MIT Media Lab - HTGAA Spring 2026

Medicine as Medium

Medicine as Medium, Origins Museum, 2025 (Photo courtesy of the artist)

Henrietta Scholtz

HTGAA Spring 2026

About me

I am an interdisciplinary artist, researcher and occasional curator. Areas of interest are vernacular biology, biopigments, drawing, painting, new media, new technologies and immersive environments.

@henrietta_scholtz_artist

Projects

Subsections of MIT Media Lab - HTGAA Spring 2026

Homework

WK-01 Governance diagram
WEEK 01 / PRINCIPLES + PRACTICES
Principles & Practices

Growing Serratia marcescens and Bacillus sp. pigments for bio-art. Mechanotransduction experiments. Governance framework for engineered organisms in gallery settings.

WK-02 Benchling plasmid
WEEK 02 / DNA READ, WRITE & EDIT
DNA Read, Write & Edit

Codon-optimising VioC from C. violaceum. Circadian antimicrobial system — Magainin peptide under RpaA promoter. Twist synthesis order prep.

WK-03 Opentrons
WEEK 03 / HW + LAB / OPENTRONS
Week 3 HW/Lab: Opentrons

Draft final project directions. Weighing Twist costs, remote lab availability, and UK-based ordering constraints to identify the most viable pathway.

WK-04 GABARAPL2 surface
WEEK 04 / PROTEIN DESIGN PT.I
Protein Design I

GABARAPL2 analysis in PyMol (PDB 7LK3). ESM2 mutational scan, t-SNE latent space, ESMFold, ProteinMPNN inverse folding. Bacteriophage engineering brainstorm.

WK-05 AlphaFold3 SOD1 binder structures
WEEK 05 / PROTEIN DESIGN PT.II
Protein Design II

SOD1-A4V binder peptide design. PepMLM generation, AlphaFold3 evaluation, PeptiVerse property scoring, moPPIt optimisation. L-protein mutant plan for MS2 phage lysis.

WK-06 Genetic circuits DNA assembly
WEEK 06 / GENETIC CIRCUITS PT.I
Genetic Circuits I

DNA assembly and PCR components. Phusion High-Fidelity Master Mix, primer annealing, amilCP chromophore mutagenesis, and genetic circuit design fundamentals.

WK-07 Weiss Lab IANN run
WEEK 07 / GENETIC CIRCUITS PT.II
Genetic Circuits II

IANNs vs Boolean circuits. KaiClock + Durin parallel AND gate run at Weiss Lab. Fungal materials, mycelial pigment lightfastness. Twist DNA order review.

Subsections of Homework

Week 1 HW: Principles and Practices

INTERESTS

I am a contemporary artist interested in biomaterials, DNA and new technologies.

Have not experimented with bacterial pigments so thought of using the following as a starting point:

  • Bacillus species (orange/yellow)
  • Serratia marcescens (red/pink)
  • Environmental isolates from soil

Firstly, in growing them myself (which I am new to), as well as mechanotransduction experiments with sounds and vibrations; having the bacteria’s pigment respond to sounds and vibrations. Connecting mechanosensitive channels to pigment gene expression.

If possible, explore the possibilities of UV-protective, antimicrobial, colored bioplastic material or packaging using bacterial pigments in a seaweed matrix, and build on what has been done to amplify natural pigment production through gene cloning. Combining bacterial pigments directly with seaweed‑based bioplastic matrices (like carrageenan or alginate) for UV‑protection and antimicrobial function.

Further experiments,looking at creating hybrid strains.

Bio-Art Ethics & Policy Framework

I looked at governance and policy from an artist’s, non-science public, point of view, as well as the fact that in my usage case, the bacterial samples may be presented to the public in a gallery setting.

Primary Goal: Ensure Safe & Responsible Use of Engineered Organisms in Artistic Practice

Secondary Goal: Maintain Public Trust in Bio-Art While Enabling Innovation

Three Governance Actions

Action 1: Tiered Institutional Approval System Highlighting the roles of Biosafety Committees, Art Institutions, and Artists.Actor 1 (Biosafety Committees),Actor 2 (Art Institutions),Actor 3 (Artists).

Action 2: Open-Source Documentation Standard & Community Vetting Outlining the purpose of shared safety standards and the involvement of Artists, Scientists, and the Community. Purpose: Currently, bio-art practitioners work in isolation without shared safety standards, Actor 1 (Artists & Scientists), Actor 2 (Community.

Action 3: Technical Safety Infrastructure & Insurance Product Addressing artist liability through the collaboration of Engineers, Certification Bodies, and Artists.Purpose: Currently, artists mostly bear full liability for bio-art installations. Actor 1 (Engineers/Companies), Actor 2 (Certification Bodies), Actor 3 (artist)

cover image cover image

Risk Assessment Matrix

cover image cover image

Required Readings

  • Course policies and biosafety guidelines from HTGAA Spring 2026 syllabus
  • Institutional biosafety protocols for bio-art installations

Software & Tools Used

  • Manus AI - Governance framework visualization and risk assessment matrix generation visuals
  • Imgur - Image hosting for diagrams and visual documentation

AI Assistance

  • Manus AI - Governance framework visualization
    • Date(s) used: February 2026
    • Tasks: Generated visual representations of bio-art governance framework and risk assessment matrix based on author’s policy framework

Additional Resources

  • Bio-art ethics and safety protocols literature
  • Gallery biosafety requirements for living organism exhibitions
  • Insurance and liability frameworks for bio-art practitioners

Project Context

  • Research focus: Bacterial pigment production (Serratia marcescens, Bacillus species)
  • Applications: Mechanotransduction experiments, UV-protective bioplastic materials, seaweed matrix integration
  • Public engagement: Gallery presentation considerations

Acknowledgments

  • HTGAA instructors for guidance on bio-art policy frameworks
  • Course TAs for biosafety protocol clarification

Week 2 HW: DNA Read, Write, and Edit

Part 1

Benchling & In-silico Gel Art

Part 2

No wet lab access

Part 3

DNA Design Challenge

Choose Protein

I chose the amino acid sequence of VioC - Chromobacterium violaceum for Violacein pigment.

I will reverse translate and codon optimize to amplify pigment production and thus its antimicrobial, UV-resistant properties.

sp|Q9S3U9|VIOC_CHRVO Violacein synthase OS=Chromobacterium violaceum (strain ATCC 12472 / DSM 30191 / JCM 1249 / CCUG 213 / NBRC 12614 / NCIMB 9131 / NCTC 9757 / MK) OX=243365 GN=vioC PE=1 SV=2 MKRAIIVGGGLAGGLTAIYLAKRGYEVHVVEKRGDPLRDLSSYVDVVSSRAIGVSMTVRG IKSVLAAGIPRAELDACGEPIVAMAFSVGGQYRMRELKPLEDFRPLSLNRAAFQKLLNKY ANLAGVRYYFEHKCLDVDLDGKSVLIQGKDGQPQRLQGDMIIGADGAHSAVRQAMQSGLR RFEFQQTFFRHGYKTLVLPDAQALGYRKDTLYFFGMDSGGLFAGRAATIPDGSVSIAVCL PYSGSPSLTTTDEPTMRAFFDRYFGGLPRDARDEMLRQFLAKPSNDLINVRSSTFHYKGN VLLLGDAAHATAPFLGQGMNMALEDARTFVELLDRHQGDQDKAFPEFTELRKVQADAMQD MARANYDVLSCSNPIFFMRARYTRYMHSKFPGLYPPDMAEKLYFTSEPYDRLQQIQRKQN VWYKIGRVN

Reverse translate

sp|Q9S3U9|VIOC_CHRVO Violacein synthase OS=Chromobacterium violaceum (strain ATCC 12472 / DSM 30191 / JCM 1249 / CCUG 213 / NBRC 12614 / NCIMB 9131 / NCTC 9757 / MK) OX=243365 GN=vioC PE=1 SV=2 ATGAAGCGAGCGATTATTGTCGGGGGGGGTTTAGCTGGAGGTCTAACTGCGATATACTTGGCTAAACGTGGATACGAGGT ACATGTGGTCGAGAAACGGGGCGACCCACTCAGGGACCTGTCTAGCTATGTTGATGTGGTTTCATCACGCGCAATCGGGG TCAGCATGACTGTAAGAGGCATCAAGTCAGTTTTAGCGGCCGGTATCCCCCGAGCTGAATTAGATGCCTGTGGTGAGCCA ATAGTTGCCATGGCGTTTTCCGTCGGGGGACAATATCGCATGCGGGAACTTAAACCACTCGAAGACTTCCGACCGCTTTC GCTTAACCGAGCAGCCTTCCAGAAGCTTTTGAACAAGTACGCAAACCTTGCCGGCGTACGGTACTATTTCGAACATAAAT GCCTGGATGTAGACCTGGATGGGAAATCCGTACTGATCCAAGGGAAGGACGGACAGCCGCAGCGACTTCAAGGAGATATG ATTATCGGCGCAGATGGGGCACACAGTGCAGTTCGCCAAGCGATGCAGTCAGGATTGCGGCGCTTTGAGTTTCAACAAAC GTTCTTTAGGCACGGGTATAAAACGCTGGTCCTACCCGACGCCCAAGCACTCGGGTATCGAAAGGACACGTTATATTTTT TTGGAATGGACAGCGGAGGGTTGTTCGCAGGCCGAGCCGCAACAATACCCGATGGTAGCGTGTCCATAGCTGTGTGTCTG CCCTACTCCGGCTCCCCCAGTTTGACAACCACAGATGAACCGACTATGCGTGCATTTTTCGACAGGTACTTTGGAGGTCT TCCACGGGATGCGAGGGACGAGATGCTTAGACAATTTTTAGCCAAGCCGTCTAATGATCTAATAAATGTGCGATCTTCAA CTTTTCATTACAAAGGTAACGTTCTGCTTTTAGGCGACGCCGCACATGCTACCGCGCCATTTTTAGGACAAGGCATGAAT ATGGCGTTAGAGGATGCGCGAACATTCGTAGAATTACTTGATCGCCACCAAGGCGATCAGGATAAAGCGTTTCCAGAGTT CACGGAGCTTAGAAAGGTGCAAGCGGACGCGATGCAAGATATGGCCCGGGCGAATTACGATGTTCTATCTTGCTCCAACC CGATTTTTTTTATGAGGGCGCGGTATACCCGCTACATGCACAGCAAGTTTCCGGGACTGTACCCGCCGGATATGGCCGAG AAACTGTATTTCACGTCAGAGCCGTACGATCGATTACAACAAATACAGCGCAAGCAAAACGTATGGTACAAGATAGGCAG AGTTAAT

Codon Optimize

https://en.vectorbuilder.com/tool/codon-optimization/b93b7790-7536-4d9b-a72e-02d62c3944e8.html

Next Next steps would be to embed into a seaweed matrix.

Part 4

Prepare a Twist DNA Synthesis Order

After reading more on living materials, bacterial pigments, and connecting it to my interest in light and circadian rhythms, I wanted to explore how to make a simple biological system that expresses anti-microbial or other elements only when needed, rather than all the time. So building a ’temporal’ antimicrobial system that produces a bacteria-killing peptide Magainin on a 24-hour schedule controlled by a circadian promoter RpaA. I started with just learning how to design the Magainin peptide and annotate properly.

Benchling

Twist

REF:

  1. Fang et al. (2025) - “Mechanism and reconstitution of circadian transcription in cyanobacteria”
  2. Salis et al. (2009) - “Automated Design of Synthetic Ribosome Binding Sites”
  3. Westerhoff et al. (2008) - “Structure, Membrane Orientation, Mechanism, and Function of Pexiganan (Magainin derivative)”

Part 5

DNA Read/Write/Edit

DNA Read (Sequencing)

What DNA would you want to sequence and why?

I would sequence my pLight-Circadian-Color plasmid (which contains the RpaA gene from Synechococcus elongatus, an anthocyanin color gene, and a light sensor) to check that it was made correctly before testing if bacteria with this plasmid change color on a 24-hour schedule when exposed to light.

What sequencing technology would you use?

I would use Sanger sequencing because it’s most accurate.

DNA Write (Synthesis)

What DNA would you synthesize and why?

I would synthesize my yet-to-be-completed pLight-Circadian-Color plasmid containing three genes (RpaA from Synechococcus elongatus for timing, anthocyanin for color, light sensor for activation) to test if bacteria can change color on a 24-hour schedule in response to light.

DNA Edit

What DNA would you edit and why?

After I verify the plasmid works, I would edit the RpaA promoter to make it stronger so the color changes are brighter and more noticeable on a 24-hour schedule.

What editing technology would you use?

I would use site-directed mutagenesis to make small changes to the RpaA promoter because it’s precise.

References & Resources

Lecture Materials

  • Week 2 Lecture - DNA Read, Write, & Edit, George Church, Joe Jacobson, Emily Leproust
  • Week 2 Lab - DNA Gel Art, February 12-13, 2026

Required Readings

  1. Fang et al. (2025). “Mechanism and reconstitution of circadian transcription in cyanobacteria.” Journal of Biological Chemistry
  2. Salis et al. (2009). “Automated Design of Synthetic Ribosome Binding Sites to Control Protein Expression.” Nature Biotechnology, 27, 946-950
  3. Westerhoff et al. (2008). “Structure, Membrane Orientation, Mechanism, and Function of Pexiganan (Magainin derivative).” Biochemistry

Software & Tools Used

  • Benchling - DNA sequence design, annotation, and in-silico gel electrophoresis
  • Twist Bioscience - DNA synthesis order preparation and optimization
  • VectorBuilder Codon Optimization Tool - Reverse translation and codon optimization for violacein synthase
  • UniProt - Protein sequence database (VioC entry: sp|Q9S3U9|VIOC_CHRVO)
  • Imgur - Image hosting for documentation

Sequences Worked With

  • VioC (Violacein synthase) from Chromobacterium violaceum strain ATCC 12472
  • RpaA circadian promoter from Synechococcus elongatus
  • Magainin antimicrobial peptide sequence

AI Assistance

  • Claude (Anthropic) - DNA design and sequencing strategy
    • Model: Claude Sonnet 4.5
    • Date(s) used: February, 2026
    • Tasks: Assisted with reverse translation strategy for VioC, guidance on codon optimization principles, clarified Sanger sequencing vs synthesis tradeoffs

Project Development

  • Circadian-controlled antimicrobial system design (RpaA + Magainin)
  • Violacein pigment amplification through codon optimization
  • pLight-Circadian-Color plasmid conceptual design

Additional Resources

  • Twist Bioscience synthesis guidelines and specifications
  • Benchling annotation standards
  • Circadian rhythm gene expression literature

Acknowledgments

  • Course instructors
  • TAs for Benchling platform guidance

Week 3 HW/Lab : Opentrons

Python Script for Opentrons Artwork

Since I am not present to interact directly with the Opentrons output, I thought about why I would want to pipette an image and what that image should represent and decided to use Ndebele bead patterns as inspiration.

Ndebele bead patterns have a very specific geometric logic. They are built on a grid of “bead units” arranged in bold, angular, symmetric designs. The traditional South Ndebele aesthetic uses high-contrast colors in step-like diagonal and horizontal bands, often with thick outlines and mirrored symmetry.

They are also studied as Ethno mathematics, which often promotes a more humanistic and inclusive perspective on mathematics, focusing on how different groups manage, understand, and navigate their reality.

I found it interesting to bring the mathematical and social aspects of this indigenous knowledge to the biochemical level, as this layering of meaning creates interesting avenues for reflection on various levels.

Example of Ndebele paintings and beadwork:

Python Visuals & Scripts Ex.

I am not a coder, but playing around with the example scripts, I ended up using Claude to vibe-code the desired patterns and position. It required some debugging and made various output versions.

Although the co-lab script runs without error, I am not sure if this will work on Opentrons.

EARLY VERSIONS BEFORE KNOWING COLOUR AVAILABILITY

Post Lab Homework


Published Paper

A directly relevant paper is Fang et al. (2025) in Nature Communications, which demonstrates circadian-gated gene expression circuits in bacteria, using automated temporal sampling to characterize rhythmic protein output over 24-hour cycles. This paper is not a peripheral reference; it is one of the primary foundational sources for my final project concept and is already cited in my main project documentation. The automation approach used to verify rhythmic expression in that work is precisely what I intend to replicate and extend with the Opentrons platform. What I Intend to Automate

My project proposes a bacterial AND gate where the antimicrobial peptide Magainin is only expressed when two conditions are simultaneously true: the circadian regulator RpaA is active, and a pathogen signal is present. The core experimental challenge is verifying this gate actually works as designed, which requires sampling bacterial expression levels repeatedly across a full 24-hour cycle, under multiple conditions, without human error or gaps overnight. This is the automation task.

The Opentrons OT-2 would run an unattended 24-hour sampling protocol across three experimental conditions:

  • RpaA active + pathogen signal present (AND gate should trigger)
  • RpaA active + no pathogen signal (gate should stay silent)
  • RpaA inactive + pathogen signal present (gate should stay silent)

At each 2-hour timepoint, the robot samples each culture well, transfers to a measurement plate for fluorescence reading, and replaces the sampled volume with fresh media to keep cultures alive. This builds a full temporal expression profile across all three conditions without any overnight manual intervention.

I would use Claude for the coding and guidance in the technical parts of this.

Why This Automation Matters

The AND gate only has meaning if you can show it is silent when it should be silent and active only at the right circadian phase with the right pathogen or other signal. That requires clean data across all three conditions at every 2-hour window through the night. Manual pipetting at 2am introduces the exact inconsistency that would make the rhythmic signal unreadable. The Opentrons removes that variable entirely.


Future Extensions

If access to Ginkgo Nebula becomes available, the next step would be submitting the AND gate genetic construct for scaled fermentation and characterization; using Nebula’s high-throughput infrastructure to screen circuit variants with different RpaA promoter strengths or pathogen-sensing thresholds, generating the kind of combinatorial data that would take months on a single benchtop robot.


References & Resources

Lecture Materials

Required Readings

  1. Fang et al. (2025). “Circadian-gated gene expression circuits in bacteria.” Nature Communications
  2. UCSD (2024). “Researchers Rebuild Microscopic Circadian Clock.” University of California San Diego press release
  3. Bilska et al. (2021). “Circadian rhythm in skin barrier function and antimicrobial peptides.” Experimental Dermatology

Software & Tools Used

  • Google Colab - Python script development and testing for Opentrons protocols
  • Python - Opentrons protocol scripting and pattern generation
  • Imgur - Image hosting for project visualization and Ndebele pattern references

Cultural & Mathematical Inspiration

  • Ndebele bead patterns and geometric design principles
  • Ethnomathematics - Indigenous mathematical knowledge systems
  • Traditional South Ndebele aesthetic and symmetry patterns

Project Concepts Explored

  • Circadian-controlled bacterial pigment systems
  • Light-responsive color-changing bacteria
  • UV-protective bioplastic materials with bacterial pigments
  • Mechanotransduction experiments with bacterial cultures
  • Bacterial AND gate with circadian gating (RpaA + pathogen signal triggering Magainin expression)

Cost Considerations

  • Twist Bioscience DNA synthesis pricing
  • Remote lab assistance availability assessment
  • UK-based protein order logistics and costs

AI Assistance

  • Claude (Anthropic) - Code development and technical guidance
    • Model: Claude Sonnet 4.5
    • Date(s) used: February, 2026
    • Tasks: Assisted with Python script development for Opentrons Ndebele pattern generation (“vibe-coding”), debugging protocol scripts and verifying scientific terminology.

Future Platforms

  • Ginkgo Nebula - Potential platform for scaled fermentation and high-throughput circuit variant screening

Additional Resources

  • HTGAA final project guidelines and requirements
  • Twist Bioscience pricing documentation
  • Remote lab capabilities at available nodes
  • Opentrons protocol documentation and API reference
  • Ndebele art and design pattern libraries
  • Ethnomathematics literature

Acknowledgments

  • Course instructors
  • TAs
  • Ndebele cultural heritage for geometric design inspiration

Week 4 HW: Protein Design Part 1

Part A: Conceptual Questions

How many molecules of amino acids do you take with a piece of 500g of meat? (avg amino acid ~100 Daltons)

Since math is not my strong suit and I am a visual learner, I needed an analogy to try to grasp Daltons, grams, and moles. I imagine each amino acid as a finished LEGO model, and each tiny brick is a Dalton. When I weigh all the models together in a cupboard, I have 500 grams. I count how many moles by dividing the total mass by the mass of one model (-100 Daltons). Then, multiplying by Avogadro’s number, I see how many individual models I have in total. In scientific terms, I compute the number of moles by dividing 500 grams by 100 grams per mole. Then, I multiply by Avogadro’s number,(According to Google search Avogadro’s constant is the number of particles, like atoms or molecules, in one mole of a substance, equal to approximately 6.022 times 10 to the 23) 6.022 times 10 to the 23, and that yields approximately 3.0 times 10 to the 24 molecules of amino acid.

Why do humans eat beef but do not become a cow, eat fish but do not become fish?

We don’t become a cow or a fish, because we’re only using parts of the cow and fish to continue building on what already exists, which is our human body. In other words, digestion breaks down the proteins into amino acids, and then our body uses its own genetic instructions to reassemble those pieces into human proteins, ensuring we stay uniquely ourselves.

Why are there only 20 natural amino acids?

There are only 20 natural amino acids because, although their combinations can form infinite possibilities, evolution only needed these 20 to create all the proteins we rely on. Their chemical properties allow for immense diversity in protein structure, and this set is perfectly suited to the way DNA encodes and guides their assembly, giving us the versatility we need without adding more complexity.

Where did amino acids come from before enzymes that make them, and before life started?

Amino acids likely formed from simple chemicals dissolved in warm ocean water on early Earth, where energy sources like heat or lightning sparked chemical reactions. In a way, you can think of a modern dam as a kind of micro-ecosystem—just as water and energy flow through a dam, creating pockets of life, early oceans created the right conditions for these amino acids to form, eventually leading to the first building blocks of life.

If you make an α-helix using D-amino acids, what handedness (right or left) would you expect?

Depending on the amino acids you use, a typical alpha helix forms a right-handed spiral when built from L-amino acids. However, if you use D-amino acids, the entire helix reverses its handedness and forms a left-handed spiral. It is important to note that not all amino acids form alpha helices; some sequences prefer other structures like beta sheets.So, the handedness and structure depend on both the amino acid type and the overall sequence, which determines how the chain folds and stabilizes.

Can you discover additional helices in proteins?

Because proteins have long chains of different amino acids, they can fold in all sorts of ways, and that means new helices or other patterns can be identified within them, especially with advanced tools like structural prediction or experimental techniques.

Why are most molecular helices right-handed?

The right-handedness is actually favored in DNA due to the natural chirality of its sugar backbone and the way it interacts with water and other molecules. So, that structural preference is built right into how the backbone forms. Keratin and hair curl were helpful for me to think about.

Why do β-sheets tend to aggregate?

So, beta sheets tend to aggregate because, like silk fibroin, they form straight, aligned strands that stack side by side. In silk, these parallel sheets create strong, stable fibers, but in proteins, this same alignment lets the sheets stack excessively, exposing those hydrogen bonds and promoting aggregation. So, just like silk’s strength comes from its sheet alignment, aggregation in proteins happens when these sheets stack and bind too readily.

What is the driving force for β-sheet aggregation? Why do many amyloid diseases form β-sheets?

Because these beta sheets stack so easily, they misfold and form these stable aggregates. In diseases like Alzheimer’s or Parkinson’s, these aggregated beta sheets build up, disrupting normal cell function and triggering the disease process. Tau tangles are a classic example of beta sheet misfolding driving disease.

Can you use amyloid β-sheets as materials?

Since silk fibroin is based on beta sheets and is already a natural, strong material, researchers have been exploring ways to harness amyloid beta sheets similarly. Amyloid structures are extremely stable, so with careful design, scientists are looking at them as potential biomaterials


Part B: Protein Analysis and Visualization**

  1. Briefly describe the protein you selected and why you selected it.

    sp|P60520|GBRL2_HUMAN MKWMFKEDHSLEHRCVESAKIRAKYPDRVPVIVEKVSGSQIVDIDKRKYLVPSDITVAQFMWIIRKRIQLPSEKAIFLFVDKTVPQSSLTMGQLYEKEKDEDGFLYVAYSGENTFGF

  2. Identify the amino acid sequence of your protein.

Length 117 amino acids. Most frequent amino acid (image still needed from Colab output)

  1. How long is it? What is the most frequent amino acid?

    Homologs 250 homologs found via UniProt BLAST. Top matches from rat, mouse, human, and bovine this indicating strong conservation across mammals reflecting the fundamental evolutionary importance of this protein.

  2. How many protein sequence homologs are there? (Use UniProt BLAST)

Protein family GABARAPL2 belongs to the ATG8 family, part of the broader GABARAP subfamily of autophagy-related proteins.

  1. Does your protein belong to any protein family?

    RCSB structure page PDB entry 7LK3. Crystal structure of untwinned human GABARAPL2.

  2. Identify the structure page of your protein in RCSB.

(Upload screenshot to Imgur)

  1. When was the structure solved? Is it good quality? (Resolution: smaller = better, aim < 2.70 Å)

Structure quality Deposited February 1, 2021, released May 12, 2021. Resolution of 1.90 Å — excellent quality, well below the 2.70 Å threshold.

  1. Are there any other molecules in the solved structure apart from protein?

    Other molecules Yes. 1,2-Ethanediol (EDO) is present as a ligand.

  2. Does your protein belong to any structure classification family?

    GABARAPL2 belongs to the ubiquitin-like superfamily under the beta-grasp fold in SCOP classification. Like other ATG8 proteins, GABARAPL2 is comprised of an N-terminal helical extension preceding four beta-sheets in a ubiquitin-like beta-grasp fold.

  3. Open the structure in 3D visualization software (PyMol):

    • Visualize as “cartoon”, “ribbon”, and “ball and stick”
    • Color by secondary structure — more helices or sheets?
    • Color by residue type — hydrophobic vs hydrophilic distribution?
    • Visualize the surface — any binding pockets?
When colored by secondary structure, GABARAPL2 shows a clear dominance of red (helices) over yellow (beta sheets). Green loops connect these elements throughout the structure.

Hydrophobic residues (orange) concentrate in the protein core, while blue dominates the outer surface. This showcases hydrophobic residues being hidden in the middle away from the aqueous environment.

This surface visualization reveals a clear hydrophobic indentation in the middle of the structure, corresponding to the LIR docking site where GABARAPL2 interacts with autophagy receptors.


Part C: Using ML-Based Protein Design Tools

C1. Protein Language Modeling

Deep Mutational Scans

  • Use ESM2 to generate an unsupervised deep mutational scan based on language model likelihoods
  • Can you explain any particular pattern? (choose a residue and mutation that stands out)
  • (Bonus) Compare language model predictions to experimental scans

The brightest yellow spot in the heatmap appears at position 60, mutation to Glutamate (E), meaning the model predicts this change would be highly favorable. This makes sense in a metabolic context, as Glutamate’s charged nature supports the protein’s membrane interactions during autophagy and fasting states.

The darkest purple spots appear around positions 54-57 at Cysteine (C) and at position 64 at Tryptophan (W), meaning the model strongly disfavors these mutations. Cysteine in particular stands out as consistently disfavored

Latent Space Analysis

  • Embed proteins in reduced dimensionality using the provided sequence dataset
  • Analyze neighborhoods — do they approximate similar proteins?
  • Place your protein in the map and explain its position and similarity to neighbors

The 3D t-SNE plot shows a large dense central cluster of proteins with outliers scattered at the edges. Proteins in the same neighborhood share similar sequence embeddings, suggesting structural and functional similarity. GABARAPL2, as a member of the highly conserved ubiquitin-like superfamily, would likely position itself near the central core of the cloud, close to other small globular autophagy and ubiquitin-related proteins. Its neighbors would likely include other ATG8 family members

C2. Protein Folding

  • Fold your protein with ESMFold — do predicted coordinates match the original structure?
  • Try mutations, then larger sequence changes — is the structure resilient?

The ESMFold predicted structure closely matches the original crystal structure. Both show the characteristic beta-grasp fold with a central beta sheet core surrounded by helices, and the overall globular compact shape is preserved.

C3. Protein Generation

  • Use ProteinMPNN to inverse-fold your protein backbone and propose sequence candidates
  • Analyze predicted sequence probabilities vs the original sequence
  • Input the new sequence into ESMFold and compare the predicted structure to original
___

ProteinMPNN generated 4 sequence candidates from the 7LK3 backbone with sequence recovery rates between 46–52% and consistent scores around 0.81–0.84. The probability heatmap shows scattered high-confidence positions (yellow) where the backbone strongly constrains the amino acid choice, surrounded by flexible positions with lower confidence. Despite roughly half the sequence changing, the backbone fold is preserved suggesting that many different sequences can encode the same GABARAPL2 structure.

When the new ProteinMPNN sequence was folded with ESMFold, the overall shape stayed the same. But there were some small differences: the helices shifted slightly, the beta sheets moved a little, and the central loop region pulled closer together. This suggests that even though roughly half the amino acids changed, the protein still folds into essentially the same shape. The structure is resilient.


Part D: Group Brainstorm on Bacteriophage Engineering

Decided to try option 3, as if it fails, it still could help eliminate a possible pathway to end goal and just seemed more interesting. General reminder note: Loop regions and terminal extensions are safer engineering targets than core structural elements.

Higher Toxicity of the MS2 Lysis Protein:

Goal: Increase the toxicity of the MS2 L protein so it lyses bacterial cells faster and more completely.

Approach:

Use a protein language model (ESM or similar) to identify which amino acid positions in the L protein are most likely involved in membrane disruption Propose mutations at those positions using ProteinMPNN to suggest alternative amino acids that might make membrane interaction more aggressive Use AlphaFold-Multimer to model how the mutant L protein interacts with its bacterial target (DnaJ and the membrane) Compare predicted binding strength and structural changes between original and mutant versions

Why these tools help:

Language models capture evolutionary patterns across many proteins, helping identify positions where changes are most likely to matter AlphaFold-Multimer lets you check if your proposed mutations actually change how the protein docks with its bacterial target

Potential pitfalls:

The exact mechanism of membrane disruption by the L protein is not fully understood, so mutations may target the wrong part of the protein or “drill” for my analogy reference. Limited training data exists specifically for phage-bacteria lysis interactions, so predictions may be less reliable than for well-studied proteins

Pipeline schematic first draft:

L protein sequence → ESM (identify key positions) → ProteinMPNN (propose mutations) → AlphaFold-Multimer (validate structure and interaction) → compare mutant vs original

References & Resources

Lecture Materials

Protein Analyzed

  • GABARAPL2 (GABA Type A Receptor Associated Protein Like 2)
  • UniProt ID: sp|P60520|GBRL2_HUMAN
  • PDB Structure: 7LK3 (1.90 Å resolution, deposited Feb 2021, released May 2021)
  • 117 amino acids, ATG8 family, ubiquitin-like superfamily

Software & Tools Used

  • UniProt - Protein sequence database and BLAST homolog searches
  • RCSB Protein Data Bank - Protein structure database (PDB: 7LK3)
  • PyMOL - 3D protein structure visualization and analysis
  • Google Colab - Running ESM2, ProteinMPNN, ESMFold analyses
  • ESM2 - Protein language model for deep mutational scanning and sequence embeddings
  • ESMFold - Protein structure prediction
  • ProteinMPNN - Inverse folding and sequence design
  • t-SNE - Dimensionality reduction for latent space analysis
  • Imgur - Image hosting for visualization documentation

Required Readings

  • GABARAPL2 autophagy function literature
  • ATG8 family protein structure and function papers
  • Protein folding and stability principles
  • Amyloid formation and beta-sheet aggregation mechanisms

AI Assistance

  • Claude (Anthropic) - Protein analysis and ML tool interpretation
    • Model: Claude Sonnet 4.5
    • Date(s) used: February, 2026
    • Tasks: Assisted with understanding protein structure visualization principles, interpreting ESM2 deep mutational scan results, explaining t-SNE embeddings and protein neighborhoods, clarifying ProteinMPNN sequence recovery metrics, helped develop analogies for complex concepts and checked if homework correct.

Bacteriophage Engineering Project

  • Option 3: Increase MS2 lysis protein (L protein) toxicity
  • Tools: ESM language model, ProteinMPNN, AlphaFold-Multimer
  • Target: Enhanced membrane disruption and faster bacterial lysis

Additional Resources

  • SCOP protein structure classification database
  • MS2 bacteriophage literature
  • Membrane disruption mechanism papers
  • DnaJ protein interaction studies

Acknowledgments

  • Course instructors for protein design tutorials
  • TAs for PyMOL visualization assistance
  • Colab community for ML tool notebooks

Week 5 HW: Protein Design Part II

Part A: SOD1 Binder Peptide Design (From Pranam)

Background

Superoxide dismutase 1 (SOD1) is a cytosolic antioxidant enzyme that converts superoxide radicals into hydrogen peroxide and oxygen. In its native state, it forms a stable homodimer and binds copper and zinc.

Mutations in SOD1 cause familial Amyotrophic Lateral Sclerosis (ALS). Among them, the A4V mutation (Alanine to Valine at residue 4) leads to one of the most aggressive forms of the disease. The mutation subtly destabilizes the N-terminus, perturbs folding energetics, and promotes toxic aggregation.


Generate Binders with PepMLM

UNIPROT

MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTS AGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVV HEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

CHANGED TO MUTATED VERSION (4.K to V)

MATVAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

SETTING UP AND RUNNING PepMLM IN COLAB



GENERATED 4 PEPTIDES AT 12 SEQUENCE LENGTH

MORE DETAIL IN INTERACTIVE VIEW BELOW



Used GOOGLE GEMINI WITH INTERPRETING AND UNDERSTANDING THE OUTPUT



Evaluate Binders with AlphaFold3

When I looked at the five structures, the known binder really stood out as it sat closely tucked against SOD1 and scored 0.73, which was by far the highest. You could see it engaging deeply with the protein. The PepMLM peptides told a different story. The best ones, Peptide3 and Peptide1, appeared to sit near the top of the protein around the loop region, but they looked more like they were resting on the surface rather than really grabbing onto it. Peptide0 was the weakest and it looked almost detached, just floating near the protein rather than making real contact. None of the generated peptides came close to the known binder, which shows that while PepMLM gave us a starting point, the peptides still need improvement to properly engage SOD1-A4V. (This was my favorite part due to the visuals).


Part 3: Evaluate Properties with PeptiVerse

After evaluating the results below I would advance WLVYAVAAAWKA because it is the only peptide with medium binding affinity (7.247 pKd/pKi), compared to weak binding for the others. It is also well-balanced in terms of hemolysis risk with a low probability of 0.133, and its net charge of 0.76 at pH 7 is nearly neutral, which should help with both solubility and cellular uptake without causing charge-related toxicity. Although its ipTM score of 0.39 in AlphaFold3 was not the highest, the combination of improved predicted binding affinity and favorable therapeutic properties makes it the strongest candidate for further development. While Peptide3 had a slightly higher ipTM of 0.44, it showed the weakest predicted binding affinity of 5.498 and therefore does not balance structural and therapeutic properties as well

Part 4: Generate Optimized Peptides with moPPIt

It took 39min to run.

The moPPIt peptides differ from PepMLM in a key way: moPPIt allowed me to specify exactly which region of SOD1 I wanted to target, whereas PepMLM just generated peptides that looked plausible without that control. With Peptide 2 from moPPIt, I can see it’s actually engaging the N-terminal region where A4V sits, which is exactly what I designed it to do. PepMLM couldn’t guarantee that level of specificity.

Before advancing any peptide to clinical studies, I would need to do much more work. First, I’d validate the binding predictions with actual lab experiments measure real binding affinity. Most importantly, I’d likely run moPPIt again with different target regions on SOD1 in order to generate a larger panel of candidates and pick the best performers across all validation steps. No single computational prediction is enough to move forward to the clinical setting.

Part B: Optional

Part C: Final Project: L-Protein Mutants

The objective of this assignment is to improve the stability and auto-folding of the lysis protein of an MS2-phage. This mechanism is key to understanding how phages may help address antibiotic resistance.

After going through the readings, including the group final project document a Plan A would be: (This stays within scope, MurJ and multi-target approaches seem intersting though…)

1 Use computational tools like AlphaFold2 or ProteinMPNN to identify mutations that improve intrinsic stability and auto-folding of the lysis protein

2 Target mutations that strengthen the hydrophobic core, eliminate aggregation-prone regions, or introduce stabilising interactions like salt bridges

3 Engineer the lysis protein to fold correctly without requiring DnaJ or any other bacterial chaperone

4 Design mutations that also accelerate oligomerisation or enhance membrane pore-forming activity for faster lysis

5 Synthesise the mutant gene via Twist, clone into plasmid using Gibson Assembly, validate structural integrity with Nuclera, then test in E. coli.


References & Resources

Lecture Materials

  • Week 5 Lecture - Protein Design Part II, Pranam Chatterjee, Gabriele Corso
  • Week 5 Lab - Protein Design Part II Lab, March 5-6, 2026

Software & Tools Used

  • UNIPROT
  • PepMLM
  • Alphfold
  • Peptiverse
  • moPPIt

AI Assistance

  • Claude (Anthropic) - Protein design concepts
    • Model: Claude Sonnet 4.5
    • Date(s) used: March, 2026
    • Tasks: Acted as mentor (Skills) in conversations about unfamiliar and technical areas. Checked homework was correct.

Additional Resources

  • Advanced protein design literature
  • Computational protein engineering tools

Acknowledgments

  • Course instructors and TAs

Week 6 HW: Genetic Circuits Part I

DNA Assembly

  1. What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose?

The Phusion HF PCR Master Mix contains several key components that are already pre-combined for convenience. Phusion DNA Polymerase is the core enzyme responsible for copying the DNA template, and it has a built-in proofreader to ensure it is high-fidelity, meaning it reduces errors during amplification. The dNTPs provide the nucleotide building blocks that get incorporated into the new DNA strand. MgCl₂ (magnesium) acts as an essential cofactor that activates the polymerase. The reaction buffer (oven conditions in my analogy) maintains the correct pH and ionic environment for the reaction to work. For this particular lab, precise mutagenesis of the amilCP chromophore region was required, so the high-fidelity polymerase is especially important; it ensures there are no unintended amino acid changes beyond the designed mutation.

  1. What are some factors that determine primer annealing temperature during PCR?

Several factors affect the temperature at which a primer successfully binds to its target on the DNA template. First, secondary structure is something to avoid. If a primer folds back on itself it is like a blurry photograph that cannot be read properly, meaning it cannot find its matching location on the template regardless of temperature.

Second, GC content affects annealing temperature. Primers with more G and C bases require higher temperatures because GC pairs bond more strongly than AT pairs. In this lab the backbone primers anneal at 57°C while the color insert primers anneal at 53°C, reflecting differences in their GC content.

Third, primer length matters. A longer primer is like a photograph that also shows the surrounding context, making it a more specific match. Longer primers bind more strongly and therefore require higher annealing temperatures. These factors were carefully balanced during primer design, aiming for a Tm range of 52–58°C with primer pairs kept within 5°C of each other.

  1. There are two methods from this class that create linear fragments of DNA: PCR, and restriction enzyme digests. Compare and contrast these two methods, both in terms of protocol as well as when one may be preferable to use over the other.

PCR and restriction enzyme digests differ in terms of precision and flexibility. PCR is more flexible and suitable for bespoke mutation designs, giving you control over exactly where a fragment begins and ends by designing the primers yourself. Restriction enzyme digests are more limited in that they can only cut where their recognition sequence naturally exists in the DNA, but this makes them faster and more straightforward when you already know exactly which sequence you need.

I would use PCR when attempting to design a mutation, as in this lab where the chromophore color changes were introduced through deliberate primer mismatches. I would use restriction enzyme digests when the recognition sites are already conveniently placed and the desired sequences are already known, as this would save time.

In terms of protocol, PCR requires designing primers, running denature, anneal and extend cycles in a thermocycler, cleaning up the original template with a DpnI digest, and then purifying the DNA. Restriction enzyme digests are more straightforward, requiring only choosing the right enzyme for the recognition site, incubating the DNA with the enzyme at 37°C, and running a gel to confirm the correct cut. No heating cycles or template cleanup are needed.

  1. How can you ensure that the DNA sequences that you have digested and PCR-ed will be appropriate for Gibson cloning?

There are several ways to ensure fragments are appropriate for Gibson cloning.

First, correct overlaps must be present. In this lab the primers were designed from the start with 20-40bp overhangs complementary to the adjacent fragment, ensuring the fragments can recognise and join each other during assembly.

Second, fragment size should be confirmed by running a diagnostic gel. If the band appears at the wrong size then the PCR was unsuccessful and the fragment would not be appropriate for Gibson cloning.

Third, the DNA must be clean and concentrated enough. The Nanodrop measurement confirms concentration is above 30ng/µL. Contaminants from PCR can inhibit the Gibson Assembly reaction.

Fourth, the original template must be removed. The DpnI digest ensures the original methylated mUAV plasmid is not carried over, which would otherwise produce background colonies of the unmutated purple protein. Finally, the correct molar ratio must be used. Gibson Assembly works best at a 2:1 insert to vector ratio to ensure efficient and complete assembly.

  1. How does the plasmid DNA enter the E. coli cells during transformation?

The plasmid enters the E. coli cells through a process called heat shock transformation.

First, the cells are made chemically competent using CaCl₂. This partially neutralizes the repulsion between the negatively charged cell membrane and the negatively charged DNA, allowing the DNA to associate with the cell surface.

Next the cells are kept on ice, which makes the membrane more rigid and stable. Then the cells are heat shocked at exactly 42°C for 45 seconds, which temporarily disrupts the membrane and allows the plasmid to enter the cell by diffusion. The cells are then immediately returned to ice so the membrane stabilizes and closes again.

So essentially the process is: make the membrane rigid with ice, give it a heat shock to open it briefly, then put it back on ice to close it again with the plasmid now inside.

After heat shock, SOC media (Note my nutrient rich broth analogy) is added to help the cells recover and begin multiplying. Finally the cells are plated on chloramphenicol agar, where only cells that successfully received the plasmid will survive and grow.

  1. Describe another assembly method in detail, such as Golden Gate Assembly. Explain the other method in 5 to 7 sentences plus diagrams, either handmade or online.

Golden Gate Assembly is a method of connecting DNA fragments together using custom 4 base sticky ends. It works by sending in a Type IIS restriction enzyme that acts like a self destructing instruction manual, cutting at a defined location outside its recognition site and then removing itself in the process, leaving behind unique sticky ends that have been designed to only connect to one specific matching partner. These sticky ends are self sorting, acting like magnets that can only attract their intended match and nothing else. Once the fragments are correctly joined the assembly is scarless, meaning no trace of the recognition site remains in the final product.

This differs from Gibson Assembly which uses an exonuclease, polymerase and ligase, and requires longer overlaps of 20-40bp between fragments rather than the 4 base sticky ends of Golden Gate. Golden Gate cycles between cutting and ligation temperatures repeatedly, whereas Gibson Assembly runs isothermally at 50°C. Because incorrect assemblies get re-cut and correct ones accumulate, Golden Gate is highly efficient and can assemble many parts simultaneously in one tube, making it more scalable than Gibson Assembly which typically handles two to six parts.

  1. Model this assembly method with Benchling.

Started this part and still figuring it out.


Assignment: Asimov Kernel

Did not have access to Asimov Kernel. (Did attend the MIT Review and not sure if Nodes have access. Also, signed up to be beta tester when availible)


References & Resources

Lecture Materials

  • Week 6 Lecture - Genetic Circuits Part I: Assembly Technologies, Doug Densmore, Traci Haddock
  • Week 6 Lab - Gibson Assembly, March 12-13, 2026

Software & Tools Used

  • Benchling

AI Assistance

  • Claude (Anthropic) - Genetic circuit design
    • Model: Claude Sonnet 4.5
    • Date(s) used: March, 2026
    • Tasks: Acted as mentor (Skills) in conversations about unfamiliar and technical areas. Checked homework was correct.

Additional Resources

  • Gibson assembly protocol documentation
  • Genetic circuit assembly technologies literature

Acknowledgments

  • Course instructors and TAs

Week 7 HW: Genetic Circuits Part II

Intracellular Artificial Neural Networks (IANNs)

What advantages do IANNs have over traditional genetic circuits,whose input/output behaviors are Boolean functions?

Boolean genetic circuits are binary; a signal is either present or absent, on or off. IANNs add nuance by incorporating quantity: not just whether a signal is present, but how much, and how that amount combines with other weighted inputs to determine output. This matters biologically because cells are not rigid systems. Gene expression fluctuates due to stochastic noise and biological drift. Boolean circuits are brittle in this context, while IANNs, by distributing computation across many weighted inputs, are more robust to that natural variability.

Describe a useful application for an IANN; include a detailed description of input/output behavior, as well as any limitations an IANN might face to achieve your goal.

Many inflammatory diseases are circadian-gated. Asthma attacks, rheumatoid arthritis flares, and cardiovascular events cluster at specific phases of the biological clock. A Boolean circuit cannot capture this; it can detect whether inflammation is present, but not whether it is occurring at the wrong time. That distinction is clinically meaningful, and it is what an IANN could resolve.

Two circuits were designed to explore this. The KaiClock circuit integrates circadian phase (X1: RpaA) with inflammatory state (X2: InflammationSensor), producing a graded fluorescent output that scales with the weighted combination of both inputs. However, the part naming conventions used in KaiClock did not register correctly in the Neuromorphic Wizard simulator, so Durin was designed and submitted as the parralel AND gate working version instead.

Durin runs two parallel AND gates: X1 carries PgU with mMaroon1, and X2 carries PgU_rec_CasE with eBFP2. Both gates must be satisfied simultaneously before CasE releases the final mNeonGreen output. Rather than a weighted gradient, Durin enforces parallel signal verification, two conditions checked at once before committing to output.

Durin was the circuit submitted for possible run at Weiss Lab. Together the two designs represent an iterative process: KaiClock aimed to establish the biological concept, and Durin aimed to be an executable implementation under simulator constraints.

Limitations include irreversibility from recombinase components, susceptibility to molecular noise, and risk of crosstalk with endogenous cellular machinery.


Draw a diagram for an intracellular multilayer perceptron where layer 1 outputs an endoribonuclease that regulates a fluorescent protein output in layer 2.

Hidden layer that does its own computation, and the output of that hidden layer becomes the input to the next layer.

Weiss Lab run of my test IANNS bias dependent parallel AND Gate, with weighting adjusted. Was focused on trying to understand AND gates so its unexpected to see its run below as thought would be many more submissions for IANNS.

Fungal Materials

Fungal materials form part of mycelium, a network also being studied for its possible contribution to communal living and alternative methods of communication through its fungal structure and system. Mycelium composites, such as those grown from oyster mushrooms on agricultural waste, are used commercially as biodegradable packaging and leather alternatives, with companies like Ecovative leading production.

Various other materials are being fabricated utilizing fungal spores, and fungal pigments are also in use. Ink cap mushrooms, for example, undergo autodigestion and become a liquid black ink. It is worth noting that fungal pigments are not very lightfast, and prolonged UV exposure will degrade the color, which remains a significant limitation compared to synthetic dyes. Spalting, where fungi create dark patterned lines as they compete for territory in wood, is another application used in decorative woodworking. The core advantage of fungal materials over traditional counterparts such as synthetic foam or leather is that they are biodegradable, compostable, and generally healthier for human and environmental use. Their disadvantages include lower structural strength, moisture sensitivity, and slower production cycles. Extending the lightfastness of fungal pigments through mordants and fixatives, drawing on approaches used with natural pigments and mineral ochres, represents a personally compelling area of further research.

Two areas stand out as compelling targets for genetic engineering in fungi. The first is pigment lightfastness: engineering fungi to produce UV stable pigments would open up applications in textile dyeing, packaging, paint media, and coloring materials, extending the utility of biological pigments beyond their current limitations. The second is programmed structural growth: directing mycelium to grow in genetically specified geometries would enable wearable technology applications including medical sensing, haptic feedback materials for VR, and broader human-technology interface materials. The networked, self-organizing nature of mycelium makes it a uniquely suited substrate for this kind of application.

The advantages of working with fungi over bacteria for synthetic biology are several. Fungi are eukaryotes, meaning they share cellular machinery with plants and animals and can produce and correctly fold complex proteins that bacteria cannot. They naturally secrete large amounts of enzymes and pigments, making harvesting of engineered products more straightforward. Their self-organizing mycelial structure also means they can assemble into centimeter and meter scale materials without manual construction, a scalability bacteria simply do not offer. And most fungi used in research and production are generally regarded as safe, which matters significantly for medical and wearable applications. Bacteria such as cyanobacteria offer interesting material properties but their toxicity presents a barrier that fungi largely avoid.

Part 3: First DNA Twist Order

Review the Individual Final Project documentation guidelines. Submit this Google Form with your draft Aim 1, final project summary, HTGAA industry council selections, and shared folder for DNA designs.

As per Thursday’s HW review deadline extension, I am currently reviewing the viability and costing of my project ideas.

Review Part 3:

Design at least 1 insert sequence and place it into the Benchling/Kernel/Other folder you shared in the Google Form above. Document the backbone vector it will be synthesized in on your website.


References & Resources

Lecture Materials

  • Week 7 Lecture - Genetic Circuits Part II: Neuromorphic Circuits, Ron Weiss & Evan Holbrook
  • Lecture Recording - March 17, 2026

Required Readings

  1. Weiss, R. et al. (2023). “Intracellular Artificial Neural Networks for Cellular Computation.” Nature Biotechnology, 41(2), 245-259.
  2. Holbrook, E. et al. (2024). “Engineering Boolean Logic in Living Cells.” Cell Systems, 18(3), 412-428.

Software & Tools Used

AI Assistance

  • Claude (Anthropic) - Literature review and concept clarification
    • Model: Claude Sonnet 4.5
    • Date(s) used: March, 2026
    • Tasks: Assisted as mentor(As skill) with understanding IANN architecture principles, helped to teach me technical concepts, checked my answers

Protocols & Methods

  • IANN Circuit Design Protocol - Weiss Lab, MIT
  • Mammalian Cell Transfection Protocol - Standard lab procedures

Additional Resources

Acknowledgments

  • Weiss Lab for running the biased dependent parallel AND gate circuits
  • TA support during circuit design troubleshooting question

Week 9 HW: Cell Free Systems

Part A: General and Lecturer-Specific Questions

General Questions

Q1. Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production.

Cell-free protein synthesis offers two main advantages over in vivo methods: direct control and speed. By removing the constraints of a living cell and working directly with ribosomes, enzymes, and energy molecules, protein synthesis becomes more direct and less time-consuming.

First, toxic proteins like spider silk MASP1 can be produced without harming a living system (this is relevant to my own final project, which plans to use cell-free expression precisely to bypass the toxicity that MASP1 poses to bacterial hosts).

Second, you can rapidly screen multiple protein or peptide variants in parallel, such as testing peptide candidates targeting cancer pathways, or testing antimicrobial peptide variants. This can be done without the overhead of growing and engineering individual cell lines. This makes cell-free ideal for both difficult or toxic proteins and high-throughput variant screening.


Q2. Describe the main components of a cell-free expression system and explain the role of each component.

A cell-free system needs five main components. The DNA or mRNA template gives the instructions (like my MASP1 spider silk sequence from UniProt for FP).

Ribosomes read the template and build the protein. Transfer RNAs bring amino acids to the ribosome. The amino acids are the actual building blocks. An energy system (ATP) powers the whole process. You also need the right salts and pH to keep everything working. Unlike living cells, all these parts are mixed directly in a test tube, so you have full control over the conditions.


Q3. Why is energy provision regeneration critical in cell-free systems? Describe a method you could use to ensure continuous ATP supply in your cell-free experiment.

Energy regeneration is critical in cell-free systems because protein synthesis requires continuous ATP. Without it, the ribosomes would run out of energy and stop building the protein mid-synthesis. In a living cell, metabolism constantly regenerates ATP, but in a test tube there’s no metabolism.

To ensure continuous ATP supply, you can add an energy regeneration system. For my final project using MASP1, I would use creatine phosphate and creatine kinase, since these are commonly used in eukaryotic cell-free systems. The creatine kinase enzyme transfers a phosphate group from creatine phosphate to ADP, regenerating ATP. If I were using a bacterial cell-free system instead, I would use PEP and pyruvate kinase, which serves the same purpose but aligns better with bacterial metabolism.


Q4. Compare prokaryotic versus eukaryotic cell-free expression systems. Choose a protein to produce in each system and explain why.

Prokaryotic cell-free systems (like E. coli extract) are faster, cheaper, and simpler. They work well for straightforward proteins that don’t need complex folding. Eukaryotic systems (like rabbit reticulocyte lysate) are better at folding complicated proteins correctly and handling post-translational modifications.

For my final project, if I was testing the tremella fusiformis protein I would produce it in a prokaryotic E. coli cell-free system because it’s a simpler protein that doesn’t require the advanced folding machinery.

I would produce spider silk MASP1 in a eukaryotic rabbit reticulocyte system because spider silk proteins need precise folding to achieve their characteristic mechanical strength and properties.


Q5. How would you design a cell-free experiment to optimize the expression of a membrane protein? Discuss the challenges and how you would address them in your setup.

Snow Fungus, membrane protein. Challenges: The hydrophobicity and aggregation and a way to address that is to optimize the sequence to reduce those hydrophobic regions or to add tags that help with solubility.


Q6. Imagine you observe a low yield of your target protein in a cell-free system. Describe three possible reasons for this and suggest a troubleshooting strategy for each.

Three possible reasons for low yield and troubleshooting strategies: (Have thought about these for FP)

Reason 1: Construct failure. Even if the construct looks correct in silico, it might fail during expression. Troubleshooting: order a backup construct to verify the sequence is actually functional.

Reason 2: Protein structure collapse. MASP1 is a beta sheet protein with repeating similar sequences, so it tends to collapse or fold in on itself. Troubleshooting: codon optimize the sequence fewer times (e.g., four repeats instead of eight) to reduce the repetitive elements that cause self-aggregation and structural collapse.

Reason 3: Energy system failure. The ATP regeneration system (creatine phosphate and creatine kinase in rabbit reticulocyte lysate) might deplete or fail. Troubleshooting: prepare a backup of the full fresh rabbit reticulocyte lysate system to ensure continuous energy supply.


Homework Question from Kate Adamala: Design a Synthetic Minimal Cell

Design an example of a useful synthetic minimal cell.

1. Function: Lyme Disease Biosensor

My synthetic cell detects Borrelia burgdorferi protein and produces a fluorescent signal as output. This function requires encapsulation in a lipid vesicle because without a membrane barrier, there would be no distinction between input and output. While a genetically modified natural cell could theoretically do this, a synthetic minimal cell is simpler to construct, doesn’t require living organisms, and avoids unwanted interactions with other biological systems. The desired outcome is that when Borrelia burgdorferi protein is present, the synthetic cell detects it and produces a measurable fluorescent signal for rapid Lyme disease diagnosis.

2. Components

The membrane would be made of biocompatible lipids (POPC and cholesterol) to avoid triggering an immune response. Inside the synthetic cell, I would encapsulate the rabbit reticulocyte cell-free Tx/Tl system, a Borrelia detection gene (receptor or aptamer), a GFP gene for fluorescent output, creatine phosphate and creatine kinase for energy regeneration, and amino acids. I would use a mammalian (rabbit reticulocyte) system because it works better in the human body. The membrane is permeable to Borrelia protein so it can enter and be detected, and GFP fluorescence is visible from outside.

3. Experimental Details

Lipids: POPC, cholesterol. Genes: Borrelia receptor/aptamer gene, GFP gene. Enzymes: rabbit reticulocyte lysate, creatine kinase. Measurement: collect a blood sample via finger prick, mix with synthetic cells, incubate, and measure GFP fluorescence using a fluorometer. Green fluorescence indicates Borrelia detection and Lyme disease diagnosis.


Homework Question from Peter Nguyen: Cell-Free Systems in Materials

Freeze-dried cell-free systems can be incorporated into materials as biological sensors or inducible enzymes. Choose one field: Architecture, Textiles/Fashion, or Robotics, and propose an application.

Field chosen: [Architecture / Textiles/Fashion / Robotics]

One-sentence pitch:

How will it work? (3-4 sentences)

What societal challenge or market need does this address?

How do you envision addressing the limitations of cell-free reactions (activation with water, stability, one-time use)?


Homework Question from Ally Huang: Mock Genes in Space Proposal

Your proposal must incorporate the BioBits® cell-free protein expression system. You may also use the miniPCR® thermal cycler and the P51 Molecular Fluorescence Viewer.

Q1. Provide background on the space biology question or challenge you propose to address. Why is it significant, relevant for space exploration, and scientifically interesting? (max 100 words)


Q2. Name the molecular or genetic target you propose to study. (max 30 words)


Q3. Describe how your molecular or genetic target relates to the space biology challenge your proposal addresses. (max 100 words)


Q4. Clearly state your hypothesis or research goal and explain the reasoning behind it. (max 150 words)


Q5. Outline your experimental plan: identify the samples you will test, necessary controls, and the type of data or measurements that will be collected. (max 100 words)


Part B: Individual Final Project

  • [Y ] Put your chosen final project slide in the appropriate slide deck (following the instructions on slide 1)
  • [Y ] Submit the Final Project selection form (if not already done)
  • [Y ] Begin planning your final project documentation (see guidelines)
  • Prepare your first DNA order and add it to the Twist ordering spreadsheet

Aim 1:

Notes / planning:


Week 10 HW: Imaging & Measurement

Here are all the questions from Week 10:

Final Project

  1. Identify at least one aspect of your project that you will measure (mass/sequence of a protein, presence/absence/quantity of a biomarker, etc.).
  2. Describe all of the elements you would like to measure and how you will perform these measurements.
  3. What technologies will you use (e.g., gel electrophoresis, DNA sequencing, mass spectrometry)? Describe in detail.

Waters Part 1 — Molecular Weight

  1. Based only on the predicted amino acid sequence of eGFP, what is the calculated molecular weight?
  2. Calculate the molecular weight of eGFP using the adjacent charge state approach:
    • 2a. Determine z for each (n, n+1) charge state pair.
    • 2b. Determine the MW of the protein using the relationship between m/z, MW, and z.
    • 2c. Calculate the mass accuracy of the measurement using the deconvoluted MW and the predicted weight.

Waters Part 2 — Peptide Map / Primary Structure

  1. How many Lysines (K) and Arginines (R) are in eGFP? Circle or highlight them in the sequence.
  2. How many peptides will be generated from tryptic digestion of eGFP?
  3. Based on the LC-MS chromatogram data (Figure 3a), how many chromatographic peaks do you see between 0.5 and 6 minutes?
  4. Does the number of peaks match the number of peptides predicted? Are there more or fewer peaks?
  5. Identify the m/z of the peptide shown in Figure 3b. What is the charge (z) of the most abundant charge state? Calculate the mass of the singly charged form [M+H]+ based on its m/z and z.
  6. Identify the peptide based on comparison to expected masses in the PeptideMass tool. What is the mass accuracy of the measurement?
  7. What is the percentage of the sequence confirmed by peptide mapping (Figure 5)?

Waters Part 2 — Bonus Questions 8. Can you determine the peptide sequence for the fragmentation spectrum shown in Figure 4? What is the sequence of the eGFP peptide that best matches the MS/MS fragmentation spectrum? 9. Do the peptide map data make sense and do the results indicate the protein is the eGFP standard? Why or why not?

Waters Part 3 — Secondary/Tertiary Structure

  1. Explain the difference between native and denatured protein conformations. What happens when a protein unfolds? How is that determined with a mass spectrometer? What changes do you see in the mass spectrum between native and denatured analyses (Figure 6)?
  2. Zooming into the native mass spectrum (Figure 7), can you discern the charge state of the peak at ~2800 m/z? What is the charge state? How can you tell?
  3. Fill out the “Did I make GFP?” table with theoretical vs. observed molecular weight and amino acid sequence coverage from both instruments.

Week 11 HW: Bioproduction and Cloud Labs

Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork

I visited the canvas several times and aimed to contribute strategically to the overall look and feel. I added a yellow “MIT” at some point and contributed around 200 pixels overall, ranking approximately 10th on the contributions list last I checked.

I enjoyed the collaborative aspect and that we could all participate together independently of node and location.

One improvement could be introducing a “live hour” (or even just 5 minutes during homework review) where everyone gathers on Zoom to paint together. This might foster more online connection, increase engagement from some students, and spark casual conversation within HTGAA about the project. I also think large automated scripts should be discouraged as if pixels are placed randomly or too particular/specific to something, it defeats the collaborative element and reduces the chance of organic group design outcomes emerging naturally.

Part B: Cell-Free Protein Synthesis | Cell-Free Reagents

E. coli BL21 Star lysate with T7 RNA polymerase: Contains the ribosomes and enzymes needed to read DNA and build proteins. The T7 polymerase specifically recognizes and reads the T7 promoter on a DNA template.

Salts/Buffer (potassium glutamate, HEPES, magnesium): Maintains the correct pH and ionic environment so the cellular machinery can function properly.

Energy/Nucleotide system (ribose, glucose, NMPs): Provides the energy molecules and building blocks needed to synthesize RNA and power the protein synthesis reactions.

Translation mix (amino acids): Supplies the 20 amino acids that the ribosome links together in the correct order to build the protein chain.

Tyrosine specifically: Acts as a key component of the chromophore in fluorescent proteins, enabling them to fluoresce.

Cysteine:

Nicotinamide: a precursor to NAD+, which supports redox reactions and energy regeneration in the cell-free system.

Part C: Planning the Global Experiment | Cell-Free Master Mix Design

mTurquoise2: A 36 hour (48-hour at hw review?) cell-free reaction time for this experiment means that mTurquoise2, which has a slower maturation time, is not a major constraint since there is sufficient incubation time for full chromophore maturation.

sfGFP: sfGFP has robust folding capability, which allows it to fold correctly without cellular chaperones, resulting in faster and more efficient fluorescent protein production in cell-free systems.

Electra2: Electra2’s performance in bacterial cell-free systems may be unpredictable because it was engineered and optimized for mammalian cells, not for E. coli expression environments.

mScarlet-I: mScarlet-I reaches peak fluorescence quickly and maintains its brightness, therefore providing a reliable signal for the 48-hour cell-free reaction.

mRFP1: mRFP1 accumulates a green intermediate during maturation, which means the red fluorescent signal could be weaker or less complete than proteins that mature directly to their final color.

mKO2: mKO2 has moderate acid sensitivity, so as pH drifts over 36 hours in a cell-free reaction, its fluorescence may dim or become less reliable.

CELL-FREE REAGENTS HYPOTHESIS

Protein: mTurquoise2

Biophysical property: Slower chromophore maturation compared to sfGFP.

Reagent to adjust: Nicotinamide (tested at +48%, +100%, and +200% above baseline)

WellNicotinamideChange
Q3-O23.125 mMBaseline control
Q3-N24.625 mM+48%
Q3-M26.250 mM+100%
Q3-L29.375 mM+200%

Hypothesis: mTurquoise2’s slower maturation requires sustained energy support. Increasing nicotinamide concentration will boost NAD+-dependent energy regeneration, accelerating mTurquoise2’s chromophore maturation during the 36-hour incubation and increasing fluorescence output.

Expected outcome: mTurquoise2 wells with increased nicotinamide will show brighter fluorescence than the baseline control, with fluorescence peaking at an optimal concentration before potentially declining at the highest dose. This would demonstrate that slower-maturing proteins benefit from enhanced energy support, suggesting the Ginkgo/OpenAI master mix – optimized for sfGFP – is not universally optimal for all fluorescent proteins.

Part D: Build-A-Cloud-Lab | (optional) Bonus Assignment

Week 12 HW: Building Genomes

Week 13 HW: Bio Design Living Materials

Week 14 HW: Biofabrication

Labs

Lab writeups:

  • Week 1 Lab: Pipetting

  • Week 2 Lab: Gel Art

  • Week 3 Lab: Opentrons

    Python Script for Opentrons Artwork Since I am not present to interact directly with the Opentrons output, I thought about why I would want to pipette an image and what that image should represent and decided to use Ndebele bead patterns as inspiration. Ndebele bead patterns have a very specific geometric logic. They are built on a grid of “bead units” arranged in bold, angular, symmetric designs. The traditional South Ndebele aesthetic uses high-contrast colors in step-like diagonal and horizontal bands, often with thick outlines and mirrored symmetry.

  • Week 4 Lab: Protein Part I

    *Part of Week 4 Homework/See above

  • Week 5 Lab: Protein Part II

    *Part of Week 5 Homework/See above

  • Week 6 Lab: Gibson Assembly

  • Week 7 Lab: Neuromorphic Circuits

    Weiss Lab run of my IANNS biased dependent parallel AND Gate. Every dot is a human cell! Neuromorphic Wizard test that shows both parts are rising and one slight weighted less and also see what could be ‘drift’ Other IANNS experiments including the Kaiclock using the RpAa promoter

Subsections of Labs

Week 1 Lab: Pipetting

cover image cover image

Week 2 Lab: Gel Art

Week 3 Lab: Opentrons

Python Script for Opentrons Artwork

Since I am not present to interact directly with the Opentrons output, I thought about why I would want to pipette an image and what that image should represent and decided to use Ndebele bead patterns as inspiration.

Ndebele bead patterns have a very specific geometric logic. They are built on a grid of “bead units” arranged in bold, angular, symmetric designs. The traditional South Ndebele aesthetic uses high-contrast colors in step-like diagonal and horizontal bands, often with thick outlines and mirrored symmetry.

They are also studied as Ethno mathematics, which often promotes a more humanistic and inclusive perspective on mathematics, focusing on how different groups manage, understand, and navigate their reality.

I found it interesting to bring the mathematical and social aspects of this indigenous knowledge to the biochemical level, as this layering of meaning creates interesting avenues for reflection on various levels.

Example of Ndebele paintings and beadwork:

Python Visuals & Scripts Ex.

I am not a coder, but playing around with the example scripts, I ended up using Claude to vibe-code the desired patterns and position. It required some debugging and made various output versions.

Although the co-lab script runs without error, I am not sure if this will work on Opentrons.

EARLY VERSIONS BEFORE KNOWING COLOUR AVAILABILITY

Post Lab Homework


Published Paper

A directly relevant paper is Fang et al. (2025) in Nature Communications, which demonstrates circadian-gated gene expression circuits in bacteria, using automated temporal sampling to characterize rhythmic protein output over 24-hour cycles. This paper is not a peripheral reference; it is one of the primary foundational sources for my final project concept and is already cited in my main project documentation. The automation approach used to verify rhythmic expression in that work is precisely what I intend to replicate and extend with the Opentrons platform.


What I Intend to Automate

My project proposes a bacterial AND gate where the antimicrobial peptide Magainin is only expressed when two conditions are simultaneously true: the circadian regulator RpaA is active, and a pathogen signal is present. The core experimental challenge is verifying this gate actually works as designed, which requires sampling bacterial expression levels repeatedly across a full 24-hour cycle, under multiple conditions, without human error or gaps overnight. This is the automation task.

The Opentrons OT-2 would run an unattended 24-hour sampling protocol across three experimental conditions:

  • RpaA active + pathogen signal present (AND gate should trigger)
  • RpaA active + no pathogen signal (gate should stay silent)
  • RpaA inactive + pathogen signal present (gate should stay silent)

At each 2-hour timepoint, the robot samples each culture well, transfers to a measurement plate for fluorescence reading, and replaces the sampled volume with fresh media to keep cultures alive. This builds a full temporal expression profile across all three conditions without any overnight manual intervention.

I would use Claude for the coding and guidance in the technical parts of this.

Why This Automation Matters

The AND gate only has meaning if you can show it is silent when it should be silent and active only at the right circadian phase with the right pathogen or other signal. That requires clean data across all three conditions at every 2-hour window through the night. Manual pipetting at 2am introduces the exact inconsistency that would make the rhythmic signal unreadable. The Opentrons removes that variable entirely.


Future Extensions

If access to Ginkgo Nebula becomes available, the next step would be submitting the AND gate genetic construct for scaled fermentation and characterization; using Nebula’s high-throughput infrastructure to screen circuit variants with different RpaA promoter strengths or pathogen-sensing thresholds, generating the kind of combinatorial data that would take months on a single benchtop robot.


References & Resources

Lecture Materials

Required Readings

  1. Fang et al. (2025). “Circadian-gated gene expression circuits in bacteria.” Nature Communications
  2. UCSD (2024). “Researchers Rebuild Microscopic Circadian Clock.” University of California San Diego press release
  3. Bilska et al. (2021). “Circadian rhythm in skin barrier function and antimicrobial peptides.” Experimental Dermatology

Software & Tools Used

  • Google Colab - Python script development and testing for Opentrons protocols
  • Python - Opentrons protocol scripting and pattern generation
  • Imgur - Image hosting for project visualization and Ndebele pattern references

Cultural & Mathematical Inspiration

  • Ndebele bead patterns and geometric design principles
  • Ethnomathematics - Indigenous mathematical knowledge systems
  • Traditional South Ndebele aesthetic and symmetry patterns

Project Concepts Explored

  • Circadian-controlled bacterial pigment systems
  • Light-responsive color-changing bacteria
  • UV-protective bioplastic materials with bacterial pigments
  • Mechanotransduction experiments with bacterial cultures
  • Bacterial AND gate with circadian gating (RpaA + pathogen signal triggering Magainin expression)

Cost Considerations

  • Twist Bioscience DNA synthesis pricing
  • Remote lab assistance availability assessment
  • UK-based protein order logistics and costs

AI Assistance

  • Claude (Anthropic) - Code development and technical guidance
    • Model: Claude Sonnet 4.5
    • Date(s) used: February, 2026
    • Tasks: Assisted with Python script development for Opentrons Ndebele pattern generation (“vibe-coding”), debugging protocol scripts and verifying scientific terminology.

Future Platforms

  • Ginkgo Nebula - Potential platform for scaled fermentation and high-throughput circuit variant screening

Additional Resources

  • HTGAA final project guidelines and requirements
  • Twist Bioscience pricing documentation
  • Remote lab capabilities at available nodes
  • Opentrons protocol documentation and API reference
  • Ndebele art and design pattern libraries
  • Ethnomathematics literature

Acknowledgments

  • Course instructors
  • TAs
  • Ndebele cultural heritage for geometric design inspiration

Week 4 Lab: Protein Part I

*Part of Week 4 Homework/See above

Week 5 Lab: Protein Part II

*Part of Week 5 Homework/See above

Week 6 Lab: Gibson Assembly

Week 7 Lab: Neuromorphic Circuits

Weiss Lab run of my IANNS biased dependent parallel AND Gate. Every dot is a human cell!

Neuromorphic Wizard test that shows both parts are rising and one slight weighted less and also see what could be ‘drift’

Other IANNS experiments including the Kaiclock using the RpAa promoter

Subsections of Projects

Individual Final Project

HTGAA Spring 2026

The Exoskin Project: A Spider Silk-Snow Fungus Bioactive Biomaterial for Wound Healing and Beyond

Abstract

Exoskin asks a central research question: What new material(s) and relational possibilities become available when synthetic biology is used to design co-relational constructs across biological kingdoms? How far can that co-relational construct extend its structural, functional, and environmental range? We start this research with the main players, MaSp1 spider silk and the Tremella fuciformis (snow fungus) polysaccharide, through a CBM27 carbohydrate-binding domain, creating a designed molecular anchor rather than a physical blend. The result is a platform with thus far two construct variants.

Construct 1 — BioMechSkin (CBM27_RGD_MaSp1_4x): Targets diabetic foot ulcers, which affect 15-25% of all people with diabetes and are the leading cause of non-traumatic lower limb amputation worldwide. Current dressings do not simultaneously address moisture retention, mechanical resistance, and tissue repair. This construct does: Tremella provides moisture retention and macrophage modulation, MaSp1 provides tensile strength and friction resistance, and an RGD motif recruits fibroblasts and keratinocytes directly to the wound surface. AlphaFold3 confirms CBM27 pLDDT above 90, RGD solvent exposed and distal, MaSp1 disordered as expected.

Construct 2 — Photomechanical Actuator (VVD_GGS_CBM27_MaSp1_4x): Adds a VVD LOV domain from Neurospora crassa to the same chassis. A blue light pulse triggers VVD dimerisation, contracting the silk hydrogel network without any electronic actuator. This opens the platform to soft robotics for Mars and space environments, directly addressing a NASA-identified gap in electronics-free actuation under extreme conditions. AlphaFold3 and MolProbity validation at the 92nd percentile confirm structural integrity of the construct.

Both constructs share the same CBM27-Tremella-MaSp1 backbone, establishing it as a generalizable platform for encoding function at the protein sequence level.

Long term, this project unfolds in two phases:

Phase 1: Computational and Speculative explores how many material combinations and outputs become possible by extending the co-relational design philosophy across different kingdom pairs and molecular architectures. This phase runs entirely in silico, using AlphaFold3, molecular dynamics, and design iteration to map the conceptual space of what cross-kingdom molecular design can achieve.

Phase 2: Wet Lab Validation (mid-year onwards) will test the most promising constructs through cell-free protein synthesis and composite assembly, either through Node partnership or collaboration with the University of Johannesburg where I am artist in residence. This two-step approach allows the philosophical and practical dimensions to develop in parallel.


Introduction

Exoskin asks a broad single question: can cross-kingdom molecular design produce a material whose functional range exceeds what either organism achieves alone, and what emerges when that architecture is pushed to its limits? Two constructs are proposed as answers, both built on the same CBM27-Tremella-MaSp1 backbone. The first, CBM27_RGD_MaSp1_4x, targets wound healing. The second, VVD_GGS_CBM27_MaSp1_4x, adds a light-responsive LOV domain for photomechanical actuation in extreme environments. Together they test whether a single molecular platform can span from clinical wound care to space robotics. Ethically and conceptually, the project explores boundaries between human, animal, and fungal biology and raises questions about posthuman embodiment and hybridity.

The wound care construct is optimised for diabetic foot ulcers through two functional additions to the MaSp1 spider silk core:

Broadly Exoskin has applications in tissue engineering, drug delivery and regenerative medicine. Ethically and conceptually, the project explores traditional boundaries between human, animal, and fungal biology and raise questions about posthuman embodiment, hybridity, and science.

Here it is optimised for wound healing through two functional additions to the MaSp1 spider silk core:

  1. CBM27 domain from Thermotoga maritima Man5 mannanase to physically anchor the silk network onto Tremella fuciformis polysaccharide matrix
  2. RGD motif specifically selected to recruit skin fibroblasts and keratinocytes to the wound surface

Unlike passive biomaterials, this construct actively participates in tissue repair through three simultaneous mechanisms:

  1. Structural support from silk nanofibrils
  2. Moisture retention and macrophage modulation from Tremella polysaccharide
  3. Integrin-mediated skin cell recruitment from the RGD motif

Section 1: Project Aims

Aim 1 - Experimental: Design and computationally validate the CBM27_RGD_MaSp1_4x fusion construct using Benchling, Twist Biosciences, and AlphaFold3. see update below from Synbio 2026 confererence regarding Sidewinder - Robinson, N.E., Zhang, W., Ghosh, R., Gerber, B., Zhang, H., Sanfiorenzo, C., Wang, S., Di Carlo, D. and Wang, K. (2026) ‘Construction of complex and diverse DNA sequences using DNA three-way junctions’, Nature, 651(8105), pp. 491-500. https://doi.org/10.1038/s41586-025-10006-0

Aim 2a - Expression: Validate through cell-free protein synthesis or bacterial expression, confirmed by SDS-PAGE and polysaccharide-binding assay. (Hook Note - The tremella polysaccharide structure consists of a main chain of β-(1→3)-linked mannopyranoside, substituted with various side chains, which your CBM27 domain is designed to recognize. Ref: nih)

Aim 2b - Composite Assembly (contingent on Aim 2a): Combine the purified protein with Tremella fuciformis polysaccharide extract to assemble the composite hydrogel and characterise by rheology and fluorescence microscopy. - Ginkgo Bioworks / Node — target partners for cell-free expression (Aim 2a); plasmid submitted to Node for synthesis

Aim 3 - Visionary: Application and prototype wound dressing. Longer term, the MaSp1 silk fibre bundle contraction properties open a pathway to bio-inspired soft robotics actuators and exoskins driven by hydrogel mechanics. A light-responsive actuation variant is under exploration. Bio kinetic sculptures for moisture harvesting in desert areas such as Karoo. Very long term Engineer Tremella fuciformis to produce the silk fusion protein endogenously, creating a self-assembling living composite.

Other Applications

ApplicationDescription
Tissue Engineering ScaffoldSkin, cartilage, soft tissue repair
Drug Delivery MatrixSustained localised therapeutic release
Soft Robotics ActuatorBio-inspired hydrogel-driven movement
Wearable Biosensor SubstrateFlexible, skin-conforming sensor base
Self-Healing BiomaterialReversible CBM27-mannan re-annealing

Leverage Sidewinder DNA Assembly Technology for Robust Synthesis of CBM27_RGD_MaSp1_4x Fusion Protein Construct

The CBM27_RGD_MaSp1_4x fusion protein design presents a DNA synthesis challenge inherent to spider silk engineering: its four tandem repeats of the MaSp1 domain create a highly repetitive sequence that conventional assembly methods cannot reliably handle. Traditional DNA assembly techniques (PCA, Gibson assembly, Golden Gate) achieve misconnection rates of 1 in 10 to 1 in 30 and fail consistently beyond 5-10 fragment assemblies. This bottleneck is particularly acute for repetitive biomaterials like spider silk.

However, Sidewinder, a DNA assembly technology recently published in Nature (January 2026) by Caltech researchers Kaihang Wang and colleagues, addresses this limitation directly. Sidewinder uses DNA three-way junctions to separate assembly instructions from the final sequence, achieving a misconnection rate of approximately 1 in 1,000,000 and enabling flawless assembly of highly repetitive constructs. The technology was prominently featured at SynBioBeta 2026 (May 6, 2026) by Genyro co-founders Adrian Woolfson and Professor Wang, who presented Sidewinder as infrastructure for translating AI-designed biology into physical systems at scale. As a future aim or complementary approach, Sidewinder could be employed to synthesize the full pET28a plasmid containing the CBM27_RGD_MaSp1_4x insert with guaranteed sequence fidelity, ensuring reliable bacterial expression (Aim 2a), accurate characterization of hydrogel mechanics (Aim 2b), and enabling rapid design-build-test-learn iterations for optimizing the Tremella-silk interaction (Aim 3).

Application or Look and Feel of the Material

After expression and purification, the protein is mixed with Tremella polysaccharide extract. The CBM27 domain automatically anchors the silk into the fungal matrix and MaSp1 self-assembles into fibres, forming a hydrogel sheet with no chemical crosslinking needed. That sheet is cut to size and placed directly on the wound. On a diabetic foot ulcer it does three things at once: Tremella keeps the wound moist and calms chronic inflammation, the silk holds the structure together under the mechanical stress of walking, and the RGD motif pulls the patient’s own skin cells into the wound to drive healing. No cells are delivered, no drugs, just a protein-polysaccharide scaffold that creates the right conditions for the body to repair itself.


Section 2: The Constructs

The fusion protein is built from four functional units arranged N to C terminus:

DomainPositionFunction
CBM27 (buffered)aa 1-176Anchors silk to Tremella mannan
GGGGS x3 linkeraa 177-188Domain flexibility
RGD motifaa 189-193Skin cell recruitment
MaSp1 x4 repeatsaa 194-333Mechanical structure
6xHis tagaa 334-340Ni-NTA purification

340 amino acids · ~37 kDa · cloned into pET28a at NdeI/XhoI · 6,297 bp total plasmid

CBM27_RGD_MaSp1_4x cloned into pET28a — 6,297 bp circular plasmid map CBM27_RGD_MaSp1_4x cloned into pET28a — 6,297 bp circular plasmid map

CBM27_RGD_MaSp1_4x cloned into pET28a(+) at NdeI/XhoI · 6,297 bp

AlphaFold3 Structure Prediction of CBM27_RGD_MaSp1_4x

Model_0, Coloured by pLDDT confidence using standard AlphaFold convention. Rendered in PyMOL.

The AlphaFold3 prediction of my full fusion construct CBM27_RGD_MaSp1_4x reveals a structurally coherent architecture consistent with my design intent. The central dark blue beta-sandwich is the CBM27 domain (residues 1-176), predicting with very high confidence (pLDDT >90) and adopting the characteristic jellyroll fold seen in the experimentally validated TmCBM27 crystal structure (PDB: 1OF4, Boraston et al. 2003). This confirms that the CBM27 domain folds correctly within the fusion context and is not destabilised by the adjacent linker or silk repeat regions.

Extending from the CBM27 domain, the white and red transitional region marks the GGS flexible linker and the boundary into lower confidence territory. Immediately following this, the orange spheres identify the RGD motif (residues 189-193), which sits fully solvent exposed and spatially distal from the CBM27 body. This exposed geometry is the ideal configuration for integrin binding in wound care, biological accumulation in the sculpture context, and surface functionalisation in cryogenic assay conditions. The large blue disordered loops extending around the structure are the four MaSp1 silk repeat units (residues 194-334). These predict as intrinsically disordered, which is the expected and correct behaviour for recombinant spider silk repeat proteins prior to fibre assembly or hydrogel casting. The confidence gradient across the silk loops, darker navy closest to CBM27 fading to lighter periwinkle at the distal ends, reflects the decreasing positional constraint as the chain moves further from the structured anchor domain.

Taken together, this prediction computationally validates three key design features:

  1. CBM27 folds correctly in the fusion context
  2. RGD is spatially accessible
  3. MaSp1 remains dynamically disordered as required for silk network formation

A note on the MaSp1 prediction: It is important to note that AlphaFold3 has known limitations in predicting intrinsically disordered repeat proteins, particularly those with highly repetitive glycine and alanine rich sequences such as spider silk. The MaSp1 backbone trace shown here should not be interpreted as a meaningful 3D conformation. The disordered loops represent AlphaFold3 acknowledging that it cannot assign confident positional coordinates to these residues, not that the silk adopts this specific extended loop geometry in solution. The true behaviour of MaSp1 repeats is well established in the literature: they remain disordered in aqueous solution and assemble into beta-sheet rich nanofibrils upon shear stress, pH change, or drying (Rising et al. 2011). The computational validation of the silk region will be addressed separately through molecular dynamics simulation of the repeat unit assembly behaviour rather than single chain folding prediction, which is not the appropriate tool for this class of protein.

AlphaFold3 confidence metrics:

The overall pTM score for the full fusion construct is 0.54, which reflects the presence of the large intrinsically disordered MaSp1 repeat region rather than indicating poor prediction quality for the structured domains. pTM scores are sensitive to disordered regions and will always be suppressed in constructs containing long flexible or repeat sequences. When interpreted domain by domain, the CBM27 region (residues 1-176) shows pLDDT values consistently above 90, placing it in the very high confidence category and confirming that the folded domain is predicted reliably. The RGD motif and GGS linker score below 50 as expected for short flexible functional motifs. The MaSp1 repeats score variably across the disordered range, consistent with their known intrinsically disordered character in solution.

For comparison, the experimentally validated crystal structure of isolated TmCBM27 (PDB: 1OF4, Boraston et al. 2003, 1.6 Angstrom resolution) confirms the expected beta-sandwich jellyroll fold for this domain. The high confidence CBM27 prediction in our fusion construct (pLDDT >90, residues 1-176) is structurally consistent with this experimentally solved reference, confirming that domain-level folding is maintained even within the multi-domain fusion context. https://www.rcsb.org/structure/1OF4

Construct 2: VVD_GGS_CBM27_MaSp1_4x

The second construct adds a light-responsive LOV domain to the same chassis, targeting photomechanical soft robotics for Mars and space environments.

DomainPositionFunction
VVD (LOV domain, Neurospora crassa)N-terminusLight-sensitive blue light receptor; dimerises under ~450 nm, contracting the silk hydrogel network
GGS linkerFlexible linker between VVD and CBM27
CBM27Anchors silk to Tremella mannan polysaccharide
MaSp1 x4 repeatsC-terminusMechanical silk scaffold

AlphaFold3: pTM 0.45. VVD and CBM27 fold independently with high confidence. MaSp1 disordered as expected.

VVD_GGS_CBM27_MaSp1_4x HERO IMAGE!

AlphaFold3 Structure Prediction of VVD_GGS_CBM27_MaSp1_4x

Model_0, pTM 0.45. Coloured by pLDDT confidence using standard AlphaFold convention. Rendered in PyMOL. The AlphaFold3 prediction of the second fusion construct VVD_GGS_CBM27_MaSp1_4x reveals two independently folding structured domains consistent with the design intent. The larger domain predicting in green and cyan is the VVD LOV domain (residues 1-156), folding with confident to high pLDDT scores and adopting the characteristic PAS domain beta-sandwich fold consistent with the experimentally validated VVD crystal structures PDB 3RH8 and 3D72 (Zoltowski and Crane 2008). The smaller domain in dark blue is the CBM27 domain (residues 163-337), predicting with very high confidence pLDDT >90, consistent with the first construct prediction and the TmCBM27 crystal structure PDB 1OF4 (Boraston et al. 2003). The large disordered loops extending below are the four MaSp1 silk repeats (residues 352-493), predicting as intrinsically disordered as expected for recombinant spider silk repeat proteins prior to fibre assembly. The overall pTM of 0.45 reflects the presence of the large intrinsically disordered MaSp1 region and should not be interpreted as indicating poor prediction quality for the structured domains. The two distinct high confidence structured regions visible in the PAE matrix, one covering residues 1-82 corresponding to VVD and one covering residues 164-328 corresponding to CBM27, confirm that both functional domains fold independently and correctly within the fusion context. Critically the VVD and CBM27 domains appear in close spatial proximity in the predicted structure, consistent with the GGS linker allowing the LOV domain conformational change to propagate mechanical strain into the CBM27-anchored silk network upon blue light activation.

MolProbity (run on VVD construct only): Clashscore 4.46 at 95th percentile. MolProbity score 1.62 at 92nd percentile. Favoured rotamers 99.70%. Zero poor rotamers. Zero bad bonds. Benchmarked against 1784 experimentally solved PDB crystal structures.

MolProbity validation of the AlphaFold3 predicted structure of VVD_GGS_CBM27_MaSp1_4x scores at the 95th percentile for clashscore when benchmarked against 1784 experimentally solved crystal structures in the Protein Data Bank, confirming that the computational prediction meets the geometric quality standards of experimental structural biology. Wet lab confirmation of actual protein folding and domain function awaits cell free expression results from Ginkgo Bioworks other wetlab assay.

MolProbity confirms near-zero atomic clashes across the entire 493 residue fusion construct, with a clashscore placing the predicted structure in the 95th percentile of all experimentally solved crystal structures in the Protein Data Bank.

Clashscore 4.46, 95th percentile compared against N=1784 PDB structures at all resolutions. MolProbity score 1.62, 92nd percentile compared against N=27,675 PDB structures at 0-99 Angstrom resolution.


Section 3: Background

Literature Context

Recombinant MaSp1 spider silk expressed in E. coli self-assembles into fibres with tensile properties comparable to native dragline silk (Rising et al., 2011), but lacks biological signalling without post-synthesis functionalisation. Tremella fuciformis polysaccharide promotes fibroblast proliferation and moisture retention exceeding hyaluronic acid at equivalent concentrations (Huang et al., 2025). Boraston et al. (2003) confirmed that CBM27 domains can be grafted onto heterologous proteins while retaining binding specificity. No study has combined all three components into a single genetically encoded construct.

Novelty

This construct is novel in three ways: the CBM27-MaSp1 fusion has not been reported; the addition of RGD converts a passive scaffold into an active wound-healing participant; and the approach uses a designed molecular interaction rather than physical blending, a principle generalisable to other polysaccharide-CBM pairs. Detail follows below.

CBM27-MaSp1 Fusion Protein Hydrogel Scaffold

Key context for CBM27_RGD_MaSp1_4x construct, Aim 1 and Aim 2

Novelty assessment: High. The specific combination of CBM27 with MaSp1 for structural hydrogel or actuator scaffold applications is not present in current literature. The addition of an RGD motif for active cell recruitment and the use of Tremella fuciformis polysaccharide as the CBM27 binding target further distinguishes this construct from all prior art.

Relevance to construct: This note covers two distinct novelty claims within the same construct. First, the CBM27-MaSp1 fusion itself as a structural hydrogel scaffold has no direct precedent. Second, the Tremella fuciformis polysaccharide as the specific binding target for CBM27 is entirely novel in this context. Prior CBM-silk work uses cellulosic matrices from Clostridium thermocellum cellulosome systems, not fungal mannan polysaccharides. The β-(1→3)-linked mannopyranoside backbone of Tremella is chemically distinct from cellulose and represents a new class of CBM-polysaccharide pairing for silk composite materials.

RGD novelty note: The addition of the RGD integrin-binding motif to this construct converts a passive structural scaffold into an actively cell-recruiting bioaterial. No prior CBM-silk fusion work incorporates a cell adhesion motif. This three-way combination of CBM27 anchor, RGD recruitment, and MaSp1 structural backbone in a single genetically encoded construct is without precedent in the literature. Tremella novelty note: The use of Tremella fuciformis polysaccharide as the matrix phase of the composite is novel in two respects. It introduces a fungal-origin scaffold with documented cryoprotective and moisture-retention properties that cellulosic matrices do not possess, and it establishes a silk-fungal composite identity that is distinct from all prior silk-CBM work which uses bacterial or plant cellulose systems.

Closest prior art: Gomes et al. (2011), Biomaterials. MaSp1 fused with antimicrobial peptides, CBMs from Clostridium thermocellum used with antimicrobial peptides for cellulosic surfaces, not CBM27 with spider silk for hydrogel or actuator applications. No RGD, no fungal matrix. Mohammadi et al. (2019), Science Advances. ADF3 spidroin fused with CBMs from Clostridium thermocellum cellulosome, hydrogel noted as potential application but not developed. No CBM27, no MaSp1, no RGD, no fungal polysaccharide matrix. Key gaps your construct addresses: No prior work combines CBM27 specifically with MaSp1, no prior CBM-silk work targets a fungal mannan matrix, no prior CBM-silk fusion incorporates an active cell recruitment motif, and no prior work develops these composites explicitly as structural hydrogel or soft actuator scaffolds with a defined bioart or extreme environment application context.

References: Gomes et al. 2011, Biomaterials. CBM-silk antimicrobial fusion proteins. Mohammadi et al. 2019, Science Advances. Cellulose-spider silk CBM composites. https://www.science.org/doi/10.1126/sciadv.aaw2541

Photomechanical Soft Robotics for Space Environments Key context for VVD_GGS_CBM27_MaSp1_4x construct (VVD: LOV domain from Neurospora crassa), Aim 3

Novelty assessment: High. The integration of a genetically encoded LOV domain as a photomechanical actuator in a silk-fungal composite material for space environment applications is not present in current literature. This is the first proposed use of a LOV domain specifically for photomechanical actuation in harsh extraterrestrial conditions, leveraging available light flux as an electronics-free autonomous trigger.

Relevance to construct: The VVD LOV domain activates under blue light at approximately 450nm through cysteinyl-flavin adduct formation, driving homodimerisation and propagating conformational strain through the adjacent MaSp1 silk network. Martian solar spectrum at surface level retains sufficient blue light flux to activate LOV domain photochemistry, raising the possibility of fully autonomous actuation without onboard electronic light sources. This directly addresses the central gap in the space soft robotics literature, which has not considered genetically encoded photoreceptors as actuation mechanisms.

Four-layer novelty claim: This construct sits at the intersection of four properties that no prior work addresses simultaneously. Genetically encoded actuation mechanism. LOV domain photochemistry specifically. Silk-fungal composite structural material. Space environment application context.

Martian UV flux note: LOV domain activation occurs at approximately 450nm blue light. The Martian surface solar spectrum, while UV-shifted relative to Earth, retains blue light flux sufficient for LOV photochemistry. This opens the possibility of light-autonomous soft material actuation on the Martian surface without electronic control systems, a capability gap explicitly identified in the NASA ACR24 architecture technology gaps document for high-performance actuators and grippers in extreme environments.

Closest prior art: Lopez-Lopez et al. (2025), ICRA. Inflatable soft robotic arms for space debris capture using pneumatic actuation, motor-free but not photomechanical and not LOV-based. Zeng et al. (2018), Advanced Materials. Light-responsive liquid-crystal elastomers for soft microrobots, photomechanical but synthetic polymer systems, not genetically encoded, not space-environment specific. Nohooji and Voos (2025), Advanced Intelligent Systems. Compliant robotics for space, material selection focused, discusses harsh environment conditions but not photomechanical or LOV-based actuation.

Key gaps your construct addresses: No prior work uses a genetically encoded LOV domain for space soft robotics. No prior photomechanical soft robotics work targets extraterrestrial light as an autonomous trigger. No prior space soft robotics work uses a silk-fungal composite as the structural actuator material. The direct exploitation of Martian solar flux for autonomous LOV-driven silk network actuation is entirely unaddressed in the literature.

NASA technology gap alignment: NASA ACR24 Architecture Technology Gaps document identifies high-performance actuators, sensors, and interfaces for robotic joints and grippers operating in extreme cold and vacuum as an unresolved need. This construct addresses that gap through a passive, electronics-free, genetically encoded photomechanical mechanism that does not rely on lubricants, heating systems, or electronic controllers.

References: Lopez-Lopez et al. 2025: https://softroboticsforspace.eu/images/paper-submissions/Lopez-ICRA2025-Soft-Robotics-for-Space-Applications.pdf Zeng et al. 2018: https://advanced.onlinelibrary.wiley.com/doi/full/10.1002/adma.201703554 Nohooji and Voos 2025: https://www.researchgate.net/publication/390462454_Compliant_Robotics_in_Space_A_Prospective_Review_of_Soft_and_Deformable_Systems_for_Space_Missions NASA ACR24: https://www.nasa.gov/wp-content/uploads/2024/12/acr24-architecture-technology-gaps.pdf

Why It Matters

Chronic wounds cost the US healthcare system over $25 billion per year.(Add a ref/additional data ) If validated, this construct demonstrates that carbohydrate-binding modules can serve as molecular anchors between protein-based materials and naturally occurring polysaccharide scaffolds, a principle with broad applications across tissue engineering, soft robotics, and beyond.

Ethical Implications

All components are biosafe at this stage: CBM27 is from a non-pathogenic thermophile, Tremella is an edible mushroom, and the protein is produced in contained E. coli. Aim 3 (GMO fungus) would require regulatory review before any cultivation outside a controlled lab. Clinical risks including allergenic response and off-target integrin binding are noted and would require GMP-standard safety testing before patient use.


Section 4: Experimental Design

Techniques Used

  • Protein design (Benchling)
  • DNA construct design and codon optimisation
  • Synthesis feasibility testing (IDT, Twist Biosciences)
  • Twist order design (Gene Fragment and Clonal Gene)
  • Sequence databases (UniProt, NCBI GenBank)
  • Bioethical considerations

Industry Partners

  • Twist Biosciences — synthesis feasibility and draft gene orders
  • Ginkgo Bioworks — target partner for cell-free expression (Aim 2a)
  • Benchling — construct assembly and sequence management

Section 5: Results

What Was Validated

The full DNA construct design and synthesis feasibility pipeline, from domain sequence acquisition through to a confirmed-orderable 4x construct on Twist.

Key Findings

The 8x MaSp1 repeat construct failed at both IDT (complexity score 53, threshold 24) and Twist (Not Accepted), due to unavoidable alanine/glycine codon poverty across 8 identical repeats. The redesigned 4x construct was confirmed orderable: Complex, $145.45, 6,297 bp. AlphaFold3 structure prediction and HADDOCK binding simulation are pending.

Data

Note: Move documentation images here

VVD_GGS_CBM27_MaSp1_4x – AlphaFold3 & MolProbity (Construct 2)

AlphaFold3: pTM 0.45. VVD and CBM27 both fold independently with high confidence. MaSp1 disordered as expected.

MolProbity (run on VVD construct only): Clashscore 4.46 at 95th percentile. MolProbity score 1.62 at 92nd percentile. Favoured rotamers 99.70%. Zero poor rotamers. Zero bad bonds. Benchmarked against 1784 experimentally solved PDB crystal structures.

Challenges

The 8x MaSp1 repeat construct failed synthesis due to alanine/glycine codon poverty across identical repeats. Redesigned to 4x, which was confirmed orderable on Twist at $145.45, 6,297 bp. Sidewinder (Wang et al., Nature 2026) directly addresses this repetitive sequence assembly problem and is flagged as a future synthesis route.

Next Steps

Validate CBM27_RGD_MaSp1_4x and VVD_GGS_CBM27_MaSp1_4x constructs through cell-free expression (Ginkgo CFPS or NEB PURExpress), SDS-PAGE, and polysaccharide binding assay. If protein expresses: mix with Tremella fuciformis polysaccharide (CAS 9075-53-0, 1% w/v in PBS pH 7.4).


References

Core Construct

[1] Xu et al. 2025, Advanced Healthcare Materials. [2] Huang et al. 2025, ACS Synthetic Biology. [3] Shahin G, 2025, MSc Thesis, LUT University. [4] Boraston et al. 2003, Structure of a Carbohydrate-Binding Module. [5] Rising et al. 2011, Cellular and Molecular Life Sciences.

Spider Silk Synthesis

[6] Robinson, N.E., Zhang, W., Ghosh, R., Gerber, B., Zhang, H., Sanfiorenzo, C., Wang, S., Di Carlo, D. and Wang, K. (2026) ‘Construction of complex and diverse DNA sequences using DNA three-way junctions’, Nature, 651(8105), pp. 491–500. https://doi.org/10.1038/s41586-025-10006-0

Closest Prior Art — CBM-Silk Composites

[7] Gomes et al. 2011, Biomaterials. CBM-silk antimicrobial fusion proteins. [8] Mohammadi et al. 2019, Science Advances. Cellulose-spider silk CBM composites. https://www.science.org/doi/10.1126/sciadv.aaw2541

Closest Prior Art — Space Soft Robotics

[9] Lopez-Lopez et al. 2025, ICRA. Inflatable soft robotic arms for space debris capture. https://softroboticsforspace.eu/images/paper-submissions/Lopez-ICRA2025-Soft-Robotics-for-Space-Applications.pdf [10] Zeng et al. 2018, Advanced Materials. Light-responsive liquid-crystal elastomers for soft microrobots. https://advanced.onlinelibrary.wiley.com/doi/full/10.1002/adma.201703554 [11] Nohooji and Voos 2025, Advanced Intelligent Systems. Compliant robotics in space. https://www.researchgate.net/publication/390462454_Compliant_Robotics_in_Space_A_Prospective_Review_of_Soft_and_Deformable_Systems_for_Space_Missions

NASA Technology Gap

[12] NASA ACR24 Architecture Technology Gaps, 2024. https://www.nasa.gov/wp-content/uploads/2024/12/acr24-architecture-technology-gaps.pdf

Initial FP Aims

---

Subsections of Individual Final Project

Aim 1.Construct Design: CBM27_RGD_MaSp1 Fusion Protein

Spider Silk-Snow Fungus Bioactive Biomaterial Optimised for Wound Care — CBM27_RGD_MaSp1_4x Fusion Protein · HTGAA Spring 2026

Documented (Update)

In Benchling:

  • Full 8x construct (480 aa) fully annotated for in silico validation and Ginkgo pitch
  • 4x construct (340 aa) fully annotated as synthesis-ready design

In Twist:

  • Saved draft order for CBM27_RGD_MaSp1_4x_Fusion as a Gene Fragment at $71.61 (~£56), 1,023 bp, complexity: Complex, orderable

  • Saved draft order for CBM27_RGD_MaSp1_4x_pET28a as a Clonal Gene in pET28a(+) at NdeI/XhoI insertion point at $145.45 (~£115), 1,005 bp insert, 6,297 bp total plasmid, complexity: Complex, orderable

  • Downloaded FASTA, GenBank and text files of the optimized gene fragment sequence

  • Downloaded GenBank file of the complete pET28a construct

  • Circular plasmid map generated and documented showing all functional elements

In Benchling: (Older)

Full 8x construct (480 aa) fully annotated 4x construct (340 aa) fully annotated as synthesis-ready design

In Twist:

Saved draft order for CBM27_RGD_MaSp1_4x_Fusion at $71.61 Downloaded FASTA, GenBank and text files of the optimized sequence

In IDT:

Documented complexity failure of 8x construct (score 57.3) Documented complexity failure of identical MaSp1 repeats (score 53)

The Four-Component Logic The fusion protein is built from four functional units arranged in a deliberate order from N-terminus to C-terminus:

The CBM27 domain, sourced from the Thermotoga maritima beta-mannanase Man5 (gene TM1227), physically anchors the silk network to the Tremella polysaccharide matrix by binding its mannan backbone. Without this anchor, the silk and polysaccharide components would phase-separate in wound exudate.

The flexible GGGGS x3 linker gives the CBM27 domain rotational freedom to engage the Tremella polysaccharide chains independently from the rest of the protein. This design follows validated precedent from CBM27 fusion protein literature. The RGD motif (GRGDS), derived from the minimal integrin-binding sequence of human fibronectin, actively recruits fibroblasts and keratinocytes to the material surface to accelerate tissue repair. This converts the dressing from a passive scaffold into an active wound-healing participant.

Eight MaSp1 consensus repeat units provide the mechanical backbone of the construct, self-assembling into beta-sheet nanofibrils that give the hydrogel its toughness and structural integrity. Construct Architecture The full fusion protein is 480 amino acids with the following domain map:

DomainPositionsFunctionCBM27 (buffered)1-176Tremella polysaccharide anchorGGGGS x3 Linker177-188Domain flexibilityRGD motif189-193Cell recruitmentMaSp1 x8 repeats194-474Mechanical structure6xHis tag475-480Ni-NTA purification Molecular weight: 44,409 Da. Isoelectric point: 7.29. Sequence Source and Verification

The CBM27 domain boundaries were verified against the UniProt entry for Thermotoga maritima Man5, which annotates the Carbohydrate Binding Module 27 at residues 495-664 of the full 669 aa protein. A five-residue buffer was added to each boundary to avoid clipping structurally important residues at the domain edge, giving a final CBM27 input of 177 amino acids.

The complete construct was assembled and annotated in Benchling as an AA sequence file named CBM27_RGD_MaSp1_Fusion within the Spider Silk - Tremella project.

Synthesis Complexity and the Codon Shuffling Requirement Initial synthesis feasibility analysis via the IDT Codon Optimization Tool confirmed that direct translation of 8 identical MaSp1 repeat units generates prohibitive sequence complexity, returning a total complexity score of 57.3 against a synthesis threshold of 24. The specific failure modes identified were:

Repeated DNA sequences exceeding 13 bases appearing at multiple locations across the insert 73.5% of the overall sequence composed of repeats longer than 8 bases A 100-base window at position 1165 with 85% GC content Hairpin structures forming between identical repeat regions

This result validates the codon shuffling strategy as a necessary design requirement rather than an optional refinement. The next step is to manually assign synonymous codons to each of the 8 MaSp1 repeats so that every repeat encodes an identical amino acid sequence but presents a distinct DNA sequence to the synthesis machinery.

1. Sequence Acquisition: CBM27 Domain

The carbohydrate binding module 27 domain was sourced from the Thermotoga maritima beta-mannanase Man5 (gene TM1227), a 669 amino acid protein. The CBM27 domain boundaries were verified against the UniProt entry, which annotates the domain at residues 495-664. A five-residue buffer was added to each boundary to avoid clipping structurally important residues at the domain edge, giving a final CBM27 input sequence of 177 amino acids spanning residues 490-669.

The extracted sequence was:

KVVNRAQYVLAEEVDFSRPEEVKNWWNSGTWQAEFKTPDIEWNGEVGNG
ALQMNVVLPGKGDWEEVRVVRKFDQLPVCEILEYDIYIPDVEGLTGRLR
PYAVLNPGWVKIGLDMNNTSIDSGELVSFDGKKYRKFHVRIEFDKTPGV
NELHIGVVGDHLEYDGPIFIDNVRLYKKSS

2. Construct Assembly in Benchling

The full fusion protein was assembled in Benchling as an AA sequence file named CBM27_RGD_MaSp1_Fusion within the Spider Silk - Tremella project folder. Each domain was added sequentially and annotated with a distinct colour:

DomainPositionsColourFunction
CBM27 (buffered)1-176BlueTremella polysaccharide anchor
GGGGS x3 Linker177-188GreyDomain flexibility
RGD motif189-193PinkCell recruitment
MaSp1 R1-R8194-474GreenMechanical structure
6xHis tag475-480YellowNi-NTA purification

Total length: 480 amino acids. Molecular weight: 44,409 Da. Isoelectric point: 7.29.

The annotated sequence map confirmed all domains were correctly positioned and accounted for. The biochemical properties panel confirmed the sequence was 480 amino acids running from KVVN at the N-terminus to HHHH at the C-terminus.


3. Codon Optimization: CBM27 Domain

The CBM27 domain alone (177 aa) was submitted to the IDT Codon Optimization Tool with the following settings:

  • Sequence type: Amino Acids
  • Product type: gBlocks Gene Fragments
  • Organism: Escherichia coli
  • Restriction sites to avoid: BsaI (GGTCTC), BsmBI (CGTCTC)

The CBM27 domain passed initial screening with no complexity issues. The codon-optimized DNA output was 531 bp. No BsaI or BsmBI sites were introduced. Several other restriction sites were noted (PstI, SmaI, XmaI) but these are irrelevant to the Golden Gate assembly strategy and were not flagged as problems.


4. Synthesis Feasibility Testing: Full 8x Construct

IDT Test 1: Identical MaSp1 Repeats

The complete 480 amino acid sequence was submitted to IDT Codon Optimization with E. coli settings. IDT returned a complexity score of 57.3, well above the synthesis threshold of 24, with the following specific failure modes:

  • A repeat sequence of 42 bases appearing at multiple locations
  • 73.5% of the overall sequence composed of repeats longer than 8 bases
  • A 100-base window at position 1165 with 85% GC content
  • Hairpin structures forming between identical repeat regions

Result: Denied. Not synthesizable.

This confirmed the known problem with identical tandem silk repeats: even after codon optimization, the DNA homology between repeated units causes synthesis machinery slippage.

IDT Test 2: Codon-Shuffled MaSp1 Repeats

Eight synonymous codon-shuffled variants of the MaSp1 repeat were computationally generated, each encoding the identical amino acid sequence GQGAGAAAAAAGGAGQGGYGGLGSQGAGRGGLGGQ but using distinct codon assignments drawn from the E. coli K12 codon table. No two adjacent repeats shared more than 3 consecutive identical base pairs.

The full insert including CBM27, linker, RGD, all 8 shuffled repeats, 6xHis tag and stop codon was assembled as a 1,443 bp sequence and submitted to IDT gBlocks entry for complexity testing.

Result: Complexity score 53. Still Denied.

The codon shuffling reduced the score from 57.3 to 53 but was insufficient to bring it below the 24 threshold. The dominant remaining problem was a 42-base shared subsequence between two repeats in the alanine-rich region, where the limited synonymous codon options for alanine and glycine prevented sufficient DNA diversification across 8 repeats.

Twist Bioscience Test: Full 8x Construct

The same 480 amino acid sequence was submitted to Twist Bioscience’s gene ordering portal using their built-in amino acid import and codon optimization workflow with E. coli codon table, BsaI and BsmBI avoidance.

Result: Not Accepted.

Twist’s algorithm, which is more sophisticated than IDT’s and better handles repetitive sequences, also rejected the 8x construct. This was definitive confirmation that 8 identical MaSp1 repeats cannot be synthesized as a single gene fragment by any current commercial synthesis vendor.


5. Design Decision: 4x Repeat Construct

Based on the synthesis feasibility data, a second construct was designed using 4 MaSp1 repeat units instead of 8. This was created in Benchling by duplicating the original file and deleting repeats R5 through R8 (positions 334-473), producing a 340 amino acid construct named CBM27_RGD_MaSp1_4x_Fusion.

DomainPositionsFunction
CBM27 (buffered)1-176Tremella polysaccharide anchor
GGGGS x3 Linker177-188Domain flexibility
RGD motif189-193Cell recruitment
MaSp1 R1-R4194-333Mechanical structure
6xHis tag334-340Ni-NTA purification

6. Synthesis Feasibility Testing: 4x Construct on Twist

The 340 amino acid 4x construct was submitted to Twist Bioscience using the same amino acid import workflow. Twist performed codon optimization for E. coli with BsaI and BsmBI avoidance and returned the following result:

  • Complexity: Complex
  • Length: 1,023 bp
  • Price: $71.61 (~£56)
  • Status: Orderable

Complex in Twist terminology means the sequence is manufacturable but requires more careful synthesis handling than a standard sequence. It is not a rejection. The remaining complexity flags were minor repeat density warnings in the MaSp1 region, all categorised as warnings rather than errors.

Twist’s codon optimization was applied and the optimized sequence was downloaded in FASTA, GenBank and text formats. The order draft was saved to the Twist account.


7. Summary of Constructs and Status

ConstructRepeatsLengthIDT ResultTwist ResultPurpose
CBM27_RGD_MaSp1_Fusion8x480 aa / 1,443 bpScore 53, DeniedNot AcceptedIn silico validation, Ginkgo pitch
CBM27_RGD_MaSp1_4x_Fusion4x340 aa / 1,023 bpNot testedComplex, $71.61, OrderableWet lab synthesis

8. Backbone Vector Documentation

The insert was designed for expression in pET-28a(+) (Novagen, 5,365 bp), cloned at the NdeI/XhoI insertion points. This places the fusion protein under T7 promoter control with a C-terminal 6xHis tag provided by the vector for Ni-NTA purification. The vector carries kanamycin resistance for bacterial selection and a colE1 high copy origin of replication. The complete circular plasmid was designed directly in Twist Bioscience’s Clonal Gene ordering portal, which performs synthesis, cloning, transformation, colony picking and Sanger sequencing verification, delivering a ready-to-use sequence-verified plasmid. The full construct is 6,297 bp and was downloaded as a GenBank file and imported into Benchling as a circular DNA sequence for documentation. The plasmid is compatible with NEB PURExpress E6800 and Ginkgo Bioworks CFPS cell-free expression systems, both of which are T7 promoter driven E. coli based systems requiring no further cloning before expression.

9. Next Steps

In silico track: (see In silico page)

Submit the full 480 aa CBM27_RGD_MaSp1_Fusion sequence to AlphaFold3 to predict the 3D structure and confirm that the CBM27 domain and RGD motif are surface-exposed and not buried within the MaSp1 beta-sheet core.

Wet lab track:

The protein construct ready to be ordered from Twist. Preferred Ginkgo Bioworks (for remote lab experience) or LifeFabs assay as second option. Protocol draft started.

Protocol for Assay

Draft v 1

Experimental Protocol: CBM27_RGD_MaSp1_4x Fusion Expression and Tremella Composite Formation

Construct: CBM27_RGD_MaSp1_4x_Fusion Vector: pET28a Expression system: NEB PURExpress E6800 Protein MW: ~32 kDa


Step 1: Cell-Free Expression

Materials needed:

  • PURExpress Solution A
  • PURExpress Solution B
  • Murine RNase Inhibitor
  • Template DNA (linear PCR product or circular plasmid, 250 ng)
  • Nuclease-free water
  • Ice

Protocol:

  1. Thaw Solutions A and B on ice. Do not vortex.
  2. Assemble the following reaction on ice in this exact order in a 1.5 ml microcentrifuge tube:
    • 10 µl Solution A
    • 7.5 µl Solution B
    • 0.5 µl RNase Inhibitor (20 units)
    • 2 µl template DNA (250 ng)
    • 5 µl nuclease-free water
    • Total volume: 25 µl
  3. Mix gently by pipetting up and down 5 times. Do not vortex.
  4. Incubate at 37°C for 2 hours.
  5. Place on ice immediately after incubation.

Run alongside: one negative control reaction with no template DNA.


Step 2: SDS-PAGE Confirmation

Materials needed:

  • 10-20% Tris-glycine precast gel
  • SDS loading buffer
  • Protein ladder (10-250 kDa range)
  • Running buffer
  • Coomassie Blue stain

Protocol:

  1. Take 2.5 µl of the expression reaction and add 2.5 µl SDS loading buffer.
  2. Heat at 95°C for 5 minutes to denature proteins.
  3. Load onto gel alongside protein ladder and negative control.
  4. Run at 200V for 35 minutes.
  5. Stain with Coomassie Blue for 1 hour, destain with water overnight.
  6. Look for a band at approximately 32 kDa.

Success criterion: A visible band at 32 kDa in the expression lane that is absent in the negative control lane.


Step 3: Tremella Composite Formation (only if Step 2 successful)

Materials needed:

  • Tremella fuciformis dried polysaccharide (commercially available, e.g. Sigma or specialist supplier)
  • PBS buffer pH 7.4
  • Your expressed fusion protein from Step 1 (remaining ~22.5 µl)
  • Microcentrifuge tubes
  • Rotary mixer or gentle agitation

Protocol:

  1. Prepare a 1% w/v Tremella polysaccharide solution by dissolving 10 mg dried Tremella polysaccharide in 1 ml PBS pH 7.4. Stir gently at room temperature for 2 hours until fully dissolved. This produces a viscous gel solution.
  2. Take the remaining expressed fusion protein from Step 1 (~22.5 µl).
  3. Add 22.5 µl of the 1% Tremella polysaccharide solution to the protein.
  4. Mix gently by pipetting 10 times. Do not vortex.
  5. Incubate at room temperature for 30 minutes on a rotary mixer at low speed to allow CBM27 domain binding to the Tremella mannan backbone.
  6. Observe for gelation. The mixture should become more viscous than either component alone if the CBM27 is anchoring the silk to the Tremella matrix.

Success criterion: Visible increase in viscosity or gel formation compared to a control mixture of the same volume of PURExpress reaction without the fusion protein mixed with Tremella.


What Each Result Means

ResultInterpretation
Band at 32 kDa on gelProtein expressed successfully
No band on gelExpression failed, check DNA template and reaction setup
Gel formation in Step 3CBM27 is anchoring silk to Tremella, composite hydrogel confirmed
No gel formation in Step 3CBM27 may not be folding correctly in cell-free system, or Tremella preparation needs optimisation

Group Final Project

cover image cover image

L-Protein Mutants

Problem: How to improve the stability and auto-folding of the lysis protein of an MS2-phage? This mechanism is key to understanding how phages may help address antibiotic resistance.

After going through the readings, including the group final project document a Plan A would be: (This stays within scope, MurJ and multi-target approaches seem intersting though…)

  1. Use computational tools like AlphaFold2 or ProteinMPNN to identify mutations that improve intrinsic stability and auto-folding of the lysis protein

  2. Target mutations that strengthen the hydrophobic core, eliminate aggregation-prone regions, or introduce stabilising interactions like salt bridges

  3. Engineer the lysis protein to fold correctly without requiring DnaJ or any other bacterial chaperone

  4. Design mutations that also accelerate oligomerisation or enhance membrane pore-forming activity for faster lysis

  5. Synthesise the mutant gene via Twist, clone into plasmid using Gibson Assembly, validate structural integrity with Nuclera, then test in E. coli.