HW1 Ultra-efficient DNA Synthesis Machine My research is on designing a DNA synthesis machine that can reduce the cost and time to produce long strands of arbitrary DNA sequences. Right now we’re aiming for megabase strands of DNA but the goal of the project is to eventually get to the Gb range and establish a technology that can scale and improve similar to transistors in Moore’s Law. For context, I work mostly on the mechanical side and hope to learn more about the biochemistry and synthetic biology in general from this class.
HW11 Part A At some point during the global artwork process, I drew a little turtle in the bottom left of the top left plate. It later evolved into the border for the 2026 media lab side of the design but some of the pixels still exist. I loved seeing the artwork evolve over time and seeing what each person created and how, sometimes, people would join together asynchronously to complete designs. I wished the cooldown time was a little less; the 20-second period was more fun for individual creations but the longer time was better for collaboration and competition.
HW2 Part 1: Benchling & In-silico Gel Art This week, we made gel electrophoresis art using Lambda phage DNA and ten restriction enzymes. Gel electrophoresis uses a positive charge to pull negatively charged DNA through a conductive gel. Longer strands move slower and shorter strands move faster meaning that different lengths of DNA fragments will appear as different bars in your gel. To use this in an artistic context we take our input Lambda DNA and cut it to different lengths using different restriction enzymes which allows us to have coarse control over where these bars end up and thus we can make art with it. I have decided to really commit to my favorite animal, turtles, this semester and try to have a turtle-inspired theme to all of my projects. In an ideal world this is what I wanted my gel art to look like.
HW3 Lab Preparation: Opentrons Artwork This week, we programmed the Opentrons liquid handling robot to create fluorescent protein masterpieces. I was really looking forward to this lab and even did last week homework about expressing GFP in E.Coli. Rather than using the GFP, I found we used a variety of different colors of superfluorescent proteins. Ronan’s webtool [1] made it really easy to visualize a design, and we could even upload images to serve as a template for our designs. I decided to go all in on turtles and make a turtles all the way down image featuring a turtle with a globe for its shell. This was the original image, from my collection of Turtle CADS:
HW4 Conceptual Questions Here are my answers to the conceptual questions: [1] How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons)
~3x1024 molecules from (500g/(100g/mol))x6.02x1023 [2] Why do humans eat beef but do not become a cow, eat fish but do not become fish?
HW5 Part A Part 1: Generate Binders with PepMLM I started by getting the SOD1 sequence from UniProt: MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ Then I added the A4V mutation which changed it to: MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ
HW6 PCR and DNA Assembly What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose? The Phusion High-Fidelity PCR Master Mix contains a high-fidelity DNA polymerase for accurate DNA replication, dNTPs as the building blocks for new DNA strands, a buffer to maintain optimal reaction conditions, and magnesium ions which act as a cofactor for the polymerase. Together, these components enable efficient and precise DNA amplification.
HW7 Intracellular Artificial Neural Networks What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions? Intracellular Artificial Neural Networks use continuous analog signals instead of binary ones, which allows them to understand complex inputs like concentrations as opposed to just noting presence. They can use this to perform thresholding, enabling more complex reactions with fewer components. Overall, they are more scalable and better at multi-input sensing than regular genetic circuits.
HW9 General homework questions Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production. Cell-free gives more flexibility and control because it operates in a system where you can directly control concentrations of DNA, ions, and cofactors. This allows rapid prototyping and expression of toxic or non-natural proteins without killing your host cell. It is especially beneficial for producing toxic proteins and for quickly testing genetic circuits.
Subsections of Homework
Week 1 HW: Principles and Practices
HW1
Ultra-efficient DNA Synthesis Machine
My research is on designing a DNA synthesis machine that can reduce the cost and time to produce long strands of arbitrary DNA sequences. Right now we’re aiming for megabase strands of DNA but the goal of the project is to eventually get to the Gb range and establish a technology that can scale and improve similar to transistors in Moore’s Law. For context, I work mostly on the mechanical side and hope to learn more about the biochemistry and synthetic biology in general from this class.
Most of today’s DNA synthesis tools and services are good at making short DNA pieces, from hundreds to thousands of bps, but as strands get longer, they become much harder and costlier to produce. As an example, Integrated DNA Technologies sells ~125-3,000 bp fragments for uses like cloning single genes and antibody research, the cost is around $0.07-$0.20 per base pair.
Short and mid-length sequences are still really useful:
20–100 bp pieces are used for primers to amplify DNA or guide RNA design
1,000–3,000 bp can code for single proteins or small metabolic pathways
10,000–100,000 bp pieces could represent entire operons
Once hit the megabase range and beyond, you can do a whole lot more:
A typical bacterial genome like E. coli is ~1–13Mbp
Yeast has a genome around 12Mbp long
A gigabase approaches the size of large eukaryotic chromosomes
3.2Gbp is the size of the human genome
The goal is not just to achieve longer strands, but to make long, accurate DNA affordable and reliable, opening the door to lots more possibilities.
Governance/Policy
At the gigabase scale, DNA synthesis moves beyond individual genes and can be used to create entire genomes or chromosomes, which raises new safety, security, and ethical considerations. The main governance challenge for lowering the barrier to creating these large constructs is: how this technology and its outputs be developed and deployed responsibly as its capabilities scale.
Goal
Enable the constructive use of large-scale DNA synthesis while preventing misuse or unintended harm enabled by scale and accessibility.
Governance should aim to prevent malfeasance enabled by longer DNA constructs. This involves:
Preventing the synthesis of harmful DNA sequences
Reducing the risk posed by unintentional misuse
Ensuring that increased automation does not eliminate safety checks and that safety checks adapt to the technology
One way to enable this is through system-level design. The hierarchy of controls encourages eliminating risk before having to rely on training or user intent. By incorporating safety checks, containment options, and traceability mechanisms directly into hardware, safe operation becomes the default outcome. Without governance and training, things can still go wrong, and although accessibility is powerful, there should be limits set to ensure compliance and training. In case something goes wrong, there should be an established line of accountability and transparency with oversight.
Governance Actions
To ensure that large-scale DNA synthesis develops in a safe and constructive way there should be a mix of technical, institutional, and regulatory governance actions.
Hardware-Level Constraints on Maximum Assembly Length
Purpose
Currently, DNA synthesis limits are largely economic and biochemical but they could be enforced by hardware limits
Design
Mechanical or software-enforced limits on assembly length, these would be enforced by governance and added by design choices made by manufacturers and research labs building synthesis platforms.
Assumptions
One assumption is that limiting the strand length meaningfully reduces misuse risk. I don’t know where this limit would be set at but it’s possible that you could get around this or that the set length can still lead to misuse. It is also assumed that this wouldn’t hinder legitimate research.
Risks of Failure & “Success”
As mentioned before, you could find ways to misuse shorter strands or combine shorter strands into longer ones. Or this length could keep legitimate research from occurring successfully.
Sequence Screening and Logging
Purpose
Many DNA synthesis companies perform sequence screening to flag known pathogenic or regulated sequences. If synthesis moves toward in-house and machine-based systems, this screening could still take place or even be extended to logging
Design
Automated sequence screening could occur within each synthesis machine with flags dispersed or hardware/software locks enabled if needed. It might also be wise to log synthesized sequences above a certain size threshold. This can be implemented by academic institutions, commercial developers, and funding agencies. When working in MIT nano, your process must be approved by a committee before you begin and EHS reviews new and acceptable chemicals, this would work the same way.
Assumptions
This assumes that known harmful sequences can be meaningfully identified. And that users will accept limited logging in exchange for access.
Risks of Failure & “Success”
It is possible that screening can miss novel or emergent risks. Or that logging could raise intellectual-property concerns. If “too successful,” logging could discourage exploratory research using these systems.
Tiered Access
Purpose
I think one of the best ways to govern this technology is with a tiered access model where synthesis capabilities scale with demonstrated ability, infrastructure, and oversight.
Design
You could start with basic access for short and mid-length synthesis, with fairly open access to this. Then more advanced capabilities could be unlocked by agreeing to institutional reviews, trainings, and safety approval, along with oversight by universities or national research bodies. It might even be better to have a centralized location with the extra advanced machines with specialized oversight on them though still granting access.
Assumptions
This assumes that governing institutions can fairly evaluate readiness and risk and that training and review improve safety outcomes. It also assumes that access tiers won’t become arbitrary gatekeeping, and won’t come down to the same financial barriers in place now.
Risks of Failure & “Success”
This could disadvantage smaller or less well-funded labs, and “success” might slow innovation if approval processes lag behind technology.
Does the option:
Hardware/Software Constraints
Screening/Logging
Tiered Access
Enhance Biosecurity
• By preventing incidents
1
2
2
• By helping respond
3
1
2
Foster Lab Safety
• By preventing incidents
2
2
2
• By helping respond
3
1
2
Protect the environment
• By preventing incidents
1
3
2
• By helping respond
3
2
2
Other considerations
• Minimizing costs and burdens to stakeholders
1
3
2
• Feasibility?
1
2
2
• Not impede research
2
3
2
• Promote constructive applications
2
2
1
Recommended Governance Approach and Trade-offs
Based on the scoring, I would prioritize a combined approach using sequence screening and logging (Option 2) and tiered access (Option 3). Together, these options provide the strongest balance between preventing misuse, enabling response if something goes wrong, and still enabling legitimate research.
Option 2 performs best in terms of biosecurity and response. As DNA synthesis moves toward in-house, automated, and large-scale systems, maintaining some form of sequence screening becomes increasingly important. Screening and logging help ensure that synthesis at larger scales includes visibility and accountability. It also enables backtracking to find problems or errors, which is important when failures or misuse may not be immediately obvious.
Option 3 complements this by recognizing that not all synthesis capabilities carry the same level of risk. A tiered access model allows safe DNA synthesis to remain relatively open, while reserving the most powerful capabilities for users and institutions with appropriate infrastructure, training, and oversight.
Week 2 Lecture Prep
Homework Questions from Professor Jacobson:
1: DNA polymerase has an error rate of 1:10^6. With the 3.2 Gb human genome, that’s ~3,200 errors per replication. Biology fixes this discrepancy using polymerase proofreading and the MutS repair system, which functions likea multi-stage error-correction protocol to drop the effective mutation rate.
2: If an average human protein has around 400 amino acids, then redundancy allows for 400!/(20!)20 which is on the order of ~10501 DNA sequences to code for an average protein. But many versions fail because RNA secondary structures physically block ribosomes, or sequences trigger RNA cleavage and codon bias issues.
Homework Questions from Dr. LeProust:
1: Phosphoramidite synthesis
2: Synthesis hits a wall due to an exponential yield drop. Following the (1 - error rate)^N, curve, the probability of a “perfect” strand decreases with every base added.
3: At 2kb, the yield of perfect strands is basically zero. To reach the Gb range, we have to assemble smaller, verified oligos using PCA or Gibson Assembly rather than making them in one shot.
2: As a fail-safe, the Jurassic Park lysine contingency is flawed because all animals are already naturally unable to synthesize lysine; they get it from food. An escaped organism would do the same and just find lysine-rich food.
“give me some examples of what certain lengths of DNA can achieve?”
Week 10 HW: Imaging and Measurement
HW10
General homework questions
Work in progress, check back later
Week 11 HW: Bioproduction and Cloud Labs
HW11
Part A
At some point during the global artwork process, I drew a little turtle in the bottom left of the top left plate. It later evolved into the border for the 2026 media lab side of the design but some of the pixels still exist. I loved seeing the artwork evolve over time and seeing what each person created and how, sometimes, people would join together asynchronously to complete designs. I wished the cooldown time was a little less; the 20-second period was more fun for individual creations but the longer time was better for collaboration and competition.
Part B
Referencing the cell-free protein synthesis reaction composition (the middle box outlined in yellow on the image above, also listed below), provide a 1-2 sentence description of what each component’s role is in the cell-free reaction.
E. coli Lysate
BL21 (DE3) Star Lysate (includes T7 RNA Polymerase)
Provides the transcription and translation machinery (ribosomes, enzymes, tRNAs), with T7 RNA polymerase enabling strong expression from T7 promoters.
Salt/Buffer
Potassium Glutamate
Maintains intracellular-like ionic conditions that stabilize enzymes and ribosomes for efficient protein synthesis.
HEPES-KOH pH 7.5
Buffers the reaction to maintain a stable pH optimal for enzymatic activity.
Magnesium Glutamate
Supplies Mg²⁺ ions, which are essential cofactors for ribosomes, RNA polymerase, and ATP-dependent reactions.
Potassium phosphate monobasic
Contributes to buffering capacity and provides phosphate for metabolic processes.
Potassium phosphate dibasic
Works with the monobasic form to stabilize pH and maintain phosphate balance.
Energy / Nucleotide System
Ribose
Serves as a substrate for nucleotide regeneration through metabolic pathways.
Glucose
Provides a sustained energy source via glycolysis to extend reaction lifetime.
AMP
Acts as a precursor that can be phosphorylated to generate ATP for energy and transcription.
CMP
Precursor to CTP, required for RNA synthesis.
GMP
Precursor to GTP, necessary for transcription and translation.
UMP
Precursor to UTP, another essential RNA building block.
Guanine
A nucleobase that can be converted into GMP/GTP via salvage pathways in the lysate.
Translation Mix (Amino Acids)
17 Amino Acid Mix
Provides most amino acids needed for protein synthesis.
Tyrosine
Added separately due to solubility limitations in mixed amino acid stocks.
Cysteine
Added separately because it is chemically unstable and prone to oxidation.
Additives
Nicotinamide
Supports redox balance by maintaining NAD⁺/NADH-dependent metabolic activity for energy regeneration.
Backfill
Nuclease Free Water
Adjusts reaction volume without introducing nucleases that could degrade DNA or RNA.
Describe the main differences between the 1-hour optimized PEP-NTP master mix and the 20-hour NMP-Ribose-Glucose master mix shown in the Google Slide above.
The 1-hour PEP-NTP system uses phosphoenolpyruvate (PEP) and pre-supplied NTPs to drive rapid, high-level protein expression, but it depletes energy quickly and accumulates inhibitory byproducts. The 20-hour NMP–ribose–glucose system relies on nucleotide monophosphates and metabolic substrates to regenerate energy and NTPs over time, enabling slower but more sustained protein production. This results in a tradeoff between fast, short-lived expression and longer-lasting, more stable reactions
Part C
Given the 6 fluorescent proteins we used for our collaborative painting, identify and explain at least one biophysical or functional property of each protein that affects expression or readout in cell-free systems.
Fluorescent protein properties:
sfGFP
sfGFP has enhanced folding efficiency, allowing it to maintain high fluorescence even under suboptimal cell-free conditions.
mRFP1
mRFP1 has a slow chromophore maturation time, which delays fluorescence development despite successful protein expression.
mKO2
mKO2 matures relatively quickly but can be sensitive to environmental factors like pH, affecting its fluorescence intensity.
mTurquoise2
mTurquoise2 is highly bright but depends strongly on proper folding and oxygen availability for chromophore formation.
mScarlet-I
mScarlet-I has high brightness and improved maturation speed, but still requires efficient folding for optimal fluorescence.
Electra2
Electra2 is engineered for high brightness and stability but may place higher demands on cellular resources, making it sensitive to energy availability.
Create a hypothesis for how adjusting one or more reagents in the cell-free mastermix could improve a specific biophysical or functional property you identified above, in order to maximize fluorescence over a 36-hour incubation. Clearly state the protein, the reagent(s), and the expected effect.
Increasing glucose concentration in reactions expressing mRFP1 will extend energy availability and reaction lifetime, allowing more protein to fully mature and increasing total fluorescence over 36 hours.
The second phase of this lab will be to define the precise reagent concentrations for your cell-free experiment. You will be assigned artwork wells with specific fluorescent proteins and receive an email with instructions this week (by April 24).
Week 2 HW: DNA Read, Write, and Edit
HW2
Part 1: Benchling & In-silico Gel Art
This week, we made gel electrophoresis art using Lambda phage DNA and ten restriction enzymes. Gel electrophoresis uses a positive charge to pull negatively charged DNA through a conductive gel. Longer strands move slower and shorter strands move faster meaning that different lengths of DNA fragments will appear as different bars in your gel. To use this in an artistic context we take our input Lambda DNA and cut it to different lengths using different restriction enzymes which allows us to have coarse control over where these bars end up and thus we can make art with it. I have decided to really commit to my favorite animal, turtles, this semester and try to have a turtle-inspired theme to all of my projects. In an ideal world this is what I wanted my gel art to look like.
However, it was really hard to design this with such coarse control, I spent hours on Ronan's website to no avail before eventually deciding on something simpler, my initials: AH.
Part 2: Gel Art - Restriction Digests and Gel Electrophoresis
This was my first time going through the whole process of making a gel, but I’ll describe more about the experiment in the lab2 page on my website.
Part 3: DNA Design Challenge
3.1: Choose your protein
For my design challenge homework, I have picked Green Fluorescent Protein (GFP) because I was inspired by the HTGAA website photos, it is great for visualizing gene expression, and I want a second chance at making a turtle this time I want to make a turtle-shaped fluorescence pattern.
Here’s the protein sequence from Uniprot: >sp|P42212|GFP_AEQVI Green fluorescent protein OS=Aequorea victoria OX=6100 GN=GFP PE=1 SV=1
MSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTL
VTTFSYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLV
NRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLAD
HYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK
3.2: Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence
I used NCBI to get this reverse translation: >reverse translation of sp|P42212|GFP_AEQVI Green fluorescent protein OS=Aequorea victoria OX=6100 GN=GFP PE=1 SV=1 to a 714 base sequence of most likely codons.
atgagcaaaggcgaagaactgtttaccggcgtggtgccgattctggtggaactggatggc
gatgtgaacggccataaatttagcgtgagcggcgaaggcgaaggcgatgcgacctatggc
aaactgaccctgaaatttatttgcaccaccggcaaactgccggtgccgtggccgaccctg
gtgaccacctttagctatggcgtgcagtgctttagccgctatccggatcatatgaaacag
catgatttttttaaaagcgcgatgccggaaggctatgtgcaggaacgcaccatttttttt
aaagatgatggcaactataaaacccgcgcggaagtgaaatttgaaggcgataccctggtg
aaccgcattgaactgaaaggcattgattttaaagaagatggcaacattctgggccataaa
ctggaatataactataacagccataacgtgtatattatggcggataaacagaaaaacggc
attaaagtgaactttaaaattcgccataacattgaagatggcagcgtgcagctggcggat
cattatcagcagaacaccccgattggcgatggcccggtgctgctgccggataaccattat
ctgagcacccagagcgcgctgagcaaagatccgaacgaaaaacgcgatcatatggtgctg
ctggaatttgtgaccgcggcgggcattacccatggcatggatgaactgtataaa
3.3. Codon optimization
I used a handy website called https://www.novoprolabs.com/tools/codon-optimization to optimize my codon for Escherichia coli (E.coli). I chose this organismbecause it is commonly used in synthetic biology, safe, robust, and easy for people new to the wet lab. Why did I have to optimize it at all? GFP comes from a jellyfish it occurs naturally in that species but not in everything, in order to ensure that it will work well with E.coli I have to pick the codons it prefers (codon bias). This is possible because multiple codons can code for the same amino acid. Codon bias can be a problem if the host organism has low amounts of the matching tRNA. So codon optimization replaces rare codons with preferred codons and can remove unwanted restriction sites.
Here’s the optimized sequence:
ATGTCTAAAGGCGAAGAACTGTTCACCGGTGTGGTTCCGATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGT
TCTCTGTATCTGGTGAAGGCGAGGGTGATGCAACCTACGGTAAACTGACTCTGAAGTTCATTTGCACTACTGGTAAACT
GCCGGTTCCGTGGCCGACTCTGGTCACTACTTTCAGCTACGGTGTACAATGTTTTTCCCGTTACCCGGATCACATGAAG
CAGCATGACTTCTTCAAATCTGCTATGCCGGAAGGCTACGTTCAGGAACGCACCATCTTCTTCAAAGACGACGGTAACT
ACAAAACTCGCGCTGAGGTTAAGTTTGAAGGCGACACCCTGGTTAATCGTATCGAACTGAAAGGCATTGACTTCAAAGA
AGATGGTAACATCCTGGGTCACAAACTGGAATACAACTACAACAGCCATAACGTTTACATCATGGCAGACAAACAGAAA
AACGGCATCAAGGTGAACTTCAAAATTCGTCACAATATCGAAGATGGTTCCGTGCAGCTGGCCGATCACTACCAGCAGA
ACACTCCGATCGGTGACGGTCCGGTGCTGCTGCCGGACAATCACTATCTGAGCACTCAAAGCGCCCTGAGCAAAGACCC
GAACGAAAAACGTGATCACATGGTGCTGCTGGAATTCGTTACCGCGGCAGGCATCACTCACGGCATGGATGAACTGTAT
AAA
And stats about what’s changed:
CAI before optimization: 0.80
CAI after optimization: 0.83
GC content before optimization: 48.60%
GC content after optimization: 49.30%
3.4. You have a sequence! Now what?
Cell-Dependent Expression:
To get the GFP DNA to express in a cell, I would insert the codon-optimized GFP gene into a plasmid and transformed into E. coli. The plasmid includes a promoter that allows the cell to recognize and transcribe the gene. Once inside the bacteria, RNA polymerase transcribes the GFP DNA into mRNA, and ribosomes translate the mRNA into the GFP protein. As the protein folds into its final structure, it begins to fluoresce. In this system, the living cell provides all the machinery needed for transcription and translation.
Cell-Independent Expression:
To get GFP to be produced without living cells, I’d need to mix a cell extract containing ribosomes, enzymes, and tRNAs with the GFP DNA template in a test tube. The extract carries out transcription and translation directly in solution. This allows faster protein production and more control over reaction conditions since there is no need to grow or maintain cells. It is useful for rapid testing of gene designs before moving into full bacterial expression.
Part 4: Prepare a Twist DNA Synthesis Order
I chose a similar protein to the example in class but replaced the sfGFP coding sequence with my regular GFP coding sequence. I wonder what the differences are and if they are different enough to create a pattern with. ANyways here’s my benchling linear map.
Here's my final Fasta file with all the necessary regions: >GFP_hw2
TTTACGGCTAGCTCAGTCCTAGGTATAGTGCTAGCCATTAAAGAGGAGAAAGGTACCATGATGTCTAAAGGCGAAGAAC
TGTTCACCGGTGTGGTTCCGATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCTCTGTATCTGGTGAAGG
CGAGGGTGATGCAACCTACGGTAAACTGACTCTGAAGTTCATTTGCACTACTGGTAAACTGCCGGTTCCGTGGCCGACT
CTGGTCACTACTTTCAGCTACGGTGTACAATGTTTTTCCCGTTACCCGGATCACATGAAGCAGCATGACTTCTTCAAAT
CTGCTATGCCGGAAGGCTACGTTCAGGAACGCACCATCTTCTTCAAAGACGACGGTAACTACAAAACTCGCGCTGAGGT
TAAGTTTGAAGGCGACACCCTGGTTAATCGTATCGAACTGAAAGGCATTGACTTCAAAGAAGATGGTAACATCCTGGGT
CACAAACTGGAATACAACTACAACAGCCATAACGTTTACATCATGGCAGACAAACAGAAAAACGGCATCAAGGTGAACT
TCAAAATTCGTCACAATATCGAAGATGGTTCCGTGCAGCTGGCCGATCACTACCAGCAGAACACTCCGATCGGTGACGG
TCCGGTGCTGCTGCCGGACAATCACTATCTGAGCACTCAAAGCGCCCTGAGCAAAGACCCGAACGAAAAACGTGATCAC
ATGGTGCTGCTGGAATTCGTTACCGCGGCAGGCATCACTCACGGCATGGATGAACTGTATAAACATCACCATCACCATC
ATCACTAACCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTG
AACGCTCTCTACTAGAGTCACACTGGCTCACCTTCGGGTGGGCCTTTCTGCGTTTATA
Here’s my final plasmid:
Part 5: DNA Read/Write/Edit
5.1 DNA Read
If I could choose DNA to sequence, I would choose synthetic DNA used for digital data storage. DNA data storage is interesting because it treats DNA as away of encoding information, like a hard drive, but at a molecular scale. Sequencing the DNA would allow us to read back the stored data and measure how accurately the system preserves information over time, this seems like an idea straight out of science fiction though it is possible now.
To sequence this DNA, I would use Illumina sequencing, a second-generation sequencing technology. It works well for short, synthetic DNA fragments and provides high accuracy at relatively low cost. The input would be the synthetic DNA fragments that encode the information. These fragments wouldneed to be prepared by adding adapter sequences to their ends, amplifying them by PCR, and loading them onto a flow cell. During sequencing, fluorescently labeled nucleotides are incorporated one base at a time, and a camera detects the color signal to determine which base was added. This process converts fluorescence into a digital readout of A, T, C, and G. The output is a large dataset of DNA reads that can be reconstructed into the original digital file.
5.2 DNA Write
If I could synthesize DNA, I would create my turtle-themed GFP expression construct but with multiple colors of fluorescent proteins. I would synthesize codon-optimized Fluorescent Protein genes under the control of a bacterial promoter so that it could be expressed in E. coli. The goal would be to design a system that produces a turtle-shell-like hexagonal fluorescence pattern. This connects synthetic biology with spatial design and pattern formation maybe I could even get a system for animating it and getting the turtle to move. The core sequence would be the coding region, inserted into a plasmid backbone.
To synthesize this DNA, I would use chemical DNA synthesis and fragment assembly. Short DNA oligos are chemically synthesized, assembled into the full gene using overlapping regions, cloned into a plasmid, and sequence-verified. This method is good for constructs around 1 kb, such as GFP. Limitations include cost increasing with length, possible synthesis errors, and longer turnaround times for larger constructs.
5.3 DNA Edit
An interesting DNA edit project would be engineering biological motors or force-generating systems that could act as microscopic actuators. It would be interesting to use the process that proteins such as Kinesin-1 use to convert chemical energy from ATP into mechanical motion along microtubules. By editing the genes that encode these motor proteins, we could potentially tune their speed, force output, or binding properties, creating nanoscale linear motors with maybe even the potential to scale up. Editing their DNA could allow us to design programmable biological actuators for soft robotics, microfluidics, or responsive materials.
To perform this editing, I would use CRISPR-Cas9. I’d design a guide RNA that targets the motor protein gene. Cas9, directed by the guide RNA, cuts the DNA at a specific location. If we want to introduce a modification, such as a mutation that alters motor speed or adds a binding domain, we would also provide a donor DNA template for repair. The inputs include the guide RNA sequence, Cas9, the donor template if inserting changes, and host cells. After editing, the modified cells would express the altered motor protein, and its mechanical properties could be measured experimentally. Limitations include variable editing efficiency and the possibility of off-target edits, but the approach might allow for precise modification of biological force-generating systems.
“How do you transcribe DNA and translate a protein with a cell-dependent method?”
“How do you transcribe DNA and translate a protein with a cell-independent method?”
Week 3 HW: Lab Automation
HW3
Lab Preparation: Opentrons Artwork
This week, we programmed the Opentrons liquid handling robot to create fluorescent protein masterpieces. I was really looking forward to this lab and even did last week homework about expressing GFP in E.Coli. Rather than using the GFP, I found we used a variety of different colors of superfluorescent proteins. Ronan’s webtool [1] made it really easy to visualize a design, and we could even upload images to serve as a template for our designs. I decided to go all in on turtles and make a turtles all the way down image featuring a turtle with a globe for its shell. This was the original image, from my collection of Turtle CADS:
Once the image was imported, I went to work adjusting it to make my design. I tried to stick to fewer colors just in case we had less than what was on the website, which did not end up being an issue. This is how my image looked in the end:
I took the coordinates from this image for each different color and put them into the Colab Python Script. I had to add and change colors to match the ones I used. I originally ran into a problem where I’d tell the robot to fill up the pipette, empty it, then continue attempting to dispense I was able to see this error in the simulation and correct it by adding a loop where, after the pipette was empty (20 dots in my case using a p20 and 1ul droplets) I’d pick up more before continuing. I used the built-in Gemini chatbot to help with this assignment, after explaining the functions I had access to I asked it to use these functions to deposit blue droplets at every coordinate in a list. From there, I used the same functions and procedures to do the rest of my colors. After some troubleshooting, the simulation finally showed me the result I wanted, and the verification also went through.
It was really straightforward to go from my design to the Opentron, at least from my point of view as a student (not sure what magic the TAs and staff work behind the scenes). The machine itself was incredibly precise, not only in movement but also in dispensing. Here’s how my final design ended up: I’ll explain more about the process in my lab3 webpage:
Post-Lab Questions
Bioautomation in the wild [2]
The paper I chose was entitled: “TidyTron: Reducing lab waste using validated wash-and-reuse protocols for common plasticware in Opentrons OT-2 lab robots” authored by Bryant et al. (2023). I chose this paper because it touches on two things I care about, one being minimizing lab waste, and two being washing processes.
In this study, the authors developed an automated system, called TidyTron, that runs on the Opentrons liquid-handling robot and reduces laboratory plastic waste by automating wash-and-reuse protocols for common lab consumables like pipette tips and microplates. Traditionally, many of these are discarded after a single use because of uncertainty about cross-contamination, the authors even state that biotechnology labs generate ~5.5 million tons of plastic waste per year. TidyTron addresses this problem by using the OT-2 to perform consistent and reliable cleaning processes that are tested and validated for effectiveness.
The paper describes how the automated protocols were implemented on the Opentrons to rinse, clean, and sterilize plastics used with DNA solutions or microbial cultures. They evaluated the cleaned materials by measuring residual contamination using colony-forming unit (CFU) counts, quantitative PCR to detect residual DNA, and other metrics to confirm that the robot-washed consumables were equivalent in performance to new plastics. Their results showed that the wash-and-reuse procedures could effectively remove contamination, demonstrating that automation makes reusable workflows both practical and safe. I thought this approach was interesting because it applies automation not to increasing throughput or accuracy, but to sustainability in biological labs. I hope to use a similar system, not for plastic but for metal/more permanent lab equipment, which I’ll build into my DNA dispensing machine.
Bioautomation in my final project
As a mechanical engineer, I am very interested in the bioautomation part of synthetic biology. Two of my three final project ideas either seek to improve or apply bioautomation as a main part of the project. One idea I had was to do what we did in lab this week but in 3D. Two approaches I imagine are either to embed a 3D model in a brick of clear gel, similar to the lasers that can engrave a model into glass, or to build a 3D model out of gel and “paint” different fluorescent proteins around it all completely automated. I can imagine different challenges based on which version I pursue, but this would be nearly impossible without an automated system.
[2] Bryant JA Jr, Longmire C, Sridhar S, Janousek S, Kellinger M, Wright RC. TidyTron: Reducing lab waste using validated wash-and-reuse protocols for common plasticware in Opentrons OT-2 lab robots. SLAS Technol. 2024 Apr;29(2):100107. doi: 10.1016/j.slast.2023.08.007. Epub 2023 Sep 9. PMID: 37696493; PMCID: PMC12212179.
AI Prompts
“Using the functions described in this document, write a loop that deposits blue droplets at every coordinate in a given list”
“Use the function to refill the pipette after it’s empty, then continue depositing droplets if there are more droplets than the pipette can hold at once”
Week 4 HW: Protein Design Part I
HW4
Conceptual Questions
Here are my answers to the conceptual questions:
[1] How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons)
~3x1024 molecules from (500g/(100g/mol))x6.02x1023
[2] Why do humans eat beef but do not become a cow, eat fish but do not become fish?
Proteins are broken into amino acids in the digestive system and reassembled using instructions from human DNA. We reuse the material, but not the entire structure.
[3] Why are there only 20 natural amino acids?
These 20 amino acids are diverse enough to form all of the complex functional proteins needed throughout evolution. Adding others does not necessarily provide an evolutionary advantage and increases complexity and error rates.
[4] Can you make other non-natural amino acids? Design some new amino acids.
Yes, you can do this by using engineered ribosomes and system. Some examples of this include adding fluorinated amino acids to increase hydrophobicity and stability and photo-reactive amino acids that respond to light.
[5] Where did amino acids come from before enzymes that make them, and before life started?
They came from abiotic chemical reactions on early Earth, such as from sources like gases with lightning and UV radiation as energy sources or hydrothermal vents from meteorites.
[6] If you make an α-helix using D-amino acids, what handedness (right or left) would you expect?
Natural proteins are made of L-amino acids and form right-handed α-helices. If instead you used entirely D-amino acids you;d make a left-handed α-helix.
[7] Can you discover additional helices in proteins?
-Yes, proteins can form multiple helical structures not just the α-helix, some examples are 3₁₀ helices and π-helices.
[8] Why are most molecular helices right-handed?
Life evolved using L-amino acids, which are right-handed.
[9] Why do β-sheets tend to aggregate? What is the driving force for β-sheet aggregation?
β-sheets form planar structures with backbone hydrogen bond donors and acceptors along their edges. This lets multiple sheets form hydrogen bonds and stack together.
[10] Why do many amyloid diseases form β-sheets? Can you use amyloid β-sheets as materials?
Skipped
[11] Design a β-sheet motif that forms a well-ordered structure.
Skipped
Protein Analysis and Visualization - ATP Synthase: P0AB98 · ATP6_ECOLI
Briefly describe the protein you selected and why you selected it.
I selected ATP synthase because it is a fascinating molecular machine. As a mechanically driven enzyme, it directly connects structure, motion, and energy conversion.
ATP synthase is a large multi-subunit complex composed of two major regions: F₀ (membrane-embedded rotor) and F₁ (soluble catalytic head). The amino acid sequence depends on the organism and subunit. I looked at ATP6_ECOLI which is a critical membrane component of the F₀ motor in E. coli. It has 271 amino acids and is highly hydrophobic with many transmembrane helices. The most frequent amino acid is leucine which appears 45 times.
Here’s the sequence:
MASENMTPQDYIGHHLNNLQLDLRTFSLVDPQNPPATFWTINIDSMFFSVVLGLLFLVLFRSVAKKATSGVPGKFQTAIELVIGFVNGSVKDMYHGKSKLIAPLALTIFVWVFLMNLMDLLPIDLLPYIAEHVLGLPALRVVPSADVNVTLSMALGVFILILFYSIKMKGIGGFTKELTLQPFNHWAFIPVNLILEGVSLLSKPVSLGLRLFGNMYAGELIFILIAGLLPWWSQWILNVPWAIFHILIITLQAFIFMVLTIVYLSMASEEH
BLAST searching in UniProt revealed 250 homologs. The protein belongs to the F-type ATPase subunit family, which is common across many life forms because of its role in energy metabolism.
The earliest I could find this structure was solved was back in 2016 but in 2023 it was increased from 6.9 Å to 3.1 Å making it really close to a good quality structure. The solved structures of ATP synthase in E.coli includes multiple proteins and lipid molecules, detergent molecules, and even bound ATP or ADP.
In structural classification systems, subunit a belongs to the membrane protein α-helical bundle class, because it is mostly made of transmembrane α-helices.
Protein Visualizations: ATP Synthase 6oqw
Here's the protein as a cartoon in pyMOL
Here it is by secondary structure, it has a lot more helices than sheets
Here it is with hydrophobic in orange, polar in cyan, positive in blue, and negative in magenta. It's mostly hydrophobic, which is important for membrane stabilization.
The surface of the protein does not have any large holes and instead has narrow channels.
Group Brainstorm on Bacteriophage Engineering
Computational Engineering of the MS2 Lysis Protein (L)
Background. The MS2 L protein is a 75-amino-acid polypeptide that lyses E. coli by an incompletely understood mechanism. Its C-terminal transmembrane (TM) domain inserts into the cytoplasmic membrane and oligomerizes, causing depolarization that triggers host autolytic enzymes to degrade the murein layer. Recessive, conservative missense mutations clustered around a conserved LS dipeptide strongly implies L engages an unidentified host protein target rather than simply disrupting the bilayer. The dispensable N-terminal domain binds chaperone DnaJ (with solved PDB structures), modulating lysis timing. Its removal causes lysis ~20 min earlier. No experimental structure of L exists.
Goals. (1) Stabilize L for more robust membrane accumulation. (2) Accelerate lysis by bypassing DnaJ-dependent regulatory timing and improving delivery of functional L to the membrane. Because the downstream lytic target is unknown, we do not attempt to enhance per-molecule toxicity at the point of target engagement; we focus on removing regulatory brakes and increasing the supply of functional protein.
Pipeline: Three Tools, Each Non-Redundant
Clustal Omega (Conservation Map). Align L homologs across Leviviridae (MS2, f2, R17, GA, PP7, AP205, PRR1, M12, KU1, JP34). Conserved C-terminal residues, especially the LS motif, are presumed to mediate the unknown heterotypic interaction and are excluded from mutation. This map constrains all downstream design.
ESM2 + Deep Combinatorial Scanning (Fitness Oracle). Score every single-point mutation by log-likelihood change: increases at mutable positions indicate stabilizing substitutions (Goal 1). N-terminal scanning identifies mutations that disrupt DnaJ binding (Goal 2). A strict preservation rule applies near the LS motif: mutations are evaluated for maintenance of wild-type fitness, not improvement. The genetics show even conservative changes there cause recessive loss of function. Pairwise combinatorial scanning (about ~2M pairs) captures epistatic synergies at mutable positions. This could be potentially pushed further with enough compute.
AlphaFold 3 (Structural Filter + Complex Model). Predicts variant structures as a sanity check (does the TM helix survive?) and models the L–DnaJ complex to verify that N-terminal truncations/mutations disrupt the regulatory interface. Used as a filter, not a design engine. PAE matrix identifies confident interface contacts.
Ranking. Composite score: ESM2 log-likelihood gain (stability) + conservation preservation (all essential residues intact) + AF3-predicted DnaJ-binding disruption (for timing bypass). Top 10–20 variants advance to experimental validation.
Pipeline Schematic
Why Not More Tools? ProteinMPNN is excluded because it is trained on crystallized globular PDB proteins, not predicted structures of disordered membrane peptides. The compute is invested in combinatorial ESM2 depth.
Pitfalls
No experimental structure: All structural reasoning rests on AF3 predictions for a challenging target; mitigated by treating AF3 as a filter and cross-referencing against the conservation map.
Unknown lytic target: The central limitation. We cannot optimize target-binding affinity for an unidentified partner; engineering is restricted to upstream properties (stability, membrane delivery, DnaJ bypass).
Autolysin bottleneck: If lysis rate is limited by host autolytic enzyme activity rather than L accumulation, stabilization gains may show diminishing returns; the plaque assay will reveal this.
Write PyMOL code to help me visualize the protein as “cartoon”, “ribbon” and “ball and stick”, color the protein by secondary structure, color the protein by residue type, and visualize the surface of the protein
Week 5 HW: Protein Design Part II
HW5
Part A
Part 1: Generate Binders with PepMLM
I started by getting the SOD1 sequence from UniProt:
MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ
Then I added the A4V mutation which changed it to:
MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ
After running PepMLM I got these results:
Binder,Pseudo Perplexity
DHSYAVVVAWKX,15.47789718276849
DRYGAAAIKHKK,17.832445649262862
WSYPAAVVAHKX,7.216885793300712
DHVGAVAIEHKX,10.94105373182517
Compared to the known result of:
FLYRWLPSRRGG,20.63523127283615
Compared to the known result of:
FLYRWLPSRRGG, 0.3, 0.82
I modeled the complex between A4V mutant Superoxide dismutase 1 and each generated peptide using Alphafold. The PepMLM-generated peptides produced ipTM scores ranging from 0.38 to 0.51, while the known SOD1-binding peptide FLYRWLPSRRGG had an ipTM score of 0.30. The highest scoring peptide, DHSYAVVVAWKX, achieved an ipTM of 0.51 with a pTM of 0.89, suggesting a stronger predicted interaction with the protein compared to the known binder. The other generated peptides (DRYGAAAIKHKK, WSYPAAVVAHKX, and DHVGAVAIEHKX) also showed moderate interaction scores between 0.38–0.41, all exceeding the known peptide’s ipTM. Overall, these results suggest that the PepMLM model was able to generate candidate peptides with predicted binding strengths comparable to or stronger than the previously known SOD1 binder.
Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse
Among my peptides, WSYPAAVVAHKX is the best candidate to advance. All three peptides are predicted to be soluble and non-hemolytic with similar binding affinities (pKd/pKi ≈ 5.3), indicating comparable binding strength. However, WSYPAAVVAHKX has the lowest hemolysis probability (0.014) and a moderate net charge (0.85), which may improve stability and reduce potential toxicity. In contrast, DRYGAAAIKHKK has slightly stronger predicted binding but a much higher positive charge (2.84) and a strongly negative hydrophobicity score, which could affect membrane interactions. DHSYAVVVAWKX has near-neutral charge but a higher hemolysis probability than WSYPAAVVAHKX. Overall, WSYPAAVVAHKX provides the best balance of binding affinity, safety, and physicochemical properties, making it the most promising peptide to move forward with.
Part 4: Generate Optimized Peptides with moPPIt
I had issues running the CoLab without a GPU, but here are my predictions. MoPPIt was intended to generate 12-amino-acid peptides that bind to specific residues on the A4V mutant SOD1 while optimizing multiple properties like binding affinity, solubility, and low hemolysis. Unlike PepMLM, which broadly samples possible binders from the target sequence, moPPIt uses guided generation to steer peptides toward a chosen binding site and optimize several objectives simultaneously. As a result, moPPIt peptides would be expected to be more targeted and better optimized for therapeutic properties than the PepMLM peptides.
Before advancing these peptides further, I would first evaluate them computationally by predicting their structures and docking them to the SOD1 binding site to estimate binding strength. I would also check properties like stability, aggregation risk, toxicity, and immunogenicity. The most promising peptides should then be synthesized and tested experimentally with binding assays and cell-based tests to confirm binding, stability, and safety before considering preclinical studies.
Part C: Final Project: L-Protein Mutants
L-Protein Engineering | Option 1: Mutagenesis
### L-Protein Engineering | Option 2: Mutagenesis using Af2-Multimer
To disrupt the interaction between the L-protein and DnaJ, I designed mutations in the soluble region that either change charge, disrupt hydrophobic contacts, or introduce steric changes that could weaken the binding interface. I avoided positions that appear highly conserved in the sequence alignments and focused on residues that are likely surface exposed.
Mutation Set 1:
E45K, L48D, Y52A
This mutation flips the charge of E45 and introduces a charged residue at L48, which could disrupt electrostatic and hydrophobic interactions at the binding interface.
Mutation Set 2:
A60D, V63K, L67D
These substitutions introduce charged residues into a region that is likely hydrophobic, which may destabilize the interaction surface with DnaJ.
Mutation Set 3:
F72A, Y75A, L78D
Replacing aromatic and hydrophobic residues with alanine or charged residues reduces hydrophobic packing and weakens potential binding contacts.
Mutation Set 4:
K85E, R88E, L91A
This mutation reverses positive charges that may interact with DnaJ and replaces a hydrophobic residue with alanine to reduce interface stability.
Mutation Set 5:
V96D, A99K, L102D
Introducing charged residues in this region may disrupt local structure or electrostatic interactions that stabilize the L-protein–DnaJ complex.
To evaluate these mutants, I would co-fold each mutant L-protein with DnaJ using AF2-Multimer and compare predicted interface contacts, confidence scores, and binding geometry. Mutants that show reduced interface contacts or lower interaction confidence would be considered more effective at disrupting the interaction.
L-Protein Engineering | Option 3: Random Mutagenesis
‘Help me write python code to create random protein mutations’
Week 6 HW: Genetic Circuits Part I
HW6
PCR and DNA Assembly
What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose?
The Phusion High-Fidelity PCR Master Mix contains a high-fidelity DNA polymerase for accurate DNA replication, dNTPs as the building blocks for new DNA strands, a buffer to maintain optimal reaction conditions, and magnesium ions which act as a cofactor for the polymerase. Together, these components enable efficient and precise DNA amplification.
What are some factors that determine primer annealing temperature during PCR?
Primer annealing temperature is primarily determined by the melting temperature of the primers, which depends on their length and GC content. Higher GC content increases the temp due to stronger hydrogen bonding.
There are two methods from this class that create linear fragments of DNA: PCR, and restriction enzyme digests. Compare and contrast these two methods, both in terms of protocol as well as when one may be preferable to use over the other.
PCR generates linear DNA fragments by amplifying a target sequence using primers and a polymerase through thermal cycling, allowing for flexible design and modification of DNA ends. In contrast, restriction enzyme digestion cuts DNA at specific recognition sites, producing predictable fragments but with less flexibility. PCR is better when creating new sequences or adding overlaps, while restriction digests are better when cutting existing DNA at known sites for cloning.
How can you ensure that the DNA sequences that you have digested and PCR-ed will be appropriate for Gibson cloning?
To ensure compatibility with Gibson cloning, the DNA fragments must have overlapping homologous regions that allow them to anneal during assembly. They should be designed to keep the correct orientation and reading frame, and checked using sequence analysis tools. Gel electrophoresis can be used to confirm fragment size before assembly.
How does the plasmid DNA enter the E. coli cells during transformation?
Plasmid DNA enters cells through heat shock or electroporation. Heat shock temporarily increases membrane permeability because of the rapid temperature change, and electroporation uses an electric field to create pores in the cell membrane so DNA can pass into the cell.
Describe another assembly method in detail
Golden Gate Assembly uses Type IIS restriction enzymes that cut outside of their recognition sites to generate custom overhangs. These overhangs allow multiple DNA fragments to be assembled in a specific order in a single reaction. DNA ligase then joins the fragments together, resulting in a seamless construct without extra bases.
Explain the other method in 5–7 sentences plus diagrams
Golden Gate Assembly relies on Type IIS restriction enzymes, which cut DNA outside of their recognition sequences to produce specific overhangs. These overhangs are designed so that different DNA fragments can anneal in a predetermined order. During the reaction, restriction enzymes cut the DNA while ligase simultaneously joins compatible fragments, driving the system toward the correct final product. Because the recognition sites are removed during assembly, the resulting DNA sequence is seamless. This method is best for assembling multiple fragments in a single reaction and is popular for synthetic biology, but it requires careful design of overhang sequences to avoid incorrect assembly. It is a fast and scalable method for constructing complex DNA designs.
Model this assembly method with Benchling or Asimov Kernel
For this part of the assignment, I used Benchling to create a Golden Gate Assembly of the pUC19 backbone with the GFP insert.
I used primers to ensure that I was able to include the entirety of the insert into the plasmid
Here's what it looked like in the end
Asimov Kernel
For the first part of the kernel homework I looked into the repressilator and multiplexer circuits. The repressilator is a synthetic oscillatory circuit made of three genes that repress each other in a loop, causing time-delayed negative feedback that creates in periodic oscillations of protein expression. The multiplexer is a logic-based circuit that uses L-arabinose, aTc, and IPTG as inputs and produces a high GFP output only for specific combinations of these inputs. It was interesting to see their simulations and compare them to electronic circuits.
I also had AI help create a cheat sheet:
Cheat Sheet
pBad – activated by arabinose
pTet – repressed by TetR, activated by aTc
pLacI / pTac – repressed by LacI, activated by IPTG
RBS – controls translation strength
TetR – represses pTet
LacI – represses Lac promoters
GFP – fluorescent output
L3S2P24 Terminator – stops transcription
aTc – inhibits TetR
IPTG – inhibits LacI
Arabinose – activates pBad
structure - Promoter → RBS → CDS → Terminator → Promoter → RBS → CDS → Terminator
For the second part, I recreated the repressilator and simulated it.
For the third part, I tried to design 3 circuits, a toggle switch, a pulse, and a tiered cascade.
The toggle switch should turn off when aTc is added and on with IPTG. This should happen by turning the transcript for LacI on and the transcript for TetR off, causing GFP production to stop. This one worked in simulation, though I also expected it to respond to IPTG.
The pulse should show a sharp increase then decrease when arabinose is added. This should happen by turning the transcript for TetR on and the transcript for GFP will go from zero to high production, then decrease back to zero. This one kind of worked in simulation, but I expected GFP to start and end at zero and instead I still got a pulse but it was only a relatively low to high pulse which might affect the visibility of glowing but won’t cause a flicker.
The cascade should look like stairs, starting from one rate of GFP production then stepping down to a lower rate of GFP production when arabinose is detected. This should happen by turning the transcript for TetR on and the transcript for GFP will go from low to high production, then decrease back to zero. This one did not work at all, there was no response to the arabinose. I believe that there was something wrong with the sensing part of the circuit, but I couldn’t troubleshoot it. Consulting ChatGPT recommended some changes to the TetR part to ensure detection of arabinose, but I wasn’t sure how to do that.
Week 7 HW: Genetic Circuits Part II
HW7
Intracellular Artificial Neural Networks
What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions?
Intracellular Artificial Neural Networks use continuous analog signals instead of binary ones, which allows them to understand complex inputs like concentrations as opposed to just noting presence. They can use this to perform thresholding, enabling more complex reactions with fewer components. Overall, they are more scalable and better at multi-input sensing than regular genetic circuits.
Describe a useful application for an IANN; include a detailed description of input/output behavior, as well as any limitations an IANN might face to achieve your goal.
Diseases like cancer consist of lots of unique biomarkers in various concentrations, an IANN can be used to engineer smart therapeutic cells that detect these tumors by integrating multiple biomarkers. The inputs would be markers like hypoxia, lactate, and cytokine levels. These would all be observed over a continuous scale and processed by taking a weighted combination of these inputs with a threshold response. So only responding when levels are high enough to indicate a tumor. Then the output could be something like the expression of a therapeutic protein or reporter. Some limitations of this system would be biological noise, difficulty tuning weights, and having a slow response times.
Draw a diagram for an intracellular multilayer perceptron where layer 1 outputs an endoribonuclease that regulates a fluorescent protein output in layer 2.
Fungal Materials
What are some examples of existing fungal materials and what are they used for? What are their advantages and disadvantages over traditional counterparts?
Fungal materials made from mycelium are used for packaging, insulation, leather alternatives. The pros are that these materials are biodegradable, sustainable, and can be grown with low energy input, utilizing waste. They are also often lighter weight than their counterparts. However, they are generally less strong and durable than plastics or metals and can be quite sensitive to moisture and mass production of these materials is a challenge.
What might you want to genetically engineer fungi to do and why? What are the advantages of doing synthetic biology in fungi as opposed to bacteria?
You could engineer fungi to produce building materials that capture and store CO₂ by storing it from their growth substrates. Fungi naturally form filamentous, multicellular networks ideal for shaping walls, panels, or insulation, and can sequester carbon directly in their biomass. The main challenges are fungi have slower growth and more complex genetic engineering compared to bacteria.
First DNA Twist Order
For my final project, I want to make a “hydration” checking wearable device. Originally, I wanted to sense increased sodium levels insweat but that proved to be difficult so instead I’m approximating increased hydration risk by just detecting lactate. I want this to be a cell free system to make it more compatible with a safe wearable device.
I’m still working on my first DNA order. I want it to be cell free and got some help designing a genetic circuit but want to verify with the cell free lesson before placing order. I want it to follow this genetic circuit:
Lactate (input signal from sweat) -> Lactate Oxidase (breaks lactate down into) -> Pyruvate (byproduct not used) + H2O2 (which then activates OxyR by oxidizing it) -> OxyR (has two states starts reduced then is oxidized to become active and bind to DNA to activate transcription of PoxyS) -> PoxyS Promoter (controlled by OxyR, switches on reporter gene) -> RNA Aptamer (produced when PoxyS promoter is enabled binds to dye) -> DFHBI Dye (non-fluorescent when bound to RNA Aptamer)
All together it follows this chain:
Lactate provides the biological input (increased sweat means increased need for hydration)
Lactate Oxidase and H2O2 convert it into a detectable chemical signal
OxyR and PoxyS act as switch
RNA aptamer and dye generate a fast fluorescent output (that way the glow happens quicker than with protein translation and safer for a wearable because there are no cells)
Week 9 HW: Cell Free Systems
HW9
General homework questions
Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production.
Cell-free gives more flexibility and control because it operates in a system where you can directly control concentrations of DNA, ions, and cofactors. This allows rapid prototyping and expression of toxic or non-natural proteins without killing your host cell. It is especially beneficial for producing toxic proteins and for quickly testing genetic circuits.
Describe the main components of a cell-free expression system and explain the role of each component.
A cell-free expression system contains a cell extract with transcription and translation machinery, a DNA or mRNA template encoding the protein, and an energy system to fuel synthesis. Also amino acids, salts, and cofactors that support proper folding and reaction stability.
Why is energy provision regeneration critical in cell-free systems? Describe a method you could use to ensure continuous ATP supply in your cell-free experiment.
Energy regeneration is critical because protein synthesis quickly consumes ATP and GTP. Without replenishment the reaction stops. Continuous ATP supply can be maintained by using systems like phosphoenolpyruvate with pyruvate kinase to regenerate ATP from ADP.
Compare prokaryotic versus eukaryotic cell-free expression systems. Choose a protein to produce in each system and explain why.
Prokaryotic cell-free systems are fast and high-yield while eukaryotic systems support proper folding and modifications but are slower and more expensive. A simple protein like GFP is best produced in a prokaryotic system, but a complex protein like an antibody requires a eukaryotic system.
How would you design a cell-free experiment to optimize the expression of a membrane protein? Discuss the challenges and how you would address them in your setup.
A cell-free system can include liposomes or nanodiscs to mimic a membrane environment and prevent aggregation. Challenges include hydrophobic misfolding and instability, which can be addressed with careful tuning of reaction conditions which improves proper insertion and folding.
Imagine you observe a low yield of your target protein in a cell-free system. Describe three possible reasons for this and suggest a troubleshooting strategy for each.
Low protein yield might be caused by poor transcription, inefficient translation, or energy depletion. These can be addressed by using stronger promoters or more DNA, optimizing ribosome binding sites or codons, and improving ATP regeneration systems.
Homework question from Kate Adamala
Design an example of a useful synthetic minimal cell as follows:
Pick a function and describe it.
What would your synthetic cell do? What is the input and what is the output?
For my final project, I want to make a lactate-detecting cell-free system. It would detect lactate concentration and produce a measurable signal. It acts as a biochemical sensor that converts a lactate input into a fluorescent output and is useful for monitoring physiological or environmental lactate levels
Could this function be realized by cell-free Tx/Tl alone, without encapsulation?
Yes, that’s how I plan to do it for my final project, but encapsulation would improve control over sensing and environmental interaction and might also let me make the reaction resettable. Without a membrane, diffusion and signal localization are harder to regulate.
Could this function be realized by a genetically modified natural cell?
A genetically modified natural cell could also perform lactate sensing using my genetic circuit. However, natural cells introduce complexity, slower response times, and biosafety concerns. My circuit could also produce a large amount of h2o2 which is harmful to natural cells but maybe not to synthetic ones.
Describe the desired outcome of your synthetic cell operation.
The desired outcome is a system that produces a reliable, quantifiable signal proportional to lactate concentration. The response should be sensitive, fast, and specific to lactate. This allows accurate monitoring in real time.
Design all components that would need to be part of your synthetic cell.
What would the membrane be made of?
The membrane would be composed of a lipid bilayer combined with cholesterol for stability. This provides a semi-permeable barrier similar to natural cells. It also helps maintain internal reaction conditions.
What would you encapsulate inside? Enzymes, small molecules.
Inside the cell, would be transcription/translation machinery, DNA encoding the sensor and reporter, amino acids, ATP regeneration components, and cofactors. My cell in particular would also include lactate oxidase and DFBH dye.
Which organism your Tx/Tl system will come from? Is bacterial OK, or do you need a mammalian system for some reason? (hint: for example, if you want to use small molecule modulated promotors, like Tet-ON, you need mammalian)
I plan to use E. coli, which is sufficient for expressing simple reporter proteins and oxyR. A mammalian system is not necessary unless complex regulatory elements are required, my second goal was to track sodium levels which a mammalian system might be best for
How will your synthetic cell communicate with the environment? (hint: are substrates permeable? or do you need to express the membrane channel?)
Lactate would diffuse across the membrane or through membrane channels. If lactate permeability is low, the transporter protein lactate permease could be included.
Experimental details
List all lipids and genes.
Lipids: POPC and cholesterol. Genes: lactate-responsive regulator, a promoter controlled by OxyR, dye as a reporter, possibly lactate permease for transport, and lactate oxidase
How will you measure the function of your system?
By monitoring fluorescence intensity over time using a plate reader or microscope. Increased fluorescence indicates a higher lactate concentration.
Homework question from Peter Nguyen
Write a one-sentence summary pitch sentence describing your concept.
I want to make a freeze-dried cell-free system embedded in workout clothes that detects lactate buildup in sweat a change color to indicate fatigue and performance.
How will the idea work, in more detail? Write 3-4 sentences or more.
Cell-free reactions containing a lactate-responsive genetic circuit are freeze-dried into removable patches that are attached to various locations on the fabric. When the wearer sweats, moisture rehydrates the system and lactate diffuses into the reaction, activating expression of a fluorescent reporter dye. The intensity of the signal correlates with lactate concentration, providing real-time feedback on muscle fatigue.
What societal challenge or market need will this address?
Not sure if anyone needs this, I just think it would be cool. Originally, I wanted to do something that informs the wearer of dehydration.
How do you envision addressing the limitation of cell-free reactions (e.g., activation with water, stability, one-time use)?
I hope that sweat provides enough water to activate the freeze-dried systems, and using modular replaceable patches would allow for multiple uses and control stability.
Homework question from Ally Huang
Provide background information that describes the space biology question or challenge you propose to address. Explain why this topic is significant for humanity, relevant for space exploration, and scientifically interesting.
Microgravity leads to rapid muscle atrophy due to reduced mechanical loading, posing a major risk for long-duration space missions. Current countermeasures, like exercise, are time-consuming and not fully effective. Understanding and monitoring molecular pathways that regulate muscle growth and degradation is essential for developing better interventions. This is significant for maintaining astronaut health, enabling deep space exploration, and advancing treatments for muscle-wasting diseases on Earth. Studying these pathways in space also provides insight into fundamental mechanisms of mechanobiology.
Name the molecular or genetic target that you propose to study. Examples of molecular targets include individual genes and proteins, DNA and RNA sequences, or broader approaches.
Myostatin (MSTN) and muscle growth regulators, including IGF-1 pathway gene expression markers.
Describe how your molecular or genetic target relates to the space biology question or challenge your proposal addresses.
Myostatin negatively regulates muscle growth, while IGF-1 promotes muscle synthesis and repair. In microgravity, increased myostatin activity and reduced anabolic signaling contribute to muscle loss. By monitoring these molecular targets, we can track the balance between muscle degradation and growth. This helps identify when and how muscle atrophy occurs in space. Understanding these changes enables development of targeted countermeasures to maintain muscle mass.
Clearly state your hypothesis or research goal and explain the reasoning behind it.
Microgravity conditions upregulate myostatin expression and downregulate IGF-1 signaling, leading to accelerated muscle atrophy. A cell-free system can be used to detect changes in expression of these markers from biological samples in space. The goal is to develop a lightweight, rapid diagnostic platform that monitors muscle health in real time using the stability and simplicity of freeze-dried cell-free systems to make it suitable for spaceflight. If successful, it could guide personalized countermeasures such as exercise timing or therapeutic interventions. This would improve astronaut health and mission performance during long-duration space travel.
Outline your experimental plan - identify the sample(s) you will test in your experiment, including any necessary controls, the type of data or measurements that will be collected, etc.
Samples containing RNA or DNA proxies for MSTN and IGF-1 expression will be analyzed using BioBits® cell-free reactions designed to produce fluorescence in response to target sequences. miniPCR® will amplify target sequences if needed, and fluorescence will be measured using the P51 viewer. Controls will include samples without target sequences and with known concentrations. Data will be collected as fluorescence intensity over time and compared across simulated microgravity conditions. This will quantify changes in muscle-related gene expression.
First DNA Twist Order
For my final project, I want to make a “hydration” checking wearable device. Originally, I wanted to sense increased sodium levels insweat but that proved to be difficult so instead I’m approximating increased hydration risk by just detecting lactate. I want this to be a cell free system to make it more compatible with a safe wearable device.
Lactate (input signal from sweat) -> Lactate Oxidase (breaks lactate down into) -> Pyruvate (byproduct not used) + H2O2 (which then activates OxyR by oxidizing it) -> OxyR (has two states starts reduced then is oxidized to become active and bind to DNA to activate transcription of PoxyS) -> PoxyS Promoter (controlled by OxyR, switches on reporter gene) -> RNA Aptamer (produced when PoxyS promoter is enabled binds to dye) -> DFHBI Dye (non-fluorescent when bound to RNA Aptamer)
All together it follows this chain:
Lactate provides the biological input (increased sweat means increased need for hydration)
Lactate Oxidase and H2O2 convert it into a detectable chemical signal
OxyR and PoxyS act as switch
RNA aptamer and dye generate a fast fluorescent output (that way the glow happens quicker than with protein translation and safer for a wearable because there are no cells)