Homework

Weekly homework submissions:

Week 1 HW: Principles and Practices
HW1 Ultra-efficient DNA Synthesis Machine My research is on designing a DNA synthesis machine that can reduce the cost and time to produce long strands of arbitrary DNA sequences. Right now we’re aiming for megabase strands of DNA but the goal of the project is to eventually get to the Gb range and establish a technology that can scale and improve similar to transistors in Moore’s Law. For context, I work mostly on the mechanical side and hope to learn more about the biochemistry and synthetic biology in general from this class.
Week 10 HW: Imaging and Measurement
HW10 Homework: Final Project Please identify at least one (ideally many) aspect(s) of your project that you will measure. For my final project, I am developing a wearable hydration tracker that measures biomarkers related to hydration status from sweat samples. The project is designed to continuously monitor hydration in real time using a small wearable device.
Week 11 HW: Bioproduction and Cloud Labs
HW11 Part A At some point during the global artwork process, I drew a little turtle in the bottom left of the top left plate. It later evolved into the border for the 2026 media lab side of the design but some of the pixels still exist. I loved seeing the artwork evolve over time and seeing what each person created and how, sometimes, people would join together asynchronously to complete designs. I wished the cooldown time was a little less; the 20-second period was more fun for individual creations but the longer time was better for collaboration and competition.
Week 12 HW: Building Genomes
HW12 See Week 11 Hw and Week 11 Lab for content, along with my individual final project page
Week 13 HW: Bio Design Living Materials
HW13 Check out my Final Project Page for this weeks HW
Week 14 HW: Biofabrication
HW14 See my Final Project Presentation page
Week 2 HW: DNA Read, Write, and Edit
HW2 Part 1: Benchling & In-silico Gel Art This week, we made gel electrophoresis art using Lambda phage DNA and ten restriction enzymes. Gel electrophoresis uses a positive charge to pull negatively charged DNA through a conductive gel. Longer strands move slower and shorter strands move faster meaning that different lengths of DNA fragments will appear as different bars in your gel. To use this in an artistic context we take our input Lambda DNA and cut it to different lengths using different restriction enzymes which allows us to have coarse control over where these bars end up and thus we can make art with it. I have decided to really commit to my favorite animal, turtles, this semester and try to have a turtle-inspired theme to all of my projects. In an ideal world this is what I wanted my gel art to look like.
Week 3 HW: Lab Automation
HW3 Lab Preparation: Opentrons Artwork This week, we programmed the Opentrons liquid handling robot to create fluorescent protein masterpieces. I was really looking forward to this lab and even did last week homework about expressing GFP in E.Coli. Rather than using the GFP, I found we used a variety of different colors of superfluorescent proteins. Ronan’s webtool [1] made it really easy to visualize a design, and we could even upload images to serve as a template for our designs. I decided to go all in on turtles and make a turtles all the way down image featuring a turtle with a globe for its shell. This was the original image, from my collection of Turtle CADS:
Week 4 HW: Protein Design Part I
HW4 Conceptual Questions Here are my answers to the conceptual questions: [1] How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons) ~3x1024 molecules from (500g/(100g/mol))x6.02x1023 [2] Why do humans eat beef but do not become a cow, eat fish but do not become fish?
Week 5 HW: Protein Design Part II
HW5 Part A Part 1: Generate Binders with PepMLM I started by getting the SOD1 sequence from UniProt: MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ Then I added the A4V mutation which changed it to: MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ
Week 6 HW: Genetic Circuits Part I
HW6 PCR and DNA Assembly What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose? The Phusion High-Fidelity PCR Master Mix contains a high-fidelity DNA polymerase for accurate DNA replication, dNTPs as the building blocks for new DNA strands, a buffer to maintain optimal reaction conditions, and magnesium ions which act as a cofactor for the polymerase. Together, these components enable efficient and precise DNA amplification.
Week 7 HW: Genetic Circuits Part II
HW7 Intracellular Artificial Neural Networks What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions? Intracellular Artificial Neural Networks use continuous analog signals instead of binary ones, which allows them to understand complex inputs like concentrations as opposed to just noting presence. They can use this to perform thresholding, enabling more complex reactions with fewer components. Overall, they are more scalable and better at multi-input sensing than regular genetic circuits.
Week 9 HW: Cell Free Systems
HW9 General homework questions Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production. Cell-free gives more flexibility and control because it operates in a system where you can directly control concentrations of DNA, ions, and cofactors. This allows rapid prototyping and expression of toxic or non-natural proteins without killing your host cell. It is especially beneficial for producing toxic proteins and for quickly testing genetic circuits.

Week 1 HW: Principles and Practices

cover image

HW1

Ultra-efficient DNA Synthesis Machine

My research is on designing a DNA synthesis machine that can reduce the cost and time to produce long strands of arbitrary DNA sequences. Right now we’re aiming for megabase strands of DNA but the goal of the project is to eventually get to the Gb range and establish a technology that can scale and improve similar to transistors in Moore’s Law. For context, I work mostly on the mechanical side and hope to learn more about the biochemistry and synthetic biology in general from this class.

Most of today’s DNA synthesis tools and services are good at making short DNA pieces, from hundreds to thousands of bps, but as strands get longer, they become much harder and costlier to produce. As an example, Integrated DNA Technologies sells ~125-3,000 bp fragments for uses like cloning single genes and antibody research, the cost is around $0.07-$0.20 per base pair.

Short and mid-length sequences are still really useful:

20–100 bp pieces are used for primers to amplify DNA or guide RNA design
1,000–3,000 bp can code for single proteins or small metabolic pathways
10,000–100,000 bp pieces could represent entire operons

Once hit the megabase range and beyond, you can do a whole lot more:

A typical bacterial genome like E. coli is ~1–13Mbp
Yeast has a genome around 12Mbp long
A gigabase approaches the size of large eukaryotic chromosomes
3.2Gbp is the size of the human genome

The goal is not just to achieve longer strands, but to make long, accurate DNA affordable and reliable, opening the door to lots more possibilities.

Governance/Policy

At the gigabase scale, DNA synthesis moves beyond individual genes and can be used to create entire genomes or chromosomes, which raises new safety, security, and ethical considerations. The main governance challenge for lowering the barrier to creating these large constructs is: how this technology and its outputs be developed and deployed responsibly as its capabilities scale.

Goal

Enable the constructive use of large-scale DNA synthesis while preventing misuse or unintended harm enabled by scale and accessibility.

Governance should aim to prevent malfeasance enabled by longer DNA constructs. This involves:

Preventing the synthesis of harmful DNA sequences
Reducing the risk posed by unintentional misuse
Ensuring that increased automation does not eliminate safety checks and that safety checks adapt to the technology

One way to enable this is through system-level design. The hierarchy of controls encourages eliminating risk before having to rely on training or user intent. By incorporating safety checks, containment options, and traceability mechanisms directly into hardware, safe operation becomes the default outcome. Without governance and training, things can still go wrong, and although accessibility is powerful, there should be limits set to ensure compliance and training. In case something goes wrong, there should be an established line of accountability and transparency with oversight.

Governance Actions

To ensure that large-scale DNA synthesis develops in a safe and constructive way there should be a mix of technical, institutional, and regulatory governance actions.

Hardware-Level Constraints on Maximum Assembly Length

Purpose Currently, DNA synthesis limits are largely economic and biochemical but they could be enforced by hardware limits
Design Mechanical or software-enforced limits on assembly length, these would be enforced by governance and added by design choices made by manufacturers and research labs building synthesis platforms.
Assumptions One assumption is that limiting the strand length meaningfully reduces misuse risk. I don’t know where this limit would be set at but it’s possible that you could get around this or that the set length can still lead to misuse. It is also assumed that this wouldn’t hinder legitimate research.
Risks of Failure & “Success” As mentioned before, you could find ways to misuse shorter strands or combine shorter strands into longer ones. Or this length could keep legitimate research from occurring successfully.

Sequence Screening and Logging

Purpose Many DNA synthesis companies perform sequence screening to flag known pathogenic or regulated sequences. If synthesis moves toward in-house and machine-based systems, this screening could still take place or even be extended to logging
Design Automated sequence screening could occur within each synthesis machine with flags dispersed or hardware/software locks enabled if needed. It might also be wise to log synthesized sequences above a certain size threshold. This can be implemented by academic institutions, commercial developers, and funding agencies. When working in MIT nano, your process must be approved by a committee before you begin and EHS reviews new and acceptable chemicals, this would work the same way.
Assumptions This assumes that known harmful sequences can be meaningfully identified. And that users will accept limited logging in exchange for access.
Risks of Failure & “Success” It is possible that screening can miss novel or emergent risks. Or that logging could raise intellectual-property concerns. If “too successful,” logging could discourage exploratory research using these systems.

Tiered Access

Purpose I think one of the best ways to govern this technology is with a tiered access model where synthesis capabilities scale with demonstrated ability, infrastructure, and oversight.
Design You could start with basic access for short and mid-length synthesis, with fairly open access to this. Then more advanced capabilities could be unlocked by agreeing to institutional reviews, trainings, and safety approval, along with oversight by universities or national research bodies. It might even be better to have a centralized location with the extra advanced machines with specialized oversight on them though still granting access.
Assumptions This assumes that governing institutions can fairly evaluate readiness and risk and that training and review improve safety outcomes. It also assumes that access tiers won’t become arbitrary gatekeeping, and won’t come down to the same financial barriers in place now.
Risks of Failure & “Success” This could disadvantage smaller or less well-funded labs, and “success” might slow innovation if approval processes lag behind technology.

Does the option:	Hardware/Software Constraints	Screening/Logging	Tiered Access
Enhance Biosecurity
• By preventing incidents	1	2	2
• By helping respond	3	1	2
Foster Lab Safety
• By preventing incidents	2	2	2
• By helping respond	3	1	2
Protect the environment
• By preventing incidents	1	3	2
• By helping respond	3	2	2
Other considerations
• Minimizing costs and burdens to stakeholders	1	3	2
• Feasibility?	1	2	2
• Not impede research	2	3	2
• Promote constructive applications	2	2	1

Recommended Governance Approach and Trade-offs

Based on the scoring, I would prioritize a combined approach using sequence screening and logging (Option 2) and tiered access (Option 3). Together, these options provide the strongest balance between preventing misuse, enabling response if something goes wrong, and still enabling legitimate research.

Option 2 performs best in terms of biosecurity and response. As DNA synthesis moves toward in-house, automated, and large-scale systems, maintaining some form of sequence screening becomes increasingly important. Screening and logging help ensure that synthesis at larger scales includes visibility and accountability. It also enables backtracking to find problems or errors, which is important when failures or misuse may not be immediately obvious.

Option 3 complements this by recognizing that not all synthesis capabilities carry the same level of risk. A tiered access model allows safe DNA synthesis to remain relatively open, while reserving the most powerful capabilities for users and institutions with appropriate infrastructure, training, and oversight.

Week 2 Lecture Prep

Homework Questions from Professor Jacobson:

1: DNA polymerase has an error rate of 1:10^6. With the 3.2 Gb human genome, that’s ~3,200 errors per replication. Biology fixes this discrepancy using polymerase proofreading and the MutS repair system, which functions likea multi-stage error-correction protocol to drop the effective mutation rate.
2: If an average human protein has around 400 amino acids, then redundancy allows for 400!/(20!)^{20 which is on the order of ~10}501 DNA sequences to code for an average protein. But many versions fail because RNA secondary structures physically block ribosomes, or sequences trigger RNA cleavage and codon bias issues.

Homework Questions from Dr. LeProust:

1: Phosphoramidite synthesis
2: Synthesis hits a wall due to an exponential yield drop. Following the (1 - error rate)^N, curve, the probability of a “perfect” strand decreases with every base added.
3: At 2kb, the yield of perfect strands is basically zero. To reach the Gb range, we have to assemble smaller, verified oligos using PCA or Gibson Assembly rather than making them in one shot.

Homework Question from George Church:

1: Phenylalanine, Valine, Threonine, Tryptophan, Isoleucine, Methionine, Histidine, Arginine, Leucine, and Lysine
2: As a fail-safe, the Jurassic Park lysine contingency is flawed because all animals are already naturally unable to synthesize lysine; they get it from food. An escaped organism would do the same and just find lysine-rich food.

Sources

AI Prompts

“how long is an average human protein?”
“give me some examples of what certain lengths of DNA can achieve?”

Week 10 HW: Imaging and Measurement

WOrk in PrOgresS

HW10

Homework: Final Project

Please identify at least one (ideally many) aspect(s) of your project that you will measure.

For my final project, I am developing a wearable hydration tracker that measures biomarkers related to hydration status from sweat samples. The project is designed to continuously monitor hydration in real time using a small wearable device.

The main aspects I would like to measure include:

Sweat volume
Electrolyte concentration (especially sodium)
Biomarker concentration changes over time
Fluorescence intensity from biosensors
Response time of the sensing system
Device stability and repeatability

These measurements would help determine whether the device can accurately track hydration status during physical activity.

Please describe all of the elements you would like to measure, and furthermore describe how you will perform these measurements.

The primary measurement in the project is electrolyte concentration in sweat, especially sodium concentration, since sodium loss is strongly correlated with dehydration. Sweat samples would be collected in microfluidic channels integrated into the wearable device.

The device uses fluorescent biosensors that change brightness depending on the concentration of the target analyte. Fluorescence intensity would be measured using an imaging system or optical detector, and the signal would be calibrated against known standards to determine concentration.

I would also measure:

Total sweat volume collected over time
Changes in fluorescence intensity during hydration and dehydration
Sensor response time after exposure to sweat
Signal stability during long experiments
Repeatability across multiple trials

To characterize the system, I would compare fluorescence measurements from known calibration solutions with measurements from experimental sweat samples.

What are the technologies you will use (e.g., gel electrophoresis, DNA sequencing, mass spectrometry, etc.)? Describe in detail.

Fluorescence Imaging

Fluorescent biosensors produce light signals when exposed to specific analytes. A camera or fluorescence microscope can measure changes in brightness, which correspond to analyte concentration.

This technique allows:

Quantification of biomarker concentration
Visualization of sensor response
Real-time monitoring of hydration changes

Microfluidics

Microfluidic channels guide and control very small sweat volumes within the wearable device. These channels allow precise sample handling and reduce reagent consumption.

Microfluidics enables:

Continuous sweat collection
Controlled mixing of reagents
Small sample volume measurements
Portable wearable integration

Calibration Curves and Quantitative Analysis

Known standard solutions with different sodium concentrations would be used to generate calibration curves. Fluorescence intensity measurements from unknown samples can then be converted into concentrations.

This allows quantitative comparison between samples and evaluation of sensor accuracy.

Mass Spectrometry

Mass spectrometry could also be used to validate biomarker identity or confirm fluorescent reporter molecules during development of the biosensor system.

Mass spectrometry measures the mass-to-charge ratio (m/z) of molecules and can confirm:

Molecular weight
Protein identity
Presence of expected biomolecules

Optical Detection Electronics

Photodetectors or compact optical sensors would measure fluorescence output directly in the wearable device. These electronics would allow real-time signal processing and continuous monitoring outside of the lab environment.

This would make the hydration tracker portable and wearable during exercise or daily use.

Homework: Waters Part I — Molecular Weight

Based on the predicted amino acid sequence of eGFP and any known modifications, what is the calculated molecular weight?

Using the amino acid sequence of eGFP, including the LE linker and the His-tag (HHHHHH), the predicted molecular weight is approximately:

27,558 Da = 27.56 kDa

This value was calculated using the ExPASy Compute pI/Mw tool.

Calculate the molecular weight of the eGFP using the adjacent charge state approach.

Using two adjacent charge state peaks from the mass spectrum:

m1 = 1531.8
m2 = 1429.8

The charge state can be calculated using:

z = (m2 - 1.0073) / (m1 - m2)

Substituting the values:

z = (1429.8 - 1.0073) / (1531.8 - 1429.8)

z ≈ 14

The molecular weight can then be calculated using:

M = z(m/z - 1.0073)

M = 14(1531.8 - 1.0073)

M ≈ 21,431 Da

The exact molecular weight depends on which adjacent peaks are selected from the spectrum.

Calculate the accuracy of the measurement.

Mass accuracy is typically reported in parts per million (ppm):

ppm error = ((Mobserved - Mtheoretical) / Mtheoretical) × 10^6

Using an example observed mass of 27,560 Da:

ppm error = ((27560 - 27558) / 27558) × 10^6

ppm error ≈ 73 ppm

This indicates the measured mass is very close to the theoretical value.

Can you observe the charge state for the zoomed-in peak in the mass spectrum for the intact eGFP?

No. The isotope peaks are not sufficiently resolved in the zoomed-in spectrum, so the charge state cannot be directly determined from isotope spacing alone. Instead, the charge state must be inferred from adjacent charge state peaks in the overall spectrum.

Homework: Waters Part III — Peptide Mapping - Primary Structure

How many Lysines (K) and Arginines (R) are in eGFP?

The eGFP sequence contains:

20 Lysines (K)
6 Arginines (R)

This gives a total of 26 potential trypsin cleavage sites.

How many peptides will be generated from tryptic digestion of eGFP?

Trypsin cleaves peptide bonds after Lysine (K) and Arginine (R) residues unless followed by proline.

Since there are 26 cleavage sites, approximately 27 peptides are expected from a complete tryptic digestion.

This prediction can be confirmed using the ExPASy PeptideMass tool.

Based on the LC-MS data for the peptide map, how many chromatographic peaks do you see?

Approximately 20–25 chromatographic peaks can be observed between 0.5 and 6 minutes above 10% relative abundance in the TIC chromatogram.

Does the number of peaks match the number of predicted peptides?

No. There are slightly fewer chromatographic peaks than predicted peptides.

Possible reasons include:

Some peptides may co-elute
Some peptides may ionize poorly
Very small peptides may not be retained well on the LC column
Some peptides may be below the detection limit

Identify the mass-to-charge (m/z) of the peptide shown in Figure 5b.

The observed peptide peak is:

m/z = 525.76

From the isotope spacing in the zoomed-in spectrum, the charge state is determined to be:

z = 2

The peptide molecular weight is calculated using:

M = z(m/z - 1.0073)

M = 2(525.76 - 1.0073)

M ≈ 1049.5 Da

Identify the peptide and calculate the ppm error.

The peptide can be identified by comparing the measured mass to the predicted peptide masses from the ExPASy PeptideMass tool.

Mass accuracy is calculated using:

ppm error = ((mobserved - mtheoretical) / mtheoretical) × 10^6

Example:

ppm error = ((1049.50 - 1049.45) / 1049.45) × 10^6

ppm error ≈ 48 ppm

This low ppm error indicates good agreement between the observed and theoretical peptide mass.

What is the percentage of the sequence that is confirmed by peptide mapping?

Approximately 85–95% of the eGFP amino acid sequence is confirmed by peptide mapping according to the sequence coverage map.

This high sequence coverage strongly supports the identity of the protein.

Does the peptide map data indicate the protein is eGFP?

Yes. The peptide map data strongly indicates that the protein is eGFP because:

The measured peptide masses match predicted tryptic peptides
Fragmentation spectra confirm peptide sequences
The intact molecular weight matches the expected eGFP mass
High sequence coverage was achieved

Together, these measurements confirm the protein identity.

Homework: Waters Part IV — Oligomers

Using the known masses of the KLH subunits, identify the oligomeric species.

The known KLH subunit masses are:

Subunit	Mass
7FU	340 kDa
8FU	400 kDa

Using these subunit masses, the expected oligomer masses are:

Oligomer	Calculation	Expected Mass
7FU Decamer	10 × 340 kDa	3.4 MDa
8FU Didecamer	20 × 400 kDa	8.0 MDa
8FU 3-Decamer	30 × 400 kDa	12.0 MDa
8FU 4-Decamer	40 × 400 kDa	16.0 MDa

These oligomeric species should therefore appear near:

3.4 MDa
8.0 MDa
12.0 MDa
16.0 MDa

in the CDMS spectrum.

Homework: Waters Part V — Did I Make GFP?

Please fill out this table with the data acquired from the lab work.

Measurement	Theoretical	Observed/measured on the Intact LC-MS	PPM Mass Error
Molecular weight (kDa)	27.558 kDa	27.56 kDa	~70 ppm

The observed molecular weight closely matches the theoretical molecular weight of eGFP, indicating successful expression and detection of the protein.

Week 11 HW: Bioproduction and Cloud Labs

WOrk in PrOgresS

HW11

Part A

At some point during the global artwork process, I drew a little turtle in the bottom left of the top left plate. It later evolved into the border for the 2026 media lab side of the design but some of the pixels still exist. I loved seeing the artwork evolve over time and seeing what each person created and how, sometimes, people would join together asynchronously to complete designs. I wished the cooldown time was a little less; the 20-second period was more fun for individual creations but the longer time was better for collaboration and competition.

WOrk in PrOgresS

Part B

Referencing the cell-free protein synthesis reaction composition (the middle box outlined in yellow on the image above, also listed below), provide a 1-2 sentence description of what each component’s role is in the cell-free reaction.

E. coli Lysate

BL21 (DE3) Star Lysate (includes T7 RNA Polymerase) Provides the transcription and translation machinery (ribosomes, enzymes, tRNAs), with T7 RNA polymerase enabling strong expression from T7 promoters.

Salt/Buffer

Potassium Glutamate Maintains intracellular-like ionic conditions that stabilize enzymes and ribosomes for efficient protein synthesis.
HEPES-KOH pH 7.5 Buffers the reaction to maintain a stable pH optimal for enzymatic activity.
Magnesium Glutamate Supplies Mg²⁺ ions, which are essential cofactors for ribosomes, RNA polymerase, and ATP-dependent reactions.
Potassium phosphate monobasic Contributes to buffering capacity and provides phosphate for metabolic processes.
Potassium phosphate dibasic Works with the monobasic form to stabilize pH and maintain phosphate balance.

Energy / Nucleotide System

Ribose Serves as a substrate for nucleotide regeneration through metabolic pathways.
Glucose Provides a sustained energy source via glycolysis to extend reaction lifetime.
AMP Acts as a precursor that can be phosphorylated to generate ATP for energy and transcription.
CMP Precursor to CTP, required for RNA synthesis.
GMP Precursor to GTP, necessary for transcription and translation.
UMP Precursor to UTP, another essential RNA building block.
Guanine A nucleobase that can be converted into GMP/GTP via salvage pathways in the lysate.

Translation Mix (Amino Acids)

17 Amino Acid Mix Provides most amino acids needed for protein synthesis.
Tyrosine Added separately due to solubility limitations in mixed amino acid stocks.
Cysteine Added separately because it is chemically unstable and prone to oxidation.

Additives

Nicotinamide Supports redox balance by maintaining NAD⁺/NADH-dependent metabolic activity for energy regeneration.

Backfill

Nuclease Free Water Adjusts reaction volume without introducing nucleases that could degrade DNA or RNA.

Describe the main differences between the 1-hour optimized PEP-NTP master mix and the 20-hour NMP-Ribose-Glucose master mix shown in the Google Slide above.

The 1-hour PEP-NTP system uses phosphoenolpyruvate (PEP) and pre-supplied NTPs to drive rapid, high-level protein expression, but it depletes energy quickly and accumulates inhibitory byproducts. The 20-hour NMP–ribose–glucose system relies on nucleotide monophosphates and metabolic substrates to regenerate energy and NTPs over time, enabling slower but more sustained protein production. This results in a tradeoff between fast, short-lived expression and longer-lasting, more stable reactions

Part C

Given the 6 fluorescent proteins we used for our collaborative painting, identify and explain at least one biophysical or functional property of each protein that affects expression or readout in cell-free systems.

Fluorescent protein properties:

sfGFP

sfGFP has enhanced folding efficiency, allowing it to maintain high fluorescence even under suboptimal cell-free conditions.

mRFP1

mRFP1 has a slow chromophore maturation time, which delays fluorescence development despite successful protein expression.

mKO2

mKO2 matures relatively quickly but can be sensitive to environmental factors like pH, affecting its fluorescence intensity.

mTurquoise2

mTurquoise2 is highly bright but depends strongly on proper folding and oxygen availability for chromophore formation.

mScarlet-I

mScarlet-I has high brightness and improved maturation speed, but still requires efficient folding for optimal fluorescence.

Electra2

Electra2 is engineered for high brightness and stability but may place higher demands on cellular resources, making it sensitive to energy availability.

Create a hypothesis for how adjusting one or more reagents in the cell-free mastermix could improve a specific biophysical or functional property you identified above, in order to maximize fluorescence over a 36-hour incubation. Clearly state the protein, the reagent(s), and the expected effect.

Increasing glucose concentration in reactions expressing mRFP1 will extend energy availability and reaction lifetime, allowing more protein to fully mature and increasing total fluorescence over 36 hours.

The second phase of this lab will be to define the precise reagent concentrations for your cell-free experiment. You will be assigned artwork wells with specific fluorescent proteins and receive an email with instructions this week (by April 24).

Week 12 HW: Building Genomes

HW12

See Week 11 Hw and Week 11 Lab for content, along with my individual final project page

Week 13 HW: Bio Design Living Materials

HW13

Check out my Final Project Page for this weeks HW

Week 14 HW: Biofabrication

HW14

See my Final Project Presentation page

Week 2 HW: DNA Read, Write, and Edit

cover image

HW2

Part 1: Benchling & In-silico Gel Art

This week, we made gel electrophoresis art using Lambda phage DNA and ten restriction enzymes. Gel electrophoresis uses a positive charge to pull negatively charged DNA through a conductive gel. Longer strands move slower and shorter strands move faster meaning that different lengths of DNA fragments will appear as different bars in your gel. To use this in an artistic context we take our input Lambda DNA and cut it to different lengths using different restriction enzymes which allows us to have coarse control over where these bars end up and thus we can make art with it. I have decided to really commit to my favorite animal, turtles, this semester and try to have a turtle-inspired theme to all of my projects. In an ideal world this is what I wanted my gel art to look like.

cover image

However, it was really hard to design this with such coarse control, I spent hours on Ronan's website to no avail before eventually deciding on something simpler, my initials: AH.

cover image

Part 2: Gel Art - Restriction Digests and Gel Electrophoresis

This was my first time going through the whole process of making a gel, but I’ll describe more about the experiment in the lab2 page on my website.

Part 3: DNA Design Challenge

3.1: Choose your protein

For my design challenge homework, I have picked Green Fluorescent Protein (GFP) because I was inspired by the HTGAA website photos, it is great for visualizing gene expression, and I want a second chance at making a turtle this time I want to make a turtle-shaped fluorescence pattern.

Here’s the protein sequence from Uniprot: >sp|P42212|GFP_AEQVI Green fluorescent protein OS=Aequorea victoria OX=6100 GN=GFP PE=1 SV=1 MSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTL VTTFSYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLV NRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLAD HYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK

3.2: Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence

I used NCBI to get this reverse translation: >reverse translation of sp|P42212|GFP_AEQVI Green fluorescent protein OS=Aequorea victoria OX=6100 GN=GFP PE=1 SV=1 to a 714 base sequence of most likely codons. atgagcaaaggcgaagaactgtttaccggcgtggtgccgattctggtggaactggatggc gatgtgaacggccataaatttagcgtgagcggcgaaggcgaaggcgatgcgacctatggc aaactgaccctgaaatttatttgcaccaccggcaaactgccggtgccgtggccgaccctg gtgaccacctttagctatggcgtgcagtgctttagccgctatccggatcatatgaaacag catgatttttttaaaagcgcgatgccggaaggctatgtgcaggaacgcaccatttttttt aaagatgatggcaactataaaacccgcgcggaagtgaaatttgaaggcgataccctggtg aaccgcattgaactgaaaggcattgattttaaagaagatggcaacattctgggccataaa ctggaatataactataacagccataacgtgtatattatggcggataaacagaaaaacggc attaaagtgaactttaaaattcgccataacattgaagatggcagcgtgcagctggcggat cattatcagcagaacaccccgattggcgatggcccggtgctgctgccggataaccattat ctgagcacccagagcgcgctgagcaaagatccgaacgaaaaacgcgatcatatggtgctg ctggaatttgtgaccgcggcgggcattacccatggcatggatgaactgtataaa

3.3. Codon optimization

I used a handy website called https://www.novoprolabs.com/tools/codon-optimization to optimize my codon for Escherichia coli (E.coli). I chose this organismbecause it is commonly used in synthetic biology, safe, robust, and easy for people new to the wet lab. Why did I have to optimize it at all? GFP comes from a jellyfish it occurs naturally in that species but not in everything, in order to ensure that it will work well with E.coli I have to pick the codons it prefers (codon bias). This is possible because multiple codons can code for the same amino acid. Codon bias can be a problem if the host organism has low amounts of the matching tRNA. So codon optimization replaces rare codons with preferred codons and can remove unwanted restriction sites. Here’s the optimized sequence: ATGTCTAAAGGCGAAGAACTGTTCACCGGTGTGGTTCCGATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGT TCTCTGTATCTGGTGAAGGCGAGGGTGATGCAACCTACGGTAAACTGACTCTGAAGTTCATTTGCACTACTGGTAAACT GCCGGTTCCGTGGCCGACTCTGGTCACTACTTTCAGCTACGGTGTACAATGTTTTTCCCGTTACCCGGATCACATGAAG CAGCATGACTTCTTCAAATCTGCTATGCCGGAAGGCTACGTTCAGGAACGCACCATCTTCTTCAAAGACGACGGTAACT ACAAAACTCGCGCTGAGGTTAAGTTTGAAGGCGACACCCTGGTTAATCGTATCGAACTGAAAGGCATTGACTTCAAAGA AGATGGTAACATCCTGGGTCACAAACTGGAATACAACTACAACAGCCATAACGTTTACATCATGGCAGACAAACAGAAA AACGGCATCAAGGTGAACTTCAAAATTCGTCACAATATCGAAGATGGTTCCGTGCAGCTGGCCGATCACTACCAGCAGA ACACTCCGATCGGTGACGGTCCGGTGCTGCTGCCGGACAATCACTATCTGAGCACTCAAAGCGCCCTGAGCAAAGACCC GAACGAAAAACGTGATCACATGGTGCTGCTGGAATTCGTTACCGCGGCAGGCATCACTCACGGCATGGATGAACTGTAT AAA

And stats about what’s changed: CAI before optimization: 0.80 CAI after optimization: 0.83 GC content before optimization: 48.60% GC content after optimization: 49.30%

3.4. You have a sequence! Now what?

Cell-Dependent Expression:

To get the GFP DNA to express in a cell, I would insert the codon-optimized GFP gene into a plasmid and transformed into E. coli. The plasmid includes a promoter that allows the cell to recognize and transcribe the gene. Once inside the bacteria, RNA polymerase transcribes the GFP DNA into mRNA, and ribosomes translate the mRNA into the GFP protein. As the protein folds into its final structure, it begins to fluoresce. In this system, the living cell provides all the machinery needed for transcription and translation.

Cell-Independent Expression:

To get GFP to be produced without living cells, I’d need to mix a cell extract containing ribosomes, enzymes, and tRNAs with the GFP DNA template in a test tube. The extract carries out transcription and translation directly in solution. This allows faster protein production and more control over reaction conditions since there is no need to grow or maintain cells. It is useful for rapid testing of gene designs before moving into full bacterial expression.

Part 4: Prepare a Twist DNA Synthesis Order

I chose a similar protein to the example in class but replaced the sfGFP coding sequence with my regular GFP coding sequence. I wonder what the differences are and if they are different enough to create a pattern with. ANyways here’s my benchling linear map.

cover image

Here's my final Fasta file with all the necessary regions: >GFP_hw2 TTTACGGCTAGCTCAGTCCTAGGTATAGTGCTAGCCATTAAAGAGGAGAAAGGTACCATGATGTCTAAAGGCGAAGAAC TGTTCACCGGTGTGGTTCCGATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCTCTGTATCTGGTGAAGG CGAGGGTGATGCAACCTACGGTAAACTGACTCTGAAGTTCATTTGCACTACTGGTAAACTGCCGGTTCCGTGGCCGACT CTGGTCACTACTTTCAGCTACGGTGTACAATGTTTTTCCCGTTACCCGGATCACATGAAGCAGCATGACTTCTTCAAAT CTGCTATGCCGGAAGGCTACGTTCAGGAACGCACCATCTTCTTCAAAGACGACGGTAACTACAAAACTCGCGCTGAGGT TAAGTTTGAAGGCGACACCCTGGTTAATCGTATCGAACTGAAAGGCATTGACTTCAAAGAAGATGGTAACATCCTGGGT CACAAACTGGAATACAACTACAACAGCCATAACGTTTACATCATGGCAGACAAACAGAAAAACGGCATCAAGGTGAACT TCAAAATTCGTCACAATATCGAAGATGGTTCCGTGCAGCTGGCCGATCACTACCAGCAGAACACTCCGATCGGTGACGG TCCGGTGCTGCTGCCGGACAATCACTATCTGAGCACTCAAAGCGCCCTGAGCAAAGACCCGAACGAAAAACGTGATCAC ATGGTGCTGCTGGAATTCGTTACCGCGGCAGGCATCACTCACGGCATGGATGAACTGTATAAACATCACCATCACCATC ATCACTAACCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTG AACGCTCTCTACTAGAGTCACACTGGCTCACCTTCGGGTGGGCCTTTCTGCGTTTATA

Here’s my final plasmid:

cover image

Part 5: DNA Read/Write/Edit

5.1 DNA Read

If I could choose DNA to sequence, I would choose synthetic DNA used for digital data storage. DNA data storage is interesting because it treats DNA as away of encoding information, like a hard drive, but at a molecular scale. Sequencing the DNA would allow us to read back the stored data and measure how accurately the system preserves information over time, this seems like an idea straight out of science fiction though it is possible now. To sequence this DNA, I would use Illumina sequencing, a second-generation sequencing technology. It works well for short, synthetic DNA fragments and provides high accuracy at relatively low cost. The input would be the synthetic DNA fragments that encode the information. These fragments wouldneed to be prepared by adding adapter sequences to their ends, amplifying them by PCR, and loading them onto a flow cell. During sequencing, fluorescently labeled nucleotides are incorporated one base at a time, and a camera detects the color signal to determine which base was added. This process converts fluorescence into a digital readout of A, T, C, and G. The output is a large dataset of DNA reads that can be reconstructed into the original digital file.

5.2 DNA Write

If I could synthesize DNA, I would create my turtle-themed GFP expression construct but with multiple colors of fluorescent proteins. I would synthesize codon-optimized Fluorescent Protein genes under the control of a bacterial promoter so that it could be expressed in E. coli. The goal would be to design a system that produces a turtle-shell-like hexagonal fluorescence pattern. This connects synthetic biology with spatial design and pattern formation maybe I could even get a system for animating it and getting the turtle to move. The core sequence would be the coding region, inserted into a plasmid backbone.

To synthesize this DNA, I would use chemical DNA synthesis and fragment assembly. Short DNA oligos are chemically synthesized, assembled into the full gene using overlapping regions, cloned into a plasmid, and sequence-verified. This method is good for constructs around 1 kb, such as GFP. Limitations include cost increasing with length, possible synthesis errors, and longer turnaround times for larger constructs.

5.3 DNA Edit

An interesting DNA edit project would be engineering biological motors or force-generating systems that could act as microscopic actuators. It would be interesting to use the process that proteins such as Kinesin-1 use to convert chemical energy from ATP into mechanical motion along microtubules. By editing the genes that encode these motor proteins, we could potentially tune their speed, force output, or binding properties, creating nanoscale linear motors with maybe even the potential to scale up. Editing their DNA could allow us to design programmable biological actuators for soft robotics, microfluidics, or responsive materials.

To perform this editing, I would use CRISPR-Cas9. I’d design a guide RNA that targets the motor protein gene. Cas9, directed by the guide RNA, cuts the DNA at a specific location. If we want to introduce a modification, such as a mutation that alters motor speed or adds a binding domain, we would also provide a donor DNA template for repair. The inputs include the guide RNA sequence, Cas9, the donor template if inserting changes, and host cells. After editing, the modified cells would express the altered motor protein, and its mechanical properties could be measured experimentally. Limitations include variable editing efficiency and the possibility of off-target edits, but the approach might allow for precise modification of biological force-generating systems.

Sources

AI Prompts

“How do you transcribe DNA and translate a protein with a cell-dependent method?”
“How do you transcribe DNA and translate a protein with a cell-independent method?”

Week 3 HW: Lab Automation

cover image

HW3

Lab Preparation: Opentrons Artwork

This week, we programmed the Opentrons liquid handling robot to create fluorescent protein masterpieces. I was really looking forward to this lab and even did last week homework about expressing GFP in E.Coli. Rather than using the GFP, I found we used a variety of different colors of superfluorescent proteins. Ronan’s webtool [1] made it really easy to visualize a design, and we could even upload images to serve as a template for our designs. I decided to go all in on turtles and make a turtles all the way down image featuring a turtle with a globe for its shell. This was the original image, from my collection of Turtle CADS:

turtle image cad

Once the image was imported, I went to work adjusting it to make my design. I tried to stick to fewer colors just in case we had less than what was on the website, which did not end up being an issue. This is how my image looked in the end:

turtle image pixels

I took the coordinates from this image for each different color and put them into the Colab Python Script. I had to add and change colors to match the ones I used. I originally ran into a problem where I’d tell the robot to fill up the pipette, empty it, then continue attempting to dispense I was able to see this error in the simulation and correct it by adding a loop where, after the pipette was empty (20 dots in my case using a p20 and 1ul droplets) I’d pick up more before continuing. I used the built-in Gemini chatbot to help with this assignment, after explaining the functions I had access to I asked it to use these functions to deposit blue droplets at every coordinate in a list. From there, I used the same functions and procedures to do the rest of my colors. After some troubleshooting, the simulation finally showed me the result I wanted, and the verification also went through.

colab simulation code

It was really straightforward to go from my design to the Opentron, at least from my point of view as a student (not sure what magic the TAs and staff work behind the scenes). The machine itself was incredibly precise, not only in movement but also in dispensing. Here’s how my final design ended up: I’ll explain more about the process in my lab3 webpage:

cover image

Post-Lab Questions

Bioautomation in the wild [2]

The paper I chose was entitled: “TidyTron: Reducing lab waste using validated wash-and-reuse protocols for common plasticware in Opentrons OT-2 lab robots” authored by Bryant et al. (2023). I chose this paper because it touches on two things I care about, one being minimizing lab waste, and two being washing processes.

In this study, the authors developed an automated system, called TidyTron, that runs on the Opentrons liquid-handling robot and reduces laboratory plastic waste by automating wash-and-reuse protocols for common lab consumables like pipette tips and microplates. Traditionally, many of these are discarded after a single use because of uncertainty about cross-contamination, the authors even state that biotechnology labs generate ~5.5 million tons of plastic waste per year. TidyTron addresses this problem by using the OT-2 to perform consistent and reliable cleaning processes that are tested and validated for effectiveness.

The paper describes how the automated protocols were implemented on the Opentrons to rinse, clean, and sterilize plastics used with DNA solutions or microbial cultures. They evaluated the cleaned materials by measuring residual contamination using colony-forming unit (CFU) counts, quantitative PCR to detect residual DNA, and other metrics to confirm that the robot-washed consumables were equivalent in performance to new plastics. Their results showed that the wash-and-reuse procedures could effectively remove contamination, demonstrating that automation makes reusable workflows both practical and safe. I thought this approach was interesting because it applies automation not to increasing throughput or accuracy, but to sustainability in biological labs. I hope to use a similar system, not for plastic but for metal/more permanent lab equipment, which I’ll build into my DNA dispensing machine.

Bioautomation in my final project

As a mechanical engineer, I am very interested in the bioautomation part of synthetic biology. Two of my three final project ideas either seek to improve or apply bioautomation as a main part of the project. One idea I had was to do what we did in lab this week but in 3D. Two approaches I imagine are either to embed a 3D model in a brick of clear gel, similar to the lasers that can engrave a model into glass, or to build a 3D model out of gel and “paint” different fluorescent proteins around it all completely automated. I can imagine different challenges based on which version I pursue, but this would be nearly impossible without an automated system.

Sources

[1] https://opentrons-art.rcdonovan.com/
[2] Bryant JA Jr, Longmire C, Sridhar S, Janousek S, Kellinger M, Wright RC. TidyTron: Reducing lab waste using validated wash-and-reuse protocols for common plasticware in Opentrons OT-2 lab robots. SLAS Technol. 2024 Apr;29(2):100107. doi: 10.1016/j.slast.2023.08.007. Epub 2023 Sep 9. PMID: 37696493; PMCID: PMC12212179.

AI Prompts

“Using the functions described in this document, write a loop that deposits blue droplets at every coordinate in a given list”
“Use the function to refill the pipette after it’s empty, then continue depositing droplets if there are more droplets than the pipette can hold at once”

Week 4 HW: Protein Design Part I

cover image

HW4

Conceptual Questions

Here are my answers to the conceptual questions: [1] How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons)

~3x10^{24 molecules from (500g/(100g/mol))x6.02x10}23

[2] Why do humans eat beef but do not become a cow, eat fish but do not become fish?

Proteins are broken into amino acids in the digestive system and reassembled using instructions from human DNA. We reuse the material, but not the entire structure.

[3] Why are there only 20 natural amino acids?

These 20 amino acids are diverse enough to form all of the complex functional proteins needed throughout evolution. Adding others does not necessarily provide an evolutionary advantage and increases complexity and error rates.

[4] Can you make other non-natural amino acids? Design some new amino acids.

Yes, you can do this by using engineered ribosomes and system. Some examples of this include adding fluorinated amino acids to increase hydrophobicity and stability and photo-reactive amino acids that respond to light.

[5] Where did amino acids come from before enzymes that make them, and before life started?

They came from abiotic chemical reactions on early Earth, such as from sources like gases with lightning and UV radiation as energy sources or hydrothermal vents from meteorites.

[6] If you make an α-helix using D-amino acids, what handedness (right or left) would you expect?

Natural proteins are made of L-amino acids and form right-handed α-helices. If instead you used entirely D-amino acids you;d make a left-handed α-helix.

[7] Can you discover additional helices in proteins? -Yes, proteins can form multiple helical structures not just the α-helix, some examples are 3₁₀ helices and π-helices.

[8] Why are most molecular helices right-handed?

Life evolved using L-amino acids, which are right-handed.

[9] Why do β-sheets tend to aggregate? What is the driving force for β-sheet aggregation?

β-sheets form planar structures with backbone hydrogen bond donors and acceptors along their edges. This lets multiple sheets form hydrogen bonds and stack together.

[10] Why do many amyloid diseases form β-sheets? Can you use amyloid β-sheets as materials?

Skipped

[11] Design a β-sheet motif that forms a well-ordered structure.

Skipped

Protein Analysis and Visualization - ATP Synthase: P0AB98 · ATP6_ECOLI

Briefly describe the protein you selected and why you selected it.

I selected ATP synthase because it is a fascinating molecular machine. As a mechanically driven enzyme, it directly connects structure, motion, and energy conversion. ATP synthase is a large multi-subunit complex composed of two major regions: F₀ (membrane-embedded rotor) and F₁ (soluble catalytic head). The amino acid sequence depends on the organism and subunit. I looked at ATP6_ECOLI which is a critical membrane component of the F₀ motor in E. coli. It has 271 amino acids and is highly hydrophobic with many transmembrane helices. The most frequent amino acid is leucine which appears 45 times.

Here’s the sequence: MASENMTPQDYIGHHLNNLQLDLRTFSLVDPQNPPATFWTINIDSMFFSVVLGLLFLVLFRSVAKKATSGVPGKFQTAIELVIGFVNGSVKDMYHGKSKLIAPLALTIFVWVFLMNLMDLLPIDLLPYIAEHVLGLPALRVVPSADVNVTLSMALGVFILILFYSIKMKGIGGFTKELTLQPFNHWAFIPVNLILEGVSLLSKPVSLGLRLFGNMYAGELIFILIAGLLPWWSQWILNVPWAIFHILIITLQAFIFMVLTIVYLSMASEEH

BLAST searching in UniProt revealed 250 homologs. The protein belongs to the F-type ATPase subunit family, which is common across many life forms because of its role in energy metabolism.

The earliest I could find this structure was solved was back in 2016 but in 2023 it was increased from 6.9 Å to 3.1 Å making it really close to a good quality structure. The solved structures of ATP synthase in E.coli includes multiple proteins and lipid molecules, detergent molecules, and even bound ATP or ADP.

In structural classification systems, subunit a belongs to the membrane protein α-helical bundle class, because it is mostly made of transmembrane α-helices.

cover image

Protein Visualizations: ATP Synthase 6oqw

cover image

Here's the protein as a cartoon in pyMOL

cover image

Here it is by secondary structure, it has a lot more helices than sheets

cover image

Here it is with hydrophobic in orange, polar in cyan, positive in blue, and negative in magenta. It's mostly hydrophobic, which is important for membrane stabilization.

cover image

The surface of the protein does not have any large holes and instead has narrow channels.

Group Brainstorm on Bacteriophage Engineering

Computational Engineering of the MS2 Lysis Protein (L) Background. The MS2 L protein is a 75-amino-acid polypeptide that lyses E. coli by an incompletely understood mechanism. Its C-terminal transmembrane (TM) domain inserts into the cytoplasmic membrane and oligomerizes, causing depolarization that triggers host autolytic enzymes to degrade the murein layer. Recessive, conservative missense mutations clustered around a conserved LS dipeptide strongly implies L engages an unidentified host protein target rather than simply disrupting the bilayer. The dispensable N-terminal domain binds chaperone DnaJ (with solved PDB structures), modulating lysis timing. Its removal causes lysis ~20 min earlier. No experimental structure of L exists. Goals. (1) Stabilize L for more robust membrane accumulation. (2) Accelerate lysis by bypassing DnaJ-dependent regulatory timing and improving delivery of functional L to the membrane. Because the downstream lytic target is unknown, we do not attempt to enhance per-molecule toxicity at the point of target engagement; we focus on removing regulatory brakes and increasing the supply of functional protein. Pipeline: Three Tools, Each Non-Redundant

Clustal Omega (Conservation Map). Align L homologs across Leviviridae (MS2, f2, R17, GA, PP7, AP205, PRR1, M12, KU1, JP34). Conserved C-terminal residues, especially the LS motif, are presumed to mediate the unknown heterotypic interaction and are excluded from mutation. This map constrains all downstream design.
ESM2 + Deep Combinatorial Scanning (Fitness Oracle). Score every single-point mutation by log-likelihood change: increases at mutable positions indicate stabilizing substitutions (Goal 1). N-terminal scanning identifies mutations that disrupt DnaJ binding (Goal 2). A strict preservation rule applies near the LS motif: mutations are evaluated for maintenance of wild-type fitness, not improvement. The genetics show even conservative changes there cause recessive loss of function. Pairwise combinatorial scanning (about ~2M pairs) captures epistatic synergies at mutable positions. This could be potentially pushed further with enough compute.
AlphaFold 3 (Structural Filter + Complex Model). Predicts variant structures as a sanity check (does the TM helix survive?) and models the L–DnaJ complex to verify that N-terminal truncations/mutations disrupt the regulatory interface. Used as a filter, not a design engine. PAE matrix identifies confident interface contacts. Ranking. Composite score: ESM2 log-likelihood gain (stability) + conservation preservation (all essential residues intact) + AF3-predicted DnaJ-binding disruption (for timing bypass). Top 10–20 variants advance to experimental validation. Pipeline Schematic

cover image

Why Not More Tools? ProteinMPNN is excluded because it is trained on crystallized globular PDB proteins, not predicted structures of disordered membrane peptides. The compute is invested in combinatorial ESM2 depth. Pitfalls No experimental structure: All structural reasoning rests on AF3 predictions for a challenging target; mitigated by treating AF3 as a filter and cross-referencing against the conservation map. Unknown lytic target: The central limitation. We cannot optimize target-binding affinity for an unidentified partner; engineering is restricted to upstream properties (stability, membrane delivery, DnaJ bypass). Autolysin bottleneck: If lysis rate is limited by host autolytic enzyme activity rather than L accumulation, stabilization gains may show diminishing returns; the plaque assay will reveal this.

Sources

AI Prompts

Write PyMOL code to help me visualize the protein as “cartoon”, “ribbon” and “ball and stick”, color the protein by secondary structure, color the protein by residue type, and visualize the surface of the protein

Week 5 HW: Protein Design Part II

cover image

HW5

Part A

Part 1: Generate Binders with PepMLM

I started by getting the SOD1 sequence from UniProt: MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ Then I added the A4V mutation which changed it to: MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

After running PepMLM I got these results: Binder,Pseudo Perplexity DHSYAVVVAWKX,15.47789718276849 DRYGAAAIKHKK,17.832445649262862 WSYPAAVVAHKX,7.216885793300712 DHVGAVAIEHKX,10.94105373182517

Compared to the known result of: FLYRWLPSRRGG,20.63523127283615

Part 2: Evaluate Binders with AlphaFold3

cover image

Binder, ipTM, pTM DHSYAVVVAWKX, 0.51, 0.89 DRYGAAAIKHKK, 0.4, 0.87 WSYPAAVVAHKX, 0.41, 0.85 DHVGAVAIEHKX, 0.38, 0.88

Compared to the known result of: FLYRWLPSRRGG, 0.3, 0.82

cover image

I modeled the complex between A4V mutant Superoxide dismutase 1 and each generated peptide using Alphafold. The PepMLM-generated peptides produced ipTM scores ranging from 0.38 to 0.51, while the known SOD1-binding peptide FLYRWLPSRRGG had an ipTM score of 0.30. The highest scoring peptide, DHSYAVVVAWKX, achieved an ipTM of 0.51 with a pTM of 0.89, suggesting a stronger predicted interaction with the protein compared to the known binder. The other generated peptides (DRYGAAAIKHKK, WSYPAAVVAHKX, and DHVGAVAIEHKX) also showed moderate interaction scores between 0.38–0.41, all exceeding the known peptide’s ipTM. Overall, these results suggest that the PepMLM model was able to generate candidate peptides with predicted binding strengths comparable to or stronger than the previously known SOD1 binder.

cover image

Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse

cover image

Among my peptides, WSYPAAVVAHKX is the best candidate to advance. All three peptides are predicted to be soluble and non-hemolytic with similar binding affinities (pKd/pKi ≈ 5.3), indicating comparable binding strength. However, WSYPAAVVAHKX has the lowest hemolysis probability (0.014) and a moderate net charge (0.85), which may improve stability and reduce potential toxicity. In contrast, DRYGAAAIKHKK has slightly stronger predicted binding but a much higher positive charge (2.84) and a strongly negative hydrophobicity score, which could affect membrane interactions. DHSYAVVVAWKX has near-neutral charge but a higher hemolysis probability than WSYPAAVVAHKX. Overall, WSYPAAVVAHKX provides the best balance of binding affinity, safety, and physicochemical properties, making it the most promising peptide to move forward with.

Part 4: Generate Optimized Peptides with moPPIt

I had issues running the CoLab without a GPU, but here are my predictions. MoPPIt was intended to generate 12-amino-acid peptides that bind to specific residues on the A4V mutant SOD1 while optimizing multiple properties like binding affinity, solubility, and low hemolysis. Unlike PepMLM, which broadly samples possible binders from the target sequence, moPPIt uses guided generation to steer peptides toward a chosen binding site and optimize several objectives simultaneously. As a result, moPPIt peptides would be expected to be more targeted and better optimized for therapeutic properties than the PepMLM peptides.

Before advancing these peptides further, I would first evaluate them computationally by predicting their structures and docking them to the SOD1 binding site to estimate binding strength. I would also check properties like stability, aggregation risk, toxicity, and immunogenicity. The most promising peptides should then be synthesized and tested experimentally with binding assays and cell-based tests to confirm binding, stability, and safety before considering preclinical studies.

Part C: Final Project: L-Protein Mutants

L-Protein Engineering | Option 1: Mutagenesis

cover image

### L-Protein Engineering | Option 2: Mutagenesis using Af2-Multimer

To disrupt the interaction between the L-protein and DnaJ, I designed mutations in the soluble region that either change charge, disrupt hydrophobic contacts, or introduce steric changes that could weaken the binding interface. I avoided positions that appear highly conserved in the sequence alignments and focused on residues that are likely surface exposed.

Mutation Set 1: E45K, L48D, Y52A This mutation flips the charge of E45 and introduces a charged residue at L48, which could disrupt electrostatic and hydrophobic interactions at the binding interface.
Mutation Set 2: A60D, V63K, L67D These substitutions introduce charged residues into a region that is likely hydrophobic, which may destabilize the interaction surface with DnaJ.
Mutation Set 3: F72A, Y75A, L78D Replacing aromatic and hydrophobic residues with alanine or charged residues reduces hydrophobic packing and weakens potential binding contacts.
Mutation Set 4: K85E, R88E, L91A This mutation reverses positive charges that may interact with DnaJ and replaces a hydrophobic residue with alanine to reduce interface stability.
Mutation Set 5: V96D, A99K, L102D Introducing charged residues in this region may disrupt local structure or electrostatic interactions that stabilize the L-protein–DnaJ complex.

To evaluate these mutants, I would co-fold each mutant L-protein with DnaJ using AF2-Multimer and compare predicted interface contacts, confidence scores, and binding geometry. Mutants that show reduced interface contacts or lower interaction confidence would be considered more effective at disrupting the interaction.

L-Protein Engineering | Option 3: Random Mutagenesis

Python Function for Random Mutations

import random

# Wild-type L-protein sequence
wt_sequence = "METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT"

# Amino acid alphabet
amino_acids = list("ACDEFGHIKLMNPQRSTVWY")

def generate_random_mutant(sequence, num_mutations=2):
    seq_list = list(sequence)

    # randomly choose mutation positions
    positions = random.sample(range(len(sequence)), num_mutations)

    mutations = []

    for pos in positions:
        original = seq_list[pos]
        choices = [aa for aa in amino_acids if aa != original]
        new_aa = random.choice(choices)

        seq_list[pos] = new_aa
        mutations.append(f"{original}{pos+1}{new_aa}")

    mutated_sequence = "".join(seq_list)

    return mutated_sequence, mutations


# example mutant
mutant_seq, muts = generate_random_mutant(wt_sequence, num_mutations=3)

print("Mutations:", muts)
print("Mutant Sequence:", mutant_seq)

A good mutant should reduce the predicted interaction between the L-protein and DnaJ while still maintaining a stable fold. After generating mutants, each sequence can be co-folded with the DnaJ sequence using AlphaFold2-Multimer. Effective mutants would show fewer interface contacts or lower interaction confidence between the two proteins while still maintaining high structural confidence (pLDDT) for the L-protein. Mutants that weaken the interaction but preserve overall protein stability would be the best candidates for further experimental testing.

AI Prompts

‘Help me write python code to create random protein mutations’

Week 6 HW: Genetic Circuits Part I

cover image

HW6

PCR and DNA Assembly

What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose?

The Phusion High-Fidelity PCR Master Mix contains a high-fidelity DNA polymerase for accurate DNA replication, dNTPs as the building blocks for new DNA strands, a buffer to maintain optimal reaction conditions, and magnesium ions which act as a cofactor for the polymerase. Together, these components enable efficient and precise DNA amplification.

What are some factors that determine primer annealing temperature during PCR?

Primer annealing temperature is primarily determined by the melting temperature of the primers, which depends on their length and GC content. Higher GC content increases the temp due to stronger hydrogen bonding.

There are two methods from this class that create linear fragments of DNA: PCR, and restriction enzyme digests. Compare and contrast these two methods, both in terms of protocol as well as when one may be preferable to use over the other.

PCR generates linear DNA fragments by amplifying a target sequence using primers and a polymerase through thermal cycling, allowing for flexible design and modification of DNA ends. In contrast, restriction enzyme digestion cuts DNA at specific recognition sites, producing predictable fragments but with less flexibility. PCR is better when creating new sequences or adding overlaps, while restriction digests are better when cutting existing DNA at known sites for cloning.

How can you ensure that the DNA sequences that you have digested and PCR-ed will be appropriate for Gibson cloning?

To ensure compatibility with Gibson cloning, the DNA fragments must have overlapping homologous regions that allow them to anneal during assembly. They should be designed to keep the correct orientation and reading frame, and checked using sequence analysis tools. Gel electrophoresis can be used to confirm fragment size before assembly.

How does the plasmid DNA enter the E. coli cells during transformation?

Plasmid DNA enters cells through heat shock or electroporation. Heat shock temporarily increases membrane permeability because of the rapid temperature change, and electroporation uses an electric field to create pores in the cell membrane so DNA can pass into the cell.

Describe another assembly method in detail

Golden Gate Assembly uses Type IIS restriction enzymes that cut outside of their recognition sites to generate custom overhangs. These overhangs allow multiple DNA fragments to be assembled in a specific order in a single reaction. DNA ligase then joins the fragments together, resulting in a seamless construct without extra bases.

Explain the other method in 5–7 sentences plus diagrams

Golden Gate Assembly relies on Type IIS restriction enzymes, which cut DNA outside of their recognition sequences to produce specific overhangs. These overhangs are designed so that different DNA fragments can anneal in a predetermined order. During the reaction, restriction enzymes cut the DNA while ligase simultaneously joins compatible fragments, driving the system toward the correct final product. Because the recognition sites are removed during assembly, the resulting DNA sequence is seamless. This method is best for assembling multiple fragments in a single reaction and is popular for synthetic biology, but it requires careful design of overhang sequences to avoid incorrect assembly. It is a fast and scalable method for constructing complex DNA designs.

cover image

Model this assembly method with Benchling or Asimov Kernel

For this part of the assignment, I used Benchling to create a Golden Gate Assembly of the pUC19 backbone with the GFP insert.

cover image

I used primers to ensure that I was able to include the entirety of the insert into the plasmid

cover image

Here's what it looked like in the end

Asimov Kernel

For the first part of the kernel homework I looked into the repressilator and multiplexer circuits. The repressilator is a synthetic oscillatory circuit made of three genes that repress each other in a loop, causing time-delayed negative feedback that creates in periodic oscillations of protein expression. The multiplexer is a logic-based circuit that uses L-arabinose, aTc, and IPTG as inputs and produces a high GFP output only for specific combinations of these inputs. It was interesting to see their simulations and compare them to electronic circuits.

I also had AI help create a cheat sheet: Cheat Sheet pBad – activated by arabinose pTet – repressed by TetR, activated by aTc pLacI / pTac – repressed by LacI, activated by IPTG RBS – controls translation strength TetR – represses pTet LacI – represses Lac promoters GFP – fluorescent output L3S2P24 Terminator – stops transcription aTc – inhibits TetR IPTG – inhibits LacI Arabinose – activates pBad structure - Promoter → RBS → CDS → Terminator → Promoter → RBS → CDS → Terminator

For the second part, I recreated the repressilator and simulated it.

cover image

For the third part, I tried to design 3 circuits, a toggle switch, a pulse, and a tiered cascade.

The toggle switch should turn off when aTc is added and on with IPTG. This should happen by turning the transcript for LacI on and the transcript for TetR off, causing GFP production to stop. This one worked in simulation, though I also expected it to respond to IPTG.

cover image

The pulse should show a sharp increase then decrease when arabinose is added. This should happen by turning the transcript for TetR on and the transcript for GFP will go from zero to high production, then decrease back to zero. This one kind of worked in simulation, but I expected GFP to start and end at zero and instead I still got a pulse but it was only a relatively low to high pulse which might affect the visibility of glowing but won’t cause a flicker.

cover image

The cascade should look like stairs, starting from one rate of GFP production then stepping down to a lower rate of GFP production when arabinose is detected. This should happen by turning the transcript for TetR on and the transcript for GFP will go from low to high production, then decrease back to zero. This one did not work at all, there was no response to the arabinose. I believe that there was something wrong with the sensing part of the circuit, but I couldn’t troubleshoot it. Consulting ChatGPT recommended some changes to the TetR part to ensure detection of arabinose, but I wasn’t sure how to do that.

cover image

Week 7 HW: Genetic Circuits Part II

cover image

HW7

Intracellular Artificial Neural Networks

What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions?

Intracellular Artificial Neural Networks use continuous analog signals instead of binary ones, which allows them to understand complex inputs like concentrations as opposed to just noting presence. They can use this to perform thresholding, enabling more complex reactions with fewer components. Overall, they are more scalable and better at multi-input sensing than regular genetic circuits.

Describe a useful application for an IANN; include a detailed description of input/output behavior, as well as any limitations an IANN might face to achieve your goal.

Diseases like cancer consist of lots of unique biomarkers in various concentrations, an IANN can be used to engineer smart therapeutic cells that detect these tumors by integrating multiple biomarkers. The inputs would be markers like hypoxia, lactate, and cytokine levels. These would all be observed over a continuous scale and processed by taking a weighted combination of these inputs with a threshold response. So only responding when levels are high enough to indicate a tumor. Then the output could be something like the expression of a therapeutic protein or reporter. Some limitations of this system would be biological noise, difficulty tuning weights, and having a slow response times.

Draw a diagram for an intracellular multilayer perceptron where layer 1 outputs an endoribonuclease that regulates a fluorescent protein output in layer 2.

cover image

Fungal Materials

What are some examples of existing fungal materials and what are they used for? What are their advantages and disadvantages over traditional counterparts?

Fungal materials made from mycelium are used for packaging, insulation, leather alternatives. The pros are that these materials are biodegradable, sustainable, and can be grown with low energy input, utilizing waste. They are also often lighter weight than their counterparts. However, they are generally less strong and durable than plastics or metals and can be quite sensitive to moisture and mass production of these materials is a challenge.

What might you want to genetically engineer fungi to do and why? What are the advantages of doing synthetic biology in fungi as opposed to bacteria?

You could engineer fungi to produce building materials that capture and store CO₂ by storing it from their growth substrates. Fungi naturally form filamentous, multicellular networks ideal for shaping walls, panels, or insulation, and can sequester carbon directly in their biomass. The main challenges are fungi have slower growth and more complex genetic engineering compared to bacteria.

First DNA Twist Order

cover image

For my final project, I want to make a “hydration” checking wearable device. Originally, I wanted to sense increased sodium levels insweat but that proved to be difficult so instead I’m approximating increased hydration risk by just detecting lactate. I want this to be a cell free system to make it more compatible with a safe wearable device.

I’m still working on my first DNA order. I want it to be cell free and got some help designing a genetic circuit but want to verify with the cell free lesson before placing order. I want it to follow this genetic circuit:

Lactate (input signal from sweat) -> Lactate Oxidase (breaks lactate down into) -> Pyruvate (byproduct not used) + H2O2 (which then activates OxyR by oxidizing it) -> OxyR (has two states starts reduced then is oxidized to become active and bind to DNA to activate transcription of PoxyS) -> PoxyS Promoter (controlled by OxyR, switches on reporter gene) -> RNA Aptamer (produced when PoxyS promoter is enabled binds to dye) -> DFHBI Dye (non-fluorescent when bound to RNA Aptamer)

All together it follows this chain: Lactate provides the biological input (increased sweat means increased need for hydration) Lactate Oxidase and H2O2 convert it into a detectable chemical signal OxyR and PoxyS act as switch RNA aptamer and dye generate a fast fluorescent output (that way the glow happens quicker than with protein translation and safer for a wearable because there are no cells)

Week 9 HW: Cell Free Systems

cover image

HW9

General homework questions

Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production.

Cell-free gives more flexibility and control because it operates in a system where you can directly control concentrations of DNA, ions, and cofactors. This allows rapid prototyping and expression of toxic or non-natural proteins without killing your host cell. It is especially beneficial for producing toxic proteins and for quickly testing genetic circuits.

Describe the main components of a cell-free expression system and explain the role of each component.

A cell-free expression system contains a cell extract with transcription and translation machinery, a DNA or mRNA template encoding the protein, and an energy system to fuel synthesis. Also amino acids, salts, and cofactors that support proper folding and reaction stability.

Why is energy provision regeneration critical in cell-free systems? Describe a method you could use to ensure continuous ATP supply in your cell-free experiment.

Energy regeneration is critical because protein synthesis quickly consumes ATP and GTP. Without replenishment the reaction stops. Continuous ATP supply can be maintained by using systems like phosphoenolpyruvate with pyruvate kinase to regenerate ATP from ADP.

Compare prokaryotic versus eukaryotic cell-free expression systems. Choose a protein to produce in each system and explain why.

Prokaryotic cell-free systems are fast and high-yield while eukaryotic systems support proper folding and modifications but are slower and more expensive. A simple protein like GFP is best produced in a prokaryotic system, but a complex protein like an antibody requires a eukaryotic system.

How would you design a cell-free experiment to optimize the expression of a membrane protein? Discuss the challenges and how you would address them in your setup.

A cell-free system can include liposomes or nanodiscs to mimic a membrane environment and prevent aggregation. Challenges include hydrophobic misfolding and instability, which can be addressed with careful tuning of reaction conditions which improves proper insertion and folding.

Imagine you observe a low yield of your target protein in a cell-free system. Describe three possible reasons for this and suggest a troubleshooting strategy for each.

Low protein yield might be caused by poor transcription, inefficient translation, or energy depletion. These can be addressed by using stronger promoters or more DNA, optimizing ribosome binding sites or codons, and improving ATP regeneration systems.

Homework question from Kate Adamala

Design an example of a useful synthetic minimal cell as follows:

Pick a function and describe it.

What would your synthetic cell do? What is the input and what is the output?

For my final project, I want to make a lactate-detecting cell-free system. It would detect lactate concentration and produce a measurable signal. It acts as a biochemical sensor that converts a lactate input into a fluorescent output and is useful for monitoring physiological or environmental lactate levels

Could this function be realized by cell-free Tx/Tl alone, without encapsulation?
Yes, that’s how I plan to do it for my final project, but encapsulation would improve control over sensing and environmental interaction and might also let me make the reaction resettable. Without a membrane, diffusion and signal localization are harder to regulate.
Could this function be realized by a genetically modified natural cell?

A genetically modified natural cell could also perform lactate sensing using my genetic circuit. However, natural cells introduce complexity, slower response times, and biosafety concerns. My circuit could also produce a large amount of h2o2 which is harmful to natural cells but maybe not to synthetic ones.

Describe the desired outcome of your synthetic cell operation.

The desired outcome is a system that produces a reliable, quantifiable signal proportional to lactate concentration. The response should be sensitive, fast, and specific to lactate. This allows accurate monitoring in real time.

Design all components that would need to be part of your synthetic cell.

What would the membrane be made of?

The membrane would be composed of a lipid bilayer combined with cholesterol for stability. This provides a semi-permeable barrier similar to natural cells. It also helps maintain internal reaction conditions.

What would you encapsulate inside? Enzymes, small molecules.

Inside the cell, would be transcription/translation machinery, DNA encoding the sensor and reporter, amino acids, ATP regeneration components, and cofactors. My cell in particular would also include lactate oxidase and DFBH dye.

Which organism your Tx/Tl system will come from? Is bacterial OK, or do you need a mammalian system for some reason? (hint: for example, if you want to use small molecule modulated promotors, like Tet-ON, you need mammalian)

I plan to use E. coli, which is sufficient for expressing simple reporter proteins and oxyR. A mammalian system is not necessary unless complex regulatory elements are required, my second goal was to track sodium levels which a mammalian system might be best for

How will your synthetic cell communicate with the environment? (hint: are substrates permeable? or do you need to express the membrane channel?)

Lactate would diffuse across the membrane or through membrane channels. If lactate permeability is low, the transporter protein lactate permease could be included.

Experimental details

List all lipids and genes. Lipids: POPC and cholesterol. Genes: lactate-responsive regulator, a promoter controlled by OxyR, dye as a reporter, possibly lactate permease for transport, and lactate oxidase
How will you measure the function of your system?

By monitoring fluorescence intensity over time using a plate reader or microscope. Increased fluorescence indicates a higher lactate concentration.

Homework question from Peter Nguyen

Write a one-sentence summary pitch sentence describing your concept.

I want to make a freeze-dried cell-free system embedded in workout clothes that detects lactate buildup in sweat a change color to indicate fatigue and performance.

How will the idea work, in more detail? Write 3-4 sentences or more.

Cell-free reactions containing a lactate-responsive genetic circuit are freeze-dried into removable patches that are attached to various locations on the fabric. When the wearer sweats, moisture rehydrates the system and lactate diffuses into the reaction, activating expression of a fluorescent reporter dye. The intensity of the signal correlates with lactate concentration, providing real-time feedback on muscle fatigue.

What societal challenge or market need will this address?

Not sure if anyone needs this, I just think it would be cool. Originally, I wanted to do something that informs the wearer of dehydration.

How do you envision addressing the limitation of cell-free reactions (e.g., activation with water, stability, one-time use)?

I hope that sweat provides enough water to activate the freeze-dried systems, and using modular replaceable patches would allow for multiple uses and control stability.

Homework question from Ally Huang

Provide background information that describes the space biology question or challenge you propose to address. Explain why this topic is significant for humanity, relevant for space exploration, and scientifically interesting.

Microgravity leads to rapid muscle atrophy due to reduced mechanical loading, posing a major risk for long-duration space missions. Current countermeasures, like exercise, are time-consuming and not fully effective. Understanding and monitoring molecular pathways that regulate muscle growth and degradation is essential for developing better interventions. This is significant for maintaining astronaut health, enabling deep space exploration, and advancing treatments for muscle-wasting diseases on Earth. Studying these pathways in space also provides insight into fundamental mechanisms of mechanobiology.

Name the molecular or genetic target that you propose to study. Examples of molecular targets include individual genes and proteins, DNA and RNA sequences, or broader approaches.

Myostatin (MSTN) and muscle growth regulators, including IGF-1 pathway gene expression markers.

Describe how your molecular or genetic target relates to the space biology question or challenge your proposal addresses.

Myostatin negatively regulates muscle growth, while IGF-1 promotes muscle synthesis and repair. In microgravity, increased myostatin activity and reduced anabolic signaling contribute to muscle loss. By monitoring these molecular targets, we can track the balance between muscle degradation and growth. This helps identify when and how muscle atrophy occurs in space. Understanding these changes enables development of targeted countermeasures to maintain muscle mass.

Clearly state your hypothesis or research goal and explain the reasoning behind it.

Microgravity conditions upregulate myostatin expression and downregulate IGF-1 signaling, leading to accelerated muscle atrophy. A cell-free system can be used to detect changes in expression of these markers from biological samples in space. The goal is to develop a lightweight, rapid diagnostic platform that monitors muscle health in real time using the stability and simplicity of freeze-dried cell-free systems to make it suitable for spaceflight. If successful, it could guide personalized countermeasures such as exercise timing or therapeutic interventions. This would improve astronaut health and mission performance during long-duration space travel.

Outline your experimental plan - identify the sample(s) you will test in your experiment, including any necessary controls, the type of data or measurements that will be collected, etc.

Samples containing RNA or DNA proxies for MSTN and IGF-1 expression will be analyzed using BioBits® cell-free reactions designed to produce fluorescence in response to target sequences. miniPCR® will amplify target sequences if needed, and fluorescence will be measured using the P51 viewer. Controls will include samples without target sequences and with known concentrations. Data will be collected as fluorescence intensity over time and compared across simulated microgravity conditions. This will quantify changes in muscle-related gene expression.

First DNA Twist Order

cover image

Homework

Weekly homework submissions:

Subsections of Homework

Week 1 HW: Principles and Practices

HW1

Ultra-efficient DNA Synthesis Machine

Governance/Policy

Goal

Governance Actions

Hardware-Level Constraints on Maximum Assembly Length

Sequence Screening and Logging

Tiered Access

Recommended Governance Approach and Trade-offs

Week 2 Lecture Prep

Homework Questions from Professor Jacobson:

Homework Questions from Dr. LeProust:

Homework Question from George Church:

Sources

Week 10 HW: Imaging and Measurement

HW10

Homework: Final Project

Please identify at least one (ideally many) aspect(s) of your project that you will measure.

Please describe all of the elements you would like to measure, and furthermore describe how you will perform these measurements.

What are the technologies you will use (e.g., gel electrophoresis, DNA sequencing, mass spectrometry, etc.)? Describe in detail.

Fluorescence Imaging

Microfluidics

Calibration Curves and Quantitative Analysis

Mass Spectrometry

Optical Detection Electronics

This would make the hydration tracker portable and wearable during exercise or daily use.

Homework: Waters Part I — Molecular Weight

Based on the predicted amino acid sequence of eGFP and any known modifications, what is the calculated molecular weight?

Calculate the molecular weight of the eGFP using the adjacent charge state approach.

Calculate the accuracy of the measurement.

Can you observe the charge state for the zoomed-in peak in the mass spectrum for the intact eGFP?

Homework: Waters Part III — Peptide Mapping - Primary Structure

How many Lysines (K) and Arginines (R) are in eGFP?

How many peptides will be generated from tryptic digestion of eGFP?

Based on the LC-MS data for the peptide map, how many chromatographic peaks do you see?

Does the number of peaks match the number of predicted peptides?

Identify the mass-to-charge (m/z) of the peptide shown in Figure 5b.

Identify the peptide and calculate the ppm error.

What is the percentage of the sequence that is confirmed by peptide mapping?

Does the peptide map data indicate the protein is eGFP?

Homework: Waters Part IV — Oligomers

Using the known masses of the KLH subunits, identify the oligomeric species.

Homework: Waters Part V — Did I Make GFP?

Please fill out this table with the data acquired from the lab work.

Week 11 HW: Bioproduction and Cloud Labs

HW11

Part A

Part B

Part C

Week 12 HW: Building Genomes

HW12

Week 13 HW: Bio Design Living Materials

HW13

Week 14 HW: Biofabrication

HW14

Week 2 HW: DNA Read, Write, and Edit

HW2

Part 1: Benchling & In-silico Gel Art

Part 2: Gel Art - Restriction Digests and Gel Electrophoresis

Part 3: DNA Design Challenge

3.1: Choose your protein

3.2: Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence

3.3. Codon optimization

3.4. You have a sequence! Now what?

Cell-Dependent Expression:

Cell-Independent Expression:

Part 4: Prepare a Twist DNA Synthesis Order

Part 5: DNA Read/Write/Edit

5.1 DNA Read

5.2 DNA Write

5.3 DNA Edit

Sources

Week 3 HW: Lab Automation

HW3

Lab Preparation: Opentrons Artwork

Post-Lab Questions