Homework

Weekly homework submissions:

  • Week 1 HW: Principles and Practices

    Scale-up of nanocapsules for drug delivery using bacteria as ferritin manufacturers 1. Describe a biological engineering application or tool you want to develop and why. Biologics are drugs synthesized by living organisms, which have gained more notoriety throughout the years (Walsh, 2018). Cancer drugs and vaccines are some of the achievements scientists have accomplished with biotechnology. This is a novel area with increasing knowledge and endless applications. Currently, iron deficiency is one of the main global issues affecting overall health (Lee et al., 2025). This project aims to develop a drug delivery system using bacteria-made ferritin, given the popularity and extended use of these microorganisms over the years for drug manufacturing (Kulkarni, 2026).

  • Week 10 HW: Advanced Imaging & Measurement Technology

    Please identify at least one (ideally many) aspect(s) of your project that you will measure. It could be the mass or sequence of a protein, the presence, absence, or quantity of a biomarker, etc. The main aspect to be measured is the expression and activity of the biosynthetic gene cluster (BGC). This includes: Presence and expression of BGC-associated enzymes Production of candidate metabolites Antibacterial activity against Leptospira Please describe all of the elements you would like to measure, and furthermore describe how you will perform these measurements.

  • Week 11 HW: Bioproduction and Cloud Labs

    1. Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork I was not able to complete the artwork since I never received the email :c.
  1. Part B: Cell-Free Protein Synthesis | Cell-Free Reagents Referencing the cell-free protein synthesis reaction composition (the middle box outlined in yellow on the image above, also listed below), provide a 1-2 sentence description of what each component’s role is in the cell-free reaction.
  • Week 12 HW: Building Genomes

    Lab Homework: Bioproduction of Beta-Carotene and Lycopene Post-Lab questions Which genes when transferred into E. coli will induce the production of lycopene and beta-carotene, respectively? Lycopene production in E. coli typically requires the carotenoid biosynthesis genes crtE, crtB, and crtI from organisms such as Erwinia herbicola (now Pantoea ananatis). Beta-carotene production additionally requires the crtY gene, which converts lycopene into beta-carotene through cyclization.

  • Week 2 HW: DNA Read, Write, and Edit

    HOMEWORK Part 1: Benchling & In-silico Gel Art Below are some screenshots from the steps followed to create a basic pattern: Step 1: The sequence is imported from the webpage to Benchling. Figures 1 and 2. Lambda DNA import process. Step 2: The digest function is shown as a test with EcoRI as the chosen restriction enzyme.

  • Week 3 HW: Lab Automation

    Assignment no. 1: Python Script for Opentrons Artwork The code used was the following to create a simple swirl pattenr with four different colors: from opentrons import types import math metadata = { 'author': 'Jean Colmenares', 'protocolName': 'Agar Swirl Pattern - 4 Colors', 'description': 'Swirl pattern with four colors per branch', 'source': 'HTGAA 2026 Opentrons Lab', 'apiLevel': '2.20' } TIP_RACK_DECK_SLOT = 9 COLORS_DECK_SLOT = 6 AGAR_DECK_SLOT = 5 PIPETTE_STARTING_TIP_WELL = 'A1' well_colors = { 'A1': 'Red', 'B1': 'Green', 'C1': 'Orange', 'D1': 'Blue' } def run(protocol): tips_20ul = protocol.load_labware( 'opentrons_96_tiprack_20ul', TIP_RACK_DECK_SLOT, 'Opentrons 20uL Tips' ) pipette_20ul = protocol.load_instrument( "p20_single_gen2", "right", [tips_20ul] ) temperature_module = protocol.load_module( 'temperature module gen2', COLORS_DECK_SLOT ) temperature_plate = temperature_module.load_labware( 'opentrons_96_aluminumblock_generic_pcr_strip_200ul', 'Cold Plate' ) color_plate = temperature_plate agar_plate = protocol.load_labware( 'htgaa_agar_plate', AGAR_DECK_SLOT, 'Agar Plate' ) center_location = agar_plate['A1'].top() pipette_20ul.starting_tip = tips_20ul.well(PIPETTE_STARTING_TIP_WELL) # —————————————————————— # Helper functions # —————————————————————— def location_of_color(color_string): for well, color in well_colors.items(): if color.lower() == color_string.lower(): return color_plate[well] raise ValueError(f"No well found with color {color_string}") def dispense_and_detach(pipette, volume, location): above_location = location.move(types.Point(z=location.point.z + 5)) pipette.move_to(above_location) pipette.dispense(volume, location) pipette.move_to(above_location) # —————————————————————— # SWIRL PATTERN — BIG + FIXED COLOR PER BRANCH (P20 SAFE) # —————————————————————— DROP_VOLUME = 3 branches = 4 points_per_branch = 24 radius_start = 3 radius_step = 1.6 angle_step = math.pi/9 branch_colors = ['Red', 'Green', 'Orange', 'Blue'] for branch in range(branches): base_angle = branch * (2*math.pi/branches) color = branch_colors[branch] source = location_of_color(color) for i in range(points_per_branch): pipette_20ul.pick_up_tip() pipette_20ul.aspirate(DROP_VOLUME, source.bottom(1)) angle = base_angle + i * angle_step radius = radius_start + i * radius_step x = radius * math.cos(angle) y = radius * math.sin(angle) loc = center_location.move(types.Point(x=x, y=y, z=0)) dispense_and_detach(pipette_20ul, DROP_VOLUME, loc) pipette_20ul.drop_tip() The pattern is shown below:

  • Week 4 HW: Protein Design Part I

    PART A: 1. How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons) Assumptions: 500 g of meat ~31 g of protein per 100 g of meat (British Nutrition Foundation, 2021 ) Average amino acid mass ≈ 100 g/mol Avogadro’s number = 6.022 × 10^23 molecules/mol 1. Protein content in 500 g of meat

  • Week 5 HW: Protein Design part II

    PART A: SOD1 Binder Peptide Design The sequence for the original protein is: // sp|P00441|SODC_HUMAN Superoxide dismutase [Cu-Zn] OS=Homo sapiens OX=9606 GN=SOD1 PE=1 SV=2 MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTS AGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVV HEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ Mutation occurs at residue 4: Alanine becomes Valine // 1UXM_1|Chains A, B, C, D, E, F, G, H, I, J, K, L|SUPEROXIDE DISMUTASE [CU-ZN]|HOMO SAPIENS (9606) ATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVS IEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ Part 1: Generate Binders with PepMLM

  • Week 6 HW: Genetic Circuits Part I

    PART 1: Protocol questions What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose? The mastermix contains: Phusion Hi-Fi DNA Polymerase: It is crucial for completing the amplicons generated during PCR. Deoxynucleotides: The building blocks necessary for replicating DNA fragments. Buffer including MgCl2: Prevents enzyme denaturation by maintaining pH at a fixed level. What are some factors that determine primer annealing temperature during PCR? The annealing temperature depends on the length of the primers and their GC content. Primers with higher GC content have higher melting temperatures. The sequence of the primer and the presence of mismatches also affect binding. In addition, salt concentration can influence primer stability.

  • Week 7 HW: Genetic Circuits Part II

    PART 1: IANNs What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions? Intracellular Artificial Neural Networks (IANNs) offer several advantages over traditional genetic circuits based on Boolean logic. While Boolean circuits operate in a binary manner (ON/OFF), IANNs can process continuous, graded inputs such as varying concentrations of metabolites or regulatory molecules. This enables more nuanced and biologically realistic responses. Additionally, IANNs integrate multiple inputs through weighted interactions, allowing for more flexible and complex decision-making compared to rigid logical gates like AND or OR.

  • Week 9 HW: Cell-Free Systems

    Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production. Main advantages (flexibility & control): Open system: Components such as DNA, cofactors, salts, inhibitors, can be directly modified. Precise control: You can tune Mg²⁺, ATP, amino acids, etc. Rapid expression: No need for cloning → transformation → growth. Toxic proteins: You can express proteins that would normally kill cells.

Subsections of Homework

Week 1 HW: Principles and Practices

Scale-up of nanocapsules for drug delivery using bacteria as ferritin manufacturers

1. Describe a biological engineering application or tool you want to develop and why.

Biologics are drugs synthesized by living organisms, which have gained more notoriety throughout the years (Walsh, 2018). Cancer drugs and vaccines are some of the achievements scientists have accomplished with biotechnology. This is a novel area with increasing knowledge and endless applications. Currently, iron deficiency is one of the main global issues affecting overall health (Lee et al., 2025). This project aims to develop a drug delivery system using bacteria-made ferritin, given the popularity and extended use of these microorganisms over the years for drug manufacturing (Kulkarni, 2026).

As I entered my senior year in university, I wanted to work on topics related to drug development or delivery, and scaling that process up, which is why I was doing some research for my bachelor’s thesis. However, I ended up reaching a different path due to life, but here I am trying to learn and see how far I can take this idea.

2. Describe one or more governance policy goals related to ensuring this application contributes to an ethical future & prevents harm.

  • Ensure quality in scaled production: Ensure that the biologic product complies with Good Manufacturing Practices (GMP).

  • Prevent non-maleficence in biomanufacturing: Avoid harmful use of bioengineered bacteria.

  • Foster and promote innovation and global access: Enable technology transfer to low-resource settings.

3. Describe at least three different potential governance actions by considering the purpose, design, assumptions, and risks of failures & “success”.

Below is a table with the three main governance actions:

Table 1. Governance actions

Governance ActionPurposeDesignAssumptionsRisks & Failures
Biologics safeguardsCreate biocontainment for bacteria as biological hazardsEntities such as FDAThe safeguards will be effectiveOverlooking the safeguards may affect their effectiveness
Standardized GMPsElaborate guidelines for safe productionQA department staffFast implementation and adaptation by companiesHigh associated costs may create manufacturing monopolies
Traceability of biological productAvoid misuse of the biologicMolecular signaturesTraceability methods are robustMutations in microorganisms may render signatures ineffective

4. Score each of your governance actions against your rubric of policy goals

Table 2. Scoring

Does the option:Biologics safeguardsStandardized GMPsTraceability of biological product
Enhance Biosecurity
• By preventing incidents122
• By helping respond211
Foster Lab Safety
• By preventing incidents112
• By helping respond222
Protect the environment
• By preventing incidents122
• By helping respond222
Other considerations
• Minimizing costs and burdens322
• Feasibility222
• Not impede research232
• Promote constructive applications212
TOTAL SCORE181819

5. Based on scores, describe which governance option or combination of options you would prioritize, and why.

After reviewing the different options and their scores, the most reasonable combination of options to prioritize would be Standardized GMPs and Traceability of Biological Products. The former is selected due to its strong impact on both productivity and product quality, as well as its capacity to establish clear guidelines that ensure biological safety for both production staff and consumers. The latter is essential because traceability enables the identification of errors and deviations throughout the production process, allowing them to be corrected in a timely manner and ensuring the ethical and responsible use of this technology.


References

Walsh, G. (2018). Biopharmaceutical benchmarks 2018. Nature Biotechnology, 36, 1136–1145. https://www.nature.com/articles/nbt.4305

Lee, G. R., et al. (2025). Global burden of iron deficiency and its impact on health. Nature Medicine. https://www.nature.com/articles/s41591-025-03624-8

Kulkarni, S. (2026). Engineered microbes as API manufacturers in pharma. Pharma Now. https://www.pharmanow.live/knowledge-hub/research/engineered-microbes-api-manufacturing-pharma

Week 10 HW: Advanced Imaging & Measurement Technology

Please identify at least one (ideally many) aspect(s) of your project that you will measure. It could be the mass or sequence of a protein, the presence, absence, or quantity of a biomarker, etc.

The main aspect to be measured is the expression and activity of the biosynthetic gene cluster (BGC). This includes:

  • Presence and expression of BGC-associated enzymes
  • Production of candidate metabolites
  • Antibacterial activity against Leptospira

Please describe all of the elements you would like to measure, and furthermore describe how you will perform these measurements.

  • BGC enzyme expression: Measured using Western blot to confirm protein presence and approximate expression levels.
  • Metabolite production: Measured using LC-MS to detect and quantify candidate compounds produced by the BGC.
  • Antibacterial activity: Evaluated through antibiograms to assess inhibition of Leptospira growth.

What are the technologies you will use (e.g., gel electrophoresis, DNA sequencing, mass spectrometry, etc.)? Describe in detail.

  • Western blot: To detect specific proteins encoded by the BGC after separation by gel electrophoresis.
  • Gel electrophoresis: For protein separation prior to blotting.
  • LC-MS (Liquid Chromatography–Mass Spectrometry): Main analytical technique to identify and quantify metabolites based on retention time and mass-to-charge ratio.
  • Antibiogram assays: To determine the antibacterial effectiveness of produced compounds.

PART 1

Based on the predicted amino acid sequence of eGFP (see below) and any known modifications, what is the calculated molecular weight? You can use an online calculator like the one at https://web.expasy.org/compute_pi/

MVSKGEELFTG VVPILVELDG DVNGHKFSVS GEGEGDATYG KLTLKFICTT GKLPVPWPTL VTTLTYGVQC FSRYPDHMKQ HDFFKSAMPE GYVQERTIFF KDDGNYKTRA EVKFEGDTLV NRIELKGIDF KEDGNILGHK LEYNYNSHNV YIMADKQKNG IKVNFKIRHN IEDGSVQLAD HYQQNTPIGD GPVLLPDNHY LSTQSALSKD PNEKRDHMVL LEFVTAAGIT LGMDELYKLE HHHHHH

Based on the website provided, the molecular weight is 26.94 kDa, which does not consider the linker and His tag. If we consider them, the new MW will then be 28 kDa. The former value is consistent with other GPFs from other databases such as Q9U6Y4 (26.17 kDa), P42212 (26.89 kDa) and Q9GZ28 (25.91 kDa).

Based on the predicted amino acid sequence of eGFP (see below) and any known modifications, what is the calculated molecular weight? You can use an online calculator like the one at https://web.expasy.org/compute_pi/

The expression is:

$ z = \frac{m/z_{n+1}}{\left(\frac{m}{z_n} - \frac{m}{z_{n+1}}\right)} $

Then, considering peaks such as 800.5508 and 824.0635, $z$ is equal to 34.047.

Then, the molecular weight is given by:

$ MW = z \cdot \left(\frac{m}{z_n} - 1\right) $ $ MW = 34.047 \cdot (824.0635 - 1) = 28{,}023.329 \ \text{Da} = 28.02 \ \text{kDa} $

With these values, the accuracy will be:

$ \frac{28.02 - 28}{28} = 7.14 \times 10^{-4} = 714 ppm $

Can you observe the charge state for the zoomed-in peak in the mass spectrum for the intact eGFP? If yes, what is it? If no, why not?

The peak has an approximate value of 1473.74, and the neighboring peaks are 1473.88 and 1473.79. Given that the difference in the peaks is equal to the amount of charges, subrstracting the values gives:

$ 1473.88-1473.74 =0.06; \frac{1}{z} = 0.06; z \approx 17 $

PART 2

Based on learnings in the lab, please explain the difference between native and denatured protein conformations. For example, what happens when a protein unfolds? How is that determined with a mass spectrometer? What changes do you see in the mass spectrum between the native and denatured protein analyses (Figure 2)?

When a protein unfolds, it allows external ions to interact with the structure through ion-molecule interactions such as ion-dipole forces. This increases the amount of charges and, therefore, the number of peaks, which is why it is shown a broader spectrum in the denatured protein than that of the native protein.

Zooming into the native mass spectrum of eGFP from the Waters Xevo G3 QTof MS (see Figure 3), can you discern the charge state of the peak at ~2800? What is the charge state? How can you tell?

Yes. Since the differences between peaks is around 0.1, the charge state is approximately +10.

PART 3

How many Lysines (K) and Arginines (R) are in eGFP? Please circle or highlight them in the eGFP sequence given in Waters Part I question 1 above. (Note: adding the sequence to Benchling as an amino acid file and clicking biochemical properties tab will show you a count for each amino acid).

MVS K GEELFTG VVPILVELDG DVNGH K FSVS GEGEGDATYG K LTL K FICTT G K LPVPWPTL VTTLTYGVQC FS R YPDHM K Q HDFF K SAMPE GYVQE R TIFF K DDGNY K T R A EV K FEGDTLV N R IEL K GIDF K EDGNILGH K LEYNYNSHNV YIMAD K Q K NG I K VNF K I R HN IEDGSVQLAD HYQQNTPIGD GPVLLPDNHY LSTQSALS K D PNE K R DHMVL LEFVTAAGIT LGMDELY K LE HHHHHH

Number of Lysines: 20 Number of Arginines: 6

How many peptides will be generated from tryptic digestion of eGFP?

Based on the website provided, there will be 19 fragments

Figure 1. 19 fragments cut using the Expasy PeptideMass website.

Based on the LC-MS data for the Peptide Map data generated in lab (please use Figure 5a as a reference) how many chromatographic peaks do you see in the eGFP peptide map between 0.5 and 6 minutes? You may count all peaks that are >10% relative abundance.

Figure 2. Chromatographic peaks.

There are around 19 peaks based on the 10% of the highest peak which is 4.87.

Assuming all the peaks are peptides, does the number of peaks match the number of peptides predicted from question 2 above? Are there more peaks in the chromatogram or fewer?

Yes, there are around 19 peaks which match the number of fragments on the previous question.

Identify the mass-to-charge of the peptide shown in Figure 5b. What is the charge of the most abundant charge state of the peptide? (Use the separation of the isotopes to determine the charge state). Calculate the mass of the singly charged form of the peptide based on its mass-to-charge ration and charge state

The observed m/z is 525.76.

The isotope spacing is:

$\Delta (m/z) = 0.499 $

Therefore, the charge state is:

$z \approx 2 $

The neutral mass is:

$M = (m/z)\cdot z - z \cdot 1.007 = 1049.51 \ \text{Da} $

6. Identify the peptide based on comparison to expected masses in the PeptideMass tool. What is mass accuracy of measurement? Please calculate the error in ppm

Based on Figure 1, the closest match is the peptide is FEGDTLVNR with a mass of 1050.42 Da.

The accuracy is therefore:

$ \frac{1050.42 - 1049.51}{1050.42} = 8.66 \times 10^{-4} = 866.32 ppm $

The calculated mass error is approximately 866 ppm, which is significantly higher than expected . This suggests that the peptide assignment (FEGDTLVNR) may be incorrect or that there is a mismatch between the experimental and theoretical mass definitions.

What is the percentage of the sequence that is confirmed by peptide mapping?

Based on the website, the coverage is at aroun 90.7%.

PART 4

The Subunits are given in the following table:

Table 1. Oligomers and masses.

Oligomeric Species# SubunitsSubunit Mass (kDa)Total Mass (kDa)Total Mass (MDa)
7FU Decamer1034034003.4
8FU Didecamer2040080008.0
8FU 3-Decamer304001200012.0
8FU 4-Decamer404001600016.0

Therefore, the following figure identifies the associated peaks:

Figure 3. Subunits locations.

PART 5

Please fill out this table with the data you acquired from the lab work done at the Waters Immerse Lab in Cambridge, or else the data screenshots in this document if you were unable to have lab work done at Waters.

Table 2. PPM mass error

Theoretical (kDa)Observed / Measured (kDa)PPM Mass Error
Molecular weight (kDa)28.0028.02714

Week 11 HW: Bioproduction and Cloud Labs

1. Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork

I was not able to complete the artwork since I never received the email :c.

2. Part B: Cell-Free Protein Synthesis | Cell-Free Reagents

Referencing the cell-free protein synthesis reaction composition (the middle box outlined in yellow on the image above, also listed below), provide a 1-2 sentence description of what each component’s role is in the cell-free reaction.

E. coli Lysate

BL21 (DE3) Star Lysate (includes T7 RNA Polymerase)

This lysate provides the cellular machinery required for transcription and translation, including ribosomes, enzymes, tRNAs, and metabolic components. The T7 RNA polymerase specifically drives the transcription of genes under a T7 promoter, enabling protein synthesis in the cell-free system.

Salts/Buffer

Potassium Glutamate: Helps maintain ionic strength and mimics the intracellular environment of E. coli. It supports enzyme activity and improves protein synthesis efficiency.

HEPES-KOH pH 7.5: Acts as a buffering agent to maintain a stable pH during the reaction. Proper pH is essential for optimal enzyme and ribosome activity.

Magnesium Glutamate: Stabilize ribosomes, nucleic acids, and enzymatic reactions involved in protein synthesis as they are essential cofactors for transcription and translation processes.

Potassium phosphate monobasic: Contributes to phosphate buffering capacity and helps maintain the chemical balance of the reaction mixture.

Potassium phosphate dibasic: Works together with the monobasic form to stabilize pH and provide phosphate ions necessary for cellular reactions.

Energy / Nucleotide System

Ribose: Serves as a carbon source and can contribute to energy regeneration pathways during the reaction.

Glucose: Provides metabolic energy that supports ATP regeneration and prolongs protein synthesis activity in the cell-free system.

AMP: A nucleotide precursor involved in ATP regeneration and nucleic acid metabolism during transcription.

CMP: Provides cytidine nucleotides required for RNA synthesis during transcription.

GMP: Supplies guanine nucleotides necessary for mRNA production.

UMP: Provides uridine nucleotides used during RNA synthesis.

Guanine: Supports nucleotide biosynthesis and recycling pathways that help sustain transcription.

17 Amino Acid Mix

This mixture provides most of the amino acids required for protein synthesis. Ribosomes use these amino acids to assemble the target protein.

Tyrosine: It is added separately to ensure sufficient availability during translation, since it may be unstable or consumed rapidly. Cysteine: It is supplied separately because it is chemically reactive and prone to oxidation, which can reduce its stability in the mixture.

Additives

Nicotinamide: Nicotinamide functions as a precursor for NAD-related cofactors that support metabolic and enzymatic reactions in the cell-free system.

Backfill

Nuclease Free Water: Nuclease free water is used to adjust the final reaction volume while preventing degradation of nucleic acids by nucleases.

Describe the main differences between the 1-hour optimized PEP-NTP master mix and the 20-hour NMP-Ribose-Glucose master mix shown in the Google Slide above. (2-3 sentences)

The 1-hour optimized PEP-NTP master mix uses phosphoenolpyruvate (PEP) and directly supplied NTPs as an immediate energy and transcription source, allowing rapid protein production in a short incubation time. In contrast, the 20-hour NMP-Ribose-Glucose master mix relies on ribose, glucose, and nucleotide monophosphates (NMPs), which are metabolically converted into usable energy and nucleotides over time through endogenous cellular enzymes.

Additionally, the 20-hour system is designed to be more sustainable and cost-effective for long incubations, while the 1-hour system prioritizes fast and high-yield protein expression.

3. Part C: Planning the Global Experiment | Cell-Free Master Mix Design

Given the 6 fluorescent proteins we used for our collaborative painting, identify and explain at least one biophysical or functional property of each protein that affects expression or readout in cell-free systems. (Hint: options include maturation time, acid sensitivity, folding, oxygen dependence, etc) (1-2 sentences each)

Table 1: Protein properties

Fluorescent ProteinProperty Affecting Cell-Free SystemsExplanation
sfGFPEnhanced folding efficiencysfGFP is engineered to fold efficiently even under stressful or suboptimal conditions, making it highly reliable in cell-free expression systems. Its fast maturation also allows rapid fluorescence detection.
mRFP1Slow maturation timemRFP1 matures more slowly than many GFP variants, which delays fluorescence readout in cell-free reactions. Proper chromophore formation also requires efficient oxidation and folding.
mKO2Acid sensitivitymKO2 fluorescence can decrease under acidic conditions, so pH changes in cell-free reactions may affect signal intensity. It also has relatively fast maturation compared to other orange fluorescent proteins.
mTurquoise2High brightness and folding efficiencymTurquoise2 has a very high quantum yield, producing strong fluorescence signals even at lower expression levels. Its optimized folding improves performance in cell-free systems.
mScarlet-IRapid maturation and photostabilitymScarlet-I matures quickly and produces bright fluorescence, making it useful for fast cell-free assays. Its high photostability also maintains signal quality during measurements.
Electra2Oxygen-dependent chromophore formationElectra2 requires oxygen for proper chromophore maturation, so low oxygen availability in cell-free systems can reduce fluorescence output. Folding kinetics may also influence signal development time.

Create a hypothesis for how adjusting one or more reagents in the cell-free mastermix could improve a specific biophysical or functional property you identified above, in order to maximize fluorescence over a 36-hour incubation. Clearly state the protein, the reagent(s), and the expected effect.

Increasing the concentration of ribose and glucose in the 20-hour NMP-Ribose-Glucose cell-free mastermix could improve the fluorescence output of mScarlet-I during a 36-hour incubation

Because mScarlet-I depends on efficient folding and chromophore maturation over longer expression periods, providing additional long-term energy substrates may sustain ATP regeneration and protein synthesis, resulting in brighter and more stable fluorescence over time. Another possible modification would be increasing magnesium glutamate concentration slightly to improve ribosome activity and translation efficiency, which could further enhance mScarlet-I expression. Tiny chemical soup optimization rituals.

Bonus question: How can transcription occur if GMP is not included but Guanine is?

Transcription can still occur because the system may use guanine as a precursor to synthesize GMP and eventually GTP through endogenous metabolic enzymes present in the cell extract. In the NMP-Ribose-Glucose system, the cell-free lysate retains parts of cellular nucleotide salvage pathways, allowing guanine to be converted into usable nucleotide forms needed for RNA synthesis over long incubations.

Week 12 HW: Building Genomes

Lab Homework: Bioproduction of Beta-Carotene and Lycopene

Post-Lab questions

Which genes when transferred into E. coli will induce the production of lycopene and beta-carotene, respectively?

Lycopene production in E. coli typically requires the carotenoid biosynthesis genes crtE, crtB, and crtI from organisms such as Erwinia herbicola (now Pantoea ananatis). Beta-carotene production additionally requires the crtY gene, which converts lycopene into beta-carotene through cyclization.

Why do the plasmids that are transferred into the E. coli need to contain an antibiotic resistance gene?

The antibiotic resistance gene acts as a selectable marker so only bacteria containing the plasmid survive when grown on antibiotic-containing media.

What outcomes might we expect to see when we vary the media, presence of fructose, and temperature conditions of the overnight cultures?

Different media compositions can affect bacterial growth rate and nutrient availability, influencing pigment production levels. For instance, the presence of fructose may increase carbon availability and metabolic flux toward carotenoid synthesis, while temperature changes can alter enzyme activity, protein folding, and plasmid stability, leading to differences in pigment accumulation.

Generally describe what “OD600” measures and how it can be interpreted in this experiment.

OD600 measures optical density at 600 nm and is used as an estimate of bacterial cell density in liquid culture. In this experiment, higher OD600 values generally indicate greater bacterial growth, which can correlate with increased carotenoid production if the engineered pathway is functioning efficiently.

What are other experimental setups where we may be able to use acetone to separate cellular matter from a compound we intend to measure?

Acetone is commonly used to extract hydrophobic pigments or metabolites such as chlorophyll, carotenoids, and lipids from cells. It can also precipitate proteins while leaving smaller metabolites in solution for downstream analysis.

Why might we want to engineer E. coli to produce lycopene and beta-carotene pigments when Erwinia herbicola naturally produces them?

E. coli is easier to genetically manipulate, grows rapidly, and is widely used for industrial-scale recombinant protein and metabolite production. Studies have shown that engineered E. coli strains can achieve enhanced growth and lycopene production, especially when optimized carbon sources such as fructose are used.

Metabolic Pathway

What are the enzymes of the carotene pathway?

The main enzymes are:

  • CrtE: geranylgeranyl pyrophosphate synthase
  • CrtB: phytoene synthase
  • CrtI: phytoene desaturase
  • CrtY: lycopene cyclase

Within this pathway, which is the rate determining step (the step that takes the longest)? Which enzyme is responsible for this step?

The phytoene desaturation step catalyzed by CrtI is often considered rate-limiting because it involves multiple sequential desaturation reactions required to form lycopene.

Notes for design of a DNA construct for bioproduction

The first thing to do is to decide what organism you are going to use for this (E. coli or S. cerevisiae) for production. Which would you choose and why (emphases on production differences)?

Now choose one of the enzymes and lets outline the parts of the construct for expression

For E. coli lets create a expression vector that works as a plasmid you choose E. coli let’s create a expression vector that works as a plasmids Now, for making a functional construct there are a variety of biological parts needed for this, like ribosome binding sites, terminators, operators and promoters. The last ones are the most important in terms of enzyme or protein production. Let’s elaborate further on this biopart

Promoter

What is the function of a promoter?

A promoter is a DNA sequence where RNA polymerase binds to initiate transcription of a gene.

What types of promoters do we have?

There is a varity of promoters, which includes:

  • Constitutive promoters
  • Inducible promoters
  • Repressible promoters
  • Tissue-specific promoters
  • Synthetic promoters

If we wanted to turn off the transcription of a gene in response to a metabolite, what type of promoter would be most useful? What if we wanted this to increase in the presence of the metabolite?

For turning off the transcription in response to a metabolite, a repressible promoter is useful. On the other hand, an inducible promoter would be useful if transcription should increase in the presence of the metabolite.

Now choose one of the genes of the metabolic pathway previously described (Carotene/lycopene )and choose one enzyme to make an expression construct. What promoter could you use for this? Why did you choose it?

The lac promoter or T7 promoter could be used for crtI expression in E. coli. These promoters allow strong inducible expression, giving researchers control over carotenoid production timing and reducing metabolic burden before induction.

Origin of replication of plasmid

What is the origin of replication?

The origin of replication (ori) is the DNA sequence where plasmid replication begins inside the host cell.

What types of origin of replication do we have?

Origins can be:

  • High-copy origins
  • Low-copy origins
  • Broad-host-range origins
  • Narrow-host-range origins

(Extra) What are compatibility groups?

Compatibility groups describe whether two plasmids can coexist in the same cell without interfering with each other’s replication systems.

Now for the previously chosen promoter and gene what will be the best origin or replication?

A high-copy origin such as ColE1/pUC would be useful for strong crtI expression and increased carotenoid production in E. coli.

Elaborate further on other bioparts like RBS, terminators, operators you would use for a correct design and further bioproduction?

  • Strong RBS for efficient translation
  • Terminators to stop transcription properly
  • Operators for transcriptional regulation
  • Selection markers for plasmid maintenance

These components improve stability, expression efficiency, and control of metabolic production.

(Hot! Extra points) What are aptamers and riboswitches and how can they be used for metabolic tuning or engineering in prokaryotes?

Aptamers are nucleic acid structures that bind specific molecules, while riboswitches are RNA regulatory elements that alter gene expression in response to metabolite binding. They can be used for metabolic tuning by dynamically regulating pathway genes depending on intracellular metabolite concentrations.

(Extra points) Now what approach can be used to join all these parts together? Make a quick analysis of their sequence in search of possibilities (search for restriction sites, etc)

Methods include:

Gibson Assembly Golden Gate Assembly Restriction enzyme cloning Yeast homologous recombination

Restriction site analysis helps identify compatible cloning strategies and avoid unwanted cutting inside coding sequences.

(Extra Hot!!! Extra Points) Try to elaborate further on a biosynthetic pathway you would want to engineer in E. coli for production of a metabolite or product. What use could this bio-product have? Imagine dream applications!!!

I would engineer E. coli to produce biodegradable bioplastics such as polyhydroxyalkanoates (PHAs). These materials could replace petroleum-based plastics in packaging and medical applications, reducing environmental pollution and dependence on fossil fuels.

(Extra points) For S. cerevisiae create an integration cassette for homologous recombination.

As well as for prokaryotes, eukaryotic DNA designs need bioparts used for construction of a function design and further expresion. Now search for a biosynthetic pathway if interested and describe one of the genes of the pathway.

One interesting eukaryotic biosynthetic pathway is the resveratrol biosynthesis pathway engineered in Saccharomyces cerevisiae (Meng et. al, 2023). One important gene in this pathway is 4CL (4-coumarate:CoA ligase), which converts p-coumaric acid into p-coumaroyl-CoA, an essential precursor for resveratrol production. This step is important because it activates the substrate that will later be converted into resveratrol by stilbene synthase.

Now, remember that for making a functional construct there are a variety of biological parts needed for this, like ribosome binding sites or Kozak sequences, terminators, and promoters. List the ones you could use for DNA design.

Possible parts include:

  • GAL1 promoter
  • TEF1 promoter
  • Kozak sequence
  • CYC1 terminator
  • Selectable markers such as URA3 or LEU2

In yeast engineering we use DNA construction designs for making genome integration. What chromosome site could you use for integration of these and why?

Safe harbor loci such as the HO locus are commonly used because integration there minimizes disruption of essential genes while maintaining stable expression.

(Hot! Extra points) Following the next chart of how a DNA integration cassette should be designed and with the previously chosen parts elaborate the DNA sequence you could use to synthesize with Twist.

Week 2 HW: DNA Read, Write, and Edit

HOMEWORK

Part 1: Benchling & In-silico Gel Art

Below are some screenshots from the steps followed to create a basic pattern:

Step 1: The sequence is imported from the webpage to Benchling.

Figures 1 and 2. Lambda DNA import process.

Step 2: The digest function is shown as a test with EcoRI as the chosen restriction enzyme.

Figures 3 and 4. DNA Gel using EcoRI.

Step 3: The process is repeated using enzymes requested in the homework, and the result is the following:

Figures 5 and 6. DNA pattern using homework enzymes.

Step 4: The process is repeated now using different enzymes requested in the homework to create a different pattern. In this case, the pattern might look like a series of barcodes, using the enzymes shown in figure 7. Since they cut in different sites, they create a lot of short DNA fragments which scatter across the gel, givin the impression of a barcode. This enzyme behavior could be used as a biomarker perhaps in, although this idea needs further development:

Figures 7 and 8. DNA Gel pattern using different enzymes.

Part 3: DNA Design Challenge

3.1. Choose your protein

In this case, I chose the Transcription Initiation Factor 3 (TIF-3) encoded in the gene infC, which is a relatively short protein that is involed in the translation process (Gutu et al., 2013), (Arenz & Wilson, 2016). Modifying the structure of this protein may be crucial to combat antibiotic resistance.

By going to UniProt, the aminoacid sequence is the following:

//

tr|A0A8S0FV27|A0A8S0FV27_ECOLX Translation initiation factor IF-3 OS=Escherichia coli OX=562 GN=infC PE=3 SV=1 MSLREALEKAEEAGVDLVEISPNAEPPVCRIMDYGKFLYEKSKSSKEQKKKQKVIQVKEI KFRPGTDEGDYQVKLRSLIRFLEEGDKAKITLRFRGREMAHQQIGMEVLNRVKDDLLRRT GSGRILPNEDRRPPDDHGARS

3.2 and 3.3: Reverse translation and codon optimization

Going bak to Benchling, the AA sequence was imported and the function back translate was used to obtain the DNA sequence which is already optimized given the figures below:

Figure 9. Codon optimization.

// DNA optimized sequence: ATGAGTTTACGTGAAGCACTGGAAAAAGCGGAAGAAGCCGGTGTTGATCTGGTCGAAATCAGTCCTAATGCAGAACCCCCGGTGTGCCGTATCATGGACTATGGCAAATTCCTCTACGAGAAATCTAAAAGCTCAAAGGAACAAAAAAAGAAACAGAAGGTTATTCAGGTCAAAGAGATTAAGTTTCGACCGGGGACTGACGAAGGAGACTATCAAGTGAAACTTCGCTCCTTGATTCGCTTCCTGGAAGAGGGGGATAAAGCGAAAATTACCCTGCGCTTTCGCGGCAGAGAGATGGCCCACCAGCAGATCGGCATGGAAGTATTGAACCGTGTGAAAGATGACTTACTGCGTCGCACGGGTAGCGGTCGTATACTGCCAAACGAGGATCGCCGGCCGCCGGATGATCATGGCGCTCGGTCG

The organism selected for this protein is E. Coli due to its wide use in biotechnology. The codons had to be optimized due to the fact that the cellular machinery may differ from one bacteria to another. This means that a bacteria other than E. Coli might express this protein at a different rate and intensity. In this case, the protein was obtained from E. Coli based on UniProt, but since E. Coli contains different strands, codon optimization still would have to be performed.

3.4 and 3.5: Production technologies and alignment

I would use host cells since the chosen protein is from a bacteria en E. Coli is a common microorganism used for these purposes. This technique has a much lower cost thant using cell-free systems in which all the cellular components have to be supplied.

The alignment is shown in the following figures

Figures 10, 11 and 12. Central Dogma for TIF-3.

Part 4: Prepare a Twist DNA Sequence Order

Step 1: DNA Sequence

The same DNA linear sequence was already obtained in Part no. 3.

Figure 13. DNA Sequence.

Step 2: Building the chasis

The parts were initially searched in iGEM, but the website shut down. Due to this, the parts provided in the homework were used.

Figures 14 and 15. iGEM issue.

The chasis now looks like this:

Figure 16. Chasis.

Step 4: Ordering in Twist

The process is shown in the figures below:

Figure 17. Importing the sequence to Twist from Benchling.

Step 5: Creating the plasmid

The process is shown in the figures below:

Figures 18 and 19. Creating the plasmid.

Finally, the plasmid is shown below:

Figure 20. Final plasmid.

Part 5: DNA Read/Edit/Write

5.1.1: What DNA would you want to sequence (e.g., read) and why?

I would like to analyze DNA from insects such as flies, since many species act as vectors for infectious diseases. By sequencing their DNA, I could identify genetic elements associated with viral transmission, pathogen resistance, or susceptibility. This information could help improve disease monitoring and vector control strategies.

5.1.2: In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why?

I would use second-generation sequencing technologies such as Illumina sequencing due to their high throughput, accuracy, and cost-effectiveness. Illumina sequencing is particularly efficient for short DNA fragments and allows the parallel sequencing of millions of reads, making it ideal for large-scale genomic analysis of insect populations.

5.2.1: What DNA would you want to synthesize (e.g., write) and why?

I would like to synthesize bacterial DNA initially because bacterial genomes are less complex than eukaryotic genomes, which makes them more manageable in terms of cost and laboratory procedures. This would allow me to gain experience with gene design and expression systems before working with more complex organisms.

5.2.2: What technology or technologies would you use to perform this DNA synthesis and why?

I would use common routes such as solid-phase phosphoramidite chemical DNA synthesis combined with gene assembly techniques. These methods allow precise synthesis of short oligonucleotides, which can then be assembled into longer DNA constructs. It is widely used, reliable, and scalable for constructing bacterial genes or plasmids.

5.3.1: What DNA would you want to edit and why?

I would edit DNA from mammalian cells, focusing on genes involved in the immune response. By modifying specific regulatory or coding sequences, it may be possible to enhance resistance to infectious diseases or better understand the mechanisms underlying autoimmune disorders. However, such research would need to be conducted carefully and ethically due to the potential implications of editing mammalian genomes.

PRE-LECTURE NOTES

Homework questions from Prof. Jacobson

Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome? How does biology deal with that discrepancy?

The error rate of polymerase is approximately 1 in a million nucleotides. Considering the human genome length of approximately 3.2 billion base pairs, or 6.4 billion nucleotides in a diploid cell, there would be roughly 6,400 errors per molecule of DNA, and half that number in a haploid cell. This implies a significant chance for defects or mutations to occur and potentially be passed down to offspring. However, biology has developed multiple mechanisms over the past century that increase the fidelity of DNA replication. For instance, MutS-1 is a protein shown to bind to mismatched DNA sequences. This mechanism therefore acts as an additional layer that improves the fidelity of de novo DNA synthesis (Carr et al., 2004).

How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice, what are some of the reasons that all of these different codes don’t work to code for the protein of interest?

There are approximately 20n ways to code a human protein, where n represents the length of the protein. For instance, a typical protein may consist of 300 amino acids. Therefore, there are 20300 possible ways, which corresponds to an extremely large number of potential coding sequences (Alberts et al., 2002). Some of the reasons these codes do not work in practice include:

  • Codon usage bias: The prevalence of a codon is related to its translation efficiency; some codons are translated faster than others. This impacts protein expression levels and availability (Chakravarty, 2026).

  • Protein structure: Since proteins fold co-translationally, changes in codon usage can alter the timing of folding events, affecting protein structure and function (Moss et al., 2024).

Homework questions from Dr. LeProust

What’s the most commonly used method for oligo synthesis currently?

Currently, oligo synthesis is most commonly performed using phosphoramidite nucleosides as building blocks. This process consists of four main chemical reactions: detritylation, coupling, capping, and oxidation (Kosuri & Church, 2014).

Why is it difficult to make oligos longer than 200 nt via direct synthesis?

The main challenge in synthesizing long oligonucleotides using standard phosphoramidite chemistry lies in cumulative yield loss and error accumulation. Unwanted reactions, such as depurination during detritylation, and incomplete removal of protecting groups can leave gaps in the oligo backbone, reducing overall yield. In addition, single-base deletions are the predominant errors caused by inefficiencies during these reaction steps.

Why can’t you make a 2000 bp gene via direct oligo synthesis?

Manufacturing an oligo of this length is highly prone to errors due to several factors. First, oligo concentrations obtained from a selected pool after processing are often quite low, reducing assembly efficiency. Second, when synthesizing large numbers of oligos, overlapping coding regions may introduce assembly errors at scale. Finally, significantly higher costs are required to produce the large number of strands necessary for successful gene assembly.

Homework question from George Church

Using Google & Prof. Church’s slide #4, what are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?

The ten essential amino acids in most animals are:

Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Threonine, Tryptophan, Valine, and sometimes Cysteine or Tyrosine, depending on species-specific metabolic capabilities (Hou and Wu, 2018).

Understanding the Lysine Contingency as a bioengineered constraint, the dependence of animals on multiple essential amino acids further strengthens this strategy. This dependency enables researchers to implement safer in vivo containment systems, as organisms lacking access to these amino acids are unable to survive outside controlled environments (Shivni, 2023).


References

Carr, A. M., Lambert, S., & Replication Stress Group. (2004). Mismatch repair proteins and DNA replication fidelity. Nucleic Acids Research, 32(20), e162. https://academic.oup.com/nar/article/32/20/e162/1115791

Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., & Walter, P. (2002). Molecular Biology of the Cell (4th ed.). Garland Science. https://www.ncbi.nlm.nih.gov/books/NBK26830/

Chakravarty, A. (2026). What is codon bias? GoldBio. https://www.goldbio.com/blogs/articles/what-is-codon-bias

Moss, A. J., et al. (2024). Codon usage and protein folding dynamics. PMC. https://pmc.ncbi.nlm.nih.gov/articles/PMC11227313/

Kosuri, S., & Church, G. M. (2014). Large-scale de novo DNA synthesis. Nature Methods, 11, 499–507. https://www.nature.com/articles/nmeth.2918

Hou, Y., & Wu, G. (2018). Nutritionally essential amino acids in animals. Advances in Nutrition, 9(6), 849–858. https://doi.org/10.1093/advances/nmy054

Shivni, R. (2023). A pioneer of the multiplex frontier. The Scientist. https://www.the-scientist.com/a-pioneer-of-the-multiplex-frontier-71132

Week 3 HW: Lab Automation

Assignment no. 1: Python Script for Opentrons Artwork

The code used was the following to create a simple swirl pattenr with four different colors:

from opentrons import types
import math

metadata = {
    'author': 'Jean Colmenares',
    'protocolName': 'Agar Swirl Pattern - 4 Colors',
    'description': 'Swirl pattern with four colors per branch',
    'source': 'HTGAA 2026 Opentrons Lab',
    'apiLevel': '2.20'
}

TIP_RACK_DECK_SLOT = 9
COLORS_DECK_SLOT = 6
AGAR_DECK_SLOT = 5
PIPETTE_STARTING_TIP_WELL = 'A1'

well_colors = {
    'A1': 'Red',
    'B1': 'Green',
    'C1': 'Orange',
    'D1': 'Blue'
}

def run(protocol):

    tips_20ul = protocol.load_labware(
        'opentrons_96_tiprack_20ul',
        TIP_RACK_DECK_SLOT,
        'Opentrons 20uL Tips'
    )

    pipette_20ul = protocol.load_instrument(
        "p20_single_gen2",
        "right",
        [tips_20ul]
    )

    temperature_module = protocol.load_module(
        'temperature module gen2',
        COLORS_DECK_SLOT
    )

    temperature_plate = temperature_module.load_labware(
        'opentrons_96_aluminumblock_generic_pcr_strip_200ul',
        'Cold Plate'
    )

    color_plate = temperature_plate

    agar_plate = protocol.load_labware(
        'htgaa_agar_plate',
        AGAR_DECK_SLOT,
        'Agar Plate'
    )

    center_location = agar_plate['A1'].top()

    pipette_20ul.starting_tip = tips_20ul.well(PIPETTE_STARTING_TIP_WELL)

    # ------------------------------------------------------------------
    # Helper functions
    # ------------------------------------------------------------------

    def location_of_color(color_string):
        for well, color in well_colors.items():
            if color.lower() == color_string.lower():
                return color_plate[well]
        raise ValueError(f"No well found with color {color_string}")

    def dispense_and_detach(pipette, volume, location):
        above_location = location.move(types.Point(z=location.point.z + 5))
        pipette.move_to(above_location)
        pipette.dispense(volume, location)
        pipette.move_to(above_location)

     # ------------------------------------------------------------------
    # SWIRL PATTERN — BIG + FIXED COLOR PER BRANCH (P20 SAFE)
    # ------------------------------------------------------------------

    DROP_VOLUME = 3

    branches = 4
    points_per_branch = 24

    radius_start = 3
    radius_step = 1.6
    angle_step = math.pi/9

    branch_colors = ['Red', 'Green', 'Orange', 'Blue']

    for branch in range(branches):

        base_angle = branch * (2*math.pi/branches)
        color = branch_colors[branch]
        source = location_of_color(color)

        for i in range(points_per_branch):

            pipette_20ul.pick_up_tip()

            pipette_20ul.aspirate(DROP_VOLUME, source.bottom(1))

            angle = base_angle + i * angle_step
            radius = radius_start + i * radius_step

            x = radius * math.cos(angle)
            y = radius * math.sin(angle)

            loc = center_location.move(types.Point(x=x, y=y, z=0))
            dispense_and_detach(pipette_20ul, DROP_VOLUME, loc)

            pipette_20ul.drop_tip()

The pattern is shown below:

Assignment no.2: Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications.

Automated Strain Construction for Biosynthetic Pathway Screening in Yeast Astolfi et al., 2025

In this study, the researchers programmed a Hamilton Microlab VANTAGE liquid-handling robot (a high-end automation platform, not an Opentrons) to integrate with additional on- and off-deck hardware (e.g., thermocyclers, plate sealers, colony pickers) via its central arm. Together with custom software and a user interface developed in the Hamilton VENUS environment, this system automated key steps in yeast strain construction such as transformation setup, heat-shock, washing, and plating.

This automated workflow achieved a throughput of up to ~2,000 transformations per week, enabling high-throughput construction and screening of libraries of engineered yeast strains. As a proof of concept, the team applied the system to screen gene variants within a biosynthetic pathway for the plant alkaloid precursor verazine. They identified several genes that significantly increased pathway product titers, demonstrating the utility of automated strain construction for rapid pathway discovery and optimization.

Assignment no. 3: Write a description about what you intend to do with automation tools for your final project. You may include example pseudocode, Python scripts, 3D printed holders, a plan for how to use Ginkgo Nebula, and more. You may reference this week’s recitation slide deck for lab automation details.

For my final project, I would like to use automation tools to optimize dye biodegradation using Bacillus subtilis. A liquid-handling robot such as the Opentrons OT-2 could automate pipetting, sample preparation, and serial dilutions to test different enzyme and dye concentrations efficiently.

The system could be used to prepare multiple reactions in 96-well plates and measure dye degradation through absorbance readings. Automation would improve reproducibility, reduce human error, and increase experimental throughput.

Simple Python scripts could also be used to organize experimental data and calculate degradation efficiency automatically.

for enzyme_concentration in concentrations:
    add_samples()
    incubate()
    measure_absorbance()

Additionally, Ginkgo Nebula could be used for DNA construct design and sequence analysis related to enzyme production in Bacillus subtilis.

05/17: Since the idea was changed, the description is written below:

For my final project, I would like to use automation tools to optimize the expression and testing of the LigA protein in Escherichia coli. A liquid-handling robot such as the Opentrons OT-2 could automate pipetting steps, preparation of cultures, and testing of different expression conditions such as inducer concentration, temperature, and media composition.

Automation would allow faster screening of conditions that improve protein production and interaction studies with the designed peptide.

Week 4 HW: Protein Design Part I

PART A:

1. How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons)

Assumptions:


1. Protein content in 500 g of meat

$ 500 \,\text{g meat} \times \frac{31 \,\text{g protein}}{100 \,\text{g meat}} = 155 \,\text{g protein} $

2. Convert grams of protein to moles of amino acids

$ 155 \,\text{g} \times \frac{1 \,\text{mol}}{100 \,\text{g}} = 1.55 \,\text{mol amino acids} $

3. Convert moles to molecules

$ 1.55 \,\text{mol} \times 6.022 \times 10^{23} = 9.33 \times 10^{23} $

Final Answer

$ \boxed{9 \times 10^{23} \text{ amino acid molecules}} $

2. Why do humans eat beef but do not become a cow, eat fish but do not become fish?

Humans need to feed on beef, fish and other nutrients to obtain energy and raw materials. Even though we consume proteins and nucleic acids that were originally built according to another organism’s DNA, digestion breaks them down into basic biomolecules, namely amino acids and nucleotides. Our cells then use our DNA to reassemble those building blocks according to human genetic instructions, not those of a cow or a fish.

3. Why are there only 20 natural amino acids?

There are around 500 aminoacids, but the only ones required for human protein building are 20.

5. Where did amino acids come from before enzymes that make them, and before life started?

Amino acids likely formed through prebiotic chemical reactions before life emerged. Experimental evidence suggests they could have been synthesized under early Earth atmospheric conditions, through energy sources such as lightning, volcanic activity, and hydrothermal systems rich in sulfur compounds. Discuss how simple inorganic molecules, combined with energy input, could generate organic building blocks like amino acids without the need for enzymes.

Several scienties have tried to answer this question and, surprisginfly, they could have been synthetized artifically by the atmospheric conditions and the high-sulfured sea. This (Farias-Rico and Mourra-Diaz, 2022)

6. If you make an α-helix using D-amino acids, what handedness (right or left) would you expect?

Because D-amino acids are the mirror image of L-amino acids, the energetically favorable backbone torsion angles are also inverted. As a result, the most stable α-helix formed by D-amino acids is left-handed.

7. Can you discover additional helices in proteins?

Yes. In fact, some studies have identified different forms of alpha helix in globular proteins, namely linear, curved or kinked (Kumar & Bansal, 1998). Adititionaly, there are 3~10 and $\pi$ helix which are less favourable in trerms of stability but still occur (Kumar et al., 2022)

8. Why are most molecular helices right-handed? This is because most amino acids are D-oriented.

9. Why do β-sheets tend to aggregate? What is the driving force for β-sheet aggregation

β-sheets tend to aggregate because exposed backbone hydrogen bond donors and acceptors allow β-strands from different molecules to bind to each other. Hydrophobic side chains further stabilize these interactions, making sheet stacking energetically favorable. Aggregation is driven mainly by intermolecular hydrogen bonding and hydrophobic interactions. These forces lower the system’s free energy and promote ordered β-sheet assembly, similar to crystallization. (Chen et al., 2017)

10. Why do many amyloid diseases form β-sheets? Can you use amyloid β-sheets as materials?

Many amyloid diseases involve β-sheet structures because the cross-β arrangement is highly stable and self-propagating. This stability allows misfolded proteins to accumulate as insoluble fibrils that disrupt normal cellular function. (Chen et al., 2017). Despite their toxic effects on health, there is increasing research pertaining their development as materials for several applications (Yadav et al., 2024).

PART B: Protein Analysis and Visualization

1. Briefly describe the protein you selected and why you selected it.

In this case I chose the laccase from Bacillus subtilis cotA, an enzyme shown to have multiple applications, ranging from bioremedation to dye breakdown. It belongs to the family of multicopper oxidases (MCOs), which are capable of oxidizing a significant amount of chemical compounds. Since one of my project ideas was to determine the optimal concentration of laccase for optimal wastewater treatment in textile factories, I thought it would be a great start to get used to the different tools we have for protein design.

2. Identify the amino acid sequence of your protein.

The sequence is:

\ \ MTLEKFVDALPIPDTLKPVQQSKEKTYYEVTMEECTHQLHRDLPPTRLWGYNGLFPGPTIEVKRNENVYVKWMNNLPSTHFLPIDHTIHHSDSQHEEPEVKTVVHLHGGVTPDDSDGYPEAWFSKDFEQTGPYFKREVYHYPNQQRGAILWYHDHAMALTRLNVYAGLVGAYIIHDPKEKRLKLPSDEYDVPLLITDRTINEDGSLFYPSAPENPSPSLPNPSIVPAFCGETILVNGKVWPYLEVEPRKYRFRVINASNTRTYNLSLDNGGDFIQIGSDGGLLPRSVKLNSFSLAPAERYDIIIDFTAYEGESIILANSAGCGGDVNPETDANIMQFRVTKPLAQKDESRKPKYLASYPSVQHERIQNIRTLKLAGTQDEYGRPVLLLNNKRWHDPVTETPKVGTTEIWSIINPTRGTHPIHLHLVSFRVLDRRPFDIARYQESGELSYTGPAVPPPPSEKGWKDTIQAHAGEVLRIAATFGPYSGRYVWHCHILEHEDYDMMRPMDITDPHK

Using the code provided, the most common aminoacid is Proline (P) which appears 46 times.

Some additional facts were obtained from UniProt:

  • Lenght (Number of AAs): 513
  • Molecular mass: 58.5 kDa
  • Family: Multicopper Oxidase
  • Number of homologues: 242

3. Identify the structure page of your protein in RCSB

The protein’s structure was solved in 2003 by Francisco J. Enguita and collaborators, with a resolution of 1.70 Å. There are 4 ligands: C2O, GOL, C1O and CU. It belongs to the Laccase family.

4. Open the structure of your protein in any 3D molecule visualization software

The following images of the protein were obtained from PyMOL:


Main structure

Figure 1. Structural representations of the laccase protein generated in PyMOL: cartoon, ribbon and ,balls-and-sticks representation.

Figure 1 presents different structural representations of CotA. The cartoon view highlights its compact, globular fold dominated by β-sheets and connecting loops, characteristic of bacterial laccases, while the ribbon representation emphasizes the backbone organization and overall topology. The balls-and-sticks model displays all atoms explicitly, revealing dense atomic packing and the presence of copper ions within the protein core, coordinated by conserved histidine residues typical of multicopper oxidase active sites.


Secondary structures

Figure 2. Secondary structures and residue groups.

The surface of CotA was colored according to residue type: hydrophobic residues in yellow, polar uncharged in cyan, positively charged in blue, and negatively charged in red. The visualization shows a typical globular organization, with hydrophobic residues mainly buried in the protein core, contributing to structural stability, and polar and charged residues predominantly exposed on the surface, supporting solubility and potential functional interactions. The distribution is consistent with CotA’s role as a multicopper oxidase.

PART C: Using ML-Based Protein Design Tools

1. Protein Language Modeling

The selected protein was the $\beta$-lactoblogulin which is the main component of whey protein.

Sequence

\ \ 3NPO_1|Chain A|Beta-lactoglobulin|Bos taurus (9913) LIVTQTMKGLDIQKVAGTWYSLAMAASDISLLDAQSAPLRVYVEELKPTPEGDLEILLQKWENGECAQKKIIAEKTKIPAVFKIDALNENKVLVLDTDYKKYLLFCMENSAEPEQSLACQCLVRTPEVDDEALEKFDKALKALPMHIRLSFNPTQLEEQCHI

Deep Mutational Scans

Figure 3. Deep Mutational Scans.

A few mutational hotspots were identified, particularly around positions 17, 63, 142 and 158. These sites show a tendency toward substitution to Leucine (L) and Glutamic Acid (E). Given that these residues appear to be located on the outer region of the protein, this pattern may reflect structural permissiveness. Substitution toward Leucine could enhance local hydrophobic packing, whereas Glutamic Acid, being negatively charged, may be well tolerated on the protein surface. This suggests that these positions are structurally flexible and can accommodate both hydrophobic and charged residues without significantly disrupting the overall fold.

Latent Space Analysis

Figure 4. Latent Space Analysis.

2. Protein Folding

Figure 5. Structure obtained.

I was able to obtain the amino acid probabilites, which may help me in the future to guess the most likely mutations and try and fold again the proteins.

Figure 6. Amino acid probabilities .

By mutating residues no. 1 (L), 15 (P), 36 (P), 122 (P), and 29 (L), the following sequence variants were obtained:

Variant 1

Mutations:

  • L1 → I
  • P15 → A

Sequence:

IIVTQTMKGLDIQKVAGTWYSLAMAASDISLLDAQSAPLRVYVEELKPTPEGDLEILLQKWENGECAQKKIIAEKTKIPAVFKIDALNENKVLVLDTDYKKYLLFCMENSAEPEQSLACQCLVRTPEVDDEALEKFDKALKALPMHIRLSFNPTQLEEQCHI


Variant 2

Mutations:

  • L29 → F
  • P36 → G
  • P122 → S

Sequence:

LIVTQTMKGLDIQKVAGTWYSLAMAASDISFLDAQSAGLRVYVEELKPTPEGDLEILLQKWENGECAQKKIIAEKTKIPAVFKIDALNENKVLVLDTDYKKYLLFCMENSAEPEQSLACQCLVRTPEVDDEALEKFDKALKALSMHIRLSFNPTQLEEQCHI


Variant 3

Mutations:

  • L1 → D
  • P15 → G
  • L29 → N
  • P36 → G
  • P122 → G

Sequence:

DIVTQTMKGLDIQKVGGTWYSLAMAASDISNNDAQSAGLRVYVEELKPTPEGDLEILLQKWENGECAQKKIIAEKTKIPAVFKIDALNENKVLVLDTDYKKYLLFCMENSAEPEQSLACQCLVRTPEVDDEALEKFDKALKGALPMHIRLSFNPTQLEEQCHI

Figure 7. Different mutations.

3. Protein Generation

New sequence

MPVKKTMEGLDISKLAGKWYTQAQAATKKELLSTKSSPYNRFTLELIPTPEGDLQVRYEYYENGECKDRLDTWHKTDDPAVFVKEGEEDERLVIMDTDYNNYALWCIEKKSEPEKELVCQCLVRKPEINEEALKKFEEAKKDLPFTVEKFWNPEQLQKRCLE

Mutations

Figure 8. Mutation probabilities.

Then, the two proteins, both original and new, are shown below:

Figure 9. Old and new proteins.

Both sequences appear to preserve the same overall structural fold, as several conserved motifs and key residues remain unchanged, including regions such as PTPEGDL, DTDY, and VCQCLVR. These conserved segments suggest that the core structural or functional features of the protein were maintained.

However, the new sequence introduces substantial changes in amino acid composition, with an increase in charged and polar residues such as lysine (K), glutamate (E), and glutamine (Q), alongside a reduction in hydrophobic residues. Structurally, this likely makes the redesigned protein more soluble and less prone to aggregation, while also increasing surface polarity and local flexibility. In contrast, the original sequence appears to contain a more hydrophobic and compact core, which may contribute to greater structural rigidity and stability.

PART D: Group Brainstorm on Bacteriophage Engineering

Group Members: Alvaro Pacheco (Lima, PE) and Renzo Condori (Lima, PE)

Goals

  • Perform mutagenesis on the LS-motif to enhance stability
  • Modify the promoting region of the DNA sequence to express larger amounts of the MS2 protein

Tools

-AlphaFold: Predicts 3D structure of the mutant variants. It allows to evaluate if such mutations maintain the transmbembrane topology and general conformation, verifying if the functional motif LS keeps its orientation and stability.

-FoldX/Rosetta: It will estimate the change in free energy due to mutations. It eases the selection of mutant variants which are more likely to provde a thermodinamic stability, reducing the number of prospects.

-GROMACS: It allows to simulate and analyze the protein stability in a bacterial environment.

These tools complement each other by combining evolutionary, structural, and physical insights to improve MS2 lysis protein stability. Protein Language Models suggest mutations consistent with evolutionary constraints, AlphaFold2 screens for preserved structure and topology, energy calculations prioritize stabilizing variants, and molecular dynamics simulations test behavior in membrane conditions. Together, they enable rational design of stabilizing mutations while reducing the risk of impairing lytic function.

Pitfalls

One potential pitfall is the limited accuracy of current prediction tools for small, membrane-associated, and partially disordered proteins, which may reduce the reliability of both structural and energetic predictions. A second limitation is the trade-off between stability and function: increasing stability may reduce the conformational flexibility required for interaction with the membrane target, potentially impairing lytic activity.

Pipeline

Procedure

Functional Analysis and Definition of Constraints The conserved transmembrane motif Leu48–Ser49 (LS) and its immediate surrounding region will be identified as critical functional regions that must not be mutated.

Directed In Silico Mutagenesis Mutant variants will be generated using protein language models, restricting mutations to the remainder of the sequence.

Preliminary Energy Filtering Variants will be evaluated through ΔΔG stability predictions, selecting those with improved thermodynamic stability.

Structural Prediction Selected variants will be modeled using AlphaFold2 to verify preservation of the transmembrane domain and overall conformation.

Dynamic Validation in a Membrane Environment Top candidates will be evaluated through molecular dynamics simulations in a bacterial membrane environment.

Final Variant Selection Mutants showing the best balance between enhanced stability and functional conservation will be selected.


Week 5 HW: Protein Design part II

PART A: SOD1 Binder Peptide Design

The sequence for the original protein is:

// sp|P00441|SODC_HUMAN Superoxide dismutase [Cu-Zn] OS=Homo sapiens OX=9606 GN=SOD1 PE=1 SV=2 MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTS AGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVV HEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

Mutation occurs at residue 4: Alanine becomes Valine

// 1UXM_1|Chains A, B, C, D, E, F, G, H, I, J, K, L|SUPEROXIDE DISMUTASE [CU-ZN]|HOMO SAPIENS (9606) ATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVS IEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

Part 1: Generate Binders with PepMLM

Adding Methionine to the mutated sequence, the generated peptides were:

Nno. SequenceBinderPerplexity
1WRYYVAVVRWKK28.003177
2WRYYAAVLEWKE16.368043
3WHSYAVVLEWWK19.985691
4WLSGPVAVEWKK11.674838

Comparing to the binding protein, FLYRWLPSRRGG, all of them are quite different. This might be due to the input parameters of the code as well as the fact that the generation is random, which contributes to a higher degree of perplexity.

Part 2: Evaluate Binders with AlphaFold3

The following figures show how the SOD1 protein interacts with each of the four binders:

Figure 1. Visual representation of docking between different mutated binders and SOD1 protein.

The website also provided the following scores:

No. PeptideSequenceipTM scorepTM score
1WRYYVAVVRWKK0.780.85
2WRYYAAVLEWKE0.640.77
3WHSYAVVLEWWK0.740.83
4WLSGPVAVEWKK0.730.83

Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse

No. PeptideSequenceSolubilityHemolysisBinding AffinityMolecular WeightNet charge @ pH = 7Isoelectric PointHydrophobicity (GRAVY)
1WRYYVAVVRWKKSolubleNon-hemolytic (0.066)Medium (7.377 pKd/pKi)1654.0 Da3.7610.45-0.57
2WRYYAAVLEWKESolubleNon-hemolytic (0.057)Weak binding (6.837 pKd/pKi)1613.8 Da-0.236.28-0.68
3WHSYAVVLEWWKSolubleNon-hemolytic (0.090)Weak binding (6.133 pKd/pKi)1603.8 Da-0.156.75-0.12
4WLSGPVAVEWKKSolubleNon-hemolytic (0.039)Weak binding (5.301 pKd/pKi)1399.6 Da0.768.59-0.16

Part 4: Generate Optimized Peptides with moPPIt

The generated binders are shown in the following table:

BinderHemolysisSolubilityHalf-LifeAffinityMotifSpecificity
EAVEGLTAEQIW0.950.59.925.910.340.61
WIIWVTTTKAQK0.940.55.665.770.770.67
ITLDEWLKKQCY0.880.6714.467.160.780.57
TDEQKVQLSAYW0.840.675.646.410.660.53

PART C: Final Project: L-Protein Mutants

The programs were run and the heatmap along with some mutation candidates were obtained:

Amino AcidPosition in ProteinAmino Acid in Protein SequenceScore
L50K2.56
R29C2.39
L39Y2.24
S29C2.04
Q9S2.01
Q29C1.99
P29C1.97
L29C1.96
I50K1.92
L53N1.86
Figure 2. Amino acid-mutation predictions.

Week 6 HW: Genetic Circuits Part I

PART 1: Protocol questions

  1. What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose?

The mastermix contains:

  • Phusion Hi-Fi DNA Polymerase: It is crucial for completing the amplicons generated during PCR.
  • Deoxynucleotides: The building blocks necessary for replicating DNA fragments.
  • Buffer including MgCl2: Prevents enzyme denaturation by maintaining pH at a fixed level.
  1. What are some factors that determine primer annealing temperature during PCR?

The annealing temperature depends on the length of the primers and their GC content. Primers with higher GC content have higher melting temperatures. The sequence of the primer and the presence of mismatches also affect binding. In addition, salt concentration can influence primer stability.

  1. There are two methods from this class that create linear fragments of DNA: PCR, and restriction enzyme digests. Compare and contrast these two methods, both in terms of protocol as well as when one may be preferable to use over the other

PCR is a method that amplifies a specific DNA fragment using primers and a DNA polymerase. It is very flexible because primers can be designed to target almost any sequence. However, it requires thermal cycling and can introduce small errors.

Restriction enzyme digestion, on the other hand, cuts DNA at specific recognition sites. It is very precise but less flexible because it depends on the presence of those sites in the DNA. PCR is preferred when you need to create or modify fragments, while restriction enzymes are useful when the correct sites already exist.

  1. How can you ensure that the DNA sequences that you have digested and PCR-ed will be appropriate for Gibson cloning?

To use Gibson cloning, DNA fragments must have overlapping regions of about 20–40 base pairs. These overlaps must be complementary to each other. The overlaps can be added during PCR by designing primers with extra sequences. It is also important to check that the sequences are in the correct orientation and reading frame.

  1. How does the plasmid DNA enter the E. coli cells during transformation?

The plasmid DNA enters the cells through a process called electroporation where, by means of an externally applied voltage, the membrane permeability increases, allowing the plasmid to enter the bacterial cytosol.

  1. Describe another assembly method in detail (such as Golden Gate Assembly) 6.1 Explain the other method in 5 - 7 sentences plus diagrams (either handmade or online)

Golden Gate Assembly uses special restriction enzymes that cut outside their recognition sites. This creates custom overhangs that control how DNA fragments join together. The reaction includes both digestion and ligation at the same time. The fragments are designed to have matching overhangs, so they assemble in the correct order. After assembly, the restriction sites are removed, leaving a clean sequence. This method is efficient and allows multiple fragments to be assembled in one step. A diagram is shown below:

Figure 1. Golden Gate diagram.

6.2. Model this assembly method with Benchling or Asimov Kernel!

In this case, the GFP protein was selected as the insert and the cloning plasmid pET-28(+). The assembly was done in Benchling.

Figure 2. Sequences.

We click on the assembly wizard option on the lower-right corner of the page and select ‘Golden Gate Assembly’ as shown below:

Figure 3. Selection.

Then, we select the section outside of the MCS as the backbone, and the full GFP sequence as the fragment to be inserted.

Figure 4. Backbone and fragment.

By clicking on the ‘Assemble’, the construct is shown below:

Figure 5. Final construct.

PART 2: Asimov Kernel

  1. Repressilator

The model was copied form the template and the parts were searched, obtaining, and the simulation with the parameters Simple transfection, E. Coli and 24 h with 10 min timestamps was run, obtaining the figures below:

Figure 6. Repressilator construct and graph.
  1. Constructs

2.1 CONSTRUCT 1

This is a basic construct that expresses LambdaCI. I tried to look fo IPTG sint pTac is an inducible promoter but did not find the protein.

Figure 7. Construct 1.

Despite its constituve expression, pTac works when IPTG is induced. Several reasons on why this occurs might be linked to the promoter proerties. pTac might be a leaky promoter because the expression, although low, is constant as the graph shows. This is interesting from an efficiency standpoint. Perpahs when building more sophisticated constructs, pTac should be replaced by a more robust promoter.

2.2 CONSTRUCT 2

This construct is basically two repressing proteins against each other. The colors show which protein represses which promoter.

Figure 8. Construct 2.

Although both proteins start to express simultaneously, there is a decrease in AmtR and a notable increase in LitR production. This may be due to promoter strength. Since both proteins repress their promoters, small concentrations would be expected. However, the behavior shown in the graph might imply that pAmtR is a stronger promoter than pLitR.

2.3 CONSTRUCT 3

This construct expresses constitutively the protein LambdaCI since pBad is an inducible promoter for L-arabinose.

Figure 9. Construct C3.

However, when Arabinose for E. Coli is placed in the construct next to LambdaCI, the expression remains unchanged. This may be due to the fact that L-arabinose is the right protein and it is not found in the bacterial parts repository.

Figure 10. C3 expression with Arabinose for E. Coli.

Week 7 HW: Genetic Circuits Part II

PART 1: IANNs

  1. What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions?

Intracellular Artificial Neural Networks (IANNs) offer several advantages over traditional genetic circuits based on Boolean logic. While Boolean circuits operate in a binary manner (ON/OFF), IANNs can process continuous, graded inputs such as varying concentrations of metabolites or regulatory molecules. This enables more nuanced and biologically realistic responses. Additionally, IANNs integrate multiple inputs through weighted interactions, allowing for more flexible and complex decision-making compared to rigid logical gates like AND or OR.

  1. Describe a useful application for an IANN; include a detailed description of input/output behavior, as well as any limitations an IANN might face to achieve your goal.

A useful application of an IANN is in cell-based disease diagnostics and therapeutic response, particularly for conditions such as cancer. In this context, inputs could include biomarkers such as cancer-associated microRNA levels, oxidative stress, and metabolic indicators like elevated lactate concentrations. The IANN processes these signals by assigning weights to each input and integrating them into a combined output that reflects the likelihood of a diseased state. Based on this computation, the system can trigger the expression of a therapeutic protein, such as one inducing apoptosis, or activate a fluorescent reporter for diagnostic purposes. Compared to Boolean circuits, which rely on strict thresholds and may fail when signals are near cutoff values, IANNs can interpret intermediate levels and produce more graded and accurate responses. However, limitations include gene expression noise, challenges in precisely tuning interaction weights, metabolic burden on the host cell, and potential instability due to mutations over time.

  1. Below is a diagram depicting an intracellular single-layer perceptron where the X1 input is DNA encoding for the Csy4 endoribonuclease and the X2 input is DNA encoding for a fluorescent protein output whose mRNA is regulated by Csy4. Tx: transcription; Tl: translation. Draw a diagram for an intracellular multilayer perceptron where layer 1 outputs an endoribonuclease that regulates a fluorescent protein output in layer 2.
Figure 1. Two-layer neural network.

The system is organized into two functional layers: the first layer integrates DNA inputs to produce an endoribonuclease, while the second layer uses this intermediate signal to post-transcriptionally regulate a fluorescent reporter.

PART 2: Fungal Materials

  1. What are some examples of existing fungal materials and what are they used for? What are their advantages and disadvantages over traditional counterparts?

Fungal materials, particularly those derived from mycelium, have been increasingly developed for sustainable applications across multiple industries. Examples include mycelium-based leather used as an alternative to animal-derived leather, biodegradable packaging materials that replace petroleum-based foams such as polystyrene, lightweight construction materials for insulation or structural use, and environmentally friendly textiles. These materials offer several advantages, including biodegradability, low environmental impact, and the ability to grow on inexpensive and renewable substrates such as agricultural waste. Additionally, fungal materials exhibit self-assembling properties, reducing the need for energy-intensive manufacturing processes. However, they also present disadvantages, including lower mechanical strength compared to some synthetic materials, sensitivity to moisture and environmental conditions, slower production rates, and variability in material properties due to the inherent complexity of biological systems.

  1. What might you want to genetically engineer fungi to do and why? What are the advantages of doing synthetic biology in fungi as opposed to bacteria?

Potential modifications include altering cell wall composition, such as chitin or glucan content, to improve mechanical strength, engineering the production of antimicrobial compounds for functionalized materials, and designing fungi that respond to environmental stimuli like humidity or temperature. Additionally, fungi could be engineered to incorporate biosensing or signaling capabilities, enabling the development of responsive or “smart” materials. Compared to bacterial systems, fungi offer several advantages, including their eukaryotic cellular structure, which supports proper folding and post-translational modification of complex proteins, as well as their natural ability to form multicellular networks (mycelium) that are well-suited for material fabrication. They also tend to have greater secretion capacity and environmental tolerance. Despite these benefits, fungal systems are generally more complex and slower to engineer than bacterial models, which can present practical challenges in their development and optimization.

Week 9 HW: Cell-Free Systems

  1. Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production.

Main advantages (flexibility & control):

Open system: Components such as DNA, cofactors, salts, inhibitors, can be directly modified. Precise control: You can tune Mg²⁺, ATP, amino acids, etc. Rapid expression: No need for cloning → transformation → growth. Toxic proteins: You can express proteins that would normally kill cells.

When CFPS is better than in vivo:

Producing toxic proteins (e.g., antimicrobial peptides) Studying protein variants quickly (high-throughput screening, mutant libraries) Incorporating non-natural amino acids Expressing membrane proteins without worrying about cell viability

  1. Describe the main components of a cell-free expression system and explain the role of each component.

Cell extract (lysate): Contains ribosomes, tRNAs, enzymes DNA or mRNA template: The blueprint for your protein Amino acids: Building blocks for protein synthesis Energy system (ATP, GTP + regeneration system): Fuels translation Salts (Mg²⁺, K⁺): Maintain ribosome stability and activity Cofactors (NAD⁺, CoA, etc.): Support metabolic reactions Enzymes (optional): Folding, disulfide bond formation, etc.

  1. Why is energy provision regeneration critical in cell-free systems? Describe a method you could use to ensure continuous ATP supply in your cell-free experiment.

Protein synthesis highly relies on energy. If not enough ATP is available, the system is not able to produce proteins. Without regeneration translation stops quickly, and yield drops dramatically.

For instance, use of phosphoenolpyruvate (PEP) or creatine phosphate is valid as energy sources since these regenerate ATP via substrate-level phosphorylation

  1. Compare prokaryotic versus eukaryotic cell-free expression systems. Choose a protein to produce in each system and explain why.

Prokaryotic systems (e.g., E. coli extract):

Fast, cheap, high yield. However, poor at post-translational modifications

Use case:

Produce enzymes like β-galactosidase → no complex folding/modifications needed

Eukaryotic systems (e.g., wheat germ, insect, mammalian extracts):

Slower, expensive Can do folding, disulfide bonds, glycosylation (depending on system)

Use case:

Produce antibodies or glycoproteins, which need proper folding and modifications

  1. How would you design a cell-free experiment to optimize the expression of a membrane protein? Discuss the challenges and how you would address them in your setup.

Challenges include aggregation, misfolding and insolubility. The strategies might include:

Add detergents Use liposomes or nanodiscs to mimic membranes Optimize Mg²⁺ and chaperones Lower temperature to improve folding

A design idea would be:

CFPS + nanodiscs + chaperones → allows co-translational insertion into a membrane-like environment

  1. Imagine you observe a low yield of your target protein in a cell-free system. Describe three possible reasons for this and suggest a troubleshooting strategy for each.
  • Poor DNA template quality

Problem: degraded DNA or bad promoter Fix: use fresh plasmid, stronger promoter (e.g., T7), optimize codons

  • Energy depletion

Problem: ATP runs out Fix: improve regeneration system (PEP, glucose system)

  • Protein misfolding or degradation

Problem: aggregates or proteolysis Fix: add chaperones, reduce temperature, include protease inhibitors

  • Suboptimal ion concentrations

Problem: Mg²⁺/K⁺ imbalance kills ribosome activity Fix: optimize salt concentrations experimentally

Homework question from Kate Adamala

Design an example of a useful synthetic minimal cell as follows:

  1. Pick a function and describe it.

Source: Higashi et. al, 2025

a. What would your synthetic cell do? What is the input and what is the output? It would sense and activate signals in the presence of metals such as mercury for instance. INPUT: Hg²⁺ ions; OUTPUT: AHL (acyl homoserine lactone), a common quorum sensing metabolite.

b. Could this function be realized by cell-free Tx/Tl alone, without encapsulation? No. If not encapsulated, AHL would diffuse freely throughout the medium and not behave like a cell.

c. Could this function be realized by genetically modified natural cell? Yes. However, quorum sensing-related receptors are specific for every metal. Therefore, new cells would have to be modified, which might bring ethical discussions as to whether or not allow such modifications. SMCs, in a way, avoid this.

d. Describe the desired outcome of your synthetic cell operation.

In the presence of SMC: GFP is produced in response to mercury

In the absence of SMC: No response is observed

Thus, the SMC enables bacteria to indirectly sense mercury.

  1. Design all components that would need to be part of your synthetic cell. a. What would be the membrane made of? Phospholipids (POPC) and Cholesterol

b. What would you encapsulate inside? Enzymes, small molecules. It would contain a Tx/Tl system from, for instance. E. Coli, along with the DNA consisting of a promoter with a regulatory protein and a gene (PmerT, MerR and luxI, respectively)

c. Which organism your Tx/Tl system will come from? Is bacterial OK, or do you need a mammalian system for some reason? (hint: for example, if you want to use small molecule modulated promotors, like Tet-ON, you need mammalian) It would be a bacterial system due to it consisting of a metal-responsive system such as MerR and its cost effectiveness.

d. How will your synthetic cell communicate with the environment? (hint: are substrates permeable? or do you need to express the membrane channel?) The substrate will be permeable since Mercury ions will diffuse through the membrane pores. The ions will activate the genetic circuit that synthesizes AHL, which will help bacteria produce GFP.

  1. Experimental details a. List all lipids and genes. (bonus: find the specific genes; for example, instead of just saying “small molecule membrane channel” pick the actual gene.)

Lipids POPC Cholesterol

Genes merR PmerT luxI

Cells E. Coli

b. How will you measure the function of your system? The function will be measured through GFP output measurment by flow cytometry.

Homework question from Kate Adamala

Freeze-dried cell-free systems can be incorporated into all kinds of materials as biological sensors or as inducible enzymes to modify the material itself or the surrounding environment. Choose one application field — Architecture, Textiles/Fashion, or Robotics — and propose an application using cell-free systems that are functionally integrated into the material. Answer each of these key questions for your proposal pitch:

Write a one-sentence summary pitch sentence describing your concept. How will the idea work, in more detail? Write 3-4 sentences or more. What societal challenge or market need will this address? How do you envision addressing the limitation of cell-free reactions (e.g., activation with water, stability, one-time use)?

This proposal describes a smart textile integrated with freeze-dried cell-free biosensors capable of detecting airborne pollutants such as carbon monoxide or volatile organic compounds and producing a visible color change as a warning. The system consists of microencapsulated transcription–translation machinery embedded within fabric fibers, which becomes activated upon exposure to environmental humidity (e.g., sweat). When target gases diffuse into the material, they trigger engineered genetic circuits that induce the production of chromoproteins, enabling real-time, user-friendly detection. This technology addresses the growing need for accessible, portable air quality monitoring, especially in urban environments, while limitations of cell-free systems are mitigated through lyophilization with stabilizers, protective encapsulation, and the use of replaceable sensing patches.

Based on Atalie, D., & Fikre, Y. 2026 and Pardee et. al, 2016

Homework question from Ally Huang

Freeze-dried cell-free reactions have great potential in space, where resources are constrained. As described in my talk, the Genes in Space competition challenges students to consider how biotechnology, including cell-free reactions, can be used to solve biological problems encountered in space. While the competition is limited to only high school students, your assignment will be to develop your own mock Genes in Space proposal to practice thinking about biotech applications in space!

For this particular assignment, your proposal is required to incorporate the BioBits® cell-free protein expression system, but you may also use the other tools in the Genes in Space toolkit (the miniPCR® thermal cycler and the P51 Molecular Fluorescence Viewer). For more inspiration, check out https://www.genesinspace.org/ .

  1. Provide background information that describes the space biology question or challenge you propose to address. Explain why this topic is significant for humanity, relevant for space exploration, and scientifically interesting. (Maximum 100 words)

Space radiation is a major challenge for long-term human space exploration, as it can damage DNA and compromise biological systems. Current radiation monitoring relies on physical sensors, which do not capture biological effects at the molecular level. Developing biological sensors that directly report DNA damage would provide more relevant information for astronaut health and system performance. Cell-free systems, such as BioBits®, offer a safe, portable, and programmable platform for detecting molecular changes without the need for living cells, making them ideal for space environments where resources and containment are limited.

  1. Name the molecular or genetic target that you propose to study. Examples of molecular targets include individual genes and proteins, DNA and RNA sequences, or broader -omics approaches. (Maximum 30 words)

DNA damage–responsive regulatory elements (e.g., SOS response promoters such as recA/lexA) controlling expression of a fluorescent reporter (GFP).

  1. Describe how your molecular or genetic target relates to the space biology question or challenge your proposal addresses. (Maximum 100 words)

Space radiation induces DNA damage, particularly double-strand breaks, which activate conserved bacterial DNA damage responses such as the SOS pathway. By incorporating DNA damage–responsive promoters into a cell-free system, it is possible to directly link radiation-induced molecular damage to reporter gene expression. This approach enables real-time biological detection of radiation effects, rather than indirect physical measurements. Such systems are highly relevant for monitoring biomolecular integrity in space habitats, spacecraft, and biological experiments, where understanding functional damage to DNA is critical for maintaining astronaut health and ensuring the reliability of biotechnological systems.

  1. Clearly state your hypothesis or research goal and explain the reasoning behind it. (Maximum 150 words)

The main hypothesis is that DNA damage caused by radiation exposure can be detected in a cell-free system by coupling DNA damage–responsive regulatory elements to the expression of a fluorescent reporter protein. Specifically, if DNA templates or regulatory components are sufficiently affected by radiation, the SOS-like response elements will activate transcription, leading to measurable GFP production. This system would function as a biological dosimeter, translating molecular damage into a quantifiable signal. The reasoning is based on well-characterized bacterial DNA damage responses, which can be reconstituted in vitro using cell-free transcription–translation systems. Demonstrating this capability would support the development of portable, low-resource biosensors for monitoring radiation-induced biological damage during space missions.

  1. Outline your experimental plan - identify the sample(s) you will test in your experiment, including any necessary controls, the type of data or measurements that will be collected, etc. (Maximum 100 words)

Freeze-dried BioBits® reactions containing DNA constructs with SOS-responsive promoters controlling GFP will be prepared. Samples will include: (1) non-exposed control, (2) radiation-exposed DNA templates, and (3) negative controls lacking promoter activation. DNA may be pre-amplified using miniPCR®. After rehydration, reactions will be incubated and fluorescence measured using the P51 Molecular Fluorescence Viewer. GFP intensity will be compared across conditions to assess activation of the damage-responsive system. Increased fluorescence in exposed samples relative to controls will indicate successful detection of radiation-induced DNA damage.