Homework

Weekly homework submissions:

  • Week 1 HW: Principles and Practices

    The Biological Engineering Tool Tool: Portable Cell-Free Allergen Biosensor. Description: A single-use, portable reaction unit containing shelf-stable biological sensing reagents. Mechanism: The user introduces a small sample of food (solid or liquid) into the unit. The device initiates a biochemical reaction that specifically recognizes the molecular signature of a target allergen (e.g., peanut or soy). If the target is detected, the device triggers a distinct visual signal (such as a color change or fluorescence) within minutes.

  • Week 2 HW: DNA Read Write and Edit

    DNA Design Challenge Chosen Protein: I chose GFP because it serves as a robust reporter that could be used for my allergen biosensor. The goal of the device is to turn a biological detection (sensing peanut DNA) into a signal the user can see. GFP spontaneously fluoresces green when exposed to UV or blue light (like a simple black light LED). By designing the system so that GFP is activated only when the allergen is detected (or shut off in the presence of the allergen), I can create an intuitive user interface.

  • Week 3 HW: Lab Automation

    Project Overview: Cell-Free Allergen Biosensor I am hoping to develop a rapid, consumer-grade biosensor designed to detect trace allergens like peanut or soy in a restaurant setting. To prioritize speed and accuracy, I will use a DNA-to-RNA detection circuit. The workflow consists of three main stages: Extraction and Amplification: I could use RPA (Recombinase Polymerase Amplification) to exponentially copy target DNA (like the Ara h 1 gene) at a constant 37°C. Transcription: T7 RNA polymerase can concurrently convert that DNA into Trigger RNA. RNA Toehold Detection: This Trigger RNA can bind to a synthetic Toehold Switch, and unzip an RNA hairpin to allow the translation of a reporter protein. This can create a visible color change or induce luminescence in under 20 minutes. By using a cell-free protein synthesis system, the entire reaction is shelf-stable and functions without the need for a traditional lab environment.

  • Week 4 HW: Protein Design part I

    HW Questions: How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons) Meat is roughly 20% protein by mass, so there’s ~100g of protein in 500g of meat. Average amino acid molecular weight is ~110 Da. 100 g ÷ 110 g/mol ≈ 0.91 mol of amino acid residues × Avogadro’s number: 0.91 × 6.022 × 10²³ ≈ ~5.5 × 10²³ amino acid residues Why do humans eat beef but do not become a cow, eat fish but do not become fish?

  • Week 5 HW: Protein Design part II

    Part A: SOD1 Binder Peptide Design sequences, scores, structure, and properties for all peptides PepMLM binder generation Perplexity scores for known and generated peptides: Alphafold binder evaluation ipTM Values and Comparison to Known Binder The ipTM values across all peptides are low, ranging from 0.27 to 0.43, and none exceed 0.5 — the general threshold for confident protein-peptide interaction prediction. Notably, two PepMLM-generated peptides (Sequence_0 at 0.40 and Sequence_1 at 0.43) actually exceed the known binder (Sequence_4 at 0.32), suggesting the model produced candidates with comparable or slightly better predicted interface confidence. However, all predictions share the same binding

  • Week 6 HW: Genetic Circuits

    HW Questions What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose? Phusion DNA Polymease, which synthesized new DNA by adding new nucleotides to the template DNA during PCR dNTPs, which are the nucleotide building blocks (dATP, dGTP, dCTP, and dTTP) reaction buffer, which acts as a chemical stabilizer that maintains the ideal pH and salt balance so the enzyme stays active and can accurately build new DNA strands. What are some factors that determine primer annealing temperature during PCR? Melting temperature of the primer, which is the temperature at which half of the DNA complex dissociates Primer length, since longer primers usually require higher annealing temperatures GC content, since higher GC content typically increases the primer melting temperature Salt concentration, since higher salt concentrations can stabilize the DNA and thus may require higher annealing temperatures There are two methods from this class that create linear fragments of DNA: PCR, and restriction enzyme digests. Compare and contrast these two methods, both in terms of protocol as well as when one may be preferable to use over the other. Feature PCR (Polymerase Chain Reaction) Restriction Enzyme Digest Mechanism Enzymatic Synthesis: Building new strands from primers. Enzymatic Cleavage: Cutting phosphodiester bonds at specific sites. Protocol Thermal Cycling: Repeated steps of denaturation (95°C), annealing (55-65°C), and extension (72°C). Isothermal Incubation: DNA and enzymes are mixed in a buffer and held at a constant temp (usually 37°C). Reagents DNA template, Primers, dNTPs, Taq Polymerase, MgCl2, Buffer. DNA template, Restriction Enzymes, specific BSA/Salt Buffer, Water. Pros High sensitivity; amplifies DNA; creates specific fragments without needing existing cut sites. Simple setup; highly reproducible; great for verifying known sequences or circular DNA. Cons Prone to contamination; requires known flanking sequences; potential for polymerase errors. Does not amplify DNA; limited by the location of natural recognition sites. When to Use When you have minimal DNA, need a custom fragment, or want to add “tails” for cloning. When linearizing plasmids, performing diagnostic checks, or subcloning existing inserts. How can you ensure that the DNA sequences that you have digested and PCR-ed will be appropriate for Gibson cloning? Both the PCR and digested fragments must share identical overlapping terminal sequences (15–40 bp) with their neighboring fragments to allow for seamless homology-directed assembly. How does the plasmid DNA enter the E. coli cells during transformation? Membrane pores open due to a thermal pressure imbalance during the heat shock, allowing the plasmid DNA (which has been neutralized by calcium ions) to be pulled into the cell. Describe another assembly method in detail (such as Golden Gate Assembly) Golden Gate Assembly is a highly efficient “one-pot” cloning method that allows you to join multiple DNA fragments together simultaneously using Type IIS restriction enzymes and T4 DNA ligase. Unlike standard enzymes, Type IIS enzymes like BsaI bind to a specific recognition sequence but cut the DNA several nucleotides away, creating custom 4-base overhangs. By strategically designing these overhangs to be complementary, you can ensure that multiple fragments assemble in a specific, directional order. During the reaction, you cycle the temperature to repeatedly cut and ligate the DNA until the fragments are perfectly joined. A key advantage is that the enzyme’s recognition sites are positioned to be “cut off” and removed during the process, meaning the final product cannot be re-cut. This makes the reaction irreversible and drives the assembly toward the final, seamless circular plasmid. Because of this precision, Golden Gate is the gold standard for modular cloning and building complex multi-gene constructs. Simulating Golden Gate using AddGene’s tool

  • week 7 HW: genetic circuits part II

    Part 1: Intracellular Artificial Neural Networks (IANNs) What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions? Traditional genetic circuits can only read a signal as ON/OFF, even though molecules inside a cell exist at all kinds of intermediate concentrations. To build something complex out of ON/OFF switches, you have to layer many of them together, and each added layer introduces new opportunities for components to accidentally influence each other or fall out of sync. IANNs instead pass graded responses between nodes. Each node receives an actual concentration value, weighs it, and passes a continuous output forward. This means a single node carries far more information than an ON/OFF switch, so you need fewer of them to represent something complex, and there are fewer points at which things can go wrong.

Subsections of Homework

Week 1 HW: Principles and Practices

allergen_image allergen_image

The Biological Engineering Tool

Tool: Portable Cell-Free Allergen Biosensor.

Description: A single-use, portable reaction unit containing shelf-stable biological sensing reagents.

Mechanism: The user introduces a small sample of food (solid or liquid) into the unit. The device initiates a biochemical reaction that specifically recognizes the molecular signature of a target allergen (e.g., peanut or soy). If the target is detected, the device triggers a distinct visual signal (such as a color change or fluorescence) within minutes.

Why: To increase food confidence and reduce anxiety for people with dietary restrictions. While the primary function is to prevent anaphylaxis, the secondary goal is to validate “mystery foods” (like sauces or baked goods) in social settings, allowing users to eat with peace of mind rather than fear.


Governance & Policy Goals

Primary Goal: Non-Malfeasance (Preventing Harm via Rigorous Safety)

  • Sub-Goal A (Reliability & Trust): The tool must provide high confidence in negative results (telling a user a food is safe). If a user trusts the device and eats a “safe” food that is actually contaminated (False Negative), the physical harm is severe. Conversely, if the device constantly cries “wolf” (False Positive), the user loses confidence and stops using it, returning to a state of anxiety.
  • Sub-Goal B (User Safety & Disposal): Ensure that the device—which may contain chemical neutralizing agents or biological waste—is safe to handle during use and safe to dispose of, without introducing new chemical hazards to the user.

Governance Actions (The Options)

Option 1: The “Matrix Stress Test” Certification

  • Purpose: Mandates that the sensor must be proven to work in “Worst Case Matrices” (foods high in fat, sugar, or acidity) to prevent false negatives caused by complex food chemistry interfering with the sensor.
  • Actor: FDA / Food Safety Regulators.

Option 2: The “Fail-Safe” Internal Control

  • Purpose: Every unit must contain a secondary “Positive Control” mechanism that signals if the biological reagents are functional. If this control signal is missing, the user knows the test is broken/expired. This is critical for confidence: a user needs to know the difference between “This food is safe” and “The test didn’t work.”
  • Actor: The Company / Product Designers.

Option 3: Hazardous Containment & Neutralization Regulations

  • Purpose: If the device uses a chemical “kill switch” (e.g., a bleaching agent or strong acid) to neutralize the biological components before disposal, strict regulations must govern the containment of these hazardous materials.
  • Actor: Consumer Product Safety Commission (CPSC) / Environmental Regulators.
  • Key Policy: Mandate “Child-Resistant” sealing mechanisms and clear, high-contrast warning labels (e.g., “CAUTION: CORROSIVE CONTENTS”) to prevent users from accidentally exposing themselves to the neutralizing chemicals inside.

Scoring Matrix

Scoring Key:

  • 1 = Strong Positive Impact (Best Outcome)
  • 2 = Moderate Impact / Minor Trade-off
  • 3 = Weak Impact / Negative Trade-off / Not Applicable
Does the option:Option 1
(Matrix Stress)
Option 2
(Fail-Safe)
Option 3
(Safe Containment)
Enhance BiosecurityN/AN/A1
• By preventing incidents331
• By helping respond333
Foster User Safety113
• By preventing incident (False Negatives)1 (High)1 (High)3
• By helping respond (Minimizing harm)313 (Risk of Leaks)
Protect the environment331
• By preventing incidents331
• By helping respond333
Other considerations
• Minimizing costs and burdens to stakeholders2 (Dev Cost)2 (Complexity)2 (Packaging Cost)
• Feasibility?211
• Not impede research221
• Promote constructive applications112

Recommendation & Prioritization

Recommendation: I would prioritize Option 2 (The Fail-Safe Internal Control), followed by Option 1.

Reasoning:

  • Why Option 2 First? To build confidence in “mystery foods,” ambiguity is the enemy. If the device fails silently (e.g., due to storage conditions), the user may assume the food is safe. Establishing a positive indication of functionality is the only way to give the user the peace of mind they are looking for.
  • The Trade-off of Option 3: While neutralizing biological waste is important for environmental governance, introducing a toxic, neutralizing chemical creates a new chemical safety hazard for the user. If the containment fails, the user could be burned or injured. Therefore, regulations on how that chemical is contained (Warning Labels, Shatter-proof casing) are critical, but the risk of injury might outweigh the benefit of neutralizing trace amounts of biological material.

Reflection

In class, we discussed the ‘Responsibility of the Toolmaker.’ If I build a tool that claims to detect peanuts, I am effectively taking responsibility for that person’s life for that meal. The ethical weight of a ‘False Negative’ here is far heavier than in other biodesign projects. This made me realize that ‘Accuracy’ isn’t just a technical spec; it’s an ethical requirement. If I can’t guarantee >99% accuracy across all food types, is it ethical to release the product at all?

Proposed Governance: We might need a ‘Beta Testing Transparency’ law. Startups often release ‘beta’ products to iterate quickly. However, for safety diagnostics, ‘Beta’ labels are insufficient. There should be governance prohibiting the release of ‘beta’ medical/safety diagnostics to consumers until they are fully validated.


Pre-Lecture Questions

1. The biological synthesis of DNA using an error-correcting polymerase has an error rate of 1 * 10^6 (one error for every million base pairs added). The human genome is 3.2 Gbp (3.2 billion base pairs). 3,200,000,000 * (1 error)/(1,000,000 bases) = 3,200 errors per cell division. The cell would accumulate thousands of mutations every time it divides, which is too high for a complex organism to survive. Biology resolves this discrepancy by employing a post-replication “spell check” mechanism known as the MutS repair system. This system uses specific proteins (MutS, MutL, and MutH) to scan the DNA for mismatches that the polymerase missed. To ensure it fixes the right letter, the system distinguishes the correct “template” strand from the error-prone “new” strand by looking for methylation markers; the old strand is methylated, while the newly made strand is not. The system then cuts the new strand, removes the section containing the error using exonucleases, and fills in the correct sequence.

2. The number of different ways to code for an average human protein is very large. The average human protein coding sequence is approximately 1,036 bp long, which is roughly 345 amino acids. Because the genetic code is redundant (multiple three-letter DNA codons for most amino acids), there are on average about three different options for every single position in the protein chain. To find the total number of combinations, you would multiply these options for every amino acid (3 times 3 times 3… for 345 times), resulting in approximately 10 to the power of 164 different DNA sequences capable of making the same protein.

In practice, however, the vast majority of these theoretical codes will not work inside a cell due to several biological and physical constraints. One major issue is RNA folding, or secondary structure; as shown in the NUPACK analysis slides, specific nucleotide sequences can twist into tight knots or hairpins based on their “Minimum Free Energy”. If the code you choose creates a tight structure near the start of the molecule, the ribosome may be unable to latch on, preventing the protein from ever being made. Additionally, cells possess “cleanup” enzymes like RNase III that hunt for specific sequence patterns or structures to destroy old or foreign RNA. If your engineered sequence accidentally creates one of these cleavage targets, the cell’s own immune-like system will chop up the instructions before they can be used. Finally, sequences that are extremely repetitive or have difficult chemical properties (such as improper GC content) can be nearly impossible to synthesize or assemble reliably in the lab without introducing errors.

3. The standard method for oligonucleotide synthesis is the phosphoramidite cycle, a solid-phase chemical process that builds DNA strands one base at a time through a repeating four-step sequence. The cycle begins with deprotection, where a blocking group is removed from the sugar of the previous nucleotide, followed by coupling to attach a new building block. This is followed by a capping step to block any unreacted chains from continuing with an incorrect sequence, and finally, oxidation to stabilize the newly formed phosphate linkage.

4. The difficulty in synthesizing oligos longer than 200 nucleotides arises from the cumulative effect of yield decay and chemical imperfections. Even with a highly efficient coupling rate of 99%, the percentage of sequence-perfect, full-length product decreases exponentially with every added base, leaving a mixture dominated by truncated “failure” sequences. Furthermore, as the strand grows, the likelihood of the DNA forming secondary structures increases, which can physically shield the growing end of the chain from the incoming chemicals and prevent successful reactions.

5. It is impossible to make a 2000bp gene through direct synthesis because no chemical process is accurate enough to maintain high fidelity over thousands of consecutive steps. Instead, large genes are produced using a hierarchical assembly approach where shorter, high-quality oligonucleotides are synthesized first and then “stitched” together. These overlapping fragments are joined using enzymatic methods such as Polymerase Chain Reaction (PCR) or Gibson Assembly, allowing for the construction of long, complex sequences while providing opportunities to filter out errors that occurred during the initial oligo printing.

6. In most animals, there are 10 essential amino acids that they simply can’t make themselves and have to get from food. These are Phenylalanine, Valine, Threonine, Tryptophan, Isoleucine, Methionine, Histidine, Arginine, Leucine, and Lysine. In biology, we look at these as the fundamental code—the “basepair code” and ribosomal translation that turns 4 RNA bases into 20+ amino acids to build life.

When you look at the “Lysine Contingency” from Jurassic Park through this lens, the logic is actually pretty weak. Since lysine is already an essential amino acid for almost all animals, every creature in the wild is technically already on a “lysine contingency”. If an engineered animal escaped, it would just find lysine by eating natural plants or other animals.

Modern tech, like the Genomically Recoded Organisms (GROs) mentioned in the slides, takes this concept much further. Instead of relying on something common like lysine, scientists are swapping out codons to create “metabolic isolation”. They engineer life to require Non-Standard Amino Acids (NSAAs)—synthetic building blocks that don’t exist in nature. If these organisms don’t get their specific lab-made “fuel,” their proteins won’t fold, and they won’t survive.

We’re even looking at “Mirror World” life, where the chirality of DNA and proteins is flipped. Since natural life uses L-amino acids and B-DNA, a “mirror” organism would be totally invisible to natural viruses and couldn’t exchange nutrients with the wild. It’s a much more secure “lock” than just hoping the dinosaurs don’t find a snack.

Week 2 HW: DNA Read Write and Edit

gel_art_image gel_art_image

DNA Design Challenge

Chosen Protein: I chose GFP because it serves as a robust reporter that could be used for my allergen biosensor. The goal of the device is to turn a biological detection (sensing peanut DNA) into a signal the user can see. GFP spontaneously fluoresces green when exposed to UV or blue light (like a simple black light LED). By designing the system so that GFP is activated only when the allergen is detected (or shut off in the presence of the allergen), I can create an intuitive user interface.

Amino Acid Sequence:

sp|P42212|GFP_AEQVI Green fluorescent protein OS=Aequorea victoria OX=6100 PE=1 SV=1 MSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTFSYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK

Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence.: atgagtaaag gagaagaact tttcactgga gttgtcccaa ttcttgttga attagatggt gatgttaatg ggcacaaatt ttctgtcagt ggagagggtg aaggtgatgc aacatacgga aaacttaccc ttaaatttat ttgcactact ggaaaactac ctgttccatg gccaacactt gtcactactt tctgttatgg tgttcaatgc ttttcaagat acccagatca tatgaaacag catgactttt tcaagagtgc catgcccgaa ggttatgtac aggaaagaac tatatttttc aaagatgacg ggaactacaa gacacgtgct gaagtcaagt ttgaaggtga tacccttgtt aatagaatcg agttaaaagg tattgatttt aaagaagatg gaaacattct tggacacaaa ttggaataca actataactc acacaatgta tacatcatgg cagacaaaca aaagaatgga atcaaagtta acttcaaaat tagacacaac attgaagatg gaagcgttca actagcagac cattatcaac aaaatactcc aattggcgat ggccctgtcc ttttaccaga caaccattac ctgtccacac aatctgccct ttccaaagat cccaacgaaa agagagatca catggtcctt cttgagtttg taacagctgc tgggattaca catggcatgg atgaactata caaa

Codon optimization: Codon optimization is necessary because the genetic code is redundant; while multiple codons can specify the same amino acid, different organisms utilize these codons with varying frequencies. By rewriting the sequence to use the host’s “preferred” codons, we ensure that the corresponding tRNA molecules are readily available, which prevents ribosomal stalling and significantly accelerates protein synthesis. Additionally, optimization allows for the removal of inhibitory mRNA secondary structures and “unfavorable” sequences that could lead to premature degradation of the genetic instructions, ultimately ensuring a more robust and rapid visual signal for the user

For this assignment, I have chosen to optimize the sequence for Escherichia coli. As the most well-characterized model organism in synthetic biology, E. coli provides a reliable and standard baseline for protein expression with highly optimized commercial algorithms available for sequence design.

While E. coli is the selected host due to the robust, pre-set optimization tools available on platforms like Twist Bioscience, it serves as a functional baseline for standardization and ease of integration into common laboratory workflows. However, the ideal biological choice for a timely sensor would be Vibrio natriegens (TaxID: 1219067), which possesses a doubling time of under 10 minutes and a significantly higher ribosomal density. Utilizing E. coli for this optimization ensures a high Codon Adaptation Index (CAI) and reliable synthesis, though a custom-optimized Vibrio sequence would remain the preferred engineering solution for achieving maximum metabolic speed in a real-world application.

Optimized using Twist’s Codon optimization tool for E. Coli, avoiding EcoRI, BamHI, XhoI, HindIII, BasI, and BbsI cut sites. ATGTCCAAAGGTGAAGAGTTGTTTACCGGCGTTGTTCCCATCTTAGTGGAGCTCGACGGAGATGTCAATGGTCACAAATTCAGTGTATCAGGTGAAGGGGAAGGCGACGCGACATACGGGAAACTTACCTTAAAATTTATATGCACCACCGGCAAATTGCCCGTACCATGGCCAACGTTAGTGACCACCTTTTCCTACGGTGTCCAGTGCTTTTCACGGTACCCGGATCACATGAAACAGCACGACTTCTTCAAGTCCGCGATGCCGGAAGGCTACGTTCAAGAGCGCACCATATTCTTCAAGGATGACGGAAACTACAAGACGCGAGCAGAGGTTAAGTTCGAGGGAGACACCTTGGTAAATCGAATTGAATTAAAAGGCATTGACTTCAAGGAAGATGGAAACATCCTGGGCCATAAGCTGGAGTACAACTACAATAGTCATAATGTTTACATCATGGCGGATAAACAAAAGAATGGTATCAAGGTCAACTTCAAGATACGACACAATATCGAAGATGGATCTGTCCAATTAGCGGACCACTACCAGCAAAATACCCCCATTGGTGATGGTCCAGTTCTGCTCCCGGACAACCACTATTTGAGTACACAGTCGGCCCTCTCTAAGGACCCTAACGAAAAGCGGGACCATATGGTGCTCCTGGAATTTGTAACGGCCGCCGGAATTACCCACGGCATGGACGAGCTGTACAAATGA

codon_optimization codon_optimization

Now what?:

8The Transcription and Translation Process The process occurs in two stages. First, transcription happens when RNA Polymerase binds to the promoter and creates a messenger RNA (mRNA) strand. Next, translation begins as ribosomes dock onto the mRNA at the Ribosome Binding Site. The ribosome “reads” the mRNA codons and recruits tRNA molecules to assemble amino acids—starting at your M (Methionine) and ending at the TAA stop codon—folding the chain into a functional protein.

Cell-Dependent Production In cell-dependent systems, the DNA is inserted into a plasmid and transformed into a living host like E. coli or V. natriegens. The host’s metabolism provides the energy (ATP) and raw amino acids needed for synthesis. While highly scalable for mass production, this method requires significant time for cell growth and complex purification steps to separate your target protein from the host’s cellular components.

Cell-Free Production A cell-free (TX-TL) system mixes the linear DNA fragment directly with a lysate containing ribosomes and enzymes harvested from broken-open cells. This might be the ideal “timely” choice for a sensor because it eliminates the need to wait for living cultures to grow. It allows for immediate protein production and is more resilient to the lysis buffers used in food testing, though it is generally more expensive for large-scale use.


Prepare a Twist DNA Synthesis Order

Linear Map (Annotated) annotated_map annotated_map

Twist Cloning Vector cloning_vector_pic cloning_vector_pic


DNA Read/Write/Edit

Read

What DNA would you want to sequence (e.g., read) and why? I would sequence the metagenomic DNA from the human gut microbiome of patients with neurodegenerative diseases like Alzheimer’s. Research into the “gut-brain axis” suggests that microbial diversity and specific bacterial metabolites (such as short-chain fatty acids) directly influence neuroinflammation. By reading the entire microbial community, we can identify specific species that are neuroprotective or neurotoxic, providing a non-invasive way to discover early biomarkers for brain health.

In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why? I would use Illumina Next-Seq because it offers the optimal balance of accuracy and throughput required for complex microbiome analysis. When sequencing a fecal sample to understand the gut-brain axis, we are looking at a soup of DNA from thousands of different bacterial species. To detect rare but potentially influential neuro-active microbes, we need deep sequencing, meaning we must read the sample millions of times. Illumina’s second-generation technology is the industry leader for this because its cost per base is significantly lower than third-generation methods (like Nanopore) and its base-calling accuracy is higher (99.9%), ensuring that we don’t misidentify a bacterial species due to a sequencing error.

  1. This is a second-generation technology. It is defined by its “sequencing-by-synthesis” (SBS) method, which uses massive parallelism to read millions of short DNA fragments simultaneously with extremely high accuracy.
  2. The input is high-quality genomic DNA extracted from fecal samples. The essential preparation steps are fragmentation, adapter ligation, and PCR amplification
  3. The sequencer adds fluorescently labeled nucleotides one at a time. Each time a base is incorporated, it emits a specific color. A high-resolution camera records these flashes, and the software performs base calling by converting the sequence of colors into a digital string of A, C, G, and T.
  4. The output is a FASTQ file containing millions of short, highly accurate “reads” which are then computationally assembled to map the microbial species present in the gut.

Write

What DNA would you want to synthesize (e.g., write) and why? I would synthesize a closed-loop genetic logic gate circuit for use in iPSC-derived human microglia. This circuit would be designed to sense high levels of pro-inflammatory cytokines (like IL-1β) or Amyloid-beta and respond by triggering the production of a therapeutic anti-inflammatory protein (like IL-10) or a fluorescent reporter like mCherry for real-time monitoring of disease state in the lab. This would be useful because current drugs for Alzheimer’s are always on, leading to side effects. A synthetic circuit in iPSC-derived microglia could trigger a therapeutic response (like releasing a clearing enzyme) only when it senses high levels of Amyloid-beta or inflammatory cytokines.

What technology or technologies would you use to perform this DNA synthesis and why? I would use Twist Bioscience’s silicon platform because synthetic genetic circuits require the writing of many different regulatory parts (promoters, insulators, and reporters) that must work with each other. Twist’s technology is superior for this because it uses a semiconductor-based approach to synthesize thousands of Gene Fragments in parallel on a single silicon chip. For a BME project, this allows me to multiplex. I can order 50 variations of my Alzheimer’s-sensing circuit with slightly different promoter strengths to see which one has the best signal-to-noise ratio in my iPSC-derived microglia.

  1. Twist utilizes a semiconductor-based silicon platform to perform traditional phosphoramidite chemistry in miniature. The process involves a four-step cycle: De-protection (removing a blocking group), Coupling (adding the next nucleotide), Capping (preventing incorrect chains from growing), and Oxidation (stabilizing the bond). This is repeated for each base until the custom sequence is complete.

  2. Limitations:

  • Speed: Synthesis and shipping typically take 5–10 days, which is slower than biological replication.
  • Accuracy: Chemical synthesis has a small error rate that compounds with length; therefore, sequences longer than 2kb must be built by stitching together smaller, verified fragments.
  • Scalability: While the silicon platform allows for synthesizing thousands of different genes at once, the cost per base remains higher than large-scale natural DNA replication.

Edit

What DNA would you want to edit and why? I would want to edit the human APOE gene to convert the high-risk APOE4 allele into the protective APOE2 allele. APOE4 is the strongest genetic risk factor for late-onset Alzheimer’s, while APOE2 is known to be neuroprotective. Making this switch in human neural stem cells could fundamentally change a person’s risk profile and slow the progression of neurodegeneration.

What technology or technologies would you use to perform these DNA edits and why? I would choose CRISPR Base Editing specifically because of its safety profile in post-mitotic or sensitive cells like those found in the brain. Standard CRISPR-Cas9 (first-generation editing) creates Double-Strand Breaks, which can trigger a p53-mediated toxicity response or cause large, unintended deletions that could be catastrophic in a neural environment. Base Editing is the superior choice for a therapeutic application in Alzheimer’s research because it performs a search and replace at the single-atom level. By chemically converting the target nucleotide without ever cutting the DNA backbone, we minimize the risk of genomic instability while achieving the precise C -> T flip needed to convert the APOE4 risk allele into the protective APOE2 variant

  1. Base editing uses a “deactivated” Cas9 (dCas9) or a nickase (nCas9) fused to a deaminase enzyme. Unlike standard CRISPR, it does not cut the DNA backbone. The process begins with targeting, where a custom-designed guide RNA leads the Base Editor complex to the precise location of the APOE SNP within the genome. Once the target is reached, the Cas9 domain performs unzipping, pulling the DNA strands apart to create a localized window of single-stranded DNA. Finally, the chemical conversion occurs; the fused deaminase enzyme physically rearranges the atoms of a specific base—for instance, converting a Cytosine into a Uracil. The cell’s natural repair machinery then recognizes this change and converts it into a Thymine (C -> T), effectively flipping the genetic switch from a risk allele to a protective one without ever creating a double-strand break.
  2. For preparation, we must design a guide RNA with a 20bp spacer that is unique to the APOE locus. The input is typically the Base Editor protein (or mRNA) and the synthetic gRNA, delivered via lipid nanoparticles or viral vectors.
  3. Despite its precision, the method faces a significant challenge in efficiency, as not every target cell will successfully receive the editor or undergo the chemical conversion. This results in mosaicism, a state where only a fraction of the neural population is corrected while others remain in the high-risk state. Furthermore, the technology is limited by its precision regarding bystander editing. If other identical bases (such as multiple Cytosines) are located within the narrow activity window of the deaminase enzyme, the editor may unintentionally change those nearby bases as well, potentially leading to unintended genetic modifications.

Week 3 HW: Lab Automation

Project Overview: Cell-Free Allergen Biosensor

I am hoping to develop a rapid, consumer-grade biosensor designed to detect trace allergens like peanut or soy in a restaurant setting. To prioritize speed and accuracy, I will use a DNA-to-RNA detection circuit.

The workflow consists of three main stages:

  1. Extraction and Amplification: I could use RPA (Recombinase Polymerase Amplification) to exponentially copy target DNA (like the Ara h 1 gene) at a constant 37°C.
  2. Transcription: T7 RNA polymerase can concurrently convert that DNA into Trigger RNA.
  3. RNA Toehold Detection: This Trigger RNA can bind to a synthetic Toehold Switch, and unzip an RNA hairpin to allow the translation of a reporter protein. This can create a visible color change or induce luminescence in under 20 minutes.

By using a cell-free protein synthesis system, the entire reaction is shelf-stable and functions without the need for a traditional lab environment.


Automation Stack

  • Nebula: I could use this for computational modeling to predict the thermodynamic stability ($\Delta G$) of different toehold designs. It can help me pick sequences that are stable at room temperature but unzip quickly when triggered.
  • Opentrons OT-2: This is a good tool for kinetic screening. I can use it to automate the distribution of reagents and synchronize the start of up to 96 reactions across a plate, ensuring my “time-to-detection” data is precise.
  • Ginkgo: I can look to foundry-scale automation for validating my optimized sensors against complex food matrices (oils, acids, salts) to minimize false negatives.

Kinetic Screening Pseudo-code

This protocol could be used to screen my Nebula-designed candidates to find the one with the fastest response time.

# SENSOR OPTIMIZATION PROTOCOL

# REAGENTS
SET Designs = [Nebula_A, Nebula_B, Nebula_C, Nebula_D]
SET Concentrations = [0, 10, 50, 100] # ppm of Ara h 1 DNA

# 1. REAGENT DISTRIBUTION
FOR EACH well IN plate:
    DISPENSE 10uL Cell-Free/RPA Master Mix

# 2. SENSOR LOADING
FOR i, design IN enumerate(Designs):
    DISPENSE 2uL design INTO Column(i)

# 3. SYNCHRONIZED REACTION START
# Using multichannel to eliminate timing offsets
FOR j, conc IN enumerate(Concentrations):
    MULTICHANNEL_DISPENSE conc INTO Row(j)

# 4. KINETIC DATA COLLECTION
WHILE time < 20 minutes:
    READ_ABSORBANCE (570nm) EVERY 30 seconds
    LOG_DATA(well_id, time, signal_intensity)

# 5. ANALYSIS
RANK designs BY (Signal-to-Noise Ratio) AND (Time-to-Threshold)

Novel Lab Automation for Brain Organoids

Overview Organoids are 3D in vitro models that closely replicate the biology and physiology of their in vivo counterparts, making them highly valuable for developmental research and disease modeling. However, traditional manual cell culture protocols often lack consistency and expose these complex models to erratic swings in nutrient availability and the buildup of toxic metabolites, which can lead to cellular stress.

To address these limitations, researchers developed the “Autoculture” platform, a modular, automated microfluidic system designed to optimize 3D organoid growth. This Internet of Things (IoT)-enabled system uses a custom 24-well polydimethylsiloxane (PDMS) chip to house individual organoids in isolated microenvironments. The platform automatically delivers precisely scheduled media and removes waste without the need to take the cultures out of the incubator, offering high spatiotemporal resolution and customizable feeding schedules.

Findings When tested on cerebral cortex organoids over an 18-day period, the Autoculture platform supported robust growth and accurate neural differentiation comparable to conventional orbital shaker suspension methods. Crucially, RNA sequencing revealed that the automated microfluidic environment significantly reduced markers of cellular stress. Specifically, organoids grown in the automated system showed a marked downregulation in genes associated with canonical glycolysis and endoplasmic reticulum (ER) stress when compared to those maintained in traditional suspension cultures.

figure_2 figure_2

paper_link

Week 4 HW: Protein Design part I

HW Questions:

  1. How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons)

    • Meat is roughly 20% protein by mass, so there’s ~100g of protein in 500g of meat. Average amino acid molecular weight is ~110 Da.
    • 100 g ÷ 110 g/mol ≈ 0.91 mol of amino acid residues × Avogadro’s number: 0.91 × 6.022 × 10²³ ≈ ~5.5 × 10²³ amino acid residues
  2. Why do humans eat beef but do not become a cow, eat fish but do not become fish?

    • During digestion, proteases break down the cow/fish proteins down into their constituent free amino acids and small peptides. These are then absorbed as monomers. Your ribosomes then reassemble them according to your own mRNA instructions.
  3. Why are there only 20 natural amino acids?

    • Likely a combination of three things. First, once the genetic code co-evolved around these 20 amino acids, any change would catastrophically mis-translate the entire proteome, so those 20 were ’locked in’. Second, the 20 cover the necessary chemical space (charged, polar, hydrophobic, aromatic, etc). Third, the simplest amino acids (Gly, Ala, Asp, Glu, Val…) are exactly the ones most readily produced abiotically, so the code evolved around what was chemically accessible early on. Selenocysteine and pyrrolysine as the “21st and 22nd” amino acids show the code can expand, but only under very constrained circumstances.
  4. Where did amino acids come from before enzymes that make them, and before life started?

    • They form spontaneously from simple chemistry. Amino acids are thermodynamically reasonable products.
    • The Miller-Urey experiment showed electric discharge through CH₄, NH₃, H₂O and H₂ produces Gly, Ala, Asp and others.
    • Hydrothermal vents provide mineral catalysts and redox gradients. Meteorites (like Murchison) contain over 70 amino acids synthesized in space, confirming abiotic production is universal wherever C, N, O, H and energy coexist.
  5. If you make an α-helix using D-amino acids, what handedness (right or left) would you expect?

    • Left-handed. L-amino acids prefer φ≈−57°, ψ≈−47° → right-handed helix. D-amino acids are the mirror image → φ≈+57°, ψ≈+47° → left-handed helix.
  6. Can you discover additional helices in proteins?

    • Beyond the α-helix, several others exist.
    • The 3₁₀-helix is tighter (i→i+3 H-bonds), more strained, and common at helix termini.
    • The π-helix is wider (i→i+5 H-bonds) and surprisingly prevalent at functional sites — maybe 15% of proteins contain at least one π-turn.
    • The polyproline II (PPII) helix has no intramolecular H-bonds at all, is left-handed, and is extremely common in disordered regions, collagen, and signaling domains (SH3 recognition).
    • The collagen triple helix is three intertwined PPII-like chains stabilized by interchain H-bonds.
    • New folds continue to emerge from cryo-EM and AlphaFold-era structural biology, so the catalogue is probably not closed.
  7. Why are most molecular helices right-handed?

    • Most molecular helices are right-handed because all biological amino acids are L-form, so their bond geometry naturally favors coiling clockwise when chained together.
  8. Why do β-sheets tend to aggregate?

    • Edge strands have unpaired backbone NH and C=O groups pointing outward (they’re basically unsatisfied H-bond donors and acceptors, which makes them inherently “sticky.”)
    • Flat hydrophobic surfaces on sheet faces also drive stacking through the hydrophobic effect.
    • Side chains interdigitate into a tight steric zipper, which is favorable in both enthalpy and entropy, and thus very hard to prevent without chaperones or proline residues.
  9. Why do many amyloid diseases form β-sheets? Can you use amyloid β-sheets as materials?\

    • For many disease-associated sequences (Aβ in Alzheimer’s, α-synuclein in Parkinson’s, IAPP in type II diabetes), the amyloid state is actually thermodynamically more stable than the native fold. Stress, concentration increases, mutations, or metal ions can nucleate conversion, after which elongation proceeds rapidly like crystal growth. The resulting cross-β architecture is extraordinarily stable, resistant to heat, detergent, and proteases.
    • As materials, amyloid fibrils are definitely useful. They can have a Young’s modulus of 1–20 GPa (similar to silk), high aspect ratio, and nanoscale precision. Functional amyloids already exist naturally (curli fibers in bacterial biofilms, yeast prions as regulatory switches). Proposed applications include conductive nanowires (metallized with silver or gold), hydrogels for drug delivery, tissue engineering scaffolds, and even food technology(whey protein amyloids are already used commercially as emulsifiers).

Protein Analysis and Visualization

  1. Insulin is a small hormone produced by the pancreas that regulates blood glucose levels. I chose it because of its fascinating history as a therapeutic protein. Decades of protein engineering have produced analogs like insulin lispro and glargine, where just one or two amino acid changes dramatically alter how the drug behaves in the body. I wanted to explore the structure underlying a protein that has been so deliberately and successfully redesigned.
  2. Sequence:
    • GIVEQCCTSICSLYQLENYCN/FVNQHLCGSHLVEALYLVCGERGFFYTPK
    • length: 51 amino acids
    • most common residues are C and L, which appear 6 times each
    • this protein has 228 homologs
    • this specific insulin protein is in the broader insulin family
  3. Protein Structure Page
    • The structure was released on February 24, 2009. (Note: While the very first insulin structure was solved in 1969, this version (3E7Y) is the modern high resolution reference for the native human protein).
    • The quality is excellent - 1.60 Å
    • There are other molecules in the solved structure of the protein, as this classic structure represents the storage form (hexamer). It contains zinc ions, chloride ions, and water molecules
    • This protein is the defining member of the Insulin-like superfamily.
  4. 3D Structure
  • cartoon insulin_cartoon insulin_cartoon
  • ribbon insulin_ribbon insulin_ribbon
  • sticks insulin_sticks insulin_sticks
  • secondary structure: the protein has more helices than sheet insulin_secondary_structure insulin_secondary_structure
  • protein surface: the distribution of hydrophobic (orange) vs hydrophillic (cyan) residues follows as expected. Most of the surface residues are hydrophillic, and the hydrophobic residues line the binding pockets insulin_residue_type insulin_residue_type

Using ML-Based Protein Design Tools

For this section, I will be using the same Insulin protein from the prior section.

GIVEQCCTSICSLYQLENYCNFVNQHLCGSHLVEALYLVCGERGFFYTPKT

PDB ID: 3E7Y

C1.1. Mutational Scan

Mutation_Heatmap Mutation_Heatmap

Darker columns (low probablities mean mutations are unlikely) indicate more conserved residues (wild-type is strongly preffered). Likely, these are amino acids which are most crucial to the protein’s structure and function.

C1.2. Latent Space Analysis

Latent_Space Latent_Space

The neighborhoods in this map are determined by the high-dimensional proximity of ESM-2 embeddings, which represent the structural and evolutionary patterns of the sequences. In the ASTRAL SCOP dataset used here, these spatial clusters correspond to specific protein superfamilies and folds. Hovering over any cluster confirms that neighbors share identical or closely related SCOP classification codes, verifying that the visualization effectively approximates biological and structural similarity.

The insulin protein is represented by the lime green dot situated within a dense cluster of other signaling proteins. It is positioned in this specific neighborhood because the ESM-2 model identified its sequence signature, particularly the conserved cysteine motif, as being highly similar to the other sequences in that region. The predominant identities of its closest neighbors are other human and mammalian insulins, as well as members of the Insulin/IGF/Relaxin superfamily like IGF-1 and IGF-2. This proximity indicates that the model recognizes your protein as sharing the same disulfide-rich fold that defines the structural template for this entire cluster.

C2.1

folding_comp folding_comp

The predicted structure matches the experimentally determined structure.

C2.2

Mutating some portions had minimal impact on predicted structure, while mutations/deletions in other portions (the beginning for instance) meaningfully altered the predicted structure.

C3.1

The designed sequence obtained using inverse folding was:

GIEELCCESSCTPEELAEYCN/SSSGRYCGEELIEALAEVCGERGFTYAPP

The inverse folding analysis of the 3E7Y backbone yielded a native sequence score of 1.6521 and a ProteinMPNN-designed score of 0.9312, where the lower value for the design indicates a higher model-perceived likelihood that the new sequence will stabilize the target structure. This optimization is reflected in a sequence recovery of 0.5200, suggesting that while the model preserved 52% of the original residues to maintain the structural core, it proposed alternative mutations for the remaining 48% to improve the fit. The amino acid probability chart supports these scores by revealing that the model’s high confidence is concentrated at key positions like the cysteines, which are essential for the disulfide-rich insulin fold, while the lower scores for the redesign likely stem from the model finding more energetically favorable residues for the flexible, darker-colored regions on the heatmap.

aa_probabilities aa_probabilities

C3.2

The predicted structure of the inverse-folding result was extremely similar to the ESMFold result of the original sequence.

folding_comp_2 folding_comp_2

Week 5 HW: Protein Design part II

SOD1_dimer SOD1_dimer

Part A: SOD1 Binder Peptide Design

sequences, scores, structure, and properties for all peptides

PepMLM binder generation

  • Perplexity scores for known and generated peptides:
perplexity_scores perplexity_scores

Alphafold binder evaluation

  • ipTM Values and Comparison to Known Binder

The ipTM values across all peptides are low, ranging from 0.27 to 0.43, and none exceed 0.5 — the general threshold for confident protein-peptide interaction prediction. Notably, two PepMLM-generated peptides (Sequence_0 at 0.40 and Sequence_1 at 0.43) actually exceed the known binder (Sequence_4 at 0.32), suggesting the model produced candidates with comparable or slightly better predicted interface confidence. However, all predictions share the same binding

peptide_0_alphafold peptide_0_alphafold peptide_1_alphafold peptide_1_alphafold peptide_2_alphafold peptide_2_alphafold peptide_3_alphafold peptide_3_alphafold peptide_4_alphafold peptide_4_alphafold

Location note: surface-exposed, near the dimer interface, and not near the A4V mutation site.

Important caveat: none of these peptides, including the known binder, appear to engage the mutation directly, which limits their utility for direct mutant stabilization.

Peptiverse property evaluation

  • Peptiverse vs. AlphaFold3 Structural Predictions

There is no consistent correlation between ipTM and predicted binding affinity. Sequence_1 has the highest ipTM (0.43) but one of the weaker affinities (5.183 pKd/pKi), while the known binder has a modest ipTM (0.32) yet the strongest predicted affinity (5.968 pKd/pKi) — suggesting the two metrics are capturing different aspects of binding. Encouragingly, all peptides are fully soluble (P=1.00) and non-hemolytic, presenting clean safety profiles. The known binder is the closest to a hemolysis concern (P=0.047) and carries a high positive charge (+2.76), raising off-target interaction risk. Sequence_0 best balances the available metrics — strong ipTM among generated candidates (0.40), highest predicted affinity of the PepMLM sequences (5.624 pKd/pKi), full solubility, low hemolysis risk (P=0.034), and near-neutral charge (−1.14).

Peptide to Advance: Sequence_0 (WRYYPVGVEHGE)

I would advance Sequence_0. It offers the best combination of structural confidence, predicted affinity, and therapeutic safety among the generated candidates, and compares favorably to the known binder on hemolysis and charge. Crucially, like all candidates here, it binds near the dimer interface rather than the mutation site directly — meaning the likely mechanism of action would be dimer stabilization rather than direct rescue of the A4V destabilization. This is still therapeutically meaningful, since SOD1 A4V toxicity is closely linked to aberrant monomerization and aggregation, and stabilizing the dimer interface could slow that cascade. That mechanistic framing should be explicitly tested in follow-up biochemical and cell-based assays, but Sequence_0 represents the strongest starting point for that line of investigation.

Week 6 HW: Genetic Circuits

HW Questions

  1. What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose?
  • Phusion DNA Polymease, which synthesized new DNA by adding new nucleotides to the template DNA during PCR
  • dNTPs, which are the nucleotide building blocks (dATP, dGTP, dCTP, and dTTP)
  • reaction buffer, which acts as a chemical stabilizer that maintains the ideal pH and salt balance so the enzyme stays active and can accurately build new DNA strands.
  1. What are some factors that determine primer annealing temperature during PCR?
  • Melting temperature of the primer, which is the temperature at which half of the DNA complex dissociates
  • Primer length, since longer primers usually require higher annealing temperatures
  • GC content, since higher GC content typically increases the primer melting temperature
  • Salt concentration, since higher salt concentrations can stabilize the DNA and thus may require higher annealing temperatures
  1. There are two methods from this class that create linear fragments of DNA: PCR, and restriction enzyme digests. Compare and contrast these two methods, both in terms of protocol as well as when one may be preferable to use over the other.
FeaturePCR (Polymerase Chain Reaction)Restriction Enzyme Digest
MechanismEnzymatic Synthesis: Building new strands from primers.Enzymatic Cleavage: Cutting phosphodiester bonds at specific sites.
ProtocolThermal Cycling: Repeated steps of denaturation (95°C), annealing (55-65°C), and extension (72°C).Isothermal Incubation: DNA and enzymes are mixed in a buffer and held at a constant temp (usually 37°C).
ReagentsDNA template, Primers, dNTPs, Taq Polymerase, MgCl2, Buffer.DNA template, Restriction Enzymes, specific BSA/Salt Buffer, Water.
ProsHigh sensitivity; amplifies DNA; creates specific fragments without needing existing cut sites.Simple setup; highly reproducible; great for verifying known sequences or circular DNA.
ConsProne to contamination; requires known flanking sequences; potential for polymerase errors.Does not amplify DNA; limited by the location of natural recognition sites.
When to UseWhen you have minimal DNA, need a custom fragment, or want to add “tails” for cloning.When linearizing plasmids, performing diagnostic checks, or subcloning existing inserts.
  1. How can you ensure that the DNA sequences that you have digested and PCR-ed will be appropriate for Gibson cloning?
  • Both the PCR and digested fragments must share identical overlapping terminal sequences (15–40 bp) with their neighboring fragments to allow for seamless homology-directed assembly.
  1. How does the plasmid DNA enter the E. coli cells during transformation?
  • Membrane pores open due to a thermal pressure imbalance during the heat shock, allowing the plasmid DNA (which has been neutralized by calcium ions) to be pulled into the cell.
  1. Describe another assembly method in detail (such as Golden Gate Assembly)
  • Golden Gate Assembly is a highly efficient “one-pot” cloning method that allows you to join multiple DNA fragments together simultaneously using Type IIS restriction enzymes and T4 DNA ligase. Unlike standard enzymes, Type IIS enzymes like BsaI bind to a specific recognition sequence but cut the DNA several nucleotides away, creating custom 4-base overhangs. By strategically designing these overhangs to be complementary, you can ensure that multiple fragments assemble in a specific, directional order. During the reaction, you cycle the temperature to repeatedly cut and ligate the DNA until the fragments are perfectly joined. A key advantage is that the enzyme’s recognition sites are positioned to be “cut off” and removed during the process, meaning the final product cannot be re-cut. This makes the reaction irreversible and drives the assembly toward the final, seamless circular plasmid. Because of this precision, Golden Gate is the gold standard for modular cloning and building complex multi-gene constructs.
golden_gate_assembly golden_gate_assembly

Simulating Golden Gate using AddGene’s tool

gg_sim gg_sim
  • AddGene’s tool allows you to work with either a single insert or multiple fragments (up to 100), and it handles both vectors that already have Type IIS restriction sites and those that don’t. When sites are missing, it automatically designs the PCR primers needed to add them. You can also set your preferred PCR primer Tm. The tool selects the appropriate enzyme recognition sites, predicts how overhangs will interact, and flags any potential mis-ligation issues through an assembly fidelity score. Once you’re happy with the design, it simulates the full digest-and-ligate reaction and generates the predicted product sequence, which you can inspect at the fusion points to confirm everything is in frame and in the right order. It also exports your primer sequences ready to order.
  1. select your vector gg_vector gg_vector

  2. insert fragment

  • check the vector digest gg_fragment gg_fragment
  1. review the predicted product

  2. adjust overhangs gg_overhangs gg_overhangs

  3. assemble gg_assembled gg_assembled

Asimov Kernel

Repressilator

I reacreated the represillator by searching dragging parts one by one from the Parts Search.

represillator represillator

The results of the simulation were similar to the simulation of the original represillator. The simulation shows oscillations in expression levels for all three proteins.

og_represillator_sim og_represillator_sim

Original Constructs

I first started with a simple construct consisting of a LambdaCI promoter, an A1 RBS, the LambdaCI CDS, a bacterial terminator, and a backbone. Simulating showed constant concentration of RNA, and highly expressed protein levels.

original_construct_sim original_construct_sim

Next, I removed the promoter. This meant that no presence of RNA or protein expression were predicted.

no_promoter_sim no_promoter_sim

Lastly, I removed the RBS. Here, despite predicted presence of RNA, there was no protein expression. This reveals the necessity of the RBS for protein epression.

no_RBS_sim no_RBS_sim

week 7 HW: genetic circuits part II

Part 1: Intracellular Artificial Neural Networks (IANNs)

  1. What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions?

Traditional genetic circuits can only read a signal as ON/OFF, even though molecules inside a cell exist at all kinds of intermediate concentrations. To build something complex out of ON/OFF switches, you have to layer many of them together, and each added layer introduces new opportunities for components to accidentally influence each other or fall out of sync. IANNs instead pass graded responses between nodes. Each node receives an actual concentration value, weighs it, and passes a continuous output forward. This means a single node carries far more information than an ON/OFF switch, so you need fewer of them to represent something complex, and there are fewer points at which things can go wrong.

  1. Describe a useful application for an IANN; include a detailed description of input/output behavior, as well as any limitations an IANN might face to achieve your goal.

A good use case is engineering bacteria to produce a drug. The bacteria need to balance how much raw material is available against how much final product has built up, since overproduction can stall or kill the cells. A Boolean circuit can only respond to whether the product level is above or below a fixed cutoff, shutting production on or off entirely. An IANN can instead read both signals as continuous values and smoothly adjust enzyme production in response, the same way a thermostat gradually responds to temperature rather than just cutting the heat off when a room gets warm.

  1. Draw a diagram for an intracellular multilayer perceptron where layer 1 outputs an endoribonuclease that regulates a fluorescent protein output in layer 2.
circuit_hw_7 circuit_hw_7

Layer 1 takes X₁ and X₂ as DNA inputs, each transcribed outside the cell. Inside, X₁ is translated into Csy4 (the inhibitory node, red) and X₂ is transcribed into FP mRNA. Both exit Layer 1 and enter Layer 2, where Csy4 represses translation of the FP mRNA while the mRNA itself drives it. The surviving signal is then translated by the output Tl node into the fluorescent protein.

Part 2: Fungal Materials

  1. What are some examples of existing fungal materials and what are they used for? What are their advantages and disadvantages over traditional counterparts?

Packaging: Mycelium grown on agricultural waste like straw can be molded into compostable styrofoam alternatives, but costs more to produce and is harder to manufacture consistently at scale.

Insulation: Mycelium panels outperform synthetic foam on fire resistance and sound absorption and are fully biodegradable, but absorb moisture easily and aren’t strong enough for load-bearing applications.

  1. What might you want to genetically engineer fungi to do and why? What are the advantages of doing synthetic biology in fungi as opposed to bacteria?

If I were going to engineer fungi for materials, I’d want to tune the mycelium itself genetically to make it grow faster, more uniformly, and with better water resistance so that the material comes out of the growth process already having desireable properties, rather than having to add plastics or chemical treatments afterward to compensate.