Subsections of <YOUR NAME HERE> — HTGAA Spring 2026

Subsections of Labs

Week 1 Lab: Pipetting

cover image cover image

Subsections of Projects

Individual Final Project

cover image cover image

Group Final Project

cover image cover image

Homework

Weekly homework submissions:

  • Week 1 HW: Principles and Practices

    The Biological Engineering Tool I Want to Develop: The Gut-Longevity Diagnostic Platform Emerging research firmly establishes the gut microbiome as a key modulator of systemic inflammation, metabolic health, and even the rate of biological aging. The metabolic outputs of our gut bacteria—particularly short-chain fatty acids (SCFAs) like butyrate—are directly linked to immune regulation, insulin sensitivity, and cellular repair pathways. I propose developing a diagnostic platform to functionally map this ecosystem and provide actionable insights for promoting healthspan. The Gut-Longevity Diagnostic is an at-home testing system that moves beyond static genomic sequencing. A user sample is exposed to a standardized panel of prebiotic substrates within a disposable cartridge containing engineered biosensors. These sensors measure the real-time, functional metabolic output—the specific SCFAs and gases produced by the user’s unique microbial community. A validated algorithm interprets this dynamic functional profile against longitudinal health data, generating a personalized, food-based nutritional prescription designed to steer the microbiome towards an anti-inflammatory, metabolic, and pro-longevity phenotype.

  • Week 4: Protein Design

    Protein Design I Questions from Shuguang Zhang Question 1: How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons) Answer: In 500 grams of meat, you consume approximately 8.2 × 10²³ molecules of amino acids. This refined calculation uses a standard protein content of 30% for cooked meat and an adjusted average molecular weight of 110 Daltons per amino acid residue.


title: Homework

Week 2: DNA Read, Write, and Edit


Homework Questions from Professor Jacobson

  1. Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome? How does biology deal with that discrepancy?

The core DNA replication machinery, DNA polymerase, has a raw error rate of about 1 mistake for every 100,000 nucleotides it copies. This might sound precise, but given the human genome is roughly 3 billion base pairs long, a single cell division would result in about 30,000 mutations if uncorrected. This is a catastrophic level of errors that would make life impossible.

To deal with this, biology employs a powerful, multi-layered proofreading system:

Proofreading (3’→5’ Exonuclease Activity): Many DNA polymerases have a built-in “backspace” function. As they add nucleotides, they can immediately check and remove a mismatched one, improving accuracy by about 100-fold.

Mismatch Repair: After replication, a separate system acts like a final quality control team. It scans the new DNA strand, identifies and corrects mismatches that escaped the initial proofreading, boosting fidelity by another 100 to 1000 times.

Together, these systems reduce the final error rate to an astonishingly low ~1 error per 10 billion nucleotides, making high-fidelity inheritance possible.

  1. How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice, what are some of the reasons that all of these different codes don’t work to code for the protein of interest?

Because the genetic code is redundant (multiple DNA codons can specify the same amino acid), the number of possible DNA sequences for an average 400-amino-acid protein is astronomically high—roughly 3400 different sequences.

However, not all these theoretical sequences will produce a functional protein efficiently in a living cell. Key biological constraints include:

Codon Usage Bias: Cells have preferred “words” (codons). Using rare codons that match scarce transfer RNA (tRNA) molecules can dramatically slow down protein production.

mRNA Structure: The sequence itself can fold into shapes that block the ribosome, preventing translation.

Protein Folding: The speed of translation, influenced by codon choice, can affect how the protein folds correctly as it’s being made.

Hidden Signals: The coding sequence might accidentally create signals that tell the cell to cut (splice) the RNA in the wrong place or stop translation early.

Homework Questions from Dr. LeProust

  1. What is the most commonly used method for oligonucleotide synthesis currently? The industry standard is phosphoramidite-based solid-phase synthesis. In this automated process, DNA strands are built nucleotide-by-nucleotide onto a solid bead or chip. Each cycle adds one base with very high efficiency (99-99.5%), allowing for the reliable and scalable production of short DNA sequences.

  2. Why is it difficult to make oligonucleotides longer than ~200 nucleotides by direct synthesis? The limitation is cumulative yield loss. Even with 99.5% efficiency per step, after 200 cycles, the chance of any one strand being fully correct is only about 37%. Beyond this length, the majority of the product is fragments of various lengths, making it extremely difficult and expensive to purify the tiny amount of full-length, error-free DNA.

  3. Why can’t a 2000 bp gene be synthesized directly using oligo synthesis? For a 2000-base-pair gene, the probability of a perfect, full-length strand from direct chemical synthesis is effectively zero. Instead, scientists synthesize many shorter, manageable oligonucleotides (like 40-60 bases long) that overlap in sequence. These fragments are then stitched together accurately using enzymatic assembly methods like Gibson Assembly or Polymerase Chain Reaction (PCR)-based assembly, which leverage the cell’s own precise DNA repair and replication machinery.

Homework Question from George Church

  1. What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”? The ten amino acids that animals cannot synthesize and must obtain from food are: Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Threonine, Tryptophan, Valine, and Arginine.

The “Lysine Contingency” is a concept that highlights an evolutionary trap. The fact that lysine is essential means animal ancestors permanently lost the complex biochemical pathways to produce it, likely because they lived in environments rich in lysine (e.g., eating plants). Once lost, these pathways are virtually impossible to re-evolve. This creates a fundamental nutritional dependency that shapes all animal ecology—from what we eat to how food webs are structured—and underscores how evolution can constrain future possibilities by eliminating unused metabolic options.

  1. What code would you suggest for AA:AA interactions?

For AA:AA interactions, I would suggest a code based on hydrophobicity scales. The rule is that amino acids with similar hydrophobicity values have a high propensity to interact, with hydrophobic ones driving core packing and hydrophilic ones favoring surface exposure. This provides a powerful, simplified model because the drive to sequester hydrophobic residues from water is the fundamental organizing principle of protein folding.

week 2

Subsections of Homework

Week 1 HW: Principles and Practices

cover image cover image
  1. The Biological Engineering Tool I Want to Develop: The Gut-Longevity Diagnostic Platform Emerging research firmly establishes the gut microbiome as a key modulator of systemic inflammation, metabolic health, and even the rate of biological aging. The metabolic outputs of our gut bacteria—particularly short-chain fatty acids (SCFAs) like butyrate—are directly linked to immune regulation, insulin sensitivity, and cellular repair pathways. I propose developing a diagnostic platform to functionally map this ecosystem and provide actionable insights for promoting healthspan.

The Gut-Longevity Diagnostic is an at-home testing system that moves beyond static genomic sequencing. A user sample is exposed to a standardized panel of prebiotic substrates within a disposable cartridge containing engineered biosensors. These sensors measure the real-time, functional metabolic output—the specific SCFAs and gases produced by the user’s unique microbial community. A validated algorithm interprets this dynamic functional profile against longitudinal health data, generating a personalized, food-based nutritional prescription designed to steer the microbiome towards an anti-inflammatory, metabolic, and pro-longevity phenotype.

The goal is to transform gut health from a vague concept into a measurable, modifiable pillar of preventative medicine, providing a science-backed tool for conscious dietary choices aimed at healthy aging.

  1. Governance & Policy Goals for an Ethical Future The primary goal is Beneficence and Non-Maleficence in Preventative Health: ensuring the tool delivers real health benefits while rigorously preventing harm across diverse populations.

Sub-goal 1.1: Ensure Clinical Validity and Safety. The algorithm’s nutritional recommendations must be grounded in robust clinical evidence to avoid harm. Incorrect advice could exacerbate conditions like metabolic syndrome or IBD. Governance must mandate rigorous validation against health outcomes, not just correlation.

Sub-goal 1.2: Prevent Biological Data Exploitation. Gut microbiome data is highly personal predictive information. Governance must establish it as a protected health entity, preventing its use by insurers or employers for discrimination or by third parties for unauthorized manipulation (e.g., targeted advertising for unhealthy foods).

Sub-goal 1.3: Architect for Equitable Access from Inception. To avoid exacerbating health disparities, the technology’s design and business model must prioritize accessibility. Governance should incentivize affordable, scalable solutions suitable for integration into public health initiatives for aging populations.

  1. Three Potential Governance Actions Action 1: A “Functional Diagnostic” Pre-Market Framework.

Purpose: Create a new regulatory pathway for tools that provide functional health analysis and dietary advice, distinct from medical devices or supplements.

Design: A consortium of regulatory agencies, microbiologists, and nutrition scientists defines evidence tiers for claims. Developers must achieve a given tier before marketing.

Assumptions: Regulators can adapt quickly, and predefined evidence standards will accelerate, not hinder, responsible innovation.

Risks: Over-standardization could stifle novel approaches. Success could create a two-tier system where only well-funded entities achieve the highest validation tiers.

Action 2: A Microbiome Data Commons with Granular User Control.

Purpose: Shift data ownership from corporations to individuals by creating a user-controlled, interoperable data repository.

Design: A non-profit or public entity develops the open-source platform. Users hold encryption keys, granting time-limited, specific access to researchers or apps via a “data wallet.”

Assumptions: Users will manage their keys responsibly. Researchers will participate despite more complex data access procedures.

Risks: Increased platform complexity and liability. If poorly adopted, it could fragment the data ecosystem further.

Action 3: Public-Private Development of a Core, Open-Source Algorithm.

Purpose: Ensure the core science remains transparent and auditable, preventing proprietary “black boxes” from dominating a public health field.

Design: A government-funded research center develops and validates a base algorithm using diverse, ethically-sourced data. Commercial entities build applications on this audited core.

Assumptions: An open-source model can achieve and maintain clinical-grade accuracy. Public funding will be sustained.

Risks: The core model could become outdated without continuous public investment. Commercial forks could deviate from safety guidelines. 4. Scoring Governance Actions Against Policy Goals (Scoring: 1 = Best, 3 = Worst, n/a = Not Applicable)

Does the option:Option 1Option 2Option 3
Enhance Biosecurity
• By preventing incidents132
• By helping respond122
Foster Lab Safety
• By preventing incident213
• By helping respond313
Protect the environment
• By preventing incidents321
• By helping respond211
Other considerations
• Minimizing costs and burdens to stakeholders231
• Feasibility?122
• Not impede research211
• Promote constructive applications111
  1. My Recommendation & Trade-Offs To a National Institute on Aging or Public Health Agency:

I recommend prioritizing Action 3 (Open-Source Core) supported by Action 1 (Diagnostic Framework). This combination best serves long-term public health goals.

Why This Combination? Developing a publicly-audited core algorithm (Action 3) is a strategic investment that ensures foundational science remains a public good, fosters innovation on a level playing field, and directly enables equitable access by lowering the cost of entry. A clear, tiered Action 1 framework provides the necessary guardrails for safety and validity without prematurely stifling innovation around this open core.

The Trade-Off: This approach consciously accepts that early commercial market development may be less lucrative, potentially slowing initial private investment. However, it prioritizes long-term ecosystem health, scientific transparency, and equitable dissemination over short-term market capture by a few entities.

The Uncertainty: The major unknown is whether a publicly maintained model can match the rapid iteration pace of well-funded private labs. This requires a commitment to sustainable, competitive funding for the public core development team.

  1. Personal Ethical Reflection This week’s work crystallized a critical ethical tension: the gap between personalized health technology and population-level health justice. A tool optimized for “longevity” risks being calibrated using data from affluent, already-healthy cohorts, potentially pathologizing normal variations in gut flora from underrepresented groups or labeling their traditional diets as “suboptimal.” This risks a new form of biomedical marginalization.

Therefore, a non-negotiable governance action must be mandatory diversity and inclusion in foundational research. Any public funding or regulatory approval for such platforms should require that the training datasets and validation cohorts are representative of global genetic, dietary, and socioeconomic diversity. This is not merely an ethical imperative but a scientific one, ensuring the resulting tools are robust, generalizable, and truly serve the goal of healthspan for all.

Week 4: Protein Design

Protein Design I

Questions from Shuguang Zhang

Question 1: How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons)

Answer: In 500 grams of meat, you consume approximately 8.2 × 10²³ molecules of amino acids. This refined calculation uses a standard protein content of 30% for cooked meat and an adjusted average molecular weight of 110 Daltons per amino acid residue.

Step-by-Step Refined Calculation

  1. Calculate Total Protein Mass
    Raw meat typically contains about 20–22% protein, but 500 g of cooked lean meat (like chicken breast or steak) averages roughly 30% protein due to water loss during cooking.
    Protein mass = 500 g × 0.30 = 150 g.

  2. Adjust for Amino Acid Residue Weight
    The average weight of the 20 free amino acids is ~100 Da, but when they link to form protein chains, they lose a water molecule (18 Da). Biochemists use an adjusted average of 110 Da per residue in a protein chain.

  3. Convert Mass to Daltons
    Using the conversion factor where 1 g = 6.022 × 10²³ Da (Avogadro’s number):
    Total mass in Daltons = 150 g × (6.022 × 10²³ Da/g) = 9.033 × 10²⁵ Da.

  4. Calculate Total Molecules
    Divide the total mass in Daltons by the average mass of a single amino acid residue:
    Number of molecules = (9.033 × 10²⁵ Da) / (110 Da/residue) = 8.21 × 10²³ residues.

Comparison of Meat Types (500 g cooked)

The specific amino acid count fluctuates depending on the lean‑to‑fat ratio of the meat:

Meat TypeProtein ContentNumber of Amino Acid Molecules
Chicken Breast (Grilled)~160 g8.76 × 10²³
Beef Steak (Lean)~155 g8.48 × 10²³
Pork Chop (Lean)~158 g8.65 × 10²³
Ground Beef (80/20)~125 g6.84 × 10²³

Question 2: Why do humans eat beef but do not become a cow, eat fish but do not become fish?

Answer:
Humans do not “become” the animals they eat because the body does not absorb tissues whole; it reduces them to universal chemical precursors before rebuilding them according to a strictly human genetic blueprint.

1. The Principle of Molecular Deconstruction (Digestion)

Proteolytic Breakdown: Enzymes like pepsin and trypsin hydrolyze the peptide bonds of animal proteins, reducing them into their smallest constituents: amino acids.
Universal Currency: At the molecular level, a leucine molecule from a cow is chemically identical to a leucine molecule in a human. By the time these nutrients reach the bloodstream, all species-specific “identity” has been stripped away.

2. Genetic Blueprint and Protein Synthesis (The “Human” Template)

Once absorbed, these universal amino acids enter the cellular amino acid pool. The reason we rebuild these into human muscle rather than bovine muscle lies in our DNA.
The Instruction Manual: Your DNA contains the specific sequences (templates) for human proteins. Through the processes of transcription and translation, your ribosomes use the “imported” amino acids to assemble exclusively human proteins (like human collagen or human hemoglobin).
Biological Identity: The body’s metabolic pathways are genetically “hard-coded.” Even if you consume fish DNA, your cells lack the biological machinery to incorporate foreign genetic material into your own genome; instead, foreign DNA is broken down into harmless nucleotides and recycled as raw material.

3. Evolutionary Barrier and Immune Guarding

From an evolutionary standpoint, the ability to maintain a stable biological identity while consuming diverse organic matter is a survival necessity.
Immune Surveillance: The human immune system is designed to recognize and destroy “non-self” proteins. If a cow protein were to enter the bloodstream intact, the body would treat it as a pathogen (an allergen or invader) rather than a building block.
Metabolic Adaptability: Humans are physiologically omnivorous, meaning our systems are specialized in extracting energy from varied sources without compromising our own structural integrity.

Summary

ConceptAnalogy
DigestionTaking apart a Lego castle into individual bricks
Amino AcidsThe universal Lego bricks
DNAThe instruction manual for a human castle
Protein SynthesisBuilding a human castle using the same bricks

In short: We eat cow structure, but we follow human instructions.