Week 01 HW

HW.1: Class assignment

1. Describe an application

Identify a biological engineering tool or application you wish to develop and explain your motivation.

I would like to develop a way to make plants grow 100x faster. I find this a very interesting and ambitious question. Perhaps you reverse-engineer the genome, morphological development and constraints, proteins/enzymes/catalysts for growth. Perhaps you design a separate organism (two bacterium?) which produces biomass - a combination of a carbon sequester and a cellulose printer. Perhaps you attempt to design a minimal artificial cell, like a Xenobot / JCVI minimal cells - using new AI design software, you create a minimal genome/DNA, design your own morphological topology through simulation, which is compiled down to gene regulatory networks (GRN’s), transcription factors/thresholds, and DNA.

Why? Because trees and plants are great. They are calming, they look beautiful, they are functionally useful. Originally I wanted to build my own house, and was wondering - why is wood so expensive? If we could grow wood more quickly and effectively, that would be useful. It would also be fun to rapidly green certain areas of the world to produce arable land - the Australian desert, for example.

2. Establish governance goals

Describe one or more governance/policy goals related to ensuring that this application or tool contributes to an “ethical” future, like ensuring non-malfeasance (preventing harm). Break big goals down into two or more specific sub-goals.

  • Enhance biosecurity (prevent misuse and uncontrolled spread)

    • Prevent incidents

      • Restrict access to engineered strains, protocols, and enabling tools
      • Use genetic containment (kill-switches, auxotrophy, sterility)
      • Avoid traits that increase invasiveness or persistence outside intended settings
    • Help respond

      • Establish monitoring and reporting systems for unexpected dissemination
      • Maintain traceability (registries, audit logs, chain-of-custody)
  • Foster lab safety (reduce accidents during development)

    • Prevent incidents

      • Standard biosafety training and conservative organism/chassis selection
      • Physical containment and phased testing (lab → greenhouse → controlled trials)
      • Explicit evaluation of failure modes in growth and developmental pathways
    • Help respond

      • Clear spill/escape response protocols and emergency shutdown procedures
      • Regular safety reviews and independent oversight
  • Protect the environment (minimize ecological externalities)

    • Prevent incidents

      • Ecological risk assessment: gene flow, non-target effects, ecosystem disruption
      • Prohibit open release until long-term impacts are understood
      • Prefer reversible or self-limiting designs over permanent alterations
    • Help respond

      • Post-deployment surveillance and remediation plans
      • Defined liability and responsibility for environmental harms
  • Equity, autonomy, and constructive use (ensure benefits are fairly distributed)

    • Minimizing burdens to stakeholders

      • Community consultation for land-use and deployment decisions
      • Avoid shifting risks onto local ecosystems or vulnerable populations
    • Feasibility without blocking research

      • Clear regulatory pathways that enable safe experimentation
      • Transparency and documentation to support responsible scaling
    • Promote beneficial applications

      • Prioritize reforestation, sustainable materials, and climate-positive outcomes
      • Discourage purely extractive or destabilizing commercial deployment

3. Design governance actions

Describe at least three different potential governance “actions” by considering the four aspects below (Purpose, Design, Assumptions, Risks of Failure & “Success”)

  1. Purpose: What is done now and what changes are you proposing?
  2. Design: What is needed to make it “work”? (including the actor(s) involved - who must opt-in, fund, approve, or implement, etc)
  3. Assumptions: What could you have wrong (incorrect assumptions, uncertainties)?
  4. Risks of Failure & “Success”: How might this fail, including any unintended consequences of the “success” of your proposed actions?
  1. Containment-by-design + staged release
  • Actors: Institutional Biosafety Committees (IBC), national GMO regulators (e.g., OGTR/USDA), lab leads, funders
  • Design: engineered sterility/kill-switches, greenhouse-only trials, stepwise permits before field testing
  • Assumptions: containment works reliably; lab phenotypes predict outdoor behavior
  • Risks: safeguard failure, gene flow, invasive advantage, unexpected ecosystem effects
  1. Access control + biosecurity screening
  • Actors: DNA synthesis firms, biosecurity agencies, research institutions, grant/journal oversight
  • Design: sequence screening, restricted strain distribution, dual-use review processes
  • Assumptions: misuse is limited by controlling access to key materials/information
  • Risks: leakage, uneven enforcement globally, slowing benign research
  1. Environmental monitoring + liability framework
  • Actors: environmental agencies, local governments/landholders, independent ecologists, insurers/courts
  • Design: required impact studies, long-term surveillance, clear remediation liability
  • Assumptions: harms are detectable early and manageable with monitoring
  • Risks: underfunded surveillance, delayed ecological damage, liability discouraging deployment

4. Score against rubric

Evaluate each action against objectives including:

  • Biosecurity enhancement
  • Lab safety
  • Environmental protection
  • Cost/burden minimization
  • Feasibility and research impact
Does the option:Option 1Option 2Option 3
Enhance Biosecurity332
• By preventing incidents332
• By helping respond223
Foster Lab Safety321
• By preventing incident321
• By helping respond222
Protect the environment323
• By preventing incidents322
• By helping respond213
Other considerations
• Minimizing costs and burdens to stakeholders221
• Feasibility?231
• Not impede research121
• Promote constructive applications323

5. Prioritize options

Last, drawing upon this scoring, describe which governance option, or combination of options, you would prioritize, and why. Outline any trade-offs you considered as well as assumptions and uncertainties.

For this, you can choose one or more relevant audiences for your recommendation, which could range from the very local (e.g. to MIT leadership or Cambridge Mayoral Office) to the national (e.g. to President Biden or the head of a Federal Agency) to the international (e.g. to the United Nations Office of the Secretary-General, or the leadership of a multinational firm or industry consortia). These could also be one of the “actor” groups in your matrix.

I would prioritise Containment-by-design + staged release. Given that there is immense uncertainty in how this project could be achieved, it is a waste of resources to consider other governance actions for now. Rapid iteration to reduce uncertainty is the path towards achievement. As part of this - a scalable safety protocol throughout this process facilitates rapid experimentation without risk of ruin, until the project can achieve milestones necessary for unlocking funding and revenue.

HW.2: Lecture prep for W2

Answer prep questions from three faculty members:

Homework Questions from Professor Jacobson:

Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy?

Error rate refers to errors per nucleotide added per replication. An error could be a misincorporation (wrong base expressed for a pair), for example.

Error rate of polymerase synthesis is 1/1e7 (1:10^7).

Human genome has 3.1-3.2 Gbp or 3e9 base pairs.

The rate of errors in polymerase copying the human genome’s DNA is 1/1e7 * 3e9, which is nonzero.

Biology deals with the likely error through multiple levels of mitigation:

  • Proofreading during synthesis corrects errors
  • Mismatch repair after synthesis repairs errors
  • Redundancy and selection at multiple levels - DNA is double-stranded, cells exist in huge populations, misfolded proteins get degraded, defective RNAs are destroyed, faulty cells undergo apoptosis
  • Damage repair system

How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?

Our assumptions:

  • Average Human Protein: 1036 bp.
  • ~30,000 proteins observed in mammalian genome.
  • A protein of length L = 3L nucleotides (bases) + a stop codon in the genome

Coding is the process by which DNA is transcribed into mRNA (triplets / codons), and mRNA (codons) is translated into a linear chain of amino acids (polypeptides), which folds into 3D protein structures.

How many different ways are there to code for an average human protein, meaning how many different DNA encodings would compile (transcribe and translate) down to the same protein (chain of amino acids) of length 1036 bp?

alt text alt text

Codons are 3 nucleotides, each which have a base (A,C,G,T). There are 64 possible triplet combinations (codons) using the four bases (A, U, G, C). Each codon encodes one amino acid. An amino acid can be encoded by multiple codons. For instance, codons GAA and GAG both specify glutamic acid and exhibit redundancy. This is referred to as degeneracy.

The degeneracy of an amino acid refers to the number of codons which encode it. ie. d(Leu)=6, meaning Leucine has 6 codons which encode it.

Average codon degeneracy across amino acids is roughly 3.

So to calculate the number of possible encodings for a protein of length L=5 amino acids, we compute the degeneracy of each amino acid, and compute their product to find the maximum number of permutations. ie. for a protein of L=5, average degeneracy d(*)=3, num_permutations=d(*) * d(*) * d(*) * d(*) * d(*) = d(*)^L = 3^L

So for an average human protein of L=1036 bp, the number of possible encodings could be 3^L = 3^1036.

There is an intractable number of possible encodings. However, functional “good” encodings are a tiny subset constrained by expression, folding, RNA processing, regulation, and host biology.


Homework Questions from Dr. LeProust:

What’s the most commonly used method for oligo synthesis currently?

solid-phase chemical synthesis with phosphoramidite chemistry

Why is it difficult to make oligos longer than 200nt via direct synthesis?

Because direct phosphoramidite synthesis has a per-step yield <1.0, errors compound exponentially with length. P(success)=(1-e)^200 is improbable (e ~= 0.01)

Why can’t you make a 2000bp gene via direct oligo synthesis?

(1-e)^2000 is near impossible, due to errors accumulating from each synthetic cycle/step.

  • expected number of cleavage events scales ~linearly with cycle count and purine content
  • Misincorporations accumulate (wrong base addition)

Homework Question from George Church:

Choose ONE of the following three questions to answer; and please cite AI prompts or paper citations used, if any.

[Using Google & Prof. Church’s slide #4] What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?

[Given slides #2 & 4 (AA:NA and NA:NA codes)] What code would you suggest for AA:AA interactions?

[(Advanced students)] Given the one paragraph abstracts for these real 2026 grant programs sketch a response to one of them or devise one of your own:

https://arpa-h.gov/explore-funding/programs/boss

https://www.darpa.mil/research/programs/smart-rbc

https://www.darpa.mil/research/programs/go

What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?

Histidine, isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan, valine (+ arginine conditional).

Out of the 20 amino acids needed, the body synthesizes 11-12, while the remaining 8-9, known as essential amino acids, must be obtained through diet.

This is not accurate to all animals, it seems? Counterexample: cats. Cats require taurine.

The Lysine Contingency was a genetic alteration Henry Wu performed in the dinosaur genome. The modification knocked out the ability of the dinosaurs to produce the amino acid Lysine.

This forced the dinosaurs to depend on lysine supplements provided by the park’s veterinary staff. In this way, dinosaurs could never escape from the park because they would never survive long without the food supplements.

Haha, I have to rewatch this film.

The way I would hack around this would be to introduce a substance containing the microbes that cows digest and feed it to the dinosaurs. These microbes synthesise the essential amino acids from nitrogen, thus mitigating the need for the dinosaurs to produce Lysine themselves, instead forming a symbiotic relationship with the microbes in their gut.

I don’t know what this question means, but it reminds me also of Liebig’s law - would the restriction of one amino acid necessarily debilitate the dinosaurs so they can’t escape, or is nature more nonlinear and complex than that?

LLM prompts used:

  • 10 essential amino acids in all animals?
  • across all animals?
  • cows can synthesise most of their needed amino acids? how many which ones
  • how long can you survive without just one of the amnio acids ?

HW.3: Setup your website

CHECK IT OUT

https://pages.htgaa.org/2026a/liam-edwards-playne/