Week 1 HW: Principles and Practices
Biological Engineering Application
Proposed application: I want to develop a computer program that helps early-stage biological research by making it easier and more responsible for researchers to analyze biological data. The tool would help organize, check, and understand biological datasets (such as genomic or protein-related data) using bioinformatics and AI-assisted methods. It would also clearly show where there is doubt and where there is a risk of misuse.
Why this interests me: My interest stems from my academic exposure to bioinformatics and my curiosity about how software can meaningfully support biological research without lowering safety standards. As biological technologies become more accessible, I am particularly interested in how computational systems might help guide responsible use rather than accelerate harm or misuse.
Governance/Policy Goals for an Ethical Future
Ensure that AI- and software-assisted biological tools promote constructive scientific progress while minimizing risks related to misuse, safety failures, and inequitable access.
Sub-goals:
- Non-malfeasance by preventing intentional or accidental misuse of biological data or tools.
- Reduce risks related to unsafe experimental design or misinterpretation of results.
- Equity & Access by ensuring tools do not disproportionately benefit only well-resourced institutions or regions.
- To encourage transparency, reproducibility, and responsible documentation.
Governance Actions
Option 1: Mandatory Safety & Ethics Training for Tool Access
Purpose: Currently, many computational biology tools are accessible with minimal oversight. This action proposes requiring basic ethics and safety training before granting access to advanced biological analysis features.
Design
- Actor(s): Universities, research institutions, platform developers
- Short certification modules embedded into the tool onboarding
- Required before unlocking sensitive or high-risk functionalities
Assumptions
- Users will engage honestly with the training
- Training content is kept up to date
- Institutions agree on baseline standards
Risks of Failure & “Success”
- Risk: Training becomes a box-checking exercise
- Unintended success: May exclude independent researchers or under-resourced users if not designed inclusively

Option 2: Built-in Technical Safeguards and Usage Monitoring
Purpose: Introduce technical constraints that limit high-risk outputs and flag potentially dangerous use cases.
Design
- Actor(s): Software developers, private companies
- Automated flags, rate limits, and warning prompts
- Optional audit logs for institutional users
Assumptions
- Risky behaviors can be meaningfully detected
- Developers correctly anticipate misuse patterns
Risks of Failure & “Success”
- Risk: Over-blocking legitimate research
- Unintended success: Users may try to bypass safeguards using alternative tools
Option 3: Norms and Incentives for Transparent Documentation
Purpose: Encourage researchers to document both successes and failures to promote safer learning and reproducibility, much like chess players recording every move of a match.
Design
- Actor(s): Journals, funding bodies, academic institutions
- Incentives for publishing negative or null results
- Standardized documentation templates
Assumptions
- Researchers value incentives over speed or prestige
- Documentation does not expose sensitive information
Risks of Failure & “Success”
- Risk: Increased administrative burden
- Unintended success: Over-disclosure of sensitive methods
Scoring Governance Actions
| Does the option: | Option 1 | Option 2 | Option 3 |
|---|---|---|---|
| Enhance Biosecurity | |||
| • By preventing incidents | 2 | 1 | 2 |
| • By helping respond | 2 | 2 | 1 |
| Foster Lab Safety | |||
| • By preventing incident | 1 | 2 | 2 |
| • By helping respond | 2 | 2 | 1 |
| Protect the environment | |||
| • By preventing incidents | 2 | 2 | 2 |
| • By helping respond | 2 | 3 | 1 |
| Other considerations | |||
| • Minimizing costs and burdens to stakeholders | 2 | 3 | 1 |
| • Feasibility? | 1 | 2 | 1 |
| • Not impede research | 2 | 3 | 1 |
| • Promote constructive applications | 1 | 2 | 1 |
Prioritization & Recommendation
Based on the scoring, I would prioritize a combination of Option 1 (training requirements) and Option 3 (documentation norms). Together, these approaches encourage responsible behavior without heavily restricting legitimate research. While technical safeguards (Option 2) are important, they should be applied cautiously to avoid impeding innovation.
Target audience: Academic institutions and platform developers, with encouragement from funding agencies.
Trade-offs & uncertainties:
- Balancing accessibility with responsibility
- Ensuring governance mechanisms evolve alongside technology
- Risk that voluntary norms are unevenly adopted
Reflection: Ethical Concerns from This Week
This week highlighted how easily powerful biological tools can shift from beneficial to harmful depending on context, intent, and oversight. One ethical concern that stood out to me was the assumption that access alone equates to understanding or responsibility. I was also struck by how governance often lags behind technical capability.
Proposed additional governance action: Introduce interdisciplinary review processes that include not only scientists but also ethicists, policymakers, and community representatives when developing or deploying new biological tools.
Assignment (Week 2 Lecture Prep)
Homework Questions from Professor Jacobson
Question 1: Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy?
Answer: Nature’s machinery for copying DNA, called DNA polymerase, has an error rate of approximately 1 error per 10 million base pairs (1:10⁶). The human genome consists of approximately 3 billion base pairs. This means that during DNA replication, there could be around 300 errors per replication event.
Biology has evolved mechanisms to address this discrepancy and maintain genomic integrity:
Proofreading by DNA Polymerase: DNA polymerase has a built-in proofreading ability. If it incorporates an incorrect nucleotide, it can detect the error, remove the incorrect base, and replace it with the correct one. This significantly reduces the error rate to about 1 error per billion base pairs (1:10⁹).
DNA Repair Mechanisms: Cells have additional repair systems, such as mismatch repair, which identify and correct errors that escape the proofreading process. These mechanisms further reduce the error rate and help maintain the accuracy of the genome.
These processes ensure that the human genome remains stable and functional despite its vast size and the inherent error rate of DNA replication.
Question 2: How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?
Answer: To determine how many different ways DNA can code for an average human protein, we need to consider the following:
- Average Length of a Human Protein: The document states that the average human protein is 1036 base pairs long. Since each amino acid is encoded by a codon (a sequence of three nucleotides), this corresponds to approximately 345 amino acids (1036 ÷ 3 ≈ 345).
- Codon Redundancy: The genetic code is degenerate, meaning multiple codons can encode the same amino acid. For example, there are 64 possible codons, but only 20 amino acids, so many amino acids are encoded by more than one codon. The number of codons per amino acid varies (e.g., leucine has 6 codons, while methionine has only 1).
- Number of Possible DNA Codes: If we assume an average of 3 codons per amino acid (a rough estimate based on the genetic code), the number of possible DNA sequences for an average human protein would be approximately 3345, which is an astronomically large number.
Reasons Why All These Codes Don’t Work in Practice:
- Codon Bias: Different organisms have preferences for certain codons over others, known as codon bias. Codons that are rarely used in the host organism may lead to inefficient translation or reduced protein expression.
- mRNA Secondary Structures: Some DNA sequences may produce mRNA with secondary structures (e.g., hairpins) that interfere with ribosome binding or translation, reducing the efficiency of protein synthesis.
- Regulatory Elements: DNA sequences may inadvertently contain regulatory elements (e.g., promoters, enhancers, or silencers) that affect transcription or translation, leading to unintended consequences.
- Protein Folding and Function: While the amino acid sequence may be correct, the codon choice can influence the speed of translation, which in turn affects protein folding. Improper folding can result in non-functional or misfolded proteins.
- Post-Translational Modifications: Some DNA sequences may not allow for proper post-translational modifications, which are critical for the protein’s function.
- Codon Context Effects: The sequence context around codons can influence translation efficiency and accuracy, meaning that not all codon combinations are equally effective.
In practice, researchers often optimize codon usage for the host organism to ensure efficient and accurate protein production.
Homework Questions from Dr. LeProust
Question 1: What’s the most commonly used method for oligo synthesis currently?
Answer: The most commonly used method for oligonucleotide synthesis currently is solid-phase phosphoramidite chemistry
Question 2: Why is it difficult to make oligos longer than 200nt via direct synthesis?
Answer: Making oligonucleotides (oligos) longer than 200 nucleotides (nt) via direct chemical synthesis is difficult primarily because of the exponential decrease in yield caused by coupling efficiencies being less than 100%, and the cumulative increase in chemical errors over long synthesis cycles.
Question 3: Why can’t you make a 2000bp gene via direct oligo synthesis?
Answer: Direct, single-step chemical synthesis of a 2000 base pair (bp) DNA sequence is currently not possible using standard automated oligonucleotide synthesis, primarily due to the exponential decrease in efficiency, accumulation of errors, and the inability to purify such long single-stranded molecules. While 2000 bp genes are commonly created, they are assembled from smaller oligonucleotides (typically 40-200 bases) rather than synthesized in one direct step.
Homework Question from George Church
Question 1: What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?
Answer: The 10 essential amino acids that must be acquired through the diet of most animals, including humans, are Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Threonine, Tryptophan, Valine, and Arginine. These are considered “essential” because animal bodies cannot synthesize these amino acids internally at a rate sufficient to meet metabolic needs and must instead obtain them from food.
How the Science Affects My View of the Contingency: The knowledge that lysine is an essential amino acid for all animals completely undermines the premise of the “Lysine Contingency” as a practical, reliable safety feature. All vertebrates already cannot produce their own lysine. They must obtain it from their food. Making a dinosaur “lysine-deficient” is redundant because they were already, by definition, dependent on dietary lysine. The contingency fails because dinosaurs can easily find lysine in their environment. Herbivores can eat soy, beans, and other common plants, while carnivores can obtain it by eating those herbivores. The failure of the contingency serves as a key plot point demonstrating human hubris and the inability to fully control nature, as noted by characters in the franchise.
AI Use Disclosure: I used ChatGPT (OpenAI) as a brainstorming and structuring aid while working on this assignment. Specifically, I used AI-generated prompts to help clarify the assignment requirements, organize my ideas, and explore example frameworks for governance and ethics analysis. All final interpretations, reflections, and written content were reviewed, adapted, and contextualized by me.