Alejandro Ortiz — HTGAA Spring 2026

cover image cover image

About me

“The best way to predict the future is to grow it.”

I’m Alejandro Ortiz — a Colombian social entrepreneur, technologist, and aspiring biological engineer working at the intersection of synthetic biology, neurotechnology, and social impact.

I believe the next great revolutions won’t come from silicon alone — they’ll be grown. From engineering living systems that interface with the human brain to designing biological tools that democratize healthcare, I’m here to learn how to build the future at the molecular level.

My path so far

  • Creator of Aurora App — a free platform protecting women’s safety in Bogota through legal aid and real-time incident mapping. 200+ users, 50+ women supported in legal cases. Named after the color purple, a symbol of feminist resilience.
  • NeoReplicante — exploring the frontiers of AI, synthetic biology, and human augmentation.
  • Aspire Leaders Program ‘22 — first-generation student who turned adversity into a mission: knowledge as a tool to solve humanity’s deepest problems.
  • Community Action Award recipient — recognized for leveraging technology for social justice.

Why HTGAA?

I came to How To Grow Almost Anything because I’m convinced that biological engineering is the most powerful technology humanity has ever touched — and the most dangerous if left ungoverned. I want to:

  • Master the tools of genetic engineering, from CRISPR to DNA synthesis
  • Explore living bioelectronics — cells that can talk to neurons and compute
  • Build toward brain-computer interfaces rooted in biology, not just metal and silicon
  • Ensure these technologies serve everyone, not just the privileged few

My north star: engineer biology that amplifies human potential while protecting human dignity.


Homework

Labs

Projects

Subsections of Alejandro Ortiz — HTGAA Spring 2026

Week 1 — Principles & Practices

Class Assignment: Biological Engineering Application & Governance


1. A Bio-Engineering Tool I Want to Develop

NeuroGrow: A Living Biosensor Platform for Real-Time Neurotransmitter Detection

I want to develop a living biosensor system composed of engineered mammalian cells that can detect and report levels of key neurotransmitters — dopamine, serotonin, GABA, and glutamate — in real time.

Why this matters: Today, mental health diagnostics rely almost entirely on subjective self-reporting. Depression, anxiety, PTSD, and ADHD are diagnosed through questionnaires, not through biochemical evidence. Meanwhile, brain-computer interfaces (BCIs) are advancing rapidly, but they remain invasive, expensive, and inaccessible to most of the world. We need a bridge — a biological layer that can read the brain’s chemistry without drilling into skulls.

How it would work: The platform uses engineered cells equipped with synthetic receptor circuits. When a target neurotransmitter binds to a designed receptor on the cell surface, it triggers an intracellular signaling cascade that produces a measurable output — fluorescence, bioluminescence, or an electrical signal via bacterial nanowires. These biosensor cells would be encapsulated in biocompatible hydrogels and placed in minimally invasive, subcutaneous patches or wearable microfluidic devices that sample interstitial fluid (which correlates with cerebrospinal neurotransmitter levels).

The vision: A world where a low-cost, living diagnostic device can tell a clinician — or a patient themselves — what is happening in their neurochemistry, in real time, without a single electrode touching the brain. This could democratize neuroscience, bring precision psychiatry to low-income countries, and serve as the biological foundation for a new generation of brain-computer interfaces that work with biology, not against it.

Personal motivation: I grew up in Bogota, Colombia, where I witnessed how mental health crises devastate communities but remain invisible to healthcare systems. I built Aurora App to make violence against women visible through technology. Now I want to make the invisible chemistry of human suffering — and human potential — visible through biology. This is the next step.


2. Governance & Policy Goals

I propose the following governance framework for living neurotechnologies, organized around three primary goals:

Goal A: Protect Neural Privacy & Cognitive Liberty

  • Sub-goal A1: Prevent unauthorized collection, storage, or commercial exploitation of neurotransmitter data (neural data is the most intimate data that exists — it reflects our emotions, intentions, and mental states).
  • Sub-goal A2: Establish the legal concept of “cognitive liberty” — the right of every individual to mental self-determination, including the right to refuse neurochemical monitoring.

Goal B: Ensure Equitable Global Access

  • Sub-goal B1: Prevent monopolization of living neurotechnology by wealthy nations or corporations.
  • Sub-goal B2: Ensure that the biological materials, genetic constructs, and manufacturing knowledge required to produce these tools are accessible to researchers and clinicians in low- and middle-income countries.

Goal C: Prevent Weaponization & Misuse

  • Sub-goal C1: Prevent the engineering of biosensors that could be repurposed for covert surveillance of mental states (e.g., by authoritarian governments or employers).
  • Sub-goal C2: Prevent dual-use modification of neurotransmitter-responsive cells into neurotransmitter-producing cells that could be weaponized to alter behavior.

3. Three Governance Actions

Action 1: International Neurodata Privacy Standard (Regulatory)

Purpose: Currently, no international legal framework specifically addresses neural data. The EU GDPR covers “health data” broadly, and Chile passed a pioneering “neurorights” amendment in 2021, but no binding international standard exists. I propose a UN-level International Neurodata Convention that classifies neural data (including neurotransmitter profiles) as a protected category above standard health data, with explicit prohibitions on commercial sale, non-consensual collection, and use in employment or insurance decisions.

Design:

  • Actors: The UN Office of the Secretary-General would convene a working group including neuroscientists, bioethicists, representatives from WHO, and civil society organizations. States would opt in by ratification, similar to the Chemical Weapons Convention.
  • Mechanism: The convention would establish a “Neurodata Protection Authority” with the power to audit companies and institutions working with neural biosensors, require informed consent protocols, and impose sanctions for violations.
  • Implementation: Nations would transpose the convention into domestic law, similar to how GDPR works within EU member states.

Assumptions:

  • Neural data from biosensors is sufficiently distinct from other health data to warrant special protection (I believe it is, because it reveals internal mental states, not just physical conditions).
  • International consensus can be achieved — a significant assumption given geopolitical fragmentation.
  • Enforcement mechanisms would be effective across jurisdictions.

Risks of Failure & “Success”:

  • Failure: Without broad ratification, the standard becomes aspirational only. Countries that opt out could become “neurodata havens.”
  • Unintended success risk: Overly strict regulation could slow beneficial research, particularly in academic and non-profit contexts. Researchers might avoid the field entirely due to compliance burden.

Action 2: Open-Source Bio-BCI Toolkit Requirement (Incentive-Based)

Purpose: Currently, most BCI and biosensor research is locked behind corporate IP or paywalled academic publications. I propose that all publicly funded research on living neurotechnologies be required to release genetic constructs, protocols, and data under open-source licenses (similar to how NIH requires data sharing plans). This would be modeled on the open-source DNA repositories like Addgene and the iGEM Registry of Standard Biological Parts.

Design:

  • Actors: National funding agencies (NIH, NSF, ERC, COLCIENCIAS, etc.) would add this as a condition of grants.
  • Mechanism: Grantees must deposit all engineered genetic constructs in a public repository (e.g., Addgene) and publish all protocols on open platforms (e.g., protocols.io) within 12 months of publication.
  • Incentive structure: Compliance unlocks bonus funding or priority for future grants. Non-compliance results in ineligibility for renewal.
  • Community component: Fund community biology labs (like Genspace, BioCurious, or labs in the Global South) to test and adapt these tools for local contexts.

Assumptions:

  • Open-source release does not create unacceptable biosecurity risks (this is debatable and would require case-by-case biosecurity review).
  • Funding agencies have the political will to enforce this.
  • Open access to constructs will actually lead to broader use, not just accumulation in well-resourced labs.

Risks of Failure & “Success”:

  • Failure: Industry could lobby to exempt commercially relevant constructs. Researchers might release minimal or unusable data to technically comply.
  • Unintended success risk: Widely available genetic constructs for neurotransmitter-responsive cells could be accessed by bad actors. This creates a direct tension with Goal C (preventing weaponization) and requires complementary technical safeguards.

Action 3: Engineered Biosafety Kill-Switch in All Living Neural Sensors (Technical)

Purpose: Any living biosensor deployed near or inside the human body must have a built-in safety mechanism. I propose that all engineered cells used in neurotechnology applications be required to contain a genetic kill-switch — a synthetic circuit that causes cell death in the absence of a specific external molecule (a “deadman switch”) or in the presence of an emergency trigger compound. This is inspired by existing work on synthetic auxotrophies and engineered biocontainment in the synthetic biology community.

Design:

  • Actors: Regulatory bodies (FDA, EMA, INVIMA in Colombia) would require kill-switch certification as part of the approval process for any living neurotechnology product.
  • Technical design: Cells are engineered to be dependent on a synthetic amino acid not found in nature. Without regular external supply of this molecule, the cells die within 24-48 hours. Additionally, a secondary chemical trigger (e.g., a small molecule drug) can induce rapid apoptosis if needed.
  • Standards body: An international technical standards organization (e.g., ISO, or a new body under WHO) would define minimum requirements for kill-switch reliability, including required failure rates (<1 in 10^9 escape frequency).

Assumptions:

  • Kill-switches can be made sufficiently robust (current kill-switches have known escape mutant problems — cells evolve to disable them).
  • The synthetic amino acid dependency doesn’t create supply chain vulnerabilities for patients.
  • Regulatory agencies have the technical capacity to evaluate kill-switch designs.

Risks of Failure & “Success”:

  • Failure: Evolution is relentless. Kill-switches can and do fail through mutation. Redundant circuits reduce but do not eliminate this risk.
  • Unintended success risk: Requiring kill-switches could create a false sense of security, leading to less stringent review of other safety aspects. Also, dependency on a synthetic molecule creates a potential point of coercion (withholding the molecule becomes a weapon).

4. Scoring Matrix

I score each governance action against my policy goals on a scale of 1 (best) to 3 (least effective), or n/a:

Does the option:Action 1: Neurodata Privacy StandardAction 2: Open-Source Toolkit RequirementAction 3: Biosafety Kill-Switch
Protect Neural Privacy
• By preventing incidents13n/a
• By helping respond1n/an/a
Ensure Equitable Access
• By preventing monopolization213
• By enabling global access312
Prevent Weaponization
• By preventing incidents231
• By helping respond231
Other considerations
• Minimizing costs/burdens323
• Feasibility312
• Not impeding research212
• Promoting constructive applications212

5. Recommendation

I recommend a layered combination of all three governance actions, prioritized in this order:

First priority: Action 2 (Open-Source Toolkit Requirement) — This has the highest feasibility, lowest burden, and the greatest potential for immediate positive impact. It directly addresses equity (my deepest concern as someone from the Global South) and promotes constructive applications. Funding agencies already have mechanisms for this. It can be implemented within existing institutional frameworks. Audience: NIH Director, NSF, and COLCIENCIAS (Colombia’s science agency).

Second priority: Action 3 (Kill-Switch Requirement) — Essential for biosafety and biosecurity, but should be implemented as a technical standard developed collaboratively between synthetic biologists and regulators, not as a top-down mandate that stifles innovation. The key trade-off is between safety (kill-switches reduce risk) and reliability (they can fail, and dependency molecules create new vulnerabilities). Audience: FDA, WHO, and the international synthetic biology community (e.g., through iGEM’s safety committee).

Third priority: Action 1 (Neurodata Privacy Standard) — The most important in the long run, but the hardest to implement. International treaties take decades. I recommend starting with national-level “neurorights” legislation (following Chile’s example) and building toward an international convention. The key trade-off is between protection (strict rules safeguard privacy) and access (overly strict rules could prevent researchers in low-resource settings from participating). Audience: UN Office of the Secretary-General, with initial advocacy directed at the Chilean government (as a pioneer) and the Colombian Congress.

Key trade-off across all three: There is an inherent tension between openness (Action 2) and security (Actions 1 and 3). Making genetic constructs freely available increases both beneficial access and misuse potential. I believe this tension is best managed by combining open access with technical safeguards (kill-switches) and legal consequences (privacy standards), rather than by restricting access — because restricted access historically benefits the powerful and excludes the vulnerable.


6. Ethical Reflections from Week 1

This week’s lectures and readings surfaced several ethical concerns that were new or newly vivid to me:

The dual-use dilemma is not abstract — it is personal. When I think about engineering cells that detect neurotransmitters, I imagine helping a mother in Bogota understand why she feels the way she feels. But the same technology could be used by an authoritarian government to screen dissidents for “inappropriate” emotional states. George Church and David Baker’s 2024 Science editorial on protein design and biosecurity made this tension concrete: the same AI tools that design therapeutic proteins can design toxic ones. The question is not whether to build, but how to build responsibly.

Equity is a biosafety issue. The WHO’s “Global Guidance Framework for the Responsible Use of Life Sciences” (2022) discusses biorisks and dual-use governance, but says little about what happens when only wealthy nations can afford the safety infrastructure these frameworks require. If biosafety compliance is expensive, and only rich-country labs can afford it, then biosafety governance inadvertently becomes a tool of exclusion. This is a risk I want to address directly in my HTGAA project.

Community biology is an ethical imperative, not a hobby. The DIYBio movement and community biology labs (as described in the Handbook for Community Biology Spaces) are sometimes treated as curiosities. But after building Aurora App with a team that included non-engineers, I believe deeply that democratizing access to biological tools is essential for ethical governance. You cannot govern what you do not understand. And communities cannot advocate for their interests in biotechnology if they have never touched a pipette.

The “Lysine Contingency” is a metaphor for everything. In Jurassic Park, the dinosaurs were engineered to be dependent on externally supplied lysine — a kill-switch. It failed. Every biological safety mechanism we design will be tested by evolution. Humility about the limits of engineering control is itself an ethical stance.


Week 2 Lecture Preparation: DNA Read, Write, and Edit


Professor Jacobson’s Questions

Q1: What is the error rate of polymerase? How does this compare to the length of the human genome? How does biology deal with that discrepancy?

DNA polymerase, the molecular machine responsible for copying DNA during cell division, has an intrinsic error rate of approximately 1 error per 104 to 105 nucleotides during initial synthesis. However, DNA polymerase also has a built-in 3’-to-5’ exonuclease proofreading activity that catches and corrects most of these errors immediately, improving fidelity to approximately 1 error per 10^7 nucleotides.

After replication, a second layer of correction — the mismatch repair (MMR) system — scans newly synthesized DNA for mismatches that proofreading missed. MMR improves overall fidelity by another factor of ~100-1000x, bringing the final error rate to approximately 1 error per 109 to 1010 nucleotides per cell division.

The human genome is approximately 3.2 x 10^9 base pairs long. This means that even with all three layers of error correction (polymerase selectivity, proofreading, and mismatch repair), we expect roughly 0.3 to 3 new mutations per cell division across the entire genome.

Biology “deals with” this remaining discrepancy through several strategies:

  • Redundancy: Most of the genome is non-coding (~98.5%), so most mutations land in regions where they have no immediate functional consequence.
  • Diploid genomes: Humans carry two copies of most genes, so a damaging mutation in one copy can be compensated by the healthy copy.
  • DNA damage repair systems: Beyond replication errors, additional repair pathways (base excision repair, nucleotide excision repair, homologous recombination) fix damage caused by UV light, oxidation, and other insults.
  • Apoptosis and immune surveillance: Cells that accumulate too many mutations can trigger programmed cell death or be eliminated by the immune system (this is why cancer, which evades these systems, is so dangerous).
  • Selection: At the population level, deleterious mutations are removed by natural selection over generations.

Q2: How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice, what are some of the reasons that all of these different codes don’t work?

The genetic code is degenerate (redundant), meaning most amino acids are encoded by more than one codon (triplet of nucleotides). There are 64 possible codons but only 20 amino acids (plus stop signals). On average, each amino acid is encoded by about 3 different codons.

For an average human protein of approximately 400 amino acids, the number of possible DNA sequences that encode the same amino acid sequence is roughly:

~3400 ≈ 10190 different DNA sequences

(The exact number depends on the specific amino acid composition, since some amino acids like leucine have 6 codons while methionine has only 1, but the order of magnitude is astronomically large.)

In practice, most of these synonymous coding sequences would not work to produce a functional protein. The reasons include:

  1. Codon usage bias: Different organisms (and even different tissues within an organism) prefer certain codons over others. Rare codons can cause ribosomal stalling or pausing during translation, slowing or aborting protein production. For example, the codon AGG (arginine) is very rare in E. coli — using many of them in a synthetic gene would dramatically reduce protein yield.

  2. mRNA secondary structure: The specific nucleotide sequence determines the folding of the mRNA molecule. Certain sequences can form strong hairpin structures near the ribosome binding site or start codon that block translation initiation. Others can create structures within the coding region that cause ribosome stalling.

  3. mRNA stability: Some sequences contain motifs that are recognized by cellular RNA degradation machinery (e.g., AU-rich elements, or endonuclease recognition sites), leading to rapid mRNA destruction before sufficient protein can be made.

  4. Splicing signals: In eukaryotes, certain sequences within the coding region can be mistakenly recognized as splice donor or acceptor sites, leading to incorrect mRNA processing and truncated or non-functional proteins.

  5. CpG dinucleotide effects: In mammals, CG dinucleotides (CpGs) are targets for DNA methylation, which can silence gene expression. Sequences with high CpG content may be epigenetically silenced.

  6. Co-translational folding: The speed at which the ribosome translates different codons affects how the nascent protein folds. Some proteins require specific “pauses” at rare codons to allow domains to fold correctly before the next domain is synthesized. Changing these codons to fast-translating synonyms can cause misfolding.


Dr. LeProust’s Questions

Q1: What’s the most commonly used method for oligo synthesis currently?

The most commonly used method is phosphoramidite chemistry, developed in the early 1980s by Marvin Caruthers and colleagues. In this method, DNA oligonucleotides are synthesized chemically on a solid support (typically controlled pore glass or polystyrene beads) in the 3’-to-5’ direction — the opposite of biological DNA synthesis.

Each cycle adds one nucleotide and consists of four steps:

  1. Detritylation (deprotection): Removal of the DMT (dimethoxytrityl) protecting group from the 5’-OH of the growing chain, using trichloroacetic acid.
  2. Coupling: A phosphoramidite monomer (the next nucleotide, activated with tetrazole) reacts with the free 5’-OH. Coupling efficiencies are typically 98-99.5% per step.
  3. Capping: Unreacted 5’-OH groups are acetylated to prevent them from participating in subsequent cycles (which would produce deletion sequences).
  4. Oxidation: The phosphite triester linkage is oxidized to the more stable phosphate triester using iodine/water.

After all cycles are complete, the oligo is cleaved from the solid support and fully deprotected.

Q2: Why is it difficult to make oligos longer than 200nt via direct synthesis?

The fundamental problem is cumulative coupling efficiency. Even at an excellent coupling efficiency of 99.5% per step, the overall yield of full-length product decreases exponentially with length:

  • 20-mer: 0.995^20 = 90.5% full-length product
  • 100-mer: 0.995^100 = 60.6% full-length product
  • 200-mer: 0.995^200 = 36.7% full-length product
  • 300-mer: 0.995^300 = 22.2% full-length product

At 200 nucleotides, roughly two-thirds of the molecules in the synthesis have at least one error (deletion, insertion, or substitution). This means:

  • The yield of correct full-length product is very low.
  • The purity is poor — the desired sequence is contaminated by a complex mixture of truncated and error-containing sequences.
  • Purification becomes extremely difficult because the error products are very similar in size and properties to the correct product.

Additionally, the depurination side reaction (loss of purine bases during the acid detritylation step) accumulates over many cycles, introducing additional sequence errors that are hard to remove.

Q3: Why can’t you make a 2000bp gene via direct oligo synthesis?

At 2000 nucleotides, even with 99.5% coupling efficiency, the theoretical yield of full-length correct product would be:

0.995^2000 ≈ 0.004% (4 in 100,000 molecules)

This is effectively zero usable product. The synthesis would produce an overwhelmingly complex mixture of trillions of different error-containing sequences, with the correct sequence being essentially undetectable.

Instead, genes of 2000bp or longer are made by gene assembly strategies:

  1. Synthesize short oligos (40-200nt) with overlapping sequences.
  2. Assemble them into longer fragments using PCR-based assembly (e.g., overlap extension PCR, Gibson Assembly) or ligation-based methods.
  3. Verify the assembled product by sequencing.
  4. Error-correct if needed, using enzymatic error correction (e.g., with mismatch-cleaving enzymes) or by selecting correct clones.

This hierarchical approach allows each short building block to be synthesized with manageable error rates, and errors can be identified and corrected at each stage of assembly. Companies like Twist Bioscience, IDT, and GenScript use automated versions of these assembly workflows to offer gene synthesis services for sequences of 10,000+ bp.


Professor Church’s Question

I chose Question 1:

What are the 10 essential amino acids in all animals, and how does this affect your view of the “Lysine Contingency”?

The 10 essential amino acids — those that animals cannot synthesize and must obtain from their diet — are:

  1. Histidine (His, H)
  2. Isoleucine (Ile, I)
  3. Leucine (Leu, L)
  4. Lysine (Lys, K)
  5. Methionine (Met, M)
  6. Phenylalanine (Phe, F)
  7. Threonine (Thr, T)
  8. Tryptophan (Trp, W)
  9. Valine (Val, V)
  10. Arginine (Arg, R) — conditionally essential (essential in young/growing animals, sometimes classified as semi-essential in adults)

(Note: Some sources list the canonical 9 — excluding arginine — but in all animals broadly, arginine is essential during growth.)

The “Lysine Contingency” is a fictional biosafety mechanism from Michael Crichton’s Jurassic Park (1990). In the novel and film, the dinosaurs were engineered to be unable to produce lysine, making them dependent on lysine-supplemented food provided by the park. The idea was that escaped dinosaurs would die without this external supply.

My view, informed by the biology:

The Lysine Contingency is a conceptually elegant but biologically naive containment strategy. Here’s why:

  1. Lysine is abundant in the environment. It is present in nearly all protein-rich food sources — meat, fish, legumes, and many plants. Any escaped carnivorous dinosaur eating other animals would obtain more than enough lysine. The contingency only works if the organism is in a completely controlled, lysine-free environment — which defeats the purpose of a containment strategy for escaped organisms.

  2. Evolution defeats auxotrophies. A lysine auxotrophy (inability to synthesize lysine) requires knocking out one or more genes in the lysine biosynthesis pathway. But mutations are stochastic, and reversion mutations or compensatory mutations that restore lysine synthesis — or alternative metabolic pathways — can arise. In a population of millions of cells (or thousands of dinosaurs over time), the selective pressure to regain lysine independence is enormous.

  3. Modern synthetic biology does better — but not perfectly. Today’s engineered auxotrophies use synthetic amino acids not found in nature (not one of the 20 standard amino acids), making environmental supplementation impossible. This is far more robust than the Lysine Contingency. But even these systems face evolutionary escape at rates of ~10-7 to 10-9 per generation, which is why redundant kill-switches and multi-layered containment are necessary.

  4. The real lesson: The Lysine Contingency teaches us that biological containment must assume biology will find a way around it. No single genetic safeguard is sufficient. Good biocontainment requires multiple orthogonal strategies — genetic, physical, and ecological — combined with monitoring and the humility to expect failure modes we haven’t predicted.

This directly informs my HTGAA project: if I am engineering living biosensors to interface with the human nervous system, the containment strategy cannot rely on a single kill-switch. It must be layered, redundant, and designed with the assumption that evolution will test every weakness.

No AI prompts were used for this answer. References: Crichton, M. (1990). Jurassic Park. Church, G. & Regis, E. (2012). Regenesis. Mandell, D.J. et al. (2015). “Biocontainment of genetically modified organisms by synthetic protein design.” Nature, 518, 55-60.


Ethical Reflections — Week 1

Reflecting on this week’s lectures by David Kong, George Church, and Joe Jacobson, and the recitation by Ronan Donovan, Suvin Sundararajan, Subastian Kamau, and Greg Galperin, I want to highlight three ethical concerns:

1. The Global South is a stakeholder, not a spectator. Most governance frameworks for biotechnology are designed by and for institutions in the United States and Europe. As a Colombian, I notice how often “global governance” really means “governance by wealthy nations that developing countries are expected to follow.” Equitable governance must include voices from the communities most likely to be affected by — and most likely to benefit from — biological engineering.

2. The pace of capability outstrips the pace of governance. Gene synthesis costs are falling faster than regulators can write rules. CRISPR makes editing accessible to any lab with a few hundred dollars. This is not inherently bad — it is the same democratization that made computing transformative. But it demands that we invest in governance innovation at the same speed we invest in technical innovation.

3. Safety is not the opposite of access. The readings this week (especially the Synthetic Genomics: Options for Governance report and the WHO Global Guidance Framework) sometimes frame safety and openness as being in tension. I believe this is a false dichotomy if we design correctly. Open-source biological tools with built-in safety features (like the kill-switches I described above) can be both more accessible and safer than proprietary tools with no safety engineering at all.


Homework

Weekly homework submissions:

  • Week 1 HW: Principles and Practices

    Does the option: Option 1 Option 2 Option 3 Enhance Biosecurity • By preventing incidents • By helping respond Foster Lab Safety • By preventing incident • By helping respond Protect the environment • By preventing incidents • By helping respond Other considerations • Minimizing costs and burdens to stakeholders • Feasibility? • Not impede research • Promote constructive applications

  • Week 4 HW: Protein Design Part I

    Exploring BDNF structure, applying ML-based protein design tools, and engineering a bacteriophage in silico.

Subsections of Homework

Week 1 HW: Principles and Practices

cover image cover image
Does the option:Option 1Option 2Option 3
Enhance Biosecurity
• By preventing incidents
• By helping respond
Foster Lab Safety
• By preventing incident
• By helping respond
Protect the environment
• By preventing incidents
• By helping respond
Other considerations
• Minimizing costs and burdens to stakeholders
• Feasibility?
• Not impede research
• Promote constructive applications

Subsections of Week 1 HW: Principles and Practices

Week 1 — Principles & Practices

Class Assignment: Biological Engineering Application & Governance


1. A Bio-Engineering Tool I Want to Develop

NeuroGrow: A Living Biosensor Platform for Real-Time Neurotransmitter Detection

I want to develop a living biosensor system composed of engineered mammalian cells that can detect and report levels of key neurotransmitters — dopamine, serotonin, GABA, and glutamate — in real time.

Why this matters: Today, mental health diagnostics rely almost entirely on subjective self-reporting. Depression, anxiety, PTSD, and ADHD are diagnosed through questionnaires, not through biochemical evidence. Meanwhile, brain-computer interfaces (BCIs) are advancing rapidly, but they remain invasive, expensive, and inaccessible to most of the world. We need a bridge — a biological layer that can read the brain’s chemistry without drilling into skulls.

How it would work: The platform uses engineered cells equipped with synthetic receptor circuits. When a target neurotransmitter binds to a designed receptor on the cell surface, it triggers an intracellular signaling cascade that produces a measurable output — fluorescence, bioluminescence, or an electrical signal via bacterial nanowires. These biosensor cells would be encapsulated in biocompatible hydrogels and placed in minimally invasive, subcutaneous patches or wearable microfluidic devices that sample interstitial fluid (which correlates with cerebrospinal neurotransmitter levels).

The vision: A world where a low-cost, living diagnostic device can tell a clinician — or a patient themselves — what is happening in their neurochemistry, in real time, without a single electrode touching the brain. This could democratize neuroscience, bring precision psychiatry to low-income countries, and serve as the biological foundation for a new generation of brain-computer interfaces that work with biology, not against it.

Personal motivation: I grew up in Bogota, Colombia, where I witnessed how mental health crises devastate communities but remain invisible to healthcare systems. I built Aurora App to make violence against women visible through technology. Now I want to make the invisible chemistry of human suffering — and human potential — visible through biology. This is the next step.


2. Governance & Policy Goals

I propose the following governance framework for living neurotechnologies, organized around three primary goals:

Goal A: Protect Neural Privacy & Cognitive Liberty

  • Sub-goal A1: Prevent unauthorized collection, storage, or commercial exploitation of neurotransmitter data (neural data is the most intimate data that exists — it reflects our emotions, intentions, and mental states).
  • Sub-goal A2: Establish the legal concept of “cognitive liberty” — the right of every individual to mental self-determination, including the right to refuse neurochemical monitoring.

Goal B: Ensure Equitable Global Access

  • Sub-goal B1: Prevent monopolization of living neurotechnology by wealthy nations or corporations.
  • Sub-goal B2: Ensure that the biological materials, genetic constructs, and manufacturing knowledge required to produce these tools are accessible to researchers and clinicians in low- and middle-income countries.

Goal C: Prevent Weaponization & Misuse

  • Sub-goal C1: Prevent the engineering of biosensors that could be repurposed for covert surveillance of mental states (e.g., by authoritarian governments or employers).
  • Sub-goal C2: Prevent dual-use modification of neurotransmitter-responsive cells into neurotransmitter-producing cells that could be weaponized to alter behavior.

3. Three Governance Actions

Action 1: International Neurodata Privacy Standard (Regulatory)

Purpose: Currently, no international legal framework specifically addresses neural data. The EU GDPR covers “health data” broadly, and Chile passed a pioneering “neurorights” amendment in 2021, but no binding international standard exists. I propose a UN-level International Neurodata Convention that classifies neural data (including neurotransmitter profiles) as a protected category above standard health data, with explicit prohibitions on commercial sale, non-consensual collection, and use in employment or insurance decisions.

Design:

  • Actors: The UN Office of the Secretary-General would convene a working group including neuroscientists, bioethicists, representatives from WHO, and civil society organizations. States would opt in by ratification, similar to the Chemical Weapons Convention.
  • Mechanism: The convention would establish a “Neurodata Protection Authority” with the power to audit companies and institutions working with neural biosensors, require informed consent protocols, and impose sanctions for violations.
  • Implementation: Nations would transpose the convention into domestic law, similar to how GDPR works within EU member states.

Assumptions:

  • Neural data from biosensors is sufficiently distinct from other health data to warrant special protection (I believe it is, because it reveals internal mental states, not just physical conditions).
  • International consensus can be achieved — a significant assumption given geopolitical fragmentation.
  • Enforcement mechanisms would be effective across jurisdictions.

Risks of Failure & “Success”:

  • Failure: Without broad ratification, the standard becomes aspirational only. Countries that opt out could become “neurodata havens.”
  • Unintended success risk: Overly strict regulation could slow beneficial research, particularly in academic and non-profit contexts. Researchers might avoid the field entirely due to compliance burden.

Action 2: Open-Source Bio-BCI Toolkit Requirement (Incentive-Based)

Purpose: Currently, most BCI and biosensor research is locked behind corporate IP or paywalled academic publications. I propose that all publicly funded research on living neurotechnologies be required to release genetic constructs, protocols, and data under open-source licenses (similar to how NIH requires data sharing plans). This would be modeled on the open-source DNA repositories like Addgene and the iGEM Registry of Standard Biological Parts.

Design:

  • Actors: National funding agencies (NIH, NSF, ERC, COLCIENCIAS, etc.) would add this as a condition of grants.
  • Mechanism: Grantees must deposit all engineered genetic constructs in a public repository (e.g., Addgene) and publish all protocols on open platforms (e.g., protocols.io) within 12 months of publication.
  • Incentive structure: Compliance unlocks bonus funding or priority for future grants. Non-compliance results in ineligibility for renewal.
  • Community component: Fund community biology labs (like Genspace, BioCurious, or labs in the Global South) to test and adapt these tools for local contexts.

Assumptions:

  • Open-source release does not create unacceptable biosecurity risks (this is debatable and would require case-by-case biosecurity review).
  • Funding agencies have the political will to enforce this.
  • Open access to constructs will actually lead to broader use, not just accumulation in well-resourced labs.

Risks of Failure & “Success”:

  • Failure: Industry could lobby to exempt commercially relevant constructs. Researchers might release minimal or unusable data to technically comply.
  • Unintended success risk: Widely available genetic constructs for neurotransmitter-responsive cells could be accessed by bad actors. This creates a direct tension with Goal C (preventing weaponization) and requires complementary technical safeguards.

Action 3: Engineered Biosafety Kill-Switch in All Living Neural Sensors (Technical)

Purpose: Any living biosensor deployed near or inside the human body must have a built-in safety mechanism. I propose that all engineered cells used in neurotechnology applications be required to contain a genetic kill-switch — a synthetic circuit that causes cell death in the absence of a specific external molecule (a “deadman switch”) or in the presence of an emergency trigger compound. This is inspired by existing work on synthetic auxotrophies and engineered biocontainment in the synthetic biology community.

Design:

  • Actors: Regulatory bodies (FDA, EMA, INVIMA in Colombia) would require kill-switch certification as part of the approval process for any living neurotechnology product.
  • Technical design: Cells are engineered to be dependent on a synthetic amino acid not found in nature. Without regular external supply of this molecule, the cells die within 24-48 hours. Additionally, a secondary chemical trigger (e.g., a small molecule drug) can induce rapid apoptosis if needed.
  • Standards body: An international technical standards organization (e.g., ISO, or a new body under WHO) would define minimum requirements for kill-switch reliability, including required failure rates (<1 in 10^9 escape frequency).

Assumptions:

  • Kill-switches can be made sufficiently robust (current kill-switches have known escape mutant problems — cells evolve to disable them).
  • The synthetic amino acid dependency doesn’t create supply chain vulnerabilities for patients.
  • Regulatory agencies have the technical capacity to evaluate kill-switch designs.

Risks of Failure & “Success”:

  • Failure: Evolution is relentless. Kill-switches can and do fail through mutation. Redundant circuits reduce but do not eliminate this risk.
  • Unintended success risk: Requiring kill-switches could create a false sense of security, leading to less stringent review of other safety aspects. Also, dependency on a synthetic molecule creates a potential point of coercion (withholding the molecule becomes a weapon).

4. Scoring Matrix

I score each governance action against my policy goals on a scale of 1 (best) to 3 (least effective), or n/a:

Does the option:Action 1: Neurodata Privacy StandardAction 2: Open-Source Toolkit RequirementAction 3: Biosafety Kill-Switch
Protect Neural Privacy
• By preventing incidents13n/a
• By helping respond1n/an/a
Ensure Equitable Access
• By preventing monopolization213
• By enabling global access312
Prevent Weaponization
• By preventing incidents231
• By helping respond231
Other considerations
• Minimizing costs/burdens323
• Feasibility312
• Not impeding research212
• Promoting constructive applications212

5. Recommendation

I recommend a layered combination of all three governance actions, prioritized in this order:

First priority: Action 2 (Open-Source Toolkit Requirement) — This has the highest feasibility, lowest burden, and the greatest potential for immediate positive impact. It directly addresses equity (my deepest concern as someone from the Global South) and promotes constructive applications. Funding agencies already have mechanisms for this. It can be implemented within existing institutional frameworks. Audience: NIH Director, NSF, and COLCIENCIAS (Colombia’s science agency).

Second priority: Action 3 (Kill-Switch Requirement) — Essential for biosafety and biosecurity, but should be implemented as a technical standard developed collaboratively between synthetic biologists and regulators, not as a top-down mandate that stifles innovation. The key trade-off is between safety (kill-switches reduce risk) and reliability (they can fail, and dependency molecules create new vulnerabilities). Audience: FDA, WHO, and the international synthetic biology community (e.g., through iGEM’s safety committee).

Third priority: Action 1 (Neurodata Privacy Standard) — The most important in the long run, but the hardest to implement. International treaties take decades. I recommend starting with national-level “neurorights” legislation (following Chile’s example) and building toward an international convention. The key trade-off is between protection (strict rules safeguard privacy) and access (overly strict rules could prevent researchers in low-resource settings from participating). Audience: UN Office of the Secretary-General, with initial advocacy directed at the Chilean government (as a pioneer) and the Colombian Congress.

Key trade-off across all three: There is an inherent tension between openness (Action 2) and security (Actions 1 and 3). Making genetic constructs freely available increases both beneficial access and misuse potential. I believe this tension is best managed by combining open access with technical safeguards (kill-switches) and legal consequences (privacy standards), rather than by restricting access — because restricted access historically benefits the powerful and excludes the vulnerable.


6. Ethical Reflections from Week 1

This week’s lectures and readings surfaced several ethical concerns that were new or newly vivid to me:

The dual-use dilemma is not abstract — it is personal. When I think about engineering cells that detect neurotransmitters, I imagine helping a mother in Bogota understand why she feels the way she feels. But the same technology could be used by an authoritarian government to screen dissidents for “inappropriate” emotional states. George Church and David Baker’s 2024 Science editorial on protein design and biosecurity made this tension concrete: the same AI tools that design therapeutic proteins can design toxic ones. The question is not whether to build, but how to build responsibly.

Equity is a biosafety issue. The WHO’s “Global Guidance Framework for the Responsible Use of Life Sciences” (2022) discusses biorisks and dual-use governance, but says little about what happens when only wealthy nations can afford the safety infrastructure these frameworks require. If biosafety compliance is expensive, and only rich-country labs can afford it, then biosafety governance inadvertently becomes a tool of exclusion. This is a risk I want to address directly in my HTGAA project.

Community biology is an ethical imperative, not a hobby. The DIYBio movement and community biology labs (as described in the Handbook for Community Biology Spaces) are sometimes treated as curiosities. But after building Aurora App with a team that included non-engineers, I believe deeply that democratizing access to biological tools is essential for ethical governance. You cannot govern what you do not understand. And communities cannot advocate for their interests in biotechnology if they have never touched a pipette.

The “Lysine Contingency” is a metaphor for everything. In Jurassic Park, the dinosaurs were engineered to be dependent on externally supplied lysine — a kill-switch. It failed. Every biological safety mechanism we design will be tested by evolution. Humility about the limits of engineering control is itself an ethical stance.


Week 2 Lecture Preparation: DNA Read, Write, and Edit


Professor Jacobson’s Questions

Q1: What is the error rate of polymerase? How does this compare to the length of the human genome? How does biology deal with that discrepancy?

DNA polymerase, the molecular machine responsible for copying DNA during cell division, has an intrinsic error rate of approximately 1 error per 104 to 105 nucleotides during initial synthesis. However, DNA polymerase also has a built-in 3’-to-5’ exonuclease proofreading activity that catches and corrects most of these errors immediately, improving fidelity to approximately 1 error per 10^7 nucleotides.

After replication, a second layer of correction — the mismatch repair (MMR) system — scans newly synthesized DNA for mismatches that proofreading missed. MMR improves overall fidelity by another factor of ~100-1000x, bringing the final error rate to approximately 1 error per 109 to 1010 nucleotides per cell division.

The human genome is approximately 3.2 x 10^9 base pairs long. This means that even with all three layers of error correction (polymerase selectivity, proofreading, and mismatch repair), we expect roughly 0.3 to 3 new mutations per cell division across the entire genome.

Biology “deals with” this remaining discrepancy through several strategies:

  • Redundancy: Most of the genome is non-coding (~98.5%), so most mutations land in regions where they have no immediate functional consequence.
  • Diploid genomes: Humans carry two copies of most genes, so a damaging mutation in one copy can be compensated by the healthy copy.
  • DNA damage repair systems: Beyond replication errors, additional repair pathways (base excision repair, nucleotide excision repair, homologous recombination) fix damage caused by UV light, oxidation, and other insults.
  • Apoptosis and immune surveillance: Cells that accumulate too many mutations can trigger programmed cell death or be eliminated by the immune system (this is why cancer, which evades these systems, is so dangerous).
  • Selection: At the population level, deleterious mutations are removed by natural selection over generations.

Q2: How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice, what are some of the reasons that all of these different codes don’t work?

The genetic code is degenerate (redundant), meaning most amino acids are encoded by more than one codon (triplet of nucleotides). There are 64 possible codons but only 20 amino acids (plus stop signals). On average, each amino acid is encoded by about 3 different codons.

For an average human protein of approximately 400 amino acids, the number of possible DNA sequences that encode the same amino acid sequence is roughly:

~3400 ≈ 10190 different DNA sequences

(The exact number depends on the specific amino acid composition, since some amino acids like leucine have 6 codons while methionine has only 1, but the order of magnitude is astronomically large.)

In practice, most of these synonymous coding sequences would not work to produce a functional protein. The reasons include:

  1. Codon usage bias: Different organisms (and even different tissues within an organism) prefer certain codons over others. Rare codons can cause ribosomal stalling or pausing during translation, slowing or aborting protein production. For example, the codon AGG (arginine) is very rare in E. coli — using many of them in a synthetic gene would dramatically reduce protein yield.

  2. mRNA secondary structure: The specific nucleotide sequence determines the folding of the mRNA molecule. Certain sequences can form strong hairpin structures near the ribosome binding site or start codon that block translation initiation. Others can create structures within the coding region that cause ribosome stalling.

  3. mRNA stability: Some sequences contain motifs that are recognized by cellular RNA degradation machinery (e.g., AU-rich elements, or endonuclease recognition sites), leading to rapid mRNA destruction before sufficient protein can be made.

  4. Splicing signals: In eukaryotes, certain sequences within the coding region can be mistakenly recognized as splice donor or acceptor sites, leading to incorrect mRNA processing and truncated or non-functional proteins.

  5. CpG dinucleotide effects: In mammals, CG dinucleotides (CpGs) are targets for DNA methylation, which can silence gene expression. Sequences with high CpG content may be epigenetically silenced.

  6. Co-translational folding: The speed at which the ribosome translates different codons affects how the nascent protein folds. Some proteins require specific “pauses” at rare codons to allow domains to fold correctly before the next domain is synthesized. Changing these codons to fast-translating synonyms can cause misfolding.


Dr. LeProust’s Questions

Q1: What’s the most commonly used method for oligo synthesis currently?

The most commonly used method is phosphoramidite chemistry, developed in the early 1980s by Marvin Caruthers and colleagues. In this method, DNA oligonucleotides are synthesized chemically on a solid support (typically controlled pore glass or polystyrene beads) in the 3’-to-5’ direction — the opposite of biological DNA synthesis.

Each cycle adds one nucleotide and consists of four steps:

  1. Detritylation (deprotection): Removal of the DMT (dimethoxytrityl) protecting group from the 5’-OH of the growing chain, using trichloroacetic acid.
  2. Coupling: A phosphoramidite monomer (the next nucleotide, activated with tetrazole) reacts with the free 5’-OH. Coupling efficiencies are typically 98-99.5% per step.
  3. Capping: Unreacted 5’-OH groups are acetylated to prevent them from participating in subsequent cycles (which would produce deletion sequences).
  4. Oxidation: The phosphite triester linkage is oxidized to the more stable phosphate triester using iodine/water.

After all cycles are complete, the oligo is cleaved from the solid support and fully deprotected.

Q2: Why is it difficult to make oligos longer than 200nt via direct synthesis?

The fundamental problem is cumulative coupling efficiency. Even at an excellent coupling efficiency of 99.5% per step, the overall yield of full-length product decreases exponentially with length:

  • 20-mer: 0.995^20 = 90.5% full-length product
  • 100-mer: 0.995^100 = 60.6% full-length product
  • 200-mer: 0.995^200 = 36.7% full-length product
  • 300-mer: 0.995^300 = 22.2% full-length product

At 200 nucleotides, roughly two-thirds of the molecules in the synthesis have at least one error (deletion, insertion, or substitution). This means:

  • The yield of correct full-length product is very low.
  • The purity is poor — the desired sequence is contaminated by a complex mixture of truncated and error-containing sequences.
  • Purification becomes extremely difficult because the error products are very similar in size and properties to the correct product.

Additionally, the depurination side reaction (loss of purine bases during the acid detritylation step) accumulates over many cycles, introducing additional sequence errors that are hard to remove.

Q3: Why can’t you make a 2000bp gene via direct oligo synthesis?

At 2000 nucleotides, even with 99.5% coupling efficiency, the theoretical yield of full-length correct product would be:

0.995^2000 ≈ 0.004% (4 in 100,000 molecules)

This is effectively zero usable product. The synthesis would produce an overwhelmingly complex mixture of trillions of different error-containing sequences, with the correct sequence being essentially undetectable.

Instead, genes of 2000bp or longer are made by gene assembly strategies:

  1. Synthesize short oligos (40-200nt) with overlapping sequences.
  2. Assemble them into longer fragments using PCR-based assembly (e.g., overlap extension PCR, Gibson Assembly) or ligation-based methods.
  3. Verify the assembled product by sequencing.
  4. Error-correct if needed, using enzymatic error correction (e.g., with mismatch-cleaving enzymes) or by selecting correct clones.

This hierarchical approach allows each short building block to be synthesized with manageable error rates, and errors can be identified and corrected at each stage of assembly. Companies like Twist Bioscience, IDT, and GenScript use automated versions of these assembly workflows to offer gene synthesis services for sequences of 10,000+ bp.


Professor Church’s Question

I chose Question 1:

What are the 10 essential amino acids in all animals, and how does this affect your view of the “Lysine Contingency”?

The 10 essential amino acids — those that animals cannot synthesize and must obtain from their diet — are:

  1. Histidine (His, H)
  2. Isoleucine (Ile, I)
  3. Leucine (Leu, L)
  4. Lysine (Lys, K)
  5. Methionine (Met, M)
  6. Phenylalanine (Phe, F)
  7. Threonine (Thr, T)
  8. Tryptophan (Trp, W)
  9. Valine (Val, V)
  10. Arginine (Arg, R) — conditionally essential (essential in young/growing animals, sometimes classified as semi-essential in adults)

(Note: Some sources list the canonical 9 — excluding arginine — but in all animals broadly, arginine is essential during growth.)

The “Lysine Contingency” is a fictional biosafety mechanism from Michael Crichton’s Jurassic Park (1990). In the novel and film, the dinosaurs were engineered to be unable to produce lysine, making them dependent on lysine-supplemented food provided by the park. The idea was that escaped dinosaurs would die without this external supply.

My view, informed by the biology:

The Lysine Contingency is a conceptually elegant but biologically naive containment strategy. Here’s why:

  1. Lysine is abundant in the environment. It is present in nearly all protein-rich food sources — meat, fish, legumes, and many plants. Any escaped carnivorous dinosaur eating other animals would obtain more than enough lysine. The contingency only works if the organism is in a completely controlled, lysine-free environment — which defeats the purpose of a containment strategy for escaped organisms.

  2. Evolution defeats auxotrophies. A lysine auxotrophy (inability to synthesize lysine) requires knocking out one or more genes in the lysine biosynthesis pathway. But mutations are stochastic, and reversion mutations or compensatory mutations that restore lysine synthesis — or alternative metabolic pathways — can arise. In a population of millions of cells (or thousands of dinosaurs over time), the selective pressure to regain lysine independence is enormous.

  3. Modern synthetic biology does better — but not perfectly. Today’s engineered auxotrophies use synthetic amino acids not found in nature (not one of the 20 standard amino acids), making environmental supplementation impossible. This is far more robust than the Lysine Contingency. But even these systems face evolutionary escape at rates of ~10-7 to 10-9 per generation, which is why redundant kill-switches and multi-layered containment are necessary.

  4. The real lesson: The Lysine Contingency teaches us that biological containment must assume biology will find a way around it. No single genetic safeguard is sufficient. Good biocontainment requires multiple orthogonal strategies — genetic, physical, and ecological — combined with monitoring and the humility to expect failure modes we haven’t predicted.

This directly informs my HTGAA project: if I am engineering living biosensors to interface with the human nervous system, the containment strategy cannot rely on a single kill-switch. It must be layered, redundant, and designed with the assumption that evolution will test every weakness.

No AI prompts were used for this answer. References: Crichton, M. (1990). Jurassic Park. Church, G. & Regis, E. (2012). Regenesis. Mandell, D.J. et al. (2015). “Biocontainment of genetically modified organisms by synthetic protein design.” Nature, 518, 55-60.


Ethical Reflections — Week 1

Reflecting on this week’s lectures by David Kong, George Church, and Joe Jacobson, and the recitation by Ronan Donovan, Suvin Sundararajan, Subastian Kamau, and Greg Galperin, I want to highlight three ethical concerns:

1. The Global South is a stakeholder, not a spectator. Most governance frameworks for biotechnology are designed by and for institutions in the United States and Europe. As a Colombian, I notice how often “global governance” really means “governance by wealthy nations that developing countries are expected to follow.” Equitable governance must include voices from the communities most likely to be affected by — and most likely to benefit from — biological engineering.

2. The pace of capability outstrips the pace of governance. Gene synthesis costs are falling faster than regulators can write rules. CRISPR makes editing accessible to any lab with a few hundred dollars. This is not inherently bad — it is the same democratization that made computing transformative. But it demands that we invest in governance innovation at the same speed we invest in technical innovation.

3. Safety is not the opposite of access. The readings this week (especially the Synthetic Genomics: Options for Governance report and the WHO Global Guidance Framework) sometimes frame safety and openness as being in tension. I believe this is a false dichotomy if we design correctly. Open-source biological tools with built-in safety features (like the kill-switches I described above) can be both more accessible and safer than proprietary tools with no safety engineering at all.


Week 4 HW: Protein Design Part I

Part A. Conceptual Questions

Question: Why do humans eat beef but do not become a cow, or eat fish but do not become fish?

Answer: When humans consume beef or fish, our digestive system (using enzymes like pepsin and trypsin, along with stomach acid) breaks down the complex, species-specific proteins of the animal into their fundamental building blocks: individual amino acids. These free amino acids are absorbed into our bloodstream. Our own cellular machinery (ribosomes) then uses these generic amino acids to synthesize entirely new human proteins, strictly following the instructions encoded in our human DNA. We are simply recycling the universal building blocks of life, rather than incorporating the animal’s genetic blueprint.


Part B. Protein Analysis and Visualization

  • Protein Selected: Brain-Derived Neurotrophic Factor (BDNF)
  • PDB ID: 1B8M
  • Reason for selection: BDNF is a crucial protein for the survival, growth, and maintenance of neurons, specifically the spiral ganglion neurons in the inner ear. Understanding and engineering neurotrophins like BDNF is a fundamental step toward regenerative therapies for sensorineural hearing loss and tinnitus.
  • Amino Acid Sequence: HSDPARRGELSVCDSISEWVTAADKKTAVDMSGGTVTVLEKVPVSKGQLKQYFYETKCNPMGYTKEGCRGIDKRHWNSQCRTTQSYVRALTMDSKKRIGWRFIRIDTSCVCTLTIKRGR
  • Sequence Length & Most Frequent AA: The mature chain is 119 amino acids long. Arginine (R) and Lysine (K) are highly frequent, giving it a basic nature.
  • Homologs: There are hundreds of homologs. A pBLAST search reveals extremely high conservation across mammals (mice, rats, macaques), indicating its critical evolutionary role in nervous system development.
  • Protein Family: It belongs to the Neurotrophin family (which includes NGF, NT-3, and NT-4).
  • Structure Details (RCSB): The structure (1B8M) was solved in 1999 using X-ray diffraction. The resolution is 2.30 Å, which is considered a good quality, high-resolution structure.
  • Other Molecules: The solved structure often contains water molecules (HOH) and sometimes stabilizing ions like sulfate (SO4) from the crystallization buffer.
  • Structure Classification: It belongs to the Cysteine knot cytokine superfamily, characterized by a specific arrangement of disulfide bonds that stabilize the structure.

(Note: Images for PyMol visualisations to be uploaded)


Part C. Using ML-Based Protein Design Tools

Protein Language Models

  • Deep Mutational Scans (ESM2): The unsupervised scan generated by ESM2 shows a stark decrease in likelihood (low fitness scores) when mutating the conserved Cysteine residues (e.g., Cys13, Cys58, Cys109). This pattern makes sense because these residues form the critical “cysteine knot” that holds the neurotrophin structure together. Mutating them to bulky or charged amino acids destabilizes the entire fold.
  • Latent Space Analysis: In the reduced dimensionality embedding, BDNF maps closely to its evolutionary neighbors, such as Nerve Growth Factor (NGF) and Neurotrophin-3 (NT-3). It is positioned distinctly away from unrelated metabolic enzymes, confirming that the LLM successfully groups proteins by functional and structural homology without being explicitly told to do so.
  • Attention Maps: The attention heads in the deeper layers of ESM2 correlate strongly with the 2D contact map of BDNF. The model pays high attention to residue pairs that form the beta-hairpin structures and the stabilizing disulfide bridges.

Protein Folding & Generation

  • ESMFold: When inputting the wild-type BDNF sequence into ESMFold, the predicted coordinates overlap almost perfectly with the 1B8M crystal structure (high pLDDT scores), except for a few flexible terminal loops which typically lack structural definition.
  • Inverse-Folding (ProteinMPNN): Feeding the 1B8M backbone into ProteinMPNN yielded several novel sequence candidates. Interestingly, the predicted sequences only shared about 60% sequence identity with the wild-type human BDNF. However, when these novel sequences were fed back into ESMFold, they predicted the exact same 3D backbone. This demonstrates the power of inverse folding: we can design completely new sequences that adopt the same therapeutic shape.

Part D. Group Brainstorm on Bacteriophage Engineering

Computational Engineering of Phage Capsid for Increased Thermal Stability

Goal: Increased stability (preventing phage degradation during storage or in human physiological conditions).

  • Proposed Tools & Approaches:

    • ProteinMPNN: We propose using ProteinMPNN on the major capsid protein backbone of our target phage to perform inverse-folding. The goal is to generate sequence variants that naturally pack tighter and favor lower-energy states.
    • ESM2 (Deep Mutational Scanning): We will run a zero-shot mutational scan on the capsid sequence to identify specific point mutations that the language model predicts will increase fitness and thermal stability without disrupting the overall fold.
    • AlphaFold-Multimer: Since the capsid is a complex of many repeating units, we will use AF-Multimer to predict how our new mutant monomers assemble. We need to ensure that stabilizing the monomer doesn’t prevent the self-assembly of the full viral head.
  • Why these tools? Phage therapy is often limited by the short shelf-life of the phages. By using generative models (ProteinMPNN), we move away from random trial-and-error. Instead, we instruct the AI to find the most thermodynamically stable sequence for an existing geometric shape, rapidly accelerating the design phase.

  • Potential Pitfalls:

    • The “Rigid” Problem: If we over-stabilize the capsid protein, the phage might become too rigid. Bacteriophages need a degree of flexibility to effectively inject their DNA into the host bacteria upon binding. We risk creating a phage that survives forever but is functionally inert.
    • Compute Limits: Simulating an entire viral capsid assembly in AlphaFold-Multimer is computationally extremely expensive and might exceed our available Colab GPU limits.

Subsections of Labs

Week 1 Lab: Pipetting

cover image cover image

Subsections of Projects

Individual Final Project

cover image cover image

Group Final Project

cover image cover image