Week 1 HW: Principles and Practices

cover image cover image

Biological Engineering Application & Research Proposal

Agnostic Biosurveillance in the Age of Generative AI


Important

HTGAA | APPLICATION

Describe a biological engineering application or tool you want to develop and why. This could be inspired by an idea for your HTGAA class project and/or something for which you are already doing in your research, or something you are just curious about.


The Landscape of Risk

As AI capabilities grow, concerns about biosecurity risks posed by frontier large language models (LLMs) have grown significantly. Current reports based on surveys of experts in biosecurity and biology, indicate that if AI reaches certain performance levels, the annual risk of a human-caused epidemic (with more than 100,000 deaths) could increase from 0.3% to 1.5%.

Given this landscape, my proposal is to develop a detection system capable of identifying any genetically modified organism, even if it is completely new and does not exist in any database.


Focus on how it was made, not on what it is

As I understand, the problem with current biosecurity is that it functions reactively, similar to a computer “antivirus” that only recognizes threats it has seen before. If someone creates a new pathogen from scratch using AI, our current tools might not detect it because they would not recognize the specific sequence.

My proposal, instead does not focus on the identity of the organism but on its creation process.

The “Human Accent” Hypothesis

The central idea is that human-manufactured DNA possesses an “accent” or statistical pattern distinct from that of natural DNA. This occurs because humans use specific tools to design, optimize, and assemble genetic code. My idea proposes to use AI models (specifically genomic language models) to learn this “human accent” systematically.


Mechanism and Vision

Once the AI identifies an artificially designed sequence, it would be coupled with an enhanced CRISPR-Cas13 system to trigger a biological alarm. Ultimately, I seek to create a tool that makes any genetic engineering intervention “visible,” ensuring that we can anticipate and mitigate the growing risks posed by generative AI applied to biology.


References of Interest

  • Linder, J., et al. (2024) | Nature Machine Intelligence Validates the use of AI architectures to identify statistical biases in synthetic DNA.
  • Gootenberg, J. S., et al. (2023) | Science Establishes the technical viability of CRISPR-Cas13 sensors for PCR-free biodefense applications.
  • Zhou et al. (2024) | Eliyon Confirms that CRISPR-Cas13a systems achieve exceptional sensitivity with the capacity for single-copy detection.
  • Forecasting LLM-enabled biorisk and the efficacy of safeguards (2025) Forecast Research Institute.

Important

HTGAA | ETHICS & GOVERNANCE

Describe one or more governance/policy goals related to ensuring that this application or tool contributes to an “ethical” future, like ensuring non-malfeasance (preventing harm). Break big goals down into two or more specific sub-goals.**


Goal 1: Preventing Misuse of the System (Security)

Since this tool can have a wide reach, so would its implications; therefore, we must prevent anyone from using it to learn how to manufacture threats that “deceive” detection.

  • Controlled Access (API): Do not release the AI code openly. Instead, allow it to be used under supervision so that no one can use it to “train” new invisible pathogens.
  • Attack Simulations (Red-Teaming): Assemble a panel of experts to constantly try to break or deceive the system. In this way, the goal is to stay ahead of those who wish to use AI to cause harm.

Goal 2: Fair Access for Everyone (Equity)

Biological security should not be a luxury for rich countries; it must protect everyone.

  • Accessible CRISPR Technology: Although the AI remains private for security reasons, the instructions to manufacture the biological sensor (CRISPR) must be free and low-cost so that any country can use them.
  • Solidarity-based Global Alert: Create a network where, if a country detects something unusual, it can warn the rest of the world efficiently.

Goal 3: Protecting Privacy and Legitimate Science (Ethics)

The system must be capable of distinguishing between a real threat and a medical breakthrough or a natural mutation.

  • Clear Boundaries: Create rules that separate what is a dangerous pathogen from what is medicine (such as gene therapy), in order to avoid slowing down science that actually helps people.
  • Human Oversight: Design safeguards and the corresponding alignment so that the AI does not make extreme decisions on its own. If the system detects something, a group of experts must review the situation before launching a panic-inducing alarm.

Additional References

  • International AI Safety Report (2026) | A Comprehensive review of latest scientific research on the capabilities and risks of general-purpose AI systems.

Important

HTGAA | STRATEGIC ACTIONS

Describe at least three different potential governance “actions” by considering the four aspects below (Purpose, Design, Assumptions, Risks of Failure & “Success”)**


Action 1: An International “Alert System”

Technical Strategy

  • Purpose: Currently, threat detection is isolated and slow. This proposal could eventually encourage the creation of a global network of shared sensors that use AI to report genetic anomalies in real-time.
  • Design: Academic researchers and hospitals install low-cost CRISPR biosensors. Data is sent to a centralized platform (similar to the financial system that detects bank fraud).
  • Assumptions: We assume that countries will be willing to share health data for the common good.
  • Risks: If the system is “successful,” we could have too many alerts (false positives). If it fails, a real threat could go unnoticed if sensors are not well distributed geographically.

Action 2: “Black Box” Licenses for AI

Rule / Requirement

  • Purpose: To prevent the AI that detects the “human accent” from being used by criminals to learn how to erase that trace.
  • Design: Restrictions must be established so that technology companies do not release the model’s source code openly. Only verified users (governments and certified scientists) will be able to access the tool using special keys.
  • Assumptions: We believe we can maintain technological secrecy and that no one will manage to “hack” or replicate the AI privately.
  • Risks: The “success” of this rule could slow down scientific innovation if researchers cannot freely access the tool. Failure would be the emergence of a black market for “unrestricted” versions of this AI.

Action 3: “Clean DNA” Certification

Incentive

  • Purpose: Today, DNA synthesis companies cannot always detect if an order is dangerous if it is a new design.
  • Design: DNA synthesis companies receive a certification or security seal if they run their orders through this AI filter before sending them to the customer. A regulatory entity validates that these companies comply with the standard.
  • Assumptions: We assume that companies will seek certification and that customers will not go to illegal providers.
  • Risks: If successful, designing biopathogens becomes almost impossible to print legally. A risk of success is that biotechnology becomes slower or more expensive. A failure would be that malicious actors simply buy their own “DNA printers” to bypass control.

Important

HTGAA | SCORES

Score each of your governance actions against your rubric of policy goals


🧭 POLICY GOVERNANCE | EVALUATION MATRIX

STRATEGIC ACTIONSECURITY 🛡️EQUITY 🌍EFFICACYSTRATEGIC NOTES
01. Global Alert System● ● ●● ● ●● ● ○Best for universal protection. Requires heavy coordination.
02. Black Box Licenses● ● ●● ○ ○● ○ ○High security. Risks excluding Global South & slowing science.
03. Clean DNA Certification● ● ○● ● ○● ● ●High industry flow. Relies on voluntary company adoption.

SCORING KEY ● ● ● High Alignment (1) | ● ● ○ Medium Alignment (2) | ● ○ ○ Low Alignment (3)



Important

HTGAA | FINAL RECOMMENDATION

Drawing upon this scoring, describe which governance option, or combination of options, you would prioritize, and why. Outline any trade-offs you considered as well as assumptions and uncertainties.**


Strategic Priority: Academic-Industry Alliance

I recommend that MIT lead a joint task force with the International Gene Synthesis Consortium (IGSC) to implement a combination of Action 1 (Alert System) and Action 3 (Clean DNA Certification).

Why this combination? This alliance allows MIT’s cutting-edge research to become an industrial standard. While MIT provides the AI to detect the “human accent,” IGSC companies act as the first global filter, ensuring that no suspicious sequence is synthesized without oversight. This maximizes Security and Efficacy without relying solely on slow-moving government regulations.


Trade-offs

  • Security vs. Industrial Speed: Implementing an AI filter on every DNA order might make synthesis slightly slower. However, the cost of not doing so (an artificial pandemic) is much higher.
  • Transparency vs. Intellectual Property: Companies must allow an external AI to analyze their orders. To protect their trade secrets, this proposal points toward a system where the AI processes data in an encrypted manner (Privacy-Preserving AI).

Assumptions and Uncertainties

  • Universal Adoption: We assume that companies outside the consortium will not offer “no-questions-asked” services to attract malicious actors.
  • False Positives: There is uncertainty regarding whether the AI will halt legitimate synthetic biology experiments, which would require a fast and expert human review process.

Note

Ethical Reflections & New Actions

Regarding key concerns about the scientist’s responsibility:

  • The Duality of Detection: The same system that detects engineering could be used to teach others how to hide it.
    • Proposed Action: Establish a “Black Box” protocol where the source code for detection is only accessible to certified biosecurity auditors.
  • Technological Justice: We do not want this technology to create a gap where only MIT and large companies are protected.
    • Proposed Action: Create an Open Biosurveillance fund to donate low-cost CRISPR sensors to laboratories in countries of the Global South.

Conclusion for MIT and the IGSC

Biological security in the age of AI should not be solved by a single institution. By combining MIT’s intelligence with the IGSC’s infrastructure, the proposal aims to create an ecosystem where genetic engineering could be visible and accountable.


🧬 Q&A: Professor Jacobson

Polymerase Error Rate

  • 10-4 to 10-5 (1 error per 10,000–100,000 bases).

Genome Comparison

  • With 3.2 x 10^9 bp, lack of repair would cause ~100,000 mutations per division, making life impossible.

Biological Solution

  • A three-tier system (Selectivity – Proofreading – Mismatch Repair) reduces error to < 1 mutation per genome.

Coding Possibilities

  • ~400 amino acids can be encoded by 10^191 different DNA sequences.

Functional Constraints

  • Not all codes work due to:
    1. tRNA availability
    2. Folding kinetics
    3. mRNA stability
    4. Splicing signals

Reference: Alberts et al. (2022). Molecular Biology of the Cell.

Q&A: Professor LeProust

Oligo Synthesis: Methods and Limits

1. Most Common Method The standard method is Phosphoramidite synthesis. It is a chemical, solid-phase process that adds nucleotides one by one to a growing chain.

2. Why is it hard to exceed 200nt? It is a matter of cumulative yield. Even with a 99% efficiency per step, errors add up exponentially:

  • At 100nt, ~36% of the final molecules are correct.
  • At 200nt, only ~13% are correct. Beyond 200nt, the mixture contains too many “short” or damaged sequences, making it nearly impossible to isolate the right one.

3. Why not a 2,000bp gene directly?

  • Zero Yield: At 2,000 bases, the math (0.99^2000) results in essentially zero correct molecules.
  • DNA Damage: The chemicals used in each cycle eventually degrade the DNA chain before it can reach that length.

As a solution, to make a 2,000bp gene, companies synthesize multiple small 60-100nt fragments (oligos) and then “stitch” them together using biological methods like Assembly PCR or Gibson Assembly.

Q&A: Professor Church

The 10 dietary essential amino acids are:

The 10 dietary essential amino acids are:

  • Arg, His, Ile, Leu, Lys, Met, Phe, Thr, Trp, Val
  • The “lysine contingency” from Jurassic Park doesn’t work well because animals already can’t make lysine on their own. They normally get it from food. So if one escaped, it could still find lysine in nature, which makes this a weak safety control.