Assignment: Governance and Ethics in Biological Engineering 1. Proposed Biological Engineering Application or Tool Description of the application or tool:
Part 1: Benchling & In-silico Gel Art Gel lane 1: LAMCG - MboI Gel lane 2: LAMCG - MwoI Gel lane 3: LAMCG - PshAI Gel lane 4: LAMCG - MwoI Gel lane 5: LAMCG - MboI Part 3: DNA Design Challenge I am very intersted for Interleukin-6 (IL-6), an protein indicating early inflammation & immune activation.
Assignment: Governance and Ethics in Biological Engineering
1. Proposed Biological Engineering Application or Tool
Description of the application or tool:
What does the future of passive diagnostic sensing look like?
The future of passive diagnostic sensing lies in distributed biosensing networks that combine cell-free biological systems with metagenomic analysis to enable proactive disease monitoring. Such systems would allow for continuous, decentralized, and low-cost sensing of biological signals associated with human health, shifting healthcare from reactive diagnosis toward anticipatory, preventative health management.
At the core of the vision are cell-free biosensors engineered to detect specific small molecules, such as metabolites, pathogen related compounds, or inflammatory markers, and transduce these signals into measurable outputs, including fluorescence, audio (buzzing sound?) or electrochemical responses. These biosensors could be deployed both as wearable devices for individual-level monitoring as well as environmental sensors embedded in soil, wastewater, and natural water bodies to capture population- and ecosystem-level health signals.
Using Internet of Things (IoT) protocols and architectures, these sensing nodes would form short- and wide-range network topologies that convert biological responses into digital data through on-device computation and wireless transmission. By aggregating and analyzing these data streams in real time, the system would create a ubiquitous and scalable biosensing infrastructure capable of early warning, continuous monitoring, and data-driven public health decision-making.
Compared to traditional clinical testing, which is often slow, expensive, and inaccessible in remote or resource-limited settings, the passive diagnostic sensing offers a pathway toward earlier detection of emerging health threats. In doing so, such systems could help residents and communities identify, prepare for, and potentially mitigate future pandemics before they escalate.
2. Governance and Policy Goals for an Ethical Future
High-level ethical goals:
The overall goal for such a distributed system to be implemented requires the safe installation and accuracy across various environments and consideration of preventing misuse of such technology regarding biosecurity issues. Such technology should also assess geopolitical and embrace cultural contexts and local ecosystem to ensure equitable, ethical and appropriate deployment.
Specific sub-goals:
Goal 1: Ensure that the system does not cause physical, biological, or long- harm to users and surrounding environments
Sub-goal 1: Physical safety and ergonomic compliance
Sub-goal 2: Biological safety of synthetic or bio-integrated components
Sub-goal 3: Electrical and mechanical safety
Goal 2: Data security and governance
Sub-goal 1: Policies define data sovereignty and user ownership
Sub-goal 2: Secure storage and transmission of biometric data
Sub-goal 3: Ethical use and purpose limitation
3. Proposed Governance Actions
Describe at least three governance actions pursued by different actors.
Actor(s): Federal Regulators, Institutional Review Boards, Academic Researchers, and Private Biotech Companies
Purpose:
What is done now? Ethical review is currently fragmented. IRBs focus on human subjects, while biosafety committees focus on lab safety, often ignoring the unique risks of “distributed” sensors in the wild or on the body.
What change are you proposing? Mandate a unified “Pre-deployment Review” specifically for distributed networks. This forces an evaluation of biological safety, data governance, and potential dual-use exploitation before a single node is deployed.
Design:
What is needed to make this work? A new technical standard for hardware/material safety and data processing ethics.
Who must opt in, fund, approve, or implement it? Regulatory agencies need to create a “Distributed Systems” focus group with specialists. Funders require this certification as a condition for milestone payments.
Assumptions:
What assumptions does this rely on? Assumes ethical bodies have the technical expertise and literacy to understand distributed technology and synthetic gene circuits.
What uncertainties could undermine it? the “compliance cost” will not overweigh small startups or research labs, potentially centralizing the technology in the hands of big biotech companies.
Risks of Failure and Success:
How could this fail? The review becomes a “checkbox exercise” where researchers fill out forms without meaningful engagement and discussion, leading to a open ended conversation without accountability.
What unintended consequences might arise even if it succeeds? While it creates safer protocols, the rigid standards might slow down the iterative “learn-and-build” cycle essential for new technological development.
Governance Action 2: Decoupled Data Architecture & Privacy Standards
Actor(s): Technical Standards Bodies, Software Engineers, and Privacy Regulators
Purpose:
Most sensors rely on centralized raw biological data, which are high-value targets for breaches and commercial profiling.
Mandate Intentional Data Collection through decentralized architecture. Data must be processed at the on the device (e.g., wearable/environmental node) so that only anonymized “detected events” are transmitted, not raw biological sequences. Secured data transfering is also needed when encoding data processing.
Design:
Development of open-source “Privacy-Preserving” firmware can send our ecrypted data without specific type of hardware.
Regulators refuse to license sensor network that lacks “Privacy by Design” certification. Users should share higher-fidelity data for research purposes.
Assumptions:
Wearable nodes has sufficient computational power to run complex ML noise-filtering locally
Human bias in the underlying algorithms could still lead to biased data that affect demographics with low socialeconomical backgrounds.
Risks of Failure and Success:
Failure: Distributed nodes are very hard to quality-control. An error in the decentralized code could lead to massive data loss, requires specialists’ repairment and potentially incorrect public health alerts that can not be retracted.
Success: A “Planetary Nervous System” that is resilient to hacking, but the extreme anonymity might make it difficult for health officials to find the “Patient Zero” during any upcoming outbreaks.
Governance Action 3: Explicit Use-Case Licensing
Actor(s): Policy Regulators, Legal Specialists, and Industry Collaboration Groups
Purpose: Healthcare data is often repurposed for surveillance or commercial profiling without clear legal boundaries for distributed devices. We need to create a licensing system that prohibit sense publically accessible materials for use profiling or any commercial, business purposes or threat.
Design: Legal definitions of “Prohibited Uses” and mandatory documentation for how data is being used. Enforcement needs to be handled through periodic audits and heavy fines for ones that cross the legal boundaries (e.g., using a biosensor for something other than its licensed purpose).
Assumptions: regulators can anticipate harmful use cases before they happen and detail them into the policy enforcement. However, it is incredibly difficult in a truly distributed system where nodes can be modified (e.g. DIYBio) or repurposed by the end-user.
Risks of Failure and Success: Failure: passive sensing occurs for grey purposes where licensing is ignored due to accessible technology. Success: Limits misuse. However, “Strict Purpose Limits” may block further research that could result in new discoveries because of no license.
4. Scoring Governance Actions Against Policy Goals
Score each option from 1 to 3 (1 = best), or N/A.
Policy Goal or Consideration
Option 1
Option 2
Option 3
Enhance Biosecurity
Preventing incidents
1
2
2
Helping respond
2
1
3
Foster Lab Safety
Preventing incidents
1
3
2
Helping respond
2
N/A
N/A
Protect the Environment
Preventing incidents
1
2
2
Helping respond
2
1
3
Other Considerations
Minimizing costs and burdens
3
2
3
Feasibility
2
1
2
Does not impede research
3
1
2
Promote constructive uses
2
1
2
5. Prioritization and Recommendations
Among three opetions, a hybrid responsive governance should be considered for implementation. Since in a distributed network, it is difficult to monitor every node consistantly and physically, technical standards (action 2) gatekeep initial purpose and development of technological solutions. By embedding safety and security into the standard practice, self regulating systems are more likely to scale. While the use-case liscence specify terms and conditions in higher detail that prevents futher misuse of such technology.
Target audience(s):
Federal Regulatory Agencies and National Security Commissions - define policies and rules for the bioeconomy/safety
Venture Capitalists and Bio-Foundries - understand compliance to derisk investiment in distributed sensing
Local Communities (e.g., DIYBio and iGEM) - framework for balencing open science and biosecurity
Homework Questions
from Professor Jacobson
For DNA polymerases, the raw error rate during replication is roughly 10⁻⁶ errors per base.
The human genome is about: 3 × 10⁹ base pairs. Therefore, at 10⁻⁶ errors per base, there are approx 3,000 errors per cell division.
There are multiple error-correction systems. 1/ proofreading by polymerase during the replication. It removes mispaired nucleotides when it moves from 3’ -> 5’ end. 2/ post-replication, mismatched repair (MMR), these specialized proteins identify mispaired bases and small insertions/deletion, then correct the incorrect strand.
Proofreading catches most mistakes in real time while mismatch repair acts as a second pass to correct whatever slips through during synthesis.
Reference: Bębenek, A., Ziuzia-Graczyk, I. Fidelity of DNA replication—a matter of proofreading. Curr Genet 64, 985–996 (2018). https://doi.org/10.1007/s00294-018-0820-1
from Dr. LeProust
The most commonly used oligo synthesis method currently is solid-phase chemical synthesis using phosphoramidite chemistry. It builds DNA or RNA oligonucleotides one nucleotide at a time from activated phosphoramidite monomers attached to a solid support.
Longer sequence, the overall yield declines more significantly. Also as the chain gets longer, side reactions such as depurination and deletions accumulate that creates a mixtire of truncared or incorrect products. Purification in this case is extremely difficult and lower the fraction of full length correct oligo.
A ~2000 bp gene is 10x longer than the ~200nt typical max length for direct synthesis, limitations comes with length, such as full-length product missing, difficulties isolating the correct oligo in usable quantiites and accumulated erros over time.
Reference: Hughes, R. A., & Ellington, A. D. (2017). Synthetic DNA synthesis and assembly: Putting the synthetic in synthetic biology. Cold Spring Harbor Perspectives in Biology, 9(1), a023812. https://doi.org/10.1101/cshperspect.a023812
Reference: Sandahl, A. F., Nguyen, T. J. D., Hansen, R. A., Johansen, M. B., & Gothelf, K. V. (2021). On-demand synthesis of phosphoramidites. Nature Communications, 12, Article 2760. https://doi.org/10.1038/s41467-021-22945-z
The “Lysine Contingency” extended from a plot in Jurassic Park film, while engineered dinasaurs are dependent on supplied lysine and unable to survice without it, we are inherently lack of the biosynthetic pathway for lysine. The amino acid comes from food. This contingency reflects an biochemical limitation in animal metabolic pathway, which can be vunerable to engineered organisms and systems.
Here is the reverse translated DNA (nucleotide) sequence in both “Most likely codons” option and “Consensus codons” option:
reverse translation of CAA31312.1 IL-6 receptor precursor (AA -19 to 449) [Homo sapiens] to a 1404 base sequence of most likely codons.
atgctggcggtgggctgcgcgctgctggcggcgctgctggcggcgccgggcgcggcgctg
gcgccgcgccgctgcccggcgcaggaagtggcgcgcggcgtgctgaccagcctgccgggc
gatagcgtgaccctgacctgcccgggcgtggaaccggaagataacgcgaccgtgcattgg
gtgctgcgcaaaccggcggcgggcagccatccgagccgctgggcgggcatgggccgccgc
ctgctgctgcgcagcgtgcagctgcatgatagcggcaactatagctgctatcgcgcgggc
cgcccggcgggcaccgtgcatctgctggtggatgtgccgccggaagaaccgcagctgagc
tgctttcgcaaaagcccgctgagcaacgtggtgtgcgaatggggcccgcgcagcaccccg
agcctgaccaccaaagcggtgctgctggtgcgcaaatttcagaacagcccggcggaagat
tttcaggaaccgtgccagtatagccaggaaagccagaaatttagctgccagctggcggtg
ccggaaggcgatagcagcttttatattgtgagcatgtgcgtggcgagcagcgtgggcagc
aaatttagcaaaacccagacctttcagggctgcggcattctgcagccggatccgccggcg
aacattaccgtgaccgcggtggcgcgcaacccgcgctggctgagcgtgacctggcaggat
ccgcatagctggaacagcagcttttatcgcctgcgctttgaactgcgctatcgcgcggaa
cgcagcaaaacctttaccacctggatggtgaaagatctgcagcatcattgcgtgattcat
gatgcgtggagcggcctgcgccatgtggtgcagctgcgcgcgcaggaagaatttggccag
ggcgaatggagcgaatggagcccggaagcgatgggcaccccgtggaccgaaagccgcagc
ccgccggcggaaaacgaagtgagcaccccgatgcaggcgctgaccaccaacaaagatgat
gataacattctgtttcgcgatagcgcgaacgcgaccagcctgccggtgcaggatagcagc
agcgtgccgctgccgacctttctggtggcgggcggcagcctggcgtttggcaccctgctg
tgcattgcgattgtgctgcgctttaaaaaaacctggaaactgcgcgcgctgaaagaaggc
aaaaccagcatgcatccgccgtatagcctgggccagctggtgccggaacgcccgcgcccg
accccggtgctggtgccgctgattagcccgccggtgagcccgagcagcctgggcagcgat
aacaccagcagccataaccgcccggatgcgcgcgatccgcgcagcccgtatgatattagc
aacaccgattatttttttccgcgc
reverse translation of CAA31312.1 IL-6 receptor precursor (AA -19 to 449) [Homo sapiens] to a 1404 base sequence of consensus codons.
atgytngcngtnggntgygcnytnytngcngcnytnytngcngcnccnggngcngcnytn
gcnccnmgnmgntgyccngcncargargtngcnmgnggngtnytnacnwsnytnccnggn
gaywsngtnacnytnacntgyccnggngtngarccngargayaaygcnacngtncaytgg
gtnytnmgnaarccngcngcnggnwsncayccnwsnmgntgggcnggnatgggnmgnmgn
ytnytnytnmgnwsngtncarytncaygaywsnggnaaytaywsntgytaymgngcnggn
mgnccngcnggnacngtncayytnytngtngaygtnccnccngargarccncarytnwsn
tgyttymgnaarwsnccnytnwsnaaygtngtntgygartggggnccnmgnwsnacnccn
wsnytnacnacnaargcngtnytnytngtnmgnaarttycaraaywsnccngcngargay
ttycargarccntgycartaywsncargarwsncaraarttywsntgycarytngcngtn
ccngarggngaywsnwsnttytayathgtnwsnatgtgygtngcnwsnwsngtnggnwsn
aarttywsnaaracncaracnttycarggntgyggnathytncarccngayccnccngcn
aayathacngtnacngcngtngcnmgnaayccnmgntggytnwsngtnacntggcargay
ccncaywsntggaaywsnwsnttytaymgnytnmgnttygarytnmgntaymgngcngar
mgnwsnaaracnttyacnacntggatggtnaargayytncarcaycaytgygtnathcay
gaygcntggwsnggnytnmgncaygtngtncarytnmgngcncargargarttyggncar
ggngartggwsngartggwsnccngargcnatgggnacnccntggacngarwsnmgnwsn
ccnccngcngaraaygargtnwsnacnccnatgcargcnytnacnacnaayaargaygay
gayaayathytnttymgngaywsngcnaaygcnacnwsnytnccngtncargaywsnwsn
wsngtnccnytnccnacnttyytngtngcnggnggnwsnytngcnttyggnacnytnytn
tgyathgcnathgtnytnmgnttyaaraaracntggaarytnmgngcnytnaargarggn
aaracnwsnatgcayccnccntaywsnytnggncarytngtnccngarmgnccnmgnccn
acnccngtnytngtnccnytnathwsnccnccngtnwsnccnwsnwsnytnggnwsngay
aayacnwsnwsncayaaymgnccngaygcnmgngayccnmgnwsnccntaygayathwsn
aayacngaytayttyttyccnmgn
Here is the codon optimized sequence:
based on reverse translation of most likely codons.
ATGCTGGCCGTCGGCTGTGCACTGCTGGCCGCCCTGCTGGCAGCCCCCGGCGCCGCTCTGGCTCCCAGAAGGTGTCCCGCTCAGGAGGTGGCCAGAGGCGTGCTGACCTCCCTGCCAGGCGATTCCGTTACCCTGACCTGTCCAGGCGTGGAGCCCGAAGATAACGCCACCGTGCACTGGGTGCTGAGGAAACCCGCCGCCGGCTCCCATCCAAGCAGATGGGCAGGCATGGGGAGGAGACTGCTTCTGAGATCTGTCCAGCTGCACGACAGTGGGAACTATTCCTGTTACAGGGCTGGGAGGCCTGCCGGAACAGTGCATCTCCTGGTTGATGTGCCCCCAGAGGAGCCGCAGCTGTCTTGCTTCAGGAAGAGCCCCCTGAGCAATGTGGTGTGCGAATGGGGCCCCCGGAGTACCCCCAGCCTGACAACCAAAGCCGTGCTGCTGGTGAGGAAGTTTCAGAACAGCCCCGCTGAAGACTTTCAGGAGCCCTGCCAGTATAGCCAGGAGTCCCAGAAGTTTTCCTGCCAGCTGGCCGTGCCCGAGGGAGATAGCAGCTTCTACATCGTCTCCATGTGCGTGGCCTCTTCCGTGGGAAGTAAGTTTTCTAAGACCCAGACCTTCCAGGGCTGCGGCATCCTGCAGCCAGATCCTCCCGCCAATATCACAGTGACAGCCGTGGCCAGAAACCCCAGATGGCTGAGCGTGACCTGGCAGGACCCACACTCTTGGAATAGCTCTTTCTATAGGCTGAGGTTCGAGCTGAGATACAGAGCTGAGCGGTCCAAGACATTCACCACTTGGATGGTGAAGGACCTGCAGCATCACTGCGTGATTCACGATGCTTGGAGCGGCCTGAGGCATGTGGTGCAGCTGAGGGCCCAGGAAGAATTTGGGCAGGGGGAATGGAGTGAATGGTCCCCAGAGGCCATGGGGACACCCTGGACCGAGTCCAGGAGCCCACCAGCAGAAAATGAGGTGAGCACTCCCATGCAGGCTCTGACCACCAACAAAGACGATGATAACATTCTCTTCAGAGACTCTGCCAACGCCACCTCCCTGCCCGTGCAGGACAGCAGCTCCGTCCCTCTGCCAACCTTTCTGGTGGCCGGAGGCTCTCTGGCCTTTGGGACCCTGCTGTGTATCGCAATCGTGCTGAGGTTTAAGAAAACCTGGAAGCTGCGGGCCCTGAAGGAGGGAAAGACAAGCATGCACCCACCTTACTCCCTGGGACAGCTGGTGCCAGAGAGGCCCAGGCCTACACCTGTGCTGGTGCCCCTGATCTCCCCTCCTGTGTCTCCTTCCTCTCTGGGTTCTGACAATACAAGTAGCCACAACCGGCCAGACGCCAGGGACCCCCGAAGCCCCTATGATATCTCTAATACCGATTACTTCTTTCCAAGA
based on reverse translation of consensus codons. (not working, as the sequence length is not a multiple of 3)
Codon optimization is a necessary step because of codon bias. Although multiple codons can encode the same amino acid, different organisms prefer certain codons over others. This preference affects: 1/ Translation efficiency, codons that match abundant tRNAs in the host organism are translated faster. 2/ Protein expression levels, Using preferred codons increases protein yield. 3/ mRNA stability, some codons help stabilize the mRNA, reducing degradation. 4/ Reduced secondary structure, optimized sequences minimize hairpins or repetitive sequences that hinder transcription/translation. 5/ Accurate protein folding, efficient translation reduces ribosome stalling, improving proper folding. Without codon optimization, a gene might produce very little protein in the chosen host.
Part 5
(i) What DNA would you want to sequence and why?
Chosen DNA
I would like to sequence DNAs sampled from human bodily fluids, household sinks, wastewater streams.
Why this DNA?
I am interested how ubiqutous DNA samples can construct a genetic reality to how we preserve healthcare and preventative monitoring.