Homework

Weekly homework submissions:

  • Week 1 HW: Principles and Practices

    Class Assignment Application: SoilBuddy With climate change set to disrupt global agriculture and the unsustainability of commercial monocultures threatening to render the arable land left unusable, humanity urgently needs solutions facilitating soil stewardship to fortify our food supply. Existing solutions are either too expensive for small-to-mid farms, worsening inequality in the developing world, impractical for massive commercial farms, necessitating guesswork set to be upended by the sea change wrought by climate change, or provide, at best, noisy, time-lagged feedback through remote sensing.

  • Week 2 HW: DNA Read, Write and Edit

    Part 0 Done. Part 1 Since it’s passe to create “MIT” with the electrophoresis gel, I decided to reverse the order of the letters to spell “TIM” instead. Part 2 I don’t have in-person access to a node, so can’t perform the wet lab component.

  • Week 3 HW: Lab Automation

    Part 1 Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications. The paper I’ll be analyzing is titled “Development of a high-throughput minimum inhibitory concentration (HT-MIC) testing workflow” published in Frontiers in Microbiology. In summary, the researchers used an Opentrons OT-2 robot to automate an assay used to test the safety-efficacy profile of an antibiotic. They accomplished this by programmatically preparing a serial dilution of the antibiotic candidate across multiple 96-well plates with replicates, innoculating the wells with the test bacteria, then measuring the optical density of each well with a plate reader to determine bacterial viability after the antibiotic challenge.

  • Week 4 - Protein Design Part I

    Part A Number of amino acids = 500g/(100 Da) = 5 mol = 3.0 x 10^24 molecules of amino acids (approx). Over the course of digestion, the polynucleotides are broken down into constituent nucleobases prior to absorption. Considering the degeneracy of codons, these 20 amino acids were sufficient to produce a diversity of proteins, and were evolutionarily conserved due to a lack of strong selection pressure against them (ie. they were “good enough”). . They could’ve formed abiotically given the presence of organic precursors such as carboxylic acids, amines and small molecule side groups, catalyzed by zeolites, and triggered by lightning and heat on the primordial earth or extraterrestrial asteroids respectively. Left-handed spirals. Yes, some examples are $\pi$ helices and $3_{10}$ helices. It’s more energetically stable given L-amino acids and D-saccharides predominate natural protein helices and the sugar-phosphate backbone of polynucleotides respectively. In aqueous environments, the hydrophobic portions of beta sheets are driven together by steric interactions while the hydrophilic groups facilitate relatively strong intermolecular hydrogen bonding. Beta sheets allow for strong intermolecular bonding, in hydrophobic zippers, that renders misfolded proteins more stable than their functional conformations and facilitates the addition of more proteins to the aggregate fibril - the biochemical basis of amyloid pathogenesis. Yes, you could make materials out of amyloid beta sheets but they’d only be stable under aqueous conditions and would be potential biotoxins. . Part B I chose to investigate an antifreeze protein from an arctic bacterium as a potential solution for winter snow-clearing.

Subsections of Homework

Week 1 HW: Principles and Practices

cover image cover image

Class Assignment

Application: SoilBuddy

With climate change set to disrupt global agriculture and the unsustainability of commercial monocultures threatening to render the arable land left unusable, humanity urgently needs solutions facilitating soil stewardship to fortify our food supply. Existing solutions are either too expensive for small-to-mid farms, worsening inequality in the developing world, impractical for massive commercial farms, necessitating guesswork set to be upended by the sea change wrought by climate change, or provide, at best, noisy, time-lagged feedback through remote sensing.

I propose SoilBuddy as a low-cost, real-time, large-area biomachine platform to supercharge the soil microbiome using nature’s own solutions. The crux of SoilBuddy is intelligent biosensing that incorporates human-readable signals, provides an attack surface for genetic circuit-based intervention, and can integrate with existing, non-biological soil monitoring/fertilization workflows. At the same time, SoilBuddy features means to intervene in the soil microbiome by orchestrating microorganisms, and modulating nitrogenation, oxygenation and ion levels in the soil.

Getting to brass tacks, I outline its sensor/effector features below, with core functions necessary for a MVP in bold:

Sensing

  • pH –> Pigment
  • Nitrogenation –> Multiplexed pigment / biofluorescence
  • Oxygenation –> infrared pigment
  • Pathogen –> checkerboard pigment

Effector

  • pH (proton pumps)
  • pH buffering (zwitterion secretion)
  • Nitrogenation (modulating different metabolic circuits)
  • Oxygenation (modulating different metabolic circuits)
  • Pathogen defense (phage libraries, defense proteins, orchestrate natural predators)
  • Microbiome orchestration (quorum sensing mlcs, chemokines)

Governance Goals

While I was inspired by the four tenets of biomedical ethics, I reckoned the most important goals relevant to SoilBuddy are Non-Maleficence and Justice. Given the echoes of the GMO debate from earlier in the century, and the proliferative capacity of bacterial systems, prioritizing safeguards against ecological contamination are crucial for governing our bacteria-based platform. At the same time, given the long history of agricultural intervention being wielded as a political tool to harm access to food and development, it’s crucial that we prioritize justice in the administration of SoilBuddy to ensure its applications prove to be a tool for good.

Goal 1: Non-Maleficence

In the context of SoilBuddy, a commitment to non-maleficence - articulated in the credos “first, do no harm” - involves a rejection of utilitarian logic to instead foreground safeguards against proliferation and mitigation of unintended consequences before the launch of SoilBuddy.

1.1: Counter-Proliferation

SoilBuddy must be governed to prevent runaway proliferation in wild ecosystems and spillover events should be minimized as far as possible.

1.2: Mitigation

Before applying SoilBuddy, practitioners in the SoilBuddy ecosystem must enact measures that mitigate the harm caused by both foreseen unintended consequences (ie. known risks of biotech in agriculture per se, such as gene contamination) as well as unforeseen unintended consequences (ie. put in place inherent safeguards or plans of action to manage unknown unknowns).

Goal 2: Justice

In the context of SoilBuddy, our construction of justice pertains to its distribution and development into perpetuity – I mean to open-source soil health.

2.1: Material Equity

Governance should focus on ensuring geographic and socioeconomic equity in access to the materiel of SoilBuddy, administering its distribution on the basis that food ought to be a human right, in line with the UN’s SDGs.

2.2: IP Equity

Governance should guard against the privatization of developments on SoilBuddy as a platform, recognizing that, again, access to food as a human right contributes to the common good of the species. Insomuch as SoilBuddy will start off open-source, its derivative art should also be accessible to all.

Governance Actions

Option One: Building-in Self-Limitation as a Technical Strategy (“Self-Limitation”)

Similar to how nuclear power plants exported under technical assistance programs are designed to make weaponization impossible, one governance action for industrial, academic and governmental practitioners in the SoilBuddy ecosystem to undertake would be stressing self-limitation when designing SoilBuddy: incorporating failsafes (eg robust, immutable genetic kill-switches) within SoilBuddy to mitigate the risks of unknown unknowns when it comes to mitigating accidental proliferation; and only including biosensor and effector modalities amenable which negate or greatly minimize the risk of deliberate proliferation.

Academic practitioners will need to self-police their research into SoilBuddy to avoid the development of high-risk features, as will industrial stakeholders, despite the potential upside of risky features in contravention of safety-first design principles. National governments must agree to avoid research into safeguard-free SoilBuddy platforms despite the agricultural productivity they may enjoy, not to speak of avoidig the weaponization of SoilBuddy to wage agricultural war. International governments will have to dedicate scarce resources, and scarcer funding to coercive as well as persuasive measures to promote this technical strategy, promoting global buy-in despite political differences as well as counterproliferative interdiction.

A great uncertainty remains as to whether such kill-switches remain biologically feasible, and robust to accidental/natural degradation (eg mutations, horizontal gene transfer). Pessimistically speaking, this strategy also assumes an alignment of academic-industrial complexes away from private interests, not to speak of strategic apparatuses of power. Given how anything short of absolute, collective alignment will rapidly corrode the incentive of collective security that motivates cooperation on this technical strategy, it is also unclear exactly how stable such an arrangement will be without active enforcement against rogue states and non-state actors.

Beyond the failure modes analyzed while elucidating the policy’s assumptions, it is also possible this technical strategy of self-limitation could dampen the research momentum required to make headway on a key technology of humanity’s food future; that, shirking risk altogether might prevent a substantially greater benefit that outweighs even the existential risk of SoilBuddy’s proliferation and inequitable distribution (ie. concentration within centers of comparative advantage) which could actualy be a more efficacious state of affairs.

Option Two: Mandating Equitable Access as an Explicit Policy Measure (“Legislating Equity”)

Switching gears from food, let’s consider the state of California’s Assembly Bill 685 as a piece of legislation enshrining access to water as a human right; in a similar fashion, option two would involve (inter)national legislation (and requisite enforcement) guaranteeing equitable access to the material and IP of SoilBuddy as the key to a more secure food supply across geopolitical borders and strata of development. This stands in stark contrast to the oligopolistic pricing of economic factor inputs and complementary agritech/biotech goods used in the production of food crop in certain major agricultural markets, as well as the fervent privatization of IP governing crop cultivation methods – much like the denial by John Deere of farmers’ right to repair their equipment, but in contraposition to Norman Borlaug’s methods of knowledge-sharing that underpinned the green revolution.

Given the bottlenecks manufacturing inequity – such as mass production, marketing and distribution – lie in industry, academia does not figure in Option Two as much insofar as we discount the role of university corporations that might own key patents and licenses withholding access to SoilBuddy platforms and developments thereof. Taking them to constitute industrial actors for the sake of argument, we thus find that the key stakeholders in Option Two are the industry players charged with balancing the profit motive and corporate responsibility, national regulators that must maintain the free market incentive to innovate with an interest in delivering on their duty to citizens, and international government faced with material limitations in enforcing violations of legislated equity despite national interests in shaping the competitiveness of their own agricultural productivity, or those of others.

Specifically, taking this legislation to be an international treaty guaranteeing mechanisms of cross-subsidization pegged to national income levels viz debt and requiring the registration of SoilBuddy IP with a INGO, we find that Option Two also requires the assumption of total cooperation to negate private incentives to cheat (in an economic sense). Beyond overlapping similarities with Option One’s modes of failure at the industrial and national levels, Option Two relies even more on state and international capacity for the enforcement of this law since there isn’t a structural incentive to ensure equity, as much as there is one to maintain technical non-proliferation due to salutary effects on SoilBuddy metabolic budgets, the avoidance of legal liability or sidestepping openings that might facilitate corporate espionage.

Apart from the failure of legislated equity to take off given the precedent of global pharmaceutical distribution, there is a distinct risk that this legislation may be weaponized for hybrid contestation on the international level. On the other hand, should there be comprehensive good-faith buy-in, the runaway success of mandated equity might paradoxically entrench corollary inequities in the distribution of R&D capacity across the Global South by eliminating the incentive to develop indigenous agritech capabilities, or entrench inter-nation trade dependencies, that if not adequately de-risked, could be disrupted by trade disruptions - innocuous as they might be, such as the maritime trade disruption caused by the Ever Given’s freak beaching in the Suez Canal.

Option Three: Engineering Markets to Incentivize Equity (“Incentivized Equity”)

Apart from altruism and compulsion, a third drive worth exploring is that of economic incentives that might appeal to naturally self-interested, rational economic agents. Similar to the ethos underpinning the FDA orphan drugs scheme, I propose the disbursement of strings-attached market incentives such as tax holidays, targeted research funding, export preferences, indirect subsidies and fast-track approval to govern the nonproliferative development and equitable distribution of SoilBuddy.

Here, a finer dissection of industry is necessary. We lump industry’s R&D side of house with academia in the form of the academic-industrial complex, which will naturally be incentivized by the targeted fiscal outlays promoting responsible SoilBuddy platform development to incorporate biosafety features in the IP that will head into commercialization; this can be verified by funding bodies during progress reviews and the tiered disbursement of funding tranches subject to the satisfactory fulfilment of performance metrics germane to non-maleficent and just SoilBuddy platforms. Next, industrial players overseeing commercialization, production and marketing will be incentivized by corporate incentives such as tax holidays and indirect subsidies to stay faithful to safety-first IP, as well as allow regulatory oversight of their products to ensure they meet the counterproliferative and equitable design features articulated above. On the international level, member fee waivers at INGOs as well as the enactment of diplomatic commons would promote good faith adherance to biosafety in the use and distribution of SoilBuddy, just as the UN has slowly but steadily driven development across the board, and the WTO common prosperity through responsible trade.

Option Three rests heavily on governmental capacity for the microeconomic to international administration of the scheme, from the efficient disbursement of funds, to the capacity for responsible statesmanship promoting the common good. Aside from the obvious failure mode should government failure (ie. rent-seeking) distort incentives for the responsible development and distribution of SoilBuddy, there is the crucial issue of national strategic interest in the performance of the biotech firms developing SoilBuddy in the high-income biotech exporters likely to muster the industrial and governmental heft to implement Option Three in the first place; importing nations might indeed suffer dumping should the subsidies provided for the responsible development and disbursement of SoilBuddy be anti-competitive. Thus, strong international governance of SoilBuddy’s trade is pivotal in maintaining the sustainability of SoilBuddy’s application worldwide.

Evaluation of Governance Actions

We evaluate the governance actions on their efficacy in meeting the governance outcomes articulated above, as well as policy desirability, feasibility and sustainability.

Governance Outcome / Policy FeatureOption One “Self-Limitation”Option Two “Legislated Equity”Option Three “Incentivized Equity”
Non-Maleficence (Counterproliferation)122
Non-Maleficence (Mitigation)123
Justice (Material Equity)323
Justice (IP)323
Desirability (Scale of drawbacks)112
Feasibility (Policy implementation)322
Sustainability (Policy sustainment)213

Selection of Governance Action(s)

I would select Option One (“Self-Limitation”) and Option Two (“Legislated Equity”) as complementary policies that offer both a high-uncertainty, high-reward as well as moderately-achievable option comprehensively guaranteeing biosecurity on both counts, of non-maleficence and justice.

Option One (“Self-Limitation”) alone would seem the most promising governance action barring its inability to directly achieve justice. It is also highly contingent on the biological underpinnings of SoilBuddy, which is an unfortunately uncertain, if rewarding bet. Given the shortfall of a single policy, we pursue either Options Two (“Legislated Equity”) or THree (“Incentivized Equity”) as a synergistic policy that, in a swiss cheese fashion, provides better coverage of biosecurity objectives.

The key differentiator between Option Two (“Legislated Equity”) and Option Three (“Incentivized Equity”) lay in the unsustainability of fiscal outlays at both the national and international levels as required for the latter, as well as the greater alignment of coercive incentive in the former to adhere to non-maleficence. As a whole, Option Two is attainable, though contingent on the tedious, if achievable negotiation of international interests to settle on a suitable piece of biosecurity legislation.

Crucially, both policies hinge on the fair assumption that there is sufficient technical as well as state capacity to see SoilBuddy through in the first place in a responsible manner. At the same time, we qualify that, pending further research, the technical research underpinning first-of-its-kind killswitches and safety measures required for Option One is more uncertain than international cooperation on the governance of a risky technology, which has precedent in the historical governance of nuclear energy, as well as more contemporary discussions of human germline editing and mirror life.

Ethical Concerns and Appropriate Governance Actions

This week’s class and this exercise as a whole have brought to mind two worries, one deontological and the other of a more utilitarian turn. Firstly, I questioned if the governance of biosecurity might err on the side of caution and unduly impinge on the freedom of sovereign individuals to pursue their SoilBuddy-related curiosities, or nations to self-actualize their agricultural potetial. Applying a Lockean construction of the tradeoff between governance and freedom, however, one realizes that the necessary surrender of a few freedoms pertaining to the riskiest of technologies, especially ones that might have existential implications for the species and planet as a whole, really is a freedom from, to channel Isaiah Berlin, a greater evil; the freedom to feed our individual or collective curiosities is contingent on the responsible stewardship of the commons which make it possible.

Secondly, from the standpoint of effective altruism, I wondered if governance might be enacting longtermist harm by impeding the pace of technological research necessary to mitigate the existential threat the species faces to its food supply. Again, on the balance of probabilities this more consequentialist reasoning enables, we conclude that the risk of massive ecological contamination in the image of an Atwood dystopia requires that we factor in guardrails against irreversible harm borne of an irresponsibly Pandoran scientific accelerationism.

On a metacognitive level, inasmuch as one might worry about the balance of caution and speed required to safely realize the promise of SoilBuddy for food security across the species, it is this very debate that constantly refines humanity’s construction of this negotiation between risk and reward.

Week 2 Lecture Prep

Professor Jacobson

  1. Polymerase has an error rate of 1:10^6. Given the length of the human genome, ~3.2 Gbp, the number of expected errors (3200) is substantial; the error rate of polymerase is significant for the length of the human genome. The error rate of polymerase is mitigated by proofreading during DNA replication, as well as MutS-mediated mismatch repair after replication.

  2. Taking the average human protein to contain ~400 amino acids in its primary structure, this means ~400 mRNA codons would be necessary to code for it. Given the degeneracy of the genetic code, working backwards and assuming a roughly equal proportion of different amino acids for the sake of argument, the codon table implies 4^(820) * 3^(120) * 2^(1220) = 1.3210^178 different sets of DNA triplet codes to encode the typical human protein. In practice, certain DNA triplet code sequences encode mRNAs that are more structurally stable, degradation-resistent, and amenable to translational regulatory elements necessary for mRNA translation and protein folding.

Dr. LeProust

  1. Solid-phase phosphoramidite synthesis.
  2. Due to accumulated errors, the yield for fully-synthesized sequences is impractically low, especially since they have to be purified and isolated. For long sequences, steric hindrance near the substrate as well as chemical degradation further reduce yield.
  3. For similar reasons as above, direct synthesis’ stepwise inefficiency would cause the yield to be impractically low.

George Church

  1. They are: Phenylalanine, Threonine, Tryptophan, Leucine, Lysine, Methionine, Arginine, Histidine, Isoleucine and Valine [1]. To use the Jurassic Park example at surface value, I think picking lysine was a poor choice for omnivorous/carnivorous dinosaurs since they were able to obtain the essential amino acid from their diet, or the environment. At the same time, it offers a broader lesson on building safeguards into our biotechnology; even if we replaced lysine with a xenobiotic amino acid, for example, simply relying on one failsafe doesn’t provide a reasonable amount of biosecurity.

[1] Lopez MJ, Mohiuddin SS. Biochemistry, Essential Amino Acids. [Updated 2024 Apr 30]. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2025 Jan-. Available from: https://www.ncbi.nlm.nih.gov/books/NBK557845/

Week 2 HW: DNA Read, Write and Edit

Part 0

Done.

Part 1

Since it’s passe to create “MIT” with the electrophoresis gel, I decided to reverse the order of the letters to spell “TIM” instead.

Digest artwork Digest artwork

Part 2

I don’t have in-person access to a node, so can’t perform the wet lab component.

Part 3

Part 3.1

I chose to investigate an inorganic pyrophosphatase-driven proton pump because it’s the most relevant to the SoilBuddy MVP. Here are other candidates I considered and why I rejected them:

  • Bacteriorhodopsin: +Prokaryote-native -Light-driven rxn is impractical for a soil-dwelling bacterium
  • Membrane H+-ATPase: +Native cellular localization at membrane is favorable -High risk of cross-talk with native membrane proton pumps and may preclude bacterial survival
  • Methanogenic proton pump: - Presupposing an electron acceptor for the correct functioning of the proton pump limits our choices of target organism and may hinder the systems integration of other effector means for SoilBuddy

The sequence is as follows, obtained from UniProt

>AAK86977.2 H+ translocating pyrophosphate synthase [Agrobacterium fabrum str. C58]
MTVIPIVILCGVLSVVYAVWTTKSVLDADQGNERMREIAGYIREGAQAYLTRQYLTIAIVGLIVAVLAWY
LLSAIAAIGFVIGAVLSGVAGFVGMHVSVRANLRTAQAASHSLGAGLDIAFKSGAITGMLVAGLALLGVS
IYYFVLTSVLGHPPGSRAVIDALVSLGFGASLISIFARLGGGIFTKGADVGGDLVGKVEAGIPEDDPRNP
ATIADNVGDNVGDCAGMAADLFETYAVSVVATMVLAAIFFAGTPILESAMVYPLAICGACILTSIAGTFF
VKLGTNNSIMGALYKGLIATGVFSVAGLAVATYATVGWGTIGTVAGMEITGTNLFFCGLVGLVVTALIVV
ITEYYTGTNKRPVNSIAQASVTGHGTNVIQGLAVSLESTALPAIVIVGGIIGTYQLGGLFGTGIAVTAML
GLAGMIVALDAFGPVTDNAGGIAEMAGLDPDVRKATDALDAVGNTTKAVTKGYAIGSAGLGALVLFAAYA
NDLSYFAANGDTYPYFKDIGEISFSLANPYVVAGLLFGGLIPYLFGGIAMTAVGKAASAIVEEVRRQFRE
KPGIMAGTEKPDYGRAVDLLTKAAIREMVIPSLLPVLAPLVVYFGVLLISGSKASAFAALGASLLGVIIN
GLFVAISMTSGGGAWDNAKKSFEDGFIDKDGVRHVKGSEAHKASVTGDTVGDPYKDTAGPAVNPAIKITN
IVALLLLAVLAH

Part 3.2

The reverse translated nucleotide sequence is as follows, obtained using bioinformatics.org’s reverse translation tool

atgaccgtgattccgattgtgattctgtgcggcgtgctgagcgtggtgtatgcggtgtgg
accaccaaaagcgtgctggatgcggatcagggcaacgaacgcatgcgcgaaattgcgggc
tatattcgcgaaggcgcgcaggcgtatctgacccgccagtatctgaccattgcgattgtg
ggcctgattgtggcggtgctggcgtggtatctgctgagcgcgattgcggcgattggcttt
gtgattggcgcggtgctgagcggcgtggcgggctttgtgggcatgcatgtgagcgtgcgc
gcgaacctgcgcaccgcgcaggcggcgagccatagcctgggcgcgggcctggatattgcg
tttaaaagcggcgcgattaccggcatgctggtggcgggcctggcgctgctgggcgtgagc
atttattattttgtgctgaccagcgtgctgggccatccgccgggcagccgcgcggtgatt
gatgcgctggtgagcctgggctttggcgcgagcctgattagcatttttgcgcgcctgggc
ggcggcatttttaccaaaggcgcggatgtgggcggcgatctggtgggcaaagtggaagcg
ggcattccggaagatgatccgcgcaacccggcgaccattgcggataacgtgggcgataac
gtgggcgattgcgcgggcatggcggcggatctgtttgaaacctatgcggtgagcgtggtg
gcgaccatggtgctggcggcgattttttttgcgggcaccccgattctggaaagcgcgatg
gtgtatccgctggcgatttgcggcgcgtgcattctgaccagcattgcgggcacctttttt
gtgaaactgggcaccaacaacagcattatgggcgcgctgtataaaggcctgattgcgacc
ggcgtgtttagcgtggcgggcctggcggtggcgacctatgcgaccgtgggctggggcacc
attggcaccgtggcgggcatggaaattaccggcaccaacctgtttttttgcggcctggtg
ggcctggtggtgaccgcgctgattgtggtgattaccgaatattataccggcaccaacaaa
cgcccggtgaacagcattgcgcaggcgagcgtgaccggccatggcaccaacgtgattcag
ggcctggcggtgagcctggaaagcaccgcgctgccggcgattgtgattgtgggcggcatt
attggcacctatcagctgggcggcctgtttggcaccggcattgcggtgaccgcgatgctg
ggcctggcgggcatgattgtggcgctggatgcgtttggcccggtgaccgataacgcgggc
ggcattgcggaaatggcgggcctggatccggatgtgcgcaaagcgaccgatgcgctggat
gcggtgggcaacaccaccaaagcggtgaccaaaggctatgcgattggcagcgcgggcctg
ggcgcgctggtgctgtttgcggcgtatgcgaacgatctgagctattttgcggcgaacggc
gatacctatccgtattttaaagatattggcgaaattagctttagcctggcgaacccgtat
gtggtggcgggcctgctgtttggcggcctgattccgtatctgtttggcggcattgcgatg
accgcggtgggcaaagcggcgagcgcgattgtggaagaagtgcgccgccagtttcgcgaa
aaaccgggcattatggcgggcaccgaaaaaccggattatggccgcgcggtggatctgctg
accaaagcggcgattcgcgaaatggtgattccgagcctgctgccggtgctggcgccgctg
gtggtgtattttggcgtgctgctgattagcggcagcaaagcgagcgcgtttgcggcgctg
ggcgcgagcctgctgggcgtgattattaacggcctgtttgtggcgattagcatgaccagc
ggcggcggcgcgtgggataacgcgaaaaaaagctttgaagatggctttattgataaagat
ggcgtgcgccatgtgaaaggcagcgaagcgcataaagcgagcgtgaccggcgataccgtg
ggcgatccgtataaagataccgcgggcccggcggtgaacccggcgattaaaattaccaac
attgtggcgctgctgctgctggcggtgctggcgcat

Part 3.3

We need to optimize codons because different model organisms have aminoacyl-tRNAs in different abundances. Certain organisms may not have enough, or any, tRNAs with the anticodons corresponding to our nucleotide sequence, while for others, certain tRNAs may be more abundant, and thus facilitate more efficient translation of mature mRNA.

I’ll be optimizing my sequence for E. coli since it’s both a well-characterized transformation and plasmid amplification platform, and amenable to SoilBuddy’s target application. Given the length of the target sequence, I’ll hedge my bets on Golden Gate Assembly, and optimize my codons without the restriction sites for common type IIS enzymes such as BsaI, BbsI, BsmBI, and FokI. I used Twist Bioscience’s codon optimization tool:

ATGACAGTTATCCCTATAGTAATACTTTGTGGTGTTTTGTCGGTAGTCTACGCAGTTTGGACGAC
GAAGTCTGTCTTGGACGCTGACCAAGGTAATGAGAGAATGCGTGAGATCGCAGGTTACATACGTG
AGGGCGCACAAGCATACTTAACACGACAATACCTCACTATCGCTATCGTTGGGCTTATCGTAGCT
GTCTTAGCATGGTACTTATTATCAGCAATCGCAGCAATCGGGTTCGTCATAGGTGCTGTTCTTAG
TGGTGTAGCTGGATTCGTTGGTATGCACGTATCCGTTCGTGCTAATTTACGTACAGCACAAGCTG
CCTCACACTCTTTAGGGGCTGGTCTGGACATCGCTTTCAAGAGTGGAGCCATCACAGGTATGTTG
GTCGCCGGATTAGCCCTTCTTGGTGTTAGTATATACTACTTCGTCCTTACGTCGGTACTTGGGCA
CCCACCTGGGTCTAGAGCTGTTATAGACGCATTAGTTTCCTTGGGGTTCGGAGCATCGTTGATCT
CAATCTTCGCCCGTTTGGGTGGTGGTATCTTCACAAAGGGTGCAGACGTCGGCGGAGATCTTGTC
GGAAAGGTTGAGGCAGGTATCCCTGAGGACGACCCCCGTAATCCAGCTACAATCGCCGACAATGT
TGGAGACAATGTTGGTGACTGTGCAGGAATGGCCGCTGACCTCTTCGAGACTTACGCAGTTAGTG
TTGTTGCAACTATGGTTTTAGCCGCAATCTTCTTCGCTGGGACACCTATCTTAGAGTCTGCAATG
GTTTACCCATTAGCTATATGTGGAGCATGTATATTAACGTCCATCGCTGGTACGTTCTTCGTTAA
GTTAGGTACAAATAATAGTATCATGGGAGCCCTCTACAAGGGTTTAATCGCAACGGGAGTTTTCT
CTGTTGCAGGTCTTGCTGTTGCAACTTACGCAACAGTCGGTTGGGGTACTATCGGTACAGTAGCC
GGTATGGAGATAACTGGAACAAATTTGTTCTTCTGTGGTTTAGTAGGGTTAGTTGTCACAGCATT
GATAGTTGTAATAACAGAGTACTACACTGGAACTAATAAGCGACCAGTCAATTCCATCGCACAAG
CATCTGTCACAGGTCACGGGACGAATGTTATCCAAGGTTTAGCCGTTTCTTTAGAGTCGACAGCC
CTTCCTGCCATCGTCATAGTAGGCGGTATCATAGGTACTTACCAACTCGGTGGATTATTCGGTAC
AGGTATAGCCGTTACTGCAATGCTCGGATTGGCAGGCATGATAGTTGCCTTGGACGCATTCGGTC
CCGTAACAGACAATGCCGGTGGTATAGCTGAGATGGCTGGTCTTGACCCTGACGTCCGTAAGGCT
ACGGACGCACTCGACGCTGTAGGAAATACTACTAAGGCAGTCACGAAGGGATACGCTATAGGGTC
TGCAGGACTCGGTGCCTTGGTTCTCTTCGCCGCTTACGCTAATGACTTATCATACTTCGCAGCCA
ATGGTGACACATACCCTTACTTCAAGGACATCGGTGAGATCTCATTCTCATTGGCAAATCCATAC
GTTGTAGCCGGTTTATTATTCGGTGGATTAATCCCATACCTCTTCGGTGGTATCGCAATGACAGC
CGTCGGAAAGGCAGCTTCAGCCATCGTTGAGGAAGTTCGTCGGCAATTCCGAGAGAAGCCCGGTA
TCATGGCTGGTACGGAGAAGCCCGACTACGGGCGGGCAGTTGACCTGCTTACGAAGGCAGCCATT
CGTGAGATGGTTATCCCCAGTCTTTTACCTGTTCTCGCCCCTCTTGTAGTTTACTTCGGTGTTCT
TCTCATCTCTGGGTCGAAGGCTTCGGCTTTCGCAGCCCTCGGTGCTTCGTTATTGGGTGTTATCA
TAAATGGGTTGTTCGTAGCTATCAGTATGACGTCTGGTGGCGGGGCATGGGACAATGCCAAGAAG
TCATTCGAGGACGGGTTCATAGACAAGGACGGTGTTAGACACGTCAAGGGTAGCGAGGCTCACAA
GGCTTCCGTTACGGGTGACACAGTTGGTGACCCATACAAGGACACTGCCGGACCCGCCGTTAATC
CCGCTATCAAGATCACGAATATCGTTGCATTGCTTTTATTAGCAGTCCTCGCTCAC

Part 3.4

Once the target sequence is spliced into a cloning vector via golden gate assembly and a competent strain of E. coli is obtained, the bacteria could be transformed using heat shock treatment. After selection for expression of the cloning vector, transformed bacteria would be incubated in broth.

Cells would be extracted via centrifugation and lysis with detergent to release the membrane protein. The membrane protein would be purified using chromatography and verified using Western blot.

A cell-free protocol would use E. coli lysate to conduct the in vitro transcription then translation of the plasmid vector containing the target sequence. Purification and verification of the protein would similarly involve chromatography and Western blot.

Part 3.5

Upon transcription of a DNA sequence to pre-mRNA, post-transcriptional modification of the pre-mRNA involves alternative splicing, which allows for different combinations of exons to be present in the mature mRNA, and subsequently translated into protein isoforms.

Alternatively, transcription initiation complexes may form on different promoters within the gene, causing different transcripts to begin with.

Part 4

Part 4.1

Done

Part 4.2

ppase ppase

Part 5

Part 5.1: DNA Read

(i) I’d like to sequence the eDNA of the soil microbiome to gain an understanding of the organisms SoilBuddy would have to co-exist with.

(ii)

  1. I’d use Illumina NGS sequencing because it balances read-length - more than sufficient for analyzing eDNA fragments - with cost and thoroughput. It is classified as a 2nd-generation sequencing technology since it utilizes sequencing-by-synthesis, albeit in a massively parallel architecture.

  2. My input is purified DNA fragments obtained from soil solution. Depending on the distribution of fragment sizes, a coarse reading of which may be obtained with a pilot gel electrophoresis using a standard ladder, I will first use enzymatic fragmentation to produce suitable DNA fragments, then ligate Illumina NGS-specific adapters containing barcode sequences, and finally run a low number of PCR cycles to amplify my DNA fragments.

After PCR amplification, I’ll pool my samples to ensure equal representation in my library and verify their purity before loading them into the sequencer.

  1. Under Illumina NGS, the fragments of DNA in my library hybridize sparsely over the surface of the sequencer’s flow cell, thanks to the adapters ligated to the eDNA samples. The fragments first undergo bridge amplification PCR wherein multiple copies of the fragments (both forward and reverse strands corresponding to the fragment sequence) are generated in small clusters within the flow cell’s nanowells. Reverse strands are enzymatically cleaved and washed off the flow cell.

Once clusters are produced, the cell is flooded with a primer, DNA polymerase and modified fluorescently-tagged nucleotides which can only be incorporated by the DNA polymerase one base at a time to the oligomers surrounding the clusters. Unbound nucleotides are washed away, then a picture of the flow cell is taken, which reveals the specific base incorporated in each cluster by virtue of the unique wavelength of fluorescent light given off by each modified nucleotide (ie. A, T, C or G). The fluorescent tag blocking DNA synthesis is chemically inactivated, then the chip is again flooded with fluorescently-tagged nucleotides. This process repeats till the full length of DNA fragments has been replicated.

Specifically, Illumina NGS identifies the specific bases using the unique fluorescent signature emitted by the fluorescent tag attached to each of the four possible nucleotides.

  1. After demultiplexing, the output is short read data. I’d subsequently process the raw read data using a metagenomic analysis pipeline to identify specific taxa and species present in the original soil solution sample.

Part 5.2: DNA Write

(i) I’d like to synthesize DNA for a genetic circuit that regulates bacterial membrane protein pumps depending on external pH as part of my SoilBuddy MVP. I’d choose a set point optimized for food crops such as corn.

(ii)

  1. I’d use Gibson Assembly to put my genetic circuit together given the simplicity and reliability of the protocol. First, I’d identify appropriate restriction enzyme combinations for the BioBricks parts I’ll require. Next, I’ll pick two BioBricks parts at a time for insertion into the target plasmid.

Each BioBricks part and the vector will be digested in respective reaction chambers with the appropriate restriction enzymes, followed by thermal denaturation (and thus inactivation) of the restriction enzymes. Thereafter, the digested BioBricks parts and vector will be incubated together in molar amounts designed to optimize formation of the target construct along with a DNA ligase. After transformation into a bacterial platform, selection, amplification and extraction, repeat the process till all the BioBricks parts are incorporated within the recombinant plasmid vector.

  1. Needless to say, the process is extremely time-intensive due to the multiple, repetitive transformations and amplifications involved. While laboratory automation could allow the process to scale, it would be very resource-intensive. Lastly, while each particular step might be lossy, regular purification, selection and verification steps would ensure the desired vector is formed at the end of the day.

Part 5.3: DNA Edit

(i) One possible development of SoilBuddy would be a process rather than an organism that transforms native soil microbiota to enhance their salutary functions (eg nitrogen-fixing, pH buffering). To that end, I would like to edit prokaryotic bacteria and eukaryotic fungal DNA considering the particular niche of commensal microbiota near plant roots. This would involve, mainly, gene insertion to introduce novel gene products or base edits to enhance existing regulatory mechanisms in the microorganisms.

(ii)

  1. Given the range of target systems (both prokaryotic and eukaryotic) as well as payloads desired (ie. novel genes and edits), I would leverage the versatility of CRISPR. Taking a CRISPR-based gene knock-in of yeast as an example, we first design a DNA sequence encoding a gRNA that recognizes our target insertion site and has minimal off-target effects. Then, we select a yeast plasmid with an antibiotic selection marker and which contains a sequence encoding Cas9 endonuclease. Using Gibson assembly, we first digest the vector and gRNA-encoding sequence with appropriate restriction enzymes, then ligate the sequence into the plasmid using DNA ligase, transform the recombinant vector into competent yeast cells, select and amplify the plasmid through incubation.

We prepare a donor template containing the gene to be inserted and transform the exogenous donor template into yeast. Now, the yeast contains Cas9 endonuclease from expression of the plasmid, which hybridizes with gRNA (also expressed from transcrption of the plasmid DNA) to form ribonucleoprotein complex. The crRNA in the gRNA binds to the recognition site before Cas9 domain activity causes a double-stranded break in the yeast DNA. Thereafter, the donor template provides for Homology-Directed Repair that knocks in the intended gene.

  1. The preperatory work involves designing the gRNA, selecting yeast cells that are competent and do not suffer from damage to their Homology-Directed Repair mechanism, and selecting a donor template incorporating a reporter (eg GFP) such that unintended edit products are minimized. At the same time, edited yeast cells have to be selected due to the low efficiency of gene knock-in through fluorescence-based flow-cytometry for example.

  2. While a well-designed gRNA and donor template will minimize indels, off-target edits and maximize the chances of successful gene knock-in, statistically speaking, the odds of a cell being successfully edited are low. Hence, care needs to be taken in selecting for edited cells, which can then be expanded to counteract the low efficiency of knock-in

Week 3 HW: Lab Automation

Part 1

Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications.

The paper I’ll be analyzing is titled “Development of a high-throughput minimum inhibitory concentration (HT-MIC) testing workflow” published in Frontiers in Microbiology.

In summary, the researchers used an Opentrons OT-2 robot to automate an assay used to test the safety-efficacy profile of an antibiotic. They accomplished this by programmatically preparing a serial dilution of the antibiotic candidate across multiple 96-well plates with replicates, innoculating the wells with the test bacteria, then measuring the optical density of each well with a plate reader to determine bacterial viability after the antibiotic challenge.

The method outlined in the paper minimizes human error, standardizes the innoculation pattern used for each plate well and allows for a massive range of antibiotic concentrations, as well as antibiotics to be tested in parallel. This allows for massive improvements in thoroughput, accuracy and replicability of results.

Part 2

Write a description about what you intend to do with automation tools for your final project. You may include example pseudocode, Python scripts, 3D printed holders, a plan for how to use Ginkgo Nebula, and more. You may reference this week’s recitation slide deck for lab automation details.

Considering my main plan, SoilBuddy, lab automation serves primarily to augment the speed at which my experiments may be run, and allow for overlapping scheduling to optimize lab time.

There are a few key steps where lab automation will come in handy:

  • Building: plasmid digestion, transformation, and selection of bacteria
  • Testing: Testing the performance of the reporter system against multiplexed stimulus conditions

Building

Roughly, here are the steps I’ll be following along with the corresponding pieces of lab equipment required for each:

  • (if the plasmid isn’t ordered with gene construct and genetic switch incorporated) ATC Thermal cycler for PCR amplification of casette, Plate incubator for plasmid digestion
  • Plate incubator for heat shock transformation
  • Plate incubator for bacterial recovery, expansion and selection

Testing

Roughly, here are the steps I’ll be following along with the corresponding pieces of lab equipment required for each:

  • Opentrons OT-2 for preparing serial dilutions of N2 agent and pH
  • Opentrons OT-2 for preparing different innoculations of culture and replicate plates
  • Spark plate reader for reading out reporter results

Week 4 - Protein Design Part I

Part A

  1. Number of amino acids = 500g/(100 Da) = 5 mol = 3.0 x 10^24 molecules of amino acids (approx).
  2. Over the course of digestion, the polynucleotides are broken down into constituent nucleobases prior to absorption.
  3. Considering the degeneracy of codons, these 20 amino acids were sufficient to produce a diversity of proteins, and were evolutionarily conserved due to a lack of strong selection pressure against them (ie. they were “good enough”).
  4. .
  5. They could’ve formed abiotically given the presence of organic precursors such as carboxylic acids, amines and small molecule side groups, catalyzed by zeolites, and triggered by lightning and heat on the primordial earth or extraterrestrial asteroids respectively.
  6. Left-handed spirals.
  7. Yes, some examples are $\pi$ helices and $3_{10}$ helices.
  8. It’s more energetically stable given L-amino acids and D-saccharides predominate natural protein helices and the sugar-phosphate backbone of polynucleotides respectively.
  9. In aqueous environments, the hydrophobic portions of beta sheets are driven together by steric interactions while the hydrophilic groups facilitate relatively strong intermolecular hydrogen bonding.
  10. Beta sheets allow for strong intermolecular bonding, in hydrophobic zippers, that renders misfolded proteins more stable than their functional conformations and facilitates the addition of more proteins to the aggregate fibril - the biochemical basis of amyloid pathogenesis. Yes, you could make materials out of amyloid beta sheets but they’d only be stable under aqueous conditions and would be potential biotoxins.
  11. .

Part B

  1. I chose to investigate an antifreeze protein from an arctic bacterium as a potential solution for winter snow-clearing.

  2. The sequence, as obtained from UniProt is as follows:

>sp|H7FWB6|IBP_FLAFP Ice-binding protein OS=Flavobacterium frigoris (strain PS1) OX=1086011 GN=HJ01_03463 PE=1 SV=1
MKILKRIPVLAVLLVGLMTNCSNDSDSSSLSVANSTYETTALNSQKSSTDQPNSGSKSGQ
TLDLVNLGVAANFAILSKTGITDVYKSAITGDVGASPITGAAILLKCDEVTGTIFSVDAA
GPACKITDASRLTTAVGDMQIAYDNAAGRLNPDFLNLGAGTIGGKTLTPGLYKWTSTLNI
PTDITISGSSTDVWIFQVAGNLNMSSAVRITLAGGAQAKNIFWQTAGAVTLGSTSHFEGN
ILSQTGINMKTAASINGRMMAQTAVTLQMNTVTIPQ
  • It is 280 residues long. Threonine is the most common AA residue, appearing a cumulative 33 times.
  • There are 250 homologs from across the tree of life.
  • It belongs to the ice-binding protein superfamily.
  1. The structure was modelled in Apr 2014. The model is high-resolution, providing details down to 2.10Å. There are no non-protein components in the final protein structure. It belongs to the ice-binding protein superfamily

  2. Cartoon View Cartoon View of IBP_FLAFP Cartoon View of IBP_FLAFP

Ribbon View Ribbon View of IBP_FLAFP Ribbon View of IBP_FLAFP

Ball-and-Stick View Ball-and-stick View of IBP_FLAFP Ball-and-stick View of IBP_FLAFP

Labelling by Secondary structure Secondary Structure-Labelled View of IBP_FLAFP Secondary Structure-Labelled View of IBP_FLAFP

The protein has more sheets than helices, though the helices are substantially larger.

Labelling by residue hydropathy.

Blue represents hydrophilicity while red represents hydrophobicity Hydropathic View of IBP_FLAFP Hydropathic View of IBP_FLAFP

Hydrophobic residues are encountered sparingly, on outward-oriented branches, while hydrophilic residues are found on the main beta sheets accessible to the protein’s surroundings.

Surface View Surface Energy View of IBP_FLAFP Surface Energy View of IBP_FLAFP

There are a few pockets for water to bind to.

Part C

C1

Deep Mutational Scans

  1. Mutational Heat Map Mutational Heat Map Mutational Heat Map

  2. I think the bright bands which occur at regular intervals signify residues involved in linking helices together since they offer some flexibility in binding nucleation sites.

Latent Space Analysis

  1. Yes, they tend to be homologs

C2

Folding Protein Folding Protein Folding

  1. It largely matches the original structure including the spiral linking several beta sheets and a large alpha helix. However, it fails to model the hydrophobic side branches of the protein separately and instead connects them using a polypeptide chain.
  2. No, it is quite sensitive to missense mutations. This comports with the earlier heatmap of mutational scans.

C3

Inverse Folding Inverse Folding Inverse Folding