Subsections of Homework

Week 1 HW: Principles and Practices

Contents

Week1 homework

  1. First, describe a biological engineering application or tool you want to develop and why. I want to engineer a genetic biocontainment system linked to PET degradation because this would allow controlled open-release applications of plastic-degrading bacteria. This biocontainment strategy should be designed to prevent both proliferation of the living bacteria outside the desired application zones and the spread of the synthetic genetic elements through horizontal gene transfer. PET plastic can be degraded with two enzymes into terephthalic acid, which can be used as a carbon source by some soil bacteria like Psueomonas species. Using transcription factors, a genetic circuit can be built to link the expression of a kill-switch with the degradation of PET, so that the cells don’t spread beyond the polluted area. CRISPR-Cas, as a kill-switch mechanism, not only prevents proliferation of live cells, but also can degrade the enzymes to decrease the likelihood of horizontal gene transfer.
  2. Next, describe one or more governance/policy goals related to ensuring that this application or tool contributes to an “ethical” future, like ensuring non-malfeasance (preventing harm). Break big goals down into two or more specific sub-goals. The products of synthetic biology, like engineered microbes, could be used to do amazing things that help society, but there are inherent risks in editing life. I believe a core governance goal should be to ensure we, as synthetic biologists, are designing our products with those risks in mind, and making choices to mitigate those risks. In some cases like open-release of engineered bacteria (like for plastic pollution bioremediation in soil), we might not even know what all the risks might be, or how likely they are. Therefore, an important subgoal is controlled, small-scale testing under realistic deployment conditions for risk assessment. Once risks are identified, the probabilities of occurrence should be considered along with the potential harms, and risk mitigation should be designed appropriately. So, another subgoal is requiring risk mitigation strategies for the identified risks; as well as demonstrating that the chosen strategies do minimize those risks.
  3. Next, describe at least three different potential governance “actions” by considering the four aspects below (Purpose, Design, Assumptions, Risks of Failure & “Success”). In the US, bioremediation activity is usually regulated by the US Environmental Protection Agency; although depending on the specific application, the FDA/USDA might also have jurisdiction. For engineered bacteria for soil pollutant bioremediation, I believe the bacteria would need to be approved by the EPA. Therefore, they could take governance actions by implementing specific policies around these products:
    • EPA-led or funded research on risk assessment of engineered bacteria in realistic open-release conditions.
      • Purpose: Current EPA policies mostly disallow engineered organisms for open-release unless the organism has no “trans-genes” (genes from a different species). This is largely considered to be outdated to the current level of technology and scientific knowledge. Requiring risk assessment proposes new research to clearly identify possible risks and prioritize them by probability and potential harm, eventually allowing a way for approval and safe implementation of engineered organism products.
      • Design: The EPA would need to do this research or fund other institutes to do test engineered bacteria in conditions that reflect open-release applications. The EPA has multiple offices that do research, as well as some grants that give funding to external recipients. Ideally this would result in a list of possible risks and how to assess them.
      • Assumptions: Research outcomes can be broadly applicable to similar scenarios (this is a pretty huge assumption that I’m honestly unsure if I’m comfortable making); i.e. engineered bacteria in similar applications in similar environments might have similar risks.
      • Risks of Failure and Success: This could fail because there could be additional risks that are not identified. Especially when looking at something as potentially broad as an open-release application of a live organism, there are so many potential interactions that we can’t anticipate or test for in a controlled manner. For example, in a soil bioremediation or biofertilizer context, there are bench-scale microcosm and greenhouse-scale mesocosm experiments that can account for a lot of the soil/water/plant interactions. But what about things like weather and wildlife? A field study is needed, but if you control against those risks (such as netting to keep out birds) to prevent escape during the risk assessment experiments, you still aren’t able to fully test those risks. So a risk of success is the harmful escape of an unsafe engineered bacterium during risk assessment experiments. With how connected environments are (i.e. oceans), this could result in a global spread.
    • Policy to require specific risk mitigation and demonstration of effectiveness under realistic application conditions for engineered bacteria approval.
      • Purpose: Currently, new products that might affect environment and public health need to be approved by the EPA for commercial use. This would enact specific requirements for approvals for engineered bacteria. Additionally, many publications about genetic biocontainment discuss it as potential risk mitigation, but the effectiveness of the biocontainment is only demonstrated under specific laboratory conditions (i.e. axenic, optimized media, etc.).
      • Design: This would be a change in current EPA standards and approval processes. The EPA would need to write and implement new policies, potentially train risk assessors and application managers, and develop testing procedures to ensure compliance. With the overturning of the Chevron doctrine, likely this sort of new policy would require the buy-in of either the companies trying to get their products approved or US Congress to pass new legislation.
      • Assumptions: Companies and reseachers abide by federal regulations regarding testing and approval. Risk assessment is done in good faith, rather than by companies prioritizing profit over safety. Risk assessment is done by trained ecological and biological risk assessors who know what to look for or be aware of.
      • Risks of Failure and Success: This could fail if the requirement is too stringent to allow any new products to be approved. This could also fail if the requirements are too lax, and not all risks are accounted for and mitigated. If experimental conditions do not properly reflect application conditions, what appeared to be effective mitigation in the lab might not be effective mitigation in application.
    • Researchers and inventors could also implement relevant and effective genetic biocontainment in any engineered bacteria used for open-release applications.
      • Purpose: For risks around the unintended spread of engineered bacteria or their synthetic genetic constructs, genetic biocontainment can mitigate these risks by preventing proliferation and/or degrading the relevant DNA. By tying the biocontainment system to the intended use of the bacterium, researchers manage risk in a relevant manner, thus ensuring that the bacterium is specific to the intended application and minimizing spread thereby reducing risks.
      • Design: Any developer of an engineered bacteria that could be released would need to research biocontainment and engineer a system into their bacteria. This would require a change in the current culture of the field, where the risks of engineered bacteria spread and mitigation through biocontainment are sometimes discussed, but mostly considered somewhat niche. If it became common practice to consider application and risks thereof for the products of synthetic biology, I think the design of these sorts of safeguards would be more widespread. Any sort of research requires funding and incentive, so universities, grant funders, and biotech companies would need to start looking for these considerations in proposals to motivate it.
      • Assumptions: Genetic biocontainment is a good strategy to mitigate the potential ecological and public health risks of new synthetic biology products. These risks are limited to ones we think to test (i.e. microbial community shifts, horizontal gene transfer of antiobiotic resistance genes or other functions, proliferation of engineered bacteria in unintended location, local specific bacterial extinction event in the case of a particularly robust engineered bacterium).
      • Risks of Failure and Success: If we rely too heavily on genetic biocontainment, a failure of the genetic system could result in losing that protection against risk. It’s also possible risks would not be seriously considered because we too easily trust biocontainment to minimize the risk.
  4. Next, score (from 1-3 with, 1 as the best, or n/a) each of your governance actions against your rubric of policy goals.
Does the option:Risk Assessment ResearchRisk Mitigation for ApprovalBiocontainment in Practice
Enhance Biosecurity
• By preventing incidents211
• By helping respond223
Foster Lab Safety
• By preventing incident2n/a1
• By helping respondn/an/an/a
Protect the environment
• By preventing incidents211
• By helping respond212
Other considerations
• Minimizing costs and burdens to stakeholders133
• Feasibility?123
• Not impede research132
• Promote constructive applications112
  1. Last, drawing upon this scoring, describe which governance option, or combination of options, you would prioritize, and why. Outline any trade-offs you considered as well as assumptions and uncertainties. I would prioritize the requirement of risk assessment and mitigation strategies for EPA approval of engineered bacteria. I believe this would have the biggest impact in terms of allowing engineered bacteria to be used for public good (such as pollution bioremediation) while preventing potential harm (such as ecosystem destabilization by permanently altering the native microbiome). However, I don’t think such a policy would be possible without the prior research so the EPA regulators know what to look for - so the first strategy of risk assessment research would also have to be prioritized. The development of genetic biocontainment tools and implementation thereof becoming regular practice in the field of engineered microbes would be awesome, but I think would be harder to bring about and would take longer - although it might actually have more impact. So maybe instituting a course on risk for bioengineering or biotechnology students could help to bring about that sort of cultural change.

References:

  1. Yonatan Chemla, Connor J Sweeney, Christopher A Wozniak, et al. Engineering Bacteria for Environmental Release: Regulatory Challenges and Design Strategies. Authorea. July 05, 2024. DOI: 10.22541/au.171933709.97462270/v2
  2. Dalton R George, Mark Danciu, Peter W. Davenport, et al. A bumpy road ahead for genetic biocontainment. Nature Communications, 15(650). January 20, 2024. DOI: 10.1038/s41467-023-44531-1
  3. Jay Reichman, Gwendolyn McClung, Khoa Nguyen, et al. Research Needs for Novel Engineered Microbes and Biopesticides Intended for Open Release into the Environment. US EPA/600/R-22/109. September, 2022. https://cfpub.epa.gov/si/si_public_record_report.cfm?LAB=CPHEA&dirEntryID=357184

Week2 Lecture Prep

Jacobson:

  1. Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy? Polymerase error rate: $1 : 10^{6}$. The human genome is around 3.2 Gb, or $3.2 * 10^{9}$ basepairs. Biological polymerases are error-correcting; they have have proofreading mechanisms.
  2. How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest? The average human protein is encoded within 1036bp. This might be answerable based on the last slide titled “Fabricational Complexity”, but I couldn’t quite figure out what these formulas are supposed to be calculating without explanation. So instead, we can do some back-of-the-napkin math together. 1036bp is $1036/3 \approx 345$ codons, or 344 amino acids (because of the stop codon at the end), assuming that the 1036bp figure doesn’t include introns. Most amino acids have either 4 or 2 codons that can encode for it, although a couple have more or less. We’ll average it out to approximately 3 codons per amino acid. I imagine that not all amino acids are used at the same frequency in human proteins, but I don’t actually know what it is off the top of my head, so we’re just going to go with what we have. Each possible DNA sequence for an amino acid sequence includes every combination with all possible codons for each amino acid. So assuming an average human protein has 344 amino acids, and the average number of codons per amino acid is 3, then there are $3^{344} = 1.3 E164$ different ways to code for an average human protein. In practice, not all tRNAs are synthesized at the same frequency, so it might take unreasonably long for certain codons to be recognized during chain extension; and during DNA replication, errors can be made and some errors will be more tolerable than others due to codon wobble.

LeProust:

  1. What’s the most commonly used method for oligo synthesis currently? Phosphoramidite synthesis.
  2. Why is it difficult to make oligos longer than 200nt via direct synthesis? There are side reactions that occur, causing the accumulation of errors (incorrect bases).
  3. Why can’t you make a 2000bp gene via direct oligo synthesis? I think this is because of the side reactions in Q2, right? Like, the accumulation of errors limits oligo synthesis to around 200 bases in practice. Also, oligos are single-stranded DNA; a 2000bp gene is double-stranded, and therefore you’d either need to synthesize both strands and ligate them together, or synthesize one strand and use it as a template for PCR or something.

Church:

  1. Given the one paragraph abstracts for these real 2026 grant programs sketch a response to one of them or devise one of your own: BioStabilization Systems - ARPA-H

Biologic therapeutics are critically important for a number of diseases, but require careful and specific conditions at all points on the supply chain to maintain efficacy. Specifically, cell therapies and biologics require extreme cold to prevent degradation, thus making biologics inaccessible to people who don’t live near a specialized medical center. To solve this problem, we propose to express biologic therapeutics in extremophiles from abyssal marine sediment, which demonstrated little cell proliferation in low-oxygen environments but regained metabolic activity when incubated with oxygen. We predict that the faster cell turnover period at warmer temperature, oxygen-rich, and high-nutrient conditions will allow us to engineer these bacteria to produce the biologic therapeutic molecules. Once production is achieved, we will seal the cells into low-oxygen capsules for transport, which we predict will slow their metabolic rate enough to preserve the goal product until oxygen is provided again. If successful, this research could expand access to biologic therapeutics to anywhere that can aseptically incubate microbes at room temperature and purify the molecules therein.

References:

  • Morono, Y., Ito, M., Hoshino, T. et al. Aerobic microbial life persists in oxic marine sediment as old as 101.5 million years. Nat Commun 11, 3626 (2020). https://doi.org/10.1038/s41467-020-17330-1
  • Suzuki, Y., Webb, S.J., Kouduka, M. et al. Subsurface Microbial Colonization at Mineral-Filled Veins in 2-Billion-Year-Old Mafic Rock from the Bushveld Igneous Complex, South Africa. Microb Ecol 87, 116 (2024). https://doi.org/10.1007/s00248-024-02434-8

Personal notes/drafting

abstract formula:

  • 1 sentence on the broad problem: Biologic therapeutics are critically important for a number of diseases, but require careful and specific conditions at all points on the supply chain to maintain efficacy.
  • 1-2 sentences on the specific problem: How to transport cell therapies and biologics at room temperature, decentralizing medicine
  • 1 sentence on the broad goal: We aim to express biologic compounds in extremophiles from the deep subsurface where energy and nutrients are limited.
  • 2-3 sentences on methods: aerobic microbes from oxic abyssal marine sediment that proliferated at 10C with provision of nutrients and higher conc O2; might need to consider eukaryotic protein folding in prokaryotes; low O2 environment - maybe sealing the cells (post-therapeutic production, pre-shipping) into an airtight capsule would prevent metabolic activity including the breakdown of said therapeutics?
  • 1 sentence on future work: maybe also try extremophiles found within old rock samples
  • 1 sentence on conclusion/impact: expands access to biologics, especially to under-resourced communities

Week 1 HW: Principles and Practices version2

Contents

Week1 homework

  1. First, describe a biological engineering application or tool you want to develop and why.
    I want to optimize a strain of cyanobacteria for biomanufacturing. Cyanobacteria can be engineered to produce many useful things from atmospheric carbon dioxide, from commodity chemicals to bioactive compounds for pharmaceuticals, but harvesting the products is often energy intensive and expensive, especially at an industrial scale. I am particularly interested in cyanobacterial bioplastics, such as polyhydroxyalkanoates, because this would be a closed-loop carbon cycle for biodegradable plastic.

  2. Next, describe one or more governance/policy goals related to ensuring that this application or tool contributes to an “ethical” future, like ensuring non-malfeasance (preventing harm). Break big goals down into two or more specific sub-goals.

  • Goal: Prevent accidental release that could harm native ecosystems through microbial community shifts or production of commodity chemicals in the natural environment.
    • Subgoal: Include biocontainment systems in all commercially used industrial bioproduction strains.
    • Subgoal: Institute testing standards and protocols to notice any accidental release when it occurs.
  • Goal: Increase access to the genetic tools and strains used for cyanobacterial bioproduction to allow more chemicals to be manufactured in this carbon-neutral way.
    • Subgoal: Publish cyanobacterial genetic engineering research (such as new tools, etc.) in open access journals or make PDFs available on personal/lab websites.
    • Subgoal: Enable strain sharing.
  1. Next, describe at least three different potential governance “actions” by considering the four aspects below (Purpose, Design, Assumptions, Risks of Failure & “Success”).
    • Policy to require specific risk mitigation and demonstration of effectiveness under realistic application conditions for engineered bacteria approval.
      • Purpose: Currently, engineered bacteria that might affect environment and public health need to be approved by the EPA, FDA, or USDA for commercial use. This new policy would enact specific requirements for approvals for engineered bacteria. Additionally, many publications about genetic biocontainment discuss it as potential risk mitigation, but the effectiveness of the biocontainment is only demonstrated under specific laboratory conditions (i.e. axenic, optimized media, etc.).
      • Design: This would be a change in current federal standards and approval processes. The EPA, FDA, and USDA would need to write and implement new policies, potentially train risk assessors and application managers, and develop testing procedures to ensure compliance. With the overturning of the Chevron doctrine, likely this sort of new policy would require the buy-in of either the companies trying to get their products approved or US Congress to pass new legislation.
      • Assumptions: Companies and reseachers abide by federal regulations regarding testing and approval. Risk assessment is done in good faith, rather than by companies prioritizing profit over safety. Risk assessment is done by trained ecological and biological risk assessors who know what to look for or be aware of.
      • Risks of Failure and Success: This could fail if the requirement is too stringent to allow any new products to be approved. This could also fail if the requirements are too lax, and not all risks are accounted for and mitigated. If experimental conditions do not properly reflect application conditions, what appeared to be effective mitigation in the lab might not be effective mitigation in application.
    • Researchers and inventors could also implement relevant and effective genetic biocontainment in any engineered bacteria used for commercial biomanufacturing.
      • Purpose: For risks around the unintended spread of engineered bacteria or their synthetic genetic constructs, genetic biocontainment can mitigate these risks by preventing proliferation and/or degrading the relevant DNA. By tying the biocontainment system to the intended use of the bacterium, researchers manage risk in a relevant manner, thus ensuring that the bacterium is specific to the intended application and minimizing spread thereby reducing risks.
      • Design: Any developer of an engineered bacteria that could be intentionally or unintentionally released would need to research biocontainment and engineer a system into their bacteria. This would require a change in the current culture of the field, where the risks of engineered bacteria spread and mitigation through biocontainment are sometimes discussed, but mostly considered somewhat niche. If it became common practice to consider application and risks thereof for the products of synthetic biology, I think the design of these sorts of safeguards would be more widespread. Any sort of research requires funding and incentive, so universities, grant funders, and biotech companies would need to start looking for these considerations in proposals to motivate it.
      • Assumptions: Genetic biocontainment is a good strategy to mitigate the potential ecological and public health risks of new synthetic biology products. These risks are limited to ones we think to test (i.e. microbial community shifts, horizontal gene transfer of antiobiotic resistance genes or other functions, proliferation of engineered bacteria in unintended location, local specific bacterial extinction event in the case of a particularly robust engineered bacterium).
      • Risks of Failure and Success: If we rely too heavily on genetic biocontainment, a failure of the genetic system could result in losing that protection against risk. It’s also possible risks would not be seriously considered because we too easily trust biocontainment to minimize the risk.
    • Establish professional society for cyanobacteria-specific or general photosynthetic-organism research to promote resesarch and tool sharing.
      • Purpose: Currently, microalgae research is generally lumped along with all other non-model microbes in synthetic biology. A professional association or conference specific to photobiocatalysis could be a gathering place to collect all relevant tools, protocols, and standards, as well as potentially institute a shared ethics or goal to include improving access to the research and its products.
      • Design: Perhaps a starting point would be to invite cyanobacteria, eukaryotic microalgae, macro-algae, and plant synthetic biologists to a conference on photobiocatalysis, along with industry representatives from companies using or creating engineered phototrophs. This might be best done under the banner of an existing synthetic biology or metabolic engineering professional association (such as the Society for Biological Engineering in the American Institute of Chemical Engineers). If there is enough interest at the conference, attendees could work together to establish a more specific sub-association, or just resolve to discuss access and research sharing at the conference itself.
      • Assumptions: This is a large enough field to host such a specific conference. It might be too niche, but I don’t think so; it might be a conference on the smaller side at first though probably.
      • Risks of Failure and Success: It’s possible industry and start-ups might not want to popularly share their research as there is an economic disincentive.
  2. Next, score (from 1-3 with, 1 as the best, or n/a) each of your governance actions against your rubric of policy goals.
Does the option:Risk Mitigation for ApprovalBiocontainment in PracticePhotobiomanufacturing Professional Society
Enhance Biosecurity
• By preventing incidents113
• By helping respond233
Foster Lab Safety
• By preventing incident2n/a2
• By helping respond2n/a2
Protect the environment
• By preventing incidents112
• By helping respond122
Other considerations
• Minimizing costs and burdens to stakeholders333
• Feasibility?232
• Not impede research321
• Promote constructive applications121
  1. Last, drawing upon this scoring, describe which governance option, or combination of options, you would prioritize, and why. Outline any trade-offs you considered as well as assumptions and uncertainties.
    I would prioritize the requirement of risk assessment and mitigation strategies for federal approval of engineered bacteria. I believe this would have the biggest impact in terms of allowing engineered bacteria to be used for public good (such as biomanufacturing) while preventing potential harm (such as ecosystem destabilization by permanently altering native microbiome in instances of escape). The development of genetic biocontainment tools and implementation thereof becoming regular practice in the field of engineered microbes would be awesome, but I think would be harder to bring about and would take longer - although it might actually have more impact. The establishment of a professional society could help institute such norms. Starting a new conference would probably be easiest in terms of discovering feasibility - proposing it to a handful of host organizations would rapidly identify whether this is currently worth pursuing or if it would need to be worked on for a while first.

References:

  1. Chemla, Y; Sweeney, CJ; Wozniak, CA; et al. Engineering Bacteria for Environmental Release: Regulatory Challenges and Design Strategies. Authorea. July 05, 2024. DOI: 10.22541/au.171933709.97462270/v2
  2. George, DR; Danciu, M; Davenport, PW; et al. A bumpy road ahead for genetic biocontainment. Nature Communications, 15(650). January 20, 2024. DOI: 10.1038/s41467-023-44531-1
  3. Schmelling, NM; Bross, M. What is holding back cyanobacterial research and applications? A survey of the cyanobacterial research community. Nat Commun 15, 6758. August, 8, 2024. DOI: 10.1038/s41467-024-50828-6

Week2 Lecture Prep

Jacobson:

  1. Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy?
    Polymerase error rate: $1 : 10^{6}$. The human genome is around 3.2 Gb, or $3.2 * 10^{9}$ basepairs. Biological polymerases are error-correcting; they have have proofreading mechanisms. There are also mutation repair mechanisms.
  2. How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?
    The average human protein is encoded within 1036bp. This might be answerable based on the last slide titled “Fabricational Complexity”, but I couldn’t quite figure out what these formulas are supposed to be calculating without explanation. So instead, we can do some back-of-the-napkin math together. 1036bp is $1036/3 \approx 345$ codons, or 344 amino acids (because of the stop codon at the end), assuming that the 1036bp figure doesn’t include introns. Most amino acids have either 4 or 2 codons that can encode for it, although a couple have more or less. We’ll average it out to approximately 3 codons per amino acid. I imagine that not all amino acids are used at the same frequency in human proteins, but I don’t actually know what it is off the top of my head, so we’re just going to go with what we have. Each possible DNA sequence for an amino acid sequence includes every combination with all possible codons for each amino acid. So assuming an average human protein has 344 amino acids, and the average number of codons per amino acid is 3, then there are $3^{344} = 1.3 E164$ different ways to code for an average human protein. In practice, not all tRNAs are synthesized at the same frequency, so it might take unreasonably long for certain codons to be recognized during chain extension; and during DNA replication, errors can be made and some errors will be more tolerable than others due to codon wobble.

LeProust:

  1. What’s the most commonly used method for oligo synthesis currently?
    Phosphoramidite synthesis.
  2. Why is it difficult to make oligos longer than 200nt via direct synthesis?
    There are side reactions that occur, causing the accumulation of errors (incorrect bases).
  3. Why can’t you make a 2000bp gene via direct oligo synthesis?
    I think this is because of the side reactions in Q2, right? Like, the accumulation of errors limits oligo synthesis to around 200 bases in practice. Also, oligos are single-stranded DNA; a 2000bp gene is double-stranded, and therefore you’d either need to synthesize both strands and ligate them together, or synthesize one strand and use it as a template for PCR or something.

Church:

  1. Given the one paragraph abstracts for these real 2026 grant programs sketch a response to one of them or devise one of your own: BioStabilization Systems - ARPA-H \

Biologic therapeutics are critically important for a number of diseases, but require careful and specific conditions at all points on the supply chain to maintain efficacy. Specifically, cell therapies and biologics require extreme cold to prevent degradation, thus making biologics inaccessible to people who don’t live near a specialized medical center. To solve this problem, we propose to express biologic therapeutics in extremophiles from abyssal marine sediment, which demonstrated little cell proliferation in low-oxygen environments but regained metabolic activity when incubated with oxygen. We predict that the faster cell turnover period at warmer temperature, oxygen-rich, and high-nutrient conditions will allow us to engineer these bacteria to produce the biologic therapeutic molecules. Once production is achieved, we will seal the cells into low-oxygen capsules for transport, which we predict will slow their metabolic rate enough to preserve the goal product until oxygen is provided again. If successful, this research could expand access to biologic therapeutics to anywhere that can aseptically incubate microbes at room temperature and purify the molecules therein.

References:

  1. Morono, Y; Ito, M; Hoshino, T; et al. Aerobic microbial life persists in oxic marine sediment as old as 101.5 million years. Nat Commun 11, 3626. 2020. DOI: 10.1038/s41467-020-17330-1
  2. Suzuki, Y; Webb, SJ; Kouduka, M; et al. Subsurface Microbial Colonization at Mineral-Filled Veins in 2-Billion-Year-Old Mafic Rock from the Bushveld Igneous Complex, South Africa. Microb Ecol 87, 116. 2024. DOI: 10.1007/s00248-024-02434-8

Personal notes/drafting

abstract formula:

  • 1 sentence on the broad problem: Biologic therapeutics are critically important for a number of diseases, but require careful and specific conditions at all points on the supply chain to maintain efficacy.
  • 1-2 sentences on the specific problem: How to transport cell therapies and biologics at room temperature, decentralizing medicine
  • 1 sentence on the broad goal: We aim to express biologic compounds in extremophiles from the deep subsurface where energy and nutrients are limited.
  • 2-3 sentences on methods: aerobic microbes from oxic abyssal marine sediment that proliferated at 10C with provision of nutrients and higher conc O2; might need to consider eukaryotic protein folding in prokaryotes; low O2 environment - maybe sealing the cells (post-therapeutic production, pre-shipping) into an airtight capsule would prevent metabolic activity including the breakdown of said therapeutics?
  • 1 sentence on future work: maybe also try extremophiles found within old rock samples
  • 1 sentence on conclusion/impact: expands access to biologics, especially to under-resourced communities

Week 2 HW: Read, Write, Edit DNA

Contents

Part 1: Benchling and In-Silico Gel Art

Simulated lambda DNA digestions:

[grid] EcoRIdig EcoRIdig HindIIIdig HindIIIdig BamHIdig BamHIdig KpnIdig KpnIdig EcoRVdig EcoRVdig EcoRIdig EcoRIdig SacIdig SacIdig SalIdig SalIdig [/grid]"

I couldn’t figure out how to use Ronan’s website other than the randomization button unfortunately. As a result, I went with a pretty simple smiley face design for my in-silico art. GelArtSmile GelArtSmile

Part 3: DNA Design Challenge

3.1 Protein

I chose PETase, a naturally occurring enzyme from Ideonella sakaiensis, which degrades poly(ethylene terephthalate) into monomers of mono-2-hydroxyethyl terephthalate.

sp|A0A0K8P6T7|PETH_PISS1 Poly(ethylene terephthalate) hydrolase OS=Piscinibacter sakaiensis OX=1547922 GN=ISF6_4831 PE=1 SV=1 MNFPRASRLMQAAVLGGLMAVSAAATAQTNPYARGPNPTAASLEASAGPFTVRSFTVSRP SGYGAGTVYYPTNAGGTVGAIAIVPGYTARQSSIKWWGPRLASHGFVVITIDTNSTLDQP SSRSSQQMAALRQVASLNGTSSSPIYGKVDTARMGVMGWSMGGGGSLISAANNPSLKAAA PQAPWDSSTNFSSVTVPTLIFACENDSIAPVNSSALPIYDSMSRNAKQFLEINGGSHSCA NSGNSNQALIGKKGVAWMKRFMDNDTRYSTFACENPNSTRVSDFRTANCS

3.2 Reverse translate

To reverse translate in Benchling, it asks what codon optimization scheme you want to use. For this initial DNA sequence, I just used Escherichia coli K12 as my organism, matching codon usage to the frequency found in the E. coli genome.

ATGAATTTTCCGCGCGCAAGTCGTTTAATGCAAGCGGCGGTACTGGGCGGCTTGATGGCAGTGTCGGCAGCTGCGACGGCTCAGACCAATCCGTATGCACGCGGTCCGAATCCAACCGCGGCCAGCCTGGAAGCATCCGCGGGTCCTTTTACTGTTCGAAGCTTCACAGTGAGCCGGCCGTCGGGCTATGGCGCTGGGACCGTGTATTATCCAACTAACGCGGGAGGCACCGTAGGTGCGATTGCTATCGTACCCGGCTACACAGCGCGTCAGTCCTCAATTAAATGGTGGGGCCCCCGCTTAGCGTCGCACGGTTTTGTTGTCATTACCATTGATACGAATAGTACCCTAGACCAACCATCGTCTCGTTCGTCTCAGCAGATGGCCGCGCTGCGCCAGGTTGCCAGCCTCAACGGCACGAGCTCATCTCCGATCTACGGTAAAGTCGATACGGCACGCATGGGCGTGATGGGATGGTCAATGGGCGGCGGTGGTAGTCTGATTAGTGCGGCGAATAACCCGTCTTTGAAAGCCGCCGCCCCGCAGGCCCCGTGGGATAGTAGCACAAACTTTTCCTCAGTTACTGTCCCGACCCTTATCTTCGCCTGTGAGAACGACTCCATTGCGCCTGTGAATAGCTCAGCCCTGCCGATATACGATTCAATGAGCCGTAATGCCAAGCAGTTTCTTGAAATCAATGGCGGAAGCCATAGCTGCGCAAACAGTGGGAATAGCAACCAAGCCCTGATTGGTAAAAAGGGGGTGGCGTGGATGAAACGCTTCATGGATAACGACACCAGGTACTCGACCTTCGCATGTGAAAACCCTAACAGCACGCGCGTGAGCGATTTTCGTACCGCCAACTGCTCG

3.3 Codon optimize

To practice codon optimization, I stayed within the Benchling tool. This time, I chose my potential host organism: Pseudomonas putida, a soil bacteria that is a fairly common chassis. This time, I selected to only use the best (most frequently occuring) codons, to theoretically improve expression. I also avoided BsaI cut sites, in case I want to use Golden Gate cloning with this construct.

ATGAACTTCCCGCGCGCCAGCCGCCTGATGCAGGCCGCCGTGCTGGGCGGCCTGATGGCCGTGAGCGCCGCCGCCACCGCCCAGACCAACCCGTACGCCCGCGGCCCGAACCCGACCGCCGCCAGCCTGGAGGCCAGCGCCGGCCCGTTCACCGTGCGCAGCTTCACCGTGAGCCGCCCGAGCGGCTACGGCGCCGGCACCGTGTACTACCCGACCAACGCCGGCGGCACCGTGGGCGCCATCGCCATCGTGCCGGGCTACACCGCCCGCCAGAGCAGCATCAAGTGGTGGGGCCCGCGCCTGGCCAGCCACGGCTTCGTGGTGATCACCATCGACACCAACAGCACCCTGGACCAGCCGAGCAGCCGCAGCAGCCAGCAGATGGCCGCCCTGCGCCAGGTGGCCAGCCTGAACGGCACCAGCAGCAGCCCGATCTACGGCAAGGTGGACACCGCCCGCATGGGCGTGATGGGCTGGAGCATGGGCGGCGGCGGCAGCCTGATCAGCGCCGCCAACAACCCGAGCCTGAAGGCCGCCGCCCCGCAGGCCCCGTGGGACAGCAGCACCAACTTCAGCAGCGTGACCGTGCCGACCCTGATCTTCGCCTGCGAGAACGACAGCATCGCCCCGGTGAACAGCAGCGCCCTGCCGATCTACGACAGCATGAGCCGCAACGCCAAGCAGTTCCTGGAGATCAACGGCGGCAGCCACAGCTGCGCCAACAGCGGCAACAGCAACCAGGCCCTGATCGGCAAGAAGGGCGTGGCCTGGATGAAGCGCTTCATGGACAACGACACCCGCTACAGCACCTTCGCCTGCGAGAACCCGAACAGCACCCGCGTGAGCGACTTCCGCACCGCCAACTGCAGC

3.4 Now what?

I optimized the codons for P. putida, so I would choose to express this in P. putida. I would probably put the gene onto an expression plasmid first, under a strong constitutive promoter, just to ensure it works. After transforming P. putida with the plasmid, I would test expression by looking at protein production with a Western blot, and also culturing with a sample of PET plastic, to check for degradation. Ultimately, I would want to integrate this gene into the genome of the bacteria, possibly under an inducible promoter for use in open-release plastic pollution bioremediation.

Part 4: Prepare a Twist DNA Synthesis Order

There is actually a way to use Twist’s expression vectors, so I wouldn’t have to design the whole expression cassette. For example, if I wanted to express my E. coli codon-optimized PETase gene in E. coli, I could select one of Twist’s pET expression vectors; in this case, I chose pET-blank(Kan). It has a T7 promoter and RBS already included in it, and lacO for inducible expression. I believe the host strain would need a T7 polymerase, but my PETase gene should be expressed in the presence of IPTG.

Twist construct: gene inside pET-blank(Kan) Twist construct: gene inside pET-blank(Kan) Gb file for Twist construct

Part 5: DNA Read/Write/Edit

5.1 Read

  1. What DNA would you want to sequence (e.g., read) and why? I would want to sequence metagenomic 16S sequences from soil samples. This gives me a baseline for bacterial community structure (prior to engineered strain addition).
  2. In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why? This would be Illumina sequencing probably, so second generation. This requires PCR amplification of the 16S variable region, adapter ligation, and then library pooling. The output would be many, many sequences that I would be able to compare to published 16S sequences to identify bacterial species present. I’d choose this method because it multiplexes better than Sanger sequencing, and it doesn’t need to be long-read like nanopore.

5.2 Write

  1. What DNA would you want to synthesize (e.g., write) and why? I would want to synthesize the CRISPR cassette for my kill-switch because it’s somewhat difficult and time-intensive to stitch together out of oligos.
  2. What technology or technologies would you use to perform this DNA synthesis and why? I’d order it from Twist because it has multiple internal repeats, and they’re one of the few companies with the technology to accomplish that.

5.3 Edit

  1. What DNA would you want to edit and why? I’d like to edit the P. putida genome to include PETase and MHETase genes, as well as a killswitch circuit for biocontainment to prevent unintended ecological effects during application.
  2. What technology or technologies would you use to perform these DNA edits and why? I’d like to use CRISPR-Cas9 because it is the most flexible when it comes to genomic integration location. I could identify a few good neutral sites for integration and design sgRNAs to target these locations. Then I could insert repair templates (including homology arms) onto vectors and transform those sequentially with the CRISPR-Cas9 plasmid (repair template should be first). The repair templates could either be wholly synthesized, or assembled through overlap PCR or Gibson assembly (primer design for homology arms and overlaps for assembly). I might need a antibiotic, fluorescent, or other marker to scan for initial transformation and also genomic integration post-plasmid loss - in that case, I would also need to consider a step to remove the marker.

Week 2 HW: Read, Write, Edit DNA Version2

Contents

Part 1: Benchling and In-Silico Gel Art

Simulated lambda DNA digestions: Multiple restriction enzyme digestions Multiple restriction enzyme digestions

I couldn’t figure out how to use Ronan’s website other than the randomization button unfortunately. As a result, I went with a pretty simple smiley face design for my in-silico art. GelArtSmile GelArtSmile

Part 2: Gel Art - Restriction Digests and Gel Electrophoresis

See Week 2 lab for details. gel art gel art

Part 3: DNA Design Challenge

3.1 Choose protein

I’m interested in PhaC, a PHA synthase. This is an enzyme involved in the synthesis of polyhydroxyalkanoates (PHAs), a class of biopolymer that is considered a potential non-petroleum-derived thermoplastic. PHAs are also of interest for possible medical uses as biodegradable polymers. PhaC is the enzyme that catalyzes the polymerization step, adding on monomers to the chain.

I selected PhaC from Cupriavidus necator H16 whose primary product is poly(3-hydroxybutyurate). From UniProt, the accession number is P23608 · PHAC_CUPNH.

MATGKGAAASTQEGKSQPFKVTPGPFDPATWLEWSRQWQGTEGNGHAAASGIPGLDALAGVKIAPAQLGDIQQRYMKDFSALWQAMAEGKAEATGPLHDRRFAGDAWRTNLPYRFAAAFYLLNARALTELADAVEADAKTRQRIRFAISQWVDAMSPANFLATNPEAQRLLIESGGESLRAGVRNMMEDLTRGKISQTDESAFEVGRNVAVTEGAVVFENEYFQLLQYKPLTDKVHARPLLMVPPCINKYYILDLQPESSLVRHVVEQGHTVFLVSWRNPDASMAGSTWDDYIEHAAIRAIEVARDISGQDKINVLGFCVGGTIVSTALAVLAARGEHPAASVTLLTTLLDFADTGILDVFVDEGHVQLREATLGGGAGAPCALLRGLELANTFSFLRPNDLVWNYVVDNYLKGNTPVPFDLLFWNGDATNLPGPWYCWYLRHTYLQNELKVPGKLTVCGVPVDLASIDVPTYIYGSREDHIVPWTAAYASTALLANKLRFVLGASGHIAGVINPPAKNKRSHWTNDALPESPQQWLAGAIEHHGSWWPDWTAWLAGQAGAKRAAPANYGNARYRAIEPAPGRYVKAKA

3.2 Reverse translate

I used the Benchling back-translate tool set to match Escherichia coli K-12 naturally occuring codon usage because it didn’t have the native host C. necator as an option. They are in the same phylum (Pseudomonadota), so maybe it will be similar.

ATGGCAACTGGAAAGGGTGCGGCCGCGAGCACACAGGAAGGTAAATCACAGCCGTTTAAGGTAACCCCGGGCCCCTTCGATCCTGCCACGTGGCTCGAGTGGTCGCGTCAGTGGCAAGGCACTGAAGGTAATGGGCACGCAGCCGCCTCTGGCATCCCGGGTCTTGATGCCCTGGCAGGCGTGAAGATTGCCCCAGCCCAATTAGGTGACATTCAGCAACGTTACATGAAAGACTTTAGTGCACTATGGCAGGCCATGGCGGAAGGTAAAGCGGAGGCGACGGGGCCTCTGCATGATCGTCGCTTCGCCGGCGATGCGTGGCGTACCAACCTGCCGTATCGCTTCGCAGCGGCGTTTTATCTGCTCAACGCGCGTGCACTTACCGAGCTGGCTGACGCAGTAGAAGCCGACGCCAAAACCAGGCAACGCATCCGTTTTGCGATTAGCCAGTGGGTGGATGCCATGAGTCCGGCTAACTTTCTGGCGACCAACCCGGAAGCCCAGCGCCTCCTGATTGAATCCGGTGGCGAAAGTCTTCGCGCGGGAGTGCGAAACATGATGGAAGATCTGACGCGAGGTAAGATCAGCCAGACGGATGAAAGCGCATTCGAAGTCGGGCGTAATGTTGCCGTTACGGAGGGTGCGGTTGTGTTTGAGAACGAATATTTCCAGTTGTTACAGTATAAGCCGCTGACCGATAAAGTGCATGCCCGCCCACTTCTCATGGTACCTCCGTGCATCAACAAATACTACATTCTGGATCTTCAGCCTGAGAGCTCATTGGTACGCCATGTGGTAGAGCAAGGCCACACAGTGTTTCTAGTCTCATGGCGCAATCCGGACGCATCCATGGCCGGCTCGACGTGGGACGATTATATCGAACACGCGGCAATAAGAGCGATTGAGGTCGCGCGTGATATCAGCGGTCAGGACAAAATTAATGTGTTAGGTTTCTGCGTAGGCGGTACTATCGTGAGTACCGCCCTGGCGGTTTTGGCAGCTCGCGGCGAACATCCGGCCGCTTCAGTTACTCTTCTGACTACCCTGCTGGATTTTGCGGACACCGGCATTCTGGATGTCTTCGTAGATGAAGGACATGTTCAGTTGCGCGAAGCAACCTTAGGCGGGGGGGCGGGTGCCCCGTGTGCCTTACTGCGGGGCCTGGAACTCGCTAACACCTTTTCGTTCCTGCGCCCAAACGATCTGGTTTGGAATTACGTGGTCGATAACTATCTGAAAGGCAACACCCCGGTGCCGTTTGATCTGCTGTTTTGGAATGGCGACGCGACCAACCTGCCGGGCCCGTGGTATTGCTGGTACCTCCGCCACACATACCTGCAAAATGAACTAAAAGTGCCAGGCAAACTGACAGTTTGTGGCGTGCCTGTGGATTTGGCTTCCATTGACGTGCCGACGTACATTTACGGTTCGCGCGAAGATCACATCGTCCCGTGGACCGCTGCCTACGCTTCTACGGCGTTGTTAGCAAATAAACTTCGGTTCGTTTTAGGCGCATCTGGCCATATTGCGGGAGTTATTAATCCACCCGCGAAAAATAAGCGTAGCCATTGGACCAATGACGCGTTGCCTGAAAGCCCCCAGCAATGGCTGGCAGGCGCGATAGAGCATCACGGCAGCTGGTGGCCGGATTGGACCGCATGGTTAGCCGGCCAGGCCGGAGCGAAACGTGCTGCGCCCGCGAATTATGGAAACGCGCGTTATCGTGCCATTGAACCCGCCCCGGGGCGCTATGTCAAAGCGAAAGCA

They are not that similar, it turns out; although that may have less to do with codon usage frequency and more to do with when the reverse translate tool used which codons. Here’s the DNA sequence alignment comparing the genomic sequence from C. necator with the E. coli optimized reverse translation. This sequence alignment was performed in Benchling, using MAFFT with pre-set parameters. partial phaC sequence alignment partial phaC sequence alignment Full alignment viewable here.

3.3 Codon optimize

I once again used the Benchling tool to codon optimize for E. coli K-12, but this time, I selected the Best Codon option in Benchling, and this was performed off the original C. necator phaC DNA sequence - although it should produce the same sequence if it was done as a reverse translate from the amino acid sequence too (since i confirmed that the phaC sequence does translate to the PhaC sequence with 100% identity).

ATGGCAACTGGAAAGGGTGCGGCCGCGAGCACACAGGAAGGTAAATCACAGCCGTTTAAGGTAACCCCGGGCCCCTTCGATCCTGCCACGTGGCTCGAGTGGTCGCGTCAGTGGCAAGGCACTGAAGGTAATGGGCACGCAGCCGCCTCTGGCATCCCGGGTCTTGATGCCCTGGCAGGCGTGAAGATTGCCCCAGCCCAATTAGGTGACATTCAGCAACGTTACATGAAAGACTTTAGTGCACTATGGCAGGCCATGGCGGAAGGTAAAGCGGAGGCGACGGGGCCTCTGCATGATCGTCGCTTCGCCGGCGATGCGTGGCGTACCAACCTGCCGTATCGCTTCGCAGCGGCGTTTTATCTGCTCAACGCGCGTGCACTTACCGAGCTGGCTGACGCAGTAGAAGCCGACGCCAAAACCAGGCAACGCATCCGTTTTGCGATTAGCCAGTGGGTGGATGCCATGAGTCCGGCTAACTTTCTGGCGACCAACCCGGAAGCCCAGCGCCTCCTGATTGAATCCGGTGGCGAAAGTCTTCGCGCGGGAGTGCGAAACATGATGGAAGATCTGACGCGAGGTAAGATCAGCCAGACGGATGAAAGCGCATTCGAAGTCGGGCGTAATGTTGCCGTTACGGAGGGTGCGGTTGTGTTTGAGAACGAATATTTCCAGTTGTTACAGTATAAGCCGCTGACCGATAAAGTGCATGCCCGCCCACTTCTCATGGTACCTCCGTGCATCAACAAATACTACATTCTGGATCTTCAGCCTGAGAGCTCATTGGTACGCCATGTGGTAGAGCAAGGCCACACAGTGTTTCTAGTCTCATGGCGCAATCCGGACGCATCCATGGCCGGCTCGACGTGGGACGATTATATCGAACACGCGGCAATAAGAGCGATTGAGGTCGCGCGTGATATCAGCGGTCAGGACAAAATTAATGTGTTAGGTTTCTGCGTAGGCGGTACTATCGTGAGTACCGCCCTGGCGGTTTTGGCAGCTCGCGGCGAACATCCGGCCGCTTCAGTTACTCTTCTGACTACCCTGCTGGATTTTGCGGACACCGGCATTCTGGATGTCTTCGTAGATGAAGGACATGTTCAGTTGCGCGAAGCAACCTTAGGCGGGGGGGCGGGTGCCCCGTGTGCCTTACTGCGGGGCCTGGAACTCGCTAACACCTTTTCGTTCCTGCGCCCAAACGATCTGGTTTGGAATTACGTGGTCGATAACTATCTGAAAGGCAACACCCCGGTGCCGTTTGATCTGCTGTTTTGGAATGGCGACGCGACCAACCTGCCGGGCCCGTGGTATTGCTGGTACCTCCGCCACACATACCTGCAAAATGAACTAAAAGTGCCAGGCAAACTGACAGTTTGTGGCGTGCCTGTGGATTTGGCTTCCATTGACGTGCCGACGTACATTTACGGTTCGCGCGAAGATCACATCGTCCCGTGGACCGCTGCCTACGCTTCTACGGCGTTGTTAGCAAATAAACTTCGGTTCGTTTTAGGCGCATCTGGCCATATTGCGGGAGTTATTAATCCACCCGCGAAAAATAAGCGTAGCCATTGGACCAATGACGCGTTGCCTGAAAGCCCCCAGCAATGGCTGGCAGGCGCGATAGAGCATCACGGCAGCTGGTGGCCGGATTGGACCGCATGGTTAGCCGGCCAGGCCGGAGCGAAACGTGCTGCGCCCGCGAATTATGGAAACGCGCGTTATCGTGCCATTGAACCCGCCCCGGGGCGCTATGTCAAAGCGAAAGCA

3.4 Now what?

This sequence could be used to express PhaC in E. coli. I would probably put the gene onto an expression plasmid, under a strong constitutive promoter, just to ensure it works. After transforming E. coli with the plasmid, I would test expression by looking at protein production with a Western blot, and looking at cells under a microscope to look for PHA granules. I need to do a little more literature searching on heterologous expression of PhaC in E. coli - I think maybe other enzymes are needed for PHB synthesis.

3.5 Optional - how does it work in natural biological systems?

  1. Describe how a single gene codes for multiple proteins at the transcriptional level.
    Different reading frames on the same string of DNA bases gives different codons that are off-set by which base (1-3) starts it. In this way, genes for multiple proteins can overlap on the same sequence of DNA.
  2. Try aligning the DNA sequence, the transcribed RNA, and also the resulting translated Protein!
    I created the transcript by using Benchling to create a new RNA sequence off the reverse of my coodon-optimized sequence. I kept the annotations, so the translation should still be visible. Then I made a new alignment in Benchling using MAFFT with the automatic parameters. Again, the sequences match perfectly - although it’s not 100% identity because technically the T/U difference between DNA and RNA are considered mismatches, but we can see visually across the bottom of the screenshot that we don’t have any actual mismatches. DNA RNA protein alignment DNA RNA protein alignment

Part 4: Prepare a Twist DNA Synthesis Order

Following the instructions in the Week2 Homework, I added the J23106 promoter and an RBS at the beginning of my codon-optimized phaC sequence. My coding sequence already had a start and stop codon, so I didn’t need to add those. I inserted the 7x-His tag just before the stop codon, and then I put the terminator after the stop codon at the end.

linear map of expression cassette linear map of expression cassette

Genbank file with annotations FASTA file

I then set up the Twist order, as if I was going to order this cassette to be synthesized. Again, following the instructions for upload, I chose cloning vector pTwist Amp High Copy to make a full plasmid.

circular map of plasmid to order from Twist circular map of plasmid to order from Twist

Genbank file of plasmid

Part 5: DNA Read/Write/Edit

5.1 Read

  1. What DNA would you want to sequence (e.g., read) and why?
    I’d like to sequence the genomes of all cyanobacterial strains known to produce PHAs or specifically PHB (some already are sequenced, I think). I want to align all the known cyanobacterial PHA-synthases, and then align with the assembled genomes of the cyanobacterial strains known to produce PHAs that maybe aren’t annotated yet to try to find the PHA-synthases and add those to my comparisons.
  2. In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why?
    I would use third-generation sequencing on an Oxford nanopore. By using long-read technology, I would get much longer contigs, to make genomic assembly easier.

5.2 Write

  1. What DNA would you want to synthesize (e.g., write) and why?
    I’d like to get a CRISPR-Cas12a multiplexed gRNA cassette synthesized. This would allow multiple genomic edits to occur simultaneously, if the appropriate repair templates are included (one for gRNA target).
  2. What technology or technologies would you use to perform this DNA synthesis and why?
    I would submit an order to Twist to get this synthesized because it has multiple internal repeats because of the CRISPR region, which means traditional DNA synthesis technologies would struggle with this sequence.

5.3 Edit

  1. What DNA would you want to edit and why?
    I’d like to improve PHA-synthase expression in my cyanobacterial chassis strain of choice (specific strain yet to be determined). This could be accomplished through promoter replacement if we’re staying in the genome rather than adding a plasmid, but I’d also be interested in knocking out other biosynthetic pathways to improve carbon flux towards PHA synthesis. So I’d want to edit the genomic DNA of a cyanobacterial chassis.
  2. What technology or technologies would you use to perform these DNA edits and why?
    I’d use a CRISPR-Cas12a vector because it allows for multiplexed targeting, so I could make multiple genomic edits. Cas12a both processes the CRISPR-gRNA cassette and makes the cuts, so it requires fewer components than Cas9. Additionally, there’s some evidence suggesting Cas12a shows less off-target effects than Cas9.

Week 3 HW: Lab Automation

Contents

Python Script for Opentrons Artwork

Note: This is due before the Victoria node does its Opentrons artwork lab, at a future date TBD. This homework assignment is still in-progress because the due date is not yet established.

Post-lab questions

  1. Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications.
    A paper published this month in ACS Synthetic Biology details a new workflow for automating MoClo plasmid assembly and transformation, with a semi-automated colony PCR on an Opentrons OT-2 and Opentrons Flex. These workflows are designed to be user-friendly and output the Opentrons protocol from user-supplied CSV files, which provided README files describe how to produce.

Alternatively, the authors also developed a graphical user interface which requires no coding ability. This is a novel application because it is only the second automation of MoClo/Golden Gate cloning for Opentrons system (as opposed to advanced high-throughput liquid handling systems), and this new workflow does not require Python ability as the previously published AssemblyTron workflow.

These workflows were validated by assembling plasmids with the MoClo Yeast Toolkit and MoClo SubtiToolKit, and transforming these plasmids into Saccharomyces cerevisiae and sequentially Escherichia coli and Bacillus subtilis, respectively. With both toolkits, the automated procedure achieved efficiency comparable to the manual procedures (> 90% and 60%, respectively). Figure 1: Schematic overview of the protocol design workflows developed for the Opentrons platform. Figure 1: Schematic overview of the protocol design workflows developed for the Opentrons platform.

Figure 1: Schematic overview of the protocol design workflows developed for the Opentrons platform. Protocols can be generated using either the generator.py Python script via the command line or the online Slowpoke tool, which features a user-friendly GUI. Both tools run the workflow.py files in the backend. (A) Workflow for Golden Gate-based cloning, where users define genetic part layouts and assembly combinations. (B) Workflow for colony PCR, including colony selection, reagent layout, and reaction recipe input.

  • Malci, K; Meng, F; Galez, H; et al. Slowpoke: An Automated Golden Gate Cloning Workflow for Opentrons OT-2 and Flex. 2026. ACS Synthetic Biology, 15(2): 511-521. DOI: 10.1021/acssynbio.5c00629
  1. Write a description about what you intend to do with automation tools for your final project.
    I’d want to utilize the Opentrons set-up in the Victoria node to enable the possible execution of my medium-term aim with as little scientist benchtime as possible. I don’t know the exact make and model of all modules that the Victoria Opentrons has, but below is a series of possible steps that might be automatable (best use of automation would be medium or high throughput, depending on the number of designs we are able to test):
    1. Gibson Assembly or MoClo plasmid assembly
      1. Transfer reaction components into wells
      2. Heat block for digestion/ligation/PCR steps
    2. Transformation of expression plasmid
      1. Transfer plasmids and competent cells into wells
      2. Heat block for heat shock
      3. Transfer media into wells
      4. Heated shaker for recovery
      5. Incubator for overnight growth
      6. Stamp onto new plate or pick into multiple liquid cultures for culturing
      7. Incubator or heated shaker for overnight growth
    3. Readout
      1. Transfer cells (and reagents) into wells
      2. Plate reader for fluorescent or colorimetric output

Final project ideas

Brainstorming:

  • Identification of PhaC analog in Cyanobacterium aponium UTEX 3222 and overproducing or engineering for increased efficiency
    • BLAST/align with known PHA-synthases
    • Compare efficiency / mutations that improved turnover in other PhaC - test analogous mutations (aligned location, similar or different AAs). improved substrate specificity?
    • Site-specific saturation mutagenesis? Would be good use for automation
  • Quorum sensing based killswitch (i.e. cell dies if it escapes bioreactor)
    • Has to have some kind of inducible element or won’t grow after initial transformation
    • What’s good at quorum sensing already?
  • Something else??? Something in E coli that can be done on Opentron
    • Because it’s more convenient for a final project to be executed in Victoria remotely
  • Cyanobacterial expression plasmid across multiple cyano species
    • needs to include E coli machinery for manipulation and production (and conjugation, for relevant species)

Ideas:

  1. PhaC protein engineering
    1. Short term aim: Design small library of PhaC variants with expected improvement
    2. Medium term aim: Generate library and test in chassis strain
    3. Long term aim: Develop PHB bio-manufacturing cyanobacterial strain for carbon-neutral/carbon-negative plastic (depending on biodegradation).
  2. Quorum sensing based circuit for biocontainment
    1. Short term aim: Design killswitch with genetic circuit to trigger based on quorum sensing.
    2. Medium term aim: Build genetic circuit with expression based on quorum sensing with a measureable output; test circuit in E. coli.
    3. Long term aim: Optimize circuit sensitivity and test with killswitch expression; integrate into bio-manufacturing chassis strains for population-linked biocontainment.
  3. Broad cyanobacterial expression plasmid
    1. Short term aim: Design plasmid backbone based off native cyanobacterial plasmids and established E. coli machinery.
    2. Medium term aim: Test expression in multiple cyanobacterial strains (including some previously considered genetically intractable with classic broad-host-range vectors).
    3. Long term aim: Establish protocol for domestication of newly prospected, wild-type cyanobacterial strains using the cyanobacterial plasmid.
Google slide of ideas Google slide of ideas

Week 4 HW: Protein Design Part I

Contents

Part A: Conceptual Questions

Need to answer 9/11 questions; I skipped

  1. How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons)
    $$ 500g * \frac{1 mol AA}{100g} = 5 mol AA $$ $$ 5 mol * \frac{6.02*10^{23} molecules}{1 mol} = 3.01 E24 molecules $$
  2. Why do humans eat beef but do not become a cow, eat fish but do not become fish?
    We break down the proteins during digestion to the constituent amino acids. These amino acids are then used in our cells to build human proteins.
  3. Why are there only 20 natural amino acids?
    It’s been hypothesized that the 20 naturally occurring amino acids fairly effectively cover the “chemical space”, which would indicate that more complex or diverse amino acids are not needed for increasing function. This includes variation in chemical properties like molecular size, hydrophobicity, and charge, but also rotational conformations. These twenty sufficiently cover the space for effective function while also being relatively low in energy (easy to synthesize). Another paper hypothesizes that all twenty natural amino acids predate the RNA world, and in fact were naturally synthesized prebiotically with mineral catalysts - thus suggesting that the development of the three-base 64-codon alphabet actually was because a two-base 16-codon alphabet would restrict to sixteen instead of the existing 20 amino acids.
  4. Can you make other non-natural amino acids? Design some new amino acids.
    There are a new non-cannonical amino acids that people have designed and used, by changing the residue for an unnatural one.
  5. Where did amino acids come from before enzymes that make them, and before life started? In 2018, Bywater suggested that amino acids were synthesized prebiotically, with the simpler structures occurring through aqueous reactions, and more complex structures requiring mineral catalysts. Many amino acids have been identified on meteorites, suggesting that amino acids could have originated in outer space, but more likely that the conditions to synthesize the “simpler” amino acids exist in multiple places. Other researchers have suggested that the “complex” amino acids must have been biosynthesized by early proteins made up of “simple” amino acids, and in particular, that histidine, phenylalanine, cysteine, methionine, tryptophan and tyrosine had to come after molecular oxygen because they have redox functionality.
  6. If you make an α-helix using D-amino acids, what handedness (right or left) would you expect?
    I would expect D-amino acids would form a left-handed helix because L-amino acids form right-handed helices.
  7. Can you discover additional helices in proteins?
  8. Why are most molecular helices right-handed?
    In general, naturally occuring amino acids are L-enantiomers, which leads to right-handed helices because of steric hindrance requiring the side chains to point outwards.
  9. Why do β-sheets tend to aggregate? What is the driving force for β-sheet aggregation?
    Because beta sheets are flat, they can stack, and the large surface area means that the side-chains can have interactions (especially hydrophobic side-chains) between the sheets.
  10. Why do many amyloid diseases form β-sheets? Can you use amyloid β-sheets as materials?
  11. Design a β-sheet motif that forms a well-ordered structure.

Part B: Protein Analysis and Visualization

  1. Briefly describe the protein you selected and why you selected it.
  2. Identify the amino acid sequence of your protein.
    • How long is it? What is the most frequent amino acid?
    • How many protein sequence homologs are there for your protein?
    • Does your protein belong to any protein family?
  3. Identify the structure page of your protein in RCSB
    • When was the structure solved? Is it a good quality structure?
    • Are there any other molecules in the solved structure apart from protein?
    • Does your protein belong to any structure classification family?
  4. Open the structure of your protein in any 3D molecule visualization software:
    • Visualize the protein as “cartoon”, “ribbon” and “ball and stick”.
    • Color the protein by secondary structure. Does it have more helices or sheets?
    • Color the protein by residue type. What can you tell about the distribution of hydrophobic vs hydrophilic residues?
    • Visualize the surface of the protein. Does it have any “holes” (aka binding pockets)?

Part C: Using ML-based Protein Design Tools

Part D: Group Brainstorm on Bacteriophage Engineering