Subsections of Omama Syed — HTGAA Spring 2026

Homework

Weekly homework submissions:

  • Week 1 HW: Principles and Practices

    Week 1 Assignment Q1) First, describe a biological engineering application or tool you want to develop and why. This could be inspired by an idea for your HTGAA class project and/or something for which you are already doing in your research, or something you are just curious about. Engineered Cyanobacteria as Oxygen-Generating Systems for Extraterrestrial Environments

  • Week 2 HW: DNA Read/Write/Edit

    Part 1: Benchling & In-silico Gel Art For this part, I created an account on Benchling –> imported the Lambda DNA –> played around with 7 restriction enzymes to get the gel patterns from the digests. Here’s the enzymes I used and the resulting gel pattern of digests:

  • Week 3 HW: Lab Automation

Subsections of Homework

Week 1 HW: Principles and Practices

cover image cover image cover image cover image

Week 1 Assignment

Q1) First, describe a biological engineering application or tool you want to develop and why. This could be inspired by an idea for your HTGAA class project and/or something for which you are already doing in your research, or something you are just curious about.

Engineered Cyanobacteria as Oxygen-Generating Systems for Extraterrestrial Environments

With renewed interest in space exploration following missions such as NASA’s Artemis II program [1], I can’t help but wonder how far we have come in space research and what else we have yet to accomplish in leaving the Earth and settling on other planets. Beyond the immediate and well-studied challenges of spaceflight, such as radiation exposure, bone loss, and other risks to human health in space [2], I have often found myself thinking about a more distant future: what would it be like to stand on the surface of another planet? And could humans one day (hopefully soon!) be able to breathe there without the constant protection of a spacesuit? Through this course, and particularly through learning more about synthetic biology, that once-distant idea feels less like science fiction and more like a long-term scientific possibility. For this assignment, I therefore propose the idea of engineering cyanobacteria-based biological oxygen generation systems for use on the Moon, Mars, or other extraterrestrial bodies. These organisms would be genetically optimized to carry out photosynthesis under extreme conditions—such as high radiation, low pressure, limited water availability, large temperature fluctuations, and reduced light—converting locally available carbon dioxide and resources into oxygen-rich, breathable air.

Cyanobacteria are photosynthetic bacteria that played a foundational role in shaping life on Earth – 2.5 billion years ago they rapidly produced all the oxygen in an event called the Great Oxygenation Event and birthing complex life. Today, they are found across a range of environments, from marine bodies, rocks, and soil environments to extreme temperature regions such as hot springs and the Arctic [3]. For these qualities, they are viewed as promising candidates for space-related applications. Several studies have already shown that cyanobacteria can survive and maintain photosynthetic and nitrogen-fixing activity in simulated Martian environments, highlighting their potential role in future life-support or terraforming strategies [4].

A core bottleneck in long-duration space exploration has been sustainable life-support. Current systems on space stations rely on mechanical methods such as electrolysis, or on oxygen supplied from Earth—all of which are costly, resource-intensive, and difficult to scale [5]. While recent technologies like NASA’s MOXIE experiment have demonstrated that oxygen can be chemically extracted from the Martian atmosphere, these systems remain energy-intensive and limited in output [6]. By contrast, a biologically driven approachcould offer a more self-sustaining and scalable alternative, potentially reducing mission risk and supporting longer-term human presence beyond Earth.

At the same time, intentionally deploying engineered life beyond Earth raises some tough ethical, ecological, and governance challenges. Living systems can evolve, spread, and behave unpredictably, and once released, they may irreversibly alter extraterrestrial environments. For this reason, governance cannot be treated as an afterthought, but must be considered a central design constraint in the development of extraterrestrial bioengineering technologies.

To ensure that extraterrestrial bioengineering contributes to an ethical future, governance must balance innovation with planetary protection, safety, equity, and long-term responsibility. I propose three governance goals, most of which are directly aligned with COSPAR’s planetary protection policies:

Goal 1: Prevent Biological Harm and Irreversible Planetary Contamination

Sub-goals:

  • Prevent contamination of extraterrestrial environments by engineered organisms from Earth.
  • Protect the scientific value of pristine/untouched planetary bodies.
  • Avoid harm to potential undiscovered extraterrestrial life forms.
  • Reduce risks to astronauts and future settlers from uncontrolled biological behavior The 2026 COSPAR guidelines emphasize on preventing and controlling contamination, but do not mention intentional deployment of engineered organisms, and must therefore be expanded [7].

Goal 2: Ensure Responsible Innovation Through Transparency, Oversight, and Accountability

Sub-goals:

  • Require rigorous pre-deployment testing and environmental impact assessment
  • Establish traceability and accountability for engineered strains
  • Prevent regulatory gaps between academic, governmental, and private actors This space bioengineering project will involve universities, space agencies, and private companies, so governance must prevent unilateral deployment driven by competition or profit.

Goal 3: Promote Equity and Collective Stewardship of Extraterrestrial Environments

Sub-goals:

  • Prevent monopolization of oxygen-generating or terraforming technologies
  • Ensure international participation in decision-making
  • Treat planetary environments as a shared global responsibility rather than commercial property Drawing on principles from the Antarctic Treaty System and the Outer Space Treaty, emphasis is on the fact that planetary-scale interventions affect all of mankind [8, 9].

Q3) Next, describe at least three different potential governance “actions” by considering the four aspects below (Purpose, Design, Assumptions, Risks of Failure & “Success”)

Action 1: Risk Review Before Using Engineered Organisms in Space

Purpose

At the moment, planetary protection rules are mainly designed to prevent accidental contamination from Earth microbes, but they do not fully account for situations where scientists intentionally design organisms to survive and function beyond Earth. Before engineered cyanobacteria are used on the Moon, Mars, or elsewhere, there should be a required risk review to ask straightforward but important questions: What does the organism do? Where will it be used? What could go wrong? And how would those risks be managed? The goal is to make sure decisions are made carefully and responsibly, rather than being rushed by competition or technological excitement.

Design

  • Actors: international group involving space agencies (NASA, ESA, CNSA), scientists, and ethics experts (UN-COPUOS).
  • Any research group or company planning to use engineered cyanobacteria beyond Earth would need approval before launch.
  • The review would focus on basic questions of safety, containment, and long-term impact

Assumptions

  • International cooperation is feasible
  • Ecological risks can be modeled
  • “Harm” can be defined even in environments with no confirmed life

Risks of Failure & “Success”

  • Failure: Weak enforcement enables unilateral deployment
  • Unintended success: Overly restrictive assessments slow beneficial research and push work into unregulated domains

Action 2: Containment-First Deployment with Reversible Design Requirements

Purpose

Because living systems can evolve and spread in unexpected ways, releasing engineered organisms directly into a planetary environment carries serious risks. Therefore, early uses of engineered cyanobacteria should remain contained and reversible, rather than open or permanent. Instead of aiming for large-scale atmospheric change right away, cyanobacteria would first be used in closed or semi-closed systems, such as bioreactors or controlled habitats. These systems could be monitored, adjusted, or shut down if problems arise. Designing for reversibility helps reduce the risk of causing permanent harm before we fully understand the consequences.

Design

  • Actors: researchers, engineers, space agencies
  • Engineered cyanobacteria would initially be used only in closed or controlled systems, such as bioreactors or sealed habitats.
  • Genetic safeguards (kill switches, synthetic nutrient dependence) to shut down or remove the organisms if problems appear
  • Continuous monitoring would be required to detect unexpected changes.
  • Environmental dependency preventing survival outside intended settings.
  • Open-ended release into planetary environments prohibited at early stages.

Assumptions

  • Genetic and physical containment systems will function reliably under space conditions
  • Reversible systems are technically feasible at meaningful scales

Risks of Failure & “Success”

  • Failure: Radiation or mutation could weaken containment or control mechanisms. System failures could lead to unintended release.
  • Unintended success: Perceived reversibility may encourage people to move too quickly toward larger releases or premature deployment.

Action 3: International Open-Source Registry and Tiered Licensing System

Purpose

To ensure transparency, prevent monopolization, and enable oversight, all engineered cyanobacteria strains intended for space use should be traceable and licensed.

Design

  • Actors: United Nations-affiliated bodies, international scientific organizations (e.g., iGEM-style consortium)
  • Establish:
    • An open registry documenting genetic designs, safety features, and deployment locations
    • A tiered licensing system (lab testing → simulated environments → limited deployment → large-scale use), analogous to FDA clinical trial phases
    • Incentives for open data sharing and safety-focused design

Assumptions

  • Open-source models reduce misuse rather than enabling it
  • Regulators possess sufficient expertise
  • Risk levels can be meaningfully categorized

Risks of Failure & “Success

  • Failure: Regulatory capture or non-compliance by powerful private actors (powerful countries, companies, political competition)
  • Unintended success: Too much bureaucracy could slow innovation and mission timelines

Q4) Next, score (from 1-3 with, 1 as the best, or n/a) each of your governance actions against your rubric of policy goals.

cover image cover image

Q5) Last, drawing upon this scoring, describe which governance option, or combination of options, you would prioritize, and why. Outline any trade-offs you considered as well as assumptions and uncertainties.

I would prioritize responsible innovation with clear accountability, supported by shared oversight. When it comes to introducing engineered life beyond Earth, it doesn’t feel like something that should move as fast as possible, even if the technology allows it. A risk review creates space to think carefully about safety and long-term consequences before taking irreversible steps.

Shared oversight adds another layer of care by making these decisions collective rather than competitive. Instead of one actor deciding the future of an entire planetary environment, responsibility is spread across many voices. The trade-off is time: broader discussion and staged approval slow things down and can feel frustrating in a field driven by ambition and speed. This approach also assumes cooperation between countries and institutions, which is not guaranteed. Still, given how permanent the consequences could be, moving more thoughtfully seems worth the cost.

References

[1] National Aeronautics and Space Administration. (n.d.). Artemis II. NASA. https://www.nasa.gov/mission/artemis-ii/

[2] Tomsia, M., Cieśla, J., Śmieszek, J., Florek, S., Macionga, A., Michalczyk, K., & Stygar, D. (2024). Long-term space missions’ effects on the human organism: What we do know and what requires further research. Frontiers in Physiology, 15, 1284644. https://doi.org/10.3389/fphys.2024.1284644

[3] Bekker, A., Holland, H. D., Wang, P.-L., Rumble, D. R. III, Stein, H. J., Hannah, J. L., Coetzee, L. L., & Beukes, N. J. (2004). Dating the rise of atmospheric oxygen. Nature, 427(6970), 117–120. https://doi.org/10.1038/nature02260

[4] Coleine, C., Delgado-Baquerizo, M., Rosado, A. S., & Zerboni, A. (2025). The role of extremophile microbiomes in terraforming Mars. Communications Biology, 8(1), 1588. https://doi.org/10.1038/s42003-025-08973-1

[5] Jones, H., NASA Ames Research Center, Anderson, G., & Paragon Space Development Corporation. (2017). Need for cost optimization of space life support systems. In 47th International Conference on Environmental Systems. https://ntrs.nasa.gov/api/citations/20170010166/downloads/20170010166.pdf

[6] Hoffman, J. A., Hecht, M. H., Rapp, D., Hartvigsen, J. J., SooHoo, J. G., Aboobaker, A. M., McClean, J. B., Liu, A. M., Hinterman, E. D., Nasr, M., Hariharan, S., Horn, K. J., Meyen, F. E., Okkels, H., Steen, P., Elangovan, S., Graves, C. R., Khopkar, P., Madsen, M. B., . . . Eisenman, D. J. (2022). Mars Oxygen ISRU Experiment (MOXIE)—Preparing for human Mars exploration. Science Advances, 8(35), eabp8636. https://doi.org/10.1126/sciadv.abp8636

[7] COSPAR, 2026. COSPAR Policy on Planetary Protection, Space Research Today, Volume 224, 17-39. https://doi.org/10.60970/012026SRT224/PPP

[8] The Antarctic Treaty. (1961). United Nations Treaty Series, 72, No. 5778. https://treaties.un.org/doc/Publication/UNTS/Volume%20402/volume-402-I-5778-English.pdf

[9] Robert.Wickramatunga. (n.d.). Outer Space Treaty. https://www.unoosa.org/oosa/en/ourwork/spacelaw/treaties/outerspacetreaty.html


Homework Questions

Professor Jacobson

  1. Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy?

Biological DNA synthesis uses DNA polymerase, which has an error rate of 1:10^5, that is it adds 1 incorrect base every 100,000 bases. Comparing this with the 3.2 Gb (3.2 x 109 bp) human genome, this enzyme would result in tens of thousands of errors every genome replication cycle, which would be unsustainable. This discrepancy is dealt with by the DNA polymerase itself, which has a dual polymerase and exonuclease function. It is capable of extending the DNA molecule in the 5’  3’ direction, while also proof reading via removing incorrect and adding correct bases in the 3’  5’ direction. That is to say, the polymerase has a natural self-correcting/proofreading mechanism for removing an erroneous base every time it adds it to the replicating DNA strand. The polymerase plus its proof-reading system results in an error rate of 1:106, which still gives a significant over 3000 errors per cell division. This is further reduced by the MutS repair system, wherein a group of proteins (MutS, MutL MutH) recognize mismatches in DNA replication, cleave the region with the incorrect bases, and allow the DNA polymerase to extend it with the right set of bases.

  1. How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?

For an average human protein, the DNA code would be 1036 bp long, which corresponds to roughly 345 codons (and subsequently, amino acids). Since the genetic code is redundant i.e more than 1 codon can give the same amino acid, there are many Due to codon degeneracy, each amino acid can be encoded by multiple synonymous codons. Assuming an average of ~3 codons per amino acid, the number of possible DNA sequences that could encode the same protein is approximately 3345 , an astronomically large number. In practice, most of these sequences are not biologically viable due to physical and biochemical constraints. One of these is GC content, where strands with extremely high and low GC contents in their sequence limit effective DNA transcription. GC content also affects secondary structure formation, with certain GC-rich/poor sequences allowing for hairpins and loops to form, again not allowing for the genetic code to be translated into the protein of interest. Moreover, the ribozyme RNAse III preferentially cuts at certain RNA sequences or secondary structures (RNA Cleavage Rules), making many theoretical codes fail.

Dr. LeProust

  1. What’s the most commonly used method for oligo synthesis currently?

The most commonly used method for oligonucleotide synthesis currently is the phosphoramidite method, developed by Caruthers in 1981. This method involves a cyclic process performed on solid-phase support (CPG) where nucleotides are added one at a time in a repeating four-step cycle: (1) coupling with phosphoramidite to add the next nucleotide, (2) capping any unreacted sites to prevent errors, (3) oxidation to stabilize the phosphate linkage, and (4) deblocking to prepare for the next cycle. This cycle is repeated N times to build an oligonucleotide of N bases in length. The phosphoramidite method became the foundation for automated DNA synthesis when Applied Biosystems (ABI) introduced the first automated DNA synthesizer in 1983, and it remains the standard chemistry used in modern synthesis platforms, including high-throughput approaches like microarrays and silicon-based synthesis technologies.

  1. Why is it difficult to make oligos longer than 200nt via direct synthesis?

Making oligonucleotides longer than 200 nucleotides via direct synthesis is extremely difficult due to the cumulative effect of incomplete coupling reactions at each synthesis cycle. In the phosphoramidite method, each nucleotide addition cycle has a coupling efficiency of approximately 98-99%, meaning that 1-2% of growing chains fail to add the correct base at each step. While this seems like a high success rate, the effect compounds over many cycles—after 200 cycles, the proportion of full-length, error-free product becomes vanishingly small, with the majority of molecules being truncated or containing errors. This is evident from the significant achievement of synthesizing 700-mers for the first time using enhanced chemistry, which required substantial optimization to achieve 97% full-length material. The baseline synthesis process shows clear accumulation of truncation products for sequences approaching 500 nucleotides, and even with enhanced chemistry, a greater than 10-fold increase in PCR yield was necessary to obtain better uniformity and more full-length materials. Essentially, as oligo length increases beyond 200nt, the yield of correct, full-length product drops exponentially while error rates accumulate, making direct synthesis increasingly impractical.

  1. Why can’t you make a 2000bp gene via direct oligo synthesis?

A 2000bp gene cannot be made via direct oligonucleotide synthesis because this length far exceeds the practical limitations of the phosphoramidite method. With the most advanced chemistry enhancements, direct synthesis is limited to approximately 200-700 nucleotides maximum, making a 2000bp sequence 3-10 times longer than what's achievable in a single synthesis reaction. Instead, longer genes must be constructed through an assembly approach such as classical gene synthesis methods by first designing multiple shorter, overlapping oligonucleotides that are then pieced together. Modern gene synthesis workflows, such as Twist's platform, use a multi-step process: first synthesizing ultra-long oligonucleotides (which are still limited in length), then using enzymatic assembly methods to join these fragments together, and finally using long-read sequencing to verify the final assembled product. This assembly strategy allows for the production of gene fragments up to 5kb with low error rates (1:7,500 ), but these are explicitly described as assembled products rather than directly synthesized single molecules.

George Church

  1. What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?

The 20 amino acids found in nature are further divided into two groups based on their role in growth and nitrogen balance: essential amino acids and non-essential amino acids. Essential amino acids (EAAs) are those whose carbon skeletons are not synthesized de novo by animal cells, or are insufficiently synthesized de novo relative to metabolic needs (Wu, 2009). In other words, they cannot be produced by the body and therefore must be obtained from external sources (food). The 10 essential amino acids in all animals include Phe, Val, Thr, Trp, Ile, Met, His, Arg, Leu, and Lys. The "Lysine Contingency" is a concept from the Jurassic Park films in which the dinosaurs are genetically engineered to be unable to naturally produce the amino acid lysine and must be fed it in their diets. The scientists added this feature as a kill-switch: if the dinosaurs escaped their confines, they would be unable to survive in nature without human-supplemented lysine in their food. However, this concept is scientifically flawed because lysine is already an essential amino acid that isn't naturally synthesized by the body and can only be obtained externally through diet. The film's inaccuracy (portraying lysine as though it were normally non-essential) demonstrates the filmmakers' negligence in aligning the biotech/synbio-focused plot with biological reality. To non-science audiences, this may seem like a clever plot device, but as someone with a biology background who enjoyed these films growing up (and can even say they inspired me to pursue biotech), this now seems like a cringe-worthy and amusing mistake in such a major franchise.

Citations:

Hou, Y., & Wu, G. (2018). Nutritionally essential amino acids. Advances in Nutrition, 9(6), 849–851. https://doi.org/10.1093/advances/nmy054

https://jurassicpark.fandom.com/wiki/Lysine_contingency

(I acknowledge that AI tools such as ChatGPT, Claude, and DeepSeek were used to assist with conceptual understanding and language refinement. The content, ideas, and intellectual contributions of this work remain entirely my own.)


Week 2 HW: DNA Read/Write/Edit

Part 1: Benchling & In-silico Gel Art

For this part, I created an account on Benchling –> imported the Lambda DNA –> played around with 7 restriction enzymes to get the gel patterns from the digests. Here’s the enzymes I used and the resulting gel pattern of digests:

cover image cover image cover image cover image

(I envisioned creating a dragonfly, so if you look closely I hope you can see it 😄)


Part 3: DNA Design Challenge

3.1. Choose your protein

The protein I chose for this assignment is the Tardigrade Dsup (Damage Suppressor), which is a unique DNA-binding protein produced only by tardigrades (Phylum Tardgrada) that enables them to survive ROS and radiation-induced DNA damage (Hashimoto et.al 2016). It’s one of the key reasons these tiny organisms can tolerate the extreme conditions of space (Jonsson et.al 2008). What I found especially fascinating is that when Dsup was expressed in human cultured cells (HEK293), it also reduced X-ray- and hydrogen-peroxide-induced DNA damage by lowering DNA breaks and fragmentation. Furthermore, a lot of exciting studies on Dsup-based gene and mRNA delivery approaches as potential radioprotective therapies, which are especially relevant to cancer treatment (Cui et. al. 2025). The (UniProt sequence)[https://www.uniprot.org/uniprotkb/P0DOW4/entry] of Dsup is pasted below:

MASTHQSSTEPSSTGKSEETKKDASQGSGQDSKNVTVTKGTGSSATSAAIVKTGGSQGKDSSTTAGSSSTQGQKFSTTPTDPKTFSSDQKEKSKSPAKEVPSGGDSKSQGDTKSQSDAKSSGQSQGQSKDSGKSSSDSSKSHSVIGAVKDVVAGAKDVAGKAVEDAPSIMHTAVDAVKNAATTVKDVASSAASTVAEKVVDAYHSVVGDKTDDKKEGEHSGDKKDDSKAGSGSGQGGDNKKSEGETSGQAESSSGNEGAAPAKGRGRGRPPAAAKGVAKGAAKGAAASKGAKSGAESSKGGEQSSGDIEMADASSKGGSDQRDSAATVGEGGASGSEGGAKKGRGRGAGKKADAGDTSAEPPRRSSRLTSSGTGAGSAPAAAKGGAKRAASSSSTPSNAKKQATGGAGKAAATKATAAKSAASKAPQNGAGAKKKGGKAGGRKRK

References

  • Hashimoto, T., Horikawa, D. D., Saito, Y., Kuwahara, H., Kozuka-Hata, H., Shin-I, T., Minakuchi, Y., Ohishi, K., Motoyama, A., Aizu, T., Enomoto, A., Kondo, K., Tanaka, S., Hara, Y., Koshikawa, S., Sagara, H., Miura, T., Yokobori, S. I., Miyagawa, K., Suzuki, Y., … Kunieda, T. (2016). Extremotolerant tardigrade genome and improved radiotolerance of human cultured cells by tardigrade-unique protein. Nature communications, 7, 12808. https://doi.org/10.1038/ncomms12808
  • Jönsson, K. I., Rabbow, E., Schill, R. O., Harms-Ringdahl, M., & Rettberg, P. (2008). Tardigrades survive exposure to space in low Earth orbit. Current Biology, 18(17), R729–R731. https://doi.org/10.1016/j.cub.2008.06.048
  • Cui, Z., Lin, C., Zhao, H., & Wang, X. (2025). Radioprotection redefined: drug discovery at the intersection of tardigrade biology and translational pharmacology. Frontiers in pharmacology, 16, 1713914. https://doi.org/10.3389/fphar.2025.1713914

3.2. Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence.

Using the “Reverse Translate” tool of the Sequence Manipulation Suite (https://www.bioinformatics.org/sms2/rev_trans.html)

atggcgagcacccatcagagcagcaccgaaccgagcagcaccggcaaaagcgaagaaacc aaaaaagatgcgagccagggcagcggccaggatagcaaaaacgtgaccgtgaccaaaggc accggcagcagcgcgaccagcgcggcgattgtgaaaaccggcggcagccagggcaaagat agcagcaccaccgcgggcagcagcagcacccagggccagaaatttagcaccaccccgacc gatccgaaaacctttagcagcgatcagaaagaaaaaagcaaaagcccggcgaaagaagtg ccgagcggcggcgatagcaaaagccagggcgataccaaaagccagagcgatgcgaaaagc agcggccagagccagggccagagcaaagatagcggcaaaagcagcagcgatagcagcaaa agccatagcgtgattggcgcggtgaaagatgtggtggcgggcgcgaaagatgtggcgggc aaagcggtggaagatgcgccgagcattatgcataccgcggtggatgcggtgaaaaacgcg gcgaccaccgtgaaagatgtggcgagcagcgcggcgagcaccgtggcggaaaaagtggtg gatgcgtatcatagcgtggtgggcgataaaaccgatgataaaaaagaaggcgaacatagc ggcgataaaaaagatgatagcaaagcgggcagcggcagcggccagggcggcgataacaaa aaaagcgaaggcgaaaccagcggccaggcggaaagcagcagcggcaacgaaggcgcggcg ccggcgaaaggccgcggccgcggccgcccgccggcggcggcgaaaggcgtggcgaaaggc gcggcgaaaggcgcggcggcgagcaaaggcgcgaaaagcggcgcggaaagcagcaaaggc ggcgaacagagcagcggcgatattgaaatggcggatgcgagcagcaaaggcggcagcgat cagcgcgatagcgcggcgaccgtgggcgaaggcggcgcgagcggcagcgaaggcggcgcg aaaaaaggccgcggccgcggcgcgggcaaaaaagcggatgcgggcgataccagcgcggaa ccgccgcgccgcagcagccgcctgaccagcagcggcaccggcgcgggcagcgcgccggcg gcggcgaaaggcggcgcgaaacgcgcggcgagcagcagcagcaccccgagcaacgcgaaa aaacaggcgaccggcggcgcgggcaaagcggcggcgaccaaagcgaccgcggcgaaaagc gcggcgagcaaagcgccgcagaacggcgcgggcgcgaaaaaaaaaggcggcaaagcgggc ggccgcaaacgcaaa

3.3. Codon optimization

When expressing a foreign gene in a host organism (chassis), it’s important to consider codon usage bias, which refers to the host’s preference of translating a synonymous codon for an amino acid over one that is in the genetic sequence, which is not frequently translated by the host. This is where codon optimization comes into play, by replacing some codons with those the host prefers to translate (but not changing the amino acid sequence), to ensure that the host organism optimally translates our protein of interest.

Supposing I want to acquire an engineered Dsup protein for therapeutic use, then I would choose [E.coli] as the chassis/delivery system. E.coli is one of the standard and most reliable chassis in synthetic biology - thoroughly studied, well characterized, and it grows rapidly in large volumes inexpensively.

Using the Codon Optimization Tool of [VectorBuilder]:cover image cover image

cover image cover image

I get the following improved sequence: cover image cover image

ATGGCGAGCACCCATCAGAGCTCCACCGAACCGAGCAGCACCGGCAAAAGCGAAGAAACCAAAAAAGATGCGTCACAGGGCTCAGGCCAGGATAGCAAAAATGTGACCGTGACCAAAGGTACCGGCAGCAGCGCCACCAGCGCGGCGATTGTAAAAACCGGCGGTAGCCAGGGCAAAGATAGTTCTACCACCGCGGGCAGCAGCAGTACCCAAGGTCAGAAATTCAGCACCACCCCGACCGATCCGAAAACCTTTAGCAGTGATCAGAAAGAAAAATCCAAAAGCCCGGCCAAAGAAGTGCCGAGCGGCGGTGATTCAAAAAGCCAGGGCGACACCAAAAGTCAGAGCGATGCGAAATCTAGCGGCCAATCACAGGGCCAGAGCAAAGATAGCGGTAAAAGCAGCAGCGACAGTTCGAAAAGCCATAGCGTTATTGGTGCGGTGAAAGATGTGGTTGCGGGCGCCAAAGATGTGGCAGGCAAAGCGGTGGAAGATGCGCCGTCCATTATGCATACCGCCGTGGATGCGGTTAAAAATGCAGCGACCACCGTTAAAGATGTGGCGAGCAGCGCAGCCAGCACCGTGGCGGAAAAAGTAGTGGATGCGTATCACTCAGTTGTCGGCGATAAAACTGATGATAAAAAAGAAGGCGAACATTCGGGCGATAAAAAAGATGATAGCAAAGCGGGTAGCGGCAGCGGCCAGGGCGGCGACAACAAAAAATCAGAAGGTGAAACGTCCGGCCAGGCGGAAAGCAGCTCAGGTAACGAAGGTGCCGCCCCGGCGAAAGGCCGCGGCCGTGGTCGCCCGCCAGCGGCCGCAAAAGGCGTTGCCAAAGGTGCCGCCAAAGGCGCGGCGGCATCAAAAGGCGCGAAATCGGGTGCGGAATCAAGTAAAGGCGGCGAACAGAGCAGCGGCGATATCGAAATGGCGGATGCTAGCAGCAAAGGCGGCAGCGATCAACGTGACAGTGCCGCCACCGTGGGCGAAGGTGGCGCGTCGGGCAGCGAAGGTGGTGCGAAAAAAGGCCGTGGCCGCGGTGCCGGCAAAAAAGCGGATGCGGGCGATACGAGCGCGGAACCGCCGCGCCGCAGCAGTCGTTTAACCTCAAGCGGCACCGGCGCCGGCAGCGCGCCGGCAGCGGCCAAAGGCGGCGCCAAACGCGCGGCCAGCAGCTCGAGCACCCCGAGCAATGCCAAAAAACAGGCGACCGGCGGCGCGGGCAAAGCGGCCGCAACCAAAGCGACCGCCGCGAAAAGTGCGGCGTCTAAAGCTCCGCAAAATGGCGCGGGCGCGAAAAAAAAAGGCGGCAAAGCAGGTGGCCGCAAACGCAAA

3.4. You have a sequence! Now what? The next step is to express this sequence in a host (transcription) and get the protein (translation). Since I optimized the sequence for E. coli, I would use an E. coli expression system, and the cell’s transcription/translation machinery will produce the protein. I will achieve this in the following steps:

  • Constructing a recombinant vector (Dsup gene + expression plasmid) via Gibson Assembly. It will comprise of the Dsup gene, a promoter, ribosome binding site, selectable marker (antibiotic resistance) genes, and a His-tag sequence.
  • Transformation of E.coli cells by introducing the vector. Cells containing the plasmid are selected using antibiotic media.
  • Transcription: inside the bacterial cell, RNA polymerase recognizes the promoter on the plasmid and transcribes the Dsup gene into mRNA.
  • Translation: ribosomes bind the ribosome binding site on the mRNA and translate codons into amino acids using tRNAs. The growing polypeptide folds into the functional Dsup protein inside the cell.
  • The protein is purified and obtained by lysing the cells, and passing the cell lysates through affinity chromatography with a His-tag. Since the Dsup gene was designed with a His-tag sequence in the vector, the protein will also have it, and this will help it to be separated (purified).

Part 4: Prepare a Twist DNA Synthesis Order

Step 1) Create a Twist account and a Benchling account

Step 2) Build Your DNA Insert Sequence I have used my codon-optimized dsup sequence for an E.coli expression system. On Benchling, I sequentially inserted the following sequences (mentioned in the HW) into the “Create DNA / RNA sequence” tab:

  • Promoter (e.g. BBa_J23106): TTTACGGCTAGCTCAGTCCTAGGTATAGTGCTAGC
  • RBS (e.g. BBa_B0034 with spacers for optimal expression): CATTAAAGAGGAGAAAGGTACC
  • Start Codon: ATG
  • Coding Sequence (my codon optimized dsup for a protein of interest)
  • 7x His Tag (to enable protein purification from E. coli): CATCACCATCACCATCATCAC
  • Stop Codon: TAA
  • Terminator (e.g. BBa_B0015): CCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGAACGCTCTCTACTAGAGTCACACTGGCTCACCTTCGGGTGGGCCTTTCTGCGTTTATA cover image cover image During the process of annotating: cover image cover image Linear Map (link: https://benchling.com/s/seq-FJPRVyjCrav3opFmzzZ2?m=slm-MQBEBBrWm3w4d8JMFsGv): cover image cover image

Step 3) On Twist Biosciences:

Selecting “Genes” and then “Clonal Genes” Option

cover image cover image cover image cover image

Importing the Benchling Casette Sequence (FASTA)

cover image cover image

Choosing Twist Vector

cover image cover image

Downloading GenBank sequence of the construct

cover image cover image

Here is what the recombinant vector looks like in Twist:

cover image cover image

Step 4) Back to Benchling:

After importing the Twist construct (GenBank file), this is the final plasmid + expression casette:

cover image cover image

Part 5: DNA Read/Write/Edit

5.1 DNA Read

(i) What DNA would you want to sequence (e.g., read) and why?

I would want to sequence the tardigrade dsup gene (damage suppressor). The Dsup protein binds and physically shields DNA from hydroxyl radicals generated by radiation and oxidative stress. This makes it super interesting for some of my envisioned projects on:

  • Space biology (could it be used to protect astronauts from space radiation? maybe in engineered probiotics or human gene therapy for long-term missions?)
  • Radioprotectant for cancer therapy (protecting healthy cells from radiotherapy?)
  • Biomanufacturing (dsup-engineered human cells that are radiation-resistant?)

(ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why?

I would choose Oxford Nanopore sequencing (3rd generation) because it provides very long read lengths and direct sequencing of DNA withough any amplification, which is ideal for verifying complex constructs like the Dsup gene and related assemblies. Nanopore sequencing reduces bias and enables rapid real-time data generation. However, it has a higher raw error rate compared to short-read sequencing (1st + 2nd gen) and requires computational base-calling, so I would use it primarily for assembly confirmation and structure verification.

The essential steps (I used ChatGPT to simplify this) are given below:

  1. Library preparation: attaching specific sequence adapters to the DNA so it can interact with pores
  2. DNA passes through protein nanopores: each single-stranded DNA molecule travels through a tiny biological pore embedded in a membrane
  3. Electrical signal detection: as each nucleotide passes through the pore, it causes a distinctive change in ionic current (measured in real-time)
  4. Base-calling: signal data is processed by software to convert electrical signals into a DNA sequence
  5. Data analysis: aligning the reads to the intended design to confirm correct assembly

Output: set of DNA read files (FASTQ) containing the nucleotide sequences of DNA fragments + associated quality scores. These reads can be aligned to the designed Dsup sequence to verify that the synthesized gene matches the intended design. Other outputs like consensus sequences and variant files summarize the final DNA sequence and any differences.

5.2 DNA Write

(i) What DNA would you want to synthesize (e.g., write) and why?

I would synthesize a fusion gene encoding a chimeric Dsup–HMGN protein for expression in human cells. Dsup (from tardigrades) protects DNA from radiation damage, but since it is not a human protein it may not localize efficiently in human chromatin and could trigger immune responses. To improve this, I would fuse Dsup to the nucleosome-binding domain of a human HMGN protein, which naturally binds chromatin in human cells. HMGN are present in vertebrates, including humans, and have some sequence similarity with dsup (Chavez et al, 2019).

The goal would be to create a protein that:

  • targets human DNA effectively
  • protects DNA during radiation exposure

In terms of potential applications, this could protect healthy tissue during radiotherapy and astronauts from space radiation.

(ii) What technology or technologies would you use to perform this DNA synthesis and why?

I would use phosphoramidite chemical DNA synthesis followed by enzymatic assembly (workflow used by Twist Bioscience). Why?

  • My construct (~1.5–2 kb fusion gene) is too long to synthesize in one piece
  • Chemical synthesis reliably produces short oligos (≈150–200 bp)
  • These fragments can then be assembled into the full gene using PCR assembly/Gibson assembly
  • It allows codon optimization and seamless fusion of Dsup + linker + HMGN

So my worflow would look like this:

  1. Design codon-optimized sequence (human expression)
  2. Chemically synthesize short DNA oligos
  3. Assemble into full gene
  4. Clone into plasmid
  5. Verify sequence using Oxford Nanopore sequencing

Why Oxford Nanopore?

  • Can read the entire plasmid in one read (long-read sequencing)
  • Confirms correct assembly and orientation
  • Detects large insertions, deletions, or rearrangements that Sanger may miss

Essential steps of the chosen sequencing method (Oxford Nanopore; repeated from above)

  1. Library preparation: attaching specific sequence adapters to the DNA so it can interact with pores
  2. DNA passes through protein nanopores: each single-stranded DNA molecule travels through a tiny biological pore embedded in a membrane
  3. Electrical signal detection: as each nucleotide passes through the pore, it causes a distinctive change in ionic current (measured in real-time)
  4. Base-calling: signal data is processed by software to convert electrical signals into a DNA sequence
  5. Data analysis: aligning the reads to the intended design to confirm correct assembly

Limitations of the sequencing method in terms of speed, accuracy, scalability: (I consulted the internet and ChatGPT for this)

  1. Accuracy

    • Raw read accuracy lower than short-read sequencing
    • Homopolymers (AAAAA…) are error-prone
    • Usually solved by high coverage consensus
  2. Speed

    • Fast setup and real-time results
    • But computational basecalling adds analysis time
  3. Scalability

    • Excellent for plasmids and constructs
    • Not ideal for very high-throughput small variant detection compared to Illumina

[References]

  • Carolina Chavez, Grisel Cruz-Becerra, Jia Fei, George A Kassavetis, James T Kadonaga (2019) The tardigrade damage suppressor protein binds to nucleosomes and protects DNA from hydroxyl radicals eLife 8:e47682 https://doi.org/10.7554/eLife.47682

5.3 DNA Edit

(i) What DNA would you want to edit and why?

I would focus on editing human genomes to improve health, increase disease resistance, and extend healthy lifespan. With my background in cancer research and interest in space biology, I’m especially intrigued by the tardigrade Dsup protein, which protects DNA from radiation damage. Understanding or applying similar mechanisms could help humans better withstand cosmic radiation during space travel and even serve as radioprotective strategies in cancer therapy, all while enhancing our natural resilience to disease and aging.

(ii) What technology or technologies would you use to perform these DNA edits and why?

I would use CRISPR-Cas9 to edit human genomes because it is precise, efficient, and versatile, allowing targeted insertion of protective genes like the tardigrade Dsup protein to enhance DNA resilience, disease resistance, and longevity.

  1. How does your technology of choice edit DNA? What are the essential steps? (Used ChatGPT to simplify and refine concepts)

CRISPR uses a guide RNA (gRNA) to direct the Cas9 enzyme to a specific DNA sequence, where Cas9 creates a double-strand break. The cell then repairs the break via homology-directed repair (HDR), allowing precise insertion of a donor DNA sequence, such as my codon-optimized Dsup gene.

  1. What preparation do you need to do (e.g. design steps) and what is the input (e.g. DNA template, enzymes, plasmids, primers, guides, cells) for the editing?

Design Steps:

  • Identify the target DNA sequence in the human genome.
  • Design a specific guide RNA complementary to the target.
  • If introducing a new gene (e.g., Dsup), design a donor DNA template with homology arms for HDR.

Inputs Needed:

  • gRNA sequence targeting the genomic site.
  • Cas9 enzyme or Cas9-expressing plasmid.
  • Donor DNA template (for dsup gene).
  • Human cells for editing, cultured in vitro.
  1. What are the limitations of your editing methods (if any) in terms of efficiency or precision?
  • HDR efficiency can be low, especially in non-dividing cells.
  • Off-target effects may cause unintended mutations.
  • Delivering CRISPR components into human tissues is challenging.
  • Ethical and safety concerns limit clinical applications in humans.

Week 3 HW: Lab Automation

Subsections of Labs

Week 1 Lab: Pipetting

cover image cover image

Subsections of Projects

Individual Final Project

cover image cover image

Group Final Project

cover image cover image