<Christine Zhou> — HTGAA Spring 2026

cover image cover image

About me

Hello, I’m Christine, a Product Design Engineering student passionate about enhancing ecological integration within our communities through design. I am continually inspired by citizen science groups and their commitment to conservation and environmental research, and their work motivates me to explore how design can strengthen our relationship with nature.

I love learning and participating in habitat restoration projects, including coral planting, fungi observation, community farming, and wildlife surveys. These experiences have deepened my understanding of ecological systems and the role design can play in supporting environmental resilience.

Contact info

Homework

Labs

Projects

Subsections of <Christine Zhou> — HTGAA Spring 2026

Homework

Weekly homework submissions:

  • Week 1 HW: Principles and Practices

    1. First, describe a biological engineering application or tool you want to develop and why. I am interested in developing materials that recover over time by mimicking fungal evolutionary conservatism and continuity. I am currently developing a pollen sensor intended to be placed in children’s playgrounds, while simultaneously reading Merlin Sheldrake’s Entangled Life. Together, these are inspiring me to think about design in a different way.
  • Week 2 HW: DNA Read, Write, Edit Life

    Part 1: Benchling & In-silico Gel Art Navigating to Lambda sequence and saving as text. file Importing Lambda file by saving FASTA Format of Sequence file Simulate Restriction Enzyme Digestion with the following Enzymes: EcoRI Testing Virtual digest example Adding more restrcition enzymes Simulate Restriction Enzyme Digestion with all the Enzymes

  • Week 3 HW: Labd Automation

    Generate an artistic design using the GUI at opentrons-art.rcdonovan.com I wanted to draw Punch, the famous baby Japanese macaque from the zoo in Japan. The GUI created an approximate outline which I used as a base and added more details of different colour on top. This was the end result This was the settings with the coordinates.

  • Week 4 HW: Protein Design Part I

    Part A. Conceptual Questions Answer any NINE of the following questions from Shuguang Zhang: (i.e. you can select two to skip) How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons) Why do humans eat beef but do not become a cow, eat fish but do not become fish? Why are there only 20 natural amino acids? Can you make other non-natural amino acids? Design some new amino acids. Where did amino acids come from before enzymes that make them, and before life started? If you make an α-helix using D-amino acids, what handedness (right or left) would you expect? Can you discover additional helices in proteins? Why are most molecular helices right-handed? Why do β-sheets tend to aggregate? What is the driving force for β-sheet aggregation? Why do many amyloid diseases form β-sheets? Can you use amyloid β-sheets as materials? Design a β-sheet motif that forms a well-ordered structure. Part B: Protein Analysis and Visualization

Subsections of Homework

Week 1 HW: Principles and Practices

cover image cover image

1. First, describe a biological engineering application or tool you want to develop and why.

I am interested in developing materials that recover over time by mimicking fungal evolutionary conservatism and continuity. I am currently developing a pollen sensor intended to be placed in children’s playgrounds, while simultaneously reading Merlin Sheldrake’s Entangled Life. Together, these are inspiring me to think about design in a different way.

In Entangled Life, Sheldrake describes how certain fungi infect insects, such as ants, causing them to veer off the trajectory of their own evolutionary story and onto the evolutionary path of the fungus. This blurring of where one organism ends and another begins, prompted me to reflect on how can other objects operate in similar ways.

From this, I began to brainstorm how a pollen sensor, a public infrastructure where durability and reparability govern its sustainability and performance, could behave more like a fungus. How might it heal itself? How could it detect air quality and sense its environment without relying solely on electronic monitoring of pollen, carbon, dust, or humidity? What might be gained from infrastructure that is self-regenerative, adaptive, or even slowly evolving into another living organism?

2. Next, describe one or more governance/policy goals related to ensuring that this application or tool contributes to an “ethical” future, like ensuring non-malfeasance (preventing harm). Break big goals down into two or more specific sub-goals.

In considering material that recovers especially public infrastructure, some governance that relates are environmental responsibility, ecological integration, public trust and safety standards regarding to public infrastucture utilities.

3. Next, describe at least three different potential governance “actions” by considering the four aspects below (Purpose, Design, Assumptions, Risks of Failure & “Success”).

  1. Environmental responsibility
  • Purpose: Self-recovering material used in public infrastructure does not introduce harm to surrounding environment and affect original biodiversity.
  • Design: Require environmental impact assessments for all materials and define acceptable interactions between the infrastructure and local ecosystems.
  • Assumptions: That materials designed to recover or regenerate will affect ecosystems and hence has to be carefully governed.
  • Risks of Failure & “Success”:Failure is that the regenerative materials will alter local ecosystems in unpredictable or harmful ways. Success is that regenerative material will improve circularity in resources.
  1. Public safety and infrastructure standards compliance
  • Purpose: Maintaining public trust by ensuring the material meets established safety stands especially as it begins to repair
  • Design: Require regular inspection protocols that account for material change, aging, or regeneration over time. Establish clear thresholds for intervention.
  • Assumptions: Public acceptance depends on reliability, when the infrastructure completely regnerates it is well integrated back into nature.
  • Risks of Failure & “Success”: Failure is that the material recovery will compromise the structrual integrity and introduce new safety hazards. The success is that the infrastructures remains safe, reliable over a period of time until original functionality is overtaken by new identity.
  1. Ecological impact
  • Purpose: The public infrastructure with regnerative material operates within ecological limits
  • Design: Ensure ecological integration rather than disruption
  • Assumptions: The rate of regeneration is influenced greatly and different location to location hence different types of material must be used, not a one size fits all.
  • Risks of Failure & “Success”: Success is that the regenerative material will allow ecological compatibilty and creative positive interaction with its surroundings. Failutre is that failed integration between the organisms and their environment.

4. Next, score (from 1-3 with, 1 as the best, or n/a) each of your governance actions against your rubric of policy goals. The following is one framework but feel free to make your own:

Does the option:Option 1Option 2Option 3
Enhance Biosecurity
• By preventing incidents333
• By helping respond333
Foster Lab Safety333
• By preventing incident333
• By helping respond333
Protect the environment
• By preventing incidents111
• By helping respond111
Other considerations
• Minimizing costs and burdens to stakeholders122
• Feasibility?221
• Not impede research12`1
• Promote constructive applications121

5.Last, drawing upon this scoring, describe which governance option, or combination of options, you would prioritize, and why. Outline any trade-offs you considered as well as assumptions and uncertainties. Drawing upon this scoring, I would prioritise priortise Public safety and infrastructure standards compliance because this goverance option addresses the risks associated with materials and installation, it ensures the regulations and compliances are met with industry standards which together contribute and build public trust. Some trades offs are safety may constrain certain material properties where it cannot completely replicate the conservatism or continuity like fungus. Key assumptions are that current safety standards do not yet provide a framework for selfregenerative materials, and there is uncertainty regarding how the materials will behave under rapdily changing conditions.

In preparation for Week 2’s lecture on “DNA Read, Write, and Edit,"

In addition, answer these questions in each faculty member’s section:

Homework Questions from Professor Jacobson:

  1. Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy? Error Rate: 1:1016
    Throughput: 10 mS per Base Addition

Compared to length of human genome 3 x 109 base pairs Biology deal with that discrepency through error correcting gene sythesis.

  1. How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest? 10143 possible DNA sequences, and not all work to code for protein of interest because of codon usage bias and mRNA stability.

Homework Questions from Dr. LeProust:

  1. What’s the most commonly used method for oligo synthesis currently? Most common use is each nucleotide addition has a small failure rate (~0.5–1%).
  2. Why is it difficult to make oligos longer than 200nt via direct synthesis? It is difficult because loner sequences have more opportunities for incomplete coupling and other chemical damages.
  3. Why can’t you make a 2000bp gene via direct oligo synthesis? Direct oligo synthesis only reliably produces ≤200 nt sequences

Homework Question from George Church: [Using Google & Prof. Church’s slide #4] What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?

  1. Arginine
  2. Histidine
  3. Isoleucine
  4. Leucine
  5. Lysine
  6. Methionine
  7. Phenylalanine
  8. Threonine
  9. Tryptophan
  10. Valine It shows how the Lysine contingency is a noraml universal biologcal contstraint. All aniamsl need certain amino acids from food.

Week 2 HW: DNA Read, Write, Edit Life

Part 1: Benchling & In-silico Gel Art

Navigating to Lambda sequence and saving as text. file

Importing Lambda file by saving FASTA Format of Sequence file

Simulate Restriction Enzyme Digestion with the following Enzymes: EcoRI

Testing Virtual digest example

Adding more restrcition enzymes

Simulate Restriction Enzyme Digestion with all the Enzymes

Create a pattern/image in the style of Paul Vanouse’s Latent Figure Protocol artworks

Using https://rcdonovan.com/gel-art to create randomised Gel Art

Fireworks and Flower Gel Art Shape

Part 3: DNA Design Challenge 3.1 Choose your Protein

Figure 1

Figure 2

My chosen protein is DdrC Protein seen in figure 1 and 2 above, because of its ability to respond to DNA damage by putting a stop to the damage and alerting the cell to begin repair process. For my final project I would like to research into material that is able to self-repair and contain the deterioation. Hence I wanted to research about how DdrC from Deinococcus radioduran is able to stabilise DNA breaks by binding to both single and double stranded leisions. Something that I thought was quite interesting through the readings of this protein is its structural asymmetry allowing it precise lesion recognition (“The Unique Mechanism of DdrC in Enhancing DNA Stability,” 2024). It bears two asymmetric DNA binding sites located on either side of the dimer and can modulate the topology and level of compaction of circular DNA (Gueguen et al., 2022).

Protein Sequence

sp|Q9RYE6|DDRC_DEIRA DNA damage response protein C OS=Deinococcus radiodurans (strain ATCC 13939 / DSM 20539 / JCM 16871 / CCUG 27074 / LMG 4051 / NBRC 15346 / NCIMB 9279 / VKM B-1422 / R1) OX=243230 GN=ddrC PE=1 SV=2 MKNAPLTLNFGSVRLPVSADGLLHAPTAQQQLGLTQSWEAALVEHGLPETYRDFGAGPEA AVSVPDFVALAFALDTPEARRWQKRARELLARAMQGDVRVAAQIAERNPEPDARRWLAAR LESTGARRELLATVARHGGEGRVYGQLGSISNRTVLGKDSASVRQERGVKATRDGLTSAE LLRLAYIDTVTARAIQESEARGNAAILTLHEQVARSERQSWERAGQVQRVG

3.2. Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence

Using https://www.genecorner.ugent.be/rev_trans.html to create reverse translation of Protein DdrC

atgaaaaacgcgccgctgaccctgaactttggcagcgtgcgcctgccggtgagcgcggat ggcctgctgcatgcgccgaccgcgcagcagcagctgggcctgacccagagctgggaagcg gcgctggtggaacatggcctgccggaaacctatcgcgattttggcgcgggcccggaagcg gcggtgagcgtgccggattttgtggcgctggcgtttgcgctggataccccggaagcgcgc cgctggcagaaacgcgcgcgcgaactgctggcgcgcgcgatgcagggcgatgtgcgcgtg gcggcgcagattgcggaacgcaacccggaaccggatgcgcgccgctggctggcggcgcgc ctggaaagcaccggcgcgcgccgcgaactgctggcgaccgtggcgcgccatggcggcgaa ggccgcgtgtatggccagctgggcagcattagcaaccgcaccgtgctgggcaaagatagc gcgagcgtgcgccaggaacgcggcgtgaaagcgacccgcgatggcctgaccagcgcggaa ctgctgcgcctggcgtatattgataccgtgaccgcgcgcgcgattcaggaaagcgaagcg cgcggcaacgcggcgattctgaccctgcatgaacaggtggcgcgcagcgaacgccagagc tgggaacgcgcgggccaggtgcagcgcgtgggc

3.3. Codon optimization

Using https://en.vectorbuilder.com/tool/codon-optimization.html to Condon Optimise DNA/RNA sequence Selected Organism: Rat (Rattus Norvegicus)

In your own words, describe why you need to optimize codon usage. Which organism have you chosen to optimize the codon sequence for and why?

Codon Optimization is required to improve the efficiency of translation of mRNA to Protein. It involves modifying the nucleotide sequence of a gene to replace rare or less-favoured codons with more frequently used codons in the host organsim. The organism I have chosen is a Rat and this is because the codon usuage differs from the orginal source of the DdrC gene. The codon optimisation will replace codons that are rare in rats with preferred rat codons allowing fro faster translation, higher protein expression and more efficient product of the DdrC protein.

ATGAAGAACGCTCCACTGACCCTGAACTTTGGCAGCGTCAGACTGCCCGTGTCTGCAGACGGCCTGTTGCACGCCCCAACTGCCCAGCAGCAGCTGGGCCTGACCCAGTCCTGGGAAGCTGCTCTCGTGGAGCACGGACTGCCAGAGACCTACCGGGACTTCGGCGCTGGCCCTGAAGCAGCTGTGAGCGTGCCTGACTTTGTCGCTCTGGCCTTTGCCCTGGACACCCCTGAGGCCAGAAGATGGCAGAAGAGGGCTCGGGAGCTCCTGGCTAGAGCTATGCAGGGCGATGTGCGGGTGGCCGCCCAGATCGCTGAGAGAAACCCCGAGCCTGACGCCCGAAGATGGCTCGCCGCACGGCTGGAATCTACAGGGGCTAGAAGGGAGCTGCTTGCCACTGTGGCCAGACATGGCGGAGAAGGAAGGGTCTATGGACAGCTGGGCAGTATCAGCAATAGAACAGTGCTTGGAAAGGACTCCGCATCAGTCCGTCAGGAACGGGGAGTCAAAGCCACAAGAGACGGCCTGACAAGCGCCGAACTCCTCCGACTGGCCTATATCGACACCGTGACAGCAAGAGCCATCCAGGAGAGCGAGGCTAGGGGCAACGCCGCCATCCTGACCCTGCACGAGCAGGTCGCAAGATCTGAGCGGCAGTCATGGGAGAGAGCCGGACAGGTGCAGCGCGTGGGA

3.4. You have a sequence! Now what?

  1. What technologies could be used to produce this protein from your DNA? Describe in your words the DNA sequence can be transcribed and translated into your protein. You may describe either cell-dependent or cell-free methods, or both.

  2. Cell-free protein synthesis - Using extracted cellular machinery

  3. DNA is inserted into a reaction mixture and the sequence starts with ATG

  4. Transcription machinery with RNA polymerase which binds ot the DNA and transcribes it into mRNA

  5. mRNA is produced, containing codons

  6. Translation of mRNA with reaction mixture containing: Ribosomes, tRNA molecules, 20 amino acids and ATP

  7. Translation stars with Ribosomes binding to the mRNA, the translation starts at the codon AUG, and each codon is read in order. tRNA brings the correct amino acids and the amino acids are joined to form a polypetide chain.

  8. Translation ends and the protein is released.

  9. The protin then folds into the correct shape and becomes functional.

3.5. (Optional) How does it work in nature/biological systems?

  1. Describe how a single gene codes for multiple proteins at the transcriptional level.

A single gene can code for multiple proteins at the transcription level due to alternative splicing, it is a process where different sections of the a gene’s DNA can be selected and combined in different ways. This then results in the product of different proten variants with unqiue functions (Science of Bio Genetics, 2023).

  1. Try aligning the DNA sequence, the transcribed RNA, and also the resulting translated Protein!!! See example below.

Created new DNA sequence entry and translated to Protein

Created mRNA from DNA

Combined DNA, mRNA and Protein

Part 4: Prepare a Twist DNA Synthesis Order

4.1. Create a Twist account and a Benchling account

4.2. Build Your DNA Insert Sequence

Promoter (e.g. BBa_J23106): TTTACGGCTAGCTCAGTCCTAGGTATAGTGCTAGC

RBS (e.g. BBa_B0034 with spacers for optimal expression): CATTAAAGAGGAGAAAGGTACC

Start Codon: ATG

Coding Sequence codon optimized DNA

7x His Tag : CATCACCATCACCATCATCAC

Stop Codon: TAA

Terminator (e.g. BBa_B0015): CCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGAACGCTCTCTACTAGAGTCACACTGGCTCACCTTCGGGTGGGCCTTTCTGCGTTTATA

Export Sequence

4.3. On Twist, Select The “Genes” Option

4.4. Select “Clonal Genes” option

4.5. Import your sequence

4.6. Choose Your Vector

Download Construct (GenBank) and import to Benchling

Part 5: DNA Read/Write/Edit

a) What DNA would you want to sequence (e.g., read) and why?

I want to sequence the DNA (genome) of the bacterium called Paenibacillus Vortex seen in figure 1, because of these microorganism’s social motility and ability to employ cell-to-cell signaling to prompt acitvites such as attraction and repulsion under different environmental conditons (Myers, 2012). An example of this is when grown on hard surfaces, P.vortex will generate aggregates of dense bacteria that are pushed forward by repulsive chemotactic signals sensed from the cells at the back (Sirota-Madi et al., 2010). And when grown on soft surfaces these microorgranism can exhbit collective motility by forming foraging amrs that are sent out in search of food (Myers, 2012).

To achieve these cooperative ventures, P.Vortex has to be able to communicate with each other. The information exchanged are regarding its population size, myriad of indiviual environmental measurements at different locations, their internal states and their phenotypic and epigentic adjustments (Sirota-Madi et al., 2010). I want to know what sensors guides them to navigate the environment and what are the tranducers and responders that process the information into action. How do they communicate to each other?

Figure 1, (Sirota-Madi et al., 2010) Colony Organisation of the P.vortex bacteria A. Whole colony view of P.vortex (grown on on 15 g/l peptone and 2.25% (w/v) agar for four days) B. Two colonies of P.vortex (involuated in two parallel lines grown on on 15 g/l peptone and 2.25% (w/v) agar) C. Colony pattern and vortex process (Magnification x20) D. Mature indivdual Vortex (Magnification x500) E. Each individual baterium has curvature (scans fro electron micropscope)

b) What technology or technologies would you use to perform this sequencing and why?

With reference to the research paper Genome sequence of the pattern forming Paenibacillus vortex bacterium reveals potential for thriving in complex environments by Sirota-Madi et al. (2010), the reserachers used a hybrid approach of 454 Life Sciences and Illumina, achieving a total of 289× coverage, with 99.8% sequence identity between the two methods (Sirota-Madi et al., 2010). This is because the reserachers claimed that p.Vortex has an exceptionally large number of signal transduction genes and a high combined regulatory score (TCS, TFs, transport, defense genes). In order to verify this, they would require a hihgly accurate genome, minimial assembly artifacts and reliable gene annotation. The hybrid approach can reduce the distortions that come from using one sequencting technology.

i) Is this your method first-, second-, or third-generation or other? How so?

Illumina and 454 Life Sciences are both second generation sequencing. 454 provides longer reads creating better contiguity whilst Illumina provides massive depth hence providing higher accuracy. The researchers chose this combination of methods because of the De novo assembly problem (no reference genome). Short reads alone (especially early Illumina reads) are difficult to assemble reliably and the hybrid allowed more reliable identification of regulatory and signal transduction genes.

ii) What is your input? How do you prepare your input? List the essential steps.

Input:

Step 1: Bacteria growth

  • P. vortex grown overnight in LB medium
  • 37 °C, shaking at 200 rpm
  • Ensures sufficient biomass

Step 2: Cell Harvesting

  • DNA extracted from 2 ml culture (~10⁹ cells/ml)

Step 3: Cell Lysis

  • Cells incubated with lysozyme for 45 minutes
  • Breaks down bacterial cell wall (important for Gram-positive bacteria)

Step 4: DNA Extraction

  • Qiagen DNeasy Blood & Tissue Kit
  • Removes proteins, lipids, and other contaminants

Step 5: DNA Elution

  • Eluted in 200 µl AE buffer (10 mM Tris-HCl, 0.5 mM EDTA pH 9.0)

iii) What are the essential steps of you chosen sequencing technology, how does it decode the bases of your DNA sample?

Sequencing approach incorporated 454 pyrosequencing with Illumina Genome Analyzer.

Figure 2, (MacLean, Jones, & Studholme, 2009)

Referecin MacLean, Jones, and Studholme (2009) seen in figure 2, the essential steps of 454 pyrosequencing are:

  1. DNA fragmentation and adapter ligation

    Genomic DNA is broken into small fragments and short adapter sequences are attached.

  2. Bead attachment and emulsion PCR

    Each DNA fragment is attached to a microscopic bead and amplified inside an oil droplet.

  3. Bead loading into wells

    DNA-coated beads are placed into tiny wells, one bead per well.

  4. Sequencing by nucleotide incorporation

    DNA polymerase adds nucleotides to the growing DNA strand one type at a time.

  5. Light generation (pyrosequencing)

    When a nucleotide is incorporated, pyrophosphate is released, triggering a reaction that produces light.

  6. Signal detection and base calling

    The emitted light is recorded and used to determine the DNA sequence.

Figure 3, (MacLean, Jones, & Studholme, 2009)

Referecin MacLean, Jones, and Studholme (2009) seen in figure 3, the essential steps of Illumina Genome Analyzer sequencing are:

  1. DNA Fragmentation

    Mixture of single-stranded, adaptor oligo-ligated DNA fragments

  2. Attachment to the flow cell

  • Using a microfluidic cluster station to add these fragments to the surface of a glass flow cell.

  • Each flow cell is divided into eight separate lanes, and the interior surfaces have covalently attached oligos complementary to the specific adapters that are ligated onto the library fragments

  1. Cluster amplification (bridge PCR)
  • Hybridization of these DNAs to the oligos on the flow cell occurs by an active heating and cooling step
  • Subsequent incubation with reactants and an isothermal polymerase amplifies the fragments in a discrete area or ‘cluster’ on the flow cell surfaces
  1. Flow cell is placed into a fluidics cassette within the sequencer
  • Each cluster is supplied with polymerase and four differentially labeled fluorescent nucleotides that have their 3′-OH chemically inactivated to ensure that only a single base is incorporated per cycle. (MacLean, Jones, & Studholme, 2009)
  1. Identify the incorporated nucleotide at each cluster
  • Each base incorporation cycle is followed by an imaging step, and by a chemical step removes the fluorescent group and deblocks the 3′ end for the next base incorporation cycle. (MacLean, Jones, & Studholme, 2009)
  1. End of the sequencing run
  • The sequence of each cluster is computed and subjected to quality filtering to eliminate low-quality reads of between 32 and 40bp. (MacLean, Jones, & Studholme, 2009)

iv) What is the output of your chosen sequencing technology?

Outputs are raw sequences, base quality scores and FASTQ files

DNA WRITE

a) What DNA would you want to synthesize (e.g., write) and why?

I want to synthesise the gene encoding the luminescent protein from Aequorea vicotria jellyfish (GFP Green Fluprescent Protein) with a Plasmid to emit light natrually without electricity.

b) What technology or technologies would you use to perfom this DNA synthesis and why?

The technologies I would use to perform this DNA synethesis are CRISPR-Cas9 becasue it is extremely precise and can insert GFP exactly in the desired location. It can work in bateria, yeast and plants and allows for stable genome integration.

DNA EDIT

a) What DNA would you want to edit and why?

  • limonene synthase (Prem Kumar, Morehouse, Yu, & Oprian, 2019)*

The protein that I can to edit is limonene synthase (citrus sinensis) because I think it is interesting to edit smells, what would happen if you edit the smell of sweet orange and grow a fruit with the edited scent? Would it result in reduced fruit quality, the fruit will have no orange smell, how does smell evolve? What are some environmental factors that can change the DNA of certain smells.

b) What technology or technologies would you use to perfom these DNA edits and why?

I could utilise base editors where I can edit the DNA without creating double strand breaks, I can introduce specific point mutations in the Limonene synthase to tweak the enzyme’s function.

Bibliorgraphy

Gueguen, E., Bruto, M., Lemaire, D., Bertrand, E., Fichant, G., & Graille, M. (2022). Structural and functional characterization of DdrC, a novel DNA damage-induced nucleoid associated protein involved in DNA compaction. Nucleic Acids Research, 50(13). https://doi.org/10.1093/nar/gkac563

MacLean, D., Jones, J. D., & Studholme, D. J. (2009). Application of “next-generation” sequencing technologies to microbial genetics. Nature Reviews Microbiology, 7(4), 287–296. https://doi.org/10.1038/nrmicro2122

Morehouse, B.R., Kumar, R.P., Matos, J.O., Yu, Q., Bannister, A., Malik, K., Temme, J.S., Krauss, I.J. & Oprian, D.D. (2019) Direct Evidence of an Enzyme‑Generated LPP Intermediate in (+)-Limonene Synthase Using a Fluorinated GPP Substrate Analog. ACS Chemical Biology, 14, 2035‑2043. https://doi.org/10.1021/acschembio.9b00514

Myers, W. (2018). Bio design: Nature, science, creativity (Paola Antonelli, Foreword). Thames & Hudson.

Sirota-Madi, A., Olender, T., Helman, Y., Ingham, C., Brainis, I., Roth, D., Hagi, E., Brodsky, L., Leshkowitz, D., Galatenko, V., Nikolaev, V., Mugasimangalam, R. C., Bransburg-Zabary, S., Gutnick, D. L., Lancet, D., & Ben-Jacob, E. (2010). Genome sequence of the pattern forming Paenibacillus vortex bacterium reveals potential for thriving in complex environments. BMC Genomics, 11, 710. https://doi.org/10.1186/1471-2164-11-710

The unique mechanism of DdrC in enhancing DNA stability. (2024, December 30). Optimise.mfm.au. Retrieved from https://optimise.mfm.au/research/the-unique-mechanism-of-ddrc-in-enhancing-dna-stability/

Unknown author. (n.d.). JellyFish [Photograph]. The Pipette Pen. http://www.thepipettepen.com/wp-content/uploads/2016/04/7338555110_bc2ba135a5_z.jpg

Week 3 HW: Labd Automation

Generate an artistic design using the GUI at opentrons-art.rcdonovan.com

I wanted to draw Punch, the famous baby Japanese macaque from the zoo in Japan. The GUI created an approximate outline which I used as a base and added more details of different colour on top.

This was the end result

This was the settings with the coordinates.

Using the coordinates from the GUI, follow the instructions in the HTGAA26 Opentrons Colab to write your own Python script which draws your design using the Opentrons

1. Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications.

2. Write a description about what you intend to do with automation tools for your final project. You may include example pseudocode, Python scripts, 3D printed holders, a plan for how to use Ginkgo Nebula, and more. You may reference this week’s recitation slide deck for lab automation details.

3. While your description/project idea doesn’t need to be set in stone, we would like to see core details of what you would automate. This is due at the start of lecture and does not need to be tested on the Opentrons yet.

Week 4 HW: Protein Design Part I

Part A. Conceptual Questions

Answer any NINE of the following questions from Shuguang Zhang: (i.e. you can select two to skip)

  1. How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons)
  2. Why do humans eat beef but do not become a cow, eat fish but do not become fish?
  3. Why are there only 20 natural amino acids?
  4. Can you make other non-natural amino acids? Design some new amino acids.
  5. Where did amino acids come from before enzymes that make them, and before life started?
  6. If you make an α-helix using D-amino acids, what handedness (right or left) would you expect?
  7. Can you discover additional helices in proteins?
  8. Why are most molecular helices right-handed?
  9. Why do β-sheets tend to aggregate?
  10. What is the driving force for β-sheet aggregation?
  11. Why do many amyloid diseases form β-sheets?
  12. Can you use amyloid β-sheets as materials?
  13. Design a β-sheet motif that forms a well-ordered structure.

Part B: Protein Analysis and Visualization

  1. Briefly describe the protein you selected and why you selected it.
  2. Identify the amino acid sequence of your protein.
  3. How long is it? What is the most frequent amino acid? You can use this Colab notebook to count the frequency of amino acids.
  4. How many protein sequence homologs are there for your protein? Hint: Use Uniprot’s BLAST tool to search for homologs.
  5. Does your protein belong to any protein family?
  6. Identify the structure page of your protein in RCSB
  7. When was the structure solved? Is it a good quality structure? Good quality structure is the one with good resolution. Smaller the better (Resolution: 2.70 Å)
  8. Are there any other molecules in the solved structure apart from protein?
  9. Does your protein belong to any structure classification family?
  10. Open the structure of your protein in any 3D molecule visualization software:
  11. PyMol Tutorial Here (hint: ChatGPT is good at PyMol commands)
  12. Visualize the protein as “cartoon”, “ribbon” and “ball and stick”.
  13. Color the protein by secondary structure. Does it have more helices or sheets?
  14. Color the protein by residue type. What can you tell about the distribution of hydrophobic vs hydrophilic residues?
  15. Visualize the surface of the protein. Does it have any “holes” (aka binding pockets)?

Part C. Using ML-Based Protein Design Tools

In this section, we will learn about the capabilities of modern protein AI models and test some of them in your chosen protein.

Copy the HTGAA_ProteinDesign2026.ipynb notebook and set up a colab instance with GPU.

Choose your favorite protein from the PDB.

We will now try multiple things in the three sections below; report each of these results in your homework writeup on your HTGAA website:

C1. Protein Language Modeling

  1. Deep Mutational Scans a)Use ESM2 to generate an unsupervised deep mutational scan of your protein based on language model likelihoods.

b)Can you explain any particular pattern? (choose a residue and a mutation that stands out)

c)(Bonus) Find sequences for which we have experimental scans, and compare the prediction of the language model to experiment.

  1. Latent Space Analysis

a)Use the provided sequence dataset to embed proteins in reduced dimensionality.

b)Analyze the different formed neighborhoods: do they approximate similar proteins?

c)Place your protein in the resulting map and explain its position and similarity to its neighbors.

C2. Protein Folding

  1. Folding a protein

a)Fold your protein with ESMFold. Do the predicted coordinates match your original structure?

b)Try changing the sequence, first try some mutations, then large segments. Is your protein structure resilient to mutations?

C3. Protein Generation

  1. Inverse-Folding a protein: Let’s now use the backbone of your chosen PDB to propose sequence candidates via ProteinMPNN

a) Analyze the predicted sequence probabilities and compare the predicted sequence vs the original one.

b)Input this sequence into ESMFold and compare the predicted structure to your original.

Part D. Group Brainstorm on Bacteriophage Engineering

  1. Find a group of ~3–4 students

  2. Read through the Phage Reading material listed under “Reading & Resources” below.

  3. Review the Bacteriophage Final Project Goals for engineering the L Protein: Increased stability (easiest) Higher titers (medium) Higher toxicity of lysis protein (hard)

Brainstorm Session

  1. Choose one or two main goals from the list that you think you can address computationally (e.g., “We’ll try to stabilize the lysis protein,” or “We’ll attempt to disrupt its interaction with E. coli DnaJ.”).

  2. Write a 1-page proposal (bullet points or short paragraphs) describing:

  3. Which tools/approaches from recitation you propose using (e.g., “Use Protein Language Models to do in silico mutagenesis, then AlphaFold-Multimer to check complexes.”).

  4. Why do you think those tools might help solve your chosen sub-problem?

  5. Name one or two potential pitfalls (e.g., “We lack enough training data on phage–bacteria interactions.”).

  6. Include a schematic of your pipeline.

This resource may be useful: HTGAA Protein Engineering Tools

Each individually put your plan on your HTGAA website

Include your group’s short plan for engineering a bacteriophage

Subsections of Labs

Week 1 Lab: Pipetting

cover image cover image

Subsections of Projects

Individual Final Project

cover image cover image

Group Final Project

cover image cover image