Jessica Wu — HTGAA Spring 2026
About me
Mechanical Engineer
Contact info
Homework
- Week 1 HW: Principles and Practices
- Week 2 HW: DNA Read, Write, & Edit
- Week 3 HW: Lab Automation
- Week 4 HW: Protein Design Part I
Mechanical Engineer
Week 1 HW: Principles and Practices
Class Assignment 1. First, describe a biological engineering application or tool you want to develop and why. There is currently an urgent research focus on the biodegradation of plastics, due to the extremely long life cycle of synthetic polymers. Prior work has focused on a mix of exploring bacterial and microbial processes (e.g. anaerobic digestion) to break down plastics, and developing compositions that can be commercial compostable (e.g. for single use plastics). My personal interest is in fiber arts and sustainability, so I’d like to tackle this problem from a textile perspective. Fast fashion has exacerbated the volume of cheap, low quality clothes produced everyday. These clothes are often made with synthetic fibers and not for long term use (although the two are not necessarily interchangeable). I believe it’s incredibly important to find a way to biodegrade polyester, one of the most common synthetic polymers in fast fashion clothing.
Week 2 HW: DNA Read, Write, & Edit
Part 1: Benchling & In-silico Gel Art Had an initial mess-up where I tried to “speedrun” the process and ended up with a ladder packed with the effects of multiple restriction enzymes. Finally got success with all of the listed enzymes, separately.
Python Script for Opentrons Artwork Opentrons Art I tried to play around with math functions to create a design, like the Mathematical Heart sample. I drew up a cute fox in Desmos graphing calculator using the following functions, making sure to scale them to the 40 mm limit. Transferring that to Colab was a bit more difficult, and I had to play around with the functions, ranges, and dispense volume to find something that looked good.
Week 4 HW: Protein Design Part I
Part A. Conceptual Questions Answer any NINE of the following questions from Shuguang Zhang: How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons) Why do humans eat beef but do not become a cow, eat fish but do not become fish? Why are there only 20 natural amino acids? The natural amino acids are determined by codons, which are determined by three nucleotides (of which can be adenine, uracil, guanine, cytosine). This gives 4 x 4 x 4 = 64 total codons, but redundancy among codons produces only 20 unique amino acids.
There is currently an urgent research focus on the biodegradation of plastics, due to the extremely long life cycle of synthetic polymers. Prior work has focused on a mix of exploring bacterial and microbial processes (e.g. anaerobic digestion) to break down plastics, and developing compositions that can be commercial compostable (e.g. for single use plastics). My personal interest is in fiber arts and sustainability, so I’d like to tackle this problem from a textile perspective. Fast fashion has exacerbated the volume of cheap, low quality clothes produced everyday. These clothes are often made with synthetic fibers and not for long term use (although the two are not necessarily interchangeable). I believe it’s incredibly important to find a way to biodegrade polyester, one of the most common synthetic polymers in fast fashion clothing.
Action 1: Standardization of Process
Action 2: Polyester Tracking for Success Metrics
Action 3: Community Awareness
Action 4: Regulation on Composite Material Production
I’d focus on Process Standardization, as safety is the absolute first priority. Following that, between Success Metrics and Community Awareness, both have potential to contribute to a circular economy, but I’d like to prioritize Success Metrics for its potential to better target impacted areas down the line. So I’d work on a more technical level to develop more effective processes and data collection (which would likely involve academic institutions/environment-focused agencies).
I’m wary of how effective we’d be in a global setting, especially since my perceived impact with this depends on how well we can affect overseas institutions, where I believe most of fast fashion waste is made and accumulated.
1. Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy?
The error rate for polymerase is 1:106, or 1 out of every 1,000,000 base pairs might be wrong. Meanwhile, the human genome spans billions of base pairs, with a diploid being about 6.3 Gigabase pairs (pr 3.2 Gigabase pairs for a haploid)1. However, the polymerase can go through a proofreading process where it uses exonuclease to remove the nucleotide through the entire monophosphate base2, essentially, allowing the sequence to “backspace” before continuing.
2. How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?
So that’s clearly an excessive amount of variation for a single protein. However, organisms have developed something called “codon usage bias”, or preference for certain codons evolved over time. This can be due to the following reasons3:
1. What’s the most commonly used method for oligo synthesis currently?
Oligonucleotides are defined as DNA chains with a length under 200 nucleotides5. Oligo synthesis began with solid-phase synthesis, with additional methods (phosphodiester, phosphotriester, phosphitetriester, phosphoramidite) developed up until the 1980s. Currently solid-phase synthesis using the phosphoramidite method is the most common method; the process was leveraged to implement the first automated DNA synthesizer and has since been optimized for high DNA production volume/thermal control5.
2. Why is it difficult to make oligos longer than 200nt via direct synthesis?
Longer chains have reduced theoretical yield, since each additional nucleotide has an additional “elongation cycle efficiency” (think error rate) that stacks up5. This is calculated with the equation theoretical yield = elongation cycle efficiencynt. Assuming efficiency of 99%,
The phosphoramidite method in particular becomes ineffective beyond 200 base pairs. As a result, more recent alternatives (e.g. enzymatic) are being explored, as research turns to using longer sequences.
3. Why can’t you make a 2000bp gene via direct oligo synthesis?
If we further calculate using the above equation, the yield becomes
or effectively zero. At such a high length, the individual “error rates” compound, leaving no chance for success. Current efforts try to improve the process, i.e. increasing the elongation cycle efficiency, or use workarounds like making batches of shorter segments to link together5.
1. [Given slides #2 & 4 (AA:NA and NA:NA codes)] What code would you suggest for AA:AA interactions?
Not sure I’m fully understanding the question, but given there are 64 possible codons for amino acids yet only 20 amino acids, I’d create a code where all possible codons are inputted and outputted as “plaintext” and “ciphertext”, with the encryption “key” being the 20 amino acids they could be interpreted as. Something like the drawing below:
This could even be further streamlined for repeating letters:
(Prof. Church’s slides and paper at [6] used for reference.)
[1] Piovesan A, Pelleri MC, Antonaros F, Strippoli P, Caracausi M, Vitale L. On the length, weight and GC content of the human genome. BMC Res Notes. 2019 Feb 27;12(1):106. https://doi.org/10.1186/s13104-019-4137-z
[2] Hopfield, JJ. Kinetic proofreading: a new mechanism for reducing errors in biosynthetic processes requiring high specificity. Proc Natl Acad Sci USA. 1974 Oct; 71(10):4135-9. https://doi.org/10.1073/pnas.71.10.4135
[3] Ford, T. Plasmids 101: Codon usage bias. addgene Blog. 2018 Sept. https://blog.addgene.org/plasmids-101-codon-usage-bias
[4] Athey J, Alexaki A, Osipova E, Rostovtsev A, Santana-Quintero LV, Katneni U, Simonyan V, Kimchi-Sarfaty C. A new and updated resource for codon usage tables. BMC Bioinformatics. 2017 Sep 2; 18(1):391. https://doi.org/10.1186/s12859-017-1793-7
[5] Hoose, A., Vellacott, R., Storch, M. et al. DNA synthesis technologies to close the gene writing gap. Nat Rev Chem 7, 2023 Jan; 144–161. https://doi.org/10.1038/s41570-022-00456-9
[6] Acevedo-Rocha CG, Budisa N. Xenomicrobiology: a roadmap for genetic code engineering. Microb Biotechnol. 2016 Sep; 9(5):666-76. https://doi.org/10.1111/1751-7915.12398
Had an initial mess-up where I tried to “speedrun” the process and ended up with a ladder packed with the effects of multiple restriction enzymes.
Finally got success with all of the listed enzymes, separately.

Some experimentation on Ronan’s website got me this pattern that sort of looks like a pair of pants. In hindsight I should’ve definitely explored results from a combination of enzymes (e.g. EcoRI and HindIII together), which would’ve given me a bigger range of visual results.
Replicated “sort-of pants” on Benchling, and my final result.

I recently read about snake venom and how its majority composition of proteins/enzymes make it (theoretically) edible, since it can be digested in the stomach. That was a pretty fun fact. For this assignment, I picked irditoxin, a three-finger toxin that is selectively neurotoxic towards birds and lizards (but not mammals).
I found two subunits on UniProt and went with A.
I located irditoxin subunit A in the European Nucleotide Archive, with the following DNA sequence:
Codon optimization can be useful in controlling gene expression within a sequence (both increasing and decreasing it). It can also make mRNA production more efficient and impact translation speed, which can in turn affect things like folding speed (fast-translated sequences fold while waiting for slow-translated sequences).
I tried out IDT’s Codon Optimization Tool. IDT recognized a couple different stop and start codons, so I picked the sequence between the first codon (ATG, a start) and the first stop codon (TGA at position 330). The sequence was optimized for E.coli to go for a standard, well understood, commonly used organism.
Shortened old sequence:
Optimized sequence:
Another mess-up: IDT denied the optimized sequence due to its complexity, which means this sequence isn’t currently manufacturable and needs to be further redesigned. Seems some of my enzyme recognition sites weren’t ideal.
What technologies could be used to produce this protein from your DNA? Describe in your words the DNA sequence can be transcribed and translated into your protein. You may describe either cell-dependent or cell-free methods, or both.
As we’re working with a neurotoxin, and not for high-volume production, I’d probably turn to a cell-free method.
From the downloaded FASTA file:
My expression casset can be accessed here.
Here’s my plasmid. She’s beautiful!
(i) What DNA would you want to sequence (e.g., read) and why? This could be DNA related to human health (e.g. genes related to disease research), environmental monitoring (e.g., sewage waste water, biodiversity analysis), and beyond (e.g. DNA data storage, biobank).
I think bioindicators are an interesting group of organisms, and sequencing them could help us isolate genes that react to surroundings and get used for a more standardized, widespread environment monitoring tool. E.g. microalgae can detect wide range of water quality issues, from heavy metals to nanoparticles, yet “only a few species have been fully sequenced without any gaps”1.
(ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why? Also answer the following questions: Is your method first-, second- or third-generation or other? How so? What is your input? How do you prepare your input (e.g. fragmentation, adapter ligation, PCR)? List the essential steps. What are the essential steps of your chosen sequencing technology, how does it decode the bases of your DNA sample (base calling)? What is the output of your chosen sequencing technology?
I’d likely use a NGS method to handle the sequencing of the entire genome, and of the multiple species that we come across at that. So potentially something like PacBio, which is third-generation. Both the sample used and the library need to be prepared, with the sample needing to be purified and the DNA needing to be fragmented to length and end-capped. The DNA is decoded through a polymerase that runs along the sequence. As it interacts with each nucleotide, it emits light, which is recorded live and appended onto the current sequence. The end result is a straightforward sequence of DNA nucleotides, given in a file that can be read on any notepad app.
(i) What DNA would you want to synthesize (e.g., write) and why? These could be individual genes, clusters of genes or genetic circuits, whole genomes, and beyond. As described in class thus far, applications could range from therapeutics and drug discovery (e.g., mRNA vaccines and therapies) to novel biomaterials (e.g. structural proteins), to sensors (e.g., genetic circuits for sensing and responding to inflammation, environmental stimuli, etc.), to art (DNA origamis). If possible, include the specific genetic sequence(s) of what you would like to synthesize! You will have the opportunity to actually have Twist synthesize these DNA constructs! :)
I’m continuously interested in biomaterials as an alternative to things like plastic. There’s been a lot of work on getting plastic that is biodegradable, and I’m wondering if there’s a way to go about it from the opposite direction, like fortifying kombucha leather to last longer.
(ii) What technology or technologies would you use to perform this DNA synthesis and why? Also answer the following questions: What are the essential steps of your chosen sequencing methods? What are the limitations of your sequencing method (if any) in terms of speed, accuracy, scalability?
Solid-phase synthesis using the phosphoramidite method seems to be the go-to method, so I’d stick with that. The steps are coupling the base with phosphoramidite, capping unreacted sites, oxidating the phosphate, deblocking, and then repeating as needed. The limitation to this process is that it decreases in efficiency past a certain number of bp (~200 as discussed last week) so could potentially be difficult as the needed sequence becomes longer.
(i) What DNA would you want to edit and why? In class, George shared a variety of ways to edit the genes and genomes of humans and other organisms. Such DNA editing technologies have profound implications for human health, development, and even human longevity and human augmentation. DNA editing is also already commonly leveraged for flora and fauna, for example in nature conservation efforts, (animal/plant restoration, de-extinction), or in agriculture (e.g. plant breeding, nitrogen fixation). What kinds of edits might you want to make to DNA (e.g., human genomes and beyond) and why?
This feels like a touchy subject in line with our ethical considerations from last week. The responsibility of human genome sequencing seems enormous, so I’ll consider other organisms. I’m thinking of how filter feeders play an important role in the water ecosystem, essentially “purifying” the water. Could something like that be intentionally edited into plants (or other microorganisms) to boost its “purifying” effect on the air?
(ii) What technology or technologies would you use to perform these DNA edits and why? Also answer the following questions: How does your technology of choice edit DNA? What are the essential steps? What preparation do you need to do (e.g. design steps) and what is the input (e.g. DNA template, enzymes, plasmids, primers, guides, cells) for the editing? What are the limitations of your editing methods (if any) in terms of efficiency or precision?
I’m unsure of this part as I’m not too familiar with the work needed to introduce this behavior into land organisms. I believe CRISPR-Cas9 is also a go-to for gene editing, so would probably be a popular approach regardless. The process begins with a guide RNA finding the target sequence in the DNA, at which Cas-9 “cleaves” the segment. Then either new DNA can be added (replacing the segment), or the DNA strands repair (deleting the segment). For preparation, the appropriate guide RNA needs to be sourced. A limitation of CRISPR-Cas9 is in its effectiveness; it has the potential to insert DNA incorrectly, which can lead to mutations when applied to human genome sequencing.
[1] Evangelia Stavridou, Lefkothea Karapetsi, Georgia Maria Nteve, Georgia Tsintzou, Marianna Chatzikonstantinou, Meropi Tsaousi, Angel Martinez, Pablo Flores, Marián Merino, Luka Dobrovic, José Luis Mullor, Stefan Martens, Leonardo Cerasino, Nico Salmaso, Maslin Osathanunkul, Nikolaos E. Labrou, Panagiotis Madesis, Landscape of microalgae omics and metabolic engineering research for strain improvement: An overview, Aquaculture, Volume 587, 2024, https://doi.org/10.1016/j.aquaculture.2024.740803.
Opentrons Art
I tried to play around with math functions to create a design, like the Mathematical Heart sample. I drew up a cute fox in Desmos graphing calculator using the following functions, making sure to scale them to the 40 mm limit. Transferring that to Colab was a bit more difficult, and I had to play around with the functions, ranges, and dispense volume to find something that looked good.

Some notes from the process:
draw() function (like the Mathematical Heart).pipette_20ul.current_volume) and is handled all at once in the draw() function.I also used https://ginkgoartworks.com/ to draw a mushroom and imported the program into Colab. Since the bacteria names don’t register as RGB colors, I had to “color-correct” well_colors to get the visualization to show up (but I assume both versions will work as long as the PCR tubes are physically in order).
Colab link for both projects here, including the color-corrected version for the mushroom.
Extra: Nebula art
I designed some artwork for the 1536-well plates on the Nebula, which were made during the Saturday 2-6 pm Cloud Lab session. The first one was a firefly squid, inspired by bioluminescent photos I’ve seen of them underwater. Link to gallery image here.
I also made a second one resembling the Chinese jianzhi for Lunar New Year. I experimented with two color sets, to see how bacteria with similar coloring would contrast against each other. Link to gallery images here and here.
This paper explores “dyeing” bacterial cellulose, a bioplastic alternative to leather, which I found pretty interesting. Instead of applying an independent biodegradable dye, the researchers engineered Komagataeibacter rhaeticus to develop eumelanin (dark melanin), which gives a range of shades seen in the photo above. Opentrons is used in the production of the eumelanin development buffer, in which the K. rhaeticus pellicle is incubated to “dye” itself. The robot mixes the buffer, cells, and supernatant within a reaction plate while maintaining a constant low temperature to prevent initial eumelanin growth.
Walker, K.T., Li, I.S., Keane, J. et al. Self-pigmenting textiles grown from cellulose-producing bacteria with engineered tyrosinase expression. Nat Biotechnol 43, 345–354 (2025). https://doi.org/10.1038/s41587-024-02194-3
More research is needed on my part for this, but I’d like to explore scaling up or going in depth with the range of results for my project (e.g. if self-pigmenting, then trials to develop swatches of colors). This would require a lot of samples, and liquid handlers like the Opentrons would be necessary for producing all the samples indentically.
Coloring Bioplastic/Biotextiles (with an art-focused approach)
a. Self-pigmenting Bacterial Cellulose: Building on the above paper, further development with dyed bacterial cellulose using pheomelanin instead of eumelanin for a different color range. Likely this is already being explored, so as an ambitious goal Komagataeibacter rhaeticus could be edited to express both pheomelanin and eumelanin, allowing you a 2D range of colors.
b. Structural Color on Textiles as Biopigment: Naturally occuring structural color is tied to the genome, so we could intentionally induce colors as a natural, biodegradable dyes for textiles. This paper highlights a bacteria that naturally form to output structural color, and this paper explores gene knockout to change the color expressed by one of the bacterias Flavobacterium IR1. A potential project could explore different colors on these bacteria, or find a way to introduce the bacteria to textiles without affecting its formation.
Environmental Sensors (Algae biosensors?)
Polyester Biodegradation
a. Integration with Byproduct Biodegradation: Ideonella sakaiensis is a bacteria that can break down PET plastic through a two step process involving PETase and MHETase enzymes. The end products include carbon dioxide, water, and methane, which in itself a pollutant. Methane itself is a subject of research, with methanotrophs being a type of bacteria that metabolize methane. I’m wondering if the bacteria used for breaking down plastic can be somehow integrated with the added function of breaking down methane through gene engineering.
b. Polyester-Eating Enzymes: This is a less familiar topic for me, but current work on enzyme degradation focuses on improving the performance of natural enzymes, e.g. its thermostability, pH control, etc. Since there’s such a wide range of work being done, I’m sure there’s some further testing that could be done on an underfocused bacteria/performance metric/modification method.
Answer any NINE of the following questions from Shuguang Zhang:
The natural amino acids are determined by codons, which are determined by three nucleotides (of which can be adenine, uracil, guanine, cytosine). This gives 4 x 4 x 4 = 64 total codons, but redundancy among codons produces only 20 unique amino acids.
I picked crystallin, which is a protein in the eye responsible for the movement of your iris as you focus. It’s notably transparent, being part of the eye lens, and water-soluble, which was a callback to our lecture. I picked the protein because I was interested in how cataracts were formed.
The specific protein I went with for the following questions is P02511, or Alpha-crystallin B (in humans).
The AA sequence from UnitProt is
sp|P02511|CRYAB_HUMAN Alpha-crystallin B chain OS=Homo sapiens OX=9606 GN=CRYAB PE=1 SV=2 MDIAIHHPWIRRPFFPFHSPSRLFDQFFGEHLLESDLFPTSTSLSPFYLRPPSFLRAPSW FDTGLSEMRLEKDRFSVNLDVKHFSPEELKVKVLGDVIEVHGKHEERQDEHGFISREFHR KYRIPADVDPLTITSSLSSDGVLTVNGPRKQVSGPERTIPITREEKPAVTAAPKK
Using the Colab notebook, the protein is 175 amino acids long with the most common amino acid being P (and appearing 17 times).
According to UniProt, it’s part of the small heat shock protein (HSP20) family, along with all other Alpha-crystallin B proteins. However, according to the Transporter Classification Database, it’s part of the α-Crystallin Chaperone (CryA) family (where other Alpha-crystallin B proteins don’t appear).
Homology refers to protein sequences that likely have a common ancestor (identified through having similarities in sequence/structure?). Using the BLAST software gives 250 results for similar proteins, with results primarily appearing to be Alpha-crystallin B in different animals.
This step was particularly difficult for me, as I didn’t always understand how to get to the answer based on what I had on the screen
The structure seems to be initially solved in 2009 but has increased members up until 2025. Some particularly high resolution structures were identified in 2012 and 2014 through X-ray diffraction, with a resolution of 1.0 - 1.5 Å.
I’m not entirely sure how to identify this…
Using the Structural Classification website, it belongs to the “Alpha crystallin-like” family, further within the “Hsp20 chaperone-like” family.
I chose to use PyMol to open my structure, getting the structure below.
The protein seems to mostly be composed of sheets with some helices.
I colored hydrophobic residues in red and hydrophillic residues in green.
select hydrophobic, resn Gly resn Ala resn Val resn Leu resn Ile resn Pro resn Phe resn Met resn Trpselect hydrophillic, resn Ser resn Thr resn Asn resn Gln resn Cys resn Gly
I had to switch to a spheres visualization to better see how molecules were interacting. It was a little hard for me to see a significant pattern, but I do feel like the hydrophilic residues have more “open” facing areas, whereas the hydrophobic residues were more clumped (both together and with neighboring residues).
Visualizing the protein as a surface was really helpful! I could easily find a couple areas that could be binding pockets. It’s a little difficult to show it accurately in a photo, but I indicated potential areas below:

In this section, we will learn about the capabilities of modern protein AI models and test some of them in your chosen protein.
Folding a protein
Picture Source: 1. Post from Sergey Ovchinnikov 2. Roney, Ovchinnikov et al (2022). State-of-the-art estimation of protein model accuracy using AlphaFold. Phys. Rev. Lett. 129, 238101
Inverse-Folding a protein: Let’s now use the backbone of your chosen PDB to propose sequence candidates via ProteinMPNN