Shitong Luo — HTGAA Spring 2026

About me
I am a second-year PhD student at MIT EECS. I work in computational biology and machine learning.

I am a second-year PhD student at MIT EECS. I work in computational biology and machine learning.
Week 1 HW: Principles and Practices
Class Assignment 1. First, describe a biological engineering application or tool you want to develop and why. I want to optimize PETase (polyethylene terephthalate hydrolase). PETase is an enzyme that can break down PET plastics, which are widely used in packaging. By optimizing PETase, we can enhance its efficiency in degrading PET and increase its stability under various conditions. This could lead to more effective recycling processes and help reduce plastic pollution.
Week 2 HW: DNA Read, Write, & Edit
Part 0: Basics of Gel Electrophoresis I have attended the recitation. Part 1: Benchling & In-silico Gel Art I made the gel art below. It is “HT” for “How To grow almost anything”. Part 2: Gel Art - Restriction Digests and Gel Electrophoresis I worked in group with Louisa, Jasmine, and Yutong. We tried to make the cat gel art designed by Louisa, but unfortunately it was not very successful. Photo below:
Python Script for Opentrons Artwork I created a design using opentrons-art.rcdonovan.com Opentrons-Art Website: https://opentrons-art.rcdonovan.com/?id=80fx569l8o4tho4 Google Colab: https://colab.research.google.com/drive/1UPiCmwBP3sIFD_rNVRHeT3YhuiQQ5ZGP#scrollTo=pczDLwsq64mk&line=6&uniqifier=1 The OpentronMock gives the following output: Code:
I want to optimize PETase (polyethylene terephthalate hydrolase). PETase is an enzyme that can break down PET plastics, which are widely used in packaging. By optimizing PETase, we can enhance its efficiency in degrading PET and increase its stability under various conditions. This could lead to more effective recycling processes and help reduce plastic pollution.
I plan to use AI models such as ProteinMPNN to propose mutations and test them in the lab.
One governance goal for optimizing PETase is to ensure that the enzyme does not have unintended consequences on the environment or human health, such as producing harmful byproducts.
Possible sub-goals:
[Option 1] As researchers, we could conduct comprehensive testing of the optimized PETase to identify any potential harmful byproducts and assess their toxicity.
[Option 2] Companies that produce the enzymes should provide detailed information about the enzyme’s properties, including any potential risks and safety measures.
[Option 3] Regulators could establish guidelines for safe use and disposal to minimize potential impact.
| Does the option: | Option 1 | Option 2 | Option 3 |
|---|---|---|---|
| Enhance Biosecurity | |||
| • By preventing incidents | 1 | 2 | 2 |
| • By helping respond | 2 | 1 | 2 |
| Foster Lab Safety | |||
| • By preventing incident | 1 | 1 | 2 |
| • By helping respond | 2 | 2 | 2 |
| Protect the environment | |||
| • By preventing incidents | 1 | 1 | 3 |
| • By helping respond | 2 | 1 | 2 |
| Other considerations | |||
| • Minimizing costs and burdens to stakeholders | 1 | 2 | 3 |
| • Feasibility? | 1 | 1 | 2 |
| • Not impede research | 2 | 2 | 2 |
| • Promote constructive applications | 2 | 2 | 2 |
I would prioritize Option 1, because this option is our responsibility as researchers, and it is the most direct way to ensure safety by eliminating risks at the source.
Error rate: $1:10^6$
Human genome length: 3.2 Gbp (billion base pairs)
Mechanisms to deal with the discrepancy: proofreading and repairing (MutS)
Average human protein: 345 amino acids. Number of different ways to code: $3^{345}$.
Reasons that not all codes work: codon bias among species and mRNA secondary structure.
Phosphodiester method.
Because the yield decreases exponentially with length, and the error rate increases with length as well.
Because the cumulative error rate would be too high leading to practically zero yield.
Phenylalanine, Valine, Threonine, Tryptophan, Isoleucine, Methionine, Histidine, Leucine, Lysine, Arginine.
Lysine is essential for protein synthesis and enzyme production, which is critical for survival.
Here it is!
I have attended the recitation.
I made the gel art below. It is “HT” for “How To grow almost anything”.

I worked in group with Louisa, Jasmine, and Yutong. We tried to make the cat gel art designed by Louisa, but unfortunately it was not very successful. Photo below:

I chose EGFR (Epidermal Growth Factor Receptor), because it is a protein that plays a critical role in cell growth and division, and it is often mutated in various cancers.
I used the reverse translation tool on the Twist Bioscience website. I added a stop codon (TAA) manually to the end of the DNA sequence. The DNA sequence is as follows:
I used the codon optimization tool on the Twist Bioscience website. The optimized DNA sequence is attached below.
Codon optimization is necessary because different organisms have different preferences for codons to encode the same amino acid. This can affect the efficiency of protein expression. I chose to optimize the codon sequence for E. coli, because it is a commonly used host organism for protein expression in the lab.
Cell-dependent and cell-free technologies could be used.
Cell-dependent method: First, insert the DNA sequence into a plasmid vector, and then transfer it into a host cell (e.g. E. coli). The host cell will transcribe the DNA into mRNA, which will then be translated into the protein.
I have done everything in Part 3 using the Twist Bioscience website, and followed the tutorial to finish the remaining steps.
I would want to sequence the DNA of a cancer cell, because we can learn which mutations led to the cancer, and use corresponding drugs (if available) to treat the cancer.
Illumina sequencing.
Q1: Is your method first-, second- or third-generation or other? How so?
Second-generation, because it is based on sequencing by synthesis.
Q2: What is your input? How do you prepare your input (e.g. fragmentation, adapter ligation, PCR)? List the essential steps.
The input is the DNA extracted from the cancer cell.
Fragmentation: The DNA is fragmented into smaller pieces.
Adapter ligation: Adapters are ligated to the ends of the DNA fragments.
PCR: The DNA fragments are amplified using PCR.
Q3: What are the essential steps of your chosen sequencing technology, how does it decode the bases of your DNA sample (base calling)?
First, DNA fragments are immobilized on a flow cell and amplified to form clusters. Then, fluorescently labeled nucleotides are added to the flow cell. Each time, a nucleotide is incorporated into the growing DNA strand by DNA polymerase, a fluorescent signal is emitted. The fluorescence is detected to determine which nucleotide was incorporated. Finally, the fluorescence signals are converted into base calls.
Q4: What is the output of your chosen sequencing technology?
Sequence reads with quality scores. Sequence reads can be assembled into a complete genome sequence.
I would want to synthesize the DNA sequence of mutated EGFR in cancers. By synthesizing the DNA sequence of EGFR and translate it into protein, we can study its function and develop targeted drugs.
I would use phosphoramidite chemical synthesis for short DNA fragments (e.g. 200bp), and then Gibson assembly for assembling the short fragments into the full-length DNA sequence.
Q1: What are the essential steps of your chosen sequencing methods?
Phosphoramidite chemical synthesis: One DNA base is synthesizaed at a time. Each cycle consists of four steps: deprotection, coupling, capping, and oxidation.
Gibson assembly: DNA fragments with overlapping ends are mixed together. Enzymes (exonuclease, DNA polymerase, and DNA ligase) are added to the mixture and stitch the DNA fragments together.
I would want to edit the DNA sequence of a gene that is mutated in a genetic disease, such as cystic fibrosis. By editing the DNA sequence to correct the mutation, we can potentially cure the disease.
Q1: How does your technology of choice edit DNA? What are the essential steps? CRISPR-Cas9 is a commonly used technology for DNA editing.
First, a guide RNA (gRNA) is designed to target the specific DNA sequence to be edited. The gRNA is then complexed with the Cas9 protein to form a ribonucleoprotein (RNP) complex which will be delivered into the target cells. The RNP will bind to the target DNA sequence and creates a break. The cell’s repair mechanisms then repair the break according to a repair template.
Q2: What preparation do you need to do (e.g. design steps) and what is the input (e.g. DNA template, enzymes, plasmids, primers, guides, cells) for the editing?
Preparation and input: gRNA sequence that corresponds to the edit site (find), repair template (replace).
Q3: What are the limitations of your editing methods (if any) in terms of efficiency or precision?
The main limitation is the off-target effect, where the edits are applied to unintended sites.
I created a design using opentrons-art.rcdonovan.com
Opentrons-Art Website: https://opentrons-art.rcdonovan.com/?id=80fx569l8o4tho4
Google Colab: https://colab.research.google.com/drive/1UPiCmwBP3sIFD_rNVRHeT3YhuiQQ5ZGP#scrollTo=pczDLwsq64mk&line=6&uniqifier=1
The OpentronMock gives the following output:

Code:
With the help of our TA Ronan, the art was printed with an Opentrons robot. The result is shown below:

PyLabRobot: An open-source, hardware-agnostic interface for liquid-handling robots and accessories. Wierenga, Rick P. et al. Device, Volume 1, Issue 4, 100111
This paper introduces PyLabRobot, an open-source Python library that provides a unified interface for controlling various liquid-handling robots and accessories, including Opentrons. PyLabRobot also includes a simulator (like the OpentronMock provided in this homework’s Google Colab notebook), which allows users to test and debug their protocols without needing access to the physical robot. Further, this paper also demonstrates the integration with LLMs, allowing users who are not familiar with programming to create protocols using natural language instructions, which are then translated into executable code for the robot.
I am interested in using lab automation to do machine-learning guided directed evolution of PET-ase (PET plastic degradation enzyme).
First, I will need to use machine learning models such as ProteinMPNN to design an initial library of PET-ase variants. I will place orders for the DNA fragments of these variants from Twist Bioscience.
Second, I will use liquid handler to assemble the DNA fragments into plasmids, and then transform the plasmids into E. coli cells.
Then, I will use a plate reader to measure the activity of the PET-ase variants in degrading PET plastic. This can also be done in a high-throughput manner using 96-well or 384-well plates with an automation robot.
Finally, I will use the activity data to train a machine learning model to predict the activity of new PET-ase variants, and then use the model to design the next round of variants for testing. This iterative process can be repeated until we find highly active PET-ase variants for degrading PET plastic.
I used P20 (1-20uL) pipettes to create a letter “H” pattern with red food coloring solution contained in an Eppendorf tube.
Louisa, Jasmine, Yutong and I worked on this lab together. TAE Buffer Preparation We mixed 8mL of TAE (50% concentration) with 492mL of deionized water to make 500mL of TAE buffer (1% concentration). We also added dye to the buffer to make the electrophoresis process traceable. Agarose Electrophoresis Gel Preparation First, we added 0.75g of agarose powder and 75mL of TAE buffer into a microwavable flask. We shaked the flask to mix the powder and the buffer.
Python Script for Opentrons Artwork I created a design using opentrons-art.rcdonovan.com Opentrons-Art Website: https://opentrons-art.rcdonovan.com/?id=80fx569l8o4tho4 Google Colab: https://colab.research.google.com/drive/1UPiCmwBP3sIFD_rNVRHeT3YhuiQQ5ZGP#scrollTo=pczDLwsq64mk&line=6&uniqifier=1 Result With the help of our TA Ronan, the art was printed with an Opentrons robot. The result is shown below:
I used P20 (1-20uL) pipettes to create a letter “H” pattern with red food coloring solution contained in an Eppendorf tube.

Louisa, Jasmine, Yutong and I worked on this lab together.
We mixed 8mL of TAE (50% concentration) with 492mL of deionized water to make 500mL of TAE buffer (1% concentration). We also added dye to the buffer to make the electrophoresis process traceable.

First, we added 0.75g of agarose powder and 75mL of TAE buffer into a microwavable flask. We shaked the flask to mix the powder and the buffer.
Next, we microwaved the flask for multiple rounds of 20 seconds each until the agarose powder was completely dissolved.
After that, we waited for the agarose solution to cool down for about 20 minutes. We then added 7.5uL of SYBR Safe DNA stain to the solution.
Finally, we poured the agarose solution into a gel mold, inserted the comb, and let it solidify for about 30 minutes. After the gel solidified, we removed the comb and placed the gel.


According to the gel art designed, we used EcoRV, SacI, BamHI, and KpnI. We created the mixture of Lambda DNA, the restriction enzyme, and the buffer according to the protocol. After mixing the solution, we incubated it at 37C for 30 minutes to allow the digestion to occur.

First, we poured the TAE buffer into the gel box until the gel was submerged. Next, we pipetted the digested DNA samples into the wells of the gel.
We then connected the gel box to the power supply and ran the gel at 120V for about 30 minutes.
Finally, we visualized the gel in the imaging system. The gel art is shown below. It does not look exactly like the design we created, but at least we gave it a try and had fun in the process!

I created a design using opentrons-art.rcdonovan.com
Opentrons-Art Website: https://opentrons-art.rcdonovan.com/?id=80fx569l8o4tho4
Google Colab: https://colab.research.google.com/drive/1UPiCmwBP3sIFD_rNVRHeT3YhuiQQ5ZGP#scrollTo=pczDLwsq64mk&line=6&uniqifier=1

With the help of our TA Ronan, the art was printed with an Opentrons robot. The result is shown below:
