Subsections of Shitong Luo — HTGAA Spring 2026

Homework

Weekly homework submissions:

  • Week 1 HW: Principles and Practices

    Class Assignment 1. First, describe a biological engineering application or tool you want to develop and why. I want to optimize PETase (polyethylene terephthalate hydrolase). PETase is an enzyme that can break down PET plastics, which are widely used in packaging. By optimizing PETase, we can enhance its efficiency in degrading PET and increase its stability under various conditions. This could lead to more effective recycling processes and help reduce plastic pollution.

  • Week 2 HW: DNA Read, Write, & Edit

    Part 0: Basics of Gel Electrophoresis I have attended the recitation. Part 1: Benchling & In-silico Gel Art I made the gel art below. It is “HT” for “How To grow almost anything”. Part 2: Gel Art - Restriction Digests and Gel Electrophoresis I worked in group with Louisa, Jasmine, and Yutong. We tried to make the cat gel art designed by Louisa, but unfortunately it was not very successful. Photo below:

  • Week 3 HW: Lab Automation

    Python Script for Opentrons Artwork I created a design using opentrons-art.rcdonovan.com Opentrons-Art Website: https://opentrons-art.rcdonovan.com/?id=80fx569l8o4tho4 Google Colab: https://colab.research.google.com/drive/1UPiCmwBP3sIFD_rNVRHeT3YhuiQQ5ZGP#scrollTo=pczDLwsq64mk&line=6&uniqifier=1 The OpentronMock gives the following output: Code:

### YOUR CODE HERE to create your design ### sfgfp_points = [(-3.3, -3.3),(-1.1, -3.3),(1.1, -3.3),(3.3, -3.3),(-3.3, -5.5),(-1.1, -5.5),(1.1, -5.5),(3.3, -5.5),(-1.1, -7.7),(1.1, -7.7),(-5.5, -9.9),(-1.1, -9.9),(1.1, -9.9),(5.5, -9.9),(-3.3, -12.1),(-1.1, -12.1),(1.1, -12.1),(3.3, -12.1)] mrfp1_points = [(-23.1, 27.5),(-20.9, 27.5),(-18.7, 27.5),(18.7, 27.5),(20.9, 27.5),(23.1, 27.5),(-23.1, 25.3),(-20.9, 25.3),(-18.7, 25.3),(-16.5, 25.3),(-14.3, 25.3),(14.3, 25.3),(16.5, 25.3),(18.7, 25.3),(20.9, 25.3),(23.1, 25.3),(-23.1, 23.1),(-20.9, 23.1),(-18.7, 23.1),(-16.5, 23.1),(-14.3, 23.1),(-12.1, 23.1),(-3.3, 23.1),(-1.1, 23.1),(1.1, 23.1),(3.3, 23.1),(12.1, 23.1),(14.3, 23.1),(16.5, 23.1),(18.7, 23.1),(20.9, 23.1),(23.1, 23.1),(-23.1, 20.9),(-20.9, 20.9),(-18.7, 20.9),(-16.5, 20.9),(-14.3, 20.9),(-12.1, 20.9),(-9.9, 20.9),(-7.7, 20.9),(-5.5, 20.9),(-3.3, 20.9),(-1.1, 20.9),(1.1, 20.9),(3.3, 20.9),(5.5, 20.9),(7.7, 20.9),(9.9, 20.9),(12.1, 20.9),(14.3, 20.9),(16.5, 20.9),(18.7, 20.9),(20.9, 20.9),(23.1, 20.9),(-23.1, 18.7),(-20.9, 18.7),(-18.7, 18.7),(-16.5, 18.7),(-14.3, 18.7),(-12.1, 18.7),(-9.9, 18.7),(-7.7, 18.7),(7.7, 18.7),(9.9, 18.7),(12.1, 18.7),(14.3, 18.7),(16.5, 18.7),(18.7, 18.7),(20.9, 18.7),(23.1, 18.7),(-23.1, 16.5),(-20.9, 16.5),(-18.7, 16.5),(-16.5, 16.5),(-14.3, 16.5),(-12.1, 16.5),(12.1, 16.5),(14.3, 16.5),(16.5, 16.5),(18.7, 16.5),(20.9, 16.5),(23.1, 16.5),(-23.1, 14.3),(-20.9, 14.3),(-18.7, 14.3),(-16.5, 14.3),(16.5, 14.3),(18.7, 14.3),(20.9, 14.3),(23.1, 14.3),(-23.1, 12.1),(-20.9, 12.1),(-18.7, 12.1),(18.7, 12.1),(20.9, 12.1),(23.1, 12.1),(-23.1, 9.9),(-20.9, 9.9),(20.9, 9.9),(23.1, 9.9),(-23.1, 7.7),(-20.9, 7.7),(20.9, 7.7),(23.1, 7.7),(-23.1, 5.5),(23.1, 5.5),(-25.3, 3.3),(-23.1, 3.3),(23.1, 3.3),(25.3, 3.3),(-25.3, 1.1),(-23.1, 1.1),(23.1, 1.1),(25.3, 1.1),(-25.3, -1.1),(-23.1, -1.1),(23.1, -1.1),(25.3, -1.1),(-25.3, -5.5),(-23.1, -5.5),(23.1, -5.5),(25.3, -5.5),(-25.3, -7.7),(25.3, -7.7),(-23.1, -9.9),(23.1, -9.9),(-23.1, -12.1),(23.1, -12.1),(-23.1, -14.3),(23.1, -14.3),(-20.9, -16.5),(20.9, -16.5),(-20.9, -18.7),(-18.7, -18.7),(18.7, -18.7),(20.9, -18.7),(-18.7, -20.9),(-16.5, -20.9),(16.5, -20.9),(18.7, -20.9),(-16.5, -23.1),(-14.3, -23.1),(-12.1, -23.1),(12.1, -23.1),(14.3, -23.1),(16.5, -23.1),(-14.3, -25.3),(-12.1, -25.3),(-9.9, -25.3),(-7.7, -25.3),(7.7, -25.3),(9.9, -25.3),(12.1, -25.3),(14.3, -25.3),(-9.9, -27.5),(-7.7, -27.5),(-5.5, -27.5),(-3.3, -27.5),(-1.1, -27.5),(1.1, -27.5),(3.3, -27.5),(5.5, -27.5),(7.7, -27.5),(9.9, -27.5),(-1.1, -29.7),(1.1, -29.7)] azurite_points = [(-9.9, 7.7),(-7.7, 7.7),(7.7, 7.7),(9.9, 7.7),(-12.1, 5.5),(-9.9, 5.5),(-7.7, 5.5),(7.7, 5.5),(9.9, 5.5),(12.1, 5.5),(-9.9, 3.3),(9.9, 3.3)] mwasabi_points = [(-27.5, -3.3),(-25.3, -3.3),(-23.1, -3.3),(-20.9, -3.3),(-18.7, -3.3),(-16.5, -3.3),(16.5, -3.3),(18.7, -3.3),(20.9, -3.3),(23.1, -3.3),(25.3, -3.3),(27.5, -3.3),(-23.1, -7.7),(-20.9, -7.7),(-18.7, -7.7),(-16.5, -7.7),(16.5, -7.7),(18.7, -7.7),(20.9, -7.7),(23.1, -7.7),(-27.5, -9.9),(-25.3, -9.9),(25.3, -9.9),(27.5, -9.9),(-16.5, -12.1),(16.5, -12.1),(-20.9, -14.3),(-18.7, -14.3),(18.7, -14.3),(20.9, -14.3),(-23.1, -16.5),(23.1, -16.5),(-25.3, -18.7),(25.3, -18.7)] scale = 1 def draw_points(points, color="Red"): segments = [] for i in range(0, len(points), 20): segments.append(points[i : i+20]) for seg in segments: pipette_20ul.pick_up_tip() pipette_20ul.aspirate(len(seg), location_of_color(color)) for x, y in seg: adjusted_location = center_location.move(types.Point(x=xscale, y=yscale)) dispense_and_detach(pipette_20ul, 1, adjusted_location) pipette_20ul.drop_tip() draw_points(sfgfp_points, "Red") draw_points(mrfp1_points, "Green") draw_points(azurite_points, "Orange") draw_points(mwasabi_points, "Orange") Result With the help of our TA Ronan, the art was printed with an Opentrons robot. The result is shown below:

Subsections of Homework

Week 1 HW: Principles and Practices

Class Assignment

1. First, describe a biological engineering application or tool you want to develop and why.

I want to optimize PETase (polyethylene terephthalate hydrolase). PETase is an enzyme that can break down PET plastics, which are widely used in packaging. By optimizing PETase, we can enhance its efficiency in degrading PET and increase its stability under various conditions. This could lead to more effective recycling processes and help reduce plastic pollution.

I plan to use AI models such as ProteinMPNN to propose mutations and test them in the lab.

One governance goal for optimizing PETase is to ensure that the enzyme does not have unintended consequences on the environment or human health, such as producing harmful byproducts.

Possible sub-goals:

  • Identify possible byproducts in the lab.
  • Test the toxicity of the byproducts.
  • Ensure that there are no byproducts that could be harmful to environment and health.

3. Next, describe at least three different potential governance “actions” by considering the four aspects below (Purpose, Design, Assumptions, Risks of Failure & “Success”)

  • [Option 1] As researchers, we could conduct comprehensive testing of the optimized PETase to identify any potential harmful byproducts and assess their toxicity.

    • Purpose: To ensure that the optimized PETase does not produce harmful byproducts.
    • Design: Conduct experiments to identify all the products produced by the PETase.
    • Assumptions: We know what it looks like when there’s no unexpected byproducts.
    • Risks of Failure: If harmful byproducts are not identified, it could lead to environmental or health issues.
    • Success: The optimized PETase is found to be safe and does not produce harmful byproducts.
  • [Option 2] Companies that produce the enzymes should provide detailed information about the enzyme’s properties, including any potential risks and safety measures.

    • Purpose: To sufficiently inform users about the potential risks associated with the enzyme.
    • Design: Disclose information about the enzyme’s properties and potential risks.
    • Assumptions: Companies will comply with the reporting requirements and provide accurate and sufficient information.
    • Risks of Failure: Could lead to mishandling of the enzyme.
    • Success: Users are well-informed about the enzyme and can use it safely.
  • [Option 3] Regulators could establish guidelines for safe use and disposal to minimize potential impact.

    • Purpose: To ensure that the enzyme is used and disposed of properly to minimize potential environmental impact.
    • Design: Develop guidelines of best practices for the use and disposal.
    • Assumptions: Users will follow the guidelines.
    • Risks of Failure: If users do not follow the guidelines, it could lead to negative consequences.
    • Success: Environmental and health impact are minimized.

4. Next, score (from 1-3 with, 1 as the best, or n/a) each of your governance actions against your rubric of policy goals. The following is one framework but feel free to make your own

Does the option:Option 1Option 2Option 3
Enhance Biosecurity
• By preventing incidents122
• By helping respond212
Foster Lab Safety
• By preventing incident112
• By helping respond222
Protect the environment
• By preventing incidents113
• By helping respond212
Other considerations
• Minimizing costs and burdens to stakeholders123
• Feasibility?112
• Not impede research222
• Promote constructive applications222

5. Last, drawing upon this scoring, describe which governance option, or combination of options, you would prioritize, and why. Outline any trade-offs you considered as well as assumptions and uncertainties.

I would prioritize Option 1, because this option is our responsibility as researchers, and it is the most direct way to ensure safety by eliminating risks at the source.

Lab Preparation

  • Complete Lab Specific Training in Person.
  • Complete Safety Training in Atlas

Week 2 Lecture Prep

Questions from Professor Jacobson

1. Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy?

Error rate: $1:10^6$

Human genome length: 3.2 Gbp (billion base pairs)

Mechanisms to deal with the discrepancy: proofreading and repairing (MutS)

2. How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?

Average human protein: 345 amino acids. Number of different ways to code: $3^{345}$.

Reasons that not all codes work: codon bias among species and mRNA secondary structure.

Questions from Dr. LeProust

1. What’s the most commonly used method for oligo synthesis currently?

Phosphodiester method.

2. Why is it difficult to make oligos longer than 200nt via direct synthesis?

Because the yield decreases exponentially with length, and the error rate increases with length as well.

3. Why can’t you make a 2000bp gene via direct oligo synthesis?

Because the cumulative error rate would be too high leading to practically zero yield.

Question from George Church

What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?

Phenylalanine, Valine, Threonine, Tryptophan, Isoleucine, Methionine, Histidine, Leucine, Lysine, Arginine.

Lysine is essential for protein synthesis and enzyme production, which is critical for survival.

Your HTGAA Website

Here it is!

Week 2 HW: DNA Read, Write, & Edit

Part 0: Basics of Gel Electrophoresis

I have attended the recitation.

Part 1: Benchling & In-silico Gel Art

I made the gel art below. It is “HT” for “How To grow almost anything”.

Part 2: Gel Art - Restriction Digests and Gel Electrophoresis

I worked in group with Louisa, Jasmine, and Yutong. We tried to make the cat gel art designed by Louisa, but unfortunately it was not very successful. Photo below:

Part 3: DNA Design Challenge

3.1. Choose your protein.

I chose EGFR (Epidermal Growth Factor Receptor), because it is a protein that plays a critical role in cell growth and division, and it is often mutated in various cancers.

>sp|P00533|EGFR_HUMAN Epidermal growth factor receptor OS=Homo sapiens OX=9606 GN=EGFR PE=1 SV=2
MRPSGTAGAALLALLAALCPASRALEEKKVCQGTSNKLTQLGTFEDHFLSLQRMFNNCEV
VLGNLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVERIPLENLQIIRGNMYYENSYALA
VLSNYDANKTGLKELPMRNLQEILHGAVRFSNNPALCNVESIQWRDIVSSDFLSNMSMDF
QNHLGSCQKCDPSCPNGSCWGAGEENCQKLTKIICAQQCSGRCRGKSPSDCCHNQCAAGC
TGPRESDCLVCRKFRDEATCKDTCPPLMLYNPTTYQMDVNPEGKYSFGATCVKKCPRNYV
VTDHGSCVRACGADSYEMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDSLSINATNIKHFK
NCTSISGDLHILPVAFRGDSFTHTPPLDPQELDILKTVKEITGFLLIQAWPENRTDLHAF
ENLEIIRGRTKQHGQFSLAVVSLNITSLGLRSLKEISDGDVIISGNKNLCYANTINWKKL
FGTSGQKTKIISNRGENSCKATGQVCHALCSPEGCWGPEPRDCVSCRNVSRGRECVDKCN
LLEGEPREFVENSECIQCHPECLPQAMNITCTGRGPDNCIQCAHYIDGPHCVKTCPAGVM
GENNTLVWKYADAGHVCHLCHPNCTYGCTGPGLEGCPTNGPKIPSIATGMVGALLLLLVV
ALGIGLFMRRRHIVRKRTLRRLLQERELVEPLTPSGEAPNQALLRILKETEFKKIKVLGS
GAFGTVYKGLWIPEGEKVKIPVAIKELREATSPKANKEILDEAYVMASVDNPHVCRLLGI
CLTSTVQLITQLMPFGCLLDYVREHKDNIGSQYLLNWCVQIAKGMNYLEDRRLVHRDLAA
RNVLVKTPQHVKITDFGLAKLLGAEEKEYHAEGGKVPIKWMALESILHRIYTHQSDVWSY
GVTVWELMTFGSKPYDGIPASEISSILEKGERLPQPPICTIDVYMIMVKCWMIDADSRPK
FRELIIEFSKMARDPQRYLVIQGDERMHLPSPTDSNFYRALMDEEDMDDVVDADEYLIPQ
QGFFSSPSTSRTPLLSSLSATSNNSTVACIDRNGLQSCPIKEDSFLQRYSSDPTGALTED
SIDDTFLPVPEYINQSVPKRPAGSVQNPVYHNQPLNPAPSRDPHYQDPHSTAVGNPEYLN
TVQPTCVNSTFDSPAHWAQKGSHQISLDNPDYQQDFFPKEAKPNGIFKGSTAENAEYLRV
APQSSEFIGA

3.2. Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence.

I used the reverse translation tool on the Twist Bioscience website. I added a stop codon (TAA) manually to the end of the DNA sequence. The DNA sequence is as follows:

>Egfr
ATGCGTCCGTCCGGCACTGCAGGGGCCGCACTGTTGGCTCTGCTCGCAGCCTTATGCCCCGCCAG
TCGAGCATTGGAGGAGAAGAAGGTGTGCCAGGGAACCTCTAACAAGCTGACACAACTCGGGACCT
TCGAGGACCATTTCCTGAGCCTCCAGAGAATGTTTAACAACTGCGAGGTTGTGTTGGGGAATCTG
GAGATTACCTACGTCCAGCGTAACTACGACCTCAGTTTCCTCAAGACAATACAGGAAGTGGCTGG
TTATGTTCTGATTGCACTGAATACCGTAGAGAGAATCCCCCTGGAAAACCTGCAAATTATTAGAG
GGAACATGTACTACGAAAACTCATATGCACTGGCCGTCCTGTCTAATTACGATGCCAACAAAACC
GGCCTGAAGGAGCTGCCCATGAGGAATCTCCAGGAAATTCTGCATGGGGCTGTAAGATTCAGCAA
TAACCCGGCTCTCTGTAACGTCGAGAGCATCCAGTGGCGCGACATAGTGAGCTCTGATTTCCTTT
CAAATATGTCCATGGATTTTCAAAACCATCTGGGGTCTTGTCAAAAGTGCGACCCATCTTGTCCC
AATGGTTCCTGTTGGGGCGCCGGAGAGGAGAATTGTCAGAAGCTTACCAAGATTATCTGTGCACA
GCAGTGTTCAGGGAGATGTAGGGGGAAGTCTCCAAGCGATTGCTGTCACAATCAATGCGCCGCCG
GGTGCACCGGCCCCAGAGAGAGTGACTGTTTAGTGTGCCGTAAATTCAGGGATGAGGCTACGTGC
AAAGACACATGCCCTCCACTCATGTTGTACAACCCAACAACTTATCAGATGGACGTGAACCCCGA
AGGGAAGTATAGCTTTGGGGCGACGTGCGTCAAGAAATGTCCACGCAACTATGTCGTTACGGACC
ACGGTTCCTGTGTGCGCGCTTGCGGAGCCGATTCATACGAAATGGAGGAAGACGGCGTCCGCAAG
TGTAAGAAGTGTGAAGGTCCTTGTCGCAAAGTCTGTAACGGTATCGGAATCGGCGAATTCAAGGA
CTCTCTCAGTATTAATGCAACCAATATCAAGCATTTTAAAAATTGTACCTCAATCTCAGGGGACT
TGCACATCCTGCCCGTTGCGTTTAGGGGGGACTCATTTACACACACCCCTCCACTGGATCCACAG
GAATTGGATATTCTGAAGACGGTGAAGGAAATCACCGGGTTCCTCCTGATTCAGGCCTGGCCAGA
GAACCGGACCGACCTTCACGCATTTGAGAACCTTGAAATTATTCGCGGGAGAACGAAACAACATG
GGCAATTCAGTCTCGCTGTCGTCAGCCTTAACATTACCTCCCTCGGATTGAGATCACTGAAGGAG
ATAAGCGACGGGGACGTGATAATCTCTGGGAACAAGAACTTGTGCTACGCCAACACTATTAACTG
GAAGAAGTTGTTCGGTACCTCAGGTCAGAAGACTAAAATCATCTCTAATCGAGGCGAAAATTCTT
GCAAAGCAACAGGGCAGGTGTGCCATGCCCTGTGCTCCCCTGAAGGGTGCTGGGGCCCCGAACCA
AGAGACTGCGTCTCTTGTCGGAATGTTTCCCGGGGGAGGGAATGTGTGGATAAGTGCAACCTGCT
TGAAGGAGAACCGAGAGAGTTTGTGGAAAATTCCGAATGTATCCAATGCCACCCTGAATGTTTAC
CTCAGGCCATGAACATTACATGCACAGGTAGGGGCCCCGACAATTGTATTCAGTGCGCCCATTAC
ATCGACGGACCCCACTGCGTAAAGACGTGCCCTGCCGGCGTTATGGGCGAGAATAATACCCTGGT
GTGGAAGTATGCCGACGCCGGGCATGTTTGCCATCTGTGCCACCCAAATTGTACATATGGCTGCA
CCGGCCCCGGGTTGGAGGGCTGCCCCACTAATGGACCTAAGATCCCCAGCATCGCCACCGGGATG
GTTGGCGCTCTGCTGCTGCTGTTGGTGGTCGCCTTGGGGATAGGGTTGTTCATGCGAAGGAGGCA
TATCGTCAGGAAGCGCACCCTGAGGAGGCTGCTCCAAGAGCGTGAGCTTGTCGAACCCCTGACCC
CCAGCGGCGAGGCCCCAAACCAGGCACTTTTGCGCATTCTGAAAGAGACCGAATTCAAAAAGATC
AAGGTGCTCGGCAGCGGGGCCTTCGGCACTGTGTACAAGGGTCTTTGGATCCCGGAGGGCGAAAA
AGTTAAAATACCCGTGGCAATAAAAGAGCTGAGGGAGGCTACAAGTCCAAAGGCAAACAAAGAAA
TCCTTGACGAAGCATACGTTATGGCATCCGTCGATAATCCACACGTGTGCAGGCTTCTCGGGATT
TGCCTGACGTCAACCGTGCAGCTGATTACGCAACTGATGCCATTTGGCTGTCTGCTCGATTATGT
GCGCGAGCACAAGGACAACATTGGATCTCAATACTTACTCAACTGGTGCGTGCAGATCGCGAAGG
GGATGAACTATCTGGAGGACCGCAGACTCGTCCACAGGGATCTTGCAGCCAGGAACGTACTCGTT
AAGACCCCGCAGCATGTGAAAATTACCGATTTCGGCCTTGCAAAACTGCTGGGTGCCGAAGAGAA
AGAATACCATGCAGAGGGTGGCAAAGTTCCTATCAAATGGATGGCGTTAGAGTCAATTCTGCATC
GGATCTATACCCATCAGAGCGACGTGTGGTCTTACGGTGTGACCGTTTGGGAGCTTATGACTTTT
GGGAGCAAACCGTACGACGGCATCCCGGCAAGCGAAATTTCCTCAATACTGGAAAAGGGGGAACG
TCTGCCCCAACCACCTATCTGCACTATAGACGTTTATATGATAATGGTGAAATGTTGGATGATCG
ACGCCGACAGTCGACCCAAGTTTCGAGAGTTAATCATCGAGTTCTCCAAGATGGCTCGGGATCCC
CAAAGGTACTTAGTCATCCAGGGCGATGAAAGAATGCACTTACCCTCACCCACAGATTCAAACTT
CTATCGAGCTTTGATGGATGAGGAAGACATGGATGATGTGGTAGACGCCGACGAGTACCTGATAC
CACAGCAGGGTTTTTTTTCTTCACCAAGCACATCTCGTACGCCTCTTCTGAGTAGCCTCAGCGCG
ACCTCCAACAACTCCACAGTGGCGTGCATCGACCGCAACGGACTTCAGTCCTGTCCAATTAAAGA
GGATTCCTTCCTGCAGAGGTATAGCAGCGACCCTACCGGAGCCCTTACCGAGGATAGTATTGATG
ATACATTCCTGCCTGTACCCGAATACATTAATCAGTCCGTGCCAAAACGCCCCGCAGGGAGTGTA
CAGAATCCAGTGTACCACAACCAGCCGCTGAACCCCGCACCCAGCCGAGACCCCCACTACCAGGA
CCCACATAGCACGGCCGTGGGAAATCCTGAATACCTGAACACCGTGCAGCCTACATGTGTGAATA
GCACTTTCGATAGCCCCGCACACTGGGCCCAGAAGGGCTCACACCAAATTAGCCTTGATAACCCT
GATTACCAGCAGGACTTCTTCCCCAAGGAGGCAAAGCCAAACGGTATCTTTAAGGGTAGCACGGC
CGAAAACGCAGAGTACTTGAGGGTTGCCCCTCAGTCCAGTGAGTTCATTGGCGCCTAA

3.3. Codon optimization.

I used the codon optimization tool on the Twist Bioscience website. The optimized DNA sequence is attached below.

Codon optimization is necessary because different organisms have different preferences for codons to encode the same amino acid. This can affect the efficiency of protein expression. I chose to optimize the codon sequence for E. coli, because it is a commonly used host organism for protein expression in the lab.

>Egfr
ATGAGACCCAGTGGAACCGCCGGTGCAGCTCTCCTTGCTTTGCTCGCTGCGCTCTGTCCAGCTTC
ACGGGCCCTTGAAGAAAAGAAAGTCTGTCAAGGTACAAGCAATAAACTCACGCAGTTGGGAACTT
TTGAAGATCACTTTCTGTCCCTGCAAAGGATGTTCAATAATTGTGAAGTAGTTCTGGGCAACCTC
GAAATCACATATGTACAGAGAAATTATGATTTATCCTTTCTGAAAACCATCCAAGAGGTAGCCGG
GTACGTCTTGATCGCTTTAAACACGGTTGAACGGATACCACTCGAGAATTTGCAGATAATCCGCG
GCAATATGTATTATGAGAATAGCTACGCCCTCGCGGTGCTCTCAAACTATGACGCGAATAAGACA
GGGTTAAAAGAATTACCAATGAGAAACCTGCAAGAGATACTCCACGGTGCAGTTAGGTTTAGTAA
CAATCCAGCCCTGTGCAATGTGGAATCTATTCAATGGCGAGATATCGTTAGTAGTGACTTTCTGT
CCAACATGAGTATGGACTTCCAGAATCACCTTGGCAGTTGCCAGAAATGTGATCCCAGCTGCCCA
AACGGGAGCTGCTGGGGAGCTGGGGAAGAAAACTGCCAGAAACTCACTAAAATCATATGCGCTCA
ACAATGCTCTGGCAGGTGCAGAGGCAAAAGCCCTTCCGACTGTTGCCATAACCAGTGTGCAGCTG
GATGTACTGGGCCGAGGGAAAGCGATTGCCTTGTCTGTAGAAAGTTTCGGGACGAAGCCACCTGT
AAGGATACTTGTCCACCCCTGATGCTCTATAATCCTACGACCTACCAAATGGATGTTAATCCGGA
AGGAAAATACTCCTTCGGCGCCACCTGTGTGAAGAAGTGCCCGCGGAATTACGTTGTGACAGATC
ATGGGTCTTGCGTCCGAGCCTGTGGTGCAGACTCTTATGAGATGGAAGAGGATGGGGTGAGGAAA
TGCAAGAAATGCGAGGGGCCATGCAGGAAGGTATGCAATGGAATTGGCATAGGTGAGTTTAAAGA
TTCACTGAGCATCAACGCGACAAACATTAAACACTTCAAGAACTGCACGTCCATATCTGGAGATC
TTCATATTCTTCCGGTGGCTTTCCGAGGAGATTCTTTCACCCATACACCACCCTTAGACCCTCAA
GAGCTGGACATATTGAAAACAGTTAAAGAGATTACAGGCTTTCTGCTTATCCAAGCTTGGCCTGA
AAATAGGACGGATCTCCATGCCTTCGAAAATCTGGAGATCATCAGAGGACGCACAAAGCAGCACG
GACAGTTTTCCCTGGCGGTGGTGTCTCTCAATATAACTTCACTTGGCTTACGCAGCCTCAAAGAA
ATTTCCGACGGAGATGTCATCATAAGTGGAAATAAGAATCTCTGTTACGCTAATACCATCAATTG
GAAGAAACTCTTCGGAACATCTGGACAAAAGACAAAGATCATTAGCAACCGCGGGGAGAACAGCT
GTAAGGCTACCGGACAAGTCTGTCACGCACTTTGTTCTCCAGAGGGATGTTGGGGCCCAGAGCCT
CGTGATTGTGTGTCCTGCAGGAACGTCAGCCGCGGCAGAGAGTGCGTTGACAAATGTAATCTCCT
CGAGGGCGAGCCTCGCGAATTCGTTGAGAACAGTGAGTGCATTCAGTGTCATCCAGAGTGCTTGC
CACAAGCTATGAATATCACCTGTACTGGACGCGGACCTGATAACTGCATCCAATGTGCTCACTAT
ATAGATGGGCCACATTGTGTGAAAACTTGTCCAGCTGGTGTAATGGGAGAAAACAACACATTAGT
TTGGAAATACGCAGATGCTGGTCACGTGTGTCACCTTTGTCATCCTAACTGCACCTACGGGTGTA
CTGGTCCAGGCCTCGAAGGTTGTCCGACCAACGGCCCAAAGATTCCTTCAATTGCAACAGGCATG
GTCGGGGCCTTGTTGTTGTTGCTGGTTGTGGCACTCGGTATTGGCCTGTTTATGAGACGGCGGCA
CATTGTGCGTAAAAGAACATTACGTCGCCTCTTACAGGAACGAGAACTGGTAGAGCCGCTCACAC
CTTCTGGGGAAGCACCGAATCAAGCCCTGCTCCGTATATTAAAGGAAACTGAGTTTAAGAAAATT
AAAGTACTGGGATCCGGCGCTTTTGGTACAGTTTATAAAGGGCTGTGGATACCTGAAGGGGAGAA
GGTGAAGATCCCTGTCGCCATCAAGGAATTGCGAGAAGCGACTTCCCCCAAAGCCAATAAAGAGA
TTCTGGATGAGGCCTATGTCATGGCTTCTGTGGACAACCCTCATGTTTGTCGCCTGTTAGGCATC
TGTCTTACTAGCACTGTCCAACTTATCACTCAGTTGATGCCGTTCGGGTGCCTTCTGGACTACGT
TAGAGAACATAAAGATAATATCGGTAGCCAGTATCTCCTGAATTGGTGTGTTCAAATAGCAAAAG
GCATGAATTACCTCGAAGATCGGCGGCTGGTTCATCGCGACCTGGCTGCTCGGAATGTCCTTGTC
AAAACACCCCAACACGTAAAGATAACAGACTTTGGGCTGGCTAAGCTTCTTGGCGCTGAGGAAAA
GGAGTATCACGCTGAAGGCGGAAAGGTCCCCATTAAGTGGATGGCTCTGGAATCCATCTTGCACA
GGATATACACTCACCAATCAGATGTCTGGTCCTATGGGGTAACAGTATGGGAACTGATGACCTTC
GGCTCCAAGCCATATGATGGAATACCTGCGAGTGAGATAAGCTCCATTTTGGAGAAAGGAGAGAG
GTTACCGCAGCCGCCCATATGTACAATTGATGTCTACATGATTATGGTCAAGTGCTGGATGATTG
ATGCAGATAGCCGGCCGAAATTCCGCGAATTGATTATTGAATTTAGCAAAATGGCCCGCGACCCA
CAGCGCTATTTGGTTATTCAAGGGGACGAGAGGATGCATCTGCCAAGCCCAACTGACAGCAACTT
TTACCGCGCCCTTATGGACGAAGAGGATATGGACGACGTTGTGGATGCTGATGAATATTTGATTC
CTCAACAAGGGTTCTTCAGTTCTCCCTCAACTTCCAGAACCCCACTGCTTTCAAGTTTATCCGCA
ACTAGTAATAATAGCACCGTTGCATGTATTGATCGGAATGGGCTGCAAAGCTGCCCCATCAAGGA
AGACTCATTCTTACAACGTTACTCATCTGATCCCACTGGGGCGCTGACTGAAGACTCAATCGACG
ACACCTTTCTTCCAGTCCCGGAGTATATCAACCAAAGTGTCCCCAAGAGACCTGCCGGATCCGTG
CAAAACCCTGTTTATCATAATCAACCTCTCAATCCAGCGCCGTCTAGGGATCCTCATTATCAAGA
TCCCCACTCTACTGCTGTAGGGAACCCAGAGTATCTCAATACAGTTCAACCCACTTGCGTCAACT
CTACCTTTGACAGTCCTGCCCATTGGGCGCAGAAAGGTTCCCATCAGATCTCCCTGGACAATCCA
GACTATCAACAAGATTTCTTTCCTAAAGAAGCCAAACCCAATGGGATATTCAAAGGATCTACCGC
AGAGAATGCGGAATATCTGCGCGTGGCACCCCAAAGCAGCGAATTTATCGGGGCTTAG

3.4. You have a sequence! Now what?

Cell-dependent and cell-free technologies could be used.

Cell-dependent method: First, insert the DNA sequence into a plasmid vector, and then transfer it into a host cell (e.g. E. coli). The host cell will transcribe the DNA into mRNA, which will then be translated into the protein.

Part 4: Prepare a Twist DNA Synthesis Order

I have done everything in Part 3 using the Twist Bioscience website, and followed the tutorial to finish the remaining steps.

Part 5: DNA Read/Write/Edit

5.1 DNA Read

(i) What DNA would you want to sequence (e.g., read) and why?

I would want to sequence the DNA of a cancer cell, because we can learn which mutations led to the cancer, and use corresponding drugs (if available) to treat the cancer.

(ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why?

Illumina sequencing.

Q1: Is your method first-, second- or third-generation or other? How so?

Second-generation, because it is based on sequencing by synthesis.

Q2: What is your input? How do you prepare your input (e.g. fragmentation, adapter ligation, PCR)? List the essential steps.

The input is the DNA extracted from the cancer cell.

Fragmentation: The DNA is fragmented into smaller pieces.

Adapter ligation: Adapters are ligated to the ends of the DNA fragments.

PCR: The DNA fragments are amplified using PCR.

Q3: What are the essential steps of your chosen sequencing technology, how does it decode the bases of your DNA sample (base calling)?

First, DNA fragments are immobilized on a flow cell and amplified to form clusters. Then, fluorescently labeled nucleotides are added to the flow cell. Each time, a nucleotide is incorporated into the growing DNA strand by DNA polymerase, a fluorescent signal is emitted. The fluorescence is detected to determine which nucleotide was incorporated. Finally, the fluorescence signals are converted into base calls.

Q4: What is the output of your chosen sequencing technology?

Sequence reads with quality scores. Sequence reads can be assembled into a complete genome sequence.

5.2 DNA Write

(i) What DNA would you want to synthesize (e.g., write) and why?

I would want to synthesize the DNA sequence of mutated EGFR in cancers. By synthesizing the DNA sequence of EGFR and translate it into protein, we can study its function and develop targeted drugs.

(ii) What technology or technologies would you use to perform this DNA synthesis and why?

I would use phosphoramidite chemical synthesis for short DNA fragments (e.g. 200bp), and then Gibson assembly for assembling the short fragments into the full-length DNA sequence.

Q1: What are the essential steps of your chosen sequencing methods?

Phosphoramidite chemical synthesis: One DNA base is synthesizaed at a time. Each cycle consists of four steps: deprotection, coupling, capping, and oxidation.

Gibson assembly: DNA fragments with overlapping ends are mixed together. Enzymes (exonuclease, DNA polymerase, and DNA ligase) are added to the mixture and stitch the DNA fragments together.

5.2 DNA Edit

(i) What DNA would you want to edit and why?

I would want to edit the DNA sequence of a gene that is mutated in a genetic disease, such as cystic fibrosis. By editing the DNA sequence to correct the mutation, we can potentially cure the disease.

Q1: How does your technology of choice edit DNA? What are the essential steps? CRISPR-Cas9 is a commonly used technology for DNA editing.

First, a guide RNA (gRNA) is designed to target the specific DNA sequence to be edited. The gRNA is then complexed with the Cas9 protein to form a ribonucleoprotein (RNP) complex which will be delivered into the target cells. The RNP will bind to the target DNA sequence and creates a break. The cell’s repair mechanisms then repair the break according to a repair template.

Q2: What preparation do you need to do (e.g. design steps) and what is the input (e.g. DNA template, enzymes, plasmids, primers, guides, cells) for the editing?

Preparation and input: gRNA sequence that corresponds to the edit site (find), repair template (replace).

Q3: What are the limitations of your editing methods (if any) in terms of efficiency or precision?

The main limitation is the off-target effect, where the edits are applied to unintended sites.

Week 3 HW: Lab Automation

Python Script for Opentrons Artwork

I created a design using opentrons-art.rcdonovan.com

Opentrons-Art Website: https://opentrons-art.rcdonovan.com/?id=80fx569l8o4tho4

Google Colab: https://colab.research.google.com/drive/1UPiCmwBP3sIFD_rNVRHeT3YhuiQQ5ZGP#scrollTo=pczDLwsq64mk&line=6&uniqifier=1

The OpentronMock gives the following output:

Code:

  ###
  ### YOUR CODE HERE to create your design
  ###
  sfgfp_points = [(-3.3, -3.3),(-1.1, -3.3),(1.1, -3.3),(3.3, -3.3),(-3.3, -5.5),(-1.1, -5.5),(1.1, -5.5),(3.3, -5.5),(-1.1, -7.7),(1.1, -7.7),(-5.5, -9.9),(-1.1, -9.9),(1.1, -9.9),(5.5, -9.9),(-3.3, -12.1),(-1.1, -12.1),(1.1, -12.1),(3.3, -12.1)]
  mrfp1_points = [(-23.1, 27.5),(-20.9, 27.5),(-18.7, 27.5),(18.7, 27.5),(20.9, 27.5),(23.1, 27.5),(-23.1, 25.3),(-20.9, 25.3),(-18.7, 25.3),(-16.5, 25.3),(-14.3, 25.3),(14.3, 25.3),(16.5, 25.3),(18.7, 25.3),(20.9, 25.3),(23.1, 25.3),(-23.1, 23.1),(-20.9, 23.1),(-18.7, 23.1),(-16.5, 23.1),(-14.3, 23.1),(-12.1, 23.1),(-3.3, 23.1),(-1.1, 23.1),(1.1, 23.1),(3.3, 23.1),(12.1, 23.1),(14.3, 23.1),(16.5, 23.1),(18.7, 23.1),(20.9, 23.1),(23.1, 23.1),(-23.1, 20.9),(-20.9, 20.9),(-18.7, 20.9),(-16.5, 20.9),(-14.3, 20.9),(-12.1, 20.9),(-9.9, 20.9),(-7.7, 20.9),(-5.5, 20.9),(-3.3, 20.9),(-1.1, 20.9),(1.1, 20.9),(3.3, 20.9),(5.5, 20.9),(7.7, 20.9),(9.9, 20.9),(12.1, 20.9),(14.3, 20.9),(16.5, 20.9),(18.7, 20.9),(20.9, 20.9),(23.1, 20.9),(-23.1, 18.7),(-20.9, 18.7),(-18.7, 18.7),(-16.5, 18.7),(-14.3, 18.7),(-12.1, 18.7),(-9.9, 18.7),(-7.7, 18.7),(7.7, 18.7),(9.9, 18.7),(12.1, 18.7),(14.3, 18.7),(16.5, 18.7),(18.7, 18.7),(20.9, 18.7),(23.1, 18.7),(-23.1, 16.5),(-20.9, 16.5),(-18.7, 16.5),(-16.5, 16.5),(-14.3, 16.5),(-12.1, 16.5),(12.1, 16.5),(14.3, 16.5),(16.5, 16.5),(18.7, 16.5),(20.9, 16.5),(23.1, 16.5),(-23.1, 14.3),(-20.9, 14.3),(-18.7, 14.3),(-16.5, 14.3),(16.5, 14.3),(18.7, 14.3),(20.9, 14.3),(23.1, 14.3),(-23.1, 12.1),(-20.9, 12.1),(-18.7, 12.1),(18.7, 12.1),(20.9, 12.1),(23.1, 12.1),(-23.1, 9.9),(-20.9, 9.9),(20.9, 9.9),(23.1, 9.9),(-23.1, 7.7),(-20.9, 7.7),(20.9, 7.7),(23.1, 7.7),(-23.1, 5.5),(23.1, 5.5),(-25.3, 3.3),(-23.1, 3.3),(23.1, 3.3),(25.3, 3.3),(-25.3, 1.1),(-23.1, 1.1),(23.1, 1.1),(25.3, 1.1),(-25.3, -1.1),(-23.1, -1.1),(23.1, -1.1),(25.3, -1.1),(-25.3, -5.5),(-23.1, -5.5),(23.1, -5.5),(25.3, -5.5),(-25.3, -7.7),(25.3, -7.7),(-23.1, -9.9),(23.1, -9.9),(-23.1, -12.1),(23.1, -12.1),(-23.1, -14.3),(23.1, -14.3),(-20.9, -16.5),(20.9, -16.5),(-20.9, -18.7),(-18.7, -18.7),(18.7, -18.7),(20.9, -18.7),(-18.7, -20.9),(-16.5, -20.9),(16.5, -20.9),(18.7, -20.9),(-16.5, -23.1),(-14.3, -23.1),(-12.1, -23.1),(12.1, -23.1),(14.3, -23.1),(16.5, -23.1),(-14.3, -25.3),(-12.1, -25.3),(-9.9, -25.3),(-7.7, -25.3),(7.7, -25.3),(9.9, -25.3),(12.1, -25.3),(14.3, -25.3),(-9.9, -27.5),(-7.7, -27.5),(-5.5, -27.5),(-3.3, -27.5),(-1.1, -27.5),(1.1, -27.5),(3.3, -27.5),(5.5, -27.5),(7.7, -27.5),(9.9, -27.5),(-1.1, -29.7),(1.1, -29.7)]
  azurite_points = [(-9.9, 7.7),(-7.7, 7.7),(7.7, 7.7),(9.9, 7.7),(-12.1, 5.5),(-9.9, 5.5),(-7.7, 5.5),(7.7, 5.5),(9.9, 5.5),(12.1, 5.5),(-9.9, 3.3),(9.9, 3.3)]
  mwasabi_points = [(-27.5, -3.3),(-25.3, -3.3),(-23.1, -3.3),(-20.9, -3.3),(-18.7, -3.3),(-16.5, -3.3),(16.5, -3.3),(18.7, -3.3),(20.9, -3.3),(23.1, -3.3),(25.3, -3.3),(27.5, -3.3),(-23.1, -7.7),(-20.9, -7.7),(-18.7, -7.7),(-16.5, -7.7),(16.5, -7.7),(18.7, -7.7),(20.9, -7.7),(23.1, -7.7),(-27.5, -9.9),(-25.3, -9.9),(25.3, -9.9),(27.5, -9.9),(-16.5, -12.1),(16.5, -12.1),(-20.9, -14.3),(-18.7, -14.3),(18.7, -14.3),(20.9, -14.3),(-23.1, -16.5),(23.1, -16.5),(-25.3, -18.7),(25.3, -18.7)]

  scale = 1

  def draw_points(points, color="Red"):
    segments = []
    for i in range(0, len(points), 20):
      segments.append(points[i : i+20])
    for seg in segments:
      pipette_20ul.pick_up_tip()
      pipette_20ul.aspirate(len(seg), location_of_color(color))
      for x, y in seg:
        adjusted_location = center_location.move(types.Point(x=x*scale, y=y*scale))
        dispense_and_detach(pipette_20ul, 1, adjusted_location)
      pipette_20ul.drop_tip()

  draw_points(sfgfp_points, "Red")
  draw_points(mrfp1_points, "Green")
  draw_points(azurite_points, "Orange")
  draw_points(mwasabi_points, "Orange")

Result

With the help of our TA Ronan, the art was printed with an Opentrons robot. The result is shown below:

Published paper about lab automation

PyLabRobot: An open-source, hardware-agnostic interface for liquid-handling robots and accessories. Wierenga, Rick P. et al. Device, Volume 1, Issue 4, 100111

This paper introduces PyLabRobot, an open-source Python library that provides a unified interface for controlling various liquid-handling robots and accessories, including Opentrons. PyLabRobot also includes a simulator (like the OpentronMock provided in this homework’s Google Colab notebook), which allows users to test and debug their protocols without needing access to the physical robot. Further, this paper also demonstrates the integration with LLMs, allowing users who are not familiar with programming to create protocols using natural language instructions, which are then translated into executable code for the robot.

My plan to use automation tools

I am interested in using lab automation to do machine-learning guided directed evolution of PET-ase (PET plastic degradation enzyme).

First, I will need to use machine learning models such as ProteinMPNN to design an initial library of PET-ase variants. I will place orders for the DNA fragments of these variants from Twist Bioscience.

Second, I will use liquid handler to assemble the DNA fragments into plasmids, and then transform the plasmids into E. coli cells.

Then, I will use a plate reader to measure the activity of the PET-ase variants in degrading PET plastic. This can also be done in a high-throughput manner using 96-well or 384-well plates with an automation robot.

Finally, I will use the activity data to train a machine learning model to predict the activity of new PET-ase variants, and then use the model to design the next round of variants for testing. This iterative process can be repeated until we find highly active PET-ase variants for degrading PET plastic.

Labs

Lab writeups:

Subsections of Labs

Week 1 Lab: Pipetting

I used P20 (1-20uL) pipettes to create a letter “H” pattern with red food coloring solution contained in an Eppendorf tube.

Week 2 Lab: DNA Gel Art

Louisa, Jasmine, Yutong and I worked on this lab together.

TAE Buffer Preparation

We mixed 8mL of TAE (50% concentration) with 492mL of deionized water to make 500mL of TAE buffer (1% concentration). We also added dye to the buffer to make the electrophoresis process traceable.

Agarose Electrophoresis Gel Preparation

First, we added 0.75g of agarose powder and 75mL of TAE buffer into a microwavable flask. We shaked the flask to mix the powder and the buffer.

Next, we microwaved the flask for multiple rounds of 20 seconds each until the agarose powder was completely dissolved.

After that, we waited for the agarose solution to cool down for about 20 minutes. We then added 7.5uL of SYBR Safe DNA stain to the solution.

Finally, we poured the agarose solution into a gel mold, inserted the comb, and let it solidify for about 30 minutes. After the gel solidified, we removed the comb and placed the gel.

Digestion

According to the gel art designed, we used EcoRV, SacI, BamHI, and KpnI. We created the mixture of Lambda DNA, the restriction enzyme, and the buffer according to the protocol. After mixing the solution, we incubated it at 37C for 30 minutes to allow the digestion to occur.

Running the Gel

First, we poured the TAE buffer into the gel box until the gel was submerged. Next, we pipetted the digested DNA samples into the wells of the gel.

We then connected the gel box to the power supply and ran the gel at 120V for about 30 minutes.

Result

Finally, we visualized the gel in the imaging system. The gel art is shown below. It does not look exactly like the design we created, but at least we gave it a try and had fun in the process!

Subsections of Projects

Individual Final Project

cover image cover image

Group Final Project

cover image cover image