Homework

Weekly homework submissions:

  • Week 2 HW: DNA Read, write, and edit

    This week, we evaluated the different methods to read, write, and edit DNA. Part 1: Benchling & In-silico Gel Art Make a free account at benchling.com Import the Lambda DNA. Simulate Restriction Enzyme Digestion with the following Enzymes: EcoRI HindIII BamHI KpnI EcoRV SacI Sall Create a pattern/image in the style of Paul Vanouse’s Latent Figure Protocol artworks. I decided to elaborate on the design of a square wine glass. This design is new, and it provides a fresh look for wine.

  • Week 3 HW: Lab automation

    Assignment: Python Script for Opentrons Artwork — DUE BY YOUR LAB TIME!

  1. Your task this week is to Create a Python file to run on an Opentrons liquid handling robot.
  2. Review this week’s recitation and this week’s lab for details on the Opentrons and programming it.
  3. Generate an artistic design using the GUI at opentrons-art.rcdonovan.com. Using the coordinates from the GUI, follow the instructions in the HTGAA26 Opentrons Colab to write your own Python script which draws your design using the Opentrons. You may use AI assistance for this coding — Google Gemini is integrated into Colab (see the stylized star bottom center); it will do a good job writing functional Python, while you probably need to take charge of the art concept. If you’re a proficient programmer and you’d rather code something mathematical or algorithmic instead of using your GUI coordinates, you may do that instead. 4. If you use AI to help complete this homework or lab, document how you used AI and which models made contributions.
  • Week 1 HW: Principles and Practices

    This week we evaluate the principles and governance policies for our new ideas. It is important to avoid the misuse of these new developments.

  1. First, describe a biological engineering application or tool you want to develop and why. This could be inspired by an idea for your HTGAA class project and/or something for which you are already doing in your research, or something you are just curious about.

Subsections of Homework

Week 2 HW: DNA Read, write, and edit

This week, we evaluated the different methods to read, write, and edit DNA.

Part 1: Benchling & In-silico Gel Art Make a free account at benchling.com Import the Lambda DNA.

Simulate Restriction Enzyme Digestion with the following Enzymes:

  1. EcoRI
  2. HindIII
  3. BamHI
  4. KpnI
  5. EcoRV
  6. SacI
  7. Sall

Create a pattern/image in the style of Paul Vanouse’s Latent Figure Protocol artworks.

I decided to elaborate on the design of a square wine glass. This design is new, and it provides a fresh look for wine.

To make this design, I used 3 types of restriction enzymes:

  • EcoRI
  • HindIII
  • BamHI

Part 2: Gel Art - Restriction Digests and Gel Electrophoresis Assignees for the following sections

    MIT/Harvard students = Required

    Committed Listeners = Optional (for those with Lab access)

    Part 3: DNA Design Challenge

    3.1. Choose your protein.

    In recitation, we discussed that you will pick a protein for your homework that you find interesting. Which protein have you chosen and why? Using one of the tools described in recitation (NCBI, UniProt, google), obtain the protein sequence for the protein you chose.

    [Example from our group homework, you may notice the particular format — The example below came from UniProt] />sp|P03609|LYS_BPMS2 Lysis protein OS=Escherichia phage MS2 OX=12022 PE=2 SV=1 METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLL EAVIRTVTTLQQLLT

    I chose kinesin (Kinesin-1 heavy chain, KIF5B). I was impressed by its movement because, during my career, I have seen different animations representing it. This protein walks along microtubules to transport cellular cargo.

    Retrieved from: https://www.chemistryworld.com/research/walking-proteins-tiny-steps-measured-with-germanium-nanospheres/4013257.article

    This is important to me because, although we know that life exists beneath our skin, this kind of animation is a great way to improve our understanding of life. Consequently, I want to increase my knowledge of this protein, especially because this protein is in several cellular components.

    >sp|P33176|KINH_HUMAN Kinesin-1 heavy chain OS=Homo sapiens OX=9606 GN=KIF5B PE=1 SV=1 MADLAECNIKVMCRFRPLNESEVNRGDKYIAKFQGEDTVVIASKPYAFDRVFQSSTSQEQVYNDCAKKIVKDVLEGYNGTIFAYGQTSSGKTHTMEGKLHDPEGMGIIPRIVQDIFNYIYSMDENLEFHIKVSYFEIYLDKIRDLLDVSKTNLSVHEDKNRVPYVKGCTERFVCSPDEVMDTIDEGKSNRHVAVTNMNEHSSRSHSIFLINVKQENTQTEQKLSGKLYLVDLAGSEKVSKTGAEGAVLDEAKNINKSLSALGNVISALAEGSTYVPYRDSKMTRILQDSLGGNCRTTIVICCSPSSYNESETKSTLLFGQRAKTIKNTVCVNVELTAEQWKKKYEKEKEKNKILRNTIQWLENELNRWRNGETVPIDEQFDKEKANLEAFTVDKDITLTNDKPATAIGVIGNFTDAERRKCEEEIAKLYKQLDDKDEEINQQSQLVEKLKTQMLDQEELLASTRRDQDNMQAELNRLQAENDASKEEVKEVLQALEELAVNYDQKSQEVEDKTKEYELLSDELNQKSATLASIDAELQKLKEMTNHQKKRAAEMMASLLKDLAEIGIAVGNNDVKQPEGTGMIDEEFTVARLYISKMKSEVKTMVKRCKQLESTQTESNKKMEENEKELAACQLRISQHEAKIKSLTEYLQNVEQKKRQLEESVDALSEELVQLRAQEKVHEMEKEHLNKVQTANEVKQAVEQQIQSHRETHQKQISSLRDEVEAKAKLITDLQDQNQKMMLEQERLRVEHEKLKATDQEKSRKLHELTVMQDRREQARQDLKGLEETVAKELQTLHNLRKLFVQDLATRVKKSAEIDSDDTGGSAAQKQKISFLENNLEQLTKVHKQLVRDNADLRCELPKLEKRLRATAERVKALESALKEAKENASRDRKRYQQEVDRIKEAVRSKNMARRGHSAQIAKPIRPGQHPAASPTHPSAIRGGGAFVQNSQPVAVRGGGGKQV

    Tool: Uniprot

    3.2. Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence.

    The Central Dogma discussed in class and recitation describes the process in which DNA sequence becomes transcribed and translated into protein. The Central Dogma gives us the framework to work backwards from a given protein sequence and infer the DNA sequence that the protein is derived from. Using one of the tools discussed in class, NCBI or online tools (google “reverse translation tools”), determine the nucleotide sequence that corresponds to the protein sequence you chose above.

    [Example: Get to the original sequence of phage MS2 L-protein from its genome phage MS2 genome - Nucleotide - NCBI]

    Lysis protein DNA sequence/ atggaaacccgattccctcagcaatcgcagcaaactccggcatctactaatagacgccggccattcaaacatgaggattacccatgtcgaagacaacaaagaagttcaactctttatgtattgatcttcctcgcgatctttctctcgaaatttaccaatcaattgcttctgtcgctactggaagcggtgatccgcacagtgacgactttacagcaattgcttacttaa

    P33176|KINH_HUMAN Kinesin-1 heavy chain protein DNA sequence 2889 bases atggcggatctggcggaatgcaacattaaagtgatgtgccgctttcgcccgctgaacgaaagcgaagtgaaccgcggcgataaatatattgcgaaatttcagggcgaagataccgtggtgattgcgagcaaaccgtatgcgtttgatcgcgtgtttcagagcagcaccagccaggaacaggtgtataacgattgcgcgaaaaaaattgtgaaagatgtgctggaaggctataacggcaccatttttgcgtatggccagaccagcagcggcaaaacccataccatggaaggcaaactgcatgatccggaaggcatgggcattattccgcgcattgtgcaggatatttttaactatatttatagcatggatgaaaacctggaatttcatattaaagtgagctattttgaaatttatctggataaaattcgcgatctgctggatgtgagcaaaaccaacctgagcgtgcatgaagataaaaaccgcgtgccgtatgtgaaaggctgcaccgaacgctttgtgtgcagcccggatgaagtgatggataccattgatgaaggcaaaagcaaccgccatgtggcggtgaccaacatgaacgaacatagcagccgcagccatagcatttttctgattaacgtgaaacaggaaaacacccagaccgaacagaaactgagcggcaaactgtatctggtggatctggcgggcagcgaaaaagtgagcaaaaccggcgcggaaggcgcggtgctggatgaagcgaaaaacattaacaaaagcctgagcgcgctgggcaacgtgattagcgcgctggcggaaggcagcacctatgtgccgtatcgcgatagcaaaatgacccgcattctgcaggatagcctgggcggcaactgccgcaccaccattgtgatttgctgcagcccgagcagctataacgaaagcgaaaccaaaagcaccctgctgtttggccagcgcgcgaaaaccattaaaaacaccgtgtgcgtgaacgtggaactgaccgcggaacagtggaaaaaaaaatatgaaaaagaaaaagaaaaaaacaaaattctgcgcaacaccattcagtggctggaaaacgaactgaaccgctggcgcaacggcgaaaccgtgccgattgatgaacagtttgataaagaaaaagcgaacctggaagcgtttaccgtggataaagatattaccctgaccaacgataaaccggcgaccgcgattggcgtgattggcaactttaccgatgcggaacgccgcaaatgcgaagaagaaattgcgaaactgtataaacagctggatgataaagatgaagaaattaaccagcagagccagctggtggaaaaactgaaaacccagatgctggatcaggaagaactgctggcgagcacccgccgcgatcaggataacatgcaggcggaactgaaccgcctgcaggcggaaaacgatgcgagcaaagaagaagtgaaagaagtgctgcaggcgctggaagaactggcggtgaactatgatcagaaaagccaggaagtggaagataaaaccaaagaatatgaactgctgagcgatgaactgaaccagaaaagcgcgaccctggcgagcattgatgcggaactgcagaaactgaaagaaatgaccaaccatcagaaaaaacgcgcggcggaaatgatggcgagcctgctgaaagatctggcggaaattggcattgcggtgggcaacaacgatgtgaaacagccggaaggcaccggcatgattgatgaagaatttaccgtggcgcgcctgtatattagcaaaatgaaaagcgaagtgaaaaccatggtgaaacgctgcaaacagctggaaagcacccagaccgaaagcaacaaaaaaatggaagaaaacgaaaaagaactggcggcgtgccagctgcgcattagccagcatgaagcgaaaattaaaagcctgaccgaatatctgcagaacgtggaacagaaaaaacgccagctggaagaaagcgtggatgcgctgagcgaagaactggtgcagctgcgcgcgcaggaaaaagtgcatgaaatggaaaaagaacatctgaacaaagtgcagaccgcgaacgaagtgaaacaggcggtggaacagcagattcagagccatcgcgaaacccatcagaaacagattagcagcctgcgcgatgaagtggaagcgaaagcgaaactgattaccgatctgcaggatcagaaccagaaaatgatgctggaacaggaacgcctgcgcgtggaacatgaaaaactgaaagcgaccgatcaggaaaaaagccgcaaactgcatgaactgaccgtgatgcaggatcgccgcgaacaggcgcgccaggatctgaaaggcctggaagaaaccgtggcgaaagaactgcagaccctgcataacctgcgcaaactgtttgtgcaggatctggcgacccgcgtgaaaaaaagcgcggaaattgatagcgatgataccggcggcagcgcggcgcagaaacagaaaattagctttctggaaaacaacctggaacagctgaccaaagtgcataaacagctggtgcgcgataacgcggatctgcgctgcgaactgccgaaactggaaaaacgcctgcgcgcgaccgcggaacgcgtgaaagcgctggaaagcgcgctgaaagaagcgaaagaaaacgcgagccgcgatcgcaaacgctatcagcaggaagtggatcgcattaaagaagcggtgcgcagcaaaaacatggcgcgccgcggccatagcgcgcagattgcgaaaccgattcgcccgggccagcatccggcggcgagcccgacccatccgagcgcgattcgcggcggcggcgcgtttgtgcagaacagccagccggtggcggtgcgcggcggcggcggcaaacaggtg

    3.3. Codon optimization.

    Once a nucleotide sequence of your protein is determined, you need to codon optimize your sequence. You may, once again, utilize google for a “codon optimization tool”. In your own words, describe why you need to optimize codon usage. Which organism have you chosen to optimize the codon sequence for and why?

    [Example from Codon Optimization Tool | Twist Bioscience while avoiding Type IIs enzyme recognition sites BsaI, BsmBI, and BbsI]

    Lysis protein DNA sequence with Codon-Optimization

    ATGGAAACCCGCTTTCCGCAGCAGAGCCAGCAGACCCCGGCGAGCACCAACCGCCGCCGCCCGTTCAAACATGAAGATTATCCGTGCCGTCGTCAGCAGCGCAGCAGCACCCTGTATGTGCTGATTTTTCTGGCGATTTTTCTGAGCAAATTCACCAACCAGCTGCTGCTGAGCCTGCTGGAAGCGGTGATTCGCACAGTGACGACCCTGCAGCAGCTGCTGACCTAA

    Tool: https://www.idtdna.com/CodonOpt

    Codon Optimization chain

    ATG GCT GAT CTC GCT GAA TGT AAC ATC AAA GTG ATG TGC CGC TTT CGC CCC TTG AAC GAA TCA GAG GTG AAC CGC GGG GAC AAA TAC ATC GCC AAG TTT CAG GGG GAA GAT ACC GTG GTG ATT GCT TCT AAA CCT TAT GCG TTT GAT CGG GTG TTC CAG TCC TCA ACC TCC CAA GAA CAG GTG TAT AAC GAT TGT GCA AAG AAG ATC GTT AAA GAT GTT CTT GAG GGT TAC AAT GGC ACT ATC TTT GCC TAT GGC CAG ACT TCA TCC GGA AAG ACA CAC ACT ATG GAG GGC AAA CTT CAT GAT CCA GAG GGA ATG GGC ATC ATT CCA CGG ATT GTT CAG GAC ATA TTC AAC TAT ATA TAC AGC ATG GAC GAG AAC CTC GAG TTT CAT ATC AAG GTG AGC TAC TTC GAG ATC TAT CTC GAT AAA ATC CGG GAT CTT TTG GAT GTG TCT AAA ACT AAT CTG TCC GTT CAC GAG GAC AAG AAC AGA GTG CCC TAT GTG AAA GGG TGC ACC GAA CGG TTC GTG TGT TCA CCC GAC GAG GTC ATG GAT ACC ATT GAC GAG GGC AAA TCT AAC AGG CAT GTG GCT GTG ACC AAC ATG AAC GAG CAT AGC AGT AGG TCT CAT TCT ATA TTT CTG ATT AAT GTC AAG CAG GAG AAC ACC CAG ACT GAA CAG AAA TTG TCA GGC AAA CTC TAT CTG GTC GAC CTC GCA GGG AGC GAA AAG GTT TCC AAG ACA GGC GCA GAA GGC GCT GTG CTT GAC GAA GCC AAG AAT ATC AAC AAG TCC CTG AGC GCT CTT GGA AAC GTG ATA TCA GCC CTC GCC GAG GGC TCT ACG TAC GTT CCA TAT CGG GAT TCT AAA ATG ACC CGG ATC CTC CAA GAT TCC CTT GGA GGC AAC TGC AGG ACA ACA ATC GTC ATC TGT TGC AGT CCC TCT TCT TAC AAT GAG TCT GAA ACT AAG TCT ACT CTC CTG TTT GGG CAG AGA GCC AAG ACT ATA AAG AAT ACT GTG TGC GTC AAT GTG GAG CTG ACA GCG GAG CAG TGG AAG AAA AAA TAT GAA AAA GAA AAG GAA AAG AAT AAG ATC CTC AGA AAT ACC ATT CAG TGG CTT GAA AAC GAG CTG AAT AGG TGG AGG AAT GGC GAG ACT GTG CCC ATC GAC GAG CAG TTC GAT AAG GAG AAG GCT AAT TTG GAG GCG TTT ACA GTG GAT AAG GAT ATT ACA TTG ACA AAT GAC AAA CCA GCC ACC GCC ATT GGA GTA ATC GGC AAT TTT ACC GAT GCT GAG AGA AGG AAA TGC GAG GAG GAA ATC GCA AAG CTC TAT AAG CAA CTC GAT GAT AAG GAC GAG GAA ATC AAC CAA CAG TCC CAA CTC GTT GAA AAA CTG AAA ACA CAG ATG CTC GAC CAG GAA GAG CTG CTG GCC TCC ACT AGG CGG GAT CAG GAT AAT ATG CAG GCC GAA CTG AAC AGA CTT CAG GCC GAG AAC GAC GCC TCA AAG GAG GAG GTA AAG GAG GTG CTG CAG GCC CTG GAG GAG CTG GCG GTT AAC TAT GAT CAA AAG AGT CAG GAG GTG GAG GAC AAG ACT AAG GAG TAC GAA CTG CTG TCC GAC GAG CTT AAC CAG AAG TCA GCC ACA CTT GCG AGC ATC GAT GCC GAG CTC CAG AAA CTG AAA GAG ATG ACG AAT CAT CAG AAA AAG AGG GCT GCT GAA ATG ATG GCA AGC CTG TTG AAA GAC CTG GCG GAG ATC GGA ATC GCC GTG GGG AAT AAT GAT GTG AAA CAG CCC GAA GGG ACC GGA ATG ATA GAC GAG GAG TTC ACA GTA GCC AGA CTG TAC ATA AGC AAG ATG AAA TCT GAG GTA AAA ACG ATG GTT AAG CGA TGT AAA CAG CTC GAG TCT ACA CAG ACC GAG AGT AAC AAA AAG ATG GAG GAA AAT GAG AAA GAA CTG GCC GCT TGC CAG CTG CGG ATA TCA CAG CAT GAG GCC AAG ATT AAA AGT CTT ACT GAA TAC TTG CAG AAT GTA GAG CAA AAG AAA CGG CAA CTG GAG GAA AGC GTG GAT GCC CTC TCA GAG GAA CTC GTG CAG CTC AGA GCC CAA GAA AAG GTT CAT GAG ATG GAG AAA GAG CAC CTT AAT AAA GTA CAG ACG GCC AAT GAA GTC AAA CAG GCT GTG GAA CAG CAG ATC CAG TCT CAC AGG GAG ACA CAC CAG AAG CAG ATA AGC TCA CTG AGG GAC GAA GTG GAA GCA AAA GCC AAG CTC ATC ACT GAT CTC CAA GAC CAG AAT CAG AAG ATG ATG CTT GAG CAG GAG CGA CTC CGA GTG GAG CAT GAA AAA TTG AAG GCA ACT GAC CAA GAG AAG TCT AGA AAA CTT CAC GAA CTC ACT GTG ATG CAG GAC CGC AGG GAG CAG GCG CGC CAA GAC CTG AAA GGA CTT GAA GAG ACT GTG GCT AAG GAG CTC CAG ACC CTC CAT AAT CTG CGG AAG CTG TTC GTT CAG GAT TTG GCC ACC AGA GTC AAA AAA AGT GCG GAA ATT GAT AGC GAT GAC ACT GGC GGC AGT GCC GCC CAG AAG CAA AAA ATT TCT TTC TTG GAG AAC AAC TTG GAA CAG CTG ACA AAG GTA CAC AAG CAG CTG GTG AGA GAT AAC GCT GAC CTC CGA TGC GAA CTC CCA AAG TTG GAG AAA AGA CTG CGG GCC ACA GCA GAG AGG GTT AAA GCC CTG GAG TCA GCT CTG AAA GAA GCT AAG GAG AAC GCC TCC AGG GAC AGA AAA CGG TAC CAG CAA GAG GTA GAC CGG ATT AAA GAG GCC GTC AGG TCC AAA AAC ATG GCA AGA AGG GGG CAT AGT GCC CAG ATC GCC AAA CCC ATT AGA CCC GGA CAA CAC CCC GCC GCA TCC CCT ACC CAC CCT TCT GCA ATT CGG GGT GGG GGA GCC TTC GTT CAG AAT AGT CAG CCT GTG GCC GTA CGC GGC GGC GGA GGT AAG CAG GTG

    Why you need to optimize codon usage?

    Last week, we discussed how one amino acid might codify for several codons. These preferences vary from one organism to another, which means that if you try to put the gene from one organism into another, the choice of codon used by the gene might be different from the one preferred by the organism. Consequently, the expression of the protein will be affected. This explains why the optimal codon sequence is fundamental to ensure the highest level of expression of one specific protein.

    Which organism have you chosen to optimize the codon sequence for and why?

    Human (Homo sapiens)

    3.4. You have a sequence! Now what?

    What technologies could be used to produce this protein from your DNA? Describe in your words the DNA sequence can be transcribed and translated into your protein. You may describe either cell-dependent or cell-free methods, or both.

    Cell-dependent method:

    In this case, scientists use live cells and their biological capacity to produce one protein. This means that this method employs the central dogma of biology.

    DNA encodes RNA → RNA encodes Protein → Amino Acids Encode Proteins

    1. Transcription of DNA to RNA: During this step, the RNA polymerase uses the DNA strand of nucleic acids to produce an antiparallel RNA chain ( mRNA).

    2. Translation of RNA to protein: Protein synthesis occurs in the cytoplasm with the help of ribosomes. These structures read the mRNA and incorporate each amino acid according to the codon sequence.

    For example, scientists could use a bacterium to produce and purify a human protein. However, there are several disadvantages to this method:

    Living cells are complex and require specific conditions to ensure their growth.

    It is difficult to control the many variables that a cell possesses. The whole process is expensive

    Cell- free methods:

    In this method, scientists follow the central dogma of biology, but in this case, protein synthesis occurs in a controlled environment outside the cell.

    There are three components:

    1. Cell-free extract: Contains all the machinery from the cell to build proteins.
    2. DNA sequence: Provides genetic information to build the protein.
    3. Energy and Cofactors: Energy sources and supplies to facilitate the process

    This method has several advantages:

    • The process is fast; scientists might obtain one protein in a couple of hours.
    • It is more flexible because scientists can improve the reaction to produce the protein, and they do not need to maintain a living cell.
    • Less expensive, because it requires less maintenance in contrast to maintaining living cells
    • Minimal contamination of protein.

    These 2 methods have advantages and disadvantages, but without doubt, we can say that they improve our knowledge of life.

    Part 4: Prepare a Twist DNA Synthesis Order

    4.1. Create a Twist account and a Benchling account

    4.2. Build Your DNA Insert Sequence

    4.3. On Twist, Select The “Genes” Option

    4.4. Select “Clonal Genes” option

    4.5. Import your sequence

    4.6. Choose Your Vector

    This is the plasmid you just built with your expression cassette included. Congratulations on building your first plasmid!

    Part 5: DNA Read/Write/Edit

    5.1 DNA Read

    1. What DNA would you want to sequence (e.g., read) and why? This could be DNA related to human health (e.g. genes related to disease research), environmental monitoring (e.g., sewage waste water, biodiversity analysis), and beyond (e.g. DNA data storage, biobank).

    I want to read OPRM1 (Opioid Receptor Mu 1), which encodes the activity of opioid receptors in humans (MOR). MOR is the target of most opioid analgesics and other medicines related to pain management. Also, it has a fundamental role in dependence on other substances such as nicotine, cocaine, and alcohol.

    Scientists have found several variations in this gene related to a major risk of addiction. In my opinion, as a pharmacist, with the opioid crisis and the emergence of new substances every day, it is important to be aware of possible addictions. Especially, because the result of addictions might be death. And as a healthcare team, we don’t desire that people die because of medicines whose primary purpose was to treat pain.

    The objective of reading this gene is that if in one moment a patient goes to the doctor’s office suffering from chronic pain, the doctor will have the chance to sequence this specific gene in their patient and then know if their patient has a higher risk to develop addictions caused by variations or polymorphisms in this gene, especially A118G.

    Consequently, the doctor will use this information and their knowledge to prescribe a lower dose of medicine with risk of addiction or may try to manage the pain of the patient by other methods, such as physical therapy, massages, or even other medicines from different therapeutic groups.

    2. In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why?

    I would like to perform sequencing on the OPRM1 gene using sequencing by synthesis (SBS). This is because this method allows the doctor to take a sample of DNA of the patient, and by using the fluorescent image obtained of each different color from nucleotides, we can compare the sequence obtained from the patient vs a normal sequence of the gene.

    Also, answer the following questions:

    2.1 Is your method first-, second-, or third-generation or other? How so?

    Second-generation sequencing, because it allows sequencing multiple fragments at the same time, which brings several advantages, including fast and economic results.

    2.2 What is your input? How do you prepare your input (e.g. fragmentation, adapter ligation, PCR)? List the essential steps.

    Input: Patient blood sample

    1. Purify the DNA sample
    2. Ensure that the sample is pure and undegraded
    3. Start library preparation:

      →Cut the DNA sample into DNA fragments using high-frequency sound waves or enzymes

      →Add adapters to each DNA fragment

      →Assure that the library contains enough concentration to sequence

    3. What are the essential steps of your chosen sequencing technology? How does it decode the bases of your DNA sample (base calling)?

    In sequencing by synthesis (SBS), the DNA fragments are copied one base at a time. And each nucleotide is marked with a fluorescent dye. This produces an image with the flow cell.

    3.1 What is the output of your chosen sequencing technology?

    After obtaining the flow cell, they pass through a process of demultiplexing, obtaining different reads that will be organized based on a reference genome.

    Consequently, scientists will compare the reference genome with the patient’s sample, and evaluate if there are any polymorphisms in OPRM1.

    5.2 DNA Write

    1.What DNA would you want to synthesize (e.g., write) and why? These could be individual genes, clusters of genes or genetic circuits, whole genomes, and beyond. As described in class thus far, applications could range from therapeutics and drug discovery (e.g., mRNA vaccines and therapies) to novel biomaterials (e.g. structural proteins), to sensors (e.g., genetic circuits for sensing and responding to inflammation, environmental stimuli, etc.), to art (DNA origamis). If possible, include the specific genetic sequence(s) of what you would like to synthesize! You will have the opportunity to actually have Twist synthesize these DNA constructs! :)

    I would like to create a cellular sensor for opioids. The sensor will use a mu-opioid receptor (MOR), whose activity is regulated by the OPRM1 gene. The principal idea is that when the receptor is activated by high concentrations of opioids, it will trigger a genetic circuit that produces a fluorescent protein, causing the cell to have a visible glow that might be easy to detect and measure.

    Higher glows indicate higher concentrations of opioids; this sensor is useful because it allows scientists to evaluate how different doses of opioids affect the activity of the receptors. Also, this sensor might be used in educational programs regarding the use of opioids to show people the activity in cells in a different way.

    2. What technology or technologies would you use to perform this DNA synthesis and why?

    I would like to use Twist Bioscience’s chip-based gene synthesis, because it allows precise and efficient synthesis. It is the easiest way, since I need to digitally design the sensor and submit the design to Twist for synthesis.

    Additionally, my sensor is composed of several DNA fragments, including the gene OPRM1, promoters, fluorescent circuit, etc. This method synthesizes multiple DNA fragments on one chip. It is fast, economic, and accurate.

    Also answer the following questions:

    2.1 What are the essential steps of your chosen sequencing methods?

    The essential types of Twist Bioscience’s chip-based gene synthesis are:

    1. Upload your gene sequence and configure your project
    2. DNA is synthesized at Twist
    3. DNA is assembled
    4. High-quality genes

    One method to verify if the synthesis is correct might be SBS, previously described.

    2.2 What are the limitations of your sequencing method (if any) in terms of speed, accuracy, and scalability?

    Disadvantages of Sequencing by Synthesis (SBS):

    • Sample preparation requires purification, PCR, and fragmentation
    • Acquiring fluorescent dyes might be difficult and expensive, depending on the market.
    • High initial instruments cost
    • It is a second-generation method of synthesis, with short read lengths in contrast to third-generation methods
    • This method has great potential for scalability

    5.3 DNA Edit

    1. What DNA would you want to edit and why? In class, George shared a variety of ways to edit the genes and genomes of humans and other organisms. Such DNA editing technologies have profound implications for human health, development, and even human longevity and human augmentation. DNA editing is also already commonly leveraged for flora and fauna, for example in nature conservation efforts, (animal/plant restoration, de-extinction), or in agriculture (e.g. plant breeding, nitrogen fixation). What kinds of edits might you want to make to DNA (e.g., human genomes and beyond) and why?

    I would like to edit and correct the OPRM1 gene in patients with the A118G polymorphism to reduce the risk of addiction. However, it is important to remember that addictions are multifactorial conditions, which means that reducing the risk does not eliminate it.

    I believe that this is a good framework, especially for patients with chronic pain whose better option to manage the pain is opioids. This approach, with adequate monitoring and supervision by doctors and family members, might reduce the incidence of addictions.

    2. What technology or technologies would you use to perform these DNA edits and why?

    I would like to use CRISPR-Cas9, because it is an editing technology frequently used today, and there are some protocols defined, even though this technology is not widely used for editing humans. It is a well-known technology.

    Also, answer the following questions:

    2.1 How does your technology of choice edit DNA? What are the essential steps?

    It has 2 parts:

    Cas9 protein: Cut DNA

    Guide RNA: Recognize the site of DNA to be edited

    C= Clustered

    R= Regularly

    I= Interspaced

    S= Short

    P= Palindromic

    R= Repeats

    1. CRISPR/Cas9 complex formation
    2. CRISPR/Cas9 complex attaches to the target DNA sequence and induces a double-strand break (DSB) at the specific site
    3. Insertion of donor DNA and results in the transformed DNA sequence

    2.2 What preparation do you need to do (e.g. design steps) and what is the input (e.g. DNA template, enzymes, plasmids, primers, guides, cells) for the editing?

    1. Identify the sequence of the human genome that is causing the disease or problem.
    2. Create a specific RNA
    3. Introducing the complex CRISPR-Cas9 to the cells
    4. The CRISPR-Cas9 complex can edit the sequence by eliminating, modifying, or inserting a new sequence.
    5. Use cells as the biological system.

    Retrieved from: https://www.researchgate.net/figure/Mechanism-of-the-CRISPR-cas9-system-The-first-step-in-this-process-is-the-CRISPR-Cas9_fig2_362382684

    2.3 What are the limitations of your editing methods (if any) in terms of efficiency or precision?

    • In terms of precision, CRISPR-Cas9 has a high frequency of off-target effects (OTEs), specifically ≥50%
    • There is the possibility that CRISPR-Cas9 triggers apoptosis rather than objective gene editing.
    • Immunotoxicity

    References

    • KIF5B kinesin family member 5B [Homo sapiens (human)] - Gene - NCBI. (2025). Nih.gov. https://www.ncbi.nlm.nih.gov/gene/3799
    • P33176 KINH_HUMAN (Homo sapiens(human))- Gene- Retrieved February (2026) Uniprot. https://www.uniprot.org/uniprotkb/P33176/entry
    • supreme_admin. (2025, March 31). Codon Optimization: Understanding the Basics | IDT. IDT. https://www.idtdna.com/page/support-and-education/decoded-plus/codon-optimization-the-basics-explained/
    • Social Science, L. (2020, June 30). 3.4: DNA and Protein Synthesis. Social Sci LibreTexts. https://socialsci.libretexts.org/Courses/College_of_the_Canyons/Anthro_101%3A_Physical_Anthropology/03%3A_Cell_biology/3.04%3A_DNA_and_Protein_Synthesis
    • Brookwell, A., Oza, J. P., & Caschera, F. (2021). Biotechnology Applications of Cell-Free Expression Systems. Life, 11(12), 1367. https://doi.org/10.3390/life11121367
    • Technologies , I. D. (2015). Cell-Free Protein Synthesis Explained | IDT. Integrated DNA Technologies. https://www.idtdna.com/pages/applications/cell-free-protein-synthesis
    • Medicine, N. L. of. (2025, November 25). OPRM1 opioid receptor mu 1 [Homo sapiens (human)] - Gene - NCBI. Www.ncbi.nlm.nih.gov. https://www.ncbi.nlm.nih.gov/gene/4988
    • Taqi, M. M., Faisal, M., & Zaman, H. (2019). OPRM1 A118G polymorphisms and its role in opioid addiction: Implication on severity and treatment approaches. Pharmacogenomics and Personalized Medicine, Volume 12, 361–368. https://doi.org/10.2147/pgpm.s198654
    • ClevaLab. (2022, December 4). Next Generation Sequencing - A Step-By-Step Guide to DNA Sequencing. Www.youtube.com. https://www.youtube.com/watch?v=WKAUtJQ69n8
    • Zhang, X., Jiang, X., Wang, Y., Chen, Q., Jiang, H., Zhang, H., Beltran, A., Yang, W., Chen, T., Liang, C., Cheng, N., Huang, Y., Ding, G., Xie, C., Gao, N., Liu, J., Xu, W., Huang, J., Cai, D., & Zhu, L. (2025). Scaling DNA synthesis with a microchip-based massively parallel synthesis system. Nature Biotechnology. https://doi.org/10.1038/s41587-025-02844-0
    • Fuller, C. W., Middendorf, L. R., Benner, S. A., Church, G. M., Harris, T., Huang, X., Jovanovich, S. B., Nelson, J. R., Schloss, J. A., Schwartz, D. C., & Vezenov, D. V. (2009). The challenges of sequencing by synthesis. Nature Biotechnology, 27(11), 1013–1023. https://doi.org/10.1038/nbt.1585
    • The Power of Silicon-Based DNA Synthesis- Retrieved February (2026) Twist BioScience. https://www.twistbioscience.com/products/genes/gene-synthesis?tab=overview&utm_source=google&utm_medium=cpc&utm_campaign=PSR-GLBL-FY21-1791-GENES-Twist-Genes-Product&adgroup=114820677303&utm_term=gene%20fragment%20synthesis&utm_content=aud-1246333009810:kwd-366151829721&creative=747198843491&device=c&matchtype=b&location=9004247&gad_source=1&gad_campaignid=12061463038&gbraid=0AAAAADdPWR--SRjJbKP9Btyj804YD913x&gclid=CjwKCAiA-sXMBhAOEiwAGGw6LHgdq1r8sKVubeax3HNyhDZuKiraOMwMm2M6z5Vk7xDgdWaBj3uD5hoCwBUQAvD_BwE
    • Jayachandran, M., Fei, Z., & Qu, S. (2022). Genetic advancements in obesity management and CRISPR-Cas9-based gene editing system. Molecular and Cellular Biochemistry, 478. https://doi.org/10.1007/s11010-022-04518-w
    • Mayo Clinic. (2018). CRISPR Explained [YouTube Video]. In YouTube. https://www.youtube.com/watch?v=UKbrwPL3wXE
    • Uddin, F., Rudin, C. M., & Sen, T. (2020). CRISPR Gene Therapy: Applications, Limitations, and Implications for the Future. Frontiers in Oncology, 10(1387). https://doi.org/10.3389/fonc.2020.01387

Week 3 HW: Lab automation

Assignment: Python Script for Opentrons Artwork — DUE BY YOUR LAB TIME!

0. Your task this week is to Create a Python file to run on an Opentrons liquid handling robot.

1. Review this week’s recitation and this week’s lab for details on the Opentrons and programming it.

2. Generate an artistic design using the GUI at opentrons-art.rcdonovan.com.

  • Using the coordinates from the GUI, follow the instructions in the HTGAA26 Opentrons Colab to write your own Python script which draws your design using the Opentrons.
  • You may use AI assistance for this coding — Google Gemini is integrated into Colab (see the stylized star bottom center); it will do a good job writing functional Python, while you probably need to take charge of the art concept.
  • If you’re a proficient programmer and you’d rather code something mathematical or algorithmic instead of using your GUI coordinates, you may do that instead.

4. If you use AI to help complete this homework or lab, document how you used AI and which models made contributions.

5. Sign up for a robot time slot if you are at MIT/Harvard/Wellesley or at a Node offering Opentrons automation. The Python script you created will be run on the robot to produce your work of art!

6. Submit your Python file via this form.

✨✨My code✨✨

https://colab.research.google.com/drive/1rOfQVambbO3m8ZcjPQd7lQDcfa-qjr_j?usp=sharing

My inspiration was this image of the little fox from the story of the little prince.

Post-Lab Questions — DUE BY START OF FEB 24 LECTURE

One of the great parts about having an automated robot is being able to precisely mix, deposit, and run reactions without much intervention, and design and deploy experiments remotely.

For this week, we’d like for you to do the following:

1. Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications.

Title: Automation of protein crystallization scaleup via Opentrons 2 liquid handling

Publication: SLAS Technology, 2025

This article is impressive because it explains that protein crystallization is an important process; however, there are several variables that are difficult to control to execute this process, especially on a small scale. When humans develop this process, it might be difficult, exhausting, and the outcomes may be inaccurate.

Consequently, as a new way to resolve this problem, scientists have developed a new approach optimizing protein crystallization trials at the multi-microliter scale with the Opentrons-2 liquid handling robot.

Scientists explain that although there are different robots on the market whose objective is to improve this process, these robots are expensive and their programming is exclusive. On the other hand, we have Opentrons, which is a robot with automation for several purposes, and it can be programmed using Python.

With Python scripts, scientists compare the efficacy and accuracy of the process developed by Opentrons OT-2 vs the manual method.

The materials they used included:

  • Opentrons OT-2
  • Crystallization plates (sitting drop 24-well): for forming the protein crystals.
  • Protein solutions: the proteins they wanted to crystallize.
  • Precipitating and buffer solutions: substances that help the crystals form.

Steps

  1. Plate Preparation
  2. The wells of the crystallization plate were placed in specific positions on the robot deck
  3. Robot Programming
  • Python was used to instruct the robot on how to move liquids
  • Pick up protein from one container.
  • Pick up buffer/precipitate from another container.
  • Mix small drops into the wells of the plate.
  • Running Assays
  • The robot performed all pipetting automatically, drop by drop.

Some plates contained different combinations of protein and precipitate to test various conditions simultaneously.

They illustrate the process with the following images:

2. Write a description about what you intend to do with automation tools for your final project. You may include example pseudocode, Python scripts, 3D printed holders, a plan for how to use Ginkgo Nebula, and more. You may reference this week’s recitation slide deck for lab automation details. While your description/project idea doesn’t need to be set in stone, we would like to see core details of what you would automate. This is due at the start of lecture and does not need to be tested on the Opentrons yet.

For my project, I want to develop a biosensor. I plan to automate sample preparation and measurement using a liquid-handling robot (like Opentrons) and a plate reader (like PHERAstar).

This automation will be used only during laboratory development to ensure reproducible and accurate results. However, this automation will not be included in the final product, because the main goal of my biosensor is that it will be used in ambulatory settings.

Final Project Ideas — DUE BY START OF FEB 24 LECTURE

As explained in this week’s recitation, add 1-3 slides in your Node’s section of this slide deck with 3 ideas you have for an Individual Final Project. Be sure to put your name, city, and country on your slide!

References

  • DeRoo, Jacob B., et al. “Automation of Protein Crystallization Scaleup via Opentrons-2 Liquid Handling.” SLAS Technology, vol. 32, June 2025, p. 100268, pmc.ncbi.nlm.nih.gov/articles/PMC12229254/, https://doi.org/10.1016/j.slast.2025.100268. Accessed 12 Dec. 2025.
  • Taqi, M. M., Faisal, M., & Zaman, H. (2019). OPRM1 A118G polymorphisms and its role in opioid addiction: Implication on severity and treatment approaches. Pharmacogenomics and Personalized Medicine, Volume 12, 361–368. https://doi.org/10.2147/pgpm.s198654
  • Tasoulis, T., & Isbister, G. (2017). A review and database of snake venom proteomes. Toxins, 9(9), 290. https://doi.org/10.3390/toxins9090290
  • Alonso, L. L., Slagboom, J., Casewell, N. R., Samanipour, S., & Kool, J. (2025). Categorization and Characterization of Snake Venom Variability through Intact Toxin Analysis by Mass Spectrometry. Journal of Proteome Research, 24(3), 1329–1341. https://doi.org/10.1021/acs.jproteome.4c00923

Week 1 HW: Principles and Practices

cover image cover image

This week we evaluate the principles and governance policies for our new ideas. It is important to avoid the misuse of these new developments.

1. First, describe a biological engineering application or tool you want to develop and why. This could be inspired by an idea for your HTGAA class project and/or something for which you are already doing in your research, or something you are just curious about.

Portable Biosensor for Rapid Identification of Animal Venom Toxicity Profiles in Emergency Settings

This idea came about because I am from Colombia, where in some rural areas access to health care is a challenge, especially in emergencies. In low- and middle-income countries, it is common for people to encounter dangerous and venomous animals such as snakes, scorpions, spiders, jellyfish, and venomous fish. In many cases, treating a person who has been bitten or stung is difficult because doctors and emergency teams need to know the species and characteristics of the animal to provide appropriate care.

Identifying the animal that bit or stung a person is challenging because sometimes patients are unconscious, or, in general, people do not have the knowledge to identify the animal or describe its characteristics.

In these kinds of situations, the time window is crucial because if the time between the bite and receiving proper medical attention is too long, the chances of successfully treating the person decrease. In this context, it is essential to develop a tool that allows identification of the type of toxin affecting a person, for example, neurotoxic, hemotoxic, cytotoxic, or myotoxic. The use of a portable biosensor for rapid identification of animal venom toxicity profiles will allow identification of the type of venom or toxin affecting a person among the following: neurotoxic, hemotoxic, cytotoxic, or myotoxic. This will improve positive outcomes because it will be easier to treat the symptoms and administer the appropriate antivenom.

The general idea is that the portable biosensor identifies the type of venom using lateral flow assays (LFA) with antibodies directed against families of toxins. For example, snake venom is composed of a variety of proteins and enzymes; however, there are four main groups of proteins classified according to their activity.

Three-finger toxin (3FTx) family:

  • Neurotoxic effects (causing paralysis)
  • Cytotoxic effects

Phospholipases (PLA2s):

  • Neurotoxic effects
  • Cytotoxic effects (through direct/indirect plasma membrane disruption)

Metalloproteases (SVMPs):

  • Hemotoxic effects (cause severe hemorrhage, inflammation, and coagulopathy)

Serine proteases (SVSPs):

  • Multifunctional enzymes

These types of proteins are present in most snake species, which provides an advantage by reducing the impact of geographical variation, because the antibodies used by this sensor will be specific to the protein family rather than to a single type of snake venom.

2. Next, describe one or more governance/policy goals related to ensuring that this application or tool contributes to an “ethical” future, like ensuring non-malfeasance (preventing harm). Break big goals down into two or more specific sub-goals. Below is one example framework (developed in the context of synthetic genomics) you can choose to use or adapt, or you can develop your own. The example was developed to consider policy goals of ensuring safety and security, alongside other goals, like promoting constructive uses, but you could propose other goals for example, those relating to equity or autonomy.

A. Use the device according to the instructions for use, ensuring only trained and authorized personnel operate it.

Specific objectives:

  • Identify the secondary effects or risks associated with the use of this device
  • Recommend that users comply with safety protocols and instructions.

B. Ensure that the device meets the quality criteria established for medical detection devices by the FDA and other regulatory agencies.

Specific objectives

  • Identify the countries where this technology is urgently needed and therefore comply with the parameters established by their respective regulatory agency.
  • Verify device accuracy and reliability

3. Next, describe at least three different potential governance “actions” by considering the four aspects below (Purpose, Design, Assumptions, Risks of Failure & “Success”). Try to outline a mix of actions (e.g. a new requirement/rule, incentive, or technical strategy) pursued by different “actors” (e.g. academic researchers, companies, federal regulators, law enforcement, etc). Draw upon your existing knowledge and a little additional digging, and feel free to use analogies to other domains (e.g. 3D printing, drones, financial systems, etc.). Purpose: What is done now and what changes are you proposing? Design: What is needed to make it “work”? (including the actor(s) involved - who must opt-in, fund, approve, or implement, etc) Assumptions: What could you have wrong (incorrect assumptions, uncertainties)? Risks of Failure & “Success”: How might this fail, including any unintended consequences of the “success” of your proposed actions?

Accessibility text Accessibility text

4. Next, score (from 1-3 with, 1 as the best, or n/a) each of your governance actions against your rubric of policy goals.

Accessibility text Accessibility text

5. Last, drawing upon this scoring, describe which governance option, or combination of options, you would prioritize, and why. Outline any trade-offs you considered as well as assumptions and uncertainties. For this, you can choose one or more relevant audiences for your recommendation, which could range from the very local (e.g. to MIT leadership or Cambridge Mayoral Office) to the national (e.g. to President Biden or the head of a Federal Agency) to the international (e.g. to the United Nations Office of the Secretary-General, or the leadership of a multinational firm or industry consortia). These could also be one of the “actor” groups in your matrix.

Based on the scoring of each governance action, these options ranked highest. Consequently, they would be prioritized:

  • Workshops with public health regulatory agencies, antivenom production centers, and poison centers.
  • Priority distribution programs in rural hospitals or high-risk areas.

These actions have a direct influence on the two governance policies:

Accessibility text Accessibility text

The priority distribution programs ensure that populations in high-risk areas have access to this device first, as they are the people who need it most. However, it is important to recognize that these priority programs require funding from regulatory agencies, hospitals, and health departments in order to function properly and avoid, for example, delays because of weather conditions or areas of difficult access, especially communities in middle- and low-income countries where there are some paramilitary groups.

As well, this is relevant for the medical care team in order to learn how to access, use, and interpret the results of the device. Regarding this idea, the workshops will require interest from researchers, the population, health care teams, emergency teams, and paramilitary groups. Additionally, they will require the availability of time to develop the workshops and meetings to monitor the continuing evaluation of the device and its use in the community.

The prioritization of these two actions assumes that all resources will be available consistently, which is uncertain, especially in remote areas. This is why it is important to consider these actions as a whole and not as separate actions. If there is no interest in promoting the device from regulatory agencies, antivenom production centers, and poison centers, it will be difficult for populations in remote areas to begin using the device. These two combined ideas will promote the safe use of the device and save lives in emergencies because it allows the identification of the type of toxin affecting a person after a bite or sting. Moreover, the medical team will provide adequate medical attention.

The target population for this recommendation is national health ministries, regulatory agencies, hospitals, and emergency teams responsible for device distribution and training.

Accessibility text Accessibility text

Warm-up Questions for Week 2

In preparation for Week 2’s lecture on “DNA Read, Write, and Edit," please review these materials:

Lecture 2 slides as posted below. The associated papers that are referenced in those slides. In addition, answer these questions in each faculty member’s section:

Homework Questions from Professor Jacobson:

1. Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome? How does biology deal with that discrepancy?

Accessibility text Accessibility text

Which means that in the total length of the human genome, the polymerase fails in only 64 million bases. Indeed, this situation generates some discrepancy, but it is important to consider that 99% of the human genome is copied successfully. When some mismatches occur, the polymerase has several ways to correct these mistakes.

  • 3'-exonuclease activity
  • Mismatch repair:Fixes mistakes in DNA after replication or transcription, a repair mechanism for endogenous damage
  • Base excision: This method works by excising a single damaged base or nucleotide
  • Nucleotide excision:Use the mechanism Cut and patch

2. How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?

In the present day, we recognize approximately 500 amino acids; however, only 20 are considered in the Protein synthesis. Some scientists believe that this occurs due to a matter of evolution; they explain that evolution prioritized modifications that provided advantages to survive. In the beginning of the process, there are 4 nucleotides in DNA: A, T, C, and G, A, U, C, and G in RNA, these for nucleotides form codons of three nucleotides each, which means that: Accessibility text Accessibility text

From these 64 codons, 61 correspond to amino acids for protein synthesis, and 3 are used as stop signals in the process. Which means that initially we have 20 elements to correspond to 61 options, consequently there are several codons which codify for 1 amino acid. Indeed, to identify how many ways there might be to code 1 protein, it is necessary to know the number of amino acids in the protein.

For example, the Hemoglobin has 146 amino acids in one chain; basically, the calculation would be done like this.

Accessibility text Accessibility text Accessibility text Accessibility text

This is the theory of degeneracy in the genetic code. This theory suggests that the unequal distribution of 64 codons would be caused by nature to maintain stable C : H ratios. Although there are several ways to code a protein, not all of them work properly; this is because, during the protein synthesis process, there are a variety of codons more commonly used than others.

This might cause a problem when the process tries to use codons used rarely, as it will be difficult for the ribosomes decode these types of codons. This theory is known as Codon usage bias.

Homework Questions from Dr. LeProust:

1. What’s the most commonly used method for oligo synthesis currently?

  • Solid-phase phosphoramidite synthesis

It occurs on a solid support held between filters, in columns that allow all reagents and solvents to pass through freely.

Principal component: Solid supports

These are insoluble particles of 50-200 μm in diameter; the oligonucleotide makes bonds with these supports during the synthesis process.

The two most common materials used in solid supports are Controlled-pore glass (CPG) and Polystyrene (PS). Both methods work perfectly when synthesizing oligonucleotides up to 150 bases in length. After that limit, the efficiency rate decreases. One nucleotide is added per synthesis cycle; the whole process includes approximately 14 steps. This is because the growing oligonucleotide blocks the pores, which means that the diffusion of reagents will decrease. Additionally, in the case of PS with oligonucleotides longer than 40 bases, the efficiency rate will decrease because long chains generate steric hindrance.

Advantages:

  • Occurs in columns that allow reagents and solvents to pass through freely
  • Quick completion
  • No purification is required
  • The whole process might be automated

2. Why is it difficult to make oligos longer than 200nt via direct synthesis?

After reaching the limit, we can identify the principal disadvantage of the process, which is that the growing oligonucleotide blocks the pores, and long chains generate steric hindrance, which means that the diffusion of reagents will decrease.

The efficiency of the process is approximately 95.6% per 10 base chain length, which means that if you have a chain of 100nt, and you try to add, for example, 10 more bases, approximately 4 bases of the chain might fail, leaving the chain with 96nt. Imagine that you want to continue the process: you try to add 10 more bases, but again, probably 4 bases will be stuck, which leaves you with a chain of approximately 92 bases.

This is what scientists call coupling efficiency. At the end of the process, a chain of 200nt will have an efficiency rate of 36.9 %, which is less than half.

3. Why can’t you make a 2000bp gene via direct oligo synthesis?

A gene of 2000 bp has more than 200nt, which means that if the efficiency rate is less than half with just 200nt, then the efficiency rate of a gene of that length will be close to 0%.

Homework Question from George Church: Choose ONE of the following three questions to answer; and please cite AI prompts or paper citations used, if any.

1. [Using Google & Prof. Church’s slide #4] What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?

10 Essential amino acids

  1. Histidine (His)
  2. Isoleucine (Ile)
  3. Leucine (Leu)
  4. Lysine (Lys)
  5. Methionine (Met)
  6. Phenylalanine (Phe)
  7. Threonine (Thr)
  8. Tryptophan (Trp)
  9. Valine (Arg)

Lysine Contingency:

Lysine is the first limiting amino acid, which means that it is the amino acid present in food in the lowest quantity, compared to the amount needed by the body. In short, the quantity of lysine present in foods is less than the quantity required. However, we talk about lysine contingency even though there are nine other essential amino acids, because lysine has a fundamental role in the proper functioning of the body.

  1. Acts as a substrate for post-translational modifications and the synthesis of a variety of substances used and produced by the body to work properly
  2. Fundamental to adequate muscle performance
  3. Possible Herpes Simplex Infections Therapy
  4. Modulation of Immune and Cardiovascular Functions
  5. Cancer Therapy
  6. Elimination of Harmful Substances
  7. Osteoporosis Therapy
  8. Wound Healing

When we detail all these uses of lysine, it is possible to notice that since this amino acid is extremely relevant, its availability in the diet is low. Therefore, it makes sense to establish it as a principal amino acid and consider its importance as an advantage in science, for example, developing an organism that depends mostly on lysine; this might be a way of controlling the activity of this organism. Other applications could be developing new therapies to treat diseases or even indicators of diseases.

References

  • Tasoulis, T., & Isbister, G. (2017). A review and database of snake venom proteomes. Toxins, 9(9), 290. https://doi.org/10.3390/toxins9090290 Alonso, L. L., Slagboom, J., Casewell, N. R., Samanipour, S., & Kool, J. (2025). Categorization and Characterization of Snake Venom Variability through Intact Toxin Analysis by Mass Spectrometry. Journal of Proteome Research, 24(3), 1329–1341. https://doi.org/10.1021/acs.jproteome.4c00923
  • Miller, E. (n.d.). GeNotes. GeNotes. https://www.genomicseducation.hee.nhs.uk/genotes/knowledge-hub/genome/
  • Marteijn, J. A., Lans, H., Vermeulen, W., & Hoeijmakers, J. H. J. (2014). Understanding nucleotide excision repair and its roles in cancer and ageing. Nature Reviews Molecular Cell Biology, 15(7), 465–481. https://doi.org/10.1038/nrm3822
  • Behura, S. K., & Severson, D. W. (2012). Codon usage bias: causative factors, quantification methods and genome‐wide patterns: with emphasis on insect genomes. Biological Reviews/Biological Reviews of the Cambridge Philosophical Society, 88(1), 49–61. https://doi.org/10.1111/j.1469-185x.2012.00242.x
  • On the origin of degeneracy in the genetic code. (2019, October). The Royal Society Publishing. https://royalsocietypublishing.org/rsfs/article/9/6/20190038/35070/On-the-origin-of-degeneracy-in-the-genetic
  • Ye, S., & Lehmann, J. (2022). Genetic code degeneracy is established by the decoding center of the ribosome. Nucleic Acids Research, 50(7), 4113–4126. https://doi.org/10.1093/nar/gkac171
  • Qian, Y., Zhang, R., Jiang, X., & Wu, G. (2021). The constraints between amino acids influence the unequal distribution of codons and protein sequence evolution. Royal Society Open Science, 8(6), 201852. https://doi.org/10.1098/rsos.201852
  • ATDBio - Nucleic Acids Book - Chapter 5: Solid-phase oligonucleotide synthesis. (n.d.). ATDBio - Nucleic Acids Book. https://atdbio.com/nucleic-acids-book/Solid-phase-oligonucleotide-synthesis
  • Holeček, M. (2025). Lysine: sources, metabolism, physiological importance, and use as a supplement. International Journal of Molecular Sciences, 26(18), 8791. https://doi.org/10.3390/ijms26188791
  • Lopez, M. J., & Mohiuddin, S. S. (2024, April 30). Biochemistry, essential amino acids. StatPearls - NCBI Bookshelf. https://www.ncbi.nlm.nih.gov/books/NBK557845/
  • MMDB. (n.d.). 2H35: Solution structure of Human normal adult hemoglobin. https://www.ncbi.nlm.nih.gov/Structure/pdb/2H35