Charles Naney — HTGAA Spring 2026

cover image cover image

About me

HTGAA Committed Listener (CL) Agreement

I am a HTGAA Committed Listener, my responsibilities are:

Watching class lectures and recitations Participating in node reviews Developing and documenting my homework Actively communicating with other students and TAs on the forum Allowing HTGAA and BioClub to share my work (with attribution) Honestly reporting on my work, and appropriately attributing and citing the work of others (both human and non-human) Following locally applicable health and safety guidance Promoting a respectful environment free of harassment and discrimination Signed by committing this file to my documentation page/repository,

Charles (Charley) C. Naney

2Mar2026

Contact info

Homework

Labs

Projects

Subsections of Charles Naney — HTGAA Spring 2026

Homework

Weekly homework submissions:

  • Week 1 HW: Principles and Practices

    Class Assignment — DUE BY START OF FEB 10 LECTURE Question 01 First, describe a biological engineering application or tool you want to develop and why. This could be inspired by an idea for your HTGAA class project and/or something for which you are already doing in your research, or something you are just curious about. According to the well-regarded popular science writer Matthew Cobb (2022), since the Asilomar conference in 1975, molecular biologists have been the vanguard in self-regulating when playing God. This means we refrain from conducting our research irresponsibly by deploying unnecessarily hazardous experimental methods. Alas, this also means that some of the most exciting genetic engineering is no longer done. Consider Dr. Oswald Avery’s transforming principle experiment. Blindly take a population of virulent pneumonia bacteria and feed them harmless kin until they lose their aggressive function and magically adapt into weak and indifferent pneumonia. Since Asilomar, this is indeed one kind of experiment that trustworthy principal investigators must abstain from. I get it, and still I contemplate. Wasn’t Avery the best of us, though? Between Schrodinger and Watson, Crick, and Franklin – Dr. Avery intuited DNA into existence with his transforming principle and used it effectively. Surely I didn’t name my oldest son after this man for nothing?

  • Week 2 HW: DNA Read Write and Edit

    Table of contents Software used: Terminal, git, xcode, hugo, benchling, rcdonovan website, twist website. Objective: This week explores the read–write–edit toolkit: sequencing and synthesis workflows, restriction digests and gel electrophoresis, and early genome-editing frameworks.

  • Week 3 HW: Lab Automation

    Homework for HTGAA 2026 (Week 03): Lab Automation Table of contents Software used: Terminal, git, Opentrons, rcdonovan website, Google Colab. Objective: This week we get hands-on (or at least code-on) with pipetting robots.

  • Week 4 HW: Protein Design Part I

    Homework: Protein Design I Assignment Objective: Learn basic concepts: amino acid structure, 3D protein visualization, and the variety of ML-based design tools. Brainstorm as a group how to apply these tools to engineer a better bacteriophage (setting the stage for the final project).

  • Week 5 HW: Protein Design Part ii

    [] Homework — DUE BY START OF MAR 10 LECTURE Part A: SOD1 Binder Peptide Design (From Pranam) Superoxide dismutase 1 (SOD1) is a cytosolic antioxidant enzyme that converts superoxide radicals into hydrogen peroxide and oxygen. In its native state, it forms a stable homodimer and binds copper and zinc. Mechanis Mutations in SOD1 cause familial Amyotrophic Lateral Sclerosis (ALS). Among them, the A4V mutation (Alanine → Valine at residue 4) leads to one of the most aggressive forms of the disease. The mutation subtly destabilizes the N-terminus, perturbs folding energetics, and promotes toxic aggregation.

  • Week 6 HW: Genetic Circuits Part i

    []Homework — DUE BY START OF MAR 17 LECTURE Week 6 HW: Genetic Circuits Part 01 Assignment: DNA Assembly Protocol and Study Questions What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose? A proprietary gold standard heat-stable DNA polymerase alternative to Taq reagent synthesized and sold by Thermo Fisher Scientific. Unlike Taq which was isolated from thermophilic bacteria, Phusion emulates an archaea-based enzyme that evolved in the hydrothermal vents from extremeophile species. They function as DNA polymerases essentially in a form biomimickry with minimal replication error. The purpose of Phusion is to amplify target DNA sequences in the PCR protocol. Phusion PCR is more expensive but worth the investment to increase the accuracy of the run.

  • Week 7 HW: Genetic Circuits Part 2

    []Homework — DUE BY START OF MAR 31 LECTURE at 2PM ET Week 7 HW: Genetic Circuits Part 2 Assignment Part 1: Intracellular Artificial Neural Networks (IANNs) What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions? IANNs are ideal for the continuous transcriptomic-driven change observed in cells that are constantly moving and communicating in their intracellular environment – through analog computations. In contrast, much of the early synbio genetic circuit engineering was digital, with discrete logic gate switch programming or perhaps even through gene knock out (present versus absent) if such a connection would be permitted.

  • Week 9 HW: Cell Free Systems

    Week 9 HW: Cell Free Systems Assignment Part 1: ABC? EFG Reading & Resources

  • Week 10 HW: Imaging and Measurement

    Week 10 HW: Imaging and Measurement Assignment Part 1: ABC? EFG Reading & Resources

Subsections of Homework

Week 1 HW: Principles and Practices

cover image cover image

Class Assignment — DUE BY START OF FEB 10 LECTURE

Question 01

First, describe a biological engineering application or tool you want to develop and why. This could be inspired by an idea for your HTGAA class project and/or something for which you are already doing in your research, or something you are just curious about.

According to the well-regarded popular science writer Matthew Cobb (2022), since the Asilomar conference in 1975, molecular biologists have been the vanguard in self-regulating when playing God. This means we refrain from conducting our research irresponsibly by deploying unnecessarily hazardous experimental methods. Alas, this also means that some of the most exciting genetic engineering is no longer done. Consider Dr. Oswald Avery’s transforming principle experiment. Blindly take a population of virulent pneumonia bacteria and feed them harmless kin until they lose their aggressive function and magically adapt into weak and indifferent pneumonia. Since Asilomar, this is indeed one kind of experiment that trustworthy principal investigators must abstain from. I get it, and still I contemplate. Wasn’t Avery the best of us, though? Between Schrodinger and Watson, Crick, and Franklin – Dr. Avery intuited DNA into existence with his transforming principle and used it effectively. Surely I didn’t name my oldest son after this man for nothing?

Unlike Dr. Avery, I am fortunate to be proposing my HTGAA 2026 project after the discovery of DNA and the Asilomar conference and the Once-in-a-Century Pandemic when mRNA Vaccines and CRISPR gene editing approaches were available. However, like Avery, with molecular biology, we can still take a population perspective to current wicked health problems. My first professional mentor, Dr. Paul Farmer, would often credit his milieu caring for the poorest of the poor as the center of his mission. Though I once aspired to that also, I now reflect that I am fortunate to just be an aspiring Molecular Biologist. In addition, when I marvel over everything that has been achieved since Avery, especially since the Human Genome Project and the advances of systems biology to synthetic biology, I see there are now viable alternatives in biological practice to help others and the living world.

This brings me to the project. I agree with Dr. Aubrey De Grey that biological aging is a vexing, immutable inequality in public health that must be solved. In fact, I am engaged in this research with my excellent Biology PHD mentors at North Carolina Agricultural and Technical State University (NCATSU), and one of them was on the team that first postponed senescence in Drosophila back when Star Wars movies were worthy of the hype.

Like Dr. De Grey, I believe exit velocity will be achieved in our lifetimes by engineering negligible senescence. The difference is that his model species are cohorts of robust, rejuvenated rodents centralized in a single laboratory, and I propose we develop many sites and open science approaches using goats instead. I also think that we will need to develop applied computational systems biology simulators (synthetic biology simulators too if they exist) and at the center of the approach needs to be the host-microbiome.

Why goats? They’re not even a monogastric species. Please hear me out.

According to ChatGPT, the oldest recorded goat in the Guinness World Records is McGinty (22 years, 5 months). The buck was a Brition, he was male, and from a Pygmy breed. My understanding is that Pygmy goats were originally bred to feed large cats. This record was set in 2003 and I assume it hasn’t been challenged since. Although I never had the pleasure of meeting McGinty, the general indifference evident by his nefarious name and the dusting of a few social media posts and overall absence of life-history information makes plain that likely society gave up on even understanding goat longevity decades ago. This means that despite living among Homo sapiens for more than 10,000 years and sustaining us in every challenging environment on Earth, we still know more about goats’ genomic diversity than life history. That’s not a bad thing, though, because goats’ genomes and immune systems are as infinitely fascinating as our own.

In addition to not being popular, goats live a life preoccupied by parasites, predators, and food insecurity that is only moderately improved by domestication, let’s be honest. I often reflect on the goats I met in the Galapagos Islands – the first example of extreme biological environments. Goats are not indigenous to the Galapagos. They are migrants. They didn’t migrate there on their own volition, though – instead they brought in rafts and boats a Century ago, and still to this few re-wilded stragglers refuse to go extinct. In fact it’s hard to find an island in the Galapagos that doesn’t have a pile of goat skulls on it. I understand the issue is complicated but either way you land on the issue, it’s hard to deny that goats are specialists in acclimating in extreme environments. Ironically, it’s Charles Darwin’s theory of Natural Selection that I would like to structure the computational systems biology goat longevity simulator around, particularly using Neo-Darwinian genetics and postponement of senescence work by Rose, Muller, Luckinbill, and Graves.

I propose a Long Term Experimental Evolution (LTEE) study that leverages synthetic biology and local animal husbandry to study the role of gut microbiomes on cellular senescence in goats. I hypothesize that understanding diversity and abundance in genetic circuitry constituting biological signaling pathways between adaptive, senescence-resistant microbes and Metazoan somatic tissues will yield the putative attractor switches we need to cure cellular senescence and put apoptosis on a toggle switch. Theoretically, though I certainly don’t plan to achieve this in 10 weeks or morally at all. The point is that once you understand that one contingent evolutionary endosymbiotic event transformed an alpha-proteobacteria into the power center for every Metazoan cell that came after, and then the effects of the mitochondria on oxidative stress accumulation and stabilization. Inevitably, we can trust that the solution to aging in somatic cells will never again be an if question.

Endpoints I will be investigating are biologically and statistically significant variation in “aging” host and microbe genes identified through differential gene expression. The study will be a multigenerational LTEE for Synthetic Biology 101, targeting the bidirectional interactions between living goat genes and pathways and the microbiota in their gut. My stakeholders are the American Milk Goat Breeding Association and Nanopore, and every isolated mountain village or homesteader that is still alive because of their goat herd.


Question 2

GPG01: Explore synthetic biology for goat life history for a putative Mitocarta or SASP gene and phenotypic pathways that may be useful in future studies to bioengineer negligible senescence in goats.

GPG02: Integrate aim 1 gene with OMICs data using computational model to explore molecule mediated bidirectional interactions between somatic host cells and microbes in goat microbiomes.

GPG03: Consider systems-level synthetic biology interventions for extreme environments that support goat metabolism and gut microbiome health.


Question 3

Next, describe at least three different potential governance “actions” by considering the four aspects below (Purpose, Design,

Assumptions, Risks of Failure & “Success”). Try to outline a mix of actions (e.g. a new requirement/rule, incentive, or technical strategy) pursued by different “actors” (e.g. academic researchers, companies, federal regulators, law enforcement, etc). Draw upon your existing knowledge and a little additional digging, and feel free to use analogies to other domains (e.g. 3D printing, drones, financial systems, etc.). Purpose: What is done now and what changes are you proposing? Design: What is needed to make it “work”? (including the actor(s) involved - who must opt-in, fund, approve, or implement, etc) Assumptions: What could you have wrong (incorrect assumptions, uncertainties)? Risks of Failure & “Success”: How might this fail, including any unintended consequences of the “success” of your proposed actions?

CN Answer Purpose: Humans have been artifically selecting phenotypes in goats to consume for our benefit for more than 10,000 years. I understand goat meat, milk, and fiber are necessary for human differential reproductive success and maintenance. I actualy have a working goat farm. My purpose is not adjudicate my species, or rescue another, my intention is to use this incredible opportunity to make ammends to another population of Metazoans by helping my peers use synthetic biology to help goats live longer, higher quality lives. I do not pursue this idea to make a more profitable goat commodity either. In summation my reasoning at HW 1 is: because I know postponing senescence in Metazoans is possible and I care about the welfare of all goats, I want to help others and myself advance negligible senesence in goat somatic cells safely and humanly too. The changes I am proposing for my HTGAA 2026 project though are only to expand fair, accurate, timely, accessible open science data about goat life history and genetics, so through synthetic biology we can help goats live longer, healthier lives.

Design: Based on what I know about Synthetic Biology today, which is far less than I care to admit without embarassment on a public website. What is needed to make it “work” is Dr. Aubrey De Grey brillance, vision, ability and a sincere heart for animal welfare. Eventually a sustainable research enterprise plan will be useful to achieve endpoints, quality benchmarks, and safety standards. Let me make a clear point first though, all I am proposing at this juncture is cast out net for data, reel it in and evaluate what I find. This review will require oversite from experts – can anyone put me in contact with Dr. George Church or Dr. Aubrey De Grey?

Assumptions: I love this question because I am a scientist and I think no other discipline is more pragmatic than us when it comes to how we manage uncertainty. This is the crux about assumptions. Uncertainty is dangerous. Case and point, because I care about goat welfare and recognize I do not understand enough about Synthetic Biology interventions to expect what I don’t understand about goat life-history, physiology, and genetics I would never do anything to disrupt in vivo what I am learning to explain and make predictions about – that being the bioengineering of negligible senesence in goats.

Risks of Failure & “Success”: Here risk communication and management are key. I was an Epidemiologist for 20 years before going back to school. In Epidemiology although all Pandemics are orphans, a breech in prevention is always the root cause. I say this to explain why I am so proactive about preventing failure, especially when it comes to public health. Another example, is part of my PhD training was working in a Molecular Microbiology lab on a LTEE for NASA. Here a significant portion of the job is monitor and improve protocols and practice to minimize contamination, especially on a 100 day LTEE study.


Stakeholders: Registry of Standard Biological Parts (RSBP), SAB Biotherapeutics (SABBio), World Health Organization (WHO), Rocky Hill Farm in WV (RHFWV)

Rating Scale: ♛ Most Effective ♞ Moderately Effective ♟ Minimally Effective

Does the option:RSBPSABBioWHO
Explore synthetic biology for goat life history for a putative MitoCarta or SASP gene and phenotypic pathways that may be useful in future studies to bioengineer negligible senescence in goats.
• By reducing uncertainty about the life history of goats.
• By reducing uncertainty about synthetic biology interventions for negligible senescence in goats.
Integrate aim 1 gene with OMICs data using computational model to explore molecule mediated bidirectional interactions between somatic host cells and microbes in goat microbiomes.
• By mapping major biological signaling pathways where communication goes from goat somatic cell to -> GIT microbiome
• By mapping major biological signaling pathways where communication goes from GIT microbe in GIT microbiome to goat somatic cell or system
Consider systems-level synthetic biology interventions for extreme environments that support goat metabolism and gut microbiome health.
• By cataloging goat metabolites and microbiota and their interactions
• By modeling seed to goat food webs for diverse local environments.
• By writing an aspirational study protocol.
Other considerations
• Minimizing costs and burdens to stakeholders
• Feasibility?
• Not impede research
• Promote constructive applications

Assignment (Week 2 Lecture Prep) — DUE BY START OF FEB 10 LECTURE

In preparation for Week 2’s lecture on “DNA Read, Write, and Edit," please review these materials: Lecture 2 slides. The associated papers that are referenced in those slides. In addition, answer these questions in each faculty member’s section:

Homework Questions from Professor Jacobson:

Question 1

Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy?

Based on the deck the error rate of polymerase is 1:10^6 or one error for every 1,000,000 base pairs. The size of the human genome according to the Molecular Biology of a Gene by Watson et al. (2007) the human genome is 3200 Mega base pairs in length which converts to 3,200,000,000 base pairs. Biology deals with the discrepancy through redundancy and replication forks moving from many different insertion sites at the same time. This way the redundancy offsets the discrepancy in the error rate. However errors still occur.


Question 2

How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?

There are two ways that I am aware of, Mass Spec and Edman Degredation. Both of these techniques identify amino residues that are synthesized as triple codons for a varity of lengths and structures. The total number of probable combinations is 3 codons multipled by 20 possible amino acids.


Homework Questions from Dr. LeProust: [Lecture 2 slides]

Question 1

What’s the most commonly used method for oligo synthesis currently?

Amplicon-Based Assays


Question 2

Why is it difficult to make oligos longer than 200nt via direct synthesis?

Turn-around time on results due to added complexity from higher Chimera rate, drop out rate, and uniformity constraints above 100nt


Question 3

Why can’t you make a 2000bp gene via direct oligo synthesis?

I couldn’t find an exact answer in the deck, but an article by Yin et al (2024) cited below, which is relatively up-to-date, reports that the current length record for direct oligo synthesis is between 800 mer - 1728 mer. This alone is an accomplishment since authors explain that the rate of errors increases significantly above 100nt. The article also discusses the original 1000nt ceiling due to the steric hindrance of the substrate macromolecule. Please forgive my answer being a little choppy; I am still learning how to converse in this language.

Yin, Y., Arneson, R., Yuan, Y., and Fang, S. (2025). Long oligos: Direct chemical synthesis of genes with up to 1728 nucleotides. Chemical Science, 16(4), 1966–1973. https://doi.org/10.1039/D4SC06958G


Homework Question from George Church: [Lecture 2 slides]

Choose ONE of the following three questions to answer; and please cite AI prompts or paper citations used, if any.

Question 1

[Using Google & Prof. Church’s slide #4] What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?


Question 2

[Given slides #2 & 4 (AA:NA and NA:NA codes)] What code would you suggest for AA:AA interactions?

Here I think you are asking me to provide an alternative code for the foundation of life? The irony is that is why I am here, I want you all to teach me how to rewrite code for AA:AA interactions. I know this because in preparing my answer I ran AA:AA interactions by AI for a second oppinion. My prompt was “what do you think Dr. George Church means by code for AA:AA interactions”. AI tells me that this code may need to demonstrate how sequences of amino acids influence physical interaction rules, which I interpret to be first order principles. Therefore what code would I suggest to influence AA:AA interactions that are able (if AI’s tip is correct) to perturb first order principles. My code as Dr. Nick Lane would say, should be efficient at turning the feedback loops of matter into biophysical waves of energy. In further prepartion I would turn to my greatest advisors in Natural Selection adaptation, bacteria and their metabolic motifs in Metazoans. I would need a coding system that respects phylogeny and doesn’t immagine I could ever devise a coding system more ingenious than the Krebs cycle or the molecularly machinery behind the deprotonation of hydrogen by Complex 5 in the Electron Transfer Chain. Still it’s a fascinating thought experiment if nothing else. To this end, I used your Acevodo-Rocha et al. (2016) paper and AI to find the Poliseno et al. (2024) paper and though I am not up-to-date on this team I do suspect there is an Epidemiologist among them, because their example of concordant and discordant pairing of coding and noncoding functions is what my coding system would be based on to optimize around the canonical rigidity of present AA:AA interaction.

Acevedo‐Rocha, C. G., & Budisa, N. (2016). Xenomicrobiology: A roadmap for genetic code engineering. Microbial Biotechnology, 9(5), 666–676. https://doi.org/10.1111/1751-7915.12398

Poliseno, L., Lanza, M., & Pandolfi, P. P. (2024). Coding, or non-coding, that is the question. Cell Research, 34(9), 609–629. https://doi.org/10.1038/s41422-024-00975-8

[(Advanced students)] Given the one paragraph abstracts for these real 2026 grant programs sketch a response to one of them or

devise one of your own:

https://arpa-h.gov/explore-funding/programs/boss https://www.darpa.mil/research/programs/smart-rbc https://www.darpa.mil/research/programs/go


Assignment (Your HTGAA Website) — DUE BY START OF FEB 10 LECTURE

Begin personalizing your HTGAA website in https://edit.htgaa.org/, starting with your homepage — fill in the template with

information about yourself, or remove what’s there and make it your own. Be creative! As with all assignments in HTGAA, be sure to

write up every part of this Homework on your HTGAA website in order to receive credit.


Important

For this week only, once your homework is complete and written up on your HTGAA website (and you’ve checked your published website at pages.htgaa.org and are happy with it), fill out the Homework 1 Completion form which David emailed out just after Lecture 1. This Google form expresses your interest in continuing with the course; without it you will not be accepted in HTGAA!

Week 2 HW: DNA Read Write and Edit

Table of contents

Software used:

  • Terminal,
  • git,
  • xcode,
  • hugo,
  • benchling,
  • rcdonovan website,
  • twist website.

Objective:

This week explores the read–write–edit toolkit: sequencing and synthesis workflows, restriction digests and gel electrophoresis, and early genome-editing frameworks.

Background:

DNA Read (George Church), Write (Joe Jacobson), & Edit (Emily Leproust). In addition to recitation and Tokyo Biohub node lab meetings

Methods:

  • Start with touchpoint of Design stage of SynBio DBTL cycle with In-silico Gel Art
  • Build DNA fragments in Benchling with restriction digests for Testing with Gel Electrophoresis
  • Learn from Benchling work & In-silico Gel Art
  • Start to Design or
  • Gel Electrophoresis
  • Obtain protein sequences
  • Plasmid digestion with restriction enzymes,
  • Preparing Twist DNA Synthesis Order

Tasks:

  1. Documentation
  • Make sure to document every step of the in-silico and lab experiments. Make sketches, screenshots, notes, drawings… anything that helps you - and others - understand the experiment. Your documentation should help you - and others - to understand the topic. Don’t be afraid to add things that don’t work. Show your failures - and how you overcame them. Your Documentation should be a description of the amazing journey you are on!
  1. Part 0: Basics of Gel Electrophoresis
  • Attend or watch all the lectures and recitation videos. Optionally watch bootcamp.

Part 1: Benchling & In-silico Gel Art

See the Gel Art: Restriction Digests and Gel Electrophoresis protocol for details.

Overview:

  • The EcoRI RE is sourced from Escherichia coli> with palladrome cut at AATT 5’-GAATTC-3’ 3’-CTTAAG-5’ leaving a 5’ sticky end.
  • The BamHI RE is sourced from Bacillus amyloliquefaciens and scans for 5’-GGATCC-3’ 3’-CCTAGG-5’ to cut between G and G leaving a 5’ sticky end.
  • The HindIII RE is sourced from Haemophilus influenzae and scans for 5’-AAGCTT-3’ 3’-TTCGAA-5’ leaving a 5’ sticky end.
  • The KpnI RE is sourced from Klebsiella pneumoniae, it requires small molecule cofactors including Mg and Ca ions to complete cut with fidelity; uses 5’-GGTACC-3’ 3’-CCATGG-5’ and rather uniquely for this experimental RE set leaves a 3’ sticky end.
  • The EcoRV RE is sourced from Escherichia coli also and scans for 5’-GATATC-3’ 3’-GTATAG-5’ and leaves the blunt end for this RE set.
  • The SacI RE is sourced from Streptomyces achromogenes and scans for 5’-GAGCTC-3’ 3’-GTCGAG-5’ leaving a 5’ sticky end.
  • The SaII RE is sourced from Streptomyces albus and scans for 5’-GTCGAC-3’ 3’-CAGCTG-5’ leaving a 5’ sticky end.
  • Source: Recognition sequences and cleavage patterns were verified using the REBASE database (Roberts et al., 2015).
  • Create a pattern/image w/style of Paul Vanouse’s Latent Figure Protocol artworks.
  • Use Ronan’s website as a helpful tool for quickly iterating on designs! Here is the link [https://rcdonovan.com/gel-art].
Benchling_Virtual_Digest_Report Benchling_Virtual_Digest_Report

HW2 is structured purposefully to make us think like synbio engineers. For example, the reason we transition from Gel Electrophoresis to Restriction Digests is because we cannot move large strands of DNA and RNA through the GE matrix. We need small enough pieces of readable genetic material just to accomplish the lab assay. This makes RD a function necessary to achieve our design objectives. Benchling is a similar addition to the HW2 learning module, we need to see the restriction digests applied on our Lamba model and the computational ladder for converting the pieces of plasmid DNA in our GE matrix, it then helps that we can use Benchling in subsequent steps also.

Part 2: Gel Art - Restriction Digests and Gel Electrophoresis

Perform the lab experiment you designed in Part 1 and outlined in the Gel Art: Restriction Digests and Gel Electrophoresis protocol.

Now if your mind works like mine it might seem abrupt to leap from the movement of DNA through GE matrix to proteins but not if you understand the Central Dogma, sure, but even more the SynBio Design, Build, Test, Learn loop.

  • [https://doi.org/10.1371/journal.pbio.3002116]

Add a Bacterial chromosome and plasmid sequenced with Oxford Nanopore MiniON because I am annoyingly meticulous with discovery. In my HW2 discussion questions I am going to sing praises to Nanopore so also better to be consistent in DNA read inputs. I will download chromosome and plasmid DNA and load into Benchling. Please note the Genbank files do not play nicely with Benchling, so I will need to shift to FASTAs. Escherichia coli DSM 30083 = JCM 1649 = ATCC 11775 > Chromosome GenBank: https://www.ncbi.nlm.nih.gov/nuccore/CP033092.2/ > CP033092.2 Escherichia coli DSM 30083 = JCM 1649 = ATCC 11775 chromosome, complete genome > Plasmid GenBank: https://www.ncbi.nlm.nih.gov/nuccore/CP033091.2/ > CP033091.2 Escherichia coli DSM 30083 = JCM 1649 = ATCC 11775 plasmid unnamed, complete sequence

Part 3: DNA Design Challenge

3.1. Choose your protein.

  • In recitation, we discussed that you will pick a protein for your homework that you find interesting. Which protein have you chosen and why? Using one of the tools described in recitation (NCBI, UniProt, google), obtain the protein sequence for the protein you chose.
  • [Example from our group homework, you may notice the particular format — The example below came from UniProt]
  • sp|P03609|LYS_BPMS2 Lysis protein OS=Escherichia phage MS2 OX=12022 PE=2 SV=1 METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLL EAVIRTVTTLQQLLT Considering: RpoS in E. coli K-12 will download Amino Acids for protein below and convert backwards to genome if I do not find an online reference that isn’t deleted.
83333_0:000b85 {"organism":"Escherichia coli K-12","genome_id":"GCF_000974885.1","pub_prot_id":"WP_000081588.1","pub_gene_id":"SF31_RS18190","description":"RNA polymerase sigma factor RpoS"}
MSQNTLKVHDLNEDAEFDENGVEVFDEKALVEQEPSDNDLAEEELLSQGATQRVLDATQLYLGEIGYSPLLTAEEEVYFARRALRGDVASRRRMIESNLRLVVKIARRYGNRGLALLDLIEEGNLGLIRAVEKFDPERGFRFSTYATWWIRQTIERAIMNQTRTIRLPIHIVKELNVYLRTARELSHKLDHEPSAEEIAEQLDKPVDDVSRMLRLNERITSVDTPLGGDSEKALLDILADEKENGPEDTTQDDDMKQSIVKWLFELNAKQREVLARRFGLLGYEAATLEDVGREIGLTRERVRQIQVEGLRRLREILQTQGLNIEALFRE

https://www.ebi.ac.uk/interpro/result/InterProScan/iprscan5-R20260216-160122-0718-15835993-p1m/internal-1771257679016-348-1/ https://alphafold.ebi.ac.uk/entry/P13445 https://www.ncbi.nlm.nih.gov/datasets/gene/GCF_003697165.1/

Protien Code for RpoS Gene from NZ_CP033092.1:4177924-4178988 Escherichia coli DSM 30083 = JCM 1649 = ATCC 11775 chromosome, complete genome

MSQNTLKVHDLNEDAEFDENGVEVFDEKALVEEEPSDNDLAEEELLSQGATQRVLDATQLYLGEIGYSPLLTAEEEVYFARRALRGDVASRRRMIESNLRLVVKIARRYGNRGLALLDLIEEGNLGLIRAVEKFDPERGFRFSTYATWWIRQTIERAIMNQTRTIRLPIHIVKELNVYLRTARELSHKLDHEPSAEEIAEQLDKPVDDVSRMLRLNERITSVDTPLGGDSEKALLDILADEKENGPEDTTQDDDMKQSIVKWLFELNAKQREVLARRFGLLGYEAATLEDVGREIGLTRERVRQIQVEGLRRLREILQTQGLNIEALFREEVSICQKGQSQARLAFFLLVHGTC*

3.2. Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence.

  • The Central Dogma discussed in class and recitation describes the process in which DNA sequence becomes transcribed and translated into protein. The Central Dogma gives us the framework to work backwards from a given protein sequence and infer the DNA sequence that the protein is derived from. Using one of the tools discussed in class, NCBI or online tools (google “reverse translation tools”), determine the nucleotide sequence that corresponds to the protein sequence you chose above.
  • [Example: Get to the original sequence of phage MS2 L-protein from its genome phage MS2 genome - Nucleotide - NCBI]
  • Lysis protein DNA sequence atggaaacccgattccctcagcaatcgcagcaaactccggcatctactaatagacgccggccattcaaacatgaggattacccatgtcgaagacaacaaagaagttcaactctttatgtattgatcttcctcgcgatctttctctcgaaatttaccaatcaattgcttctgtcgctactggaagcggtgatccgcacagtgacgactttacagcaattgcttacttaa
  • Nucleotide Sequence for my gene pick
>gnl|ECOLI|EG10510 rpoS RPOS-MONOMER (complement(2866559..2867551)) Escherichia coli K-12 substr. MG1655
atgAGTCAGA ATACGCTGAA AGTTCATGAT TTAAATGAAG ATGCGGAATT TGATGAGAAC
GGAGTTGAGG TTTTTGACGA AAAGGCCTTA GTAGAACAGG AACCCAGTGA TAACGATTTG
GCCGAAGAGG AACTGTTATC GCAGGGAGCC ACACAGCGTG TGTTGGACGC GACTCAGCTT
TACCTTGGTG AGATTGGTTA TTCACCACTG TTAACGGCCG AAGAAGAAGT TTATTTTGCG
CGTCGCGCAC TGCGTGGAGA TGTCGCCTCT CGCCGCCGGA TGATCGAGAG TAACTTGCGT
CTGGTGGTAA AAATTGCCCG CCGTTATGGC AATCGTGGTC TGGCGTTGCT GGACCTTATC
GAAGAGGGCA ACCTGGGGCT GATCCGCGCG GTAGAGAAGT TTGACCCGGA ACGTGGTTTC
CGCTTCTCAA CATACGCAAC CTGGTGGATT CGCCAGACGA TTGAACGGGC GATTATGAAC
CAAACCCGTA CTATTCGTTT GCCGATTCAC ATCGTAAAGG AGCTGAACGT TTACCTGCGA
ACCGCACGTG AGTTGTCCCA TAAGCTGGAC CATGAACCAA GTGCGGAAGA GATCGCAGAG
CAACTGGATA AGCCAGTTGA TGACGTCAGC CGTATGCTTC GTCTTAACGA GCGCATTACC
TCGGTAGACA CCCCGCTGGG TGGTGATTCC GAAAAAGCGT TGCTGGACAT CCTGGCCGAT
GAAAAAGAGA ACGGTCCGGA AGATACCACG CAAGATGACG ATATGAAGCA GAGCATCGTC
AAATGGCTGT TCGAGCTGAA CGCCAAACAG CGTGAAGTGC TGGCACGTCG ATTCGGTTTG
CTGGGGTACG AAGCGGCAAC ACTGGAAGAT GTAGGTCGTG AAATTGGCCT CACCCGTGAA
CGTGTTCGCC AGATTCAGGT TGAAGGCCTG CGCCGTTTGC GCGAAATCCT GCAAACGCAG
GGGCTGAATA TCGAAGCGCT GTTCCGCGAG taa

Nucleotide Code for RpoS Gene from NZ_CP033092.1:4177924-4178988 Escherichia coli DSM 30083 = JCM 1649 = ATCC 11775 chromosome, complete genome (Forward 5’ to 3')

ATGAGTCAGAATACGCTGAAAGTTCATGATTTAAATGAAGATGCGGAATTTGATGAGAACGGAGTTGAGGTTTTTGACGAAAAGGCCTTAGTAGAAGAGGAACCCAGTGATAACGATTTGGCCGAAGAGGAACTGTTATCGCAGGGAGCCACACAGCGTGTGCTGGACGCGACTCAGCTTTACCTTGGTGAGATTGGTTATTCACCACTGTTAACGGCCGAAGAAGAAGTTTATTTTGCGCGTCGCGCACTGCGTGGAGATGTCGCCTCTCGCCGCCGGATGATCGAGAGTAACTTGCGTCTGGTGGTAAAAATTGCCCGCCGTTATGGCAATCGTGGTCTGGCGTTGCTGGACCTGATCGAAGAGGGCAACCTGGGGCTGATCCGCGCGGTAGAGAAGTTTGACCCGGAACGTGGTTTCCGCTTCTCAACATACGCAACCTGGTGGATTCGCCAGACGATCGAACGGGCGATTATGAACCAAACCCGTACTATTCGTTTGCCGATTCACATCGTAAAGGAGCTGAACGTTTACCTGCGAACCGCACGTGAGTTGTCCCATAAGCTGGACCACGAACCAAGTGCGGAAGAGATCGCAGAGCAACTGGATAAGCCAGTTGATGACGTCAGCCGTATGCTTCGTCTTAACGAGCGCATTACCTCGGTAGACACCCCGCTGGGTGGTGATTCCGAAAAAGCGTTGCTGGACATCCTGGCCGATGAAAAAGAGAATGGTCCGGAAGATACCACGCAAGATGACGATATGAAGCAGAGCATCGTCAAATGGCTGTTCGAGCTGAACGCCAAACAGCGTGAAGTACTGGCACGTCGATTCGGTTTGCTGGGGTACGAAGCGGCAACACTGGAAGATGTAGGTCGTGAAATTGGCCTCACCCGTGAACGTGTTCGCCAGATTCAGGTTGAAGGCCTGCGCCGTTTGCGCGAAATCCTGCAAACGCAGGGGCTGAATATCGAAGCGCTGTTCCGCGAAGAAGTAAGCATCTGTCAGAAAGGCCAGTCTCAAGCGAGGCTGGCTTTTTTTCTTTTGGTACATGGTACATGTTGA

Reverse Compliment Nucleotide Code for RpoS Gene from NZ_CP033092.1:4177924-4178988 Escherichia coli DSM 30083 = JCM 1649 = ATCC 11775 chromosome, complete genome (Reverse 3’ to 5')

TCAACATGTACCATGTACCAAAAGAAAAAAAGCCAGCCTCGCTTGAGACTGGCCTTTCTGACAGATGCTTACTTCTTCGCGGAACAGCGCTTCGATATTCAGCCCCTGCGTTTGCAGGATTTCGCGCAAACGGCGCAGGCCTTCAACCTGAATCTGGCGAACACGTTCACGGGTGAGGCCAATTTCACGACCTACATCTTCCAGTGTTGCCGCTTCGTACCCCAGCAAACCGAATCGACGTGCCAGTACTTCACGCTGTTTGGCGTTCAGCTCGAACAGCCATTTGACGATGCTCTGCTTCATATCGTCATCTTGCGTGGTATCTTCCGGACCATTCTCTTTTTCATCGGCCAGGATGTCCAGCAACGCTTTTTCGGAATCACCACCCAGCGGGGTGTCTACCGAGGTAATGCGCTCGTTAAGACGAAGCATACGGCTGACGTCATCAACTGGCTTATCCAGTTGCTCTGCGATCTCTTCCGCACTTGGTTCGTGGTCCAGCTTATGGGACAACTCACGTGCGGTTCGCAGGTAAACGTTCAGCTCCTTTACGATGTGAATCGGCAAACGAATAGTACGGGTTTGGTTCATAATCGCCCGTTCGATCGTCTGGCGAATCCACCAGGTTGCGTATGTTGAGAAGCGGAAACCACGTTCCGGGTCAAACTTCTCTACCGCGCGGATCAGCCCCAGGTTGCCCTCTTCGATCAGGTCCAGCAACGCCAGACCACGATTGCCATAACGGCGGGCAATTTTTACCACCAGACGCAAGTTACTCTCGATCATCCGGCGGCGAGAGGCGACATCTCCACGCAGTGCGCGACGCGCAAAATAAACTTCTTCTTCGGCCGTTAACAGTGGTGAATAACCAATCTCACCAAGGTAAAGCTGAGTCGCGTCCAGCACACGCTGTGTGGCTCCCTGCGATAACAGTTCCTCTTCGGCCAAATCGTTATCACTGGGTTCCTCTTCTACTAAGGCCTTTTCGTCAAAAACCTCAACTCCGTTCTCATCAAATTCCGCATCTTCATTTAAATCATGAACTTTCAGCGTATTCTGACTCAT

RNA Nucleotide Code for RpoS Gene from NZ_CP033092.1:4177924-4178988 Escherichia coli DSM 30083 = JCM 1649 = ATCC 11775 chromosome, complete genome

AUGAGUCAGAAUACGCUGAAAGUUCAUGAUUUAAAUGAAGAUGCGGAAUUUGAUGAGAACGGAGUUGAGGUUUUUGACGAAAAGGCCUUAGUAGAAGAGGAACCCAGUGAUAACGAUUUGGCCGAAGAGGAACUGUUAUCGCAGGGAGCCACACAGCGUGUGCUGGACGCGACUCAGCUUUACCUUGGUGAGAUUGGUUAUUCACCACUGUUAACGGCCGAAGAAGAAGUUUAUUUUGCGCGUCGCGCACUGCGUGGAGAUGUCGCCUCUCGCCGCCGGAUGAUCGAGAGUAACUUGCGUCUGGUGGUAAAAAUUGCCCGCCGUUAUGGCAAUCGUGGUCUGGCGUUGCUGGACCUGAUCGAAGAGGGCAACCUGGGGCUGAUCCGCGCGGUAGAGAAGUUUGACCCGGAACGUGGUUUCCGCUUCUCAACAUACGCAACCUGGUGGAUUCGCCAGACGAUCGAACGGGCGAUUAUGAACCAAACCCGUACUAUUCGUUUGCCGAUUCACAUCGUAAAGGAGCUGAACGUUUACCUGCGAACCGCACGUGAGUUGUCCCAUAAGCUGGACCACGAACCAAGUGCGGAAGAGAUCGCAGAGCAACUGGAUAAGCCAGUUGAUGACGUCAGCCGUAUGCUUCGUCUUAACGAGCGCAUUACCUCGGUAGACACCCCGCUGGGUGGUGAUUCCGAAAAAGCGUUGCUGGACAUCCUGGCCGAUGAAAAAGAGAAUGGUCCGGAAGAUACCACGCAAGAUGACGAUAUGAAGCAGAGCAUCGUCAAAUGGCUGUUCGAGCUGAACGCCAAACAGCGUGAAGUACUGGCACGUCGAUUCGGUUUGCUGGGGUACGAAGCGGCAACACUGGAAGAUGUAGGUCGUGAAAUUGGCCUCACCCGUGAACGUGUUCGCCAGAUUCAGGUUGAAGGCCUGCGCCGUUUGCGCGAAAUCCUGCAAACGCAGGGGCUGAAUAUCGAAGCGCUGUUCCGCGAAGAAGUAAGCAUCUGUCAGAAAGGCCAGUCUCAAGCGAGGCUGGCUUUUUUUCUUUUGGUACAUGGUACAUGUUGA

Source: https://biocyc.org/ECOLI/sequence-rc?type=GENE&object=EG10510

3.3. Codon optimization.

Lysis protein DNA sequence with Codon-Optimization ATGGAAACCCGCTTTCCGCAGCAGAGCCAGCAGACCCCGGCGAGCACCAACCGCCGCCGCCCGTTCAAACATGAAGATTATCCGTGCCGTCGTCAGCAGCGCAGCAGCACCCTGTATGTGCTGATTTTTCTGGCGATTTTTCTGAGCAAATTCACCAACCAGCTGCTGCTGAGCCTGCTGGAAGCGGTGATTCGCACAGTGACGACCCTGCAGCAGCTGCTGACCTAA

3.4. You have a sequence! Now what?

3.5. [Optional] How does it work in nature/biological systems?

  1. Describe how a single gene codes for multiple proteins at the transcriptional level.
  2. Try aligning the DNA sequence, the transcribed RNA, and also the resulting translated Protein like a provided example at [https://2026a.htgaa.org/2026a/course-pages/weeks/week-02/index.html]

Reading DNA

Part 4: Prepare a Twist DNA Synthesis Order

4.1. Create a Twist account, and Benchling account…

  • create Twist and Benchling accounts
  • Pick our protein! I will pick a protein related to aging for final project, I am just trying to keep my head above water on HW2 so the protein I pick is the example provided. See below in codeblock but what sort of nucleotides are “M E T…”? Clearly those aren’t nucleotides they are single letter representatives of amino acids, known as codons, constructed from 3 nucleotides. Here we are given in a top-down Build of a protein, which we must run the Central Dogma in reverse to translate back to RNA and then transcribe back to DNA.
>sp|P03609|LYS_BPMS2 Lysis protein OS=Escherichia phage MS2 OX=12022 PE=2 SV=1 METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLL EAVIRTVTTLQQLLT
                     /note="unnamed protein product; L-protein"
                     /codon_start=1
                     /transl_table=11
                     /protein_id="CAA23990.1"
                     /db_xref="GOA:P03609"
                     /db_xref="InterPro:IPR022599"
                     /db_xref="UniProtKB/Swiss-Prot:P03609"
                     /translation="METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFL
                     AIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT"
     CDS             1761..3398

Here is an example of what running backwards looks like crudely. In this instance we go all the way back (1996) to the original sequence of phage MS2 L-protein from its genome. This is an excerpt from the GenBank file: representing a “phage MS2 genome” GenBank record [https://www.ncbi.nlm.nih.gov/nuccore/V00642].

Please note this sequence doesn’t come from the bottom of the GenBank file instead the selected region is required which must be further trimmed to match the code provided below from the HW2 blog. With correct NCBI links we can now confirm this code from the blog actually came from this GenBank record [ https://www.ncbi.nlm.nih.gov/nuccore/NC_001417.2?from=1678&to=1905&report=genbank]. I will also move this GenBank file into Benchling instead of previous file.

          atggaaacccgattccctcagcaatcgcagcaaactccggcatctactaatagacgccggccattcaaacatgaggattacccatgtcgaagacaacaaagaagttcaactctttatgtattgatcttcctcgcgatctttctctcgaaatttaccaatcaattgcttctgtcgctactggaagcggtgatccgcacagtgacgactttacagcaattgcttacttaa

A closer match of genome nucleotides is obtainted through another NCBI lookup [https://www.ncbi.nlm.nih.gov/nuccore/NC_001417.2?report=fasta&from=1643&to=1938] though even here the resulting gene fragment must be further trimmed

>NC_001417.2:1643-1938 phage MS2 genome
GCTTATTGTTAAGGCA|
ATGCAAGGTCTCCTAAAAGATGGAAACCCGATTCCCTCAGCAATCGCAGCAAAC
TCCGGCATCTACTAATAGACGCCGGCCATTCAAACATGAGGATTACCCATGTCGAAGACAACAAAGAAGT
TCAACTCTTTATGTATTGATCTTCCTCGCGATCTTTCTCTCGAAATTTACCAATCAATTGCTTCTGTCGC
TACTGGAAGCGGTGATCCGCACAGTGACGACTTTACAGCAATTGCTTACTTAA|
GGGACGAATTGCTCACA
AAGCATCCGACCTTAG

Reflecting, since we need the gene that codes for the LYS_BPMS2 Lysis protein in the Escherichia phage MS2 we go back to a GenBank file from 1996 when virology was the approach in molecular biology for engineering tag segments of RNA strand with stems looped in the translation phase of the Central Dogma of molecular biology. Based on the orignal RNA virus from which MS2 was derived.

  • select Genes on the page with prompt “what can twist build for you?” for HW2
  • name the project “L protein” with “L” for “Lysis” for HW2.
  • select Clonal Genes order card and press “Order Now” when prompted to select gene type for HW2.
  • avoid my mistake, this next page is going to take us to an “Excel Like” worksheet that we will develop our request with. The old school way was to download and upload meticulously formatted Excel spreadsheets; we are advanced humans capable of using web forms. Before we enter the DNA we require into this order form we have to work through the DNA we were given to read in HW2 Completing the optimization process on Twist Website we now have a Codon-Optimized Lysis protein DNA sequence.
ATGGAAACCCGCTTTCCGCAGCAGAGCCAGCAGACCCCGGCGAGCACCAACCGCCGCCGCCCGTTCAAACATGAAGATTATCCGTGCCGTCGTCAGCAGCGCAGCAGCACCCTGTATGTGCTGATTTTTCTGGCGATTTTTCTGAGCAAATTCACCAACCAGCTGCTGCTGAGCCTGCTGGAAGCGGTGATTCGCACAGTGACGACCCTGCAGCAGCTGCTGACCTAA

Optional: If we were going to synthesize more of this protein we now have a set of genetic instructions to read to grow those proteins. However there are different methods from which we can Build those proteins. We can consider cell-dependent or cell-free approaches. Explain more about these when I pick my protein.

In preparation for next steps remember that my Codon-Optimized Lysis protein DNA sequence In Benchling instructions to transcribe gene to RNA “Highlight the DNA sequence of interest.” “Right-click and select Copy Special.” “Choose the Reverse Complement option to get the anti-sense strand (RNA equivalent).” “Create a New DNA/RNA Sequence and paste the sequence, ensuring the type is set to “RNA”.”

4.2. Build Your DNA Insert Sequence

  • Let’s first organize our directories in Benchling for the assembly line
  • Create folder for Registry of Standard Biological Parts [https://parts.igem.org/Part:BBa_J23106] In that folder create the following folders: > A_Promoter > B_RBS > C_Start Codon > D_Coding Sequence > E_7x His Tag > F_Stop Codon > G_Terminator

HW2 Objective of assembly: make a sequence that will make E. coli glow fluorescent green under UV light by constitutively (always) expressing sfGFP (a green fluorescent protein).

In Benchling, select New DNA/RNA sequence

Give your insert sequence a name and select DNA with a Linear topology (this is a linear sequence that will be inserted into a circular backbone vector of our choosing).

Go through each piece of the given DNA sequences highlighted below (Promoter, RBS, Start Codon, Coding Sequence, His Tag, Stop Codon, Terminator) and paste the sequences into the Benchling file one after the other (replacing the coding sequence with your codon optimized DNA sequence of interest!). Each time you add a new piece of the sequence, make sure to annotate by right clicking over the sequence and creating an annotation that describes what each piece (e.g., Promoter, RBS, etc.) is (see image below).

Promoter (e.g. BBa_J23106) TTTACGGCTAGCTCAGTCCTAGGTATAGTGCTAGC

RBS (e.g. BBa_B0034 with spacers for optimal expression) CATTAAAGAGGAGAAAGGTACC

Start Codon ATG

Coding Sequence (your codon optimized DNA for a protein of interest, sfGFP for example) AGCAAAGGAGAAGAACTTTTCACTGGAGTTGTCCCAATTCTTGTTGAATTAGATGGTGATGTTAATGGGCACAAATTTTCTGTCCGTGGAGAGGGTGAAGGTGATGCTACAAACGGAAAACTCACCCTTAAATTTATTTGCACTACTGGAAAACTACCTGTTCCGTGGCCAACACTTGTCACTACTCTGACCTATGGTGTTCAATGCTTTTCCCGTTATCCGGATCACATGAAACGGCATGACTTTTTCAAGAGTGCCATGCCCGAAGGTTATGTACAGGAACGCACTATATCTTTCAAAGATGACGGGACCTACAAGACGCGTGCTGAAGTCAAGTTTGAAGGTGATACCCTTGTTAATCGTATCGAGTTAAAGGGTATTGATTTTAAAGAAGATGGAAACATTCTTGGACACAAACTCGAGTACAACTTTAACTCACACAATGTATACATCACGGCAGACAAACAAAAGAATGGAATCAAAGCTAACTTCAAAATTCGCCACAACGTTGAAGATGGTTCCGTTCAACTAGCAGACCATTATCAACAAAATACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTACCTGTCGACACAATCTGTCCTTTCGAAAGATCCCAACGAAAAGCGTGACCACATGGTCCTTCTTGAGTTTGTAACTGCTGCTGGGATTACACATGGCATGGATGAGCTCTACAAA

7x His Tag (Let’s add a 7×His tag at the C-terminus of the protein to enable protein purification from E. coli) CATCACCATCACCATCATCAC

Stop Codon TAA

Terminator (e.g. BBa_B0015) CCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGAACGCTCTCTACTAGAGTCACACTGGCTCACCTTCGGGTGGGCCTTTCTGCGTTTATA

Once you’ve completed this, click on Linear Map to preview the entire sequence. If you intend to have a TA review a sequence in the future, this is a good way to verify that all sections are annotated!

This is not required for this exercise, but to share your design with others, please ensure that link sharing is turned on! (Optional) Share your final sequence link with a TA for review!

This insert sequence you built is commonly referred to as an expression cassette in molecular biology (a sequence you can drop into any vector and it’ll perform its function). Go ahead and download the FASTA file for the sequence you made.

It’s helpful to visualize DNA designs using SBOL Canvas (Synthetic Biology Open Language) to convey your designs.

Here is my practice assembled copy of the HW2 gene fragment I will import in Twist. However, I will not submit an actual order to Twist because this is just my demonstration Clonal Gene fragment copy. I will repeat these steps with my own functional gene for official purchase order.

TTTACGGCTAGCTCAGTCCTAGGTATAGTGCTAGCCATTAAAGAGGAGAAAGGTACCATGAGCAAAGGAGAAGAACTTTTCACTGGAGTTGTCCCAATTCTTGTTGAATTAGATGGTGATGTTAATGGGCACAAATTTTCTGTCCGTGGAGAGGGTGAAGGTGATGCTACAAACGGAAAACTCACCCTTAAATTTATTTGCACTACTGGAAAACTACCTGTTCCGTGGCCAACACTTGTCACTACTCTGACCTATGGTGTTCAATGCTTTTCCCGTTATCCGGATCACATGAAACGGCATGACTTTTTCAAGAGTGCCATGCCCGAAGGTTATGTACAGGAACGCACTATATCTTTCAAAGATGACGGGACCTACAAGACGCGTGCTGAAGTCAAGTTTGAAGGTGATACCCTTGTTAATCGTATCGAGTTAAAGGGTATTGATTTTAAAGAAGATGGAAACATTCTTGGACACAAACTCGAGTACAACTTTAACTCACACAATGTATACATCACGGCAGACAAACAAAAGAATGGAATCAAAGCTAACTTCAAAATTCGCCACAACGTTGAAGATGGTTCCGTTCAACTAGCAGACCATTATCAACAAAATACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTACCTGTCGACACAATCTGTCCTTTCGAAAGATCCCAACGAAAAGCGTGACCACATGGTCCTTCTTGAGTTTGTAACTGCTGCTGGGATTACACATGGCATGGATGAGCTCTACAAACATCACCATCACCATCATCACTAACCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGAACGCTCTCTACTAGAGTCACACTGGCTCACCTTCGGGTGGGCCTTTCTGCGTTTATA

Here is my final Twist purchase order, though I will not actually purchase this either until an experiment can be developed.

TTTACGGCTAGCTCAGTCCTAGGTATAGTGCTAGCC
ATTAAAGAGGAGAAAGGTACC
ATG
ATGAGTCAGAATACGCTGAAAGTTCATGATTTAAATGAAGATGCGGAATTTGATGAGAACGGAGTTGAGGTTTTTGACGAAAAGGCCTTAGTAGAAGAGGAACCCAGTGATAACGATTTGGCCGAAGAGGAACTGTTATCGCAGGGAGCCACACAGCGTGTGCTGGACGCGACTCAGCTTTACCTTGGTGAGATTGGTTATTCACCACTGTTAACGGCCGAAGAAGAAGTTTATTTTGCGCGTCGCGCACTGCGTGGAGATGTCGCCTCTCGCCGCCGGATGATCGAGAGTAACTTGCGTCTGGTGGTAAAAATTGCCCGCCGTTATGGCAATCGTGGTCTGGCGTTGCTGGACCTGATCGAAGAGGGCAACCTGGGGCTGATCCGCGCGGTAGAGAAGTTTGACCCGGAACGTGGTTTCCGCTTCTCAACATACGCAACCTGGTGGATTCGCCAGACGATCGAACGGGCGATTATGAACCAAACCCGTACTATTCGTTTGCCGATTCACATCGTAAAGGAGCTGAACGTTTACCTGCGAACCGCACGTGAGTTGTCCCATAAGCTGGACCACGAACCAAGTGCGGAAGAGATCGCAGAGCAACTGGATAAGCCAGTTGATGACGTCAGCCGTATGCTTCGTCTTAACGAGCGCATTACCTCGGTAGACACCCCGCTGGGTGGTGATTCCGAAAAAGCGTTGCTGGACATCCTGGCCGATGAAAAAGAGAATGGTCCGGAAGATACCACGCAAGATGACGATATGAAGCAGAGCATCGTCAAATGGCTGTTCGAGCTGAACGCCAAACAGCGTGAAGTACTGGCACGTCGATTCGGTTTGCTGGGGTACGAAGCGGCAACACTGGAAGATGTAGGTCGTGAAATTGGCCTCACCCGTGAACGTGTTCGCCAGATTCAGGTTGAAGGCCTGCGCCGTTTGCGCGAAATCCTGCAAACGCAGGGGCTGAATATCGAAGCGCTGTTCCGCGAAGAAGTAAGCATCTGTCAGAAAGGCCAGTCTCAAGCGAGGCTGGCTTTTTTTCTTTTGGTACATGGTACATGTTGA
CATCACCATCACCATCATCAC
TAA
CCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGAACGCTCTCTACTAGAGTCACACTGGCTCACCTTCGGGTGGGCCTTTCTGCGTTTATA

4.2. On Twist, Select The “Genes” Option

4.3. Select “Clonal Genes” option

For this demonstration, we’ll choose Clonal Genes. You’ll select clonal genes or gene fragments depending on your final project.

Historically, HTGAA projects using clonal genes (circular DNA) have reached experimental results 1-2 weeks quicker because they can be transformed directly into E. coli without additional assembly.

Gene fragments (linear DNA) offer greater design flexibility but typically require an assembly or cloning step prior to transformation. An advantage is If designed with the appropriate exonuclease protection, gene fragments can be used directly in cell-free expression.

4.4. Import your sequence

You just took an amino acid sequence of interest and converted it into DNA, codon optimized it, and built an expression cassette around it! Choose the Nucleotide Sequence option and Upload Sequence File to upload your FASTA file.

4.5. Choose Your Vector

Since we’re ordering a clonal gene, you will need to refer to Twist’s Vector Catalog to choose your circular backbone. You can think of this as taking your linear expression cassette for your protein of interest, and completing the rest of the circle!

The backbone confers many special properties like antibiotic resistance, an origin of replication, and more. Discuss with your node to decide on appropriate antibiotic options. At MIT/Harvard, you can use Ampicillin, Chloramphenicol, or Kanamycin resistance.

Twist vectors do not contain restriction sites near the insert fragment, so make sure to flank your design with cut sites if you are intending to extract this DNA insert fragment later.

For this demonstration, choose a Twist cloning vectors like pTwist Amp High Copy.

Click into your sequence and select download construct (GenBank) to get the full plasmid sequence:

Go back to your Benchling account. Inside of a folder, click the import DNA/RNA sequence button and upload the GenBank file you just downloaded.

This is the plasmid you just built with your expression cassette included. Congratulations on building your first plasmid!

Important

For your final projects, remember to include:

Fully annotated Benchling insert fragment Desired Twist cloning vector

Part 5: DNA Read/Write/Edit

It’s helpful to visualize DNA designs using SBOL Canvas (Synthetic Biology Open Language) to convey your designs. Here’s an example of what you just annotated in Benchling:

4.3. On Twist, Select The “Genes” Option

4.4. Select “Clonal Genes” option

For this demonstration, we’ll choose Clonal Genes. You’ll select clonal genes or gene fragments depending on your final project.

Historically, HTGAA projects using clonal genes (circular DNA) have reached experimental results 1-2 weeks quicker because they can be transformed directly into E. coli without additional assembly.

Gene fragments (linear DNA) offer greater design flexibility but typically require an assembly or cloning step prior to transformation. An advantage is If designed with the appropriate exonuclease protection, gene fragments can be used directly in cell-free expression.

4.5. Import your sequence

You just took an amino acid sequence of interest and converted it into DNA, codon optimized it, and built an expression cassette around it! Choose the Nucleotide Sequence option and Upload Sequence File to upload your FASTA file.

4.6. Choose Your Vector

Since we’re ordering a clonal gene, you will need to refer to Twist’s Vector Catalog to choose your circular backbone. You can think of this as taking your linear expression cassette for your protein of interest, and completing the rest of the circle!

The backbone confers many special properties like antibiotic resistance, an origin of replication, and more. Discuss with your node to decide on appropriate antibiotic options. At MIT/Harvard, you can use Ampicillin, Chloramphenicol, or Kanamycin resistance.

Twist vectors do not contain restriction sites near the insert fragment, so make sure to flank your design with cut sites if you are intending to extract this DNA insert fragment later.

For this demonstration, choose a Twist cloning vectors like pTwist Amp High Copy.

Click into your sequence and select download construct (GenBank) to get the full plasmid sequence:

Go back to your Benchling account. Inside of a folder, click the import DNA/RNA sequence button and upload the GenBank file you just downloaded.

This is the plasmid you just built with your expression cassette included. Congratulations on building your first plasmid!

Important

For your final projects, remember to include:

Fully annotated Benchling insert fragment Desired Twist cloning vector

Part 5: DNA Read/Write/Edit

Assignees for the following sections
MIT/Harvard studentsRequired
Committed ListenersRequired

5.1 DNA Read

(i) What DNA would you want to sequence (e.g., read) and why? This could be DNA related to human health (e.g. genes related to disease research), environmental monitoring (e.g., sewage waste water, biodiversity analysis), and beyond (e.g. DNA data storage, biobank).

I would like to use directed evolution in outbred goats to select my DNA to sequence. Therefore my plan is to stick with the HTGAA method until I have a Nanopore sequencer and reagents for genomic surveillance of my herd. My argument for why is still developing but essentially I have anecdotal observations to support a hypothesis. An example of the type of genes I would like to sequence is the second vector I uploaded – the RpoS gene in the K-12 strain of E. coli. The gene was sequenced with a Nanopore sequencer.

DNA-based digital data storage technology. Source: Archives in DNA: Workshop Exploring Implications of an Emerging Bio-Digital Technology through Design Fiction - Scientific Figure on ResearchGate. Available from: https://www.researchgate.net/figure/DNA-based-digital-data-storage-technology_fig1_353128454 [accessed 11 Feb 2025]

DNA-based digital data storage technology. Source: Archives in DNA: Workshop Exploring Implications of an Emerging Bio-Digital Technology through Design Fiction - Scientific Figure on ResearchGate. Available from: https://www.researchgate.net/figure/DNA-based-digital-data-storage-technology_fig1_353128454 [accessed 11 Feb 2025]

(ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why?

There is no substitute in my opinion for a Nanopore sequencer, with a distant second being a PacBio. Ofcourse, Nanopore sequencers are far less popular than Illumina, and despite the fact that I am a big fan of Craig Venter, I still prefer the scientific opportunities available with Nanopore. In fact a significant reason why I went back to school post reproductive fitness equals zero, is because when I graduated from college they still hadn’t completed the Human Genome Project. I learned about Nanopores during the COVID-19 Pandemic when I started one of the first wastewater surveillance programs in the U.S. I believe the accuracy, speed, and flexibility of pore facilitated single base sequencing reads in parallel multithreaded readings fits my future research goals exactly and I am on the cusp of becoming a Nanopore super user.

Also answer the following questions:

Is your method first-, second- or third-generation or other? How so? What is your input? How do you prepare your input (e.g. fragmentation, adapter ligation, PCR)? List the essential steps. What are the essential steps of your chosen sequencing technology, how does it decode the bases of your DNA sample (base calling)? What is the output of your chosen sequencing technology?

I am focused on deep time series sequencing data that is broad enough to include changes in diversity of microbiome and host metatranscriptomic, epigenetic, and metabolomic signals as well as metagenomic changes. I also want to develop pipelines that I always have the opportunity to contribute to but never want to own or primarily benefit from. I believe paywall science is an etiologic mechanism that favors contagion.

The first benefit of 3rd and 4th generation sequencers, particularly the Nanopore machines, is that they do not even require PCR amplification. Don’t get me wrong, I love PCR as a flexible assay, but as an Epidemiologist I have never been comfortable with making more copies of pathogens on principle. I realize this a bit of a semantic argument and there are plenty of bio safety measures in place. At the same time the same biosecurity measures are drivers of inequality in applied Molecular Biology capabilities. What does it mean when the technology itself becomes a driver of inequality to scientific techniques everyone in a generation should have access to? I believe it means it’s time to keep innovating.

In addition I think there is wisdom in sequencing the actual shoddy molecules collected from the field, particularly for my applications. This is a Biosecurity advantage and better fit to the Epi Triangle anyway. However I am not saying there are not scenarios where higher level Biosecurity reference labs with PCR pipelines are not necessary. I just think some sequencers should be managed and maintained by governments and smaller non-PCR-based Nanopores should be prioritized by individual field researchers, like I intend to be.

Now there is an elephant in the room, thoug,h and that’s data storage. I have been wrestling with data storage my entire career, and I know my interest in Nanopore sequencing isn’t going to make these challenges go away anytime soon. Therefore, I am all for DNA storage of genomic sequencing information about animals in plant DNA ideally. If a safe method is already available, storage in animal subjects would be incredible as well. What DNA storage is maintained in goat horns or sheep’s wool? I need to investigate the methodology further to see if this is even possible. I am ashamed to admit that until I read Dr. Church’s Epilogue in Regeneration, I had never even thought about this before.

5.2 DNA Write

(i) What DNA would you want to synthesize (e.g., write) and why? These could be individual genes, clusters of genes or genetic circuits, whole genomes, and beyond. As described in class thus far, applications could range from therapeutics and drug discovery (e.g., mRNA vaccines and therapies) to novel biomaterials (e.g. structural proteins), to sensors (e.g., genetic circuits for sensing and responding to inflammation, environmental stimuli, etc.), to art (DNA origamis). If possible, include the specific genetic sequence(s) of what you would like to synthesize! You will have the opportunity to actually have Twist synthesize these DNA constructs! :)

I am dazzled by DNA origamis for synthetic materials, but the complexity of the methods to achieve static outputs is not necessarily a tradeoff I would invest time in right now. Genetic circuits are different though, I am fully attentive to this revolution. Particularily like we see in the examples provided by the Elowitz Lab [https://www.elowitz.caltech.edu/research#!computationandsyntheticcircuits].

(ii) What technology or technologies would you use to perform this DNA synthesis and why? Also answer the following questions:

What are the essential steps of your chosen sequencing methods? What are the limitations of your sequencing method (if any) in terms of speed, accuracy, scalability?

I would like to learn more about the CHOMP (circuits of hacked orthogonal modular proteases) method to integrate binary logic into functional programming modules in biological circuitry. The CHOMP method can then be used to control regulatory cascades and even more exciting to me binary logic gates. My research interests are nonlinear models in aging and cellular senescence that utiilizes elucidated insights to improve areas of stasis with the potentional for rejuvination. Waves of molecular decisions all of which with decipherable underlying binary logic gates based on Boolean logic. The engineering methods focus on viruses and bacteria. The programming motif they target is incoherent feed-forward loops. The amino acid they interact with is the Nitrogen end of Tyrosine which they expand to a four protein circuit. They image their single-transcript adaptive pulse circuits using time-lapse images. The result of the engineering is a rachet to control intrinsic nonlierarity of input and output biological systems. The scalabity and accuracy are tunable by the application. The speed is slow to design and as fast as biological circuits once implemented. Another method I would be interested in investigating further is the Asish et al. (2026) lab’s noninvasive biosensor application using live-cell diffusion-weighted imaging to investigate the effect of Gly-Ser spacers in transcription.

  • Source: Xiaojing J. Gao et al. ,Programmable protein circuits in living cells.Science361,1252-1258(2018).DOI:10.1126/science.aat5062
  • Asish N. Chacko et al. ,A programmable genetic platform for engineering noninvasive biosensors.Sci. Adv.12,eaec1211(2026).DOI:10.1126/sciadv.aec1211

5.3 DNA Edit

(i) What DNA would you want to edit and why? In class, George shared a variety of ways to edit the genes and genomes of humans and other organisms. Such DNA editing technologies have profound implications for human health, development, and even human longevity and human augmentation. DNA editing is also already commonly leveraged for flora and fauna, for example in nature conservation efforts, (animal/plant restoration, de-extinction), or in agriculture (e.g. plant breeding, nitrogen fixation). What kinds of edits might you want to make to DNA (e.g., human genomes and beyond) and why?

I am for animals. I want to contribute to dextinction and life extinction of all endangered and vulnerable organisms I can serve. I will contribute to human longevity as an afterthought to Natural diversity and sustainability. I am not beholden to humans though. I believe in the sanctity of all life.

(ii) What technology or technologies would you use to perform these DNA edits and why? Also answer the following questions: How does your technology of choice edit DNA? What are the essential steps? What preparation do you need to do (e.g. design steps) and what is the input (e.g. DNA template, enzymes, plasmids, primers, guides, cells) for the editing? What are the limitations of your editing methods (if any) in terms of efficiency or precision?

The synthesis of DNA and OMIC ontologies starts with phylogeny, small molecules, and phenotypes like disease. Therefore my career plan is to build the throughput for molecule mediate bidrectional interactions between Host physiology and the microbes in the microbiome and metagenome around the host. This is the big tent vision. Now how do get there. Imagine if I lay it all out here step-by-step. How could I do what I aspire to do if 1000 people do it first before me. Still, I come to HTGAA a pleb at the stairs to the Temple of Zeus, with my goats. Not to sacrifice, though. I will not be a culling scientist. There is no scientific discovery in the text of life that is worth sacrificing a living thing. I am a builder by nature anyway, I want to observe life without intervention, well some intervention is necessary, but not like it’s currently done. Therefore what to do?

Here is what I can share at HW2. Everywhere an organism lives, say a goat, is a DNA and RNA wake of material. Much of it is waste material, residues from competing metabolic systems stacked and ready to be interpreted and transformed into data. With data comes constraints, especially OMIC data, it’s endless and massive in scale, randomized and chaotic. I like the idea of applied systems biology pipelines built around dead biological material. You can catalogue and reconstruct living systems from waste chemistry. Do you need an Almond in it’s shell to understand the life history of that nut, not really, a fragment of husk in a pile on the ground will tell you about the almond and the animal that consumed it.

COVID-19, as a front-line Epidemiologist, in the center of the maelstrom did not equivocate in its lessons. First and foremost, public health apparatuses like mRNA vaccination research and deployment infrastructure is useful when it’s available, accessible, and appropriately matched to the agent. The rest is wastewater. Especially, where the infrastructure of sewers is insufficient to remove waste from a community fast enough, can be used to trace outbreaks in near real time. Wastewater surveillance is harder where the water is plentiful, deep, and fast-flowing. The great news for wastewater epidemic surveillance is that the structural inequalities above the sewers, exist within the sewers, and drive disease transmission in Outbreaks. This isn’t a hunch; the data support it. This is why I will continue to be interested in wastewater surveillance also when I enter the workforce.

However, I will focus on much broader networks of waste than wastewater, which is what makes the intersection of gut microbiomes, microbes, and host physiology the biological nexus for me. Thus, applications, many options here – especially in agriculture. I like agriculture because soil is the ultimate biological pile of waste. I have watched animal waste turn into dirt for several years now, and from that waste, plants grow. The animals eat those plants and turn it into animal tissues using systems of heredity and variability that have nothing to do with anything I did. I just get the animal in front of a plant and they complete their reproductive and maintenance programs. If I keep the animals water clean and their housing dry they do not get sick. These animals and the environment are an engine that I can run passively – they make the world a better place.

At the same time, though, this natural experiment produces a lot of opportunities to study molecule-mediated bidirectional relationships between animal hosts and the microbes in their microbiome and metagenome. Fortunately, for my experimental milieu, my species is driving Earth to its extreme of the boundary conditions for habitability, which certainly makes science more interesting – especially when local interventions can be developed to support sustainability, health, and longevity.

The last sentence is key for the edits I would dare to make. Never blindly though. This is why I will structure my lab within evolution directed sythesis.

Resources

DNA Sequencing at 40: Past, Present, and Future (2017) Shendure, J., Balasubramanian, S., Church, G. et al. https://doi.org/10.1038/nature24286 DNA Synthesis Technologies to Close the Gene Writing Gap (2023), Hoose, A., Vellacott, R., Storch, M. et al. https://doi.org/10.1038/s41570-022-00456-9 Recombineering and MAGE (2021), Wannier T, et al. Nat Rev Methods Primers, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9083505/ CRISPR Technology: A Decade of Genome Editing is Only the Beginning, Wang, Doudna, et al., https://www.science.org/doi/10.1126/science.add8643 Databases

GenBank overview: https://www.ncbi.nlm.nih.gov/genbank/ NCBI: https://www.ncbi.nlm.nih.gov/genome/ Ensembl: https://useast.ensembl.org/index.html UCSC Genome Browser: https://genome.ucsc.edu/ Protective and Enhancing Alleles: https://arep.med.harvard.edu/gmc/protect.html Editors and tutorials

CRISPR/Cas9 Short tutorial for designing gRNAs: https://blog.addgene.org/how-to-design-your-grna-for-crispr-genome-editing Benchling specific tutorial for designing gRNAs: https://www.benchling.com/blog/how-to-design-grnas-to-target-your-favorite-gene List of Cas editors and their PAM sites: https://www.synthego.com/guide/how-to-use-crispr/pam-sequence Base Editors Base editors contain a nicking or dead Cas9 enzyme fused to a deaminase. a.) PAM requirement: Base editors contain a nicking or dead Cas9 enzyme fused to a deaminase. For designing your guide RNA for base editing you will therefore have a PAM requirement like you would have for any Cas9 experiment. b.) Deamination window: An additional design constraint is that the sequence window in which deamination occurs is only a few base pairs long. You can find information on the deamination windows in the review below (even though some new editors are not included). BE4 and ABE7.10 are good starting points and both use SpCas9 with NGG Pam requirement. Base editors with other PAM sites have been constructed too. Review of base editors (2018) including a list of all base editors, their editing window and PAM requirement: https://www.nature.com/articles/s41576-018-0059-1?WT.feed_name=subjects_animal-biotechnology Other editors: Prime editor https://www.nature.com/articles/s41586-019-1711-4 Tutorials/tools: https://primeedit.nygenome.org/ https://www.nature.com/articles/s41551-020-00622-8 http://pegfinder.sidichenlab.org/ TALEN For TALENs, you can assume no sequence restrictions – One of the technology’s previous restrictions was a T starting base, but this has since been overcome. In contrast to the CRISPR/Cas technologies above, your DNA sequence is recognized through interactions between the DNA and the TALEN: each TAL in the array recognizes one base. (Note: In order to introduce a double strand break, you will need to design to TALENs targeting the opposing strands.) Short guide: https://www.addgene.org/talen/guide/ One of the available design resources: https://tale-nt.cac.cornell.edu/node/add/talen Directed evolution for overcoming starting base restriction:https://academic.oup.com/nar/article/41/21/9779/1276340 Additional Resources:

Gel Purification of DNA: after DNA gel electrophoresis, cutting a band of DNA out of the agarose gel allows isolation and purification of a specific DNA fragment: Addgene: Protocol - How to Purify DNA from an Agarose Gel Overview of synthetic, unnatural organisms using recoding: Synthetic genomes with altered genetic codes (2020) DNA recorders, Sense+Read+Write: Lineage tracing and analog recording in mammalian cells by single-site DNA writing (2021) Molecular electronics, integrating single molecules into electronic chips: Molecular electronics sensors on a scalable semiconductor chip: A platform for single-molecule measurement of binding kinetics and enzyme activity (2022) Review of genome editors (zinc finger nucleases, TALENs, CRISPR) at the time CRISPR was emerging as editing technology: https://www.cell.com/trends/biotechnology/pdf/S0167-7799(13)00087-5.pdf Clinical trials of genome-editing therapies: https://www.nature.com/articles/d41573-020-00096-y

Week 3 HW: Lab Automation

Homework for HTGAA 2026 (Week 03): Lab Automation

NIH_Bioart_Bacterium NIH_Bioart_Bacterium

Table of contents

Software used:

Objective:

This week we get hands-on (or at least code-on) with pipetting robots.

Background:

No lecture. Recitation and Tokyo Biohub node lab meetings. Submit three slides with ideas to our node by 24Feb2026.

Ideas for Tokyo Biohub Deck

  • GPG01: Identify transcription indicators in post reproductive goat life history indicative of alterations to NAD(H), ROS signaling, tissue specific oxidative stress and inflammation.
  • GPG02: Explore application of G-protein coupled receptors (GPCRs) in goats a method Chen et al. (2019) proposes more broadly to monitor bioactive microbial metabolites with associations to physiology.
  • GPG03: Consider systems-level synthetic biology agricultural interventions to improve yield of metabolite specific food-stuffs to support molecule mediated bidirectional interactions between goat hosts and microbiota.

Questions:

For this week, we’d like for you to do the following:

Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications. Write a description about what you intend to do with automation tools for your final project. You may include example pseudocode, Python scripts, 3D printed holders, a plan for how to use Ginkgo Nebula, and more. You may reference this week’s recitation slide deck for lab automation details. While your description/project idea doesn’t need to be set in stone, we would like to see core details of what you would automate. This is due at the start of lecture and does not need to be tested on the Opentrons yet.

  • Child HT, Wierzbicki L, Joslin GR, Roper K, Haxhiraj Q, Tennant RK. Automated environmental metagenomics using Oxford nanopore sequencing. BMC Genomics. 2025 Sep 26;26(1):835. doi: 10.1186/s12864-025-11989-w. PMID: 41013192; PMCID: PMC12465296.

I propose using a cloud laboratory and automation tools to process environmental metagenomics samples with and by Oxford nanopore sequencers. Here is the problem I am attempting to address. I am only one person. My time is always constrained profoundly. This means I always am burning and undercooking items with all my pans in the fire. Still, personally and professionally I am comitted red tooth and claw to environmental protection of biodiversity and abundance of natural ecologies and agricultural middle corridors. In addition I am personally offended by inequality, especially when it comes to the allocation of scientific discovery capacity and supply lines. The most diverse places on Earth are the most imperiled and at the same time least equipped with tooling to achieve the scientific advacements they need to protect their habitats and communities. Allow me to also preface that HTGAA is a small example that the bottlenecks of which I speak are not in human capacity, it’s techology, energy, infrastructure, brick and mortar. I believe that cloud computing and automation tools are a stopgap measure urgently needed to fill the breech and provide platforms to the facilitate the synergies of natural and unnatural selection required to advance sustainability and biodiversity. However engineering these partnerships are going to be just as important as the technological capabilities. The great thing about HTGAA is that we are doing this work from the bottom-up by participating in these cohorts.

Aside from HTGAA, my work with goats actually comes from the same engineering aspiration. I never saw a goat in the U.S. until I became a community health worker and started working outside of my country. Once I left the U.S. goats were much more plentiful, especially in rural mountainous regions. I am raising goats now to learn animal husbandry of these critical animals so that I can better understand how to help raise goats anywhere, in any locale, with any resource constraint. Goats in my opinion are the first automation tool that humans partnered with to survive in extreme environments. Through this partnership goats and humans expanded their gene and environment match with the physical constraints they were encountering in their struggle to ensure their families thrived. Goats and humans share many strengths and weaknesses, mainly their dedication to their families and security of FDR’s essential freedom from uninhabitable temperature and violence, hunger, thirst.

Now to the assignment, but from this perspective, the paper I reviewed from Childs et al. (2025) compared manual and automated metagenomic workflows using Oxford nanopore sequencers and found minimal differences in outcomes assessed. The first reason I chose this paper is because it starts with a fundamental truth, long-read sequencing has transformed our understanding of the microbiome. In fact, metagenomic and microbiome catalogues were not even attempted reliably until these machines entered the OMICs revolution. Enter the pipetter. I can attest, this is monastic work. The challenge is not the tool, it’s the lab space, and the sheer magnitude of the wells that the pipetter must span. Experimental protocols require percise allocation of minute quanties of fluids over and over again. From a personal vantage I quite enjoy the process, for there are few activities more zen in my day but then I am also hyper-privleged. Again inequality rears its head into the hallways of science. Who enters the cloister of the dwindling lab spaces in the world to the shelter of the bench and how many minuites do they have to spend to achieve their objectives. Here too is another ineqality though, because let’s be honest, not their objectives but the objectives of their research supervisors–because labs are also part of the caste system.

How do we untangle all of these knots to do the do the critical work. Could it be automation is answer? This will depend on who has access to automation. Are we talking about robotic workflows that are accessible to anyone with curiosity about microbiomes and metagenomes. Likely not anytime soon. I guess it will be more about the workflows done by students with professors. This is where the revolution of OMICs and Next-Generation Sequencers must be fought. What about private start-ups, I don’t know enough to speculate here. I can ponder the task of expanding the paradigm so any student with want of bench exposure using sequencers can have it. Honestly, I think HTGAA is pursuing this admirably. The cause is certainly just. If students and professors with and without wetlab spaces can both access cloud platforms and automation labs then we can realize the type of contingent niche environment that theoretically at least could be scaled-up and that is far better than not having a foothold at all. The Childs lab (2026) certainly seems to understand this charge when they explain that automation is a game changer fit to improve throughput, reproducibility, and accuracy. What is less clear is if the solution is the automated workflow or the Oxford nanopore sequencers that true read the sample one base pair at a time very quickly and then write that information into a cloud library for template recognition against other long-read sequences with annotation.

I didn’t really leave myself enough time to do this properly, ironically because this is lambing season, but Child’s et al. (2025) do make some very interesting observations in their side-by-side comparison of manual and automated workflows. I will apply these to my project now as well.

Childs et al. (2025) explain that many of the current studies they reviewed for their article only contain high throughput amplicon from the COVID-19 Pandemic. I do not see this as a challenge at all. Instead, when I think about the COVID-19 Pandemic as front-line warrior for Metazoans I see the good we accomplished when political will was aligned with scientific aspirations, and trust that the only reason naysayers have any leeway now to gripe about the deluge of SARS-CoV-2 data and genetic contamination, is because they are alive because of mRNA vaccines and wastewater surveillance, which Oxford nanopore significantly supported.

The liquid handling robot arm of the Childs et al. (2025) study was a Bravo Automated Liquid Handing Platform. I want one. Is it worth the cost though. Apparently, the findings are not sufficient to justify a purchase, based on read length alone. In the study the manual and Bravo study arms both analyzed the same 24 samples from a range of environments across a 96-well plate. Except for read length, which was on average longer in the manual arm than the automated. We can assume, if we have ever pipetted, that the automated arm would be more consistent in the allocation of microfluidics but confounding from variation in diverse soil samples appears to have made this distinction difficult to show. Meanwhile, the manual arm included eludication of DNA samples that the automata didn’t replicate, that doesn’t seem fair to me. However, if the automated workflow literally is not able to do all of the workflow steps than that is a strong point for manual over automated arms until the landscape is level.

Here’s the big takeaway though for my project. Childs et al. (2025) did find that improved automated libraries reduced PCR artefacts and increased sensitivity provide a more accurate snapshot of the ecological taxa of the microbiota – in other words more families, species, sub-species in the samples of less abundant organisms. This is what I want to hear, because if this process was applied to five studies instead of one then we would have 5x’s the power in detection of rare organisms that contribute to the diversity of the soil ecosystems, which is what I aspire most to understand and preserve.

  • Final Project Ideas — DUE BY START OF FEB 24 LECTURE

Methods:

  • Cloud Computing

Tasks:

Assignment: Python Script for Opentrons Artwork — DUE BY YOUR LAB TIME!

  • Your task this week is to Create a Python file to run on an Opentrons liquid handling robot.
  • Review this week’s recitation and this week’s lab for details on the Opentrons and programming it.
  • Generate an artistic design using the GUI at opentrons-art.rcdonovan.com. Using the coordinates from the GUI, follow the instructions in the HTGAA26 Opentrons Colab to write your own Python script which draws your design using the Opentrons.
  • You may use AI assistance for this coding — Google Gemini is integrated into Colab (see the stylized star bottom center); it will do a good job writing functional Python, while you probably need to take charge of the art concept. If you’re a proficient programmer and you’d rather code something mathematical or algorithmic instead of using your GUI coordinates, you may do that instead. [!warning] Ask for help early! If you are having any trouble with scripting, contact your TAs as soon as possible for help. Do not wait until your scheduled robot time slot or you may not be able to complete this assignment! If the Python component is proving too problematic even with AI and human assistance, download the full Python script from the GUI website and submit that:
    Use the download icon pointed to by the red arrow in this diagram.

    Use the download icon pointed to by the red arrow in this diagram.

    If you use AI to help complete this homework or lab, document how you used AI and which models made contributions. Sign up for a robot time slot if you are at MIT/Harvard/Wellesley or at a Node offering Opentrons automation. The Python script you created will be run on the robot to produce your work of art! At MIT/Harvard? Lab times are on Thursday Feb.19 between 10AM and 6PM. At other Nodes? Please coordinate with your Node. Submit your Python file via this form. Post-Lab Questions — DUE BY START OF FEB 24 LECTURE

One of the great parts about having an automated robot is being able to precisely mix, deposit, and run reactions without much intervention, and design and deploy experiments remotely.

For this week, we’d like for you to do the following:

Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications. Write a description about what you intend to do with automation tools for your final project. You may include example pseudocode, Python scripts, 3D printed holders, a plan for how to use Ginkgo Nebula, and more. You may reference this week’s recitation slide deck for lab automation details. While your description/project idea doesn’t need to be set in stone, we would like to see core details of what you would automate. This is due at the start of lecture and does not need to be tested on the Opentrons yet.

Example 1: You are creating a custom fabric, and want to deposit art onto specific parts that need to be intertwined in odd ways. You can design a 3D printed holder to attach this fabric to it, and be able to deposit bio art on top. Check out the Opentrons 3D Printing Directory.

Example 2: You are using the cloud laboratory to screen an array of biosensor constructs that you design, synthesize, and express using cell-free protein synthesis.

Echo transfer biosensor constructs and any required cofactors into specified wells. Bravo stamp in CPFS reagent master mix into all wells of a 96-well / 384-well plate. Multiflo dispense the CFPS lysate to all wells to start protein expression. PlateLoc seal the plate. Inheco incubate the plate at 37°C while the biosensor proteins are synthesized. XPeel remove the seal. PHERAstar measure fluorescence to compare biosensor responses. Final Project Ideas — DUE BY START OF FEB 24 LECTURE

Assignees for the following sections
MIT/Harvard studentsRequired
Committed ListenersRequired

As explained in this week’s recitation, add a slide in your Node’s section of this slide deck with an idea you have for an Individual Final Project. Be sure to put your name on your slide!

Reading & Resources (click to expand)

Opentrons API Documentation: https://docs.opentrons.com/python-api/ Opentrons Artwork GUI Website: http://opentrons-art.rcdonovan.com/ Opentrons Artwork Colab: HTGAA26 Opentrons Colab Automation Equipment: HTGAA 2026 Recitation: Lab Automation, Opentrons Art, Intro to Cloud Laboratories

HTGAA 2026: Opentrons Artwork Lab

By Eyal Perry, Laura Maria Gonzalez, Dominika Wawrzyniak, Alex Hadik, Suvin Sundararajan, Ronan Donovan

This notebook contains a few examples that demonstrates how the Opentrons OT-2 can be used to draw arbitrary patterns using the the Python Opentrons API. These examples can and should be used as your template as you try to pattern your own colorful, synthetically engineered bacteria.

To use this, make your own copy of this Colab, and in that copy you can run and edit the last section (and your work will be saved in your copy!).

Note: After learning about how to program designs using colab and python, you may choose to print more designs with automated tools like Opentrons Art Interface.

Each example consists of two blocks of code:

  1. The first code block is where the pattern is drawn using .aspirate(), .move_to(), and dispense_and_detach() (as a wrapper around .dispense()) commands (similar to G-code). This block will typically generate no output, as it’s just loading the code (but doesn’t run it yet). This block of code can later be copied as-is and saved as a .py file to be executed on a real Opentrons machine.
  2. The second code block runs a simple simulation that visualizes the pipetted pattern by executing your code in the simulator in this colab. This block will draw the state of the plate after running the robot code.

At the end is a section for you to code your design in, with the same two code blocks. Make your own copy of this Colab notebook and work in your copy. When ready, upload the link to your first block in this section to the linked google form a day before your lab date! Don’t edit the second block in this section, as only the first block will be run on the robot.

Several important notes:

  • All units are in mm
  • Never go beyond a radius of 40mm from (0,0). If you do, you might hit the walls of the petri dish and all hell breaks loose, or you might dispense onto the wall of or even outside the petri dish. (Some common “90mm” or “100mm” petri dishes only have an inner diameter of 84mm in the bottom plate, and the tip occupies a radius of a couple mm.)
  • For the Black Agar Plates, dispense 1 uL drops by default. (If you are trying for a particular effect, going slightly higher in some places may be acceptable.) While that may sound like a small quantity, the E.coli will still be visible (especially after growing) and small “pixel” sizes can produce more detailed patterns.
  • Be careful of dispensing samples too close to each other! They will move around slightly depending on the size of the drop. 1uL drops 2mm apart may sometimes run together or may stay mostly distinct even after incubation; 1uL drops 5mm apart will almost always stay distinct, but give you less than half the “resolution” for your art. Midway between those - 3.5mm separation - may be a happy medium. (See past year photos here and in the Lab Protocol and count dots along one axis; these of course show the ones which were lucky enough to mostly not run together…)
  • On the robot if you dispense and immediately move the tip 1cm to the left it will create a streak of bio-ink (shaped according to the viscosity of the liquid). The simulator accounts for this basic effect, and you will see spurious lines between your dots or to random locations in the visualization. We have provided a routine dispense_and_detach() that dispenses and moves the tip slightly up & down to fully clear the droplet; you can use this in your code both for the simulator and the robot to avoid streaking.
  • We have defined standard configuration for the robot deck for this lab, and our template code follows it. We plan to have Red-, Yellow-, Green-, Cyan-, and Blue-fluorescing bacteria (but no others) at all sites in the robot for your use, and have provided a routine location_of_color() you can use to retrieve our standardized configuration’s location of a named color (which you can pass to an aspirate() call).
  • Pay attention to any text output from the simulator (typically just above the plate image) - it can give useful diagnostics and statistics. Don’t get so focused on your beautiful drawing that you forget to check this every once and a while.
  • Remember not to waste any resources (here tips & reagents, as explained in the Lab Protocol – you can confirm via the “Volume Totals by Color” and “Tip Count” summaries shown after every successful run – but don’t cross-contaminate your color wells.
  • The visualization is not 100% accurate. We don’t model any flud dynamics, so any streaking if you don’t use dispense_and_detach(), any effects of dispensing from z>0, and even the droplet sizes in all cases are not physically realistic; and the simulator doesn’t have an awareness of the 3D positions of labware. (Feel free to contribute improvements to the simulator!)
  • The simulation is not even close to a 100% complete reimplementation of the Opetrons API. Some commands will work on the OT-2 but will cause errors in the simulation (Feel free to contribute!).

After your code is done, to submit it to be run on a robot:

  1. Make sure your code is accessible to us: in your colab click the “Share” icon in the upper right, set “General access” so that “Anyone with the link” can be a “Viewer”.
  2. Copy to the clipboard a link to your code: right-click in the first code block (which has the metadata = {...} section near the top and your code at the end) and choose “Copy link to cell”
  3. Paste this URL into the Google form for submitting to the OT-2 and submit at least a day before your robot time slot.
  4. Review the Instructions given in the Lab Protocol.

Prerequisite Code

The following block of code contains required installations and the simulation/visualization code. It only needs to be run once per runtime.

When run, it will output errors declaring “ERROR: pip’s dependency resolver does not currently take into account all the packages that are installed.” and list some package incompatibilities; that is expected, and is a result of the Opentrons API requiring an old version of some libraries. (No other errors are expected.)

This block can be re-run in a runtime without ill effect (but will show the same errors every time).

Run this block once per runtime to set up your environment

#@title Run this block once per runtime to set up your environment

The colab now comes with too new a version of numpy; opentrons still needs an older one.

So set up venv-like isolation of my pip installs (separated from colab packages) for all subsequent cells.

(Without doing this, colab would require restarting the runtime right after installing a different numpy version.)

import sys, os py = f"{sys.version_info.major}.{sys.version_info.minor}" PKG = f"/content/venv/lib/python{py}/site-packages" os.makedirs(PKG, exist_ok=True) if PKG not in sys.path: sys.path.insert(0, PKG) os.environ[“PIP_TARGET”] = PKG # routes !pip / %pip installs into the venv os.environ[“PYTHONNOUSERSITE”] = “1”

Install opentrons into the venv (and all its dependencies!) BEFORE any import numpy etc.

%pip install -q –upgrade –target “$PKG” opentrons

Now opentrons has been cleanly installed in its own venv-like environment with

versions of packages it likes; proceed to use it “normally” from here.

from opentrons import types import matplotlib.pyplot as plt plt.rcParams[“figure.figsize”] = (10,10)

Petri dish size constants

PETRI_INNER_DIAMETER = 84 # 84mm is hopefully a tight lower bound on inner diameter of “90mm” & “100mm” petri dishes MAX_DRAW_RADIUS = PETRI_INNER_DIAMETER/2 - 2 # leave 2mm margin for the tip size, drop size, miscalibration, etc.

Define some classes for our custom HTGAA Opentrons simulator/visualizer

nullLocation = types.Location(types.Point(x=250, y=250, z=250), None)

def same2DLocation(loc1, loc2): # ignores z (=> tests x, y, and labware) return loc1.point.x == loc2.point.x and loc1.point.y == loc2.point.y and loc1.labware == loc2.labware

def mock_print(str): #print("…\n" + str) pass

each PipetteSim instance tracks what it’s dispensed; if you have multiple, need to call visualize() on each.

(can’t unify multiple by making the instance variables into class variables; note this colab has at least

one instance per example, and we don’t want those sharing dispense states.)

class PipetteSim: # modeled after InstrumentContext in the opentrons api def init(self, instrument_official_name, mount_LR, tip_rack_list, well_colors): if instrument_official_name != “p20_single_gen2”: raise ValueError(f"Unsupported pipette {instrument_official_name} – should be p20_single_gen2") self.max_volume = 20 self.instrument_official_name = instrument_official_name

if mount_LR != "right":
  raise ValueError(f"Unsupported pipette mount {mount_LR} -- should be right")
self.mount_LR = mount_LR

if tip_rack_list[0].labware_official_name != "opentrons_96_tiprack_20ul":
  raise ValueError(f"Unsupported tip rack {tip_rack_list[0].labware_official_name} -- should be opentrons_96_tiprack_20ul")
self.tip_rack_list = tip_rack_list

self.well_colors = well_colors
self.droplets_x = []
self.droplets_y = []
self.droplets_size = []
self.droplets_color = []
self.smears = []                # list of 3-tuples: (xlist, ylist, color)
self.location = nullLocation    # used by dispense_and_detach()
self.justDispensedAt = None
self.current_volume = 0
self.aspirated_loc = None
self.totalAspirated = {}        # 'color' : total
self.totalDispensed = {}        # 'color' : total
self.curr_color = 'orange'
self.has_tip = False            # (in the opentrons api!)
self.tip_count = 0

def del(self): if self.has_tip: raise Exception("### ERROR: Run completed without dropping the tip!") # python prints but ignores exceptions in destructors

used by our dispense_and_detach() routine

def _get_last_location_by_api_version(self): # (in the opentrons api!) return self.location

use the well id to make up a location on the petri dish diagram:

D6 in the center, A1 lower left, H12 upper right (assuming 96-well, but will work for any)

def petriLocOfWell(self, well): # (NOT in opentrons api) assert(isinstance(well, WellMock)) x,y = well.get_row_col() return well.top().move(types.Point(x=(x-ord(‘D’)) * MAX_DRAW_RADIUS/4, y=(y-6) * MAX_DRAW_RADIUS/6, z=0))

but no smear if it’s just a z-move

def smearIfJustDispensed(self, loc): # (NOT in opentrons api) assert(isinstance(loc, (types.Location, WellMock))) if self.justDispensedAt is not None: newloc = loc if isinstance(loc, types.Location) else self.petriLocOfWell(loc) if not same2DLocation(self.justDispensedAt, newloc): line_end = self.justDispensedAt.move(0.5 * (newloc.point - self.justDispensedAt.point)) self.smears.append(([self.justDispensedAt.point.x, line_end.point.x], [self.justDispensedAt.point.y, line_end.point.y], self.curr_color)) self.justDispensedAt = None

def dispense(self, volume, location): # (in opentrons api) assert(isinstance(location, types.Location)) # not allowing dispensing into well or trashbin/wastechute for this lab – petri only! assert(isinstance(volume, (int, float))) if (location.point.x2 + location.point.y2 > MAX_DRAW_RADIUS**2): raise ValueError(f’Dispensing outside “safe” area: Point ({location.point.x}, {location.point.y}) is more than’ + f" {MAX_DRAW_RADIUS}mm away from the petri dish’s center.") if not self.has_tip: raise RuntimeError(“dispense() called when no tip was being held”) if self.current_volume < volume: raise ValueError(f"You dispensed {volume}uL, which is more than was in the pipette ({self.current_volume}uL).") if volume <= 0: raise ValueError(f"Dispensing {volume}uL – you should dispense a positive amount.") if location.point.z < 0: raise ValueError(f"dispense() passed a location with z={location.point.z} – do not go below z=0!") if location.point.z >= 10: print(f"Dispensing from a location with z={location.point.z} – do you really want to dispense from that high?") self.smearIfJustDispensed(location) self.current_volume -= volume self.droplets_x.append(location.point.x) self.droplets_y.append(location.point.y) self.droplets_size.append(volume * 100) # unprincipled scale factor (1uL->100 sq.pt), but it works self.droplets_color.append(’lime’ if self.curr_color.lower()==‘green’ else self.curr_color) # map green -> lime (looks more like GFP) self.totalDispensed.setdefault(self.curr_color, 0) self.totalDispensed[self.curr_color] += volume self.location = location self.justDispensedAt = location

def aspirate(self, volume, location): # (in opentrons api) assert(isinstance(volume, (int, float))) assert(isinstance(location, (types.Location, WellMock))) if not self.has_tip: raise RuntimeError(“aspirate() called when no tip was being held”) if volume + self.current_volume > self.max_volume: raise ValueError(f"Aspirating {volume}uL + {self.current_volume}uL already in pipette = {volume + self.current_volume}uL," f" which is more than the pipette can hold ({self.max_volume}uL).") if volume <= 0: raise ValueError(f"Aspirating {volume}uL – you should aspirate a positive amount.") if self.aspirated_loc is not None and self.aspirated_loc != location: raise RuntimeError(f"Cross-contaminating wells {self.aspirated_loc} and {location} with a single pipette") self.aspirated_loc = location self.smearIfJustDispensed(location) self.current_volume += volume if isinstance(location, WellMock): if location.well_id.upper() not in (id.upper() for id in self.well_colors.keys()): raise ValueError(f"aspirate() was passed well location {location} which hasn’t been configured to have a color.") color = location.color() newloc = location else: # legal for aspirate() but we should probably treat this as an error for this lab? right now marking it white… print(f"WARNING – aspirate() passed a Location rather than a well – are you sure you know what you’re doing?") if location.point.z < 0: raise ValueError(f"aspirate() passed a location with z={location.point.z} – do not go below z=0!") color = ‘white’ # we don’t know where they’re asiprateing from… use an unusual color to mark it. newloc = self.petriLocOfWell(location) self.curr_color = color self.totalAspirated.setdefault(color, 0) self.totalAspirated[color] += volume self.location = newloc

def pick_up_tip(self): # (in opentrons api) loc = types.Location(types.Point(x=-MAX_DRAW_RADIUS, y=MAX_DRAW_RADIUS, z=0), ‘Pickup Tip’) self.smearIfJustDispensed(loc) if self.has_tip: raise RuntimeError(“pick_up_tip() called when already holding a tip”) self.has_tip = True assert(self.aspirated_loc is None) self.tip_count += 1 self.current_volume = 0 self.location = loc

def drop_tip(self): # (in opentrons api) loc = types.Location(types.Point(x=MAX_DRAW_RADIUS, y=MAX_DRAW_RADIUS, z=0), ‘Drop Tip’) self.smearIfJustDispensed(loc) if not self.has_tip: raise RuntimeError(“drop_tip() called when no tip was being held”) self.has_tip = False self.aspirated_loc = None self.current_volume = 0 self.location = loc

def move_to(self, location): # (in opentrons api) if location.point.z < 0: raise ValueError(f"move_to() passed a location with z={location.point.z} – do not go below z=0!") self.smearIfJustDispensed(location) self.location = location

def visualize(self): # (NOT in opentrons api) print("\n=== VOLUME TOTALS BY COLOR ===") for color in self.totalAspirated.keys() | self.totalDispensed.keys(): comment = ’’ if self.totalAspirated.setdefault(color, 0) != self.totalDispensed.setdefault(color, 0): comment = “\t\t##### WASTING BIO-INK : more aspirated than dispensed!” print(f"\t{color}:\t\t aspirated {self.totalAspirated[color]}\t dispensed {self.totalDispensed[color]}{comment}") print(f"\t[all colors]:\t[aspirated {sum(self.totalAspirated.values())}]\t[dispensed {sum(self.totalDispensed.values())}]") print(f"\n=== TIP COUNT ===\n\t Used {self.tip_count} tip(s) (ideally exactly one per unique color)") print("\n") # plus prints its own newline

## uncomment (only) one of these corresponding to the background medium you're printing on
plt.gca().add_patch(plt.Circle((0, 0), radius=PETRI_INNER_DIAMETER/2, color='#000000', fill=True)) # petri dish - 84mm inner diam, black agar plate
#plt.gca().add_patch(plt.Circle((0, 0), radius=PETRI_INNER_DIAMETER/2, color='#000000', fill=False)) # petri dish - 84mm inner diam, paper insert
#plt.gca().add_patch(plt.Circle((0, 0), radius=PETRI_INNER_DIAMETER/2, color='#d7ca95', fill=True)) # petri dish - 84mm inner diam, agar plate

plt.scatter(self.droplets_x, self.droplets_y, self.droplets_size, c=self.droplets_color)

for xlist,ylist,color in self.smears:
    plt.gca().plot(xlist, ylist, color=color, linewidth=4, solid_capstyle='round')

plt.xlim((-(PETRI_INNER_DIAMETER/2 + 0.5), PETRI_INNER_DIAMETER/2 + 0.5))
plt.ylim((-(PETRI_INNER_DIAMETER/2 + 0.5), PETRI_INNER_DIAMETER/2 + 0.5))
plt.show()

class WellMock: def init(self, well_id, well_color, labware_official_name): self.well_id = well_id self.labware_official_name = labware_official_name self.well_color = well_color if well_color else ‘purple’

def get_row_col(self):          # (NOT in opentrons api)
    row = ord(self.well_id[0].upper())
    col = int(self.well_id[1:])
    return (row, col)

def set_row_col(self, row, col):# (NOT in opentrons api)
    self.well_id = chr(row) + str(col)

def color(self):                # (NOT in opentrons api)
    return self.well_color

def bottom(self, z):            # (in opentrons api)
    assert z >= 0
    return self

def center(self):               # (in opentrons api)
    return self

def top(self, z=0):             # (in opentrons api)
    assert(isinstance(z, (int, float)))
    return types.Location(types.Point(x=0, y=0, z=z), 'Well')
    # return self

def move(self, location):       # (NOT in opentrons api) -- why do we have this here? what do we think it should do, move a well?
    assert(isinstance(location, types.Location))
    return self

def __eq__(self, other):
    return self.__class__ == other.__class__ and self.__dict__ == other.__dict__

def __repr__(self):
    return self.well_id

class LabwareMock: def init(self, labware_official_name, deck_slot, display_name, well_colors): self.labware_official_name = labware_official_name self.deck_slot = deck_slot self.display_name = display_name self.well_colors = well_colors

# the opentrons api names these arguments: self, idx
def well(self, well_id):        # (in opentrons api, but deprecated -- use wells(int) or wells_by_name(str) instead)
    return WellMock(well_id, self.well_colors.get(well_id, ''), self)

def __getitem__(self, well_id):
    return WellMock(well_id, self.well_colors.get(well_id, ''), self)

def __repr__(self):
    return "Deck Slot %s - %s" % (str(self.deck_slot), self.display_name)

class ModuleMock: def init(self, module_official_name, deck_slot, well_colors): self.module_official_name = module_official_name self.deck_slot = deck_slot self.well_colors = well_colors

# the opentrons api names these arguments: self, name, label
def load_labware(self, labware_official_name, display_name):    # (in opentrons api)
    mock_print("Module " + str(self.module_official_name) + " loaded " + str(labware_official_name))
    return LabwareMock(labware_official_name, self.deck_slot, display_name, well_colors)

def set_temperature(self, celsius):     # (in opentrons api)
    assert(isinstance(celsius, int))
    assert(celsius >= 4 and celsius <= 110)
    mock_print("Setting temperature to " + str(celsius) + "C")

def open_lid(self):                     # (in opentrons api)
    mock_print("Opening lid")

def close_lid(self):                    # (in opentrons api)
    mock_print("Closing lid")

def set_lid_temperature(self, temperature):     # (in opentrons api, but only for Thermocycler)
    assert(isinstance(temperature, int))
    assert(temperature >= 4 and temperature <= 110)
    mock_print("Setting lid temperature to " + str(temperature) + "C")

def deactivate_lid(self):               # (in opentrons api, but only for Thermocycler)
    mock_print("Deactivate lid")

                                        # (in opentrons api, but only for Thermocycler)
def set_block_temperature(self, temperature, hold_time_minutes=0, hold_time_seconds=0, ramp_rate=0, block_max_volume=25):
    assert(isinstance(temperature, int))
    assert(temperature >= 4 and temperature <= 110)
    assert(isinstance(hold_time_minutes, int))
    assert(isinstance(block_max_volume, int))
    mock_print("Setting block temperature to " + str(temperature) + "C")
    if (hold_time_minutes > 0):
        mock_print("Holding for " + str(hold_time_minutes) + " minutes...")
    if (hold_time_seconds > 0):
        mock_print("Holding for " + str(hold_time_seconds) + " seconds...")

def execute_profile(self, steps, repetitions, block_max_volume):    # (in opentrons api, but only for Thermocycler)
    assert(isinstance(repetitions, int))
    assert(isinstance(block_max_volume, int))

    mock_print("Executing following protocol for " + str(repetitions) + " cycles")

    for step in steps:
        assert(isinstance(step, dict))
        assert(isinstance(step['temperature'], int))
        assert(isinstance(step['hold_time_seconds'], int))

        mock_print("Temperature: " + str(step['temperature']) + "C, Time: " + str(step['hold_time_seconds']) + " seconds")

hmm, this appears to be unused…

class InstrumentMock: def init(self, instrument_official_name, mount_LR, tip_rack_list): self.instrument_official_name = instrument_official_name self.mount_LR = mount_LR starting_tip = None

    if "p20" in instrument_official_name:
        self.display_name = "P20"
        self.vol_range = (1, 20)
    elif "p300" in instrument_official_name:
        self.display_name = "P300"
        self.vol_range = (20, 300)
    elif "p1000" in instrument_official_name:
        self.display_name = "P1000"
        self.vol_range = (100, 1000)
    else:
        mock_print("WARNING: UNSUPPORTED PIPETTE")
        assert false

def advance_tip(self):
    row, col = self.starting_tip.get_row_col()

    row += 1
    if row > ord('H'):
        row = ord('A')
        col += 1

    if col > 12:
        mock_print("WARNING: OUT OF TIPS!!!")
        assert false

    self.starting_tip.set_row_col(row, col)

def pick_up_tip(self):
    row, col = self.starting_tip.get_row_col()
    assert(row >= ord('A') and row <= ord('H'))
    assert(col >= 1 and col <= 12)
    mock_print(self.display_name + " is picking up a tip from " + str(self.starting_tip))
    self.advance_tip()

def drop_tip(self):
    mock_print(self.display_name + " is dropping a tip");

def aspirate(self, volume, well):
    assert(isinstance(volume, (int, float)))
    assert(isinstance(well, WellMock))
    assert volume >= self.vol_range[0] and volume <= self.vol_range[1]
    mock_print("##### " + str(well.labware_official_name) + " [" + str(well.well_id) + "] ---> (" + str(volume) + "uL)")

def dispense(self, volume, well):
    assert(isinstance(volume, (int, float)))
    assert(isinstance(well, WellMock))
    assert volume >= self.vol_range[0] and volume <= self.vol_range[1]
    mock_print("##### " + str(well.labware_official_name) + " [" + str(well.well_id) + "] <--- (" + str(volume) + "uL)")

def blow_out(self):
    mock_print(self.display_name + " blow out")

def mix(self, repetitions, volume, well):
    assert(isinstance(repetitions, int))
    assert(isinstance(volume, (int, float)))
    assert(isinstance(well, WellMock))
    assert volume >= self.vol_range[0] and volume <= self.vol_range[1]
    mock_print("##### " + str(well.labware_official_name) + " [" + str(well.well_id) + "] - Mixing - " + str(repetitions) + " times, volume " + str(volume) + "uL")

def move_to(self, location, force_direct=False):
    assert(isinstance(force_direct, bool))
    assert(isinstance(location, WellMock))
    mock_print(self.display_name + " is moving");

class OpentronsMock: def init(self, well_colors): self.well_colors = well_colors self.pipette = None #self.location_cache = None # unimplemented: opentrons api’s more canonical way to get last_location, but these protocols don’t need it

def home(self):
    mock_print("Going home!")

# the opentrons api names these arguments: self, load_name, location, label
def load_labware(self, labware_official_name, deck_slot, display_name):
    mock_print("Loaded " + str(labware_official_name) + " in deck slot " + str(deck_slot))
    return LabwareMock(labware_official_name, deck_slot, display_name, self.well_colors)

# the opentrons api names these arguments: self, module_name, location
def load_module(self, module_official_name, deck_slot=0):
    mock_print("Loaded module " + str(module_official_name) + " in deck slot " + str(deck_slot))
    return ModuleMock(module_official_name, deck_slot, self.well_colors)

# the opentrons api names these arguments: self, instrument_name, mount, tip_racks
def load_instrument(self, instrument_official_name, mount_LR, tip_rack_list):
    self.pipette = PipetteSim(instrument_official_name, mount_LR, tip_rack_list, self.well_colors)
    return self.pipette

def pause(self):
    mock_print("Robot pause")

def visualize(self):
    self.pipette.visualize()

Put your name in the ‘author’ field of the metadata near the top of the first block, give your protocol a ‘protocolName’ there, and fill in the ‘description’ of what the protocol will do Write code to create your design at the very end of the first block

DEVELOPMENT TIP: Write your code in short runnable chunks, and after you’ve written each one run both of your clode blocks (running the first one loads your code, running the second one executes it on the simulator) to see that it’s doing what you expect. Simulate often!

My Code

    from opentrons import types import math

    metadata = { “author”: “Charley Naney”, “protocolName”: “HTGAA Opentrons Lab”, “description”: “HW3 multi-color agar patterning (safe preview + batched dispense)”, “source”: “HTGAA 2026 Opentrons Lab”, “apiLevel”: “2.20”, }

    —————————-

    DECK CONSTANTS

    —————————-

    TIP_RACK_DECK_SLOT = 9 COLORS_DECK_SLOT = 6 AGAR_DECK_SLOT = 5 PIPETTE_STARTING_TIP_WELL = “A1”

    —————————-

    COLOR SOURCE WELL -> PATTERN NAME (for logging)

    —————————-

    What the TA mock/visualizer uses for plotting colors (must be matplotlib-valid)

    well_colors = { “A1”: “#FFBF00”, # mko2 “B1”: “#FF4500”, # mrfp1 “C1”: “#FF2400”, # mscarlet “D1”: “#32CD32”, # sfgfp “E1”: “#7DF9FF”, # electra2 “F1”: “#4A4B43”, # mjuniper

    }

    What YOU use for printing nice names in the log

    well_color_names = { “A1”: “mko2”, “B1”: “mrfp1”, “C1”: “mscarlet”, “D1”: “sfgfp”, “E1”: “electra2”, “F1”: “mjuniper”, }

    mko2_points = [(7, 29),(11, 29),(13, 29),(17, 29),(21, 29),(7, 27),(9, 27),(11, 27),(15, 27),(17, 27),(19, 27),(21, 27),(25, 27),(1, 25),(3, 25),(5, 25),(7, 25),(9, 25),(11, 25),(13, 25),(17, 25),(19, 25),(23, 25),(25, 25),(27, 25),(-1, 23),(1, 23),(5, 23),(7, 23),(9, 23),(11, 23),(13, 23),(21, 23),(23, 23),(27, 23),(29, 23),(-9, 21),(-5, 21),(-3, 21),(-1, 21),(1, 21),(3, 21),(5, 21),(23, 21),(27, 21),(29, 21),(-13, 19),(-9, 19),(-7, 19),(-3, 19),(-1, 19),(23, 19),(25, 19),(27, 19),(29, 19),(-17, 17),(-13, 17),(-11, 17),(-7, 17),(-5, 17),(-3, 17),(25, 17),(-21, 15),(-17, 15),(-15, 15),(-13, 15),(-11, 15),(-9, 15),(-7, 15),(25, 15),(27, 15),(29, 15),(31, 15),(-25, 13),(-23, 13),(-21, 13),(-19, 13),(-17, 13),(-15, 13),(-13, 13),(25, 13),(29, 13),(-27, 11),(-25, 11),(-23, 11),(-21, 11),(-19, 11),(-17, 11),(25, 11),(27, 11),(31, 11),(-27, 9),(-23, 9),(-21, 9),(25, 9),(27, 9),(-27, 7),(-25, 7),(-23, 7),(25, 7),(27, 7),(29, 7),(-29, 5),(-27, 5),(-25, 5),(-23, 5),(23, 5),(27, 5),(-29, 3),(-27, 3),(-25, 3),(-23, 3),(23, 3),(25, 3),(27, 3),(-29, 1),(-27, 1),(-25, 1),(-21, 1),(21, 1),(23, 1),(-27, -1),(-25, -1),(-23, -1),(-21, -1),(-19, -1),(-15, -1),(17, -1),(19, -1),(23, -1),(-23, -3),(-21, -3),(-19, -3),(-3, -3),(5, -3),(7, -3),(11, -3),(13, -3),(15, -3),(19, -3),(-21, -5),(-19, -5),(-15, -5),(-13, -5),(-11, -5),(-9, -5),(-7, -5),(-5, -5),(-3, -5),(-1, -5),(1, -5),(3, -5),(7, -5),(11, -5)] mrfp1_points = [(19, 29),(1, 27),(23, 27),(-3, 25),(3, 23),(7, 21),(25, 21),(31, 21),(-17, 19),(5, 19),(-9, 17),(27, 17),(31, 17),(-11, 13),(-7, 13),(31, 13),(29, 11),(-25, 9),(-19, 9),(29, 9),(31, 9),(33, 9),(-33, 7),(-31, 7),(-21, 7),(31, 7),(29, 5),(-23, 1),(25, 1),(-31, -1),(-17, -1),(15, -1),(21, -1),(-27, -3),(-25, -3),(1, -3),(3, -3),(9, -3),(17, -3),(-17, -5),(5, -5),(13, -5),(-23, -7),(-19, -7),(-17, -7),(-15, -7),(-7, -7),(3, -7),(5, -7),(-3, -9),(-17, -11),(-15, -11),(-3, -11),(3, -13),(-21, -15),(-21, -19)] mscarlet_i_points = [(-21, 5),(19, 5),(-33, 3),(-31, 3),(-21, 3),(-19, 3),(-35, 1),(-37, -3),(-37, -5),(-37, -7),(-35, -9),(-33, -11),(-27, -13),(-25, -13),(-23, -15),(-21, -17),(-19, -23),(-17, -27),(-13, -29),(-11, -29)] sfgfp_points = [(13, 17),(15, 17),(1, 15),(9, 15),(11, 15),(13, 15),(15, 15),(3, 13),(11, 13),(13, 13),(15, 13),(17, 13),(-5, 11),(-3, 11),(1, 11),(3, 11),(7, 11),(11, 11),(13, 11),(15, 11),(-9, 9),(-7, 9),(-5, 9),(-1, 9),(1, 9),(7, 9),(9, 9),(13, 9),(15, 9),(-11, 7),(-7, 7),(-3, 7),(-1, 7),(1, 7),(3, 7),(7, 7),(9, 7),(13, 7),(15, 7),(19, 7),(-7, 5),(-5, 5),(-3, 5),(-1, 5),(3, 5),(5, 5),(9, 5),(-15, 3),(-13, 3),(-7, 3),(-11, 1),(3, 1)] electra2_points = [(17, 21),(19, 19),(7, 17),(17, 17),(21, 15),(-3, 13),(9, 13),(9, 11),(17, 11),(-15, 9),(-13, 9),(5, 9),(21, 9),(-5, 7),(17, 7),(-15, 5),(-13, 5),(-11, 5),(11, 5),(-5, 3),(3, 3),(13, 3),(17, 3),(-13, 1),(9, 1),(11, 1)] mjuniper_points = [(17, 19),(9, 17),(19, 17),(17, 15),(5, 11),(-3, 9),(3, 9),(11, 9),(17, 9),(19, 9),(-13, 7),(-9, 7),(5, 7),(-9, 5),(1, 5),(13, 5),(-11, 3),(9, 3),(11, 3)]

    colors_to_points = { “A1”: mko2_points, “B1”: mrfp1_points, “C1”: mscarlet_i_points, “D1”: sfgfp_points, “E1”: electra2_points, “F1”: mjuniper_points, }

    def run(protocol): def safe_comment(msg): if hasattr(protocol, “comment”): protocol.comment(str(msg)) else: print(str(msg))

    def safe_pause(msg):
        if hasattr(protocol, "pause"):
            try:
                protocol.pause(str(msg))
            except TypeError:
                protocol.pause()
                print(f"[PAUSE] {msg}")
        else:
            print(f"[PAUSE] {msg}")
    
    def reset_mock_aspirated_loc(pipette):
        if hasattr(pipette, "aspirated_loc"):
            pipette.aspirated_loc = None
    
    tips_20ul = protocol.load_labware(
        "opentrons_96_tiprack_20ul",
        TIP_RACK_DECK_SLOT,
        "Opentrons 20uL Tips",
    )
    
    try:
        pipette_20ul = protocol.load_instrument(
            "p20_single_gen2",
            "right",
            tip_racks=[tips_20ul],
        )
    except TypeError:
        pipette_20ul = protocol.load_instrument(
            "p20_single_gen2",
            "right",
            [tips_20ul],
        )
    
    pipette_20ul.starting_tip = tips_20ul.well(PIPETTE_STARTING_TIP_WELL)
    
    temperature_module = protocol.load_module(
        "temperature module gen2",
        COLORS_DECK_SLOT,
    )
    
    temperature_plate = temperature_module.load_labware(
        "opentrons_96_aluminumblock_generic_pcr_strip_200ul",
        "Cold Plate",
    )
    
    color_plate = temperature_plate
    
    agar_plate = protocol.load_labware(
        "htgaa_agar_plate",
        AGAR_DECK_SLOT,
        "Agar Plate",
    )
    
    center_location = agar_plate["A1"].top()
    
    DRY_RUN_ONLY = False
    DOT_UL = 1.0
    PIP_MAX_UL = 20.0
    HEADROOM_UL = 1.0
    PREVIEW_Z_MM = 10.0
    DETACH_Z_MM = 5.0
    
    X_MAX = 40.0
    Y_MAX = 40.0
    
    origin = center_location  
    
    max_dots_per_asp = int(math.floor((PIP_MAX_UL - HEADROOM_UL) / DOT_UL))
    if max_dots_per_asp < 1:
        raise ValueError("Invalid aspiration volume configuration.")
    
    def optimize_path_scanline(points):
        buckets = {}
        for (x, y) in points:
            buckets.setdefault(float(y), []).append(float(x))
        ys = sorted(buckets.keys(), reverse=True)
        out = []
        flip = False
        for y in ys:
            xs = sorted(buckets[y], reverse=flip)
            out.extend([(x, y) for x in xs])
            flip = not flip
        return out
    
    def bounding_box_check(mapping):
        for src, pts in mapping.items():
            for (x, y) in pts:
                if (float(x) < -X_MAX or float(x) > X_MAX or
                    float(y) < -Y_MAX or float(y) > Y_MAX):
                    raise RuntimeError(f"Point outside safety bounds for {src}: ({x}, {y})")
    
    def preview_move(pipette, location):
        high = location.move(types.Point(z=location.point.z + PREVIEW_Z_MM))
        pipette.move_to(high)
    
    def dispense_and_detach(pipette, volume, location):
        above = location.move(types.Point(z=location.point.z + DETACH_Z_MM))
        pipette.move_to(above)
        pipette.dispense(volume, location)
        pipette.move_to(above)
    
    bounding_box_check(colors_to_points)
    
    safe_comment("Starting dry run preview.")
    pipette_20ul.pick_up_tip()
    
    for src_well in sorted(colors_to_points.keys()):
        pts = optimize_path_scanline(colors_to_points[src_well])
        safe_comment(
            f"Preview {src_well} ({well_color_names.get(src_well, src_well)}): {len(pts)} dots"
        )
        for (x, y) in pts:
            target = origin.move(types.Point(x=float(x), y=float(y)))
            preview_move(pipette_20ul, target)
    
    pipette_20ul.drop_tip()
    
    if DRY_RUN_ONLY:
        safe_comment("Preview only mode complete.")
        return
    
    safe_pause("Dry run complete. Resume to dispense.")
    
    safe_comment("Starting dispense.")
    
    for src_well in sorted(colors_to_points.keys()):
        source = color_plate[src_well]
        pts = optimize_path_scanline(colors_to_points[src_well])
    
        safe_comment(
            f"Pattern: {well_color_names.get(src_well, src_well)} | Source well: {src_well} | Dots: {len(pts)}"
        )
    
        pipette_20ul.pick_up_tip()
        reset_mock_aspirated_loc(pipette_20ul)
    
        remaining = list(pts)
        while remaining:
            batch = remaining[:max_dots_per_asp]
            remaining = remaining[max_dots_per_asp:]
    
            volume_needed = len(batch) * DOT_UL
            pipette_20ul.aspirate(volume_needed, source)
    
            for (x, y) in batch:
                target = origin.move(types.Point(x=float(x), y=float(y)))
                dispense_and_detach(pipette_20ul, DOT_UL, target)
    
        pipette_20ul.drop_tip()
        reset_mock_aspirated_loc(pipette_20ul)
    
    safe_comment("Run complete.")
    
    Execute Simulation / Visualization – don’t change this code block

    protocol = OpentronsMock(well_colors) run(protocol) protocol.visualize()

Week 4 HW: Protein Design Part I

Homework: Protein Design I

Assignment


Objective:

Learn basic concepts: amino acid structure, 3D protein visualization, and the variety of ML-based design tools. Brainstorm as a group how to apply these tools to engineer a better bacteriophage (setting the stage for the final project).

HTGAA Protein Engineering Tools, HTGAA Protein Engineering Feedback​


Part A. Conceptual Questions

Answer any of the following questions by Shuguang Zhang:

1. How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons)

2. Why humans eat beef but do not become a cow, eat fish but do not become fish?

Although the saying goes we are what we eat, our genomes disagree. The genes preserve function against infinite intrinsic and extrinsic stresses or other organisms’ genes. Vertically inherited animal genomes from ancestors, parents, and offspring are protected in the germ line but there also exists the functional genes that create proteins in specialized tissue mosaics that also selfishly persist as long as constituent cells survive – all cells are selfish like that. These specialized cells are sustained, though by new macromolecules which is likely why the human eat the cow in the first place. Here though Natural Selection has again preserved the reproductive programs of each species genome – starting with the taste buds and olfactory centers of the human brain. A carnivore has a digestive tract that can disassemble and denature raw muscle protein very efficiently, largely because they have different enzymes. Are lions susceptible to prion diseases? That question will make more sense in a moment, but a quick search of the internet indicates they are highly resistant (particularily to Chronic Wasting Disease). Again the enzymes in their gut protect them against 97-98% of CWDs they are exposed to. The risk to humans is different, but then so is the type of prion disease. Human also have digestive tracts though that denature the proteins and lipids of the cow before it even enters the digestive tract. Unfortunately humans also have large scale industrial agriculture and rampid inequality in access to cuts and quality of cow meat. This means that economics largely determines what parts of cows humans will be exposed and how concentrated the cows on the feed lots will be along with the condition of slaughter floor. It’s hard to know today because enzymes outside of niches are fragile things, but once humans started cooking their cow meat before shredding and emulsification in their digestive systems maybe they had more of the lion’s enzymes. The loss of those enzymes is no more felt than in the epidemiological triangle between prion in cow and the prion in man, here not gene-to-gene but protein-to-protein interaction between these two animals and the zoonotic pathogen adds a new layer of complexity.

3. Why there are only 20 natural amino acids? > This is such a great question. Likely it was because Earth is a special planet, at least the most exceptional planet we have any knowledge about, thus far, for forming these 20 amino acids. Foundational to this question is what I love most about the study of Earth life. Evolution is change in matter in response to energetic landscapes in constant motion. Everything in biology is nested in this evolutionary onion through which cosmic evolution begets planetary evolution all the way down to the sub-atomic particles of atoms and back up to cells-within-cells. All this evolutionary change, as my astrobiologist friends tell me, started with the creation of our Universe 13.8 billion years ago. Therefore, as I learned from Dr. Graves to Dr. Lane, single-celled organisms evolved into multicellular organisms through Natural Selection and biochemical pathways that can be traced back from the 20 natural amino acids and other macromolecules that then formed single-celled organisms and multicelled organisms. We can trace origin of individual atoms in those micromolecules to macromolecules, one element at a time or in families. Throughout all evolutionary nesting, from the Cosmos to the Microcosmos, there is the unifying story of energy flowing and matter cycling. This is the root cause of the first tradeoff in Natural Selection – the struggle for existence of living organisms constrained by dynamic environmental conditions. Life, or the 20 natural amino acids, thus formed from elemental particles flowing from energy sink to energy sink over the vast horizon of evolutionary time in one long continuous chain reaction. If anyone doubts this, break every natural amino acid down to elemental atoms, starting with Hydrogen the most abundant element in the Universe.

6. Can you make other non-natural amino acids? Design some new amino acids. We sure can but we have to be heretical and leave the canonical 2o natural amino acids behind. The process then becomes a matter of what your trajectory will be through the process. Are you going to design new amino acids within cells or externally and then from the bottom up or top down?

7. Where did amino acids come from before enzymes that make them, and before life started. > Earth was formed from the same cosmic evolutionary process that formed the other planets and their moons over 4 billion years ago. Cells are an administrative variable today. What is less understood is the abiotic chemistry of the primordial oceans, could the hydrothermal vents Infact we know that orginally, when the Earth was first formed, it didn’t even have a moon. Then there was movement of Jupiter and Mars was forever changed and a giant piece of rock hit Earth which led to the formation of our Moon. Again, all this was over 4 billion years ago, and there were no amino acids or enzymes at this time. Now, I am going to throw a bunch of science at you, but just remember when I do – the Earth and Moon are a twin study. The simple answer is that when Amino Acids are synthesized in a lab, outside of a cell, they form a racemic mixture of both L-amino acids and D-amino acids. However, when they are synthesized within cells between genes and proteins for specific functions, they are almost exclusively L-amino acids. This is because cells use enzymes to speed up biological synthesis, and this contributes to the preponderance of L-amino acids. Samples of the moon’s surface from the Apollo missions contained glycine, alanine, glutamic acid, aspartic acid, serine, and threoine. This indicates that there was life generating amino acids on the moon, but the process stopped. Coincidentally, that over 4 billion years ago when the Earth was forming, a huge rock collided with it, leading to the formation of the orbiting Moon body. This allows us to deduce that some simple R-group amino acids left Earth and entered the vacuum of space where they essentially were frozen in time. Meanwhile, more natural L-amino acids continued to evolve on Earth. Where did they evolve? In bacterial and archaeal cells, the central dogma of molecular biology tells us that genes cannot transcribe and translate proteins without enzymes. Now knowing that ontogeny recapitulates phylogeny the question now becomes, who is the index bacterium, Earth life’s original synthesizer?

8. If you make an alpha-helix using D-amino acids, what handedness (right or left) would you expect?

The 20 primary amino acids are all L-amino acids, as are most protein building blocks of cells. Alpha-helices here will be the B-DNA, favoring right-handedness. Thus by the power of deduction that leaves D-amino acids, the exceptions, to the way of left-handedness. Now, the cool science here is that B-DNA favors right-handedness on Earth, but because of complementary strand synthesis, the pattern of right-handedness carries over to the A-form RNA during transcription, but not exclusively! This means that our converse hypothesis was correct, and there is another handedness in RNA, also known as Z-RNA. Furthermore, there are specific conditions that generate the Z-RNA form.

9. Can you discover additional helices in proteins?

Yes, there are primary, secondary, tertiary, and quaternary forms in proteins based on the structural chemistry of the bonding molecules. The ⍺-helix is the most ordinary secondary protein structure but in biology there are already exceptions. Examples of these exceptional helices include the 310 and pi helices.

10. Why most molecular helices are right-handed?

Most life on Earth is evolutionary rooted in B-DNA helices with a right-handed confirmation due to origin in saltwater oceans, passed on to self-replicating cells synthesized from macromolecules shaped by complementarities in form dominated by non-covalent weak interactions. source,

11. Why do beta-sheets tend to aggregate?

Two ways to answer this question: Hysteresis and Natural Selection. This is a why question after all, and that’s what evolution shines light on, the why behind biological structure. If you dig to the bottom of the fossil pile or record you find energetic attractor conditions for the clumping of polypeptides. Let’s start with the canonical amino acid ingredients of beta-sheet secondary structures. Every amino acid from the natural 20 is capable of contributing to beta-sheets but some are very improbable to be represented, including: Proline, Glycine, Asparatic acid, Glutamic acid, and Lysine. Conversely, some frequent amino acid suspects and properties favor beta-sheet formation. For example, the alternating chain pattern of hydrophobicity and polarity. Therefore, high probability beta-sheet producers include: Valine, Isoleucine, Phenylalanine, Serine, and Tyrosine.

12. What is the driving force for b-sheet aggregation?

Before we look at the science my speculative answer is structural stability. In nature we see the tensil strength of silk

13. Why many amyloid diseases form b-sheet?

14. Can you use amyloid b-sheets as materials?

15. Design a b-sheet motif that forms a well-ordered structure.


Part B: Protein Analysis and Visualization

In this part of the homework, you will be using online resources and 3D visualization software to answer questions about proteins.

I summon mTOR for my protein (including 4JSV, 4JSN, 4DRH, 4DRI, 3ML9, 5WBH, 5GPG, 5H64, 5FLC, 6BCX, 6BCU, 6SB0, 6SB2, 6ZWM, 7PE7, 7PEC, 8ERA, 9ED4, 9ED6, 9ED7, 9ED8).

mTor mTor
Pick any protein (from any organism) of your interest that has a 3D structure and answer the following questions.
  1. Briefly describe the protein you selected and why you selected it. The protein I am now selecting now that I better understand the exercise and tooling is the mighty mTOR TOR mechanistic target of Rapamycin protein kinase. I pick this protein because just trying to write the name is challenging. In add there is conserved and then there is mTOR conserved, it’s as common as Cytochrome C. Also I like the puzzle of navigating all of the codes just to plug it into the codebase – the combinations are amazing as my son’s Pokomon characters would say. Lastly, it has a lot of wiring for considering circuitry as a quintessential regulator of Natural Selection programs for growth, maintenance, and metabolism under stress.
  2. Identify the amino acid sequence of your protein.
PDB_APDB_BProtein AA Sequence
4JSV4JSV_2 Target of rapamycin complex subunit LST8MNTSPGTVGSDPVILATAGYDHTVRFWQAHSGICTRTVQHQDSQVNALEVTPDRSMIAAAGYQHIRMYDLNSNNPNPIISYDGVNKNIASVGFHEDGRWMYTGGEDCTARIWDLRSRNLQCQRIFQVNAPINCVCLHPNQAELIVGDQSGAIHIWDLKTDHNEQLIPEPEVSITSAHIDPDASYMAAVNSTGNCYVWNLTGGIGDEVTQLIPKTKIPAHTRYALQCRFSPDSTLLATCSADQTCKIWRTSNFSLMTELSIKSGNPGESSRGWMWGCAFSGDSQYIVTASSDNLARLWCVETGEIKREYGGHQKAVVCLAFNDSVLG
4JSV4JSV_1 Serine/threonine-protein kinase mTORDDNGIVLLGERAAKCRAYAKALHYKELEFQKGPTPAILESLISINNKLQQPEAAAGVLEYAMKHFGELEIQATWYEKLHEWEDALVAYDKKMDTNKDDPELMLGRMRCLEALGEWGQLHQQCCEKWTLVNDETQAKMARMAAAAAWGLGQWDSMEEYTCMIPRDTHDGAFYRAVLALHQDLFSLAQQCIDKARDLLDAELTAMAGESYSRAYGAMVSCHMLSELEEVIQYKLVPERREIIRQIWWERLQGCQRIVEDWQKILMVRSLVVSPHEDMRTWLKYASLCGKSGRLALAHKTLVLLLGVDPSRQLDHPLPTVHPQVTYAYMKNMWKSARKIDAFQHMQHFVQTMQQQAQHAIATEDQQHKQELHKLMARCFLKLGEWQLNLQGINESTIPKVLQYYSAATEHDRSWYKAWHAWAVMNFEAVLHYKHQNQARDEKKKLRHASGANITNATTAATTAATATTTASTEGSNSESEAESTENSPTPSPLQKKVTEDLSKTLLMYTVPAVQGFFRSISLSRGNNLQDTLRVLTLWFDYGHWPDVNEALVEGVKAIQIDTWLQVIPQLIARIDTPRPLVGRLIHQLLTDIGRYHPQALIYPLTVASKSTTTARHNAANKILKNMCEHSNTLVQQAMMVSEELIRVAILWHEMWHEGLEEASRLYFGERNVKGMFEVLEPLHAMMERGPQTLKETSFNQAYGRDLMEAQEWCRKYMKSGNVKDLTQAWDLYYHVFRRISKQLPQLTSLELQYVSPKLLMCRDLELAVPGTYDPNQPIIRIQSIAPSLQVITSKQRPRKLTLMGSNGHEFVFLLKGHEDLRQDERVMQLFGLVNTLLANDPTSLRKNLSIQRYAVIPLSTNSGLIGWVPHCDTLHALIRDYREKKKILLNIEHRIMLRMAPDYDHLTLMQKVEVFEHAVNNTAGDDLAKLLWLKSPSSEVWFDRRTNYTRSLAVMSMVGYILGLGDRHPSNLMLDRLSGKILHIDFGDCFEVAMTREKFPEKIPFRLTRMLTNAMEVTGLDGNYRITCHTVMEVLREHKDSVMAVLEAFVYDPLLNWRLMDTNTKGNKRSRTRTDSYSAGQSVEILDGVELGEPAHKKTGTTVPESIHSFIGDGLVKPEALNKKAIQIINRVRDKLTGRDFSHDDTLDVPTQVELLIKQATSHENLCQCYIGWCPFW
4JSN
4DRH
4DRI
3ML9
5WBH
5GPG
5H64
5FLC
6BCX
6BCU
6SB0
6SB2
6ZWM
7PE7
7PEC
8ERA
9ED4
9ED6
9ED7
9ED8
  1. How long is it? What is the most frequent amino acid? You can use this notebook to count most frequent amino acid - https://colab.research.google.com/drive/1vlAU_Y84lb04e4Nnaf1axU8nQA6_QBP1?usp=sharing

    The length of the Surfeit locus protein 2 protein is: 257 amino acids. The most common amino acid is: L, which appears 27 times. All Amino Acid Frequencies: L: 27 (10.51%) K: 25 (9.73%) E: 24 (9.34%) R: 24 (9.34%) S: 20 (7.78%) P: 18 (7.00%) D: 15 (5.84%) Q: 15 (5.84%) T: 14 (5.45%) H: 13 (5.06%) F: 12 (4.67%) G: 10 (3.89%) A: 8 (3.11%) Y: 8 (3.11%) V: 7 (2.72%) N: 7 (2.72%) C: 5 (1.95%) M: 2 (0.78%) I: 2 (0.78%) W: 1 (0.39%)

  2. How many protein sequence homologs are there for your protein? Hint: Use the pBLAST tool to search for homologs and ClustalOmega to align and visualize them. Tutorial Here

  3. Does your protein belong to any protein family?

  4. Identify the structure page of your protein in RCSB

  5. When was the structure solved? Is it a good quality structure? Good quality structure is the one with good resolution. Smaller the better (Resolution: 2.70 Å)

  6. Are there any other molecules in the solved structure apart from protein?

  7. Does your protein belong to any structure classification family? SURF2

  8. Open the structure of your protein in any 3D molecule visualization software:

  • PyMol Tutorial Here (hint: ChatGPT is good at PyMol commands)
    • Visualize the protein as “cartoon”, “ribbon” and “ball and stick”.
    • Color the protein by secondary structure. Does it have more helices or sheets?
    • Color the protein by residue type. What can you tell about the distribution of hydrophobic vs hydrophilic residues?
    • Visualize the surface of the protein. Does it have any “holes” (aka binding pockets)?

Part C. Using ML-Based Protein Design Tools

aaprob aaprob fps fps 3dtsne 3dtsne

In this section, we will learn about the capabilities of modern protein AI models and test some of them in your chosen protein. Copy the notebook below and set up a colab instance with GPU for this section: HTGAA_ProteinDesign2026.ipynb Choose your favorite protein from the PDB. We will now try multiple things, report each of those results in your homework page: Protein Language Models: Deep Mutational Scans Use ESM2 to generate an unsupervised deep mutational scan of your protein based on language model likelihoods. Can you explain any particular pattern? (choose a residue and a mutation that stands out) (Bonus) Find sequences for which we have experimental scans, and compare the prediction of the language model to experiment Latent Space Analysis Use the provided sequence dataset to embed proteins in reduced dimensionality Analyze the different formed neighborhoods: do they approximate similar proteins? Place your protein in the resulting map and explain its position and similarity to its neighbors Attention Maps Analyze the attention maps of ESM2. Investigate if its layers correlate to the 2D map of residue distances of your protein Protein Folding: Folding a protein Fold your protein with ESMFold. Do the predicted coordinates match your original structure? Try changing the sequence, first try some mutations, then large segments. Is your protein structure resilient to mutations? Protein Generation: Inverse-Folding a protein Let’s now use the backbone of your chosen PDB to propose sequence candidates via ProteinMPNN Analyze the predicted sequence probabilities and compare the predicted sequence vs the original one Input this sequence into ESMFold and compare the predicted structure to your original

newmaaprob newmaaprobnewpdb newpdb

Last step in my copy of HTGAA_ProteinDesign2026.ipynb script

Generating sequences… tmp, score=2.2338, fixed_chains=[], designed_chains=[‘A’], model_name=v_48_020 MDEPPSDVLAFLRQHPSLRLLPNTRKVRCSLTGHELPCRLPELQEYTRGKKYQRLSSSFSNFDYAAFEPHIVPSTKNRHQLFCKLTLRHINKSPEHVLRHTQGRRYQRALHQYEECQKQGVEYVPACLLHKRKKREDQTNSDELPGQRTGFWEPASSDEEDALSDDSMTDLYPPELFTKRELGKPKNDDTPEDFLTDQQDEKPEHSEEKSFREREEARVGHKRGRKLRKKQLTSLTKKFKSYHHKPKNFSSFKQLGR T=0.1, sample=0, score=1.1620, seq_recovery=0.2568 MPPLPPEVVAFLAQHPHLVALPGQPLVRCTLTGEELPAELPVLRAHVATPRHQALAAREKNFDFSKYEPHIVPSRWDPDKLFCRLCLKEIPKTPEAVEAHVNSKEHQEALKEYEEAKKRGKRYIPKRLRKRRRRRRRRRRRRRGRRKRRKRRPPPPRRRPRRKRRRRRRRLVPREWLRRRRRRRRRRRRRRRPRRRRRPPRRVVGAAPEPAVAALAEAPAPPAPPAPPPPPEERPEPPPPPPERREPPPEELEEEEE

New Sequence:MPPLPPEVVAFLAQHPHLVALPGQPLVRCTLTGEELPAELPVLRAHVATPRHQALAAREKNFDFSKYEPHIVPSRWDPDKLFCRLCLKEIPKTPEAVEAHVNSKEHQEALKEYEEAKKRGKRYIPKRLRKRRRRRRRRRRRRRGRRKRRKRRPPPPRRRPRRKRRRRRRRLVPREWLRRRRRRRRRRRRRRRPRRRRRPPRRVVGAAPEPAVAALAEAPAPPAPPAPPPPPEERPEPPPPPPERREPPPEELEEEEE

Part D. Group Brainstorm on Bacteriophage Engineering

Find a group of ~3–4 students Review the Bacteriophage Final Project Goals: Increased stability (easiest) Higher titers (medium) Higher toxicity of lysis protein (hard) Brainstorm Session Choose one or two main goals from the list that you think you can address computationally (e.g., “We’ll try to stabilize the lysis protein,” or “We’ll attempt to disrupt its interaction with E. coli DnaJ.”). Write a 1-page proposal (bullet points or short paragraphs) describing: Which tools/approaches from recitation you propose using (e.g., “Use Protein Language Models to do in silico mutagenesis, then AlphaFold-Multimer to check complexes.”). Why you think those tools might help solve your chosen sub-problem. One or two potential pitfalls (e.g., “We lack enough training data on phage–bacteria interactions.”). Include a schematic of your pipeline This resource may be useful: HTGAA Protein Engineering Tools Individually put your plan on your website page Each group’s short plan for engineering a bacteriophage Schedule time ( HTGAA Protein Engineering Feedback) to get feedback/discuss your ideas, and put the feedback on your website [Optional] Part E. Find a drug for an oncology target

Week 5 HW: Protein Design Part ii

  • [] Homework — DUE BY START OF MAR 10 LECTURE

Part A: SOD1 Binder Peptide Design (From Pranam)

Superoxide dismutase 1 (SOD1) is a cytosolic antioxidant enzyme that converts superoxide radicals into hydrogen peroxide and oxygen. In its native state, it forms a stable homodimer and binds copper and zinc. Mechanis

Mutations in SOD1 cause familial Amyotrophic Lateral Sclerosis (ALS). Among them, the A4V mutation (Alanine → Valine at residue 4) leads to one of the most aggressive forms of the disease. The mutation subtly destabilizes the N-terminus, perturbs folding energetics, and promotes toxic aggregation.

  • [] Task A

Your challenge:

    Background: Design short peptides that bind mutant SOD1 and then decide which ones are worth advancing toward therapy. You will use three models developed in our lab:
  • PepMLM: target sequence-conditioned peptide generation via masked language modeling.
  • PeptiVerse: therapeutic property prediction.
  • moPPIt: motif-specific multi-objective peptide design using Multi-Objective Guided Discrete Flow Matching (MOG-DFM)
  • Part 1: Generate Binders with PepMLM

                      Begin by retrieving the human SOD1 sequence from UniProt (P00441) and introducing the A4V mutation.
    

    🟢🟤🟡P00441

    Here is fully translated superoxide dimutase protein P00441 in uniprot with the initiator methionine included. We need to cleave that M off before we apply our requested mutation to progress with a mature enzyme.

    So not this… 1 2 3 4 M A T K

    But this.. 1 2 3 4 A T K A

    To create our A4V SOD (love the rhyme) mutant… 1 2 3 4 A T K V

    The savvy student who fails to cleave the first methionine (M) can intuit the actual amino acid to change without thinking through any of the previous steps, but it’s nice to have a why in all things, since this is biology after all and we have evolution and ChatGPT. Please note that we will not want to use a protein sequence with any sort of truncation or wrapping on the sequence so here are my sequences for PPMLM-650M.

                      Using the PepMLM Colab linked from the HuggingFace PepMLM-650M model card: 
    

    🤗pepmlm650mlink

                      Generate four peptides of length 12 amino acids conditioned on the mutant SOD1 sequence.
    

    To create our A4V SOD (love the rhyme) mutant… 1 2 3 4 A T K V

    Mutant A4V SOD for PepMLM-650 There are two options, full protein sequence and a 12-Sequence input which I settled on in later runs.
    MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

    colabcode

    MATKVVCVLKGD

    Within the PepMLM-650 codebase in Google Colab Notebook, there are sliders and input fields to parameterize individual runs. However, these parameters didn’t seem to encode, so I finally hard-coded changes, as I will show below as a series of excerpts pulled from the codebase.

    single_sequence = True #@param {type:"boolean"} protein_seq = "MATKVVCVLKGD" #@param {type:"string"} # Initial value for num_binders num_binders = 4 # Initial values for top_k and peptide_length top_k = 3 peptide_length = 12

    code_constrained_step

    Initial_4in1_SequenceSet

    BinderPseudo_Perplexity_Score
    WVVVLVAGVVGE35.014933
    LTLVVAVGEVGE25.582245
    SVTEEVEDVDPV21.336863
    LPTVVVEGVDPE17.079494

    To your generated list, add the known SOD1-binding peptide FLYRWLPSRRGG for comparison.

    Record the perplexity scores that indicate PepMLM’s confidence in the binders.

    Part 2: Evaluate Binders with AlphaFold3

    Navigate to the AlphaFold Server: alphafoldserver.comFor each peptide, submit the mutant SOD1 sequence followed by the peptide sequence as separate chains to model the protein-peptide complex.Record the ipTM score and briefly describe where the peptide appears to bind. Does it localize near the N-terminus where A4V sits? Does it engage the β-barrel region or approach the dimer interface? Does it appear surface-bound or partially buried?In a short paragraph, describe the ipTM values you observe and whether any PepMLM-generated peptide matches or exceeds the known binder.

    Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse

    Structural confidence alone is insufficient for therapeutic development. Using PeptiVerse, let’s evaluate the therapeutic properties of your peptide! For each PepMLM-generated peptide:

    Paste the peptide sequence. Paste the A4V mutant SOD1 sequence in the target field. Check the boxes Predicted binding affinity Solubility Hemolysis probability Net charge (pH 7) Molecular weight Compare these predictions to what you observed structurally with AlphaFold3. In a short paragraph, describe what you see. Do peptides with higher ipTM also show stronger predicted affinity? Are any strong binders predicted to be hemolytic or poorly soluble? Which peptide best balances predicted binding and therapeutic properties?

    Choose one peptide you would advance and justify your decision briefly.

    Part 4: Generate Optimized Peptides with moPPIt

    Now, move from sampling to controlled design. moPPIt uses Multi-Objective Guided Discrete Flow Matching (MOG-DFM) to steer peptide generation toward specific residues and optimize binding and therapeutic properties simultaneously. Unlike PepMLM, which samples plausible binders conditioned on just the target sequence, moPPIt lets you choose where you want to bind and optimize multiple objectives at once.

    Open the moPPit Colab linked from the HuggingFace moPPIt model card Make a copy and switch to a GPU runtime. In the notebook: Paste your A4V mutant SOD1 sequence. Choose specific residue indices on SOD1 that you want your peptide to bind (for example, residues near position 4, the dimer interface, or another surface patch). Set peptide length to 12 amino acids. Enable motif and affinity guidance (and solubility/hemolysis guidance if available). Generate peptides. After generation, briefly describe how these moPPit peptides differ from your PepMLM peptides. How would you evaluate these peptides before advancing them to clinical studies?

Part B: BRD4 Drug Discovery Platform Tutorial (Gabriele)

  • [] Task B - Boltz Document

https://docs.google.com/document/d/18Vd9TQL2FjpEU0QdlGCgHe1D0BDoMzcfPRiFEXQIAas/preview

Part C: Final Project: L-Protein Mutants

This homework requires computation that might take you a while to run, so please get started early.

Tools

See HTGAA Protein Engineering Tools spreadsheet

Week 6 HW: Genetic Circuits Part i

[]Homework — DUE BY START OF MAR 17 LECTURE

Week 6 HW: Genetic Circuits Part 01

    Assignment: DNA Assembly

    Protocol and Study Questions

    What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose?
    

    A proprietary gold standard heat-stable DNA polymerase alternative to Taq reagent synthesized and sold by Thermo Fisher Scientific. Unlike Taq which was isolated from thermophilic bacteria, Phusion emulates an archaea-based enzyme that evolved in the hydrothermal vents from extremeophile species. They function as DNA polymerases essentially in a form biomimickry with minimal replication error. The purpose of Phusion is to amplify target DNA sequences in the PCR protocol. Phusion PCR is more expensive but worth the investment to increase the accuracy of the run.

    What are some factors that determine primer annealing temperature during PCR?
    

    I don’t know if a question will formally cover this, but PCR methods include initialization, annealing, and extension. Heat is first applied in the initialization step of a hot start PCR protocol. There are two temperature modalities with a typical run: 205 °F or 208 °F. Phusion polymerase would be a proper reagent for a hot start PCR run. The next phase, denaturation, again includes a 201-208 °F step to separate double-stranded DNA templates by breaking hydrogen bonds. The next temperature cycle is the annealing step where temperature drops to 122–149 °F. A key factor with temperature annealing is to be exact with temperature and time to avoid an off-target reaction mixture.

    There are two methods from this class that create linear fragments of DNA: PCR, and restriction enzyme digests. Compare and contrast these two methods, both in terms of protocol as well as when one may be preferable to use over the other.  How can you ensure that the DNA sequences that you have digested and PCR-ed will be appropriate for Gibson cloning?  
    

    Fundamentally, the first difference between restriction enzymes and PCR is the inventor. Bacteria invented restriction enzymes to decrease the size of their individual single chromosome genome through Natural Selection to adapt to environmental niches faster. A professor once explained it to me like the traveller who embarks into the desert. Why would they carry junk DNA they do not need when they can prioritize every ounce of storage space for genes they will need to survive in the desert? PCR is an entirely different angle. Now we, the scientists, are using a laboratory machine to initialize, anneal, and extend sections of DNA we are interested in replicating experimentally. In fact, it is the exact opposite mechanism, like the continuum between divestment and investment of DNA. In terms of protocols, restriction enzymes are more of a puzzle based on the actual information available genetically, and PCR can be applied to any segment of DNA anywhere on the genome that can be extracted. Additionally, there is a quality differential between approaches in the input DNA.

    How does the plasmid DNA enter the E. coli cells during transformation?
    

    Bacteria naturally use three methods to transfer genetic material, including conjugation, transduction, and transformation. Conjugation requires direct contact. Transduction uses phages as intermediates. Transformation occurs in nature when bacteria incorporate genetic material from dead bacteria in the environment. Scientists have learned how to leverage bacterial transformation using heat shock and electroporation.

    Describe another assembly method in detail (such as Golden Gate Assembly (GGA))
    

    Golden Gate is one of my favorite parks in Cali and GGA is a new (circa 1996) assembly method detailed by New England Biolabs. Key components are Type IIS restriction enzymes, T4 DNA Ligase, the “backbone”, and Transcription Activator Like Effectors (TALEs). Conceptually, GGA is revolutionary because it unites restriction enzymes and PCR amplicon assembly in an expedited way. The workflow for GGA has a two-modality kit for either BsaI or BsmBI directed assemblies, both are restriction enzymes with recognition sites that generate 4-bp overhangs when cut. The they have different recognition sites, BsaI is for standard assembly and uses GGTCTC and BsmBI is a hierarchical system mod that uses CGTCTC.

    Explain the other method in 5 - 7 sentences plus diagrams (either handmade or online).
    

    An alternative to GGA is Gibson Assembly. In this approach:

    1. Step01: The homologous DNA assembly fragments must overlap by 20-40 bps.
    2. Step02: The reaction mix with 5’ exonuclease is applied which chews back 5’ ends of both fragments.
    3. Step03: The matching x overlapping ends then anneal or base pair spontaneously which aligns the joined fragments in the correct order.
    4. Step04: The addition of a high-fidelity DNA polymerase (i.e., Phusion) then extends the missing bases to produce the 3’ ends.
    5. Step05: The Taq DNA ligase is then applied to fill in the gaps, and with that all fragmentary assembles continuous double-stranded DNA plasmid or linear construct.
    Model this assembly method with Benchling or a similar tool!
    

    Gibson Protocol in Benchling GibAssemb Adding Magenta Plasmid MagentaPlasmidDemo

    Repeating a similar Benchling workflow to BioClub Japan teammate Nourelden Rihan's notebook. Not only is Dr. Rihan a natural leader of excellence in the work but his formatting makes me wish I were Egyptian. 
    

    In the workflow is to stich a GFP Protein to a plasmid. I tried the ENA route with a different vector and GFP ideal for mitochondrial superfluorescence imaging. I then took a different course.

    1. First, I instead start with a eGFP protein without any additional fragments attached to it– shown here as U55761_EGFP_CDS translated into AA sequence for better artistic effect. U55761_EGFP_CDS

    Properties

    Position 1-239 Summary MVSK…ELYK 239 AAs Molecular weight 26941.36 Da Isoelectric point (pI) 5.58 Extinction coefficient Cys fully reduced 21890.00 M-1cm-1 Abs 0.1% (1 g/l) 0.813 Cys fully oxidized 22015.00 M-1cm-1 Abs 0.1% (1 g/l) 0.817 Instability index 29.06 (stable) Amino Acid Frequencies

    Amino acid Count Ala A 8 3.3% Arg R 6 2.5% Asn N 13 5.4% Asp D 18 7.5% Cys C 2 0.8% Gln Q 8 3.3% Glu E 16 6.7% Gly G 22 9.2% His H 9 3.8% Ile I 12 5.0% Leu L 21 8.8% Lys K 20 8.4% Met M 6 2.5% Phe F 12 5.0% Pro P 10 4.2% Ser S 10 4.2% Thr T 16 6.7% Trp W 1 0.4% Tyr Y 11 4.6% Val V 18 7.5% Pyl O 0 0.0% Sec U 0 0.0% Net Charge

    pH Charge 4 22.49 4.5 13.06 5 5.50 5.5 0.67 6 -3.06 6.5 -5.96 7 -7.70 7.5 -8.63 8 -9.31 8.5 -10.37 9 -12.81 9.5 -17.98 10 -26.38

    1. After wrestling with my pre-loaded mito_mGFP plasmid, I used ChatGPT to find the simplest, vanilla, mammalian expression vector with CMV promoter and selectable marker possible. Mission Accomplished – not quite, many more steps to follow after this. Also shown in this screenshot from the Benchling account is the primers that I had to create to guide the Gibson Assembly to follow. pcDNA31_plasmidwREIIandPCRPrimers

    Assignment: Asimov Kernel

    Protocol and Study Questions

    Create a Repository for your work
    
    Create a blank Notebook entry to document the homework and save it to that Repository
    
    Explore the devices in the Bacterial Demos Repo to understand how the parts work together by running the Simulator on various examples, following the instructions for the simulator found in the “Info” panel (click the “i” icon on the right to open the Info panel)
    
    Create a blank Construct and save it to your Repository
    
    Recreate the Repressilator in that empty Construct by using parts from the Characterized Bacterial Parts repository
    
    Search the parts using the Search function in the right menu
    
    Drag and drop the parts into the Construct
    
    Confirm it works as expected by running the Simulator (“play” button) and compare your results with the Repressilator Construct found in the Bacterial Demos repository
    
    Document all of this work in your Notebook entry - you can copy the glyph image and the simulator graphs, and paste them into your Notebook
    
    Build three of your own Constructs using the parts in the Characterized Bacterials Parts Repo
    
    Explain in the Notebook Entry how you think each of the Constructs should function
    
    Run the simulator and share your results in the Notebook Entry
    
    If the results don’t match your expectations, speculate on why and see if you can adjust the simulator settings to get the expected outcome
    

    Resources

    1. Primer Design: HTGAA’s Supplement to Gibson Assembly Recitation
    2. NEB’s (New England Biolabs) video Introduction to Gibson Assembly
    3. NEB’s (New England Biolabs) explanation & protocols for Gibson Assembly®

Week 7 HW: Genetic Circuits Part 2

muse muse

[]Homework — DUE BY START OF MAR 31 LECTURE at 2PM ET

Week 7 HW: Genetic Circuits Part 2

    Assignment Part 1: Intracellular Artificial Neural Networks (IANNs)

    What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions?
    

    IANNs are ideal for the continuous transcriptomic-driven change observed in cells that are constantly moving and communicating in their intracellular environment – through analog computations. In contrast, much of the early synbio genetic circuit engineering was digital, with discrete logic gate switch programming or perhaps even through gene knock out (present versus absent) if such a connection would be permitted.

    Describe a useful application for an IANN; include a detailed description of input/output behavior, as well as any limitations an IANN might face to achieve your goal.
    

    Perhaps it’s still a bit conceptual at this time, but the redox lesion-boundary in my personal HTGAA project may be a useful application for framing an intracellular artificial neural network (IANN). My goal would be to create a perceptron-esque intracellular circuit used to classify tissue sample sites by integrating continuous biochemical inputs into a weighted threshold graded output that identifies lesion boundaries in host animals. This data will be dependent on a time-series with continuous expression and variation that is spatially distributed across host goats. The input layer in this model might not even include a mite or mite eggs as is traditionally used but instead models inputs like ROS, hypoxia, inflamatory damage that spreads across hosts in similar phenotypic patterns. The weights for this model would be the expected promoter strength, affinity, repression, similarity. Activation would be evaluated using threshold nonlinear gene response (it’s biology afterall) and the output would be fluorescent markeing (highlighting) of lesion bounardy cellular expression.

    Below is a diagram depicting an intracellular single-layer perceptron where the X1 input is DNA encoding for the Csy4 endoribonuclease and the X2 input is DNA encoding for a fluorescent protein output whose mRNA is regulated by Csy4. Tx: transcription; Tl: translation. 
    

    Example Circuit

    Draw a diagram for an intracellular multilayer perceptron where layer 1 outputs an endoribonuclease that regulates a fluorescent protein output in layer 2.
    

    With a minimal understanding of this week’s lab, I am now going to try to adapt the Cortassa lab’s (2004) computational mitochondrial oscillator on the reactive oxygen species model to the diagram challenge for the intracellular multilayer perceptron challenge. Why? I am endlessly fascinated by both. The Cortassa model hypotheses target spontaneous metabolic oscillations in heart cells, and is built using sound biology and stone-cold math. I also respect that Cortassa was modelling the mitochondria in 2003; we didn’t know anything about mitochondria back then, compared to what we know now. Another great feature of the Cortassa model is the scalability across time. Often, when we discuss the interstellar nature of scale in biology, we focus on space, but time is just as perplexing when it comes to the bioenergetics of metabolic redox reactions, as Cortassa et al (2004) describe, with timescales from milliseconds to hours. Lastly, the Cortassa model even considers Fluorescent probes, so as a side note, if this lab is looking for a PostDoc. Cortassa Lab Dynamic Mitochondrial Oscillator Model The dynamics are nonlinear Each modelled mitochondrion is assumed to possess an inner membrane anion channel abbreviated as IMAC. An IMAC is activated An IMAC is modulated by MG2+ and PH An IMAC is inhibited by amphiphilic molecules.
    The model has two state changes: There is the relaxation mitochondrial oscillator state with slow and fast spaces. Over the slow space ROS builds up in the mitochondrial matrix There is the stable mitochondrial oscillatory state Source: Cortassa S, Aon MA, Winslow RL, O’Rourke B. A mitochondrial oscillator dependent on reactive oxygen species. Biophys J. 2004 Sep;87(3):2060-73. doi: 10.1529/biophysj.104.041749. PMID: 15345581; PMCID: PMC1304608. Cortassa S, Aon MA, Marbán E, Winslow RL, O’Rourke B. An integrated model of cardiac mitochondrial energy metabolism and calcium dynamics. Biophys J. 2003 Apr;84(4):2734-55. doi: 10.1016/S0006-3495(03)75079-6. PMID: 12668482; PMCID: PMC1201507.

    Assignment Part 2: Fungal Materials

    What are some examples of existing fungal materials and what are they used for? 
    
    What are their advantages and disadvantages over their traditional counterparts?
    
    What might you want to genetically engineer fungi to do and why? 
    

    There are so many things that we could genetically engineer with Fungi, so why is it so confounding just to pick one? Ironically, it’s easier to imagine what a bird would genetically engineer with fungi. Shelter is the first demand after all for survival; a bird’s nest could be a noble odyssey for scientific discovery. To start, what do birds build nests with? On this farm, birds use sticks, hay, and straight wool. What do all of these substrates have in common? For the most part, they are straight and bendable, but there are other notable attributes. Hay and Wool for example, are resistant to most environmental antagonists: water, wind, bugs, and fire. I also think they use composites because they are structurally reliable with a predictable function. For example, all three are tough enough and not heavy. A bird can pick up a piece with its beak, fly back to the nest, and drop it where they want it. Does this make it a tool? Anyway, once they drop a piece of stick, wool, or hay they then just have two more steps: tuck in side A and then tuck in side B – done. The problem with all three of these materials is that they exist independently of the task of providing birds with material for nests. Enter a fungi farm for bird nest materials. All we need are yeast plasmids, a PCR machine, a gel electrophoresis device, reagents, electricity, computers, and revolutionary institutions to teach us how. Fortified with our why and how, we commence…

    What are the advantages of doing synthetic biology in fungi as opposed to bacteria?
    

    There are several ways to approach this question. We choose bacteria because of their remarkable ability to adapt. While it’s important to acknowledge the bacterium’s collaboration with natural selection, it took over a billion years for the kingdom to develop this trait. Traditionally, we prioritize bacteria in synthetic biology because bacteria occupy the niche first. Why is this significant? Well, it can be summarized in one sentence: the niche is where matter and energy intersect. Fungi eventually occupy the niche also and when they do, they provide structure and in a less specific way. Perhaps their universality is because fungi possess the genes of plants and animals. Scale, is another reason Fungi. Bacteria leaned millions of years ago to the illusion of invisibility to multicellular ocular adaptations, which was advantageous to their race for niches against multicellular competitors. Not to downplay humans too much, we figured this out too, and we can even watch bacteria at their little spinning wheels and transform them from the inside out. However, be it tangential or not, I still wonder where the evolutionary studies are following the synbio bacteria prospectively as they reintegrate with nature. Therefore, in summary, fungi over bacteria because they replicate so slowly for microbiota, which for example better adjusts to human chronological understanding of space and time.

    Installation Guide and Walk-through of the Neuromorphic Wizard Software.
    Open link for the installation file: https://drive.google.com/drive/folders/10_gEzYV2J5hVOdKt6sNBeSH8cEMMGX8O
    Click the three dots.
    Click Download.
    Move the folder to somewhere you can access it from Terminal with cd command.
    If the downloaded file is zipped, then unzip it.
    In terminal run cd into NeuromorphicWizard and ls contents
    In terminal run conda create -n neuro_wiz python==3.10
    In terminal run conda activate neuro_wiz
    In terminal run pip install -r requirements.txt
    In terminal run python main.py
    

    Assignment Part 3: First DNA Twist Order

    Review the Individual Final Project documentation guidelines.

    Submit this Google Form with your draft Aim 1, final project summary, HTGAA industry council selections, and shared folder for DNA designs. DUE MARCH 20 FOR MIT/HARVARD/WELLESLEY STUDENTS Review Part 3: DNA Design Challenge of the week 2 homework. Design at least 1 insert sequence and place it into the Benchling/Kernel/Other folder you shared in the Google Form above. Document the backbone vector it will be synthesized in on your website.

    Reading & Resources

    The perceptron, the basis of artificial neural networks: https://www.geeksforgeeks.org/deep-learning/what-is-perceptron-the-simplest-artificial-neural-network/ Many examples of artificial neural networks made using biomolecules: https://doi.org/10.1016/j.biosystems.2024.105164

nmwa nmwa

Week 9 HW: Cell Free Systems

Week 9 HW: Cell Free Systems

    Assignment Part 1:

    ABC?
    

    EFG

    Reading & Resources

image image

Week 10 HW: Imaging and Measurement

Week 10 HW: Imaging and Measurement

    Assignment Part 1:

    ABC?
    

    EFG

    Reading & Resources

image image

Subsections of Labs

Week 1 Lab: Pipetting

cover image cover image

Projects

Final projects:

  • Sarcoptes scabiei Suspected Here! FinalSlide HTGAA 2026 Final Project: Sarcoptes exploit host geometry – developing assay to measure redox landscape in stratum corneum of goat By Charley Naney SECTION 1: ABSTRACT Provide an abstract/summary for your project. (minimum 150 words) • Should be a self-contained description of the project • Should contain brief outline of: Significance: The ectoparasitic mite Sarcoptes scabiei exploits the stratum corneum of mammalian skin, including hosts within the Bovidae family (subfamily Caprinae), notably the genera Capra (goats) and Ovis (sheep). While S. scabiei infects a broad range of mammals, this project focuses on caprine hosts (Capra), which provide a tractable and agriculturally relevant system. Goats and sheep share a common ancestor within Caprinae but diverged approximately 4–5 million years ago, adapting to distinct ecological geometries. Goats evolved in heterogeneous, vertical, and discontinuous mountain environments, whereas sheep adapted to more homogeneous, open, and flock-oriented landscapes. These divergent ecological pressures have shaped not only behavior and morphology but potentially the spatial structure of skin physiology, including barrier properties, microenvironmental heterogeneity, and host–parasite interactions. Sarcoptes scabiei is a globally distributed, highly contagious parasite of major veterinary and public health concern. Cross-host transmission is frequently observed in practice: infestations in sheep may spread to goats, dogs, and humans in shared environments, reflecting the long cohabitation of these species under both natural and artificial selection over the past 8,000–10,000 years. This deep ecological and evolutionary entanglement suggests that Sarcoptes scabiei operates within host-derived microenvironments that are conserved enough to permit transmission, yet variable enough to shape infection dynamics. This project introduces a spatially resolved diagnostic concept: rather than sampling lesions indiscriminately, it targets the peri-lesional boundary, hypothesized to represent a localized maximum in reactive oxygen species (ROS) generated by host immune activity. By developing a small-scale assay—potentially leveraging fluorescent ROS probes—to detect this spatial oxidative signature, I aim to create a rapid, geometry-informed approach to identifying mite-associated lesions. While species-level discrimination of Sarcoptes lineages may not be feasible within the project timeline, this approach can be strengthened through triangulation, integrating spatial ROS signals with observational and survey-based metadata (host species, lesion morphology, transmission context). Together, these elements may yield a more robust and field-deployable diagnostic framework. Beyond immediate application, this work addresses a broader evolutionary and biophysical question: how parasites exploit host tissue geometry to create stable microenvironments, and how those environments can be detected through their spatial redox signatures. In this sense, Sarcoptes scabiei infection becomes not only a veterinary problem, but a model system for understanding how geometry, evolution, and immune dynamics converge in living tissue. Broad objectives: I will develop spaitially resolved diagnostic assay for detecting Sarcoptes scabiei infection, or evidence of burrows in epidermal layers of skin, using fluorescent ROS probes capable of detecting oxidative stress signature in caprine skin. This approach will include farmer education and a survey for targeting host and environmental metadata, and skin targeting of peri-lesional microenvironments. Testing data can later be combined with transcriptomic sequencing, as other researchers are starting to do to develop putative genes that can be analyzed to develop more exact assay-based biomarkers for diagnosis, prognosis, and more targeted therapy with effective surveillance, as only a recovering Epidemiologist, goat and sheep farmer, returned to Evolutionary Microbiology can dream up. Hypotheses: Sarcoptes scabiei infection or infestation, need to decide on best state variable terminology here, is a pathogenic process involving reorganization of the geometry of the stratum corneum. The mites burrow to feed, make shelter, and lay eggs which results in a diffusion-constrained microenvironment ideal for assaying host immune responses surrounding, including what I hypothesize to be a localized region with maximum concentration of ROS distributed within a peri-lesional boundary. Specific aims Null Hypothesis: In probable Sarcoptes scabiel infested goats, ROS levels will be uniformly distributed across lesion core, peri-lesional boundary tissue, with no spaitial enrichment at lesion boundary. Alternative Hypothesis: In probable Sarcoptes scabiel infested goats, a localized maximum of ROS at the peri-lesional boundary in caprine skin will be present.

Subsections of Projects

Individual Final Project

cover image cover image

Sarcoptes scabiei Suspected Here!

FinalSlide

HTGAA 2026 Final Project: Sarcoptes exploit host geometry – developing assay to measure redox landscape in stratum corneum of goat

By Charley Naney

SECTION 1: ABSTRACT

  1. Provide an abstract/summary for your project. (minimum 150 words) • Should be a self-contained description of the project • Should contain brief outline of:
  • Significance: The ectoparasitic mite Sarcoptes scabiei exploits the stratum corneum of mammalian skin, including hosts within the Bovidae family (subfamily Caprinae), notably the genera Capra (goats) and Ovis (sheep). While S. scabiei infects a broad range of mammals, this project focuses on caprine hosts (Capra), which provide a tractable and agriculturally relevant system. Goats and sheep share a common ancestor within Caprinae but diverged approximately 4–5 million years ago, adapting to distinct ecological geometries. Goats evolved in heterogeneous, vertical, and discontinuous mountain environments, whereas sheep adapted to more homogeneous, open, and flock-oriented landscapes. These divergent ecological pressures have shaped not only behavior and morphology but potentially the spatial structure of skin physiology, including barrier properties, microenvironmental heterogeneity, and host–parasite interactions. Sarcoptes scabiei is a globally distributed, highly contagious parasite of major veterinary and public health concern. Cross-host transmission is frequently observed in practice: infestations in sheep may spread to goats, dogs, and humans in shared environments, reflecting the long cohabitation of these species under both natural and artificial selection over the past 8,000–10,000 years. This deep ecological and evolutionary entanglement suggests that Sarcoptes scabiei operates within host-derived microenvironments that are conserved enough to permit transmission, yet variable enough to shape infection dynamics. This project introduces a spatially resolved diagnostic concept: rather than sampling lesions indiscriminately, it targets the peri-lesional boundary, hypothesized to represent a localized maximum in reactive oxygen species (ROS) generated by host immune activity. By developing a small-scale assay—potentially leveraging fluorescent ROS probes—to detect this spatial oxidative signature, I aim to create a rapid, geometry-informed approach to identifying mite-associated lesions. While species-level discrimination of Sarcoptes lineages may not be feasible within the project timeline, this approach can be strengthened through triangulation, integrating spatial ROS signals with observational and survey-based metadata (host species, lesion morphology, transmission context). Together, these elements may yield a more robust and field-deployable diagnostic framework. Beyond immediate application, this work addresses a broader evolutionary and biophysical question: how parasites exploit host tissue geometry to create stable microenvironments, and how those environments can be detected through their spatial redox signatures. In this sense, Sarcoptes scabiei infection becomes not only a veterinary problem, but a model system for understanding how geometry, evolution, and immune dynamics converge in living tissue.
  • Broad objectives: I will develop spaitially resolved diagnostic assay for detecting Sarcoptes scabiei infection, or evidence of burrows in epidermal layers of skin, using fluorescent ROS probes capable of detecting oxidative stress signature in caprine skin. This approach will include farmer education and a survey for targeting host and environmental metadata, and skin targeting of peri-lesional microenvironments. Testing data can later be combined with transcriptomic sequencing, as other researchers are starting to do to develop putative genes that can be analyzed to develop more exact assay-based biomarkers for diagnosis, prognosis, and more targeted therapy with effective surveillance, as only a recovering Epidemiologist, goat and sheep farmer, returned to Evolutionary Microbiology can dream up.
  • Hypotheses: Sarcoptes scabiei infection or infestation, need to decide on best state variable terminology here, is a pathogenic process involving reorganization of the geometry of the stratum corneum. The mites burrow to feed, make shelter, and lay eggs which results in a diffusion-constrained microenvironment ideal for assaying host immune responses surrounding, including what I hypothesize to be a localized region with maximum concentration of ROS distributed within a peri-lesional boundary.
  • Specific aims
  • Null Hypothesis: In probable Sarcoptes scabiel infested goats, ROS levels will be uniformly distributed across lesion core, peri-lesional boundary tissue, with no spaitial enrichment at lesion boundary.

Alternative Hypothesis: In probable Sarcoptes scabiel infested goats, a localized maximum of ROS at the peri-lesional boundary in caprine skin will be present.

  • Methods to be employed • Use lay language (i.e., understandable by the general public) as much as possible SECTION 2: PROJECT AIMS
  1. Outline three aims of your final project (min. 3 sentences, at least one for each aim) Aim 1: Baseline Fluorescence Mapping of Herd The subject (i.e., goat, sheep, other mammal frequently in-contact with herd) is visually inspected using system for suspected lesions indicative of possible infestation. A low toxicity fluorescent dye or marker may be applied to skin surface to survey skin surface to establish baseline imaging conditions. A portable light source (i.e., UV or visible excitation light) will be coupled with imaging device (i.e., smartphone camera with accessory macro lens) will be used capture images of skin regions where first coat of fluorescent die label is applied. This region can include suspected lesion or control region. These images are then analyzed to establish baseline fluorescence patterns and instructions to improve collection of baseline images. Aim 2: Reactive Oxygen Species Detection in Herd Low toxicity in-field application of fluorescent probe capable of interacting with reactive oxygen species is applied to the skin surface, including the lesion and surrounding peri-lesional region. The sample will not be fixed with formaldehyde, a known carcinogen, but instead will be air dried with optional portable fan to accelerate non-chemical fixation process. Excitation will then be reached with appropriate light source, at which point scan and capture of ROS signals will be completed. Spatial variation in fluorescence intensity will then be analyzed to identify biologically active regions, including potential peri-lesional infestation boundaries. Aim 3: Image Analysis and Risk Stratification in Herd Captured images from Aim 1 and Aim 2 are analyzed using computational methods, including intracellular artificial neural network circuit for analogue machine learning modeler to identify patterns of fluorescence intensity associated with peri-lesion activity. The system will classify individual animals in herd based on risk levels to identify individuals requiring further testing as well as guiding further testing and treatment. In some embodiments, the system may be deployed to integrate data across multiple animals to support herd-level decision-making.
  • State or link any methods/experimental protocols/OpenTrons protocols/DNA or protein designs/protein design tools or models/Twist orders you will use

  • Quality Control and Validation This is a scientific embodiment, so before system is packaged for any farmer with a smart phone and a goat, an appropriate confirmatory analysis using cutting edge laboratory methods will be performed. Analysis will be conducted using automation and high-throughput systems. The validation protocol will be developed with guidance from MIT HTGAA 2026. Validation steps may be used to correlate field-detected fluorescence signals with molecular or biological markers of infestation.

  • You will provide a detailed, step-by-step outline of how you will achieve your goal for the first aim in the experimental design portion of this assignment (i.e., in question 7)

  • Feel free to run your goal by a TA • The second aim should be a medium-term aim that is a follow-up to your first aim and focused on goals beyond this class, building toward your third, visionary aim

  • For example, your second aim may be to successfully execute a set of experimental protocols, solve a specific problem, or develop a specific technology building upon the goals of your first aim. • The third aim should be a visionary, long-term aim

  • Reveal how the larger goal of the project can be impactful

  • Examples: challenging an existing paradigm or clinical practice, addressing a critical barrier to progress in the field, describing how you envision a new technology to change how a certain type of research is conducted SECTION 3: BACKGROUND

  1. Provide background research for your final project. This should describe the current state of knowledge related to your project and be a critical evaluation of the literature that identifies the gap in knowledge that this project will fill. Cite at least 2 peer-reviewed research papers. (min. 4 sentences)
  2. Describe how your project is innovative (min. 3 sentences) • Examples of topics to discuss
  • Novel applications, usage, or development of theoretical concepts, approaches, methodologies, instrumentation, and/or interventions
  • How it challenges current theories, paradigms, or ways in which technology/biological tools are used
  • How your project pushes the boundaries of synthetic biology
  1. Briefly expand upon the significance of your final project. (min. 5 sentences) • Examples of topics to discuss
  • There are several ways my project solves a pressing problem in the world. First and foremost, it provides any farmer with a smart phone and my kit, which I hope they can buy at Walmart or Tractor Supply, or they can pick one up for free where such conveniences are not available. The kit will include low-toxicity fluorescence, snap on macro lens, uv flashlight, QR code with registration for an App. They can then snap photo of their cat, and prompt will say not goat. They find goat, prompt correct animal, take photo here but part hair first, ect. This tool will empower them to test their intuition about an animal in the herd that they think may be infested with Sarcoptes scabiei without spending hundreds or thousands of dollars to expose every animal in the herd to toxic treatments for prevention or to consult a vet before they know if they have a problem. For example, “sulfur lime,” which is one of the treatments I am referring to, coats the entire animal in reactive oxygen species. In contrast, S. scabiei is highly localized and as an obligate parasite with a very long evolutionary tail will move from one susceptible animal to the next at a leisurely pace. Granted when enough animals are infested it can be a hard chain of transmission to break, but early detection and localized remediation is effective. Early detection is critical in this regard. Another challenge is the random way it enters herd, often because neighboring farmers are in the same boat. Then there are all the intermediate hosts between herds. The only viable alternative to my solution is marrying a vet or having a kid go to vet school, but my kit is a lot cheaper.
  • Importance of the problem it solves or the critical barrier(s) to progress in the field that the proposed project addresses
  • The ways in which it contributes to the larger society
  • How the proposed project will improve scientific knowledge, technical capability, and/or clinical practice in one or more broad fields
  • How the concepts, methods, technologies, treatments, services, or preventative interventions that drive this field will be changed if the proposed aims are achieved
  1. Describe the bioethical considerations involved in your project. (min. 2 paragraphs) • First paragraph: Include what ethical implications are involved in your project. Try to suggest ethical the principle(s) you may apply (e.g. non-maleficence, justice)? • Second paragraph: Describe the measures that should be taken to ensure that your project is ethical (both in how the research is conducted and in its broader implications for society). You may wish to answer the following questions:
  • What action(s) do you propose?
  • What are potential unintended consequences of your proposed actions?
  • What could you have wrong (e.g., incorrect assumptions and uncertainties)?
  • What are alternatives to your proposed actions? • Note: in an NIH proposal, an ethics statement is used to describe the relevance of this research to public health SECTION 4: EXPERIMENTAL DESIGN
  1. Create a detailed experimental plan for your final project. Include a timeline for each part of your experimental plan (i.e., how long you would expect each step in your final project to take). (min. 15 lines/sentences—a numbered list is acceptable) • Include specific methods/tools/technologies/biological concepts for each part of the final project and analysis • This section will be used to determine whether the experiments are well designed, feasible, and likely to succeed in testing your hypothesis • Often this section is broken into discrete tasks/sub-aims • For each experiment and/or analysis, include a description of your expected results • If possible, include figure(s) that visually shows a broad workflow of your project or a specific aspect of your experimental plan SECTION 5: TECHNIQUES, TOOLS, AND TECHNOLOGY
  2. We discussed and practiced various techniques related to synthetic biology throughout the semester. Place a check next to the techniques relevant to your project. Pipetting Pipetting Lab Safety Bioethical Considerations (must check this box) DNA Gel Art DNA Sequencing DNA Editing (e.g., CRISPR) DNA Construct Design Restriction Enzyme Digestion Gel Electrophoresis DNA Purification From Gel Databases (e.g., GenBank, NCBI, Ensembl, and UCSC Genome Browser) Opentrons Creating Code for Laboratory Automation PyLabRobot Using Liquid Handling Robots (e.g., Opentrons) Protein Design Protein Design Models and Notebooks Databases Tools BioProduction BioProduction Chassis Selection (e.g., Dh5alpha) Registry of Standard Biological Parts FreeGenes Plasmid Preparation Bacterial Culturing Quality Control/Analysis Bacterial Processing (e.g., Centrifugation, Lysis, DNA Purification) Cell Free Cell Free Reactions Freeze-Dried Cell Free Systems miniPCR Tools Week 7: Gibson Assembly Primer Design or Selection PCR Reactions Gibson Assembly Other Cloning Methods (e.g., Restriction Enzyme Digestion or Gateway Cloning) Week 8-9: CRISPR CRISPR/Cas9 Designing Prime Editing gRNA Creating Twist Order
  3. Expand upon two techniques you checked in the previous question by describing how you would utilize those techniques in your final project. (min. 4 sentences) SECTION 6: PROJECT VALIDATION
  4. You are required to validate at least one aspect of your final project aims. This is to ensure that you are able to successfully apply a relevant synthetic biology technique to your project. Include figures if you have them—accuracy is critical in figures, tables, and graphs. Here is a non-exhaustive list of acceptable validations: • Performing a PCR reaction using primers relevant to your final project • Performing a Gibson assembly relevant to your final project • Designing DNA relevant to your final project • Creating and performing a cell-free assay related to your final project • Creating and running code to validate an aspect of your final project • Developing a model or completing a computational analysis relevant to your project • Designing DNA construct(s) that can express at least one gene of interest, ordering it (via Twist), and testing of the expression of the construct(s) (potentially using an OpenTrons robot) 10a. What aspect of your final project did you choose to validate? (min. 2 sentences) 10b. Write down a detailed protocol of how you validated this aspect of your final project. (Numbered list or paragraph is fine) 10c. What synthetic biology techniques did you utilize in validating this aspect of your final project? You can refer to the list of techniques in question 8. (min. 4 sentences)
  5. Did you encounter any unexpected challenge(s) when performing your validation? If so, describe the challenge(s) and strategies to overcome it. If not, discuss potential problems, difficulties, limitations, and/or alternative strategies to overcome challenges in your final project. (min. 4 sentences) SECTION 7: ADDITIONAL INFORMATION
  6. List all references cited in this assignment (bullet-point list)
  7. Create a supply list and budget for your project (bullet-point list)
  • What supplies, equipment, and budget is needed for your project to work?

Group Final Project

cover image cover image