First, describe a biological engineering application or tool you want to develop and why. This could be inspired by an idea for your HTGAA class project and/or something for which you are already doing in your research, or something you are just curious about.
Thyroid implant for canines
In the canine world, there is a very specific problem that owners face: hypothyroidism. It is a common condition that dogs develop around the ages of 2 or 3. Still, most of the time it is mistaken for other health conditions, such as intestinal problems, allergies, dermatological conditions, and so on. As this health condition has many impacts on the body of dogs, sometimes veterinarians can lead to a false positive.
Part 1: Benchling & In-silico Gel Art
See this week’s lab protocol “Gel Art: Restriction Digests and Gel Electrophoresis” for details. Overview:
Make a free account at benchling.com Import the Lambda DNA. Simulate Restriction Enzyme Digestion with the following Enzymes: EcoRI HindIII BamHI KpnI EcoRV SacI SalI Restriction Enzyme Digestion made with Benchling Create a pattern/image in the style of Paul Vanouse’s Latent Figure Protocol artworks. You might find Ronan’s website a helpful tool for quickly iterating on designs! E=m*a2 EcoRV vs. EcoRI Single Enzymes Pyramid Enzymes Part 3: DNA Design Challenge
Assignment: Python Script for Opentrons Artwork — DUE BY YOUR LAB TIME!
Your task this week is to Create a Python file to run on an Opentrons liquid handling robot.
Review this week’s recitation and this week’s lab for details on the Opentrons and programming it. Generate an artistic design using the GUI at opentrons-art.rcdonovan.com. Star´s birth Rectangular color palette Iteration color palette 1 Iteration color palette 2 Iteration color palette 3 - Ellipse Iteration color palette 4 - Circumference Using the coordinates from the GUI, follow the instructions in the HTGAA26 Opentrons Colab to write your own Python script which draws your design using the Opentrons. You may use AI assistance for this coding — Google Gemini is integrated into Colab (see the stylized star bottom center); it will do a good job writing functional Python, while you probably need to take charge of the art concept. Iteration color palette 5 Coding done by Gemini This coding was made with Google Gemini. The steps for doing that were: first, loading the coordinates made in the GUI; second, giving instructions to the AI for what the expected outcome; finally, iterating until the idea was achieved. The given instruction given to the IA did not have any basic coding, it was all made with written instructions.
Part A. Conceptual Questions
Answer any NINE of the following questions from Shuguang Zhang: (i.e. you can select two to skip)
How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons) Why do humans eat beef but do not become a cow, eat fish but do not become fish? Why are there only 20 natural amino acids? Can you make other non-natural amino acids? Design some new amino acids. Where did amino acids come from before enzymes that make them, and before life started? If you make an α-helix using D-amino acids, what handedness (right or left) would you expect? Can you discover additional helices in proteins? Why are most molecular helices right-handed? Why do β-sheets tend to aggregate? What is the driving force for β-sheet aggregation? Why do many amyloid diseases form β-sheets? Can you use amyloid β-sheets as materials? Design a β-sheet motif that forms a well-ordered structure. Part B: Protein Analysis and Visualization
Part A: SOD1 Binder Peptide Design (From Pranam)
Superoxide dismutase 1 (SOD1) is a cytosolic antioxidant enzyme that converts superoxide radicals into hydrogen peroxide and oxygen. In its native state, it forms a stable homodimer and binds copper and zinc.
Mutations in SOD1 cause familial Amyotrophic Lateral Sclerosis (ALS). Among them, the A4V mutation (Alanine → Valine at residue 4) leads to one of the most aggressive forms of the disease. The mutation subtly destabilizes the N-terminus, perturbs folding energetics, and promotes toxic aggregation.
Assignment: DNA Assembly
Answer these questions about the protocol in this week’s lab:
What are some components in the Phusion High-Fidelity PCR Master Mix, and what is their purpose? Elements:
Pyrococcus-like enzyme fused with a processivity-enhancing domain that increases fidelity and speed. 53 polymerase activity, 35exonuclease activity, and generates blunt-ended products. Phusion DNA Polymerase, deoxynucleotides, and reaction buffer that has been optimized and includes MgCl2 Requirements to be added: template, primers, and water. Purpose:
Assignment Part 1: Intracellular Artificial Neural Networks (IANNs)
What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions?
In-vivo Artificial Neural Networks provide an analog approach to computation within biological systems, enabling cells to process continuous inputs and generate a gradient and a non-binary output. IANNs can integrate multiple signals and capture more complex, non-linear relationships through distributed gene regulation. IANNs have the ability to capture patterns and complex inputs.
Part A: General and Lecturer-Specific Questions
General homework questions
Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production.
Cell-free protein synthesis: is a protein expression approach that enables the production of a target protein without the use of living cells. In vivo methods: it is also a protein expression approach that uses living cells such as bacteria (e. coli, most common), yeast, insect cells, and mammalian cells. Here is a chart that makes a summary about differences between both methods:
Waters Part I — Molecular Weight
We will analyze an eGFP standard on a Waters Xevo G3 QTof MS system to determine the molecular weight of intact eGFP and observe its charge state distribution in the native and denatured (unfolded) states. The conditions for LC-MS analysis of intact protein cause it to unfold and be detected in its denatured form (due to the solvents and pH used for analysis).
Based on the predicted amino acid sequence of eGFP (see below) and any known modifications, what is the calculated molecular weight? You can use an online calculator like the one at https://web.expasy.org/compute_pi/
Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork
Contribute at least one pixel to this global artwork experiment before the editing ends on Sunday 4/19 at 11:59 PM EST.
A personalized URL was sent to the email address associated with your Discourse account, and you can discuss the artwork on the Discourse. If you did not have a chance to contribute, it’s okay, just make sure you become a TA this fall! 😉 2. Make a note on your HTGAA webpages including:
Subsections of Homework
Week 1 HW: Principles and Practices
First, describe a biological engineering application or tool you want to develop and why. This could be inspired by an idea for your HTGAA class project and/or something for which you are already doing in your research, or something you are just curious about.
Thyroid implant for canines
In the canine world, there is a very specific problem that owners face: hypothyroidism. It is a common condition that dogs develop around the ages of 2 or 3. Still, most of the time it is mistaken for other health conditions, such as intestinal problems, allergies, dermatological conditions, and so on. As this health condition has many impacts on the body of dogs, sometimes veterinarians can lead to a false positive.
There are three types of hypothyroidism:
Primary
Secondary
Tertiary
Congenituous
We are going to center our attention on primary hypothyroidism, caused by an immunological condition or an idiopathic problem.
The origin of Primary Hypothyroidism lies in the intercommunication that exists between the thyroid gland and the brain´s instructions. There is a malfunction that doesn´t allow the gland to produce more T4. Although the communication between the brain, the hypothalamus, and the hipophysis are fine and working at its normal pace, the gland receives the message, the gland cannot keep the production of T4 because it has been attacked by the immunological cells or it has started to transform into fat with no returning point.
The interesting thing about this system is that it is only partially broken. The T4-T3 hormones do not depend on the gland to be absorbed by the body; they depend on the genetic switches that are inside the organs they enter: liver, kidneys, brain, and muscles. Another interesting fact is that, actually, T4 hormones are passive cells and can travel around the body through the circulatory system.
In our present times, we have a pharmaceutical solution called: levotiroxin. This pill is actually a concentration ot T4 that enters the body through the stomach, and as it is absorbed, it can be delivered to the different organs to be transformed to T3. Although it is a simple solution, it is not exactly accurate all of the time, and it has to be monitored every 6 months to be adjusted according to the requirements of multiple blood tests until the dose is accurate.
Now, what if we could implant a thyroid substitute that could read the TSH directly through the blood and produce T4 as needed in the body, with a more precise response? To achieve this objective, we must face the subsequent situations:
The body needs to accept the cells as its own. The most accurate way to do this is by extracting a piece of endodermic or glandular cells from the body of the pet and reprogram it to be thyroid cells. This should be done in a lab, and the cells must reach a stable state so that they can operate accurately and not stop working when they are inside the body. The importance of having cells from the same body is that they won´t be read as a giant thread, as other materials could.
Since the body has a very acute immune system, it is necessary to put the reprogrammed cells into a container or membrane that filters immune cells and directs infiltrations. This membrane should be soft and with the right amount of pores that allow the entrance of oxygen, TSH, and nutrients, but also allow the filtration of T4 into the blood system. This membrane should be made out of biomimetic biogels, which are very hydrated and can be read as neutral elements by the immune cells.
To grow an implant, it is important to guide the cells so that they can reproduce the architecture of the thyroid gland. This could be guided by a biopolymer that suggests how cells should grow together.
Finally, this implant should be placed in the subcutaneous region so that it is surrounded by blood; therefore, TSH can reach the implant and start the reactions of production of T4.
The implant must be tested and programmed to react over a period of weeks, because the levels of T4 are regulated not in the exact moment that the TSH goes up or down, but rather when the TSH stays in this levels for the period of weeks. These slow reactions are similar to how the original gland reacts, and also, are not a red flag to immune cells. The implant would be reacting accurately and on its natural pace without the induction of not accurate dosis of T4.
Another important addition, and not a very stable one for the body, would be to integrate a nanochip that could scan and give feedback on the situation inside the body, but, for now, it is not a stable solution because the immune system would read it as a major danger that must be eliminated.
ai-generated image
Next, describe one or more governance/policy goals related to ensuring that this application or tool contributes to an “ethical” future, like ensuring non-malfeasance (preventing harm). Break big goals down into two or more specific sub-goals. Below is one example framework (developed in the context of synthetic genomics) you can choose to use or adapt, or you can develop your own. The example was developed to consider policy goals of ensuring safety and security, alongside other goals, like promoting constructive uses, but you could propose other goals, for example, those relating to equity or autonomy.
Prevention of physical and psychological harm during experimental stages, non-malfeasance.
Strengthen diagnostic protocols for canine hypothyroidism before giving the individuals any medical procedure or medicine.
Limit the access and the quantity of samples extracted to each individual.
Establish rules and strong selection criteria for canine individuals that will be participating in the experimental stage.
Fair and free access to information
Establish clear, accessible, and transparent information for caregivers regarding the potential health risks that the research and experimental stages pose.
Ensure that the results and information obtained in experimental stages and results are transferred to students and professionals in specializations, with emphasis on ethical and responsible usages.
In the event of a successful result in canines and the intention to use this research on humans, any prior required animal experimentation must be reviewed and regulated by an official regulatory authority, ensuring fairness, accountability, and ethical continuity across species.
Next, describe at least three different potential governance “actions” by considering the four aspects below (Purpose, Design, Assumptions, Risks of Failure & “Success”). Try to outline a mix of actions (e.g. a new requirement/rule, incentive, or technical strategy) pursued by different “actors” (e.g. academic researchers, companies, federal regulators, law enforcement, etc). Draw upon your existing knowledge and a little additional digging, and feel free to use analogies to other domains (e.g. 3D printing, drones, financial systems, etc.).
Purpose: What is done now and what changes are you proposing?
Design: What is needed to make it “work”? (including the actor(s) involved - who must opt-in, fund, approve, or implement, etc)
Assumptions: What could you have wrong (incorrect assumptions, uncertainties)?
Risks of Failure & “Success”: How might this fail, including any unintended consequences of the “success” of your proposed actions?
A. Review of individual cases based on the evaluation of diagnostics made by an external entity and professional
To obtain a verified hypotheroid diagnosis, there must be an external entity to review and confirm these evaluations. This review must be done by physical and chemical tests that provide consistent and standardized results. This conclusion would accept or deny the participation of each subject.
For this action to be applied, collaboration between institutions and external professionals must be done. These entities and actors must be accredited and affiliated by the governement Ecuador, in this case. External evaluation must be a mandatory step before initiating any procedure, because it will ensure that participants are correctly diagnosed and that budgets, time, and assets are well used.
For these actions to happen, it is assumed that external professionals will be qualified and available to be part of the program, that diagnostic criteria will be consistent in every case, and that these reviews will reduce misdiagnosis. The risks that are on the table could be clinical disagreements between actors involved, internal vs. external, and there is a high risk of corruption among professionals and institutions, which could dramatically lower the chances of succeeding with ethical and good treatment values.
B. Unannounced auditor´s inspections regarding research developments and animals´ health and care
Monitoring research progress should be audited by some unannounced visits in order to have all the information regarding the project clear. This action would be led by an external inspector of the research institutions, as well as the research team. This would ensure the supervision of ethical practices in the lab or clinic, as well as the real progress of the experimental project. The external inspector would be accredited by external academic institutions that are renowned for their knowledge in the field and their ethical practices. It is important that this actor can have the authority to propose adjustments when needed, and to stop or allow the research to continue when the conditions are met.
This would be a successful action assuming that academic institutions would have this type of professional under their wing, and that they have the capacity to support these mechanisms of control. The risks that could arise in this policy are that frequent inspections could provoke a hostile environment, affecting the performance of researchers and, therefore, the success of the research. Administrative burdens could slow the speed of the process, and rigidity could cloud the creative and precise environment needed in this type of research.
C. Mandatory and transparency in educational programs
With mandatory education for all actors involved and transparent communication pathways ensures that experimental procedures, ethical considerations, and long-term care requirements are responsibly applied beyond the research setting. This action would be implemented by educational programs given by the academic institutions involved in this research. It would have to be open for specialized students, professionals, and caregivers who will be part of the program. Also, accessible and clear training sessions would be provided, focusing on care requirements and ethical responsibilities.
Assuming that the information is being passed with transparency and it is being understood by all the attendants, it could lead to more ethical decision-making and supervision within all the actors involved. On the other side, these actions could fail if the educational content is not accessible for all participants, making it overly technical or poorly communicated. Also, an important thing to keep in mind is that there will be an emotional and subjective matter that caregivers will experience throughout the experimentation lapse; this can lead to inadvertent or sudden dopouts putting at risk both the research process and the animal’s life.
Next, score (from 1-3 with, 1 as the best, or n/a) each of your governance actions against your rubric of policy goals. The following is one framework but feel free to make your own:
Does the option:
Option 1
Option 2
Option 3
Enhance Biosecurity
• By preventing incidents
1
1
1
• By helping respond
3
2
2
Foster Lab Safety
• By preventing incident
2
1
1
• By helping respond
3
2
2
Protect the environment
• By preventing incidents
3
2
3
• By helping respond
3
2
3
Other considerations
• Animal wellfare and intervention
1
1
1
• Diagnostic accuracy and animal inclusion
1
2
1
• Ethical inclusion and transparency
2
1
2
• Equity in access to knowledge and care
2
3
1
• Minimizing costs and burdens to stakeholders
1
2
3
• Feasibility?
1
2
2
• Not impede research
2
3
2
• Promote constructive applications
3
1
1
Last, drawing upon this scoring, describe which governance option, or combination of options, you would prioritize, and why. Outline any trade-offs you considered as well as assumptions and uncertainties. For this, you can choose one or more relevant audiences for your recommendation, which could range from the very local (e.g. to MIT leadership or Cambridge Mayoral Office) to the national (e.g. to President Biden or the head of a Federal Agency) to the international (e.g. to the United Nations Office of the Secretary-General, or the leadership of a multinational firm or industry consortia). These could also be one of the “actor” groups in your matrix.
The fundamental governance option to be prioritized is “Review of individual cases based on the evaluation of diagnostics made by an external entity and professional”. As it acts on the foundations of the experimental phases, it can prevent misdiagnosis or weakly supported clinical assumptions before the animal is in the trial. This option reduces the risk of ethical procedures, minimizes harm, protects animal welfare, and strengthens the scientific validity of the research foundation. The trade-off to be considered here is that this extra evaluation may slow down, at the beginning, and limit the number of participants, as well as the development of the research, but in the future are preventive in many aspects.
As complementary governance options are Option n.2 and Option n.3. These actions ensure accountability within all the actors involved by making knowledge a regulatory and independent system. The prevent the misuse of information and empowers informed decision-making throughout the research cycle. External regulatory actors are also essential because they can verify if clinical, laboratory, and welfare standards are applied before, during, and after experimentation. The trade-off here is that research institutions might perceive these actors as intruders and as potential uncertainty symbols of the research development. But, on the other hand, the combination of early prevention, continuous oversight, and broad education creates a distributed responsibility, which reduces the ethical failures to go unnoticed or unaddressed.
Reflecting on what you learned and did in class this week, outline any ethical concerns that arose, especially any that were new to you. Then propose any governance actions you think might be appropriate to address those issues. This should be included on your class page for this week.
As an experimental research that needs to be done in live animals, specifically canines, it is important to keep very clear the limit between ethical procedures and unethical ones. The fact that we do not communicate in the same way might be an important barrier between researchers and subjects. They might end up seen as numbers rather than brave and important living beings who are giving their lives to the experiment.
It has arisen to me the question whether the experiment is worth the many health discomforts that the research individual will undergo to gain a major medical advancement?
Another question to keep in mind is whether we could develop a nanoscanner able to read the gland’s condition and avoid any invasive procedure, preventing from misdiagnostics and giving a much more accurate reading of the situation inside the animal´s body.
Some bibliography found about thyroid organoids for humans:
Kariyawasam, D., Stoupa, A., Nguyen Quoc, A., Pimentel Dantas, I., Polak, M., & Carré, A. (2025). From stem cells to organoids in thyroid: Useful tools or a step for cell therapy? La Presse Médicale, 54(4), 104301. https://doi.org/10.1016/j.lpm.2025.104301
ZHANG, Y., FU, M., WANG, H., & SUN, H. (2023). Advances in the Construction and Application of Thyroid Organoids. Physiological Research, 72(5), 557–564. https://doi.org/10.33549/physiolres.935102
Week 2 HW: DNA Read, Write & Edit
Part 1: Benchling & In-silico Gel Art
See this week’s lab protocol “Gel Art: Restriction Digests and Gel Electrophoresis” for details. Overview:
Make a free account at benchling.com
Import the Lambda DNA.
Simulate Restriction Enzyme Digestion with the following Enzymes:
EcoRI
HindIII
BamHI
KpnI
EcoRV
SacI
SalI
Restriction Enzyme Digestion made with Benchling
Create a pattern/image in the style of Paul Vanouse’s Latent Figure Protocol artworks.
You might find Ronan’s website a helpful tool for quickly iterating on designs!
E=m*a2EcoRV vs. EcoRISingle EnzymesPyramid Enzymes
Part 3: DNA Design Challenge
3.1. Choose your protein.
In recitation, we discussed that you will pick a protein for your homework that you find interesting. Which protein have you chosen and why? Using one of the tools described in recitation (NCBI, UniProt, google), obtain the protein sequence for the protein you chose.
[Example from our group homework, you may notice the particular format — The example below came from UniProt]
The world of proteins is so vast that choosing a single protein has been a profound task.
To follow the same path as week 1 HW, let´s start with Thyroglobulin, a very complex and specialized protein that is key to the generation of T3 and T4 hormones; in other words, it is a hormone protein. Because of its complexity, specificity, and its work with DNA, it is a modern protein. Some interesting facts about Thyroglobuline are: its size, it is very big in comparison to other proteins, it only functions in the thyroid gland, it is prone to being attacked to inmune system´s cells when something is not working well, and it does not accept errors in its process. If we compared it to the Actin protein, we could understand that Actin is a simpler protein that achieves a general action and that it is present in all eukaryotic forms since early life on Earth. Actin is the protein in charge of the formation of the cytoskeleton, motility, and shape of cells, among many other functions. The interesting fact about Actin is that it can allow errors to occur, in contrast to Thyroglobuline, which is very precise.
In the exercise bellow I will develop Thyroglobuline for Canis lupus familiaris.
3.2. Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence.
The Central Dogma discussed in class and recitation describes the process in which DNA sequence becomes transcribed and translated into protein. The Central Dogma gives us the framework to work backwards from a given protein sequence and infer the DNA sequence that the protein is derived from. Using one of the tools discussed in class, NCBI or online tools (google “reverse translation tools”), determine the nucleotide sequence that corresponds to the protein sequence you chose above.
[Example: Get to the original sequence of phage MS2 L-protein from its genome phage MS2 genome - Nucleotide - NCBI]
Lysis protein DNA sequence
atggaaacccgattccctcagcaatcgcagcaaactccggcatctactaatagacgccggccattcaaacatgaggattacccatgtcgaagacaacaaagaagttcaactctttatgtattgatcttcctcgcgatctttctctcgaaatttaccaatcaattgcttctgtcgctactggaagcggtgatccgcacagtgacgactttacagcaattgcttacttaa
Once a nucleotide sequence of your protein is determined, you need to codon optimize your sequence. You may, once again, utilize google for a “codon optimization tool”. In your own words, describe why you need to optimize codon usage. Which organism have you chosen to optimize the codon sequence for and why?
[Example from Codon Optimization Tool | Twist Bioscience while avoiding Type IIs enzyme recognition sites BsaI, BsmBI, and BbsI]
Lysis protein DNA sequence with Codon-Optimization
ATGGAAACCCGCTTTCCGCAGCAGAGCCAGCAGACCCCGGCGAGCACCAACCGCCGCCGCCCGTTCAAACATGAAGATTATCCGTGCCGTCGTCAGCAGCGCAGCAGCACCCTGTATGTGCTGATTTTTCTGGCGATTTTTCTGAGCAAATTCACCAACCAGCTGCTGCTGAGCCTGCTGGAAGCGGTGATTCGCACAGTGACGACCCTGCAGCAGCTGCTGACCTAA
Codon optimization is needed to make the codon sequence of the original host be read or expressed in the codon sequence of the organism that will produce it later, without modifying the amino acid sequence. In the case of research, it is necessary to have a bank of protein that will be analyzed and tested; in that way, it is not sustainable to always get it from the original host for many reasons: budget, quantity, ethics, etc.
In the case of canine Thyroglobulin, the experiment will be based on the question: how to produce canine Thyroglobulin that could be used as one component of an implant of a thyroid gland? The cell organism that will produce the protein later will be CHO (Chinese hamster ovary cells), a mammal cell that has the capacity to perform complex processes to produce proteins as specialized as thyroglobulin.
THYROGLOBULINE - DNA sequence with Codon-Optimization
ATGTTCTCACCACCTGTGTCTTCTGGCAAGAATGGCCCCACCTCCCTGGCTTCTGGCCACTTCACCGGAAGCAACGTGGAGGACAGGTCCTCTTCCGGCTCCTGGGGCAATGGCGGCCACCCAAGTCCATCTCGAAACTACGGCGACGGGACCCCTTACGATCACATGACCTCCAGAGACCTGGGCTCTCATGACAATCTGTCTCCCCCATTTGTGAACTCCCGGATTCAGTCTAAGACTGAGCGGGGCTCATACAGCTCTTACGGACGCGAGAGCAACCTGCAGGGTTGTCACCAGTCCCTGCTGGGCGGAGACATGGACATGGGCACCCCCGGGACCCTCTCTCCTACTAAGCCTGGCTCTCAGTATTACCAGTACTCCTCCAATAACCCTCGAAGGCGGCCCCTGCACAGCAGTGCCATGGAGGTCCAGACAAAGAAAGTCAGGAAGGTGCCACCAGGCCTGCCCAGCTCCGTCTATGCCCCAAGCGCCTCCACCGCCGATTACAATCGAGATAGCCCCGGTTACCCCTCCTCTAAGCCAGCAGCCTCTACTTTCCCTAGCTCCTTCTTTATGCAGGACGGCCATCACTCAAGTGATCCTTGGTCCAGCAGCTCTGGCATGAACCAGCCAGGGTACGGTGGTATGCTGGGTTCTTCCAGTCACATCCCTCAGTCTTCTTCCTACTGTAGTCTGCATCCACATGAGCGCCTGTCATACCCCAGCCACTCCTCTGCCGACATCAATAGCTCCCTGCCACCCATGTCAACCTTCCATAGGAGCGGCACTAACCATTATTCCACATCCAGCTGCACTCCTCCCGCTAACGGTACTGACTCTATCATGGCTAACAGAGGCTCCGGCGCCGCTGGAAGCAGTCAGACCGGAGATGCTCTCGGCAAGGCCCTGGCTTCTATCTATTCTCCCGACCATACCAACAATTCTTTCAGTAGCAACCCTTCTACTCCCGTGGGCTCCCCTCCTTCCCTGTCCGCCGGAACCGCTGTGTGGTCTAGGAATGGCGGCCAGGCCAGCTCCAGCCCTAATTATGAGGGCCCCCTGCACAGCCTGCAGTCTCGTATTGAGGATAGGCTGGAGCGACTGGACGACGCTATTCACGTGCTGCGTAACCATGCTGTGGGCCCAAGCACCGCTATGCCCGGGGGACACGGAGACATGCACGGAATCATCGGCCCTTCTCACAACGGGGCTATGGGGGGTCTGGGCAGCGGCTACGGAACAGGCCTGCTGTCCGCCAACAGGCACTCTCTGATGGTGGGTGCCCACCGGGAAGACGGAGTGGCCCTGAGAGGGTCACATAGCCTGGTGCCTAACCAGGTGCCTGTGCCTCAGCTGCCCGTGCAGAGTGCTACTAGCCCCGATCTGAACCCTCCACAGGACCCTTACAGAGGCATGCCACCCGGTCTGCAGGGACAGTCTGTGTCCTCTGGCAGTAGCGAGATCAAGTCAGATGACGAGGGAGACGAGAACCTGCAGGATACAAAGAGCTCCGAGGATAAGAAATTGGACGACGACAAGAAGGACATCAAGTCCATCACCAGCAACAACGACGACGAGGACCTGACTCCTGAGCAGAAGGCCGAACGGGAAAAGGAAAGGCGGATGGCTAACAATGCAAGAGAACGCCTGAGGGTCAGGGATATCAATGAGGCTTTCAAGGAGCTGGGCAGGATGGTGCAGCTGCATCTTAAGTCTGACAAGCCACAGACAAAGCTGCTGATCCTGCACCAGGCTGTGGCTGTGATTCTGTCCCTGGAGCAGCAGGTGAGAGAGAGGAACCTGAACCCTAAGGCCGCTTGCCTGAAAAGACGGGAGGAGGAAAAAGTGAGCTCTGAGCCCCCACCCCTCTCCCTGGCCGGACCACACCCCGGCATGGGCGACGCTTCTAACCACATGGGCCAGATG
3.4. You have a sequence! Now what?
What technologies could be used to produce this protein from your DNA? Describe in your words the DNA sequence can be transcribed and translated into your protein. You may describe either cell-dependent or cell-free methods, or both.
If I had to produce this from my DNA, I would have to use a codon-optimized sequence so that the protein could be interpreted by human cells. Although canine thyroglobuline and human thyroglobuline are not so different, and the DNA is not that different, it is necessary to make this step in order to have amino acids arranged in perfect order so that they can be read by mRNA. The technologies to do this would be:
cell-free methods: for producing the protein, specifically CHO cells that come from Chinese hamster ovaries. As this protein comes from a mammal, the cells for reproducing the protein need to be from the same group; it is not efficient to use bacterial cells like E.coli, for example, because the protein needs to fold in a specific way, and the differences between bacteria, mammal,s and plants make this process very different.
bioreactor: for scaling the production, avoiding cito-contamination, and giving the process a controlled atmosphere to fold and grow.
Part 4: Prepare a Twist DNA Synthesis Order
This is a practice exercise, not necessarily your real Twist order!
4.1. Create a Twist account and a Benchling account
Although the Benchling account was successfully created and used, the Twist account was not able to grant access.
sfGFP expressed in e.coli
4.2. Build Your DNA Insert Sequence
For example, let’s make a sequence that will make E. coli glow fluorescent green under UV light by constitutively (always) expressing sfGFP (a green fluorescent protein):
In Benchling, select New DNA/RNA sequence
Give your insert sequence a name and select DNA with a Linear topology (this is a linear sequence that will be inserted into a circular backbone vector of our choosing).
Go through each piece of the given DNA sequences highlighted below (Promoter, RBS, Start Codon, Coding Sequence, His Tag, Stop Codon, Terminator) and paste the sequences into the Benchling file one after the other (replacing the coding sequence with your codon optimized DNA sequence of interest!). Each time you add a new piece of the sequence, make sure to annotate by right clicking over the sequence and creating an annotation that describes what each piece (e.g., Promoter, RBS, etc.) is (see image below).
For this part, I started by searching for the protein sequence in FASTA format, then I translated it to a DNA sequence and finally I made the codon optimization of the DNA sequence for e.coli.The results are bellow:
5.1 DNA Read(i) What DNA would you want to sequence (e.g., read) and why? This could be DNA related to human health (e.g. genes related to disease research), environmental monitoring (e.g., sewage waste water, biodiversity analysis), and beyond (e.g. DNA data storage, biobank).
I would like to sequence the DNA of an ecuatorian tree called Cholán, or Tecoma stans.
(ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why?
Also answer the following questions:
Is your method first-, second- or third-generation or other? How so?
What is your input? How do you prepare your input (e.g. fragmentation, adapter ligation, PCR)? List the essential steps.
What are the essential steps of your chosen sequencing technology, how does it decode the bases of your DNA sample (base calling)?
What is the output of your chosen sequencing technology?
5.2 DNA Write
(i) What DNA would you want to synthesize (e.g., write) and why? These could be individual genes, clusters of genes or genetic circuits, whole genomes, and beyond. As described in class thus far, applications could range from therapeutics and drug discovery (e.g., mRNA vaccines and therapies) to novel biomaterials (e.g. structural proteins), to sensors (e.g., genetic circuits for sensing and responding to inflammation, environmental stimuli, etc.), to art (DNA origamis). If possible, include the specific genetic sequence(s) of what you would like to synthesize! You will have the opportunity to actually have Twist synthesize these DNA constructs! :)
(ii) What technology or technologies would you use to perform this DNA synthesis and why?
Also answer the following questions:
What are the essential steps of your chosen sequencing methods?
What are the limitations of your sequencing method (if any) in terms of speed, accuracy, scalability?
5.3 DNA Edit
(i) What DNA would you want to edit and why? In class, George shared a variety of ways to edit the genes and genomes of humans and other organisms. Such DNA editing technologies have profound implications for human health, development, and even human longevity and human augmentation. DNA editing is also already commonly leveraged for flora and fauna, for example in nature conservation efforts, (animal/plant restoration, de-extinction), or in agriculture (e.g. plant breeding, nitrogen fixation). What kinds of edits might you want to make to DNA (e.g., human genomes and beyond) and why?
(ii) What technology or technologies would you use to perform these DNA edits and why?
Also answer the following questions:
How does your technology of choice edit DNA? What are the essential steps?
What preparation do you need to do (e.g. design steps) and what is the input (e.g. DNA template, enzymes, plasmids, primers, guides, cells) for the editing?
What are the limitations of your editing methods (if any) in terms of efficiency or precision?
Week 3 HW: Lab Automation
Assignment: Python Script for Opentrons Artwork — DUE BY YOUR LAB TIME!
Your task this week is to Create a Python file to run on an Opentrons liquid handling robot.
Review this week’s recitation and this week’s lab for details on the Opentrons and programming it.
Generate an artistic design using the GUI at opentrons-art.rcdonovan.com.
Using the coordinates from the GUI, follow the instructions in the HTGAA26 Opentrons Colab to write your own Python script which draws your design using the Opentrons.
You may use AI assistance for this coding — Google Gemini is integrated into Colab (see the stylized star bottom center); it will do a good job writing functional Python, while you probably need to take charge of the art concept.
Iteration color palette 5Coding done by Gemini
This coding was made with Google Gemini. The steps for doing that were: first, loading the coordinates made in the GUI; second, giving instructions to the AI for what the expected outcome; finally, iterating until the idea was achieved. The given instruction given to the IA did not have any basic coding, it was all made with written instructions.
If you’re a proficient programmer and you’d rather code something mathematical or algorithmic instead of using your GUI coordinates, you may do that instead.
If the Python component is proving too problematic even with AI and human assistance, download the full Python script from the GUI website and submit that:
If you use AI to help complete this homework or lab, document how you used AI and which models made contributions.
Gemini instructions - collab
Sign up for a robot time slot if you are at MIT/Harvard/Wellesley or at a Node offering Opentrons automation. The Python script you created will be run on the robot to produce your work of art!
At MIT/Harvard? Lab times are on Thursday Feb.19 between 10AM and 6PM.
At other Nodes? Please coordinate with your Node.
Submit your Python file via this form.
STAR´S BIRTH CODES
ITERATION COLOR PALETTE - 5
SUCCESSFULL ATTEMPT
As you can see, the code has not been uploaded. But, it was because I could not understand how to do it. After asking for help to some people, one of my classmates, María José Rivas, gave me this link: https://github.com/Mozta/opentrons-bioart-sim/tree/main?tab=readme-ov-file#from-source-for-development. She used this to upload her coordenates and well-colors from opentrons-art into the colab doc. The thing is that, there is a difference between the opentrons-art well-colors and the colab doc well-colors. This protocol, runs it fine.
For this process, first I dowloaded python to see if I could run it there, it was not successfull, but I understood how the program works (super basic knowledge). Then I went to the colab doc and tried to import the documents, but I was not successfull, so I asked for help to ChatGpt. We went trough the hall process together, and step by step it helped me import the link info as well as my .py doc.
The final results are these:
from opentrons import types
metadata = { # see https://docs.opentrons.com/v2/tutorial.html#tutorial-metadata
'author': '',
'protocolName': '',
'description': '',
'source': 'HTGAA 2026 Opentrons Lab',
'apiLevel': '2.20'
}
##############################################################################
### Robot deck setup constants - don't change these
##############################################################################
TIP_RACK_DECK_SLOT = 9
COLORS_DECK_SLOT = 6
AGAR_DECK_SLOT = 5
PIPETTE_STARTING_TIP_WELL = 'A1'
well_colors = {
'A1' : 'Red',
'B1' : 'Green',
'C1' : 'Orange'
}
def run(protocol):
##############################################################################
### Load labware, modules and pipettes
##############################################################################
# Tips
tips_20ul = protocol.load_labware('opentrons_96_tiprack_20ul', TIP_RACK_DECK_SLOT, 'Opentrons 20uL Tips')
# Pipettes
pipette_20ul = protocol.load_instrument("p20_single_gen2", "right", [tips_20ul])
# Modules
temperature_module = protocol.load_module('temperature module gen2', COLORS_DECK_SLOT)
# Temperature Module Plate
temperature_plate = temperature_module.load_labware('opentrons_96_aluminumblock_generic_pcr_strip_200ul',
'Cold Plate')
# Choose where to take the colors from
color_plate = temperature_plate
# Agar Plate
agar_plate = protocol.load_labware('htgaa_agar_plate', AGAR_DECK_SLOT, 'Agar Plate') ## TA MUST CALIBRATE EACH PLATE!
# Get the top-center of the plate, make sure the plate was calibrated before running this
center_location = agar_plate['A1'].top()
pipette_20ul.starting_tip = tips_20ul.well(PIPETTE_STARTING_TIP_WELL)
##############################################################################
### Patterning
##############################################################################
###
### Helper functions for this lab
###
# pass this e.g. 'Red' and get back a Location which can be passed to aspirate()
def location_of_color(color_string):
for well,color in well_colors.items():
if color.lower() == color_string.lower():
return color_plate[well]
raise ValueError(f"No well found with color {color_string}")
# For this lab, instead of calling pipette.dispense(1, loc) use this: dispense_and_detach(pipette, 1, loc)
def dispense_and_detach(pipette, volume, location):
"""
Move laterally 5mm above the plate (to avoid smearing a drop); then drop down to the plate,
dispense, move back up 5mm to detach drop, and stay high to be ready for next lateral move.
5mm because a 4uL drop is 2mm diameter; and a 2deg tilt in the agar pour is >3mm difference across a plate.
"""
assert(isinstance(volume, (int, float)))
above_location = location.move(types.Point(z=location.point.z + 5)) # 5mm above
pipette.move_to(above_location) # Go to 5mm above the dispensing location
pipette.dispense(volume, location) # Go straight downwards and dispense
pipette.move_to(above_location) # Go straight up to detach drop and stay high
###
### YOUR CODE HERE to create your design
###
!git clone https://github.com/Mozta/opentrons-bioart-sim.git
%cd opentrons-bioart-sim
!ls
%cd /content/opentrons-bioart-sim
!ls
%cd /content/opentrons-bioart-sim/opentrons-bioart-sim/src/opentrons_bioart_sim
!ls
!sed -n '1,200p' /content/opentrons-bioart-sim/opentrons-bioart-sim/src/opentrons_bioart_sim/colors.py
!sed -n '1,200p' /content/opentrons-bioart-sim/opentrons-bioart-sim/src/opentrons_bioart_sim/visualization.py
%cd /content
!ls
import sys
sys.path.append("/content/opentrons-bioart-sim/opentrons-bioart-sim/src")
import OTDesign_96_deep_well_plate
from opentrons_bioart_sim.mock import OpentronsMock
!sed -n '1,200p' OTDesign_96_deep_well_plate.py
# Don't forget to end with a drop_tip()
fatal: destination path 'opentrons-bioart-sim' already exists and is not an empty directory.
/content/opentrons-bioart-sim
CONTRIBUTING.md LICENSE opentrons-bioart-sim README.md tests
examples notebooks pyproject.toml src
/content/opentrons-bioart-sim
CONTRIBUTING.md LICENSE opentrons-bioart-sim README.md tests
examples notebooks pyproject.toml src
/content/opentrons-bioart-sim/opentrons-bioart-sim/src/opentrons_bioart_sim
cli.py __init__.py opentrons-bioart-sim visualization.py
colors.py mock.py __pycache__
"""
colors.py — Fluorescent protein color mappings for Opentrons Bio-Art visualization
===================================================================================
Maps fluorescent protein names to matplotlib-compatible colors for Petri dish rendering.
"""
# ═══════════════════════════════════════════════════════════════════════
# Petri dish constants
# ═══════════════════════════════════════════════════════════════════════
PETRI_INNER_DIAMETER: float = 84 # mm — inner diameter of "90mm" and "100mm" plates
MAX_DRAW_RADIUS: float = PETRI_INNER_DIAMETER / 2 - 2 # 2mm margin for tip size, drops, calibration
# ═══════════════════════════════════════════════════════════════════════
# Protein → visual color mapping
# ═══════════════════════════════════════════════════════════════════════
PROTEIN_VISUAL_COLORS: dict[str, str] = {
# Reds / Pinks
'mrfp1': 'red',
'mcherry': 'firebrick',
'dsred': 'darkred',
'mruby2': 'crimson',
'mscarlet_i': 'tomato',
'mkate2': 'deeppink',
'mkate2_tf': 'mediumvioletred',
'tagrfp': 'coral',
'tdtomato': 'orangered',
'eqfp578': 'salmon',
'mlychee_tf': 'hotpink',
'mwatermelon': 'lightcoral',
# Oranges / Yellows
'mko2': 'orange',
'mpapaya': 'lightsalmon',
'venus': 'yellow',
'mcitrine': 'gold',
'mvenus': 'goldenrod',
'mbanana': 'khaki',
'mstaygold2': 'gold',
'mchartreuse_tf': 'chartreuse',
# Greens
'sfgfp': 'lime',
'egfp': 'lime',
'megfp': 'limegreen',
'avgfp': 'palegreen',
'mneongreen': 'greenyellow',
'mazamigreen': 'forestgreen',
'mclover3': 'green',
'mwasabi': 'lightgreen',
'mjuniper': 'darkgreen',
'zsgreen1': 'springgreen',
'pa_gfp': 'mediumseagreen',
'mhoneydew': 'yellowgreen',
# Blues / Cyans
'azurite': 'royalblue',
'tagbfp': 'blue',
'mtagbfp2': 'mediumblue',
'ultramarine': 'navy',
'mturquoise2': 'turquoise',
'mcerulean3': 'cyan',
'mtfp1': 'darkcyan',
'mmicy': 'aquamarine',
'electra2': 'deepskyblue',
# Others
'mplum': 'purple',
}
def resolve_visual_color(protein_or_color_name: str) -> str:
"""Resolve a fluorescent protein name or color name to a matplotlib color.
Lookup order:
1. Check PROTEIN_VISUAL_COLORS (case-insensitive)
2. Map 'green' → 'lime' for better visibility on dark backgrounds
3. Pass through as-is (assumed to be a valid matplotlib color)
Args:
protein_or_color_name: Protein name (e.g. 'sfGFP') or color (e.g. 'red').
Returns:
A matplotlib-compatible color string.
"""
key = protein_or_color_name.lower().strip()
if key in PROTEIN_VISUAL_COLORS:
return PROTEIN_VISUAL_COLORS[key]
if key == 'green':
return 'lime'
return protein_or_color_name
"""
visualization.py — Petri dish visualization for Opentrons Bio-Art protocols
============================================================================
Renders droplet positions, smears, and volume summaries as a matplotlib figure.
"""
from __future__ import annotations
from typing import Optional
import matplotlib.pyplot as plt
from matplotlib.figure import Figure
from matplotlib.axes import Axes
from .colors import PETRI_INNER_DIAMETER
def visualize_petri(
droplets_x: list[float],
droplets_y: list[float],
droplets_size: list[float],
droplets_color: list[str],
smears: list[tuple[list[float], list[float], str]],
total_aspirated: dict[str, float],
total_dispensed: dict[str, float],
tip_count: int,
background: str = 'black',
title: str = 'Opentrons Bio-Art Simulation',
save_path: Optional[str] = None,
show: bool = True,
dpi: int = 150,
figsize: tuple[float, float] = (10, 10),
) -> tuple[Figure, Axes]:
"""Render a Petri dish visualization with all dispensed droplets.
Args:
droplets_x: X coordinates of each droplet (mm from center).
droplets_y: Y coordinates of each droplet (mm from center).
droplets_size: Size of each droplet in scatter points (volume × 100).
droplets_color: Matplotlib color of each droplet.
smears: List of (x_list, y_list, color) tuples for smear lines.
total_aspirated: Dict mapping color name → total µL aspirated.
total_dispensed: Dict mapping color name → total µL dispensed.
tip_count: Number of tips used during the protocol.
background: 'black' (dark agar), 'agar' (beige agar), or 'paper' (outline only).
title: Plot title.
save_path: If provided, save figure to this file path.
show: If True, call plt.show(). Set False for headless/test usage.
dpi: Resolution for saved images.
figsize: Figure size in inches.
Returns:
Tuple of (Figure, Axes) for further customization.
"""
# ── Print volume summary ──
_print_volume_summary(total_aspirated, total_dispensed, tip_count)
# ── Create figure ──
fig, ax = plt.subplots(figsize=figsize)
# ── Petri dish background ──
radius = PETRI_INNER_DIAMETER / 2
bg_colors = {
'black': ('#000000', True),
'agar': ('#d7ca95', True),
'paper': ('#000000', False),
}
color, fill = bg_colors.get(background, bg_colors['black'])
ax.add_patch(plt.Circle((0, 0), radius=radius, color=color, fill=fill))
# ── Droplets ──
if droplets_x:
ax.scatter(droplets_x, droplets_y, droplets_size, c=droplets_color)
# ── Smears ──
for xlist, ylist, scolor in smears:
ax.plot(xlist, ylist, color=scolor, linewidth=4, solid_capstyle='round')
# ── Axes setup ──
margin = radius + 0.5
ax.set_xlim(-margin, margin)
ax.set_ylim(-margin, margin)
ax.set_aspect('equal')
ax.set_title(title)
# ── Save / Show ──
if save_path:
fig.savefig(save_path, dpi=dpi, bbox_inches='tight',
facecolor=fig.get_facecolor(), edgecolor='none')
print(f"\nImage saved to: {save_path}")
if show:
plt.show()
return fig, ax
def _print_volume_summary(
total_aspirated: dict[str, float],
total_dispensed: dict[str, float],
tip_count: int,
) -> None:
"""Print a summary of aspirated/dispensed volumes by color."""
from .colors import resolve_visual_color
print("\n=== TOTAL VOLUMES BY COLOR ===")
all_colors = total_aspirated.keys() | total_dispensed.keys()
for color in sorted(all_colors):
asp = total_aspirated.get(color, 0)
disp = total_dispensed.get(color, 0)
waste = "\t\t##### WASTE: more aspirated than dispensed!" if asp != disp else ''
vis = resolve_visual_color(color)
print(f"\t{color} ({vis}):\t aspirated {asp:.1f}\t dispensed {disp:.1f}{waste}")
total_asp = sum(total_aspirated.values())
total_disp = sum(total_dispensed.values())
print(f"\t[all]:\t\t[aspirated {total_asp:.1f}]\t[dispensed {total_disp:.1f}]")
print(f"\n=== TIPS USED ===\n\t{tip_count} tip(s) (ideal: one per color)\n")
/content
opentrons-bioart-sim __pycache__ venv
OTDesign_96_deep_well_plate.py sample_data
from opentrons import types
import string
metadata = {
'protocolName': '{YOUR NAME} - Opentrons Art - HTGAA',
'author': 'HTGAA',
'source': 'HTGAA 2026',
'apiLevel': '2.20'
}
Z_VALUE_AGAR = 2.0
POINT_SIZE = 1
electra2_points = [(-4.4,39.6), (-8.8,37.4), (6.6,37.4), (-6.6,35.2), (15.4,35.2), (-13.2,33), (0,33), (-11,30.8), (19.8,30.8), (4.4,28.6), (13.2,28.6), (-15.4,26.4), (-6.6,26.4), (-19.8,24.2), (-26.4,22), (30.8,22), (-35.2,11), (35.2,11), (-37.4,8.8), (-39.6,2.2), (-33,0), (33,0), (-37.4,-6.6), (35.2,-8.8), (-33,-13.2), (33,-15.4), (-30.8,-22), (-15.4,-24.2), (-24.2,-26.4), (-11,-33), (8.8,-33), (15.4,-33), (4.4,-37.4)]
mturquoise2_points = [(-24.2,26.4), (-22,22), (-19.8,19.8), (-17.6,19.8), (-19.8,17.6), (-17.6,17.6), (-15.4,17.6), (-17.6,15.4), (-15.4,15.4), (-13.2,13.2), (-11,13.2), (-13.2,11), (-11,11), (-8.8,8.8), (-4.4,6.6), (-6.6,4.4), (11,4.4), (13.2,4.4), (15.4,4.4), (11,2.2), (13.2,2.2), (15.4,2.2), (19.8,2.2), (8.8,0), (11,0), (-4.4,-8.8), (-4.4,-11), (-4.4,-13.2), (-6.6,-15.4), (-4.4,-15.4), (-2.2,-17.6), (0,-24.2), (-4.4,-28.6)]
azurite_points = [(-2.2,39.6), (-6.6,37.4), (-15.4,35.2), (-11,35.2), (-4.4,35.2), (-19.8,33), (-17.6,33), (-6.6,33), (-22,30.8), (-15.4,30.8), (-19.8,28.6), (-17.6,28.6), (-28.6,24.2), (-22,24.2), (-28.6,22), (-33,19.8), (-28.6,19.8), (-24.2,19.8), (-30.8,17.6), (-28.6,17.6), (22,17.6), (-35.2,15.4), (-33,15.4), (17.6,15.4), (24.2,15.4), (-30.8,13.2), (-33,11), (15.4,11), (19.8,11), (22,11), (26.4,11), (17.6,8.8), (19.8,8.8), (22,8.8), (15.4,6.6), (17.6,6.6), (-37.4,4.4), (19.8,4.4), (24.2,4.4), (17.6,2.2), (22,2.2), (24.2,2.2), (19.8,0), (22,0), (26.4,0), (24.2,-2.2)]
sfgfp_points = [(0,-8.8), (2.2,-8.8), (-2.2,-11), (0,-11), (2.2,-11), (-2.2,-13.2), (-4.4,-19.8), (-2.2,-19.8)]
mjuniper_points = [(-2.2,-22), (0,-22), (-4.4,-24.2), (-2.2,-24.2), (-4.4,-26.4), (-6.6,-28.6), (-2.2,-28.6), (-4.4,-33), (-2.2,-35.2)]
mko2_points = [(2.2,8.8), (-6.6,6.6), (0,6.6), (2.2,6.6), (-4.4,4.4), (-2.2,4.4), (0,4.4), (-4.4,2.2), (-2.2,2.2), (6.6,2.2), (8.8,2.2), (-6.6,0), (-4.4,0), (4.4,0), (6.6,0), (-8.8,-2.2), (-6.6,-2.2), (2.2,-2.2), (4.4,-2.2), (0,-4.4), (2.2,-4.4), (4.4,-4.4), (6.6,-4.4), (-2.2,-6.6), (0,-6.6), (2.2,-6.6), (4.4,-6.6), (6.6,-6.6), (8.8,-6.6), (-2.2,-8.8), (8.8,-8.8), (11,-8.8), (13.2,-8.8), (11,-11), (15.4,-11), (13.2,-13.2), (11,-15.4), (13.2,-15.4), (13.2,-17.6), (15.4,-17.6), (17.6,-17.6), (19.8,-19.8), (19.8,-22), (22,-24.2), (26.4,-26.4)]
mwasabi_points = [(0,8.8), (-2.2,6.6), (-6.6,2.2), (-8.8,0)]
mrfp1_points = [(8.8,-11), (13.2,-11), (11,-13.2)]
mscarlet_i_points = [(15.4,-15.4), (17.6,-15.4)]
point_name_pairing = [("electra2", electra2_points),("mturquoise2", mturquoise2_points),("azurite", azurite_points),("sfgfp", sfgfp_points),("mjuniper", mjuniper_points),("mko2", mko2_points),("mwasabi", mwasabi_points),("mrfp1", mrfp1_points),("mscarlet_i", mscarlet_i_points)]
# Robot deck setup constants
TIP_RACK_DECK_SLOT = 9
COLORS_DECK_SLOT = 6
AGAR_DECK_SLOT = 5
PIPETTE_STARTING_TIP_WELL = 'A1'
# Place the PCR tubes in this order
well_colors = {
'A1': 'sfGFP',
'A2': 'mRFP1',
'A3': 'mKO2',
'A4': 'Venus',
'A5': 'mKate2_TF',
'A6': 'Azurite',
'A7': 'mCerulean3',
'A8': 'mClover3',
'A9': 'mJuniper',
'A10': 'mTurquoise2',
'A11': 'mBanana',
'A12': 'mPlum',
'B1': 'Electra2',
'B2': 'mWasabi',
'B3': 'mScarlet_I',
'B4': 'mPapaya',
'B5': 'eqFP578',
'B6': 'tdTomato',
'B7': 'DsRed',
'B8': 'mKate2',
'B9': 'EGFP',
'B10': 'mRuby2',
'B11': 'TagBFP',
'B12': 'mChartreuse_TF',
'C1': 'mLychee_TF',
'C2': 'mTagBFP2',
'C3': 'mEGFP',
'C4': 'mNeonGreen',
'C5': 'mAzamiGreen',
'C6': 'mWatermelon',
'C7': 'avGFP',
'C8': 'mCitrine',
'C9': 'mVenus',
'C10': 'mCherry',
'C11': 'mHoneydew',
'C12': 'TagRFP',
'D1': 'mTFP1',
'D2': 'Ultramarine',
'D3': 'ZsGreen1',
'D4': 'mMiCy',
'D5': 'mStayGold2',
'D6': 'PA_GFP'
}
volume_used = {
'electra2': 0,
'mturquoise2': 0,
'azurite': 0,
'sfgfp': 0,
'mjuniper': 0,
'mko2': 0,
'mwasabi': 0,
'mrfp1': 0,
'mscarlet_i': 0
}
def update_volume_remaining(current_color, quantity_to_aspirate):
rows = string.ascii_uppercase
for well, color in list(well_colors.items()):
if color == current_color:
if (volume_used[current_color] + quantity_to_aspirate) > 250:
# Move to next well horizontally by advancing row letter, keeping column number
row = well[0]
col = well[1:]
# Find next row letter
next_row = rows[rows.index(row) + 1]
next_well = f"{next_row}{col}"
del well_colors[well]
well_colors[next_well] = current_color
volume_used[current_color] = quantity_to_aspirate
else:
volume_used[current_color] += quantity_to_aspirate
break
def run(protocol):
# Load labware, modules and pipettes
protocol.home()
# Tips
tips_20ul = protocol.load_labware('opentrons_96_tiprack_20ul', TIP_RACK_DECK_SLOT, 'Opentrons 20uL Tips')
# Pipettes
pipette_20ul = protocol.load_instrument("p20_single_gen2", "right", [tips_20ul])
# Deep Well Plate
temperature_plate = protocol.load_labware('nest_96_wellplate_2ml_deep', 6)
# Agar Plate
agar_plate = protocol.load_labware('htgaa_agar_plate', AGAR_DECK_SLOT, 'Agar Plate')
agar_plate.set_offset(x=0.00, y=0.00, z=Z_VALUE_AGAR)
# Get the top-center of the plate, make sure the plate was calibrated before running this
center_location = agar_plate['A1'].top()
pipette_20ul.starting_tip = tips_20ul.well(PIPETTE_STARTING_TIP_WELL)
# Helper function (dispensing)
def dispense_and_jog(pipette, volume, location):
assert(isinstance(volume, (int, float)))
# Go above the location
above_location = location.move(types.Point(z=location.point.z + 2))
pipette.move_to(above_location)
# Go downwards and dispense
pipette.dispense(volume, location)
# Go upwards to avoid smearing
pipette.move_to(above_location)
# Helper function (color location)
def location_of_color(color_string):
for well,color in well_colors.items():
if color.lower() == color_string.lower():
return temperature_plate[well]
raise ValueError(f"No well found with color {color_string}")
# Print pattern by iterating over lists
for i, (current_color, point_list) in enumerate(point_name_pairing):
# Skip the rest of the loop if the list is empty
if not point_list:
continue
# Get the tip for this run, set the bacteria color, and the aspirate bacteria of choice
pipette_20ul.pick_up_tip()
max_aspirate = int(18 // POINT_SIZE) * POINT_SIZE
quantity_to_aspirate = min(len(point_list)*POINT_SIZE, max_aspirate)
update_volume_remaining(current_color, quantity_to_aspirate)
pipette_20ul.aspirate(quantity_to_aspirate, location_of_color(current_color))
# Iterate over the current points list and dispense them, refilling along the way
for i in range(len(point_list)):
x, y = point_list[i]
adjusted_location = center_location.move(types.Point(x, y))
dispense_and_jog(pipette_20ul, POINT_SIZE, adjusted_location)
if pipette_20ul.current_volume == 0 and len(point_list[i+1:]) > 0:
quantity_to_aspirate = min(len(point_list[i:])*POINT_SIZE, max_aspirate)
update_volume_remaining(current_color, quantity_to_aspirate)
pipette_20ul.aspirate(quantity_to_aspirate, location_of_color(current_color))
# Drop tip between each color
pipette_20ul.drop_tip()
Finally, the result is this:
=== TOTAL VOLUMES BY COLOR ===
Azurite (royalblue): aspirated 47.0 dispensed 46.0 ##### WASTE: more aspirated than dispensed!
Electra2 (deepskyblue): aspirated 34.0 dispensed 33.0 ##### WASTE: more aspirated than dispensed!
mJuniper (darkgreen): aspirated 9.0 dispensed 9.0
mKO2 (orange): aspirated 46.0 dispensed 45.0 ##### WASTE: more aspirated than dispensed!
mRFP1 (red): aspirated 3.0 dispensed 3.0
mScarlet_I (tomato): aspirated 2.0 dispensed 2.0
mTurquoise2 (turquoise): aspirated 34.0 dispensed 33.0 ##### WASTE: more aspirated than dispensed!
mWasabi (lightgreen): aspirated 4.0 dispensed 4.0
sfGFP (lime): aspirated 8.0 dispensed 8.0
[all]: [aspirated 187.0] [dispensed 183.0]
=== TIPS USED ===
9 tip(s) (ideal: one per color)
(<Figure size 1000x1000 with 1 Axes>,
<Axes: title={'center': 'Opentrons Bio-Art Simulation'}>)
Special thanks to María José and Rafael Pérez Aguirre (@Mozta)
Post-Lab Questions — DUE BY START OF FEB 24 LECTURE
One of the great parts about having an automated robot is being able to precisely mix, deposit, and run reactions without much intervention, and design and deploy experiments remotely.
For this week, we’d like for you to do the following:
Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications.
Automation of protein crystallization scale-up via Opentrons-2 liquid handling
This study shows the approach for optimizing protein crystallization trials at multi-microliter scale using the Opentrons-2 liquid handling robot. The research shows that using Python scripts for precise control, the robot can mix and set up crystallization plates with a model protein - hen egg white lysozyme - and periplasmic protein from Campylobacter jejuni, a crystal used in the Snow lab as a biomaterial for nanotechnology, requiring large, consistent batches. This automation of the process can significantly reduce manual labor, costs, and improve reliability in the protein crystallization results. Opentrons uses a python programming, making it easier to set up for iterations and improvements in programming protocols.
DeRoo, J. B., Jones, A. A., Slaughter, C. K., Ahr, T. W., Stroup, S. M., Thompson, G. B., & Snow, C. D. (2025). Automation of protein crystallization scaleup via Opentrons-2 liquid handling. SLAS Technology, 32, 100268. https://doi.org/10.1016/j.slast.2025.100268
Other interesting studies demonstrate how Opentrons can be linked to other types of technology, such as 3D bioprinting. Although a 3d printer does not work with proteins in the same way as OT-2, it can print different types of labware, reducing costs and making specialized tools.
Apart from robots, there is now a collaboration between automated labs and AI assistance.
Write a description about what you intend to do with automation tools for your final project. You may include example pseudocode, Python scripts, 3D printed holders, a plan for how to use Ginkgo Nebula, and more. You may reference this week’s recitation slide deck for lab automation details.
While your description/project idea doesn’t need to be set in stone, we would like to see core details of what you would automate. This is due at the start of lecture and does not need to be tested on the Opentrons yet.
Project tech proposals:
DNA OPTIMIZER: for optimizing codon-sequences to express proteins in its host cells, for example: E. coli expressing luciferase.
OPENTROS OT-2: to make iterations of expressing cells. For example, for bacteria, folding proteins while having frequencies played.
CLOUD LAB: For iterating with different proteins and combinations before going to the lab. For example: recombining DNA for cyanobacteria and luciferase.
BIOREACTOR: For expressing proteins properly in a non-contaminated atmosphere
Final Project Ideas — DUE BY START OF FEB 24 LECTURE
For the final project ideas, there are 3 options to take into consideration:
MUSIC & BACTERIA
AQUATIC MICROORGANISMS & BIOLUMINESCENT SENSORS
PROTEIN BASED CRYSTALLINE MATERIALS & SPIDER-SILK TEXTILES
Week 4: Protein Design - part I
Part A. Conceptual Questions
Answer any NINE of the following questions from Shuguang Zhang: (i.e. you can select two to skip)
How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons)
Why do humans eat beef but do not become a cow, eat fish but do not become fish?
Why are there only 20 natural amino acids?
Can you make other non-natural amino acids? Design some new amino acids.
Where did amino acids come from before enzymes that make them, and before life started?
If you make an α-helix using D-amino acids, what handedness (right or left) would you expect?
Can you discover additional helices in proteins?
Why are most molecular helices right-handed?
Why do β-sheets tend to aggregate?
What is the driving force for β-sheet aggregation?
Why do many amyloid diseases form β-sheets?
Can you use amyloid β-sheets as materials?
Design a β-sheet motif that forms a well-ordered structure.
Part B: Protein Analysis and Visualization
In this part of the homework, you will be using online resources and 3D visualization software to answer questions about proteins. Pick any protein (from any organism) of your interest that has a 3D structure and answer the following questions:
LUCIFERASE OF Pyrophorus plagiophthalamus
Luciferase is a protein/enzyme that generates bioluminescence by catalyzing oxidation of D-luciferine in the presence of ATP, oxygen, and MG+2. In the case of this particular insect, Pyrophorus plagiophthalmus, different isoforms of luciferasecan emit light ranging from green to orange, depending on which organs it expresses the gene. These variations of color arise from subtle structural variations in the enzyme´s active site that alter the electronic environment of the excited oxyluciferin intermediate. Click Beetle´s luciferase is a very stable protein in a wide range of pH range compared to other active luciferases. It is very common to use this enzyme for in vivo imaging applications, especially the red-emitting variants. They are also used as a biosensor to monitor gene expression and as a gene reporter.
I chose this particular protein because I am interested in analyzing how sound frequencies might influence bacterial protein expression, growth dynamics, or spatial organization. In this way, having luciferase as a biosensor is ideal; light emission provides a real-time, quantifiable readout.
Burbelo, P. D., Kisailus, A. E., & Peck, J. W. (2002). Detecting Protein-Protein Interactions Using Renilla Luciferase Fusion Proteins. BioTechniques, 33(5), 1044–1050. https://doi.org/10.2144/02335st05
How long is it? What is the most frequent amino acid?
For this part, I used Google Colab and did some research on Leucine.
Luciferase of Pryphorus plagiophtalmus: has 543 amino acids, being the most frequent L (Leucina) that appears 56 times.
Leucine is commonly known for being an amino acid that helps synthesize muscle proteins and supports tissue regeneration. In this case, its function is related to a hydrophobic nucleotide, correct protein folding, and formation of alpha helices.
Luciferase - colab run
How many protein sequence homologs are there for your protein?
According to Uniprot´s BLAST TOOL, it has 236 homologs. This means that there is a variety of similar proteins in the living realm. They might not be the same, but they share a very similar structure. These homologs can be orthologs and parologs. The second ones are proteins that can be found inside the insect’s body, but with very subtle variations in their structure.
Does your protein belong to any protein family?
Yes, it belongs to the luciferase proteins of insects. This type of protein needs ATP, d-luciferin, and oxygen to perform the oxidation process.
When was the structure solved? Is it a good quality structure? A good-quality structure is one with high resolution. Smaller the better (Resolution: 2.70 Å) :
This particular protein, Luciferase of Pyrophorus plagiophthalamus, is not in the bank information of RCSB, so I took the first luciferase structured in the bank which is: 1LCI Firefly luciferase from Photinus pyralis. Its structure was solved in 1997. The quality is 2.00 Å, which is a good quality.
Are there any other molecules in the solved structure apart from protein?
There is a presence chrystallographic molecules of water (HOH), which stabilizes the protein and may participate in hydrogen bond formation. As this is the first protein from luciferase to be structured, it does not include other types of components, besides the protein and water.
Red dots alone: chrystallographic water
Does your protein belong to any structure classification family?
It belongs to the ATP-dependent AMP-binding enzyme family. This family includes enzymes that activate substrates through adenylation using ATP, forming an AMP-bound intermediate.
Unitprot´s info
Open the structure of your protein in any 3D molecule visualization software:- PyMol Tutorial Here (hint: ChatGPT is good at PyMol commands)- Visualize the protein as “cartoon”, “ribbon” and “ball and stick”.
Luciferase-cartoonLuciferase-ribbonLuciferase-ball and sticks
Color the protein by secondary structure. Does it have more helices or sheets?
Luciferase-ss
The protein shows a predominance in alpha helix (red) compared to beta helix (green). This indicates that firefly luciferase is mainly an alpha-helical protein with a smaller portion of beta-sheet structures
Color the protein by residue type. What can you tell about the distribution of hydrophobic vs hydrophilic residues?
Luciferase-hidrophobic (yellow) vs. hydrophilic (yellow)
It is shown that this particular protein/enzyme, which operates in an aqueous environment, has an exterior with hydrophilic residues as protagonists and its core with hydrophobic residues.
Visualize the surface of the protein. Does it have any “holes” (aka binding pockets)?
Yes, the protein surface shows a variety of pockets. One with a predominant size, and others that are small. It is precisely in the big pocket that the ATP binds with the D-luciferine to form Luciferil-AMP and then binds together with oxygen molecules that finally form oxyluciferine and light.
Part C. Using ML-Based Protein Design Tools
C1. Protein Language Modeling
Deep Mutational Scans
a. Use ESM2 to generate an unsupervised deep mutational scan of your protein based on language model likelihoods.b. Can you explain any particular pattern? (choose a residue and a mutation that stands out).c. (Bonus) Find sequences for which we have experimental scans, and compare the prediction of the language model to experiment.
Deep Mutational Scan
The map shows a large number of possible mutations, although two main regions should not be changed because the protein could collapse; those regions are shown as two columns of dark blue. Also, three subtle rows show color consistency corresponding to W, M, and C.
Latent Space Analysis
a. Use the provided sequence dataset to embed proteins in reduced dimensionality.b. Analyze the different formed neighborhoods: do they approximate similar proteins?c. Place your protein in the resulting map and explain its position and similarity to its neighbors.
It is shown that near the analyzed protein (Firefly Luciferase - Photinus pyralis) is located the Luciferase Luciola Cruciata, a protein produced by another type of firefly. The first one, PP, is from North America, while the second one, LC, is from Japan. The main difference is the geographical location and its molecular composition, which is expressed in a slightly different type of color, and the stability of the enzyme. Although both proteins use D-Luciferin and ATP to produce light, PP Luciferase is widely used in biotech as a reporter gene. In contrast, LC Luciferase is used to understand how active-site residues interact with the substrate.
C2. Protein Folding
Fold your protein with ESMFold. Do the predicted coordinates match your original structure?Try changing the sequence, first try some mutations, then large segments. Is your protein structure resilient to mutations?
When folded with ESMFold, the protein shows an almost identical structure to the original one, but when given some mutations, it presents a few changes, not very radical ones, but a few anomalies, meaning that the protein is resilient in a high percentage.
C3. Protein Generation
Inverse-Folding a proteinLet’s now use the backbone of your chosen PDB to propose sequence candidates via ProteinMPNNAnalyze the predicted sequence probabilities and compare the predicted sequence vs the original one.Input this sequence into ESMFold and compare the predicted structure to your original.
Even though the predicted structure has a completely different type of amino acid distribution, the structure remains the same. This is why the protein shown in 3D is very similar to the original in its alpha- and beta-structures. The backbone is not altered, nor is the logic of the distribution of certain types of amino acids, either.
Part D. Group Brainstorm on Bacteriophage Engineering
Find a group of ~3–4 students
Read through the Phage Reading material listed under “Reading & Resources” below.
Review the Bacteriophage Final Project Goals for engineering the L Protein:
Increased stability (easiest)
Higher titers (medium)
Higher toxicity of lysis protein (hard)
Brainstorm Session
Choose one or two main goals from the list that you think you can address computationally (e.g., “We’ll try to stabilize the lysis protein,” or “We’ll attempt to disrupt its interaction with E. coli DnaJ.”).
Write a 1-page proposal (bullet points or short paragraphs) describing:
Which tools/approaches from recitation you propose using (e.g., “Use Protein Language Models to do in silico mutagenesis, then AlphaFold-Multimer to check complexes.”).
Why do you think those tools might help solve your chosen sub-problem?
Name one or two potential pitfalls (e.g., “We lack enough training data on phage–bacteria interactions.”).
Include a schematic of your pipeline.
This resource may be useful: HTGAA Protein Engineering Tools
Each individually put your plan on your HTGAA website
Include your group’s short plan for engineering a bacteriophage
Week 5: Protein Design - part II
Part A: SOD1 Binder Peptide Design (From Pranam)
Superoxide dismutase 1 (SOD1) is a cytosolic antioxidant enzyme that converts superoxide radicals into hydrogen peroxide and oxygen. In its native state, it forms a stable homodimer and binds copper and zinc.
Mutations in SOD1 cause familial Amyotrophic Lateral Sclerosis (ALS). Among them, the A4V mutation (Alanine → Valine at residue 4) leads to one of the most aggressive forms of the disease. The mutation subtly destabilizes the N-terminus, perturbs folding energetics, and promotes toxic aggregation.
Your challenge:
Design short peptides that bind mutant SOD1.
Then decide which ones are worth advancing toward therapy.
The pseudo perplexity range explains that the lower the range, the higher the confidence of the model. This means that the peptide with 15.42 will be less natural, while the peptide with 10.32 is a more natural and similar peptide to the sequence. Adding the SOD-1 binding sequence marks a difference arrises, have a pseudo perplexisty of 20.63, a very high number, which means that
I took the peptides generated in PepMLM and bound them to Alphafolds using the mutant SOD1 sequence. The results show that the protein sequence is highly confident in the result it generated, indicating that the model has high confidence in the predicted structure. While the iPTM shows numbers under 0.6, which means there is low confidence in the interaction between the peptide and protein. Also, the parts in which the peptide actually binds a little bit to the protein correspond to the beginning of the sequence, which appears to be a more flexible region of the protein.
Peptide 1Peptide 2Peptide 3Peptide 4Peptide 5
C. Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse
PEPTIDE 1: WHYPVVGLALGX
PROPERTY PREDICTION VALUE UNIT
💧 Solubility Soluble 1.000 Probability
🩸 Hemolysis Non-hemolytic 0.103 Probability
🔗 Binding Affinity Weak binding 6.293 pKd/pKi
⚖️ Molecular Weight 1193.6 Da
⚡ Net Charge (pH 7) -0.15
PEPTIDE 2: WRVYPAGAAHKK
PROPERTY PREDICTION VALUE UNIT
💧 Solubility Soluble 1.000 Probability
🩸 Hemolysis Non-hemolytic 0.010 Probability
🔗 Binding Affinity Weak binding 4.953 pKd/pKi
⚖️ Molecular Weight 1383.6 Da
⚡ Net Charge (pH 7) 2.84
PEPTIDE 3: KRYPAVALAWWE
PROPERTY PREDICTION VALUE UNIT
💧 Solubility Soluble 1.000 Probability
🩸 Hemolysis Non-hemolytic 0.071 Probability
🔗 Binding Affinity Weak binding 6.653 pKd/pKi
⚖️ Molecular Weight 1489.7 Da
⚡ Net Charge (pH 7) 0.77
PEPTIDE 4: WLYYAVGAEHGE
PROPERTY PREDICTION VALUE UNIT
💧 Solubility Soluble 1.000 Probability
🩸 Hemolysis Non-hemolytic 0.065 Probability
🔗 Binding Affinity Weak binding 6.053 pKd/pKi
⚖️ Molecular Weight 1394.5 Da
⚡ Net Charge (pH 7) -2.14
PEPTIDE 5: FLYRWLPSRRGG
PROPERTY PREDICTION VALUE UNIT
💧 Solubility Soluble 1.000 Probability
🩸 Hemolysis Non-hemolytic 0.047 Probability
🔗 Binding Affinity Weak binding 5.968 pKd/pKi
⚖️ Molecular Weight 1507.7 Da
⚡ Net Charge (pH 7) 2.76
It seems that the best candidate as a therapeutic peptide is Peptide 3: KRYPAVALAWWE. Although other candidates show very good qualities and achieve similar results in terms of solubility, hemolysis, molecular weight, and net charge, they do not present a strong binding score. In this case, Peptide 3 shows the highest predicted binding affinity among the candidates.
If we compare these results with the iPTM values predicted by Alphafold, we can observe that the confidence of interaction between peptides and the protein is generally low. For Peptide 3 in particular, the iPTM value lies in the middle of the observed range, suggesting (inside of the low values) moderate structural confidence in the predicted interaction.
Additionally, when designing peptides for therapeutic purposes, several properties must be considered. First, peptides need to be soluble so that they can circulate in the biological fluids without forming aggregates. Second, hemolysis probabilities should remain below 0.2, since higher values indicate that peptides may disrupt red blood cells and release hemoglobin into the bloodstream, which can be toxic. Third, binding affinity is important because it helps to predict whether a peptide will interact strongly with the target protein. Furthermore, molecular weight is preferably small, as smaller peptides are easier to synthesize and diffuse through biological environments. Finally, a moderate positive net charge is often favorable, because it can promote electrostatic interactions with negatively charged regions on protein surfaces, potentially stabilizing the peptide-protein interaction.
D. Part 4: Generate Optimized Peptides with moPPIt
I chose to run the peptide at the nearest residues of the mutation because the flexibility around these spaces is beneficial to peptide-protein binding.
To consider the values of analysis:
💧 Solubility: 1.0 (good)
🩸 Hemolysis : 1.0 (good)
🔗 Binding Affinity: the higher the better
🧩 Motif: 1.0 (good)
Therefore, Peptide 3: YYQKTCLVKKEH reflects that it is the best candidate for binding to the mutant SOD1. It presents balanced and consistent results in every aspect: hemolysis, solubility, affinity, and motif. Although the solubility is slightly lower compared to Peptide 0, it still falls in the favorable range, suggesting that the peptide can remain stable and soluble in physiological conditions. Also, it presents a high affinity and motif, meaning that it can perform a strong and specific interaction with the selected residues of the protein.
Compared to PepMLM peptides, the Moppit results show a good affinity and motif, which did not appear in the PepMLM peptides. I think Moppit has a higher affinity and better chances to bind with the protein because it has developed results with a specific target of residues in a specific region, while PepML gives a general result based on stable and more plausible sequences without focusing on any particular binding site.
How would you evaluate these peptides before advancing them to clinical studies?
I would first run a few more computational tests to have consistent results in stability and strength of the peptide-protein bond. This would be run by docking and molecular dynamics simulations. Afterward, it will be necessary to do some in vitro experiments to test if the solubility, hemolysis, binding, affinity, motif, and results keep being consistent and similar to the computational simulations. Finally, in vivo models would be run to assess safety, stability, and pharmacokinetic properties to see if the peptide meets the requirements for clinical studies.
Part C: Final Project: L-Protein Mutants
High level summary: The objective of this assignment is to improve the stability and auto-folding of the lysis protein of a MS2-phage. This mechanism is key to the understanding of how phages can potentially solve antibiotic-resistance.
Week 6: Genetic Circuits - part I
Assignment: DNA Assembly
Answer these questions about the protocol in this week’s lab:
What are some components in the Phusion High-Fidelity PCR Master Mix, and what is their purpose?
Elements:
Pyrococcus-like enzyme fused with a processivity-enhancing domain that increases fidelity and speed.
53 polymerase activity, 35exonuclease activity, and generates blunt-ended products.
Phusion DNA Polymerase, deoxynucleotides, and reaction buffer that has been optimized and includes MgCl2
Requirements to be added: template, primers, and water.
Purpose:
DNA needs to be correct after amplification
Cloning and long or difficult amplicons
New England Biolabs. (2026). Neb.com. https://www.neb.com/en/products/m0531-phusion-high-fidelity-pcr-master-mix-with-hf-buffer?srsltid=AfmBOorEwfoZbsx9mLnMvqC0ZzoqOD_2vjUU8fWlYwjLxsupJd6g5J4T
What are some factors that determine primer annealing temperature during PCR?
Annealing temperature:
The temperature used during the primer annealing step of a PCR, which depends on the primer melting temperature
Usually lower than the melting temperature
Too high Ta: the primer will not anneal to the template, and the amplification will not proceed
Too low Ta: nonspecific binding of the primers to the template or each other can occur, causing:
increased likelihood of nonspecific product formation
decreased formation of the intended product due to inefficient reaction conditions.
Factors of primer annealing temperature:
Base composition (proportion of ATCG nucleotides)
Primer concentration
Ionic reaction environment
New England Biolabs. (2025). Neb.com. https://www.neb.com/en/faqs/how-should-i-determine-the-appropriate-annealing-temperature-for-my-reaction
PCR conditions | Primer annealing specificity | PCR buffers. (n.d.). Www.qiagen.com. https://www.qiagen.com/us/knowledge-and-support/knowledge-hub/bench-guide/pcr/introduction/pcr-conditions
Two methods from this class create linear DNA fragments: PCR and restriction enzyme digestion. Compare and contrast these two methods, both in terms of protocol and in terms of when one may be preferable to the other.
PROTOCOLS PCR vs RESTRICTION ENZYME DIGESTION:
PCR is the technique for mainly DNA amplification.
RESTRICTION ENZYME DIGESTION is the technique for cutting specific sections of DNA.
Both act in different temperatures and procedures, but the final applications are varied: for PCR, there is DNA amplification, cloning, sequencing, etc, while for Restriction enzyme digestion, there are DNA-specific cutting, cloning, etc.
In the table below, the comparison between protocols can be better understood:
Protocol comparison chart
Addgene. (2016, October 11). Molecular Biology Protocol - Restriction Digest of Plasmid DNA. Addgene.org. https://www.addgene.org/protocols/restriction-digest/
New England Biolabs. (2026). Neb.com. https://www.neb.com/en/protocols/taq-dna-polymerase-with-standard-taq-buffer-m0273?srsltid=AfmBOoofDqU_j-Dq9LMqmvJJS77-KBrCTdRfvMap0CX93U1VmEXFKB2t
New England Biolabs. (2026). Neb.com. https://www.neb.com/en/protocols/restriction-digest-protocol?srsltid=AfmBOorUr9MjG3wtFkW9aSZK3zyRojgexRcNxwj0Tinahe9D5B5V07vw
How can you ensure that the DNA sequences that you have digested and PCR-ed will be appropriate for Gibson cloning?
How does the plasmid DNA enter the E. coli cells during transformation?
Describe another assembly method in detail (such as Golden Gate Assembly)
Explain the other method in 5 - 7 sentences plus diagrams (either handmade or online).
Model this assembly method with Benchling or Asimov Kernel!
Assignment: Asimov Kernel
Create a Repository for your work
Create a blank Notebook entry to document the homework and save it to that Repository
Explore the devices in the Bacterial Demos Repo to understand how the parts work together by running the Simulator on various examples, following the instructions for the simulator found in the “Info” panel (click the “i” icon on the right to open the Info panel)
Create a blank Construct and save it to your Repository
Recreate the Repressilator in that empty Construct by using parts from the Characterized Bacterial Parts repository
Search the parts using the Search function in the right menu
Drag and drop the parts into the Construct
Confirm it works as expected by running the Simulator (“play” button) and compare your results with the Repressilator Construct found in the Bacterial Demos repository
Document all of this work in your Notebook entry - you can copy the glyph image and the simulator graphs, and paste them into your Notebook
Build three of your own Constructs using the parts in the Characterized Bacterials Parts Repo
Explain in the Notebook Entry how you think each of the Constructs should function
Run the simulator and share your results in the Notebook Entry
If the results don’t match your expectations, speculate on why and see if you can adjust the simulator settings to get the expected outcome
For this assignment, I took notes in the Notebook as suggested. Here are the images of it, in which I registered every step:
For the first part: represillator construct, the process was smooth, as well as the results compared to the reference.
For the second part of constructs 1, 2, and 3. I designed a simple protein expression of GFP; the results were in a steady state. After this construct, I tried a repressilator using GFP, BFP, and RFP, the first simulation was pretty good, but I noticed that fluorescent proteins were not expressing, so I made a few changes. The result I obtained was not what I was expecting because the oscillations did not appear as steady as I thought they were going to. Instead, I obtained a few oscillations at the beginning of the simulations, and the expression was steady. For the last construct, I tried to use GFP and RFP with a TetR protein and a Ligand. The results of the simulation without the ligand were successful after a few iterations. Although the ligand did not work out, I tried many options.
12345678
Week 7 HW: Genetic Circuits - part II
Assignment Part 1: Intracellular Artificial Neural Networks (IANNs)
What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions?
In-vivo Artificial Neural Networks provide an analog approach to computation within biological systems, enabling cells to process continuous inputs and generate a gradient and a non-binary output. IANNs can integrate multiple signals and capture more complex, non-linear relationships through distributed gene regulation. IANNs have the ability to capture patterns and complex inputs.
In contrast, traditional circuits based on Boolean logic are a binary system that has an ON/OFF output that responds directly to the inputs given. This gives a limited range of data for processing based on predefined inputs given to the system. It is easier to implement experimentally, but its results are limited and cannot answer complex, continuous, and dynamic signals.
In the chart below, there is a comparison to understand it more:
Feature
Boolean Genetic Circuits
IANNs
Response Type
Digital (ON/OFF)
Analog (Graded/Continuous)
Logic Type
Exact Logic (AND, NOT)
Non-linear Approximation
Complexity
Limited / Simple
Complex / Adaptive
Robustness
Low (Noise sensitive)
High (Pattern Recognition)
Main Use Case
Switch / Memory
Classification / Processing
Comparison table made with IA´s help
3Blue1Brown. (2017). But what is a Neural Network? | Deep learning, chapter 1. In YouTube. https://www.youtube.com/watch?v=aircAruvnKk
TeachTech Online. (2016, April 15). [Tema 3] Puertas lógicas y circuitos combinacionales. YouTube. https://www.youtube.com/watch?v=r7YNXYGCx7s
Shao, B., Liu, X., Zhang, D., Wu, J., & Ouyang, Q. (2015). From Boolean Network Model to Continuous Model Helps in Design of Functional Circuits. PLOS ONE, 10(6), e0128630. https://doi.org/10.1371/journal.pone.0128630
1. Introduction to biological circuit design — Biological Circuit Design documentation. (n.d.). Biocircuits.github.io. https://biocircuits.github.io/chapters/01_intro_to_circuit_design.html
Neural network (machine learning). (2024, February 18). Wikipedia. https://en.wikipedia.org/wiki/Neural_network_(machine_learning)
Describe a useful application for an IANN; include a detailed description of input/output behavior, as well as any limitations an IANN might face to achieve your goal.
An interesting application of IANN could be in the production of palm hearts. The proposal would be to optimize its growth by inserting a symbiotic bacterium that sends signals to the plant genes that, in consequence, make the palm produce more or fewer proteins to gain an optimal production of fiber.
A conceptual approach to the formula of this would be:
O=0.3M+0.25H−0.2S+0.25G
In which O means the output, M represents the metabolic states (sugar levels), H represents hormone levels, G represents expression related to structural growth, and S represents stress levels. So, in this approximation, when the bacteria detect a different output or the presence of this protein weight, they send a signal to the plant´s DNA so that it starts producing until it reaches a balanced output. This system would be dynamic because the plant’s state will never be a constant state due to environmental causes and internal processes.
Some possible complications would be the complexity of the plant’s system itself, and the variability in each individual, which would expand an infinite proportion of the lectures that bacteria should do, making it an adaptive network. Also, there is no guarantee of the survival of the bacteria within different individuals.
Below is a diagram depicting an intracellular single-layer perceptron where the X1 input is DNA encoding for the Csy4 endoribonuclease and the X2 input is DNA encoding for a fluorescent protein output whose mRNA is regulated by Csy4. Tx: transcription; Tl: translation. Draw a diagram for an intracellular multilayer perceptron where layer 1 outputs an endoribonuclease that regulates a fluorescent protein output in layer 2.
For this example, I will use:
X1: glucose, construct: Promoter (pCRP)+ RBS + Csy4 + T
Output: Fluorescence in higher or lower quantities.
In the image above, you can see how X1 and X2 determine the amount of Csy4 produced in the cell; then, the amount of Csy4 enzyme will affect the expression of GFP. Therefore, when there is more Csy4, there will be less fluorescence (GFP), and when there is less Csy4, the fluorescence (GFP) will be higher.
Multilayer Perceptron - chart developed with AI assistance
Assignment Part 2: Fungal Materials
What are some examples of existing fungal materials and what are they used for? What are their advantages and disadvantages over traditional counterparts?
There are so many examples of the usage of fungal materials, but the most interesting and useful ones are:
Architecture: specifically with NASA, testing fungal materials to build biosensors for space exploration devices.
Biotextiles: the development and commercialization of fungal leather and other types of biotextiles
Mycelial wood: for replication of wood, it is lighter, faster, and easier to produce.
Food industries: the production of meat and proteins to compensate.
Antibiotic researchers
Also, the types of materials developed with fungi are classified as rigid materials or flexible materials.
For the rigid ones, materials are created by combining fungi with fibers or lignocellulose particles. The characteristic of each rigid material depends on many factors, such as the type of substrate, the fungi species and strains, the type of hypha, and overall growth conditions. Some examples of rigid materials are related to construction.
For the flexible materials, the variables of material production are similar to the rigid ones, but the difference relies on the final product. In this case, textiles tend to be fragile and limited; that is why they have to be produced with biotechnology, so that their properties can improve. There are some industries that use fungi to produce flexible products like textiles, food, foams, leather, and many more.
In the chart below, there is a list of various companies that work with mycelium, and the industries they are involved in:
Fungi companies around the world
Welcome To Zscaler Directory Authentication. (2026). Sciencedirect.com. https://www.sciencedirect.com/science/article/pii/S2950194625002079#ab0010
Hinneburg, H., Gu, S., & Naseri, G. (2025). Fungal Innovations—Advancing Sustainable Materials, Genetics, and Applications for Industry. Journal of Fungi, 11(10), 721. https://doi.org/10.3390/jof11100721
What might you want to genetically engineer fungi to do and why? What are the advantages of doing synthetic biology in fungi as opposed to bacteria?
I think it could be interesting to genetically engineer fungi to efficiently convert plant biomass into biofuels such as ethanol, lipids, or energy-rich compounds. Due to their natural ability to degrade complex polymers like cellulose or lignin, fungi can serve as a biological platform for transforming agricultural waste into usable energy sources.
According to the paper “Fungal Innovations - Advancing sustainable materials,” the advantages of doing synthetic biology in fungi rely on the structural and biosynthetic capabilities, in contrast to E.coli or S. cerevisiae, which are the common hosts that have rapid growth and simple genetics, but have some complications with RNA splicing, and complex regulation. Filamentous fungi, to be specific, can fold complex proteins and can drive advancements in better transcriptional regulation tools, genome editing techniques, and rapid DNA assembly methods; therefore, they are powerful biomanufacturing platforms.
Hinneburg, H., Gu, S., & Naseri, G. (2025). Fungal Innovations—Advancing Sustainable Materials, Genetics, and Applications for Industry. Journal of Fungi, 11(10), 721. https://doi.org/10.3390/jof11100721
Assignment Part 3: First DNA Twist Order
Review the Individual Final Project documentation guidelines.
Submit this Google Form with your draft Aim 1, final project summary, HTGAA industry council selections, and shared folder for DNA designs. DUE MARCH 20 FOR MIT/HARVARD/WELLESLEY STUDENTS
Review Part 3: DNA Design Challenge of the week 2 homework. Design at least 1 insert sequence and place it into the Benchling/Kernel/Other folder you shared in the Google Form above. Document the backbone vector it will be synthesized in on your website.
Week 9 HW: Cell Free Systems
Part A: General and Lecturer-Specific Questions
General homework questions
Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production.
Cell-free protein synthesis: is a protein expression approach that enables the production of a target protein without the use of living cells.
In vivo methods: it is also a protein expression approach that uses living cells such as bacteria (e. coli, most common), yeast, insect cells, and mammalian cells.
Here is a chart that makes a summary about differences between both methods:
Cell-free protein synthesis vs. traditional in vivo methods
As it is shown, cell-free protein synthesis has several advantages compared to traditional in vivo methods, mainly because everything happens outside of a living cell. This makes the system much more flexible, since you can directly control things like the amount of DNA, enzymes, and other components without worrying about how the cell will react. In contrast to cell-dependent methods, in which limitations are high due to metabolism, regulation, and survival.
Another important advantage is that cell-free systems allow the production of proteins that might be toxic to cells, while in in vivo methods, these types of proteins can kill or damage the host organism, making them difficult or impossible to produce. In cell-free expression, this is not a problem because there are no living cells involved.
In terms of speed, it is faster (cell-free), since it does not need to spend time growing cells or transforming them. This makes it easier to quickly test different DNA sequences or protein variants.
There are several situations where the cell-free method is more useful. For example: when producing toxic proteins that cannot be expressed in cells, or for rapid prototyping applications, such as synthetic biology circuits or screening multiple protein variants in a short time.
It is important to add that while cell-free methods are better with rapid process and protein expression in a controled-medium size, in vivo methods can handle massive production and low-cost protein production
New England Biolabs. (2026). Neb.com. https://www.neb.com/en/applications/protein-expression/cell-free-protein-expression?srsltid=AfmBOorKmWZBUknZgtYzmBfbx0IiXqMcRlLcgWd8oi4EKcYBFwv4sudk
Mason, E. (2023, March 23). Advantages of Cell-Free Protein Expression. Biocompare.com. https://www.biocompare.com/Editorial-Articles/594727-Advantages-of-Cell-Free-Protein-Expression/
Cui, Y., Chen, X., Wang, Z., & Lu, Y. (2022). Cell-Free PURE System: Evolution and Achievements. BioDesign Research, 2022. https://doi.org/10.34133/2022/9847014
Describe the main components of a cell-free expression system and explain the role of each component.
Cell-free expression systems are made up of several components that work together to produce proteins outside living cells:
Cell extract: contains the molecular machinery needed for protein synthesis. This extract usually comes from broken cells (like bacteria) and provides ribosomes, tRNAs, enzymes, and other factors required for transcription and translation.
DNA template: this is the gene that encodes the protein of interest. The system uses this DNA to produce mRNA and then translates it into the desired protein.
Amino acids: they are the building blocks of proteins. These are added to the system so that ribosomes can assemble them into a protein based on the sequence of the mRNA.
NTPs (nucleoside triphosphates): such as ATP, GTP, CTP, and UTP. These molecules are essential both for building the mRNA during transcription and for providing energy during translation.
Energy source: protein synthesis requires a lot of energy, so the system needs molecules like ATP and other energy-regenerating compounds to keep the reaction running.
Cofactors and salts: this helps the stability of the chemical environment and helps enzymes to function correctly. Cofactors and salts ensure that the system remains stable and efficient.
Cell-free components
Kim, W., Han, J., Chauhan, S., & Lee, J. W. (2025). Cell-free protein synthesis and vesicle systems for programmable therapeutic manufacturing and delivery. Journal of Biological Engineering, 19(1). https://doi.org/10.1186/s13036-025-00523-x
Why is energy provision regeneration critical in cell-free systems? Describe a method you could use to ensure continuous ATP supply in your cell-free experiment.
Energy provision regeneration is critical in cell-free systems because protein synthesis is a highly energy-demanding process. Both transcription and translation require large amounts of ATP and GTP. Without the continuous energy supply, the reaction stops. Because of the absence of a cellular metabolism to naturally generate energy, the cell-free system would run out of ATP very fast. Without it, protein yields would be very low, and the reaction would only last a short period.
There are some interesting pathways to produce continuous energy in cell-free systems, some of them are:
Glucose and sugar metabolism: systems that are frequently derived from E.coli or yeast, enable high-yield, in vitro protein production by utilizing metabolic pathways to break down glucose, which can improve cost-efficiency.
Maltodextrin metabolism: it is a low-cost secondary energy compound for CFPS. It produces higher levels of protein than PEP, glucose, and glucose-6 phosphate. The enhancement of protein synthesis was largely attributed to be better-controlled phosphate levels ( recycling of inorganic phosphate) and a more homeostatic reaction environment.
Electric-generated power: in cells, ATP is synthethized through a rather complicated process involving several membrane-bound redox protein complexes. Electrons are transferred along different redox centers, creating a proton motive force across the membrane, which is subsequently harvested for ATP synthesis.
System
How it generates ATP
Advantages
Disadvantages
When to use it
Glucose and sugar metabolism
Uses glucose or other sugars (e.g., glycolysis) to produce ATP
Wang Y, Zhang YH. Cell-free protein synthesis energized by slowly-metabolized maltodextrin. BMC Biotechnol. 2009 Jun 28;9:58. doi: 10.1186/1472-6750-9-58. PMID: 19558718; PMCID: PMC2716334.
Luo, S., Adam, D., Giaveri, S., Barthel, S., Cestellos-Blanco, S., Hege, D., Paczia, N., Castañeda-Losada, L., Klose, M., Arndt, F., Heider, J., & Erb, T. J. (2023). ATP production from electricity with a new-to-nature electrobiological module. Joule, 7(8), 1745-1758. https://doi.org/10.1016/j.joule.2023.07.012
Calhoun, Kara & Swartz, James. (2005). Energizing cell-free protein synthesis with glucose metabolism. Biotechnology and bioengineering. 90. 606-13. 10.1002/bit.20449.
Compare prokaryotic versus eukaryotic cell-free expression systems. Choose a protein to produce in each system and explain why.
How would you design a cell-free experiment to optimize the expression of a membrane protein? Discuss the challenges and how you would address them in your setup.
Imagine you observe a low yield of your target protein in a cell-free system. Describe three possible reasons for this and suggest a troubleshooting strategy for each.
Week 10 HW: Imagin and Measurement
Waters Part I — Molecular Weight
We will analyze an eGFP standard on a Waters Xevo G3 QTof MS system to determine the molecular weight of intact eGFP and observe its charge state distribution in the native and denatured (unfolded) states. The conditions for LC-MS analysis of intact protein cause it to unfold and be detected in its denatured form (due to the solvents and pH used for analysis).
1. Based on the predicted amino acid sequence of eGFP (see below) and any known modifications, what is the calculated molecular weight? You can use an online calculator like the one at https://web.expasy.org/compute_pi/
eGFP Sequence:
MVSKGEELFTG VVPILVELDG DVNGHKFSVS GEGEGDATYG KLTLKFICTT GKLPVPWPTL VTTLTYGVQC FSRYPDHMKQ HDFFKSAMPE GYVQERTIFF KDDGNYKTRA EVKFEGDTLV NRIELKGIDF KEDGNILGHK LEYNYNSHNV YIMADKQKNG IKVNFKIRHN IEDGSVQLAD HYQQNTPIGD GPVLLPDNHY LSTQSALSKD PNEKRDHMVL LEFVTAAGIT LGMDELYKLE HHHHHH
Note: This contains a His-purification tag (HHHHHH) and a linker (the LE before it).
Using the link, the result obtained was:
Theoretical pI/Mw of eGFP: 5.90 / 28006.60
2. Calculate the molecular weight of the eGFP using the adjacent charge state approach described in the recitation. Select two charge states from the intact LC-MS data (Figure 1) and:
Determine z for each adjacent pair of peaks (n,n+1) using:
z =
m / zn+1m / zn - m / zn+1
For this part, I chose this adjacent pair of peaks:
n= 1037.4423
n+1= 1000.4302
The formula would be:
z =
1000.4302
1037.4423 - 1000.4302
z =
1000.4302
37.0121
z =
27.0298
z =
≈27 kDa
Determine the MW of the protein using the relationship between m/zn, MW, z:
For this part, I will use the equation given in the presentation to determine MW. First I will determine n, to then determine MW
n
n =
m / zn+1 - 1
m / zn - m / zn+1
n =
1000.4302 - 1
10037.4423 - 1000.4302
n =
999.4302
37.0121
n =
27
MW
MW =
(n * m / zn+1) - n
MW =
(27 * 1037.4423) - 27
MW =
28010.9421 - 27
MW =
27983.9421
MW =
≈27984 Da
When we compare results between Theoretical MW (28006.60) and Experimental MW (27984) we recognize a slight varietion, meening that the achieved results are very precise and coherent between them. The final result of MW will be, as mentioned before, 27984 Da.
Calculate the accuracy of the measurement using the deconvoluted MW from 2.2 and the predicted weight of the protein from 2.1 using:
Accuracy =
(MW experiment - MW theory)
MW theory
Accuracy =
27984 - 28006.60
28006.60
Accuracy =
-22.6
28006.60
Accuracy =
-8.07 * 10-4
Accuracy =
≈-0.0807%
The final Accuracy number -0.0807% shows that the difference between the MW Experimental and MW Theoretical is slightly different by a minimal percentage. This shows a high accuracy in charge-state assignment and peak selection.
Can you observe the charge state for the zoomed-in peak in the mass spectrum for the intact eGFP? If yes, what is it? If no, why not?
Yes, it can be observed, and also it is shown that there are other peaks inside the peaks. This is because the protein is charged with multiple protons, creating a charge-state envelope (like a membrane that changes constantly). When we zoom in, we can see the isotopical peaks.
Waters Part II — Secondary/Tertiary structure
We will analyze eGFP in its native, folded state and compare it to its denatured, unfolded state on a quadrupole time-of-flight MS. We will be doing MS-only analysis (no liquid chromatography, also known as “direct infusion” experiments) on the Waters Xevo G3-QToF MS.
1. Based on learnings in the lab, please explain the difference between native and denatured protein conformations. For example, what happens when a protein unfolds? How is that determined with a mass spectrometer? What changes do you see in the mass spectrum between the native and denatured protein analyses (Figure 2)?
The main difference between native and denatured protein conformation is that native proteins are those that are in their natural, functional 3D conformation, specifically conformed into secondary, tertiary, and quaternary structures. While, on the other hand, denatured proteins are those that have lost their 3D structure, unfolding into random inactive polypeptide chains due to stress ( heat, pH, chemicals). Although they retain their primary amino acid sequence.
Charge states:
Native: because of the compact structure, it hides the charge sites, showing fewer peaks with lower charge-states (higher value of m/zn)
Denatured: because of the loss of the 3D structure, protonized sites are exposed, therefore the result is a distribution of multiply and higher charges (lower value of m/zn).
Quaternary conservation complex
Native: the native mass spectrometer allows for observing the complex in its structure because the non-covalent interactions (hydrogen bridges, hydrofobic interactions) are kept intact.
Denatured: it dissociates into individual subunits, or it is observed as a fully unfolded polypeptide chain.
Presence in additions (ligands/salt)
Native: it shows a peak corresponding to the protein linked to small ligands, metals, or subunits, which confirms a functional state.
Denatured: ligands are lost due to the denaturalized state.
Solven conditions
Native: Native MS uses neutral pH (6.5-7.5) and aqueous buffer.
Denatured: Denaturing MS uses acidic conditions (for example: TFA, Formic acid) and organic solvents (for example: Acetonitrile)
In the Figure shown (Figure 2), the charge states clearly show how the native protein has very specific peaks; therefore, it is more compact, while the denatured protein shows a broad presence of peaks in all its analyses, therefore, it evidences the presence of an unfolded structure exposing its protonized sites. Also, charges in the denatured protein are a lot higher and consecutive than in the native protein.
2. Zooming into the native mass spectrum of eGFP from the Waters Xevo G3 QTof MS (see Figure 3), can you discern the charge state of the peak at ~2800 m/z? What is the charge state? How can you tell?
To determine the charge state, we should use adjacent peaks, such as:
m1: 2544.8552
m2: 2544.7637
The formula we will use is:
z =
m2 - 1
m1 - m2
z =
2544.7637 - 1
2544.8552 - 2544.7637
z =
2543.7637
0.0915
z =
27800.7
Waters Part III — Peptide Mapping - primary structure
We will digest the eGFP protein standard into peptides using trypsin (an enzyme that selectively cleaves the peptide bond after Lysine (K) and Arginine (R) residues. The resulting peptides will be analyzed on the Waters BioAccord LC-MS to measure their molecular weights and fragmented to confirm the amino acid sequence within each peptide – generating a “peptide map”. This process is used to confirm the primary structure of the protein.
There are a variety of tools available online to calculate protein molecular weight and predict a list of peptides generated from a tryptic digest. We will be using tools within the online resource Expasy (the bioinformatics resource portal of the Swiss Institute of Bioinformatics (SIB)) to predict a list of tryptic peptides from eGFP.
1. How many Lysines (K) and Arginines (R) are in eGFP? Please circle or highlight them in the eGFP sequence given in Waters Part I question 1 above. (Note: adding the sequence to Benchling as an amino acid file and clicking biochemical properties tab will show you a count for each amino acid).
Copy/paste the sequence above into the input box in the PeptideMass tool to generate expected list of peptides.
Use Figure 4 below as a guide for the relevant parameters to predict peptides from eGFP.
Click “Perform the Cleavage” button in the PeptideMass tool and report the number of peptides generated when using trypsin to perform the digest.
Generated peptide
When the cleavage is performed, the number of peptides is: 19
3. Based on the LC-MS data for the Peptide Map data generated in lab (please use Figure 5a as a reference) how many chromatographic peaks do you see in the eGFP peptide map between 0.5 and 6 minutes? You may count all peaks that are >10% relative abundance.
According to the map data, the number of peaks clearly identified and that are >10% relative abundance, is 17:
4. Assuming all the peaks are peptides, does the number of peaks match the number of peptides predicted from question 2 above? Are there more peaks in the chromatogram or fewer?
Predicted peptides: 19
Chromatogram peaks: 17
If we assume that the peaks in the chromatogram are peptides, then we would have fewer than the predicted number in theory.
5. Identify the mass-to-charge (m/z) of the peptide shown in Figure 5b. What is the charge (z) of the most abundant charge state of the peptide (use the separation of the isotopes to determine the charge state)? Calculate the mass of the singly charged form of the peptide ([M+H]+) based on its m/z and z.
z =
m / zn+1m / zn - m / zn+1
Week 11 HW: Bioproduction & Cloud Labs
Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork
1. Contribute at least one pixel to this global artwork experiment before the editing ends on Sunday 4/19 at 11:59 PM EST.
A personalized URL was sent to the email address associated with your Discourse account, and you can discuss the artwork on the Discourse.
If you did not have a chance to contribute, it’s okay, just make sure you become a TA this fall! 😉
2. Make a note on your HTGAA webpages including:
What you contributed to the community bioart project (e.g., “I made part of the DNA on the bottom right plate”)
For this part I contributed with a sinuous line that has the idea to connect the plates of the bottom. It was made with sfGFP protein (green fluorescent protein). I think I made some pixels with red, but I could not identified them when the artwork was complete.
Pixel Artwork intervention
What you liked about the project, and
I think it woke up my curiosity and excitement about what was going to be the final output. Everyone was making some changes in it, and sometimes the drawing was pretty clear, but other times it mutated. It is nice to see how everyone contributes, without having anything in mind or with a very specific idea. I like the idea of having a collaborative artwork without knowing how it is going to be at the end. Also, it is an extensive process for the cloud lab to process.
What about this collaborative art experiment could be made better for next year?
I would like to see more colors in it. Also, it would be nice to use some coding to let everyone make a complete drawing, and then have an artificial analysis that has a final output. I would like to understand what the AI is reading and what patterns it selects and keeps. It would be nice to just have that curiosity resolved.
Part B: Cell-Free Protein Synthesis | Cell-Free Reagents
1. Referencing the cell-free protein synthesis reaction composition (the middle box outlined in yellow on the image above, also listed below), provide a 1-2 sentence description of what each component’s role is in the cell-free reaction.
E. coli Lysate
BL21 (DE3) Star Lysate (includes T7 RNA Polymerase): It is the machinery of the E.coli cell that makes possible the expression of the proteins inserted in the plasmid. In the specific case of BL21 (DE3), the machinery has been modified so that the mRNA would not be degraded, resulting in a higher yield of protein expression and stability.
Salts/Buffer
Potassium Glutamate: It is used to maintain osmotic balance and support enzyme function in the bacterial lysate process. It is an alternative to KCl and a highly efficient reaction. It simulates ionic strength.
HEPES-KOH pH 7.5: It is a common high-quality buffer suited for cell culture, protein purification, and enzyme studies. It maintains a pH stable in the range of 6.8-8.2.
Magnesium Glutamate: Mg²⁺ ions are essential for ribosome function and enzymatic activity, while glutamate counterion acts as a biocompatible high-capacity buffer that stabilizes pH and mimics intracellular conditions. Therefore, magnesium glutamate offers a superior buffering capacity across a broader pH range, enhancing protein stability and activity.
Potassium phosphate monobasic: it is a highly soluble inorganic salt used as a foundational buffering agent. It acts as a weak acid in the preparation of phosphate buffer systems to maintain a stable pH (between 5.8 and 8.0), critical for enzyme stability and cell culture viability.
Potassium phosphate dibasic: generally, it is combined with potassium phosphate monobasic for creating biological buffers. It’s used in CFPS when needing high-purity and low-background buffering.
Energy / Nucleotide System
Ribose: It is crucial for rebuilding nucleotides in the reaction mixture and regeneration of energy molecules such as ATP (adenosine triphosphate) and adenine nucleotide. Ribose is a carbon-5 sugar used to build scaffolding for nucleobase salvage, turning them back into active nucleotides (important during energy stress)
Glucose: it offers an ATP regeneration via glycolysis. It offers a higher energy yield per molecule compared to traditional high-energy phosphate. Glucose fuels CFPS by initiating glycolysis. Glucose drive systems present a lag in protein synthesis because ATP is consumed for priming glycolysis before net regeneration begins.
AMP: Adenosine monophosphate acts as a signal of energy depletion in CFPS. It is recycled back to ADP/ATP to prolong protein synthesis, often via endogenous kinases such as adenylate kinase.
CMP: Cytidine 5-monophosphate is a nucleotide component in CFPS. It serves as a precursor for the production of CTP, which is essential for RNA synthesis and transcription. It is utilized in energy regeneration pathways and as a building block fr nucleic acids.
GMP: Guanosine monophosphate. Is a critical nucleotide precursor that serves as a building block for RNA synthesis (transcription) and as a substrate for energy regeneration (translation).
UMP: Uridine monophosphate. Acts as a critical precursor for pyrimidine nucleotide metabolism, supplying UTP required for RNA synthesis and energy-dependent processes. While UTP is used for transcription, UMP phosphorylation depends on the energy regeneration systems, that replanishes the ATP required by UMP kinases.
Guanine: It is crucial for powering translation machinery, with 2 molecules of GTP typically consumed for each amino acid incorporated into a growing polypeptide. It is required for: amino acid activation, initiation, elongation, and termination during translation.
Translation Mix (Amino Acids)
17 Amino Acid Mix: Provides the majority of amino acids required for protein synthesis, enabling ribosomes to assemble the target protein. While 20 amino acids are the standard, in specific mixes, 17 also works, and is used for enhancing solubility, balancing metabolism, or allowing the incorporation of unnatural amino acids to the mix.
Tyrosine: It is supplied separately due to its low solubility. Typically added when needing high-yield protein production (1-4 mM).
Cysteine: Also supplied separately due to its stability, reactivity, and oxidation issues. Is important to ensure its availability for having proper protein folding and disulfide bond formation
Additives
Nicotinamide: commonly utilized in the context of metabolic engineering for the production of nicotinamide mononucleotide (NMN), a vital intermediate in NAD+/NADH, supporting redox reactions and metabolic processes that help sustain energy regeneration.
Backfill
Nuclease Free Water: It is used to adjust the final reaction volume while preventing degradation of nucleic acids by nucleases.
2. Describe the main differences between the 1-hour optimized PEP-NTP master mix and the 20-hour NMP-Ribose-Glucose master mix shown in the Google Slide above. (2-3 sentences)
The main difference lies in the energy regeneration strategy used in each system: the 1hour PEP-NTP master mix relies on phosphoenolpyruvate (PEP) as a high-energy phosphate donor, enabling rapid and efficient ATP regeneration for short-term, high-yield protein synthesis. In contrast, the 20-hour NMP uses a slower, metabolism-based pathway that recycles nucleotides and generates ATP through glycolytic processes, allowing sustainable protein production over longer periods but with lower instantaneous energy output.
3. Bonus question: How can transcription occur if GMP is not included but Guanine is?
Transcription can occur because guanine can be converted into GMP through nucleotide salvage pathways present in the lysate. GMP is then phosphorylated to GDP and GTP, which are required for RNA synthesis. Therefore, even in the absence of externally supplied GMP, the system can regenerate the necessary nucleotides for guanine.
Part C: Planning the Global Experiment | Cell-Free Master Mix Design
1. Given the 6 fluorescent proteins we used for our collaborative painting, identify and explain at least one biophysical or functional property of each protein that affects expression or readout in cell-free systems. (Hint: options include maturation time, acid sensitivity, folding, oxygen dependence, etc) (1-2 sentences each)
sfGFP: has a ratio of 2.2 ± 0.1 (74 cells) (2.2 is the average of OSER parameters; 0.1 is the variability; 74 is the number of cells in which it has been analyzed). Therefore, sfGFP is an almost stable protein that tends to aggregate at a low rate, which means that it might not fold correctly or be less fluorescent.
OSER ratio (Organized smooth endoplasmic reticulum) evaluates if a protein tends to oligomerize or aggregate in cells. In the case of CFPS, it indicates a tendency to do so. In general, values around 1-2 indicate a tendency to monomerize and therefore be more stable, while values around 3-4 indicate a tendency to oligomerize and therefore to aggregate.
mRFP1: this protein has 4.5 pKa. Normally, in CFPS, the pH is around 7-7.5 in order to obtain a higher efficiency in translation and transcription. In this case, a pH of 4.5 means that it is negatively charged, which favors its solubility. The only thing is that if the pH changes slightly, the protein will become positively charged and change its conformation. (altering its 3d structure and losing its functions).
PKA: It is the measure of the acid sensitivity of FP. It is the pH at which fluorescence intensity drops at 50% of its maximum value. It is the physiological pH that the protein needs.
mKO2: has a 5.5 pKa, and a maturation time of 108 min. This means that the maturation time is long, and it delays fluorescence detection in CFPS, affecting the early readout. On the other hand, the low pKa indicates that it is sensitive to pH changes, making the fluorescence readout decrease when the pH gets more acidic.
Maturation time (min): is the ideal time in which a protein is properly folded and expressed. The rates vary between 10-30 (min) as fast maturation, 30-60 (min) medium, and more than 90 (min) slow.
mTurquoise2: this protein has a maturation time of 33.5 min, which is a moderate rate, and a 3.3pKa, which is a little low; it is stable in physiological pH. In the case of oxygen dependence, as mTurquoise2 is derived from GFP, oxygen is required for chromophore maturation, making fluorescence dependent on oxygen availability.
mScarlet_I: we have a 3.9 (ns) of lifetime. This means each molecule will remain in an excited state for 3.9 ns before emitting a photon. This improves readout reliability by enabling discrimination from background fluorescence.
Lifetime (ns): it is the time a fluorophore remains excited before returning to the ground state by emitting a photon. It is influenced by the local environment, such as pH and molecular interactions.
Electra2: this protein has a high photostability, with a half-life of 1466 seconds, allowing it to maintain fluorescence under continuous illumination.
Photostability: ability of a fluorescent protein to maintain its fluorescence under continuous illumination. High photostability improves CFPS readout by reducing photobleaching because it produces a reliable signal.
2. Create a hypothesis for how adjusting one or more reagents in the cell-free mastermix could improve a specific biophysical or functional property you identified above, in order to maximize fluorescence over a 36-hour incubation. Clearly state the protein, the reagent(s), and the expected effect.
I chose the mKO2 protein, because it presents a relatively long maturation time (108 min), delaying fluorescence detection. I think it can be improved by combining two elements: first, by applying more agitation, oxygen levels would increase in the reaction, and to accelerate the chromophore formation. Second, would be by adding a chaperon protein so that it can improve folding efficiency, the chaperon selected could be GroEl/GroEs, also, it would increase the yield of properly folded protein over a 36-hour incubation.
3. The second phase of this lab will be to define the precise reagent concentrations for your cell-free experiment. You will be assigned artwork wells with specific fluorescent proteins and receive an email with instructions this week (by April 24). You can begin composing master mix compositions here.
4. The final phase of this lab will be analyzing the fluorescence data we collect to determine whether we can draw any conclusions about favorable reagent compositions for our fluorescent proteins. This will be due a week after the data is returned (date TBD!). The reaction composition for each well will be as follows:
6 μL of Lysate
10 μL of 2X Optimized Master Mix from above
2 μL of assigned fluorescent protein DNA template
2 μL of your custom reagent supplements
Total: 20 μL reaction
Part D: Build-A-Cloud-Lab | (optional) Bonus Assignment