Homework

Weekly homework submissions:

Week 1 HW: Principles and Practices
First, describe a biological engineering application or tool you want to develop and why. This could be inspired by an idea for your HTGAA class project and/or something for which you are already doing in your research, or something you are just curious about. Thyroid implant for canines In the canine world, there is a very specific problem that owners face: hypothyroidism. It is a common condition that dogs develop around the ages of 2 or 3. Still, most of the time it is mistaken for other health conditions, such as intestinal problems, allergies, dermatological conditions, and so on. As this health condition has many impacts on the body of dogs, sometimes veterinarians can lead to a false positive.
Week 2 HW: DNA Read, Write & Edit
Part 1: Benchling & In-silico Gel Art See this week’s lab protocol “Gel Art: Restriction Digests and Gel Electrophoresis” for details. Overview: Make a free account at benchling.com Import the Lambda DNA. Simulate Restriction Enzyme Digestion with the following Enzymes: EcoRI HindIII BamHI KpnI EcoRV SacI SalI Restriction Enzyme Digestion made with Benchling Create a pattern/image in the style of Paul Vanouse’s Latent Figure Protocol artworks. You might find Ronan’s website a helpful tool for quickly iterating on designs! E=m*a2 EcoRV vs. EcoRI Single Enzymes Pyramid Enzymes Part 3: DNA Design Challenge
Week 3 HW: Lab Automation
Assignment: Python Script for Opentrons Artwork — DUE BY YOUR LAB TIME! Your task this week is to Create a Python file to run on an Opentrons liquid handling robot. Review this week’s recitation and this week’s lab for details on the Opentrons and programming it. Generate an artistic design using the GUI at opentrons-art.rcdonovan.com. Star´s birth Rectangular color palette Iteration color palette 1 Iteration color palette 2 Iteration color palette 3 - Ellipse Iteration color palette 4 - Circumference Using the coordinates from the GUI, follow the instructions in the HTGAA26 Opentrons Colab to write your own Python script which draws your design using the Opentrons. You may use AI assistance for this coding — Google Gemini is integrated into Colab (see the stylized star bottom center); it will do a good job writing functional Python, while you probably need to take charge of the art concept. Iteration color palette 5 Coding done by Gemini This coding was made with Google Gemini. The steps for doing that were: first, loading the coordinates made in the GUI; second, giving instructions to the AI for what the expected outcome; finally, iterating until the idea was achieved. The given instruction given to the IA did not have any basic coding, it was all made with written instructions.
Week 4: Protein Design - part I
Part A. Conceptual Questions Answer any NINE of the following questions from Shuguang Zhang: (i.e. you can select two to skip) How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons) Why do humans eat beef but do not become a cow, eat fish but do not become fish? Why are there only 20 natural amino acids? Can you make other non-natural amino acids? Design some new amino acids. Where did amino acids come from before enzymes that make them, and before life started? If you make an α-helix using D-amino acids, what handedness (right or left) would you expect? Can you discover additional helices in proteins? Why are most molecular helices right-handed? Why do β-sheets tend to aggregate? What is the driving force for β-sheet aggregation? Why do many amyloid diseases form β-sheets? Can you use amyloid β-sheets as materials? Design a β-sheet motif that forms a well-ordered structure. Part B: Protein Analysis and Visualization
Week 5: Protein Design - part II
Part A: SOD1 Binder Peptide Design (From Pranam) Superoxide dismutase 1 (SOD1) is a cytosolic antioxidant enzyme that converts superoxide radicals into hydrogen peroxide and oxygen. In its native state, it forms a stable homodimer and binds copper and zinc. Mutations in SOD1 cause familial Amyotrophic Lateral Sclerosis (ALS). Among them, the A4V mutation (Alanine → Valine at residue 4) leads to one of the most aggressive forms of the disease. The mutation subtly destabilizes the N-terminus, perturbs folding energetics, and promotes toxic aggregation.
Week 6: Genetic Circuits - part I
Assignment: DNA Assembly Answer these questions about the protocol in this week’s lab: What are some components in the Phusion High-Fidelity PCR Master Mix, and what is their purpose? Elements: Pyrococcus-like enzyme fused with a processivity-enhancing domain that increases fidelity and speed. 53 polymerase activity, 35exonuclease activity, and generates blunt-ended products. Phusion DNA Polymerase, deoxynucleotides, and reaction buffer that has been optimized and includes MgCl2 Requirements to be added: template, primers, and water. Purpose:
Week 7 HW: Genetic Circuits-part II
Assignment Part 1: Intracellular Artificial Neural Networks (IANNs)

What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions? In-vivo Artificial Neural Networks provide an analog approach to computation within biological systems, enabling cells to process continuous inputs and generate a gradient and a non-binary output. IANNs can integrate multiple signals and capture more complex, non-linear relationships through distributed gene regulation. IANNs have the ability to capture patterns and complex inputs.

Week 9 HW: Cell Free Systems
Part A: General and Lecturer-Specific Questions General homework questions

Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production. Cell-free protein synthesis: is a protein expression approach that enables the production of a target protein without the use of living cells. In vivo methods: it is also a protein expression approach that uses living cells such as bacteria (e. coli, most common), yeast, insect cells, and mammalian cells. Here is a chart that makes a summary about differences between both methods:

Week 10 HW: Imagin and Measurement
Waters Part I — Molecular Weight We will analyze an eGFP standard on a Waters Xevo G3 QTof MS system to determine the molecular weight of intact eGFP and observe its charge state distribution in the native and denatured (unfolded) states. The conditions for LC-MS analysis of intact protein cause it to unfold and be detected in its denatured form (due to the solvents and pH used for analysis).

Based on the predicted amino acid sequence of eGFP (see below) and any known modifications, what is the calculated molecular weight? You can use an online calculator like the one at https://web.expasy.org/compute_pi/

Week 11 HW: Bioproduction & Cloud Labs
Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork

Contribute at least one pixel to this global artwork experiment before the editing ends on Sunday 4/19 at 11:59 PM EST. A personalized URL was sent to the email address associated with your Discourse account, and you can discuss the artwork on the Discourse. If you did not have a chance to contribute, it’s okay, just make sure you become a TA this fall! 😉 2. Make a note on your HTGAA webpages including:

Week 1 HW: Principles and Practices

First, describe a biological engineering application or tool you want to develop and why. This could be inspired by an idea for your HTGAA class project and/or something for which you are already doing in your research, or something you are just curious about.

Thyroid implant for canines

In the canine world, there is a very specific problem that owners face: hypothyroidism. It is a common condition that dogs develop around the ages of 2 or 3. Still, most of the time it is mistaken for other health conditions, such as intestinal problems, allergies, dermatological conditions, and so on. As this health condition has many impacts on the body of dogs, sometimes veterinarians can lead to a false positive.

There are three types of hypothyroidism:

Primary
Secondary
Tertiary
Congenituous

We are going to center our attention on primary hypothyroidism, caused by an immunological condition or an idiopathic problem.

The origin of Primary Hypothyroidism lies in the intercommunication that exists between the thyroid gland and the brain´s instructions. There is a malfunction that doesn´t allow the gland to produce more T4. Although the communication between the brain, the hypothalamus, and the hipophysis are fine and working at its normal pace, the gland receives the message, the gland cannot keep the production of T4 because it has been attacked by the immunological cells or it has started to transform into fat with no returning point.

The interesting thing about this system is that it is only partially broken. The T4-T3 hormones do not depend on the gland to be absorbed by the body; they depend on the genetic switches that are inside the organs they enter: liver, kidneys, brain, and muscles. Another interesting fact is that, actually, T4 hormones are passive cells and can travel around the body through the circulatory system.

In our present times, we have a pharmaceutical solution called: levotiroxin. This pill is actually a concentration ot T4 that enters the body through the stomach, and as it is absorbed, it can be delivered to the different organs to be transformed to T3. Although it is a simple solution, it is not exactly accurate all of the time, and it has to be monitored every 6 months to be adjusted according to the requirements of multiple blood tests until the dose is accurate.

Now, what if we could implant a thyroid substitute that could read the TSH directly through the blood and produce T4 as needed in the body, with a more precise response? To achieve this objective, we must face the subsequent situations:

The body needs to accept the cells as its own. The most accurate way to do this is by extracting a piece of endodermic or glandular cells from the body of the pet and reprogram it to be thyroid cells. This should be done in a lab, and the cells must reach a stable state so that they can operate accurately and not stop working when they are inside the body. The importance of having cells from the same body is that they won´t be read as a giant thread, as other materials could.
Since the body has a very acute immune system, it is necessary to put the reprogrammed cells into a container or membrane that filters immune cells and directs infiltrations. This membrane should be soft and with the right amount of pores that allow the entrance of oxygen, TSH, and nutrients, but also allow the filtration of T4 into the blood system. This membrane should be made out of biomimetic biogels, which are very hydrated and can be read as neutral elements by the immune cells.
To grow an implant, it is important to guide the cells so that they can reproduce the architecture of the thyroid gland. This could be guided by a biopolymer that suggests how cells should grow together.
Finally, this implant should be placed in the subcutaneous region so that it is surrounded by blood; therefore, TSH can reach the implant and start the reactions of production of T4.

The implant must be tested and programmed to react over a period of weeks, because the levels of T4 are regulated not in the exact moment that the TSH goes up or down, but rather when the TSH stays in this levels for the period of weeks. These slow reactions are similar to how the original gland reacts, and also, are not a red flag to immune cells. The implant would be reacting accurately and on its natural pace without the induction of not accurate dosis of T4.

Another important addition, and not a very stable one for the body, would be to integrate a nanochip that could scan and give feedback on the situation inside the body, but, for now, it is not a stable solution because the immune system would read it as a major danger that must be eliminated.

Thyroid implant scheme — ai-generated image

Next, describe one or more governance/policy goals related to ensuring that this application or tool contributes to an “ethical” future, like ensuring non-malfeasance (preventing harm). Break big goals down into two or more specific sub-goals. Below is one example framework (developed in the context of synthetic genomics) you can choose to use or adapt, or you can develop your own. The example was developed to consider policy goals of ensuring safety and security, alongside other goals, like promoting constructive uses, but you could propose other goals, for example, those relating to equity or autonomy.

Prevention of physical and psychological harm during experimental stages, non-malfeasance.

Strengthen diagnostic protocols for canine hypothyroidism before giving the individuals any medical procedure or medicine.
Limit the access and the quantity of samples extracted to each individual.
Establish rules and strong selection criteria for canine individuals that will be participating in the experimental stage.

Fair and free access to information

Establish clear, accessible, and transparent information for caregivers regarding the potential health risks that the research and experimental stages pose.
Ensure that the results and information obtained in experimental stages and results are transferred to students and professionals in specializations, with emphasis on ethical and responsible usages.
In the event of a successful result in canines and the intention to use this research on humans, any prior required animal experimentation must be reviewed and regulated by an official regulatory authority, ensuring fairness, accountability, and ethical continuity across species.

Next, describe at least three different potential governance “actions” by considering the four aspects below (Purpose, Design, Assumptions, Risks of Failure & “Success”). Try to outline a mix of actions (e.g. a new requirement/rule, incentive, or technical strategy) pursued by different “actors” (e.g. academic researchers, companies, federal regulators, law enforcement, etc). Draw upon your existing knowledge and a little additional digging, and feel free to use analogies to other domains (e.g. 3D printing, drones, financial systems, etc.).

Purpose: What is done now and what changes are you proposing?
Design: What is needed to make it “work”? (including the actor(s) involved - who must opt-in, fund, approve, or implement, etc)
Assumptions: What could you have wrong (incorrect assumptions, uncertainties)?
Risks of Failure & “Success”: How might this fail, including any unintended consequences of the “success” of your proposed actions?

A. Review of individual cases based on the evaluation of diagnostics made by an external entity and professional

To obtain a verified hypotheroid diagnosis, there must be an external entity to review and confirm these evaluations. This review must be done by physical and chemical tests that provide consistent and standardized results. This conclusion would accept or deny the participation of each subject. For this action to be applied, collaboration between institutions and external professionals must be done. These entities and actors must be accredited and affiliated by the governement Ecuador, in this case. External evaluation must be a mandatory step before initiating any procedure, because it will ensure that participants are correctly diagnosed and that budgets, time, and assets are well used.

For these actions to happen, it is assumed that external professionals will be qualified and available to be part of the program, that diagnostic criteria will be consistent in every case, and that these reviews will reduce misdiagnosis. The risks that are on the table could be clinical disagreements between actors involved, internal vs. external, and there is a high risk of corruption among professionals and institutions, which could dramatically lower the chances of succeeding with ethical and good treatment values.

B. Unannounced auditor´s inspections regarding research developments and animals´ health and care

Monitoring research progress should be audited by some unannounced visits in order to have all the information regarding the project clear. This action would be led by an external inspector of the research institutions, as well as the research team. This would ensure the supervision of ethical practices in the lab or clinic, as well as the real progress of the experimental project. The external inspector would be accredited by external academic institutions that are renowned for their knowledge in the field and their ethical practices. It is important that this actor can have the authority to propose adjustments when needed, and to stop or allow the research to continue when the conditions are met.

This would be a successful action assuming that academic institutions would have this type of professional under their wing, and that they have the capacity to support these mechanisms of control. The risks that could arise in this policy are that frequent inspections could provoke a hostile environment, affecting the performance of researchers and, therefore, the success of the research. Administrative burdens could slow the speed of the process, and rigidity could cloud the creative and precise environment needed in this type of research.

C. Mandatory and transparency in educational programs

With mandatory education for all actors involved and transparent communication pathways ensures that experimental procedures, ethical considerations, and long-term care requirements are responsibly applied beyond the research setting. This action would be implemented by educational programs given by the academic institutions involved in this research. It would have to be open for specialized students, professionals, and caregivers who will be part of the program. Also, accessible and clear training sessions would be provided, focusing on care requirements and ethical responsibilities.

Assuming that the information is being passed with transparency and it is being understood by all the attendants, it could lead to more ethical decision-making and supervision within all the actors involved. On the other side, these actions could fail if the educational content is not accessible for all participants, making it overly technical or poorly communicated. Also, an important thing to keep in mind is that there will be an emotional and subjective matter that caregivers will experience throughout the experimentation lapse; this can lead to inadvertent or sudden dopouts putting at risk both the research process and the animal’s life.

Next, score (from 1-3 with, 1 as the best, or n/a) each of your governance actions against your rubric of policy goals. The following is one framework but feel free to make your own:

Does the option:	Option 1	Option 2	Option 3
Enhance Biosecurity
• By preventing incidents	1	1	1
• By helping respond	3	2	2
Foster Lab Safety
• By preventing incident	2	1	1
• By helping respond	3	2	2
Protect the environment
• By preventing incidents	3	2	3
• By helping respond	3	2	3
Other considerations
• Animal wellfare and intervention	1	1	1
• Diagnostic accuracy and animal inclusion	1	2	1
• Ethical inclusion and transparency	2	1	2
• Equity in access to knowledge and care	2	3	1
• Minimizing costs and burdens to stakeholders	1	2	3
• Feasibility?	1	2	2
• Not impede research	2	3	2
• Promote constructive applications	3	1	1

Last, drawing upon this scoring, describe which governance option, or combination of options, you would prioritize, and why. Outline any trade-offs you considered as well as assumptions and uncertainties. For this, you can choose one or more relevant audiences for your recommendation, which could range from the very local (e.g. to MIT leadership or Cambridge Mayoral Office) to the national (e.g. to President Biden or the head of a Federal Agency) to the international (e.g. to the United Nations Office of the Secretary-General, or the leadership of a multinational firm or industry consortia). These could also be one of the “actor” groups in your matrix.

The fundamental governance option to be prioritized is “Review of individual cases based on the evaluation of diagnostics made by an external entity and professional”. As it acts on the foundations of the experimental phases, it can prevent misdiagnosis or weakly supported clinical assumptions before the animal is in the trial. This option reduces the risk of ethical procedures, minimizes harm, protects animal welfare, and strengthens the scientific validity of the research foundation. The trade-off to be considered here is that this extra evaluation may slow down, at the beginning, and limit the number of participants, as well as the development of the research, but in the future are preventive in many aspects.

As complementary governance options are Option n.2 and Option n.3. These actions ensure accountability within all the actors involved by making knowledge a regulatory and independent system. The prevent the misuse of information and empowers informed decision-making throughout the research cycle. External regulatory actors are also essential because they can verify if clinical, laboratory, and welfare standards are applied before, during, and after experimentation. The trade-off here is that research institutions might perceive these actors as intruders and as potential uncertainty symbols of the research development. But, on the other hand, the combination of early prevention, continuous oversight, and broad education creates a distributed responsibility, which reduces the ethical failures to go unnoticed or unaddressed.

Reflecting on what you learned and did in class this week, outline any ethical concerns that arose, especially any that were new to you. Then propose any governance actions you think might be appropriate to address those issues. This should be included on your class page for this week.

As an experimental research that needs to be done in live animals, specifically canines, it is important to keep very clear the limit between ethical procedures and unethical ones. The fact that we do not communicate in the same way might be an important barrier between researchers and subjects. They might end up seen as numbers rather than brave and important living beings who are giving their lives to the experiment. It has arisen to me the question whether the experiment is worth the many health discomforts that the research individual will undergo to gain a major medical advancement?

Another question to keep in mind is whether we could develop a nanoscanner able to read the gland’s condition and avoid any invasive procedure, preventing from misdiagnostics and giving a much more accurate reading of the situation inside the animal´s body.

Some bibliography found about thyroid organoids for humans:

Kariyawasam, D., Stoupa, A., Nguyen Quoc, A., Pimentel Dantas, I., Polak, M., & Carré, A. (2025). From stem cells to organoids in thyroid: Useful tools or a step for cell therapy? La Presse Médicale, 54(4), 104301. https://doi.org/10.1016/j.lpm.2025.104301
ZHANG, Y., FU, M., WANG, H., & SUN, H. (2023). Advances in the Construction and Application of Thyroid Organoids. Physiological Research, 72(5), 557–564. https://doi.org/10.33549/physiolres.935102

‌

Week 2 HW: DNA Read, Write & Edit

Part 1: Benchling & In-silico Gel Art

See this week’s lab protocol “Gel Art: Restriction Digests and Gel Electrophoresis” for details. Overview:

Make a free account at benchling.com
Import the Lambda DNA.
Simulate Restriction Enzyme Digestion with the following Enzymes:
- EcoRI
- HindIII
- BamHI
- KpnI
- EcoRV
- SacI
- SalI

Virtual digest sequence LAMCG — *Restriction Enzyme Digestion made with Benchling*

Create a pattern/image in the style of Paul Vanouse’s Latent Figure Protocol artworks.
You might find Ronan’s website a helpful tool for quickly iterating on designs!

Part 3: DNA Design Challenge

3.1. Choose your protein.

In recitation, we discussed that you will pick a protein for your homework that you find interesting. Which protein have you chosen and why? Using one of the tools described in recitation (NCBI, UniProt, google), obtain the protein sequence for the protein you chose.

[Example from our group homework, you may notice the particular format — The example below came from UniProt]

>sp|P03609|LYS_BPMS2 Lysis protein OS=Escherichia phage MS2 OX=12022 PE=2 SV=1
METRFPQQSQQTPASTNRRPRFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLSLL EAVIRTVTTLQQLLT

THYROGLOBULINE (CANIS LUPUS FAMILIARIS)

The world of proteins is so vast that choosing a single protein has been a profound task.

To follow the same path as week 1 HW, let´s start with Thyroglobulin, a very complex and specialized protein that is key to the generation of T3 and T4 hormones; in other words, it is a hormone protein. Because of its complexity, specificity, and its work with DNA, it is a modern protein. Some interesting facts about Thyroglobuline are: its size, it is very big in comparison to other proteins, it only functions in the thyroid gland, it is prone to being attacked to inmune system´s cells when something is not working well, and it does not accept errors in its process. If we compared it to the Actin protein, we could understand that Actin is a simpler protein that achieves a general action and that it is present in all eukaryotic forms since early life on Earth. Actin is the protein in charge of the formation of the cytoskeleton, motility, and shape of cells, among many other functions. The interesting fact about Actin is that it can allow errors to occur, in contrast to Thyroglobuline, which is very precise.

In the exercise bellow I will develop Thyroglobuline for Canis lupus familiaris.

THYROGLOBULINE - CANIS LUPUS FAMILIARIS

>sp|P15881|ITF2_CANLF Transcription factor 4 OS=Canis lupus familiaris OX=9615 GN=TCF4 PE=2 SV=2
MFSPPVSSGKNGPTSLASGHFTGSNVEDRSSSGSWGNGGHPSPSRNYGDGTPYDHMTSRD
LGSHDNLSPPFVNSRIQSKTERGSYSSYGRESNLQGCHQSLLGGDMDMGTPGTLSPTKPG
SQYYQYSSNNPRRRPLHSSAMEVQTKKVRKVPPGLPSSVYAPSASTADYNRDSPGYPSSK
PAASTFPSSFFMQDGHHSSDPWSSSSGMNQPGYGGMLGSSSHIPQSSSYCSLHPHERLSY
PSHSSADINSSLPPMSTFHRSGTNHYSTSSCTPPANGTDSIMANRGSGAAGSSQTGDALG
KALASIYSPDHTNNSFSSNPSTPVGSPPSLSAGTAVWSRNGGQASSSPNYEGPLHSLQSR
IEDRLERLDDAIHVLRNHAVGPSTAMPGGHGDMHGIIGPSHNGAMGGLGSGYGTGLLSAN
RHSLMVGAHREDGVALRGSHSLVPNQVPVPQLPVQSATSPDLNPPQDPYRGMPPGLQGQS
VSSGSSEIKSDDEGDENLQDTKSSEDKKLDDDKKDIKSITSNNDDEDLTPEQKAEREKER
RMANNARERLRVRDINEAFKELGRMVQLHLKSDKPQTKLLILHQAVAVILSLEQQVRERN
LNPKAACLKRREEEKVSSEPPPLSLAGPHPGMGDASNHMGQM

3.2. Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence.

The Central Dogma discussed in class and recitation describes the process in which DNA sequence becomes transcribed and translated into protein. The Central Dogma gives us the framework to work backwards from a given protein sequence and infer the DNA sequence that the protein is derived from. Using one of the tools discussed in class, NCBI or online tools (google “reverse translation tools”), determine the nucleotide sequence that corresponds to the protein sequence you chose above.

[Example: Get to the original sequence of phage MS2 L-protein from its genome phage MS2 genome - Nucleotide - NCBI]

Lysis protein DNA sequence
atggaaacccgattccctcagcaatcgcagcaaactccggcatctactaatagacgccggccattcaaacatgaggattacccatgtcgaagacaacaaagaagttcaactctttatgtattgatcttcctcgcgatctttctctcgaaatttaccaatcaattgcttctgtcgctactggaagcggtgatccgcacagtgacgactttacagcaattgcttacttaa

THYROGLOBULINE - DNA sequence

atgttctccccccccgtgtcctccggcaagaacggccccacctccctggcctccggccac
ttcaccggctccaacgtggaggaccgctcctcctccggctcctggggcaacggcggccac
ccctccccctcccgcaactacggcgacggcaccccctacgaccacatgacctcccgcgac
ctgggctcccacgacaacctgtccccccccttcgtgaactcccgcatccagtccaagacc
gagcgcggctcctactcctcctacggccgcgagtccaacctgcagggctgccaccagtcc
ctgctgggcggcgacatggacatgggcacccccggcaccctgtcccccaccaagcccggc
tcccagtactaccagtactcctccaacaacccccgccgccgccccctgcactcctccgcc
atggaggtgcagaccaagaaggtgcgcaaggtgccccccggcctgccctcctccgtgtac
gccccctccgcctccaccgccgactacaaccgcgactcccccggctacccctcctccaag
cccgccgcctccaccttcccctcctccttcttcatgcaggacggccaccactcctccgac
ccctggtcctcctcctccggcatgaaccagcccggctacggcggcatgctgggctcctcc
tcccacatcccccagtcctcctcctactgctccctgcacccccacgagcgcctgtcctac
ccctcccactcctccgccgacatcaactcctccctgccccccatgtccaccttccaccgc
tccggcaccaaccactactccacctcctcctgcaccccccccgccaacggcaccgactcc
atcatggccaaccgcggctccggcgccgccggctcctcccagaccggcgacgccctgggc
aaggccctggcctccatctactcccccgaccacaccaacaactccttctcctccaacccc
tccacccccgtgggctcccccccctccctgtccgccggcaccgccgtgtggtcccgcaac
ggcggccaggcctcctcctcccccaactacgagggccccctgcactccctgcagtcccgc
atcgaggaccgcctggagcgcctggacgacgccatccacgtgctgcgcaaccacgccgtg
ggcccctccaccgccatgcccggcggccacggcgacatgcacggcatcatcggcccctcc
cacaacggcgccatgggcggcctgggctccggctacggcaccggcctgctgtccgccaac
cgccactccctgatggtgggcgcccaccgcgaggacggcgtggccctgcgcggctcccac
tccctggtgcccaaccaggtgcccgtgccccagctgcccgtgcagtccgccacctccccc
gacctgaaccccccccaggacccctaccgcggcatgccccccggcctgcagggccagtcc
gtgtcctccggctcctccgagatcaagtccgacgacgagggcgacgagaacctgcaggac
accaagtcctccgaggacaagaagctggacgacgacaagaaggacatcaagtccatcacc
tccaacaacgacgacgaggacctgacccccgagcagaaggccgagcgcgagaaggagcgc
cgcatggccaacaacgcccgcgagcgcctgcgcgtgcgcgacatcaacgaggccttcaag
gagctgggccgcatggtgcagctgcacctgaagtccgacaagccccagaccaagctgctg
atcctgcaccaggccgtggccgtgatcctgtccctggagcagcaggtgcgcgagcgcaac
ctgaaccccaaggccgcctgcctgaagcgccgcgaggaggagaaggtgtcctccgagccc
ccccccctgtccctggccggcccccaccccggcatgggcgacgcctccaaccacatgggc
cagatg

3.3. Codon optimization.

Once a nucleotide sequence of your protein is determined, you need to codon optimize your sequence. You may, once again, utilize google for a “codon optimization tool”. In your own words, describe why you need to optimize codon usage. Which organism have you chosen to optimize the codon sequence for and why?

[Example from Codon Optimization Tool | Twist Bioscience while avoiding Type IIs enzyme recognition sites BsaI, BsmBI, and BbsI]

Lysis protein DNA sequence with Codon-Optimization
ATGGAAACCCGCTTTCCGCAGCAGAGCCAGCAGACCCCGGCGAGCACCAACCGCCGCCGCCCGTTCAAACATGAAGATTATCCGTGCCGTCGTCAGCAGCGCAGCAGCACCCTGTATGTGCTGATTTTTCTGGCGATTTTTCTGAGCAAATTCACCAACCAGCTGCTGCTGAGCCTGCTGGAAGCGGTGATTCGCACAGTGACGACCCTGCAGCAGCTGCTGACCTAA

Codon optimization is needed to make the codon sequence of the original host be read or expressed in the codon sequence of the organism that will produce it later, without modifying the amino acid sequence. In the case of research, it is necessary to have a bank of protein that will be analyzed and tested; in that way, it is not sustainable to always get it from the original host for many reasons: budget, quantity, ethics, etc.

In the case of canine Thyroglobulin, the experiment will be based on the question: how to produce canine Thyroglobulin that could be used as one component of an implant of a thyroid gland? The cell organism that will produce the protein later will be CHO (Chinese hamster ovary cells), a mammal cell that has the capacity to perform complex processes to produce proteins as specialized as thyroglobulin.

THYROGLOBULINE - DNA sequence with Codon-Optimization

ATGTTCTCACCACCTGTGTCTTCTGGCAAGAATGGCCCCACCTCCCTGGCTTCTGGCCACTTCACCGGAAGCAACGTGGAGGACAGGTCCTCTTCCGGCTCCTGGGGCAATGGCGGCCACCCAAGTCCATCTCGAAACTACGGCGACGGGACCCCTTACGATCACATGACCTCCAGAGACCTGGGCTCTCATGACAATCTGTCTCCCCCATTTGTGAACTCCCGGATTCAGTCTAAGACTGAGCGGGGCTCATACAGCTCTTACGGACGCGAGAGCAACCTGCAGGGTTGTCACCAGTCCCTGCTGGGCGGAGACATGGACATGGGCACCCCCGGGACCCTCTCTCCTACTAAGCCTGGCTCTCAGTATTACCAGTACTCCTCCAATAACCCTCGAAGGCGGCCCCTGCACAGCAGTGCCATGGAGGTCCAGACAAAGAAAGTCAGGAAGGTGCCACCAGGCCTGCCCAGCTCCGTCTATGCCCCAAGCGCCTCCACCGCCGATTACAATCGAGATAGCCCCGGTTACCCCTCCTCTAAGCCAGCAGCCTCTACTTTCCCTAGCTCCTTCTTTATGCAGGACGGCCATCACTCAAGTGATCCTTGGTCCAGCAGCTCTGGCATGAACCAGCCAGGGTACGGTGGTATGCTGGGTTCTTCCAGTCACATCCCTCAGTCTTCTTCCTACTGTAGTCTGCATCCACATGAGCGCCTGTCATACCCCAGCCACTCCTCTGCCGACATCAATAGCTCCCTGCCACCCATGTCAACCTTCCATAGGAGCGGCACTAACCATTATTCCACATCCAGCTGCACTCCTCCCGCTAACGGTACTGACTCTATCATGGCTAACAGAGGCTCCGGCGCCGCTGGAAGCAGTCAGACCGGAGATGCTCTCGGCAAGGCCCTGGCTTCTATCTATTCTCCCGACCATACCAACAATTCTTTCAGTAGCAACCCTTCTACTCCCGTGGGCTCCCCTCCTTCCCTGTCCGCCGGAACCGCTGTGTGGTCTAGGAATGGCGGCCAGGCCAGCTCCAGCCCTAATTATGAGGGCCCCCTGCACAGCCTGCAGTCTCGTATTGAGGATAGGCTGGAGCGACTGGACGACGCTATTCACGTGCTGCGTAACCATGCTGTGGGCCCAAGCACCGCTATGCCCGGGGGACACGGAGACATGCACGGAATCATCGGCCCTTCTCACAACGGGGCTATGGGGGGTCTGGGCAGCGGCTACGGAACAGGCCTGCTGTCCGCCAACAGGCACTCTCTGATGGTGGGTGCCCACCGGGAAGACGGAGTGGCCCTGAGAGGGTCACATAGCCTGGTGCCTAACCAGGTGCCTGTGCCTCAGCTGCCCGTGCAGAGTGCTACTAGCCCCGATCTGAACCCTCCACAGGACCCTTACAGAGGCATGCCACCCGGTCTGCAGGGACAGTCTGTGTCCTCTGGCAGTAGCGAGATCAAGTCAGATGACGAGGGAGACGAGAACCTGCAGGATACAAAGAGCTCCGAGGATAAGAAATTGGACGACGACAAGAAGGACATCAAGTCCATCACCAGCAACAACGACGACGAGGACCTGACTCCTGAGCAGAAGGCCGAACGGGAAAAGGAAAGGCGGATGGCTAACAATGCAAGAGAACGCCTGAGGGTCAGGGATATCAATGAGGCTTTCAAGGAGCTGGGCAGGATGGTGCAGCTGCATCTTAAGTCTGACAAGCCACAGACAAAGCTGCTGATCCTGCACCAGGCTGTGGCTGTGATTCTGTCCCTGGAGCAGCAGGTGAGAGAGAGGAACCTGAACCCTAAGGCCGCTTGCCTGAAAAGACGGGAGGAGGAAAAAGTGAGCTCTGAGCCCCCACCCCTCTCCCTGGCCGGACCACACCCCGGCATGGGCGACGCTTCTAACCACATGGGCCAGATG

3.4. You have a sequence! Now what?

What technologies could be used to produce this protein from your DNA? Describe in your words the DNA sequence can be transcribed and translated into your protein. You may describe either cell-dependent or cell-free methods, or both.

If I had to produce this from my DNA, I would have to use a codon-optimized sequence so that the protein could be interpreted by human cells. Although canine thyroglobuline and human thyroglobuline are not so different, and the DNA is not that different, it is necessary to make this step in order to have amino acids arranged in perfect order so that they can be read by mRNA. The technologies to do this would be:

cell-free methods: for producing the protein, specifically CHO cells that come from Chinese hamster ovaries. As this protein comes from a mammal, the cells for reproducing the protein need to be from the same group; it is not efficient to use bacterial cells like E.coli, for example, because the protein needs to fold in a specific way, and the differences between bacteria, mammal,s and plants make this process very different.
bioreactor: for scaling the production, avoiding cito-contamination, and giving the process a controlled atmosphere to fold and grow.

Part 4: Prepare a Twist DNA Synthesis Order

This is a practice exercise, not necessarily your real Twist order!

4.1. Create a Twist account and a Benchling account

Although the Benchling account was successfully created and used, the Twist account was not able to grant access.

4.2. Build Your DNA Insert Sequence

For example, let’s make a sequence that will make E. coli glow fluorescent green under UV light by constitutively (always) expressing sfGFP (a green fluorescent protein):

In Benchling, select New DNA/RNA sequence
Give your insert sequence a name and select DNA with a Linear topology (this is a linear sequence that will be inserted into a circular backbone vector of our choosing).
Go through each piece of the given DNA sequences highlighted below (Promoter, RBS, Start Codon, Coding Sequence, His Tag, Stop Codon, Terminator) and paste the sequences into the Benchling file one after the other (replacing the coding sequence with your codon optimized DNA sequence of interest!). Each time you add a new piece of the sequence, make sure to annotate by right clicking over the sequence and creating an annotation that describes what each piece (e.g., Promoter, RBS, etc.) is (see image below).

For this part, I started by searching for the protein sequence in FASTA format, then I translated it to a DNA sequence and finally I made the codon optimization of the DNA sequence for e.coli.The results are bellow:

FASTA SEQUENCE:

>UFQ89826.1 sfGFP [synthetic construct]
MRKGEELFTGVVPILVELDGDVNGHKFSVRGEGEGDATNGKLTLKFICTTGKLPVPWPTLVTTLTYGVQC
FARYPDHMKQHDFFKSAMPEGYVQERTISFKDDGTYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHK
LEYNFNSHNVYITADKQKNGIKANFKIRHNVEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSVLSKD
PNEKRDHMVLLEFVTAAGITHGMDELYKRP

REVERSE TRANSLATE:

atgcgtaaaggtgaagaactgtttaccggtgttgttccgattctggttgaactggatggt
gatgttaatggtcataaatttagcgttcgtggtgaaggtgaaggtgatgcaaccaatggt
aaactgaccctgaaatttatttgtaccaccggtaaactgccggttccgtggccgaccctg
gttaccaccctgacctatggtgttcagtgttttgcacgttatccggatcatatgaaacag
catgatttttttaaaagcgcaatgccggaaggttatgttcaggaacgtaccattagcttt
aaagatgatggtacctataaaacccgtgcagaagttaaatttgaaggtgataccctggtt
aatcgtattgaactgaaaggtattgattttaaagaagatggtaatattctgggtcataaa
ctggaatataattttaatagccataatgtttatattaccgcagataaacagaaaaatggt
attaaagcaaattttaaaattcgtcataatgttgaagatggtagcgttcagctggcagat
cattatcagcagaataccccgattggtgatggtccggttctgctgccggataatcattat
ctgagcacccagagcgttctgagcaaagatccgaatgaaaaacgtgatcatatggttctg
ctggaatttgttaccgcagcaggtattacccatggtatggatgaactgtataaacgtccg

CODON OPTIMIZATION FOR E.COLI

ATGCGTAAAGGCGAAGAACTGTTTACCGGCGTGGTGCCGATTCTGGTGGAACTGGATGGCGATGTGAATGGCCATAAATTTAGCGTGCGTGGCGAAGGTGAAGGCGATGCGACCAATGGCAAACTGACCCTGAAATTTATCTGCACCACCGGTAAACTGCCGGTGCCGTGGCCGACCCTGGTGACCACCCTGACCTACGGCGTGCAGTGTTTTGCGCGCTACCCGGATCATATGAAACAGCATGATTTTTTTAAAAGCGCGATGCCGGAAGGCTATGTGCAGGAACGCACCATTAGTTTCAAAGATGATGGCACCTACAAAACCCGTGCCGAAGTGAAATTCGAAGGCGATACCCTGGTGAATCGCATTGAACTGAAAGGCATTGATTTTAAAGAAGATGGTAACATTCTGGGCCACAAACTGGAATACAACTTTAACAGCCATAACGTGTACATTACGGCGGATAAACAGAAAAATGGCATTAAAGCGAACTTTAAAATTCGCCATAACGTGGAAGATGGCTCAGTGCAGCTGGCGGATCACTATCAGCAGAACACCCCGATTGGCGATGGCCCGGTTCTGCTGCCGGATAACCACTATCTGAGCACCCAGAGCGTGCTGTCGAAAGATCCGAACGAAAAACGCGATCACATGGTGCTGCTGGAATTTGTGACCGCCGCGGGCATCACCCATGGTATGGATGAACTGTATAAACGCCCG

After that, I imported the sequence to Benchling, it looked like this:

And finally, I added each of the parts listed before:

Promoter(e.g. BBa_J23106): TTTACGGCTAGCTCAGTCCTAGGTATAGTGCTAGC
RBS (e.g. BBa_B0034 with spacers for optimal expression): CATTAAAGAGGAGAAAGGTACC
Start Codon: ATG
Coding Sequence (your codon optimized DNA for a protein of interest, sfGFP for example): for this part I used the one listed before
7x His Tag (Let’s add a 7×His tag at the C-terminus of the protein to enable protein purification from E. coli): CATCACCATCACCATCATCAC
Stop Codon: TAA
Terminator (e.g. BBa_B0015): CCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGAACGCTCTCTACTAGAGTCACACTGGCTCACCTTCGGGTGGGCCTTTCTGCGTTTATA

The final result was this:

🍊Orange: promoter 🌾 Light orange: RBS 🌞 Yellow: Start codon 🍏 Green: coding sequence 💎 Sky blue: 7xHis Tag 🌞 Yellow: Stop codon 🎀 Pink: Terminator

Finally, this is the link of the construct:

https://benchling.com/s/seq-TQtZceLQYEMAHO9Vdig1?m=slm-SUGXLLIINcheG13p5SEZ

- 4.3. On Twist, Select The “Genes” Option

- 4.4. Select “Clonal Genes” option

- 4.5. Import your sequence

- 4.6. Choose Your Vector

Part 5: DNA Read/Write/Edit

5.1 DNA Read (i) What DNA would you want to sequence (e.g., read) and why? This could be DNA related to human health (e.g. genes related to disease research), environmental monitoring (e.g., sewage waste water, biodiversity analysis), and beyond (e.g. DNA data storage, biobank).

I would like to sequence the DNA of an ecuatorian tree called Cholán, or Tecoma stans.

(ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why? Also answer the following questions:

Is your method first-, second- or third-generation or other? How so?
What is your input? How do you prepare your input (e.g. fragmentation, adapter ligation, PCR)? List the essential steps.
What are the essential steps of your chosen sequencing technology, how does it decode the bases of your DNA sample (base calling)?
What is the output of your chosen sequencing technology?

5.2 DNA Write

(i) What DNA would you want to synthesize (e.g., write) and why? These could be individual genes, clusters of genes or genetic circuits, whole genomes, and beyond. As described in class thus far, applications could range from therapeutics and drug discovery (e.g., mRNA vaccines and therapies) to novel biomaterials (e.g. structural proteins), to sensors (e.g., genetic circuits for sensing and responding to inflammation, environmental stimuli, etc.), to art (DNA origamis). If possible, include the specific genetic sequence(s) of what you would like to synthesize! You will have the opportunity to actually have Twist synthesize these DNA constructs! :)

(ii) What technology or technologies would you use to perform this DNA synthesis and why? Also answer the following questions:

What are the essential steps of your chosen sequencing methods?
What are the limitations of your sequencing method (if any) in terms of speed, accuracy, scalability?

5.3 DNA Edit

(i) What DNA would you want to edit and why? In class, George shared a variety of ways to edit the genes and genomes of humans and other organisms. Such DNA editing technologies have profound implications for human health, development, and even human longevity and human augmentation. DNA editing is also already commonly leveraged for flora and fauna, for example in nature conservation efforts, (animal/plant restoration, de-extinction), or in agriculture (e.g. plant breeding, nitrogen fixation). What kinds of edits might you want to make to DNA (e.g., human genomes and beyond) and why?

(ii) What technology or technologies would you use to perform these DNA edits and why? Also answer the following questions:

How does your technology of choice edit DNA? What are the essential steps?
What preparation do you need to do (e.g. design steps) and what is the input (e.g. DNA template, enzymes, plasmids, primers, guides, cells) for the editing?
What are the limitations of your editing methods (if any) in terms of efficiency or precision?

Week 3 HW: Lab Automation

Assignment: Python Script for Opentrons Artwork — DUE BY YOUR LAB TIME!

Your task this week is to Create a Python file to run on an Opentrons liquid handling robot.

Review this week’s recitation and this week’s lab for details on the Opentrons and programming it.
Generate an artistic design using the GUI at opentrons-art.rcdonovan.com.

Iteration color palette 4 - Circumference

Using the coordinates from the GUI, follow the instructions in the HTGAA26 Opentrons Colab to write your own Python script which draws your design using the Opentrons.

You may use AI assistance for this coding — Google Gemini is integrated into Colab (see the stylized star bottom center); it will do a good job writing functional Python, while you probably need to take charge of the art concept.

This coding was made with Google Gemini. The steps for doing that were: first, loading the coordinates made in the GUI; second, giving instructions to the AI for what the expected outcome; finally, iterating until the idea was achieved. The given instruction given to the IA did not have any basic coding, it was all made with written instructions.

If you’re a proficient programmer and you’d rather code something mathematical or algorithmic instead of using your GUI coordinates, you may do that instead.

If the Python component is proving too problematic even with AI and human assistance, download the full Python script from the GUI website and submit that:
If you use AI to help complete this homework or lab, document how you used AI and which models made contributions.

Sign up for a robot time slot if you are at MIT/Harvard/Wellesley or at a Node offering Opentrons automation. The Python script you created will be run on the robot to produce your work of art!

At MIT/Harvard? Lab times are on Thursday Feb.19 between 10AM and 6PM.
At other Nodes? Please coordinate with your Node.

Submit your Python file via this form.
- STAR´S BIRTH CODES

ITERATION COLOR PALETTE - 5

SUCCESSFULL ATTEMPT

As you can see, the code has not been uploaded. But, it was because I could not understand how to do it. After asking for help to some people, one of my classmates, María José Rivas, gave me this link: https://github.com/Mozta/opentrons-bioart-sim/tree/main?tab=readme-ov-file#from-source-for-development. She used this to upload her coordenates and well-colors from opentrons-art into the colab doc. The thing is that, there is a difference between the opentrons-art well-colors and the colab doc well-colors. This protocol, runs it fine.

For this process, first I dowloaded python to see if I could run it there, it was not successfull, but I understood how the program works (super basic knowledge). Then I went to the colab doc and tried to import the documents, but I was not successfull, so I asked for help to ChatGpt. We went trough the hall process together, and step by step it helped me import the link info as well as my .py doc.

The final results are these:

from opentrons import types

metadata = {    # see https://docs.opentrons.com/v2/tutorial.html#tutorial-metadata
    'author': '',
    'protocolName': '',
    'description': '',
    'source': 'HTGAA 2026 Opentrons Lab',
    'apiLevel': '2.20'
}

##############################################################################
###   Robot deck setup constants - don't change these
##############################################################################

TIP_RACK_DECK_SLOT = 9
COLORS_DECK_SLOT = 6
AGAR_DECK_SLOT = 5
PIPETTE_STARTING_TIP_WELL = 'A1'

well_colors = {
    'A1' : 'Red',
    'B1' : 'Green',
    'C1' : 'Orange'
}


def run(protocol):
  ##############################################################################
  ###   Load labware, modules and pipettes
  ##############################################################################

  # Tips
  tips_20ul = protocol.load_labware('opentrons_96_tiprack_20ul', TIP_RACK_DECK_SLOT, 'Opentrons 20uL Tips')

  # Pipettes
  pipette_20ul = protocol.load_instrument("p20_single_gen2", "right", [tips_20ul])

  # Modules
  temperature_module = protocol.load_module('temperature module gen2', COLORS_DECK_SLOT)

  # Temperature Module Plate
  temperature_plate = temperature_module.load_labware('opentrons_96_aluminumblock_generic_pcr_strip_200ul',
                                                      'Cold Plate')
  # Choose where to take the colors from
  color_plate = temperature_plate

  # Agar Plate
  agar_plate = protocol.load_labware('htgaa_agar_plate', AGAR_DECK_SLOT, 'Agar Plate')  ## TA MUST CALIBRATE EACH PLATE!
  # Get the top-center of the plate, make sure the plate was calibrated before running this
  center_location = agar_plate['A1'].top()

  pipette_20ul.starting_tip = tips_20ul.well(PIPETTE_STARTING_TIP_WELL)

  ##############################################################################
  ###   Patterning
  ##############################################################################

  ###
  ### Helper functions for this lab
  ###

  # pass this e.g. 'Red' and get back a Location which can be passed to aspirate()
  def location_of_color(color_string):
    for well,color in well_colors.items():
      if color.lower() == color_string.lower():
        return color_plate[well]
    raise ValueError(f"No well found with color {color_string}")

  # For this lab, instead of calling pipette.dispense(1, loc) use this: dispense_and_detach(pipette, 1, loc)
  def dispense_and_detach(pipette, volume, location):
      """
      Move laterally 5mm above the plate (to avoid smearing a drop); then drop down to the plate,
      dispense, move back up 5mm to detach drop, and stay high to be ready for next lateral move.
      5mm because a 4uL drop is 2mm diameter; and a 2deg tilt in the agar pour is >3mm difference across a plate.
      """
      assert(isinstance(volume, (int, float)))
      above_location = location.move(types.Point(z=location.point.z + 5))  # 5mm above
      pipette.move_to(above_location)       # Go to 5mm above the dispensing location
      pipette.dispense(volume, location)    # Go straight downwards and dispense
      pipette.move_to(above_location)       # Go straight up to detach drop and stay high

  ###
  ### YOUR CODE HERE to create your design
  ###

!git clone https://github.com/Mozta/opentrons-bioart-sim.git
%cd opentrons-bioart-sim
!ls

%cd /content/opentrons-bioart-sim
!ls

%cd /content/opentrons-bioart-sim/opentrons-bioart-sim/src/opentrons_bioart_sim
!ls

!sed -n '1,200p' /content/opentrons-bioart-sim/opentrons-bioart-sim/src/opentrons_bioart_sim/colors.py

!sed -n '1,200p' /content/opentrons-bioart-sim/opentrons-bioart-sim/src/opentrons_bioart_sim/visualization.py

%cd /content
!ls

import sys
sys.path.append("/content/opentrons-bioart-sim/opentrons-bioart-sim/src")

import OTDesign_96_deep_well_plate
from opentrons_bioart_sim.mock import OpentronsMock

!sed -n '1,200p' OTDesign_96_deep_well_plate.py

  # Don't forget to end with a drop_tip()

fatal: destination path 'opentrons-bioart-sim' already exists and is not an empty directory.
/content/opentrons-bioart-sim
CONTRIBUTING.md  LICENSE    opentrons-bioart-sim  README.md  tests
examples	 notebooks  pyproject.toml	  src
/content/opentrons-bioart-sim
CONTRIBUTING.md  LICENSE    opentrons-bioart-sim  README.md  tests
examples	 notebooks  pyproject.toml	  src
/content/opentrons-bioart-sim/opentrons-bioart-sim/src/opentrons_bioart_sim
cli.py	   __init__.py	opentrons-bioart-sim  visualization.py
colors.py  mock.py	__pycache__
"""
colors.py — Fluorescent protein color mappings for Opentrons Bio-Art visualization
===================================================================================
Maps fluorescent protein names to matplotlib-compatible colors for Petri dish rendering.
"""

# ═══════════════════════════════════════════════════════════════════════
# Petri dish constants
# ═══════════════════════════════════════════════════════════════════════

PETRI_INNER_DIAMETER: float = 84  # mm — inner diameter of "90mm" and "100mm" plates
MAX_DRAW_RADIUS: float = PETRI_INNER_DIAMETER / 2 - 2  # 2mm margin for tip size, drops, calibration

# ═══════════════════════════════════════════════════════════════════════
# Protein → visual color mapping
# ═══════════════════════════════════════════════════════════════════════

PROTEIN_VISUAL_COLORS: dict[str, str] = {
    # Reds / Pinks
    'mrfp1':            'red',
    'mcherry':          'firebrick',
    'dsred':            'darkred',
    'mruby2':           'crimson',
    'mscarlet_i':       'tomato',
    'mkate2':           'deeppink',
    'mkate2_tf':        'mediumvioletred',
    'tagrfp':           'coral',
    'tdtomato':         'orangered',
    'eqfp578':          'salmon',
    'mlychee_tf':       'hotpink',
    'mwatermelon':      'lightcoral',
    # Oranges / Yellows
    'mko2':             'orange',
    'mpapaya':          'lightsalmon',
    'venus':            'yellow',
    'mcitrine':         'gold',
    'mvenus':           'goldenrod',
    'mbanana':          'khaki',
    'mstaygold2':       'gold',
    'mchartreuse_tf':   'chartreuse',
    # Greens
    'sfgfp':            'lime',
    'egfp':             'lime',
    'megfp':            'limegreen',
    'avgfp':            'palegreen',
    'mneongreen':       'greenyellow',
    'mazamigreen':      'forestgreen',
    'mclover3':         'green',
    'mwasabi':          'lightgreen',
    'mjuniper':         'darkgreen',
    'zsgreen1':         'springgreen',
    'pa_gfp':           'mediumseagreen',
    'mhoneydew':        'yellowgreen',
    # Blues / Cyans
    'azurite':          'royalblue',
    'tagbfp':           'blue',
    'mtagbfp2':         'mediumblue',
    'ultramarine':      'navy',
    'mturquoise2':      'turquoise',
    'mcerulean3':       'cyan',
    'mtfp1':            'darkcyan',
    'mmicy':            'aquamarine',
    'electra2':         'deepskyblue',
    # Others
    'mplum':            'purple',
}


def resolve_visual_color(protein_or_color_name: str) -> str:
    """Resolve a fluorescent protein name or color name to a matplotlib color.

    Lookup order:
      1. Check PROTEIN_VISUAL_COLORS (case-insensitive)
      2. Map 'green' → 'lime' for better visibility on dark backgrounds
      3. Pass through as-is (assumed to be a valid matplotlib color)

    Args:
        protein_or_color_name: Protein name (e.g. 'sfGFP') or color (e.g. 'red').

    Returns:
        A matplotlib-compatible color string.
    """
    key = protein_or_color_name.lower().strip()
    if key in PROTEIN_VISUAL_COLORS:
        return PROTEIN_VISUAL_COLORS[key]
    if key == 'green':
        return 'lime'
    return protein_or_color_name
"""
visualization.py — Petri dish visualization for Opentrons Bio-Art protocols
============================================================================
Renders droplet positions, smears, and volume summaries as a matplotlib figure.
"""

from __future__ import annotations

from typing import Optional

import matplotlib.pyplot as plt
from matplotlib.figure import Figure
from matplotlib.axes import Axes

from .colors import PETRI_INNER_DIAMETER


def visualize_petri(
    droplets_x: list[float],
    droplets_y: list[float],
    droplets_size: list[float],
    droplets_color: list[str],
    smears: list[tuple[list[float], list[float], str]],
    total_aspirated: dict[str, float],
    total_dispensed: dict[str, float],
    tip_count: int,
    background: str = 'black',
    title: str = 'Opentrons Bio-Art Simulation',
    save_path: Optional[str] = None,
    show: bool = True,
    dpi: int = 150,
    figsize: tuple[float, float] = (10, 10),
) -> tuple[Figure, Axes]:
    """Render a Petri dish visualization with all dispensed droplets.

    Args:
        droplets_x: X coordinates of each droplet (mm from center).
        droplets_y: Y coordinates of each droplet (mm from center).
        droplets_size: Size of each droplet in scatter points (volume × 100).
        droplets_color: Matplotlib color of each droplet.
        smears: List of (x_list, y_list, color) tuples for smear lines.
        total_aspirated: Dict mapping color name → total µL aspirated.
        total_dispensed: Dict mapping color name → total µL dispensed.
        tip_count: Number of tips used during the protocol.
        background: 'black' (dark agar), 'agar' (beige agar), or 'paper' (outline only).
        title: Plot title.
        save_path: If provided, save figure to this file path.
        show: If True, call plt.show(). Set False for headless/test usage.
        dpi: Resolution for saved images.
        figsize: Figure size in inches.

    Returns:
        Tuple of (Figure, Axes) for further customization.
    """
    # ── Print volume summary ──
    _print_volume_summary(total_aspirated, total_dispensed, tip_count)

    # ── Create figure ──
    fig, ax = plt.subplots(figsize=figsize)

    # ── Petri dish background ──
    radius = PETRI_INNER_DIAMETER / 2
    bg_colors = {
        'black': ('#000000', True),
        'agar':  ('#d7ca95', True),
        'paper': ('#000000', False),
    }
    color, fill = bg_colors.get(background, bg_colors['black'])
    ax.add_patch(plt.Circle((0, 0), radius=radius, color=color, fill=fill))

    # ── Droplets ──
    if droplets_x:
        ax.scatter(droplets_x, droplets_y, droplets_size, c=droplets_color)

    # ── Smears ──
    for xlist, ylist, scolor in smears:
        ax.plot(xlist, ylist, color=scolor, linewidth=4, solid_capstyle='round')

    # ── Axes setup ──
    margin = radius + 0.5
    ax.set_xlim(-margin, margin)
    ax.set_ylim(-margin, margin)
    ax.set_aspect('equal')
    ax.set_title(title)

    # ── Save / Show ──
    if save_path:
        fig.savefig(save_path, dpi=dpi, bbox_inches='tight',
                    facecolor=fig.get_facecolor(), edgecolor='none')
        print(f"\nImage saved to: {save_path}")

    if show:
        plt.show()

    return fig, ax


def _print_volume_summary(
    total_aspirated: dict[str, float],
    total_dispensed: dict[str, float],
    tip_count: int,
) -> None:
    """Print a summary of aspirated/dispensed volumes by color."""
    from .colors import resolve_visual_color

    print("\n=== TOTAL VOLUMES BY COLOR ===")
    all_colors = total_aspirated.keys() | total_dispensed.keys()
    for color in sorted(all_colors):
        asp = total_aspirated.get(color, 0)
        disp = total_dispensed.get(color, 0)
        waste = "\t\t##### WASTE: more aspirated than dispensed!" if asp != disp else ''
        vis = resolve_visual_color(color)
        print(f"\t{color} ({vis}):\t aspirated {asp:.1f}\t dispensed {disp:.1f}{waste}")

    total_asp = sum(total_aspirated.values())
    total_disp = sum(total_dispensed.values())
    print(f"\t[all]:\t\t[aspirated {total_asp:.1f}]\t[dispensed {total_disp:.1f}]")
    print(f"\n=== TIPS USED ===\n\t{tip_count} tip(s)  (ideal: one per color)\n")
/content
opentrons-bioart-sim		__pycache__  venv
OTDesign_96_deep_well_plate.py	sample_data
from opentrons import types

import string

metadata = {
    'protocolName': '{YOUR NAME} - Opentrons Art - HTGAA',
    'author': 'HTGAA',
    'source': 'HTGAA 2026',
    'apiLevel': '2.20'
}

Z_VALUE_AGAR = 2.0
POINT_SIZE = 1

electra2_points = [(-4.4,39.6), (-8.8,37.4), (6.6,37.4), (-6.6,35.2), (15.4,35.2), (-13.2,33), (0,33), (-11,30.8), (19.8,30.8), (4.4,28.6), (13.2,28.6), (-15.4,26.4), (-6.6,26.4), (-19.8,24.2), (-26.4,22), (30.8,22), (-35.2,11), (35.2,11), (-37.4,8.8), (-39.6,2.2), (-33,0), (33,0), (-37.4,-6.6), (35.2,-8.8), (-33,-13.2), (33,-15.4), (-30.8,-22), (-15.4,-24.2), (-24.2,-26.4), (-11,-33), (8.8,-33), (15.4,-33), (4.4,-37.4)]
mturquoise2_points = [(-24.2,26.4), (-22,22), (-19.8,19.8), (-17.6,19.8), (-19.8,17.6), (-17.6,17.6), (-15.4,17.6), (-17.6,15.4), (-15.4,15.4), (-13.2,13.2), (-11,13.2), (-13.2,11), (-11,11), (-8.8,8.8), (-4.4,6.6), (-6.6,4.4), (11,4.4), (13.2,4.4), (15.4,4.4), (11,2.2), (13.2,2.2), (15.4,2.2), (19.8,2.2), (8.8,0), (11,0), (-4.4,-8.8), (-4.4,-11), (-4.4,-13.2), (-6.6,-15.4), (-4.4,-15.4), (-2.2,-17.6), (0,-24.2), (-4.4,-28.6)]
azurite_points = [(-2.2,39.6), (-6.6,37.4), (-15.4,35.2), (-11,35.2), (-4.4,35.2), (-19.8,33), (-17.6,33), (-6.6,33), (-22,30.8), (-15.4,30.8), (-19.8,28.6), (-17.6,28.6), (-28.6,24.2), (-22,24.2), (-28.6,22), (-33,19.8), (-28.6,19.8), (-24.2,19.8), (-30.8,17.6), (-28.6,17.6), (22,17.6), (-35.2,15.4), (-33,15.4), (17.6,15.4), (24.2,15.4), (-30.8,13.2), (-33,11), (15.4,11), (19.8,11), (22,11), (26.4,11), (17.6,8.8), (19.8,8.8), (22,8.8), (15.4,6.6), (17.6,6.6), (-37.4,4.4), (19.8,4.4), (24.2,4.4), (17.6,2.2), (22,2.2), (24.2,2.2), (19.8,0), (22,0), (26.4,0), (24.2,-2.2)]
sfgfp_points = [(0,-8.8), (2.2,-8.8), (-2.2,-11), (0,-11), (2.2,-11), (-2.2,-13.2), (-4.4,-19.8), (-2.2,-19.8)]
mjuniper_points = [(-2.2,-22), (0,-22), (-4.4,-24.2), (-2.2,-24.2), (-4.4,-26.4), (-6.6,-28.6), (-2.2,-28.6), (-4.4,-33), (-2.2,-35.2)]
mko2_points = [(2.2,8.8), (-6.6,6.6), (0,6.6), (2.2,6.6), (-4.4,4.4), (-2.2,4.4), (0,4.4), (-4.4,2.2), (-2.2,2.2), (6.6,2.2), (8.8,2.2), (-6.6,0), (-4.4,0), (4.4,0), (6.6,0), (-8.8,-2.2), (-6.6,-2.2), (2.2,-2.2), (4.4,-2.2), (0,-4.4), (2.2,-4.4), (4.4,-4.4), (6.6,-4.4), (-2.2,-6.6), (0,-6.6), (2.2,-6.6), (4.4,-6.6), (6.6,-6.6), (8.8,-6.6), (-2.2,-8.8), (8.8,-8.8), (11,-8.8), (13.2,-8.8), (11,-11), (15.4,-11), (13.2,-13.2), (11,-15.4), (13.2,-15.4), (13.2,-17.6), (15.4,-17.6), (17.6,-17.6), (19.8,-19.8), (19.8,-22), (22,-24.2), (26.4,-26.4)]
mwasabi_points = [(0,8.8), (-2.2,6.6), (-6.6,2.2), (-8.8,0)]
mrfp1_points = [(8.8,-11), (13.2,-11), (11,-13.2)]
mscarlet_i_points = [(15.4,-15.4), (17.6,-15.4)]

point_name_pairing = [("electra2", electra2_points),("mturquoise2", mturquoise2_points),("azurite", azurite_points),("sfgfp", sfgfp_points),("mjuniper", mjuniper_points),("mko2", mko2_points),("mwasabi", mwasabi_points),("mrfp1", mrfp1_points),("mscarlet_i", mscarlet_i_points)]

# Robot deck setup constants
TIP_RACK_DECK_SLOT = 9
COLORS_DECK_SLOT = 6
AGAR_DECK_SLOT = 5
PIPETTE_STARTING_TIP_WELL = 'A1'

# Place the PCR tubes in this order
well_colors = {
    'A1': 'sfGFP',
    'A2': 'mRFP1',
    'A3': 'mKO2',
    'A4': 'Venus',
    'A5': 'mKate2_TF',
    'A6': 'Azurite',
    'A7': 'mCerulean3',
    'A8': 'mClover3',
    'A9': 'mJuniper',
    'A10': 'mTurquoise2',
    'A11': 'mBanana',
    'A12': 'mPlum',
    'B1': 'Electra2',
    'B2': 'mWasabi',
    'B3': 'mScarlet_I',
    'B4': 'mPapaya',
    'B5': 'eqFP578',
    'B6': 'tdTomato',
    'B7': 'DsRed',
    'B8': 'mKate2',
    'B9': 'EGFP',
    'B10': 'mRuby2',
    'B11': 'TagBFP',
    'B12': 'mChartreuse_TF',
    'C1': 'mLychee_TF',
    'C2': 'mTagBFP2',
    'C3': 'mEGFP',
    'C4': 'mNeonGreen',
    'C5': 'mAzamiGreen',
    'C6': 'mWatermelon',
    'C7': 'avGFP',
    'C8': 'mCitrine',
    'C9': 'mVenus',
    'C10': 'mCherry',
    'C11': 'mHoneydew',
    'C12': 'TagRFP',
    'D1': 'mTFP1',
    'D2': 'Ultramarine',
    'D3': 'ZsGreen1',
    'D4': 'mMiCy',
    'D5': 'mStayGold2',
    'D6': 'PA_GFP'
}

volume_used = {
    'electra2': 0,
    'mturquoise2': 0,
    'azurite': 0,
    'sfgfp': 0,
    'mjuniper': 0,
    'mko2': 0,
    'mwasabi': 0,
    'mrfp1': 0,
    'mscarlet_i': 0
}

def update_volume_remaining(current_color, quantity_to_aspirate):
    rows = string.ascii_uppercase
    for well, color in list(well_colors.items()):
        if color == current_color:
            if (volume_used[current_color] + quantity_to_aspirate) > 250:
                # Move to next well horizontally by advancing row letter, keeping column number
                row = well[0]
                col = well[1:]
                
                # Find next row letter
                next_row = rows[rows.index(row) + 1]
                next_well = f"{next_row}{col}"
                
                del well_colors[well]
                well_colors[next_well] = current_color
                volume_used[current_color] = quantity_to_aspirate
            else:
                volume_used[current_color] += quantity_to_aspirate
            break

def run(protocol):
    # Load labware, modules and pipettes
    protocol.home()

    # Tips
    tips_20ul = protocol.load_labware('opentrons_96_tiprack_20ul', TIP_RACK_DECK_SLOT, 'Opentrons 20uL Tips')

    # Pipettes
    pipette_20ul = protocol.load_instrument("p20_single_gen2", "right", [tips_20ul])

    # Deep Well Plate
    temperature_plate = protocol.load_labware('nest_96_wellplate_2ml_deep', 6)

    # Agar Plate
    agar_plate = protocol.load_labware('htgaa_agar_plate', AGAR_DECK_SLOT, 'Agar Plate')
    agar_plate.set_offset(x=0.00, y=0.00, z=Z_VALUE_AGAR)

    # Get the top-center of the plate, make sure the plate was calibrated before running this
    center_location = agar_plate['A1'].top()

    pipette_20ul.starting_tip = tips_20ul.well(PIPETTE_STARTING_TIP_WELL)
    
    # Helper function (dispensing)
    def dispense_and_jog(pipette, volume, location):
        assert(isinstance(volume, (int, float)))
        # Go above the location
        above_location = location.move(types.Point(z=location.point.z + 2))
        pipette.move_to(above_location)
        # Go downwards and dispense
        pipette.dispense(volume, location)
        # Go upwards to avoid smearing
        pipette.move_to(above_location)

    # Helper function (color location)
    def location_of_color(color_string):
        for well,color in well_colors.items():
            if color.lower() == color_string.lower():
                return temperature_plate[well]
        raise ValueError(f"No well found with color {color_string}")

    # Print pattern by iterating over lists
    for i, (current_color, point_list) in enumerate(point_name_pairing):
        # Skip the rest of the loop if the list is empty
        if not point_list:
            continue

        # Get the tip for this run, set the bacteria color, and the aspirate bacteria of choice
        pipette_20ul.pick_up_tip()
        max_aspirate = int(18 // POINT_SIZE) * POINT_SIZE
        quantity_to_aspirate = min(len(point_list)*POINT_SIZE, max_aspirate)
        update_volume_remaining(current_color, quantity_to_aspirate)
        pipette_20ul.aspirate(quantity_to_aspirate, location_of_color(current_color))

        # Iterate over the current points list and dispense them, refilling along the way
        for i in range(len(point_list)):
            x, y = point_list[i]
            adjusted_location = center_location.move(types.Point(x, y))

            dispense_and_jog(pipette_20ul, POINT_SIZE, adjusted_location)
            
            if pipette_20ul.current_volume == 0 and len(point_list[i+1:]) > 0:
                quantity_to_aspirate = min(len(point_list[i:])*POINT_SIZE, max_aspirate)
                update_volume_remaining(current_color, quantity_to_aspirate)
                pipette_20ul.aspirate(quantity_to_aspirate, location_of_color(current_color))

        # Drop tip between each color
        pipette_20ul.drop_tip()

Finally, the result is this:

=== TOTAL VOLUMES BY COLOR ===
	Azurite (royalblue):	 aspirated 47.0	 dispensed 46.0		##### WASTE: more aspirated than dispensed!
	Electra2 (deepskyblue):	 aspirated 34.0	 dispensed 33.0		##### WASTE: more aspirated than dispensed!
	mJuniper (darkgreen):	 aspirated 9.0	 dispensed 9.0
	mKO2 (orange):	 aspirated 46.0	 dispensed 45.0		##### WASTE: more aspirated than dispensed!
	mRFP1 (red):	 aspirated 3.0	 dispensed 3.0
	mScarlet_I (tomato):	 aspirated 2.0	 dispensed 2.0
	mTurquoise2 (turquoise):	 aspirated 34.0	 dispensed 33.0		##### WASTE: more aspirated than dispensed!
	mWasabi (lightgreen):	 aspirated 4.0	 dispensed 4.0
	sfGFP (lime):	 aspirated 8.0	 dispensed 8.0
	[all]:		[aspirated 187.0]	[dispensed 183.0]

=== TIPS USED ===
	9 tip(s)  (ideal: one per color)

(<Figure size 1000x1000 with 1 Axes>,
 <Axes: title={'center': 'Opentrons Bio-Art Simulation'}>)

Some info to keep in mind:

STAR in colab doc

CIRCUMFERENCE in colab doc

GIT HUB LINK

Special thanks to María José and Rafael Pérez Aguirre (@Mozta)

Post-Lab Questions — DUE BY START OF FEB 24 LECTURE

One of the great parts about having an automated robot is being able to precisely mix, deposit, and run reactions without much intervention, and design and deploy experiments remotely.

For this week, we’d like for you to do the following:

Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications.

Automation of protein crystallization scale-up via Opentrons-2 liquid handling

This study shows the approach for optimizing protein crystallization trials at multi-microliter scale using the Opentrons-2 liquid handling robot. The research shows that using Python scripts for precise control, the robot can mix and set up crystallization plates with a model protein - hen egg white lysozyme - and periplasmic protein from Campylobacter jejuni, a crystal used in the Snow lab as a biomaterial for nanotechnology, requiring large, consistent batches. This automation of the process can significantly reduce manual labor, costs, and improve reliability in the protein crystallization results. Opentrons uses a python programming, making it easier to set up for iterations and improvements in programming protocols.

DeRoo, J. B., Jones, A. A., Slaughter, C. K., Ahr, T. W., Stroup, S. M., Thompson, G. B., & Snow, C. D. (2025). Automation of protein crystallization scaleup via Opentrons-2 liquid handling. SLAS Technology, 32, 100268. https://doi.org/10.1016/j.slast.2025.100268

Other interesting studies demonstrate how Opentrons can be linked to other types of technology, such as 3D bioprinting. Although a 3d printer does not work with proteins in the same way as OT-2, it can print different types of labware, reducing costs and making specialized tools. Apart from robots, there is now a collaboration between automated labs and AI assistance.

Write a description about what you intend to do with automation tools for your final project. You may include example pseudocode, Python scripts, 3D printed holders, a plan for how to use Ginkgo Nebula, and more. You may reference this week’s recitation slide deck for lab automation details.

While your description/project idea doesn’t need to be set in stone, we would like to see core details of what you would automate. This is due at the start of lecture and does not need to be tested on the Opentrons yet.

Project tech proposals:

DNA OPTIMIZER: for optimizing codon-sequences to express proteins in its host cells, for example: E. coli expressing luciferase.
OPENTROS OT-2: to make iterations of expressing cells. For example, for bacteria, folding proteins while having frequencies played.
CLOUD LAB: For iterating with different proteins and combinations before going to the lab. For example: recombining DNA for cyanobacteria and luciferase.
BIOREACTOR: For expressing proteins properly in a non-contaminated atmosphere

Final Project Ideas — DUE BY START OF FEB 24 LECTURE

For the final project ideas, there are 3 options to take into consideration:

MUSIC & BACTERIA
AQUATIC MICROORGANISMS & BIOLUMINESCENT SENSORS
PROTEIN BASED CRYSTALLINE MATERIALS & SPIDER-SILK TEXTILES

Week 4: Protein Design - part I

Part A. Conceptual Questions

Answer any NINE of the following questions from Shuguang Zhang: (i.e. you can select two to skip)

How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons)
Why do humans eat beef but do not become a cow, eat fish but do not become fish?
Why are there only 20 natural amino acids?
Can you make other non-natural amino acids? Design some new amino acids.
Where did amino acids come from before enzymes that make them, and before life started?
If you make an α-helix using D-amino acids, what handedness (right or left) would you expect?
Can you discover additional helices in proteins?
Why are most molecular helices right-handed?
Why do β-sheets tend to aggregate?
What is the driving force for β-sheet aggregation?
Why do many amyloid diseases form β-sheets?
Can you use amyloid β-sheets as materials?
Design a β-sheet motif that forms a well-ordered structure.

Part B: Protein Analysis and Visualization

In this part of the homework, you will be using online resources and 3D visualization software to answer questions about proteins. Pick any protein (from any organism) of your interest that has a 3D structure and answer the following questions:

LUCIFERASE OF Pyrophorus plagiophthalamus

Luciferase is a protein/enzyme that generates bioluminescence by catalyzing oxidation of D-luciferine in the presence of ATP, oxygen, and MG+2. In the case of this particular insect, Pyrophorus plagiophthalmus, different isoforms of luciferasecan emit light ranging from green to orange, depending on which organs it expresses the gene. These variations of color arise from subtle structural variations in the enzyme´s active site that alter the electronic environment of the excited oxyluciferin intermediate. Click Beetle´s luciferase is a very stable protein in a wide range of pH range compared to other active luciferases. It is very common to use this enzyme for in vivo imaging applications, especially the red-emitting variants. They are also used as a biosensor to monitor gene expression and as a gene reporter.

I chose this particular protein because I am interested in analyzing how sound frequencies might influence bacterial protein expression, growth dynamics, or spatial organization. In this way, having luciferase as a biosensor is ideal; light emission provides a real-time, quantifiable readout.

Burbelo, P. D., Kisailus, A. E., & Peck, J. W. (2002). Detecting Protein-Protein Interactions Using Renilla Luciferase Fusion Proteins. BioTechniques, 33(5), 1044–1050. https://doi.org/10.2144/02335st05

‌
How long is it? What is the most frequent amino acid?

AAQ11735.1 luciferase [Pyrophorus plagiophthalamus]

MMKREKNVVYGPEPLHALEDLTAGEMLFRALRKHSHLPQALVDVYGEEWISYKEFFETTCLLAQSLHNCG
YKMSDVVSICAENNKRFFVPIIAAWYIGMIVAPVNEGYIPDELCKVMGISRPQLVFCTKNILNKVLEVQS
RTDFIKRIIILDAVENIHGCESLPNFISRYSDGNIANFKPLHYDPVEQVAAILCSSGTTGLPKGVMQTHR
NVCVRLIHALDPRVGTQLIPGVTVLVYLPFFHAFGFSINLGYFMVGLRVIMLRRFDQEAFLKAIQDYEVR
SVINVPAIILFLSKSPLVDKYDLSSLRELCCGAAPLAKEVAEIAVKRLNLPGIRCGFGLTESTSANIHSL
RDEFKSGSLGKVTPFMAVKIADRETGKALGPNQVGELCVKGPMVSKGYVNNVEATKEAIDDDGWLHSGDF
GYYDQDEHFYVVDRYKELIKYKGSQVAPAELEEILLKNPCIRDVAVVGIPDLEAGELPSAFVVIQPGKEI
TAKEVYDYLAERVSHTKYLRGGVRFVDSIPRNVTGKITRKELLKQLLEKSSKL

For this part, I used Google Colab and did some research on Leucine. Luciferase of Pryphorus plagiophtalmus: has 543 amino acids, being the most frequent L (Leucina) that appears 56 times. Leucine is commonly known for being an amino acid that helps synthesize muscle proteins and supports tissue regeneration. In this case, its function is related to a hydrophobic nucleotide, correct protein folding, and formation of alpha helices.

How many protein sequence homologs are there for your protein?

According to Uniprot´s BLAST TOOL, it has 236 homologs. This means that there is a variety of similar proteins in the living realm. They might not be the same, but they share a very similar structure. These homologs can be orthologs and parologs. The second ones are proteins that can be found inside the insect’s body, but with very subtle variations in their structure.

Does your protein belong to any protein family?

Yes, it belongs to the luciferase proteins of insects. This type of protein needs ATP, d-luciferin, and oxygen to perform the oxidation process.

When was the structure solved? Is it a good quality structure? A good-quality structure is one with high resolution. Smaller the better (Resolution: 2.70 Å) :

This particular protein, Luciferase of Pyrophorus plagiophthalamus, is not in the bank information of RCSB, so I took the first luciferase structured in the bank which is: 1LCI Firefly luciferase from Photinus pyralis. Its structure was solved in 1997. The quality is 2.00 Å, which is a good quality.

BAL46512.1 firefly luciferase [Photinus pyralis]

MEDAKNIKKGPAPFYPLEDGTAGEQLHKAMKRYALVPGTIAFTDAHIEVNITYAEYFEMSVRLAEAMKRY
GLNTNHRIVVCSENSLQFFMPVLGALFIGVAVAPANDIYNERELLNSMNISQPTVVFVSKKGLQKILNVQ
KKLPIIQKIIIMDSKTDYQGFQSMYTFVTSHLPPGFNEYDFVPESFDRDKTIALIMNSSGSTGLPKGVAL
PHRTACVRFSHARDPIFGNQIIPDTAILSVVPFHHGFGMFTTLGYLICGFRVVLMYRFEEELFLRSLQDY
KIQSALLVPTLFSFFAKSTLIDKYDLSNLHEIASGGAPLSKEVGEAVAKRFHLPGIRQGYGLTETTSAIL
ITPEGDDKPGAVGKVVPFFEAKVVDLDTGKTLGVNQRGELCVRGPMIMSGYVNNPEATNALIDKDGWLHS
GDIAYWDEDEHFFIVDRLKSLIKYKGYQVAPAELESILLQHPNIFDAGVAGLPDDDAGELPAAVVVLEHG
KTMTEKEIVDYVASQVTTAKKLRGGVVFVDEVPKGLTGKLDARKIREILIKAKKGGKSKL

Are there any other molecules in the solved structure apart from protein?

There is a presence chrystallographic molecules of water (HOH), which stabilizes the protein and may participate in hydrogen bond formation. As this is the first protein from luciferase to be structured, it does not include other types of components, besides the protein and water.

Does your protein belong to any structure classification family?

It belongs to the ATP-dependent AMP-binding enzyme family. This family includes enzymes that activate substrates through adenylation using ATP, forming an AMP-bound intermediate.

Open the structure of your protein in any 3D molecule visualization software: - PyMol Tutorial Here (hint: ChatGPT is good at PyMol commands) - Visualize the protein as “cartoon”, “ribbon” and “ball and stick”.

Color the protein by secondary structure. Does it have more helices or sheets?

The protein shows a predominance in alpha helix (red) compared to beta helix (green). This indicates that firefly luciferase is mainly an alpha-helical protein with a smaller portion of beta-sheet structures

Color the protein by residue type. What can you tell about the distribution of hydrophobic vs hydrophilic residues?

Luciferase-hidrophobic (yellow) vs. hydrophilic (yellow)

It is shown that this particular protein/enzyme, which operates in an aqueous environment, has an exterior with hydrophilic residues as protagonists and its core with hydrophobic residues.

Visualize the surface of the protein. Does it have any “holes” (aka binding pockets)?

Yes, the protein surface shows a variety of pockets. One with a predominant size, and others that are small. It is precisely in the big pocket that the ATP binds with the D-luciferine to form Luciferil-AMP and then binds together with oxygen molecules that finally form oxyluciferine and light.

Part C. Using ML-Based Protein Design Tools

C1. Protein Language Modeling

Deep Mutational Scans

a. Use ESM2 to generate an unsupervised deep mutational scan of your protein based on language model likelihoods. b. Can you explain any particular pattern? (choose a residue and a mutation that stands out). c. (Bonus) Find sequences for which we have experimental scans, and compare the prediction of the language model to experiment.

The map shows a large number of possible mutations, although two main regions should not be changed because the protein could collapse; those regions are shown as two columns of dark blue. Also, three subtle rows show color consistency corresponding to W, M, and C.

Latent Space Analysis

a. Use the provided sequence dataset to embed proteins in reduced dimensionality. b. Analyze the different formed neighborhoods: do they approximate similar proteins? c. Place your protein in the resulting map and explain its position and similarity to its neighbors.

It is shown that near the analyzed protein (Firefly Luciferase - Photinus pyralis) is located the Luciferase Luciola Cruciata, a protein produced by another type of firefly. The first one, PP, is from North America, while the second one, LC, is from Japan. The main difference is the geographical location and its molecular composition, which is expressed in a slightly different type of color, and the stability of the enzyme. Although both proteins use D-Luciferin and ATP to produce light, PP Luciferase is widely used in biotech as a reporter gene. In contrast, LC Luciferase is used to understand how active-site residues interact with the substrate.

C2. Protein Folding

Fold your protein with ESMFold. Do the predicted coordinates match your original structure? Try changing the sequence, first try some mutations, then large segments. Is your protein structure resilient to mutations?

When folded with ESMFold, the protein shows an almost identical structure to the original one, but when given some mutations, it presents a few changes, not very radical ones, but a few anomalies, meaning that the protein is resilient in a high percentage.

C3. Protein Generation

Inverse-Folding a protein Let’s now use the backbone of your chosen PDB to propose sequence candidates via ProteinMPNN Analyze the predicted sequence probabilities and compare the predicted sequence vs the original one. Input this sequence into ESMFold and compare the predicted structure to your original.

SDRIRVGPEPAEPVQPGTAGQLLHDAMRKFAAIPGTVAFIDAETGKSMTYEEFYTDSVKMAAALKNYGLDKNDAIAVMSKNSLQYFIPVLGALMIGVAVAPINPDYDVEGALTAMSRAKPKVVFTSKENIEKVKEVQKKLPTIKEIIVLDSKEPYKGLDSIYTFIEKYLPEGFDPWKFKPAEFDRDTTIAFILEDXXXXXEPKGVAHPHRALVHNFSIAVDPVYGIAPVPGTVILLTTPLTEHVGLTNTLGAIYAGFTVVLISKFDEDLFLKTLQDYKVQEAYVEPEMLELLAKSTKISQYDLSSLKRISSGGHVISKEVADAVAKKFNLPGVRRGYGKTETFHAFIITPEGXXXGGAAGHVVPYYEARVVDPETGEVLGVNEVGEIEVRGPMIMAGYVDDPEATAERIDEDGWYHTGDLGYFDENGALYIVXXXXXLILNNGKPVDPADLEAVLRSHPAIKDAGVAGLPDPAAGELPAAVVVKAPGKTITEAEVVAYVASQVPPHKHLTGGVVFVDEVPXXXXXAVDRAAVRAILVAAKG

Even though the predicted structure has a completely different type of amino acid distribution, the structure remains the same. This is why the protein shown in 3D is very similar to the original in its alpha- and beta-structures. The backbone is not altered, nor is the logic of the distribution of certain types of amino acids, either.

Part D. Group Brainstorm on Bacteriophage Engineering

Find a group of ~3–4 students
Read through the Phage Reading material listed under “Reading & Resources” below.
Review the Bacteriophage Final Project Goals for engineering the L Protein:
- Increased stability (easiest)
- Higher titers (medium)
- Higher toxicity of lysis protein (hard)
Brainstorm Session
- Choose one or two main goals from the list that you think you can address computationally (e.g., “We’ll try to stabilize the lysis protein,” or “We’ll attempt to disrupt its interaction with E. coli DnaJ.”).
- Write a 1-page proposal (bullet points or short paragraphs) describing:
  - Which tools/approaches from recitation you propose using (e.g., “Use Protein Language Models to do in silico mutagenesis, then AlphaFold-Multimer to check complexes.”).
  - Why do you think those tools might help solve your chosen sub-problem?
  - Name one or two potential pitfalls (e.g., “We lack enough training data on phage–bacteria interactions.”).
  - Include a schematic of your pipeline.
- This resource may be useful: HTGAA Protein Engineering Tools
Each individually put your plan on your HTGAA website
- Include your group’s short plan for engineering a bacteriophage

Week 5: Protein Design - part II

Part A: SOD1 Binder Peptide Design (From Pranam)

Superoxide dismutase 1 (SOD1) is a cytosolic antioxidant enzyme that converts superoxide radicals into hydrogen peroxide and oxygen. In its native state, it forms a stable homodimer and binds copper and zinc.

Mutations in SOD1 cause familial Amyotrophic Lateral Sclerosis (ALS). Among them, the A4V mutation (Alanine → Valine at residue 4) leads to one of the most aggressive forms of the disease. The mutation subtly destabilizes the N-terminus, perturbs folding energetics, and promotes toxic aggregation.

Your challenge:

Design short peptides that bind mutant SOD1. Then decide which ones are worth advancing toward therapy.

A. Part 1: Generate Binders with PepMLM

SOD1 SEQUENCE

sp|P00441|SODC_HUMAN Superoxide dismutase [Cu-Zn] OS=Homo sapiens OX=9606 GN=SOD1 PE=1 SV=2

MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTS
AGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVV
HEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

SOD1 SEQUENCE with A4V mutation

MATK**V**VCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTS
AGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVV
HEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

After processing 4 peptides with 12 amino acids in the mutational sequence, we got:

Binder#    Sequence        Pseudo Perplexity

1         WHYPVVGLALGX    10.324439
2         WRVYPAGAAHKK    12.004818
3         KRYPAVALAWWE    14.695892
4         WLYYAVGAEHGE    15.423531
5         FLYRWLPSRRGG    20.635226

The pseudo perplexity range explains that the lower the range, the higher the confidence of the model. This means that the peptide with 15.42 will be less natural, while the peptide with 10.32 is a more natural and similar peptide to the sequence. Adding the SOD-1 binding sequence marks a difference arrises, have a pseudo perplexisty of 20.63, a very high number, which means that

B. Part 2: Evaluate Binders with AlphaFold3

Binder#    Sequence        Pseudo Perplexity       ipTM      PTM

1         WHYPVVGLALGX     10.324439               0.40      0.85  
2         WRVYPAGAAHKK     12.004818               0.39      0.86
3         KRYPAVALAWWE     14.695892               0.36      0.87
4         WLYYAVGAEHGE     15.423531               0.32      0.81
5         FLYRWLPSRRGG     20.635226               0.36      0.82

I took the peptides generated in PepMLM and bound them to Alphafolds using the mutant SOD1 sequence. The results show that the protein sequence is highly confident in the result it generated, indicating that the model has high confidence in the predicted structure. While the iPTM shows numbers under 0.6, which means there is low confidence in the interaction between the peptide and protein. Also, the parts in which the peptide actually binds a little bit to the protein correspond to the beginning of the sequence, which appears to be a more flexible region of the protein.

C. Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse

PEPTIDE 1: WHYPVVGLALGX

PROPERTY                    PREDICTION        VALUE         UNIT
💧   Solubility              Soluble           1.000        Probability
🩸   Hemolysis               Non-hemolytic     0.103        Probability
🔗  Binding Affinity        Weak binding      6.293        pKd/pKi
⚖️  Molecular Weight                          1193.6       Da
⚡  Net Charge (pH 7)                         -0.15

PEPTIDE 2: WRVYPAGAAHKK    

PROPERTY                    PREDICTION        VALUE         UNIT
💧   Solubility              Soluble           1.000        Probability
🩸   Hemolysis               Non-hemolytic     0.010        Probability
🔗  Binding Affinity        Weak binding      4.953        pKd/pKi
⚖️  Molecular Weight                          1383.6       Da
⚡  Net Charge (pH 7)                         2.84

PEPTIDE 3: KRYPAVALAWWE    

PROPERTY                    PREDICTION        VALUE         UNIT
💧   Solubility              Soluble           1.000        Probability
🩸   Hemolysis               Non-hemolytic     0.071        Probability
🔗  Binding Affinity        Weak binding      6.653        pKd/pKi
⚖️  Molecular Weight                          1489.7       Da
⚡  Net Charge (pH 7)                         0.77

PEPTIDE 4: WLYYAVGAEHGE    

PROPERTY                    PREDICTION        VALUE         UNIT
💧   Solubility              Soluble           1.000        Probability
🩸   Hemolysis               Non-hemolytic     0.065        Probability
🔗  Binding Affinity        Weak binding      6.053        pKd/pKi
⚖️  Molecular Weight                          1394.5       Da
⚡  Net Charge (pH 7)                         -2.14

PEPTIDE 5: FLYRWLPSRRGG    

PROPERTY                    PREDICTION        VALUE         UNIT
💧   Solubility              Soluble           1.000        Probability
🩸   Hemolysis               Non-hemolytic     0.047        Probability
🔗  Binding Affinity        Weak binding      5.968        pKd/pKi
⚖️  Molecular Weight                          1507.7       Da
⚡  Net Charge (pH 7)                         2.76

It seems that the best candidate as a therapeutic peptide is Peptide 3: KRYPAVALAWWE. Although other candidates show very good qualities and achieve similar results in terms of solubility, hemolysis, molecular weight, and net charge, they do not present a strong binding score. In this case, Peptide 3 shows the highest predicted binding affinity among the candidates.

If we compare these results with the iPTM values predicted by Alphafold, we can observe that the confidence of interaction between peptides and the protein is generally low. For Peptide 3 in particular, the iPTM value lies in the middle of the observed range, suggesting (inside of the low values) moderate structural confidence in the predicted interaction.

Additionally, when designing peptides for therapeutic purposes, several properties must be considered. First, peptides need to be soluble so that they can circulate in the biological fluids without forming aggregates. Second, hemolysis probabilities should remain below 0.2, since higher values indicate that peptides may disrupt red blood cells and release hemoglobin into the bloodstream, which can be toxic. Third, binding affinity is important because it helps to predict whether a peptide will interact strongly with the target protein. Furthermore, molecular weight is preferably small, as smaller peptides are easier to synthesize and diffuse through biological environments. Finally, a moderate positive net charge is often favorable, because it can promote electrostatic interactions with negatively charged regions on protein surfaces, potentially stabilizing the peptide-protein interaction.

D. Part 4: Generate Optimized Peptides with moPPIt

Index  Binder         Hemolysis            Solubility           Affinity             Motif

0      GYGFGEGQCYTH   0.92835745960474     0.9166666865348816   6.400432586669922    0.838644802570343
1      WVISFKKEEEKA   0.974050072953105    0.5833333134651184   6.252631664276123    0.5743995308876038
2      KKTIQVKTECGR   0.9746747445315124   0.8333333134651184   6.728649616241455    0.8640713095664978
3      YYQKTCLVKKEH   0.9649858362972736   0.8333333134651184   7.076708793640137    0.882260799407959

I chose to run the peptide at the nearest residues of the mutation because the flexibility around these spaces is beneficial to peptide-protein binding.

To consider the values of analysis: 💧 Solubility: 1.0 (good) 🩸 Hemolysis : 1.0 (good) 🔗 Binding Affinity: the higher the better 🧩 Motif: 1.0 (good)

Therefore, Peptide 3: YYQKTCLVKKEH reflects that it is the best candidate for binding to the mutant SOD1. It presents balanced and consistent results in every aspect: hemolysis, solubility, affinity, and motif. Although the solubility is slightly lower compared to Peptide 0, it still falls in the favorable range, suggesting that the peptide can remain stable and soluble in physiological conditions. Also, it presents a high affinity and motif, meaning that it can perform a strong and specific interaction with the selected residues of the protein.

Compared to PepMLM peptides, the Moppit results show a good affinity and motif, which did not appear in the PepMLM peptides. I think Moppit has a higher affinity and better chances to bind with the protein because it has developed results with a specific target of residues in a specific region, while PepML gives a general result based on stable and more plausible sequences without focusing on any particular binding site.

How would you evaluate these peptides before advancing them to clinical studies?

I would first run a few more computational tests to have consistent results in stability and strength of the peptide-protein bond. This would be run by docking and molecular dynamics simulations. Afterward, it will be necessary to do some in vitro experiments to test if the solubility, hemolysis, binding, affinity, motif, and results keep being consistent and similar to the computational simulations. Finally, in vivo models would be run to assess safety, stability, and pharmacokinetic properties to see if the peptide meets the requirements for clinical studies.

Part C: Final Project: L-Protein Mutants

High level summary: The objective of this assignment is to improve the stability and auto-folding of the lysis protein of a MS2-phage. This mechanism is key to the understanding of how phages can potentially solve antibiotic-resistance.

Week 6: Genetic Circuits - part I

Assignment: DNA Assembly

Answer these questions about the protocol in this week’s lab:

What are some components in the Phusion High-Fidelity PCR Master Mix, and what is their purpose?

Elements:
- Pyrococcus-like enzyme fused with a processivity-enhancing domain that increases fidelity and speed.
- 53 polymerase activity, 35exonuclease activity, and generates blunt-ended products.
- Phusion DNA Polymerase, deoxynucleotides, and reaction buffer that has been optimized and includes MgCl2
- Requirements to be added: template, primers, and water.
Purpose:
- DNA needs to be correct after amplification
- Cloning and long or difficult amplicons

New England Biolabs. (2026). Neb.com. https://www.neb.com/en/products/m0531-phusion-high-fidelity-pcr-master-mix-with-hf-buffer?srsltid=AfmBOorEwfoZbsx9mLnMvqC0ZzoqOD_2vjUU8fWlYwjLxsupJd6g5J4T

What are some factors that determine primer annealing temperature during PCR?

Annealing temperature:
- The temperature used during the primer annealing step of a PCR, which depends on the primer melting temperature
- Usually lower than the melting temperature
- Too high Ta: the primer will not anneal to the template, and the amplification will not proceed
- Too low Ta: nonspecific binding of the primers to the template or each other can occur, causing:
  - increased likelihood of nonspecific product formation
  - decreased formation of the intended product due to inefficient reaction conditions.
Factors of primer annealing temperature:
- Base composition (proportion of ATCG nucleotides)
- Primer concentration
- Ionic reaction environment

New England Biolabs. (2025). Neb.com. https://www.neb.com/en/faqs/how-should-i-determine-the-appropriate-annealing-temperature-for-my-reaction

PCR conditions | Primer annealing specificity | PCR buffers. (n.d.). Www.qiagen.com. https://www.qiagen.com/us/knowledge-and-support/knowledge-hub/bench-guide/pcr/introduction/pcr-conditions

Two methods from this class create linear DNA fragments: PCR and restriction enzyme digestion. Compare and contrast these two methods, both in terms of protocol and in terms of when one may be preferable to the other.

PROTOCOLS PCR vs RESTRICTION ENZYME DIGESTION:
- PCR is the technique for mainly DNA amplification.
- RESTRICTION ENZYME DIGESTION is the technique for cutting specific sections of DNA.
Both act in different temperatures and procedures, but the final applications are varied: for PCR, there is DNA amplification, cloning, sequencing, etc, while for Restriction enzyme digestion, there are DNA-specific cutting, cloning, etc.
In the table below, the comparison between protocols can be better understood:

Addgene. (2016, October 11). Molecular Biology Protocol - Restriction Digest of Plasmid DNA. Addgene.org. https://www.addgene.org/protocols/restriction-digest/

‌New England Biolabs. (2026). Neb.com. https://www.neb.com/en/protocols/taq-dna-polymerase-with-standard-taq-buffer-m0273?srsltid=AfmBOoofDqU_j-Dq9LMqmvJJS77-KBrCTdRfvMap0CX93U1VmEXFKB2t

‌New England Biolabs. (2026). Neb.com. https://www.neb.com/en/protocols/restriction-digest-protocol?srsltid=AfmBOorUr9MjG3wtFkW9aSZK3zyRojgexRcNxwj0Tinahe9D5B5V07vw

How can you ensure that the DNA sequences that you have digested and PCR-ed will be appropriate for Gibson cloning?
To ensure that the DNA sequences are appropriate for Gibson cloning, we need to verify the next steps:
1. Confirm the overlapping homologous regions: these overlaps must be from 20-40 bps and are important because they allow the fragments to anneal together during the assembly.
2. Design PCR fragments correctly: the PCR should amplify the correct fragment, including overlap extensions compatible with the neighboring fragment/vector.
3. Verify fragment size by electrophoresis: to confirm expected band size, clean amplification, and minimal non-specific products, run the PCR and digested products on an agarose gel to confirm.
4. Purify the DNA fragments: remove primers, nucleotides, enzymes, template DNA. Contaminants can inhibit Gibson enzymes.
5. Ensure compatible vector linearization: if using a plasmid backbone, it is important to confirm the vector is fully linearized, verify if the ends correspond to the designed overlaps.
6. Check the sequence orientation and reading frame: this avoids framshifts and truncated proteins
7. Sequence verification: after the assembly and transformation, perform a colony PCR and a Sanger sequencing so that it is confirmed that the sequence insertion, no mutations, and correct assembly junctions.
How does the plasmid DNA enter the E. coli cells during transformation?
Plasmid DNA enters the E.coli cells in two ways:
1. Chemical transformation
  In chemical transformation, plasmid DNA is mixed with chemically competent cells, then briefly exposed to an elevated temperature (heat shock). The initial incubation of cells is carried out in a propylene tube on ice for durations ranging from 5 to 30 minutes.
2. Electrophoresis
  Electroporation involves using an electroporator to subject competent cells and a plasmid-carrying DNA construct to a brief pulse of a high-voltage electric field. This treatment induces transient pores in cell membranes, which permit plasmid entry into cells. The most common type of electric pulse in bacterial transformation is exponential decay, where a set voltage is applied and allowed to decay over a few milliseconds, called the time constant. The applied voltage is determined by field strength (V/cm), where V is the initial peak of voltage and cm is the measurement of the gap between the electrodes of the cuvette used.

Feature	Chemical Transformation	Electroporation
Definition	Introduction of plasmid DNA into chemically competent cells using cations and heat shock	Introduction of plasmid DNA into competent cells using a high-voltage electric pulse
Main Mechanism	Mg²⁺ or other cations help neutralize charges, and heat shock creates temporary membrane permeability	Electric field creates transient pores in the cell membrane allowing DNA entry
Key Steps	DNA mixed with competent cells → incubation on ice → heat shock at 42°C → recovery on ice	DNA mixed with competent cells → electric pulse applied with electroporator → recovery
Typical Conditions	42°C for ~30 seconds after 5–30 min incubation on ice	Field strength >15 kV/cm using 0.1 cm cuvettes
DNA Amount Required	1–10 ng plasmid DNA	Usually small amounts of plasmid DNA are sufficient
Cell Volume	50–100 µL competent cells	20–80 µL in electroporation cuvettes
Equipment Needed	Water bath or heat block	Electroporator and specialized cuvettes
Transformation Efficiency	Lower efficiency, especially with ligation mixtures	Generally higher efficiency
Advantages	Simple, inexpensive, does not require special equipment	Highly efficient and suitable for difficult transformations
Disadvantages	Lower efficiency; ligation mixtures may reduce success to 1–10%	Risk of arcing, which decreases viability and efficiency
Sensitivity to Buffers	Less sensitive to buffer composition	Conductive buffers (MgCl₂, phosphates) can cause arcing
Membrane Permeabilization Method	Heat shock	Electric pulse
Post-transformation Handling	Cells returned to ice before recovery	Cells recovered immediately after pulse
Storage of Competent Cells	Can be refrozen, but efficiency drops ~50% per freeze/thaw cycle	Same limitation applies; best stored in single-use aliquots
Best Use Cases	Routine cloning and simple plasmid transformations	High-efficiency cloning or transformation of difficult cells/plasmids

Bacterial Transformation Workflow | Thermo Fisher Scientific - US. (2025). Thermofisher.com. https://www.thermofisher.com/ec/en/home/life-science/cloning/cloning-learning-center/invitrogen-school-of-molecular-biology/molecular-cloning/transformation/bacterial-transformation-workflow.html

Describe another assembly method in detail (such as Golden Gate Assembly)

Explain the other method in 5 - 7 sentences plus diagrams (either handmade or online).
Golden Gate Assembly is a molecular cloning method that allows multiple DNA fragments to be assembled in a specific order in a single reaction tube. The method uses type IIS restriction enzymes, such as Bsal, which cut DNA outside of their recognition site and generate custom overhangs. These overhangs are designed so that each DNA fragment joins only to the correct neighboring fragment. During the reaction, the DNA is repeatedly digested and ligated using DNA ligase, producing a seamless final construct without extra restriction sites. One major advantage is that many DNA fragments can be assembled simultaneously, making the process faster than traditional cloning methods.

Traditional methods vs Golden Gate Assembly

Golden Gate Assembly - the most commonly used restriction enzymes used in multiple Type IIS assembly standards: Bsal, Bbsl, Sapl, and BsmBl.

Step 1: DNA Fragments with Type IIS Restriction Sites

[BsaI]-Fragment A-[BsaI]
[BsaI]-Fragment B-[BsaI]
[BsaI]-Fragment C-[BsaI]

                ↓ Digestion

Step 2: Custom Sticky Ends are Generated

Fragment A ====
               ==== Fragment B
                               ==== Fragment C

                ↓ Ligation

Step 3: Final Assembled Construct

Fragment A — Fragment B — Fragment C

(2026). Igem.org. https://technology.igem.org/technologies/golden-gate

New England Biolabs. (2024). Getting Started with Golden Gate Assembly. Neb.com. https://www.neb.com/en/nebinspired-blog/getting-started-with-golden-gate

‌

Model this assembly method with Benchling or Asimov Kernel!

Assignment: Asimov Kernel

Create a Repository for your work
Create a blank Notebook entry to document the homework and save it to that Repository
Explore the devices in the Bacterial Demos Repo to understand how the parts work together by running the Simulator on various examples, following the instructions for the simulator found in the “Info” panel (click the “i” icon on the right to open the Info panel)
Create a blank Construct and save it to your Repository
Recreate the Repressilator in that empty Construct by using parts from the Characterized Bacterial Parts repository
Search the parts using the Search function in the right menu
Drag and drop the parts into the Construct
Confirm it works as expected by running the Simulator (“play” button) and compare your results with the Repressilator Construct found in the Bacterial Demos repository
Document all of this work in your Notebook entry - you can copy the glyph image and the simulator graphs, and paste them into your Notebook
Build three of your own Constructs using the parts in the Characterized Bacterials Parts Repo
Explain in the Notebook Entry how you think each of the Constructs should function
Run the simulator and share your results in the Notebook Entry
If the results don’t match your expectations, speculate on why and see if you can adjust the simulator settings to get the expected outcome

For this assignment, I took notes in the Notebook as suggested. Here are the images of it, in which I registered every step:

For the first part: represillator construct, the process was smooth, as well as the results compared to the reference. For the second part of constructs 1, 2, and 3. I designed a simple protein expression of GFP; the results were in a steady state. After this construct, I tried a repressilator using GFP, BFP, and RFP, the first simulation was pretty good, but I noticed that fluorescent proteins were not expressing, so I made a few changes. The result I obtained was not what I was expecting because the oscillations did not appear as steady as I thought they were going to. Instead, I obtained a few oscillations at the beginning of the simulations, and the expression was steady. For the last construct, I tried to use GFP and RFP with a TetR protein and a Ligand. The results of the simulation without the ligand were successful after a few iterations. Although the ligand did not work out, I tried many options.

Week 7 HW: Genetic Circuits-part II

Assignment Part 1: Intracellular Artificial Neural Networks (IANNs)

1. What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions?

In-vivo Artificial Neural Networks provide an analog approach to computation within biological systems, enabling cells to process continuous inputs and generate a gradient and a non-binary output. IANNs can integrate multiple signals and capture more complex, non-linear relationships through distributed gene regulation. IANNs have the ability to capture patterns and complex inputs.

In contrast, traditional circuits based on Boolean logic are a binary system that has an ON/OFF output that responds directly to the inputs given. This gives a limited range of data for processing based on predefined inputs given to the system. It is easier to implement experimentally, but its results are limited and cannot answer complex, continuous, and dynamic signals.

In the chart below, there is a comparison to understand it more:

Feature	Boolean Genetic Circuits	IANNs
Response Type	Digital (ON/OFF)	Analog (Graded/Continuous)
Logic Type	Exact Logic (AND, NOT)	Non-linear Approximation
Complexity	Limited / Simple	Complex / Adaptive
Robustness	Low (Noise sensitive)	High (Pattern Recognition)
Main Use Case	Switch / Memory	Classification / Processing

Comparison table made with IA´s help

3Blue1Brown. (2017). But what is a Neural Network? | Deep learning, chapter 1. In YouTube. https://www.youtube.com/watch?v=aircAruvnKk

‌TeachTech Online. (2016, April 15). [Tema 3] Puertas lógicas y circuitos combinacionales. YouTube. https://www.youtube.com/watch?v=r7YNXYGCx7s

Shao, B., Liu, X., Zhang, D., Wu, J., & Ouyang, Q. (2015). From Boolean Network Model to Continuous Model Helps in Design of Functional Circuits. PLOS ONE, 10(6), e0128630. https://doi.org/10.1371/journal.pone.0128630

‌1. Introduction to biological circuit design — Biological Circuit Design documentation. (n.d.). Biocircuits.github.io. https://biocircuits.github.io/chapters/01_intro_to_circuit_design.html

‌Neural network (machine learning). (2024, February 18). Wikipedia. https://en.wikipedia.org/wiki/Neural_network_(machine_learning)

2. Describe a useful application for an IANN; include a detailed description of input/output behavior, as well as any limitations an IANN might face to achieve your goal.

An interesting application of IANN could be in the production of palm hearts. The proposal would be to optimize its growth by inserting a symbiotic bacterium that sends signals to the plant genes that, in consequence, make the palm produce more or fewer proteins to gain an optimal production of fiber. A conceptual approach to the formula of this would be:

                                                        O=0.3M+0.25H−0.2S+0.25G

In which O means the output, M represents the metabolic states (sugar levels), H represents hormone levels, G represents expression related to structural growth, and S represents stress levels. So, in this approximation, when the bacteria detect a different output or the presence of this protein weight, they send a signal to the plant´s DNA so that it starts producing until it reaches a balanced output. This system would be dynamic because the plant’s state will never be a constant state due to environmental causes and internal processes.

Some possible complications would be the complexity of the plant’s system itself, and the variability in each individual, which would expand an infinite proportion of the lectures that bacteria should do, making it an adaptive network. Also, there is no guarantee of the survival of the bacteria within different individuals.

3. Below is a diagram depicting an intracellular single-layer perceptron where the X1 input is DNA encoding for the Csy4 endoribonuclease and the X2 input is DNA encoding for a fluorescent protein output whose mRNA is regulated by Csy4. Tx: transcription; Tl: translation. Draw a diagram for an intracellular multilayer perceptron where layer 1 outputs an endoribonuclease that regulates a fluorescent protein output in layer 2.

For this example, I will use:

X1: glucose, construct: Promoter (pCRP)+ RBS + Csy4 + T
X2: stress, construct: Promoter (pRpos) + RBS + Csy4 + T
Layer 1: mRNA Csy4 = Csy4 w(x1+x2)
Layer 2: GFP construct: Promoter + RBS + GFP (with cutting site) + T
Output: Fluorescence in higher or lower quantities.

In the image above, you can see how X1 and X2 determine the amount of Csy4 produced in the cell; then, the amount of Csy4 enzyme will affect the expression of GFP. Therefore, when there is more Csy4, there will be less fluorescence (GFP), and when there is less Csy4, the fluorescence (GFP) will be higher.

Multilayer Perceptron - chart developed with AI assistance

Assignment Part 2: Fungal Materials

1. What are some examples of existing fungal materials and what are they used for? What are their advantages and disadvantages over traditional counterparts?

There are so many examples of the usage of fungal materials, but the most interesting and useful ones are:

 * Architecture: specifically with NASA, testing fungal materials to build biosensors for space exploration devices.
 * Biotextiles: the development and commercialization of fungal leather and other types of biotextiles
 * Mycelial wood: for replication of wood, it is lighter, faster, and easier to produce.
 * Food industries: the production of meat and proteins to compensate.
 * Antibiotic researchers

Also, the types of materials developed with fungi are classified as rigid materials or flexible materials. For the rigid ones, materials are created by combining fungi with fibers or lignocellulose particles. The characteristic of each rigid material depends on many factors, such as the type of substrate, the fungi species and strains, the type of hypha, and overall growth conditions. Some examples of rigid materials are related to construction.

For the flexible materials, the variables of material production are similar to the rigid ones, but the difference relies on the final product. In this case, textiles tend to be fragile and limited; that is why they have to be produced with biotechnology, so that their properties can improve. There are some industries that use fungi to produce flexible products like textiles, food, foams, leather, and many more.

In the chart below, there is a list of various companies that work with mycelium, and the industries they are involved in:

Welcome To Zscaler Directory Authentication. (2026). Sciencedirect.com. https://www.sciencedirect.com/science/article/pii/S2950194625002079#ab0010

Hinneburg, H., Gu, S., & Naseri, G. (2025). Fungal Innovations—Advancing Sustainable Materials, Genetics, and Applications for Industry. Journal of Fungi, 11(10), 721. https://doi.org/10.3390/jof11100721

2. What might you want to genetically engineer fungi to do and why? What are the advantages of doing synthetic biology in fungi as opposed to bacteria?

I think it could be interesting to genetically engineer fungi to efficiently convert plant biomass into biofuels such as ethanol, lipids, or energy-rich compounds. Due to their natural ability to degrade complex polymers like cellulose or lignin, fungi can serve as a biological platform for transforming agricultural waste into usable energy sources.

According to the paper “Fungal Innovations - Advancing sustainable materials,” the advantages of doing synthetic biology in fungi rely on the structural and biosynthetic capabilities, in contrast to E.coli or S. cerevisiae, which are the common hosts that have rapid growth and simple genetics, but have some complications with RNA splicing, and complex regulation. Filamentous fungi, to be specific, can fold complex proteins and can drive advancements in better transcriptional regulation tools, genome editing techniques, and rapid DNA assembly methods; therefore, they are powerful biomanufacturing platforms.

Hinneburg, H., Gu, S., & Naseri, G. (2025). Fungal Innovations—Advancing Sustainable Materials, Genetics, and Applications for Industry. Journal of Fungi, 11(10), 721. https://doi.org/10.3390/jof11100721

‌ Assignment Part 3: First DNA Twist Order

0. Review the Individual Final Project documentation guidelines.

1. Submit this Google Form with your draft Aim 1, final project summary, HTGAA industry council selections, and shared folder for DNA designs. DUE MARCH 20 FOR MIT/HARVARD/WELLESLEY STUDENTS

For this part, I made the submission but forgot to take a screenshot of it. The shared folder can be found in this lik: https://benchling.com/s/seq-kXwSe99jGIKKQLPAx7XC?m=slm-7BhFEtQHpBAKVNKSrK9c

2. Review Part 3: DNA Design Challenge of the week 2 homework. Design at least 1 insert sequence and place it into the Benchling/Kernel/Other folder you shared in the Google Form above. Document the backbone vector it will be synthesized in on your website.

I made an insert for E.coli using aeBlue.

AeBlue is a chromoprotein originally derived from the anemone Actinia equina. This protein is color-dependent and expresses a strong bright blue color. The color palette of the protein goes from pink to bright blue. I chose this chromoprotein because the color intensity of this protein is higher than that of many other chromoproteins, making it specially useful for visualization during co-culture experiments with K. xylinus during cellulose production and a bio-leather formation.

A stronger color expression would make it easier to monitor protein expression and track bacterial distribution throughout the material.

For this insert, I chose to use:

Constitutive promoter BBa_J23101
RBS BBa_B0034
CDS aeBlue K864401
Terminator B0015
Backbone: pUC19

Week 9 HW: Cell Free Systems

Part A: General and Lecturer-Specific Questions

General homework questions

1. Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production.

Cell-free protein synthesis: is a protein expression approach that enables the production of a target protein without the use of living cells.
In vivo methods: it is also a protein expression approach that uses living cells such as bacteria (e. coli, most common), yeast, insect cells, and mammalian cells.

Here is a chart that makes a summary about differences between both methods:

Cell-free protein synthesis vs. traditional in vivo methods

As it is shown, cell-free protein synthesis has several advantages compared to traditional in vivo methods, mainly because everything happens outside of a living cell. This makes the system much more flexible, since you can directly control things like the amount of DNA, enzymes, and other components without worrying about how the cell will react. In contrast to cell-dependent methods, in which limitations are high due to metabolism, regulation, and survival.

Another important advantage is that cell-free systems allow the production of proteins that might be toxic to cells, while in in vivo methods, these types of proteins can kill or damage the host organism, making them difficult or impossible to produce. In cell-free expression, this is not a problem because there are no living cells involved.

In terms of speed, it is faster (cell-free), since it does not need to spend time growing cells or transforming them. This makes it easier to quickly test different DNA sequences or protein variants.

There are several situations where the cell-free method is more useful. For example: when producing toxic proteins that cannot be expressed in cells, or for rapid prototyping applications, such as synthetic biology circuits or screening multiple protein variants in a short time. It is important to add that while cell-free methods are better with rapid process and protein expression in a controled-medium size, in vivo methods can handle massive production and low-cost protein production

New England Biolabs. (2026). Neb.com. https://www.neb.com/en/applications/protein-expression/cell-free-protein-expression?srsltid=AfmBOorKmWZBUknZgtYzmBfbx0IiXqMcRlLcgWd8oi4EKcYBFwv4sudk

‌Mason, E. (2023, March 23). Advantages of Cell-Free Protein Expression. Biocompare.com. https://www.biocompare.com/Editorial-Articles/594727-Advantages-of-Cell-Free-Protein-Expression/

‌Cui, Y., Chen, X., Wang, Z., & Lu, Y. (2022). Cell-Free PURE System: Evolution and Achievements. BioDesign Research, 2022. https://doi.org/10.34133/2022/9847014

2. Describe the main components of a cell-free expression system and explain the role of each component.

Cell-free expression systems are made up of several components that work together to produce proteins outside living cells:

Cell extract: contains the molecular machinery needed for protein synthesis. This extract usually comes from broken cells (like bacteria) and provides ribosomes, tRNAs, enzymes, and other factors required for transcription and translation.
DNA template: this is the gene that encodes the protein of interest. The system uses this DNA to produce mRNA and then translates it into the desired protein.
Amino acids: they are the building blocks of proteins. These are added to the system so that ribosomes can assemble them into a protein based on the sequence of the mRNA.
NTPs (nucleoside triphosphates): such as ATP, GTP, CTP, and UTP. These molecules are essential both for building the mRNA during transcription and for providing energy during translation.
Energy source: protein synthesis requires a lot of energy, so the system needs molecules like ATP and other energy-regenerating compounds to keep the reaction running.
Cofactors and salts: this helps the stability of the chemical environment and helps enzymes to function correctly. Cofactors and salts ensure that the system remains stable and efficient.

Kim, W., Han, J., Chauhan, S., & Lee, J. W. (2025). Cell-free protein synthesis and vesicle systems for programmable therapeutic manufacturing and delivery. Journal of Biological Engineering, 19(1). https://doi.org/10.1186/s13036-025-00523-x

3. Why is energy provision regeneration critical in cell-free systems? Describe a method you could use to ensure continuous ATP supply in your cell-free experiment.

Energy provision regeneration is critical in cell-free systems because protein synthesis is a highly energy-demanding process. Both transcription and translation require large amounts of ATP and GTP. Without the continuous energy supply, the reaction stops. Because of the absence of a cellular metabolism to naturally generate energy, the cell-free system would run out of ATP very fast. Without it, protein yields would be very low, and the reaction would only last a short period.

There are some interesting pathways to produce continuous energy in cell-free systems, some of which are:

Glucose and sugar metabolism: systems that are frequently derived from E.coli or yeast, enable high-yield, in vitro protein production by utilizing metabolic pathways to break down glucose, which can improve cost-efficiency.
Maltodextrin metabolism: it is a low-cost secondary energy compound for CFPS. It produces higher levels of protein than PEP, glucose, and glucose-6 phosphate. The enhancement of protein synthesis was largely attributed to be better-controlled phosphate levels ( recycling of inorganic phosphate) and a more homeostatic reaction environment.
Electric-generated power: in cells, ATP is synthethized through a rather complicated process involving several membrane-bound redox protein complexes. Electrons are transferred along different redox centers, creating a proton motive force across the membrane, which is subsequently harvested for ATP synthesis.

System	How it generates ATP	Advantages	Disadvantages	When to use it
Glucose and sugar metabolism	Uses glucose or other sugars (e.g., glycolysis) to produce ATP	Low cost; biologically natural; supports longer reactions	Can produce acidic byproducts; less control over conditions	When a cost-effective and stable system is needed
Maltodextrin metabolism	Uses maltodextrin, slowly broken down to generate ATP	More stable energy release; fewer inhibitory byproducts	Requires optimization; depends on enzymatic activity	When higher stability and consistency are required
Electricity-powered generation	Uses electrical energy to drive redox reactions	Precise control; real-time adjustment; reduced byproducts	Technically complex; requires specialized equipment	Advanced research or tight energy control systems

Wang Y, Zhang YH. Cell-free protein synthesis energized by slowly-metabolized maltodextrin. BMC Biotechnol. 2009 Jun 28;9:58. doi: 10.1186/1472-6750-9-58. PMID: 19558718; PMCID: PMC2716334.

Luo, S., Adam, D., Giaveri, S., Barthel, S., Cestellos-Blanco, S., Hege, D., Paczia, N., Castañeda-Losada, L., Klose, M., Arndt, F., Heider, J., & Erb, T. J. (2023). ATP production from electricity with a new-to-nature electrobiological module. Joule, 7(8), 1745-1758. https://doi.org/10.1016/j.joule.2023.07.012

Calhoun, Kara & Swartz, James. (2005). Energizing cell-free protein synthesis with glucose metabolism. Biotechnology and bioengineering. 90. 606-13. 10.1002/bit.20449.

4. Compare prokaryotic versus eukaryotic cell-free expression systems. Choose a protein to produce in each system and explain why.

First, we need to understand both terms:

Prokaryotic is a unicellular organism that is characterized by not having a defined nucleus. Its DNA is located in the cytoplasm.
Eukaryotic is a cell or an organism that has a defined nucleus, protected by a membrane in which the DNA is stored.

When comparing them within the cell-free expression, we can outline some important differences due to the complexity of each intracellular machinery:

Prokaryotic CFPS:
- Yield higher quantities of protein
- Are cost-effective
- Production of simple proteins
- Production of toxic proteins that would kill a living host.
- Example: E.coli
Eukaryotic CFPS:
- Crucial for properly folding complex
- Functional proteins that require post-translational modifications.
- Disulfide bond formation
- Example: Wheat germ, rabbit reticulocyte lysate, insect cell lysate.

In the image below, we can analyze a comparative chart between Prokaryotic vs. Eukaryotic CFPS:

And a useful comparison between CFPS systems classified by eukaryotic and prokaryotic cells:

A Comparative Guide: Prokaryotic vs. Eukaryotic Cell-Free Expression Systems for Eukaryotic Proteins - CD Biosynsis. (2025). Biosynsis.com. https://www.biosynsis.com/a-comparative-guide-prokaryotic-vs-eukaryotic-cell-free-expression-systems-for-eukaryotic-proteins.html

‌Cell-Free Protein Expression | Thermo Fisher Scientific - US. (2026). Thermofisher.com. https://www.thermofisher.com/ec/en/home/life-science/protein-biology/protein-biology-learning-center/protein-biology-resource-library/pierce-protein-methods/cell-free-protein-expression.html

Zemella A, Thoring L, Hoffmeister C, Kubick S. Cell-Free Protein Synthesis: Pros and Cons of Prokaryotic and Eukaryotic Systems. Chembiochem. 2015 Nov;16(17):2420-31. doi: 10.1002/cbic.201500340. Epub 2015 Oct 19. PMID: 26478227; PMCID: PMC4676933.‌

5. How would you design a cell-free experiment to optimize the expression of a membrane protein? Discuss the challenges and how you would address them in your setup.

To optimize membrane protein expression in a cell-free system, I would use an eukaryotic extract such as CHO or an Insect extract, since both systems are more suitable for producing complex membrane proteins with proper folding and co-translational processes.

These extracts contain microsomal vesicles that provide a hydrophobic membrane-like environment, helping membrane proteins maintain their structure and preventing aggregation. This is especially important because membrane proteins are naturally adapted to lipid environments.

To monitor protein expression, I would use a GFP tag on the protein construct. GFP fluorescence would serve as an indicator of successful protein expression and would also help visualize where expression occurs within the system.

For energy regeneration, I would include a creatine phosphate/creatine kinase system to maintain ATP levels. Additionally, I would implement a Continuous Exchange Cell-Free (CEFC) system. This setup would extend the reaction time from only a few hours up to 24 hours, allowing higher protein yield.

Finally, in comparison with E.coli extracts, these eukaryotic systems usually produce a lower amount of protein. However, the main benefit is the higher-fidelity membrane protein expression, since protein folding and membrane expression are difficult to achieve in E.coli.

6. Imagine you observe a low yield of your target protein in a cell-free system. Describe three possible reasons for this and suggest a troubleshooting strategy for each.

Inappropriate extract system: Using the wrong extract system may reduce protein yield because some proteins require specific folding machinery, chaperons, or membrane environments that are not present in all extracts. For example, a complex membrane protein may not express properly in E.coli-based systems (like in the question above). For example, E.coli extracts can produce proteins very quickly, but they are not always a good fit for complex membrane proteins. To troubleshoot this, it would be necessary to understand exactly what the best extract and supplementary structures are for each protein.
Low energy supply: Cell-free protein synthesis requires a constant supply to maintain transcription and translation. If ATP regeneration is insufficient, the reaction may stop prematurely, resulting in a low protein yield. To address this, it is necessary to ensure a constant regeneration of energy, which could be achieved by using alternatives that provide a more stable and continuous energy source, such as glucose metabolism, maltodextrin metabolism, the creatine phosphate/creatine kinase system, or electricity-powered ATP regeneration.
Incorrect hydrophobic mimetics: Some proteins need to have hydrophobic environments to remain stable and folded. If the wrong detergents, nanodiscs, or liposomes are used, then the protein would aggregate or lose its functionality, therefore reducing protein yield. To solve this problem, different membrane mimetics and concentrations could be tested to mimic the natural membrane environment and stabilize the protein during expression.

Homework question from Kate Adamala

Design an example of a useful synthetic minimal cell as follows:

Pick a function and describe it.

What would your synthetic cell do? What is the input and what is the output?

I would like to design a synthetic cell that identifies flea saliva inside a dog´s body and then sends a signal to a common bacterial skin.

The input would be the protein that corresponds to the flea´s saliva: ctenocephalides felis. specifically the major allergen Cte f1, a chymotrypsin-like enzyme responsible for triggering histamine-mediated allergic responses. The output would be IPTG, which would activate the engineered bacteria living in the dog´s hair follicles.

The general output would be the expression of an anti-allergenic protein that blocks the saliva protein’s activity.

Could this function be realized by cell-free Tx/Tl alone, without encapsulation?

No, it could not be done by a cell-free Tx/Tl alone, because the skin of the dog contains substances such as sweat that could degrade the different components of the cell-free system, as well as the DNA, or the transcription and translation enzymes.

Could this function be realized by genetically modified natural cell?

It would be damaging for the dog and the bacteria of the skin because when a mutation is inserted in an organism, it can generate a very complex chain of reactions that could damage the health of the living being. In this case, bacterial metabolism would be compromised by the environment, which could cause mutations. This event would shut down the receptor mechanism, and the sensor would not work.7

Describe the desired outcome of your synthetic cell operation.

In the presence of flea´s salivary protein (Cte f1), the SMDC produces and liberates IPTG. IPTG would wake up the modified Staphylococcus epidermis that lives in the hair follicles. Once activated, the enzyme Histaminase will be produced.

Design all components that would need to be part of your synthetic cell.

What would be the membrane made of?
- 70% POPC: as a structural fluid base.
- 20% Cholesterol: to reduce permeability and give rigidity against physical force.
- 10% DSPE-PEG2000: to prevent aggregation of vesicles in shampoo or gel products and to improve adherence to the dog´s fur.
What would you encapsulate inside? Enzymes, small molecules.
- Macromolecules and enzymes: Tx/Tl machinery, e.coli chasis, ARN polymerase, tRNAs, and elongation factors.
- Small molecules: IPTG, ATP, GTP, amino acids, and a regenerative system of energy.
- Genetic material: DNA plasmid containing the allergen-responsive riboswitch circuit.
Which organism your Tx/Tl system will come from? Is bacterial OK, or do you need a mammalian system for some reason? (hint: for example, if you want to use small molecule modulated promotors, like Tet-ON, you need mammalian)

It will come from E.coli that is perfect for a bacterial match. This bacterial system is fully adequeate because the synthetic circuit uses a modified bacterial RNA aptamer/riboswitch that interacts directly with prokarytic 70S ribosomes.

How will your synthetic cell communicate with the environment? (hint: are substrates permeable? or do you need to express the membrane channel?)
- Input communication: since the flea allergen protein Cte f1 is too large to cross the membrane through passive diffusion, the SMC membrane will constitutively contain modified alfa-hemolysin pores. These pores would allow the passage of peptides or small allergen fragments. Alternatively, a membrane-bound scFv receptor (single-chain variable fragment antibody) could be coupled to the pore system to induce conformational signaling upon allergen binding.
- Output communication: IPTG is a relatively small molecule and would diffuse through activated αHL pores once the molecular circuit is triggered.

Experimental details

List all lipids and genes. (bonus: find the specific genes; for example, instead of just saying “small molecule membrane channel” pick the actual gene.)
- LIPIDS:
  1. POPC (1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine)
  2. Cholesterol
  3. DSPE-PEG2000 (1.2-distearoyl-sn-glycero-3-phosphoethanolamine-N-[amino(polyethylene glycol)-2000)])
- GENES:
  1. Constitutive T7 promoter (pT7)
  2. Cte f1-specific riboswitch/aptamer An engineered RNA aptamer that specifically binds the flea allergen Cte f1. In the absence of the allergen, the ribosome binding site remains blocked. Once the allergen binds, the RNA changes conformation and allows translation.
  3. Holin S105 gene: After activation, the system expresses the Lambda phage Holin S105 protein, which forms membrane pores and triggers the rapid release of encapsulated IPTG.
How will you measure the function of your system?
- IPTG Release assay: A fluorescent dye such as calcein would be co-encapsulated inside the SMCs. When the flea allergen is added, pore formation would release the dye, and fluorescence increase would be measured over time.
- Bacterial Activation Assay: The SMCs would be co-cultured with engineered Staphylococcus epidermidis containing an IPTG-inducible mCherry reporter. Successful activation would produce red fluorescence.
- Histamine Degradation Assay: After system activation, histamine concentration would be measured using ELISA or HPLC to confirm DAO-mediated histamine degradation.

Y. Erin Chen et al. ,Engineered skin bacteria induce antitumor T cell responses against melanoma.Science380,203-210(2023).DOI:10.1126/science.abp9563

Brown MM, Horswill AR. Staphylococcus epidermidis-Skin friend or foe? PLoS Pathog. 2020 Nov 12;16(11):e1009026. doi: 10.1371/journal.ppat.1009026. PMID: 33180890; PMCID: PMC7660545.

Smith JM, Chowdhry R, Booth MJ. Controlling Synthetic Cell-Cell Communication. Front Mol Biosci. 2022 Jan 5;8:809945. doi: 10.3389/fmolb.2021.809945. PMID: 35071327; PMCID: PMC8766733.

Clinics, V. (2025, November 7). Dermatitis in dogs: treating flea allergy dermatitis with fipronil. Vets and Clinics; Vets & Clinics. https://vetsandclinics.com/en/library/dermatitis-in-dogs-treating-flea-allergy-dermatitis-with-fipronil

Homework question from Peter Nguyen

Freeze-dried cell-free systems can be incorporated into all kinds of materials as biological sensors or as inducible enzymes to modify the material itself or the surrounding environment. Choose one application field — Architecture, Textiles/Fashion, or Robotics — and propose an application using cell-free systems that are functionally integrated into the material. Answer each of these key questions for your proposal pitch:

Write a one-sentence summary pitch sentence describing your concept.

A freeze-dried cell-free bioleather capable of sensing environmental temperature changes and responding by producing visible aeBlue chromoprotein coloration within the material.

How will the idea work, in more detail? Write 3-4 sentences or more.

The project consists of a bacterial cellulose-based bioleather embedded with freeze-dried cell-free protein synthesis systems containing the genetic circuit for aeBlue chromoprotein expression. The CFPS components would remain inactive while dry, allowing the material to be stored and used safely without living engineered bacteria.

When the material is exposed to moisture, humidity, sweat, or environmental rehydration, the freeze-dried CFPS becomes active. Temperature-responsive genetic elements or the natural thermosensitive behaviour of ae-Blue would trigger different levels of blue pigmentation depending on environmental conditions.

The bacterial cellulose matrix acts both as a structural biomaterial and a carrier for the embedded CFPS microcapsules. As temperature changes occur, the material dynamically changes color, creating an environmentally responsive biofabricated leather.

What societal challenge or market need will this address?

This project addresses the growing demand for sustainable and responsive biomaterials within the fashion and textile industries. Traditional leather production has major environmental impacts, while synthetic plastics contribute to pollution and microplastic accumulation.

Also, a major challenge would be the variety of bioleather materials that already exist and are already available in the market, and their variations.

How do you envision addressing the limitation of cell-free reactions (e.g., activation with water, stability, one-time use)?

The CFPS system would be freeze-dried and encapsulated within the protective hydrogel or polymer microcapsules distributed throughout the cellulose matrix. This approach increases long-term stability and protects the biological machinery from premature degradation.

The material would only activate after exposure to moisture or humidity, extending shelf life during storage and transport. To address the one-time-use limitation, the bioleather could incorporate a replaceable or rechargeable CFPS layer that can be rehydrated multiple times or periodically replenished with fresh freeze-dried reaction components.

Homework question from Ally Huang

Freeze-dried cell-free reactions have great potential in space, where resources are constrained. As described in my talk, the Genes in Space competition challenges students to consider how biotechnology, including cell-free reactions, can be used to solve biological problems encountered in space. While the competition is limited to only high school students, your assignment will be to develop your own mock Genes in Space proposal to practice thinking about biotech applications in space!

For this particular assignment, your proposal is required to incorporate the BioBits® cell-free protein expression system, but you may also use the other tools in the Genes in Space toolkit (the miniPCR® thermal cycler and the P51 Molecular Fluorescence Viewer). For more inspiration, check out https://www.genesinspace.org/ .

Provide background information that describes the space biology question or challenge you propose to address. Explain why this topic is significant for humanity, relevant for space exploration, and scientifically interesting. (Maximum 100 words)

Long-duration space missions expose astronauts to enclosed environments where oxygen imbalance or toxic gas accumulation may become life-threatening. Current oxygen monitoring systems depend heavily on electronic infrastructure, which can fail during emergencies. This proposal explores a freeze-dried cell-free biosensor capable of detecting low oxygen levels and harmful gas accumulation while activating an emergency biological response system. Such a platform could support astronaut survival spacecraft, planetary habitats, or exploration missions with limited resources. Developing lightweight, portable, and biologically programmable gas-response systems is significant for future lunar, Martian, and deep-space exploration, where autonomous emergency technologies will become increasingly necessary.

Name the molecular or genetic target that you propose to study. Examples of molecular targets include individual genes and proteins, DNA and RNA sequences, or broader -omics approaches. (Maximum 30 words)

Oxygen-sensitive regulatory system (Fnr system), hypoxya response promoters, and gas-responsive reporter proteins integrated into BioBits cell-free protein expression platform.

Describe how your molecular or genetic target relates to the space biology question or challenge your proposal addresses. (Maximum 100 words)

The selected molecular targets allow the CFPS to biologically detect oxygen depletion and potentially dangerous atmospheric conditions. The Fnr oxygen-sensing regulatory system naturally responds to hypoxic environments in bacteria, making it suitable for engineering emergency biosensors. By integrating the regulatory elements into the BioBits platform, the system could activate visible reporter signals or trigger gas-response mehcanisms when ocygen concentration drops below safe thresholds. This directly addresses one of the major biological and engineering challenges in space exploration: maintaining safe atmospheric conditions in isolated environments where equipment malfunction or delayed rescue may threaten astronaut survival.

Clearly state your hypothesis or research goal and explain the reasoning behind it. (Maximum 150 words)

This project hypothesizes that a freeze-dried BioBits cell-free system can function as a portable oxygen deficiency biosensor capable of activating a detectable biological response under hypoxic conditions. The goal is to engineer a lightweight and stable emergency platform that can detect dangerous atmospheric changes without relying entirely on electronic systems.

The proposed system would use oxygen-sensitive regulatory components to activate reporter protein expression when oxygen levels decrease. This response could generate visible colorimetric or fluorescent signals that warn astronauts about unsafe environments. In future applications, the platform could potentially be expanded to activate oxygen-generating or gas -neutralizing bilogical pathways.

This research is important because future space habitats will require autonomous, low-resource emergency technologies that remain functional even during equipment failures or power limitations.

Outline your experimental plan - identify the sample(s) you will test in your experiment, including any necessary controls, the type of data or measurements that will be collected, etc. (Maximum 100 words)

Freeze-dried BioBits reactions containing oxygen-sensitive genetic circuits will be prepared and exposed to controlled atmospheric conditions with varying oxygen concentrations. Normoxic samples will serve as controls, while hypoxic chambers will simulate oxygen-deficient environments.

Reporter protein expression will be measured using visible chromoproteins or fluorescence detected with the P51 Molecular Fluorescence Viewer. The miniPCR thermal cycler may be used to amplify and verify DNA constructs before freeze-drying.

Collected data will include fluorescence intensity, response time, activation thresholds, and system stability after rehydration. These measurements will evaluate whether the biosensor can reliably detect dangerous atmospheric conditions in space-like environments.

Homework Part B: Individual Final Project

We’d like students to start exploring their final project in depth this week! Of your three Aims, for this week you should have at least Aim 1 decided and written down.

Put your chosen final project slide in the appropriate slide deck following the instructions on slide 1:
- MIT/Harvard/Wellesley ONE FINAL PROJECT IDEA
- Committed Listener ONE FINAL PROJECT IDEA
This is my final project idea, it is still on work but the slide makes an emphazyse about the idea of the experiment:

Submit this Final Project selection form if you have not already.
I already made it but forgot to take a screenshot of the form
Begin planning how you will write your final project documentation based on these guidelines
Prepare your first DNA order and put it in the “Twist (MIT)” or “Twist (Nodes)” tab of the 2026 HTGAA Ordering: DNA, Reagents, Consumables spreadsheet, as appropriate.
- First Twist order deadline for MIT/Harvard/Wellesley students is Friday, April 3 at 11PM ET
- First Twist order deadline for Committed Listeners is Friday, April 10 at 11PM ET. (Your Node Lead will place the Twist order, so please work with them to finalize your constructs and ordering decisions.)

For the USFQ node, the Twist orders were cancelled fue to administrative procedures.

Week 10 HW: Imagin and Measurement

Waters Part I — Molecular Weight

We will analyze an eGFP standard on a Waters Xevo G3 QTof MS system to determine the molecular weight of intact eGFP and observe its charge state distribution in the native and denatured (unfolded) states. The conditions for LC-MS analysis of intact protein cause it to unfold and be detected in its denatured form (due to the solvents and pH used for analysis).

1. Based on the predicted amino acid sequence of eGFP (see below) and any known modifications, what is the calculated molecular weight? You can use an online calculator like the one at https://web.expasy.org/compute_pi/

eGFP Sequence:

MVSKGEELFTG VVPILVELDG DVNGHKFSVS GEGEGDATYG KLTLKFICTT GKLPVPWPTL VTTLTYGVQC FSRYPDHMKQ HDFFKSAMPE GYVQERTIFF KDDGNYKTRA EVKFEGDTLV NRIELKGIDF KEDGNILGHK LEYNYNSHNV YIMADKQKNG IKVNFKIRHN IEDGSVQLAD HYQQNTPIGD GPVLLPDNHY LSTQSALSKD PNEKRDHMVL LEFVTAAGIT LGMDELYKLE HHHHHH

Note: This contains a His-purification tag (HHHHHH) and a linker (the LE before it).

Using the link, the result obtained was:
- Theoretical pI/Mw of eGFP: 5.90 / 28006.60

2. Calculate the molecular weight of the eGFP using the adjacent charge state approach described in the recitation. Select two charge states from the intact LC-MS data (Figure 1) and:

Determine z for each adjacent pair of peaks (n,n+1) using:

z = m / z_n+1 m / z_n - m / z_n+1

For this part, I chose this adjacent pair of peaks:
- n= 1037.4423
- n+1= 1000.4302
The formula would be:

z = 1000.4302 1037.4423 - 1000.4302

z = 1000.4302 37.0121

z = 27.0298

z = ≈27 kDa

Determine the MW of the protein using the relationship between m/zn, MW, z:
For this part, I will use the equation given in the presentation to determine MW. First I will determine n, to then determine MW

n = m / z_n+1 - 1 m / z_n - m / z_n+1

n = 1000.4302 - 1 10037.4423 - 1000.4302

n = 999.4302 37.0121

n = 27

MW = (n * m / z_n+1) - n

MW = (27 * 1037.4423) - 27

MW = 28010.9421 - 27

MW = 27983.9421

MW = ≈27984 Da

When we compare results between Theoretical MW (28006.60) and Experimental MW (27984) we recognize a slight varietion, meening that the achieved results are very precise and coherent between them. The final result of MW will be, as mentioned before, 27984 Da.

Calculate the accuracy of the measurement using the deconvoluted MW from 2.2 and the predicted weight of the protein from 2.1 using:

Accuracy = (MW _experiment - MW _theory) MW _theory

Accuracy = 27984 - 28006.60 28006.60

Accuracy = -22.6 28006.60

Accuracy = -8.07 * 10^-4

Accuracy = ≈-0.0807%

The final Accuracy number -0.0807% shows that the difference between the MW Experimental and MW Theoretical is slightly different by a minimal percentage. This shows a high accuracy in charge-state assignment and peak selection.

Can you observe the charge state for the zoomed-in peak in the mass spectrum for the intact eGFP? If yes, what is it? If no, why not?
Yes, it can be observed, and also it is shown that there are other peaks inside the peaks. This is because the protein is charged with multiple protons, creating a charge-state envelope (like a membrane that changes constantly). When we zoom in, we can see the isotopical peaks.

Waters Part II — Secondary/Tertiary structure

We will analyze eGFP in its native, folded state and compare it to its denatured, unfolded state on a quadrupole time-of-flight MS. We will be doing MS-only analysis (no liquid chromatography, also known as “direct infusion” experiments) on the Waters Xevo G3-QToF MS.

1. Based on learnings in the lab, please explain the difference between native and denatured protein conformations. For example, what happens when a protein unfolds? How is that determined with a mass spectrometer? What changes do you see in the mass spectrum between the native and denatured protein analyses (Figure 2)?

The main difference between native and denatured protein conformation is that native proteins are those that are in their natural, functional 3D conformation, specifically conformed into secondary, tertiary, and quaternary structures. While, on the other hand, denatured proteins are those that have lost their 3D structure, unfolding into random inactive polypeptide chains due to stress ( heat, pH, chemicals). Although they retain their primary amino acid sequence.

Charge states:
- Native: because of the compact structure, it hides the charge sites, showing fewer peaks with lower charge-states (higher value of m/zn)
- Denatured: because of the loss of the 3D structure, protonized sites are exposed, therefore the result is a distribution of multiply and higher charges (lower value of m/zn).
Quaternary conservation complex
- Native: the native mass spectrometer allows for observing the complex in its structure because the non-covalent interactions (hydrogen bridges, hydrofobic interactions) are kept intact.
- Denatured: it dissociates into individual subunits, or it is observed as a fully unfolded polypeptide chain.
Presence in additions (ligands/salt)
- Native: it shows a peak corresponding to the protein linked to small ligands, metals, or subunits, which confirms a functional state.
- Denatured: ligands are lost due to the denaturalized state.
Solven conditions
- Native: Native MS uses neutral pH (6.5-7.5) and aqueous buffer.
- Denatured: Denaturing MS uses acidic conditions (for example: TFA, Formic acid) and organic solvents (for example: Acetonitrile)

In the Figure shown (Figure 2), the charge states clearly show how the native protein has very specific peaks; therefore, it is more compact, while the denatured protein shows a broad presence of peaks in all its analyses, therefore, it evidences the presence of an unfolded structure exposing its protonized sites. Also, charges in the denatured protein are a lot higher and consecutive than in the native protein.

2. Zooming into the native mass spectrum of eGFP from the Waters Xevo G3 QTof MS (see Figure 3), can you discern the charge state of the peak at ~2800 m/z? What is the charge state? How can you tell?

To determine the charge state, we should use adjacent peaks, such as:

m1: 2544.8552
m2: 2544.7637

The formula we will use is:

z = m₂ - 1 m₁ - m₂

z = 2544.7637 - 1 2544.8552 - 2544.7637

z = 2543.7637 0.0915

z = 27800.7

Waters Part III — Peptide Mapping - primary structure

We will digest the eGFP protein standard into peptides using trypsin (an enzyme that selectively cleaves the peptide bond after Lysine (K) and Arginine (R) residues. The resulting peptides will be analyzed on the Waters BioAccord LC-MS to measure their molecular weights and fragmented to confirm the amino acid sequence within each peptide – generating a “peptide map”. This process is used to confirm the primary structure of the protein.

There are a variety of tools available online to calculate protein molecular weight and predict a list of peptides generated from a tryptic digest. We will be using tools within the online resource Expasy (the bioinformatics resource portal of the Swiss Institute of Bioinformatics (SIB)) to predict a list of tryptic peptides from eGFP.

1. How many Lysines (K) and Arginines (R) are in eGFP? Please circle or highlight them in the eGFP sequence given in Waters Part I question 1 above. (Note: adding the sequence to Benchling as an amino acid file and clicking biochemical properties tab will show you a count for each amino acid).

eGFP Sequence:

MVS[K]GEELFTG VVPILVELDG DVNGH[K]FSVS GEGEGDATYG [K]LTL[K]FICTT G[K]LPVPWPTL VTTLTYGVQC FS(R)YPDHM[K]Q HDFF[K]SAMPE GYVQE(R)TIFF [K]DDGNY[K]T(R)A EV[K]FEGDTLV N(R)IEL[K]GIDF [K]EDGNILGH[K] LEYNYNSHNV YIMAD[K]Q[K]NG I[K]VNF[K]I(R)HN IEDGSVQLAD HYQQNTPIGD GPVLLPDNHY LSTQSALS[K]D PNE[K](R)DHMVL LEFVTAAGIT LGMDELY[K]LE HHHHHH

In the sequence, we can identify:

Lysines (K): 19
Arginines (R): 6

2. How many peptides will be generated from tryptic digestion of eGFP?

Navigate to https://web.expasy.org/peptide_mass/
Copy/paste the sequence above into the input box in the PeptideMass tool to generate expected list of peptides.
Use Figure 4 below as a guide for the relevant parameters to predict peptides from eGFP.
Click “Perform the Cleavage” button in the PeptideMass tool and report the number of peptides generated when using trypsin to perform the digest.

When the cleavage is performed, the number of peptides is: 19

3. Based on the LC-MS data for the Peptide Map data generated in lab (please use Figure 5a as a reference) how many chromatographic peaks do you see in the eGFP peptide map between 0.5 and 6 minutes? You may count all peaks that are >10% relative abundance.

According to the map data, the number of peaks clearly identified and that are >10% relative abundance, is 17:

     1 /   2 /   3 /   4 /   5 /   6 /   7 /   8 /   9 /   10/  11 /  12 /  13 /  14 /  15 /  16 /  17
  0.43/ 0.61/ 0.79/ 1.43/ 1.80/ 1.85/ 1.93/ 2.17/ 2.26/ 2.54/ 2.78/ 3.53/ 3.59/ 3.70/ 4.48/ 4.64/ 4.87

4. Assuming all the peaks are peptides, does the number of peaks match the number of peptides predicted from question 2 above? Are there more peaks in the chromatogram or fewer?

Predicted peptides: 19
Chromatogram peaks: 17

If we assume that the peaks in the chromatogram are peptides, then we would have fewer than the predicted number in theory.

5. Identify the mass-to-charge (m/z) of the peptide shown in Figure 5b. What is the charge (z) of the most abundant charge state of the peptide (use the separation of the isotopes to determine the charge state)? Calculate the mass of the singly charged form of the peptide ([M+H]⁺) based on its m/z and z.

m/z₁ = 525.76712
m/z₂ = 526.25918

To identified the z charge we will first have to know the variation between m/z₁ and m/z₂ for that we will use this formula:

Δm/z = m/z₂ - m/z₁

Δm/z = 526.25918 - 525.76712

Δm/z ≈ 0.49

After that, we will use the formula above to identified the z charge:

1/z = Δm/z

1/z ≈ 0.49

z ≈ 1 / 0.49

z ≈ 2

For the next part, we will use the formula:

MW = (n x m/z _n) - n

MW = (2 x 525.76712) - 2

MW = 1051.53424 - 2

MW = ≈1049.52

We now have M, and we already know H (1.007):

(M + H)⁺ = 1049.52 + 1.007

(M + H)⁺ = 1050.52

By calculating z from the isotop spacing, we can know that the peptide has a z charge of 2+ at m/z= 525.76. On the other hand, the single correspond ion was at m/z= 1050.52. These values confirm that both signals correspond to the same peptide detected in different stages. And this occurs because peptides can have different numbers of protons during electrospray ionization, depending on the presence of protonable amino acids.

6. Identify the peptide based on comparison to expected masses in the PeptideMass tool. What is mass accuracy of measurement? Please calculate the error in ppm. Recall that Accuracy = (MW _experiment - MW _theory)MW _theory

MW _theory = 1049.51
MW _experiment = 1049.52

Accuracy = (MW _experiment - MW _theory) MW _theory

Accuracy = (1049.52 - 1049.51) 1049.51

Accuracy = 0.01 1049.51

Accuracy = 9.52e^-6 x 10⁶

Accuracy ≈ 9.5 ppm

7. What is the percentage of the sequence that is confirmed by peptide mapping? (see Figure 6)

The percentage of the sequence confirmed by the pepitde mapping is 88%

Waters Part IV — Oligomers

We will determine Keyhole Limpet Hemocyanin (KLH)’s oligomeric states using charge detection mass spectrometry (CDMS). CDMS single-particle measurements of KLH allow us to make direct mass measurements to determine what oligomeric states (that is, how many protein subunits combine) are present in solution. Using the known masses of the polypeptide subunits (Table 1) for KLH, identify where the following oligomeric species are on the spectrum shown below from the CDMS (Figure 7):

- 7FU Decamer - 8FU Didecamer - 8FU 3-Decamer - 8FU 4-Decamer

Polypeptide Subunit Name	Subunit Mass
7FU Decamer	340 kDa
8FU Didecamer	400 kDa

To calculate the Subunit masses, we have to understand:

Decamer = 10 subunits
FU = protein subunits of KLH

Polypeptide Subunit Name	Decamer × Subunit Mass	Total Subunit Mass
7FU Decamer	10 × 340 kDa	3.4 MDa
8FU Didecamer	20 × 400 kDa	8.0 MDa
8FU 3-Decamer	30 × 400 kDa	12.0 MDa
8FU 4-Decamer	40 × 400 kDa	16.0 MDa

Week 11 HW: Bioproduction & Cloud Labs

Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork

1. Contribute at least one pixel to this global artwork experiment before the editing ends on Sunday 4/19 at 11:59 PM EST.

A personalized URL was sent to the email address associated with your Discourse account, and you can discuss the artwork on the Discourse.
If you did not have a chance to contribute, it’s okay, just make sure you become a TA this fall! 😉

2. Make a note on your HTGAA webpages including:

What you contributed to the community bioart project (e.g., “I made part of the DNA on the bottom right plate”)
For this part I contributed with a sinuous line that has the idea to connect the plates of the bottom. It was made with sfGFP protein (green fluorescent protein). I think I made some pixels with red, but I could not identified them when the artwork was complete.

What you liked about the project, and
I think it woke up my curiosity and excitement about what was going to be the final output. Everyone was making some changes in it, and sometimes the drawing was pretty clear, but other times it mutated. It is nice to see how everyone contributes, without having anything in mind or with a very specific idea. I like the idea of having a collaborative artwork without knowing how it is going to be at the end. Also, it is an extensive process for the cloud lab to process.
What about this collaborative art experiment could be made better for next year?
I would like to see more colors in it. Also, it would be nice to use some coding to let everyone make a complete drawing, and then have an artificial analysis that has a final output. I would like to understand what the AI is reading and what patterns it selects and keeps. It would be nice to just have that curiosity resolved.

Part B: Cell-Free Protein Synthesis | Cell-Free Reagents

1. Referencing the cell-free protein synthesis reaction composition (the middle box outlined in yellow on the image above, also listed below), provide a 1-2 sentence description of what each component’s role is in the cell-free reaction.

E. coli Lysate
- BL21 (DE3) Star Lysate (includes T7 RNA Polymerase): It is the machinery of the E.coli cell that makes possible the expression of the proteins inserted in the plasmid. In the specific case of BL21 (DE3), the machinery has been modified so that the mRNA would not be degraded, resulting in a higher yield of protein expression and stability.
Salts/Buffer
- Potassium Glutamate: It is used to maintain osmotic balance and support enzyme function in the bacterial lysate process. It is an alternative to KCl and a highly efficient reaction. It simulates ionic strength.
- HEPES-KOH pH 7.5: It is a common high-quality buffer suited for cell culture, protein purification, and enzyme studies. It maintains a pH stable in the range of 6.8-8.2.
- Magnesium Glutamate: Mg²⁺ ions are essential for ribosome function and enzymatic activity, while glutamate counterion acts as a biocompatible high-capacity buffer that stabilizes pH and mimics intracellular conditions. Therefore, magnesium glutamate offers a superior buffering capacity across a broader pH range, enhancing protein stability and activity.
- Potassium phosphate monobasic: it is a highly soluble inorganic salt used as a foundational buffering agent. It acts as a weak acid in the preparation of phosphate buffer systems to maintain a stable pH (between 5.8 and 8.0), critical for enzyme stability and cell culture viability.
- Potassium phosphate dibasic: generally, it is combined with potassium phosphate monobasic for creating biological buffers. It’s used in CFPS when needing high-purity and low-background buffering.
Energy / Nucleotide System
- Ribose: It is crucial for rebuilding nucleotides in the reaction mixture and regeneration of energy molecules such as ATP (adenosine triphosphate) and adenine nucleotide. Ribose is a carbon-5 sugar used to build scaffolding for nucleobase salvage, turning them back into active nucleotides (important during energy stress)
- Glucose: it offers an ATP regeneration via glycolysis. It offers a higher energy yield per molecule compared to traditional high-energy phosphate. Glucose fuels CFPS by initiating glycolysis. Glucose drive systems present a lag in protein synthesis because ATP is consumed for priming glycolysis before net regeneration begins.
- AMP: Adenosine monophosphate acts as a signal of energy depletion in CFPS. It is recycled back to ADP/ATP to prolong protein synthesis, often via endogenous kinases such as adenylate kinase.
- CMP: Cytidine 5-monophosphate is a nucleotide component in CFPS. It serves as a precursor for the production of CTP, which is essential for RNA synthesis and transcription. It is utilized in energy regeneration pathways and as a building block fr nucleic acids.
- GMP: Guanosine monophosphate. Is a critical nucleotide precursor that serves as a building block for RNA synthesis (transcription) and as a substrate for energy regeneration (translation).
- UMP: Uridine monophosphate. Acts as a critical precursor for pyrimidine nucleotide metabolism, supplying UTP required for RNA synthesis and energy-dependent processes. While UTP is used for transcription, UMP phosphorylation depends on the energy regeneration systems, that replanishes the ATP required by UMP kinases.
- Guanine: It is crucial for powering translation machinery, with 2 molecules of GTP typically consumed for each amino acid incorporated into a growing polypeptide. It is required for: amino acid activation, initiation, elongation, and termination during translation.
Translation Mix (Amino Acids)
- 17 Amino Acid Mix: Provides the majority of amino acids required for protein synthesis, enabling ribosomes to assemble the target protein. While 20 amino acids are the standard, in specific mixes, 17 also works, and is used for enhancing solubility, balancing metabolism, or allowing the incorporation of unnatural amino acids to the mix.
- Tyrosine: It is supplied separately due to its low solubility. Typically added when needing high-yield protein production (1-4 mM).
- Cysteine: Also supplied separately due to its stability, reactivity, and oxidation issues. Is important to ensure its availability for having proper protein folding and disulfide bond formation
Additives
- Nicotinamide: commonly utilized in the context of metabolic engineering for the production of nicotinamide mononucleotide (NMN), a vital intermediate in NAD+/NADH, supporting redox reactions and metabolic processes that help sustain energy regeneration.
Backfill
- Nuclease Free Water: It is used to adjust the final reaction volume while preventing degradation of nucleic acids by nucleases.

2. Describe the main differences between the 1-hour optimized PEP-NTP master mix and the 20-hour NMP-Ribose-Glucose master mix shown in the Google Slide above. (2-3 sentences)

The main difference lies in the energy regeneration strategy used in each system: the 1hour PEP-NTP master mix relies on phosphoenolpyruvate (PEP) as a high-energy phosphate donor, enabling rapid and efficient ATP regeneration for short-term, high-yield protein synthesis. In contrast, the 20-hour NMP uses a slower, metabolism-based pathway that recycles nucleotides and generates ATP through glycolytic processes, allowing sustainable protein production over longer periods but with lower instantaneous energy output.

3. Bonus question: How can transcription occur if GMP is not included but Guanine is?

Transcription can occur because guanine can be converted into GMP through nucleotide salvage pathways present in the lysate. GMP is then phosphorylated to GDP and GTP, which are required for RNA synthesis. Therefore, even in the absence of externally supplied GMP, the system can regenerate the necessary nucleotides for guanine.

Part C: Planning the Global Experiment | Cell-Free Master Mix Design

1. Given the 6 fluorescent proteins we used for our collaborative painting, identify and explain at least one biophysical or functional property of each protein that affects expression or readout in cell-free systems. (Hint: options include maturation time, acid sensitivity, folding, oxygen dependence, etc) (1-2 sentences each)

sfGFP: has a ratio of 2.2 ± 0.1 (74 cells) (2.2 is the average of OSER parameters; 0.1 is the variability; 74 is the number of cells in which it has been analyzed). Therefore, sfGFP is an almost stable protein that tends to aggregate at a low rate, which means that it might not fold correctly or be less fluorescent.
- OSER ratio (Organized smooth endoplasmic reticulum) evaluates if a protein tends to oligomerize or aggregate in cells. In the case of CFPS, it indicates a tendency to do so. In general, values around 1-2 indicate a tendency to monomerize and therefore be more stable, while values around 3-4 indicate a tendency to oligomerize and therefore to aggregate.
mRFP1: this protein has 4.5 pKa. Normally, in CFPS, the pH is around 7-7.5 in order to obtain a higher efficiency in translation and transcription. In this case, a pH of 4.5 means that it is negatively charged, which favors its solubility. The only thing is that if the pH changes slightly, the protein will become positively charged and change its conformation. (altering its 3d structure and losing its functions).
- PKA: It is the measure of the acid sensitivity of FP. It is the pH at which fluorescence intensity drops at 50% of its maximum value. It is the physiological pH that the protein needs.
mKO2: has a 5.5 pKa, and a maturation time of 108 min. This means that the maturation time is long, and it delays fluorescence detection in CFPS, affecting the early readout. On the other hand, the low pKa indicates that it is sensitive to pH changes, making the fluorescence readout decrease when the pH gets more acidic.
- Maturation time (min): is the ideal time in which a protein is properly folded and expressed. The rates vary between 10-30 (min) as fast maturation, 30-60 (min) medium, and more than 90 (min) slow.
mTurquoise2: this protein has a maturation time of 33.5 min, which is a moderate rate, and a 3.3pKa, which is a little low; it is stable in physiological pH. In the case of oxygen dependence, as mTurquoise2 is derived from GFP, oxygen is required for chromophore maturation, making fluorescence dependent on oxygen availability.
mScarlet_I: we have a 3.9 (ns) of lifetime. This means each molecule will remain in an excited state for 3.9 ns before emitting a photon. This improves readout reliability by enabling discrimination from background fluorescence.
- Lifetime (ns): it is the time a fluorophore remains excited before returning to the ground state by emitting a photon. It is influenced by the local environment, such as pH and molecular interactions.
Electra2: this protein has a high photostability, with a half-life of 1466 seconds, allowing it to maintain fluorescence under continuous illumination.
- Photostability: ability of a fluorescent protein to maintain its fluorescence under continuous illumination. High photostability improves CFPS readout by reducing photobleaching because it produces a reliable signal.

2. Create a hypothesis for how adjusting one or more reagents in the cell-free mastermix could improve a specific biophysical or functional property you identified above, in order to maximize fluorescence over a 36-hour incubation. Clearly state the protein, the reagent(s), and the expected effect.

I chose the mKO2 protein, because it presents a relatively long maturation time (108 min), delaying fluorescence detection. I think it can be improved by combining two elements: first, by applying more agitation, oxygen levels would increase in the reaction, and to accelerate the chromophore formation. Second, would be by adding a chaperon protein so that it can improve folding efficiency, the chaperon selected could be GroEl/GroEs, also, it would increase the yield of properly folded protein over a 36-hour incubation.

3. The second phase of this lab will be to define the precise reagent concentrations for your cell-free experiment. You will be assigned artwork wells with specific fluorescent proteins and receive an email with instructions this week (by April 24). You can begin composing master mix compositions here.

4. The final phase of this lab will be analyzing the fluorescence data we collect to determine whether we can draw any conclusions about favorable reagent compositions for our fluorescent proteins. This will be due a week after the data is returned (date TBD!). The reaction composition for each well will be as follows:

6 μL of Lysate
10 μL of 2X Optimized Master Mix from above
2 μL of assigned fluorescent protein DNA template
2 μL of your custom reagent supplements Total: 20 μL reaction

Part D: Build-A-Cloud-Lab | (optional) Bonus Assignment