Subsections of <Paulina Flores> — HTGAA Spring 2026

Homework

Weekly homework submissions:

  • Week 1 HW: Principles and Practices

    First, describe a biological engineering application or tool you want to develop and why. This could be inspired by an idea for your HTGAA class project and/or something for which you are already doing in your research, or something you are just curious about. Thyroid implant for canines In the canine world, there is a very specific problem that owners face: hypothyroidism. It is a common condition that dogs develop around the ages of 2 or 3. Still, most of the time it is mistaken for other health conditions, such as intestinal problems, allergies, dermatological conditions, and so on. As this health condition has many impacts on the body of dogs, sometimes veterinarians can lead to a false positive.

  • Week 2 HW: DNA Read, Write & Edit

    Part 1: Benchling & In-silico Gel Art See this week’s lab protocol “Gel Art: Restriction Digests and Gel Electrophoresis” for details. Overview: Make a free account at benchling.com Import the Lambda DNA. Simulate Restriction Enzyme Digestion with the following Enzymes: EcoRI HindIII BamHI KpnI EcoRV SacI SalI Restriction Enzyme Digestion made with Benchling Create a pattern/image in the style of Paul Vanouse’s Latent Figure Protocol artworks. You might find Ronan’s website a helpful tool for quickly iterating on designs! E=m*a2 EcoRV vs. EcoRI Single Enzymes Pyramid Enzymes Part 3: DNA Design Challenge

  • Week 3 HW: Lab Automation

    Assignment: Python Script for Opentrons Artwork — DUE BY YOUR LAB TIME! Your task this week is to Create a Python file to run on an Opentrons liquid handling robot. Review this week’s recitation and this week’s lab for details on the Opentrons and programming it. Generate an artistic design using the GUI at opentrons-art.rcdonovan.com. Star´s birth Rectangular color palette Iteration color palette 1 Iteration color palette 2 Iteration color palette 3 - Ellipse Iteration color palette 4 - Circumference Using the coordinates from the GUI, follow the instructions in the HTGAA26 Opentrons Colab to write your own Python script which draws your design using the Opentrons. You may use AI assistance for this coding — Google Gemini is integrated into Colab (see the stylized star bottom center); it will do a good job writing functional Python, while you probably need to take charge of the art concept. Iteration color palette 5 Coding done by Gemini This coding was made with Google Gemini. The steps for doing that were: first, loading the coordinates made in the GUI; second, giving instructions to the AI for what the expected outcome; finally, iterating until the idea was achieved. The given instruction given to the IA did not have any basic coding, it was all made with written instructions.

  • Week 4: Protein Design - part I

    Part A. Conceptual Questions Answer any NINE of the following questions from Shuguang Zhang: (i.e. you can select two to skip) How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons) Why do humans eat beef but do not become a cow, eat fish but do not become fish? Why are there only 20 natural amino acids? Can you make other non-natural amino acids? Design some new amino acids. Where did amino acids come from before enzymes that make them, and before life started? If you make an α-helix using D-amino acids, what handedness (right or left) would you expect? Can you discover additional helices in proteins? Why are most molecular helices right-handed? Why do β-sheets tend to aggregate? What is the driving force for β-sheet aggregation? Why do many amyloid diseases form β-sheets? Can you use amyloid β-sheets as materials? Design a β-sheet motif that forms a well-ordered structure. Part B: Protein Analysis and Visualization

  • Week 5: Protein Design - part II

    Part A: SOD1 Binder Peptide Design (From Pranam) Superoxide dismutase 1 (SOD1) is a cytosolic antioxidant enzyme that converts superoxide radicals into hydrogen peroxide and oxygen. In its native state, it forms a stable homodimer and binds copper and zinc. Mutations in SOD1 cause familial Amyotrophic Lateral Sclerosis (ALS). Among them, the A4V mutation (Alanine → Valine at residue 4) leads to one of the most aggressive forms of the disease. The mutation subtly destabilizes the N-terminus, perturbs folding energetics, and promotes toxic aggregation.

Subsections of Homework

Week 1 HW: Principles and Practices

Mies portada Mies portada

First, describe a biological engineering application or tool you want to develop and why. This could be inspired by an idea for your HTGAA class project and/or something for which you are already doing in your research, or something you are just curious about.

Thyroid implant for canines

In the canine world, there is a very specific problem that owners face: hypothyroidism. It is a common condition that dogs develop around the ages of 2 or 3. Still, most of the time it is mistaken for other health conditions, such as intestinal problems, allergies, dermatological conditions, and so on. As this health condition has many impacts on the body of dogs, sometimes veterinarians can lead to a false positive.

There are three types of hypothyroidism:

  • Primary
  • Secondary
  • Tertiary
  • Congenituous

We are going to center our attention on primary hypothyroidism, caused by an immunological condition or an idiopathic problem.

The origin of Primary Hypothyroidism lies in the intercommunication that exists between the thyroid gland and the brain´s instructions. There is a malfunction that doesn´t allow the gland to produce more T4. Although the communication between the brain, the hypothalamus, and the hipophysis are fine and working at its normal pace, the gland receives the message, the gland cannot keep the production of T4 because it has been attacked by the immunological cells or it has started to transform into fat with no returning point.

The interesting thing about this system is that it is only partially broken. The T4-T3 hormones do not depend on the gland to be absorbed by the body; they depend on the genetic switches that are inside the organs they enter: liver, kidneys, brain, and muscles. Another interesting fact is that, actually, T4 hormones are passive cells and can travel around the body through the circulatory system.

In our present times, we have a pharmaceutical solution called: levotiroxin. This pill is actually a concentration ot T4 that enters the body through the stomach, and as it is absorbed, it can be delivered to the different organs to be transformed to T3. Although it is a simple solution, it is not exactly accurate all of the time, and it has to be monitored every 6 months to be adjusted according to the requirements of multiple blood tests until the dose is accurate.

Now, what if we could implant a thyroid substitute that could read the TSH directly through the blood and produce T4 as needed in the body, with a more precise response? To achieve this objective, we must face the subsequent situations:

  1. The body needs to accept the cells as its own. The most accurate way to do this is by extracting a piece of endodermic or glandular cells from the body of the pet and reprogram it to be thyroid cells. This should be done in a lab, and the cells must reach a stable state so that they can operate accurately and not stop working when they are inside the body. The importance of having cells from the same body is that they won´t be read as a giant thread, as other materials could.

  2. Since the body has a very acute immune system, it is necessary to put the reprogrammed cells into a container or membrane that filters immune cells and directs infiltrations. This membrane should be soft and with the right amount of pores that allow the entrance of oxygen, TSH, and nutrients, but also allow the filtration of T4 into the blood system. This membrane should be made out of biomimetic biogels, which are very hydrated and can be read as neutral elements by the immune cells.

  3. To grow an implant, it is important to guide the cells so that they can reproduce the architecture of the thyroid gland. This could be guided by a biopolymer that suggests how cells should grow together.

  4. Finally, this implant should be placed in the subcutaneous region so that it is surrounded by blood; therefore, TSH can reach the implant and start the reactions of production of T4.

The implant must be tested and programmed to react over a period of weeks, because the levels of T4 are regulated not in the exact moment that the TSH goes up or down, but rather when the TSH stays in this levels for the period of weeks. These slow reactions are similar to how the original gland reacts, and also, are not a red flag to immune cells. The implant would be reacting accurately and on its natural pace without the induction of not accurate dosis of T4.

Another important addition, and not a very stable one for the body, would be to integrate a nanochip that could scan and give feedback on the situation inside the body, but, for now, it is not a stable solution because the immune system would read it as a major danger that must be eliminated.

Thyroid implant scheme
ai-generated image

Next, describe one or more governance/policy goals related to ensuring that this application or tool contributes to an “ethical” future, like ensuring non-malfeasance (preventing harm). Break big goals down into two or more specific sub-goals. Below is one example framework (developed in the context of synthetic genomics) you can choose to use or adapt, or you can develop your own. The example was developed to consider policy goals of ensuring safety and security, alongside other goals, like promoting constructive uses, but you could propose other goals, for example, those relating to equity or autonomy.

Prevention of physical and psychological harm during experimental stages, non-malfeasance.

  1. Strengthen diagnostic protocols for canine hypothyroidism before giving the individuals any medical procedure or medicine.
  2. Limit the access and the quantity of samples extracted to each individual.
  3. Establish rules and strong selection criteria for canine individuals that will be participating in the experimental stage.

Fair and free access to information

  1. Establish clear, accessible, and transparent information for caregivers regarding the potential health risks that the research and experimental stages pose.
  2. Ensure that the results and information obtained in experimental stages and results are transferred to students and professionals in specializations, with emphasis on ethical and responsible usages.
  3. In the event of a successful result in canines and the intention to use this research on humans, any prior required animal experimentation must be reviewed and regulated by an official regulatory authority, ensuring fairness, accountability, and ethical continuity across species.

Next, describe at least three different potential governance “actions” by considering the four aspects below (Purpose, Design, Assumptions, Risks of Failure & “Success”). Try to outline a mix of actions (e.g. a new requirement/rule, incentive, or technical strategy) pursued by different “actors” (e.g. academic researchers, companies, federal regulators, law enforcement, etc). Draw upon your existing knowledge and a little additional digging, and feel free to use analogies to other domains (e.g. 3D printing, drones, financial systems, etc.).

  1. Purpose: What is done now and what changes are you proposing?
  2. Design: What is needed to make it “work”? (including the actor(s) involved - who must opt-in, fund, approve, or implement, etc)
  3. Assumptions: What could you have wrong (incorrect assumptions, uncertainties)?
  4. Risks of Failure & “Success”: How might this fail, including any unintended consequences of the “success” of your proposed actions?

A. Review of individual cases based on the evaluation of diagnostics made by an external entity and professional

To obtain a verified hypotheroid diagnosis, there must be an external entity to review and confirm these evaluations. This review must be done by physical and chemical tests that provide consistent and standardized results. This conclusion would accept or deny the participation of each subject. For this action to be applied, collaboration between institutions and external professionals must be done. These entities and actors must be accredited and affiliated by the governement Ecuador, in this case. External evaluation must be a mandatory step before initiating any procedure, because it will ensure that participants are correctly diagnosed and that budgets, time, and assets are well used.

For these actions to happen, it is assumed that external professionals will be qualified and available to be part of the program, that diagnostic criteria will be consistent in every case, and that these reviews will reduce misdiagnosis. The risks that are on the table could be clinical disagreements between actors involved, internal vs. external, and there is a high risk of corruption among professionals and institutions, which could dramatically lower the chances of succeeding with ethical and good treatment values.

B. Unannounced auditor´s inspections regarding research developments and animals´ health and care

Monitoring research progress should be audited by some unannounced visits in order to have all the information regarding the project clear. This action would be led by an external inspector of the research institutions, as well as the research team. This would ensure the supervision of ethical practices in the lab or clinic, as well as the real progress of the experimental project. The external inspector would be accredited by external academic institutions that are renowned for their knowledge in the field and their ethical practices. It is important that this actor can have the authority to propose adjustments when needed, and to stop or allow the research to continue when the conditions are met.

This would be a successful action assuming that academic institutions would have this type of professional under their wing, and that they have the capacity to support these mechanisms of control. The risks that could arise in this policy are that frequent inspections could provoke a hostile environment, affecting the performance of researchers and, therefore, the success of the research. Administrative burdens could slow the speed of the process, and rigidity could cloud the creative and precise environment needed in this type of research.

C. Mandatory and transparency in educational programs

With mandatory education for all actors involved and transparent communication pathways ensures that experimental procedures, ethical considerations, and long-term care requirements are responsibly applied beyond the research setting. This action would be implemented by educational programs given by the academic institutions involved in this research. It would have to be open for specialized students, professionals, and caregivers who will be part of the program. Also, accessible and clear training sessions would be provided, focusing on care requirements and ethical responsibilities.

Assuming that the information is being passed with transparency and it is being understood by all the attendants, it could lead to more ethical decision-making and supervision within all the actors involved. On the other side, these actions could fail if the educational content is not accessible for all participants, making it overly technical or poorly communicated. Also, an important thing to keep in mind is that there will be an emotional and subjective matter that caregivers will experience throughout the experimentation lapse; this can lead to inadvertent or sudden dopouts putting at risk both the research process and the animal’s life.


Next, score (from 1-3 with, 1 as the best, or n/a) each of your governance actions against your rubric of policy goals. The following is one framework but feel free to make your own:

Does the option:Option 1Option 2Option 3
Enhance Biosecurity
• By preventing incidents111
• By helping respond322
Foster Lab Safety
• By preventing incident211
• By helping respond322
Protect the environment
• By preventing incidents323
• By helping respond323
Other considerations
• Animal wellfare and intervention111
• Diagnostic accuracy and animal inclusion121
• Ethical inclusion and transparency212
• Equity in access to knowledge and care231
• Minimizing costs and burdens to stakeholders123
• Feasibility?122
• Not impede research232
• Promote constructive applications311

Last, drawing upon this scoring, describe which governance option, or combination of options, you would prioritize, and why. Outline any trade-offs you considered as well as assumptions and uncertainties. For this, you can choose one or more relevant audiences for your recommendation, which could range from the very local (e.g. to MIT leadership or Cambridge Mayoral Office) to the national (e.g. to President Biden or the head of a Federal Agency) to the international (e.g. to the United Nations Office of the Secretary-General, or the leadership of a multinational firm or industry consortia). These could also be one of the “actor” groups in your matrix.

The fundamental governance option to be prioritized is “Review of individual cases based on the evaluation of diagnostics made by an external entity and professional”. As it acts on the foundations of the experimental phases, it can prevent misdiagnosis or weakly supported clinical assumptions before the animal is in the trial. This option reduces the risk of ethical procedures, minimizes harm, protects animal welfare, and strengthens the scientific validity of the research foundation. The trade-off to be considered here is that this extra evaluation may slow down, at the beginning, and limit the number of participants, as well as the development of the research, but in the future are preventive in many aspects.

As complementary governance options are Option n.2 and Option n.3. These actions ensure accountability within all the actors involved by making knowledge a regulatory and independent system. The prevent the misuse of information and empowers informed decision-making throughout the research cycle. External regulatory actors are also essential because they can verify if clinical, laboratory, and welfare standards are applied before, during, and after experimentation. The trade-off here is that research institutions might perceive these actors as intruders and as potential uncertainty symbols of the research development. But, on the other hand, the combination of early prevention, continuous oversight, and broad education creates a distributed responsibility, which reduces the ethical failures to go unnoticed or unaddressed.


Reflecting on what you learned and did in class this week, outline any ethical concerns that arose, especially any that were new to you. Then propose any governance actions you think might be appropriate to address those issues. This should be included on your class page for this week.

As an experimental research that needs to be done in live animals, specifically canines, it is important to keep very clear the limit between ethical procedures and unethical ones. The fact that we do not communicate in the same way might be an important barrier between researchers and subjects. They might end up seen as numbers rather than brave and important living beings who are giving their lives to the experiment. It has arisen to me the question whether the experiment is worth the many health discomforts that the research individual will undergo to gain a major medical advancement?

Another question to keep in mind is whether we could develop a nanoscanner able to read the gland’s condition and avoid any invasive procedure, preventing from misdiagnostics and giving a much more accurate reading of the situation inside the animal´s body.


Some bibliography found about thyroid organoids for humans:

  • Kariyawasam, D., Stoupa, A., Nguyen Quoc, A., Pimentel Dantas, I., Polak, M., & Carré, A. (2025). From stem cells to organoids in thyroid: Useful tools or a step for cell therapy? La Presse Médicale, 54(4), 104301. https://doi.org/10.1016/j.lpm.2025.104301

  • ZHANG, Y., FU, M., WANG, H., & SUN, H. (2023). Advances in the Construction and Application of Thyroid Organoids. Physiological Research, 72(5), 557–564. https://doi.org/10.33549/physiolres.935102

Week 2 HW: DNA Read, Write & Edit

Xray of DNA

Part 1: Benchling & In-silico Gel Art

See this week’s lab protocol “Gel Art: Restriction Digests and Gel Electrophoresis” for details. Overview:

  • Make a free account at benchling.com
  • Import the Lambda DNA.
  • Simulate Restriction Enzyme Digestion with the following Enzymes:
    • EcoRI
    • HindIII
    • BamHI
    • KpnI
    • EcoRV
    • SacI
    • SalI
Virtual digest sequence LAMCG
Restriction Enzyme Digestion made with Benchling
  • Create a pattern/image in the style of Paul Vanouse’s Latent Figure Protocol artworks.
  • You might find Ronan’s website a helpful tool for quickly iterating on designs!
E=m*a2
EcoRV vs. EcoRI
Single Enzymes
Pyramid Enzymes

Part 3: DNA Design Challenge

3.1. Choose your protein.

In recitation, we discussed that you will pick a protein for your homework that you find interesting. Which protein have you chosen and why? Using one of the tools described in recitation (NCBI, UniProt, google), obtain the protein sequence for the protein you chose.

[Example from our group homework, you may notice the particular format — The example below came from UniProt]

>sp|P03609|LYS_BPMS2 Lysis protein OS=Escherichia phage MS2 OX=12022 PE=2 SV=1
METRFPQQSQQTPASTNRRPRFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLSLL EAVIRTVTTLQQLLT

THYROGLOBULINE (CANIS LUPUS FAMILIARIS) / ACTIN (CANIS LUPUS FAMILIARIS) vs. ACTIN (FUNGUS: S.C)

The world of proteins is so vast that choosing a single protein has been a profound task.

To follow the same path as week 1 HW, let´s start with Thyroglobulin, a very complex and specialized protein that is key to the generation of T3 and T4 hormones; in other words, it is a hormone protein. Because of its complexity, specificity, and its work with DNA, it is a modern protein. Some interesting facts about Thyroglobuline are: its size, it is very big in comparison to other proteins, it only functions in the thyroid gland, it is prone to being attacked to inmune system´s cells when something is not working well, and it does not accept errors in its process. If we compared it to the Actin protein, we could understand that Actin is a simpler protein that achieves a general action and that it is present in all eukaryotic forms since early life on Earth. Actin is the protein in charge of the formation of the cytoskeleton, motility, and shape of cells, among many other functions. The interesting fact about Actin is that it can allow errors to occur, in contrast to Thyroglobuline, which is very precise.

In the exercise bellow I will develop Thyroglobuline for Canis lupus familiaris, and also, compare Actin protein in dogs vs. Actin protein in fungus (Saccharomyces cerevisiae).

THYROGLOBULINE - CANIS LUPUS FAMILIARIS

>sp|P15881|ITF2_CANLF Transcription factor 4 OS=Canis lupus familiaris OX=9615 GN=TCF4 PE=2 SV=2
MFSPPVSSGKNGPTSLASGHFTGSNVEDRSSSGSWGNGGHPSPSRNYGDGTPYDHMTSRD
LGSHDNLSPPFVNSRIQSKTERGSYSSYGRESNLQGCHQSLLGGDMDMGTPGTLSPTKPG
SQYYQYSSNNPRRRPLHSSAMEVQTKKVRKVPPGLPSSVYAPSASTADYNRDSPGYPSSK
PAASTFPSSFFMQDGHHSSDPWSSSSGMNQPGYGGMLGSSSHIPQSSSYCSLHPHERLSY
PSHSSADINSSLPPMSTFHRSGTNHYSTSSCTPPANGTDSIMANRGSGAAGSSQTGDALG
KALASIYSPDHTNNSFSSNPSTPVGSPPSLSAGTAVWSRNGGQASSSPNYEGPLHSLQSR
IEDRLERLDDAIHVLRNHAVGPSTAMPGGHGDMHGIIGPSHNGAMGGLGSGYGTGLLSAN
RHSLMVGAHREDGVALRGSHSLVPNQVPVPQLPVQSATSPDLNPPQDPYRGMPPGLQGQS
VSSGSSEIKSDDEGDENLQDTKSSEDKKLDDDKKDIKSITSNNDDEDLTPEQKAEREKER
RMANNARERLRVRDINEAFKELGRMVQLHLKSDKPQTKLLILHQAVAVILSLEQQVRERN
LNPKAACLKRREEEKVSSEPPPLSLAGPHPGMGDASNHMGQM

3.2. Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence.

The Central Dogma discussed in class and recitation describes the process in which DNA sequence becomes transcribed and translated into protein. The Central Dogma gives us the framework to work backwards from a given protein sequence and infer the DNA sequence that the protein is derived from. Using one of the tools discussed in class, NCBI or online tools (google “reverse translation tools”), determine the nucleotide sequence that corresponds to the protein sequence you chose above.

[Example: Get to the original sequence of phage MS2 L-protein from its genome phage MS2 genome - Nucleotide - NCBI]

Lysis protein DNA sequence
atggaaacccgattccctcagcaatcgcagcaaactccggcatctactaatagacgccggccattcaaacatgaggattacccatgtcgaagacaacaaagaagttcaactctttatgtattgatcttcctcgcgatctttctctcgaaatttaccaatcaattgcttctgtcgctactggaagcggtgatccgcacagtgacgactttacagcaattgcttacttaa
THYROGLOBULINE - DNA sequence

atgttctccccccccgtgtcctccggcaagaacggccccacctccctggcctccggccac
ttcaccggctccaacgtggaggaccgctcctcctccggctcctggggcaacggcggccac
ccctccccctcccgcaactacggcgacggcaccccctacgaccacatgacctcccgcgac
ctgggctcccacgacaacctgtccccccccttcgtgaactcccgcatccagtccaagacc
gagcgcggctcctactcctcctacggccgcgagtccaacctgcagggctgccaccagtcc
ctgctgggcggcgacatggacatgggcacccccggcaccctgtcccccaccaagcccggc
tcccagtactaccagtactcctccaacaacccccgccgccgccccctgcactcctccgcc
atggaggtgcagaccaagaaggtgcgcaaggtgccccccggcctgccctcctccgtgtac
gccccctccgcctccaccgccgactacaaccgcgactcccccggctacccctcctccaag
cccgccgcctccaccttcccctcctccttcttcatgcaggacggccaccactcctccgac
ccctggtcctcctcctccggcatgaaccagcccggctacggcggcatgctgggctcctcc
tcccacatcccccagtcctcctcctactgctccctgcacccccacgagcgcctgtcctac
ccctcccactcctccgccgacatcaactcctccctgccccccatgtccaccttccaccgc
tccggcaccaaccactactccacctcctcctgcaccccccccgccaacggcaccgactcc
atcatggccaaccgcggctccggcgccgccggctcctcccagaccggcgacgccctgggc
aaggccctggcctccatctactcccccgaccacaccaacaactccttctcctccaacccc
tccacccccgtgggctcccccccctccctgtccgccggcaccgccgtgtggtcccgcaac
ggcggccaggcctcctcctcccccaactacgagggccccctgcactccctgcagtcccgc
atcgaggaccgcctggagcgcctggacgacgccatccacgtgctgcgcaaccacgccgtg
ggcccctccaccgccatgcccggcggccacggcgacatgcacggcatcatcggcccctcc
cacaacggcgccatgggcggcctgggctccggctacggcaccggcctgctgtccgccaac
cgccactccctgatggtgggcgcccaccgcgaggacggcgtggccctgcgcggctcccac
tccctggtgcccaaccaggtgcccgtgccccagctgcccgtgcagtccgccacctccccc
gacctgaaccccccccaggacccctaccgcggcatgccccccggcctgcagggccagtcc
gtgtcctccggctcctccgagatcaagtccgacgacgagggcgacgagaacctgcaggac
accaagtcctccgaggacaagaagctggacgacgacaagaaggacatcaagtccatcacc
tccaacaacgacgacgaggacctgacccccgagcagaaggccgagcgcgagaaggagcgc
cgcatggccaacaacgcccgcgagcgcctgcgcgtgcgcgacatcaacgaggccttcaag
gagctgggccgcatggtgcagctgcacctgaagtccgacaagccccagaccaagctgctg
atcctgcaccaggccgtggccgtgatcctgtccctggagcagcaggtgcgcgagcgcaac
ctgaaccccaaggccgcctgcctgaagcgccgcgaggaggagaaggtgtcctccgagccc
ccccccctgtccctggccggcccccaccccggcatgggcgacgcctccaaccacatgggc
cagatg

3.3. Codon optimization.

Once a nucleotide sequence of your protein is determined, you need to codon optimize your sequence. You may, once again, utilize google for a “codon optimization tool”. In your own words, describe why you need to optimize codon usage. Which organism have you chosen to optimize the codon sequence for and why?

[Example from Codon Optimization Tool | Twist Bioscience while avoiding Type IIs enzyme recognition sites BsaI, BsmBI, and BbsI]

Lysis protein DNA sequence with Codon-Optimization
ATGGAAACCCGCTTTCCGCAGCAGAGCCAGCAGACCCCGGCGAGCACCAACCGCCGCCGCCCGTTCAAACATGAAGATTATCCGTGCCGTCGTCAGCAGCGCAGCAGCACCCTGTATGTGCTGATTTTTCTGGCGATTTTTCTGAGCAAATTCACCAACCAGCTGCTGCTGAGCCTGCTGGAAGCGGTGATTCGCACAGTGACGACCCTGCAGCAGCTGCTGACCTAA

Codon optimization is needed to make the codon sequence of the original host be read or expressed in the codon sequence of the organism that will produce it later, without modifying the amino acid sequence. In the case of research, it is necessary to have a bank of protein that will be analyzed and tested; in that way, it is not sustainable to always get it from the original host for many reasons: budget, quantity, ethics, etc.

In the case of canine Thyroglobulin, the experiment will be based on the question: how to produce canine Thyroglobulin that could be used as one component of an implant of a thyroid gland? The cell organism that will produce the protein later will be CHO (Chinese hamster ovary cells), a mammal cell that has the capacity to perform complex processes to produce proteins as specialized as thyroglobulin.

THYROGLOBULINE - DNA sequence with Codon-Optimization

ATGTTCTCACCACCTGTGTCTTCTGGCAAGAATGGCCCCACCTCCCTGGCTTCTGGCCACTTCACCGGAAGCAACGTGGAGGACAGGTCCTCTTCCGGCTCCTGGGGCAATGGCGGCCACCCAAGTCCATCTCGAAACTACGGCGACGGGACCCCTTACGATCACATGACCTCCAGAGACCTGGGCTCTCATGACAATCTGTCTCCCCCATTTGTGAACTCCCGGATTCAGTCTAAGACTGAGCGGGGCTCATACAGCTCTTACGGACGCGAGAGCAACCTGCAGGGTTGTCACCAGTCCCTGCTGGGCGGAGACATGGACATGGGCACCCCCGGGACCCTCTCTCCTACTAAGCCTGGCTCTCAGTATTACCAGTACTCCTCCAATAACCCTCGAAGGCGGCCCCTGCACAGCAGTGCCATGGAGGTCCAGACAAAGAAAGTCAGGAAGGTGCCACCAGGCCTGCCCAGCTCCGTCTATGCCCCAAGCGCCTCCACCGCCGATTACAATCGAGATAGCCCCGGTTACCCCTCCTCTAAGCCAGCAGCCTCTACTTTCCCTAGCTCCTTCTTTATGCAGGACGGCCATCACTCAAGTGATCCTTGGTCCAGCAGCTCTGGCATGAACCAGCCAGGGTACGGTGGTATGCTGGGTTCTTCCAGTCACATCCCTCAGTCTTCTTCCTACTGTAGTCTGCATCCACATGAGCGCCTGTCATACCCCAGCCACTCCTCTGCCGACATCAATAGCTCCCTGCCACCCATGTCAACCTTCCATAGGAGCGGCACTAACCATTATTCCACATCCAGCTGCACTCCTCCCGCTAACGGTACTGACTCTATCATGGCTAACAGAGGCTCCGGCGCCGCTGGAAGCAGTCAGACCGGAGATGCTCTCGGCAAGGCCCTGGCTTCTATCTATTCTCCCGACCATACCAACAATTCTTTCAGTAGCAACCCTTCTACTCCCGTGGGCTCCCCTCCTTCCCTGTCCGCCGGAACCGCTGTGTGGTCTAGGAATGGCGGCCAGGCCAGCTCCAGCCCTAATTATGAGGGCCCCCTGCACAGCCTGCAGTCTCGTATTGAGGATAGGCTGGAGCGACTGGACGACGCTATTCACGTGCTGCGTAACCATGCTGTGGGCCCAAGCACCGCTATGCCCGGGGGACACGGAGACATGCACGGAATCATCGGCCCTTCTCACAACGGGGCTATGGGGGGTCTGGGCAGCGGCTACGGAACAGGCCTGCTGTCCGCCAACAGGCACTCTCTGATGGTGGGTGCCCACCGGGAAGACGGAGTGGCCCTGAGAGGGTCACATAGCCTGGTGCCTAACCAGGTGCCTGTGCCTCAGCTGCCCGTGCAGAGTGCTACTAGCCCCGATCTGAACCCTCCACAGGACCCTTACAGAGGCATGCCACCCGGTCTGCAGGGACAGTCTGTGTCCTCTGGCAGTAGCGAGATCAAGTCAGATGACGAGGGAGACGAGAACCTGCAGGATACAAAGAGCTCCGAGGATAAGAAATTGGACGACGACAAGAAGGACATCAAGTCCATCACCAGCAACAACGACGACGAGGACCTGACTCCTGAGCAGAAGGCCGAACGGGAAAAGGAAAGGCGGATGGCTAACAATGCAAGAGAACGCCTGAGGGTCAGGGATATCAATGAGGCTTTCAAGGAGCTGGGCAGGATGGTGCAGCTGCATCTTAAGTCTGACAAGCCACAGACAAAGCTGCTGATCCTGCACCAGGCTGTGGCTGTGATTCTGTCCCTGGAGCAGCAGGTGAGAGAGAGGAACCTGAACCCTAAGGCCGCTTGCCTGAAAAGACGGGAGGAGGAAAAAGTGAGCTCTGAGCCCCCACCCCTCTCCCTGGCCGGACCACACCCCGGCATGGGCGACGCTTCTAACCACATGGGCCAGATG

3.4. You have a sequence! Now what?

What technologies could be used to produce this protein from your DNA? Describe in your words the DNA sequence can be transcribed and translated into your protein. You may describe either cell-dependent or cell-free methods, or both.

If I had to produce this from my DNA, I would have to use a codon-optimized sequence so that the protein could be interpreted by human cells. Although canine thyroglobuline and human thyroglobuline are not so different, and the DNA is not that different, it is necessary to make this step in order to have amino acids arranged in perfect order so that they can be read by mRNA. The technologies to do this would be:

  • cell-free methods: for producing the protein, specifically CHO cells that come from Chinese hamster ovaries. As this protein comes from a mammal, the cells for reproducing the protein need to be from the same group; it is not efficient to use bacterial cells like E.coli, for example, because the protein needs to fold in a specific way, and the differences between bacteria, mammal,s and plants make this process very different.

  • bioreactor: for scaling the production, avoiding cito-contamination, and giving the process a controlled atmosphere to fold and grow.


Part 4: Prepare a Twist DNA Synthesis Order

This is a practice exercise, not necessarily your real Twist order!

- 4.1. Create a Twist account and a Benchling account

- 4.2. Build Your DNA Insert Sequence

- 4.3. On Twist, Select The “Genes” Option

- 4.4. Select “Clonal Genes” option

- 4.5. Import your sequence

- 4.6. Choose Your Vector

Week 3 HW: Lab Automation

Starsbirth

Assignment: Python Script for Opentrons Artwork — DUE BY YOUR LAB TIME!

Your task this week is to Create a Python file to run on an Opentrons liquid handling robot.

  1. Review this week’s recitation and this week’s lab for details on the Opentrons and programming it.
  2. Generate an artistic design using the GUI at opentrons-art.rcdonovan.com.
  1. Using the coordinates from the GUI, follow the instructions in the HTGAA26 Opentrons Colab to write your own Python script which draws your design using the Opentrons.
  • You may use AI assistance for this coding — Google Gemini is integrated into Colab (see the stylized star bottom center); it will do a good job writing functional Python, while you probably need to take charge of the art concept.
Iteration color palette 5
Coding done by Gemini

This coding was made with Google Gemini. The steps for doing that were: first, loading the coordinates made in the GUI; second, giving instructions to the AI for what the expected outcome; finally, iterating until the idea was achieved. The given instruction given to the IA did not have any basic coding, it was all made with written instructions.

  • If you’re a proficient programmer and you’d rather code something mathematical or algorithmic instead of using your GUI coordinates, you may do that instead.
  1. If the Python component is proving too problematic even with AI and human assistance, download the full Python script from the GUI website and submit that:

  2. If you use AI to help complete this homework or lab, document how you used AI and which models made contributions.

Gemini instructions - collab
  1. Sign up for a robot time slot if you are at MIT/Harvard/Wellesley or at a Node offering Opentrons automation. The Python script you created will be run on the robot to produce your work of art!
  • At MIT/Harvard? Lab times are on Thursday Feb.19 between 10AM and 6PM.
  • At other Nodes? Please coordinate with your Node.
  1. Submit your Python file via this form.

    • STAR´S BIRTH CODES
  • ITERATION COLOR PALETTE - 5

SUCCESSFULL ATTEMPT

As you can see, the code has not been uploaded. But, it was because I could not understand how to do it. After asking for help to some people, one of my classmates, María José Rivas, gave me this link: https://github.com/Mozta/opentrons-bioart-sim/tree/main?tab=readme-ov-file#from-source-for-development. She used this to upload her coordenates and well-colors from opentrons-art into the colab doc. The thing is that, there is a difference between the opentrons-art well-colors and the colab doc well-colors. This protocol, runs it fine.

For this process, first I dowloaded python to see if I could run it there, it was not successfull, but I understood how the program works (super basic knowledge). Then I went to the colab doc and tried to import the documents, but I was not successfull, so I asked for help to ChatGpt. We went trough the hall process together, and step by step it helped me import the link info as well as my .py doc.

The final results are these:

from opentrons import types

metadata = {    # see https://docs.opentrons.com/v2/tutorial.html#tutorial-metadata
    'author': '',
    'protocolName': '',
    'description': '',
    'source': 'HTGAA 2026 Opentrons Lab',
    'apiLevel': '2.20'
}

##############################################################################
###   Robot deck setup constants - don't change these
##############################################################################

TIP_RACK_DECK_SLOT = 9
COLORS_DECK_SLOT = 6
AGAR_DECK_SLOT = 5
PIPETTE_STARTING_TIP_WELL = 'A1'

well_colors = {
    'A1' : 'Red',
    'B1' : 'Green',
    'C1' : 'Orange'
}


def run(protocol):
  ##############################################################################
  ###   Load labware, modules and pipettes
  ##############################################################################

  # Tips
  tips_20ul = protocol.load_labware('opentrons_96_tiprack_20ul', TIP_RACK_DECK_SLOT, 'Opentrons 20uL Tips')

  # Pipettes
  pipette_20ul = protocol.load_instrument("p20_single_gen2", "right", [tips_20ul])

  # Modules
  temperature_module = protocol.load_module('temperature module gen2', COLORS_DECK_SLOT)

  # Temperature Module Plate
  temperature_plate = temperature_module.load_labware('opentrons_96_aluminumblock_generic_pcr_strip_200ul',
                                                      'Cold Plate')
  # Choose where to take the colors from
  color_plate = temperature_plate

  # Agar Plate
  agar_plate = protocol.load_labware('htgaa_agar_plate', AGAR_DECK_SLOT, 'Agar Plate')  ## TA MUST CALIBRATE EACH PLATE!
  # Get the top-center of the plate, make sure the plate was calibrated before running this
  center_location = agar_plate['A1'].top()

  pipette_20ul.starting_tip = tips_20ul.well(PIPETTE_STARTING_TIP_WELL)

  ##############################################################################
  ###   Patterning
  ##############################################################################

  ###
  ### Helper functions for this lab
  ###

  # pass this e.g. 'Red' and get back a Location which can be passed to aspirate()
  def location_of_color(color_string):
    for well,color in well_colors.items():
      if color.lower() == color_string.lower():
        return color_plate[well]
    raise ValueError(f"No well found with color {color_string}")

  # For this lab, instead of calling pipette.dispense(1, loc) use this: dispense_and_detach(pipette, 1, loc)
  def dispense_and_detach(pipette, volume, location):
      """
      Move laterally 5mm above the plate (to avoid smearing a drop); then drop down to the plate,
      dispense, move back up 5mm to detach drop, and stay high to be ready for next lateral move.
      5mm because a 4uL drop is 2mm diameter; and a 2deg tilt in the agar pour is >3mm difference across a plate.
      """
      assert(isinstance(volume, (int, float)))
      above_location = location.move(types.Point(z=location.point.z + 5))  # 5mm above
      pipette.move_to(above_location)       # Go to 5mm above the dispensing location
      pipette.dispense(volume, location)    # Go straight downwards and dispense
      pipette.move_to(above_location)       # Go straight up to detach drop and stay high

  ###
  ### YOUR CODE HERE to create your design
  ###

!git clone https://github.com/Mozta/opentrons-bioart-sim.git
%cd opentrons-bioart-sim
!ls

%cd /content/opentrons-bioart-sim
!ls

%cd /content/opentrons-bioart-sim/opentrons-bioart-sim/src/opentrons_bioart_sim
!ls

!sed -n '1,200p' /content/opentrons-bioart-sim/opentrons-bioart-sim/src/opentrons_bioart_sim/colors.py

!sed -n '1,200p' /content/opentrons-bioart-sim/opentrons-bioart-sim/src/opentrons_bioart_sim/visualization.py

%cd /content
!ls

import sys
sys.path.append("/content/opentrons-bioart-sim/opentrons-bioart-sim/src")

import OTDesign_96_deep_well_plate
from opentrons_bioart_sim.mock import OpentronsMock

!sed -n '1,200p' OTDesign_96_deep_well_plate.py

  # Don't forget to end with a drop_tip()
fatal: destination path 'opentrons-bioart-sim' already exists and is not an empty directory.
/content/opentrons-bioart-sim
CONTRIBUTING.md  LICENSE    opentrons-bioart-sim  README.md  tests
examples	 notebooks  pyproject.toml	  src
/content/opentrons-bioart-sim
CONTRIBUTING.md  LICENSE    opentrons-bioart-sim  README.md  tests
examples	 notebooks  pyproject.toml	  src
/content/opentrons-bioart-sim/opentrons-bioart-sim/src/opentrons_bioart_sim
cli.py	   __init__.py	opentrons-bioart-sim  visualization.py
colors.py  mock.py	__pycache__
"""
colors.py — Fluorescent protein color mappings for Opentrons Bio-Art visualization
===================================================================================
Maps fluorescent protein names to matplotlib-compatible colors for Petri dish rendering.
"""

# ═══════════════════════════════════════════════════════════════════════
# Petri dish constants
# ═══════════════════════════════════════════════════════════════════════

PETRI_INNER_DIAMETER: float = 84  # mm — inner diameter of "90mm" and "100mm" plates
MAX_DRAW_RADIUS: float = PETRI_INNER_DIAMETER / 2 - 2  # 2mm margin for tip size, drops, calibration

# ═══════════════════════════════════════════════════════════════════════
# Protein → visual color mapping
# ═══════════════════════════════════════════════════════════════════════

PROTEIN_VISUAL_COLORS: dict[str, str] = {
    # Reds / Pinks
    'mrfp1':            'red',
    'mcherry':          'firebrick',
    'dsred':            'darkred',
    'mruby2':           'crimson',
    'mscarlet_i':       'tomato',
    'mkate2':           'deeppink',
    'mkate2_tf':        'mediumvioletred',
    'tagrfp':           'coral',
    'tdtomato':         'orangered',
    'eqfp578':          'salmon',
    'mlychee_tf':       'hotpink',
    'mwatermelon':      'lightcoral',
    # Oranges / Yellows
    'mko2':             'orange',
    'mpapaya':          'lightsalmon',
    'venus':            'yellow',
    'mcitrine':         'gold',
    'mvenus':           'goldenrod',
    'mbanana':          'khaki',
    'mstaygold2':       'gold',
    'mchartreuse_tf':   'chartreuse',
    # Greens
    'sfgfp':            'lime',
    'egfp':             'lime',
    'megfp':            'limegreen',
    'avgfp':            'palegreen',
    'mneongreen':       'greenyellow',
    'mazamigreen':      'forestgreen',
    'mclover3':         'green',
    'mwasabi':          'lightgreen',
    'mjuniper':         'darkgreen',
    'zsgreen1':         'springgreen',
    'pa_gfp':           'mediumseagreen',
    'mhoneydew':        'yellowgreen',
    # Blues / Cyans
    'azurite':          'royalblue',
    'tagbfp':           'blue',
    'mtagbfp2':         'mediumblue',
    'ultramarine':      'navy',
    'mturquoise2':      'turquoise',
    'mcerulean3':       'cyan',
    'mtfp1':            'darkcyan',
    'mmicy':            'aquamarine',
    'electra2':         'deepskyblue',
    # Others
    'mplum':            'purple',
}


def resolve_visual_color(protein_or_color_name: str) -> str:
    """Resolve a fluorescent protein name or color name to a matplotlib color.

    Lookup order:
      1. Check PROTEIN_VISUAL_COLORS (case-insensitive)
      2. Map 'green' → 'lime' for better visibility on dark backgrounds
      3. Pass through as-is (assumed to be a valid matplotlib color)

    Args:
        protein_or_color_name: Protein name (e.g. 'sfGFP') or color (e.g. 'red').

    Returns:
        A matplotlib-compatible color string.
    """
    key = protein_or_color_name.lower().strip()
    if key in PROTEIN_VISUAL_COLORS:
        return PROTEIN_VISUAL_COLORS[key]
    if key == 'green':
        return 'lime'
    return protein_or_color_name
"""
visualization.py — Petri dish visualization for Opentrons Bio-Art protocols
============================================================================
Renders droplet positions, smears, and volume summaries as a matplotlib figure.
"""

from __future__ import annotations

from typing import Optional

import matplotlib.pyplot as plt
from matplotlib.figure import Figure
from matplotlib.axes import Axes

from .colors import PETRI_INNER_DIAMETER


def visualize_petri(
    droplets_x: list[float],
    droplets_y: list[float],
    droplets_size: list[float],
    droplets_color: list[str],
    smears: list[tuple[list[float], list[float], str]],
    total_aspirated: dict[str, float],
    total_dispensed: dict[str, float],
    tip_count: int,
    background: str = 'black',
    title: str = 'Opentrons Bio-Art Simulation',
    save_path: Optional[str] = None,
    show: bool = True,
    dpi: int = 150,
    figsize: tuple[float, float] = (10, 10),
) -> tuple[Figure, Axes]:
    """Render a Petri dish visualization with all dispensed droplets.

    Args:
        droplets_x: X coordinates of each droplet (mm from center).
        droplets_y: Y coordinates of each droplet (mm from center).
        droplets_size: Size of each droplet in scatter points (volume × 100).
        droplets_color: Matplotlib color of each droplet.
        smears: List of (x_list, y_list, color) tuples for smear lines.
        total_aspirated: Dict mapping color name → total µL aspirated.
        total_dispensed: Dict mapping color name → total µL dispensed.
        tip_count: Number of tips used during the protocol.
        background: 'black' (dark agar), 'agar' (beige agar), or 'paper' (outline only).
        title: Plot title.
        save_path: If provided, save figure to this file path.
        show: If True, call plt.show(). Set False for headless/test usage.
        dpi: Resolution for saved images.
        figsize: Figure size in inches.

    Returns:
        Tuple of (Figure, Axes) for further customization.
    """
    # ── Print volume summary ──
    _print_volume_summary(total_aspirated, total_dispensed, tip_count)

    # ── Create figure ──
    fig, ax = plt.subplots(figsize=figsize)

    # ── Petri dish background ──
    radius = PETRI_INNER_DIAMETER / 2
    bg_colors = {
        'black': ('#000000', True),
        'agar':  ('#d7ca95', True),
        'paper': ('#000000', False),
    }
    color, fill = bg_colors.get(background, bg_colors['black'])
    ax.add_patch(plt.Circle((0, 0), radius=radius, color=color, fill=fill))

    # ── Droplets ──
    if droplets_x:
        ax.scatter(droplets_x, droplets_y, droplets_size, c=droplets_color)

    # ── Smears ──
    for xlist, ylist, scolor in smears:
        ax.plot(xlist, ylist, color=scolor, linewidth=4, solid_capstyle='round')

    # ── Axes setup ──
    margin = radius + 0.5
    ax.set_xlim(-margin, margin)
    ax.set_ylim(-margin, margin)
    ax.set_aspect('equal')
    ax.set_title(title)

    # ── Save / Show ──
    if save_path:
        fig.savefig(save_path, dpi=dpi, bbox_inches='tight',
                    facecolor=fig.get_facecolor(), edgecolor='none')
        print(f"\nImage saved to: {save_path}")

    if show:
        plt.show()

    return fig, ax


def _print_volume_summary(
    total_aspirated: dict[str, float],
    total_dispensed: dict[str, float],
    tip_count: int,
) -> None:
    """Print a summary of aspirated/dispensed volumes by color."""
    from .colors import resolve_visual_color

    print("\n=== TOTAL VOLUMES BY COLOR ===")
    all_colors = total_aspirated.keys() | total_dispensed.keys()
    for color in sorted(all_colors):
        asp = total_aspirated.get(color, 0)
        disp = total_dispensed.get(color, 0)
        waste = "\t\t##### WASTE: more aspirated than dispensed!" if asp != disp else ''
        vis = resolve_visual_color(color)
        print(f"\t{color} ({vis}):\t aspirated {asp:.1f}\t dispensed {disp:.1f}{waste}")

    total_asp = sum(total_aspirated.values())
    total_disp = sum(total_dispensed.values())
    print(f"\t[all]:\t\t[aspirated {total_asp:.1f}]\t[dispensed {total_disp:.1f}]")
    print(f"\n=== TIPS USED ===\n\t{tip_count} tip(s)  (ideal: one per color)\n")
/content
opentrons-bioart-sim		__pycache__  venv
OTDesign_96_deep_well_plate.py	sample_data
from opentrons import types

import string

metadata = {
    'protocolName': '{YOUR NAME} - Opentrons Art - HTGAA',
    'author': 'HTGAA',
    'source': 'HTGAA 2026',
    'apiLevel': '2.20'
}

Z_VALUE_AGAR = 2.0
POINT_SIZE = 1

electra2_points = [(-4.4,39.6), (-8.8,37.4), (6.6,37.4), (-6.6,35.2), (15.4,35.2), (-13.2,33), (0,33), (-11,30.8), (19.8,30.8), (4.4,28.6), (13.2,28.6), (-15.4,26.4), (-6.6,26.4), (-19.8,24.2), (-26.4,22), (30.8,22), (-35.2,11), (35.2,11), (-37.4,8.8), (-39.6,2.2), (-33,0), (33,0), (-37.4,-6.6), (35.2,-8.8), (-33,-13.2), (33,-15.4), (-30.8,-22), (-15.4,-24.2), (-24.2,-26.4), (-11,-33), (8.8,-33), (15.4,-33), (4.4,-37.4)]
mturquoise2_points = [(-24.2,26.4), (-22,22), (-19.8,19.8), (-17.6,19.8), (-19.8,17.6), (-17.6,17.6), (-15.4,17.6), (-17.6,15.4), (-15.4,15.4), (-13.2,13.2), (-11,13.2), (-13.2,11), (-11,11), (-8.8,8.8), (-4.4,6.6), (-6.6,4.4), (11,4.4), (13.2,4.4), (15.4,4.4), (11,2.2), (13.2,2.2), (15.4,2.2), (19.8,2.2), (8.8,0), (11,0), (-4.4,-8.8), (-4.4,-11), (-4.4,-13.2), (-6.6,-15.4), (-4.4,-15.4), (-2.2,-17.6), (0,-24.2), (-4.4,-28.6)]
azurite_points = [(-2.2,39.6), (-6.6,37.4), (-15.4,35.2), (-11,35.2), (-4.4,35.2), (-19.8,33), (-17.6,33), (-6.6,33), (-22,30.8), (-15.4,30.8), (-19.8,28.6), (-17.6,28.6), (-28.6,24.2), (-22,24.2), (-28.6,22), (-33,19.8), (-28.6,19.8), (-24.2,19.8), (-30.8,17.6), (-28.6,17.6), (22,17.6), (-35.2,15.4), (-33,15.4), (17.6,15.4), (24.2,15.4), (-30.8,13.2), (-33,11), (15.4,11), (19.8,11), (22,11), (26.4,11), (17.6,8.8), (19.8,8.8), (22,8.8), (15.4,6.6), (17.6,6.6), (-37.4,4.4), (19.8,4.4), (24.2,4.4), (17.6,2.2), (22,2.2), (24.2,2.2), (19.8,0), (22,0), (26.4,0), (24.2,-2.2)]
sfgfp_points = [(0,-8.8), (2.2,-8.8), (-2.2,-11), (0,-11), (2.2,-11), (-2.2,-13.2), (-4.4,-19.8), (-2.2,-19.8)]
mjuniper_points = [(-2.2,-22), (0,-22), (-4.4,-24.2), (-2.2,-24.2), (-4.4,-26.4), (-6.6,-28.6), (-2.2,-28.6), (-4.4,-33), (-2.2,-35.2)]
mko2_points = [(2.2,8.8), (-6.6,6.6), (0,6.6), (2.2,6.6), (-4.4,4.4), (-2.2,4.4), (0,4.4), (-4.4,2.2), (-2.2,2.2), (6.6,2.2), (8.8,2.2), (-6.6,0), (-4.4,0), (4.4,0), (6.6,0), (-8.8,-2.2), (-6.6,-2.2), (2.2,-2.2), (4.4,-2.2), (0,-4.4), (2.2,-4.4), (4.4,-4.4), (6.6,-4.4), (-2.2,-6.6), (0,-6.6), (2.2,-6.6), (4.4,-6.6), (6.6,-6.6), (8.8,-6.6), (-2.2,-8.8), (8.8,-8.8), (11,-8.8), (13.2,-8.8), (11,-11), (15.4,-11), (13.2,-13.2), (11,-15.4), (13.2,-15.4), (13.2,-17.6), (15.4,-17.6), (17.6,-17.6), (19.8,-19.8), (19.8,-22), (22,-24.2), (26.4,-26.4)]
mwasabi_points = [(0,8.8), (-2.2,6.6), (-6.6,2.2), (-8.8,0)]
mrfp1_points = [(8.8,-11), (13.2,-11), (11,-13.2)]
mscarlet_i_points = [(15.4,-15.4), (17.6,-15.4)]

point_name_pairing = [("electra2", electra2_points),("mturquoise2", mturquoise2_points),("azurite", azurite_points),("sfgfp", sfgfp_points),("mjuniper", mjuniper_points),("mko2", mko2_points),("mwasabi", mwasabi_points),("mrfp1", mrfp1_points),("mscarlet_i", mscarlet_i_points)]

# Robot deck setup constants
TIP_RACK_DECK_SLOT = 9
COLORS_DECK_SLOT = 6
AGAR_DECK_SLOT = 5
PIPETTE_STARTING_TIP_WELL = 'A1'

# Place the PCR tubes in this order
well_colors = {
    'A1': 'sfGFP',
    'A2': 'mRFP1',
    'A3': 'mKO2',
    'A4': 'Venus',
    'A5': 'mKate2_TF',
    'A6': 'Azurite',
    'A7': 'mCerulean3',
    'A8': 'mClover3',
    'A9': 'mJuniper',
    'A10': 'mTurquoise2',
    'A11': 'mBanana',
    'A12': 'mPlum',
    'B1': 'Electra2',
    'B2': 'mWasabi',
    'B3': 'mScarlet_I',
    'B4': 'mPapaya',
    'B5': 'eqFP578',
    'B6': 'tdTomato',
    'B7': 'DsRed',
    'B8': 'mKate2',
    'B9': 'EGFP',
    'B10': 'mRuby2',
    'B11': 'TagBFP',
    'B12': 'mChartreuse_TF',
    'C1': 'mLychee_TF',
    'C2': 'mTagBFP2',
    'C3': 'mEGFP',
    'C4': 'mNeonGreen',
    'C5': 'mAzamiGreen',
    'C6': 'mWatermelon',
    'C7': 'avGFP',
    'C8': 'mCitrine',
    'C9': 'mVenus',
    'C10': 'mCherry',
    'C11': 'mHoneydew',
    'C12': 'TagRFP',
    'D1': 'mTFP1',
    'D2': 'Ultramarine',
    'D3': 'ZsGreen1',
    'D4': 'mMiCy',
    'D5': 'mStayGold2',
    'D6': 'PA_GFP'
}

volume_used = {
    'electra2': 0,
    'mturquoise2': 0,
    'azurite': 0,
    'sfgfp': 0,
    'mjuniper': 0,
    'mko2': 0,
    'mwasabi': 0,
    'mrfp1': 0,
    'mscarlet_i': 0
}

def update_volume_remaining(current_color, quantity_to_aspirate):
    rows = string.ascii_uppercase
    for well, color in list(well_colors.items()):
        if color == current_color:
            if (volume_used[current_color] + quantity_to_aspirate) > 250:
                # Move to next well horizontally by advancing row letter, keeping column number
                row = well[0]
                col = well[1:]
                
                # Find next row letter
                next_row = rows[rows.index(row) + 1]
                next_well = f"{next_row}{col}"
                
                del well_colors[well]
                well_colors[next_well] = current_color
                volume_used[current_color] = quantity_to_aspirate
            else:
                volume_used[current_color] += quantity_to_aspirate
            break

def run(protocol):
    # Load labware, modules and pipettes
    protocol.home()

    # Tips
    tips_20ul = protocol.load_labware('opentrons_96_tiprack_20ul', TIP_RACK_DECK_SLOT, 'Opentrons 20uL Tips')

    # Pipettes
    pipette_20ul = protocol.load_instrument("p20_single_gen2", "right", [tips_20ul])

    # Deep Well Plate
    temperature_plate = protocol.load_labware('nest_96_wellplate_2ml_deep', 6)

    # Agar Plate
    agar_plate = protocol.load_labware('htgaa_agar_plate', AGAR_DECK_SLOT, 'Agar Plate')
    agar_plate.set_offset(x=0.00, y=0.00, z=Z_VALUE_AGAR)

    # Get the top-center of the plate, make sure the plate was calibrated before running this
    center_location = agar_plate['A1'].top()

    pipette_20ul.starting_tip = tips_20ul.well(PIPETTE_STARTING_TIP_WELL)
    
    # Helper function (dispensing)
    def dispense_and_jog(pipette, volume, location):
        assert(isinstance(volume, (int, float)))
        # Go above the location
        above_location = location.move(types.Point(z=location.point.z + 2))
        pipette.move_to(above_location)
        # Go downwards and dispense
        pipette.dispense(volume, location)
        # Go upwards to avoid smearing
        pipette.move_to(above_location)

    # Helper function (color location)
    def location_of_color(color_string):
        for well,color in well_colors.items():
            if color.lower() == color_string.lower():
                return temperature_plate[well]
        raise ValueError(f"No well found with color {color_string}")

    # Print pattern by iterating over lists
    for i, (current_color, point_list) in enumerate(point_name_pairing):
        # Skip the rest of the loop if the list is empty
        if not point_list:
            continue

        # Get the tip for this run, set the bacteria color, and the aspirate bacteria of choice
        pipette_20ul.pick_up_tip()
        max_aspirate = int(18 // POINT_SIZE) * POINT_SIZE
        quantity_to_aspirate = min(len(point_list)*POINT_SIZE, max_aspirate)
        update_volume_remaining(current_color, quantity_to_aspirate)
        pipette_20ul.aspirate(quantity_to_aspirate, location_of_color(current_color))

        # Iterate over the current points list and dispense them, refilling along the way
        for i in range(len(point_list)):
            x, y = point_list[i]
            adjusted_location = center_location.move(types.Point(x, y))

            dispense_and_jog(pipette_20ul, POINT_SIZE, adjusted_location)
            
            if pipette_20ul.current_volume == 0 and len(point_list[i+1:]) > 0:
                quantity_to_aspirate = min(len(point_list[i:])*POINT_SIZE, max_aspirate)
                update_volume_remaining(current_color, quantity_to_aspirate)
                pipette_20ul.aspirate(quantity_to_aspirate, location_of_color(current_color))

        # Drop tip between each color
        pipette_20ul.drop_tip()

Finally, the result is this:

=== TOTAL VOLUMES BY COLOR ===
	Azurite (royalblue):	 aspirated 47.0	 dispensed 46.0		##### WASTE: more aspirated than dispensed!
	Electra2 (deepskyblue):	 aspirated 34.0	 dispensed 33.0		##### WASTE: more aspirated than dispensed!
	mJuniper (darkgreen):	 aspirated 9.0	 dispensed 9.0
	mKO2 (orange):	 aspirated 46.0	 dispensed 45.0		##### WASTE: more aspirated than dispensed!
	mRFP1 (red):	 aspirated 3.0	 dispensed 3.0
	mScarlet_I (tomato):	 aspirated 2.0	 dispensed 2.0
	mTurquoise2 (turquoise):	 aspirated 34.0	 dispensed 33.0		##### WASTE: more aspirated than dispensed!
	mWasabi (lightgreen):	 aspirated 4.0	 dispensed 4.0
	sfGFP (lime):	 aspirated 8.0	 dispensed 8.0
	[all]:		[aspirated 187.0]	[dispensed 183.0]

=== TIPS USED ===
	9 tip(s)  (ideal: one per color)

(<Figure size 1000x1000 with 1 Axes>,
 <Axes: title={'center': 'Opentrons Bio-Art Simulation'}>)
Star in Colab doc

Some info to keep in mind:

  • STAR in colab doc
  • CIRCUMFERENCE in colab doc
  • GIT HUB LINK

Special thanks to María José and Rafael Pérez Aguirre (@Mozta)


Post-Lab Questions — DUE BY START OF FEB 24 LECTURE

One of the great parts about having an automated robot is being able to precisely mix, deposit, and run reactions without much intervention, and design and deploy experiments remotely.

For this week, we’d like for you to do the following:

  1. Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications.

Automation of protein crystallization scale-up via Opentrons-2 liquid handling

This study shows the approach for optimizing protein crystallization trials at multi-microliter scale using the Opentrons-2 liquid handling robot. The research shows that using Python scripts for precise control, the robot can mix and set up crystallization plates with a model protein - hen egg white lysozyme - and periplasmic protein from Campylobacter jejuni, a crystal used in the Snow lab as a biomaterial for nanotechnology, requiring large, consistent batches. This automation of the process can significantly reduce manual labor, costs, and improve reliability in the protein crystallization results. Opentrons uses a python programming, making it easier to set up for iterations and improvements in programming protocols.

DeRoo, J. B., Jones, A. A., Slaughter, C. K., Ahr, T. W., Stroup, S. M., Thompson, G. B., & Snow, C. D. (2025). Automation of protein crystallization scaleup via Opentrons-2 liquid handling. SLAS Technology, 32, 100268. https://doi.org/10.1016/j.slast.2025.100268

Other interesting studies demonstrate how Opentrons can be linked to other types of technology, such as 3D bioprinting. Although a 3d printer does not work with proteins in the same way as OT-2, it can print different types of labware, reducing costs and making specialized tools. Apart from robots, there is now a collaboration between automated labs and AI assistance.

  1. Write a description about what you intend to do with automation tools for your final project. You may include example pseudocode, Python scripts, 3D printed holders, a plan for how to use Ginkgo Nebula, and more. You may reference this week’s recitation slide deck for lab automation details.

While your description/project idea doesn’t need to be set in stone, we would like to see core details of what you would automate. This is due at the start of lecture and does not need to be tested on the Opentrons yet.

Project tech proposals:

  • DNA OPTIMIZER: for optimizing codon-sequences to express proteins in its host cells, for example: E. coli expressing luciferase.
  • OPENTROS OT-2: to make iterations of expressing cells. For example, for bacteria, folding proteins while having frequencies played.
  • CLOUD LAB: For iterating with different proteins and combinations before going to the lab. For example: recombining DNA for cyanobacteria and luciferase.
  • BIOREACTOR: For expressing proteins properly in a non-contaminated atmosphere

Final Project Ideas — DUE BY START OF FEB 24 LECTURE

For the final project ideas, there are 3 options to take into consideration:

  • MUSIC & BACTERIA

  • AQUATIC MICROORGANISMS & BIOLUMINESCENT SENSORS

  • PROTEIN BASED CRYSTALLINE MATERIALS & SPIDER-SILK TEXTILES

Week 4: Protein Design - part I

Part A. Conceptual Questions

Answer any NINE of the following questions from Shuguang Zhang: (i.e. you can select two to skip)

  1. How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons)
  2. Why do humans eat beef but do not become a cow, eat fish but do not become fish?
  3. Why are there only 20 natural amino acids?
  4. Can you make other non-natural amino acids? Design some new amino acids.
  5. Where did amino acids come from before enzymes that make them, and before life started?
  6. If you make an α-helix using D-amino acids, what handedness (right or left) would you expect?
  7. Can you discover additional helices in proteins?
  8. Why are most molecular helices right-handed?
  9. Why do β-sheets tend to aggregate?
  10. What is the driving force for β-sheet aggregation?
  11. Why do many amyloid diseases form β-sheets?
  12. Can you use amyloid β-sheets as materials?
  13. Design a β-sheet motif that forms a well-ordered structure.

Part B: Protein Analysis and Visualization

In this part of the homework, you will be using online resources and 3D visualization software to answer questions about proteins. Pick any protein (from any organism) of your interest that has a 3D structure and answer the following questions:

LUCIFERASE OF Pyrophorus plagiophthalamus

Luciferase is a protein/enzyme that generates bioluminescence by catalyzing oxidation of D-luciferine in the presence of ATP, oxygen, and MG+2. In the case of this particular insect, Pyrophorus plagiophthalmus, different isoforms of luciferasecan emit light ranging from green to orange, depending on which organs it expresses the gene. These variations of color arise from subtle structural variations in the enzyme´s active site that alter the electronic environment of the excited oxyluciferin intermediate. Click Beetle´s luciferase is a very stable protein in a wide range of pH range compared to other active luciferases. It is very common to use this enzyme for in vivo imaging applications, especially the red-emitting variants. They are also used as a biosensor to monitor gene expression and as a gene reporter.

I chose this particular protein because I am interested in analyzing how sound frequencies might influence bacterial protein expression, growth dynamics, or spatial organization. In this way, having luciferase as a biosensor is ideal; light emission provides a real-time, quantifiable readout.

Burbelo, P. D., Kisailus, A. E., & Peck, J. W. (2002). Detecting Protein-Protein Interactions Using Renilla Luciferase Fusion Proteins. BioTechniques, 33(5), 1044–1050. https://doi.org/10.2144/02335st05


How long is it? What is the most frequent amino acid?

AAQ11735.1 luciferase [Pyrophorus plagiophthalamus]

MMKREKNVVYGPEPLHALEDLTAGEMLFRALRKHSHLPQALVDVYGEEWISYKEFFETTCLLAQSLHNCG
YKMSDVVSICAENNKRFFVPIIAAWYIGMIVAPVNEGYIPDELCKVMGISRPQLVFCTKNILNKVLEVQS
RTDFIKRIIILDAVENIHGCESLPNFISRYSDGNIANFKPLHYDPVEQVAAILCSSGTTGLPKGVMQTHR
NVCVRLIHALDPRVGTQLIPGVTVLVYLPFFHAFGFSINLGYFMVGLRVIMLRRFDQEAFLKAIQDYEVR
SVINVPAIILFLSKSPLVDKYDLSSLRELCCGAAPLAKEVAEIAVKRLNLPGIRCGFGLTESTSANIHSL
RDEFKSGSLGKVTPFMAVKIADRETGKALGPNQVGELCVKGPMVSKGYVNNVEATKEAIDDDGWLHSGDF
GYYDQDEHFYVVDRYKELIKYKGSQVAPAELEEILLKNPCIRDVAVVGIPDLEAGELPSAFVVIQPGKEI
TAKEVYDYLAERVSHTKYLRGGVRFVDSIPRNVTGKITRKELLKQLLEKSSKL

For this part, I used Google Colab and did some research on Leucine. Luciferase of Pryphorus plagiophtalmus: has 543 amino acids, being the most frequent L (Leucina) that appears 56 times. Leucine is commonly known for being an amino acid that helps synthesize muscle proteins and supports tissue regeneration. In this case, its function is related to a hydrophobic nucleotide, correct protein folding, and formation of alpha helices.

Luciferase - colab run

How many protein sequence homologs are there for your protein?

According to Uniprot´s BLAST TOOL, it has 236 homologs. This means that there is a variety of similar proteins in the living realm. They might not be the same, but they share a very similar structure. These homologs can be orthologs and parologs. The second ones are proteins that can be found inside the insect’s body, but with very subtle variations in their structure.

Does your protein belong to any protein family?

Yes, it belongs to the luciferase proteins of insects. This type of protein needs ATP, d-luciferin, and oxygen to perform the oxidation process.

When was the structure solved? Is it a good quality structure? A good-quality structure is one with high resolution. Smaller the better (Resolution: 2.70 Å) :

This particular protein, Luciferase of Pyrophorus plagiophthalamus, is not in the bank information of RCSB, so I took the first luciferase structured in the bank which is: 1LCI Firefly luciferase from Photinus pyralis. Its structure was solved in 1997. The quality is 2.00 Å, which is a good quality.

BAL46512.1 firefly luciferase [Photinus pyralis]

MEDAKNIKKGPAPFYPLEDGTAGEQLHKAMKRYALVPGTIAFTDAHIEVNITYAEYFEMSVRLAEAMKRY
GLNTNHRIVVCSENSLQFFMPVLGALFIGVAVAPANDIYNERELLNSMNISQPTVVFVSKKGLQKILNVQ
KKLPIIQKIIIMDSKTDYQGFQSMYTFVTSHLPPGFNEYDFVPESFDRDKTIALIMNSSGSTGLPKGVAL
PHRTACVRFSHARDPIFGNQIIPDTAILSVVPFHHGFGMFTTLGYLICGFRVVLMYRFEEELFLRSLQDY
KIQSALLVPTLFSFFAKSTLIDKYDLSNLHEIASGGAPLSKEVGEAVAKRFHLPGIRQGYGLTETTSAIL
ITPEGDDKPGAVGKVVPFFEAKVVDLDTGKTLGVNQRGELCVRGPMIMSGYVNNPEATNALIDKDGWLHS
GDIAYWDEDEHFFIVDRLKSLIKYKGYQVAPAELESILLQHPNIFDAGVAGLPDDDAGELPAAVVVLEHG
KTMTEKEIVDYVASQVTTAKKLRGGVVFVDEVPKGLTGKLDARKIREILIKAKKGGKSKL
Luciferase - Photinus pyralis

Are there any other molecules in the solved structure apart from protein?

There is a presence chrystallographic molecules of water (HOH), which stabilizes the protein and may participate in hydrogen bond formation. As this is the first protein from luciferase to be structured, it does not include other types of components, besides the protein and water.

Red dots alone: chrystallographic water

Does your protein belong to any structure classification family?

It belongs to the ATP-dependent AMP-binding enzyme family. This family includes enzymes that activate substrates through adenylation using ATP, forming an AMP-bound intermediate.

Unitprot´s info

Open the structure of your protein in any 3D molecule visualization software: - PyMol Tutorial Here (hint: ChatGPT is good at PyMol commands) - Visualize the protein as “cartoon”, “ribbon” and “ball and stick”.

Luciferase-cartoon
Luciferase-ribbon
Luciferase-ball and sticks

Color the protein by secondary structure. Does it have more helices or sheets?

Luciferase-ss

The protein shows a predominance in alpha helix (red) compared to beta helix (green). This indicates that firefly luciferase is mainly an alpha-helical protein with a smaller portion of beta-sheet structures

Color the protein by residue type. What can you tell about the distribution of hydrophobic vs hydrophilic residues?

Luciferase-hidrophobic (yellow) vs. hydrophilic (yellow)

It is shown that this particular protein/enzyme, which operates in an aqueous environment, has an exterior with hydrophilic residues as protagonists and its core with hydrophobic residues.

Visualize the surface of the protein. Does it have any “holes” (aka binding pockets)?

Yes, the protein surface shows a variety of pockets. One with a predominant size, and others that are small. It is precisely in the big pocket that the ATP binds with the D-luciferine to form Luciferil-AMP and then binds together with oxygen molecules that finally form oxyluciferine and light.


Part C. Using ML-Based Protein Design Tools

C1. Protein Language Modeling

Deep Mutational Scans

a. Use ESM2 to generate an unsupervised deep mutational scan of your protein based on language model likelihoods. b. Can you explain any particular pattern? (choose a residue and a mutation that stands out). c. (Bonus) Find sequences for which we have experimental scans, and compare the prediction of the language model to experiment.

Deep Mutational Scan

The map shows a large number of possible mutations, although two main regions should not be changed because the protein could collapse; those regions are shown as two columns of dark blue. Also, three subtle rows show color consistency corresponding to W, M, and C.

Latent Space Analysis

a. Use the provided sequence dataset to embed proteins in reduced dimensionality. b. Analyze the different formed neighborhoods: do they approximate similar proteins? c. Place your protein in the resulting map and explain its position and similarity to its neighbors.

It is shown that near the analyzed protein (Firefly Luciferase - Photinus pyralis) is located the Luciferase Luciola Cruciata, a protein produced by another type of firefly. The first one, PP, is from North America, while the second one, LC, is from Japan. The main difference is the geographical location and its molecular composition, which is expressed in a slightly different type of color, and the stability of the enzyme. Although both proteins use D-Luciferin and ATP to produce light, PP Luciferase is widely used in biotech as a reporter gene. In contrast, LC Luciferase is used to understand how active-site residues interact with the substrate.

C2. Protein Folding

Fold your protein with ESMFold. Do the predicted coordinates match your original structure? Try changing the sequence, first try some mutations, then large segments. Is your protein structure resilient to mutations?

When folded with ESMFold, the protein shows an almost identical structure to the original one, but when given some mutations, it presents a few changes, not very radical ones, but a few anomalies, meaning that the protein is resilient in a high percentage.

C3. Protein Generation

Inverse-Folding a protein Let’s now use the backbone of your chosen PDB to propose sequence candidates via ProteinMPNN Analyze the predicted sequence probabilities and compare the predicted sequence vs the original one. Input this sequence into ESMFold and compare the predicted structure to your original.

SDRIRVGPEPAEPVQPGTAGQLLHDAMRKFAAIPGTVAFIDAETGKSMTYEEFYTDSVKMAAALKNYGLDKNDAIAVMSKNSLQYFIPVLGALMIGVAVAPINPDYDVEGALTAMSRAKPKVVFTSKENIEKVKEVQKKLPTIKEIIVLDSKEPYKGLDSIYTFIEKYLPEGFDPWKFKPAEFDRDTTIAFILEDXXXXXEPKGVAHPHRALVHNFSIAVDPVYGIAPVPGTVILLTTPLTEHVGLTNTLGAIYAGFTVVLISKFDEDLFLKTLQDYKVQEAYVEPEMLELLAKSTKISQYDLSSLKRISSGGHVISKEVADAVAKKFNLPGVRRGYGKTETFHAFIITPEGXXXGGAAGHVVPYYEARVVDPETGEVLGVNEVGEIEVRGPMIMAGYVDDPEATAERIDEDGWYHTGDLGYFDENGALYIVXXXXXLILNNGKPVDPADLEAVLRSHPAIKDAGVAGLPDPAAGELPAAVVVKAPGKTITEAEVVAYVASQVPPHKHLTGGVVFVDEVPXXXXXAVDRAAVRAILVAAKG

Even though the predicted structure has a completely different type of amino acid distribution, the structure remains the same. This is why the protein shown in 3D is very similar to the original in its alpha- and beta-structures. The backbone is not altered, nor is the logic of the distribution of certain types of amino acids, either.


Part D. Group Brainstorm on Bacteriophage Engineering

  1. Find a group of ~3–4 students
  2. Read through the Phage Reading material listed under “Reading & Resources” below.
  3. Review the Bacteriophage Final Project Goals for engineering the L Protein:
    • Increased stability (easiest)
    • Higher titers (medium)
    • Higher toxicity of lysis protein (hard)
  4. Brainstorm Session
    • Choose one or two main goals from the list that you think you can address computationally (e.g., “We’ll try to stabilize the lysis protein,” or “We’ll attempt to disrupt its interaction with E. coli DnaJ.”).
    • Write a 1-page proposal (bullet points or short paragraphs) describing:
      • Which tools/approaches from recitation you propose using (e.g., “Use Protein Language Models to do in silico mutagenesis, then AlphaFold-Multimer to check complexes.”).
      • Why do you think those tools might help solve your chosen sub-problem?
      • Name one or two potential pitfalls (e.g., “We lack enough training data on phage–bacteria interactions.”).
      • Include a schematic of your pipeline.
    • This resource may be useful: HTGAA Protein Engineering Tools
  5. Each individually put your plan on your HTGAA website
    • Include your group’s short plan for engineering a bacteriophage

Week 5: Protein Design - part II

Part A: SOD1 Binder Peptide Design (From Pranam)

Superoxide dismutase 1 (SOD1) is a cytosolic antioxidant enzyme that converts superoxide radicals into hydrogen peroxide and oxygen. In its native state, it forms a stable homodimer and binds copper and zinc.

Mutations in SOD1 cause familial Amyotrophic Lateral Sclerosis (ALS). Among them, the A4V mutation (Alanine → Valine at residue 4) leads to one of the most aggressive forms of the disease. The mutation subtly destabilizes the N-terminus, perturbs folding energetics, and promotes toxic aggregation.

Your challenge:

Design short peptides that bind mutant SOD1. Then decide which ones are worth advancing toward therapy.

A. Part 1: Generate Binders with PepMLM

SOD1 SEQUENCE

sp|P00441|SODC_HUMAN Superoxide dismutase [Cu-Zn] OS=Homo sapiens OX=9606 GN=SOD1 PE=1 SV=2
MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTS
AGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVV
HEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

SOD1 SEQUENCE with A4V mutation

MATK**V**VCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTS
AGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVV
HEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

After processing 4 peptides with 12 amino acids in the mutational sequence, we got:

Binder#    Sequence        Pseudo Perplexity

1         WHYPVVGLALGX    10.324439
2         WRVYPAGAAHKK    12.004818
3         KRYPAVALAWWE    14.695892
4         WLYYAVGAEHGE    15.423531
5         FLYRWLPSRRGG    20.635226

The pseudo perplexity range explains that the lower the range, the higher the confidence of the model. This means that the peptide with 15.42 will be less natural, while the peptide with 10.32 is a more natural and similar peptide to the sequence. Adding the SOD-1 binding sequence marks a difference arrises, have a pseudo perplexisty of 20.63, a very high number, which means that

B. Part 2: Evaluate Binders with AlphaFold3

Binder#    Sequence        Pseudo Perplexity       ipTM      PTM

1         WHYPVVGLALGX     10.324439               0.40      0.85  
2         WRVYPAGAAHKK     12.004818               0.39      0.86
3         KRYPAVALAWWE     14.695892               0.36      0.87
4         WLYYAVGAEHGE     15.423531               0.32      0.81
5         FLYRWLPSRRGG     20.635226               0.36      0.82

I took the peptides generated in PepMLM and bound them to Alphafolds using the mutant SOD1 sequence. The results show that the protein sequence is highly confident in the result it generated, indicating that the model has high confidence in the predicted structure. While the iPTM shows numbers under 0.6, which means there is low confidence in the interaction between the peptide and protein. Also, the parts in which the peptide actually binds a little bit to the protein correspond to the beginning of the sequence, which appears to be a more flexible region of the protein.

Peptide 1
Peptide 2
Peptide 3
Peptide 4
Peptide 5

C. Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse

PEPTIDE 1: WHYPVVGLALGX

PROPERTY                    PREDICTION        VALUE         UNIT
💧   Solubility              Soluble           1.000        Probability
🩸   Hemolysis               Non-hemolytic     0.103        Probability
🔗  Binding Affinity        Weak binding      6.293        pKd/pKi
⚖️  Molecular Weight                          1193.6       Da
⚡  Net Charge (pH 7)                         -0.15
PEPTIDE 2: WRVYPAGAAHKK    

PROPERTY                    PREDICTION        VALUE         UNIT
💧   Solubility              Soluble           1.000        Probability
🩸   Hemolysis               Non-hemolytic     0.010        Probability
🔗  Binding Affinity        Weak binding      4.953        pKd/pKi
⚖️  Molecular Weight                          1383.6       Da
⚡  Net Charge (pH 7)                         2.84
PEPTIDE 3: KRYPAVALAWWE    

PROPERTY                    PREDICTION        VALUE         UNIT
💧   Solubility              Soluble           1.000        Probability
🩸   Hemolysis               Non-hemolytic     0.071        Probability
🔗  Binding Affinity        Weak binding      6.653        pKd/pKi
⚖️  Molecular Weight                          1489.7       Da
⚡  Net Charge (pH 7)                         0.77
PEPTIDE 4: WLYYAVGAEHGE    

PROPERTY                    PREDICTION        VALUE         UNIT
💧   Solubility              Soluble           1.000        Probability
🩸   Hemolysis               Non-hemolytic     0.065        Probability
🔗  Binding Affinity        Weak binding      6.053        pKd/pKi
⚖️  Molecular Weight                          1394.5       Da
⚡  Net Charge (pH 7)                         -2.14
PEPTIDE 5: FLYRWLPSRRGG    

PROPERTY                    PREDICTION        VALUE         UNIT
💧   Solubility              Soluble           1.000        Probability
🩸   Hemolysis               Non-hemolytic     0.047        Probability
🔗  Binding Affinity        Weak binding      5.968        pKd/pKi
⚖️  Molecular Weight                          1507.7       Da
⚡  Net Charge (pH 7)                         2.76

It seems that the best candidate as a therapeutic peptide is Peptide 3: KRYPAVALAWWE. Although other candidates show very good qualities and achieve similar results in terms of solubility, hemolysis, molecular weight, and net charge, they do not present a strong binding score. In this case, Peptide 3 shows the highest predicted binding affinity among the candidates.

If we compare these results with the iPTM values predicted by Alphafold, we can observe that the confidence of interaction between peptides and the protein is generally low. For Peptide 3 in particular, the iPTM value lies in the middle of the observed range, suggesting (inside of the low values) moderate structural confidence in the predicted interaction.

Additionally, when designing peptides for therapeutic purposes, several properties must be considered. First, peptides need to be soluble so that they can circulate in the biological fluids without forming aggregates. Second, hemolysis probabilities should remain below 0.2, since higher values indicate that peptides may disrupt red blood cells and release hemoglobin into the bloodstream, which can be toxic. Third, binding affinity is important because it helps to predict whether a peptide will interact strongly with the target protein. Furthermore, molecular weight is preferably small, as smaller peptides are easier to synthesize and diffuse through biological environments. Finally, a moderate positive net charge is often favorable, because it can promote electrostatic interactions with negatively charged regions on protein surfaces, potentially stabilizing the peptide-protein interaction.

D. Part 4: Generate Optimized Peptides with moPPIt

Index  Binder         Hemolysis            Solubility           Affinity             Motif

0      GYGFGEGQCYTH   0.92835745960474     0.9166666865348816   6.400432586669922    0.838644802570343
1      WVISFKKEEEKA   0.974050072953105    0.5833333134651184   6.252631664276123    0.5743995308876038
2      KKTIQVKTECGR   0.9746747445315124   0.8333333134651184   6.728649616241455    0.8640713095664978
3      YYQKTCLVKKEH   0.9649858362972736   0.8333333134651184   7.076708793640137    0.882260799407959

I chose to run the peptide at the nearest residues of the mutation because the flexibility around these spaces is beneficial to peptide-protein binding.

To consider the values of analysis: 💧 Solubility: 1.0 (good) 🩸 Hemolysis : 1.0 (good) 🔗 Binding Affinity: the higher the better 🧩 Motif: 1.0 (good)

Therefore, Peptide 3: YYQKTCLVKKEH reflects that it is the best candidate for binding to the mutant SOD1. It presents balanced and consistent results in every aspect: hemolysis, solubility, affinity, and motif. Although the solubility is slightly lower compared to Peptide 0, it still falls in the favorable range, suggesting that the peptide can remain stable and soluble in physiological conditions. Also, it presents a high affinity and motif, meaning that it can perform a strong and specific interaction with the selected residues of the protein.

Compared to PepMLM peptides, the Moppit results show a good affinity and motif, which did not appear in the PepMLM peptides. I think Moppit has a higher affinity and better chances to bind with the protein because it has developed results with a specific target of residues in a specific region, while PepML gives a general result based on stable and more plausible sequences without focusing on any particular binding site.

How would you evaluate these peptides before advancing them to clinical studies?

I would first run a few more computational tests to have consistent results in stability and strength of the peptide-protein bond. This would be run by docking and molecular dynamics simulations. Afterward, it will be necessary to do some in vitro experiments to test if the solubility, hemolysis, binding, affinity, motif, and results keep being consistent and similar to the computational simulations. Finally, in vivo models would be run to assess safety, stability, and pharmacokinetic properties to see if the peptide meets the requirements for clinical studies.


Part C: Final Project: L-Protein Mutants

High level summary: The objective of this assignment is to improve the stability and auto-folding of the lysis protein of a MS2-phage. This mechanism is key to the understanding of how phages can potentially solve antibiotic-resistance.

Subsections of Labs

Week 1 Lab: Pipetting

cover image cover image

Subsections of Projects

Individual Final Project

cover image cover image

Group Final Project

cover image cover image