Homework

Weekly homework submissions:

Week 1: Principles and Practices
Class Assignment Project Idea Chronic wounds and surgical site infections affect millions of patients and cost heathcare systems tens of billions of dollars annually, yet closure devices often remain as passive stitches that do not actively orchestrate local immunity or regeneration [1][2]. Drug-eluting sutures have shown that suture material can safely deliver local therapeutics, but current designs provide only finite, non-adaptive release of single agents such as antibiotics or growth factors [3][4]. Cell-filled sutures packed with mesenchymal stem cells already demonstrate that viable cells can be integrated into suture structures and enhance healing, but these cells are unmodified and lack controllable, multi-functional outputs [5]. Separately, engineered combinatorial cell devices in fiber-like formats can secrete optimized cocktails of growth factors to accelerate wound and bone repair, but they are not load-bearing sutures and do not address infection or scar modulation at the incision line [6].
Week 2: DNA Read, Write, and Edit
Benchling and In-Silico Gel Art Simulate Restriction Enzyme Digest I found this process quite intuitive, as I’ve done similar simulations with the application SnapGene, but it was interesting to notice the small interface differences between the two!
Week 3: Lab Automation
Python script for Opentrons artwork For the art portion of this week’s assignment, I decided to code Yoshi from Super Mario Brothers since the Designer Cells node only had the red and green colors. I used this photo as reference. From there, I started to code for the Opentron automation.
Week 4: Protein Design Pt 1
Conceptual Questions (Question 1) A 500g piece of meat would weight about 3.011x1026 Daltons, and since each amino acid is equal to about 100 Daltons, that would mean that by consuming this piece of meat, you are consuming 3.011x1024 amino acids. (Questions 2) When we eat sources of meat, we physically and enzymatically break down the proteins into amino acids, fatty acids, and sugars, which in turn are used to provide energy to our bodies.
Week 5: Protein Design Pt 2
SOD1 binder peptide design Generate binders with PepMLM The original SOD1 sequence[1] is as follows: >sp|P00441|SODC_HUMAN Superoxide dismutase [Cu-Zn] OS=Homo sapiens OX=9606 GN=SOD1 PE=1 SV=2 MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTS AGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVV HEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ The A4V mutation changes the alanine to valine at codon 4, which results in
Week 6: Genetic Circuits Pt 1
DNA Assembly (Question 1) Within Phusion High-Fidelity Master Mix [1], there is a Phusion DNA Polymerase (which enzymatically synthesizes the DNA in the 5’ to 3’ direction), nucleotides (the building blocks of the synthesized DNA), and an optimized reaction buffer (maintains optimal conditions for the polymerase). (Question 2) Some factors that determine primer annealing temperature are the primer length, the GC content, and the salt concentration.
Week 7: Genetic Circuits Pt 2
Intrancellular Artificial Neural Networks (IANNs) (Question 1) IANNs have the advantagae of providing a more nuanced approach to using genetic circuits by allowing continuous input and output response, where as genetic circuits, which use Boolean logic, often respond in a more binary manner. (Question 2) The introduction of IANNs raised an interesting question in my individual project idea. Since my final project involves the design of a genetic circuit that can sense and then respond to the formation of fibrotic scarring, IANNs could be used as a more sophisticated approach to this problem by increasing the specificity of the circuit to only activate in a truly fibrotic wound microenvironment. In my original circuit, I had aimed to have part of my circuit sense both STAT3 and NF-kB as a trigger to secrete the anti-fibrotic factor, decorin. However, by incorporating IANN instead, I could further decrease the noise from transient inflammatory spikes through encoding three synthetic transcription factors whose expression is driven by STAT3, NF-kB, TGF-B, and HIF-1a promoters respectively. The second section of my genetic circuit would then be placed under a promoter that would require the binding of all three synthetic transcription factors.
Week 9: Cell Free Systems
General questions (Question 1) Since the cell-free protein synthesis system eliminates the cell membrane, this means that the environment that the reaction is performed in is less limited by what can enter or exit the cell as it alters the dependence of the reaction on other cellular constraints. For example, the energy source and the chaperone/cofactor concentrations can be altered independently of the cell’s own needs. This poses a particularly interesting environment for cases such as the incorporation of non-standard amino acids, in which cells may not contain the machinery necessary to incorporate but contain machinery that would resist the incorporation of such amino acids. Another intriguing application would be the prototyping of vaccine antigen production. Due to the speed that cell-free systems can perform at, the system would be able to produce a functional antigen from a gene sequence much quicker without the need to engineer a stable cell line to express the desired antigen.
Week 10: Advanced Imaging and Measurement Technology
Final Project For my final project, I will need to sequence the genetic circuit that I ultimately construct as well as the concentration of the IL-10, Decorin, and Bxb1 and PhiC31 integrase that is produced by the circuit. In order to sequence the genertic circuit, the most common method would be to use Sanger Sequencing, which utilizes electrophoresis after the synthesis in order to properly sort and sequence the circuit based on lengths and the base that terminated sequencing In order to measure the concentrations of the IL-10, Decorin, and Bxb1 and PhiC31 integrase produced by my genetic circuit, I can use Mass Spectroscopy. After harvesting the expression cells at the appropriate time points, I will use the spike-in standards strategy and then calculate the ratio of my endogenous peptide signal to the heavy standard signal, calculating the concentration based on the moles of the protein measured divided by the volume of my original sample. Waters Pt. 1: Molecular Weight For the following calculations, I will be using the provided eGFP sequence
Week 11: Bioproduction and Cloud Labs
The 1,536 Pixel Artwork Canvas I ended up contributing 6 pixels of various colors to the canvas, which were mostly made on the border, but didn’t end up in the final artwork. I really enjoyed that this assignment was a play on other iterations of the collaborative pixel artwork challenges across various platforms, and felt like a fun way to be able to interact with the entire HTGAA community. I think that a lower cooldown time was needed (and I heard that it was implemented towards the end), as I would often click onto another tab while waiting (and then would get distracted…). Overall though, it was fun to see what came out of the community and what ended up on the final canvas.

Week 1: Principles and Practices

Class Assignment

Project Idea

Chronic wounds and surgical site infections affect millions of patients and cost heathcare systems tens of billions of dollars annually, yet closure devices often remain as passive stitches that do not actively orchestrate local immunity or regeneration [1][2].

Drug-eluting sutures have shown that suture material can safely deliver local therapeutics, but current designs provide only finite, non-adaptive release of single agents such as antibiotics or growth factors [3][4]. Cell-filled sutures packed with mesenchymal stem cells already demonstrate that viable cells can be integrated into suture structures and enhance healing, but these cells are unmodified and lack controllable, multi-functional outputs [5]. Separately, engineered combinatorial cell devices in fiber-like formats can secrete optimized cocktails of growth factors to accelerate wound and bone repair, but they are not load-bearing sutures and do not address infection or scar modulation at the incision line [6].

In a separate project, I explored how patient skin cells (such as fibroblasts) could be engineered to express a genetic circuit that could counteract the persistent inflammation of chronic wounds, sense a biomarker indicative of the end of the inflammatory wound healing phase, and then kickstart the proliferation phase sequentially.

I want to use a similar premise to propose a hollow, bioabsorbable suture that houses genetically engineered cells programmed to sense wound and infection cues to secrete combinations of pro-regenerative and antimicrobial factors over the critical healing window. This would transform sutures from a passive mechanical closure tool into an adaptive, living therapeutic that directly tackles both impaired healing and scarring in a way that current drug-eluting or cell-based sutures cannot.

Governance/Policy Goals

Enhance Biosecurity

Introducing genetically modified living materials into the body always poses the risk of unintended side effects in terms of how that newly modified

Escape and persistence of engineered cells
- Genetically engineered cells have the potential to leak from the suture material during deegradation, which may cause the migration of these cells to other unintended areas of the body
Unintended immune suppression hotspots
- One potential application of the seeded engineered cells is to assist in the healing of chronic wounds, which would require the secretion of anti-inflammatory genes/cytokines. In this case, it could potentially host an environment that is susceptible to tumor growth due to the prevention of the body’s natural protection mechanisms becoming temporarily reduced

Foster Lab and Patient Safety

By preventing incident
Informed patient consent
Adverse events

Protect the Environment

Wasted suture material
- As this suture material would contain a living cellular component, the wasted material would need to be properly disposed of through the right channels
Resistance ecology
- As the suture material could aim to reduce microbial infection, this could lead to an inadvertent resistance issue through evolution (similar to antibiotic resistance) and should be thoughtfully considered

Other Considerations

Equal access
Not impede research
Promote constructive applications

Potential Governance Actions

Specialized biosafety and clinical training track for “living implant” users

Purpose:
- Researchers and clinicians complete general biosafety and surgical training, but there is no standardized curriculum for working with engineered living impants
- Establish a dedicated training a certification program for labs and clinicians who design, manufacture, or implant living sutures, similar to specialized credentialing for radiation safety or gene therapy administration
Design:
- Government entities would implement a standardized curriculum and requirement for all individuals working with living material users
- Universities, hospitals, and organizations could develop modules on containment of genetically engineered materials, safety functions and limitations, proper disposal methods and would need to require completion from designated users
Assumptions:
- Assumes that the training would be taken seriously by all parties involved
- Assumes that institutions have the resources to implement this level of training
Risks of Failures and Success:
- Training can easily devolve into people trying to just “pass a quiz”
- Small or underprivileged institutions may not be able to support the certification
- Credentials could become a bottleneck in care, limiting broader patient impact

Mandatory standardized labeling and risk communication for living sutures

Purpose:
- Implanted devices and sutures often have minimal patient-facing documentation and many patients do not know exactly what materials are being used
- Require clear, standardized labeling and risk summaries for any engineered-cell stuure, both on packaging for clinicians and in take-home materials for patients, similar to medication guides for high-risk drugs
Design:
- Government entities should define a standardized material and one-page explanation that should include that the suture is living/engineered, intended benefits, key unknowns, possible risks, and recommended follow-up durations
- Medical professionals should ensure that patients receive and acknowledge these materials during consent and discharge
Assumptions:
- Assumes patients will read and understand the materials
- Assumes that clinicians will consistently use and explain documents instead of just handing them over
- Assumes that simple language used for materials can convey the complex biological concepts utilized
Risks of Failures and Success:
- Overly technical language may confuse or scare patients without helping them to make an informed decision
- If the material emphasizes uncertainty too strongly, clinicians may avoid using the sutures due to patient refusal or anxieties, even when risk-benefit is favorable in high-need cases

Open safety data and pre-registration for living-suture research

Purpose:
- Clinical trials are often pre-registered, but preclinical work, especially in industry, can remain proprietary and negative results are frequently unpublished
- Require prospective registration and open reporting of both clinical and key preclinical studies involving engineered-cell sutures, including negative or inconclusive safety findings
Design:
- Academic and industrial labs should register protocols in public or semi-public databases and post summaries of the key findings, including failures
- Government or regulatory safety boards should aggregate data and identify patterns which can be communicated to different programs and companies
Assumptions:
- Assumes companies will accept some loss of competitive secrecy for safety transparency
- Assumes public reporting can be done in ways that protect intellectual property while still being meaningful
Risks of Failures and Success:
- Compliance may be partial, some negative preclinical findings could stay hidden in internal reports
- Low-quality data could mislead more than inform
- Highly publicized early safety issues, even if fixable, could dissolve public trust in otherwise promising tools

Governance Actions vs Policy Goals

	Researchers	Medical professionals	Government Entities (Ex: FDA)	Patients
Enhance Biosecurity
• Escape and persistence of engineered cells	3	1	2	n/a
• Unintended immune suppression hotspots	1	3	2	n/a
Foster Lab and Patient Safety
• By preventing incident	3	2	1	4
• Informed patient consent	4	1	2	3
• Adverse events	3	2	1	4
Protect the environment
• Wasted suture material	3	2	1	4
• Resistance ecology	3	2	1	4
Other considerations
• Equal access	3	2	1	n/a
• Not impede research		2	1	2
• Promote constructive applications	1	2	3	4

1= most responsibility, 4=least responsibility

Lecture 2 Preparation Questions

Questions from Professor Jacobson

The error rate for DNAP is 10⁶ (about 1 in 1 million). Since the human genome is roughly 3.2 x 10⁹ bp, this means that there would be around 3,200 errors each time a genome copy is made. However, nature is able to combat these errors due to its error correction mechanisms, such as the MutS repair system.
If we assume that an average human protein has 375 amino acids [7], and there are about three codons that code each amino acid, then there are roughly 10¹⁸⁰ ways to code for the average human protein. However, some of these codings could be invalid if they don’t have a proper start codon, if they have unstable mRNA, or if they produce a misfolded protein.

Questions from Dr. LeProust

Currently, the method typically used for oligo synthesis is solid-phase phosphoramidite chemistry, where the 5’ end of the previous nucleotide is protected and as phosphoramidites are added (modified versions of each nucleotide), the 5’ end is exposed, allowing the next base to couple, and then the resulting 5’ end is protected once again while an oxidizing solution stabilizes the bond that was just formed, repeating the process until one obtains the desired oligonucleotide [8].
Oligos longer than 200bp are typically too difficult to synthesize due to an accumulation of impurities that significantly decreases the yield [9].
Coding a gene over 2000bp by oligo synthesis would also be difficult due to exponentially decreasing yields over a certain threshold and difficulty with purifying the final product.

Questions from George Church

In response to question #2

The NA:NA code relies on pairing G to C and pairing A with T (or U in RNA). This then is translated in the AA:NA code as a three bp long codon that translates to one of the twenty amino acid, and this ultimately results in amino acids that can be coded by multiple codon sequences. In order to create an AA:AA code, which would represent protein-protein interactions, I would anticipate the need to consider 3D structure as well as properties of each of the AAs. For example, a positively charged amino acid, like histamine, would ultimately pair best with a negatively charged amino acid, such as glutamic acid. Since there are multiple amino acids with these properties, the code would not have a singular outcome, like NA:NA, but this code could then be further optimized through the 3D structure complemtarity [10][11].

Week 2: DNA Read, Write, and Edit

Benchling and In-Silico Gel Art

Simulate Restriction Enzyme Digest

I found this process quite intuitive, as I’ve done similar simulations with the application SnapGene, but it was interesting to notice the small interface differences between the two!

Pattern in the style of Paul Vanouse

I attempted to make a “Y” for Yonsei, but it turned out to be more difficult than I expected and this was the closest I ended up getting… Huge respect to the people who were able to make a more comprehensive image like the ones that spell MIT!

DNA Design Challenge

My Chosen Protein

I chose to explore Calreticulin (CALR) as my protein of interest for this week due to its role as a pro-healing cue in wound healing[1]. CALR typically serves to support the progression through the four wound healing phases (hemostasis, inflammation, proliferation, and remodeling) [2], which is classically disrupted during chronic wounds [3].

Protein sequence[4]:

sp|P27797|CRTC_HUMAN CALRETICULIN PRECURSOR from residues
                     31- 64, Pval= 3.8e-18, (100% identity); putative"
ORIGIN      
        1 mllsvplllg llglavaepa vyfkeqfldg dgwtsrwies khksdfgkfv lssgkfygde
       61 ekdkglqtsq darfyalsas fepfsnkgqt lvvqftvkhe qnidcgggyv klfpnsldqt
      121 dmhgdseyni mfgpdicgpg tkkvhvifny kgknvlinkd irckddefth lytlivrpdn
      181 tyevkidnsq vesgsleddw dflppkkikd pdaskpedwd erakiddptd skpedwdkpe
      241 hipdpdakkp edwdeemdge weppviqnpe ykgewkprqi dnpdykgtwi hpeidnpeys
      301 pdpsiyaydn fgvlgldlwq vksgtifdnf litndeayae efgnetwgvt kaaekqmkdk
      361 qdeeqrlkee eedkkrkeee eaedkedded kdedeedeed keedeeedvp gqakdel

Nucleotide sequence[5]:

atgctgctgagcgtgccgctgctgctgggcctgctgggcctggcggtggcggaaccggcg
gtgtattttaaagaacagtttctggatggcgatggctggaccagccgctggattgaaagc
aaacataaaagcgattttggcaaatttgtgctgagcagcggcaaattttatggcgatgaa
gaaaaagataaaggcctgcagaccagccaggatgcgcgcttttatgcgctgagcgcgagc
tttgaaccgtttagcaacaaaggccagaccctggtggtgcagtttaccgtgaaacatgaa
cagaacattgattgcggcggcggctatgtgaaactgtttccgaacagcctggatcagacc
gatatgcatggcgatagcgaatataacattatgtttggcccggatatttgcggcccgggc
accaaaaaagtgcatgtgatttttaactataaaggcaaaaacgtgctgattaacaaagat
attcgctgcaaagatgatgaatttacccattataccctgattgtgcgcccggataacacc
tatgaagtgaaaattgataacagccaggtggaaagcggcagcctggaagatgattgggat
tttctgccgccgaaaaaaattaaagatccggatgcgagcaaaccggaagattgggatgaa
cgcgcgaaaattgatgatccgaccgatagcaaaccggaagattgggataaaccggaacat
attccggatccggatgcgaaaaaaccggaagattgggatgaagaaatggatggcgaatgg
gaaccgccggtgattcagaacccggaatataaaggcgaatggaaaccgcgccagattgat
aacccggattataaaggcacctggattcatccggaaattgataacccggaatatagcccg
gatccgagcatttatgcgtatgataactttggcgtgctgggcctggatctgtggcaggtg
aaaagcggcaccatttttgataactttctgattaccaacgatgaagcgtatgcggaagaa
tttggcaacgaaacctggggcgtgaccaaagcggcggaaaaacagatgaaagataaacag
gatgaagaacagcgcctgaaagaagaagaagaagataaaaaacgcaaagaagaagaagaa
gcggaagataaagaagatgatgaagataaagatgaagatgaagaagatgaagaagataaa
gaagaagatgaagaagaagatgtgccgggccaggcgaaagatgaactgtaa

Codon optimization[6]:

ATGCTCCTGTCCGTGCCCCTGCTGCTGGGCCTGCTGGGGCTCGCCGTGGCTGAGCCCGCC
GTGTACTTCAAGGAGCAGTTCCTGGACGGCGATGGCTGGACATCCAGATGGATCGAGTCT
AAGCATAAGTCCGACTTCGGCAAGTTCGTGCTGTCCAGCGGGAAGTTCTATGGGGACGAG
GAGAAGGACAAAGGCCTGCAGACCTCACAGGACGCAAGATTCTATGCCCTTAGCGCCAGC
TTCGAGCCCTTCTCAAACAAAGGGCAGACTCTGGTGGTGCAGTTCACTGTGAAGCATGAG
CAGAACATTGATTGCGGCGGCGGCTACGTGAAGCTGTTTCCTAATAGCCTGGATCAGACA
GACATGCACGGGGACAGCGAGTATAACATCATGTTCGGCCCAGACATTTGCGGCCCAGGC
ACTAAGAAGGTGCACGTGATTTTCAATTATAAAGGCAAAAACGTGCTGATCAATAAAGAC
ATTAGGTGTAAGGATGACGAGTTCACCCATTACACCCTGATCGTGCGCCCCGACAACACC
TACGAGGTGAAGATCGACAACTCACAGGTGGAGAGCGGGAGCCTGGAGGACGACTGGGAC
TTTCTGCCACCAAAGAAGATTAAGGACCCCGACGCCTCCAAGCCCGAGGACTGGGACGAG
CGGGCCAAAATCGACGATCCAACAGATTCAAAGCCCGAAGACTGGGATAAGCCTGAGCAC
ATCCCCGACCCAGACGCAAAGAAGCCTGAAGACTGGGACGAGGAGATGGACGGCGAGTGG
GAGCCCCCTGTGATCCAGAACCCCGAGTACAAGGGGGAGTGGAAGCCAAGGCAGATTGAC
AACCCCGACTACAAAGGCACTTGGATTCACCCTGAGATCGACAACCCCGAATATTCACCC
GACCCCTCTATCTACGCCTACGACAATTTCGGGGTGCTGGGCCTGGACCTGTGGCAGGTG
AAGAGCGGCACCATCTTCGACAATTTCCTGATCACAAACGACGAGGCCTACGCCGAAGAG
TTCGGCAATGAGACATGGGGCGTGACCAAAGCCGCCGAGAAGCAGATGAAGGACAAGCAA
GACGAGGAGCAGCGCCTGAAAGAGGAGGAGGAGGACAAAAAGCGCAAGGAGGAGGAGGAA
GCCGAGGACAAAGAAGACGACGAGGATAAGGACGAGGATGAAGAGGACGAAGAAGACAAG
GAGGAGGACGAGGAGGAAGATGTCCCCGGACAGGCCAAGGACGAGCTGTGA

Codon optimization is essential to ensuring the proper and efficient protein expression of a given protein within a specific organism. Typically, different organisms favor different codons that ultimately encode the same amino acid [7], which is why optimizing to the specific organism you intend to use to produce the protein verifies that frequently used codons are encoded instead of rarely used ones within your expression host of choice[8].

In this case, I think that the most applicable technology to produce CALR would be to use HEK293T cells. These cells are human derived and are quickly replicable, meaning that they would be able to prouduce this protein with great efficiency. In order to do this, first I would need to clone my protein insert into a mammalian expression vector with a strong promoter, which would then be tranfected into the HEK293T cells (for example, by lipofection). Within the cells, the RNAP recognizes the promoter and would transcribe the plasmid into RNA. This mRNA would then be translated into protein by the ribosome.

Prepare a Twist DNA Synthesis Order

I prepared my optimized gene within Benchling for Twist [9]. This order page was so simple compared to other ordering sites I used, and I liked that you could export the entire plasmid as well.

DNA Read/Write/Edit

DNA Read

On a similar theme as my previous assignments, one of my main interests is in finding the key molecular mechanisms and differences that distinguish successful wound repair from chronic, non-healing counterparts. For this reason, I’d be interested in being able to compare the mitochondrial genome of healthy (efficiently healing) and chronic wound patients, as the mitochondria has been proven to play a central role in wound metabolism [10][11]. By sequencing these genomes and contrasting the two, it may reveal variants within the genome that could predispose individuals or make them more vulnerable to chronic wound development.

In order to achieve this goal, I would try to make use of Oxford Nanopore sequencing, which is a third-generation sequencing technology. Since the main goal of this reading would be to read specifically mitochondrial DNA, the first step would be to extract the DNA from a wound tissue/normal tissue sample and quantify it. Next, I should perform long-amplicon PCR in order to highlight the mitochondrial DNA, subsequently adding an A tail so that sequenecing adapters can ligate efficiently. Following this, the sequencing adapters will be attached and and then sequencing will be started. In order to decode the bases of the DNA sample, Oxford Nanopore sequencing relies on ionic currents to detect which bases are passing through the nanopore. Since each basepair emits a different current value, we are able to trace the sequence that subsequently passes through the pore by decoding each current value. The final output of this sequencing technology is a FASTQ file that includes the DNA sequence along with a per-base quality score.

AI citation

Peplexity - “Can you explain in simple terms how Oxford Nanopore sequencing is prepared and what is the outcome?”

DNA Write

One of the long-term projects I’ve been working on at the Designer Cells lab was to synthesize a genetic circuit for chronic wounds. Since the main obstacle of chronic wound healing is their persistent inflammation, I designed a FLEx (Flip Excision) switch [12] that first expresses an anti-inflammatory gene set and then switches irrersibly to a migratory gene set after sensing a biomarker indicative of the end of the inflammation phase of wound healing.

Since this genetic circuit insert ended up being close to 4.5k, I think that the most efficient method to synthesize this would be an enzymatic synthesis approach. This synthesis method is similar to how primers are synthesized. Some limitations of this synthesis method is that it has limited synthesizing ability for longer strands of DNA.

DNA Edit

One point of DNA I’ve been looking at is to edit Calreticulin to add domains that recognize damage-associated extracellular matrix patterns. This would allow these engineered proteins to more effectively seek out damaged tissues (such as in the case fo chronic wounds) in order to facilitate wound healing.

In order to achieve this, I could utilize CRISPR-Cas HDR. In order to achieve this, I would need to design a gRNA that targets the desired insertion site as well as the edits that would need to be made. Then, I would need to ensure that the gRNA was delivered to the appropriate cells of my choosing. The potential limitations could revolve around structure integrity as well as the efficiency of HDR.

Week 3: Lab Automation

Python script for Opentrons artwork

For the art portion of this week’s assignment, I decided to code Yoshi from Super Mario Brothers since the Designer Cells node only had the red and green colors. I used this photo as reference. From there, I started to code for the Opentron automation.

from opentrons import types

metadata = {    # see https://docs.opentrons.com/v2/tutorial.html#tutorial-metadata
    'author': 'Sydney',
    'protocolName': 'Yoshi',
    'description': 'Prints Yoshi',
    'source': 'HTGAA 2026 Opentrons Lab',
    'apiLevel': '2.20'
}

##############################################################################
###   Robot deck setup constants - don't change these
##############################################################################

TIP_RACK_DECK_SLOT = 9
COLORS_DECK_SLOT = 6
AGAR_DECK_SLOT = 5
PIPETTE_STARTING_TIP_WELL = 'A1'

well_colors = {
    'A1' : 'Red',
    'B1' : 'Green',
    'C1' : 'Orange'
}

sfgfp_points = [(0, 34),(2, 34),(4, 34),(6, 34),(0, 32),(2, 32),(4, 32),(6, 32),(-4, 30),(-2, 30),(0, 30),(2, 30),(-4, 28),(-2, 28),(0, 28),(2, 28),(-4, 26),(-2, 26),(12, 26),(14, 26),(16, 26),(18, 26),(20, 26),(22, 26),(-4, 24),(-2, 24),(12, 24),(14, 24),(16, 24),(18, 24),(20, 24),(22, 24),(-8, 22),(-6, 22),(-4, 22),(-2, 22),(8, 22),(10, 22),(12, 22),(14, 22),(16, 22),(18, 22),(24, 22),(26, 22),(-8, 20),(-6, 20),(-4, 20),(-2, 20),(8, 20),(10, 20),(12, 20),(14, 20),(16, 20),(18, 20),(24, 20),(26, 20),(-8, 18),(-6, 18),(-4, 18),(-2, 18),(0, 18),(2, 18),(4, 18),(6, 18),(8, 18),(10, 18),(12, 18),(14, 18),(16, 18),(18, 18),(20, 18),(22, 18),(24, 18),(26, 18),(-8, 16),(-6, 16),(-4, 16),(-2, 16),(0, 16),(2, 16),(4, 16),(6, 16),(8, 16),(10, 16),(12, 16),(14, 16),(16, 16),(18, 16),(20, 16),(22, 16),(24, 16),(26, 16),(-12, 14),(-10, 14),(-8, 14),(-6, 14),(8, 14),(10, 14),(12, 14),(14, 14),(16, 14),(18, 14),(20, 14),(22, 14),(24, 14),(26, 14),(-12, 12),(-10, 12),(-8, 12),(-6, 12),(8, 12),(10, 12),(12, 12),(14, 12),(16, 12),(18, 12),(20, 12),(22, 12),(24, 12),(26, 12),(-12, 10),(-10, 10),(8, 10),(10, 10),(12, 10),(14, 10),(16, 10),(18, 10),(20, 10),(22, 10),(24, 10),(26, 10),(-12, 8),(-10, 8),(8, 8),(10, 8),(12, 8),(14, 8),(16, 8),(18, 8),(20, 8),(22, 8),(24, 8),(26, 8),(12, 6),(14, 6),(16, 6),(18, 6),(20, 6),(22, 6),(12, 4),(14, 4),(16, 4),(18, 4),(20, 4),(22, 4),(8, 2),(10, 2),(8, 0),(10, 0),(-4, -2),(-2, -2),(0, -2),(2, -2),(4, -2),(6, -2),(-4, -4),(-2, -4),(0, -4),(2, -4),(4, -4),(6, -4),(-24, -6),(-22, -6),(-4, -6),(-2, -6),(0, -6),(2, -6),(4, -6),(6, -6),(12, -6),(14, -6),(16, -6),(18, -6),(-24, -8),(-22, -8),(-4, -8),(-2, -8),(0, -8),(2, -8),(4, -8),(6, -8),(12, -8),(14, -8),(16, -8),(18, -8),(-24, -10),(-22, -10),(-20, -10),(-18, -10),(-8, -10),(-6, -10),(-4, -10),(-2, -10),(0, -10),(2, -10),(4, -10),(6, -10),(8, -10),(10, -10),(12, -10),(14, -10),(16, -10),(18, -10),(-24, -12),(-22, -12),(-20, -12),(-18, -12),(-8, -12),(-6, -12),(-4, -12),(-2, -12),(0, -12),(2, -12),(4, -12),(6, -12),(8, -12),(10, -12),(12, -12),(14, -12),(16, -12),(18, -12),(-20, -14),(-18, -14),(-16, -14),(-14, -14),(-12, -14),(-10, -14),(-8, -14),(-6, -14),(-4, -14),(-2, -14),(0, -14),(-20, -16),(-18, -16),(-16, -16),(-14, -16),(-12, -16),(-10, -16),(-8, -16),(-6, -16),(-4, -16),(-2, -16),(0, -16),(-16, -18),(-14, -18),(-12, -18),(-10, -18),(-8, -18),(-6, -18),(-4, -18),(-2, -18),(0, -18),(-16, -20),(-14, -20),(-12, -20),(-10, -20),(-8, -20),(-6, -20),(-4, -20),(-8, -22),(-6, -22),(-4, -22),(-2, -22),(0, -22),(2, -22),(4, -22),(-8, -24),(-6, -24),(-4, -24),(-2, -24),(0, -24),(2, -24),(4, -24)]
mrfp1_points = [(6, 30),(6, 28),(4, 26),(6, 26),(4, 24),(6, 24),(4, 22),(6, 22),(20, 22),(22, 22),(0, 20),(2, 20),(4, 20),(6, 20),(20, 20),(22, 20),(-12, 18),(-10, 18),(-12, 16),(-10, 16),(-16, 14),(-14, 14),(-4, 14),(-2, 14),(0, 14),(2, 14),(6, 14),(-16, 12),(-14, 12),(-4, 12),(6, 12),(-16, 10),(-14, 10),(-8, 10),(-6, 10),(6, 10),(-16, 8),(-14, 8),(-8, 8),(6, 8),(-12, 6),(-10, 6),(10, 6),(-12, 4),(-10, 4),(8, 4),(10, 4),(-12, 2),(-10, 2),(-8, 2),(-6, 2),(-12, 0),(-10, 0),(-8, 0),(-6, 0),(-2, 0),(0, 0),(2, 0),(4, 0),(6, 0),(-8, -2),(-6, -2),(10, -2),(-8, -4),(-6, -4),(10, -4),(-16, -6),(-14, -6),(-12, -6),(-10, -6),(-6, -6),(10, -6),(-16, -8),(-14, -8),(-12, -8),(-10, -8),(-6, -8),(10, -8),(-16, -12),(-14, -12),(-12, -12),(-10, -12),(-24, -14),(10, -14),(-24, -16),(-22, -16),(10, -16),(-20, -18),(8, -18),(-20, -20),(-18, -20),(6, -20),(8, -20),(-16, -22),(-16, -24),(-14, -24),(-12, -24),(-10, -24),(-8, -26),(-6, -26),(-4, -26),(-2, -26),(0, -26),(-8, -28),(-6, -28),(-4, -28),(-2, -28),(0, -28),(-8, -30),(-6, -30),(-4, -30),(-2, -30),(0, -30),(2, -30),(4, -30),(-8, -32),(-6, -32),(-4, -32),(-2, -32),(0, -32),(2, -32),(4, -32)]


def run(protocol):
  ##############################################################################
  ###   Load labware, modules and pipettes
  ##############################################################################

  # Tips
  tips_20ul = protocol.load_labware('opentrons_96_tiprack_20ul', TIP_RACK_DECK_SLOT, 'Opentrons 20uL Tips')

  # Pipettes
  pipette_20ul = protocol.load_instrument("p20_single_gen2", "right", [tips_20ul])

  # Modules
  temperature_module = protocol.load_module('temperature module gen2', COLORS_DECK_SLOT)

  # Temperature Module Plate
  temperature_plate = temperature_module.load_labware('opentrons_96_aluminumblock_generic_pcr_strip_200ul',
                                                      'Cold Plate')
  # Choose where to take the colors from
  color_plate = temperature_plate

  # Agar Plate
  agar_plate = protocol.load_labware('htgaa_agar_plate', AGAR_DECK_SLOT, 'Agar Plate')  ## TA MUST CALIBRATE EACH PLATE!
  # Get the top-center of the plate, make sure the plate was calibrated before running this
  center_location = agar_plate['A1'].top()

  pipette_20ul.starting_tip = tips_20ul.well(PIPETTE_STARTING_TIP_WELL)

  ##############################################################################
  ###   Patterning
  ##############################################################################

  ###
  ### Helper functions for this lab
  ###

  # pass this e.g. 'Red' and get back a Location which can be passed to aspirate()
  def location_of_color(color_string):
    for well,color in well_colors.items():
      if color.lower() == color_string.lower():
        return color_plate[well]
    raise ValueError(f"No well found with color {color_string}")

  # For this lab, instead of calling pipette.dispense(1, loc) use this: dispense_and_detach(pipette, 1, loc)
  def dispense_and_detach(pipette, volume, location):
      """
      Move laterally 5mm above the plate (to avoid smearing a drop); then drop down to the plate,
      dispense, move back up 5mm to detach drop, and stay high to be ready for next lateral move.
      5mm because a 4uL drop is 2mm diameter; and a 2deg tilt in the agar pour is >3mm difference across a plate.
      """
      assert(isinstance(volume, (int, float)))
      above_location = location.move(types.Point(z=location.point.z + 5))  # 5mm above
      pipette.move_to(above_location)       # Go to 5mm above the dispensing location
      pipette.dispense(volume, location)    # Go straight downwards and dispense
      pipette.move_to(above_location)       # Go straight up to detach drop and stay high

  ###
  ### YOUR CODE HERE to create your design
  ###


  # -----------------------------
  # Printing parameters
  # -----------------------------
  VOL_PER_DOT = 0.50

  # Keep aspirates comfortably below 20uL for accuracy/safety
  MAX_ASPIRATE_UL = 18.0
  MAX_BATCH_DOTS = int(MAX_ASPIRATE_UL // VOL_PER_DOT)  # 18.0 // 0.75 = 36

  # Choose where on Z you actually want to dispense.
  # Start conservative: 0 means "at agar_plate['A1'].top() plane".
  # If your drops need to touch the agar more, try -0.5 or -1.0 after testing.
  DISPENSE_DZ = 2

  def point_location_from_center(dx, dy, dz=DISPENSE_DZ):
      # Offsets are in mm
      return center_location.move(types.Point(x=dx, y=dy, z=dz))

  def print_points(points, color_name):
      pipette_20ul.pick_up_tip()

      i = 0
      while i < len(points):
          batch = points[i:i + MAX_BATCH_DOTS]
          batch_volume = len(batch) * VOL_PER_DOT

          # Pull enough dye for this batch
          pipette_20ul.aspirate(batch_volume, location_of_color(color_name))

          # Dispense each dot
          for (dx, dy) in batch:
              loc = point_location_from_center(dx, dy)
              dispense_and_detach(pipette_20ul, VOL_PER_DOT, loc)

          i += MAX_BATCH_DOTS

      pipette_20ul.drop_tip()

  # -----------------------------
  # Print your two datasets
  # -----------------------------
  print_points(sfgfp_points, "Green")
  print_points(mrfp1_points, "Red")

  # Don't forget to end with a drop_tip()

This code successfully resulted in the following image.

Lab automation questions

Torchia, E., et al. Fabrication of cell culture hydrogels by robotic liquid handling automation for high-throughput drug testing. Commun Eng 4, 222 (2025)

Cell-based assays, typically used for drug screening, are limited in application due to their reliance on rigid substrates, which can distort results. Planar hydrogels have shown to be a promosing solution, but achieving uniform thin hydrogel layers also remains a technical limitation. In this paper, Torchia et al. explore the use of Opentrons in order to uniformly produce hydrogels for drug testing. Their methodology, HYDRA (HYDrogels by Robotic liquid-handling Automation) provides a scalable and automated solution to generate uniform micrometic planar hydrogels directly within the standardized plates. This protocol preserved canonical drug responsiveness while providing reproducible, biomimetic substrate for high-content pharmacological imaging.

For my own project, I envision using the Opentron in order to aid in the transformation processes. For each of my final project ideas, they require high-throughput screening of candidates, which could be automated using the Opentron system.

Final project ideas

Week 4: Protein Design Pt 1

Conceptual Questions

(Question 1) A 500g piece of meat would weight about 3.011x10²⁶ Daltons, and since each amino acid is equal to about 100 Daltons, that would mean that by consuming this piece of meat, you are consuming 3.011x10²⁴ amino acids.

(Questions 2) When we eat sources of meat, we physically and enzymatically break down the proteins into amino acids, fatty acids, and sugars, which in turn are used to provide energy to our bodies.

(Question 3) The 20 natural amino acids that are used most regularly today represent the 20 amino acids that contributed to early evolution and allowed for both efficiency and redundancy in order to prevent disasterous mutations[1].

(Question 5) The Miller-Urey experiment [2] famously showed that the chemistry within the early Earth’s atmosphere contributed to the abiogenesis of amino acids, proving their initial evolution through chemical synthesis.

(Question 6) Most natural proteins form the L configuration, resulting in right-handed alpha helices. Since D-amino acids are mirror images of L-amino acids, this would mean that they would form left-handed alpha helices.

(Question 7) There are other helice formations besides the alpha helix, such as the pi helix, which is a wider helix [3]. New helical structures can also be designed using synthetic biology by incorporating non-canonical amino acids in order to design new hydrogen-bonding patterns [4].

(Question 8) Most molecular helices are right-handed becuase life mostly uses L-amino acids, which favors right-handed helices for stability.

(Question 9) Beta sheets tend to aggregate becuase their backbone is extendedly exposed, which means that hydrogen bonds can form easier between subparts without steric hindrance. These hydrogen bonds are the driving force for beta sheet aggregation.

(Question 10) Amyloid diseases form beta sheets often due to the ability to aggregate and their extreme stability [5]. This is what also makes these amyloid beta sheets attractice material candidates due to their strength and ability to self-assemble.

Protein Analysis and Visualization

I wanted to expand upon what I had done in week 2 with Calreticulin (CALR), since I also used it as the premise of one of my final project ideas. CALR is a pro-healing cue in wound healing [6] and supports the progression through the four wound healing phases (hemostasis, inflammation, proliferation, and remodeling) [7], which is classically disrupted during chronic wounds [8].

Protein sequence[9]:

sp|P27797|CRTC_HUMAN CALRETICULIN PRECURSOR from residues
                     31- 64, Pval= 3.8e-18, (100% identity); putative"
ORIGIN      
        1 mllsvplllg llglavaepa vyfkeqfldg dgwtsrwies khksdfgkfv lssgkfygde
       61 ekdkglqtsq darfyalsas fepfsnkgqt lvvqftvkhe qnidcgggyv klfpnsldqt
      121 dmhgdseyni mfgpdicgpg tkkvhvifny kgknvlinkd irckddefth lytlivrpdn
      181 tyevkidnsq vesgsleddw dflppkkikd pdaskpedwd erakiddptd skpedwdkpe
      241 hipdpdakkp edwdeemdge weppviqnpe ykgewkprqi dnpdykgtwi hpeidnpeys
      301 pdpsiyaydn fgvlgldlwq vksgtifdnf litndeayae efgnetwgvt kaaekqmkdk
      361 qdeeqrlkee eedkkrkeee eaedkedded kdedeedeed keedeeedvp gqakdel

This protein sequence is 416 amino acids long and the most frequent amino acid is aspartate (D), which appears 55 times. There are 250 identified homologs [10] Calreticulin belongs to the calreticulin family of proteins, which are highly conserved ER-resident caperones [11].

This structure was discovered in 2011 and has a resolution of 1.65A. It belongs to the Concanavalin A-like lectins/glucanases structural classification family [12].

I’ve never used PyMol before, so it was quite interesting to explore the different functions.

First I visualized CALR in the cartoon, ribbon, and ball and stick visualization.

Next, I colored it by the secondary structures and noticed that the structure is mostly made up of beta sheets.

I wasn’t able to figure out how to color the structure by residues.

By looking at the surface of CALR, I determined that the binding pocket from this part of the protein was most likely the hook part of the ‘upside down L’.

Using ML-Based Protein Design Tools

Protein Language Modeling

I once again decided to use the crystal structure of the calreticulin lectin domain as my model protein for this part of the homework. 1. Mutation Scan Heatmap In the leucine row, it appears that most leucine mutations would be tolerated, apparent by its bright yellow to green band.

2. Latent Space Analysis

Protein Folding

The first image is of the generated structure by ESMFold and the second is the proven structure, which seems to match quite well.

Bacteriophage Engineering

GOAL: Increase the thermal and structural stbaility of the MS2 L lysis protein

To achieve this goal, we’ll need to ensure that the L protein is less likely to denature within the cytosol, but we still need to preserve (if not optimize) the L protein’s interactions with DnaJ

In order to achieve this goal, I began to brainstorm which tools we could use

Tool	Explanation
ESM-2	Helps to generate single- and multi-residue substitution variants with scores of predicted stability, telling us which residues are more sensitive to mutation
Conservation analysis	Conserved residues (found from public databases like UniProt/NCBI) are most likely to be essential to structure and function of the protein, meaning that they should remain the same in our iteration as well
AlphaFold2	Predict the structures of our top candidates mutants and verify whether the essential folds are maintained
AlphaFold-Multimer	Model L protein with DnaJ to compare the original versus our mutant, which will help to verify whether the protein-chaperone interactions remain/improved
BLAST	Identify structurally homologous proteins for natural templates in which positions to mutate

Some potential pitfalls that I could predict:

The L protein is short and phage lysis proteins overall poorly characterized, so the conservation analysis may not be reliable, which is why we also need to implement the structural homologs and not just the sequence homologs
The ESM-2 score doesn’t model chaperone-dependent folding, which means that the predicted score may not directly translate

Week 5: Protein Design Pt 2

SOD1 binder peptide design

Generate binders with PepMLM

The original SOD1 sequence[1] is as follows:

>sp|P00441|SODC_HUMAN Superoxide dismutase [Cu-Zn] OS=Homo sapiens OX=9606 GN=SOD1 PE=1 SV=2
MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTS
AGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVV
HEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

The A4V mutation changes the alanine to valine at codon 4, which results in

MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTS
AGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVV
HEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

Binder	Pseudo Perplexity
WRYPVAGLAHWE	15.095015
WRYPVAAVAHKE	11.070548
WRYPAVALRHKK	16.016262
WRYGPAALAWGE	12.227762
FLYRWLPSRRGG	-

Evaluate binders with AlphaFold3

Binder	ipTM score	Where peptide binds
WRYPVAGLAHWE	0.3	near the helix
WRYPVAAVAHKE	0.33	closer to the beta barrel
WRYPAVALRHKK	0.28	along the beta barrel
WRYGPAALAWGE	0.31	within a helix cavity
FLYRWLPSRRGG	0.29	next to the beta barrel

All of the ipTM values were higher than the known binder except for the third one, but they overall remained within a reasonable range of one another, meaning that my predicted binders may bind better than the known binder.

Evaluate properties of generated peptides in the PeptiVerse

Binder	Solubility	Hemolysis	Binding Affinity	MW	Net Charge (pH 7)
WRYPVAGLAHWE	Soluble	Non-hemolytic	Weak binding (5.888 pKd/pKi)	1484.7 Da	-0.15
WRYPVAAVAHKE	Soluble	Non-hemolytic	Weak binding (5.298 pKd/pKi)	1426.6 Da	0.85
WRYPAVALRHKK	Soluble	Non-hemolytic	Weak binding (5.522 pKd/pKi)	1524.8 Da	3.84
WRYGPAALAWGE	Soluble	Non-hemolytic	Weak binding (6.051 pKd/pKi)	1376.5 Da	-0.23
FLYRWLPSRRGG	Soluble	Non-hemolytic	Weak binding (5.968 pKd/pKi)	1507.7 Da	2.76

I was surprised to see that besides the fourth generated binder, the ipTM score had an inverse relationship with the binding affinity observed with PeptiVerse. All of my binderse were once again fairly similar. For these reasons, I chose to further explore the fourth peptide.

Generate optimized peptides with moPPIt

Using moPPIt, I generated four new binders with the following properties.

Binder	Hemolysis	Solubility	Affinity
KRDKQKKKTCYV	0.983	0.917	7.452
GGHTRTRSHTYI	0.967	0.917	6.118
KYDEKEETCKQL	0.826	0.917	6.885
KRRGRKRKKTSE	0.966	1.0	7.122

These generated proteins have a higher affinity than all of the previously generated ones.

Week 6: Genetic Circuits Pt 1

DNA Assembly

(Question 1) Within Phusion High-Fidelity Master Mix [1], there is a Phusion DNA Polymerase (which enzymatically synthesizes the DNA in the 5’ to 3’ direction), nucleotides (the building blocks of the synthesized DNA), and an optimized reaction buffer (maintains optimal conditions for the polymerase).

(Question 2) Some factors that determine primer annealing temperature are the primer length, the GC content, and the salt concentration.

(Question 3) While PCR and restriction enzyme digest both produce linear DNA, they differ in terms of when they can be used. For example, with PCR, it requires the design of fragment-specific primers and relies on the use of a polymerase to synthesize the specified DNA fragment. For restriction enzyme digest, it requires the presence of restriction enzyme sites in appropriate locations within the plasmid as well as incubation with the restriction enzyme itself.

(Question 4) To ensure that the DNA sequence will be appropriate for Gibson cloning, it is essential to determine that the sequence contains homologous overhangs (~20-40 bp) to the fragment that you want to combine it with.

(Question 5) When transforming E. coli, the plasmid enters the bacterial cell through pores in the membrane, which are made chemically or through heat shock [2].

(Question 6) Golden Gate is similar to Gibson assembly, in the sense that it combines two or more fragments. For Golden Gate, however, it utilizes type IIS endonucleases in order to make single-stranded fragment-specific overhangs. This process hinges on the creation of primers that will not only amplify the desired fragment, but will also add on the specific overhang, a type IIS restriction enzyme site, and a non-specific overhang on each side of the fragment. Once all of the fragments are amplified using PCR, Golden Gate is performed in a one-pot reaction, combining the fragments, the selected restriction enzyme, ligase, and a buffer.

I used Benchling to try to simulate using golden gate to insert mCherry into a pET28 backbone. It was quite simple! I used BsaI as the restriction enzyme.

Asimov Kernel

I have a bit of experience in constructing genetic circuits in the past, but this was my first time using a designated program for it. It was quite interesting to play around with the different circuits and observe the production of certain aspects based on the arrangement.

This is the XNOR circuit

This is the XOR circuit

This is the NAND citcuit

Next, I moved on to replicating the repressilator on my own by referencing the demo repressilator, which is as follows. It ended up turning out like this:

In order to further verify that I replicated the repressilator correctly, I wanted to compare the simulation of each of the circuits.

Original Repressilator	Replicated Repressilator

I noticed that although I copied the repressilator exactly, the expression of the parts did not vary as they did in the original.

After this, I moved towards simulating three of my own constructs.

Autonomous FLEx Switch	Three-Phase Integrase Switch	3

The main concept I used to formulate the first two circuits was the underlying principle of my final project idea. At its core, my final project relies on the production of two gene sets subsequently. Upon the production of the first gene set, the circuit will then respond to the consequentially produced environment in order to then express the second gene set. However, in order to avoid permanently altering the biological environment, both of the gene sets need to be eventually irreversibly turned off.

The first strategy I used to achieve this was a modification of the FLEx switch to become an autonomously responding circuit. It then uses Cre recombinase and lox2272/loxP sites to then shutoff the first gene set.

The second strategy I used was to utilize two integrases in order to inactivate the gene sets rather than the Cre/lox system. For this, it showed high concentrations of all three phases, which I assume is since I used promoters that are responsive to downstream products rather than the genes that I selected for within the circuit.

For the third circuit, I wanted to experiment more with the repressilator idea, but reversed to make a circuit that amplifies itself over time.

Week 7: Genetic Circuits Pt 2

Intrancellular Artificial Neural Networks (IANNs)

(Question 1) IANNs have the advantagae of providing a more nuanced approach to using genetic circuits by allowing continuous input and output response, where as genetic circuits, which use Boolean logic, often respond in a more binary manner.

(Question 2) The introduction of IANNs raised an interesting question in my individual project idea. Since my final project involves the design of a genetic circuit that can sense and then respond to the formation of fibrotic scarring, IANNs could be used as a more sophisticated approach to this problem by increasing the specificity of the circuit to only activate in a truly fibrotic wound microenvironment. In my original circuit, I had aimed to have part of my circuit sense both STAT3 and NF-kB as a trigger to secrete the anti-fibrotic factor, decorin. However, by incorporating IANN instead, I could further decrease the noise from transient inflammatory spikes through encoding three synthetic transcription factors whose expression is driven by STAT3, NF-kB, TGF-B, and HIF-1a promoters respectively. The second section of my genetic circuit would then be placed under a promoter that would require the binding of all three synthetic transcription factors.

While this approach does allow for more specificity, it raises a few logistical terms in terms of cassette size. Originally, I was planning on using the piggyBac transposon system in order to integrate my circuit into the fibroblast genome. However, this IANN would greatly increase the size of my circuit, by nearly double, making it must less reliable to use as a transposon cassette.

(Question 3)

Fungal Materials

(Question 1) One application of fungal materials that caught my attention was their use as packaging foam by companies such as Ecovative [1]. This company aims to reduce traditional packing foam made from EPS and styrofoam through molding mycellium composites within the molds of their desired packaging shapes. I thought this was quite a unique approach as it offers a more ecofriendly alternative as well as a low density that is comparable to EPS foams. However, I could see that this could have scalability issues as it takes time grow the mycellium, which may result in higher coasts.

(Question 2) A unqiue characteristic of fungi is their ability to “heal themselves” and their 3D microstructure. In this regard, I think that taking advantage of these aspects, it would be interesting to make living tissue engineering scaffolds or living wound dressings, which could secrete a variety of proteins or even drugs.

Some advantages of using fungi over bacteria for synthetic biology could include that fungi are able to perform post-translational function, which could allow important implications in producing functional proteins. Another interesting aspect is that fungi allow the product to have its own native 3D shape, which could further have implications in the fiber density and the branching of the mycellium.

First DNA Twist Order

Due to the complexity of my proposed genetic circuit, it is unfortunately unable to be ordered using Twist. While pieces of it could be ordered and then manually pieced together using Gibson or Golden Gate, due to my lack of access to the lab, the node and I decided not to proceed with the Twist order.

Week 9: Cell Free Systems

General questions

(Question 1) Since the cell-free protein synthesis system eliminates the cell membrane, this means that the environment that the reaction is performed in is less limited by what can enter or exit the cell as it alters the dependence of the reaction on other cellular constraints. For example, the energy source and the chaperone/cofactor concentrations can be altered independently of the cell’s own needs. This poses a particularly interesting environment for cases such as the incorporation of non-standard amino acids, in which cells may not contain the machinery necessary to incorporate but contain machinery that would resist the incorporation of such amino acids. Another intriguing application would be the prototyping of vaccine antigen production. Due to the speed that cell-free systems can perform at, the system would be able to produce a functional antigen from a gene sequence much quicker without the need to engineer a stable cell line to express the desired antigen.

(Question 2)

The cell extract provides the basic machinery of a cell, such as the ribosomes, translation factors, endogenous tRNA synthetases, chaperones and folding machinery, as well as the RNAP.
The DNA/mRNA template determines what will be expressed by the cell-free system
the RNAP allows the template to be transcribed
Amino acids allow dor the translation of the template
The energy regeneration system provides energy for the translation process and sustains the reaction
The NTPs are used for translation elongation and mRNA synthesis
Mg2+ ions are necessary for ribosome assembly and K+ ions stabilize the ribosome and supports translation fidelity
Cofactors can also be added depending on the target protein

(Question 3) Energy provision regeneration is critical in cell-free systems due to the very reason it is cell-free, in the sense that there is no longer a cell to produce the energy for the reactions. Since transcription and translation are energy-intensive processes, a cell-free system requires the supplementation of energy in order to successfully complete it’s assigned task. One method to continuously supply ATP to a cell-free experiment is using a phosphocreatine/creatine kinase system. Creatine kinase drives the phosphorylation of ATP by transferring the phosphate group from phosphocreatine to ADP, thus producing creatine.

(Question 4) Prokaryotic and eukaryotic cell-free expression systems differ by the machinery that is present in the native cells. In prokaryotic cells, transcription and translation are coupled and occur nearly simultaneously whereas in eukaryotic systems, transcription and translation are separated by location within the cell. Eukaryotic systems are also able to perform post-translational modifications, whereas prokaryotic systems are not equipped for these kinds of modifications.

I chose to explore the production of a T4 lysozyme within a prokaryotic cell-free system, as it is typically toxic to the bacterial host through the degradation of the peptidoglycan cell wall, however, the cell-free system bypasses this constraint.

Due to eukaryotic cell-free systems’ unique ability to perform post-translational modifications, I wanted to explore the hormone erythropoietin. The glycosylation is an essential part of the hormone’s production, which would be bypassed within a prokaryotic cell-free system.

(Question 5) For the synthesis of membrane proteins using the cell-free system, one of the main limitations would be that the hydrophobic nature of the membrane protein would be produced within an aqueous mix, directly opposing the very nature of the molecule. For this reason, the experimental setup would require a hydrophobic carrier/chaperone in order to ensure that the hydrophobic protein can be synthesized properly. To achieve this, I researched that nanodiscs would be the best way to achieve this, and are in fact used for functional studies of membrane proteins [1].

(Question 6)

One potential reason for low yield of a target protein could be that there is not enough supplemented energy in order to carry out the entire transcription/translation process, as discussed in question 3. In order to troubleshoot this, it would be beneficial to increase the phosphocreatine concentration, implement a feeding strategy, or switch energy systems.
Another potential reason for low ield could be that there is inefficient translation of the target protein, which can occur when the mRNA is degraded faster than it can be translated. Some potential troubleshooting strategies could include codon optimizing the sequence, to optimize the template concentration, or to add an RNase inhibitor to prevent the mRNA degradation.
An additional reason that there may be low yield of a target protein could be contributed to protein misfolding. In order to troubleshoot this, you could supplement chaperones into the cell-free system.

Kate Adamala’s questions

For this assignment, I wanted to toy with the concept of using a cell-free system for an idea similar to my final project rather than engineering the fibroblasts themselves for an application to surgical response . For this, I want to utilize the cell-free system to sense and respond to an inflammatory environment appropriately.

(Question 1)

In this concept, the synthetic cell would be designed as an inflammation-responsive growth factor delivery system for chronic wound healing. Chronic wounds are characterized by their failure to heal due to unresolving inflammation, caused by persistently elevated MMPs that degrade growth factors quicker than the tissue can respond to them, resulting in dysfuntion in the balance between pro- and anti-inflammatory signals. For my synthetic system, I want it to sense the elevated MMP-9 activity within the wound microenvironment, releasing the encapsulated template DNA of the growth factor, PDGF-BB, allowing it to reach therapeutically relevant concentrations.

Without encapsulation, the cell-free machinery would be degraded by the immune mediators within the wound environment and the template DNA would be degraded by extracellular DNases. The encapsulation of the template DNA also allows for the sense-and-respond mechanism to be functional.

While this could be replicated in genetically modified cells, there are a few pros to using a synthetic system instead. For example, the synthetic cell is less likely to cause an immune response and due to the lack of a cell itself, cannot replicate and therefore is less likely to become tumorigenic.

Ideally, the synthetic cell would remain undetectable whenever MMP-9 is low, only triggered in a wound environment where the MMP-9 concentration is elevated, indicating a wound that is unable to heal. When this happens, the MMP-9 would cleave a crosslinker which releases the DNA template and initiates the trancsiption and translation process. This allows the production and release of PDGF-BB into the wound environment, eventually resulting in the recruitment and activation of local fibroblasts, contributing to the wound healing process. As the healing progresses, the MMP-9 levels would normalize, initiating the negative feedback of the PDGF-BB production.

(Question 2)

For the membrane of the synthetic cell, I need something that is biocompatible, can respond to the MMP trigger, and has enough stability to survive in a wound environment until the input/output system is triggered. For this, I anticipate using DOPC, DOPE, DOPG, and cholesterol as the main components of the membrane and including a MMP-responsive crosslinker, such as GPLGIAGQ, which is a well-validated MMP-9 cleavage substrate [2].

As for within the membrane, I would plan to encapsulate the cell-free transcription/translation machinery as well as the template DNA for PDGF-BB under the control of a T7 promoter. To aid with the proper folding of PDGF-BB, I would also need the caperones DsbC and a glutathione buffer due to the presense of critical disulfide bonds. I think it would also be helpful to include an RNase inhibitor to protect the mRNA from any RNase activity.

Since the PDGF-BB is not dependent on post-translational modifications, it would be alright to use an E. coli cell-free sytem due to the lower costs and higher yield output.

The signal input of the system would rely on MMP-9 senseing, which would not need a membrane channel since it depends on the cleavage of the peptide crosslinker. As for the output, the PDGF-BB release would primarily rely on the destabilization of the membrane through the cleavage of the cross-linker, which is a form of passive release. However, this could be optimized using a pore-forming mechanism, such as an alpha-hemolysin channel. The small molecules involved within the cell-free mixture (NTPs, amino acids, and Mg2+), would not be able to freely pass through the DOPC bilayer, however since all the transcripition/translation machinery is encapsulated, external substrate uptake would not be required and naturally limits the operational window.

(Question 3)

Lipids

DOPC
DOPE
DOPG
Cholesterol
GPLGIAGQ conjugate

Genes

PDGFB (human, but codon-optimized for E. coli) for therapeutic output
dsbC for correct disulfide bond formation in PDGF-BB
hlyA for membrane pore (PDGF-BB release)
T7 RNAP

In order to measure the function of the system, we would first need to validate the function of the transcription/translation machinery through the confirmation of PDGF-BB expression using SDS-PAGE and western blot. The SDS-PAGE could also be used in order to validate whether or not the PDGF-BB disulfide bonds were folded correctly. It would also be necessary to validate the vesicle formation through the use of dynamic light scattering. Finally, the validation of the MMP-9 triggered release would also be neceassary to ensuring the proper functioning of the systme. By using the vesicles to encapsulate florescence, by adding MMP-9, which is supposed to cleave the GPLGIAGQ cross-linker, the presence of fluorescence would validate this encapsulation.

Peter Nguyen’s question -> Fashion/Textile

Body odor has long been the subject of self-consciousness, and with this project, a wearable fabric embedded with cell-free systems senses the skin’s biochemistry in real time and responds accordingly with various harmonious fragrances, masking the need to feel embarrassed.

The skin’s chemical landscape is constantly changing throughout our daily lives. Sweat can cause the pH to swift between 4.5 and 7.5, depending on exercise, stress, and metabolism. Skin temperature can fluctuate several degrees depending on the the area and throughout the day. These physiological signals often go undetected, but this project will actively interpret those signals as inputs to a fragrance synthesis program embedded within the fabric itself.

When I imagine this system, I imagine three different circuits existing along the fabric.

The first would be a floral base that exists at the resting body physiology, which is typically pH 5.5-6.5. Using a pH-sensitive promoter, the enzyme linalool synthase would be produced, and GPP would be encapsulated as well. Upon the conversion of GPP, it would emit a floral scent that would act as an ambient perfume [3].
The second layer would be activated from a rise in body temperature, often achieved during physical activity. This would reequire a temperature-sensitive switch that would activate the enzyme limonene synthase, which would also convert GPP into limonene, a citrus scent in order to mask the increase in body odor [4].
The final layer would be sweat activated, which can result after exercise or from sterss, causing a rise above normal pH. Upon the activation of a high pH-sensitive switch, the expression of valencene synthase would be activating, producing valencene, which provides a woody smell [5].

One of the limitations of using cell-free systems is premature water activation from rain, humidity, or other sources. In order to prevent this, we could employ a double layer membrane which consists of a hydrophobic outer cell and a pH-responsive polymer for the inner shell, requiring a certain pH to activate the cell-free system. Another potential limitation could be the limited use due to the energy and precursor consumption of the circuit. In order to bypass this, the capsules should be loaded with material that is able to perform the reaction many times for a set number of wears.

Ally Huang’s questions

During long periods of microgravity, muscles become atrophied due to the lack of resistance. Even on short 5-11 day missions, astronauts have lost up to 20% of muscle mass, with current countermeasures such as resistance exercise and nutrition guidelines, rely on verbal communication of problems rather than obervation of physiological biomarkers [6][7]. By the time atrophy becomes clinically apparent, irreversible damage could’ve already occurred [8]. On a Mars mission lasting 2-3 years, undetected progressive atrophy could result in crewmembers becoming physically unable to perform critical operations [9]. Real-time monitoring is therefore not a convenience, but a critical safety measure that should be invested in.

For this project, I would target myostatin (GDF-8) protein concentration and IL-6 myokine levels in saliva, which could be detected with a toehold switch coupled to a fluorescent reporter output.

Myostatin is a protein that firectly suppresses muscle growth, and elevated myostatin signals active muscle catabolism. IL-6 is released by contracting and stressed muscle fibers, which can indicate productive exercise response but elevated IL-6 can also indicate inflammatory muscle breakdown. These two targets can provide a picture of whether muscle tissue is in a productive remodeling state or in a catabolic degenerative state. Both proteins can be detected within the saliva, allowing for non-invasive sampling.

I hypothesize that the salivary myostatin and IL-6 profiling using BioBits freeze-dried cell-free toehold switch biosensors would be able to detect the molecular signature of muscle degradation at least two weeks prior to clinically measurable muscle volume loss. I reason that because molecular changes precede gross anatomical changes in all known muscle wasting conditions, early molecular detection is both scientifically justified and clinically actionable.

The weekly saliva collection will be processed as follows.

Add 5uL salivea to two BioBits freeze-dried toehold switch reaction tubes (one for myostatin and one for IL-6)
Rehydrate with 45uL of nuclease-free water
Incubate for 37 C for 2 hours
Read the fluorescent output using P51 viewer

It would be necessary to obtain a pre-flight baseline from each crewmember. For positive control, a synthetic myostatic/IL-6 spike at a known concentration could be used and for negative control, using water only.

Individual final project

Week 10: Advanced Imaging and Measurement Technology

Final Project

For my final project, I will need to sequence the genetic circuit that I ultimately construct as well as the concentration of the IL-10, Decorin, and Bxb1 and PhiC31 integrase that is produced by the circuit.

In order to sequence the genertic circuit, the most common method would be to use Sanger Sequencing, which utilizes electrophoresis after the synthesis in order to properly sort and sequence the circuit based on lengths and the base that terminated sequencing
In order to measure the concentrations of the IL-10, Decorin, and Bxb1 and PhiC31 integrase produced by my genetic circuit, I can use Mass Spectroscopy. After harvesting the expression cells at the appropriate time points, I will use the spike-in standards strategy and then calculate the ratio of my endogenous peptide signal to the heavy standard signal, calculating the concentration based on the moles of the protein measured divided by the volume of my original sample.

Waters Pt. 1: Molecular Weight

For the following calculations, I will be using the provided eGFP sequence

MVSKGEELFTG VVPILVELDG DVNGHKFSVS GEGEGDATYG KLTLKFICTT GKLPVPWPTL VTTLTYGVQC FSRYPDHMKQ HDFFKSAMPE GYVQERTIFF KDDGNYKTRA EVKFEGDTLV NRIELKGIDF KEDGNILGHK LEYNYNSHNV YIMADKQKNG IKVNFKIRHN IEDGSVQLAD HYQQNTPIGD GPVLLPDNHY LSTQSALSKD PNEKRDHMVL LEFVTAAGIT LGMDELYKLE HHHHHH

Using an online calculator, it was found that the expected molecular weight of this sequence would be 28006.60 g/mol. In order to calculate the experimental molecular weight of eGFP using the figure below.

I’m going to start by selecting my n peak at the value 933.7349 and my n+1 peak at 903.7148. Using the formula provided, we get that:

$z=\frac{903.7148}{933.7349-903.7148}=30.10$

From this, we know that the charge of the first peak (n) is 30 and the charge for the second peak (n+1) is 31.

Using the first peak, we can calculate the molecular weight by rearranging the formula of calculate the mass to charge ratio to $MW=z*(\frac{m}{z_n})-z$

$MW=(30)*(933.7349)-30=28,081 Da$

We can also double check this number by using our n+1 peak and doing the same calculations.

$MW=(31)*(903.7148)-31=28,032 Da$

This gives us an accuracy of Percent error $=\frac{|28,032 Da-28,006 Da|}{28,006 Da}= 0.09%$

It is possible to estimate the charge state based on the zoomed in peaks by observing the spacing, since the spaces are $\Delta m/z=\frac{1}{z}$. Since the spacing between the peaks is roughly 0.05-0.07 m/z units, we find that $z=\frac{1}{\Delta m/z}=\frac{1}{0.05-0.07}=14-20$. This gives a rough approximation of the charge.

Waters Pt. 3: Peptide Mapping–Primary Structure

Firstm I analyzed the provided eGFP sequence for the lysine (K) and arginine (R) residues, highlighting and bolded them respectively.

MVSKGEELFTG VVPILVELDG DVNGHKFSVS GEGEGDATYG KLTLKFICTT GKLPVPWPTL VTTLTYGVQC FSRYPDHMKQ HDFFKSAMPE GYVQERTIFF KDDGNYKTRA EVKFEGDTLV NRIELKGIDF KEDGNILGHK LEYNYNSHNV YIMADKQKNG IKVNFKIRHN IEDGSVQLAD HYQQNTPIGD GPVLLPDNHY LSTQSALSKD PNEKRDHMVL LEFVTAAGIT LGMDELYKLE HHHHHH

After counting, this gives 19 lysine and 6 arginine residues, so 25 total cleavage sites. Using trypsin, the cleavage results in 19 peptides.

Based on the Peptide Map data, the talest peak is at 4.87 minutes, with 1.2x10⁷ counts, so the 10% cutoff would be at about 1.2x10⁶ counts. From this, there are 14 chromatographic peaks that are relevant. This does not match the number of predicted peptides, which was predicted to be 19, meaning there are fewer peptides present in the chromatogram.

From this figure, the most abundant peak is at 525.767712. From this, M+1 would be 526.25918 and M+2 would be 526.76845, with the spacing of 0.4921 and 0.5093 accordingly, making the average spacing 0.50. This means, that $z=\frac{1}{\Delta m/z}=\frac{1}{0.50}=2$

To calculate the charged mass, we can use the equation $[M+H]^+=(m/zz)-(z-1)$, which comes out to be $[M+H]^+=(525.767122)-(1)=1050.527 Da$. Based on this mass, we can assume this peak is corresponding to the peptide FEGDTLVNR, which the tool estimated to be 1050.5214. Given, we find that percent error$=\frac{|1050.527 Da-1050.5214 Da|}{1050.527 Da}=0.0005%$.

The percentage of the sequence that is confirmed by the peptide mapping is 88%.

Waters Pt. 4: Oligomers

First, we need to calculate the expected masses of each oligometric species.

Species	Subunits	Expected Mass
7FU Decamer	10 x 340 kDa	3.40 MDa
8FU-Didecamer	20 x 400 kDa	8.00 MDa
8FU 3-Decamer	30 x 400 kDa	12.00 MDa
8FU 4-Decamer	40 x 400 kDa	16.00 MDa

Comparing this to the mass spectrum obtained, we find that all of the oligomers are present on the spectrum as a distinct peak except for the 8FU 4-Decamer and that the 8FU-Didecamer is the most abundant.

Waters Pt. 5: Did I make GFP?

Based on the images provided, I calculated that

	Theoretical	Observed/measured on the Intact LC-MS	PPM Mass error
Molecular weight (kDa)	28.006 kDa	28.032 kDa	+9.4 ppm

Week 11: Bioproduction and Cloud Labs

The 1,536 Pixel Artwork Canvas

I ended up contributing 6 pixels of various colors to the canvas, which were mostly made on the border, but didn’t end up in the final artwork.

I really enjoyed that this assignment was a play on other iterations of the collaborative pixel artwork challenges across various platforms, and felt like a fun way to be able to interact with the entire HTGAA community. I think that a lower cooldown time was needed (and I heard that it was implemented towards the end), as I would often click onto another tab while waiting (and then would get distracted…). Overall though, it was fun to see what came out of the community and what ended up on the final canvas.

Cell-Free Protein Synthesis

Component	Role
E. coli Lysate
BL21 (DE3) Star Lysate (includes T7 RNA Polymerase)	Provides the transcription/translation machinery, with the T7 RNAP driving the transcription of any gene that is under a T7 promoter
Salts/Buffer
Potassium Glutamate	Potassium Glutamate stabilizes the ribosomes and maintains osmotic balance
HEPES-KOH pH 7.5	This is a non-reactive buffer that promotes an optimal pH for transcription/translation
Magnesium Glutamate	Supplies Mg2+ to the system, which is necessary for ribosome assembly, RNAP activity, and ATP hydrolysis
Potassium phosphate monobasic	Provides additional buffering capacity and also a phosphate source that can feed into the energy pathways
Potassium phosphate dibasic	Provides additional buffering capacity and also a phosphate source that can feed into the energy pathways
Energy/Nucleotide System
Ribose	Energy source that feeds the metabolic pathways in the lysate to regenerate ATP
Glucose	Energy source that feeds the metabolic pathways in the lysate to regenerate ATP
AMP	Nucleoside monophosphates that is a building block for RNA syntehsis during transcription
CMP	Nucleoside monophosphates that is a building block for RNA syntehsis during transcription
GMP	Nucleoside monophosphates that is a building block for RNA syntehsis during transcription
UMP	Nucleoside monophosphates that is a building block for RNA syntehsis during transcription
Guanine	Helps replenish GMP, which is consumed rapidly during translation
Translation Mix (Amino Acids)
17 Amino Acid Mix	Provides the majority of the standard amino acids needed for translation
Tyrosine	Separately provided due to poor solubility at neutral pH
Cysteine	Separately added since it is prone to oxidation and to avoid off-target reactions
Additives
Nicotinamide	Precursor for NAD+ that helps sustain the energy metabolism
Backfill
Nuclease Free Water	Maintains reaction’s final woking volume without introducing RNases or DNases

The first different I noticed was in the nucleotide composition. For the 1-hour mix, it uses NTPs while the 20-hour mix provides NMPs. The 20-hour mix also relies on ribose and glucose as an energy source, opposed to the PEP-Mono used by the 1-hour mix, which indicates that the 20-hour mix is designed to sustain long-term expression. This is also indicated by the higher concentration of the amino acids found within the 20-hour mix.

Planning the Global Experiment– Cell-Free Master Mix Design

In our pixel artwork, we utilized six fluorescent proteins, including sfGFP, mRFP1, mKO2, mTurquoise2, mScarlet_I, and Electra2, each with unique biophysical properties.

sfGFP: designed to reliably produce signal even when the lysate is suboptimal
- could increase the amino acid (17 amino acid mix + tyrosine) concentrations to increase translation rates towards the end of the reaction
mRFP1: slow to mature and has low acid sensitivity
- could increase tyrosine since it has low solubility
mKO2: high dependence on oxygen
- could increase the ribose and glucose to make sure that oxygen levels are sufficient as well as re-supplying energy
mTurquoise2: slow to mature
- could increase the magnesium glutamate concentration and increase the NMP pool to overcome the maturation
mScarlet-I: fast maturation
- could increase the tyrosine concentration since it is typically consumed quicker than other amino acids
Electra2: tends to form aggregates due to resistance to acidity
- could decrease the magnesium glutamate concentration to reduce the aggregation

Reaction Mixture	Explanation
	Since I wanted to compare the Electra2 versus others to test my hypothesis, I increased the glutamate concentration
	Increased tyrosine by 1.5%
	Increased tyrosine by 3.1%
	Increased tyrosine by 4.6%
	Increase glucose, as well as tyrosin, AMP, and GMP
	Increased magnesium glutamate by 9.0%
	Increased magnesium glutamate by 17.9%
	Increased tyrosine, AMP, GMP, and glucose