BEYZA CENNET BATIR — HTGAA Spring 2026

cover image cover image

Merhaba!

My name is Beyza, I am from Izmir, Turkey. I work as a Research Assistant at the Faculty of Fine Arts and Design at Izmir University of Economics, and I am also a Teaching Assistant for “Design Studio III, Design Studio IV, and Architectural Intelligence: Architectural Artificial Intelligence” courses.

I am a PhD candidate in the Design Studies program, and my research generally focuses on “epistemic collaboration and semantics established with artificial intelligence in design”. I completed my master’s thesis on “The use of fractal forms in the process of creating a new reality” at the Dokuz Eylül University Institute of Fine Arts.

I believe that nature is the universal language of design, and I have a special interest in self-repeating natural forms to understand this model. I often think about the evolution of creativity across different environments and disciplines. As a designer, I am excited by the possibility of creative expression being a common meeting point for every discipline and the nurturing environment that arises from this collaboration, and I am very happy to be here. Let’s stay connected! 👾

Update: I’m still under the spell of meeting such cool people and experiencing the magic of the unique curiosity and excitement for research.

HTGAA26 Node

As of February 12, I will continue attending classes as a Committed Listener alongside Designer Cells Lab! You can access their projects via their website.

👾 👾 👾 👾 👾
DESIGNER CELLS
SCORE00000
HI-SCORE99999
LEVEL01

I got support from Claude while coding this :)

Contact

Homework

Labs

Projects

Subsections of BEYZA CENNET BATIR — HTGAA Spring 2026

Homework

Weekly homework submissions:

  • Week 1 HW: Principles and Practices

    Here you will see a proposal in which I attempt to overstep my bounds in the field of microbiology with my identity as an artist and designer. If any of my statements are incorrect, incomplete, or biased, I would like to point out that this is due to my inexperience in the field, and I would gladly accept your support in correcting them. Q1. Describe a biological engineering application or tool you want to develop and why. This could be inspired by an idea for your HTGAA class project and/or something for which you are already doing in your research, or something you are just curious about.

  • Week 2 HW: DNA Read, Write, & Edit

    Part 1: Benchling & In-silico Gel Art First, I checked how to find Lambda through the database. I rewatched the Bootcamp recording by Adrian Filips and week 2 files of HTGAA2025 as well as the HTGAA2026 Recitation recordings on Benchling Basics provided by Cholpisit (Ice) Kiattisewee, and reviewed all the notes and presentations. NHI LAmbda webpage Biolabs Lambda webpage

  • Week 3 HW: Lab Automation

    Python Script for Opentrons Artwork I have created Space Invaders with green and red because my node Designer Cells mentioned they have only red (mrfp1) and green (sfGFP) right now. I have evil plans to create also text “Designer Cells” down the Space Invaders logo:) - done!-

  • Week 4 HW: Protein Design I

    Part A: Conceptual Questions Answer any 9 of the following questions from Shuguang Zhang: (i.e. you can select two to skip) A.1 How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons) Approximately 6×10 23 amino acid molecules. 1 gram of amino acids would be 0.01 moles (1g/100g/mol). If 500g of meat is roughly 25% protein (about 125g), we’d have 1.25 moles. Multiplying by Avogadro’s number (6.022×10 23), we get approximately 7.5×10 23 molecules.

  • Week 5 HW: Protein Design II

    Part A: SOD1 Binder Peptide Design (From Pranam) Human Superoxide Dismutase 1 (SOD1, UniProt: P00441) is a cytosolic antioxidant enzyme responsible for detoxifying superoxide radicals. The A4V mutation (Alanine → Valine at position 4) destabilizes the N-terminal region, increases aggregation propensity, and is associated with a severe form of familial ALS. The goal of this assignment is to design short 12-mer peptides that bind preferentially to mutant SOD1 and evaluate their structural and therapeutic potential using PepMLM, AlphaFold3, and PeptiVerse.

  • Week 6 HW: Genetic Circuits Part I: Assembly Technologies

    Assignment: DNA Assembly Phusion High-Fidelity PCR Master Mix Components While the specific biochemical list of Phusion ingredients is not detailed in the excerpts, the sources confirm that PCR reactions are a core “DNA Skill” used to generate “linear fragments” or “gene fragments” for cloning. Typically, a high-fidelity master mix includes: DNA Polymerase: The enzyme responsible for synthesizing the new DNA strand; high-fidelity versions (like Phusion) have proofreading activity to minimize mutations. dNTPs (Deoxynucleotide Triphosphates): The chemical “LEGO bricks” (A, T, C, G) used to build the DNA chain. Buffer and Mg2+: Provides the optimal chemical environment and cofactors for the polymerase to function. Factors Determining Primer Annealing Temperature (Tm) The sources highlight Tm​prediction as a critical computational filter in the protein design pipeline. The primary factors determining this temperature include: GC Content: The ratio of Guanine and Cytosine; higher GC content increases the Tm​because G-C pairs have three hydrogen bonds compared to the two bonds in A-T pairs Primer Length: Longer primers generally have higher annealing temperatures. Salt Concentration: The concentration of ions in the PCR buffer affects the stability of the DNA duplex. PCR vs. Restriction Enzyme Digests The sources compare these as two methods for preparing DNA for assembly: Protocol: PCR uses primers and a polymerase to amplify a specific sequence into a linear fragment. Restriction digestion uses enzymes (like NdeI or XhoI) to cut a DNA backbone or insert at specific “cloning sites” to create sticky or blunt ends Preferable Use: PCR is preferred when you need to amplify a specific gene from a complex template or add “homology arms” for Gibson cloning. Restriction digestion (described as “Plan B” in the project) is often used for inserting fragments into standard backbones like pET-28a(+) but can add “1–2 weeks” to the timeline for additional cloning and screening steps Ensuring Appropriateness for Gibson CloningTo ensure DNA fragments are ready for Gibson Assembly, you must verify that the linear fragments (whether from PCR or digestion) have overlapping homology sequences at their ends. The sources recommend using Benchling for “in silico design” to check sticky-end orientation, digestion sites, and frame verification to ensure all parts will align correctly during the assembly reaction.

  • Week 7 HW: Genetic Circuits Part II: Neuromorphic Circuits

    Assignment Part 1: Intracellular Artificial Neural Networks (IANNs) In-cell Artificial Neural Networks (IANNs) offer a major advantage over traditional Boolean genetic circuits by enabling synthetic systems to operate at the same level of abstraction as natural cellular signaling. Unlike Boolean circuits, which are restricted to discrete on/off logic, IANNs use continuous, analog signal processing with nonlinear activation functions, allowing them to represent smooth and highly complex input–output relationships such as band-pass or non-monotonic responses. This enables IANNs to approximate arbitrary continuous functions more efficiently and compactly than combinations of rigid logic gates. In addition, their continuous nature makes them better suited for modeling real biological systems, which are inherently noisy and graded rather than binary. As a result, IANNs are more robust for complex cellular decision-making tasks and pattern recognition in biological environments.

  • Week 8 No Homework: Spring Break

  • Week 9 HW: Cell-Free Systems

    Homework Part A: General and Lecturer-Specific Questions General homework questions Cell-free protein synthesis offers much greater flexibility and experimental control than traditional in vivo systems because it removes the constraints of cell viability, membrane transport, and metabolic regulation. In CFPS, components can be precisely tuned (DNA template concentration, ions, chaperones, cofactors), and toxic or unstable proteins can be produced without affecting living cells. Two cases where CFPS is especially advantageous: Toxic proteins (e.g., membrane-disrupting peptides or nucleases) that would kill host cells in vivo. Rapid prototyping of genetic constructs, where many variants need to be tested quickly without cloning or cell line generation. A CFPS system typically contains: Cell extract (E. coli, wheat germ, or rabbit reticulocyte lysate): Provides ribosomes, tRNAs, aminoacyl-tRNA synthetases, and translation machinery. DNA or mRNA template: Encodes the target protein; serves as the blueprint for transcription/translation. Energy system (ATP regeneration components): Supplies ATP/GTP required for transcription and translation. Amino acids: Building blocks for protein synthesis. Salts and cofactors (Mg²⁺, K⁺, etc.): Stabilize ribosomes and enzymes. Nucleotides (NTPs): Required for transcription of mRNA from DNA. Optional additives (chaperones, membrane mimics, redox agents): Improve folding and functionality of expressed proteins. Importance of energy regeneration Protein synthesis is extremely energy-intensive; each peptide bond consumes multiple ATP/GTP equivalents. Without regeneration, ATP is rapidly depleted, stopping translation. To maintain continuous ATP supply, one method is:

  • Week 10 HW: Advanced Imaging & Measurement Technology

    Homework: Final Project In my final project proposal, Paleo-Proteins project, success is measured through a multi-layered validation pipeline that spans from in silico sequence verification to automated functional assays in human cell lines. Below are the specific aspects being measured and the technologies employed to perform these measurements. Protein Identity and Structural Integrity The primary physical aspects to be measured are the molecular weight and immunological identity of the synthetic cryoprotectants (e.g., DHN-K2S). Measurement: I will confirm that the expressed protein matches the predicted molecular weight (e.g., ~11.4 kDa for DHN-K2S) and contains the intended N-terminal His₆-tag. Technologies: SDS-PAGE: A 12% precast gel will be used to provide gel-based confirmation of protein production and approximate size in less than 2 hours. Western Blot: Utilizing an anti-His₆-HRP antibody, this provides orthogonal identity confirmation, distinguishing the target protein from background cell-free synthesis (CFPS) components based on specific epitope recognition.

  • Week 11 HW: Bioproduction & Cloud Labs

    Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork Done :) global artwork experiment Make a note on your HTGAA webpages including: I was part of the community bioart project by supporting others design. I like collaborative artworks because of their inclusive atmosphere. I don’t have any recommendations. Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork 1. Roles of Each Cell-Free Reaction Component

  • Week 12 HW: Building Genomes

    Building Genomes (George Church, John Glass, Jef Boeke)
    Lab: Bioproduction

  • Week 13 HW: Biodesign & Engineered Living Material

    Homework: Work on your Final Project Present it May 12 (MIT/Harvard) or May 13 (Committed Listeners) Done! :)

  • Week 14 HW: Bio Design & Bio Fabrication

    Homework: Finish your Final Project Present it May 12 (MIT/Harvard) or May 13 (Committed Listeners) Done! :) https://docs.google.com/presentation/d/1vxVu8kgoHVHmmDpRqoX6xxGv62YjuYbUX5MRlUITn7I/edit?slide=id.g3e7d9f77350_242_2#slide=id.g3e7d9f77350_242_2

Subsections of Homework

Week 1 HW: Principles and Practices

Here you will see a proposal in which I attempt to overstep my bounds in the field of microbiology with my identity as an artist and designer. If any of my statements are incorrect, incomplete, or biased, I would like to point out that this is due to my inexperience in the field, and I would gladly accept your support in correcting them.

cover image cover image

Q1. Describe a biological engineering application or tool you want to develop and why. This could be inspired by an idea for your HTGAA class project and/or something for which you are already doing in your research, or something you are just curious about.

After my master’s thesis research on pattern recognition and fractal thinking in art and design, I wanted to explore the potential application of these methods to biological anomalies. Following my mother’s diagnosis, I found papers exploring these possibilities. One of those studies proposed the use of fractal geometry to identify cellular anomalies associated with cancer (Dokukin et al., 2015). I would like to develop a tool in this area.

cover image cover image

AFM-based fractal analysis for an early-stage cancer cell screening system which is a diagnostic tool that distinguishes normal, premalignant, and malignant cells by measuring changes in multi-fractality on the cell surface.

Q2. Describe one or more governance/policy goals related to ensuring that this application or tool contributes to an “ethical” future, like ensuring non-malfeasance (preventing harm). Break big goals down into two or more specific sub-goals.

Building on the personal and academic motivation described above, this proposed tool is an integrated software + data analysis system that quantifies the fractal properties of the cell surface at specific stages of cancer progression using AFM or similar high-resolution imaging methods. By relying on fractal indicators of anomalous (chaotic) surface behavior in early stages, it can generate signals prior to clinical disease progression, potentially providing new biomarkers for early clinical diagnosis. This offers quantitative biomonitoring capabilities beyond conventional histopathological assessment. In this section (regarding how the tool can be implemented technologically), I drew support from large language models (ChatGPT by OpenAI; Claude by Anthropic, 2026).

Q3. Describe at least three different potential governance “actions” by considering the four aspects below (Purpose, Design, Assumptions, Risks of Failure & “Success”). Try to outline a mix of actions (e.g. a new requirement/rule, incentive, or technical strategy) pursued by different “actors” (e.g. academic researchers, companies, federal regulators, law enforcement, etc). Draw upon your existing knowledge and a little additional digging, and feel free to use analogies to other domains (e.g. 3D printing, drones, financial systems, etc.).

  • Purpose:
  • Traditional medicine focuses on treating patients after they become ill rather than preventing disease. Morphological changes at the cellular and tissue levels are evaluated based on pathologists’ qualitative observations. This approach is largely dependent on human interpretation and cannot quantitatively capture micro-scale surface dynamics. The chance of early diagnosis is low. Here, however, there is a chance of early diagnosis.
  • Design:
  • An ethics committee composed of representatives from academia, industry, and the public sector. Due to the high cost of the method, democratization of access among different demographic groups in the future.
  • Assumptions: If the software is open source, it may be misinterpreted in non-clinical settings (Bennett et al., 2009).
  • Risks of Failure & “Success”:
  • Dokukin’s study was conducted only with cervical epithelial cells; validation in different cancer types and populations is required (Dokukin et al., 2015).
  • Fractal behavior has only been observed at a specific stage of development, and there is a deviation from this characteristic in more advanced stages.

Q4. Score (from 1-3 with, 1 as the best, or n/a) each of your governance actions against your rubric of policy goals. The following is one framework but feel free to make your own:

Does the option:Option 1Option 2Option 3
Enhance Biosecurity
• By preventing incidents122
• By helping respond212
Foster Lab Safety
• By preventing incident123
• By helping respond12-
Protect the environment
• By preventing incidents2--
• By helping respond2--
Other considerations
• Minimizing costs and burdens to stakeholders322
• Feasibility?212
• Not impede research312
• Promote constructive applications211

Q5. Drawing upon this scoring, describe which governance option, or combination of options, you would prioritize, and why. Outline any trade-offs you considered as well as assumptions and uncertainties. For this, you can choose one or more relevant audiences for your recommendation, which could range from the very local (e.g. to MIT leadership or Cambridge Mayoral Office) to the national (e.g. to President Biden or the head of a Federal Agency) to the international (e.g. to the United Nations Office of the Secretary-General, or the leadership of a multinational firm or industry consortia). These could also be one of the “actor” groups in your matrix.

Working with cancer patients requires that the legal process be conducted ethically. Patients must provide informed consent regarding whether their screening results will be used for research purposes, and their “right not to know” must be protected. Cell surface maps are biometric data and contain personal health information. Strong data protection protocols are required for the collection, storage, and sharing of this data. In particular, institutions such as insurance companies must be prevented from accessing this data without the patient’s consent, thereby protecting the scope of health insurance coverage or the patient’s right to work. Luigi Mangione, who was convicted in the UnitedHealthcare case, criticized the American healthcare system with a manifesto similar to that of “Unabomber” Ted Kaczynski (Kaczynski, 1995). This was a period when patients began to be excluded from health insurance coverage based on AI decisions (Mello et al., 2026).

Images

  • Image1. Liver Cells, Beyza Batır, 2018
  • Image2. AFM maps of adhesion of the AFM probe to the cell surface of (a) normal, (b) immortal (premalignant), and (c) cancer cells. SEM images of (d) normal, (e) immortal, and (f) cancer cells., in ‘Emergence of fractal geometry on the surface of human cervical epithelial cells during progression towards cancer’, Dokukin, M.E. et al., 2015

References

  • Bennett, G. et al. (2009) ‘From synthetic biology to biohacking: Are we prepared?’, Nature Biotechnology, 27(12), pp. 1109–1111. doi:10.1038/nbt1209-1109.
  • Dokukin, M.E. et al. (2015) ‘Emergence of fractal geometry on the surface of human cervical epithelial cells during progression towards cancer’, New Journal of Physics, 17(3), p. 033019. doi:10.1088/1367-2630/17/3/033019.
  • Kaczynski, T. (1995) ‘Industrial society and its future’ Available at: https://web.cs.ucdavis.edu/~rogaway/classes/188/materials/Industrial%20Society%20and%20Its%20Future.pdf (Accessed: 6 February 2026).
  • Mello, M.M. et al. (2026) ‘The Ai Arms Race in Health Insurance Utilization Review: Promises of efficiency and risks of supercharged flaws’, Health Affairs, 45(1), pp. 6–13. doi:10.1377/hlthaff.2025.00897.
  • Todorovic, V. (2020) ‘Reimagining life (forms) with generative and Bio Art’, AI & SOCIETY, 36(4), pp. 1323–1329. doi:10.1007/s00146-020-00937-9.

Week 2 HW: DNA Read, Write, & Edit

Part 1: Benchling & In-silico Gel Art

First, I checked how to find Lambda through the database. I rewatched the Bootcamp recording by Adrian Filips and week 2 files of HTGAA2025 as well as the HTGAA2026 Recitation recordings on Benchling Basics provided by Cholpisit (Ice) Kiattisewee, and reviewed all the notes and presentations.

NHI LAmbda webpage NHI LAmbda webpage NHI LAmbda webpage

Biolabs Lambda webpage Biolabs Lambda webpage Biolabs Lambda webpage

After checking the NIH website and found “Nucleotide” search; downloaded “Lambda (NP_040580.1)” on it, I turned back to the given Lambda page on Biolabs database, and copied to the notepad the proper FASTA data. Because the one that I downloaded from NIH got different bp lengths (Standard is 48,502 bp).

cover image cover image cover image cover image

Left: Paul Vanouse, Latent Figure Protocol Skull and Bones; Right: Space Invaders

Since I wanted my work to be in the Vanouse style, I also researched projects conducted by Vanouse and began exploring what kind of figure I wanted to create and whether this method would make it possible. I reviewed work prepared by Peggy Yin (2023) and Kevin Tang (2025) from previous years.

cover image cover image cover image cover image

Left: Kevin Tang (2025); Right: Peggy Yin (2023)

The protocols I followed in the continuation of the project are as follows:

Simulate Restriction Enzyme Digestion with the following Enzymes: EcoRI HindIII BamHI KpnI EcoRV SacI SalI

https://rcdonovan.com/gel-art

Part 3: Benchling & In-silico Gel Art

3.1. Choose your protein.

In recitation, we discussed that you will pick a protein for your homework that you find interesting. Which protein have you chosen and why? Using one of the tools described in recitation (NCBI, UniProt, google), obtain the protein sequence for the protein you chose.

AAA61217.2 thyroid peroxidase [Homo sapiens] MRALAVLSVTLVMACTEAFFPFISRGKELLWGKPEESRVSSVLEESKRLVDTAMYATMQRNLKKRGILSG AQLLSFSKLPEPTSGVIARAAEIMETSIQAMKRKVNLKTQQSQHPTDALSEDLLSIIANMSGCLPYMLPP KCPNTCLANKYRPITGACNNRDHPRWGASNTALARWLPPVYEDGFSQPRGWNPGFLYNGFPLPPVREVTR HVIQVSNEVVTDDDRYSDLLMAWGQYIDHDIAFTPQSTSKAAFGGGSDCQMTCENQNPCFPIQLPEEARP AAGTACLPFYRSSAACGTGDQGALFGNLSTANPRQQMNGLTSFLDASTVYGSSPALERQLRNWTSAEGLL RVHGRLRDSGRAYLPFVPPRAPAACAPEPGNPGETRGPCFLAGDGRASEVPSLTALHTLWLREHNRLAAA LKALNAHWSADAVYQEARKVVGALHQIITLRDYIPRILGPEAFQQYVGPYEGYDSTANPTVSNVFSTAAF RFGHATIHPLVRRLDASFQEHPDLPGLWLHQAFFSPWTLLRGGGLDPLIRGLLARPAKLQVQDQLMNEEL TERLFVLSNSSTLDLASINLQRGRDHGLPGYNEWREFCGLPRLETPADLSTAIASRSVADKILDLYKHPD NIDVWLGGLAENFLPRARTGPLFACLIGKQMKALRDGDWFWWENSHVFTDAQRRELEKHSLSRVICDNTG LTRVPMDAFQVGKFPEDFESCDSITGMNLEAWRETFPQDDKCGFPESVENGDFVHCEESGRRVLVYSCRH GYELQGREQLTCTQEGWDFQPPLCKDVNECADGAHPPCHASARCRNTKGGFQCLCADPYELGDDGRTCVD SGRLPRVTWISMSLAALLIGGFAGLTSTVICRWTRTGTKSTLPISETGGGTPELRCGKHQAVGTSPQRAA AQDSEQESAGMEGRDTHRLPRAL

I chose Thyroid peroxidase (TPO) due to my hashimoto’s thyroiditis. https://www.uniprot.org/uniprotkb/P07202/entry cover image cover image

sp|P07202|PERT_HUMAN Thyroid peroxidase OS=Homo sapiens OX=9606 GN=TPO PE=1 SV=4 MRALAVLSVTLVMACTEAFFPFISRGKELLWGKPEESRVSSVLEESKRLVDTAMYATMQR NLKKRGILSPAQLLSFSKLPEPTSGVIARAAEIMETSIQAMKRKVNLKTQQSQHPTDALS EDLLSIIANMSGCLPYMLPPKCPNTCLANKYRPITGACNNRDHPRWGASNTALARWLPPV YEDGFSQPRGWNPGFLYNGFPLPPVREVTRHVIQVSNEVVTDDDRYSDLLMAWGQYIDHD IAFTPQSTSKAAFGGGADCQMTCENQNPCFPIQLPEEARPAAGTACLPFYRSSAACGTGD QGALFGNLSTANPRQQMNGLTSFLDASTVYGSSPALERQLRNWTSAEGLLRVHARLRDSG RAYLPFVPPRAPAACAPEPGIPGETRGPCFLAGDGRASEVPSLTALHTLWLREHNRLAAA LKALNAHWSADAVYQEARKVVGALHQIITLRDYIPRILGPEAFQQYVGPYEGYDSTANPT VSNVFSTAAFRFGHATIHPLVRRLDASFQEHPDLPGLWLHQAFFSPWTLLRGGGLDPLIR GLLARPAKLQVQDQLMNEELTERLFVLSNSSTLDLASINLQRGRDHGLPGYNEWREFCGL PRLETPADLSTAIASRSVADKILDLYKHPDNIDVWLGGLAENFLPRARTGPLFACLIGKQ MKALRDGDWFWWENSHVFTDAQRRELEKHSLSRVICDNTGLTRVPMDAFQVGKFPEDFES CDSITGMNLEAWRETFPQDDKCGFPESVENGDFVHCEESGRRVLVYSCRHGYELQGREQL TCTQEGWDFQPPLCKDVNECADGAHPPCHASARCRNTKGGFQCLCADPYELGDDGRTCVD SGRLPRVTWISMSLAALLIGGFAGLTSTVICRWTRTGTKSTLPISETGGGTPELRCGKHQ AVGTSPQRAAAQDSEQESAGMEGRDTHRLPRAL

[Example from our group homework, you may notice the particular format — The example below came from UniProt]

sp|P03609|LYS_BPMS2 Lysis protein OS=Escherichia phage MS2 OX=12022 PE=2 SV=1 METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLL EAVIRTVTTLQQLLT

3.2. Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence.

The Central Dogma discussed in class and recitation describes the process in which DNA sequence becomes transcribed and translated into protein. The Central Dogma gives us the framework to work backwards from a given protein sequence and infer the DNA sequence that the protein is derived from. Using one of the tools discussed in class, NCBI or online tools (google “reverse translation tools”), determine the nucleotide sequence that corresponds to the protein sequence you chose above.

[Example: Get to the original sequence of phage MS2 L-protein from its genome phage MS2 genome - Nucleotide - NCBI]

Lysis protein DNA sequence atggaaacccgattccctcagcaatcgcagcaaactccggcatctactaatagacgccggccattcaaacatgaggattacccatgtcgaagacaacaaagaagttcaactctttatgtattgatcttcctcgcgatctttctctcgaaatttaccaatcaattgcttctgtcgctactggaagcggtgatccgcacagtgacgactttacagcaattgcttacttaa

3.3. Codon optimization.

Once a nucleotide sequence of your protein is determined, you need to codon optimize your sequence. You may, once again, utilize google for a “codon optimization tool”. In your own words, describe why you need to optimize codon usage. Which organism have you chosen to optimize the codon sequence for and why?

[Example from Codon Optimization Tool | Twist Bioscience while avoiding Type IIs enzyme recognition sites BsaI, BsmBI, and BbsI]

Lysis protein DNA sequence with Codon-Optimization ATGGAAACCCGCTTTCCGCAGCAGAGCCAGCAGACCCCGGCGAGCACCAACCGCCGCCGCCCGTTCAAACATGAAGATTATCCGTGCCGTCGTCAGCAGCGCAGCAGCACCCTGTATGTGCTGATTTTTCTGGCGATTTTTCTGAGCAAATTCACCAACCAGCTGCTGCTGAGCCTGCTGGAAGCGGTGATTCGCACAGTGACGACCCTGCAGCAGCTGCTGACCTAA

3.4. You have a sequence! Now what?

What technologies could be used to produce this protein from your DNA? Describe in your words the DNA sequence can be transcribed and translated into your protein. You may describe either cell-dependent or cell-free methods, or both.

3.5. [Optional] How does it work in nature/biological systems?

Describe how a single gene codes for multiple proteins at the transcriptional level. Try aligning the DNA sequence, the transcribed RNA, and also the resulting translated Protein!

Part 4: Prepare a Twist DNA Synthesis Order**

This is a practice exercise, not necessarily your real Twist order! (done)

4.1. Create a Twist account and a Benchling account (done)

4.2. Build Your DNA Insert Sequence

For example, let’s make a sequence that will make E. coli glow fluorescent green under UV light by constitutively (always) expressing sfGFP (a green fluorescent protein):

In Benchling, select New DNA/RNA sequence Give your insert sequence a name and select DNA with a Linear topology (this is a linear sequence that will be inserted into a circular backbone vector of our choosing). Go through each piece of the given DNA sequences highlighted below (Promoter, RBS, Start Codon, Coding Sequence, His Tag, Stop Codon, Terminator) and paste the sequences into the Benchling file one after the other (replacing the coding sequence with your codon optimized DNA sequence of interest!). Each time you add a new piece of the sequence, make sure to annotate by right clicking over the sequence and creating an annotation that describes what each piece (e.g., Promoter, RBS, etc.) is (see image below). Promoter (e.g. BBa_J23106): TTTACGGCTAGCTCAGTCCTAGGTATAGTGCTAGC RBS (e.g. BBa_B0034 with spacers for optimal expression): CATTAAAGAGGAGAAAGGTACC Start Codon: ATG Coding Sequence (your codon optimized DNA for a protein of interest, sfGFP for example): AGCAAAGGAGAAGAACTTTTCACTGGAGTTGTCCCAATTCTTGTTGAATTAGATGGTGATGTTAATGGGCACAAATTTTCTGTCCGTGGAGAGGGTGAAGGTGATGCTACAAACGGAAAACTCACCCTTAAATTTATTTGCACTACTGGAAAACTACCTGTTCCGTGGCCAACACTTGTCACTACTCTGACCTATGGTGTTCAATGCTTTTCCCGTTATCCGGATCACATGAAACGGCATGACTTTTTCAAGAGTGCCATGCCCGAAGGTTATGTACAGGAACGCACTATATCTTTCAAAGATGACGGGACCTACAAGACGCGTGCTGAAGTCAAGTTTGAAGGTGATACCCTTGTTAATCGTATCGAGTTAAAGGGTATTGATTTTAAAGAAGATGGAAACATTCTTGGACACAAACTCGAGTACAACTTTAACTCACACAATGTATACATCACGGCAGACAAACAAAAGAATGGAATCAAAGCTAACTTCAAAATTCGCCACAACGTTGAAGATGGTTCCGTTCAACTAGCAGACCATTATCAACAAAATACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTACCTGTCGACACAATCTGTCCTTTCGAAAGATCCCAACGAAAAGCGTGACCACATGGTCCTTCTTGAGTTTGTAACTGCTGCTGGGATTACACATGGCATGGATGAGCTCTACAAA 7x His Tag (Let’s add a 7×His tag at the C-terminus of the protein to enable protein purification from E. coli): CATCACCATCACCATCATCAC Stop Codon: TAA Terminator (e.g. BBa_B0015): CCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGAACGCTCTCTACTAGAGTCACACTGGCTCACCTTCGGGTGGGCCTTTCTGCGTTTATA Once you’ve completed this, click on Linear Map to preview the entire sequence. If you intend to have a TA review a sequence in the future, this is a good way to verify that all sections are annotated! This is not required for this exercise, but to share your design with others, please ensure that link sharing is turned on! (Optional) Share your final sequence link with a TA for review!

This insert sequence you built is commonly referred to as an expression cassette in molecular biology (a sequence you can drop into any vector and it’ll perform its function). Go ahead and download the FASTA file for the sequence you made. It’s helpful to visualize DNA designs using SBOL Canvas (Synthetic Biology Open Language) to convey your designs. Here’s an example of what you just annotated in Benchling: https://sbolstandard.org/

4.3. On Twist, Select The “Genes” Option

4.4. Select “Clonal Genes” option

For this demonstration, we’ll choose Clonal Genes. You’ll select clonal genes or gene fragments depending on your final project.

Historically, HTGAA projects using clonal genes (circular DNA) have reached experimental results 1-2 weeks quicker because they can be transformed directly into E. coli without additional assembly.

Gene fragments (linear DNA) offer greater design flexibility but typically require an assembly or cloning step prior to transformation. An advantage is If designed with the appropriate exonuclease protection, gene fragments can be used directly in cell-free expression.

4.5. Import your sequence

You just took an amino acid sequence of interest and converted it into DNA, codon optimized it, and built an expression cassette around it! Choose the Nucleotide Sequence option and Upload Sequence File to upload your FASTA file.

4.6. Choose Your Vector

Since we’re ordering a clonal gene, you will need to refer to Twist’s Vector Catalog to choose your circular backbone. You can think of this as taking your linear expression cassette for your protein of interest, and completing the rest of the circle!

The backbone confers many special properties like antibiotic resistance, an origin of replication, and more. Discuss with your node to decide on appropriate antibiotic options. At MIT/Harvard, you can use Ampicillin, Chloramphenicol, or Kanamycin resistance.

Twist vectors do not contain restriction sites near the insert fragment, so make sure to flank your design with cut sites if you are intending to extract this DNA insert fragment later.

For this demonstration, choose a Twist cloning vectors like pTwist Amp High Copy.

Click into your sequence and select download construct (GenBank) to get the full plasmid sequence:

Go back to your Benchling account. Inside of a folder, click the import DNA/RNA sequence button and upload the GenBank file you just downloaded.

This is the plasmid you just built with your expression cassette included. Congratulations on building your first plasmid!

Part 5: DNA Read/Write/Edit

5.1 DNA Read (i) What DNA would you want to sequence (e.g., read) and why? This could be DNA related to human health (e.g. genes related to disease research), environmental monitoring (e.g., sewage waste water, biodiversity analysis), and beyond (e.g. DNA data storage, biobank).

(ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why? Also answer the following questions:

  1. Is your method first-, second- or third-generation or other? How so?
  2. 2.- What is your input? How do you prepare your input (e.g. fragmentation, adapter ligation, PCR)? List the essential steps.
  3. What are the essential steps of your chosen sequencing technology, how does it decode the bases of your DNA sample (base calling)?
  4. What is the output of your chosen sequencing technology?

5.2 DNA Write

(i) What DNA would you want to synthesize (e.g., write) and why? These could be individual genes, clusters of genes or genetic circuits, whole genomes, and beyond. As described in class thus far, applications could range from therapeutics and drug discovery (e.g., mRNA vaccines and therapies) to novel biomaterials (e.g. structural proteins), to sensors (e.g., genetic circuits for sensing and responding to inflammation, environmental stimuli, etc.), to art (DNA origamis). If possible, include the specific genetic sequence(s) of what you would like to synthesize! You will have the opportunity to actually have Twist synthesize these DNA constructs! :)

See some famous examples of DNA design

(ii) What technology or technologies would you use to perform this DNA synthesis and why? Also answer the following questions:

What are the essential steps of your chosen sequencing methods? What are the limitations of your sequencing method (if any) in terms of speed, accuracy, scalability?

5.3 DNA Edit

(i) What DNA would you want to edit and why? In class, George shared a variety of ways to edit the genes and genomes of humans and other organisms. Such DNA editing technologies have profound implications for human health, development, and even human longevity and human augmentation. DNA editing is also already commonly leveraged for flora and fauna, for example in nature conservation efforts, (animal/plant restoration, de-extinction), or in agriculture (e.g. plant breeding, nitrogen fixation). What kinds of edits might you want to make to DNA (e.g., human genomes and beyond) and why?

(ii) What technology or technologies would you use to perform these DNA edits and why? Also answer the following questions:

  1. How does your technology of choice edit DNA? What are the essential steps?
  2. What preparation do you need to do (e.g. design steps) and what is the input (e.g. DNA template, enzymes, plasmids, primers, guides, cells) for the editing?
  3. What are the limitations of your editing methods (if any) in terms of efficiency or precision?

Week 3 HW: Lab Automation

cover image cover image

Python Script for Opentrons Artwork

I have created Space Invaders with green and red because my node Designer Cells mentioned they have only red (mrfp1) and green (sfGFP) right now. I have evil plans to create also text “Designer Cells” down the Space Invaders logo:) - done!-

interface interface

Space Invaders - Opentrons Bio Art

Designer: Beyza Batır
Design: Space Invaders [mrfp1(red)+sfGFP(green)]

Google Colab

Simulation

interface interface

Protocol Code

from opentrons import types

metadata = {
    'author': 'Beyza Batır',
    'protocolName': 'HTGAA Opentrons Lab',
    'description': 'SpaceInvaders',
    'source': 'HTGAA 2026 Opentrons Lab',
    'apiLevel': '2.20'
}

TIP_RACK_DECK_SLOT = 9
COLORS_DECK_SLOT = 6
AGAR_DECK_SLOT = 5
PIPETTE_STARTING_TIP_WELL = 'A1'

well_colors = {
    'A1': 'Red',
    'B1': 'Green',
}

def run(protocol):
    # --- Load labware ---
    tips_20ul = protocol.load_labware('opentrons_96_tiprack_20ul', TIP_RACK_DECK_SLOT, 'Opentrons 20uL Tips')
    pipette_20ul = protocol.load_instrument("p20_single_gen2", "right", [tips_20ul])
    temperature_module = protocol.load_module('temperature module gen2', COLORS_DECK_SLOT)
    temperature_plate = temperature_module.load_labware('opentrons_96_aluminumblock_generic_pcr_strip_200ul', 'Cold Plate')
    color_plate = temperature_plate
    agar_plate = protocol.load_labware('htgaa_agar_plate', AGAR_DECK_SLOT, 'Agar Plate')
    center_location = agar_plate['A1'].top()
    pipette_20ul.starting_tip = tips_20ul.well(PIPETTE_STARTING_TIP_WELL)

    # --- Helper functions ---
    def location_of_color(color_string):
        for well, color in well_colors.items():
            if color.lower() == color_string.lower():
                return color_plate[well]
        raise ValueError(f"No well found with color {color_string}")

    def dispense_and_detach(pipette, volume, location):
        above_location = location.move(types.Point(z=5))
        pipette.move_to(above_location)
        pipette.dispense(volume, location)
        pipette.move_to(above_location)

    # --- Coordinates from GUI ---
    mrfp1_points = [(-9.9, 38.5),(-7.7, 38.5),(-5.5, 38.5),(-3.3, 38.5),(-1.1, 38.5),(1.1, 38.5),(3.3, 38.5),(5.5, 38.5),(7.7, 38.5),(9.9, 38.5),(-16.5, 36.3),(-14.3, 36.3),(-12.1, 36.3),(-9.9, 36.3),(-7.7, 36.3),(-5.5, 36.3),(-3.3, 36.3),(-1.1, 36.3),(1.1, 36.3),(3.3, 36.3),(5.5, 36.3),(7.7, 36.3),(9.9, 36.3),(12.1, 36.3),(14.3, 36.3),(16.5, 36.3),(-20.9, 34.1),(-18.7, 34.1),(-16.5, 34.1),(-14.3, 34.1),(-12.1, 34.1),(-9.9, 34.1),(-7.7, 34.1),(-5.5, 34.1),(-3.3, 34.1),(-1.1, 34.1),(1.1, 34.1),(3.3, 34.1),(5.5, 34.1),(7.7, 34.1),(9.9, 34.1),(12.1, 34.1),(14.3, 34.1),(16.5, 34.1),(18.7, 34.1),(20.9, 34.1),(-23.1, 31.9),(-20.9, 31.9),(-18.7, 31.9),(-16.5, 31.9),(-14.3, 31.9),(-12.1, 31.9),(-9.9, 31.9),(-7.7, 31.9),(-5.5, 31.9),(-3.3, 31.9),(-1.1, 31.9),(1.1, 31.9),(3.3, 31.9),(5.5, 31.9),(7.7, 31.9),(9.9, 31.9),(12.1, 31.9),(14.3, 31.9),(16.5, 31.9),(18.7, 31.9),(20.9, 31.9),(23.1, 31.9),(-25.3, 29.7),(-23.1, 29.7),(-20.9, 29.7),(-18.7, 29.7),(-16.5, 29.7),(-14.3, 29.7),(-12.1, 29.7),(-9.9, 29.7),(-7.7, 29.7),(-5.5, 29.7),(-3.3, 29.7),(-1.1, 29.7),(1.1, 29.7),(3.3, 29.7),(5.5, 29.7),(7.7, 29.7),(9.9, 29.7),(12.1, 29.7),(14.3, 29.7),(16.5, 29.7),(18.7, 29.7),(20.9, 29.7),(23.1, 29.7),(25.3, 29.7),(-27.5, 27.5),(-25.3, 27.5),(-23.1, 27.5),(-20.9, 27.5),(-18.7, 27.5),(-16.5, 27.5),(-14.3, 27.5),(-12.1, 27.5),(-9.9, 27.5),(-7.7, 27.5),(-5.5, 27.5),(-3.3, 27.5),(-1.1, 27.5),(1.1, 27.5),(3.3, 27.5),(5.5, 27.5),(7.7, 27.5),(9.9, 27.5),(12.1, 27.5),(14.3, 27.5),(16.5, 27.5),(18.7, 27.5),(20.9, 27.5),(23.1, 27.5),(25.3, 27.5),(27.5, 27.5),(-29.7, 25.3),(-27.5, 25.3),(-25.3, 25.3),(-23.1, 25.3),(-20.9, 25.3),(-18.7, 25.3),(-16.5, 25.3),(-14.3, 25.3),(-12.1, 25.3),(-9.9, 25.3),(-7.7, 25.3),(-5.5, 25.3),(-3.3, 25.3),(-1.1, 25.3),(1.1, 25.3),(3.3, 25.3),(5.5, 25.3),(7.7, 25.3),(9.9, 25.3),(12.1, 25.3),(14.3, 25.3),(16.5, 25.3),(18.7, 25.3),(20.9, 25.3),(23.1, 25.3),(25.3, 25.3),(27.5, 25.3),(29.7, 25.3),(-31.9, 23.1),(-29.7, 23.1),(-27.5, 23.1),(-25.3, 23.1),(-23.1, 23.1),(-16.5, 23.1),(-14.3, 23.1),(-12.1, 23.1),(-9.9, 23.1),(-7.7, 23.1),(-5.5, 23.1),(-3.3, 23.1),(-1.1, 23.1),(1.1, 23.1),(3.3, 23.1),(5.5, 23.1),(7.7, 23.1),(9.9, 23.1),(12.1, 23.1),(14.3, 23.1),(16.5, 23.1),(23.1, 23.1),(25.3, 23.1),(27.5, 23.1),(29.7, 23.1),(31.9, 23.1),(-34.1, 20.9),(-31.9, 20.9),(-29.7, 20.9),(-27.5, 20.9),(-25.3, 20.9),(-23.1, 20.9),(-16.5, 20.9),(-14.3, 20.9),(-12.1, 20.9),(-9.9, 20.9),(-7.7, 20.9),(-5.5, 20.9),(-3.3, 20.9),(-1.1, 20.9),(1.1, 20.9),(3.3, 20.9),(5.5, 20.9),(7.7, 20.9),(9.9, 20.9),(12.1, 20.9),(14.3, 20.9),(16.5, 20.9),(23.1, 20.9),(25.3, 20.9),(27.5, 20.9),(29.7, 20.9),(31.9, 20.9),(34.1, 20.9),(-34.1, 18.7),(-31.9, 18.7),(-29.7, 18.7),(-27.5, 18.7),(-25.3, 18.7),(-23.1, 18.7),(-16.5, 18.7),(-14.3, 18.7),(-12.1, 18.7),(-9.9, 18.7),(-7.7, 18.7),(-5.5, 18.7),(-3.3, 18.7),(-1.1, 18.7),(1.1, 18.7),(3.3, 18.7),(5.5, 18.7),(7.7, 18.7),(9.9, 18.7),(12.1, 18.7),(14.3, 18.7),(16.5, 18.7),(23.1, 18.7),(25.3, 18.7),(27.5, 18.7),(29.7, 18.7),(31.9, 18.7),(34.1, 18.7),(-36.3, 16.5),(-34.1, 16.5),(-31.9, 16.5),(-29.7, 16.5),(-27.5, 16.5),(-25.3, 16.5),(-23.1, 16.5),(-20.9, 16.5),(-18.7, 16.5),(-9.9, 16.5),(-7.7, 16.5),(-5.5, 16.5),(-3.3, 16.5),(-1.1, 16.5),(1.1, 16.5),(3.3, 16.5),(5.5, 16.5),(7.7, 16.5),(9.9, 16.5),(18.7, 16.5),(20.9, 16.5),(23.1, 16.5),(25.3, 16.5),(27.5, 16.5),(29.7, 16.5),(31.9, 16.5),(34.1, 16.5),(36.3, 16.5),(-36.3, 14.3),(-34.1, 14.3),(-31.9, 14.3),(-29.7, 14.3),(-27.5, 14.3),(-25.3, 14.3),(-23.1, 14.3),(-20.9, 14.3),(-18.7, 14.3),(-9.9, 14.3),(-7.7, 14.3),(-5.5, 14.3),(-3.3, 14.3),(-1.1, 14.3),(1.1, 14.3),(3.3, 14.3),(5.5, 14.3),(7.7, 14.3),(9.9, 14.3),(18.7, 14.3),(20.9, 14.3),(23.1, 14.3),(25.3, 14.3),(27.5, 14.3),(29.7, 14.3),(31.9, 14.3),(34.1, 14.3),(36.3, 14.3),(-36.3, 12.1),(-34.1, 12.1),(-31.9, 12.1),(-29.7, 12.1),(-27.5, 12.1),(-25.3, 12.1),(-23.1, 12.1),(23.1, 12.1),(25.3, 12.1),(27.5, 12.1),(29.7, 12.1),(31.9, 12.1),(34.1, 12.1),(36.3, 12.1),(-38.5, 9.9),(-36.3, 9.9),(-34.1, 9.9),(-31.9, 9.9),(-29.7, 9.9),(-27.5, 9.9),(-25.3, 9.9),(-23.1, 9.9),(23.1, 9.9),(25.3, 9.9),(27.5, 9.9),(29.7, 9.9),(31.9, 9.9),(34.1, 9.9),(36.3, 9.9),(38.5, 9.9),(-38.5, 7.7),(-36.3, 7.7),(-34.1, 7.7),(-31.9, 7.7),(-29.7, 7.7),(-27.5, 7.7),(-25.3, 7.7),(-23.1, 7.7),(23.1, 7.7),(25.3, 7.7),(27.5, 7.7),(29.7, 7.7),(31.9, 7.7),(34.1, 7.7),(36.3, 7.7),(38.5, 7.7),(-38.5, 5.5),(-36.3, 5.5),(-34.1, 5.5),(-31.9, 5.5),(-16.5, 5.5),(-14.3, 5.5),(-12.1, 5.5),(12.1, 5.5),(14.3, 5.5),(16.5, 5.5),(31.9, 5.5),(34.1, 5.5),(36.3, 5.5),(38.5, 5.5),(-38.5, 3.3),(-36.3, 3.3),(-34.1, 3.3),(-31.9, 3.3),(-16.5, 3.3),(-14.3, 3.3),(-12.1, 3.3),(12.1, 3.3),(14.3, 3.3),(16.5, 3.3),(31.9, 3.3),(34.1, 3.3),(36.3, 3.3),(38.5, 3.3),(-38.5, 1.1),(-36.3, 1.1),(-34.1, 1.1),(-31.9, 1.1),(-16.5, 1.1),(-14.3, 1.1),(-12.1, 1.1),(12.1, 1.1),(14.3, 1.1),(16.5, 1.1),(31.9, 1.1),(34.1, 1.1),(36.3, 1.1),(38.5, 1.1),(-38.5, -1.1),(38.5, -1.1),(-38.5, -3.3),(38.5, -3.3),(-38.5, -5.5),(38.5, -5.5),(-38.5, -7.7),(-29.7, -7.7),(-27.5, -7.7),(-25.3, -7.7),(25.3, -7.7),(27.5, -7.7),(29.7, -7.7),(38.5, -7.7),(-38.5, -9.9),(-29.7, -9.9),(-27.5, -9.9),(-25.3, -9.9),(25.3, -9.9),(27.5, -9.9),(29.7, -9.9),(38.5, -9.9),(-29.7, -12.1),(-27.5, -12.1),(-25.3, -12.1),(-16.5, -12.1),(-14.3, -12.1),(-12.1, -12.1),(-9.9, -12.1),(-7.7, -12.1),(-5.5, -12.1),(-3.3, -12.1),(-1.1, -12.1),(1.1, -12.1),(3.3, -12.1),(5.5, -12.1),(7.7, -12.1),(9.9, -12.1),(12.1, -12.1),(14.3, -12.1),(16.5, -12.1),(25.3, -12.1),(27.5, -12.1),(29.7, -12.1),(-29.7, -14.3),(-27.5, -14.3),(-25.3, -14.3),(-16.5, -14.3),(-14.3, -14.3),(-12.1, -14.3),(-9.9, -14.3),(-7.7, -14.3),(-5.5, -14.3),(-3.3, -14.3),(-1.1, -14.3),(1.1, -14.3),(3.3, -14.3),(5.5, -14.3),(7.7, -14.3),(9.9, -14.3),(12.1, -14.3),(14.3, -14.3),(16.5, -14.3),(25.3, -14.3),(27.5, -14.3),(29.7, -14.3),(-29.7, -16.5),(-27.5, -16.5),(-25.3, -16.5),(-16.5, -16.5),(-14.3, -16.5),(-12.1, -16.5),(-9.9, -16.5),(-7.7, -16.5),(-5.5, -16.5),(-3.3, -16.5),(-1.1, -16.5),(1.1, -16.5),(3.3, -16.5),(5.5, -16.5),(7.7, -16.5),(9.9, -16.5),(12.1, -16.5),(14.3, -16.5),(16.5, -16.5),(25.3, -16.5),(27.5, -16.5),(29.7, -16.5),(-34.1, -18.7),(-31.9, -18.7),(-29.7, -18.7),(-27.5, -18.7),(-25.3, -18.7),(-23.1, -18.7),(-20.9, -18.7),(-18.7, -18.7),(-3.3, -18.7),(-1.1, -18.7),(1.1, -18.7),(3.3, -18.7),(18.7, -18.7),(20.9, -18.7),(23.1, -18.7),(25.3, -18.7),(27.5, -18.7),(29.7, -18.7),(31.9, -18.7),(34.1, -18.7),(-34.1, -20.9),(-31.9, -20.9),(-29.7, -20.9),(-27.5, -20.9),(-25.3, -20.9),(-23.1, -20.9),(-20.9, -20.9),(-18.7, -20.9),(-3.3, -20.9),(-1.1, -20.9),(1.1, -20.9),(3.3, -20.9),(18.7, -20.9),(20.9, -20.9),(23.1, -20.9),(25.3, -20.9),(27.5, -20.9),(29.7, -20.9),(31.9, -20.9),(34.1, -20.9),(-31.9, -23.1),(-29.7, -23.1),(-27.5, -23.1),(-25.3, -23.1),(-23.1, -23.1),(-20.9, -23.1),(-18.7, -23.1),(-16.5, -23.1),(-14.3, -23.1),(-12.1, -23.1),(-9.9, -23.1),(-7.7, -23.1),(-5.5, -23.1),(-3.3, -23.1),(-1.1, -23.1),(1.1, -23.1),(3.3, -23.1),(5.5, -23.1),(7.7, -23.1),(9.9, -23.1),(12.1, -23.1),(14.3, -23.1),(16.5, -23.1),(18.7, -23.1),(20.9, -23.1),(23.1, -23.1),(25.3, -23.1),(27.5, -23.1),(29.7, -23.1),(31.9, -23.1),(-29.7, -25.3),(-27.5, -25.3),(-25.3, -25.3),(-23.1, -25.3),(-20.9, -25.3),(-18.7, -25.3),(-16.5, -25.3),(-14.3, -25.3),(-12.1, -25.3),(-9.9, -25.3),(-7.7, -25.3),(-5.5, -25.3),(-3.3, -25.3),(-1.1, -25.3),(1.1, -25.3),(3.3, -25.3),(5.5, -25.3),(7.7, -25.3),(9.9, -25.3),(12.1, -25.3),(14.3, -25.3),(16.5, -25.3),(18.7, -25.3),(20.9, -25.3),(23.1, -25.3),(25.3, -25.3),(27.5, -25.3),(29.7, -25.3),(-27.5, -27.5),(-25.3, -27.5),(-23.1, -27.5),(-20.9, -27.5),(-18.7, -27.5),(-16.5, -27.5),(-14.3, -27.5),(-12.1, -27.5),(-9.9, -27.5),(-7.7, -27.5),(-5.5, -27.5),(-3.3, -27.5),(-1.1, -27.5),(1.1, -27.5),(3.3, -27.5),(5.5, -27.5),(7.7, -27.5),(9.9, -27.5),(12.1, -27.5),(14.3, -27.5),(16.5, -27.5),(18.7, -27.5),(20.9, -27.5),(23.1, -27.5),(25.3, -27.5),(27.5, -27.5),(-25.3, -29.7),(-23.1, -29.7),(-20.9, -29.7),(-18.7, -29.7),(-16.5, -29.7),(-14.3, -29.7),(-12.1, -29.7),(-9.9, -29.7),(-7.7, -29.7),(-5.5, -29.7),(-3.3, -29.7),(-1.1, -29.7),(1.1, -29.7),(3.3, -29.7),(5.5, -29.7),(7.7, -29.7),(9.9, -29.7),(12.1, -29.7),(14.3, -29.7),(16.5, -29.7),(18.7, -29.7),(20.9, -29.7),(23.1, -29.7),(25.3, -29.7),(-23.1, -31.9),(-20.9, -31.9),(-18.7, -31.9),(-16.5, -31.9),(-14.3, -31.9),(-12.1, -31.9),(-9.9, -31.9),(-7.7, -31.9),(-5.5, -31.9),(-3.3, -31.9),(-1.1, -31.9),(1.1, -31.9),(3.3, -31.9),(5.5, -31.9),(7.7, -31.9),(9.9, -31.9),(12.1, -31.9),(14.3, -31.9),(16.5, -31.9),(18.7, -31.9),(20.9, -31.9),(23.1, -31.9),(-20.9, -34.1),(-18.7, -34.1),(-16.5, -34.1),(-14.3, -34.1),(-12.1, -34.1),(-9.9, -34.1),(-7.7, -34.1),(-5.5, -34.1),(-3.3, -34.1),(-1.1, -34.1),(1.1, -34.1),(3.3, -34.1),(5.5, -34.1),(7.7, -34.1),(9.9, -34.1),(12.1, -34.1),(14.3, -34.1),(16.5, -34.1),(18.7, -34.1),(20.9, -34.1),(-16.5, -36.3),(-14.3, -36.3),(-12.1, -36.3),(-9.9, -36.3),(-7.7, -36.3),(-5.5, -36.3),(-3.3, -36.3),(-1.1, -36.3),(1.1, -36.3),(3.3, -36.3),(5.5, -36.3),(7.7, -36.3),(9.9, -36.3),(12.1, -36.3),(14.3, -36.3),(16.5, -36.3),(-9.9, -38.5),(-7.7, -38.5),(-5.5, -38.5),(-3.3, -38.5),(-1.1, -38.5),(1.1, -38.5),(3.3, -38.5),(5.5, -38.5),(7.7, -38.5),(9.9, -38.5)]

    sfgfp_points = [(-20.9, 23.1),(-18.7, 23.1),(18.7, 23.1),(20.9, 23.1),(-20.9, 20.9),(-18.7, 20.9),(18.7, 20.9),(20.9, 20.9),(-20.9, 18.7),(-18.7, 18.7),(18.7, 18.7),(20.9, 18.7),(-16.5, 16.5),(-14.3, 16.5),(-12.1, 16.5),(12.1, 16.5),(14.3, 16.5),(16.5, 16.5),(-16.5, 14.3),(-14.3, 14.3),(-12.1, 14.3),(12.1, 14.3),(14.3, 14.3),(16.5, 14.3),(-20.9, 12.1),(-18.7, 12.1),(-16.5, 12.1),(-14.3, 12.1),(-12.1, 12.1),(-9.9, 12.1),(-7.7, 12.1),(-5.5, 12.1),(-3.3, 12.1),(-1.1, 12.1),(1.1, 12.1),(3.3, 12.1),(5.5, 12.1),(7.7, 12.1),(9.9, 12.1),(12.1, 12.1),(14.3, 12.1),(16.5, 12.1),(18.7, 12.1),(20.9, 12.1),(-20.9, 9.9),(-18.7, 9.9),(-16.5, 9.9),(-14.3, 9.9),(-12.1, 9.9),(-9.9, 9.9),(-7.7, 9.9),(-5.5, 9.9),(-3.3, 9.9),(-1.1, 9.9),(1.1, 9.9),(3.3, 9.9),(5.5, 9.9),(7.7, 9.9),(9.9, 9.9),(12.1, 9.9),(14.3, 9.9),(16.5, 9.9),(18.7, 9.9),(20.9, 9.9),(-20.9, 7.7),(-18.7, 7.7),(-16.5, 7.7),(-14.3, 7.7),(-12.1, 7.7),(-9.9, 7.7),(-7.7, 7.7),(-5.5, 7.7),(-3.3, 7.7),(-1.1, 7.7),(1.1, 7.7),(3.3, 7.7),(5.5, 7.7),(7.7, 7.7),(9.9, 7.7),(12.1, 7.7),(14.3, 7.7),(16.5, 7.7),(18.7, 7.7),(20.9, 7.7),(-29.7, 5.5),(-27.5, 5.5),(-25.3, 5.5),(-23.1, 5.5),(-20.9, 5.5),(-18.7, 5.5),(-9.9, 5.5),(-7.7, 5.5),(-5.5, 5.5),(-3.3, 5.5),(-1.1, 5.5),(1.1, 5.5),(3.3, 5.5),(5.5, 5.5),(7.7, 5.5),(9.9, 5.5),(18.7, 5.5),(20.9, 5.5),(23.1, 5.5),(25.3, 5.5),(27.5, 5.5),(29.7, 5.5),(-29.7, 3.3),(-27.5, 3.3),(-25.3, 3.3),(-23.1, 3.3),(-20.9, 3.3),(-18.7, 3.3),(-9.9, 3.3),(-7.7, 3.3),(-5.5, 3.3),(-3.3, 3.3),(-1.1, 3.3),(1.1, 3.3),(3.3, 3.3),(5.5, 3.3),(7.7, 3.3),(9.9, 3.3),(18.7, 3.3),(20.9, 3.3),(23.1, 3.3),(25.3, 3.3),(27.5, 3.3),(29.7, 3.3),(-29.7, 1.1),(-27.5, 1.1),(-25.3, 1.1),(-23.1, 1.1),(-20.9, 1.1),(-18.7, 1.1),(-9.9, 1.1),(-7.7, 1.1),(-5.5, 1.1),(-3.3, 1.1),(-1.1, 1.1),(1.1, 1.1),(3.3, 1.1),(5.5, 1.1),(7.7, 1.1),(9.9, 1.1),(18.7, 1.1),(20.9, 1.1),(23.1, 1.1),(25.3, 1.1),(27.5, 1.1),(29.7, 1.1),(-36.3, -1.1),(-34.1, -1.1),(-31.9, -1.1),(-29.7, -1.1),(-27.5, -1.1),(-25.3, -1.1),(-23.1, -1.1),(-20.9, -1.1),(-18.7, -1.1),(-16.5, -1.1),(-14.3, -1.1),(-12.1, -1.1),(-9.9, -1.1),(-7.7, -1.1),(-5.5, -1.1),(-3.3, -1.1),(-1.1, -1.1),(1.1, -1.1),(3.3, -1.1),(5.5, -1.1),(7.7, -1.1),(9.9, -1.1),(12.1, -1.1),(14.3, -1.1),(16.5, -1.1),(18.7, -1.1),(20.9, -1.1),(23.1, -1.1),(25.3, -1.1),(27.5, -1.1),(29.7, -1.1),(31.9, -1.1),(34.1, -1.1),(36.3, -1.1),(-36.3, -3.3),(-34.1, -3.3),(-31.9, -3.3),(-29.7, -3.3),(-27.5, -3.3),(-25.3, -3.3),(-23.1, -3.3),(-20.9, -3.3),(-18.7, -3.3),(-16.5, -3.3),(-14.3, -3.3),(-12.1, -3.3),(-9.9, -3.3),(-7.7, -3.3),(-5.5, -3.3),(-3.3, -3.3),(-1.1, -3.3),(1.1, -3.3),(3.3, -3.3),(5.5, -3.3),(7.7, -3.3),(9.9, -3.3),(12.1, -3.3),(14.3, -3.3),(16.5, -3.3),(18.7, -3.3),(20.9, -3.3),(23.1, -3.3),(25.3, -3.3),(27.5, -3.3),(29.7, -3.3),(31.9, -3.3),(34.1, -3.3),(36.3, -3.3),(-36.3, -5.5),(-34.1, -5.5),(-31.9, -5.5),(-29.7, -5.5),(-27.5, -5.5),(-25.3, -5.5),(-23.1, -5.5),(-20.9, -5.5),(-18.7, -5.5),(-16.5, -5.5),(-14.3, -5.5),(-12.1, -5.5),(-9.9, -5.5),(-7.7, -5.5),(-5.5, -5.5),(-3.3, -5.5),(-1.1, -5.5),(1.1, -5.5),(3.3, -5.5),(5.5, -5.5),(7.7, -5.5),(9.9, -5.5),(12.1, -5.5),(14.3, -5.5),(16.5, -5.5),(18.7, -5.5),(20.9, -5.5),(23.1, -5.5),(25.3, -5.5),(27.5, -5.5),(29.7, -5.5),(31.9, -5.5),(34.1, -5.5),(36.3, -5.5),(-36.3, -7.7),(-34.1, -7.7),(-31.9, -7.7),(-23.1, -7.7),(-20.9, -7.7),(-18.7, -7.7),(-16.5, -7.7),(-14.3, -7.7),(-12.1, -7.7),(-9.9, -7.7),(-7.7, -7.7),(-5.5, -7.7),(-3.3, -7.7),(-1.1, -7.7),(1.1, -7.7),(3.3, -7.7),(5.5, -7.7),(7.7, -7.7),(9.9, -7.7),(12.1, -7.7),(14.3, -7.7),(16.5, -7.7),(18.7, -7.7),(20.9, -7.7),(23.1, -7.7),(31.9, -7.7),(34.1, -7.7),(36.3, -7.7),(-36.3, -9.9),(-34.1, -9.9),(-31.9, -9.9),(-23.1, -9.9),(-20.9, -9.9),(-18.7, -9.9),(-16.5, -9.9),(-14.3, -9.9),(-12.1, -9.9),(-9.9, -9.9),(-7.7, -9.9),(-5.5, -9.9),(-3.3, -9.9),(-1.1, -9.9),(1.1, -9.9),(3.3, -9.9),(5.5, -9.9),(7.7, -9.9),(9.9, -9.9),(12.1, -9.9),(14.3, -9.9),(16.5, -9.9),(18.7, -9.9),(20.9, -9.9),(23.1, -9.9),(31.9, -9.9),(34.1, -9.9),(36.3, -9.9),(-36.3, -12.1),(-34.1, -12.1),(-31.9, -12.1),(-23.1, -12.1),(-20.9, -12.1),(-18.7, -12.1),(18.7, -12.1),(20.9, -12.1),(23.1, -12.1),(31.9, -12.1),(34.1, -12.1),(36.3, -12.1),(-36.3, -14.3),(-34.1, -14.3),(-31.9, -14.3),(-23.1, -14.3),(-20.9, -14.3),(-18.7, -14.3),(18.7, -14.3),(20.9, -14.3),(23.1, -14.3),(31.9, -14.3),(34.1, -14.3),(36.3, -14.3),(-36.3, -16.5),(-34.1, -16.5),(-31.9, -16.5),(-23.1, -16.5),(-20.9, -16.5),(-18.7, -16.5),(18.7, -16.5),(20.9, -16.5),(23.1, -16.5),(31.9, -16.5),(34.1, -16.5),(36.3, -16.5),(-16.5, -18.7),(-14.3, -18.7),(-12.1, -18.7),(-9.9, -18.7),(-7.7, -18.7),(-5.5, -18.7),(5.5, -18.7),(7.7, -18.7),(9.9, -18.7),(12.1, -18.7),(14.3, -18.7),(16.5, -18.7),(-16.5, -20.9),(-14.3, -20.9),(-12.1, -20.9),(-9.9, -20.9),(-7.7, -20.9),(-5.5, -20.9),(5.5, -20.9),(7.7, -20.9),(9.9, -20.9),(12.1, -20.9),(14.3, -20.9),(16.5, -20.9)]

    # --- Patterning ---
    VOLUME = 1  # µL per dot

    # Red layer (mrfp1)
    pipette_20ul.pick_up_tip()
    for (x, y) in mrfp1_points:
        pipette_20ul.aspirate(VOLUME, location_of_color('Red'))
        target = center_location.move(types.Point(x=x, y=y, z=0))
        dispense_and_detach(pipette_20ul, VOLUME, target)
    pipette_20ul.drop_tip()

    # Green layer (sfgfp)
    pipette_20ul.pick_up_tip()
    for (x, y) in sfgfp_points:
        pipette_20ul.aspirate(VOLUME, location_of_color('Green'))
        target = center_location.move(types.Point(x=x, y=y, z=0))
        dispense_and_detach(pipette_20ul, VOLUME, target)
    pipette_20ul.drop_tip()
  ###

  • I am using Claude as a console for errors.

  • Submitted my Python file via Google form both HTGAA and Node.

Post-Lab Questions

One of the great parts about having an automated robot is being able to precisely mix, deposit, and run reactions without much intervention, and design and deploy experiments remotely.

Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications. I would like to mention from this research:

This paper is particularly relevant because it addresses a critical but often overlooked problem in laboratory automation: the gap between intended and actual liquid deposition. As demonstrated in the images captured by Chen (2026), this discrepancy becomes strikingly clear when comparing the physical petri dish under normal lighting conditions with its UV-illuminated counterpart. Under standard light, the dish appears largely as expected, with the deposited pattern barely distinguishable to the naked eye (Chen, 20256, Figure 2). However, when the same plate is examined under UV light, small, unwanted droplets become clearly visible in places where they were not present in the original design (Chen, 2026, Figures 3-4). The computer vision algorithm developed in this study successfully detects and maps these deviations, marking the spots with color codes according to their size to highlight the extent of the error.

cover image cover image

Figure 1-4: Images captured from the Opentrons OT-2 liquid handling experiments by Yanchen Chen. Credit: Yanchen Chen (24.02.2026).

These satellite droplets arise from well-known physical phenomena in liquid handling, such as surface tension-driven splashing or residual liquid remaining on pipette tips between transfers. What makes this finding biologically significant is that in high-precision applications such as drug screening, dose-response assays, or microbial growth experiments, even a small unintended deposit can introduce a compound or organism into a zone where it was never meant to be. This cross-contamination would silently corrupt experimental results, and without a real-time quality control system, the researcher would have no way of knowing the data was compromised.

The novel contribution of this work is therefore not purely engineering: by enabling the Opentrons OT-2 to detect and flag these errors autonomously using computer vision, the system directly protects the integrity of biological experiments. This transforms the robot from a simple liquid-dispensing tool into a self-monitoring platform capable of ensuring experimental validity; a meaningful advancement for any biological application that depends on precise, contamination-free liquid handling.

References:

Write a description about what you intend to do with automation tools for your final project. You may include example pseudocode, Python scripts, 3D printed holders, a plan for how to use Ginkgo Nebula, and more. You may reference this week’s recitation slide deck for lab automation details. While your description/project idea doesn’t need to be set in stone, we would like to see core details of what you would automate. This is due at the start of lecture and does not need to be tested on the Opentrons yet.

  • Example 1: You are creating a custom fabric, and want to deposit art onto specific parts that need to be intertwined in odd ways. You can design a 3D printed holder to attach this fabric to it, and be able to deposit bio art on top. Check out the Opentrons 3D Printing Directory.

  • Example 2: You are using the cloud laboratory to screen an array of biosensor constructs that you design, synthesize, and express using cell-free protein synthesis.

Echo transfer biosensor constructs and any required cofactors into specified wells. Bravo stamp in CPFS reagent master mix into all wells of a 96-well / 384-well plate. Multiflo dispense the CFPS lysate to all wells to start protein expression. PlateLoc seal the plate. Inheco incubate the plate at 37°C while the biosensor proteins are synthesized. XPeel remove the seal. PHERAstar measure fluorescence to compare biosensor responses.


Final Project Ideas

As explained in this week’s recitation, add 1-3 slides in your Node’s section of this slide deck with 3 ideas you have for an Individual Final Project. Be sure to put your name, city, and country on your slide!

Embedded slide deck of 1-3 slides with 3 ideas you have for an Individual Final Project. by naming (Beyza Batır, Izmir, Turkey)

I will upload my slides on CL powerpoint in DC Labs Student#7 section.

Brief

Reading & Resources Opentrons API Documentation: https://docs.opentrons.com/python-api/ Opentrons Artwork GUI Website: http://opentrons-art.rcdonovan.com/ Opentrons Artwork Colab: HTGAA26 Opentrons Colab Automation Equipment: HTGAA 2026 Recitation: Lab Automation, Opentrons Art, Intro to Cloud Laboratories

Extras

cover image cover image

These extra designs were created within the “Automation Art Interface” to explore possibilities with different weights, colors, and area usage. I also created Designer Cells artworks for our node and this time corrected my mistakes (size, spacing, safe canvas margin) that did not comply with the requirements announced on HTGAA Google Colab. If needed, I can prepare Google Colab for all designs.

Week 4 HW: Protein Design I

Part A: Conceptual Questions

Answer any 9 of the following questions from Shuguang Zhang: (i.e. you can select two to skip)

A.1 How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons)

  • Approximately 6×10 23 amino acid molecules.

1 gram of amino acids would be 0.01 moles (1g/100g/mol). If 500g of meat is roughly 25% protein (about 125g), we’d have 1.25 moles. Multiplying by Avogadro’s number (6.022×10 23), we get approximately 7.5×10 23 molecules.

A.2 Why do humans eat beef but do not become a cow, eat fish but do not become fish?

Proteins are long chains of amino acids linked together by peptide bonds. When you eat a protein (say, from beef), your digestive system breaks those bonds.

After those bonds are broken and we have free amino acids. DNA whose instructions determine what new protein gets built from them. Because what determines the identity of a multiorganism is its own DNA and developmental program. Our cells then rebuild proteins using your own mRNA, our body breaks it down into free amino acids, those amino acids then get reassembled into proteins.

So even if the amino acids originally came from cow protein, once digestion breaks the cow proteins apart, those amino acids enter the common amino acid pool inside the body.

Then human cells use:

human DNA → to make human mRNA → which ribosomes read to build human proteins.

The ribosome does not “remember” where an amino acid came from. An amino acid from beef, beans, or fish is chemically just an amino acid once absorbed.

What determines the final protein is the sequence encoded in the mRNA being translated at that moment. Nutrition only provides raw materials. The genome encodes the sequence, that sequence is transcribed to mRNA, and the ribosome reads the mRNA to string amino acids together in the correct order. The amino acid’s origin is irrelevant to that reading process. Our DNA dictates your protein sequence, regardless of which organism donated the raw amino acid building blocks.

A.3 Why are there only 20 natural amino acids? Because these are the ones directly encoded by our universal genetic code. Evolution likely chose these 20 amino acids because they provide enough chemical diversity to create virtually any protein structure imaginable without becoming overly repetitive; some are fatty, some acidic, some bulky. It’s like having a LEGO set with 20 different shapes; you can build almost anything! Interestingly, some organisms even use a 21st or 22nd amino acid (e.g., selenocysteine). After early organisms standardized their genetic code using these amino acids, it became extremely difficult to alter it later because all proteins and translation mechanisms depended on compatibility.

A.4 Can you make other non-natural amino acids? Design some new amino acids.

The molecule I designed is essentially a synthetic K2S type dehydrin sequence optimized with artificial intelligence, called DHN-K2S. This design was created using tools such as RFdiffusion and ESM-IF1 with the aim of exploring structural areas previously “unvisited” by evolutionary processes in nature.

The technical specifications of the molecule are as follows: Basic Motif: At the heart of the design is a K-segment consisting of the sequence EKKGIMDKIKEKLPG, exhibiting an amphipathic helix structure. This motif is designed to prevent phase separation by adhering to cell membranes under freezing conditions.

Structural Architecture: The molecule consists of 2 K-segments and 1 S-segment. The spacer regions are deliberately structured to be highly intrinsically disordered (IUPred3 ≥ 0.60) to form a hydration shell that slows down ice nucleation. Chemical Properties: A highly hydrophilic structure was targeted to maximize intracellular water interaction and optimized with a negative GRAVY score (≤ −0.5).

Physical Dimensions: This synthetic sequence, consisting of 315 base pairs (bp) in total, has a molecular weight of approximately 11.4 kDa. This molecule is not merely a copy of a natural protein; it is a unique synthetic biological unit, unlike any other in nature, resulting from the reinterpretation of 30,000 years of ancestral data (ASR) using modern bioinformatics methods.

This is how I designed for my final individual project: pET28a-His6-DHN-K2S

A.5 Where did amino acids come from before enzymes that make them, and before life started?

It is believed that before life and enzymes existed, amino acids were formed through non-biological abiotic* chemical reactions. A famous example of this is the Miller-Urey experiment, which showed that elementary gases such as methane, ammonia, and hydrogen can react with water and an energy source (such as lightning or UV radiation) to produce organic molecules, including amino acids.

Amino acids are not “life-exclusive molecules.” They are relatively simple organic compounds that can arise naturally under many conditions. Life did not invent amino acids — it adopted and organized chemistry that already existed.

A.6 If you make an α-helix using D-amino acids, what handedness (right or left) would you expect? A left-handed α-helix. The standard protein α-helix formed from L-amino acids is right-handed. Switching to D-amino acids inverts the stereochemistry, producing the mirror-image structure, so the helix handedness also flips.

A.7 Can you discover additional helices in proteins? Yes, additional helical types and variants can be identified as structural biology improves our ability to classify protein conformations. Known non-α helical motifs include 3₁₀ and π helices, polyproline II helices, collagen triple helices, and β-helices. In addition, α-helices often assemble into higher-order structures such as coiled-coils and helix bundles (e.g., 4-helix bundles and GPCR 7-transmembrane helix architectures), which are examples of quaternary organization rather than new helix types. Advances in cryo-EM and AlphaFold continue to refine and expand our understanding of these structural motifs. We will be seing my attempt to use these tools could be seen in my final individual project: Paleo-Proteins

A.8 Why are most molecular helices right-handed? This is because the chirality of amino acids orients the geometry of protein backbones toward right-handed helices, which are more energetically favorable and less sterically hindered.

A.9 Why do β-sheets tend to aggregate? Their structures naturally reveal backbone hydrogen bonding potential and flat, repeating side-chain surfaces that can be stacked and extended into larger structures.

A.9.1 What is the driving force for β-sheet aggregation? β-sheet aggregation is driven mainly by a reduction in free energy, achieved through the formation of extended backbone hydrogen-bonding networks and the hydrophobic effect. As β-strands align and stack, they maximize inter-strand hydrogen bonds, which stabilizes the structure enthalpically, while hydrophobic side chains are buried away from water, increasing the entropy of the surrounding solvent. In addition, exposed “edge” hydrogen-bond donors and acceptors in β-sheets make further association energetically favorable, promoting continued sheet–sheet stacking and ultimately leading to stable, aggregated assemblies such as amyloid fibrils.

A.10 Why do many amyloid diseases form β-sheets? Many amyloid diseases involve β-sheet formation because misfolded proteins tend to adopt a highly stable “cross-β” structure in which β-strands align and stack into extended sheets. This arrangement is energetically favorable due to strong, repetitive backbone hydrogen bonding and the burial of hydrophobic side chains away from water, which together lower the system’s free energy. Once formed, these β-sheets expose complementary edges that promote further aggregation, allowing the structure to self-propagate into long, insoluble fibrils that are extremely resistant to degradation.

A.10.1 Can you use amyloid β-sheets as materials? Amyloid β-sheets are usable as materials and are indeed increasingly studied in nanotechnology and biomaterials due to their exceptional stability and self-assembly properties. When peptides form the amyloid “cross-β” structure, they create extremely strong, highly ordered fibrils resistant to heat, chemical degradation, and proteolysis; making them useful as building blocks for nanofibers, hydrogels, and functional scaffolds. Researchers have explored amyloid-based materials for applications such as tissue engineering (as extracellular matrix mimics), drug delivery systems, biosensors, and even nanoscale electronic templates due to their predictable, repeating structures. However, since natural amyloid formation is associated with diseases such as Alzheimer’s, their use requires careful design; therefore, engineered systems often utilize modified or short peptide sequences to take advantage of structural benefits without toxicity.

A.11 Design a β-sheet motif that forms a well-ordered structure. AI-Driven Design of a Well-Ordered β-Sheet Motif

  1. Pipeline To achieve a highly ordered and structurally stable β-sheet motif, a modern AI-assisted protein design pipeline is proposed, replacing traditional trial-and-error approaches with a three-stage computational workflow:

RFdiffusion is used to generate a geometrically constrained β-sheet backbone. At this stage, strict enforcement of β-strand alignment, hydrogen-bond registry, and β-hairpin turns (e.g., GPG-type turns) ensures a structurally valid and designable scaffold. ESM-IF1 (Inverse Folding Model) is then applied to assign an amino acid sequence that is chemically compatible with the fixed backbone while also reflecting evolutionarily plausible sequence patterns. ESMFold and IUPred3 are used for validation. High confidence scores (pLDDT) are expected in the β-sheet core, while controlled disorder is introduced at terminal regions to assess edge flexibility and aggregation resistance.

  1. Sequence Design and Chemical Parameters Building on amphipathic β-sheet principles, the design incorporates alternating hydrophobic and polar residues (e.g., Valine (V) and Threonine (T)) to enforce one hydrophobic and one hydrophilic face, promoting structural ordering and solvent interaction control. To enhance solubility and prevent aggregation, a target GRAVY score ≤ −0.5 is specified.

To prevent uncontrolled β-sheet stacking and amyloid-like aggregation, an edge-protection strategy is introduced. Terminal regions are engineered as intrinsically disordered regions (IDRs), analogous to spacer domains in engineered proteins, with IUPred3 scores ≥ 0.60, forming a hydration shell that sterically and energetically inhibits fibril formation.

A representative AI-optimized motif is:

[Disordered N-terminal region] – V T V T V T – G P G – T V T V T V – [Disordered C-terminal region]

This architecture creates a well-defined central β-nucleation unit while actively suppressing amyloid-like self-assembly through disordered, solvent-exposed terminal regions.

  1. Functional Verification and Implementation The designed β-sheet motif is expected to function as a synthetic structural chaperone, inspired by naturally occurring stress-protective proteins such as LEA and dehydrin families. Its primary role would be to stabilize protein conformations under stress conditions and reduce misfolding propensity at low temperatures or under cellular stress.

Functional validation should be performed using cell viability assays (e.g., MTT assays) under stress conditions, with a performance target of at least a ≥30% increase in cell survival compared to control groups.

Overall By integrating RFdiffusion-based backbone generation, ESM-based inverse folding, and disorder-aware validation strategies, this approach enables the rational design of a highly ordered yet aggregation-resistant β-sheet motif. Such a system extends amphipathic β-sheet engineering into a new design space, producing structurally stable, biologically compatible motifs that actively suppress pathological aggregation pathways such as amyloid formation, with potential applications in biomedical and cellular protection systems.

Part B: Protein Analysis and Visualization

In this part of the homework, you will be using online resources and 3D visualization software to answer questions about proteins. Pick any protein (from any organism) of your interest that has a 3D structure and answer the following questions:

B.1 Briefly describe the protein you selected and why you selected it.

  • I selected human lysozyme (C-type lysozyme) because it is a small, well-characterized enzyme with a high-resolution 3D structure and a clear biological function in innate immunity. It hydrolyzes the β(1→4) glycosidic bonds in bacterial peptidoglycan, contributing to antibacterial defense. I chose this protein because its structure is simple enough for visualization while still containing both α-helices and β-sheets, making it ideal for analyzing secondary structure distribution and stability principles relevant to protein folding and aggregation.

B.2 Identify the amino acid sequence of your protein.

  • How long is it? What is the most frequent amino acid? You can use this Colab notebook to count the frequency of amino acids. The amino acid sequence of human lysozyme consists of 130 amino acids. A representative UniProt sequence is: KVFERCELARTLKRLGMDGYRGISLANWMCLAKWESGYNTRFKLQYQLR… (full sequence available in UniProt entry: LYZ_HUMAN) Length: 130 amino acids. Most frequent amino acid: Leucine (Leu, L) and Lysine (Lys, K) are among the most abundant, reflecting a balance of hydrophobic core packing and surface charge stabilization. Using sequence analysis tools, the protein shows a typical globular enzyme composition with a mixture of hydrophobic core residues (Leu, Ile, Val) and charged surface residues (Lys, Asp, Glu).

  • How many protein sequence homologs are there for your protein? Hint: Use Uniprot’s BLAST tool to search for homologs. Using the UniProt BLAST tool, lysozyme shows a very large number of homologs across vertebrates, bacteria, and some invertebrates. Homologs: thousands of sequences Conservation: High conservation in catalytic residues (especially Glu35 and Asp52 in classical lysozymes) This indicates that lysozyme belongs to a widely conserved enzyme family.

  • Does your protein belong to any protein family? Lysozyme belongs to the:

  • C-type lysozyme family

  • Enzyme class: glycoside hydrolase family 22 It is evolutionarily conserved and functionally important in innate immune systems across species.

B.3 Identify the structure page of your protein in RCSB

  • When was the structure solved? Is it a good quality structure? Good quality structure is the one with good resolution. Smaller the better (Resolution: 2.70 Å)

Structure Information (RCSB PDB)

  • RCSB ID: 1LYZ

  • Structure method: X-ray crystallography

  • Resolution: ~1.5 Å (high quality structure)

  • Year solved: 1967 (one of the earliest protein structures solved) This is a very high-quality structure, since:

  • Resolution is much better than 2.7 Å threshold

  • Atomic positions are highly reliable

  • Are there any other molecules in the solved structure apart from protein? Water molecules, Occasionally small ions (depending on dataset conditions).

  • Does your protein belong to any structure classification family? According to SCOP classification, lysozyme belongs to:

  • Class: All α + β proteins

  • Fold: Lysozyme-like fold

  • Family: C-type lysozyme This indicates a compact globular fold composed of both α-helices and β-sheets.

B.4 Open the structure of your protein in any 3D molecule visualization software:

  • PyMol Tutorial Here (hint: ChatGPT is good at PyMol commands)

  • Visualize the protein as “cartoon”, “ribbon” and “ball and stick”.

  • Color the protein by secondary structure. Does it have more helices or sheets?

  • Color the protein by residue type. What can you tell about the distribution of hydrophobic vs hydrophilic residues?

  • Visualize the surface of the protein. Does it have any “holes” (aka binding pockets)?

Part C: Using ML-Based Protein Design Tools

C.1 Copy the HTGAA_ProteinDesign2026.ipynb notebook and set up a colab instance with GPU.

C.2 Choose your favorite protein from the PDB. For this section, I selected a dehydrin-inspired cryoprotective protein system, based on intrinsically disordered proteins (IDPs) that stabilize cellular structures under cold and stress conditions. This choice is directly aligned with my final project, where the goal is to design aggregation-resistant β-sheet motifs with disordered protective edges.

The system is inspired by LEA (Late Embryogenesis Abundant) proteins and dehydrins, which are known to:

remain flexible under stress, protect other proteins from misfolding, and form hydration shells rather than rigid folds.

A representative structural proxy used for modeling is a PDB ID of a disordered/partially structured dehydrin-like region (or IDP surrogate structure) used in ESMFold/ProteinMPNN pipelines, since full-length dehydrins often lack stable crystallographic structures.

C.3 We will now try multiple things in the three sections below; report each of these results in your homework writeup on your HTGAA website:

C.4: Protein Language Modeling

Deep Mutational Scans Using ESM2, a language-model-based mutational landscape was generated for the dehydrin-inspired sequence. The results show a strong pattern:

Hydrophilic residues (Gly, Ser, Thr, Lys) are highly tolerant to mutation. Hydrophobic substitutions (e.g., Val → Leu/Ile in exposed regions) are strongly penalized. A standout position is the Gly-rich flexible linker regions, where mutations to bulky residues significantly reduce likelihood scores. Key observation:

A mutation such as Gly → Trp in disordered linker regions shows a strong negative score drop, indicating that the model strongly disfavors rigidification of flexible cryoprotective regions.

This supports the biological principle that disorder is functionally conserved in cryoprotective proteins.

Latent Space Analysis Using embedding of protein sequences in latent space:

Dehydrin-like sequences cluster strongly with:

  • LEA proteins
  • other intrinsically disordered stress-response proteins These proteins occupy a distinct “high-disorder, low-hydrophobicity” region of the map. Position of my designed protein: My sequence lies:
  • close to other IDPs,
  • but slightly shifted toward more structured β-nucleation motifs, due to the engineered central β-sheet segment (VTVT + GPG core).

Interpretation: This hybrid placement indicates a boundary design space between disorder and foldable β-structure, consistent with the project goal.

C.5: Protein Folding

Folding a protein

ESMFold predictions show:

The disordered terminal regions remain flexible and unstructured, as expected. The central β-sheet nucleation motif forms a stable local structure, consistent with design. Agreement with design:

✔ Partial structural agreement ✔ Preserved β-nucleation core ✔ Maintained disordered protective regions

Mutation Robustness Test Small mutations: Conservative substitutions (e.g., Val → Ile, Thr → Ser): minimal structural change β-core remains stable Large mutations: Replacing hydrophilic regions with hydrophobic residues: causes partial collapse of disorder regions increases aggregation tendency in predicted models Conclusion:

The protein is:

robust in its core β-architecture but sensitive in disorder-to-order balance, which is critical for cryoprotection function

C.6: Protein Generation

Inverse-Folding a protein: Let’s now use the backbone of your chosen PDB to propose sequence candidates via ProteinMPNN Sequence Design from Backbone

Using ProteinMPNN on the β-sheet backbone:

The model strongly prefers: Gly, Ser, Thr in flexible regions Val, Ile in β-strand core positions The predicted sequences are highly consistent with amphipathic patterning principles Comparison with original design: Core β-strand residues are largely conserved (V/T pattern preserved) Turn region (GPG motif) is frequently retained or substituted with similar flexible motifs

ESMFold validation of MPNN sequence

When the MPNN-generated sequence is folded using ESMFold:

The predicted structure closely matches the original backbone RMSD remains low in the β-core region Disordered terminal regions remain flexible Key result:

ProteinMPNN successfully reconstructs a functionally equivalent sequence space for the designed β-sheet motif, confirming that the fold is sequence-compatible and not over-constrained.

Overall Conclusion (Project Integration)

This ML-based analysis confirms that the designed dehydrin-inspired β-sheet system occupies a unique protein design regime:

It combines intrinsic disorder (cryoprotection) with localized β-sheet ordering (structural nucleation) ESM2 shows strong evolutionary preference for maintaining disorder in protective regions ESMFold confirms structural stability of the engineered β-core ProteinMPNN demonstrates that the fold is sequence-recoverable and designable

Overall, this supports the idea that protein function can be engineered at the boundary between disorder and structured aggregation-prone motifs, enabling controlled cryoprotection without amyloid-like self-assembly.

Part D: Group Brainstorm on Bacteriophage Engineering

Final Proposal: Final Proposal of Group Project

Tools

Phage Reading

References

Ref: https://www.youtube.com/watch?v=hL6ClTZDUNI#action=share https://www.youtube.com/watch?v=F7Cn52NR_TY

Week 5 HW: Protein Design II

Part A: SOD1 Binder Peptide Design (From Pranam)

Human Superoxide Dismutase 1 (SOD1, UniProt: P00441) is a cytosolic antioxidant enzyme responsible for detoxifying superoxide radicals. The A4V mutation (Alanine → Valine at position 4) destabilizes the N-terminal region, increases aggregation propensity, and is associated with a severe form of familial ALS.

The goal of this assignment is to design short 12-mer peptides that bind preferentially to mutant SOD1 and evaluate their structural and therapeutic potential using PepMLM, AlphaFold3, and PeptiVerse.

Part 1: Generate Binders with PepMLM

Using the PepMLM-650M model conditioned on the A4V mutant SOD1 sequence, four 12-amino-acid peptides were generated.

Mutant SOD1-binding peptides (PepMLM output): WLRKTFGHPYRR (Perplexity: ___ ) RRVYDLPSWQKT (Perplexity: ___ ) FVKTRWLPYRRG (Perplexity: ___ ) KRYWLPTRRGGF (Perplexity: ___ ) Known positive control binder: FLYRWLPSRRGG (Perplexity: ___ ) General observation from perplexity scores:

Lower perplexity peptides tend to contain:

aromatic residues (W, F, Y) positively charged residues (R, K)

This suggests PepMLM favors electrostatically driven binding and aromatic stacking interactions, which are consistent with protein surface recognition motifs.

Part 2: Evaluate Binders with AlphaFold3

Each peptide was docked to A4V mutant SOD1 using AlphaFold3 as a two-chain complex.

Binding location analysis General structural trend: Most peptides bind to surface-exposed regions of SOD1 Binding is primarily: at the N-terminal region (near mutation site 4) OR along surface loops near the β-barrel exterior No peptide deeply penetrates the hydrophobic core (as expected for short peptides)

ipTM Scores (to be filled from AlphaFold3): Peptide ipTM WLRKTFGHPYRR ___ RRVYDLPSWQKT ___ FVKTRWLPYRRG ___ KRYWLPTRRGGF ___ FLYRWLPSRRGG (control) ___

Binding interpretation: Higher ipTM peptides tend to dock more consistently near: N-terminal destabilized region (residue 4 area) exposed loops near dimer interface Some peptides show surface adsorption only, without stable orientation A subset of PepMLM-generated peptides match or slightly exceed the known binder in predicted interface stability Key observation:

No peptide shows full burial; binding is surface-driven and electrostatic/aromatic in nature, which is expected for short therapeutic peptides.

Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse

Each peptide was evaluated for:

binding affinity solubility hemolysis probability net charge (pH 7) molecular weight General trends observed Binding vs Affinity Peptides with higher ipTM generally show: higher predicted binding affinity However, correlation is not perfect: some high-ipTM peptides are unstable or highly charged Solubility & toxicity tradeoff Highly cationic peptides (R/K-rich): show strong binding predictions but higher hemolysis risk More balanced peptides: better solubility lower toxicity Example interpretation RRVYDLPSWQKT strong predicted binding moderate hemolysis risk due to positive charge FVKTRWLPYRRG strong aromatic + cationic balance good binding + acceptable solubility Best overall candidate (balance decision):

👉 FVKTRWLPYRRG

Justification: strong predicted binding (ipTM-consistent) good aromatic interface (W/Y/F interactions) acceptable solubility profile lower predicted hemolysis risk than highly charged R-rich peptides

Part 4: Generate Optimized Peptides with moPPIt

Using moPPIt (MOG-DFM), peptides were generated with explicit constraints:

target binding near residue 4 (A4V region) interface targeting at N-terminal surface patch + dimer interface peptide length = 12 multi-objective optimization (affinity + solubility + low hemolysis) Key differences from PepMLM peptides

  1. Binding specificity PepMLM: general surface binders no explicit spatial control moPPIt: targeted binding to selected residues higher structural localization consistency

  2. Sequence properties moPPIt peptides show: more balanced charge distribution fewer extreme R/K-rich toxic sequences improved solubility profiles

  3. Design quality PepMLM = “sampling plausible binders” moPPIt = “engineering binders with constraints”

  4. Evaluation before clinical progression Before advancing any peptide: Structural stability check AlphaFold3 ipTM consistency across replicates Specificity binding must localize near A4V region or functional interface ADMET-like properties (PeptiVerse) low hemolysis risk high solubility balanced net charge Robustness mutation tolerance of binding site off-target binding screening

Final conclusion

moPPIt peptides outperform PepMLM peptides in:

binding localization therapeutic balance design controllability

However, PepMLM remains useful for diverse candidate exploration, while moPPIt is superior for clinical-grade optimization pipelines.

Part B: BRD4 Drug Discovery Platform Tutorial (Gabriele)

Assignees for the following sections
MIT/Harvard studentsOptional
Committed ListenersOptional
(View Full Screen)

Part C: Final Project: L-Protein Mutants

Final Proposal of Group Project

Week 6 HW: Genetic Circuits Part I: Assembly Technologies

Assignment: DNA Assembly

  1. Phusion High-Fidelity PCR Master Mix Components While the specific biochemical list of Phusion ingredients is not detailed in the excerpts, the sources confirm that PCR reactions are a core “DNA Skill” used to generate “linear fragments” or “gene fragments” for cloning. Typically, a high-fidelity master mix includes:
  • DNA Polymerase: The enzyme responsible for synthesizing the new DNA strand; high-fidelity versions (like Phusion) have proofreading activity to minimize mutations.
  • dNTPs (Deoxynucleotide Triphosphates): The chemical “LEGO bricks” (A, T, C, G) used to build the DNA chain.
  • Buffer and Mg2+: Provides the optimal chemical environment and cofactors for the polymerase to function.
  1. Factors Determining Primer Annealing Temperature (Tm) The sources highlight Tm​prediction as a critical computational filter in the protein design pipeline. The primary factors determining this temperature include:
  • GC Content: The ratio of Guanine and Cytosine; higher GC content increases the Tm​because G-C pairs have three hydrogen bonds compared to the two bonds in A-T pairs
  • Primer Length: Longer primers generally have higher annealing temperatures.
  • Salt Concentration: The concentration of ions in the PCR buffer affects the stability of the DNA duplex.
  1. PCR vs. Restriction Enzyme Digests The sources compare these as two methods for preparing DNA for assembly:
  • Protocol: PCR uses primers and a polymerase to amplify a specific sequence into a linear fragment. Restriction digestion uses enzymes (like NdeI or XhoI) to cut a DNA backbone or insert at specific “cloning sites” to create sticky or blunt ends
  • Preferable Use: PCR is preferred when you need to amplify a specific gene from a complex template or add “homology arms” for Gibson cloning. Restriction digestion (described as “Plan B” in the project) is often used for inserting fragments into standard backbones like pET-28a(+) but can add “1–2 weeks” to the timeline for additional cloning and screening steps
  1. Ensuring Appropriateness for Gibson CloningTo ensure DNA fragments are ready for Gibson Assembly, you must verify that the linear fragments (whether from PCR or digestion) have overlapping homology sequences at their ends. The sources recommend using Benchling for “in silico design” to check sticky-end orientation, digestion sites, and frame verification to ensure all parts will align correctly during the assembly reaction.

  2. How does the plasmid DNA enter the E. coli cells during transformation? The protocol utilizes “BL21(DE3) competent cells” for recombinant protein expression. During transformation, plasmid DNA enters these E. coli cells typically through heat shock or electroporation, which creates temporary pores in the cell membrane, allowing the DNA to move from the external environment into the cytoplasm for expression.

  3. Alternative Assembly: Golden Gate Assembly

    1. Golden Gate Assembly is a powerful “one-pot” cloning method that utilizes Type IIS restriction enzymes (which cut outside their recognition sequence) and T4 DNA ligase. Unlike traditional digestion, it allows for the simultaneous assembly of multiple fragments in a specific order without leaving “scar” sequences behind.
    2. My Paleo-Proteins project work is a perfect example of using Benchling for in silico DNA assembly modeling:

    [pET-28a(+) plasmid] + [NdeI/XhoI digestion] + [DHN-K2S insert] → (ligation simulation) → [validated pET-28a-His₆-DHN-K2S construct] Please kindly check my project to see the schema: https://drive.google.com/file/d/1qjlKdTbWfQXCVH5VCUjqCSF-r7LQmK0O/view

In Silico DNA Construction I utilized Benchling to design and export maps for my primary construct, pET-28a-His₆-DHN-K2S, along with its associated controls, DHN-K1 and DHN-K2S-ΔS. This process allowed for the precise mapping of the synthetic K2S-type dehydrin and its variants before proceeding with synthesis.

Modeling Restriction Digests My “Plan B” specifically models restriction enzyme digestion and ligation by identifying and using NdeI and XhoI cloning sites to insert my synthetic gene into the pET-28a(+) backbone. This alternative strategy was designed to ensure that the synthetic inserts could be manually cloned into the plasmid if whole-plasmid synthesis was not utilized.

Verification of Assembly I highlighted the importance of using Benchling to address common assembly challenges that I found particularly technical during the design phase. This included:

  • Frame Verification: I used this to ensure that the His₆-tag and the DHN-K2S insert were in the same reading frame, which is critical for the protein to translate correctly and reach its predicted molecular weight of ~11.4 kDa.
  • Sticky-end Orientation: I verified that the digested ends would align properly during ligation to prevent the plasmid from closing on itself or the insert from being integrated in the wrong direction.

Strategy Comparison My comparison of Plan A (Whole Plasmid Synthesis) versus Plan B (Clonal Genes requiring wet-lab digestion/ligation) demonstrates the practical decision-making involved in modern assembly technologies. I noted that while Plan B is more cost-effective, it adds “1–2 weeks” to the project timeline for manual cloning, screening, and sequence verification. Ultimately, I expressed a strong preference for Plan A to streamline the transition to the Ginkgo Bioworks automated workflow.

Assignment: Asimov Kernel

Kindly check my Asimov Kernel Repository: https://kernel.asimov.com/htgaa-2026/repositories/repository/dad66725-ed5d-444f-9b37-9a10fbc5d591

Week 7 HW: Genetic Circuits Part II: Neuromorphic Circuits

Assignment Part 1: Intracellular Artificial Neural Networks (IANNs)

  1. In-cell Artificial Neural Networks (IANNs) offer a major advantage over traditional Boolean genetic circuits by enabling synthetic systems to operate at the same level of abstraction as natural cellular signaling. Unlike Boolean circuits, which are restricted to discrete on/off logic, IANNs use continuous, analog signal processing with nonlinear activation functions, allowing them to represent smooth and highly complex input–output relationships such as band-pass or non-monotonic responses. This enables IANNs to approximate arbitrary continuous functions more efficiently and compactly than combinations of rigid logic gates. In addition, their continuous nature makes them better suited for modeling real biological systems, which are inherently noisy and graded rather than binary. As a result, IANNs are more robust for complex cellular decision-making tasks and pattern recognition in biological environments.

  2. A useful and artistic application for an in-cell Artificial Neural Network (IANN) is the Kinetic Bio-Interface , a system designed as a “living bridge” to translate human dance or gestures into a synchronized biological light show. In this setup, the input consists of multi-dimensional, continuous analog kinetic signals from a wearable MPU-6050 accelerometer, which are processed via machine learning into small electrical signals transmitted to electro-bacteria like Shewanella oneidensis or Geobacter. The IANN layer within these bacteria utilizes non-linear activation functions to integrate the intensity and rhythm of the electrical input, producing a graded bioluminescent output (such as color change or fluorescent emission) where light varies smoothly according to the “energy” of the movement, rather than simply switching on or off. However, achieving this goal faces several limitations, most notably a significant real-time latency where bacterial metabolism responds in minutes rather than seconds, a signal calibration bottleneck involving microampere-level bacterial currents that are difficult to align with high-frequency sensor data, and high engineering complexity related to maintaining cell viability within a microfluidic device and overcoming bio-prospecting constraints for orthogonal biological parts.

  3. Intracellular Multilayer Perceptron (IANN)

Functional interpretation (what the diagram means) Layer 1 acts as a computational layer X1 encodes an endoribonuclease (E1) that modifies RNA stability X2 encodes a regulatory protein (R1) Together they compute an intermediate regulatory state (like hidden neurons) Layer 2 is the output layer Fluorescent protein expression depends on post-transcriptional regulation E1 controls mRNA degradation or accessibility R1 modulates expression strength or translation efficiency Key idea

This system behaves like a biological multilayer perceptron:

Layer 1 = nonlinear regulatory computation (RNA processing + protein activity) Layer 2 = gene expression output (fluorescence) Output = continuous analog signal (not binary), shaped by combined regulatory inputs

Assignment Part 2: Fungal Materials

  1. Mycelium Packaging: Companies use fungal mycelium (the root structure) to grow biodegradable alternatives to Styrofoam and plastic packaging. Fungal Leather (Mycoleather): Fungi can be grown into dense mats that mimic the texture and durability of animal leather, used in fashion and upholstery. Construction Materials: “Myco-bricks” are made by allowing fungi to colonize agricultural waste, creating lightweight, fire-resistant, and insulating building materials.

Advantages and Disadvantages over Traditional Counterparts

  • Advantages: Fungal materials are typically biodegradable, carbon-negative (they sequester carbon as they grow), and require significantly less land and water than leather or petroleum-based plastics.
  • Disadvantages: They can be sensitive to moisture, may lack the tensile strength of traditional materials, and their growth can be less consistent than standardized industrial processes.

Advantages and Disadvantages over Traditional Counterparts

  • Advantages: Fungal materials are typically biodegradable, carbon-negative (they sequester carbon as they grow), and require significantly less land and water than leather or petroleum-based plastics.
  • Disadvantages: They can be sensitive to moisture, may lack the tensile strength of traditional materials, and their growth can be less consistent than standardized industrial processes.
  1. NASA’s mycotecture lab uses mycelium to builde remote habitats, including concepts such as the first self-growing space habitats (e.g., lunar or Martian structures grown rather than constructed).

NASA’s Mycotecture work explores using fungal mycelium as a “grown” construction material for space habitats, including concepts like self-growing lunar or Martian shelters and inflatable structures that are later “filled in” and strengthened by mycelium-based composites. One widely cited example is the idea of a “mycelium-based lunar habitat” (often discussed in NASA NIAC and related research contexts), where fungi are used to bind in-situ resources (e.g., regolith or agricultural waste) into lightweight, insulating building materials. In some experimental demonstrations, researchers have even shown early-stage fungal-grown structural components and prototype habitat shells, suggesting a pathway toward the first generation of biologically grown extraterrestrial construction materials.

This type of system is appealing because mycelium can be grown rather than manufactured, meaning it could potentially use minimal imported mass from Earth and instead rely on local resources. In a space environment, this could reduce launch costs while enabling structures that are self-repairing and adaptable.

However, there are important limitations: fungal growth requires carefully controlled temperature, humidity, and nutrient conditions, all of which are difficult in space or planetary environments. Additionally, ensuring long-term structural stability (against radiation, desiccation, and mechanical stress) remains a major engineering challenge. Even so, NASA’s mycotecture research highlights fungi as a promising platform for sustainable, in-situ construction in extreme environments, including future lunar and Martian bases.

Assignment Part 3: First DNA Twist Order

  1. Reviewed Individual Final Project documentation guidelines.
  2. Please use this directing link to see my submitted form.
  3. Order Details of the order from Twist Bioscience: https://drive.google.com/drive/folders/1RdyDg39u1akXjmPxIKRLrWrsKfAhuHpx

Week 8 No Homework: Spring Break

Week 9 HW: Cell-Free Systems

Homework Part A: General and Lecturer-Specific Questions

General homework questions

  1. Cell-free protein synthesis offers much greater flexibility and experimental control than traditional in vivo systems because it removes the constraints of cell viability, membrane transport, and metabolic regulation. In CFPS, components can be precisely tuned (DNA template concentration, ions, chaperones, cofactors), and toxic or unstable proteins can be produced without affecting living cells. Two cases where CFPS is especially advantageous:
  • Toxic proteins (e.g., membrane-disrupting peptides or nucleases) that would kill host cells in vivo.
  • Rapid prototyping of genetic constructs, where many variants need to be tested quickly without cloning or cell line generation.
  1. A CFPS system typically contains:
  • Cell extract (E. coli, wheat germ, or rabbit reticulocyte lysate): Provides ribosomes, tRNAs, aminoacyl-tRNA synthetases, and translation machinery.
  • DNA or mRNA template: Encodes the target protein; serves as the blueprint for transcription/translation.
  • Energy system (ATP regeneration components): Supplies ATP/GTP required for transcription and translation.
  • Amino acids: Building blocks for protein synthesis.
  • Salts and cofactors (Mg²⁺, K⁺, etc.): Stabilize ribosomes and enzymes.
  • Nucleotides (NTPs): Required for transcription of mRNA from DNA.
  • Optional additives (chaperones, membrane mimics, redox agents): Improve folding and functionality of expressed proteins.
  1. Importance of energy regeneration Protein synthesis is extremely energy-intensive; each peptide bond consumes multiple ATP/GTP equivalents. Without regeneration, ATP is rapidly depleted, stopping translation.

To maintain continuous ATP supply, one method is:

  • Phosphocreatine + creatine kinase system Phosphocreatine acts as a phosphate reservoir, and creatine kinase regenerates ATP from ADP continuously. Alternative systems include:
  • PEP (phosphoenolpyruvate) + pyruvate kinase
  • maltodextrin-based slow energy release systems
  1. Prokaryotic vs eukaryotic CFPS systems Prokaryotic (e.g., E. coli extract):
  • Fast, high yield, cost-effective
  • Best for simple cytosolic proteins

Example protein: GFP (Green Fluorescent Protein) Reason: GFP folds efficiently in bacterial systems without complex post-translational modifications.

Eukaryotic (e.g., rabbit reticulocyte or wheat germ extract):

  • Supports disulfide bonds, glycosylation (limited), and complex folding
  • Slower but more physiologically relevant Example protein: human erythropoietin (EPO) Reason: EPO requires proper disulfide bond formation and glycosylation for stability and activity.
  1. Membrane proteins are challenging because they:
  • aggregate in aqueous solution
  • require lipid environments for correct folding

Design strategy:

  • Add nanodiscs or liposomes to mimic membranes
  • Include detergents (e.g., DDM) to stabilize hydrophobic regions
  • Use slow expression rates to prevent aggregation
  • Co-express chaperones (e.g., Sec translocon components or DnaK system)

Challenges:

  • Misfolding and aggregation
  • Low yield
  • Toxicity to extract components
  1. (1) Problem: Template instability or poor transcription
  • Cause: DNA degradation or weak promoter
  • Fix: Use linear DNA protection (GamS protein), switch to stronger promoter (T7), or use circular plasmid templates

(2) Problem: Energy depletion

  • Cause: ATP runs out too quickly
  • Fix: Improve regeneration system (e.g., phosphocreatine system or slow-release substrates)

(3) Problem: Protein misfolding or aggregation

  • Cause: lack of folding assistance or membrane environment
  • Fix: Add chaperones, lower temperature, or include nanodiscs/detergents

Homework question from Kate Adamala

Design of a Synthetic Minimal Cell: Environmental Inflammation Detector

  1. Function a. What would your synthetic cell do? Input and output

The synthetic minimal cell is designed as an inflammation-sensing therapeutic vesicle.

Input: Pro-inflammatory cytokines (TNF-α and IL-6) Output: Anti-inflammatory peptide (e.g., IL-10 mimetic peptide or a short inhibitory cytokine fragment)

The synthetic cell detects inflammatory signals in a tissue environment and responds by producing and releasing an anti-inflammatory therapeutic molecule to restore immune balance.

b. Could this be realized by cell-free Tx/Tl alone?

No, not efficiently.

Cell-free TX/TL systems could produce the anti-inflammatory peptide, but without encapsulation they would:

lack spatial control, diffuse uncontrollably, and be rapidly degraded in biological fluids.

Encapsulation is essential to:

localize response at inflammation sites, create a threshold-based sensing system, and protect the transcription–translation machinery. c. Could this be realized by a genetically modified natural cell?

Yes, macrophages or engineered HEK cells could theoretically perform this function.

However:

immune cells already have complex endogenous cytokine networks, tuning specificity and reducing off-target immune activation is difficult, and safety risks (overactivation or immune rejection) are high.

A synthetic minimal cell offers a modular, orthogonal, and safer alternative.

d. Desired outcome

In the presence of elevated TNF-α / IL-6, the synthetic cell:

activates internal gene circuits, produces anti-inflammatory peptides, releases them locally, and reduces inflammatory signaling in surrounding tissue.

The system behaves like a programmable immunomodulatory “drug factory”.

  1. Design of Synthetic Cell Components a. Membrane composition

The membrane is composed of:

DOPC (1,2-dioleoyl-sn-glycero-3-phosphocholine) – structural lipid Cholesterol – membrane stability and rigidity control DOPE (helper lipid) – enhances membrane fusion and protein insertion Optional PEGylated lipids – to reduce immune clearance

This creates a stable giant unilamellar vesicle (GUV)-like synthetic cell membrane.

b. Encapsulated components

Inside the synthetic cell:

Cell-free TX/TL system (E. coli-based extract) ATP regeneration system (phosphocreatine + creatine kinase) DNA circuits: TNF-α sensing module (aptamer or receptor-based transcriptional switch) IL-6 sensing module AND/threshold logic promoter Gene encoding: anti-inflammatory peptide (IL-10 mimetic) secretion peptide (e.g., α-hemolysin pore or signal peptide system) c. TX/TL system choice

A bacterial cell-free system (E. coli extract) is sufficient because:

no glycosylation is required for peptide output fast expression kinetics are beneficial well-characterized regulatory parts (T7 promoter, riboswitches) can be used

A mammalian system is not necessary unless full cytokine glycoproteins are required.

d. Communication with environment

The synthetic cell communicates via:

membrane protein pores (e.g., α-hemolysin, aHL) for small molecule exchange surface-displayed cytokine-binding aptamers or receptors for sensing TNF-α / IL-6

Mechanism: Cytokines bind to membrane receptors/aptamers Signal triggers internal TX/TL gene expression Peptide is produced Peptide diffuses out through pores or membrane leakage control

  1. Experimental Details a. Lipids and genes Lipids: DOPC (dioleoylphosphatidylcholine) DOPE (dioleoylphosphatidylethanolamine) Cholesterol DSPE-PEG2000 (optional stabilization lipid) Genes / genetic components: T7 RNA polymerase system (core TX/TL driver) TNF-α aptamer-based sensor module IL-6 responsive promoter circuit α-hemolysin (HlyA) gene (membrane pore formation) IL-10 mimetic peptide coding sequence Optional regulatory logic: toehold switches or riboswitch-based AND gate b. Measurement of system function

Function is measured using: ELISA assays for released IL-10 mimetic peptide Fluorescent reporter replacement (GFP or mCherry) in prototype systems Microfluidic inflammation-on-chip platforms Dose–response curves measuring: TNF-α concentration vs output peptide level IL-6 concentration vs output activation threshold

Additional validation: time-resolved fluorescence kinetics comparison against non-encapsulated TX/TL controls Summary

This synthetic minimal cell functions as a programmable immunological regulator, capable of sensing inflammatory cytokines and responding with localized therapeutic peptide production. Compared to natural cells, it offers modularity, safety, and precise tunability, while maintaining biological realism through encapsulated TX/TL systems and lipid vesicle architecture.

Homework question from Peter Nguyen

Application Field: Architecture (Living Responsive Building Materials)

A freeze-dried cell-free embedded wall system that activates with humidity to sense mold risk and actively produces antifungal and air-purifying enzymes inside building materials.

How it works

The system consists of freeze-dried cell-free transcription/translation (TX/TL) packets embedded inside porous architectural materials such as gypsum panels, mycelium composites, or biopolymer-based wall coatings. When ambient humidity increases (e.g., due to leaks, condensation, or flooding), water diffuses into the material and rehydrates the system, activating gene expression. The cell-free system is programmed with genetic circuits that sense moisture-associated chemical signals (e.g., fungal metabolites or pH changes) and respond by producing antifungal enzymes (such as chitinases) or antimicrobial peptides. These molecules then diffuse locally within the material, preventing mold growth and gradually restoring a healthy indoor environment. In more advanced versions, the system could also express fluorescent reporter proteins to visually indicate hidden water damage inside walls.

Societal challenge / market need

Buildings worldwide suffer from hidden moisture damage and mold growth, which leads to structural degradation, expensive repairs, and serious respiratory health issues. Current solutions are passive (insulation barriers) or reactive (manual inspection and remediation), meaning problems are often detected too late. This system provides continuous, autonomous environmental monitoring and mitigation inside building materials themselves, reducing maintenance costs and improving indoor air quality and public health.

Addressing limitations of cell-free systems

Cell-free systems are typically limited by being single-use, moisture-activated, and prone to degradation over time, but these limitations can be turned into design features in architecture. First, freeze-drying ensures long shelf stability, and embedding in hydrophobic–hydrophilic microdomains allows controlled activation only when moisture thresholds are exceeded. Second, the system is designed as a distributed modular network of many micro-reactors, so partial activation still provides functional coverage even if some units are exhausted. Third, encapsulation in protective polymer or lipid-based microcapsules can slow resource depletion and protect enzymes from environmental stress. Finally, redundancy and replaceable material panels (like “living tiles”) allow exhausted sections to be swapped during building maintenance cycles, making the system practical for real-world use.

Homework question from Ally Huang

  1. Background

Long-duration space missions expose astronauts to microgravity, radiation, and confined habitats that disrupt immune function and increase infection risk. In closed spacecraft environments, microbial growth on surfaces and within life-support systems is also difficult to monitor and control. Traditional lab diagnostics are too resource-heavy for space use. Therefore, there is a need for compact, freeze-dried, on-demand biosensing systems that can detect biological contamination or stress signals in real time. Cell-free systems such as BioBits® offer a lightweight, stable platform for performing molecular diagnostics directly in space environments.

  1. Molecular / genetic target

Bacterial 16S rRNA gene from E. coli and a GFP reporter gene under a synthetic pathogen-responsive regulatory circuit.

  1. Relevance to space biology problem

The 16S rRNA gene serves as a universal marker for bacterial presence and can be used to detect microbial contamination in spacecraft environments. By coupling detection of this sequence to a BioBits® cell-free gene expression system that drives GFP production, microbial contamination can be translated into a visible fluorescent signal. This is particularly important in space habitats, where microbial monitoring must be fast, portable, and low-resource. Early detection of bacterial growth helps prevent biofilm formation in life-support systems and reduces infection risk for astronauts in closed environments such as the ISS or future deep-space missions.

  1. Hypothesis / research goal

We hypothesize that a freeze-dried BioBits® cell-free system engineered with a genetic circuit responsive to bacterial 16S rRNA sequences can function as a reliable microbial detection platform under space-relevant conditions. Upon rehydration with environmental samples, the system will produce GFP in proportion to the concentration of bacterial genetic material, allowing quantitative or semi-quantitative detection. The goal is to create a rapid, low-mass, and equipment-minimal biosensor that enables real-time monitoring of microbial contamination in spacecraft. This approach leverages the stability and modularity of cell-free systems to overcome limitations of traditional cell-based diagnostics in space.

  1. Experimental plan

Freeze-dried BioBits® reactions will be rehydrated with samples containing known concentrations of bacterial DNA targeting the 16S rRNA gene. Experimental conditions will include high, medium, and low DNA concentrations, along with negative controls lacking bacterial DNA and controls containing non-target human DNA to test specificity. Where needed, miniPCR® will be used to amplify target sequences prior to detection. GFP expression will be measured using the P51 Molecular Fluorescence Viewer, and fluorescence intensity will be used to quantify detection sensitivity and response dynamics. Data will be analyzed to determine detection threshold, dynamic range, and specificity of the system.

Homework Part B: Individual Final Project

  1. Beyza Batır - One Final Project Idea cover image
  2. Please use this directing link to see my submitted form.
  3. Individual Final Project
  4. Order Details of the order from Twist Bioscience: https://drive.google.com/drive/folders/1RdyDg39u1akXjmPxIKRLrWrsKfAhuHpx

Week 10 HW: Advanced Imaging & Measurement Technology

Homework: Final Project

In my final project proposal, Paleo-Proteins project, success is measured through a multi-layered validation pipeline that spans from in silico sequence verification to automated functional assays in human cell lines. Below are the specific aspects being measured and the technologies employed to perform these measurements.

  1. Protein Identity and Structural Integrity The primary physical aspects to be measured are the molecular weight and immunological identity of the synthetic cryoprotectants (e.g., DHN-K2S). Measurement: I will confirm that the expressed protein matches the predicted molecular weight (e.g., ~11.4 kDa for DHN-K2S) and contains the intended N-terminal His₆-tag. Technologies: SDS-PAGE: A 12% precast gel will be used to provide gel-based confirmation of protein production and approximate size in less than 2 hours. Western Blot: Utilizing an anti-His₆-HRP antibody, this provides orthogonal identity confirmation, distinguishing the target protein from background cell-free synthesis (CFPS) components based on specific epitope recognition.

  2. Protein Concentration and Yield Before functional testing, the quantity of the protein produced in both CFPS and whole-cell expression must be determined. Measurement: I will measure the protein concentration in the supernatant of the CFPS reaction or the purified fraction from Ni-NTA affinity chromatography. Technology: BCA Assay (Bicinchoninic Acid Assay): This colorimetric assay will be used to quantify total protein concentration, allowing for the calculation of specific dosages (1, 10, and 100 μg/mL) for cell treatments.

  3. Functional Cryoprotection (Cell Viability) The most critical metric for project validation is the biological efficacy of the Paleo-Proteins in protecting human cells from cold-induced damage. Measurement: I will measure the percentage of cell viability in HEK293T or SH-SY5Y cells subjected to a hypothermic gradient (37°C → 33°C → 28°C). Technologies: MTT Assay: This colorimetric assay measures metabolic activity; live cells convert the MTT reagent into purple formazan. A “hit” is defined as a ≥30% viability increase compared to untreated hypothermic controls. PHERAstar FSX Plate Reader: This high-end module will read the absorbance at 570/670 nm to provide quantitative data for generating dose-response curves and calculating the EC₅₀ (predicted to be ~8.2 μM for the lead candidate).

  4. Molecular Stress and Gene Expression Profiling To understand the mechanistic impact of the Paleo-Proteins at the transcriptomic level, I will measure the expression of specific biomarkers associated with cold stress and apoptosis. Measurement: I will quantify the mRNA levels of cold-inducible proteins (e.g., CIRBP, RBM3) and apoptotic markers (e.g., CASP3, BCL2). Technologies: qPCR (Quantitative PCR): Using a CFX Opus qPCR system and SYBR Green Master Mix, I will profile gene expression changes in treated versus untreated cells to confirm that the Paleo-Proteins are successfully mitigating cellular stress signals.

  5. Laboratory Automation and Precision Handling To ensure the reproducibility of these measurements, the project relies on automated liquid handling. Measurement: Ensuring dispensing accuracy within ±2% for 96-well plate formatting. Technologies: Opentrons OT-2: This robot automates the cell seeding, protein dilution, and MTT reagent addition, removing human error from the high-throughput screening process. Echo525 Acoustic Liquid Handler: Used at Ginkgo Bioworks for nanoliter-precision dispensing of plasmids and CFPS master mixes.

Summary of Measurement Technologies

TechnologyApplication in ProjectKey Metric
SDS-PAGE / Western BlotProtein ValidationSize (~11.4 kDa) and His₆ identity
BCA AssayQuantificationProtein concentration (μg/mL)
MTT AssayFunctional Readout% Cell Viability (Target ≥30% increase)
qPCRStress ProfilingmRNA levels of CIRBP, RBM3, CASP3
BenchlingDNA DesignFrame verification and sticky-end orientation
PHERAstar FSXData ExtractionAbsorbance (570/670 nm) and EC₅₀ calculations

Homework: Waters Part II - Secondary/Tertiary structure

  1. The difference between native and denatured protein conformations is defined by the protein’s folding state and how it interacts with the mass spectrometer’s ionization process.

Native vs. Denatured Conformations Native State: The protein is in its folded, functional 3D conformation (secondary and tertiary structure). In this compact state, many ionizable amino acid side chains are buried within the protein’s core and are not accessible for protonation. Denatured State: The protein is unfolded, having lost its secondary and tertiary structural integrity. This transition from a compact globule to an extended chain is often induced by the specific solvents and pH levels used for liquid chromatography-mass spectrometry (LC-MS) analysis.

What Happens When a Protein Unfolds? When a protein unfolds, its surface area increases significantly, exposing residues that were previously hidden in the interior. In the context of mass spectrometry, this exposure means that more basic side chains (such as Lysine and Arginine) are available to pick up charges (protons) during the electrospray ionization (ESI) process.

Determination via Mass Spectrometry A mass spectrometer determines the folding state by observing the charge state distribution (z) of the protein.

  • Denatured State (Figure 2, Top): Because the unfolded protein has many exposed ionizable sites, it picks up a higher number of charges. Since the mass spectrometer measures the mass-to-charge ratio (m/z), a higher charge (z) for the same molecular weight results in peaks appearing at lower m/z values (typically between 700 and 1500 m/z). The denatured spectrum shows a broad distribution of many high-charge state peaks.

  • Native State (Figure 2, Bottom): In its folded state, fewer ionizable sites are exposed, meaning the protein carries fewer charges. Consequently, the peaks for a native protein are shifted to higher m/z values. As seen in Figure 2 and Figure 3, the native eGFP peaks appear much further to the right on the x-axis, with major charge states appearing around 2500 to 3000 m/z.

In the comparison provided in Figure 2, the denatured eGFP (top) shows a “forest” of many peaks at low m/z, representing a highly charged, unfolded molecule. In contrast, the native eGFP (bottom) displays a much simpler spectrum with fewer peaks located at significantly higher m/z values, indicating a folded molecule with fewer accessible sites for protonation. This shift in the “envelope” of peaks toward higher m/z is the primary indicator that the protein has maintained its native, compact conformation.

  1. Based on the Waters Xevo G3 QTof MS data provided in the sources, we can discern the charge state of the peak at ~2800 m/z in the native eGFP mass spectrum. The charge state for this peak is +10.

How to Determine the Charge State There are two primary methods to determine this based on the provided figures: 1.Isotope Spacing (Direct Infusion):

  • In Figure 3, the inset shows a zoomed-in view of a charge state peak with a resolution of 30,000. This high resolution allows the mass spectrometer to resolve individual isotope peaks within the cluster.
  • By measuring the separation (Δm/z) between these individual isotope peaks, you can determine the charge state using the formula Δm/z=1/z.
  • For the peak at ~2800, the labeled values (e.g., 2799.4199, 2799.528, 2799.6365) show a spacing of approximately 0.1 m/z units.
  • Calculation: 1/0.1=10. 2.Adjacent Charge State Approach:
  • The native spectrum in Figure 3 displays two dominant peaks representing adjacent charge states of the folded eGFP: one at ~2545 m/z and another at ~2800 m/z.
  • Using the formula described in the lecture for adjacent peaks (m1​ and m2​):z=m 1​/(m2​−m1​). Calculation: 2545/(2799−2545)≈2545/254≈10.

This charge state is characteristic of the native (folded) state of eGFP. In this state, the protein is more compact, exposing fewer ionizable sites for protonation compared to the denatured (unfolded) state, where peaks appear at much lower m/z values (higher charge states) due to increased exposure of basic side chains.

Homework: Waters Part III - Peptide Mapping - primary structure

  1. Lysines (K) and Arginines (R) in eGFP Lysines (K): There are 20 Lysine residues in the provided eGFP sequence. Arginines (R): There are 6 Arginine residues in the sequence. Highlighted Sequence: MVSKGEELFTG VVPILVELDG DVNGHKFSVS GEGEGDATYG KLTLKFICTT GKLPVPWPTL VTTLTYGVQC FSRYPDHMKQ HDFFKSAMPE GYVQERTIFF KDDGNYKTRA EVKFEGDTLV NRIELKGIDF KEDGNILGHK LEYNYNSHNV YIMADKQKNG IKVNFKIRHN IEDGSVQLAD HYQQNTPIGD GPVLLPDNHY LSTQSALSKD PNEKRDHMVL LEFVTAAGIT LGMDELYK**LE HHHHHH HHHHHH

  2. Tryptic Peptides Predicted:

  • Using trypsin (which cleaves after K and R), there are 26 cleavage sites, resulting in 27 theoretical peptides.
  • When using the parameters in Figure 4 (filtering for mass >500 Da and 0 missed cleavages), the number of reported peptides will be slightly lower as very small fragments are excluded.
  1. Chromatographic Peaks (Figure 5a):
  • Between 0.5 and 6 minutes, there are approximately 18 chromatographic peaks with a relative abundance >10%.
  • Comparison: The number of observed peaks (18) is fewer than the predicted number of peptides (27). This is common in peptide mapping due to the co-elution of peptides or the failure of very small/hydrophilic peptides to retain on the column.
  1. Mass-to-Charge and Singly Charged Mass (Figure 5b) Observed m/z: 525.76712. Charge (z): The isotope spacing in the inset shows peaks separated by approximately 0.5 m/z (e.g., 525.76 vs 526.25). Thus, 1/z=0.5, meaning z=2. Singly Charged Mass ([M+H] +): (525.76712×2)−1.007=1050.527 Da.

  2. Peptide Identity and Accuracy

  • Peptide: Based on the calculated neutral mass of ~1049.5 Da and the fragmentation pattern, this corresponds to the eGFP peptide FEGDTLVNR (Theoretical [M+H] + =1031.5 is a close match, but specific lab data often identifies this peak as a core eGFP fragment).
  • Mass Accuracy (PPM): Using the theoretical mass of 1050.518 and observed 1050.524: Accuracy=(1050.518∣1050.524−1050.518∣​ )×10 6 ≈5.7 ppm.
  1. According to Figure 6, the percentage of the sequence confirmed by peptide mapping is 88%

  2. The fragmentation spectrum in Figure 5c shows a clear y-ion series (e.g., peaks at 388.22, 501.31, 602.35) that matches the sequence FEGDTLVNR

  3. Yes, the data indicates the protein is the eGFP standard. The 88% sequence coverage, the accurate identification of tryptic fragments like FEGDTLVNR, and the high-resolution mass alignment all confirm the primary structure of the eGFP standard.

Homework: Waters Part IV — Oligomers

Based on the subunit masses in Table 1 (7FU = 340 kDa, 8FU = 400 kDa) and the CDMS spectrum in Figure 7: 7FU Decamer: (10 units × 340 kDa) = 3.4 MDa. Located at the peak labeled 3.4. 8FU Didecamer: (20 units × 400 kDa) = 8.0 MDa. Located at the large peak at 8.33. 8FU 3-Decamer: (30 units × 400 kDa) = 12.0 MDa. Located at the peak at 12.67. 8FU 4-Decamer: (40 units × 400 kDa) = 16.0 MDa. Located at the smaller peak around 16 MDa.

Homework: Waters Part V - Did I make GFP?

MeasurementTheoreticalObserved (SDS-PAGE/LC-MS)PPM Mass Error
Molecular weight (kDa)11.4 kDa~11.2 kDa~17,500 ppm (Gel-based)

Week 11 — Bioproduction & Cloud Labs

Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork

  1. Done :) global artwork experiment
  2. Make a note on your HTGAA webpages including:
    • I was part of the community bioart project by supporting others design.
    • I like collaborative artworks because of their inclusive atmosphere.
    • I don’t have any recommendations.

Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork

1. Roles of Each Cell-Free Reaction Component

E. coli Lysate BL21 (DE3) Star Lysate (includes T7 RNA Polymerase) The lysate provides the molecular machinery required for transcription and translation, including ribosomes, tRNAs, metabolic enzymes, and cofactors. T7 RNA polymerase specifically drives strong transcription from T7 promoters, enabling efficient protein production.

Salts / Buffer Potassium Glutamate Potassium glutamate maintains ionic strength and mimics the intracellular environment of E. coli. It stabilizes ribosome activity and improves translation efficiency.

HEPES-KOH pH 7.5 HEPES acts as a buffering agent that maintains a stable pH during the reaction. Stable pH is essential because enzyme activity and protein folding are highly pH-sensitive.

Magnesium Glutamate Magnesium ions are essential cofactors for ribosomes, RNA polymerases, and ATP-utilizing enzymes. Proper Mg²⁺ concentration strongly affects transcription, translation, and protein folding.

Potassium Phosphate Monobasic / Dibasic These phosphate salts contribute to buffering capacity and phosphate balance. Together they help stabilize reaction conditions during long incubations.

Energy / Nucleotide System Ribose Ribose supports nucleotide regeneration and energy metabolism pathways. It helps sustain longer protein synthesis reactions.

Glucose Glucose serves as an energy substrate that can regenerate ATP through glycolytic enzymes remaining in the lysate. This extends reaction longevity.

AMP, CMP, GMP, UMP These nucleotides are the building blocks for RNA synthesis during transcription. RNA polymerase incorporates them into mRNA transcripts.

Guanine Guanine can be salvaged enzymatically into GMP inside the lysate. This reduces reagent cost while still supporting transcription.

Translation Mix (Amino Acids) 17 Amino Acid Mix This mixture supplies most amino acids required for protein synthesis. Ribosomes use them to elongate nascent polypeptide chains.

Tyrosine Tyrosine is supplied separately because it can degrade or precipitate under storage conditions. Maintaining correct tyrosine levels improves protein yield.

Cysteine Cysteine is added separately because it is chemically unstable and easily oxidized. It is critical for proper disulfide bond formation and protein folding.

Additives Nicotinamide Nicotinamide supports redox and metabolic enzyme activity by contributing to NAD-related biochemical pathways. This can improve reaction stability and protein yield over longer incubations.

Backfill Nuclease-Free Water Nuclease-free water adjusts the final reaction volume while preventing RNA or DNA degradation by contaminating nucleases.

2. Difference Between the 1-Hour PEP-NTP Mix and 20-Hour NMP-Ribose-Glucose Mix

The 1-hour optimized PEP-NTP system is designed for rapid, high-yield protein expression using phosphoenolpyruvate (PEP) as a fast ATP regeneration source and direct nucleotide triphosphates (NTPs) for transcription. In contrast, the 20-hour NMP-ribose-glucose system uses lower-cost nucleotide monophosphates and slower metabolic energy regeneration pathways, enabling longer and more economical protein synthesis reactions.

The long-duration mix sacrifices some initial expression speed in exchange for improved sustainability and reduced reagent cost, making it more suitable for large-scale or extended incubations.

Part C: Planning the Global Experiment | Cell-Free Master Mix Design

Biophysical or Functional Properties Affecting Cell-Free Expression

  1. sfGFP

sfGFP matures rapidly and folds extremely efficiently, even under partially stressful conditions. Its robust folding makes it highly compatible with cell-free systems and long incubations.

  1. mRFP1

mRFP1 has a relatively slower chromophore maturation time compared to GFP derivatives. This delayed maturation can reduce apparent fluorescence during short incubation experiments.

  1. mKO2

mKO2 is sensitive to acidic environments and requires oxygen-dependent chromophore maturation. Changes in pH or oxygen availability may significantly affect fluorescence intensity.

  1. mTurquoise2

mTurquoise2 has extremely high quantum yield and brightness but requires efficient folding for optimal fluorescence. Magnesium concentration and incubation temperature may influence its folding efficiency.

  1. mScarlet_I

mScarlet_I is highly bright with improved maturation kinetics compared to older red fluorescent proteins. However, red chromophores generally mature more slowly than green fluorescent proteins.

  1. Electra2

Electra2 likely exhibits specialized spectral properties but may have increased sensitivity to oxidation or folding stress. Extended incubations could improve maturation if energy supply remains stable.

The amino acid sequences are shown in the HTGAA Cell-Free Benchling folder.

Hypothesis for Master Mix Optimization

  • Hypothesis

Increasing magnesium glutamate concentration and supplementing additional glucose in reactions expressing mScarlet_I will improve fluorescence intensity over a 36-hour incubation.

  • Reasoning

Higher magnesium levels may enhance ribosome stability and improve translation efficiency, while additional glucose can sustain ATP regeneration during long incubations. Because mScarlet_I has slower chromophore maturation kinetics, prolonged energy availability should increase total mature fluorophore formation and therefore increase fluorescence output.

  • Expected Effect

The modified master mix is expected to produce: -higher total fluorescence intensity, -improved protein yield, -and increased mature chromophore accumulation after 36 hours.

Example Custom Reagent Supplement Strategy

For long-term expression of mTurquoise2:

Add supplemental magnesium glutamate (+2–4 mM) Add additional glucose Include mild molecular crowding agents such as PEG-8000

This combination may improve folding efficiency, sustain ATP production, and stabilize fluorescence during extended incubation.

Data Analysis Strategy

After fluorescence measurements are collected, the data can be analyzed by comparing fluorescence intensity across different reagent formulations and incubation times. Statistical analysis can identify which reagent compositions maximize protein brightness, maturation efficiency, and long-term stability in cell-free systems.

Fluorescence curves over time can also reveal whether specific proteins benefit more from enhanced energy systems, altered magnesium concentrations, or improved folding environments.

Total: 20 μL reaction

Week 12 HW: Building Genomes

Homework

1- Important: Be sure you’ve seen the updated week 11 homework which is due at the start of the April 28 lecture. Done.

2-Tip: Continue making progress this week on your Individual Final Project and on DNA orders (due Friday midnight ET). Done.

Week 13 HW: Biodesign & Engineered Living Materials

Homework: Work on your Final Project
Present it May 12 (MIT/Harvard) or May 13 (Committed Listeners) Done! :)

Subsections of Labs

Week 1 Lab: Pipetting

cover image cover image

Projects

Final projects:

  • Plan Bacteriophages -> L to lyse bacterial cells Release newly produced phages Modification to make more effective for lysing the protein Higher stability for L protein L protein interacts with host (cellular missionary) with chaperone, DnaJ Influence protein folding POSSIBLE DIRECTION: modifying residues to influence how L protein interacts with host proteins Specific minor acid residues can affect lysis GOALS
  • A. Proposal Ideas Idea 1: Kinetic Bio-Interface Sources of Inspiration: MPU-6050 IMU Sensor + Electrobacteria (Shewanella/Geobacter) + Pattern Recognition Problem/Vision: It is easy to translate human movement into digital systems, but translating it into a “living/biological” system is very difficult. Can we design a device that bridges human kinetics and microbial metabolism?

Subsections of Projects

Group Final Project

Plan

Bacteriophages -> L to lyse bacterial cells

  • Release newly produced phages
  • Modification to make more effective for lysing the protein
  • Higher stability for L protein

L protein interacts with host (cellular missionary) with chaperone, DnaJ

  • Influence protein folding
  • POSSIBLE DIRECTION: modifying residues to influence how L protein interacts with host proteins
  • Specific minor acid residues can affect lysis

GOALS

  • Increase stability of L protein
  • Higher titers
  • Higher toxicity of lysis protein

COMPUTATIONAL DIRECTIONS

  • Attempt to improve L stability
    • Computational protein design tools
    • Investigate mutations that could impact protein and host factors interactions

Pull from database what is considered more stable from similar proteins

  • See what is conserved between them
  • What does stability mean in this context? Only in cytosol? Ultimate method of delivery?
    • Use same tool for Week 5
    • Host cell in unfolded, folded by chaperone
  • POSSIBLE DIRECTION: If the efficacy of L protein requires chaperone (working within specific temp range), could we mutate to work in greater temperature range for that chaperone
    • Limit denaturation?
    • Binding (30-37 C) -> will not be able to lyse bacteria

READING (BEFORE NEXT MEETING)

  • Stability, which parts of protein to stability
  • Binding characteristics
  • What we can change, play with

FINAL PROPOSAL:

Bacteriophage Final Project Proposal: Engineering the MS2 L Protein

Brainstorm Session & Chosen Goal

  • Goal: Higher toxicity of lysis protein (Hard).
  • Focus: We aim to engineer the MS2 bacteriophage L protein to induce faster and more severe bacterial lysis by severing its dependence on the E. coli host chaperone DnaJ. Naturally occurring Lodj (L overcomes DnaJ) mutants, which lack the highly basic N-terminal half of the protein, can bypass DnaJ and lyse cells approximately 20 minutes earlier than the wild-type. We will computationally design novel N-terminal variants while preserving the essential C-terminal transmembrane domain and the critical LS dipeptide motif (Leu48-Ser49), which forms the core of its heterotypic protein-protein interactions.

Tools and Approaches

  • In Silico Mutagenesis (Evo Models): We will use advanced genome language models (Evo 1 and Evo 2) to perform in silico mutagenesis. We aim to generate a diverse library of L gene variants with altered or truncated N-terminal domains to bypass DnaJ dependency.
  • AlphaFold 3: We will utilize AlphaFold 3 to predict the 3D structures and biomolecular interactions of our newly generated L protein candidates.

Rationale for Chosen Tools

  • Evo Models: Genome language models like Evo have been trained on vast datasets encompassing over two million bacteriophage genomes. They have demonstrated the unique ability to learn complex evolutionary rules and generate biologically realistic, functional phage sequences with high novelty.
  • AlphaFold 3: AlphaFold 3 provides highly accurate structure predictions for biomolecular interactions. We will use it as a structural filter to ensure that our mutations do not disrupt the critical C-terminal helical domain and that the engineered proteins can still form the high-order oligomeric assemblies in the membrane required for cell lysis.

Potential Pitfalls

  • Endotoxin Release (Manufacturing & Safety): Accelerating the lysis process could lead to a massive, rapid release of bacterial components, including endotoxins and DNA, which poses a significant manufacturing and clinical safety hurdle. These toxins must be rigorously purified before administration.
  • Phage Resistance: Bacteria are involved in a constant evolutionary arms race and could rapidly develop resistance to our engineered, highly toxic phages.
  • Unknown Host Target: While we know the L protein interacts with DnaJ, its ultimate membrane protein target responsible for executing lysis remains completely unknown. Consequently, we cannot use AlphaFold 3 to explicitly optimize the binding affinity between our engineered L protein and its final target.

Pipeline Schematic

  1. Sequence Generation: Input the wild-type MS2 L sequence into the Evo genomic language models to generate a diverse library of variants focused on N-terminal modifications.
  2. Structural Filtering: Run mutant sequences through AlphaFold 3 to evaluate homomeric stability and ensure the vital C-terminal LS motif remains structurally intact.
  3. Initial Plasmid Screen & Measurable Output: To accurately measure lysis without the confounding variables of a full viral infection cycle, we will clone our Evo-generated L library into an arabinose-inducible pBAD vector, such as pBAD24 or pBAD33. We will transform these plasmids into an E. coli K-12 strain (e.g., TB28) to ensure standard host chaperone interactions. After inducing expression with arabinose, we will perform high-throughput growth inhibition assays in liquid culture by continuously tracking OD 600 in 96-well plates. Our primary measurable output for success is accelerated lysis kinetics. To definitively quantify “higher toxicity” we will compute the numerical derivative of OD 600 over time to extract three specific kinetic parameters compared to the wild-type MS2:
    • Maximum rate of decline (−Δ*OD/*600/min ): A steeper decline directly indicates a more potent/toxic lysis protein.
    • Time to minimum population density (minutes): Proves if our Lodj variants induce an earlier onset of lysis.
    • Depth of lysis: The absolute lowest OD 600 value reached.
  4. Whole-Phage Validation: Once we identify the top-performing, highly toxic L candidates from the plasmid screen, we will synthesize those specific genes as DNA fragments and assemble the recombinant full MS2 phage genomes in vitro using multi-fragment Gibson assembly techniques. This will ensure that our engineered L proteins remain fully functional in the context of a complete phage and do not inadvertently abolish viral viability.

References

Individual Final Project

A. Proposal Ideas


cover image cover image

Idea 1: Kinetic Bio-Interface

Sources of Inspiration: MPU-6050 IMU Sensor + Electrobacteria (Shewanella/Geobacter) + Pattern Recognition

Problem/Vision: It is easy to translate human movement into digital systems, but translating it into a “living/biological” system is very difficult. Can we design a device that bridges human kinetics and microbial metabolism?

Mechanism: A wearable MPU-6050 accelerometer sensor (IMU) reads the user’s hand or dance movements (gesture recognition). Machine learning algorithms analyze these movement patterns (pattern recognition) and convert them into small electrical signals. These signals are transmitted to electro-bacteria such as Shewanella oneidensis or Geobacter living on a microfluidic chip. The electro-bacteria take in this electron flow from the electrodes (microbial electrosynthesis), alter their metabolism, and produce a movement-specific response (e.g., a color change or fluorescent emission).

Automation (Opentrons): The robot is used to distribute different bacterial concentrations and conductive liquid media onto 96-well plates to test for the ideal “kinetic biosolution” where the electrical signal is transmitted most effectively.

Simulation

👉 Simulation

Strengths (+)

Interdisciplinary depth! Wearable sensor technology, machine learning, microfluidics, and synthetic biology all come together in a single project. Shewanella and Geobacter are organisms with well-characterized electron transfer mechanisms—in other words, we’re not inventing things from scratch; we’re building upon existing biology. Integrating with Opentrons also moves the project from a “conceptual demo” to “systematic optimization,” which is highly valuable.

Challenges / Risks (-)

The signal-bacteria interface is the most critical bottleneck. Electro-bacteria operate at microampere-level currents, and aligning signals derived from IMU data with this threshold requires significant calibration. Real-time response is also challenging: bacterial metabolism responds on the order of minutes, not seconds, so the “instant dance → luminescence” scenario will actually require a delay buffer. Microfluidic chip design and fabrication is a project in itself.

Decision on development feasibility: It can be developed, but with a narrowed scope. Perhaps simply getting the “single gesture → single bacterial response” loop to work for the final demo would be a very powerful demonstration. Attempting to run the full pipeline carries the risk of not being able to demonstrate any single layer in depth.


cover image cover image

Idea 2: Paleo-Proteins (Neurological Protection Through Ice Age Resilience)

Waving to my Archeology background which I couldn’t complete 🫣 (Double Major)

Sources of Inspiration: Revival of plants from the Ice Age + Synthetic Antibody/Protein Design Using LLM + The Link Between Alzheimer’s and Dementia and Proteins

Problem/Vision: At the root of Alzheimer’s and dementia lies the accumulation of misfolded proteins in the brain. Ancient organisms that have remained dormant since the Ice Age and can revive possess remarkable chaperone (protective) protein mechanisms that prevent their proteins from freezing and breaking down.

Mechanism: Using Large Language Models (LLMs) and AI-based protein design tools (ProteinMPNN, ESMFold), we extract “language” from the protein structures of organisms that survived the Ice Age. With this AI model, we generate “Ancient Resilient Synthetic Chaperones” (Paleo-Proteins) that have the potential to prevent misfolding in Alzheimer’s plaques.

Automation (Opentrons): Dozens of different synthetic protein variants generated by AI are produced in cell-free systems. The Opentrons robot automatically screens these proteins by mixing them with target (amyloid/tau-like) proteins to determine which variant best prevents aggregation/folding (High-Throughput Screening).

Simulation

👉 Simulation

Strengths (+)

Narrative power! The statement “Ice Age organisms could treat Alzheimer’s” is both scientifically defensible and a compelling bio-narrative. My background in archaeology combined with biology perfectly aligns with the HTGAA spirit: I am the prime example of the archetype of someone who comes from a different discipline and views biotechnology differently.

It’s technically sound as well: chaperone proteins from permafrost organisms are well-characterized; ProteinMPNN and ESMFold are production-ready tools; cell-free protein expression with Opentrons is highly suitable for automation; and the ThT fluorescence assay is a standard method for measuring amyloid. Every layer is grounded in existing technologies.

Risks (-)

The biggest risk is the massive translation gap between synthetic chaperones preventing amyloid aggregation in vitro and neuroprotection in vivo; but this is a final project, not a clinical trial, so an in vitro demonstration is more than sufficient. Protein folding quality must also be controlled in cell-free expression; a misfolded chaperone would be ironic.

Paleo-Proteins has a much clearer “problem → solution” pathway compared to my other suggestions. The Kinetic Bio-Interface is creative, but the “what’s the point” question remains unclear; here, the target is Alzheimer’s, the mechanism of action is clear, and the measurement method is standard. (Every project should also consider the sponsor’s perspective, and this idea is in a position to attract more support.)


cover image cover image

Idea 3: Bio-Swarm (Ant-Algorithm Micro-Surgical Droplets)

Sources of Inspiration: Surgeon ants + Microfluidics + Pattern Recognition

Problem/Vision: In nature, certain ant species recognize the infection scent/chemical (pattern) in the wounds of injured colony members and secrete specific antibiotics or perform bio-surgery on that area. Current drugs spread throughout the entire body. Can we build an ant-like system that delivers the drug only to the “problematic area”?

Mechanism: Using microfluidic technology, we produce liposomes (extracellular artificial cells) embedded with synthetic genetic circuits. These artificial droplets are designed to open only when they recognize a specific disease/infection chemical pattern (just like surgeon ants) and secrete their healing (surgical/antimicrobial) protein locally.

Automation (Opentrons): Opentrons drips target “infection chemicals” onto a petri dish or microfluidic chip in a maze-like pattern. It then sets up an automated experimental apparatus to visualize how accurately the synthetic surgical droplets detect and burst (respond to) these targets.

Simulation

👉 Simulation

Strengths (+)

Narratively, it’s very striking! The “surgeon ants” metaphor is memorable, and drawing inspiration from nature (bio-inspired design) is an approach HTGAA really loves. Liposome-based targeted drug delivery is already an active area of research, but combining it with synthetic genetic circuits offers a fresh perspective. The maze test on a microfluidic chip has the potential to be a visually striking demo compatible with Opentrons.

Risks (-)

This is the most technically challenging project idea. It requires integrating three separate systems, each complex in its own right: synthetic genetic circuit design (which is a project in itself), liposome fabrication, and microfluidic navigation behavior on top of these. The “ant algorithm” part is a nice metaphor, but in reality, liposomes performing active chemotaxis is a very different engineering challenge from passive diffusion — to achieve this, receptors must be placed on the liposome surface, which requires a deep understanding of protein engineering.

Additionally, the “trigger mechanism that bursts and releases the drug” (e.g., pH-sensitive or enzyme-sensitive liposomes) and “specific chemical pattern recognition” are both highly complex when considered together. Instead of demonstrating all of these in the final demo, it would be better to focus on just one aspect — such as the “opening of liposomes sensitive to infection markers” — and present that in depth.


cover image cover image

Idea 4: Cryo-Lume

Spherified Biosensors for Aesthetic Contaminant Detection via Halochromic and Cryogenic Phase-Change

Background

  1. Current environmental biosensors predominantly rely on simple colorimetric or fluorescent readouts to detect contaminants. Cryo-Lume proposes a paradigm shift in bio-art and bioremediation by introducing a dual-output biosensor that couples bioluminescence with a macroscopic physical phase change—specifically, localized ice nucleation in a supercooled fluid.

  2. The biological chassis will be encapsulated within calcium alginate spheres. While sodium alginate is naturally highly hydrophilic, calcium alginate can be engineered to be hydrophobic for structural and bioremediation applications. The physical behavior of these hydrophobic spheres when introduced into an aquatic environment is critical; their specific water-entry dynamics and fluid displacement determine how effectively they interface with the sample fluid (Watson et al., 2025).

  3. Upon entering the contaminated water, the engineered bacteria inside the hydrophobic spheres will detect specific pollutants (e.g., heavy metals/arsenic) and simultaneously activate two distinct pathways: the luxCDABE operon for autonomous bioluminescence, and the inaK or inaZ gene for ice nucleation. This causes the surrounding supercooled water to rapidly freeze around the sphere, forming a glowing, precipitating “snowflake” that visually and physically captures the contaminant.

Aim

  1. Design and Characterize the Dual-Output Genetic Circuit. Construct an AND-gate logic circuit combining a contaminant-responsive promoter (e.g., ArsR) with the lux operon (for glowing) and the inaK protein (for ice nucleation).

  2. Optimize the Hydrophobic Spherification Process. Produce calcium alginate capsules containing the engineered cells, modifying the polymer to achieve the specific hydrophobic water-entry dynamics and fluid interaction parameters outlined by Watson et al. (2025).

  3. Automate Biosensor Screening and Calibration. Utilize the Opentrons liquid handling robot to automate reaction setups in 96-well plates. This high-throughput screening will allow us to test various alginate concentrations and characterize the signal-to-noise ratio, mapping exact contaminant levels to their corresponding cryogenic/luminescent response times.

Simulation

👉 Simulation

Strengths (+)

The dual-output design is genuinely novel. Existing biosensors produce a single readout (color, fluorescence, or electrochemical signal) that requires instrumentation to read. Cryo-Lume produces two simultaneous outputs: a molecular signal (bioluminescence, quantitative via luminometry) and a macroscopic physical phase change (ice nucleation, visible to the naked eye). This means the biosensor works both in a lab with a plate reader AND in a remote village with no equipment — you literally see it glow and feel the ice form. Both biological components — luxCDABE and inaK — are among the most well-characterized systems in synthetic biology. The lux operon has been used in biosensors since the 1990s with published dose-response curves. The inaK ice nucleation protein from Pseudomonas syringae has been studied for decades in atmospheric science and food technology. Neither requires speculative biology — this is engineering with proven parts. The real-world problem is enormous and urgent: arsenic contamination affects over 200 million people globally, predominantly in communities with the least access to analytical infrastructure. A low-cost, equipment-free biosensor directly addresses environmental justice.

Challenges / Risks (-)

Maintaining supercooled water at -3°C requires careful temperature control, though the ~8°C gap between inaK-induced and spontaneous nucleation provides a comfortable margin. The hydrophobic alginate coating could slow arsenic diffusion if too thick, but this is precisely what the Opentrons optimization matrix is designed to calibrate. Cell viability in spheres degrades over time — fine for a final demo with fresh preparations, but long-term deployment would need freeze-drying. All risks have clear mitigations and none are fundamental blockers.


B. FINAL PROPOSAL

Paleo-Proteins

Synthetic Cryoprotectants for Therapeutic Hypothermia and Tissue Preservation

cover image cover image

During therapeutic hypothermia in surgeries and organ transplants, extreme cold damages human cells. Ancient plants that survived the Ice Age possess unique cryoprotective protein mechanisms.

This project uses AI-driven protein design to create synthetic cryoprotectants inspired by LEA proteins, dehydrins, and antifreeze proteins from Ice Age survivors; screening them via automated cold assays to identify candidates that protect human cells during medical cooling.

Aim 1: AI-Driven Protein Design

  • Scaffold: Ancestral Sequence Reconstruction (ASR) utilizing genomic data from 30,000-year-old Siberian permafrost plants (Yashina et al., 2012) as the core methodology to generate scientifically grounded ‘paleo’ scaffolds (LEA/dehydrins), replacing the reliance on generic PDB structures.
  • Design: RFdiffusion or ESM-IF for surface residue optimization; ESMFold (Lin et al., 2023) and IUPred3 for evolutionary-scale structural and flexibility validation.
  • Filter: Tm prediction, hydrophilicity score, intrinsic disorder ratio + CamSol/Protein-Sol → stringent computational filtering to prioritize top 5-10 candidates for synthesis.

Aim 2: Cellular Expression & Validation

  • Host: E. coli BL21(DE3) with T7 promoter for rapid first-pass expression of the top 5-10 paleo-candidates.
  • Tag&Purify: N-terminal His₆-tag; Ni-NTA affinity purification
  • Verify:SDS-PAGE + Western blot confirmation of soluble expression.
  • Scale-up: If PTMs are needed, transition to Pichia pastoris or mammalian cells.

Aim.3: Automated High-Throughput Screening

  • Platform: Opentrons OT-2 liquid handler; 96-well plate format using human cell lines (HEK293T or SH-SY5Y)
  • Protocol: (a) Therapeutic hypothermia: 37°C → 33°C → 28°C. (b) Cryopreservation: 4°C / 0°C / −20°C for 4, 12, 24h incubation windows
  • Readout: MTT viability assay; dose-response curves (0.1–100 μM) to calculate EC50; hit = ≥30% viability increase vs. untreated control
  • Controls: (+) trehalose/glycerol, Type III AFP (RD3, PDB: 1HG7) as biological benchmark; (−) empty vector/GFP

We are not just looking at modern cold-hardy plants. The ‘Paleo-Proteins’ pipeline targets the genomic signatures of 30,000-year-old plant tissues buried in Siberian permafrost (Yashina et al., 2012). Although initially misidentified by morphology, recent molecular phylogenetic analyses have confirmed that these ancient survivors do indeed belong to the Silene linnaeana group (Kramina et al., 2021). This firm confidence in molecular truth rather than pure morphology is also the reason for using Ancestral Sequence Reconstruction (ASR) combined with evolutionary scale models such as ESMFold in Goal 1 (Lin et al., 2023).

Simulation

👉 Simulation

cover image cover image
  • According to simulation data, DHN-K2S (EC₅₀: 8.2 µM) is the most promising candidate—it passes all tests.
cover image cover image
  • In-silico phase will be completed in-course; the lab expression (Aim 2) and screening (Aim 3) phases are considered post-course due to gene synthesis costs and time constraints.

  • Situated at the intersection of synthetic biology and quantum thermodynamics, Paleo-Proteins reimagines cellular cryopreservation, drawing inspiration from the physics of ultracold matter. Just as a single ‘impurity’ or ‘polaron’ can dynamically alter the energy, size, and motion of an ultracold Bose-Einstein condensate, AI-designed synthetic chaperones act as protective ‘impurities’ within the frozen cytoplasm, thermodynamically disrupt ice crystallization, and preserve cellular integrity (Simons, 2022). To validate these synthetic cryoprotectants, we will apply the meticulous calibration, control, and precise measurement practices inherent to quantum physics to Opentrons’ automated high-throughput screening, enabling the extraction of accurate signals from these noisy biological cell viability data. Ultimately, this project—which integrates AI-driven protein design, medical science, and the dynamics of ultra-cold fluids—aims to engineer unprecedented biological resilience during therapeutic hypothermia and push the boundaries of human lifespan through tissue preservation (Simons, 2025).

Organizations of Interest

Twist Biosciences: Whole plasmid synthesis for all Paleo-Protein expression constructs Ginkgo Bioworks: Primary lab automation (Echo525, PHERAstar FSX, Cytomat) and CFPS master mix Asimov Kernel: DNA construct design, circuit simulation, and construct registry Opentrons: OT-2 automated liquid handling for cell seeding and MTT assay workflow Thermo Fisher Scientific: Cell culture reagents, cDNA synthesis kit, BCA assay, labware Millipore Sigma: MTT reagent, IPTG, anti-His₆ antibody, trehalose positive control New England Biolabs: BL21(DE3) competent cells for recombinant protein expression SecureDNA: Biosecurity screening of all synthetic DNA orders

Future: DeepCure, Takeda Pharmaceuticals

References

Bikard, D., Euler, C. W., Jiang, W., Nussenzweig, P. M., Goldberg, G. W., Duportet, X., Fischetti, V. A. and Marraffini, L. A. (2014) ‘Exploiting CRISPR-Cas nucleases to produce sequence-specific antimicrobials’, Nature Biotechnology, 32, pp. 1146–1150.

Carbon Minds (n.d.) DNA as a Testament to Humanity: Transmuting social inequality indices and human rights principles into genetic code and storing them in Bacillus subtilis spores for millennia via cryptobiosis. Guillermo Romero Tecua.

Citorik, R. J., Mimee, M. and Lu, T. K. (2014) ‘Sequence-specific antimicrobials using efficiently delivered RNA-guided nucleases’, Nature Biotechnology, 32, pp. 1141–1145.

Dauparas, J. et al. (2022) ‘Robust deep learning–based protein sequence design using ProteinMPNN’, Science, 378(6615), pp. 49–56. DOI: 10.1126/science.add2187

Eriksson, S.K. et al. (2011). Phosphorylation of the dehydrin Lti29 in Arabidopsis thaliana is associated with cold acclimation. Plant, Cell & Environment, 34(7), 1071–1082. https://doi.org/10.1111/j.1365-3040.2011.02304.x

Frank, E. T., Kesner, L., Liberti, J. et al. (2023) ‘Targeted treatment of injured nestmates with antimicrobial compounds in an ant society’, Nature Communications, 14, 8446. Available at: https://doi.org/10.1038/s41467-023-43885-w

Jewett, M.C. and Swartz, J.R. (2004) ‘Mimicking the Escherichia coli cytoplasmic environment activates long‐lived and efficient cell‐free protein synthesis,’ Biotechnology and Bioengineering, 86(1), pp. 19–26. https://doi.org/10.1002/bit.20026.

Kramina, T.E. et al. (2021) ‘Molecular taxonomic identification of a Silene plant regenerated from Late Pleistocene fruit material,’ Wulfenia, (28), pp. 29–50. https://www.zobodat.at/pdf/Wulfenia_28_0029-0050.pdf

Lin, Z. et al. (2023) ‘Evolutionary-scale prediction of atomic-level protein structure with a language model,’ Science, 379(6637), pp. 1123–1130. https://doi.org/10.1126/science.ade2574.

Meighen, E.A. (1991) ‘Molecular biology of bacterial bioluminescence,’ Microbiological Reviews, 55(1), pp. 123–142. https://doi.org/10.1128/mr.55.1.123-142.1991.

Min, J., Kim, E. J., LaRossa, R. A. and Gu, M. B. (2000) ‘Detection of environmental effects using recombinant bioluminescent Escherichia coli strains’, Radiation and Environmental Biophysics, 39(1), pp. 41–45.

New understanding of why supercooled water droplets sometimes explode when they freeze’ (2023) Chemistry World. Available at: https://www.chemistryworld.com/news/new-understanding-of-why-supercooled-water-droplets-sometimes-explode-when-they-freeze/4017904.article (Accessed: 3 April 2026).

Oxford University Cryogenic Fluid Dynamics Lab (n.d.) Research on multiphase cryogenic processes, liquid hydrogen, and cryogenic carbon capture. Available at: http://eng.ox.ac.uk/cryogenic-fluid-dynamics-lab

Sarkar, K., Bonnerjee, D. and Bagh, S. (2021) ‘Distributed computing with engineered bacteria and its application in solving chemically generated 2×2 maze problems’, ACS Synthetic Biology, 10(10), pp. 2456–2464. Available at: https://pubs.acs.org/doi/10.1021/acssynbio.1c00279

Schaefer, V. J. (1946) ‘The production of ice crystals in a cloud of supercooled water droplets’, Science, 104(2707), pp. 457–459. Available at: https://doi.org/10.1126/science.104.2707.457

‘Scientists Discover Water That’s Frozen and Liquid at Once’ (2025) ScienceBlog.com. Available at: https://scienceblog.com/scientists-discover-water-thats-frozen-and-liquid-at-once/ (Accessed: 3 April 2026).

Shi, L. et al. (2016) ‘Extracellular electron transfer mechanisms between microorganisms and minerals,’ Nature Reviews Microbiology, 14(10), pp. 651–662. https://doi.org/10.1038/nrmicro.2016.93.

Simons, L. (2025). Bose polarons in superfluids and supersolids [Doctoral dissertation, University of Antwerp] https://repository.uantwerpen.be/docman/irua/d8fd0bmotoM35

Simons, L. (2021). Analog physics with excitations in Bose-Einstein Condensates [Master’s thesis, Universiteit Antwerpen] https://medialibrary.uantwerpen.be/files/7606/704c6cf3-5f9c-4a15-989f-98fed6ffd3b2.pdf

Soprunyuk, V. and Schranz, W. (2023) ‘Glass and freezing transition of supercooled water confined in mesoporous materials and biological systems,’ International Online Medical Council [Preprint]. https://doi.org/10.35248/2322-3308-12.4.001.

Tunnacliffe, A. and Wise, M.J. (2007) The continuing conundrum of the LEA proteins, Naturwissenschaften. journal-article, pp. 791–812. https://doi.org/10.1007/s00114-007-0254-y.

Watson, D.A. et al. (2025) ‘Water entry dynamics of hydrophobic spheres bounded by semi-infinite cylindrical pipes,’ Physics of Fluids, 37(11). https://doi.org/10.1063/5.0300541.

‘Wilson Bentley (Snowflake Bentley)’ (n.d.) Wikipedia. Available at: https://en.wikipedia.org/wiki/Wilson_Bentley

Wogan, T. (2024) ‘Experimental proof of liquid–liquid transition in supercooled water’, Chemistry World. Available at: https://www.chemistryworld.com/news/experimental-proof-of-liquid-liquid-transition-in-supercooled-water/4012820.article

Wu, J. and Rosen, B.P. (1993) ‘Metalloregulated expression of the ars operon.,’ Journal of Biological Chemistry, 268(1), pp. 52–58. https://doi.org/10.1016/s0021-9258(18)54113-2.

Yashina, S., Gubin, S., Maksimovich, S., Yashina, A., Gakhova, E. and Gilichinsky, D. (2012) ‘Regeneration of whole fertile plants from 30,000-y-old fruit tissue buried in Siberian permafrost’, Proceedings of the National Academy of Sciences, 109(10), pp. 4008–4013. Available at: https://doi.org/10.1073/pnas.1118386109

  • I used Claude for all technical infrastructure and simulation testing and generated the visuals using NanoBanana 2.

  • I would like to thank our Node Chair Prof. Han, for his valuable contributions and feedback which helped me refine the final project idea.


C. HTGAA 2026: Individual Final Project Documentation

SECTION 1: ABSTRACT

1. Abstract

Therapeutic hypothermia and cryopreservation are critical procedures in medical surgeries, organ transplants, and cellular therapies. However, extreme cold temperatures inherently cause cellular damage, crystallization, and tissue degradation. At the root of this challenge is the need for effective cryoprotectants that can thermodynamically disrupt ice crystallization and preserve cellular integrity. To address this, the “Paleo-Proteins” project draws inspiration from evolutionary biology—specifically, ancient plants that survived the Ice Age, such as the Silene linnaeana group found in 30,000-year-old Siberian permafrost. These ancient organisms possess unique, highly resilient chaperone protein mechanisms.

The broad objective of this project is to harness AI-driven protein design to develop synthetic cryoprotectants inspired by Late Pleistocene Late Embryogenesis Abundant (LEA) proteins and dehydrins. We hypothesize that computationally generated “paleo” scaffolds, optimized via modern AI models, will act as protective ‘impurities’ within the frozen cytoplasm and exhibit superior cryoprotective properties compared to generic proteins. The specific aims involve utilizing Ancestral Sequence Reconstruction (ASR) to generate scaffolds, optimizing them with RFdiffusion and ESM-IF, and filtering candidates using ESMFold and IUPred3.

Ultimately, this project aims to express these candidates and validate their efficacy via automated high-throughput cold assays, establishing a novel pipeline for engineering unprecedented biological resilience during therapeutic cooling.


SECTION 2: PROJECT AIMS

AIM.1: Experimental Aim The first aim of my final project is to computationally design a library of ‘Ancient Resilient Synthetic Chaperones’ (Paleo-Proteins) by utilizing Ancestral Sequence Reconstruction (ASR) and AI-based protein design tools. Relevant methods and resources:

  • Ancestral Sequence Reconstruction (ASR): Mining and reconstructing genomic data from 30,000-year-old Ice Age permafrost survivors (Silene linnaeana).

  • AI Protein Design Tools: Utilizing RFdiffusion or ESM-IF for de novo surface residue optimization, alongside ESMFold and IUPred3 for atomic-level 3D structural prediction and flexibility validation.

  • Computational Filtering: Applying computational models to screen candidates based on melting temperature (Tm) predictions, hydrophilicity scores, and intrinsic disorder ratios (CamSol/Protein-Sol) to prioritize the top 5-10 candidates.

AIM.2: Experimental Aim

  • The next step following a successful in-silico design (Aim 1) would be to transition into physical synthesis and automated in-vitro testing. This involves synthesizing the DNA sequences of the top AI-generated protein candidates via Twist Bioscience and expressing them in model cell lines (e.g., E. coli BL21).

  • To address the technical challenge of high-throughput testing, an Opentrons OT-2 liquid-handling robot will be deployed to automate the screening workflow. The robot will format 96-well plates, subject the engineered cells to a gradient of sub-body/freezing temperatures (simulating therapeutic hypothermia), and automate MTT cell viability assays to pinpoint which specific variants successfully confer cold resistance.

AIM.3: Visionary Aim

  • The long-term vision for this project is to translate these synthetic cryoprotectants into clinical therapeutics, directly addressing a major barrier in the medical field: the inherent cellular damage caused by extreme cold during therapeutic hypothermia.

  • If fully realized, these “Paleo-Proteins” would challenge the existing paradigm of tissue preservation by acting as protective ‘impurities’ (conceptually akin to polarons stabilizing ultracold quantum matter) that thermodynamically disrupt ice crystallization inside human cells.

  • This would revolutionize clinical practices during complex surgeries, organ transplants, and brain trauma management, safely extending the viability window for human organs and ultimately pushing the boundaries of human longevity.


SECTION 3: BACKGROUND

3a. Background and Literature Context

Therapeutic hypothermia is a critical clinical practice used to preserve organ function and minimize brain trauma during complex surgeries, yet the inherent cellular damage caused by extreme cold remains a major barrier.

Current cryopreservation methods lack targeted, highly efficient biological protectants that operate effectively within human tissue at sub-body temperatures without causing toxicity. To address this gap, this project draws core biological inspiration from Yashina et al. (2012), who successfully regenerated whole fertile plants from 30,000-year-old fruit tissue buried in Siberian permafrost, demonstrating the remarkable, long-term viability of ancient cryoprotective mechanisms.

Complementarily, the physical principles of this project are inspired by Simons (2025), whose research on Bose polarons in ultracold matter illustrates how single “impurities” can dynamically alter the energy, size, and motion of supercooled environments. By integrating these biological and physical insights, we aim to design synthetic chaperones that act as protective ‘impurities’ within freezing cytoplasm to prevent crystallization.

3b. How the project is novel or innovative

Unlike traditional drug discovery that relies on existing modern protein structures, this project utilizes Ancestral Sequence Reconstruction (ASR) combined with evolutionary-scale AI models (like ESMFold) to mine and revive ancient genomic data from Ice Age survivors.

Furthermore, it reconceptualizes biological freezing through the lens of quantum thermodynamics, treating the engineered synthetic chaperones as functional ‘impurities’ (akin to polarons) that thermodynamically disrupt ice crystallization inside the cell. This highly interdisciplinary approach expands the boundaries of synthetic biology by merging evolutionary archaeology, artificial intelligence, and ultracold fluid dynamics to challenge the current biological limits of cellular cold resistance.

3c. Why the project matters and what impact it could have

During complex medical procedures such as organ transplants, major cardiovascular surgeries, or brain trauma management, inducing sub-body temperatures (therapeutic hypothermia) is crucial to temporarily halt cellular decay and increase patient survival . However, a pressing real-world problem is that extreme cold inherently damages human cells, representing a critical barrier to progress in these life-saving surgical practices . This project attempts to solve this paradox by developing therapeutic “Paleo-Proteins” that act as synthetic cryoprotectants to safely prevent cold-induced cellular damage.

If successful, the broader societal contribution would be immense; it could significantly extend the viability window for donor organs—easing the global organ shortage crisis—and reduce irreversible brain damage in trauma patients . Ultimately, this advancement would drastically alter clinical practices in hypothermic medicine, improve the safety of complex surgeries, and push the boundaries of human longevity by enabling an entirely new biological capability: engineered resilience against extreme cold.

3d. Ethical implications associated with project and relevant ethical principles The development of “Paleo-Proteins” as synthetic cryoprotectants for therapeutic hypothermia holds profound implications for public health, directly aligning with the ethical principle of beneficence. By extending the viability window for donor organs and minimizing cellular damage during complex brain or cardiovascular surgeries, this research addresses a critical medical need and aims to save lives. However, because this technology fundamentally pushes the boundaries of tissue preservation and human longevity, it introduces severe implications regarding justice. As seen in emerging longevity research, such advancements raise crucial questions about equity, consent, and accessibility. If these enhanced cryoprotectants are successfully developed, there is a risk they could be restricted exclusively to high-resource medical facilities or wealthy individuals, thereby widening the global healthcare inequality gap. Furthermore, the principle of non-maleficence must be prioritized; engineering extreme cold resistance into biological expression systems (such as the E. coli used for production) presents a potential biosafety hazard if these resilient traits were to escape into the environment and disrupt local ecosystems.

To ensure the ethical execution of this project, I propose implementing strict biocontainment actions—such as engineered genetic kill-switches or synthetic amino acid auxotrophy—in our microbial chassis, alongside transparent, equitable licensing models to guarantee broad public health access to the final therapeutic. A potential unintended consequence of successfully creating highly resilient “super-cryoprotectants” is that they might inadvertently encourage unsafe, speculative procedures (such as unregulated extreme body cooling or commercial longevity biohacking) outside of approved clinical settings. Furthermore, we must acknowledge crucial scientific uncertainties: we could be wrong in our assumption that AI-designed ancient protein analogs will be safe and non-immunogenic in vivo. Medical AI models can overfit, and introducing foreign protein structures into the human bloodstream might trigger severe, unforeseen immune responses. Given these risks, a viable alternative to our proposed action of administering synthetic proteins directly to patients would be to use our in vitro automated screening platform solely to discover non-biological, small-molecule drugs that mimic the thermodynamic properties of Paleo-Proteins. This alternative would offer a cheaper, less immunogenic, and more universally accessible solution for global public health.


SECTION 4: EXPERIMENTAL DESIGN, TECHNIQUES, TOOLS, AND TECHNOLOGY

  • Claude | Final Project Experimental Design

Please use this directing link to see my final project proposal which is created via Claude Code based on a skill developed by Ronan Donovan.

  1. Detailed Experimental Plan & Timeline

Sub-aim 1: In-Silico AI Protein Design and DNA Assembly (Weeks 1-3)

┌─────────────────────────────────────────────────────────────────────────┐
│                    SUB-AIM 1: IN-SILICO DESIGN PIPELINE                 │
└─────────────────────────────────────────────────────────────────────────┘

  [1] ANCESTRAL DATA              [2] AI BACKBONE DESIGN
  ─────────────────               ──────────────────────
  Silene spp. permafrost    ──►   RFdiffusion
  genomic data                    de novo backbone
  (Yashina et al. 2012)           generation
  Basecamp Research DB            (~500 candidate
  LEA / dehydrin sequences        backbones)
         │                               │
         ▼                               ▼
  [3] SEQUENCE DESIGN             [4] STRUCTURAL VALIDATION
  ───────────────────             ─────────────────────────
  ESM-IF inverse folding   ──►   ESMFold structure
  ESM-2 embeddings                prediction
  PepMLM scaffold seeding         → pLDDT score ≥ 70
  (~200 sequences)                → RMSD vs. known
         │                          dehydrins < 3.5 Å
         ▼                               │
  [5] DISORDER & PROPERTY               ▼
      FILTERING                  [6] CANDIDATE SCORING
  ────────────────────           ──────────────────────
  IUPred3: disorder ratio        Composite score:
  ≥ 0.6 (intrinsically           • Tm prediction
  disordered required)           • Grand avg hydro-
  Anchor2: binding                 philicity (GRAVY)
  site prediction                • Disorder ratio
  GRAVY ≤ −0.5                  • pLDDT
         │                       • K-segment count
         └──────────┬────────────┘
                    ▼
           [7] TOP 5–10 CANDIDATES
           ─────────────────────────
           DHN-K2S (primary)
           DHN-K1, DHN-K2S-ΔS
           + 2–7 additional variants
                    │
                    ▼
           [8] DNA CONSTRUCT DESIGN
           ─────────────────────────
           Benchling (sequence design)
           Asimov Kernel (circuit
           verification & registry)
           pET-28a + His₆-tag + T7
                    │
                    ▼
           [9] TWIST ORDER + DELIVERY
           ─────────────────────────
           Whole Plasmid Synthesis
           SecureDNA screening
           → sequence-verified plasmids
             delivered in 7–10 days

  1. Reminder: All HTGAA projects must include some DNA design! Make sure this form is submitted.
  1. Techniques relevant to the project
  • Foundational Lab Practices
    • Pipetting
    • Lab Safety
    • Bioethical Considerations (must check this box)
  • DNA Skills & Analysis
    • DNA Gel Art
    • DNA Sequencing
    • DNA Editing (e.g., CRISPR)
    • DNA Construct Design
    • Restriction Enzyme Digestion
    • Gel Electrophoresis
    • DNA Purification from Gel
    • Databases (e.g., GenBank, NCBI, Ensembl, UCSC Genome Browser)
  • Laboratory Automation
    • Opentrons
      • Creating Code for Laboratory Automation
      • Using Liquid Handling Robots (e.g., Opentrons)
  • Protein Design
    • Protein Design
      1. Models and Notebooks
      2. Databases
      3. Tools
  • BioProduction - [x] Chassis Selection (e.g., DH5α, Bl21-DE3 for expression)
    • Registry of Standard Biological Parts
    • FreeGenes
    • Plasmid Preparation
    • Bacterial Culturing
    • Quality Control / Analysis
    • Bacterial Processing (e.g., Centrifugation, Lysis, DNA Purification)
  • Cell-Free System
    • Cell-Free Reactions
    • Freeze-Dried Cell-Free Systems
    • NEB Express Kits
    • miniPCR Tools
  • Gibson Assembly
    • Primer Design or Selection
    • PCR Reactions
    • Gibson Assembly
    • Other Cloning Methods (e.g., Restriction Enzyme Digestion or Gateway Cloning)
    • Creating Twist Order

4. Two expanded techniques by describing how I would utilize those techniques in final project.

Technique 1 - AI Protein Design: AI-driven protein design is the foundational technology of this project — without it, generating a library of cryoprotective candidates from ancient genomic data would be impossible within any realistic experimental timeline. In this project, I use a three-stage computational pipeline: RFdiffusion generates diverse protein backbone geometries conditioned on the canonical dehydrin K-segment amphipathic helix motif (EKKGIMDKIKEKLPG), exploring structural space that no natural evolutionary trajectory has visited; ESM-IF1 then performs inverse folding on each backbone, outputting amino acid sequences predicted to fold into those geometries while simultaneously satisfying evolutionary plausibility constraints learned from 250 million natural protein sequences; and ESMFold validates each candidate by predicting its full 3D structure from sequence alone, with per-residue pLDDT confidence scores used to confirm that K-segment regions adopt the expected α-helical geometry while spacer regions remain genuinely disordered — a property that is not a flaw but a functional requirement for intrinsically disordered cryoprotectants. The key biological insight driving this design strategy is that dehydrins protect membranes during freezing through an entropic chain mechanism: their disordered regions form a hydration shell that slows ice nucleation and maintains membrane fluidity at sub-zero temperatures, while their K-segment helices anchor to lipid bilayers and prevent phase separation — properties I am computationally amplifying by selecting for high disorder ratio (IUPred3 ≥ 0.60 in spacers), strong amphipathic K-segment helicity, and negative GRAVY scores (≤ −0.5) that ensure sufficient hydrophilicity for intracellular water interaction. This AI-first approach compresses what would traditionally be a years-long directed evolution campaign into a weeks-long computational screen, with the top 5 candidates emerging from a funnel of ~500 RFdiffusion backbones → ~200 ESM-IF sequences → ~40 IUPred3/ESMFold-validated candidates → 5 synthesis-ready constructs ordered as whole plasmids from Twist Bioscience.

Technique 2 - CFPS: Cell-free protein synthesis (CFPS) is a transformative technique that liberates protein expression from the constraints of living cells — instead of growing bacteria overnight, transforming them, inducing expression, and waiting days for results, CFPS allows any sequence-verified DNA to be transcribed and translated directly in a cell lysate within 4 hours, making it ideal for rapid prototyping of novel protein designs. In this project, I use E. coli BL21(DE3) lysate combined with the Ginkgo Bioworks CFPS master mix to perform a same-day proof-of-concept expression test for each Paleo-Protein candidate immediately upon receipt of Twist-synthesized plasmids — this means I can confirm that my AI-designed sequences are actually expressible before committing to the 3-week whole-cell expression and Ni-NTA purification campaign, dramatically de-risking the experimental timeline. The open-reaction format of CFPS is particularly valuable for intrinsically disordered proteins like my DHN-K2S candidates: because the reaction lacks cellular compartmentalization, the expressed protein immediately enters a buffered aqueous environment where it can be sampled, quantified by BCA assay, run on SDS-PAGE for size confirmation, and — critically — applied directly to hypothermia-stressed HEK293T cells for a functional MTT viability readout without any intermediate purification step. This crude CFPS-to-cell screening approach, with a go/no-go threshold of ≥15% viability improvement over the untreated hypothermic control, provides actionable functional data within a single lab session and establishes a direct mechanistic link between AI-designed sequence → expressed protein → measurable cryoprotection, validating the core hypothesis of the project before any large-scale synthesis investment is made.

5.Associated Industry Council companies

Twist Biosciences: Whole plasmid synthesis for all Paleo-Protein expression constructs Ginkgo Bioworks: Primary lab automation (Echo525, PHERAstar FSX, Cytomat) and CFPS master mix Asimov Kernel: DNA construct design, circuit simulation, and construct registry Opentrons: OT-2 automated liquid handling for cell seeding and MTT assay workflow Thermo Fisher Scientific: Cell culture reagents, cDNA synthesis kit, BCA assay, labware Millipore Sigma: MTT reagent, IPTG, anti-His₆ antibody, trehalose positive control New England Biolabs: BL21(DE3) competent cells for recombinant protein expression SecureDNA: Biosecurity screening of all synthetic DNA orders

Future: DeepCure, Takeda Pharmaceuticals


SECTION 5: PROJECT VALIDATION

1.Validation Choice

The primary validation experiment is cell-free protein synthesis (CFPS) followed by direct MTT functional screening, serving as a rapid proof-of-concept for protein activity before committing to the full multi-week E. coli expression and purification campaign. CFPS using BL21(DE3) lysate with Ginkgo Bioworks master mix enables expression of His₆-DHN-K2S directly from circular plasmid DNA within 4 hours, producing a partially purified crude protein fraction that can be applied directly to hypothermia-stressed HEK293T cells for a preliminary functional readout within a single lab session after DNA receipt.

2.Validation Protocol

  1. Resuspend Twist-delivered pET-28a-His₆-DHN-K2S plasmid at 50 ng/μL in nuclease-free water.
  2. Assemble CFPS reaction on ice: 33 μL BL21(DE3) cell-free lysate + 12 μL Ginkgo Bioworks CFPS master mix + 1 μg plasmid DNA + nuclease-free water to 50 μL total.
  3. Transfer reaction to a 1.5 mL microcentrifuge tube. Incubate at 30°C for 4 hours in Inheco Plate Incubator.
  4. Centrifuge at 12,000 × g for 5 min (HiG Centrifuge) to pellet aggregates; retain supernatant.
  5. Run 2 μL supernatant on 12% SDS-PAGE alongside a His₆ protein ladder. Stain with Coomassie Blue. Confirm band at ~11.2 kDa.
  6. Western blot: transfer to PVDF membrane, probe with anti-His₆-HRP antibody, develop with ECL. Confirm identity of band.
  7. Quantify protein concentration in CFPS supernatant by BCA assay (Thermo Fisher Pierce BCA Kit).
  8. Dilute CFPS supernatant into HEK293T cell culture medium to achieve estimated 1, 10, and 100 μg/mL concentrations (crude, not purified). Include a matched volume of empty-vector CFPS supernatant as vehicle control.
  9. Treat pre-seeded 96-well plates of HEK293T cells (seeded by Opentrons OT-2, overnight, 5×10⁴ cells/well) with prepared dilutions.
  10. Seal plates with A4s breathable seal (Plateloc). Transfer to 28°C hypothermic condition for 12 hours.
  11. Bring plates to room temperature (15 min). Add MTT reagent (Opentrons OT-2). Incubate at 37°C for 4 hours (Inheco).
  12. Add DMSO, shake 5 min (BioshakeD3000). Read at 570/670 nm (PHERAstar FSX).
  13. Calculate % viability normalized to within-plate 37°C negative control. Go/no-go threshold: ≥15% viability improvement over untreated 28°C control in crude CFPS product. If met → proceed to preparative purification (Steps 6–8 of full protocol).

3.Techniques Used The CFPS validation protocol integrates cell-free protein synthesis as a rapid prototyping technology that decouples gene expression from bacterial cell growth and viability constraints, enabling expression of any sequence-verified plasmid directly in an open-reaction format within hours of DNA receipt. SDS-PAGE provides gel-based confirmation of protein production and approximate molecular weight in less than 2 hours, serving as a low-cost, high-confidence first-pass quality check before committing to any downstream purification or cellular assays. Western blotting with an anti-His₆ HRP antibody provides orthogonal immunological identity confirmation, distinguishing the specific target protein from background CFPS components based on epitope recognition rather than size alone — critical for disordered proteins like dehydrins that may comigrate with CFPS background bands. The MTT cell viability assay, applied directly to crude CFPS-derived protein without full purification, provides functional activity data within the same week as Twist DNA delivery, dramatically compressing the design-build-test-learn cycle and generating actionable go/no-go data before investing in 3-week preparative expression campaigns.

4.Hypothetical Data

Simulated dose-response data — DHN-K2S MTT viability assay at 28°C, 12-hour hypothermia (HEK293T cells):

Cell Viability (% of 37°C untreated control)

100 |                                      ●  ●
 95 |                               ●
 88 |                        ●
 80 |
 75 |                  ●                        ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆
 67 |  ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■
 65 |                  ●
 55 |  ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲
      0.1   0.3    1     3    10    30    50   100  (μM)

● DHN-K2S (predicted EC₅₀ = 8.2 μM; max viability = 98% at 100 μM)
◆ Trehalose 100 mM (fixed concentration, viability = 72%)
■ Untreated cells at 28°C (negative control, viability = 65%)
▲ GFP / empty vector at 28°C (vehicle control, viability = 55%)
ConditionConcentrationMean Viability (%)SE (±)vs. NEG ctrl
Untreated — 37°C100.01.2
Untreated — 28°C (NEG)65.33.1baseline
GFP ctrl — 28°C54.84.2−10.5%
Trehalose — 28°C100 mM72.12.8+6.8%
AFP-RD3 — 28°C1 μM69.43.5+4.1%
DHN-K2S — 28°C0.1 μM67.22.9+1.9%
DHN-K2S — 28°C1 μM75.42.1+10.1%
DHN-K2S — 28°C10 μM88.31.8+23.0%
DHN-K2S — 28°C30 μM95.11.4+29.8%
DHN-K2S — 28°C50 μM97.21.5+31.9% ✓
DHN-K2S — 28°C100 μM98.01.6+32.7% ✓

Interpretation: DHN-K2S achieves the ≥30% viability improvement threshold at 50 μM, with the dose-response curve consistent with a predicted EC₅₀ of ~8.2 μM (as projected by in silico modeling). At 10 μM (approximately EC₅₀), a 23% improvement is already observed — substantially exceeding both chemical (trehalose: +6.8%) and biological (AFP-RD3: +4.1%) positive controls. These simulated values establish the quantitative benchmarks for experimental validation.

5.Troubleshooting

The primary anticipated challenge is low soluble expression of intrinsically disordered Paleo-Proteins in E. coli, as disordered proteins are prone to partitioning into inclusion bodies; this will be addressed by inducing at reduced temperature (18°C overnight), titrating IPTG concentration down to 0.1 mM, and switching to solubility-enhancing N-terminal fusion tags (SUMO, MBP) if needed — with SUMO cleavage by Ulp1 protease restoring the native N-terminus post-purification. A second concern is non-specific cytotoxicity at high protein concentrations (>50 μM), which could confound viability data and generate false-negative dose-response curves; this will be controlled by running matched-concentration vehicle-only wells (purification buffer diluted equivalently into cell medium) and monitoring cell morphology by brightfield microscopy at each timepoint alongside MTT readings. Inter-plate variability across the hypothermia timecourse is mitigated by including a within-plate 37°C normothermic control column on every assay plate for independent normalization, and by calibrating Opentrons OT-2 pipette tips before each run to maintain dispensing accuracy within ±2%. If formazan signal is confounded by protein pigmentation or aggregation at high concentrations, an alternative resazurin-based metabolic viability assay (CellTiter-Blue, Promega) will be substituted as an orthogonal readout, which is also compatible with the PHERAstar FSX fluorescence detection module.


SECTION 6: ADDITIONAL INFORMATION

1. References

  • Yashina, S., Gubin, S., Maksimovich, S., et al. (2012). Regeneration of whole fertile plants from 30,000-y-old fruit tissue buried in Siberian permafrost. Proceedings of the National Academy of Sciences, 109(10), 4008–4013. https://doi.org/10.1073/pnas.1118386109
  • Lin, Z., Akin, H., Rao, R., et al. (2023). Evolutionary-scale prediction of atomic-level protein structure with a language model. Science, 379(6637), 1123–1130. https://doi.org/10.1126/science.ade2574
  • Watson, J.L., Juergens, D., Bennett, N.R., et al. (2023). De novo design of protein structure and function with RFdiffusion. Nature, 620, 1089–1100. https://doi.org/10.1038/s41586-023-06415-8
  • Hsu, C., Verkuil, R., Liu, J., et al. (2022). Learning inverse folding from millions of predicted structures. ICML 2022. https://doi.org/10.1101/2022.04.10.487779
  • Kramina, T.E., Kochkin, I.T., Tatanov, I.V., & Samigullin, T.H. (2021). Towards molecular identification and phylogenetic placement of Silene (Caryophyllaceae). PhytoKeys, 173, 1–26. https://doi.org/10.3897/phytokeys.173.57402
  • Jewett, M.C., & Swartz, J.R. (2004). Mimicking the Escherichia coli cytoplasmic environment activates long-lived and efficient cell-free protein synthesis. Biotechnology and Bioengineering, 86(1), 19–26. https://doi.org/10.1002/bit.20026
  • Tunnacliffe, A., & Wise, M.J. (2007). The continuing conundrum of the LEA proteins. Naturwissenschaften, 94(10), 791–812. https://doi.org/10.1007/s00114-007-0254-y
  • Dure, L., Crouch, M., Harada, J., et al. (1989). Common amino acid sequence domains among the LEA proteins of higher plants. Plant Molecular Biology, 12(5), 475–486. https://doi.org/10.1007/BF00036962
  • Souza Filho, P.J.A., et al. (2016). Dehydrins: structure and functional role in plant stress tolerance. Plant Cell & Environment, 39(9), 1943–1953. https://doi.org/10.1111/pce.12740
  • DeVries, A.L. (1971). Glycoproteins as biological antifreeze agents in Antarctic fishes. Science, 172(3988), 1152–1155. https://doi.org/10.1126/science.172.3988.1152
  • Koag, M.C., & Lee, S. (2003). The binding of maize DHN1 to lipid vesicles: gain of structure and lipid specificity. Plant Cell, 15(5), 1061–1073. https://doi.org/10.1105/tpc.010793
  • Doyle, S.M., & Wickner, S. (2009). Hsp104 and ClpB: protein disaggregating machines. Trends in Biochemical Sciences, 34(1), 40–48. https://doi.org/10.1016/j.tibs.2008.09.010

2. Supply list and budget

DNA Synthesis & Cloning

  • Twist Bioscience Whole Plasmid Synthesis × 8 constructs (pET-28a-His₆ backbone, 5 candidates + 3 controls) — Twist Bioscience
  • E. coli BL21(DE3) High-Efficiency Competent Cells (NEB C2527H) — New England Biolabs
  • LB broth powder and LB agar — Millipore Sigma
  • Kanamycin sulfate (50 mg/mL stock) — Millipore Sigma
  • SecureDNA sequence screening (×8 constructs) — SecureDNA (free academic access)

Protein Expression & Purification

  • IPTG (isopropyl β-D-1-thiogalactopyranoside, 1 g) — Millipore Sigma
  • Ni-NTA Agarose resin (5 mL packed column) — Qiagen
  • Protease inhibitor cocktail tablets (cOmplete, EDTA-free) — Millipore Sigma
  • PD-10 desalting columns (×10) — Cytiva / Millipore Sigma
  • Pierce BCA Protein Assay Kit — Thermo Fisher Scientific

Protein Validation (SDS-PAGE + Western Blot)

  • Mini-PROTEAN TGX 12% precast gels (10-pack) — Bio-Rad
  • Coomassie Brilliant Blue R-250 staining solution — Bio-Rad
  • PVDF transfer membranes — Bio-Rad
  • Anti-His₆-HRP antibody (200 μL) — Millipore Sigma
  • ECL Western blot detection reagent — Thermo Fisher Scientific
  • Precision Plus Protein Dual Color Standards (ladder) — Bio-Rad

Cell Culture

  • HEK293T cells (ATCC CRL-3216) — ATCC
  • DMEM + GlutaMAX media (500 mL × 2) — Thermo Fisher Scientific
  • Fetal Bovine Serum, heat-inactivated (500 mL) — Thermo Fisher Scientific
  • Penicillin-Streptomycin solution (100×) — Thermo Fisher Scientific
  • MycoAlert Mycoplasma Detection Kit — Lonza
  • T-75 cell culture flasks (×10) — Thermo Fisher Scientific

High-Throughput Screening (MTT Assay)

  • MTT reagent (3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide, 1 g) — Millipore Sigma
  • 96-well flat-bottom cell culture plates (×20) — Thermo Fisher Scientific
  • DMSO (cell culture grade, 100 mL) — Millipore Sigma
  • Trehalose dihydrate (positive control cryoprotectant) — Millipore Sigma
  • Opentrons OT-2 filtered tips, 96-well racks (×10) — Opentrons

Cell-Free Protein Synthesis (Validation)

  • BL21(DE3) cell-free lysate + CFPS master mix — Ginkgo Bioworks (course infrastructure)
  • Nuclease-free water (500 mL) — Thermo Fisher Scientific

Gene Expression Profiling (qPCR)

  • RNeasy Mini Kit (50 rxns) — Qiagen
  • SuperScript IV First-Strand Synthesis Kit — Thermo Fisher Scientific
  • PowerUp SYBR Green Master Mix (500 rxns) — Thermo Fisher Scientific
  • qPCR primers for CIRBP, RBM3, CASP3, BCL2, GAPDH (custom synthesis) — Millipore Sigma
  • 384-well PCR plates (Eppendorf, ×5) — Thermo Fisher Scientific

Equipment (Course Infrastructure / Ginkgo Bioworks)

  • Opentrons OT-2 liquid handler — Opentrons (course access)
  • Echo525 acoustic liquid handler — Ginkgo Bioworks
  • PHERAstar FSX plate reader (absorbance 570/670 nm) — Ginkgo Bioworks
  • Spark Plate Reader — Ginkgo Bioworks
  • CFX Opus qPCR system — Ginkgo Bioworks
  • HiG Centrifuge — Ginkgo Bioworks
  • Inheco Plate Incubator — Ginkgo Bioworks
  • Cytomat shaking incubator — Ginkgo Bioworks
  • BioshakeD3000 plate shaker — Ginkgo Bioworks
  • Plateloc plate sealer + A4s breathable seals — Ginkgo Bioworks

Itemized Budget

ItemSupplierEst. Unit CostQtyTotal
Whole Plasmid Synthesis — pET-28a-His₆ candidates (×10)Twist Bioscience$149.0010$1,490.00
E. coli BL21(DE3) Competent CellsNEB C2527H$52.001$52.00
Ni-NTA Agarose 5 mLQiagen 30210$198.001$198.00
IPTG (1 g)Millipore Sigma I6758$41.001$41.00
MTT Reagent (1 g)Millipore Sigma M2128$148.001$148.00
96-Well Cell Culture Plates, flat-bottom (×20)Thermo Fisher Scientific$12.0020$240.00
HEK293T CellsATCC CRL-3216$499.001 vial$499.00
Anti-His₆-HRP Antibody (200 μL)Millipore Sigma A7058$89.001$89.00
PVDF Membrane (western blot)Bio-Rad 1620177$75.001 pkg$75.00
Mini-PROTEAN TGX 12% Gels (10-pack)Bio-Rad 4561094$125.001$125.00
RNeasy Mini Kit (50 rxns)Qiagen 74104$199.001$199.00
SuperScript IV First-Strand cDNA KitThermo Fisher 18090010$165.001$165.00
DMEM + GlutaMAX (500 mL)Thermo Fisher 10569010$45.002$90.00
FBS, heat-inactivated (500 mL)Thermo Fisher 10082147$149.001$149.00
Pierce BCA Protein Assay KitThermo Fisher 23225$79.001$79.00
Protease Inhibitor Cocktail TabletsMillipore Sigma 4693116001$89.001$89.00
Opentrons OT-2 Tips (96-well, ×10 racks)Opentrons$8.0010$80.00
SecureDNA Sequence Screening (×10 sequences)SecureDNA$0.0010$0.00
TOTAL$3,808.00

Ginkgo Bioworks automation access (Echo525, PHERAstar FSX, Multiflo, Cytomat) provided through course infrastructure.


DNA Construct — GenBank Format

Primary Construct: pET-28a-His₆-DHN-K2S

This construct encodes a synthetic K2S-type dehydrin (2 K-segments + 1 S-segment) inspired by ancestral Silene LEA protein sequences, designed by RFdiffusion/ESM-IF and codon-optimized for E. coli BL21(DE3). The full plasmid (insert + pET-28a backbone) is ordered as Twist Bioscience Whole Plasmid Synthesis.

LOCUS       pET28a_His6_DHN_K2S      315 bp    DNA     linear   SYN 07-APR-2026
DEFINITION  Synthetic expression insert: N-terminal His6-tagged K2S-type
            dehydrin paleo-protein (DHN-K2S); designed by RFdiffusion and
            ESM-IF from ancestral Silene LEA sequences; codon-optimized for
            E. coli BL21(DE3); cloned into pET-28a between NdeI and XhoI sites;
            ordered as whole-plasmid synthesis from Twist Bioscience.
ACCESSION   .
VERSION     .
KEYWORDS    LEA protein; dehydrin; K2S; cryoprotectant; synthetic biology;
            ancestral sequence reconstruction; paleo-protein.
SOURCE      Synthetic construct
  ORGANISM  Synthetic construct
            other sequences; artificial sequences.
FEATURES             Location/Qualifiers
     CDS             1..315
                     /label="His6-DHN-K2S"
                     /codon_start=1
                     /transl_table=11
                     /product="His6-tagged K2S-type dehydrin paleo-protein"
                     /note="Codon-optimized for E. coli expression (CAI > 0.85);
                      AI-designed scaffold; ancestral Silene LEA inspiration"
                     /translation="MHHHHHHGSDEYGMPAQAAQTGKSSEKKGIMDKIKEKLPG
                                   DKTPEQMAQLKKELPEGSSSSSSSSAEQTGGQQEKKGIMDK
                                   IKEKLPGAQAAQTGKSS"
     misc_feature    1..21
                     /label="His6-tag"
                     /note="6x histidine purification tag; Ni-NTA affinity"
     misc_feature    22..39
                     /label="GS linker + Y-segment"
                     /note="Gly-Ser flexible linker; DEYGMP Y-segment motif"
     misc_feature    40..84
                     /label="K-segment 1"
                     /note="EKKGIMDKIKEKLPG - canonical dehydrin K-segment;
                      amphipathic helix in dehydrated state"
     misc_feature    85..135
                     /label="spacer region"
                     /note="DKTPEQMAQLKKELPEGG - connecting spacer"
     misc_feature    136..159
                     /label="S-segment"
                     /note="SSSSSSSS - phosphorylatable serine cluster;
                      binds Ca2+ and mediates nuclear targeting"
     misc_feature    160..183
                     /label="phi-segment"
                     /note="AEQTGGQQ - phi-segment conserved in K2S dehydrins"
     misc_feature    184..228
                     /label="K-segment 2"
                     /note="EKKGIMDKIKEKLPG - second canonical K-segment"
     misc_feature    229..315
                     /label="C-terminal region + stop"
ORIGIN
        1 atgcaccacc accaccacca cggcagcgat gaatatggca tgccggcgca ggcggcgcag
       61 accggcaaaa gcagcgaaaa aaaaggcatc atggataaaa tcaaagaaaa actgccgggc
      121 gataaaaccc cggaacagat ggcgcagctg aaaaaagaac tgccggaagg cagcagcagc
      181 agcagcagca gcagcgcgga acagaccggc ggccagcagg aaaaaaaagg catcatggat
      241 aaaatcaaag aaaaactgcc gggcgcgcag gcggcgcaga ccggcaaaag cagctaa
//

Twist Bioscience Insert Sequences

Submit the sequences below to Twist Bioscience using the Whole Plasmid Synthesis product. Select pET-28a as backbone. Specify NdeI / XhoI cloning sites. Choose kanamycin resistance.

Construct 1 — His₆-DHN-K2S (Primary Candidate, K2S-type dehydrin)

ATGCACCACCACCACCACCACGGCAGCGATGAATATGGCATGCCGGCGCAGGCGGCGCAG
ACCGGCAAAAGCAGCGAAAAAAAAGGCATCATGGATAAAATCAAAGAAAAACTGCCGGGC
GATAAAACCCCGGAACAGATGGCGCAGCTGAAAAAAGAACTGCCGGAAGGCAGCAGCAGC
AGCAGCAGCAGCAGCGCGGAACAGACCGGCGGCCAGCAGGAAAAAAAAGGCATCATGGAT
AAAATCAAAGAAAAACTGCCGGGCGCGCAGGCGGCGCAGACCGGCAAAAGCAGCTAA

Insert length: 315 bp | Protein MW: ~11.4 kDa | pI: 4.9 | Host: E. coli BL21(DE3)


Construct 2 — His₆-DHN-K1 (Minimal Single K-Segment Control)

ATGCACCACCACCACCACCACGGCAGCGATGAATATGGCATGCCGGCGCAGGCGGCGCAG
ACCGGCAAAAGCAGCGAAAAAAAAGGCATCATGGATAAAATCAAAGAAAAACTGCCGGGC
GCGCAGGCGGCGCAGACCGGCAAAAGCAGCTAA

Insert length: 165 bp | Protein MW: ~6.1 kDa | Used as minimal K-segment structural control


Construct 3 — His₆-DHN-K2S-ΔS (S-Segment Deletion Mutant, Mechanistic Control)

ATGCACCACCACCACCACCACGGCAGCGATGAATATGGCATGCCGGCGCAGGCGGCGCAG
ACCGGCAAAAGCAGCGAAAAAAAAGGCATCATGGATAAAATCAAAGAAAAACTGCCGGGC
GATAAAACCCCGGAACAGATGGCGCAGCTGAAAAAAGAACTGCCGGAAGCGGAACAGACC
GGCGGCCAGCAGGAAAAAAAAGGCATCATGGATAAAATCAAAGAAAAACTGCCGGGCGCG
CAGGCGGCGCAGACCGGCAAAAGCAGCTAA

Insert length: 285 bp | S-segment (SSSSSSSS) replaced by Ala-Gly linker | Used to assess S-segment contribution to cryoprotection


Work in Progress

The current document will be updated via this link; I will continue to edit the document here until the deadline:

(View Full Screen)

TWIST ORDER (FINAL)

(Benchling (from clonal to whole plasmid again and again)))

After long journey from clonal gene to whole plasmid synthesis discussions; finally I decided to order 3 whole plasmid synthesis. Since it is way more expensive to order than clonal gene; I have a plan with three options:**

The Drive folder also contains Benchling exports for both options:

Plan A (preferred): Order all three constructs as Whole Plasmid Synthesis from Twist Bioscience on the pET-28a(+) backbone (NdeI/XhoI cloning sites) — pET28a-His6-DHN-K2S, pET28a-His6-DHN-K1, and pET28a-His6-DHN-K2S-ΔS. This is the fastest and most reliable route since the constructs arrive at Gingko ready-to-transform.

Plan B (if Plan A is not feasible due to cost or timeline): Order only the three inserts (DHN-K2S, DHN-K1, DHN-K2S-ΔS) as Clonal Genes / Gene Fragments with NdeI and XhoI flanking sites (also in the Drive folder). In this case, NdeI/XhoI restriction digestion and ligation into pET-28a(+) would be performed at Ginkgo as an additional cloning step before expression.

Plan C (minimum-viable option): Order only the primary construct (pET28a-His6-DHN-K2S) as Whole Plasmid Synthesis to first validate whether the lead candidate shows the expected cryoprotection activity. If K2S performs well, we proceed with K1 and K2S-ΔS controls in a second round.

Note: Plan B saves on synthesis cost but adds ~1–2 weeks at Ginkgo for cloning, screening, and sequence verification. Whole Plasmid Synthesis is often more cost-effective when accounting for hands-on time.

(ORDER 1 — pET28a-His6-DHN-K2S (Primary Construct))

(ORDER 2 — pET28a-His6-DHN-K1 (Minimal Single K-Segment Control))

(ORDER 3 — pET28a-His6-DHN-K2S-ΔS (S-Segment Deletion Mutant))

A note on my experience: This is my first time doing molecular cloning, and I found the in silico design in Benchling challenging (digestion, sticky-end orientation, frame verification). I completed all three constructs, but I’d strongly prefer Plan A (Whole Plasmid Synthesis) to avoid wet-lab cloning steps I don’t yet have experience with. If cost is a concern, Plan C (only K2S first) is also reasonable.


GINKGO BIOWORKS (NEXT STEP)

🧪 What will Ginkgo do once they receive my plasmids? Once my 3 plasmids from Twist arrive at Ginkgo, I have an automated workflow planned that will run within 3 days:

Day 1 — CFPS reaction: Ginkgo will dilute my plasmids to 50 ng/µL and dispense them into a 96-well plate with nanoliter precision using the Echo525 (acoustic liquid handler). Each well will contain a mixture of BL21(DE3) lysate + Ginkgo’s CFPS master mix + 1 µg of my plasmid. This mixture will produce the protein I designed in just 4 hours at 30 °C in an Inheco incubator — using only bacterial extract, without any live bacteria. At the end of the day, aggregates will be separated using a HiG Centrifuge, and I’ll keep the supernatant (containing my crude protein) for the next stage. On the same day, I’ll verify protein production via SDS-PAGE and anti-His6 Western blot, and measure its quantity using a BCA assay.

Day 2 — Cell testing begins: The night before, an Opentrons OT-2 robot will have seeded HEK293T cells into a 96-well plate at a density of 5×10⁴ cells per well. Ginkgo’s robots will then dilute my crude CFPS protein to concentrations of 1, 10, and 100 µg/mL in cell culture medium and add it to the cells. The plate will be sealed with a breathable film using Plateloc and left in the Cytomat at 28 °C for 12 hours under hypothermic stress — essentially telling the cells, “you’re freezing.”

Day 3 — MTT assay and decision: After 12 hours of cold stress, MTT reagent will be added to the plates (live cells convert it to purple formazan, while dead cells do not), followed by a 4-hour incubation at 37 °C, dissolution with DMSO, 5-minute shaking in a BioshakeD3000, and absorbance reading at 570/670 nm on a PHERAstar FSX plate reader. The result will tell me whether my DHN-K2S protein was successfully produced and whether it kept the cells alive under cold stress. If cell viability is ≥15% higher than the control, my design works → I’ll proceed to large-scale production. Otherwise, I’ll revise the design and iterate.

Interactive project mentor by Derek

I also had chance to try Derek’s Interactive project mentor which is developedwith Sonnet (Claude). You can see proposal file developed by discussion we had with Final Project Interview with AI mentor: Proposal Export

If you have any questions please contact me via beyzabatir@gmail.com.