Hana Sylvester — HTGAA Spring 2026

cover

About me

Hi I’m Hana. I’m from London and I was trained in social anthropology. I always regretted forgoing the natural sciences for the social sciences, however I am very grateful for my humanities background. With HTGAA I hope to be able to pursue my love of biology and biological engineering further. I am also keen to explore the intersection between biology and design and get some hands on experience in the lab!

Contact info

Hana Sylvester

hanayusy@gmail.com

Homework

Labs

Projects

!

Subsections of Hana Sylvester — HTGAA Spring 2026

Homework

Weekly homework submissions:

  • Week 1 HW: Principles and Practices

    🧬 1. First, describe a biological engineering application or tool you want to develop and why. Here are some ideas that I brainstormed: 🩹 Regenerative Medicine Autologous skin grafts to replace necrotic, cancerous, scarred, or miscoloured tissue etc Synthetic skin models for surgeons or tattoo artists to practice on Self regenerating organs (for people with damaged organs, cancer, people who need transplants i.e. bioink) Cure for frostbite → regenerating cell tissue or even whole digits or limbs! Growing teeth in a lab to replace damaged or missing teeth! Regeneration of spinal column 🧪 Cures Targeting misfolded proteins (prions) in the brain Making malarial mosquitos infertile (* ethical considerations) Synthesising biofuel AIDS cure Cancer detection mechanism to cause cellular death/induce apoptosis in abnormal cells Cure for sickle cell Cure for cataracts by growing autologous lenses Cure for blindness or eyesight degeneration Making VADs → (HKU hydrogel transistors, electrical interaction with living cells) Helping diabetics to endogenously produce effective insulin 🛡️ Prophylaxis Predicting/modelling viruses and vaccines (* ethics/dangers must be taken into consideration) 🏭 Biomaterial/Industrial applications Manufacturing leather without the need for mass animal slaughter Bioplastic on industrial scale (mitigation of microplastic crisis) Industrially producing natural inks Bioremediation: Landfills Radioactive sites Sewage systems 🎭 Fun ideas Tabulating or transcribing music on plants (patterns on leaves → gene expression) Bioluminescent lamps for: industrial applications architectural applications (can be used practically in hospitals, clubs as exit signs etc) design/decor! bioluminescent paint hair dye/cosmetics Printing the taste of a meal! for art/deisgn installation involving the coding of Hana programme/Hana AI to detect what is in an image and “print out” certain proteins? that correspond to flavours etc edible Print out QR codes on biomaterial to be scanned and play music for art/design installations or if not QR codes then fun characters or shapes (presets) that can be scanned by Hana programme/Hana AI which then play certain songs? stills from films even? covers of albums? Mushroom testing kit which prints out species name on flesh of mushroom or testing strip once pricked or sample collected (kind of like pregnancy test but not, more like DNA sequencing mushrooms to determine what species that are) Sustainable nail polish Generating music from colony of bacteria for art/design/science installation . . . In the end I decided that I would like to pursue further the idea of printing an image as well as the taste of a food/meal using edible media. I would like to call it “Tastemaker” or “Yum Dot Com.” This project highlights the sensory potential of biological engineering, whilst sitting at the intersection of biology, technology, design, the culinary, and the quotidian human experience.

  • Week 2 HW: DNA Reading

    🧬 3.1 Choose your protein I was thinking of choosing between two proteins Titin (also known as connectin) which is the largest known protein encoded by the TTN gene. In humans it accounts for 0.5kg of body weight! Titin is important in muscle cells, acting as a molecular spring. It is the third most abundant protein in muscles, giving them their elasticity, structural integrity, and stability. Green fluorescent protein or GFP which is found in the crystal jelly or hydromedusa (Aequorea Victoria), as well as various species of coral, sea anemones, and crustaceans. GFP is often used as a reporter gene in cell as well as molecular biology. Scientists have created many organisms which can express GFP which is thusly a proof of concept that a gene can be expressed by a given organism. This protein has been introduced and expressed by many species, maintained in their genome, and even passed on to their offspring; such organisms include bacteria, yeast, fungi, fish, and mammalian cells, including those of humans.

  • Week 3 HW: Lab automation

    🤖 Part I: Python Script for Opentrons Artwork I wanted to do a yin and yang symbol so I went about designing it on the very helpful website made by Ronan! https://opentrons-art.rcdonovan.com This is my design

  • Week 4 HW: Protein Design Part I

    💪🏼 Part A. Conceptual Questions 1. How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons) On average 100g of meat 🥩 contains 26g of protein = 26% protein 100:26 ∴ 500:130g 1 amino acid = 100 daltons ≈ 1.66 × 10⁻²²g

  • Week 5 HW: Protein Design Part II

    Part 1: Generate Binders with PepMLM 🐧 Human SOD1 Sequence from UniProt (154 amino acids): https://www.uniprot.org/uniprotkb/P00441/entry https://www.uniprot.org/uniprotkb/P00441/entry#sequences Click to view Human SOD1 Sequence MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ Click to view Human SOD1 Sequence with A4V mutation which causes ALS MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ I had to manually reset the code as I was not able to change the parameters with the sliding scales. I was stuck with generating a single 15 peptide long binder at a time. Thankfully I was able to change this by editing the back end of the form, but also by forcing the code to make 4 binders that were 12 peptides long.

  • Week 6 HW: Genetic Circuits Part I

    Answer these questions about the protocol in this week’s lab 🥼🧪🔬🧫 : 1. What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose? Phusion-High Fidelity PCR Master Mix contains Phusion DNA Polymerase → High fidelity, thermostable enzyme for fast, robust, and accurate DNA amplification in PCR; it is used particualrly for cloning and sequencing

  • Week 7 HW: Genetic Circuits Part II

    Summary"Cancer wil ultimately be a chronic disease." - Rob Weiss I have to say this was one of my favourite lectures. Assignment Part 1: Intracellular Artificial Neural Networks (IANNs) 📟 1. What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions? Boolean functions are binary, representing either on or off. Whereas the logic of IANNS (Interfering RNA-based Analog Neural Networks or Intracellular Artificial Neural Networks?) is more nuanced and geared towards more brain-like information processing. They can detect not merely whether a signal is present or not, but how strong it is. This is important because, in short, biology is messy!

  • Week 9 HW: Cell Free Systems

    General Homework Questions 1. Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production. Cell-free protein synthesis leverages biology as an engineering tool (Kate). Living cells require a lot of resources such as the correct amounts of

  • Week 10 HW: Imaging and Measurement

    Waters Part I — Molecular Weight 💦 We will analyze an eGFP standard on a Waters Xevo G3 QTof MS system to determine the molecular weight of intact eGFP and observe its charge state distribution in the native and denatured (unfolded) states. The conditions for LC-MS analysis of intact protein cause it to unfold and be detected in its denatured form (due to the solvents and pH used for analysis).

  • Week 11 HW: Bioproduction and Cloudlabs

  • Week 12 HW: Building Genomes

  • Week 13 HW: Biodesign & Engineered Living Materials

  • Week 14 HW: Biodesign & Biofabrication

Subsections of Homework

Week 1 HW: Principles and Practices

🧬 1. First, describe a biological engineering application or tool you want to develop and why.

Here are some ideas that I brainstormed:

🩹 Regenerative Medicine

  • Autologous skin grafts to replace necrotic, cancerous, scarred, or miscoloured tissue etc
  • Synthetic skin models for surgeons or tattoo artists to practice on
  • Self regenerating organs (for people with damaged organs, cancer, people who need transplants i.e. bioink)
  • Cure for frostbite → regenerating cell tissue or even whole digits or limbs!
  • Growing teeth in a lab to replace damaged or missing teeth!
  • Regeneration of spinal column

🧪 Cures

  • Targeting misfolded proteins (prions) in the brain
  • Making malarial mosquitos infertile (* ethical considerations)
  • Synthesising biofuel
  • AIDS cure
  • Cancer detection mechanism to cause cellular death/induce apoptosis in abnormal cells
  • Cure for sickle cell
  • Cure for cataracts by growing autologous lenses
  • Cure for blindness or eyesight degeneration
  • Making VADs → (HKU hydrogel transistors, electrical interaction with living cells)
  • Helping diabetics to endogenously produce effective insulin

🛡️ Prophylaxis

  • Predicting/modelling viruses and vaccines (* ethics/dangers must be taken into consideration)

🏭 Biomaterial/Industrial applications

  • Manufacturing leather without the need for mass animal slaughter
  • Bioplastic on industrial scale (mitigation of microplastic crisis)
  • Industrially producing natural inks
  • Bioremediation:
    • Landfills
    • Radioactive sites
    • Sewage systems

🎭 Fun ideas

  • Tabulating or transcribing music on plants (patterns on leaves → gene expression)
  • Bioluminescent lamps for:
    • industrial applications
    • architectural applications (can be used practically in hospitals, clubs as exit signs etc)
    • design/decor!
    • bioluminescent paint
    • hair dye/cosmetics
  • Printing the taste of a meal! for art/deisgn installation involving the coding of Hana programme/Hana AI to detect what is in an image and “print out” certain proteins? that correspond to flavours etc
    • edible
  • Print out QR codes on biomaterial to be scanned and play music for art/design installations
    • or if not QR codes then fun characters or shapes (presets) that can be scanned by Hana programme/Hana AI which then play certain songs?
    • stills from films even?
    • covers of albums?
  • Mushroom testing kit which prints out species name on flesh of mushroom or testing strip once pricked or sample collected (kind of like pregnancy test but not, more like DNA sequencing mushrooms to determine what species that are)
  • Sustainable nail polish
  • Generating music from colony of bacteria for art/design/science installation

. . . In the end I decided that I would like to pursue further the idea of printing an image as well as the taste of a food/meal using edible media. I would like to call it “Tastemaker” or “Yum Dot Com.” This project highlights the sensory potential of biological engineering, whilst sitting at the intersection of biology, technology, design, the culinary, and the quotidian human experience.

⚖️ 2. Next, describe one or more governance/policy goals related to ensuring that this application or tool contributes to an “ethical” future.

🔐 Data Protection and Consent

  • This application will be handilng datasets in the form of photos that are submitted by users, therefore it is important that they know what their images are being used for, where they will be held, how they will be held, and who will have access to them → Images submitted by users constitute personal data under GDPR and related frameworks
  • Explicit consent must be given by participants to enable the use of their photographs/images
  • Participants may withdraw their consent at any moment

🔎 Promoting transparency

  • Ensure consumers understand what biological materials are being used, how the food is produced, and what they are ingesting
  • Clearly distinguish between artistic/experimental food products and nutritionally complete or medically relevant foods
  • Prevent deceptive use in advertising, novelty food products, or commercial settings

🧯 Preventing Malfeasance

  • Ensure that edible biological media used to print images and taste profiles is safe to consume, ethically produced, and not misleading or harmful, while still allowing creative and experimental uses:
  • Ensure all edible media and biological components are non-toxic, allergen-aware, and safe for repeated consumption
  • Prevent accidental or intentional contamination during production, printing, or distribution
  • Avoid misleading representations of nutritional value or ingredients (e.g. printing “healthy” foods that are not)
  • Avoid applications that could be used to manipulate consumers, i.e. by falsely replicating branded foods, culturally significant dishes, or nutritionally complete meals
  • Ensuring the technology is not repurposed for coercive or exploitative contexts, such as pressuring individuals to consume unfamiliar or unsafe substances i.e. spiking or biowarfare

🧩 Non-malificence

  • Prevent misuse of biosensing infrastructure for coercive surveillance or targeting specific communities
  • Make sure that biomaterial is sourced as ethically as possible
🧭 3. Next, describe at least three different potential governance “actions”

📝 Mandatory Explicit Consent & Transparency Requirements

Purpose:

- To ensure users submit images voluntarily to have most fun and intersting experience - To prevent misuse of images

Design:

- Actors include app developers, and users, potential patrons/donors/commisioners i.e. contemporary art galleries, museums - Mandatory consent form - Consent must be opt-in, and users can revoke it at any time with data deletion confirmed - Creation of a dashboard to manage or delete submissions

Assumptions:

- Users will read and understand consent forms. - App will faithfully implement deletion and management protocols - Regulatory bodies will enforce compliance

Risks of Failure & “Success:”

- Failure: Users ignore consent warnings, app mishandles data, or withdrawal requests are delayed/inefficiently processed
  • Success: Overly strict consent enforcement may slow down research or creative experimentation

🧰 Technical data safeguards

Purpose:

- Reduce risk of data breaches

Design:

- Actors include App developers, IT/security teams. - Store images in encrypted servers - Remove metadata or personal identifiers when possible (data minimisation)

Assumptions:

- Sufficient encryption/safety measures in place - Users will take/upload appropriate photographs of food items - Users will read and understand how their data is handled and trust system to protect it

Risks of Failure & “Success:”  Failure: - Hacking or accidental leaks - Users abuse terms of upload

Success:

  • Minimal data retention may limit research/creativity

🥽 Safety standards for edible media

Purpose:

- Prevent malfeasance/negligence in food production (contamination, allergens, deceptive use) while enabling creative applications

Design:

- Actors include: Food safety regulators (e.g., FDA, EFSA), companies producing edible media - Develop guidelines and certification for safe edible inks and substrates, including allergen disclosure - Require periodic audits/inspections

Assumptions:

- Regulatory frameworks can be adapted to new bioengineered food technologies

Risks of Failure & “Success:”

- Failure: Lack of legislature or enforcement can lead to dangerous practices
  • Success: excessive regulation could slow innovation or increase costs

📊 4. Scoring Policy Goals

Option 1: Mandatory consent & transparency
Option 2: Technical data safeguards
Option 3: Regulatory oversight for edible media

Does the option:Option 1Option 2Option 3
Enhance Biosecurity
• By preventing incidentsN/A11
• By helping respondN/A11
Foster Lab Safety
• By preventing incidentN/A11
• By helping respondN/A11
Protect the environment
• By preventing incidents1N/A1
• By helping respond1N/A1
Other considerations
• Minimizing costs and burdens to stakeholders112
• Feasibility?111
• Not impede research223
• Promote constructive applications111

⚠️ 5. Ethical concerns highlighted above

📚 Homework Questions from Professor Jacobson

Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome? How does biology deal with that discrepancy?

  • Naturual error rate: 1:10⁶ or 1 in a million
  • Human error rate: 1:10² or 1 in a hundred
  • Biological mechanisms are x10⁴ or 10,000 better than human ones The length of the human genome is roughly 3.2 billion base pairs.

Biology deals with this discrepancy through built in error correction mechanisms; 3’-5’ exonuclease activity which is an intrinsic “proofreading” function is an example of this. Exonucleases are specialist enzymes which catalyse the removal of incorrectly incorporated nucleotides by breaking phosphodiester bonds via hydrolysis.

Furthermore, the genetic code is particularly robust due to its degeneracy meaning that multiple codons can code for the same amino acid i.e. all of the codons for alanine start with GC, all of the codons for leucine start with CU etc. In this way, the effect of mutations can be mitigated. Moreover in the case of a conservative substitution (or replacement), which is a missense mutation (whereby a nucleotide change results in a different amino acid being incorporated into a protein), the incorrect amino acid has similar physicochemical properties to the original one and therefore does not affect the protein’s tertiary or 3D structure. Physicochemical properties in this instance refer mostly to hydrophobicity and size.

How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?

Theoretically a lot. 😹 This is due to the degenerate nature of the genetic code. Practically speaking, not all of these different codes work to code the protein of interest as Cells engaged in translation prefer certain codons Rare codons correspond to rare tRNA molecules The number of actually functional proteins is smaller than the theoretical possibilities

🧪 Homework Questions from Dr. LeProust

What’s the most commonly used method for oligo synthesis currently?

Currently, the most commonly used method for oligo synthesis is through phosphoramidite chemistry via the phosphodiester method pioneered by Har-Gobind Khorana in the 1950s.

Why is it difficult to make oligos longer than 200nt via direct synthesis?

It is difficult to make oligos longer than 200nt long through direct synthesis as full length product synthesis drops exponentially as length increases. The more chemical reactions that must be done, the greater the rate of error and contamination.

Why can’t you make a 2000bp gene via direct oligo synthesis?

Error accumulation as mentioned in previous answer.

🧬 Homework Question from Dr. George Church

Choose ONE of the following three questions to answer; and please cite AI prompts or paper citations used, if any. What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?

The ten essential amino acids in all animals are

  • Arginine
  • Histidine
  • Isoleucine
  • Lucine
  • Lysine
  • Methionine
  • Phenylalanine
  • Threonine
  • Tryptophan
  • Valine

The Lysine Contingency is codswallop because even if dinosaurs were not able to naturally produce endogenous lysine, they could get it from food sources! Humans can’t even produce it and must get it from their diet.

Week 2 HW: DNA Reading

🧬 3.1 Choose your protein

I was thinking of choosing between two proteins

  1. Titin (also known as connectin) which is the largest known protein encoded by the TTN gene. In humans it accounts for 0.5kg of body weight! Titin is important in muscle cells, acting as a molecular spring. It is the third most abundant protein in muscles, giving them their elasticity, structural integrity, and stability.
Titin
  1. Green fluorescent protein or GFP which is found in the crystal jelly or hydromedusa (Aequorea Victoria), as well as various species of coral, sea anemones, and crustaceans. GFP is often used as a reporter gene in cell as well as molecular biology.

Scientists have created many organisms which can express GFP which is thusly a proof of concept that a gene can be expressed by a given organism. This protein has been introduced and expressed by many species, maintained in their genome, and even passed on to their offspring; such organisms include bacteria, yeast, fungi, fish, and mammalian cells, including those of humans.

Interestingly, the winners of the 2008 Nobel Prize in Chemistry: Roger Y. Tsien, Osamu Shimomura, and Martin Chalfie, were awarded such due to their discovery and development of GFP.

I decided to choose GFP as it is extremely abundant and familiar to scientists due to its in vivo and in vitro applications, but also because of my interest in creating a bioluminescent biosensor for radiation.

GFP Image

I obtained the protein sequence for such from NCBI:

Click to view GFP Protein Sequence
MSKGEELFTG VVPILVELDG DVNGHKFSVS GEGEGDATYG KLTLKFICTT GKLPVPWPTL
VTTFSYGVQC FSRYPDHMKQ HDFFKSAMPE GYVQERTIFF KDDGNYKTRA EVKFEGDTLV
NRIELKGIDF KEDGNILGHK LEYNYNSHNV YIMADKQKNG IKVNFKIRHN IEDGSVQLAD
HYQQNTPIGD GPVLLPDNHY LSTQSALSKD PNEKRDHMVL LEFVTAAGIT HGMDELYK
  
🧬 3.2 Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence

I obtained the DNA (nucleotide) sequence from NCBI also:

Click to view GFP DNA Sequence
TACACACGAATAAAAGATAACAAAGATGAGTAAAGGAGAAGAACTTTTCACTGGAGTTGTCCCAATTCTT
GTTGAATTAGATGGCGATGTTAATGGGCAAAAATTCTCTGTCAGTGGAGAGGGTGAAGGTGATGCAACAT
ACGGAAAACTTACCCTTAAATTTATTTGCACTACTGGGAAGCTACCTGTTCCATGGCCAACACTTGTCAC
TACTTTCTCTTATGGTGTTCAATGCTTTTCAAGATACCCAGATCATATGAAACAGCATGACTTTTTCAAG
AGTGCCATGCCCGAAGGTTATGTACAGGAAAGAACTATATTTTACAAAGATGACGGGAACTACAAGACAC
GTGCTGAAGTCAAGTTTGAAGGTGATACCCTTGTTAATAGAATCGAGTTAAAAGGTATTGATTTTAAAGA
AGATGGAAACATTCTTGGACACAAAATGGAATACAACTATAACTCACATAATGTATACATCATGGCAGAC
AAACCAAAGAATGGAATCAAAGTTAACTTCAAAATTAGACACAACATTAAAGATGGAAGCGTTCAATTAG
CAGACCATTATCAACAAAATACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTACCTGTC
CACACAATCTGCCCTTTCCAAAGATCCCAACGAAAAGAGAGATCACATGATCCTTCTTGAGTTTGTAACA
GCTGCTGGGATTACACATGGCATGGATGAACTATACAAATAAATGTCCAGACTTCCAATTGACACTAAAG
TGTCCGAACAATTACTAAATTCTCAGGGTTCCTGGTTAAATTCAGGCTGAGACTTTATTTATATATTTAT
AGATTCATTAAAATTTTATGAATAATTTATTGATGTTATTAATAGGGGCTATTTTCTTATTAAATAGGCT
ACTGGAGTGTAT
  

. . . I then looked for the CDS

Click to view GFP DNA Sequence (CDS)
ATGAGTAAAGGAGAAGAACTTTTCACTGGAGTTGTCCCAATTCTTGTTGAATTAGATGGCGATGTTAATGGGCAAAAATTCTCTGTCAGTGGAGAGGGTGAAGGTGATGCAACATACGGAAAACTTACCCTTAAATTTATTTGCACTACTGGGAAGCTACCTGTTCCATGGCCAACACTTGTCACTACTTTCTCTTATGGTGTTCAATGCTTTTCAAGATACCCAGATCATATGAAACAGCATGACTTTTTCAAGAGTGCCATGCCCGAAGGTTATGTACAGGAAAGAACTATATTTTACAAAGATGACGGGAACTACAAGACACGTGCTGAAGTCAAGTTTGAAGGTGATACCCTTGTTAATAGAATCGAGTTAAAAGGTATTGATTTTAAAGAAGATGGAAACATTCTTGGACACAAAATGGAATACAACTATAACTCACATAATGTATACATCATGGCAGACAAACCAAAGAATGGAATCAAAGTTAACTTCAAAATTAGACACAACATTAAAGATGGAAGCGTTCAATTAGCAGACCATTATCAACAAAATACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTACCTGTCCACACAATCTGCCCTTTCCAAAGATCCCAACGAAAAGAGAGATCACATGATCCTTCTTGAGTTTGTAACAGCTGCTGGGATTACACATGGCATGGATGAACTATACAAATAA
  
🧬 3.3 Codon optimisation

The genetic code is degenerate, meaning that multiple codons can code for the same amino acid. However, certain codons are preferred over others in difference organisms. It is for that reason that we must optimise codon usage depending on the host organism (e.g. human or jellyfish genes in E.coli).

Replacing rare coding with more frequently used ones has the effects of increasing translation speed, improving mRNA stability, and preventing premature translation termination.

I have opted to use E.coli as the host organism for which the codon sequence is optimised for. This is due to the fact that E.coli is abundantly used due to its fast reproduction, low cost, and well charactrised genetics. It is for these reasons that E.coli is an ideal host for producing proteins, espeicially those found in prokaryotes, such as the jellyfish from which GFP is isolated.

I used IDT DNA to optimise my codon sequence

Click to view E. coli Optimised GFP DNA Sequence
ATG TCC AAA GGT GAG GAA CTG TTT ACA GGT GTT GTG CCT ATC CTG GTT GAA CTG GAT GGG GAC GTT AAC GGG CAG AAG TTT AGT GTA TCA GGC GAG GGG GAG GGG GAT GCT ACA TAT GGC AAA CTT ACT TTG AAA TTC ATC TGT ACT ACT GGC AAG TTA CCG GTT CCC TGG CCA ACA TTG GTC ACA ACG TTT TCT TAT GGG GTC CAG TGT TTT TCC AGA TAC CCG GAC CAC ATG AAG CAG CAC GAC TTT TTC AAG TCT GCT ATG CCC GAA GGT TAT GTG CAA GAA CGT ACT ATT TTC TAT AAG GAC GAC GGC AAT TAC AAA ACC AGA GCC GAA GTA AAA TTC GAA GGG GAT ACC TTG GTT AAT CGT ATC GAG CTG AAA GGT ATC GAT TTT AAA GAG GAC GGT AAT ATT TTG GGC CAC AAA ATG GAG TAT AAT TAT AAT TCC CAC AAT GTA TAT ATC ATG GCG GAC AAA CCC AAA AAT GGC ATC AAA GTC AAT TTT AAA ATA CGC CAT AAC ATT AAA GAC GGC TCG GTG CAG CTT GCG GAT CAT TAT CAG CAG AAT ACC CCC ATA GGC GAC GGT CCT GTA CTG CTG CCT GAC AAT CAC TAT CTT TCA ACA CAA TCA GCC CTG TCG AAA GAC CCG AAT GAA AAA CGC GAC CAT ATG ATA CTG CTT GAA TTC GTA ACT GCT GCC GGA ATA ACA CAC GGT ATG GAC GAG TTA TAC AAA TAA
  
🧬 3.4 You have a sequence! Now what?

Cell-dependent method (using E.coli)

In order to produce this protein from our DNA we can use the host cells of E.coli. This method is the most common for producing high protein yields.

1. Transcription

We would clone the optimised DNA sequence into a plasmic vector with a strong promoter. Inside the E.coli cell, the RNA polymerase would bind to the promoter and transcribe the DNA sequence into mRNA.

2. Translation

Ribosomes within the E.coli cell bind to the mRNA and commence translation by reading the codon sequence and assembling the corresponding amino acids. The optimised codon sequence would prevent translation from stalling thus folding the amino acids into the functional GFP protein.

3. Induction

The E.coli would then produce GFP. IPTG is a common reagent used to induce protein expression in bacteria.

Cell independent method

This method does not require the use of living cells through using extracts with the necessary machinery.

1. Mixing

You combine your DNA sequence, sources of energy (ATP, GTP), amino acids, and translation machinery (ribosomes, tRNA, enzymes) derived from bacterial lysates in a test tube.

2. Transcription and Translation

The processes of transcription and translation occur simultaneously in this open system producing the GFP protein without such limitation as maintaining the viability of cells.

An advantage of this method if that you can produce proteins that are toxic to living bacterial (i.e. E.coli) cells.

🧬 4.1-2 Create a Twist account and a Benchling account to Build Your DNA Insert Sequence
GFP Image GFP Image
GFP Image

Here is the link for my final sequence:

https://benchling.com/s/seq-P12tVvT9nqlExDjD5hkS?m=slm-DFTTw8aWcdihybyBVM8P

🧬 4.3-6 Twist, Vector choice, Sequence download
GFP Image GFP Image
GFP Image
🧬 5.1 (i) What DNA would you want to sequence and why? (ii) What technology or technologies would you use to perform sequencing on your DNA and why?

(i) I would like to sequence the DNA of sunflowers, Conan the Bacterium, bowhead whales, and yeast.

  1. Sunflowers have been used for the purposes of phytoremediation (removing radioactive contaminants in soil), for exmaple in the aftermath of the disasters in Chernobyl (1986) and Fukushima (2011). This is due to the fact that sunflowers are able to act as radiation sponges, thus classifying them as “hyperaccumulators.” They are able soak up Pb (lead) and Cd (cadmium), for example, and store such in their leaves and shoots (Alaboudi et al 2018), (Al-Jabori et al 2019).

They also absorb radioactive isotopes because these toxins resemble essential nutrients due to chemical mimicry. Specifically, Cesium-137 acts like potassium, and Strontium-90 acts like calcium. Their extensive root systems mean that they are able to soak up contaminants from large volumes of soil. Moreover, their capacity to grow quickly and produce large amounts of biomass meand that they are a high-capacity storage system for contaminants, thereby preventing damage to the surrounding ecosystem. It should be notes that sunflowers do not neutralise radiation but make it easier to clean up.

I would like to sequence the DNA of sunflowers to better understand how they don’t die under radiation stress - what about their DNA makes them so resilient? I am interested in creating a radiation biosensor. Sequencing the DNA of sunflowers could be useful for human, animal, and plant health, environmental monitoring, AND potentially data storage.

  1. Conan the Bacterium otherwise known as Deinococcus radiodurans is a bacterium and the most radiation resistant organism known to man. Not only can it withstand radiation, but also dehydration, vacuum, and highly acidic condition, therefore qualifying it the classification of a polyextremophile. It can withstand up to 5 kGy (kiloGrays) of gamma radiation and 1,000 J/m² of UV radiation without significant loss of viability. It has even been recorded as having survived on the exterior of the ISS for 3 years.

Deinococcus radiodurans is able to do this employing specialised DNA repair mechanisms and creating proteome shields against oxidation: namely PprI and DdrC. These proteins are able to repair hundreds of DNA double-strand breaks (DSBs) caused by ionising radiation or desiccation within hours. It achieves this through a two-phase process: rapid Extended Synthesis-Dependent Strand Annealing (ESDSA) and then RecA-dependent homologous recombination.

I would sequence the DNA of this organism to understand what makes it so hardy. It could have applications for human, animal, and plant health in the context of genotoxic stress inducing conditions. They could also be used for the creation of radiation biosensors/useful biomaterials. I care less about tech bro space applications.

  1. Bowhead whales have incredible resistance to cancer. This is despite the fact that they have many times more cells (greater likelihood of mutation) than other smaller animals such as humans and mice for instance, - a phenomena known as Peto’s Paradox.

So what is the reason for their remarkable cancer resistance? Whilst some large mammals such as elephants have extra TP53 tumour suppressor genes, bonehead whales have highly efficient DNA repair systems. They have high levels of CIRBP ( cold inducible RNA-binding protein) - a stress response protein which minimises genotoxic stress. The production of this protein is induced by such factors as low temperatures, UV light, and hypoxia (low oxygen levels). It is located in the nucleus of a cell where it regulates alternative splicing, and the cytoplasm where it stabilises mRNA. It this influences cell survivial, circadian rhythms, and inflammation. (Zhong and Huang 2017)

Simply put, this protein aids in the repair of double-strand DNA breaks, thus minimising tumour cell growth and proliferation, and maintaining genomic stability.

Bowhead what CIRPB (bwCIRBP) is extremely stable while human CIRPB (hCIRBP) is relatively unstable. The whale version has five unique amino acids. I want to sequence bowhead whale DNA for the purposes of human, animal, and plant health, particualarly for oncological research and therapy.

  1. Yeast have incredible prion suppression abilities. Saccharomyces cerevisiae - yes the yeast used for baking and wine-making. Whilst mammals, humans included, are unable to “cure” prions (misfolded proteins), yeast have developed advanced mechanisms for eliminating such.

For instance, they possess the protein Hsp104 which works alongside the proteins Hsp70 and Hsp40. Hsp104 breaks apart amyloid filaments in cells by resolubilising aggregated proteins. Hsp70 and Hsp40 work together in what is known as a “molecular chaperone system.” Hsp40 delivers misfolded proteins to Hsp70 which remodels such. Humans do not have homologs for Hsp70 though we have the Hsp70/Hsp40/Hsp110 system. Unfortunately our system is not as affective at disaggregating existing clumps of amyloid hence the prevalence of prion diseases.

The gene for Hsp104 in Saccharomyces cerevisiae is the HSP104 gene (also known as YLL026W). I would sequence the DNA of yeast to better understand the mechanisms of prion suppression to help combat neurodegenerative disorders.

(ii) I would use a variety of DNA sequencing technologies for the DNA of different organisms that I wish to sequence.

For sunflowers

  1. Technology and Generation

    • I would like to use a combination of ONT (Oxford Nanopore Technology) long read sequencing + Illumina short read sequencing
    • The goal is to identify novel repair genes, structural variants and “hardy” regulatory genes ∴ hybrid sequencing approach prevents gaps in research
    • Long reads with ONT are essential for the Sunflower genome, which is notoriously large (approx. 3.6 Gb/base pairs) and filled with repetitive “junk” DNA that hides the hyperaccumulator genes + over 78% to 80% of the sunflower genome is made of repetitive sequences
    • Short read technologies such as Illumina have greater accuracies and can be used to “polish” long-read data
    • ONT: third generation, Illumina: second generation
  2. Input

    • Input: High Molecular Weight (HMW) Genomic DNA extracted from sunflower leaf or root tissue
    • Extraction: Using a specialized method (like CTAB or magnetic beads) to get ultra-pure, long DNA strands
    • End-Repair: We use enzymes to “blunt” the ends of the DNA strands so they aren’t ragged
    • Adapter Ligation: We attach specialized sequencing adapters to both ends of the DNA to help unzip 👖 double helix
  3. Steps

  • Translocation: the motor protein zips the DNA strand through a protein nanopore embedded in an electrically resistant membrane.
  • Ionic Current Blockage: a constant flow of ions passes through the pore. As the DNA bases (A, T, C, G) pass through, they physically block the pore
  • Base Calling: each of the four bases has a different shape and size. Therefore, each base (or more accurately, a “k-mer” or group of bases) causes a unique disruption in the electrical current
  • Decoding electric signals with AI
  1. Output
  • The output of this process is a FASTQ file

For Conan the Bacterium

  1. Technology and Generation

    • I would use Illumina once again as this short read technology is accurate and sufficient for sequencing the genome of Conan at only 3.2 Mb/bp
    • Extreme accuracy is required to identify the specific mutations in the PprI and DdrC repair genes
    • The Goal: High-fidelity “polishing” to ensure every single base of those rapid-repair enzymes is mapped correctly for biosensor
  2. Input

  • Input: Genomic DNA (gDNA) extracted from a culture of D. radiodurans
  • Fragmentation (cut DNA into consistent chunks)
  • A-Tailing (prevent sticking)
  • Adapter Ligation (We attach Y-shaped DNA adapters for sample identification)
  • PCR Amplification
  1. Steps
  • Illumina uses fluorescent light
  • Cluster Generation: grow colonies of identical strands
  • Reversible Terminator Dance: machine floods the flow cell with four types of fluorescently labeled nucleotides (A, T, C, G)
  • Competition: Only the correct matching base can bind to the DNA strand. When it binds, it emits a specific color of light (e.g., Green for G, Red for T)
  • Action!: high-resolution camera takes a photo of the entire flow cell; color of the “dot” tells the computer which base was just added
  • Cleave and Repeat: The “terminator” (the chemical block) is washed away, and the next base is added - happens hundreds of times
  1. Output
  • The output of this process is a FASTQ file
  • To build a radiation biosensor using D. radiodurans we must isolate the Promoters—the “DNA switches”—that activate the RecA and PprI proteins the moment radiation is detected
  • Short-read sequencing is the gold standard for finding these precise regulatory sequences

For Bowhead whales

  1. Technology and Generation
  • The goal: looking for the unique amino acid swaps in the bwCIRBP protein and the duplication of DNA repair genes
  • ∴ We need a sequencing method that combines the length of the sunflower strategy with the accuracy of the bacterial strategy
  • Solution: PacBio HiFi Sequencing, third generation
  • Unlike standard Nanopore (which can be “noisy”) or Illumina (which is too short), HiFi reads are both long (up to 20,000 bp) and ultra-accurate
  1. Input
  • Input: HMW (High Molecular Weight) DNA, usually taken from a skin biopsy or blood sample of the whale 🐋
  • Shearing: The DNA is sheared into large fragments of approximately 15–20 kb
  • Exonuclease Digestion: We clean up any stray single-stranded DNA to ensure only high-quality double-stranded templates remain
  • Hairpin Adapter Ligation: This is the “secret sauce” of PacBio. We attach hairpin adapters to both ends of the DNA fragment, turning it into a closed circle (called a SMRTbell)
  • Polymerase Binding: A highly engineered DNA Polymerase is attached to the adapter, ready to start “looping” around the circle
  1. Steps

Decoding: Circular Consensus Sequencing (CCS) - this is where the name HiFi comes from

  • The ZMW (Zero-Mode Waveguide): The SMRTbell is dropped into a tiny well called a ZMW. There are millions of these on a single chip.
  • The “Rolling Circle”: The polymerase starts unzipping and copying the DNA. Because the DNA is a circle, the polymerase doesn’t stop at the end—it just keeps going around and around.
  • Fluorescence Detection: As each base (A, T, C, G) is added, it lets off a flash of light. A camera at the bottom of the ZMW records these flashes.
  • Base Calling by Consensus: The machine reads the same strand of DNA 10, 20, or 30 times. If there was a random error in the first pass, it gets corrected by the subsequent passes.
  • Result: You get a single, long “Consensus” read that is as accurate as Illumina but long enough to capture a whole gene! 🧬
  1. Output
  • The output is a BAM file (which contains the raw signal) that is converted into a FASTQ file
  • You can align the whale’s CIRBP sequence directly against the human hCIRBP sequence
  • Because the data is so clean, those five unique amino acids you mentioned will stand out clearly, allowing you to design synthetic versions for oncological therapy 🥳

If we can understand how the whale stabilises its mRNA via CIRBP, we might be able to replicate that stability in human cells undergoing chemotherapy.

For regular degular yeast

  1. Technology and Generation Ok admittedly this is a little curveball answer but instead of sequencing the DNA of yeast, I would sequence the RNA of yeast. I am seeking to understand the mechanism of how Hsp104 interacts with Hsp70 and Hsp40. Just knowing the DNA sequence is not sufficient - instead I need to know how yeast regulates these genes in a protein-folding crisis.
  • The goal: To map the “Chaperone Network” and identify the regulatory triggers that allow yeast to resolubilise amyloid filaments

Therefore I would use Illumina for transcriptonomic purposes (RNA sequencing) as it is

  • Cost effective
  • Highly quantitative: doesn’t just tell you the sequence, but also how many copies of the Hsp104 mRNA are present
  1. Input
  • RNA extracted from yeast cells that have been “stressed” (e.g., heat-shocked or exposed to misfolded proteins)
  • Poly-A Selection: Use magnetic beads to grab only the mRNA (the instructions for proteins), which have a “Poly-A tail”
  • Reverse Transcription: Illumina cannot sequence RNA directly so we must use reverse transcriptase to turn the unstable RNA back into stable cDNA (complementary DNA)
  • Fragmentation & Adapters: Just like with Conan the Bacterium, we shear the cDNA into small bits and attach Illumina adapters (barcodes and anchors)
  • PCR: We amplify the library to ensure we can detect even the rarest “stress-response” instructions
  1. Steps

We use the same technique we used for Conan:

  • Fluorescence: Each time a base pair is added to the cDNA strand, it flashes a specific colour
  • High throughput: We can sequence millions of RNA transcripts simultaneously
  1. Output
  • The output is a FASTQ file, but the way we use it is different
  • Instead of building a map, we count
  • If we see 5,000 reads for HSP104 in stressed yeast but only 10 reads in healthy yeast, we know exactly how much the “prion-curing” machinery has been ramped up
  • Alternative Splicing: You can see if the yeast is “editing” its mRNA to create different versions of the Hsp104 protein for different types of protein clumps
OrganismPrimary Research GoalBest TechnologyData Output Type
SunflowerFind metal-transporter clustersOxford NanoporeLong-Read Map
Conan (Bacterium)Map rapid-repair genes (PprI)Illumina (SBS)High-Accuracy Map
Bowhead WhaleCompare CIRBP amino acidsPacBio HiFiUltra-Clean Gene Sequence
YeastWatch Hsp104 “curing” prionsRNA-Seq (Illumina)Gene Expression Levels

🧬 5.2 (i) What DNA would you want to synthesize and why? (ii) What technology or technologies would you use to perform this DNA synthesis and why?

I would synthesise a Chimeric Genetic Circuit that combines the “sensing” power of the sunflowers and Conan with the “repair” power of bowhead whales and yeast. It would have these specific compenents:

  • Switch (Promoter): The DNA sequence for the PprI promoter from D. radiodurans. This acts as a high-sensitivity “on-switch” that only activates in the presence of ionising radiation.
  • Shield (Protective Protein): The DNA sequence for the Bowhead Whale CIRBP (with those five unique amino acids). This protein would be produced the moment the PprI switch is flipped.
  • Cleaner: The DNA sequence for Yeast Hsp104, modified with human-compatible codons (through codon optimisation) to ensure it works efficiently in mammalian cells to prevent protein clumping under radiation stress

By synthesizing this custom circuit, we can create a prophylactic genetic shield. This could be used to create radiation-proof crops 🌾 for phytoremediation or specialized cell therapies for organisms living in high radiation areas, or even cancer patients undergoing high-dose radiotherapy.

In order to do this, I would use Phosphoramidite Synthesis (Oligonucleotide Synthesis) to create primers and concise units of DNA, which is what happens at Twist as Emily LeProust explained. I would also use Enzymatic DNA Synthesis because this enables the synthesis of long strands of DNA. And finally, I would make the circuit by joining together all of the concise units of DNA and necessary primers by using Gibson assembly.

🧬 5.3 (i) What DNA would you want to edit and why? (ii) What technology or technologies would you use to perform these DNA edits and why?

Maybe I would edit stem cells . . . not sure. Have to think about the risks and ethical implications. But if I were going to I would use CRISPR because it’s really good at targeting DNA for editing with great precision.

Week 3 HW: Lab automation

🤖 Part I: Python Script for Opentrons Artwork
Yin and Yang Symbol Yin and Yang Symbol

I wanted to do a yin and yang symbol so I went about designing it on the very helpful website made by Ronan! https://opentrons-art.rcdonovan.com

This is my design

First Time Running Code Visualisation/Simulation - Jupyter was kicking my ass! 😭

Second Time Running Code - A Success! 🥳 These weird long lines show the movement of the Opentrons robot

Click to view the Python of my Design 1 (BEWARE: It is super long!)
from opentrons import types

metadata = { ‘protocolName’: ‘HTGAA Opentrons Yin Yang Art’, ‘author’: ‘Hana’, ‘apiLevel’: ‘2.20’ }

— Robot deck setup constants —

TIP_RACK_DECK_SLOT = 9 COLORS_DECK_SLOT = 6 AGAR_DECK_SLOT = 5

well_colors = { ‘A1’ : ‘Red’, # Pink (mkate2_tf) ‘D1’ : ‘Cyan’, # Teal (mturquoise2) }

def run(protocol): # Load Labware with required display names for the Mock Simulator tips = protocol.load_labware(‘opentrons_96_tiprack_20ul’, TIP_RACK_DECK_SLOT, ‘Tips’) pipette = protocol.load_instrument(“p20_single_gen2”, “right”, [tips]) colors = protocol.load_labware(’nest_96_wellplate_200ul_flat’, COLORS_DECK_SLOT, ‘Inks’) plate = protocol.load_labware(’nest_96_wellplate_200ul_flat’, AGAR_DECK_SLOT, ‘Agar Plate’)

def paint_layer(points, color_well_name):
    pipette.pick_up_tip()
    # Direct dictionary access for Mock objects
    color_source = colors[color_well_name]
    target_well = plate['A1']
    
    for x, y in points:
        # Destination point relative to the well top
        destination = target_well.top().move(types.Point(x=x, y=y, z=0))
        
        pipette.aspirate(1, color_source)
        # Pass location into dispense for the simulator
        pipette.dispense(1, destination)
        
    pipette.drop_tip()

# --- FULL COORDINATE LISTS ---
mkate2_tf_points = [(-4.4, 39.6),(-2.2, 39.6),(0, 39.6),(2.2, 39.6),(4.4, 39.6),(-13.2, 37.4),(-11, 37.4),(-8.8, 37.4),(-6.6, 37.4),(-4.4, 37.4),(-2.2, 37.4),(0, 37.4),(2.2, 37.4),(4.4, 37.4),(6.6, 37.4),(8.8, 37.4),(-17.6, 35.2),(-15.4, 35.2),(-13.2, 35.2),(-11, 35.2),(-8.8, 35.2),(-6.6, 35.2),(-4.4, 35.2),(-2.2, 35.2),(0, 35.2),(2.2, 35.2),(4.4, 35.2),(6.6, 35.2),(8.8, 35.2),(11, 35.2),(-22, 33),(-19.8, 33),(-17.6, 33),(-15.4, 33),(-13.2, 33),(-11, 33),(-8.8, 33),(-6.6, 33),(-4.4, 33),(-2.2, 33),(0, 33),(2.2, 33),(4.4, 33),(6.6, 33),(8.8, 33),(11, 33),(13.2, 33),(-24.2, 30.8),(-22, 30.8),(-19.8, 30.8),(-17.6, 30.8),(-15.4, 30.8),(-13.2, 30.8),(-11, 30.8),(-8.8, 30.8),(-6.6, 30.8),(-4.4, 30.8),(-2.2, 30.8),(0, 30.8),(2.2, 30.8),(4.4, 30.8),(6.6, 30.8),(8.8, 30.8),(11, 30.8),(13.2, 30.8),(15.4, 30.8),(-26.4, 28.6),(-24.2, 28.6),(-22, 28.6),(-19.8, 28.6),(-17.6, 28.6),(-15.4, 28.6),(-13.2, 28.6),(-11, 28.6),(-8.8, 28.6),(-6.6, 28.6),(-4.4, 28.6),(-2.2, 28.6),(0, 28.6),(2.2, 28.6),(4.4, 28.6),(6.6, 28.6),(8.8, 28.6),(11, 28.6),(13.2, 28.6),(15.4, 28.6),(-28.6, 26.4),(-26.4, 26.4),(-24.2, 26.4),(-22, 26.4),(-19.8, 26.4),(-17.6, 26.4),(-15.4, 26.4),(-13.2, 26.4),(-11, 26.4),(-8.8, 26.4),(-6.6, 26.4),(-4.4, 26.4),(-2.2, 26.4),(0, 26.4),(2.2, 26.4),(4.4, 26.4),(6.6, 26.4),(8.8, 26.4),(11, 26.4),(13.2, 26.4),(15.4, 26.4),(-30.8, 24.2),(-28.6, 24.2),(-26.4, 24.2),(-24.2, 24.2),(-22, 24.2),(-19.8, 24.2),(-17.6, 24.2),(-15.4, 24.2),(-13.2, 24.2),(-11, 24.2),(-8.8, 24.2),(-6.6, 24.2),(-4.4, 24.2),(4.4, 24.2),(6.6, 24.2),(8.8, 24.2),(11, 24.2),(13.2, 24.2),(15.4, 24.2),(17.6, 24.2),(-33, 22),(-30.8, 22),(-28.6, 22),(-26.4, 22),(-24.2, 22),(-22, 22),(-19.8, 22),(-17.6, 22),(-15.4, 22),(-13.2, 22),(-11, 22),(-8.8, 22),(-6.6, 22),(6.6, 22),(8.8, 22),(11, 22),(13.2, 22),(15.4, 22),(17.6, 22),(-33, 19.8),(-30.8, 19.8),(-28.6, 19.8),(-26.4, 19.8),(-24.2, 19.8),(-22, 19.8),(-19.8, 19.8),(-17.6, 19.8),(-15.4, 19.8),(-13.2, 19.8),(-11, 19.8),(-8.8, 19.8),(-6.6, 19.8),(6.6, 19.8),(8.8, 19.8),(11, 19.8),(13.2, 19.8),(15.4, 19.8),(17.6, 19.8),(-35.2, 17.6),(-33, 17.6),(-30.8, 17.6),(-28.6, 17.6),(-26.4, 17.6),(-24.2, 17.6),(-22, 17.6),(-19.8, 17.6),(-17.6, 17.6),(-15.4, 17.6),(-13.2, 17.6),(-11, 17.6),(-8.8, 17.6),(-6.6, 17.6),(6.6, 17.6),(8.8, 17.6),(11, 17.6),(13.2, 17.6),(15.4, 17.6),(17.6, 17.6),(-35.2, 15.4),(-33, 15.4),(-30.8, 15.4),(-28.6, 15.4),(-26.4, 15.4),(-24.2, 15.4),(-22, 15.4),(-19.8, 15.4),(-17.6, 15.4),(-15.4, 15.4),(-13.2, 15.4),(-11, 15.4),(-8.8, 15.4),(-6.6, 15.4),(-4.4, 15.4),(4.4, 15.4),(6.6, 15.4),(8.8, 15.4),(11, 15.4),(13.2, 15.4),(15.4, 15.4),(17.6, 15.4),(-37.4, 13.2),(-35.2, 13.2),(-33, 13.2),(-30.8, 13.2),(-28.6, 13.2),(-26.4, 13.2),(-24.2, 13.2),(-22, 13.2),(-19.8, 13.2),(-17.6, 13.2),(-15.4, 13.2),(-13.2, 13.2),(-11, 13.2),(-8.8, 13.2),(-6.6, 13.2),(-4.4, 13.2),(-2.2, 13.2),(0, 13.2),(2.2, 13.2),(4.4, 13.2),(6.6, 13.2),(8.8, 13.2),(11, 13.2),(13.2, 13.2),(15.4, 13.2),(-37.4, 11),(-35.2, 11),(-33, 11),(-30.8, 11),(-28.6, 11),(-26.4, 11),(-24.2, 11),(-22, 11),(-19.8, 11),(-17.6, 11),(-15.4, 11),(-13.2, 11),(-11, 11),(-8.8, 11),(-6.6, 11),(-4.4, 11),(-2.2, 11),(0, 11),(2.2, 11),(4.4, 11),(6.6, 11),(8.8, 11),(11, 11),(13.2, 11),(15.4, 11),(-37.4, 8.8),(-35.2, 8.8),(-33, 8.8),(-30.8, 8.8),(-28.6, 8.8),(-26.4, 8.8),(-24.2, 8.8),(-22, 8.8),(-19.8, 8.8),(-17.6, 8.8),(-15.4, 8.8),(-13.2, 8.8),(-11, 8.8),(-8.8, 8.8),(-6.6, 8.8),(-4.4, 8.8),(-2.2, 8.8),(0, 8.8),(2.2, 8.8),(4.4, 8.8),(6.6, 8.8),(8.8, 8.8),(11, 8.8),(13.2, 8.8),(-37.4, 6.6),(-35.2, 6.6),(-33, 6.6),(-30.8, 6.6),(-28.6, 6.6),(-26.4, 6.6),(-24.2, 6.6),(-22, 6.6),(-19.8, 6.6),(-17.6, 6.6),(-15.4, 6.6),(-13.2, 6.6),(-11, 6.6),(-8.8, 6.6),(-6.6, 6.6),(-4.4, 6.6),(-2.2, 6.6),(0, 6.6),(2.2, 6.6),(4.4, 6.6),(6.6, 6.6),(8.8, 6.6),(11, 6.6),(-39.6, 4.4),(-37.4, 4.4),(-35.2, 4.4),(-33, 4.4),(-30.8, 4.4),(-28.6, 4.4),(-26.4, 4.4),(-24.2, 4.4),(-22, 4.4),(-19.8, 4.4),(-17.6, 4.4),(-15.4, 4.4),(-13.2, 4.4),(-11, 4.4),(-8.8, 4.4),(-6.6, 4.4),(-4.4, 4.4),(-2.2, 4.4),(0, 4.4),(2.2, 4.4),(4.4, 4.4),(6.6, 4.4),(8.8, 4.4),(-39.6, 2.2),(-37.4, 2.2),(-35.2, 2.2),(-33, 2.2),(-30.8, 2.2),(-28.6, 2.2),(-26.4, 2.2),(-24.2, 2.2),(-22, 2.2),(-19.8, 2.2),(-17.6, 2.2),(-15.4, 2.2),(-13.2, 2.2),(-11, 2.2),(-8.8, 2.2),(-6.6, 2.2),(-4.4, 2.2),(-2.2, 2.2),(0, 2.2),(2.2, 2.2),(4.4, 2.2),(-39.6, 0),(-37.4, 0),(-35.2, 0),(-33, 0),(-30.8, 0),(-28.6, 0),(-26.4, 0),(-24.2, 0),(-22, 0),(-19.8, 0),(-17.6, 0),(-15.4, 0),(-13.2, 0),(-11, 0),(-8.8, 0),(-6.6, 0),(-39.6, -2.2),(-37.4, -2.2),(-35.2, -2.2),(-33, -2.2),(-30.8, -2.2),(-28.6, -2.2),(-26.4, -2.2),(-24.2, -2.2),(-22, -2.2),(-19.8, -2.2),(-17.6, -2.2),(-15.4, -2.2),(-13.2, -2.2),(-11, -2.2),(-39.6, -4.4),(-37.4, -4.4),(-35.2, -4.4),(-33, -4.4),(-30.8, -4.4),(-28.6, -4.4),(-26.4, -4.4),(-24.2, -4.4),(-22, -4.4),(-19.8, -4.4),(-17.6, -4.4),(-15.4, -4.4),(-13.2, -4.4),(-37.4, -6.6),(-35.2, -6.6),(-33, -6.6),(-30.8, -6.6),(-28.6, -6.6),(-26.4, -6.6),(-24.2, -6.6),(-22, -6.6),(-19.8, -6.6),(-17.6, -6.6),(-15.4, -6.6),(-37.4, -8.8),(-35.2, -8.8),(-33, -8.8),(-30.8, -8.8),(-28.6, -8.8),(-26.4, -8.8),(-24.2, -8.8),(-22, -8.8),(-19.8, -8.8),(-17.6, -8.8),(-37.4, -11),(-35.2, -11),(-33, -11),(-30.8, -11),(-28.6, -11),(-26.4, -11),(-24.2, -11),(-22, -11),(-19.8, -11),(-37.4, -13.2),(-35.2, -13.2),(-33, -13.2),(-30.8, -13.2),(-28.6, -13.2),(-26.4, -13.2),(-24.2, -13.2),(-22, -13.2),(-19.8, -13.2),(-35.2, -15.4),(-33, -15.4),(-30.8, -15.4),(-28.6, -15.4),(-26.4, -15.4),(-24.2, -15.4),(-22, -15.4),(-2.2, -15.4),(0, -15.4),(2.2, -15.4),(-35.2, -17.6),(-33, -17.6),(-30.8, -17.6),(-28.6, -17.6),(-26.4, -17.6),(-24.2, -17.6),(-22, -17.6),(-4.4, -17.6),(-2.2, -17.6),(0, -17.6),(2.2, -17.6),(4.4, -17.6),(-33, -19.8),(-30.8, -19.8),(-28.6, -19.8),(-26.4, -19.8),(-24.2, -19.8),(-22, -19.8),(-4.4, -19.8),(-2.2, -19.8),(0, -19.8),(2.2, -19.8),(4.4, -19.8),(-33, -22),(-30.8, -22),(-28.6, -22),(-26.4, -22),(-24.2, -22),(-22, -22),(-4.4, -22),(-2.2, -22),(0, -22),(2.2, -22),(4.4, -22),(-30.8, -24.2),(-28.6, -24.2),(-26.4, -24.2),(-24.2, -24.2),(-22, -24.2),(-2.2, -24.2),(0, -24.2),(2.2, -24.2),(-28.6, -26.4),(-26.4, -26.4),(-24.2, -26.4),(-22, -26.4),(-19.8, -26.4),(-26.4, -28.6),(-24.2, -28.6),(-22, -28.6),(-19.8, -28.6),(-24.2, -30.8),(-22, -30.8),(-19.8, -30.8),(-17.6, -30.8),(-22, -33),(-19.8, -33),(-17.6, -33),(-17.6, -35.2),(-15.4, -35.2),(-13.2, -37.4)]

mturquoise2_points = [(13.2, 37.4),(15.4, 35.2),(17.6, 35.2),(17.6, 33),(19.8, 33),(22, 33),(19.8, 30.8),(22, 30.8),(24.2, 30.8),(19.8, 28.6),(22, 28.6),(24.2, 28.6),(26.4, 28.6),(19.8, 26.4),(22, 26.4),(24.2, 26.4),(26.4, 26.4),(28.6, 26.4),(-2.2, 24.2),(0, 24.2),(2.2, 24.2),(22, 24.2),(24.2, 24.2),(26.4, 24.2),(28.6, 24.2),(30.8, 24.2),(-4.4, 22),(-2.2, 22),(0, 22),(2.2, 22),(4.4, 22),(22, 22),(24.2, 22),(26.4, 22),(28.6, 22),(30.8, 22),(33, 22),(-4.4, 19.8),(-2.2, 19.8),(0, 19.8),(2.2, 19.8),(4.4, 19.8),(22, 19.8),(24.2, 19.8),(26.4, 19.8),(28.6, 19.8),(30.8, 19.8),(33, 19.8),(-4.4, 17.6),(-2.2, 17.6),(0, 17.6),(2.2, 17.6),(4.4, 17.6),(22, 17.6),(24.2, 17.6),(26.4, 17.6),(28.6, 17.6),(30.8, 17.6),(33, 17.6),(35.2, 17.6),(-2.2, 15.4),(0, 15.4),(2.2, 15.4),(22, 15.4),(24.2, 15.4),(26.4, 15.4),(28.6, 15.4),(30.8, 15.4),(33, 15.4),(35.2, 15.4),(19.8, 13.2),(22, 13.2),(24.2, 13.2),(26.4, 13.2),(28.6, 13.2),(30.8, 13.2),(33, 13.2),(35.2, 13.2),(37.4, 13.2),(19.8, 11),(22, 11),(24.2, 11),(26.4, 11),(28.6, 11),(30.8, 11),(33, 11),(35.2, 11),(37.4, 11),(17.6, 8.8),(19.8, 8.8),(22, 8.8),(24.2, 8.8),(26.4, 8.8),(28.6, 8.8),(30.8, 8.8),(33, 8.8),(35.2, 8.8),(37.4, 8.8),(15.4, 6.6),(17.6, 6.6),(19.8, 6.6),(22, 6.6),(24.2, 6.6),(26.4, 6.6),(28.6, 6.6),(30.8, 6.6),(33, 6.6),(35.2, 6.6),(37.4, 6.6),(15.4, 4.4),(17.6, 4.4),(19.8, 4.4),(22, 4.4),(24.2, 4.4),(26.4, 4.4),(28.6, 4.4),(30.8, 4.4),(33, 4.4),(35.2, 4.4),(37.4, 4.4),(39.6, 4.4),(11, 2.2),(13.2, 2.2),(15.4, 2.2),(17.6, 2.2),(19.8, 2.2),(22, 2.2),(24.2, 2.2),(26.4, 2.2),(28.6, 2.2),(30.8, 2.2),(33, 2.2),(35.2, 2.2),(37.4, 2.2),(39.6, 2.2),(6.6, 0),(8.8, 0),(11, 0),(13.2, 0),(15.4, 0),(17.6, 0),(19.8, 0),(22, 0),(24.2, 0),(26.4, 0),(28.6, 0),(30.8, 0),(33, 0),(35.2, 0),(37.4, 0),(39.6, 0),(-4.4, -2.2),(-2.2, -2.2),(0, -2.2),(2.2, -2.2),(4.4, -2.2),(6.6, -2.2),(8.8, -2.2),(11, -2.2),(13.2, -2.2),(15.4, -2.2),(17.6, -2.2),(19.8, -2.2),(22, -2.2),(24.2, -2.2),(26.4, -2.2),(28.6, -2.2),(30.8, -2.2),(33, -2.2),(35.2, -2.2),(37.4, -2.2),(39.6, -2.2),(-8.8, -4.4),(-6.6, -4.4),(-4.4, -4.4),(-2.2, -4.4),(0, -4.4),(2.2, -4.4),(4.4, -4.4),(6.6, -4.4),(8.8, -4.4),(11, -4.4),(13.2, -4.4),(15.4, -4.4),(17.6, -4.4),(19.8, -4.4),(22, -4.4),(24.2, -4.4),(26.4, -4.4),(28.6, -4.4),(30.8, -4.4),(33, -4.4),(35.2, -4.4),(37.4, -4.4),(39.6, -4.4),(-11, -6.6),(-8.8, -6.6),(-6.6, -6.6),(-4.4, -6.6),(-2.2, -6.6),(0, -6.6),(2.2, -6.6),(4.4, -6.6),(6.6, -6.6),(8.8, -6.6),(11, -6.6),(13.2, -6.6),(15.4, -6.6),(17.6, -6.6),(19.8, -6.6),(22, -6.6),(24.2, -6.6),(26.4, -6.6),(28.6, -6.6),(30.8, -6.6),(33, -6.6),(35.2, -6.6),(37.4, -6.6),(-13.2, -8.8),(-11, -8.8),(-8.8, -8.8),(-6.6, -8.8),(-4.4, -8.8),(-2.2, -8.8),(0, -8.8),(2.2, -8.8),(4.4, -8.8),(6.6, -8.8),(8.8, -8.8),(11, -8.8),(13.2, -8.8),(15.4, -8.8),(17.6, -8.8),(19.8, -8.8),(22, -8.8),(24.2, -8.8),(26.4, -8.8),(28.6, -8.8),(30.8, -8.8),(33, -8.8),(35.2, -8.8),(37.4, -8.8),(-15.4, -11),(-13.2, -11),(-11, -11),(-8.8, -11),(-6.6, -11),(-4.4, -11),(-2.2, -11),(0, -11),(2.2, -11),(4.4, -11),(6.6, -11),(8.8, -11),(11, -11),(13.2, -11),(15.4, -11),(17.6, -11),(19.8, -11),(22, -11),(24.2, -11),(26.4, -11),(28.6, -11),(30.8, -11),(33, -11),(35.2, -11),(37.4, -11),(-15.4, -13.2),(-13.2, -13.2),(-11, -13.2),(-8.8, -13.2),(-6.6, -13.2),(-4.4, -13.2),(-2.2, -13.2),(0, -13.2),(2.2, -13.2),(4.4, -13.2),(6.6, -13.2),(8.8, -13.2),(11, -13.2),(13.2, -13.2),(15.4, -13.2),(17.6, -13.2),(19.8, -13.2),(22, -13.2),(24.2, -13.2),(26.4, -13.2),(28.6, -13.2),(30.8, -13.2),(33, -13.2),(35.2, -13.2),(37.4, -13.2),(-17.6, -15.4),(-15.4, -15.4),(-13.2, -15.4),(-11, -15.4),(-8.8, -15.4),(-6.6, -15.4),(-4.4, -15.4),(4.4, -15.4),(6.6, -15.4),(8.8, -15.4),(11, -15.4),(13.2, -15.4),(15.4, -15.4),(17.6, -15.4),(19.8, -15.4),(22, -15.4),(24.2, -15.4),(26.4, -15.4),(28.6, -15.4),(30.8, -15.4),(33, -15.4),(35.2, -15.4),(-17.6, -17.6),(-15.4, -17.6),(-13.2, -17.6),(-11, -17.6),(-8.8, -17.6),(-6.6, -17.6),(6.6, -17.6),(8.8, -17.6),(11, -17.6),(13.2, -17.6),(15.4, -17.6),(17.6, -17.6),(19.8, -17.6),(22, -17.6),(24.2, -17.6),(26.4, -17.6),(28.6, -17.6),(30.8, -17.6),(33, -17.6),(35.2, -17.6),(-17.6, -19.8),(-15.4, -19.8),(-13.2, -19.8),(-11, -19.8),(-8.8, -19.8),(-6.6, -19.8),(6.6, -19.8),(8.8, -19.8),(11, -19.8),(13.2, -19.8),(15.4, -19.8),(17.6, -19.8),(19.8, -19.8),(22, -19.8),(24.2, -19.8),(26.4, -19.8),(28.6, -19.8),(30.8, -19.8),(33, -19.8),(-17.6, -22),(-15.4, -22),(-13.2, -22),(-11, -22),(-8.8, -22),(-6.6, -22),(6.6, -22),(8.8, -22),(11, -22),(13.2, -22),(15.4, -22),(17.6, -22),(19.8, -22),(22, -22),(24.2, -22),(26.4, -22),(28.6, -22),(30.8, -22),(33, -22),(-17.6, -24.2),(-15.4, -24.2),(-13.2, -24.2),(-11, -24.2),(-8.8, -24.2),(-6.6, -24.2),(-4.4, -24.2),(4.4, -24.2),(6.6, -24.2),(8.8, -24.2),(11, -24.2),(13.2, -24.2),(15.4, -24.2),(17.6, -24.2),(19.8, -24.2),(22, -24.2),(24.2, -24.2),(26.4, -24.2),(28.6, -24.2),(30.8, -24.2),(-15.4, -26.4),(-13.2, -26.4),(-11, -26.4),(-8.8, -26.4),(-6.6, -26.4),(-4.4, -26.4),(-2.2, -26.4),(0, -26.4),(2.2, -26.4),(4.4, -26.4),(6.6, -26.4),(8.8, -26.4),(11, -26.4),(13.2, -26.4),(15.4, -26.4),(17.6, -26.4),(19.8, -26.4),(22, -26.4),(24.2, -26.4),(26.4, -26.4),(28.6, -26.4),(-15.4, -28.6),(-13.2, -28.6),(-11, -28.6),(-8.8, -28.6),(-6.6, -28.6),(-4.4, -28.6),(-2.2, -28.6),(0, -28.6),(2.2, -28.6),(4.4, -28.6),(6.6, -28.6),(8.8, -28.6),(11, -28.6),(13.2, -28.6),(15.4, -28.6),(17.6, -28.6),(19.8, -28.6),(22, -28.6),(24.2, -28.6),(26.4, -28.6),(-13.2, -30.8),(-11, -30.8),(-8.8, -30.8),(-6.6, -30.8),(-4.4, -30.8),(-2.2, -30.8),(0, -30.8),(2.2, -30.8),(4.4, -30.8),(6.6, -30.8),(8.8, -30.8),(11, -30.8),(13.2, -30.8),(15.4, -30.8),(17.6, -30.8),(19.8, -30.8),(22, -30.8),(24.2, -30.8),(-13.2, -33),(-11, -33),(-8.8, -33),(-6.6, -33),(-4.4, -33),(-2.2, -33),(0, -33),(2.2, -33),(4.4, -33),(6.6, -33),(8.8, -33),(11, -33),(13.2, -33),(15.4, -33),(17.6, -33),(19.8, -33),(22, -33),(-11, -35.2),(-8.8, -35.2),(-6.6, -35.2),(-4.4, -35.2),(-2.2, -35.2),(0, -35.2),(2.2, -35.2),(4.4, -35.2),(6.6, -35.2),(8.8, -35.2),(11, -35.2),(13.2, -35.2),(15.4, -35.2),(17.6, -35.2),(-8.8, -37.4),(-6.6, -37.4),(-4.4, -37.4),(-2.2, -37.4),(0, -37.4),(2.2, -37.4),(4.4, -37.4),(6.6, -37.4),(8.8, -37.4),(11, -37.4),(13.2, -37.4),(-4.4, -39.6),(-2.2, -39.6),(0, -39.6),(2.2, -39.6),(4.4, -39.6)]

# --- EXECUTE ART ---
paint_layer(mkate2_tf_points, 'A1')    # Pink layer
paint_layer(mturquoise2_points, 'D1')  # Teal layer</pre>

Second design with an accent! 🤌🏼

First Time Running Code Visualisation/Simulation - Drum roll please 🥁

YAY!! 🥂 🍾

I had a problem with this design wherein I somehow used 5 pigments having intended only to use 3. I had to find out exactly which wells were a slight deviation in pigment. Having 5 pigments would make the Python unnecessarily complicated.

I really like this design but I think that the first is more feasible in vitro.

🤖 Part II: Post-Lab Questions

1. Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications.

I found these two papers:

  1. AssemblyTron: flexible automation of DNA assembly with Opentrons OT-2 lab robots (Bryant Jr. et al. 2023)

This paper describes logistical problems that invesitably arise with trial and error DBTL (design, build, test, learn) cycles which characterise the field of synthetic biology. Thus far, many software and hardware platforms have been built to streamline and algorithmically perform each step of this cycle, but there has been a severe dearth in the automation of the build part of the cycle. Here, the researchers experiment with AssemblyTron - an open-source Python package, which is designed for build implementation in Opentrons liquid handling robotics with minimal human intervention - to assemble DNA. They find that AssemblyTron is able to carry out efficient PCR and DNA assembly as their “data provide[s] evidence that AssemblyTron delivers robust results for cloning workflows” (Bryant Jr. et al. 2023:8).

This has massive implications as this reduces the “time, training, costs and wastes associated with synthetic biology, which, along with open-source and affordable automation, will further foster the accessibility of synthetic biology and accelerate biological research and engineering” (Bryant Jr. et al. 2023:1). Scientists can spend more time asking questions, as well as learning from and designing experiements rather than scrutinising error-prone cloning methodologies.

  1. Automation of protein crystallization scaleup via Opentrons-2 liquid handling (DeRoo et al. 2025)

This article deals with protein crystallisation methodology utilising the Opentrons-2 liquid handling robot to work with a model protein, namely hen egg white lysozyme, and and a periplasmic protein from Campylobacter jejuni - a pathogenic bacteria associated with poultry. Together these create a crystal that is used in the Snow Lab for nanotechnological purposes which necessitate large, consistent batches.

This study “shows facile adaption of the Opentrons interface and hardware” (DeRoo et al. 2025:1) as this automated approach to protein crystallisation has been shown to

  • reduce manual labour
  • increase reliability in protein crystallisation
  • reduce variability

Thus “offering an economical and versatile tool for laboratories (Ibid)” which is significant as protein crystallisation is a time-consuming and fickle process.

All developed liquid handling routines and relevant data files, in addition to demonstration videos are available at https://github.com/jbderoo/Opentrons2-Protein-Crystallization

2. Write a description about what you intend to do with automation tools for your final project.

For my living heart sleeve idea: I would use Opentrons liquid-handling robot to mix the special transistor hydrogel precursors with the modified hypoimmunogenic cells of patients at precise ratios to make an even sleeve layer which behaves like normal human tissue as much as possible!

Week 4 HW: Protein Design Part I

💪🏼 Part A. Conceptual Questions

1. How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons)

On average 100g of meat 🥩 contains 26g of protein = 26% protein

100:26 ∴ 500:130g

1 amino acid = 100 daltons ≈ 1.66 × 10⁻²²g

130g / 1.66 × 10⁻²²g = 7.83×10²³ amino acids

7.83×10²³ / Avogadro’s number

or 7.83×10²³ /6.02214076×10²³ mol−1= 1.3 mol

2. Why do humans eat beef but do not become a cow, eat fish but do not become fish?

When humans eat other animals, they break down the foreign animal matter into its constituent parts (i.e. complex proteins become amino acids) in their digestive systems. In this way genetic information is not transferred from the cow or fish to the human, though amino acids, for example, from digestion are used to build the human cells and DNA.

Furthermore, human DNA contains the instructions for human cells only and foreign DNA cannot be read or translated into proteins within human cells.

3. Why are there only 20 natural amino acids?

It is not yet understood why there are only 20 natural amino acids. Francis Crick has proposed the “frozen accident” theory which postulates that the universal genetic code is not optimised, but is rather the result of a random, early and arbitrary arrangement which later became “frozen” 🥶. This would imply that the 20 natural amino acids just so happened to become the foundation for all life. Studies hypothesise that the properties of these 20 natural amino acids, such as polarity/charge, hydrophilicity/hydrophobicity, and size, are evenly distributed throughout the chemical space.

Publication on “What Froze the Genetic Code?” (Pouplana et al. 2017) https://pmc.ncbi.nlm.nih.gov/articles/PMC5492136/

However, for now the reason for there being only 20 natural amino acids is a mystery. Today scientists have created thousands of non-naturally occurring amino acids and continue to do so for medicine, material science, and research purposes.

Article on the applications of “Reprogramming natural proteins using unnatural amino acids” (Adhikari et al. 2021): https://pmc.ncbi.nlm.nih.gov/articles/PMC9044140/#

4. Can you make other non-natural amino acids? Design some new amino acids.

As I touched on briefly in the question above, yes! You can make other non-natural amino acids.

5. Where did amino acids come from before enzymes that make them, and before life started?

“Most models of the origin of life suggest organisms developed from environmentally available organic compounds. A variety of amino acids are easily produced under conditions which were believed to have existed on the primitive Earth or in the early solar nebula.” (Cleaves II 2010:480)

https://pubmed.ncbi.nlm.nih.gov/20034500/#:~:text=Most%20models%20of%20the%20origin,in%20the%20early%20solar%20nebula.

It is believed that before amino acids were manufactured by biotic factors, they were formed through abiotic chemical reactions, with key sources of material finding their way onto Earth through extraterrestrial material such as carbonaceous meteorites.

In the movie 🎥 🍿 I watched last week, Yi Yi (2000) dir. Edward Yang, there was a scene with lightning 🌩️ ⚡️ ⛈️ and the narrator talked about how scientists posit that it was lightning on Earth 🌍 that helped to create the first amino acids. They said this in passing but it stuck with me and I did some reseach into this and found that yes, research does indicate that the high-temperate shock waves produced by lightning would be a highly efficient energy source for abiotic synthesis of amino acids on prebiotic Earth.

In fact the groundbreaking 1952 Miller-Urey Experiment, conducted by Stanley Miller and supervised by Nobel laureate Harold Urey at the University of Chicago, demonstrated that organic molecules such as amino acids could be synthesised from inorganic precursors under simulated prebiotic Earth conditions. In essence, electrical sparks simulating lightning were passed through a mixture of water, ammonia, methane, and hydrogen. This experiment produced complex organic compounds, thus supporting the theory of abiogenesis.

Miller published a paper detailing the experiment in a 1953 edition of the journal Science: “A production of amino acids under possible primitive earth conditions” (Miller 1953).

https://pubmed.ncbi.nlm.nih.gov/13056598/

Here is a paper reviewing this finding: “The 1953 Stanley L. Miller Experiment: Fifty Years of Prebiotic Organic Chemistry” (Lazcano and Bada 2003).

https://link.springer.com/article/10.1023/A:1024807125069

However, some posit that the results of the Miller-Urey experiment are contentious as the experiment produced a racemix mixture, that is an equal amount of both left and right handed amino acids. This poses a problem as life on Earth requires only left-handed amino acids, and as of yet no natural mechanism has been demonstrated to separate such in early Earth conditions.

Some scholars thus argue that homochirality has extraterrestrial origins: “Chirality and the origin of life” (Bailey 2000).

https://www.sciencedirect.com/science/article/abs/pii/S0094576500000242

6. If you make an α-helix using D-amino acids, what handedness (right or left) would you expect?

You would expect left-handedness.

7. Can you discover additional helices in proteins?

Yes! There are not only

  • α-helices

there are also

  • 3₁₀-helices

  • π-helices

8. Why are most molecular helices right-handed?

Most molecular helices such as α-helices in proteins and DNA (in its A and B forms) are right-handed as they are comprised of L-amino acids and D-sugars which favour a right-handed configuration as it maximises hydrogen bonding and minimises steric hindrance.

An exmaple of a molecular helix which is not right-handed is Z-DNA.

9. Why do β-sheets tend to aggregate? - What is the driving force for β-sheet aggregation?

β-sheets tend to aggregate due to the fact that their egdes readily interact with other β-strands, leading to edge-to-edge aggregation. In other words, β-sheets are flatter and tend to stack easily.

Hydrophobicity is the driving force for β-sheet aggregation.

10. Why do many amyloid diseases form β-sheets? - Can you use amyloid β-sheets as materials?

Many amyloid diseases form β-sheets as this structure is highyly stable, low energy, and thermodynamically favourable one which enables the aggregation of misfolded proteins. The β-sheets stack into “cross-β” configurations. This leads to the formation of insoluble fibrils which resist degradation and maximise hydrophobic interations 🚫💧.

This is the essential cause of Alzheimer’s Disease, Type 2 Diabetes, and Spongiform Encephalopathies (i.e. Creutzfeldt-Jakob disease (CJD) 🧠 in humans, and Bovine Spongiform Encephalopathy(BSE)/“Mad Cow Disease” 🦠🐄 in animals).

“Amyloid fibrils” (Rambaran and Serpell 2008) https://pmc.ncbi.nlm.nih.gov/articles/PMC2634529/#:~:text=Introduction,of%20melanosomes%2C%20curli%20and%20hydrophobins.

“Biology of Amyloid: Structure, Function, and Regulation” (Greenwald and Riek 2010) https://www.sciencedirect.com/science/article/pii/S0969212610003084#:~:text=Amyloids%20are%20highly%20ordered%20cross,a%20structural%20framework%20for%20polymorphisms.

And yes! You can use amyloid β-sheets as materials as their stable stable cross-β structure, self-assembly properties, and mechanical robustness can be leveraged upon to create versatile and high-strength eco-friendly 🌱 materials which include hydrogels, nano-wires,and bioplastics. These have the potential for tissue engineering 🫁, biosensors, and drug 💊 delivery 🚚 🚛 📦.

💪🏼 Part B: Protein Analysis and Visualization

1.Briefly describe the protein you selected and why you selected it.

The protein I have chosen is Sodium-Potassium Pump also written as the Na⁺/K⁺-ATPase enzyme. I chose this protein because I remember learning about it at school and being fascinated by how it works. The Na⁺/K⁺-ATPase enzyme is found in all metazoa (animal) cells, and is located in the plasma membrane of cells.

It essentially maintains an extracellular concentraion of Na⁺ (sodium ions) that is greater than the intracellular concentration, whilst maintaining an intracellular concentration of K⁺ (potassium ions) that is grater than the extracellular concentration.

In other words, it pumps out 3 sodium ions, and takes in 2 potassium ions. It depends on energy in the form of ATP to complete this action. The sodium-potassium pump is essential for understanding how neurons (nerve cells) function, as they depend on this pump to respond to stimuli and transmit impulses.

I just think that it is a masterpiece of biological engineering. Evolution is so crazy!

2.Identify the amino acid sequence of your protein.

I went to NCBI (which was a bloody maze!) and identified the sequence of my protein:

NP_000692.2 sodium/potassium-transporting ATPase subunit alpha-1 [Homo sapiens]

Click to view Na⁺/K⁺-ATPase Enzyme Amino Acid Sequence (CDS)
MGKGVGRDKYEPAAVSEQGDKKGKKGKKDRDMDELKKEVSMDDHKLSLDELHRKYGTDLSRGLTSARAAEILARDGPNALTPPPTTPEWIKFCRQLFGGFSMLLWIGAILCFLAYSIQAATEEEPQNDNLYLGVVLSAVVIITGCFSYYQEAKSSKIMESFKNMVPQQALVIRNGEKMSINAEEVVVGDLVEVKGGDRIPADLRIISANGCKVDNSSLTGESEPQTRSPDFTNENPLETRNIAFFSTNCVEGTARGIVVYTGDRTVMGRIATLASGLEGGQTPIAAEIEHFIHIITGVAVFLGVSFFILSLILEYTWLEAVIFLIGIIVANVPEGLLATVTVCLTLTAKRMARKNCLVKNLEAVETLGSTSTICSDKTGTLTQNRMTVAHMWFDNQIHEADTTENQSGVSFDKTSATWLALSRIAGLCNRAVFQANQENLPILKRAVAGDASESALLKCIELCCGSVKEMRERYAKIVEIPFNSTNKYQLSIHKNPNTSEPQHLLVMKGAPERILDRCSSILLHGKEQPLDEELKDAFQNAYLELGGLGERVLGFCHLFLPDEQFPEGFQFDTDDVNFPIDNLCFVGLISMIDPPRAAVPDAVGKCRSAGIKVIMVTGDHPITAKAIAKGVGIISEGNETVEDIAARLNIPVSQVNPRDAKACVVHGSDLKDMTSEQLDDILKYHTEIVFARTSPQQKLIIVEGCQRQGAIVAVTGDGVNDSPALKKADIGVAMGIAGSDVSKQAADMILLDDNFASIVTGVEEGRLIFDNLKKSIAYTLTSNIPEITPFLIFIIANIPLPLGTVTILCIDLGTDMVPAISLAYEQAESDIMKRQPRNPKTDKLVNERLISMAYGQIGMIQALGGFFTYFVILAENGFLPIHLLGLRVDWDDRWINDVEDSYGQQWTYEQRKIVEFTCHTAFFVSIVVVQWADLVICKTRRNSVFQQGMKNKILIFGLFEETALAAFLSYCPGMGVALRMYPLKPTWWFCAFPYSLLIFVYDEVRKLIIRRRPGGWVEKETYY

https://www.ncbi.nlm.nih.gov/gene?Db=gene&Cmd=DetailsSearch&Term=476

https://www.ncbi.nlm.nih.gov/protein/NP_000692.2

https://www.ncbi.nlm.nih.gov/protein/NP_000692.2?report=fasta

- How long is it? What is the most frequent amino acid? You can use this Colab notebook to count the frequency of amino acids.

The length of the protein is: 1023 amino acids. The most common amino acid is: Leucine, which appears 96 times.

Amino Acid Frequency Analysis

Protein: Na+/K+-ATPase subunit alpha-1
Total Length: 1023 amino acids

Amino AcidCountFrequency (%)
L (Leucine)969.39%
I (Isoleucine)838.12%
A (Alanine)807.83%
G (Glycine)767.44%
V (Valine)767.44%
E (Glutamic Acid)676.56%
T (Threonine)615.97%
D (Aspartic Acid)595.77%
K (Lysine)565.48%
S (Serine)565.48%
R (Arginine)454.40%
P (Proline)454.40%
F (Phenylalanine)444.31%
N (Asparagine)424.11%
Q (Glutamine)353.42%
M (Methionine)262.54%
Y (Tyrosine)242.35%
C (Cysteine)232.25%
H (Histidine)161.57%
W (Tryptophan)121.17%

- How many protein sequence homologs are there for your protein? Hint: Use Uniprot’s BLAST tool to search for homologs.

250 found in UniProtKB!

The protein is highly conserved in primates besides humans such as chimpanzees, orangutans, and baboons. This means that the Na⁺/K⁺-ATPase enzyme is so essential it has barely changed over thousands of years of evolution!

https://www.uniprot.org/blast/uniprotkb/ncbiblast-R20260311-160929-0627-30578154-p1m/overview

- Does your protein belong to any protein family?

This protein belongs to the P-type ATPase family ❤️ (Huang et al. 2024).

https://pmc.ncbi.nlm.nih.gov/articles/PMC11168508/

3. Identify the structure page of your protein in RCSB

Using RCBS, I typed in Sodium-Potassium Pump and ended up on the “Crystal structure of the sodium - potassium pump in the E2.2K+.Pi state” for the organism which turned out to be a Squalus acanthias or dogfish. They are so cute. Anyhow I tried again by entering the protein sequence (it helps to be more specific!) and found the entry for homo sapiens “Cryo EM structure of a Na+-bound Na+,K+-ATPase in the E1 state.”

Dogfish https://www.rcsb.org/structure/2ZXE

Human https://www.rcsb.org/structure/7E1Z#entity-1

When was the structure solved? Is it a good quality structure? Good quality structure is the one with good resolution. Smaller the better (Resolution: 2.70 Å)

This particular structure was

  • deposited on the 4th of February 2021

  • released/solved on the 15th of June 2022

It has a resolution of 3.20 Å - a fair to good quality, which is considerable for a massive complex membrane protein like the Sodium-Potassium Pump.

Are there any other molecules in the solved structure apart from protein?

Yes there are

  • Sodium ions (Na)
  • a Magnesium ion (Mg)
  • Cholesterol Hemisuccinate (Y01)
  • Phospholipids (PC1): 1,2-diacyl-sn-glycero-3-phosphocholine
  • Sugar (NAG): 2-acetamido-2-deoxy-beta-D-glucopyranose

- Does your protein belong to any structure classification family?

It is a membrane protein (Hydrolase / Transport Protein).

4. Open the structure of your protein in any 3D molecule visualization software:

- Visualize the protein as “cartoon”, “ribbon” and “ball and stick”.

Cartoon

Ribbon

Ball and Stick

Got a bit carried away . . .

Mesh

Dots

- Color the protein by secondary structure. Does it have more helices or sheets?

  • α-helices = red
  • β-sheets = yellow
  • loops connecting the two = green

As you can see, there are adundantly more helices. This serves a structural function as the helices act as pillars in the pump.

- Color the protein by residue type. What can you tell about the distribution of hydrophobic vs hydrophilic residues?

  • Orange = hydrophobic
  • Blue = hydrophillic

The distribution of protein residues follows a pattern which is much easier to see in the photo below this one. The orange hydrophobic residues are more concentrated the center of the protein, as these form a “transmembrane belt” which interacts with the phospholipid bilayer of a cell. Conversely, the blue hydrophilic residues are located on the exterior surfaces, as these are exposed to the aqeous environments of the cytoplasm and extracellular fluid.

- Visualize the surface of the protein. Does it have any “holes” (aka binding pockets)?

Perspective 1

Perspective 2

Perspective 3

Yes. It may be hard to tell, but these are largely located in the transmembrane domain where the orange hydrophobic residues are. These “holes” are the binding pockets that hold the sodium ions in the pump.

💪🏼 Part C. Using ML-Based Protein Design Tools

C1. Protein Language Modeling

1. Deep Mutational Scans

a. Use ESM2 to generate an unsupervised deep mutational scan of your protein based on language model likelihoods.

Mutation Scan Heatmap of ATPase

Mutation Scan Heatmap of ATPase (first 400 residues)

b. Can you explain any particular pattern? (choose a residue and a mutation that stands out)

The first image is a beautiful Mutation Scan Heatmap of ATPase which is 1023 residues long! It reminded me of the beautiful tartan skirts of Vivienne Westwood and Chopova Lowena, though the latter I cannot support as she abuses her interns and her garments are bad quality 😶‍🌫️ I digress . . .

In this image there are a few vertical bands of dark purple. The most prominent is at the rougly residue 370 mark. This likely means that this residue has stayed consistent after thousands or even millions of years of evolution. An alteration of such would probably disrupt the movement of sodium and potassium ions (hence the low score).

However, large swathes of the map are yellow and green, and a signigicant portion is even blue. The lighter areas are probably variable loops which deal with the surface of the protein and not the main sodium and potassium pump mechanism. The protein is able to function even if these residues experience mutation.

Vivienne Westwood Wool Miniskirt! So cute ❤️

The second images is a Mutation Scan Heatmap of the first 400 residues of ATPase. I did this as the protein is very large and its size kept causing CoLab to glitch or warn me that I was running out of memory. Scanning it initially took 10+ minutes and my Mac started to heat up, so I just “trimmed” ATPase to be able to do some modelling.

I chose residue 54 on the heatmap which shows a column with a lot of dark purple (score ≈ 10). This indicates that the position is vital for the ATPase’s function/strucctural integrity. A mutation here likely disrutps the the ion binding mechanism, making the sequence highly improbable in nature. This residue can likely only fit a very specific amino acid.

Latent Space Analysis

a. Use the provided sequence dataset to embed proteins in reduced dimensionality.

Result of First Run

I ran some code to help me isolate ATPase

Let's look closer

Look at the little green blob! 🟢

. . . Another perspective

The Wild West out here!

Mistaken identity

So it found the wrong protein. Time to try again.

Found you maybe

Another mistaken identity

Could this be it?

Found you?

Found you!

Bingo!

b. Analyze the different formed neighborhoods: do they approximate similar proteins?

Above is the ATPase of a fruit fly. Though my ATPase was that of a human, they are both ATPases. This proves that the AI we are using, namely ESM-2 is successfully grouping proteins by their functional neighbourhood rather than simply by which species they come from. So in short yes, the differnet formed neighbourhoods do approximate similar proteins.

c. Place your protein in the resulting map and explain its position and similarity to its neighbors.

My 400 residue ATPase segment was clustered with other ATPases. This suggests that the model recognises structural elements that are related to ATPase. It was able to situate my protein in the relevant neighbourhood, even if it gave me the ATPase of a different organism.

C2. Protein Folding

Folding a protein

a. Fold your protein with ESMFold.

ATPase

ATPase first 400 residues

b. Try changing the sequence, first try some mutations, then large segments. Is your protein structure resilient to mutations?

ATPase with 5 mutations

I introduced 5 mutations into the above protein. At the 200 residue mark there is a segment of “LVEVK” which I changed to “EDEVK” to induce a charge reversal to test the protein fold’s stability. As you can see it is pretty stable.

I also deleted a large section of 70 residues and tried to run this but it kept crashing and saying that I had run out of storage on the free tier 😞.

However, from what I have ascertained from the modelling in the above sections, this protein is very stable.

C3. Protein Generation

Inverse-Folding a protein:

Let’s now use the backbone of your chosen PDB to propose sequence candidates via ProteinMPNN

1. Analyze the predicted sequence probabilities and compare the predicted sequence vs the original one.

2. Input this sequence into ESMFold and compare the predicted structure to your original.

💪🏼 Part D. Group Brainstorm on Bacteriophage Engineering

Week 5 HW: Protein Design Part II

Part 1: Generate Binders with PepMLM 🐧

Human SOD1 Sequence from UniProt (154 amino acids): https://www.uniprot.org/uniprotkb/P00441/entry https://www.uniprot.org/uniprotkb/P00441/entry#sequences

Click to view Human SOD1 Sequence
MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ
Click to view Human SOD1 Sequence with A4V mutation which causes ALS
MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

I had to manually reset the code as I was not able to change the parameters with the sliding scales. I was stuck with generating a single 15 peptide long binder at a time. Thankfully I was able to change this by editing the back end of the form, but also by forcing the code to make 4 binders that were 12 peptides long.

Here is a table with the binders ranked and compared against a known binder:

RankPeptide SourceSequencePseudo Perplexity
1Reference (Experimental)FLYRWLPSRRGG2.2833
2PepMLM (Candidate 0)KLVPAVVLAHKX7.4714
3PepMLM (Candidate 1)KRSYPTALRHWX10.1367
4PepMLM (Candidate 2)WRYPVAABHGK11.0383
5PepMLM (Candidate 3)WHVYVVGLRHKE25.8914

The perplexity metric measures how perplexed or “surprised” as it were, a model is by a sequence. Hence a lower score represents higher model confidence or predicted affinity. Here, the known binder FLYRWLPSRRGG acts as a benchmark, scoring 2.28 on the pseudo perplexity rating, which is significantly lower than the newly generated designs. As you can see, I have ranked the binders in order of their respective perplexity ratings.

Part 2: Evaluate Binders with AlphaFold3 🦐
RankJob NameipTMpTMPrimary Binding LocationTarget Engagement
1SOD1 and KLVPAVVLAHK0.580.82N-terminus GrooveHigh (Pocket)
2SOD1 and WHVYVVGLRHKE0.490.81Upper β-barrel RidgeModerate (Surface)
3SOD1 and KRSYPTALRHW0.440.90β-barrel LoopsModerate (Surface)
4SOD1 and WRYPVAABHGK0.390.83Lower Dimer InterfaceLow/Mod (Surface)
5SOD1 and FLYRWLPSRRGG (Ref)0.260.81Surface LoopsLow (Transient)

Key

Confidence LevelpLDDT RangeEmojiCorresponding Color
Very HighpLDDT > 90💙Dark Blue
Confident90 > pLDDT > 70🩵Light Blue (Cyan)
Low70 > pLDDT > 50💛Yellow
Very LowpLDDT < 50🧡Orange

Protein-peptide complex Models using AlphaFold3 and Residue Alignment Charts (Green)

They are ordered according to their ipTM score, with the first (KLVPAVVLAHK) having the greatest score (0.58) etc Please refer to the relevant table above if necessary

SOD1 & KLVPAVVLAHK

SOD1 & WHVYVVGLRHKE

SOD1 & KRSYPTALRHW

SOD1 & WRYPVAABHGK

SOD1 & FLYRWLPSRRGG - Reference

Alignment: SOD1 & KLVPAVVLAHKX

Alignment: SOD1 & WHVYVVGLRHKE

Alignment: SOD1 & KRSYPTALRHW

Alignment: SOD1 & WRYPVAABHGK

Alignment: SOD1 & FLYRWLPSRRGG - Reference

So the AlphaFold 3 modelling confirmed the potential of the peptides I designed using PepMLM. Interestingly all four peptides that I generated using the model outperformed FLYRWLPSRRGG - the known experimental reference, in terms of ipTM (interface confidence). This was despite it having the lowest pseudo perplexity score. Candidate 0 (KLVPAVVLAHK) emerged as having the greatest ipTM score with 0.58; it also had the lowest perplexity score of 7.4714. It’s high ipTM score demonstrates its unique ability to dock deeply within the N-terminus groove where the A4V (ALS causing) mutation of SOD1 sits. The other peptides showed varying affinities for the β-barrel and dimer interface.

The second highest performing binder was Candidate 3 (WHVYVVGLRHKE) - which had an ipTM score of 0.49, though it had the highest pseudo perplexity score of 25.8914. This means that it binds quite well to the mutated SOD1 though it is less likely to be found in nature than the other peptides.

Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse 🪐

The first thing I noticed was that the PeptiVerse Website Interface is so cute and easy to use. Wow! Pranam and his team really popped off with this one 👏🏼 so hats 🎩 🧢 👒 off to them!

So kawaii!

In the search for a peptide that can stabilize the SOD1 protein—a primary target in ALS research we move from structural modeling in AlphaFold 3 to therapeutic profiling in PeptiVerse. While a 3D model shows us how a peptide looks, these 11 metrics provided by PeptiVerse tell us how it will behave in a human body. Here is the result of testing the four PepMLM-designed candidates against a known reference binder (FLYRWLPSRRGG) in order of ipTM score (highest to lowest):

Metric / PropertyKLVPAVVLAHKWHVYVVGLRHKEKRSYPTALRHWWRYPVAABHGKFLYRWLPSRRGG (Ref)
🐊 ipTM (Structural)0.580.490.440.390.26
💧 Solubility1.0001.0001.0001.0001.000
🔬 Permeability0.2420.1430.8490.3590.862
🩸 Hemolysis0.0320.0520.0220.0100.047
👯 Non-Fouling0.2850.2970.5490.4800.666
⏱️ Half-Life (hrs)0.4380.4120.3420.3390.310
🔗 Binding (pKd)5.5285.9195.9655.3005.968
📏 Length (aa)1112111112
⚖️ Mol. Weight (Da)1174.51522.81414.61166.51507.7
⚡ Net Charge (pH 7)+1.59+0.94+2.85+1.85+2.76
🎯 Isoelectric Point10.008.6011.009.9911.71
💦 GRAVY (Hydrophobicity)1.02-0.38-1.44-0.73-0.71

The results revealed a fascinating trade-off between structural integrity and medicinal viability!

As we know Candidate 0 (KLVPAVVLAHK) had the greatest ipTM score of 0.58, meaning that AlphaFold3 has great confidence in its structural fit to the mutated SOD1. However, based on PeptiVerse’s therapeutic metrics, Candidate 1 (KRSYPTALRHW) comes out on top though it had a lower ipTM score of 0.44.

Why is this?

Candidate 1 (KRSYPTALRHW) had the second lowest pseudo perplexity score of the generated peptides with 7.4714, meaning that it is reasonable to assume that it could occur in nature. Regarding the generated peptides it achieved the highest predicted affinity score with 5.965 pKd, as well as the best permeability score of 0.849, suggesting that it could actually reach the mutated SOD1 proteins inside cells. It also has the highest positive net charge of all the peptides (including the reference), with a score of +2.85 which makes it uniquely suited to crossing the blood-brain barrier and binding to the negatively charged aggregates of mutant SOD1.

Therefore I would advance Candidate 1 (KRSYPTALRHW).

Part 4: Generate Optimized Peptides with moPPIt 🧼🧽🫧💦🧹🪣
RunSequenceAffinity (pKd)SolubilitySpecificityMotif ScoreHemolysis
#1RFKCIVKVMVRR8.8810.5000.6150.5530.944
#2KRLQLYRKKCAE7.1930.7500.7370.6340.964
#3QRACDYFRDDED7.7830.8330.6790.0590.895
#4KEKEGPCWESEK7.3600.8330.8710.0020.962
Part 5: L-Protein Mutants ☣️

Lysis Protein Sequence from UniProt (75 amino acids): https://www.uniprot.org/uniprotkb/P03609/entry

Click to view Lysis Protein Sequence
METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT

DnaJ Protein Sequence from UniProt (376 amino acids): https://www.uniprot.org/uniprotkb/P08622/entry

Click to view Lysis Protein Sequence
MAKQDYYEILGVSKTAEEREIRKAYKRLAMKYHPDRNQGDKEAEAKFKEIKEAYEVLTDSQKRAAYDQYGHAAFEQGGMGGGGFGGGADFSDIFGDVFGDIFGGGRGRQRAARGADLRYNMELTLEEAVRGVTKEIRIPTLEECDVCHGSGAKPGTQPQTCPTCHGSGQVQMRQGFFAVQQTCPHCQGRGTLIKDPCNKCHGHGRVERSKTLSVKIPAGVDTGDRIRLAGEGEAGEHGAPAGDLYVQVQVKQHPIFEREGNNLYCEVPINFAMAALGGEIEVPTLDGRVKLKVPGETQTGKLFRMRGKGVKSVRGGAQGDLLCRVVVETPVGLNERQKQLLQELQESFGGPTGEHNSPRSKSFFDGVKKFFDDLTR

Week 6 HW: Genetic Circuits Part I

Answer these questions about the protocol in this week’s lab 🥼🧪🔬🧫 :

1. What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose?

Phusion-High Fidelity PCR Master Mix contains

  • Phusion DNA Polymerase

→ High fidelity, thermostable enzyme for fast, robust, and accurate DNA amplification in PCR; it is used particualrly for cloning and sequencing

“Phusion is one of the most accurate thermostable polymerases available.” (New England Biolabs Website 2026)

  • Deoxynucleotide Triphosphates (dNTPs)

→ The essential building blocks of DNA; they store the genetic information of an organism

  • Optimised reaction buffer including MgCl

→ Premixed solution that is designed to serve as the ideal chemical environment for DNA polymerase to function at maximum efficiency; it acts as a ready to use foundation for PCR experiments, thus eliminating tedious optimisation of key components. It does this by maintaining the maintains the optimal pH and ionic strength for the enzyme to function. It often contains additives that stabilize the enzyme during the high temperatures.

→ MgCl is an essential cofactor for the enzyme as it enables its catalyctic ability. This is due to the fact that the magnesium ions coordinate with the phosphate groups of the dNTPs as well as the active site of the enzyme, thereby facilitation the formation of the phosphodiester bond. It also helps primers anneal to the template DNA by reducing electrostatic repulsion.

“All that is required is the addition of template, primers and water." (Fisher Scientific Website 2026)

Sources:

New England Biolabs Website

https://www.neb.com/en-gb/products/m0531-phusion-high-fidelity-pcr-master-mix-with-hf-buffer?srsltid=AfmBOootavZNHzBVowCOZ4YrKlqZXzX4yHB3pT_1kN38DhhZEm9E7yBX

Fisher Scientific Website

https://www.fishersci.fi/shop/products/phusion-high-fidelity-pcr-master-mix-gc-buffer/10523288

2. What are some factors that determine primer annealing temperature during PCR?

  • Primer’s melting temperature
  • Primer length
  • Proportion of GC relative to AT (GC content)
  • Primer concentration
  • Ionic strength of the buffer

3. There are two methods from this class that create linear fragments of DNA: PCR, and restriction enzyme digests. Compare and contrast these two methods, both in terms of protocol as well as when one may be preferable to use over the other.

PCR (Polymerase Chain Reaction)

What it does: Amplifies specific DNA sequences, basically a biological photocopier ADDITIVE - synthesises DNA. Lots of copies of one fragment

How: Uses thermal cycling (95°C→Variable→72°C) and DNA polymerase to exponentially increase the amount of copies of a particular DNA sequence

With: Requires primers and dNTPs

Why: Finding and amplifying

When: You need to isolate some DNA out of the whole genome, and also you don’t have a lot of DNA to start with i.e. a trace amount of DNA such as a cheek swab or ancient DNA!

Restriction Enzyme Digest

What it does: Cuts specific DNA sequences, basically biological scissors SUBTRACTIVE - subtracts DNA. Several fragments of varying sizes

How: Uses restriction endonucleases which recognise short, specific DNA sequences, and cuts the sugar-phosphate backbone at those sites, kept at a constant (usually 37°C)

With: Requires Specific recognition sites (GAATTC, etc.)

Why: Cutting and checking

When: You want to

  • Manipulate or verify existing high-concentration DNA (i.e. plasmids - you can cut them open and check if desired gene is actually inside by using gel electrophoresis),
  • Do Cloning/Ligation (opening up and inserting gene into a plasmid)
  • Do Genomic mapping - before modern sequencing Restriction Fragment Length Polymorphism (RFLP) was used to compare DNA samples by looking at unique cut patterns

Here is that same information tabulated for the sake of easier comparison:

Comparison of PCR and Restriction Enzyme Digests

FeaturePCR (Polymerase Chain Reaction)Restriction Enzyme Digest
ActionAdditive: Synthesizes and amplifies DNA.Subtractive: Cleaves and fragments DNA
AnalogyBiological PhotocopierBiological Scissors
The “How”Thermal Cycling: Uses DNA polymerase; cycles through 95°C - 72°CIsothermal: Uses endonucleases; kept at a constant temperature (usually 37°C)
RequirementsSpecific Primers and dNTPs.Specific Recognition Sites (e.g., GAATTC)
OutputMillions of copies of one specific fragmentSeveral fragments of varying sizes
Primary GoalFinding and AmplifyingCutting and Checking
Ideal ScenarioTrace amounts of DNA: Isolating a single gene from a whole genome (e.g., cheek swab or ancient DNA)High-concentration DNA: Manipulating or verifying plasmids (e.g., cloning/ligation or genomic mapping)
VerificationUsed to “zoom in” on a needle in a haystackUsed in RFLP to compare DNA samples via unique “cut patterns” on a gel

Key Takeaways for Lab Application

PCR is your primary tool for creation It allows you to build a massive amount of DNA from a microscopic starting point by repeatedly synthesizing the target sequence.

Restriction Digests are your primary tool for modification. They allow you to “cut and paste” DNA or verify that a plasmid contains the correct insert by checking fragment sizes via gel electrophoresis.

4. How can you ensure that the DNA sequences that you have digested and PCR-ed will be appropriate for Gibson cloning?

  • Remove enzymes/buffers using PCR clean up because they might interfere with the Gibson Master Mix
  • Run gel electrophoresis to check if you have the correct DNA fragment/gene
  • Have at least 20 bp overlap
  • Design PCR primers with appropriate tails

5. How does the plasmid DNA enter the E. coli cells during transformation?

Heat Shock

This is the process of generating pores in the bacterial cell wall through abrupt temperature changes

Electroporation

This is the process of generating pores in the bacterial cell wall using a high electrical voltage

Both of these methods causes the cell membrane to “open up” thus allowing the plasmid to enter the cells via diffusion. After being shocked with either heat or electricity, the E.coli cells are incubated in a nutrient-rich liquid broth such as Super Optimal Broth (SOB) or Lysogeny Broth (LB) at 37°C for about an hour. This period of time allows the cells to recover and begin multiplication. some of these cells will have the plasmid inside. After this process, the transformed cells are placed on an agar plate with antibiotics. Only the sucessful recombinant cells which have received the plasmid will survive as they contain a gene for antibiotic resistance. Colonies of the E.coli with the plasmid will grow.

After a day or two, you will see the colours expressed by the inserted gene if the plasmid containsn GFP for example.

6. Describe another assembly method in detail (such as Golden Gate Assembly)

I did some research on Golden Gate Assembly. It is a goated assembly method which was invented in 2008 in Köln by Carola Engler, Sylvestre Marillonnet, Romy Kandzia, and colleagues but has its origins in 1996. I erroneously thought it was named so because it was invented in San Francisco but no - it is called Golden Gate Assembly as it can seamlessly combine multiple DNA fragmanets, acting as a “golden gate” which allows desired fragments to pass through while restricting unwanted or empty products.This one-pot, one-step cloning method was developed to allow efficient and seamless assembly of multiple DNA fragments using Type IIS restriction enzymes (like BsaI).

It is a scarless method (does not leave any unwanted sequences) that can be used to assemble multiple fragments, even up to 50+ (Thermo Fisher Scientific Website).

Here is the pioneering 2008 paper “A One Pot, One Step, Precision Cloning Method with High Throughput Capability” (Engler et al. 2008):

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0003647

I watched this video to help me understand it better:

https://www.youtube.com/watch?v=NzQdLQ44I7w

Background info:

Golden Gate Assembly is a versatile cloning technique which allows for the seamless assembly of multiple fragments in a single cloning reaction. It leverages upon the features of Type IIS restriction enzymes which makes them unsuitable for standard restriction enzyme cloning:

  • Restriction site is non-palindromic (Type IIS enzymes recognise and bind to asymmetric DNA sequences)
  • In general sites range from 4-7 nucleotides
  • DNA is cleaved outside of the recognition site!

Golden Gate Assemblies are done using a destination vector and one or more fragment of interest - these can come from multiple oraganisms. A lot of vectors are available through commerical sources or Addgene. But you can adapt a vector for Golden Gate Assembly in your lab also by making sure:

  • The destination vector has type IIS restriction sites which flank desired insert region
  • The enzymes flank a selectable or screenable gene
  • Different Type IIS enzymes will cleave DNA to produce different length overhangs
  • 4 base overhangs are often referred to as “fusion sites”
  • Type IIS restriction enzymes most commonly used for Golden Gate Assembly are Bsa1, BSM, B1, BBSB1 (all of these create 4 nucleotide overhangs)

The orientation of type IIS enzymes as well as the recognition and fusion sites is critical to the success of Golden Gate Assembly!

In destination vector, the sites must face away from each other or be outward facing and flank the region being excised. This means that after cleavage, the recognition sequences are removed from the remaining plasmid backbone, thus making it resistant to further digestion by the restriction enzyme.

The resulting overhangs are not compatible with each other so the plasmid cannot recircularise!

Preparation:

You may use any double stranded DNA fragment for Golden Gate Assembly including plasmids or PCR products. Thankfully a lot of Golden Gate Assembly kits come with plasmids with sequence-verified inserts. PCR makes it simple to convert or accommodate any sequence for the assembly. You must design PCR primers with flanking bases, type IIS recognition sites, and an overhang sequence. This way you can introduce the required recognition site at each end of the plasmid/PCR product.

Our fragment of interest will have type IIS sies that are inward facing. This means that cuts will occure between two recognition sites. This way the insert fragment will not contain a type IIS recognition after clevage, ensuring a cease of futher digestion.

Reaction time baby!

You combine all your components into a single reaction tube:

  • Destination Vector: The backbone plasmid.
  • Inserts: The DNA fragments you want to assemble.
  • Type IIS Restriction Enzyme: (e.g., BsaI or BsmBI) to cut the DNA
  • T4 DNA Ligase: To join the fragments together
  • Reaction Buffer: Containing ATP (required for the ligase)

A thermocycler rotates through different reaction conditions/temperatures

Digestion (37°C): The optimal temperature for the restriction enzyme to cut the DNA and reveal the 4-bp overhangs.

Ligation (16°C): The optimal temperature for the T4 Ligase to stabilize the short overhangs and seal the phosphodiester backbone.

Annealing (Generally between 50°C and 65°C the specific temperature is determined by your primers): Chemical sweet spot which is cool enough to allow the DNA strands to come together but warm enough to prevent the primers from sticking to the wrong places.

During the enzyme digestion the type IIS restriction enzyme exposes single stranded ends which are no longer associated with the type IIS recognition sequence. Complementary single stranded overhangs anneal creating the “Golden Gate.” Voila! The annealed overhangs are then covalently closed by DNA ligase. Cleavage with a Type IIS enzyme eliminates any unmodified destination vector that survived through the reaction.

Finally, at 80°C remaining enzyme activity is inactivated, the plasmid product is now ready for transformation into an appropriate host!

7. Explain the other method in 5 - 7 sentences plus diagrams (either handmade or online).

Golden Gate Assembly TLDR:

  1. Design primers. Choose your DNA fragments. Make sure they are “clean” and have type IIS restriction sites which flank desired insert region.
  2. Put all of your components into a reaction tube. This includes the vector, the inserts, the Type IIS enzyme, T4 ligase, and reaction buffer.
  3. The Type IIS enzyme cuts outside its recognition sequence, creating unique 4-bp sticky ends that dictate the exact order of assembly.
  4. The thermocycler should alternate between different reaction cycles to enable digestion, ligation, and annealing.
  5. As correct assemblies form, they lose the restriction sites and become “immune” to further cutting, while original vectors are repeatedly re-cut.
  6. The mix is heated to 80°C to stop all enzyme activity. This temperature will denature the Type IIS restriction enzyme and the ligase. This prevents the Type IIS restriction enzyme from cutting up the DNA or the ligase from sticking to the DNA.
  7. Plasmid can now be tranformed into desired host.

8. Model this assembly method with Benchling or Asimov Kernel!

Week 7 HW: Genetic Circuits Part II

Summary"Cancer wil ultimately be a chronic disease." - Rob Weiss

I have to say this was one of my favourite lectures.

Assignment Part 1: Intracellular Artificial Neural Networks (IANNs) 📟

1. What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions?

Boolean functions are binary, representing either on or off. Whereas the logic of IANNS (Interfering RNA-based Analog Neural Networks or Intracellular Artificial Neural Networks?) is more nuanced and geared towards more brain-like information processing. They can detect not merely whether a signal is present or not, but how strong it is. This is important because, in short, biology is messy!

Advantages of IANNs include:

  • Utilising analog as opposed to digital signalling: Whereas a binary system requires a precise threshold to signal either yes or no, IANNs operate using graded continuous intracellular signals. This allows for a greater degree of control over gene expression.

  • Robustness to noise: Boolean circuits are more brittle. IANNs can handle the stochastic noise of biology which is essensial within cellular environments

  • Reduced metabolic burden: IANNs can conduct complex decision-making with less genetic components than traditional genetic circuits

  • Complex Signal Capabilities: IANNs are able to weight and combine multiple inputs at the same time. This is comparable to neural networks, as opposed to simple AND/OR logic gates.

  • Greater Complexity: IANNs can carry out non-linear functions and adapt to protean environments. This leads to greater scalability for complex biological tasks.

2. Describe a useful application for an IANN; include a detailed description of input/output behavior, as well as any limitations an IANN might face to achieve your goal.

In the lecture Rob Weiss mentioned IANNs could be used for cancerous cell detection. I keep getting ill these days so I think an IANN could be used for oxidative stress detection.

3. Below is a diagram depicting an intracellular single-layer perceptron where the X1 input is DNA encoding for the Csy4 endoribonuclease and the X2 input is DNA encoding for a fluorescent protein output whose mRNA is regulated by Csy4. Tx: transcription; Tl: translation.

Draw a diagram for an intracellular multilayer perceptron where layer 1 outputs an endoribonuclease that regulates a fluorescent protein output in layer 2.

Assignment Part 2: Fungal Materials 🍄

1. What are some examples of existing fungal materials and what are they used for? What are their advantages and disadvantages over traditional counterparts?

2. What might you want to genetically engineer fungi to do and why? What are the advantages of doing synthetic biology in fungi as opposed to bacteria?

Assignment Part 3: First DNA Twist Order 🧬

Review Part 3: DNA Design Challenge of the week 2 homework. Design at least 1 insert sequence and place it into the Benchling/Kernel/Other folder you shared in the Google Form above. Document the backbone vector it will be synthesized in on your website.

Week 9 HW: Cell Free Systems

General Homework Questions

1. Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production.

Cell-free protein synthesis leverages biology as an engineering tool (Kate). Living cells require a lot of resources such as the correct amounts of

  • Water
  • Energy
  • Gases
  • Pressure
  • Heat
  • Equipment to sustain
  • Less need for cold chain distribution

and produce a lot of waste (Peter). With cell-free, you are able to freeze dry systems for upto a year. It has the advantages of

  • Transportability
  • Therapeutics on demand → just add water!
  • Rapid manufacturing
  • Greater biosafety: don’t have to worry about living cells getting about

Cell free is more beneficial in

  1. Space
  2. Places where supply chains are not as strong/resources are scarce i.e. in developing countries

The military was also mentioned . . .

Applications of cell-free

  • Synthetic Biology: Designing and testing biological circuits or pathways without cellular constraints
  • Protein Engineering: Rapid protein production and screening, especially for proteins that are toxic or hard to express in cells
  • Metabolic Engineering: Production of high-value chemicals, biofuels, and pharmaceuticals via synthetic pathways
  • Biosensing: Creating diagnostic tools that are portable and easy to use, like paper-based biosensors
  • Gene Editing Research: Testing CRISPR-based systems or genome editing tools in a controlled environment

2. Describe the main components of a cell-free expression system and explain the role of each component.

Whole cell extract including:

  • Ribosomes
  • tRNA
  • aminoacyl-tRNA synthetases
  • translation factors
  • RNA polymerase*

The cell extract provides the necessary machinery, as it were, for translation.

DNA Template such as a plasmid or a linear PCR product

This functions as the blueprint for the desired protein that we will be producing as it contains the coding sequence for such. Before, this sequence is a strong promoter such as a T7 promoter which the RNA polymerase will recognise in order to begin transcription.

Source of energy i.e. ATP and GTP and also phosphoenolpyruvate and pyruvate kinase (for regeneration of ADP to ATP)

The process of protein synthesis is very energy-intensive; and as there is not a living cell metabolism to make use of, energy must be supplied in an external form.

Amino acids

These function as the fundamental building blocks of the protein. The ribosomes inside the whole cell extract will link these together according to the DNA template!

Nucleotides (NTPs)

These will serve as the fundamental building blocks for the mRNA strand.

Salts and organic molecules

  • Magnesium is necessary for stabilising the ribosome structure and the polymerase activity
  • Potassium is necessary for maintaining the right ionic strength which allows protein folding and enzyme activity
  • Buffer such as HEPES, is necessary for maintaining a stable pH so that the enzymes do not denature

Chaperones and Protease inhibitors

The chaperones help the complex proteins to fold properly, while the protease inhibitors prevent the breakdown of the protein as it is being made.

3. Why is energy provision regeneration critical in cell-free systems? Describe a method you could use to ensure continuous ATP supply in your cell-free experiment.

Energy provision regeneration is critical in cell-free systems because there is no living cell metabolism. There are no mitochondria so you don’t have the normal mechanisms of energy production.

One of the most common methods of enusring a continous ATP supply is by using PEP (phosphoenolpyruvate) and PK (pyruvate kinase). PEP is a high energy phosphate, whilst PK is an enzyme which catalyses the transfer of the phosphate group from PEP to ADP - the products are which are pyruvate and ATP.

(Lian et al. 2014)

Mechanisms of such:

  1. You add a high-energy phosphate donor (PEP) and an enzyme (pyruvate kinase) to your reaction mix
  2. The ribosomes in the whole cell extract use ATP, converting it to ADP
  3. The pyruvate kinase enzyme takes a phosphate group from the PEP and sticks it back onto the ADP
  4. This recreates ATP, allowing the cycle to continue until the PEP supply is exhausted
Click to view Tangent I went on . . .

I wondered why they didn’t just put mitochondria into the whole cell extract. I found out that it is because mitochondria are delicate. But I was confused because I read that there are free floating mitochondria in human blood.

But in blood they are wrapped in cell membranes or extracellular vesicles which act like protective bubbles, thus shielding them from the harsh plasma environment and keeping their internal chemistry stable. Also most of these free floating mitochondria come from platelets as a stress response or to help other cells. So they are quite delicate, in that they require intact membranes to maintain the proton gradient (electrical charge) across their inner membrane which causes ATP synthesis. With any membrane damage, they lose their function.

Also the mitochondria need oxygen to function and this can be used up quickly in a microcentrifuge tube. Ok they are quite precious I see.

I also asked Juan Diego and he told me that if you were to introduce mitochondria into cell-free you would just be complicating matters because as endosymbiotic theory posits, mitochondria are the descendants of free-living bacteria. Hence, you would kind of be introducing bacterial DNA into cell-free. You should just stick to what is strictly necessary essentially.

I remember that mitochondria posess their own DNA which is inherited from the mother. Because of this scientists came up with the concept of “mitochondrial Eve” and indeed found one some years ago, or so I saw.

4. Compare prokaryotic versus eukaryotic cell-free expression systems. Choose a protein to produce in each system and explain why.

Prokaryotic cell-free systems

  • Use E.coli

Protein synthesis is

  • High yield
  • Rapid
  • Low-cost

Ideal for simple protein synthesis and high throughput screening

Eukaryotic cell-free systems

  • Use Rabbit reticulocytes (immature red blood cells), Wheat germ, HeLa (Henrietta Lack’s cervical cancer cells), Chinese Hamster Ovary cells

Protein synthesis is

  • Lower yield
  • Slower
  • Higher-cost

Ideal for complex folding, post-translational modifications, production of membrane proteins

Source: Cell-Free Protein Synthesis: Pros and Cons of Prokaryotic and Eukaryotic Systems (Zemella et al. 2015)

In a prokaryotic cell-free system I would synthesise GFP as it is a robust, single domain protein with a quarternary structure constisting of 238 amino acids, which does not require post-translational modifications i.e. the addition of sugars, in order to function.

In a eukaryotic cell-free system I would synthesise human erythropoietin (EPO). It is a glycoprotein hormone that is mainly produced in the kidneys. Though it is composed or relatively fewer amino acids at 165, it must undergo glycosylation - that is the addition of sugar chains in order for it to be biologically active and stable in the human body. Around 40% of its total weight consists of sugar chains/carbohydrates (Jelkmann 2013). A prokatyotic cell-free system using E. coli for example cannot perform glycosylation - it must be engineered to do such. Without this engineering, it was produce a protein without the sugar chains, rendering it unusable in a medical context. Eukaryotic cell-free systems take advantage of the endoplasmic reticulum vesicles and enzymes to add these sugar groups, thereby making sure that EPO is folded correctly into its complex teritiary shape.

In case you are wondering . . . Yes GFP has a quarternary strucutre whilst EPO has a tertiary one, but as I explained, the latter requires post-translational modifications.

5. How would you design a cell-free experiment to optimize the expression of a membrane protein? Discuss the challenges and how you would address them in your setup.

The hydrophoic element of membrane proteins poses one of the greatest challenged to designing an experiment with them. In a cell-free extract they may misfold or aggregate as there is no lipid membrane for them to be in. The phospholipid bilayer stabilises their transmembrane domains.

To overcome this we would have to add a synthetic membrane in the cell-free system to mimic the normal interaction of membrane proteins. In addition to this we can add nanodiscs which help keep the protein soluble, and liposomes which will essentially “house” the membrane proteins.

In addition to this we can add lot of protein chaperones which assist newly synthesised proteins to fold into their correct functional 3D shape.

6. Imagine you observe a low yield of your target protein in a cell-free system. Describe three possible reasons for this and suggest a troubleshooting strategy for each.

There are a few reasons as to why you may experience a low yield of your target protein. These include:

Energy/Nutrient Depletion which causes Phosphate Byproduct Accumulation Cell-free protein synthesis requires large amounts of the energy molecules ATP and GTP. In addition to this, standard systems also often rely on secondary energy sources such as phosphoenolpyruvate (PEP) or creatine phosphate. As these are used up, inorganic phosphate builds up. This can inhibit the reaction and change the pH which thereby stalls translation.

Troubleshooting Strategy: Secondary Energy Source Shift to a secondary energy source which does not produce byproducts which can inhibit reactions. Good examples inclue pyruvate or glucose which regenerate ATP more “cleanly.”

Troubleshooting Strategy: Dialysis Run the reaction in a dialysis cassette submerged in a large reservoir of “feeding buffer.” this will enable the “fresh” substrates to diffuse in and the toxic phosphate to diffuse out, thereby achieving continous exchange.

Template Issues: You may have a template with a suboptimal purity level which inhibits reactions from occuring, or you may encounter degredation by endonucleases/exonucleases in the lysate. This is particularly common in cell free systems derived from E. coli or wheat germ. What happens is that endogenous nucleases essentially “chew up” your DNA or mRNA template before the ribosomes have have the time to complete protein synthesis (translation).

Troubleshooting Strategy: Use a circular template It would be best to use a plasmid as opposed to linear PRC products as this will protect against the action of nucleases.

Troubleshooting Strategy: Add RNase inhibitors or small molecule stabilisers. This will protect the mRNA.

Troubleshooting Strategy: Chemical modification or linear DNA This can be achived by adding a GamS protein (which inhibits RecBCD) or by having phosphorothioate bonds at the ends of your primers to make the DNA “indigestible” to the nucleases.

Protein Folding and Solubility Issues may arise. Low yields may not be due to a lack of synthesis action but rather becase the protein may aggrefate or form inclusion bodies after leaving the ribosome. Cell-free extracts often lack the chaperones or the specific redox environment (for disulfide bonds) found in a living cell. Otherwise you may be using codons poorly represented in the extract (e.g., using a eukaryotic protein gene in a bacterial lysate)..

Troubleshooting Strategy: Adjust temperature By lowering the reaction temperature (e.g., from 37°C to 25°C or even 16°C) this allows for lower synthesis which enables more precise folding.

Troubleshooting Strategy: Add Chaperones/Detergents: Supplementing the reaction with purified chaperones (like DnaK/J or GroEL/ES) or mild detergents (like Brij-35) helo to keep hydrophobic patches from sticking together.

Troubleshooting Strategy: Redox Shifting If your protein requires disulfide bonds, add a mixture of reduced and oxidized glutathione (GSH/GSSG) to simulate the environment of the endoplasmic reticulum.

Troubleshooting Strategy: Check codon usage by making sure the template is optimised for the host system.

Suboptimal reaction condition

Troubleshooting Strategy: Optimise you CFPS system by checking you have a strict balance of Mg and K concentrations as well as the correct reaction pH.

Homework questions from Kate Adamala

Design an example of a useful synthetic minimal cell as follows:

1. Pick a function and describe it. a. What would your synthetic cell do? What is the input and what is the output?

b. Could this function be realized by cell-free Tx/Tl alone, without encapsulation?

c. Could this function be realized by genetically modified natural cell?

d. Describe the desired outcome of your synthetic cell operation.

2. Design all components that would need to be part of your synthetic cell. a. What would be the membrane made of?

b. What would you encapsulate inside? Enzymes, small molecules.

c. Which organism your Tx/Tl system will come from? Is bacterial OK, or do you need a mammalian system for some reason? (hint: for example, if you want to use small molecule modulated promotors, like Tet-ON, you need mammalian)

d. How will your synthetic cell communicate with the environment? (hint: are substrates permeable? or do you need to express the membrane channel?)

3. Experimental details

a. List all lipids and genes. (bonus: find the specific genes; for example, instead of just saying “small molecule membrane channel” pick the actual gene.)

b. How will you measure the function of your system?

Homework questions from Peter Nguyen

Freeze-dried cell-free systems can be incorporated into all kinds of materials as biological sensors or as inducible enzymes to modify the material itself or the surrounding environment. Choose one application field — Architecture, Textiles/Fashion, or Robotics — and propose an application using cell-free systems that are functionally integrated into the material. Answer each of these key questions for your proposal pitch:

1. Write a one-sentence summary pitch sentence describing your concept.

2. How will the idea work, in more detail? Write 3-4 sentences or more.

3. What societal challenge or market need will this address?

4. How do you envision addressing the limitation of cell-free reactions (e.g., activation with water, stability, one-time use)?

Homework questions from Ally Huang

Freeze-dried cell-free reactions have great potential in space, where resources are constrained. As described in my talk, the Genes in Space competition challenges students to consider how biotechnology, including cell-free reactions, can be used to solve biological problems encountered in space. While the competition is limited to only high school students, your assignment will be to develop your own mock Genes in Space proposal to practice thinking about biotech applications in space!

For this particular assignment, your proposal is required to incorporate the BioBits® cell-free protein expression system, but you may also use the other tools in the Genes in Space toolkit (the miniPCR® thermal cycler and the P51 Molecular Fluorescence Viewer). For more inspiration, check out https://www.genesinspace.org/ .

1. Provide background information that describes the space biology question or challenge you propose to address. Explain why this topic is significant for humanity, relevant for space exploration, and scientifically interesting. (Maximum 100 words)

2. Name the molecular or genetic target that you propose to study. Examples of molecular targets include individual genes and proteins, DNA and RNA sequences, or broader -omics approaches. (Maximum 30 words)

3. Describe how your molecular or genetic target relates to the space biology question or challenge your proposal addresses. (Maximum 100 words)

4. Clearly state your hypothesis or research goal and explain the reasoning behind it. (Maximum 150 words)

5. Outline your experimental plan - identify the sample(s) you will test in your experiment, including any necessary controls, the type of data or measurements that will be collected, etc. (Maximum 100 words)

Homework Part B: Individual Final Project

Here is my final project page

Week 10 HW: Imaging and Measurement

Waters Part I — Molecular Weight 💦

We will analyze an eGFP standard on a Waters Xevo G3 QTof MS system to determine the molecular weight of intact eGFP and observe its charge state distribution in the native and denatured (unfolded) states. The conditions for LC-MS analysis of intact protein cause it to unfold and be detected in its denatured form (due to the solvents and pH used for analysis).

1. Based on the predicted amino acid sequence of eGFP (see below) and any known modifications, what is the calculated molecular weight? You can use an online calculator like the one at https://web.expasy.org/compute_pi/

GFP SequenceMVSKGEELFTG VVPILVELDG DVNGHKFSVS GEGEGDATYG KLTLKFICTT GKLPVPWPTL VTTLTYGVQC FSRYPDHMKQ HDFFKSAMPE GYVQERTIFF KDDGNYKTRA EVKFEGDTLV NRIELKGIDF KEDGNILGHK LEYNYNSHNV YIMADKQKNG IKVNFKIRHN IEDGSVQLAD HYQQNTPIGD GPVLLPDNHY LSTQSALSKD PNEKRDHMVL LEFVTAAGIT LGMDELYKLE HHHHHH

Note: This contains a His-purification tag (HHHHHH) and a linker (the LE before it).

MW = 28006.60 - 20.04 Da (chromophore maturation)

∴ MW = 27986.56 Da

2. Calculate the molecular weight of the eGFP using the adjacent charge state approach described in the recitation.

Select two charge states from the intact LC-MS data (Figure 1) and: a. Determine z for each adjacent pair of peaks

Peak 1: 1000.4302

Peak 2: 966.0390

Peak 1 z = +28

Peak 2 z = +29

b. Determine the MW of the protein

MW of protein is 27983.82

Peak 1 Calculation Peak 2 Calculation
Above: Side-by-side comparison of the eGFP molecular weight calculations for Peak 1 ( z = +28) and Peak 2 (z = +29). Both results consistently show a mass of approximately 27,985 Da!

c. Calculate the accuracy of the measurement using the deconvoluted MW from 2.2 and the predicted weight of the protein from 2.1

Accuracy error when using 28006.60 Da value = 0.0813%

Accuracy error when accounting for loss of 20 Da (27986.56) = 0.01%

Accuracy Calculation Part 1 Accuracy Calculation Part 2
Above: Initially I made a mistake and used the value of 28006.60 forgetting to account for the loss of Daltons. Here is my working out nevertheless!

3. Can you observe the charge state for the zoomed-in peak in the mass spectrum for the intact eGFP? If yes, what is it? If no, why not?

No. Matter of resolution.

This was the image provided.
Waters Part II — Secondary/Tertiary structure 💦 💦

We will analyze eGFP in its native, folded state and compare it to its denatured, unfolded state on a quadrupole time-of-flight MS. We will be doing MS-only analysis (no liquid chromatography, also known as “direct infusion” experiments) on the Waters Xevo G3-QToF MS.

1. Based on learnings in the lab, please explain the difference between native and denatured protein conformations. For example, what happens when a protein unfolds? How is that determined with a mass spectrometer? What changes do you see in the mass spectrum between the native and denatured protein analyses (Figure 2)?

2. Zooming into the native mass spectrum of eGFP from the Waters Xevo G3 QTof MS (see Figure 3), can you discern the charge state of the peak at ~2800? What is the charge state? How can you tell?

Waters Part III — Peptide Mapping - primary structure 💦💦💦

We will digest the eGFP protein standard into peptides using trypsin (an enzyme that selectively cleaves the peptide bond after Lysine (K) and Arginine (R) residues. The resulting peptides will be analyzed on the Waters BioAccord LC-MS to measure their molecular weights and fragmented to confirm the amino acid sequence within each peptide – generating a “peptide map”. This process is used to confirm the primary structure of the protein.

There are a variety of tools available online to calculate protein molecular weight and predict a list of peptides generated from a tryptic digest. We will be using tools within the online resource Expasy (the bioinformatics resource portal of the Swiss Institute of Bioinformatics (SIB)) to predict a list of tryptic peptides from eGFP.

1. How many Lysines (K) and Arginines (R) are in eGFP? Please circle or highlight them in the eGFP sequence given in Waters Part I question 1 above. (Note: adding the sequence to Benchling as an amino acid file and clicking biochemical properties tab will show you a count for each amino acid).

2. How many peptides will be generated from tryptic digestion of eGFP?

3. Based on the LC-MS data for the Peptide Map data generated in lab (please use Figure 5a as a reference) how many chromatographic peaks do you see in the eGFP peptide map between 0.5 and 6 minutes? You may count all peaks that are >10% relative abundance

4. Assuming all the peaks are peptides, does the number of peaks match the number of peptides predicted from question 2 above? Are there more peaks in the chromatogram or fewer?

5. Identify the mass-to-charge of the peptide shown in Figure 5b. What is the charge (z) of the most abundant charge state of the peptide (use the separation of the isotopes to determine the charge state). Calculate the mass of the singly charged form of the peptide

6. Identify the peptide based on comparison to expected masses in the PeptideMass tool. What is mass accuracy of measurement? Please calculate the error in ppm.

7. What is the percentage of the sequence that is confirmed by peptide mapping?

Waters Part IV — Oligomers 🐛

Week 11 HW: Bioproduction and Cloudlabs

Week 12 HW: Building Genomes

Week 13 HW: Biodesign & Engineered Living Materials

Week 14 HW: Biodesign & Biofabrication

Subsections of Labs

Week 1 Lab: Pipetting

cover image cover image

Week 9 Lab: Cell-Free Systems

TX/TX Overview with Juan Diego
Practical

Hello Honey!

Step 1

We pippetted the metal beads

Step 2

Step 3

Step 4

Projects

Final projects:

Subsections of Projects

Individual Final Project

Chosen Final Project

Biotransistor VAD: the Living Heart Sleeve ❤️🫀❤️‍🩹

Here is the link to my Google slide

SECTION 1: ABSTRACT

Traditional VAD

Components of a traditional VAD

This project seeks to engineer a “living” biotransistor in the form of a Ventricular Assistive Device (VAD) which would fit around a coronary disease patient’s heart. Traditional VADs function like pumps, are very rigid, and carry the risks of blood clotting or tissure rejection; whereas mine would function less like a pump and behave more like a synthetic biological muscle. The broad objective of this project is to create a 3D printed hydrogel sleeve to wrap around a ventricle. This would eliminate the need for a mechanical rotor, as my biotransistor would host a patentient’s living cells, and therefore would be able to detect the heart’s natural rhythm to adjust pumping acting in real time, for example if a patient is walking uphill or is relaxed etc. This hydrogel would be soft and tissue-like, thereby reducing the friction and mechanical stress that titanium pumps have on the heart. Moreover, the likelihood of rejection would be minimised as a patient’s blood cells/immune system would see the Living Heart Sleeve as a living vessel as opposed to a foreign body, thus potentially eliminating the need for life-long blood thinners. Owing to this recognition and mechanical offloading, a patient may experience a regeneration of their own cardiac cells. At the moment nothing like this exists due to the fact that hydrogel transistors are still a pretty nascent technology.


The central hypothesis of this project is that a hydrogel biotransistor will help a defective heart to function better than a traditional VAD. To test this I will be either testing

a) The mechanical integration and biocompatability of a codon-optimised adhesion protein designed and synthesised for expression in an E.coli system (to maximise protein yield)

b) The signal transduction and biosensing ability of a living component using fluorescence (bacterial biosensor)

I will be utilising Benchling and Twist Biosciences to design and synthesise DNA.


The expected outcome is

a) Visible bacterial colony/high cell density on protein-treated hydrogel

b) fluorescence of bacteria upon detection of an electrical signal

SECTION 2: PROJECT AIMS

Aim 1: Experimental Aim (this project): Use Benchling and Twist Biosciences to design and synthesise DNA to be tested in vitro

The first aim of my final project is to test either:

a) The viability of a codon-optimised adhesion protein to essentially act as a glue between the heart sleeve and the patient’s heart. To do this I will culture engineered bacteria. Two slides will be prepared: the first will have the hydrogel with the E. coli which have been genetically engineered to express an adhesion protein, and the second will have the hydrogel with E. coli - this will function as a control.

b) The signal transduction and biosensing ability of E. coli in the lab using sfGFP in a genetic circuit - with fluorescence being the indicator of a positive result. To do this I will genetically engineer E. coli to produce sfGFP in response to distress. I will run a small current through the hydrogel using a battery to simulate the pulse of the heart. This is to test the likelihood of the biosynthetic heart sleeve’s ability to detect the heart’s natural rhythm which it would adjust to in real time.

In order to do this I will be using Benchling and Twist Biosciences to design and synthesise DNA which will be tested in person at LifeFabs Institute.


Protocols

a) Adhesion Proteins

Here is the link to my Benchling for the adhesion protein plasmid

ComponentStart (bp)End (bp)Length (bp)FunctionVisual Color
Promoter (J23100)13535Constitutive Promoter: Constant "on" signal.Periwinkle
RBS (B0034)364712High-efficiency Ribosome Binding Site.Turquoise
Mam7 Adhesion48147100Multivalent Adhesion Molecule ("Velcro").Yellow
Stop Codon (TAA)1481503Translation Termination.Orange
Terminator (B0015)15121868Transcription Stop: Prevents read-past.Pink

Measuring Output

  • The Setup: Prepare two identical plates with hydrogel. Plate A will be covered with the genetically engineered E. coli which has been designed to express the Mam7 protein which acts like a sort of biological velcro as it were. Plate B will have the E. coli which lacks the Mam7 plasmid.
  • The Input: I will subject both to a the same volume of saline solution. I will then collect the run off of both and plate them on agar as plates C (Mam7 E. coli) and D. I will then leave these for a day.
  • The Measurement (Qualitative): Are there more bacteria on plates C or D? How much more? What is this in terms of surface area/as a percentage? The plate with fewer colonies represents the stickier bacteria as it was able to successfully adhere to the original agar plate.

Metric for success: Adhesion Efficiency. If the Mam7 expressing E. coli stay attached at a 5x higher rate than the control under stress, the biological velcro may be deemed a success.

b) Biosensor Genetic Circuit

Here is the link to my Benchling for the biosensor genetic circuit

ComponentStart (bp)End (bp)Length (bp)FunctionVisual Color
precA Promoter1108108Inducible Input: SOS response trigger.Pink
RBS (B0034)10911911High-efficiency Ribosome Binding Site.Yellow
sfGFP120836717Reporter Output: Green Fluorescent Protein.Lime Green
Stop Codon (TAA)8378393Signals the end of the protein chain.Pale Orange
Terminator (B0015)84090364Transcription Stop: Prevents read-through.Periwinkle

Measuring Output

  • The Setup: Put engineered bacteria into hydrogel
  • The Input: Use a 9V battery or a simple Arduino to send a tiny micro-current through the gel (simulating a heart pulse)
  • The Measurement (Qualitative): Do the bacteria get greener under a UV light after the current is applied?

Use a cheap photodiode or an app on phone that measures “lux” (light intensity) to see if the brightness increases over time and by how much or use flow cytometry.

Aim 2: Development Aim: Use bioprinter to fabricate hydrogel biotransistor and test in vivo

Once I had conducted the necessary preliminary experiments that I outlined above I would redo them many times and optimise the materials. For instance, I could design and test many adhesion proteins and rank these according to their stickiness but also their potential therapeutic viability and cost etc. Afterwards, I would move on to designing my own viable biotransistor scaffold - that is the very Living Heart Sleeve. I would test this many times and with mammalian cardiomyocytes. Ideally the scaffold should be relatively thin to fit around the heart but also quite robust so as to withstand being inside a body, so finding the right thickness will be a challenge. The end goal of this would be to bioprint the sleeve and test it with an animal heart (sorry cow or pig ❤️), probably a pig’s as their hearts are remarkably similar to human ones. Pigs are one of my favourite animals 🐷. I am not sure if it would be inserted into cows/pigs with heart conditions or if it would be tested on the hearts of cows/pigs who have already died.

Aim 3: Visionary Aim: Make fully functional heart sleeve and bring about a paradigm shift in cardiac therapeutics (yay!)

The long term vision is to bring about a paradigm shift in VADs and cardiac therapeutics broadly. As I explained above the Living Heart Sleeve has the potential to reduce the risks of blood clotting and tissue rejection, pump and adjust to the patient’s natural heart rhythm, eliminate or at least lessen the need for medication, and also bring about the regeneration of a patient’s heart cells.

I have touched upon this above, but I must reiterate, that nothing like this exists as of yet. This technology would bypass the need for mechanical rotors made of titanium as in the case of traditional VADs and would be safer in the body.

I hope for the the Living Heart Sleeve to be used by cardiac patients with cardiovascular disease and arrythmia, for instance. I would hope that it would become viable for prophylaxic purposes in both regular, gerriactric, and pediatric contexts, thereby ultimately helping to prevent the incidence of heart attacks in vulnerable people. This hydrogel biotransistor technology could also have wide ranging implications for other fields.

SECTION 3: BACKGROUND

Research

Thank you Instagram!

Doomscrolling with a purpose . . .

Very embarassingly, I came across the news of Hong Kong University (HKU) researchers having created a “soft 3D transistor using hydrogels” on one of those big “shocking news in tech” Instagram accounts which often make viral posts: Here is the reel. I watched the video, and whilst the caption focused on how the transistors mimicked neurons, and commenters bemoaned the fall of humanity due to the potential applications this material would have for AI somehow, I thought instantly to myself that this could be used for the heart in some way as it contains electrogenic cells. My mind was drawn also to the memory of my own grandfather who died of a myocardial infarction (heart attack) last year.

Paper 1

Interpenetrated hydrogel transistors, showing their exclusive ability to resemble the spatial 3D structure of neural circuits in the brain, subjected to strain values of 0% and 30%. Video courtesy of HKU

I then read the paper published in Science: Increasing the dimensionality of transistors with hydrogels (Liu et al. 2025) by researchers at Hong Kong university (HKU), demonstrating the fesability of creating millimetre thick semi-conducting layers using hydrogels which are biocompatible, and became very interested in developing a Ventricular Assistive Device (VAD). The paper is unfortunately behind a paywall and is not even accessible through UCL alumni institutional login 💔, or via SciHub due to its being published after 2021. In fact, it was published on the 20th of November 2025, so not even 6 months ago from now. I had to get my brother at MIT to send me the PDF 🎻.

This paper details a breakthrough in material science as scientists at HKU document their ability to construct a 3D hydrogel which can achieve millimetre-scale 3D modulation. Traditional transistors which form the basis for modern electronics are 2D and hard, which severely limit their integration into biological systems. Whereas hydrogels are soft and biocompatible.

Scalable produced 3D hydrogel semiconductor fibers

Intrinsic stretchability of the hydrogel transistors

3D hydrogel transistors function similarly to organic electrochemical transistors (OETCs), however the OETCs lose functionality as their thickness increases. “In contrast, the linear thickness-capacitance dependency of hydrogel semiconductor remains consistent regardless of thickness (up to millimeters), indicating that complete 3D modulation is achieved in the hydrogel semiconductor” (Liu et al 2025:826). Here at HKU, the scientists were able to achieve this incredible modulation, balancing electron conductivity and ion activity whilst mimicking real neuronal connections through the pioneering of a double-network hydrogel system - something which creates structual stability.

Something which I found particiularly interesting was that the researchers used the reservoir computing (RC) framework (a machine learning algorithm which requires only minimal computational requirements and can operate with small training datasets) to demonstrate the potential for their 3D hydrogel transistor in the construction of neuromorphic circuits. The system they created achieved a prediction accuracy of up to 91.93%, which is comparable to conventional artificial neural networks! Moreover, this prediction accuracy can be maintained under up to 30% strain applied in any direction (see video above).

The authors also deal with the challenges of scaling up this new technology, stating that they believe the potential of 3d hydrogel semiconductors can only be fully leveraged through enabling efficient production using accessible and low-cost methods such as printing and textile manufacturing methods. They were able to develop a one-step water-processable fabrication protocol wherein the required materials for assembling the composite hydrogel semiconductor are premixed in a single step. The 3D hydrogel semiconductor can then be formed through a simple cross-linking process. Thankfully the necessary materials (with their sources) and methodology are detailed in a brilliant supplementary document.

Supplementary paper

Modulation Strategy

Transistor up close

Circuit

Intrinsic stretchability of the hydrogel transistors

Paper 2

The second article I read, which incidentally was cited in the first, is another published in Science: Transistor in a tube: A route to three-dimensional bioelectronics (Pitsalidis et al. 2018). Thankfully I was able to access this paper via SciHub which you can access by clicking the previous link. This paper

Innovation

This project consitsts of something that has never been done before as it uses novel technology, namely 3D hydrogel biotransistors, to replace traditional cardiac therapeutics. It relies on the electrogenic quality of myocardial cells and is designed to help sufferers of cardiac diseases. Moreover, it constitutes both a new methodology and technology as a “living” organ scaffold. In this way the project challenges existing paradigms by bringing into the realm of healthcare a fusion between synthetic biology and engineering which seeks not to replace fully the pumping ability of a diseased heart, but to aid this, whilst helping the regeneration of a patient’s own heart cells, thus endenously enhancing functionality. At the moment people either get VADs or heart transplants, both of which run the risk of infection or rejection, whilst the waiting list for the latter could be years.

Impacts

Explain why your project matters and what impact it could have. (Minimum 5 sentences.) Examples of topics to discuss: The problem addressed: What pressing real-world problem does your project attempt to solve? Importance of the problem: Why is this problem significant, or what critical barrier to progress in the field does it represent? Broader societal contribution: How could the outcomes of your project benefit society beyond the immediate research context? Advancement of knowledge or capability: How might the project improve scientific understanding, technical capability, or clinical practice within one or more fields? Field-level change: If your aims are achieved, how could the concepts, methods, technologies, treatments, services, or preventative approaches used in this field of research change?

Ethics

First paragraph: Include what ethical implications are involved in your project. Try to suggest ethical the principle(s) you may apply (e.g. non-maleficence, justice)? Second paragraph: Describe the measures that should be taken to ensure that your project is ethical (both in how the research is conducted and in its broader implications for society). You may wish to answer the following questions: What action(s) do you propose? What are potential unintended consequences of your proposed actions? What could you have been wrong (e.g., incorrect assumptions and uncertainties)? What are alternatives to your proposed actions? Note: in an NIH proposal, an ethics statement is used to describe the relevance of this research to public health

SECTION 4: EXPERIMENTAL DESIGN, TECHNIQUES, TOOLS, AND TECHNOLOGY
SECTION 5: RESULTS & QUANTITATIVE EXPECTATIONS
SECTION 6: ADDITIONAL INFORMATION
Final Project Ideas

The Tastemaker

Sea Wool/Sea Plastic 🌊🐑

Bioluminescent Sunflowers: Radiation Biosensor 🌻☢️

Lipophilic biomaterial for bioremediation 🪸🐠💙🐋

Chimeric Genetic Circuit: Prophylactic Shield 🛡️

Group Final Project

cover image cover image