Homework

Weekly homework submissions:

  • Week 1 HW: Principles and Practices

    First, describe a biological engineering application or tool you want to develop and why. This could be inspired by an idea for your HTGAA class project and/or something for which you are already doing in your research, or something you are just curious about. I have worked with the concept of CA before within design and 3d space generative making through creating tools for generating patterns and environments, so it was really fascinating to see it being brought up during class. So, for my idea I’d like to merge my previous digital experience with CA and synthetic biology tooling in a form of a computer aided design tool for spatial synthetic biology

  • Week 2 HW: DNA Read, Write, & Edit

    Part 1: Benchling & In-silico Gel Art Here’s my version of made in Benchling pattern/image in the style of Paul Vanouse’s Latent Figure Protocol artworks :) .. It’s a frog!!! https://benchling.com/s/seq-E1w2i5oXtxnqS8BWnwcZ?m=slm-mz29OlZAWUwDeJTn4nir I also played around separately to that drawing with Ronan’s tool, and this is what I got ( I wanted to go for something more abstract this time)

  • Week 3 HW: Principles and Practices

    For this lab, I made a little bunny drawing! I used Ronan’s tool as a base and then fixed it to work with colab environment and colours that we have in our node. from opentrons import types import math metadata = { 'protocolName': 'Bunny', 'author': 'HTGAA', 'source': 'HTGAA 2026', 'apiLevel': '2.20' } # Coordinate Data mkate2_points = [(12.1, 29.7),(14.3, 29.7),(16.5, 29.7),(18.7, 29.7),(20.9, 29.7),(23.1, 29.7),(9.9, 27.5),(12.1, 27.5),(16.5, 27.5),(18.7, 27.5),(20.9, 27.5),(23.1, 27.5),(-1.1, 25.3),(1.1, 25.3),(3.3, 25.3),(5.5, 25.3),(7.7, 25.3),(20.9, 25.3),(23.1, 25.3),(-3.3, 23.1),(-1.1, 23.1),(7.7, 23.1),(14.3, 23.1),(20.9, 23.1),(25.3, 23.1),(27.5, 23.1),(-7.7, 20.9),(14.3, 20.9),(20.9, 20.9),(29.7, 20.9),(12.1, 18.7),(16.5, 18.7),(18.7, 18.7),(29.7, 18.7),(-12.1, 16.5),(12.1, 16.5),(14.3, 16.5),(16.5, 16.5),(18.7, 16.5),(20.9, 16.5),(23.1, 16.5),(25.3, 16.5),(27.5, 16.5),(-14.3, 14.3),(12.1, 14.3),(16.5, 14.3),(18.7, 14.3),(25.3, 14.3),(-14.3, 12.1),(27.5, 12.1),(-16.5, 9.9),(27.5, 9.9),(-16.5, 7.7),(29.7, 7.7),(-16.5, 5.5),(-14.3, 5.5),(29.7, 5.5),(-16.5, 3.3),(-14.3, 3.3),(31.9, 3.3),(-14.3, 1.1),(-12.1, 1.1),(31.9, 1.1),(-12.1, -3.3),(34.1, -3.3),(-16.5, -5.5),(-14.3, -5.5),(34.1, -5.5),(-27.5, -7.7),(-25.3, -7.7),(-23.1, -7.7),(-18.7, -7.7),(-16.5, -7.7),(34.1, -7.7),(-27.5, -9.9),(-23.1, -9.9),(-20.9, -9.9),(-18.7, -9.9),(-16.5, -9.9),(-7.7, -9.9),(-5.5, -9.9),(34.1, -9.9),(-27.5, -12.1),(-18.7, -12.1),(-16.5, -12.1),(-14.3, -12.1),(-12.1, -12.1),(-9.9, -12.1),(-7.7, -12.1),(-5.5, -12.1),(34.1, -12.1),(-25.3, -14.3),(-16.5, -14.3),(-14.3, -14.3),(-12.1, -14.3),(-29.7, -16.5),(-27.5, -16.5),(-25.3, -16.5),(-23.1, -16.5),(-12.1, -16.5),(-9.9, -16.5),(31.9, -16.5),(-29.7, -18.7),(-20.9, -18.7),(-29.7, -20.9),(-18.7, -20.9),(-16.5, -20.9),(-14.3, -20.9),(-12.1, -20.9),(27.5, -20.9),(-27.5, -23.1),(-12.1, -23.1),(25.3, -23.1),(27.5, -23.1),(-27.5, -25.3),(-25.3, -25.3),(25.3, -25.3),(-23.1, -27.5),(-20.9, -27.5),(-18.7, -27.5),(-16.5, -27.5),(1.1, -27.5),(3.3, -27.5),(5.5, -27.5),(7.7, -27.5),(23.1, -27.5),(-14.3, -29.7),(-12.1, -29.7),(-9.9, -29.7),(-7.7, -29.7),(-3.3, -29.7),(5.5, -29.7),(7.7, -29.7),(9.9, -29.7),(12.1, -29.7),(14.3, -29.7),(18.7, -29.7),(20.9, -29.7)] mcerulean3_points = [(14.3, 25.3),(27.5, 20.9),(23.1, 18.7),(25.3, 18.7),(27.5, 18.7),(-5.5, 12.1),(-7.7, 9.9),(-5.5, 9.9)] TIP_RACK_DECK_SLOT = 9 COLORS_DECK_SLOT = 6 AGAR_DECK_SLOT = 5 PIPETTE_STARTING_TIP_WELL = 'A1' well_colors = { 'A1' : 'Purple', 'B1' : 'Pink', 'C3' : 'Cyan' } def run(protocol): tips_20ul = protocol.load_labware('opentrons_96_tiprack_20ul', TIP_RACK_DECK_SLOT, 'Opentrons 20uL Tips') pipette_20ul = protocol.load_instrument("p20_single_gen2", "right", [tips_20ul]) temperature_module = protocol.load_module('temperature module gen2', COLORS_DECK_SLOT) color_plate = temperature_module.load_labware('opentrons_96_aluminumblock_generic_pcr_strip_200ul', 'Cold Plate') agar_plate = protocol.load_labware('htgaa_agar_plate', AGAR_DECK_SLOT, 'Agar Plate') center_location = agar_plate['A1'].top() pipette_20ul.starting_tip = tips_20ul.well(PIPETTE_STARTING_TIP_WELL) def location_of_color(color_string): for well,color in well_colors.items(): if color.lower() == color_string.lower(): return color_plate[well] raise ValueError(f"No well found with color {color_string}") def dispense_and_detach(pipette, volume, location): above_location = location.move(types.Point(z=location.point.z + 5)) pipette.move_to(above_location) pipette.dispense(volume, location) pipette.move_to(above_location) drop_vol = 0.5 # Draw mcerulean3_points using Pink pipette_20ul.pick_up_tip() pipette_20ul.aspirate(len(mcerulean3_points) * drop_vol, location_of_color('Pink')) for x, y in mcerulean3_points: dispense_and_detach(pipette_20ul, drop_vol, center_location.move(types.Point(x, y))) pipette_20ul.drop_tip() # Draw mkate2_points using Blue pipette_20ul.pick_up_tip() num_batches = math.ceil((len(mkate2_points) * drop_vol) / 18) points_per_batch = math.floor(len(mkate2_points) / num_batches) for b in range(num_batches): start = b * points_per_batch end = min((b + 1) * points_per_batch, len(mkate2_points)) current_batch = mkate2_points[start:end] pipette_20ul.aspirate(len(current_batch) * drop_vol, location_of_color('Cyan')) for x, y in current_batch: dispense_and_detach(pipette_20ul, drop_vol, center_location.move(types.Point(x, y))) pipette_20ul.drop_tip() Post-Lab Questions Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications.

  • Week 4 HW: Protein Design Part I

    Part A. Conceptual Questions 1.How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons) 500g of meat = ~20% protein 100g ÷ 110 g/mol ≈ 0.91 mol 0.91 × 6.022 × 10²³ ≈ 5.5 × 10²³ molecules* Why are there only 20 natural amino acids?

  • Week 5 HW: Protein Design Part II

    Part A: SOD1 Binder Peptide Design (From Pranam) Part 1: Generate Binders with PepMLM sp|P00441|SODC_HUMAN Superoxide dismutase [Cu-Zn] OS=Homo sapiens OX=9606 GN=SOD1 PE=1 SV=2 MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTS AGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVV HEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ (taken from https://rest.uniprot.org/uniprotkb/P00441.fasta ) -> muted form A-> V MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTS AGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVV HEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ (2-5) # Binder Pseudo Perplexity Notes 0 WHYGAAQAAHWX 7.60026377248636 High confidence 1 WLYGASAAAWKK 7.46473740432208 Highest confidence!!! 2 WLYGAAGVAWKE 10.9325804754158 Moderate confidence 3 WLYYPQAAKLKK 15.5499787120909 Lowest confidence — FLYRWLPSRRGG 20.9180890005569 Known binder (control) Part 2: Evaluate Binders with AlphaFold3

  • Week 6 HW: Genetic Circuits Part I: Assembly Technologies

    Assignment: DNA Assembly What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose? Based on Protocol for Phusion™ High-Fidelity PCR Master Mix with GC Buffer (https://www.neb.com/en-gb/protocols/protocol-for-phusion-high-fidelity-pcr-master-mix-with-gc-buffer-m0532) Phusion DNA Polymerase - a high-fidelity enzyme that synthesises new DNA strands with 3’→5’ exonuclease activity for proofreading dNTPs - building blocks for DNA synthesis (200 µM each at 1X concentration) MgCl₂ - Cofactor required for polymerase activity (1.5 mM at 1X concentration) Possibly optional as the question doesn’t mention it (GC Buffer - optimised buffer for amplifying difficult templates with high GC content or secondary structure) DMSO (optional additive) - helps denature secondary structures in GC-rich or difficult templates (recommended at 3% final concentration)

  • Week 7 HW: Genetic Circuits Part II: Neuromorphic Circuits

    Assignment Part 1: Intracellular Artificial Neural Networks (IANNs) What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions? Analogue quality of computations (not just on and off, but rather treat them more like waves ->easier to work with) -> Thresholds finetuning works You can also work with more inputs simultaneously that will automatically become a weighted sum Which also means they can be trained like neural networks! whereas Boolean functions are just True/False with operators AND, OR, and NOT

  • Week 9 HW: Cell-Free Systems

    Homework Part A: General and Lecturer-Specific Questions General homework questions Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production. As the reaction happens outside living cells, you have much more flexibility and control over experimental variables!

  • Week 10 HW: Advanced Imaging & Measurement Technology

    Final Project Please identify at least one (ideally many) aspect(s) of your project that you will measure. It could be the mass or sequence of a protein, the presence, absence, or quantity of a biomarker, etc. molecular validation (to confirm whether the Twist plasmids are correct and transformed properly) through the presence of the sender and receiver plasmid, correct insert size and sequence identity, then successful bacterial transformation functional signalling validation / AHL sensing ability through receiver activation threshold, GFP expression strength, dose-response curve, signal saturation spatial pattern formation to measure whether LuxI actually produces AHL through sender-generated AHL activity, effective AHL equivalent concentration, and signalling consistency between cultures ML/CAD dataset features through OT-2 plate maps, fluorescence reads, OD600 and metadata logs Please describe all of the elements you would like to measure, and furthermore, describe how you will perform these measurements. What are the technologies you will use (e.g., gel electrophoresis, DNA sequencing, mass spectrometry, etc.)? Describe in detail.

  • Week 11 HW: Bioproduction & Cloud Labs

    Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork At this point, I’m not entirely sure what I’ve contributed to, as the artwork changed quite a lot. I’ve added a few yellow and green pixels when we first got access to the board, but as time went it changed so much that I don’t think any of them were left in the same spots. I did like the collaborative aspect of it, but going forward, I kind of wish everyone could contribute to just one/two pixels that couldn’t be overwritten. That would make the pointing-out aspect of it so much easier

  • Week 12 HW: Building Genomes

    See week 12~

Subsections of Homework

Week 1 HW: Principles and Practices

  1. First, describe a biological engineering application or tool you want to develop and why. This could be inspired by an idea for your HTGAA class project and/or something for which you are already doing in your research, or something you are just curious about. 

I have worked with the concept of CA before within design and 3d space generative making through creating tools for generating patterns and environments, so it was really fascinating to see it being brought up during class. So, for my idea I’d like to merge my previous digital experience with CA and synthetic biology tooling in a form of a computer aided design tool for spatial synthetic biology

  1. Next, describe one or more governance/policy goals related to ensuring that this application or tool contributes to an “ethical” future, like ensuring non-malfeasance (preventing harm). Break big goals down into two or more specific sub-goals.

Ensuring biosafety: a) Possibility of the prediction of biological behaviour b) Testable behaviour (not just false confidence in whatever is happening, both for a and b!) c) safety protocols built in within the ux/ui of the software d) training provided

Transparency whilst preventing harm: a) Maintenance of accountability over the created projects (as in not fully automated) b) Responsible use c) (Ideally!) some sort of encouragement of socially beneficial applications

  1. Next, describe at least three different potential governance “actions” by considering the four aspects below (Purpose, Design, Assumptions, Risks of Failure & “Success”). Try to outline a mix of actions (e.g. a new requirement/rule, incentive, or technical strategy) pursued by different “actors” (e.g. academic researchers, companies, federal regulators, law enforcement, etc). Draw upon your existing knowledge and a little additional digging, and feel free to use analogies to other domains (e.g. 3D printing, drones, financial systems, etc.).
    1. Purpose: What is done now and what changes are you proposing?
    2. Design: What is needed to make it “work”? (including the actor(s) involved - who must opt-in, fund, approve, or implement, etc)
    3. Assumptions: What could you have wrong (incorrect assumptions, uncertainties)?
    4. Risks of Failure & “Success”: How might this fail, including any unintended consequences of the “success” of your proposed actions?
cover image cover image
  1. Next, score (from 1-3 with, 1 as the best, or n/a) each of your governance actions against your rubric of policy goals. The following is one framework but feel free to make your own:
Does the option:Option 1Option 2Option 3
Enhance Biosecurity
• By preventing incidents212
• By helping respond322
Foster Lab Safety
• By preventing incident112
• By helping respond213
Protect the environment
• By preventing incidents321
• By helping respond221
Other considerations
• Minimizing costs and burdens to stakeholders321
• Feasibility?221
• Not impede research231
• Promote constructive applications112

Homework Questions from Professor Jacobson: [Lecture 2 slides]

1)Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase?

1:10⁶

How does this compare to the length of the human genome?

approximately 3.2 Gbp

How does biology deal with that discrepancy?

Through 5’-3’ error-correcting exonuclease and 3’-5’ proofreading exonucleas + there is also MutS Repair System

2)How many different ways are there to code (DNA nucleotide code) for an average human protein?

2³⁴⁵ or higher (as there is 1,036 bp and 345 amino acids)

In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?

  • because of the secondary structure formation
  • RNA cleavage
  • mRNA instability
  • different organisms might have preferences for certain codons

Homework Questions from Dr. LeProust: [Lecture 2 slides]

1.What’s the most commonly used method for oligo synthesis currently?

The phosphoramidite method

2.Why is it difficult to make oligos longer than 200nt via direct synthesis?

Because of the 1)error accumulation, 2)”enhanced chemistry” is required

3.Why can’t you make a 2000bp gene via direct oligo synthesis?

Error accumulation and length limitations > assembly approach is required

Homework Question from George Church: [Lecture 2 slides]

[Given slides #2 & 4 (AA:NA and NA:NA codes)] What code would you suggest for AA:AA interactions?

A protein–protein interaction code

Week 2 HW: DNA Read, Write, & Edit

Part 1: Benchling & In-silico Gel Art

Here’s my version of made in Benchling pattern/image in the style of Paul Vanouse’s Latent Figure Protocol artworks :)

 virtual_digest_sequence_Lambda_NEB_2026  virtual_digest_sequence_Lambda_NEB_2026 .. It’s a frog!!!

 Frog  Frog

https://benchling.com/s/seq-E1w2i5oXtxnqS8BWnwcZ?m=slm-mz29OlZAWUwDeJTn4nir

I also played around separately to that drawing with Ronan’s tool, and this is what I got ( I wanted to go for something more abstract this time)

EnzymeNumber of CutsNumber of FragmentsFragment Sizes (bp)
EcoRI5621,226 / 7,421 / 5,804 / 5,643 / 4,878 / 3,530
HindIII679,416 / 6,682 / 4,361 / 3,130 / 2,322 / 2,027 / 564
BamHI5616,841 / 7,233 / 6,770 / 6,527 / 5,626 / 5,505
KpnI2329,942 / 17,057 / 1,503
EcoRV21225,765 / 5,376 / 4,613 / 3,873 / 3,744 / 3,595 / 2,884 / 2,674 / 1,921 / 1,679 / 1,434 / 1,403 / 1,377 / 1,313 / 738 / 655 / 618 / 597 / 588 / 268 / 52 / 35
SacI2324,776 / 22,621 / 1,105
SalI2332,745 / 15,258 / 499
 In-silico Gel Art  In-silico Gel Art

Restriction Enzymes Used

  • EcoRI
  • EcoRV
  • HindIII
  • KpnI
  • BamHI
  • SacI
  • SalI

Restriction Digest Setup

LaneWaterCutSmart Bufferλ DNAEnzyme(s)
M (Ladder)14 μL2 μL3 μL-
113 μL2 μL3 μL1 μL EcoRI
213 μL2 μL3 μL1 μL KpnI + 1 μL BamHI
314 μL2 μL3 μL1 μL EcoRI + 1 μL HindIII
414 μL2 μL3 μL1 μL EcoRV
513 μL2 μL3 μL1 μL EcoRI + 1 μL KpnI
613 μL2 μL3 μL1 μL SacI + 1 μL HindIII
713 μL2 μL3 μL1 μL SacI + 1 μL SacI
813 μL2 μL3 μL1 μL SalI + 1 μL KpnI
914 μL2 μL3 μL1 μL SacI + 1 μL SacI
1013 μL2 μL3 μL1 μL HindIII + 1 μL SacI

Total volume per tube: 20 μL

Restriction Digest Parameters

  • Incubation: 37°C for 60 minutes
  • Heat Inactivation (optional): 80°C for 20 minutes

DNA Gel Electrophoresis

Goal: 100-150 ng of DNA per lane

Hand-Cast Gel Protocol

Digest Sample:

  • 14.7 μL Water
  • 3.3 μL Loading Dye
  • 2 μL Digest
  • Total: 20 μL

Ladder:

  • 6.6 μL Water
  • 3.3 μL Loading Dye
  • 10 μL Ladder (15 ng/μL stock)
  • Total: 20 μL

E-Gel Protocol

Digest Sample:

  • 18 μL Water
  • 2 μL Digest
  • Total: 20 μL

Ladder:

  • 10 μL Water
  • 10 μL Ladder (15 ng/μL stock)
  • Total: 20 μL

Part 3: DNA Design Challenge

3.1 For this task, I’d like to use green fluorescent protein also known as GFP as it is, firstly, safe and pretty well characterised (as described here! https://pubmed.ncbi.nlm.nih.gov/8303295/), essential for CA mechanics to be working (https://www.science.org/doi/10.1126/science.abb8205) and just generally directly connects to tool making (https://www.nature.com/articles/s41467-024-53078-8)

sp|P42212|GFP_AEQVI Green fluorescent protein OS=Aequorea victoria OX=6100 GN=GFP PE=1 SV=1 MSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPT LVTTFSYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDT LVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQ LADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK

3.2 Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence.

based on this tool https://www.bioinformatics.org/sms2/rev_trans.html

reverse translation of sp|P42212|GFP_AEQVI Green fluorescent protein OS=Aequorea victoria OX=6100 GN=GFP PE=1 SV=1 to a 714 base sequence of consensus codons. atgwsnaarggngargarytnttyacnggngtngtnccnathytngtngarytngayggn gaygtnaayggncayaarttywsngtnwsnggngarggngarggngaygcnacntayggn aarytnacnytnaarttyathtgyacnacnggnaarytnccngtnccntggccnacnytn gtnacnacnttywsntayggngtncartgyttywsnmgntayccngaycayatgaarcar caygayttyttyaarwsngcnatgccngarggntaygtncargarmgnacnathttytty aargaygayggnaaytayaaracnmgngcngargtnaarttygarggngayacnytngtn aaymgnathgarytnaarggnathgayttyaargargayggnaayathytnggncayaar ytngartayaaytayaaywsncayaaygtntayathatggcngayaarcaraaraayggn athaargtnaayttyaarathmgncayaayathgargayggnwsngtncarytngcngay caytaycarcaraayacnccnathggngayggnccngtnytnytnccngayaaycaytay ytnwsnacncarwsngcnytnwsnaargayccnaaygaraarmgngaycayatggtnytn ytngarttygtnacngcngcnggnathacncayggnatggaygarytntayaar

3.3. Codon optimization.

based on this tool - https://en.vectorbuilder.com/tool/codon-optimization/51ebfb2e-a00a-4190-9fe9-c1578d3ecfbe.html

Pasted Sequence: GC=48.60%, CAI=1.00

ATGAGCAAAGGCGAAGAACTGTTTACCGGCGTGGTGCCGATTCTGGTGGAACTGGATGGCGATGTGAACGGCCATAAATTTAGCGTGAGCGGCGAAGGCGAAGGCGATGCGACCTATGGCAAACTGACCCTGAAATTTATTTGCACCACCGGCAAACTGCCGGTGCCGTGGCCGACCCTGGTGACCACCTTTAGCTATGGCGTGCAGTGCTTTAGCCGCTATCCGGATCATATGAAACAGCATGATTTTTTTAAAAGCGCGATGCCGGAAGGCTATGTGCAGGAACGCACCATTTTTTTTAAAGATGATGGCAACTATAAAACCCGCGCGGAAGTGAAATTTGAAGGCGATACCCTGGTGAACCGCATTGAACTGAAAGGCATTGATTTTAAAGAAGATGGCAACATTCTGGGCCATAAACTGGAATATAACTATAACAGCCATAACGTGTATATTATGGCGGATAAACAGAAAAACGGCATTAAAGTGAACTTTAAAATTCGCCATAACATTGAAGATGGCAGCGTGCAGCTGGCGGATCATTATCAGCAGAACACCCCGATTGGCGATGGCCCGGTGCTGCTGCCGGATAACCATTATCTGAGCACCCAGAGCGCGCTGAGCAAAGATCCGAACGAAAAACGCGATCATATGGTGCTGCTGGAATTTGTGACCGCGGCGGGCATTACCCATGGCATGGATGAACTGTATAAA

Improved DNA[1]: GC=48.88%, CAI=0.97

ATGAGCAAAGGCGAAGAACTGTTTACCGGCGTGGTGCCGATTCTGGTGGAACTGGATGGCGATGTGAATGGCCATAAATTTAGCGTGAGCGGCGAAGGTGAAGGCGATGCGACCTATGGCAAACTGACCCTGAAATTTATCTGCACCACCGGTAAACTGCCGGTGCCGTGGCCGACCCTGGTGACCACCTTCAGCTACGGCGTGCAGTGTTTTAGCCGCTACCCGGATCATATGAAACAGCATGATTTTTTTAAAAGCGCGATGCCGGAAGGCTATGTGCAGGAACGCACCATTTTTTTCAAAGATGATGGCAATTACAAAACCCGTGCCGAAGTGAAATTCGAAGGCGATACCCTGGTGAATCGCATTGAACTGAAAGGCATTGATTTTAAAGAAGATGGTAACATTCTGGGCCACAAACTGGAATACAACTATAACAGCCATAACGTGTACATTATGGCGGATAAACAGAAAAATGGCATTAAAGTGAACTTTAAAATTCGCCATAACATTGAAGATGGCTCAGTGCAGCTGGCGGATCACTATCAGCAGAACACCCCGATTGGCGATGGCCCGGTTCTGCTGCCGGATAACCACTATCTGAGCACCCAGAGCGCGCTGTCGAAAGATCCGAACGAAAAACGCGATCACATGGTGCTGCTGGAATTTGTGACCGCCGCGGGCATCACCCATGGTATGGATGAACTGTATAAA Avoid cleavage sites of restriction enzymes: BbsI BsaI

It’s done to get a more reliable and consistent protein expression!

I chose Escherichia coli K-12 substr. MG1655 because it’s standard and saf to work with + GFP expression is well established within it

3.4 If it’s a cell-dependent method the DNA sequence can be transcribed and translated into my protein through these steps: 1. Transformation 2.Transcription 3.Translation 4.Protein Folding 5.Fluorescence

Part 4: Prepare a Twist DNA Synthesis Order

https://benchling.com/s/seq-TYyZiuVgAJUcWN179kJj?m=slm-SR2kt8YFMq4ngkmbdM2Q

 Linear Map  Linear Map Afterwards in Twist I got a bunch of errors when importing as when I was making a sequence a copy pasted an article “a"into it as it was a part of it. It is also visible here as 931 is not divisible by 3 fixed version!>  Linear Map  Linear Map how it works>   SBOL Canvas   SBOL Canvas

final link https://benchling.com/s/seq-jNpnBifWoDGPy7RLJoHu?m=slm-nUfWbFSbXctEauGsTTaq

  plasmid with expression cassette   plasmid with expression cassette

Part 5: DNA Read/Write/Edit

5.1 DNA Read

(i) What DNA would you want to sequence (e.g., read) and why?

So far some sort of plasmids encoding CA-based pattern formation circuits in bacterial populations to be able to design patterns with them for the final project tool

(ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why?

Probably something like Illumina as it is pretty commonly used which means it will be easier to find

5.2 DNA Write

(i) What DNA would you want to synthesize (e.g., write) and why?

GFP-based circuits so I could prototype programmable living pattern systems

(ii) What technology or technologies would you use to perform this DNA synthesis and why?

PCR or Gibson assembly to do precise construction of my custom genetic circuit

And then in terms of computer based tech Benchling and Twist

5.3 DNA Edit

(i) What DNA would you want to edit and why?

So far based on my research for this project E. coli K-12 MG1655as it is safe, well-documented, accessible and also works amazing with my project

(ii) What technology or technologies would you use to perform these DNA edits and why?

I’m not fully sure what’s possible to use for the projects yet but if I can CRISPR-Cas9 as it can precisely edit genetic circuits

Week 3 HW: Principles and Practices

For this lab, I made a little bunny drawing! I used Ronan’s tool as a base and then fixed it to work with colab environment and colours that we have in our node.

Bunny drawing Bunny drawing
    from opentrons import types
    import math
    
    metadata = {
        'protocolName': 'Bunny',
        'author': 'HTGAA',
        'source': 'HTGAA 2026',
        'apiLevel': '2.20'
    }
    
    # Coordinate Data
    mkate2_points = [(12.1, 29.7),(14.3, 29.7),(16.5, 29.7),(18.7, 29.7),(20.9, 29.7),(23.1, 29.7),(9.9, 27.5),(12.1, 27.5),(16.5, 27.5),(18.7, 27.5),(20.9, 27.5),(23.1, 27.5),(-1.1, 25.3),(1.1, 25.3),(3.3, 25.3),(5.5, 25.3),(7.7, 25.3),(20.9, 25.3),(23.1, 25.3),(-3.3, 23.1),(-1.1, 23.1),(7.7, 23.1),(14.3, 23.1),(20.9, 23.1),(25.3, 23.1),(27.5, 23.1),(-7.7, 20.9),(14.3, 20.9),(20.9, 20.9),(29.7, 20.9),(12.1, 18.7),(16.5, 18.7),(18.7, 18.7),(29.7, 18.7),(-12.1, 16.5),(12.1, 16.5),(14.3, 16.5),(16.5, 16.5),(18.7, 16.5),(20.9, 16.5),(23.1, 16.5),(25.3, 16.5),(27.5, 16.5),(-14.3, 14.3),(12.1, 14.3),(16.5, 14.3),(18.7, 14.3),(25.3, 14.3),(-14.3, 12.1),(27.5, 12.1),(-16.5, 9.9),(27.5, 9.9),(-16.5, 7.7),(29.7, 7.7),(-16.5, 5.5),(-14.3, 5.5),(29.7, 5.5),(-16.5, 3.3),(-14.3, 3.3),(31.9, 3.3),(-14.3, 1.1),(-12.1, 1.1),(31.9, 1.1),(-12.1, -3.3),(34.1, -3.3),(-16.5, -5.5),(-14.3, -5.5),(34.1, -5.5),(-27.5, -7.7),(-25.3, -7.7),(-23.1, -7.7),(-18.7, -7.7),(-16.5, -7.7),(34.1, -7.7),(-27.5, -9.9),(-23.1, -9.9),(-20.9, -9.9),(-18.7, -9.9),(-16.5, -9.9),(-7.7, -9.9),(-5.5, -9.9),(34.1, -9.9),(-27.5, -12.1),(-18.7, -12.1),(-16.5, -12.1),(-14.3, -12.1),(-12.1, -12.1),(-9.9, -12.1),(-7.7, -12.1),(-5.5, -12.1),(34.1, -12.1),(-25.3, -14.3),(-16.5, -14.3),(-14.3, -14.3),(-12.1, -14.3),(-29.7, -16.5),(-27.5, -16.5),(-25.3, -16.5),(-23.1, -16.5),(-12.1, -16.5),(-9.9, -16.5),(31.9, -16.5),(-29.7, -18.7),(-20.9, -18.7),(-29.7, -20.9),(-18.7, -20.9),(-16.5, -20.9),(-14.3, -20.9),(-12.1, -20.9),(27.5, -20.9),(-27.5, -23.1),(-12.1, -23.1),(25.3, -23.1),(27.5, -23.1),(-27.5, -25.3),(-25.3, -25.3),(25.3, -25.3),(-23.1, -27.5),(-20.9, -27.5),(-18.7, -27.5),(-16.5, -27.5),(1.1, -27.5),(3.3, -27.5),(5.5, -27.5),(7.7, -27.5),(23.1, -27.5),(-14.3, -29.7),(-12.1, -29.7),(-9.9, -29.7),(-7.7, -29.7),(-3.3, -29.7),(5.5, -29.7),(7.7, -29.7),(9.9, -29.7),(12.1, -29.7),(14.3, -29.7),(18.7, -29.7),(20.9, -29.7)]
    mcerulean3_points = [(14.3, 25.3),(27.5, 20.9),(23.1, 18.7),(25.3, 18.7),(27.5, 18.7),(-5.5, 12.1),(-7.7, 9.9),(-5.5, 9.9)]
    
    TIP_RACK_DECK_SLOT = 9
    COLORS_DECK_SLOT = 6
    AGAR_DECK_SLOT = 5
    PIPETTE_STARTING_TIP_WELL = 'A1'
    
    well_colors = {
        'A1' : 'Purple',
        'B1' : 'Pink',
        'C3' : 'Cyan'
    }
    
    def run(protocol):
      tips_20ul = protocol.load_labware('opentrons_96_tiprack_20ul', TIP_RACK_DECK_SLOT, 'Opentrons 20uL Tips')
      pipette_20ul = protocol.load_instrument("p20_single_gen2", "right", [tips_20ul])
      temperature_module = protocol.load_module('temperature module gen2', COLORS_DECK_SLOT)
      color_plate = temperature_module.load_labware('opentrons_96_aluminumblock_generic_pcr_strip_200ul', 'Cold Plate')
      agar_plate = protocol.load_labware('htgaa_agar_plate', AGAR_DECK_SLOT, 'Agar Plate')
      center_location = agar_plate['A1'].top()
      pipette_20ul.starting_tip = tips_20ul.well(PIPETTE_STARTING_TIP_WELL)
    
      def location_of_color(color_string):
        for well,color in well_colors.items():
          if color.lower() == color_string.lower():
            return color_plate[well]
        raise ValueError(f"No well found with color {color_string}")
    
      def dispense_and_detach(pipette, volume, location):
          above_location = location.move(types.Point(z=location.point.z + 5))
          pipette.move_to(above_location)
          pipette.dispense(volume, location)
          pipette.move_to(above_location)
    
      drop_vol = 0.5
    
      # Draw mcerulean3_points using Pink
      pipette_20ul.pick_up_tip()
      pipette_20ul.aspirate(len(mcerulean3_points) * drop_vol, location_of_color('Pink'))
      for x, y in mcerulean3_points:
        dispense_and_detach(pipette_20ul, drop_vol, center_location.move(types.Point(x, y)))
      pipette_20ul.drop_tip()
    
      # Draw mkate2_points using Blue
      pipette_20ul.pick_up_tip()
      num_batches = math.ceil((len(mkate2_points) * drop_vol) / 18)
      points_per_batch = math.floor(len(mkate2_points) / num_batches)
    
      for b in range(num_batches):
        start = b * points_per_batch
        end = min((b + 1) * points_per_batch, len(mkate2_points))
        current_batch = mkate2_points[start:end]
        pipette_20ul.aspirate(len(current_batch) * drop_vol, location_of_color('Cyan'))
        for x, y in current_batch:
          dispense_and_detach(pipette_20ul, drop_vol, center_location.move(types.Point(x, y)))
      pipette_20ul.drop_tip()

Post-Lab Questions

Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications.

https://www.nature.com/articles/s41598-024-64938-0

This study by Norton-Baker et al. (2024) used an Opentrons OT-2 liquid-handling tool to efficiently characterise a large number of proteins. They also described a generalizable pipeline for high-throughput protein purification using small-scale expression in E. coli and an affordable liquid-handling robot. As a result, the automation significantly increased throughput, reduced manual labour, and improved consistency across samples, demonstrating how accessible robotics can accelerate biological research workflows. It also allowed to confirm the validity of previous findings.

On a separate note, I really enjoyed their explanation for using the tool : “Therefore, we aimed to develop a protocol using the OT-2 to provide a low-cost option for the purification and analysis of enzymes, or other proteins, making high-throughput studies more accessible to a broader range of research laboratories. Beyond increasing efficiency, this automation-assisted approach reduces the labor burden on researchers and lowers the risk of repetitive use injuries.”"

Write a description about what you intend to do with automation tools for your final project. You may include example pseudocode, Python scripts, 3D printed holders, a plan for how to use Ginkgo Nebula, and more. You may reference this week’s recitation slide deck for lab automation details.

For my final project, I plan on using Opentrons to support a design, then build, test and learn pipeline for programmable living CA.

Things to do:

  • build libraries of sender receiver circuits with a range of parameters (I’ve done a similar thing with chi.bio before so should be straightforward). I’d also like to see how I can integrate it within the pre existing lab environment (Airtable) to get more transparent data for the future

  • Go through the process of general screening of spatial rule variants in plates to test pattern formation under different genetic and environmental conditions, depending on available options

  • In the later stages. use Cloud lab integration (e.g., Ginkgo Nebula) to scale, run large combinatorial libraries, and feed results back into the CA simulation engine for model fine tunning

All in all, this automation will increase reproducibility, enable systematic exploration of rule space, and accelerate the feedback loop between digital design and wet-lab experimentation.

I would also like to explore more of the idea using Ginkgo’s cloud lab (Nebula) to run large combinatorial libraries of CA rule variants. That would allow me to test hundreds of spatial rule configurations and feed experimental results back into the simulation engine!

Final Project Ideas:

Idea 1 Idea 1Idea 2 Idea 2Idea 3 Idea 3

Week 4 HW: Protein Design Part I

Part A. Conceptual Questions

1.How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons)

500g of meat = ~20% protein 100g ÷ 110 g/mol ≈ 0.91 mol 0.91 × 6.022 × 10²³ ≈ 5.5 × 10²³ molecules*

Why are there only 20 natural amino acids?

Because of the frozen accident!! https://pubmed.ncbi.nlm.nih.gov/27926995/ Evolution has developed so, and it’s too late to change that also, they have the optimal protein structure

Where did amino acids come from before enzymes that make them, and before life started? Based on this Nature article( https://www.nature.com/scitable/topicpage/an-evolutionary-perspective-on-amino-acids-14568445/ ) from metabolic pathways up to Archean Eon and chemical synthesis up to Hadean Eon

If you make an α-helix using D-amino acids, what handedness (right or left) would you expect? Left!

Can you discover additional helices in proteins? Yes, through like DSSP or PROCHECK software.

Why are most molecular helices right-handed? They’re more stable! https://www.ch.ic.ac.uk/rzepa/blog/?p=3802

Why do β-sheets tend to aggregate? their structure has edge strands with hydrogen-bonding capacity, making them thermodynamically prone to stacking through edge-to-edge or face-to-face interactions

What is the driving force for β-sheet aggregation?

It is determined by ATR-FTIR spectroscopy

Why do many amyloid diseases form β-sheets?

“Amyloid beta peptide (Aβ) is produced through the proteolytic processing of a transmembrane protein, amyloid precursor protein (APP), by β- and γ-secretases. Aβ accumulation in the brain is proposed to be an early toxic event”. https://www.nature.com/articles/aps201728

Can you use amyloid β-sheets as materials? Design a β-sheet motif that forms a well-ordered structure. Yes, they can be used as stable nano materials with applications in environmental sciences, material engineering, and translational medicines https://pmc.ncbi.nlm.nih.gov/articles/PMC8508955/. Still unsure how to design it

Part B: Protein Analysis and Visualization

1. Briefly describe the protein you selected and why you selected it.

I’ve selected LuxR as it controls bioluminescence through quorum sensing which is a “is the regulation of gene expression in response to fluctuations in cell-population density”(https://pubmed.ncbi.nlm.nih.gov/11544353/). As I want to do project in connection to CA it seemed fitting. For structure visualizatio I will be using TraR (PDB: 1L3L)

2. Identify the amino acid sequence of your protein.

from collections import Counter: Protein sequence provided by the user protein_sequence = “MSSNIESLYRQMLNEIIEQMEKEGKISREEALAVLESKLKKSNVSRDIILSHGFNSGVTQSAQPSSFLNNMAAIAKQNLHFSDIFTDLENQFIAELEELEQSYKNLVTKLSQKLDLSSLTEQALVELRSYLNQIKQSGLLDSAQNIFNFSEQISALQEAIGQLSQSIANLELQSREISELNQRVQDLEQELSFLSQQLASAQTQASQLQAQINQLNHQLEALEKAHQEALSRAKQEIEKLRS”

The length of the protein is: 242 aminoacids. The most common amino acid is: L, which appears 36 times.

How many protein sequence homologs are there for your protein? Hint: Use Uniprot’s BLAST tool to search for homologs. 250!  protein sequence homologs  protein sequence homologs

*The UniProt BLAST search identified over 250 homologs with highly significant E-values (ranging from ~0 to 10⁻²⁴), indicating LuxR is part of a large, well-conserved protein family. LuxR belongs to the LuxR-FixJ superfamily and the LuxR family of transcriptional regulators, which are widespread across gram-negative bacteria and typically involved in quorum sensing and gene regulation.

LuxR belongs to the TetR family RCSB PDB of transcriptional regulators, specifically to Vibrio alginolyticus variant, and more broadly to LuxR-FixJ

3. Identify the structure page of your protein in RCSB

https://www.uniprot.org/uniprotkb/P12746/entry#sequences

structure page structure page

Resolution: 1.66 Å

Are there any other molecules in the solved structure apart from protein?

Yes! LAE (3-OXO-OCTANOIC ACID (2-OXO-TETRAHYDRO-FURAN-3-YL)-AMIDE) and MSE (L-PEPTIDE LINKING)

Does your protein belong to any structure classification family?

Yes! https://www.uniprot.org/uniprotkb/P33905/entry#family_and_domains

LuxR (TraR) belongs to the autoinducer-regulated transcriptional regulatory protein family. The 2 domains are an N-terminal autoinducer-binding domain (Pfam: Autoind_bind) that binds the AHL quorum sensing signal, and a C-terminal HTH LuxR-type DNA-binding domain (Pfam: GerE) that activates transcription. Also classified under the LuxR-FixJ superfamily (SUPFAM) and the Winged Helix DNA-binding superfamily for its C-terminal domain.

4.Open the structure of your protein in any 3D molecule visualization software: structure structure structure2 structure2

:The structure is predominantly alpha-helical. Hydrophobicity Hydrophobicity

Hydrophobic residues are concentrated in the protein core, forming the structural interior and lining the AHL ligand-binding pocket.

binding pockets binding pockets

It does have binding pockets

Part C. Using ML-Based Protein Design Tools

Hydrophobicity Hydrophobicity

I also asked Gemini to identify mutation hotspots based on the compiled data in the heatmap:**

“Let’s analyze the mutation scan heatmap to identify potential mutation hotspots. I’ll write some code to scan through each position in the protein sequence and highlight those where several different amino acid substitutions are predicted to be favorable (meaning they have a high positive Log-Likelihood Ratio). Here’s the analysis:

Mutation Hotspots Identified (positions with at least 3 mutations having LLR > 0.5):

Position 2 (Original: Q):

  • Mutate to M: LLR = 7.67
  • Mutate to A: LLR = 2.68
  • Mutate to L: LLR = 2.55
  • Mutate to V: LLR = 2.43
  • Mutate to R: LLR = 2.33
  • Mutate to E: LLR = 2.02
  • Mutate to I: LLR = 2.00
  • Mutate to T: LLR = 1.99
  • Mutate to S: LLR = 1.91
  • Mutate to K: LLR = 1.90
  • Mutate to Y: LLR = 1.89
  • Mutate to F: LLR = 1.69
  • Mutate to G: LLR = 1.59
  • Mutate to Q: LLR = 1.59
  • Mutate to H: LLR = 1.33
  • Mutate to D: LLR = 1.31
  • Mutate to N: LLR = 1.26
  • Mutate to P: LLR = 1.16 Position 4 (Original: W):
  • Mutate to A: LLR = 1.31
  • Mutate to E: LLR = 1.11
  • Mutate to T: LLR = 1.07
  • Mutate to S: LLR = 1.01
  • Mutate to R: LLR = 1.00
  • Mutate to Q: LLR = 0.96
  • Mutate to D: LLR = 0.95
  • Mutate to K: LLR = 0.80 Position 5 (Original: L):
  • Mutate to L: LLR = 2.45
  • Mutate to I: LLR = 1.99
  • Mutate to A: LLR = 1.96
  • Mutate to V: LLR = 1.77
  • Mutate to T: LLR = 1.17
  • Mutate to F: LLR = 1.07
  • Mutate to R: LLR = 0.80
  • Mutate to Y: LLR = 0.76
  • Mutate to E: LLR = 0.72
  • Mutate to S: LLR = 0.56 Position 15 (Original: E):
  • Mutate to A: LLR = 1.13
  • Mutate to S: LLR = 1.00
  • Mutate to T: LLR = 0.62
  • Mutate to R: LLR = 0.62
  • Mutate to L: LLR = 0.58 Position 20 (Original: I):
  • Mutate to E: LLR = 5.70
  • Mutate to A: LLR = 5.51
  • Mutate to D: LLR = 5.31
  • Mutate to T: LLR = 4.86
  • Mutate to Q: LLR = 4.72
  • Mutate to S: LLR = 4.71
  • Mutate to R: LLR = 4.31
  • Mutate to K: LLR = 4.31
  • Mutate to G: LLR = 4.15
  • Mutate to N: LLR = 4.03
  • Mutate to P: LLR = 3.68
  • Mutate to V: LLR = 3.55
  • Mutate to L: LLR = 3.52
  • Mutate to H: LLR = 3.47
  • Mutate to I: LLR = 2.68
  • Mutate to M: LLR = 2.11
  • Mutate to F: LLR = 2.08
  • Mutate to Y: LLR = 1.95
  • Mutate to W: LLR = 1.11 Position 25 (Original: L):
  • Mutate to A: LLR = 3.12
  • Mutate to V: LLR = 2.93
  • Mutate to L: LLR = 2.73
  • Mutate to I: LLR = 2.70
  • Mutate to T: LLR = 1.85
  • Mutate to M: LLR = 1.40
  • Mutate to F: LLR = 1.15
  • Mutate to C: LLR = 0.80 Position 28 (Original: I):
  • Mutate to E: LLR = 0.99
  • Mutate to A: LLR = 0.73
  • Mutate to L: LLR = 0.64 Position 38 (Original: A):
  • Mutate to V: LLR = 1.76
  • Mutate to A: LLR = 1.56
  • Mutate to I: LLR = 1.23
  • Mutate to S: LLR = 0.90
  • Mutate to L: LLR = 0.81
  • Mutate to G: LLR = 0.79
  • Mutate to T: LLR = 0.53 Position 40 (Original: L):
  • Mutate to I: LLR = 1.66
  • Mutate to L: LLR = 1.64
  • Mutate to V: LLR = 1.53
  • Mutate to F: LLR = 0.63 Position 42 (Original: I):
  • Mutate to D: LLR = 2.31
  • Mutate to E: LLR = 1.39
  • Mutate to L: LLR = 1.08
  • Mutate to S: LLR = 0.71
  • Mutate to G: LLR = 0.63
  • Mutate to N: LLR = 0.62
  • Mutate to V: LLR = 0.62
  • Mutate to T: LLR = 0.60 Position 43 (Original: Q):
  • Mutate to D: LLR = 4.50
  • Mutate to E: LLR = 2.76
  • Mutate to N: LLR = 2.38
  • Mutate to P: LLR = 2.35
  • Mutate to S: LLR = 2.06
  • Mutate to T: LLR = 1.76
  • Mutate to A: LLR = 1.44
  • Mutate to G: LLR = 1.35
  • Mutate to H: LLR = 1.29
  • Mutate to R: LLR = 1.26
  • Mutate to Q: LLR = 1.06
  • Mutate to K: LLR = 0.92
  • Mutate to L: LLR = 0.80
  • Mutate to V: LLR = 0.51 Position 44 (Original: H):
  • Mutate to D: LLR = 1.89
  • Mutate to E: LLR = 1.59
  • Mutate to P: LLR = 1.31
  • Mutate to G: LLR = 0.72
  • Mutate to A: LLR = 0.55 Position 45 (Original: R):
  • Mutate to D: LLR = 2.75
  • Mutate to N: LLR = 1.61
  • Mutate to E: LLR = 1.59
  • Mutate to G: LLR = 1.42
  • Mutate to S: LLR = 1.10
  • Mutate to T: LLR = 0.62 Position 46 (Original: H):
  • Mutate to D: LLR = 1.93
  • Mutate to G: LLR = 1.62
  • Mutate to E: LLR = 1.26
  • Mutate to N: LLR = 0.61 Position 47 (Original: I):
  • Mutate to E: LLR = 1.81
  • Mutate to D: LLR = 1.51
  • Mutate to R: LLR = 1.04
  • Mutate to T: LLR = 0.98
  • Mutate to S: LLR = 0.90
  • Mutate to G: LLR = 0.88
  • Mutate to Q: LLR = 0.77
  • Mutate to K: LLR = 0.74
  • Mutate to A: LLR = 0.60 Position 48 (Original: T):
  • Mutate to G: LLR = 1.72
  • Mutate to L: LLR = 1.12
  • Mutate to D: LLR = 0.98
  • Mutate to E: LLR = 0.85
  • Mutate to A: LLR = 0.75
  • Mutate to R: LLR = 0.50 Position 53 (Original: Y):
  • Mutate to G: LLR = 1.56
  • Mutate to E: LLR = 0.93
  • Mutate to A: LLR = 0.90
  • Mutate to D: LLR = 0.89
  • Mutate to S: LLR = 0.68 Position 55 (Original: R):
  • Mutate to G: LLR = 1.31
  • Mutate to S: LLR = 1.27
  • Mutate to D: LLR = 1.14
  • Mutate to T: LLR = 1.12
  • Mutate to P: LLR = 0.75
  • Mutate to A: LLR = 0.56 Position 56 (Original: Q):
  • Mutate to P: LLR = 1.07
  • Mutate to G: LLR = 0.79
  • Mutate to D: LLR = 0.77
  • Mutate to E: LLR = 0.59 Position 57 (Original: W):
  • Mutate to D: LLR = 1.10
  • Mutate to G: LLR = 0.99
  • Mutate to E: LLR = 0.93
  • Mutate to S: LLR = 0.56 Position 58 (Original: Q):
  • Mutate to G: LLR = 2.29
  • Mutate to D: LLR = 1.82
  • Mutate to E: LLR = 1.73
  • Mutate to P: LLR = 1.65
  • Mutate to A: LLR = 1.52
  • Mutate to S: LLR = 1.45
  • Mutate to T: LLR = 1.40
  • Mutate to L: LLR = 1.33
  • Mutate to R: LLR = 1.20
  • Mutate to N: LLR = 1.14
  • Mutate to Q: LLR = 1.01
  • Mutate to K: LLR = 0.96
  • Mutate to H: LLR = 0.91
  • Mutate to V: LLR = 0.90
  • Mutate to I: LLR = 0.84
  • Mutate to F: LLR = 0.60
  • Mutate to Y: LLR = 0.57 Position 59 (Original: S):
  • Mutate to G: LLR = 1.33
  • Mutate to L: LLR = 1.02
  • Mutate to A: LLR = 0.89
  • Mutate to D: LLR = 0.89
  • Mutate to S: LLR = 0.84
  • Mutate to P: LLR = 0.82
  • Mutate to T: LLR = 0.63
  • Mutate to E: LLR = 0.61
  • Mutate to I: LLR = 0.53 Position 61 (Original: Y):
  • Mutate to E: LLR = 0.68
  • Mutate to D: LLR = 0.61
  • Mutate to A: LLR = 0.60 Position 63 (Original: D):
  • Mutate to D: LLR = 1.12
  • Mutate to S: LLR = 0.94
  • Mutate to A: LLR = 0.84
  • Mutate to G: LLR = 0.75
  • Mutate to P: LLR = 0.74
  • Mutate to T: LLR = 0.67
  • Mutate to L: LLR = 0.59
  • Mutate to E: LLR = 0.56 Position 65 (Original: K):
  • Mutate to A: LLR = 0.89
  • Mutate to P: LLR = 0.88
  • Mutate to E: LLR = 0.87
  • Mutate to L: LLR = 0.85
  • Mutate to G: LLR = 0.80
  • Mutate to D: LLR = 0.67
  • Mutate to R: LLR = 0.65 Position 66 (Original: F):
  • Mutate to S: LLR = 1.46
  • Mutate to P: LLR = 1.41
  • Mutate to D: LLR = 1.29
  • Mutate to T: LLR = 1.27
  • Mutate to G: LLR = 0.86
  • Mutate to A: LLR = 0.79
  • Mutate to R: LLR = 0.78
  • Mutate to E: LLR = 0.75 Position 75 (Original: K):
  • Mutate to A: LLR = 1.30
  • Mutate to E: LLR = 1.23
  • Mutate to L: LLR = 1.01
  • Mutate to R: LLR = 0.86
  • Mutate to D: LLR = 0.65
  • Mutate to S: LLR = 0.59
  • Mutate to W: LLR = 0.52 Position 78 (Original: R):
  • Mutate to L: LLR = 2.00
  • Mutate to I: LLR = 1.32
  • Mutate to F: LLR = 0.60
  • Mutate to V: LLR = 0.56 Position 80 (Original: R):
  • Mutate to G: LLR = 1.42
  • Mutate to E: LLR = 0.95
  • Mutate to D: LLR = 0.84
  • Mutate to S: LLR = 0.69
  • Mutate to A: LLR = 0.64 Position 82 (Original: H):
  • Mutate to G: LLR = 1.36
  • Mutate to E: LLR = 1.11
  • Mutate to D: LLR = 1.00
  • Mutate to T: LLR = 0.96
  • Mutate to P: LLR = 0.96
  • Mutate to S: LLR = 0.92
  • Mutate to R: LLR = 0.89
  • Mutate to A: LLR = 0.87
  • Mutate to L: LLR = 0.55 Position 84 (Original: F):
  • Mutate to L: LLR = 1.07
  • Mutate to V: LLR = 1.05
  • Mutate to I: LLR = 0.99 Position 85 (Original: T):
  • Mutate to V: LLR = 0.84
  • Mutate to L: LLR = 0.65
  • Mutate to I: LLR = 0.54 Position 86 (Original: W):
  • Mutate to V: LLR = 2.01
  • Mutate to L: LLR = 1.91
  • Mutate to I: LLR = 1.90
  • Mutate to A: LLR = 1.64
  • Mutate to T: LLR = 1.52
  • Mutate to S: LLR = 1.39
  • Mutate to R: LLR = 1.12
  • Mutate to F: LLR = 1.10
  • Mutate to E: LLR = 0.90
  • Mutate to D: LLR = 0.82
  • Mutate to G: LLR = 0.71
  • Mutate to Y: LLR = 0.63
  • Mutate to K: LLR = 0.62
  • Mutate to H: LLR = 0.62
  • Mutate to P: LLR = 0.56
  • Mutate to N: LLR = 0.53 Position 90 (Original: H):
  • Mutate to L: LLR = 1.31
  • Mutate to I: LLR = 1.28
  • Mutate to V: LLR = 1.09
  • Mutate to A: LLR = 1.03
  • Mutate to T: LLR = 0.94
  • Mutate to P: LLR = 0.79
  • Mutate to S: LLR = 0.76
  • Mutate to G: LLR = 0.75
  • Mutate to D: LLR = 0.64
  • Mutate to E: LLR = 0.61 Position 92 (Original: R):
  • Mutate to L: LLR = 1.41
  • Mutate to I: LLR = 0.92
  • Mutate to D: LLR = 0.91
  • Mutate to V: LLR = 0.67 Position 93 (Original: P):
  • Mutate to L: LLR = 1.17
  • Mutate to I: LLR = 0.76
  • Mutate to V: LLR = 0.73 Position 97 (Original: K):
  • Mutate to A: LLR = 1.41
  • Mutate to P: LLR = 1.39
  • Mutate to E: LLR = 0.93
  • Mutate to D: LLR = 0.80 Position 105 (Original: H):
  • Mutate to L: LLR = 1.84
  • Mutate to A: LLR = 1.60
  • Mutate to I: LLR = 1.40
  • Mutate to F: LLR = 1.26
  • Mutate to V: LLR = 1.24
  • Mutate to E: LLR = 0.89
  • Mutate to Y: LLR = 0.75
  • Mutate to T: LLR = 0.72
  • Mutate to R: LLR = 0.71 Position 107 (Original: S):
  • Mutate to L: LLR = 1.67
  • Mutate to R: LLR = 1.33
  • Mutate to I: LLR = 1.30
  • Mutate to V: LLR = 1.14
  • Mutate to A: LLR = 0.70 Position 108 (Original: D):
  • Mutate to R: LLR = 1.01
  • Mutate to A: LLR = 0.62
  • Mutate to E: LLR = 0.57
  • Mutate to K: LLR = 0.51 Position 118 (Original: P):
  • Mutate to A: LLR = 0.97
  • Mutate to G: LLR = 0.81
  • Mutate to R: LLR = 0.67
  • Mutate to L: LLR = 0.50 Position 123 (Original: N):
  • Mutate to D: LLR = 1.15
  • Mutate to T: LLR = 0.76
  • Mutate to S: LLR = 0.58 Position 125 (Original: F):
  • Mutate to E: LLR = 1.39
  • Mutate to V: LLR = 1.08
  • Mutate to R: LLR = 1.00
  • Mutate to L: LLR = 0.98
  • Mutate to K: LLR = 0.93
  • Mutate to I: LLR = 0.87
  • Mutate to T: LLR = 0.87
  • Mutate to D: LLR = 0.72
  • Mutate to A: LLR = 0.64 Position 126 (Original: X):
  • Mutate to V: LLR = 5.51
  • Mutate to I: LLR = 5.07
  • Mutate to L: LLR = 4.74
  • Mutate to A: LLR = 4.02
  • Mutate to T: LLR = 3.96
  • Mutate to F: LLR = 3.36
  • Mutate to R: LLR = 3.28
  • Mutate to G: LLR = 3.26
  • Mutate to E: LLR = 3.18
  • Mutate to S: LLR = 3.08
  • Mutate to P: LLR = 3.07
  • Mutate to Y: LLR = 3.06
  • Mutate to C: LLR = 2.88
  • Mutate to M: LLR = 2.85
  • Mutate to K: LLR = 2.72
  • Mutate to D: LLR = 2.59
  • Mutate to W: LLR = 2.55
  • Mutate to Q: LLR = 2.46
  • Mutate to H: LLR = 2.39
  • Mutate to N: LLR = 1.89 Position 127 (Original: S):
  • Mutate to G: LLR = 1.53
  • Mutate to A: LLR = 1.52
  • Mutate to L: LLR = 1.44
  • Mutate to V: LLR = 1.37
  • Mutate to I: LLR = 1.23
  • Mutate to Y: LLR = 0.84
  • Mutate to F: LLR = 0.69
  • Mutate to T: LLR = 0.51 Position 128 (Original: X):
  • Mutate to V: LLR = 6.03
  • Mutate to A: LLR = 5.56
  • Mutate to L: LLR = 5.32
  • Mutate to I: LLR = 5.28
  • Mutate to T: LLR = 4.73
  • Mutate to G: LLR = 4.70
  • Mutate to F: LLR = 4.55
  • Mutate to S: LLR = 4.33
  • Mutate to M: LLR = 4.08
  • Mutate to Y: LLR = 3.94
  • Mutate to C: LLR = 3.80
  • Mutate to R: LLR = 2.84
  • Mutate to E: LLR = 2.66
  • Mutate to H: LLR = 2.64
  • Mutate to Q: LLR = 2.62
  • Mutate to W: LLR = 2.53
  • Mutate to N: LLR = 2.39
  • Mutate to D: LLR = 2.05
  • Mutate to P: LLR = 1.81
  • Mutate to K: LLR = 1.69 Position 129 (Original: F):
  • Mutate to L: LLR = 1.23
  • Mutate to I: LLR = 1.05
  • Mutate to V: LLR = 0.93 Position 130 (Original: T):
  • Mutate to L: LLR = 1.51
  • Mutate to I: LLR = 1.10
  • Mutate to V: LLR = 1.04 Position 131 (Original: X):
  • Mutate to V: LLR = 4.62
  • Mutate to I: LLR = 4.53
  • Mutate to L: LLR = 4.41
  • Mutate to A: LLR = 4.02
  • Mutate to G: LLR = 3.53
  • Mutate to F: LLR = 3.40
  • Mutate to S: LLR = 3.16
  • Mutate to M: LLR = 3.09
  • Mutate to T: LLR = 2.82
  • Mutate to C: LLR = 2.71
  • Mutate to Y: LLR = 1.81
  • Mutate to D: LLR = 1.67
  • Mutate to E: LLR = 1.65
  • Mutate to N: LLR = 1.64
  • Mutate to W: LLR = 1.44
  • Mutate to H: LLR = 1.37
  • Mutate to Q: LLR = 1.36
  • Mutate to R: LLR = 1.35
  • Mutate to P: LLR = 0.87 Position 137 (Original: V):
  • Mutate to R: LLR = 1.52
  • Mutate to G: LLR = 1.38
  • Mutate to S: LLR = 0.79
  • Mutate to A: LLR = 0.66
  • Mutate to D: LLR = 0.63 Position 138 (Original: I):
  • Mutate to A: LLR = 0.73
  • Mutate to S: LLR = 0.66
  • Mutate to L: LLR = 0.53 Position 139 (Original: D):
  • Mutate to F: LLR = 2.18
  • Mutate to W: LLR = 1.91
  • Mutate to P: LLR = 1.48
  • Mutate to L: LLR = 1.48
  • Mutate to Y: LLR = 1.07
  • Mutate to A: LLR = 0.53 Position 142 (Original: ):
  • Mutate to D: LLR = 0.74
  • Mutate to A: LLR = 0.69
  • Mutate to E: LLR = 0.59 Position 149 (Original: A):
  • Mutate to L: LLR = 1.31
  • Mutate to I: LLR = 0.67
  • Mutate to V: LLR = 0.53 Position 152 (Original: A):
  • Mutate to L: LLR = 1.36
  • Mutate to A: LLR = 1.14
  • Mutate to V: LLR = 0.83
  • Mutate to I: LLR = 0.71 Position 155 (Original: G):
  • Mutate to L: LLR = 1.54
  • Mutate to A: LLR = 1.41
  • Mutate to V: LLR = 0.68
  • Mutate to I: LLR = 0.65
  • Mutate to E: LLR = 0.63
  • Mutate to R: LLR = 0.61 Position 157 (Original: I):
  • Mutate to L: LLR = 2.75
  • Mutate to A: LLR = 2.72
  • Mutate to I: LLR = 1.95
  • Mutate to V: LLR = 1.88
  • Mutate to E: LLR = 1.73
  • Mutate to G: LLR = 1.66
  • Mutate to S: LLR = 1.64
  • Mutate to F: LLR = 1.55
  • Mutate to T: LLR = 1.35
  • Mutate to R: LLR = 1.11
  • Mutate to M: LLR = 1.03
  • Mutate to Q: LLR = 0.90
  • Mutate to D: LLR = 0.67
  • Mutate to K: LLR = 0.57 Position 161 (Original: I):
  • Mutate to E: LLR = 1.09
  • Mutate to A: LLR = 1.02
  • Mutate to D: LLR = 0.97 Position 162 (Original: S):
  • Mutate to A: LLR = 3.09
  • Mutate to E: LLR = 2.57
  • Mutate to R: LLR = 2.27
  • Mutate to S: LLR = 2.09
  • Mutate to T: LLR = 1.97
  • Mutate to D: LLR = 1.89
  • Mutate to K: LLR = 1.80
  • Mutate to Q: LLR = 1.79
  • Mutate to G: LLR = 1.70
  • Mutate to L: LLR = 1.47
  • Mutate to P: LLR = 1.05
  • Mutate to V: LLR = 1.02
  • Mutate to N: LLR = 0.98
  • Mutate to H: LLR = 0.90 Position 174 (Original: A):
  • Mutate to A: LLR = 3.50
  • Mutate to R: LLR = 3.12
  • Mutate to S: LLR = 3.01
  • Mutate to T: LLR = 2.99
  • Mutate to L: LLR = 2.88
  • Mutate to K: LLR = 2.82
  • Mutate to G: LLR = 2.71
  • Mutate to E: LLR = 2.64
  • Mutate to D: LLR = 2.51
  • Mutate to V: LLR = 2.51
  • Mutate to Q: LLR = 2.31
  • Mutate to N: LLR = 2.19
  • Mutate to I: LLR = 2.11
  • Mutate to P: LLR = 1.95
  • Mutate to H: LLR = 1.79
  • Mutate to M: LLR = 1.36
  • Mutate to F: LLR = 1.32
  • Mutate to Y: LLR = 0.83 Position 178 (Original: P):
  • Mutate to A: LLR = 1.11
  • Mutate to R: LLR = 0.96
  • Mutate to Q: LLR = 0.52 Position 181 (Original: A):
  • Mutate to E: LLR = 2.23
  • Mutate to D: LLR = 1.54
  • Mutate to R: LLR = 1.26
  • Mutate to Q: LLR = 0.98
  • Mutate to A: LLR = 0.85
  • Mutate to K: LLR = 0.67 Position 182 (Original: T):
  • Mutate to V: LLR = 4.57
  • Mutate to I: LLR = 3.69
  • Mutate to L: LLR = 3.16
  • Mutate to F: LLR = 2.15
  • Mutate to A: LLR = 2.15
  • Mutate to C: LLR = 1.91
  • Mutate to T: LLR = 1.13
  • Mutate to M: LLR = 0.77 Position 185 (Original: R):
  • Mutate to L: LLR = 2.84
  • Mutate to A: LLR = 1.98
  • Mutate to G: LLR = 1.76
  • Mutate to M: LLR = 1.67
  • Mutate to R: LLR = 1.53
  • Mutate to Q: LLR = 1.14
  • Mutate to S: LLR = 1.00
  • Mutate to E: LLR = 0.96
  • Mutate to H: LLR = 0.74
  • Mutate to T: LLR = 0.71
  • Mutate to D: LLR = 0.67
  • Mutate to F: LLR = 0.64
  • Mutate to K: LLR = 0.57
  • Mutate to Y: LLR = 0.56 Position 188 (Original: A):
  • Mutate to E: LLR = 3.31
  • Mutate to A: LLR = 2.80
  • Mutate to D: LLR = 2.68
  • Mutate to S: LLR = 2.50
  • Mutate to Q: LLR = 2.36
  • Mutate to T: LLR = 1.84
  • Mutate to N: LLR = 1.65
  • Mutate to G: LLR = 1.42
  • Mutate to L: LLR = 1.39
  • Mutate to R: LLR = 1.37
  • Mutate to K: LLR = 1.36
  • Mutate to H: LLR = 0.71 Position 192 (Original: T):
  • Mutate to V: LLR = 7.46
  • Mutate to N: LLR = 7.19
  • Mutate to Q: LLR = 6.97
  • Mutate to I: LLR = 6.92
  • Mutate to A: LLR = 6.88
  • Mutate to L: LLR = 6.77
  • Mutate to T: LLR = 6.67
  • Mutate to D: LLR = 6.52
  • Mutate to E: LLR = 6.46
  • Mutate to S: LLR = 6.35
  • Mutate to P: LLR = 6.23
  • Mutate to Y: LLR = 6.07
  • Mutate to M: LLR = 5.65
  • Mutate to G: LLR = 5.41
  • Mutate to R: LLR = 5.30
  • Mutate to H: LLR = 5.06
  • Mutate to F: LLR = 4.99
  • Mutate to K: LLR = 4.49
  • Mutate to W: LLR = 3.79
  • Mutate to C: LLR = 3.53 Position 198 (Original: D):
  • Mutate to A: LLR = 1.24
  • Mutate to E: LLR = 0.90
  • Mutate to L: LLR = 0.81
  • Mutate to I: LLR = 0.68 Position 199 (Original: V):
  • Mutate to L: LLR = 3.77
  • Mutate to M: LLR = 1.63
  • Mutate to V: LLR = 1.14
  • Mutate to I: LLR = 0.95
  • Mutate to F: LLR = 0.67
  • Mutate to T: LLR = 0.61 Position 202 (Original: V):
  • Mutate to S: LLR = 1.98
  • Mutate to P: LLR = 1.27
  • Mutate to A: LLR = 1.20 Position 203 (Original: K):
  • Mutate to V: LLR = 2.31
  • Mutate to P: LLR = 1.82
  • Mutate to I: LLR = 1.59
  • Mutate to L: LLR = 1.38
  • Mutate to E: LLR = 1.20
  • Mutate to A: LLR = 1.18
  • Mutate to R: LLR = 0.76
  • Mutate to T: LLR = 0.67 Position 208 (Original: R):
  • Mutate to T: LLR = 1.82
  • Mutate to N: LLR = 1.48
  • Mutate to S: LLR = 1.02 Position 209 (Original: V):
  • Mutate to H: LLR = 3.42
  • Mutate to Y: LLR = 2.66
  • Mutate to R: LLR = 1.62
  • Mutate to Q: LLR = 0.73
  • Mutate to L: LLR = 0.59 Position 212 (Original: R):
  • Mutate to N: LLR = 2.55
  • Mutate to S: LLR = 1.41
  • Mutate to R: LLR = 0.98
  • Mutate to A: LLR = 0.88 Position 214 (Original: E):
  • Mutate to M: LLR = 8.36
  • Mutate to Y: LLR = 8.16
  • Mutate to L: LLR = 8.13
  • Mutate to R: LLR = 7.78
  • Mutate to F: LLR = 7.65
  • Mutate to E: LLR = 7.36
  • Mutate to A: LLR = 7.24
  • Mutate to Q: LLR = 7.22
  • Mutate to K: LLR = 7.17
  • Mutate to I: LLR = 7.12
  • Mutate to V: LLR = 6.98
  • Mutate to S: LLR = 6.58
  • Mutate to C: LLR = 6.40
  • Mutate to T: LLR = 6.04
  • Mutate to W: LLR = 5.67
  • Mutate to H: LLR = 5.62
  • Mutate to G: LLR = 5.38
  • Mutate to N: LLR = 4.57
  • Mutate to D: LLR = 3.21
  • Mutate to P: LLR = 1.85 Position 224 (Original: K):
  • Mutate to E: LLR = 2.96
  • Mutate to Q: LLR = 2.18
  • Mutate to D: LLR = 1.11
  • Mutate to A: LLR = 0.56

Total hotspots found: 75”

These results seemed a bit too excessive so I changed the code around a little bit and got this

Mutation Hotspots Identified (positions with at least 3 mutations having LLR > 5):

Position 20 (Original: I):

  • Mutate to E: LLR = 5.70
  • Mutate to A: LLR = 5.51
  • Mutate to D: LLR = 5.31 Position 128 (Original: X):
  • Mutate to V: LLR = 6.03
  • Mutate to A: LLR = 5.56
  • Mutate to L: LLR = 5.32
  • Mutate to I: LLR = 5.28 Position 192 (Original: T):
  • Mutate to V: LLR = 7.46
  • Mutate to N: LLR = 7.19
  • Mutate to Q: LLR = 6.97
  • Mutate to I: LLR = 6.92
  • Mutate to A: LLR = 6.88
  • Mutate to L: LLR = 6.77
  • Mutate to T: LLR = 6.67
  • Mutate to D: LLR = 6.52
  • Mutate to E: LLR = 6.46
  • Mutate to S: LLR = 6.35
  • Mutate to P: LLR = 6.23
  • Mutate to Y: LLR = 6.07
  • Mutate to M: LLR = 5.65
  • Mutate to G: LLR = 5.41
  • Mutate to R: LLR = 5.30
  • Mutate to H: LLR = 5.06 Position 214 (Original: E):
  • Mutate to M: LLR = 8.36
  • Mutate to Y: LLR = 8.16
  • Mutate to L: LLR = 8.13
  • Mutate to R: LLR = 7.78
  • Mutate to F: LLR = 7.65
  • Mutate to E: LLR = 7.36
  • Mutate to A: LLR = 7.24
  • Mutate to Q: LLR = 7.22
  • Mutate to K: LLR = 7.17
  • Mutate to I: LLR = 7.12
  • Mutate to V: LLR = 6.98
  • Mutate to S: LLR = 6.58
  • Mutate to C: LLR = 6.40
  • Mutate to T: LLR = 6.04
  • Mutate to W: LLR = 5.67
  • Mutate to H: LLR = 5.62
  • Mutate to G: LLR = 5.38

Total hotspots found: 4

One thing I noticed whilst analysing the heatmap, the x spots around 16, 122 to 126, 189 and 211 appear brighter. That means that these are the areas where the protein’s sequence might be quite adaptable, making them interesting for further investigation into their role and potential for modification.

Hydrophobicity HydrophobicityHydrophobicity HydrophobicityHydrophobicity Hydrophobicity

New sequence for folding: MNFLDDVEKLSKVKGDDETFKKGLEEIAKKHGFSGWNFVYDNNGHLKVWSNLDPALLKEYFEKNWYKVDPVIKKAREEEKLFTWSWDEEWPKLTEEEKALGEEYKKYGVKSGITIPIKAENGTRAYFTFYSKKDKIELKEPIDDEKAKKVTKLLYKKIVDNKWTPTGHTPMSLNEKERAYLKLRSKGYSQREIAEKLGVAQSTVSRTIRKAMEKFGVKTIEELIKIAKEMGLI

What it looks like New protein structure New protein structure

Part D. Group Brainstorm on Bacteriophage Engineering

acteriophage Engineering proposal acteriophage Engineering proposal

*Claude AI prompt: Can you please analyse the results of UniProtKB LuxR search ** Gemini built in to colab AI prompt “identify mutation hotspots based on the compiled data in the heatmap”

Week 5 HW: Protein Design Part II

Part A: SOD1 Binder Peptide Design (From Pranam)

Part 1: Generate Binders with PepMLM

  1. sp|P00441|SODC_HUMAN Superoxide dismutase [Cu-Zn] OS=Homo sapiens OX=9606 GN=SOD1 PE=1 SV=2 MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTS AGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVV HEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ (taken from https://rest.uniprot.org/uniprotkb/P00441.fasta ) -> muted form A-> V MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTS AGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVV HEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

  2. (2-5)
    #BinderPseudo PerplexityNotes
    0WHYGAAQAAHWX7.60026377248636High confidence
    1WLYGASAAAWKK7.46473740432208Highest confidence!!!
    2WLYGAAGVAWKE10.9325804754158Moderate confidence
    3WLYYPQAAKLKK15.5499787120909Lowest confidence
    FLYRWLPSRRGG20.9180890005569Known binder (control)

Part 2: Evaluate Binders with AlphaFold3

So what I found out is that in WHYGAAQAAHWX X is an unknown amino acid and that AlphaFold3 is not going to work with it so I’m just skipping it for now.

 SOD1_A4V_WLYYPQAAKLKK  SOD1_A4V_WLYYPQAAKLKK SOD1_A4V_WLYYPQAAKLKK

 SOD1_A4V_WLYGASAAAWKK  SOD1_A4V_WLYGASAAAWKK SOD1_A4V_WLYGASAAAWKK

 SOD1_A4V_WLYGAAGVAWKE  SOD1_A4V_WLYGAAGVAWKE SOD1_A4V_WLYGAAGVAWKE

 SOD1_A4V_FLYRWLPSRRGG  SOD1_A4V_FLYRWLPSRRGG SOD1_A4V_FLYRWLPSRRGG

PeptideipTMpTMBinding Location
FLYRWLPSRRGG (known)0.340.83Surface-bound, near bottom/loop region
WLYGAAGVAWKE0.450.88Engages β-barrel, partially buried near core
WLYGASAAAWKK0.240.78Surface-bound, near N-terminus/top loops
WLYYPQAAKLKK0.270.70Surface-bound, loose association near loops
  1. ipTM scores across all peptides ranged from 0.24 to 0.45 -> suggests weak-to-moderate predicted interface confidence. The PepMLM-generated peptide WLYGAAGVAWKE (ipTM = 0.45) outperformed the known binder FLYRWLPSRRGG (ipTM = 0.34), appearing more engaged with the β-barrel core of SOD1. The remaining two peptides WLYGASAAAWKK (ipTM = 0.24) and WLYYPQAAKLKK (ipTM = 0.27) scored below FLYRWLPSRRGG and appeared loosely surface-bound. None exceeded an ipTM of 0.5, which should be expected for short peptides against a structured target

Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse

 WLYYPQAAKLKK_pv  WLYYPQAAKLKK_pv WLYYPQAAKLKK_pv

 WLYGASAAAWKK_pv  WLYGASAAAWKK_pv WLYGASAAAWKK_pv

 WLYGAAGVAWKE_pv  WLYGAAGVAWKE_pv WLYGAAGVAWKE_pv

 FLYRWLPSRRGG_pv  FLYRWLPSRRGG_pv FLYRWLPSRRGG_pv

PeptideipTMBinding Affinity (pKd)SolubilityHemolysisNet Charge
FLYRWLPSRRGG (known)0.345.968SolubleNon-hemolytic+2.76
WLYGAAGVAWKE0.456.683SolubleNon-hemolytic-0.23
WLYGASAAAWKK0.246.071SolubleNon-hemolytic+1.76
WLYYPQAAKLKK0.275.482SolubleNon-hemolytic+2.76

WLYGAAGVAWKE had both the highest ipTM (0.45) and the strongest predicted binding affinity (6.683 pKd), suggesting that structural confidence and binding prediction align for this peptide. All peptides were soluble and non-hemolytic, meaning none raised safety red flags. Notably, higher ipTM did loosely correlate with stronger affinity — WLYGAAGVAWKE topped both metrics while WLYYPQAAKLKK scored lowest on affinity and had a weak ipTM.

Peptide to Advance: WLYGAAGVAWKE!!!

WLYGAAGVAWKE has the best structural binding confidence (ipTM = 0.45), the strongest predicted affinity (6.683 pKd), is fully soluble, and non-hemolytic. It outperforms the known binder FLYRWLPSRRGG on every key metric! Therefore, we should use this one

Part 4: Generate Optimized Peptides with moPPIt

I tried to run for Motif positions 1, 2, 3, 4, 5, 6, 7 but it’s just taking so long, so, I’m nit going to finosh running it but here’re some results that I got so far:

moPPIt Generated Peptides

#PeptideTarget Residues
1GKTEKTYTDCCD1, 2, 3, 4, 5, 6, 7
2EEQNTCIQTTKA1, 2, 3, 4, 5, 6, 7

Comparison: moPPIt vs PepMLM Peptides:

moPPIt peptides differ notably from PepMLM-generated ones in both composition and design intent. PepMLM peptides (e.g., WLYGAAGVAWKE, WLYGASAAAWKK) were dominated by W at position 1 and showed a hydrophobic character, as the model simply sampled sequences likely to bind SOD1 without any spatial constraints. In contrast, moPPIt peptides (GKTEKTYTDCCD, EEQNTCIQTTKA) are more polar and charged while balancing affinity, solubility and hemolysis objectives. moPPIt asks “what binds specifically near position 4, and is also therapeutically viable?”

Before advancing the studies of moPPIt peptides, the following next steps/ evaluations needed:

  1. SPR or ITC should be used to measure actual binding affinity
  2. AlphaFold3 rystallography to confirm binding near the target residues 1–7
  3. To confirm non-toxic
  4. Need to assess peptide half-life through test degradation in serum
  5. Check for cell permeability
  6. ALS mouse model

new!! TKCVATKKLQED

Part B: BRD4 Drug Discovery Platform Tutorial (Gabriele)

waiting to be accepted onto the platform

Part C: Final Project: L-Protein Mutants

 SOD1_A4V_WLYYPQAAKLKK  SOD1_A4V_WLYYPQAAKLKK  SOD1_A4V_WLYYPQAAKLKK  SOD1_A4V_WLYYPQAAKLKK

I’ve also asked Claude* for help with the validation of computations, and this was the correlation assessment

STRONG CORRELATION between computational scores and experimental LLR values:

Computational PredictionExperimental ValidationAgreement
K50L (Score: 2.56)Bright yellowExcellent
C29R (Score: 2.40)Position 29 hotspotExcellent
Y39L (Score: 2.24)Bright at pos 39Strong
N53L (Score: 1.87)TM leucine patternStrong
S9Q (Score: 2.01)Some positive signalsGood

Conclusion: The ESM2 language model predictions correlate well with experimental data, particularly for:

  • Identifying hotspot positions (29, 39, 50)
  • Predicting beneficial amino acid types (hydrophobic in TM, removing C29)
  • Overall mutation effects (positive vs negative)

This validates using computational approaches for rational protein design, though experimental validation remains essential.

Top 20 Mutations with Scores

RankMutationOriginal AANew AAPositionScoreRegion
1K50LKL502.561Transmembrane
2C29RCR292.395Soluble
3Y39LYL392.242Soluble
4C29SCS292.043Soluble
5S9QSQ92.014Soluble
6C29QCQ291.997Soluble
7C29PCP291.971Soluble
8C29LCL291.961Soluble
9K50IKI501.929Transmembrane
10N53LNL531.865Transmembrane
11E61LEL611.818Transmembrane
12T52LTL521.814Transmembrane
13K50FKF501.802Transmembrane
14C29TCT291.797Soluble
15C29KCK291.796Soluble
16F5QFQ51.795Soluble
17F5RFR51.660Soluble
18C29ACA291.649Soluble
19Y27RYR271.628Soluble
20F22RFR221.602Soluble

my final 5 mutations: K50L (Transmembrane) 2.561 C29R (Soluble) 2.395 Y39L (Soluble) 2.242 N53L (Transmembrane) 1.865 S9Q (Soluble) 2.014

*prompt used with Claude find correlation between the experimental data L-Protein Mutants - Sheet1.csv and protein_mutations_scores.csv

Week 6 HW: Genetic Circuits Part I: Assembly Technologies

Assignment: DNA Assembly

What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose?

Based on Protocol for Phusion™ High-Fidelity PCR Master Mix with GC Buffer (https://www.neb.com/en-gb/protocols/protocol-for-phusion-high-fidelity-pcr-master-mix-with-gc-buffer-m0532)

Phusion DNA Polymerase - a high-fidelity enzyme that synthesises new DNA strands with 3’→5’ exonuclease activity for proofreading dNTPs - building blocks for DNA synthesis (200 µM each at 1X concentration) MgCl₂ - Cofactor required for polymerase activity (1.5 mM at 1X concentration) Possibly optional as the question doesn’t mention it (GC Buffer - optimised buffer for amplifying difficult templates with high GC content or secondary structure) DMSO (optional additive) - helps denature secondary structures in GC-rich or difficult templates (recommended at 3% final concentration)

What are some factors that determine primer annealing temperature during PCR?

Primer Tm as in melting temperature, primer length, GC content, primer pair Tm similarity

There are two methods from this class that create linear fragments of DNA: PCR, and restriction enzyme digests. Compare and contrast these two methods, both in terms of protocol as well as when one may be preferable to use over the other.

PCR vs. Restriction Enzyme Digests

Comparison Table

AspectPCRRestriction Enzyme Digests
MechanicsAmplifies specific DNA regions using primersCuts DNA at specific recognition sequences
FlexibilityMore flexible; doesn’t require specific cut sitesLimited by existing restriction sites in sequence
MutationsCan introduce mutations via primer designCannot introduce mutations
End ProductsVariable ends based on primer designCreates predictable, precise ends
Time needed~90 minutes with thermal cyclingFaster (~1-2 hours incubation)
Template amountCan work with small amounts of templateRequires sufficient DNA quantity
Sticky endsNot applicableCan create compatible sticky ends
When to useWhen introducing mutations into sequences, no suitable restriction sites exist in the DNA, from small amounts of template DNAWhen restriction sites are conveniently located in the sequence, Creating compatible sticky ends for cloning, Cutting plasmids for traditional cloning applications, Precise, predictable cuts are required

How can you ensure that the DNA sequences that you have digested and PCR-ed will be appropriate for Gibson cloning?

  1. Ensure primers have 20-22 bp overhangs that are complementary between fragments

  2. After PCR and DpnI digest, purify the DNA using the Zymo DNA Clean & Concentrator kit and measure concentration using Nanodrop/Qubit. The concentration should be above ~30 ng/µL.

  3. Run the PCR products on an agarose gel to verify correct band sizes. Mix 3 µL of sample with 3.3 µL of 6x Loading Dye, run at ~100 mV for 15 min, and compare against a DNA ladder. Calculate your predicted digest on Benchling to verify the correct band size matches what you see on the gel

  4. Ensure only unmethylated PCR products (not the original methylated template plasmid) are present through DpnI Treatment, reducing background colonies from unmutated template.

How does the plasmid DNA enter the E. coli cells during transformation?

By diffusion when the cells are shocked

Describe another assembly method in detail (such as Golden Gate Assembly) Explain the other method in 5 - 7 sentences plus diagrams (either handmade or online).

Golden Gate assembly is a molecular cloning technique used to join DNA fragments in a single reaction. Methods for golden gate include MoClo, GoldenBraid 2.0, Mobius assembly and EMMA. It’s an efficient and cost-effective means to generate construct variants. Type IIS restriction enzymes are used to assemble multiple fragments in a single reaction. It’s also important to make sure that the Type IIS included in the process is not present in the vector or insert sequences

Golden Gate Assembly method Golden Gate Assembly method Screenshot taken from https://www.youtube.com/watch?v=JuiJASyG9gc

Model this assembly method with Benchling or Asimov Kernel!

As we don’t have access to Asimove I’ll be modelling this with Benchling

  1. First, I added the DNA sequense of the well-known green fluorescent protein (GFP), which is native to the jellyfish Aequorea victoria

  2. Changed the GGTCTC to GGTCTT to be able to work with the Golden Gate Assembly. Then I’ve added the plasmid to the project. I used the same one described in the lab. Then I checked for the presence of GGTCTC to use as the typeII enzyme.. there were 2 of those. Checked the other ones 2 of those again. Basically all three Type IIS enzymes (BsaI, BbsI, and Esp3I) have recognition sites in the mUAV plasmid

  3. I ended up editing the GGTCTC ones as they were all in the base parts of the plasmid so they didn’t affect much. ( GGTCTC → GGTCTT )

  4. Then I figured out that amilCP gene starts at 2114 and ends at 2779 Complement: A↔T, G↔C

  5. I then loooked for stop codons and found the TAA one. For the primer design i also needed the 20 protein sequence of the GFP which was atgtctaaaggtgaagaatt. Basically my primer components ended up being BsaI site: GGTCTC, Overhang A: AATG (includes the ATG start codon) and the binding region which is those first 20 bp of GFP = atgtctaaaggtgaagaatt. Then for the reverse binding region, the positions of 666 to 685 (20 before the TAA) are TGGTCTTGTTAGAATTTGTT therefore reverse complement ones are AACAAATTCTAACAAGACCA

  6. End primers are GFP Forward 5’-GGTCTCAATGTCTAAAGGTGAAGAATT-3’ and GFP Reverse 5’-GAGACCAAGCAACAAATTCTAACAAGACCA-3'

  7. Then as for the backbone forward primer 5’-GGTCTCGCTTAAGCTTCAAATAAAACGAA-3’ as it is (taagcttcaaataaaacgaa) where GGTCTC is a BsaI recognition site, G a spacer base, and CTTAAGCTTCAAATAAAACGAA a binding region.

  8. Backbone Reverse ended up being 5’-GAGACCCATTTTAGTATTTCTCCTCTTTCT-3'

  9. Then knowing all this it’s time to do the Golden Gate Assembly Golden Gate Assembly method Golden Gate Assembly method

  10. Needed to change the setting from use cut lines otherwise i was getting an error Golden Gate Assembly method Golden Gate Assembly method but other than that it was a pretty straightforward process Golden Gate Assembly method Golden Gate Assembly method

and here are some final setting that I used

Golden Gate Assembly method Golden Gate Assembly method

all done !!! https://benchling.com/s/seq-7kPMc9tGfTBvUaQHkniq?m=slm-VIKLDWpNc1o4GKeuM4Ii

Golden Gate Assembly method Golden Gate Assembly method

Assignment: Asimov Kernel

(waiting for further instructions)

Week 7 HW: Genetic Circuits Part II: Neuromorphic Circuits

Assignment Part 1: Intracellular Artificial Neural Networks (IANNs)

What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions?

  • Analogue quality of computations (not just on and off, but rather treat them more like waves ->easier to work with)
  • -> Thresholds finetuning works
  • You can also work with more inputs simultaneously that will automatically become a weighted sum
  • Which also means they can be trained like neural networks!

whereas Boolean functions are just True/False with operators AND, OR, and NOT

Describe a useful application for an IANN; include a detailed description of input/output behavior, as well as any limitations an IANN might face to achieve your goal.

Actually, mycelium described in the later parts of the homework is a great application of IANN due to its non-uniform growth, which is basically a response to chemical gradients, nutrient availability, CO₂ levels, humidity, and light. That’s all analogue and continuous signals! Therefore, we could engineer a fungal cell to use an IANN to sense its local environment and decide how to grow, giving mycelium a programmable, environmentally-responsive architecture, which is basically a CA <3 where simple local rules produce complex emergent global patterns!!

So then, for input behaviour, we have

-Local nutrient and carbon concentration sensed via metabolite-responsive promoters -CO₂ or O₂ levels -Mechanical compression (fungi lowkey already have these) -Neighbour cell density, which would act like quorum-sensing signalling molecules

Then, for output behaviour, we have

-Expression of growth-promoting genes -Expression of growth-halting genes → denser, stiffer composite in that region -Secretion of signalling molecules → coordinating behaviour with neighbouring cells

Limitations: intercellular signal noise (“local” signals may not stay local, warping and structural variability (Walter and Gürsoy (2022)), instability of speed as the IANN may lag behind rapidly changing environmental conditions

Draw a diagram for an intracellular multilayer perceptron where layer 1 outputs an endoribonuclease that regulates a fluorescent protein output in layer 2.

Here! A super quick sketch made in figma I tried to make it as close to the example as possible

Circuits Circuits

Assignment Part 2: Fungal Materials

What are some examples of existing fungal materials and what are they used for? What are their advantages and disadvantages over traditional counterparts?

I ended up reading a lot about acoustic dampening and how fungal materials are used for it. So the most established fungal material is the mycelium-based composite. It is produced when the vegetative root network of fungi (mycelium), is grown on agricultural or waste plant substrates and then dried to stop growth. The resulting material has the potential to replace petrochemical-based materials within architectural systems and can serve as a biodegradable(!!!) alternative to synthetic sound-absorbing materials. Based on this paper https://pmc.ncbi.nlm.nih.gov/articles/PMC9394424/ , the acoustic performance is genuinely competitive. Mycelium-based composites cultivated on shredded and fine cardboard can both be considered sound-absorbing materials and have the potential to compete with the performance of synthetic sound absorbers as both sample groups outperformed a polypropylene acoustic panel at low frequencies (125 Hz, 250 Hz, 500 Hz). An earlier study found that even the lowest-performing substrate, 100% cotton bur fibre, still yielded higher than 70% acoustic absorption at 1000 Hz.

Advantages over traditional counterparts like fibreglass or stone wool are primarily environmental. As for disadvantages, performance is variable and hard to control because mycelium growth is sensitive to humidity, temperature, and other things.

What might you want to genetically engineer fungi to do and why? What are the advantages of doing synthetic biology in fungi as opposed to bacteria?

I’d be so interested in researching further applications for acoustics. Making a speaker out of a part made out of fungi would be really interesting from a sound perspective. Like mycelium composites are already being used as acoustic panels because their fibrous, porous structure absorbs and diffuses sound well.

It’s so much easier to create a mater out of fungi. They are cheaper and easier to grow at scale. You can also also mold it, control its density, and at the end harvest a structured composite

Assignment Part 3: First DNA Twist Order

I’m still working on it but here’s a little outline of what it’s going to be:

the LuxI/LuxR sender-receiver circuit: the LuxI expression cassette (constitutive promoter + LuxI + terminator) and the LuxR receiver cassette (LuxR + pLux promoter + GFP + terminator)

For a backbone vector, a gene in a plasmid probably pSB1C3 which is an iGEM standard backbone, chloramphenicol resistance and widely used for QS parts

https://pmc.ncbi.nlm.nih.gov/articles/PMC9394424/ https://www.mdpi.com/2313-7673/7/3/100

Week 9 HW: Cell-Free Systems

Homework Part A: General and Lecturer-Specific Questions

General homework questions

Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production.

As the reaction happens outside living cells, you have much more flexibility and control over experimental variables!

To be more specific, the biggest difference is the promoter. In cell-free protein synthesis, a T7 promoter is often preferred because T7 RNA polymerase is readily added or already present in the extract, making transcription simple and highly controllable. And this is why we have a much more precise control over transcription This doesn’t happen with real bacteria, as gene expression is tied to cell growth phase, metabolism and others. That also means that prototyping is faster as you can propagate the plasmid in E. coli, then purify plasmid DNA, then add it to cell-free protein synthesis and be kinda done within so much faster

Two cases where cell-free expression is more beneficial than cell production:

  1. unstable proteins degraded in vivo
  2. toxic proteins, where something might kill bacteria before enough protein accumulates or like if one needs to do any sort of rapid screening

Describe the main components of a cell-free expression system and explain the role of each component.

  • Cell Extract (Lysate). It provides the machinery for transcription/translation
  • Genetic Template (DNA or mRNA). Contains the gene encoding the target protein.
  • Energy Regeneration System. Sustain long-term protein synthesis through supplying ATP/GTP
  • Amino Acids and Building Blocks. Needed for polypeptide synthesis
  • Buffer and Salts. Stabilise the reaction environment and optimise enzyme function and others

Why is energy provision regeneration critical in cell-free systems? Describe a method you could use to ensure continuous ATP supply in your cell-free experiment.

Transcription and translation consume ATP super fast -> ATP is quickly depleted. To avoid that a common method is adding PEP with pyruvate kinase (Calhoun and Swartz, 2007)

Compare prokaryotic versus eukaryotic cell-free expression systems. Choose a protein to produce in each system and explain why.

Prokaryotic cell-free systems - rapid, high-yield production of simple proteins -> Green fluorescent protein (GFP) (small, soluble, no need for complicated glycosylation) Eukaryotic systems - complex proteins requiring proper folding and post-translational modification -> Erythropoietin (EPO) (requires specific disulfide bonds and glycosylation)

How would you design a cell-free experiment to optimize the expression of a membrane protein? Discuss the challenges and how you would address them in your setup.

To express a membrane protein, I would use a eukaryotic cell-free system with liposomes. Then I’d also include chaperones, and optimise reaction conditions to promote proper folding and membrane insertion while preventing aggregation.

Imagine you observe a low yield of your target protein in a cell-free system. Describe three possible reasons for this and suggest a troubleshooting strategy for each.

Low yield may result from ATP depletion (->add an energy regeneration system), protein misfolding (->include chaperones or membrane mimetics), or poor template quality(->would be solved by optimising template design and reaction conditions)

Homework question from Kate Adamala

Design an example of a useful synthetic minimal cell as follows: Pick a function and describe it.

For this mini project/ question, I’d like to explore how synthetic minimal cells for the wols of biocensors specifically, I’d like to look into the possibilities of heavy metal detection in water. It would be also interesting to explore what’s possible to be done after the detection as a form of capturing it

What would your synthetic cell do? What is the input and what is the output?

The synthetic cell would function as a heavy metal detector targeting mercury (Hg²⁺) contamination in water. It will first sense dissolved Hg²⁺ through a metal-responsive transcription factor, then synthesise both a fluorescent reporter (so you can see the signal) and a metal-binding protein to physically capture the mercury.

Input Hg²⁺ ions in the surrounding aqueous environment Output GFP fluorescence, which would act as a quantitative detection signal

Could this function be realized by cell-free Tx/Tl alone, without encapsulation?

No, as it would be ineffective

Could this function be realized by genetically modified natural cell?

Yes, it’s possible to use E.coli but that wouldn’t be safe for the environment.

Describe the desired outcome of your synthetic cell operation.

The desired outcome would be a real-time fluorescent detection of mercury contamination (+passive remediation!)

Design all components that would need to be part of your synthetic cell.

  • PURE system - E. coli Tx/Tl machinery
  • MerR + PmerT-DNA - Hg²⁺ sensor + promoter
  • GFP - fluorescent reporter
  • Metallothionein - Hg²⁺ binding protein
  • ATP regen (creatine-P / CK) - energy source

What would be the membrane made of?

DPPC (structural bilayer backbone), cholesterol (reduced passive permeability), DSPE-PEG2000PE (for stability in water), DPhPC (help with protein reconstitution)

What would you encapsulate inside? Enzymes, small molecules.

Genetic material: A linear DNA expression cassette encoding gfp and MT1 (human metallothionein-1) under a single MerR-regulated PmerT promoter — both genes transcribed as a bicistronic message driven by the same mercury-sensing switch. Tx/Tl machinery: PURE system (PUREfrex 2.0 or similar), providing all ribosomes, elongation factors, tRNAs, aminoacyl-tRNA synthetases, and RNA polymerase needed for in-vesicle protein synthesis. Pre-loaded proteins/small molecules:

MerR transcription factor (pre-expressed and loaded — it is the sensor that detects Hg²⁺ and activates PmerT) Creatine phosphate + creatine kinase (ATP regeneration system to power the PURE system) All 20 amino acids and NTP mix RNase inhibitor (to protect mRNA during synthesis)

Which organism your Tx/Tl system will come from? Is bacterial OK, or do you need a mammalian system for some reason? (hint: for example, if you want to use small molecule modulated promoters, like Tet-ON, you need mammalian)

A bacterial PURE system is the right choice here because all genes are prokaryotic (merR, merB, gfp, MT1 with bacterial codon optimization),

How will your synthetic cell communicate with the environment? (hint: are substrates permeable? or do you need to express the membrane channel?)

Through MerB and MscL

Experimental details List all lipids and genes. (bonus: find the specific genes; for example, instead of just saying “small molecule membrane channel” pick the actual gene.)

Lipids:

  • DPPC (Avanti Polar Lipids )
  • Cholesterol (Sigma C8667)
  • DSPE-PEG2000
  • DPhPC

Genes / DNA Cassettes:

  • merR Uniprot P0A3C3
  • merB Uniprot P0A393
  • gfp Uniprot P42212
  • MT1 Uniprot P04731
  • mscL Uniprot P0A742

How will you measure the function of your system?

  • In vitro validation (to consider dose-response fluorescence, mercury sequestration, Tx/Tl confirmation, membrane integrity controls)
  • Then test the same synthetic cells against Pb, Cd, Cu, Zn, and Fe at equivalent concentrations
  • No-MerR negative control (cells assembled without the MerR protein but with the PmerT-gfp cassette should be no flouresebce)
  • Environmental validation through spiked water sample test

Homework question from Peter Nguyen

Freeze-dried cell-free systems can be incorporated into all kinds of materials as biological sensors or as inducible enzymes to modify the material itself or the surrounding environment. Choose one application field — Architecture, Textiles/Fashion, or Robotics — and propose an application using cell-free systems that are functionally integrated into the material. Answer each of these key questions for your proposal pitch: Write a one-sentence summary pitch sentence describing your concept.

I’m also really interested in how acoustics work and operate. I know there’s been a lot of recent development with mycelium-based material research for speakers, but for this project, I’d like to look into wall panels, which you can find a lot in music venues. In most cases, such venues require a specific setup for the type of music that’s going to be played, and this would solve this issue. I imagine them being updated with using freeze-dried cell-free enzymes to modify material porosity/stiffness in response to sound frequency and humidity, optimising acoustics in real-time for different performances.

How will the idea work, in more detail? Write 3-4 sentences or more.

The panels contain freeze-dried cell-free systems with mechanosensitive promoters and enzymes that degrade polymers in the material matrix. When sound waves vibrate the material and ambient humidity activates the system -> the enzymes modify the panel’s density and porosity. High-frequency performances trigger different enzymatic pathways than bass-heavy music, creating frequency-specific acoustic reflection. The material could shift from absorptive (for loud rock) to reflective (for classical) over the course of hours.

What societal challenge or market need will this address?

Live music venues struggle with “one-size-fits-all” acoustics; a room great for orchestras sounds terrible for others. Acoustic treatment is expensive and static. This addresses the massive live music industry with adaptive infrastructure that reduces the need for costly manual reconfiguration.

How do you envision addressing the limitation of cell-free reactions (e.g., activation with water, stability, one-time use)?

One could use multiple “generations” of enzymes with different activation thresholds. Early activators work on first moisture exposure, but reserve populations activate only after pH changes from initial reactions, creating extended activity. Incorporate hygroscopic materials that capture atmospheric moisture in humid London venues to enable repeated activation cycles.

Homework question from Ally Huang

Provide background information that describes the space biology question or challenge you propose to address. Explain why this topic is significant for humanity, relevant for space exploration, and scientifically interesting. (Maximum 100 words)

When in space, astronauts lose quite a bit of their bone density 1-2%, which is the same thing as having osteoporosis on Earth for a year. Therefore, it’s important to understand how bone-building and bone-resorbing cells operate for long missions in space. There is a chance that cellular automata could have an answer for that. With CA, we could simulate spatial-temporal patterns of bone remodelling to predict how mechanical unloading disrupts the delicate balance between mineralisation and resorption. This research could inform is of possible countermeasures

Name the molecular or genetic target that you propose to study. Examples of molecular targets include individual genes and proteins, DNA and RNA sequences, or broader -omics approaches. (Maximum 30 words)

The RANK/RANKL/OPG signalling pathway regulating osteoblast-osteoclast communication (RUNX2, osteocalcin, collagen type I).

Describe how your molecular or genetic target relates to the space biology question or challenge your proposal addresses. (Maximum 100 words)

The RANK/RANKL/OPG pathway controls bone remodelling balance. In microgravity, mechanical unloading downregulates osteoblast activity (reduced RUNX2 expression) while RANKL signalling increases osteoclast formation. Cellular automata can model how local cell-cell signalling rules—based on real gene expression data from spaceflight experiments—produce emergent bone microarchitecture patterns.

Clearly state your hypothesis or research goal and explain the reasoning behind it. (Maximum 150 words)

Altered RANK/RANKL/OPG expression ratios in microgravity create spatial signalling gradients that, when modelled using cellular automata rules, will reproduce bone loss patterns observed in astronaut imaging studies. Traditional biomechanical models miss the discrete cellular interactions driving remodelling. Cellular automata excel at capturing emergent behaviour from simple local rules.

Outline your experimental plan - identify the sample(s) you will test in your experiment, including any necessary controls, the type of data or measurements that will be collected, etc. (Maximum 100 words)

Samples: Analyse existing NASA GeneLab datasets from microgravity analogue rodent hindlimb unloading experiments, measuring RANK/RANKL/OPG expression in bone tissue at multiple timepoints. Controls: Earth-gravity parameter sets; randomised signalling gradients. Data collected: Bone volume fraction, trabecular thickness, connectivity metrics from CA output; compare against μCT imaging data from actual experiments.

Homework Part B: Individual Final Project

Circuits Circuits

Bibliography Calhoun, K.A. and Swartz, J.R. (2007). Energy Systems for ATP Regeneration in Cell-Free Protein Synthesis Reactions. In Vitro Transcription and Translation Protocols, pp.3–17. doi:https://doi.org/10.1007/978-1-59745-388-2_1.

Week 10 HW: Advanced Imaging & Measurement Technology

Final Project

Please identify at least one (ideally many) aspect(s) of your project that you will measure. It could be the mass or sequence of a protein, the presence, absence, or quantity of a biomarker, etc.

  1. molecular validation (to confirm whether the Twist plasmids are correct and transformed properly) through the presence of the sender and receiver plasmid, correct insert size and sequence identity, then successful bacterial transformation
  2. functional signalling validation / AHL sensing ability through receiver activation threshold, GFP expression strength, dose-response curve, signal saturation
  3. spatial pattern formation to measure whether LuxI actually produces AHL through sender-generated AHL activity, effective AHL equivalent concentration, and signalling consistency between cultures
  4. ML/CAD dataset features through OT-2 plate maps, fluorescence reads, OD600 and metadata logs

Please describe all of the elements you would like to measure, and furthermore, describe how you will perform these measurements. What are the technologies you will use (e.g., gel electrophoresis, DNA sequencing, mass spectrometry, etc.)? Describe in detail.

The project will measure genetic construct integrity, quorum-sensing response strength, sender-derived signalling output, and spatial cellular automata pattern formation.

  1. Plasmid integrity will be measured using colony PCR followed by agarose gel electrophoresis to confirm the expected insert sizes of the LuxI sender and LuxR-sfGFP receiver constructs. Sanger sequencing will then be used to verify promoter-RBS-CDS junctions, spacer regions, stop codons, and inducible promoter elements.

  2. Functional LuxR receiver performance will be quantified by exposing transformed receiver E. coli to a defined dilution series of 3-oxo-C6-HSL and measuring GFP fluorescence over time using a handmade fluorescence plate reader. OD600 measurements will also be collected for normalisation

  3. LuxI sender activity will be measured indirectly by collecting cell-free sender supernatant and applying it to the receiver strain, using the previously generated AHL-GFP calibration curve to infer effective signalling molecule concentration.

  4. Spatial pattern formation relevant to biological cellular automata will be measured using Opentrons OT-2 automated plate layouts combined with fluorescence time-lapse imaging. Metrics including activation radius, propagation velocity, neighbourhood response, and endpoint fluorescence distribution will be extracted and used as training data for ML-enhanced CAD prediction in Aim 2.

Waters Part I — Molecular Weight

  1. Calculated molecular weight

Compute pI/Mw - Results Theoretical pI/Mw (average)

Sequence: 10 20 30 40 50 60 MVSKGEELFT GVVPILVELD GDVNGHKFSV SGEGEGDATY GKLTLKFICT TGKLPVPWPT

    70         80         90        100        110        120 

LVTTLTYGVQ CFSRYPDHMK QHDFFKSAMP EGYVQERTIF FKDDGNYKTR AEVKFEGDTL

   130        140        150        160        170        180 

VNRIELKGID FKEDGNILGH KLEYNYNSHN VYIMADKQKN GIKVNFKIRH NIEDGSVQLA

   190        200        210        220        230        240 

DHYQQNTPIG DGPVLLPDNH YLSTQSALSK DPNEKRDHMV LLEFVTAAGI TLGMDELYKL

EHHHHHH

Theoretical pI/Mw: 5.90 / 28006.60

2. Calculated molecular weight of the eGFP using the adjacent charge state approach described in the recitation

For my data I chose peaks at $m/z = 933.8044$ and $m/z = 966.0037$

$$966.0037 - 933.8044 = 32.1993$$

For $n$ and $n-1$:

$$\frac{MW}{n} - \frac{MW}{n-1} \approx \frac{MW}{n(n-1)}$$

$z = 30$ and $z = 29$:

$$\text{Expected spacing} \approx \frac{MW}{30 \times 29} \approx \frac{28000}{870} \approx 32.2$$

$m/z = 933.8044$ is $z = 30$

$m/z = 966.0037$ is $z = 29$

MW ($z = 30$):

$$MW = (30 \times 933.8044) - (30 \times 1.007825)$$ $$= 28,014.13 - 30.23$$ $$= 27,983.9\ \text{Da}$$

Super close to the theoretical MW of $28,006.60\ \text{Da}$ from the first step


3. accuracy of the measurement using the deconvoluted MW from 2.2 and the predicted weight of the protein from 2.1

$$\text{Accuracy} = \frac{|MW_{\text{experiment}} - MW_{\text{theory}}|}{MW_{\text{theory}}}$$ $$= \frac{|27,983.9 - 28,006.60|}{28,006.60}$$ $$= \frac{22.7}{28,006.60}$$ $$= 0.00081$$ $$= 0.081\%$$

Waters Part III — Peptide Mapping - primary structure

1. Lysines (K) and Arginines (R) are in eGFP

K R

MVSKGEELFTG VVPILVELDG DVNGHKFSVS GEGEGDATYGKLTLKFICTT GKLPVPWPTL VTTLTYGVQC FSRYPDHMKQ HDFFKSAMPE GYVQERTIFF KDDGNYKTRA EVKFEGDTLV NRIELKGIDF KEDGNILGHK LEYNYNSHNV YIMADKQKNG IKVNFKIRHN IEDGSVQLAD HYQQNTPIGD GPVLLPDNHY LSTQSALSKD PNEKRDHMVL LEFVTAAGIT LGMDELYKLE HHHHHH

K=20 R=6

2. Peptides generated from tryptic digestion of eGFP?

Theoretical pI: 5.90
Mw (average mass): 28006.60
Mw (monoisotopic mass): 27988.96

#MassPosition#MCModificationsPeptide Sequence
14472.1752170–2100HNIEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSK
22566.2931217–2390DHMVLLEFVTAAGITLGMDELYK
32437.26085–270GEELFTGVVPILVELDGDVNGHK
42378.257754–740LPVPWPTLVTTLTYGVQCFSR
51973.9062142–1570LEYNYNSHNVYIMADK
61503.659728–420FSVSGEGEGDATYGK
71266.578387–970SAMPEGYVQER
81083.4979240–2470LEHHHHHH
91050.5214115–1230FEGDTLVNR
10982.4952133–1410EDGNILGHK
11821.394081–860QHDFFK
12790.355275–800YPDHMK
13769.391347–530FICTTGK
14711.2944103–1080DDGNYK
15655.381398–1020TIFFK
16602.2780211–2150DPNEK
17579.3137128–1320GIDFK
18507.2925164–1670VNFK
19502.3235124–1270IELK

Predicted peptides = 19

  1. -5.

Observed LC peaks = ~19

Observed peaks are about the same as predicted

Main peptide peak = 525.76712

Spacing = 0.5 m/z, => most abundant charge state is 2+. The singly charged mass is 1050.52 Da

Charge = +2 [ M + H ] + [M+H] + = 1050.52145 Da [ M + H ] + [M+H] + = 1050.52438 Da

Identified peptide: FEGDTLVNR

the error is about 2.8 ppm

  1. Identify the peptide from PeptideMass

FEGDTLVNR has a theoretical monoisotopic mass of 1050.52143 Da

7. 88 is the percentage of the sequence that is confirmed by peptide mapping

Waters Part IV — Oligomers

7FU Decamer = 3.40 MDa

8FU Didecamer = 8 MDa

8FU 3-Decamer = 12 MDa

8FU 4-Decamer = 16 MDa

Waters Part V — Did I make GFP?

Yes!

ParameterTheoreticalObserved/Measured on Intact LC-MSPPM Mass ErrorMolecular Weight (kDa)
eGFP Monomer26,905 Da26,903 Da0.7 ppm26.90

Week 11 HW: Bioproduction & Cloud Labs

Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork

At this point, I’m not entirely sure what I’ve contributed to, as the artwork changed quite a lot. I’ve added a few yellow and green pixels when we first got access to the board, but as time went it changed so much that I don’t think any of them were left in the same spots. I did like the collaborative aspect of it, but going forward, I kind of wish everyone could contribute to just one/two pixels that couldn’t be overwritten. That would make the pointing-out aspect of it so much easier

arrwork contribution arrwork contribution

Part B: Cell-Free Protein Synthesis | Cell-Free Reagents

E. coli Lysate Provides the cellular machinery necessary for both transcription and translation, including ribosomes, tRNAs, translation factors, and other essential proteins.

Salts/Buffer

Potassium Glutamate Acts as the primary osmolyte to maintain ionic strength and pH stability in the reaction mixture.

HEPES-KOH pH 7.5 Maintains optimal pH for enzyme activity and protein synthesis throughout the reaction.

Magnesium Glutamate Provides essential Mg²⁺ cofactor required for ribosome function, translation factors, and RNA stability.

Potassium phosphate monobasic & dibasic Work together as a phosphate buffer system to maintain pH stability and provide inorganic phosphate necessary for nucleotide synthesis.

Energy/Nucleotide System

Ribose: A pentose sugar that serves as the sugar backbone for nucleotide synthesis.

Glucose: Provides metabolic energy and carbon source; can be phosphorylated to enter glycolysis and the pentose phosphate pathway for energy generation.

AMP, CMP, GMP, UMP Nucleoside monophosphates that serve as building blocks for RNA synthesis during transcription and are recycled during the reaction.

Guanine A purine base that can be salvaged and converted into GMP, providing an alternative route for guanine nucleotide synthesis when needed.

Translation Mix (Amino Acids)

17 Amino Acid Mix Provides the standard amino acids (minus Tyr and Cys, which are listed separately) needed for protein synthesis during translation.

Tyrosine & Cysteine Included separately, likely because they require special handling (Tyr for fluorescent protein variants, Cys to prevent oxidation or for selenomethionine incorporation).

Additives

Nicotinamide: Acts as a precursor for NAD⁺/NADH, essential cofactors for redox reactions and energy metabolism within the cell-free system.

Backfill

Nuclease Free Water: Dissolves all components and serves as the reaction medium; nuclease-free formulation prevents degradation of RNA transcripts.

Main Differences:

1-hour PEP-NTP vs. 20-hour NMP-Ribose-Glucose Master Mixes The 1-hour optimized PEP-NTP master mix uses phosphoenolpyruvate (PEP) and nucleoside triphosphates (NTPs), which provide rapid, readily-available energy and nucleotides for quick transcription/translation reactions with high yields but shorter reaction windows. In contrast, the 20-hour NMP-Ribose-Glucose master mix uses nucleoside monophosphates (NMPs) along with glucose and ribose, creating a more sustainable, self-regenerating system where nucleotides and energy are generated on-demand through metabolic pathways, allowing the reaction to continue for much longer periods but requiring longer reaction times to reach equivalent protein yields. The longer format is designed for extended protein synthesis reactions, while the PEP-NTP system is optimized for speed and high initial productivity.

Part C: Planning the Global Experiment | Cell-Free Master Mix Design

Part 1: Biophysical/Functional Properties of Each Fluorescent Protein

  1. sfGFP can fold in under 10 minutes, making it one of the fastest-maturing green fluorescent proteins.

  2. mRFP1 is reported to be a somewhat slowly-maturing monomer with low acid sensitivity.

  3. mKO2 is an orange fluorescent protein with moderate acid sensitivity

  4. In mTurquoise2, the reaction with oxygen is a rate-determining step (although folding contributes as well), and oxygen is necessary for generating the planar alkene of the fluorophore. In cell-free systems, limited oxygen availability—especially as the reaction progresses and consumes dissolved oxygen—can slow or halt chromophore maturation, limiting cyan fluorescence development

  5. mScarlet is a red fluorescent protein with moderate acid sensitivity, a fluorescence lifetime of 3.9 ns (the highest value recorded to date for mRFPs), and a quantum yield of 0.70; much higher than the quantum yield of other monomeric RFPs, resulting in the highest calculated brightness in the mRFP spectral class.

  6. Electra2 is a flavin-based fluorescent protein that functions independently from molecular oxygen—its fluorescence intensity is independent from molecular oxygen and remains stable under hypoxic conditions.

Part 2: Hypothesis for Optimization

Hypothesis: Optimizing mTurquoise2 Expression via Enhanced Oxygenation Protein of Focus: mTurquoise2 Reagent(s) to Adjust:

Add: Oxygen-rich buffer or increase aeration (inject air/oxygen into the reaction headspace or use a gas-permeable reaction chamber) Add: Catalase enzyme supplement (to manage hydrogen peroxide accumulation, which competes with oxygen utilization) Increase: Magnesium Glutamate concentration (higher Mg²⁺ can stabilize protein folding and chromophore maturation steps, especially when combined with oxygenation)

Expected Effect: Since the reaction with oxygen is a rate-determining step in mTurquoise2 maturation, increasing dissolved oxygen availability throughout the 36-hour incubation would:

Accelerate the rate-limiting oxidation step of chromophore maturation Generate peak cyan fluorescence earlier in the reaction timeline Maintain sustained fluorescence output even as reaction conditions evolve over 36 hours Prevent the typical oxygen-depletion bottleneck that limits cyan FP maturation in extended cell-free reactions

This approach leverages the known oxygen-dependency of mTurquoise2 maturation to directly address its biophysical limiting factor in a 36-hour cell-free context.

Alternative Hypothesis: Optimizing Electra2 for Cell-Free Viability via Flavin Supplementation Protein of Focus: Electra2 Reagent(s) to Adjust:

Add: Flavin Mononucleotide (FMN) or Flavin Adenine Dinucleotide (FAD) supplement (10–50 μM) Increase: Riboflavin concentration in the energy/nucleotide system (to replenish flavin pool as it’s consumed)

Expected Effect: Since Electra2 is a flavin-binding fluorescent protein, flavin mononucleotide (FMN) is essential to cell physiology as well as for determining fluorescence in flavin-based reporters. Supplementing exogenous FMN or FAD would:

Ensure adequate flavin cofactor availability throughout the 36-hour reaction Prevent signal loss due to flavin depletion (which occurs as the cell-free lysate consumes flavins for other metabolic processes) Maintain consistent blue fluorescence output independent of oxygen fluctuations Potentially enable higher protein expression levels by not competing for limited intracellular flavin pools

This approach addresses Electra2’s cofactor-dependent nature and leverages its oxygen-independence to provide a robust fluorescence readout over extended cell-free reactions.

Week 12 HW: Building Genomes

See week 12~