Homework

Weekly homework submissions:

  • Week 01 HW: Principles and Practices

    Table of Contents 1) Biological engineering application / tool 2) Governance / policy goals 3) Governance actions 4) Risk analysis 5) Scoring governance actions 6) Prioritization recommendation 7) Reflection 8) Project + governance overview Homework Questions from Professor Jacobson Homework Questions from Dr. LeProust Homework Question from George Church 1) Biological engineering application / tool I want to develop (and why) Application / tool: Growing interactive surfaces from bacterial cellulose I already know how to grow 3D artifacts in bacterial cellulose (BC). In this project, I want to develop a biological functionalization workflow that turns grown BC artifacts into interactive surfaces, measured through impedance-based sensing (e.g., tactile volumetric response, sensitivity to pressure), with minimal embedded electronics.

  • Week 02 HW: DNA Read, Write, & Edit

    Table of Contents 0. Basics of Gel Electrophoresis 1. Benchling and In-silico Gel Art 2. Gel Art Restriction Digests and Electrophoresis 3.1 Choose Your Protein Protein sequence (from Supplementary Data 1) Structural micro-analysis: copper coordination and catalysis Why this protein matters 3.2 Reverse translate 3.3 Codon optimization 3.4 You have a sequence now what 3.5 Optional how does it work in nature 4. Prepare a Twist DNA Synthesis Order 4.1 Create accounts 4.2 Build the DNA insert sequence 5.1 DNA Read 5.2 DNA Write 5.3 DNA Edit 6. Exploration of others strategies for my project 0. Basics of Gel Electrophoresis Gel electrophoresis is a fascinating process that allows DNA fragments to be separated according to size. The migration of fragments through agarose reveals invisible molecular differences as visible banding patterns. What interests me most is how information encoded in DNA becomes a spatial structure that can be interpreted visually.

  • Week 03 HW lab-automation:

    Table of Contents Assignment: Python Script for Opentrons Artwork Statement of Intent — Why Reaction–Diffusion Post-Lab Questions 1.Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications 2.Write a description about what you intend to do with automation tools for your final project Automation as a Tool to Explore the Behavioral Landscape of Living Materials Moving Beyond Optimization My Intended Use of Automation 1. Mapping Morphogenetic Regimes of Bacterial Cellulose 2. Spatial Programming of Living Matter 3. Toward a Cybernetic Living Material System Why This Matters Final Project Ideas Assignment: Python Script for Opentrons Artwork https://opentrons-art.rcdonovan.com/?id=sux110hip535fnx

  • Week 04 HW: Protein Design Part 1

    Table of Contents Conceptual Questions – Answers

  1. How many molecules of amino acids are in 500 g of meat? 2. Why do humans eat beef but do not become a cow? 3. Why are there only 20 natural amino acids? 4. Can we make non-natural amino acids? 5. Where did amino acids come from before life? 6. If you make an α-helix using D-amino acids, what handedness would you expect? 7. Can you discover additional helices in proteins? 8. Why are most molecular helices right-handed? 9. Why do β-sheets tend to aggregate? 9b. What is the driving force for β-sheet aggregation? 10. Why do many amyloid diseases form β-sheets? 10b. Can you use amyloid β-sheets as materials? 11. Design a β-sheet motif that forms a well-ordered structure Part B: Protein Analysis and Visualization
  1. Submit mutant SOD1 + peptide chains for AlphaFold modeling 3. Record ipTM score and binding localization 4. Compare ipTM values and known binder Part 3. Evaluate Properties in PeptiVerse
  1. Components of the Phusion High-Fidelity PCR Master Mix 2. Factors determining primer annealing temperature 3. PCR vs restriction enzyme digests 4. Ensuring compatibility for Gibson cloning 5. Plasmid DNA transformation into E. coli 6. Golden Gate Assembly Benchling / Modeling Component Asimov Kernel Assignment — Repository and Circuit Design
  1. Advantages of IANNs over Boolean genetic circuits 2. Useful applications for IANNs 3. Intracellular Multilayer Perceptron Diagram Assignment Part 2: Fungal Materials
  2. Existing fungal materials: uses, advantages, and disadvantages 2. Genetic engineering of fungi and advantages over bacteria Assignment Part 3: First DNA Twist Order
  1. Advantages of cell-free protein synthesis 2. Components of a cell-free expression system 3. Energy regeneration and ATP supply 4. Prokaryotic vs eukaryotic cell-free systems 5. Optimizing membrane protein expression 6. Troubleshooting low protein yield Homework question from Kate Adamala - Design Synthetic Minimal Cell Genetic Circuits
  • Week 10 HW: Advanced Imaging & Measurement Technology

    Table of Contents Homework: Final Project Final Project — Measurement Plan 1. Genetic Construct Verification 2. Tyr1 Protein Expression 3. Melanin Production 4. Bacterial Cellulose Growth 5. Electrochemical / Impedance Behavior 6. Environmental / Culture Conditions Summary Table Overall Goal Homework: Waters Part I — Molecular Weight Waters Part I — eGFP Molecular Weight 1. Calculated Molecular Weight 2. Molecular Weight from Adjacent Charge States Charge State Observation from the Zoomed-In Peak Homework: Waters Part II — Secondary/Tertiary Structure

  • Week 11 - HW - Bioproduction & Cloud Labs

    Table of Contents Part A — The 1,536 Pixel Artwork Canvas | Collective Artwork Part B: Cell-Free Protein Synthesis | Cell-Free Reagents

  1. Cell-free protein synthesis reaction components E. coli Lysate Salts / Buffer Energy / Nucleotide System Translation Mix (Amino Acids) Additives Backfill 2. PEP-NTP vs NMP-Ribose-Glucose master mix Part C: Planning the Global Experiment | Cell-Free Master Mix Design

Subsections of Homework

Week 01 HW: Principles and Practices

cover image cover image

Table of Contents

1) Biological engineering application / tool I want to develop (and why)

Application / tool: Growing interactive surfaces from bacterial cellulose

I already know how to grow 3D artifacts in bacterial cellulose (BC). In this project, I want to develop a biological functionalization workflow that turns grown BC artifacts into interactive surfaces, measured through impedance-based sensing (e.g., tactile volumetric response, sensitivity to pressure), with minimal embedded electronics.

The goal is not to integrate electronic components into the object, but to make the material itself behave as an interface. Electrodes are used only for readout and characterization, not as active elements.

Concretely, this toolchain would include:

  • Biological functionalization workflow
    Modulating the extracellular matrix during growth (for example through protein secretion into the BC network) to tune impedance-based electrical behavior and enable material-level interaction.

  • Measurement interface (readout only)
    A simple electrode setup and impedance readout used to validate, compare, and characterize material responses across different growth and functionalization conditions.

  • Optional integration (if time allows)
    Integrating functionalization into my existing 3D growth workflow (molds, scaffolds, and growth conditions), so objects can be grown already functionalized rather than treated after fabrication.

Why this matters

Most interactive devices are built by assembling electronics into synthetic materials. This project proposes a different paradigm:

Instead of fabricating objects and then adding interactivity, objects are cultivated so that interactivity emerges from biological growth and organization.

This matters for several reasons:

  • Sustainable interfaces: biodegradable material base, low-energy fabrication, and the possibility of metabolic degradation or recycling.
  • New interface paradigms: artifacts become processual and time-based, where growth history and material state shape interaction.
  • Research tool: impedance acts as a lens to study how morphology, hydration, and growth dynamics become measurable signals.

2) Governance / policy goals for an ethical future

Inspired by the iGEM Safety framework, the ethical challenges of this project are framed around biosafety, environmental responsibility, and responsible use of engineered living materials. Because this work lowers the barrier to growing functional living artifacts, governance should guide not only what is built, but how materials are grown, handled, and disposed of.

Goal A — Ensure biosafety and non-malfeasance during cultivation and use

Although bacterial cellulose–producing organisms are generally considered low-risk, improper handling or uncontrolled experimentation could lead to contamination or unintended exposure.

Sub-goals

  • A1. Safe handling and containment
    Ensure cultivation, functionalization, and testing are conducted using appropriate hygiene, containment, and separation practices, especially when combining multiple biological agents.
  • A2. Transparency about biological modification
    Maintain a clear distinction between engineered biological components and non-biological elements (e.g., electrodes used only for readout).

Goal B — Prevent environmental harm and unintended release

A key motivation of this work is sustainability: growing interactive surfaces from living matter that can eventually be degraded or recycled. This benefit only holds if environmental risks are managed.

Sub-goals

  • B1. Controlled disposal and deactivation Ensure BC artifacts, growth media, and byproducts are deactivated or degraded before disposal, preventing accidental release into wastewater or compost systems.
  • B2. Avoid persistent or toxic additives Favor biological or biodegradable functionalization methods over persistent chemicals or heavy metals to preserve viable end-of-life pathways.

Goal C — Promote responsible development of living interactive interfaces

Because this project treats living materials as sensing and computational substrates, governance should guide how such systems are framed and used, particularly outside the lab.

Sub-goals

  • C1. Prevent misleading claims
    Avoid presenting biologically functionalized materials as autonomous, intelligent, or fully controllable systems.
  • C2. Limit sensing to material interaction
    Encourage applications focused on material state (touch, pressure, hydration).

cover image cover image

3) Governance actions (comparative overview)

The table below summarizes three governance actions addressing biosafety, environmental impact, and responsible use of grown bacterial cellulose artifacts. Each action is evaluated across its purpose, design, assumptions, and potential risks.

Governance ActionPurpose (What changes?)Design (How it works / Who acts?)Key AssumptionsRisks of Failure & “Success”
Option 1 — Biological process & provenance documentationCurrent documentation emphasizes performance over biological process. This action requires explicit documentation of organism source, biological modification, and separation between biological and technical components.• Actors: instructors, research labs, journals, funding agencies
• Lightweight 1–2 page template integrated into coursework or reports
• Descriptive, not approval-based
Transparency reduces misuse, confusion, and overclaiming, and improves accountability if issues arise.Failure: becomes a checkbox exercise with little reflection.
Success risk: excessive formalization discourages exploratory research.
Mitigation: keep requirements descriptive and allow uncertainty.
Option 2 — Standardized deactivation & disposal protocolsLiving materials are often treated as benign, leading to inconsistent disposal. This action establishes mandatory deactivation and disposal procedures for living artifacts and growth media.• Actors: lab managers, EHS offices, instructors
• Clear SOPs (heat, chemical, drying)
• Labeled waste streams and visible instructions
Improper disposal is the most likely real-world risk, especially as production scales or decentralizes.Failure: users bypass protocols due to inconvenience or lack of enforcement.
Success risk: overly strict rules push work into informal or unregulated spaces.
Mitigation: make compliance easy and well-supported.
Option 3 — Readout-limited interface design normsBiological sensing systems can drift toward surveillance or control narratives. This action promotes norms framing living materials as readout-based material interfaces, not monitoring systems.• Actors: researchers, designers, companies
• Explicit framing in documentation and demos
• Focus on local, material-state signals
Early design norms can influence long-term application trajectories and public expectations.Failure: norms ignored in commercial or extractive contexts.
Success risk: may limit some beneficial applications.
Mitigation: scope norms to public, non-clinical, and creative deployments.

4) Potential risks, use, misuse, and governance considerations

Concern CategoryPlausible Worst-Case ScenarioWhy This MattersGovernance Focus
BiosecurityEngineered bacteria persist outside intended settingsBC artifacts are wet and long-livedProvenance + deactivation
Lab & user safetyArtifacts become contamination reservoirsObjects may be touched or deployed publiclyHygiene + stabilization
Environmental impactDegradation causes microbial bloomsBiodegradable ≠ neutral at scaleDisposal pathways
Waste & end-of-lifeFunctionalization alters compost systemsDegradation is context-dependentAdditive disclosure
Misuse of sensingInterfaces repurposed for monitoringAmbiguous signals enable misuseReadout limits
OverclaimingMaterials framed as autonomousFalse trust and adoptionTransparency
Decentralized productionDIY use without safeguardsEscapes institutional oversightEducation-first governance

5) Scoring governance actions

Scoring: 1 = best, 2 = moderate, 3 = weak

Policy Goal / CriterionOption 1Option 2Option 3
Prevent biosecurity incidents213
Respond to incidents123
Prevent lab safety incidents213
Protect environment212
Minimize burden121
Feasibility122
Not impede research122
Promote constructive use122

6) Prioritization and recommendation

Priority order

  1. Option 2 — Deactivation & disposal protocols
  2. Option 1 — Provenance documentation
  3. Option 3 — Readout-limited norms

Option 2 addresses the most immediate and likely harm: environmental release and waste. Option 1 supports accountability and response without limiting research. Option 3 addresses longer-term risks if the technology scales.


7) Reflection

This exercise highlighted that the main risks of growing interactive living artifacts arise from success and scale, not malicious intent. Even biodegradable materials can become harmful when produced or discarded in large quantities. Another key insight is how easily biological sensing can be reframed beyond material interaction.

Governance should therefore focus on practice, documentation, and lifecycle management, rather than strict prohibition.

At this stage, the cultivation of interactive BC artifacts should remain confined to controlled research and educational contexts, and not be deployed in public or commercial environments without additional review.


8) Project + governance overview

mindmap
  root((BC Interactive Artifacts))
    Growth
      3D molds
      Functionalized matrix
    Interaction
      Impedance
      Tactile response
    Risks
      Biosecurity
      Waste
      Misuse
    Governance
      Disposal protocols
      Provenance docs
      Readout limits

Strategies to ensure an ethical biological future (project-integrated)

The table below summarizes the concrete strategies embedded in my final project to ensure it contributes to an ethical biological future. Rather than external rules, these strategies are implemented through design choices, constraints, and documentation practices.

StrategyEthical PrincipleWhat Risk It AddressesHow It Is Implemented in the ProjectWhy This Matters
Readout-first interface designObservation over controlInstrumentalization of living systems; drift toward surveillance or behavioral monitoringElectronics are used only for impedance readout; no actuation or feedback loops acting on the living material; signals framed as material-state indicatorsLimits coercive or extractive uses of living matter and preserves biological agency
Scale and context limitationResponsibility through constraintEnvironmental harm and biosafety risks arising from success and scaleProject framed as research/educational tool; growth and testing limited to controlled lab or studio environments; public or commercial deployment explicitly excluded without further reviewPrevents premature deployment and unmanaged scaling of living interfaces
Lifecycle-aware designEnd-of-life accountabilityEcological disruption from disposal or accumulation of living materialsPreference for biologically degradable functionalization; explicit deactivation and disposal protocols included in the workflowSustainability is addressed across the full material lifecycle, not only fabrication
Transparency about biological modificationEpistemic responsibilityOverclaiming, black-box narratives, and misleading representations of “living intelligence”Clear distinction between engineered biological processes and technical readout systems; documentation of uncertainty and variabilityBuilds trust and prevents misuse driven by misunderstanding or hype
Reflexive governance practiceEthics as an evolving processStatic rules becoming obsolete as techniques evolvePeriodic reassessment of risks; use of lightweight governance tools (documentation, norms, constraints) rather than fixed prohibitionsAllows ethical considerations to evolve alongside the project and its capabilities

Homework Questions from Professor Jacobson:

Question 1 — Polymerase error rate, genome size, and biological error correction

DNA polymerase, the molecular machinery responsible for copying DNA, has a non-zero error rate. As presented in the lecture slides, the error rate of an error-correcting DNA polymerase is approximately 1 error per 10⁶ base incorporations. While this error rate is very low at the scale of individual nucleotides, it becomes significant when compared to the size of the human genome, which is on the order of 3.2 × 10⁹ base pairs. If replication relied solely on polymerase accuracy, this discrepancy would imply the accumulation of thousands of errors during each complete genome replication, which would be incompatible with stable inheritance and organismal viability.

Biology resolves this apparent mismatch through multiple layers of error correction rather than relying on perfect synthesis. First, many DNA polymerases include proofreading activity (3′→5′ exonuclease) that detects and removes incorrectly incorporated bases during replication. Second, additional post-replication repair systems, such as mismatch repair pathways (e.g., the MutS system shown in the slides), identify and correct errors that escape polymerase proofreading. Together, these layered mechanisms dramatically reduce the effective error rate of DNA replication, allowing large genomes like the human genome to be copied reliably despite the intrinsic imperfection of polymerase activity.


Question 2 — Coding redundancy and why most theoretical DNA codes do not work in practice

Due to the redundancy of the genetic code, an average human protein can, in theory, be encoded by an extremely large number of different DNA sequences. As shown in the slides, an average human protein is approximately 1036 base pairs long, corresponding to roughly 345 amino acids. Because many amino acids can be encoded by multiple synonymous codons, the number of possible nucleotide sequences that map to the same amino acid sequence grows combinatorially. In principle, this means there are astronomically many distinct DNA sequences that could encode the same protein sequence.

In practice, however, most of these theoretically valid codes do not function effectively in living systems. The lecture slides highlight several biological constraints that limit which sequences work. Certain DNA or RNA sequences form secondary structures with unfavorable free energy that interfere with transcription or translation. GC content bias can destabilize sequences or alter expression efficiency. Additionally, synthesis and assembly processes have their own error profiles, making some sequences more fragile than others. Finally, translation depends on interactions with cellular machinery such as ribosomes and tRNA pools, meaning that codon usage and sequence context matter beyond simple amino acid encoding. As a result, although the genetic code is redundant in theory, only a narrow subset of possible DNA sequences reliably produce functional proteins in practice.

Homework Questions from Dr. LeProust:

1. What’s the most commonly used method for oligo synthesis currently?

The most commonly used method for oligonucleotide synthesis today is solid-phase phosphoramidite chemical synthesis, originally developed by Caruthers in the early 1980s and still the industry standard. In this approach, DNA is synthesized base by base on a solid support (typically controlled pore glass or functionalized silica). Each synthesis cycle consists of four repeated chemical steps: coupling of a phosphoramidite nucleotide, capping of unreacted chains, oxidation to stabilize the backbone, and deprotection (deblocking) to expose the next reactive site. This cycle is repeated sequentially to build the desired oligo.

This method dominates because it is highly automatable, scalable, and compatible with parallelization, especially in modern platforms such as silicon-based microarrays (e.g., Twist Bioscience). However, it is fundamentally a stepwise chemical process, meaning that each added nucleotide has a non-zero failure rate. Even with very high per-step efficiencies (>99.5%), errors accumulate with length, which directly constrains how long oligos can be synthesized reliably. This intrinsic accumulation of errors explains many of the downstream limitations discussed in the following questions.

2. Why is it difficult to make oligos longer than ~200 nt via direct synthesis?

The primary difficulty in synthesizing oligos longer than ~200 nucleotides arises from the cumulative error rate of stepwise chemical synthesis. Each coupling step has a small probability of failure (incomplete coupling, side reactions, or truncation). While a single error is unlikely at short lengths, the probability of obtaining a full-length, error-free molecule drops exponentially as the number of synthesis cycles increases. Beyond ~200 nt, the fraction of correct full-length molecules becomes very low, even if average synthesis efficiency remains high.

In addition to error accumulation, sequence-dependent effects further complicate long oligo synthesis. Regions with high or low GC content, homopolymers, inverted repeats, or strong secondary structures (e.g., hairpins) reduce coupling efficiency and increase truncation or deletion events. Purification also becomes more challenging: separating full-length products from truncated sequences is increasingly inefficient as length grows. As a result, while advances in chemistry have pushed this limit somewhat (e.g., validated synthesis of ~300–500 nt in controlled contexts), ~200 nt remains the practical and economical upper bound for routine, high-fidelity direct synthesis.

3. Why can’t you make a 2000 bp gene via direct oligo synthesis?

A 2000 bp gene cannot be synthesized directly because chemical oligo synthesis does not scale linearly with length. At that size, the compounded error rate would make the probability of producing even a single correct full-length molecule essentially negligible. Even if synthesis chemistry allowed chain extension to 2000 nt, the resulting product pool would be dominated by truncated, mutated, or rearranged sequences, rendering it unusable without extensive correction.

Instead, long genes are produced through a hierarchical assembly strategy: shorter oligos (typically 40–200 nt) are first synthesized, then enzymatically assembled using methods such as PCR-based gene assembly or ligation. These assembled fragments are subsequently cloned and sequence-verified, allowing error correction through selection rather than chemistry. This separation of concerns—chemical synthesis for short, precise building blocks, and biological processes for long-range assembly and error correction—is fundamental. It reflects a broader principle emphasized in the course: chemistry writes locally, biology validates globally. Direct chemical synthesis alone cannot replace this division of labor for long DNA constructs.

Homework Question from George Church:

Given the examples in slides #2 and #4, where biological systems use structured “codes” to mediate interactions between polymers (NA:NA via base pairing and AA:NA via translation and binding domains), an AA:AA interaction code would most plausibly rely on side-chain chemistry and spatial complementarity rather than a discrete symbolic alphabet.

Unlike nucleic acids, amino acids do not pair through a uniform geometry or hydrogen-bonding scheme. Instead, proteins interact through combinations of hydrophobicity, charge, polarity, aromatic stacking, and steric fit. Therefore, an AA:AA code is inherently analog and multidimensional, not digital. A reasonable “code” would be based on classes of side-chain interactions—for example, positively charged residues pairing with negatively charged ones, hydrophobic patches aligning with hydrophobic patches, or aromatic residues engaging in π–π stacking. This resembles a physicochemical code rather than a base-pair code.

In practice, such a code already exists implicitly in protein–protein interaction domains, coiled-coil motifs, antibody–antigen interfaces, and self-assembling protein systems. From a design perspective, an explicit AA:AA code could be engineered by constraining proteins to limited alphabets or repeat motifs, where interaction specificity emerges from repeating patterns of side-chain chemistry and geometry. This mirrors how TALE proteins use a simple amino-acid pair code to recognize DNA, but adapted to protein–protein recognition. Thus, AA:AA coding is best understood not as a lookup table, but as a designed interaction grammar grounded in chemistry and structure.


PROMPT :

you can found here the slides of Reading & Writing Life from Goerge Church. Normally I’ve to answers to one of this 2 questions (only one of those not the both) : A. [Using Google & Prof. Church’s slide #4] What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”? B. [Given slides #2 & 4 (AA:NA and NA:NA codes)] What code would you suggest for AA:AA interactions?

So, I don’t know which one is easiest to answer at the moment. We will use the PDF to determine which questions are most likely to be contained within it and reduce the uncertainty. To do this, we will

  1. Break down the problem into sub-elements.
  2. Treat each of them with an explicit confidence level (0.0 - 1.0).
  3. Verify by checking the logic, facts, completeness, possible biases, what could be done, and how to achieve the objective (determine which question is easiest to answer without being wrong and give the answer).
  4. Synthesize and combine using weighted confidence levels.
  5. If confidence is below 0.8, identify weaknesses and how we could proceed to achieve a better level. Formulate clear answers, the level of confidence, and key points for vigilance.

REFERENCES

Week 02 HW: DNA Read, Write, & Edit

cover image cover image

Table of Contents


0. Basics of Gel Electrophoresis

Gel electrophoresis is a fascinating process that allows DNA fragments to be separated according to size. The migration of fragments through agarose reveals invisible molecular differences as visible banding patterns. What interests me most is how information encoded in DNA becomes a spatial structure that can be interpreted visually.


1. Benchling and In-silico Gel Art

Following the protocol Gel Art: Restriction Digests and Gel Electrophoresis, I explored Benchling to simulate restriction enzyme digestion.

I imported Lambda DNA and simulated restriction digests using:

  • EcoRI
  • HindIII
  • BamHI
  • KpnI
  • EcoRV
  • SacI
  • SalI

At first, I was not entirely sure what I was doing — I explored the platform experimentally, clicking through the interface to understand how the enzymes cut and how the DNA fragments were segmented.

The origin of the fragment. image image

See the operons targetted. image image

How it could/ should looks on gel. image image

After generating fragment sizes, I moved to Ronan Donovan’s gel art tool:

👉 https://rcdonovan.com/gel-art

There, I entered fragment sizes from Benchling to simulate gel patterns. Interestingly, some of the same enzymes were already available in the tool. I experimented with fragment combinations, spacing, and contrast to create structured visual patterns. I tried composing a geometric “invader”-like form inspired by Paul Vanouse’s aesthetic approach.

This exercise made clear how restriction logic can generate latent images embedded in genomic structure.

image image

2. Gel Art Restriction Digests and Electrophoresis

This tutorial was particularly helpful:

👉 https://www.youtube.com/watch?v=TIZRGt3YAug

Although I do not currently have access to a laboratory to run physical gels, I have previously seen Paul Vanouse’s Latent Figure Protocol artworks in person. A curator friend owns one and showed it to me during a bioart conference. Seeing the material presence of gel-based imagery deeply informed how I approached this in-silico exercise.


3.1 Choose Your Protein

image image
Protein chosen: Tyrosinase (Tyr1) from Bacillus megaterium

I chose the tyrosinase Tyr1 enzyme from Bacillus megaterium because it directly enables the biosynthesis of eumelanin, a redox-active polymer with electronic and material properties.

This protein is particularly relevant to my project, which explores the biological functionalization of bacterial cellulose. By expressing Tyr1 in Komagataeibacter rhaeticus, the cellulose-producing bacterium can generate eumelanin within or around the extracellular matrix, thereby modifying the optical and potentially electroactive properties of the material.

Tyr1 can encode a macroscopic material property in the cell of Komagateibacter Rhaeticus.

This strategy has been experimentally validated in:

Walker et al., 2024 — Self-pigmenting textiles grown from cellulose-producing bacteria with engineered tyrosinase expression The authors engineered K. rhaeticus strains expressing Tyr1 and demonstrated eumelanin production under buffered neutral pH conditions (Supplementary Data 1, 41587_2024_2194_MOESM1_ESM).

The paper is here 👉 https://www.nature.com/articles/s41587-024-02194-3 Details are in Supplementary information / Supplementary information / Supplementary Figs. 1–5, Tables 1–3 and Supplementary Data 1 and 2.

Or here : 📄 Supplementary Walker et al

The principale idea :

DNA (tyr1) ↓ transcription mRNA ↓ translation Tyrosinase enzyme ↓ catalytic activity Melanin polymer ↓ integration into cellulose matrix Modified impedance ↓ Human interaction (touch sensing)

Rather than maximizing conductivity, the goal is to explore how living metabolism can reorganize electrochemical properties within a self-grown material matrix. Maybe a hybrid approach combining enzymatic melanin production with controlled ionic modulation of the growth medium may provide a balance between biological emergence and measurable electrical response.

Impedance vs Conduction (for BC-based “electronic” biofilms)

Quick intuition

  • Conduction (DC): current flows through a continuous conductive path (like a wire).
    → You usually need a percolating network (CNT/graphene/metal paths).

  • Impedance (AC): the material reacts to an alternating signal.
    → It mixes resistance + capacitance + ionic diffusion, and is extremely sensitive to: hydration, ions, porosity, and microstructure.

My goal is a tactile / capacitive surface then impedance is usually the most realistic target.


Concept map (what you measure defines what you build)

flowchart TB
A[BC biofilm / pellicle]--> B{What do I measure}

B-->|DC or very low frequency| C[Conduction]
B-->|AC multi-frequency| D[Impedance]

C-->C1[Electronic transport]
C-->C2[Needs a percolating path]
C2-->C3[CNT graphene Ag nanowires PEDOT]

D-->D1[Resistance R]
D-->D2[Capacitance C double layer]
D-->D3[Ionic diffusion]
D-->D4[Strongly affected by water ions porosity]
D4-->D5[Good for touch pressure hydration sensing]

Protein Sequence (from Supplementary Data 1)
>sp|Tyr1_BACME|Tyrosinase OS=Bacillus megaterium
MGNKYRVRKNVLHLTDTEKRDFVRTVLILKEKGIYDRYIAWHGAAGKFHTPPGSDRNAAHMSSAFLPWHREYLLRFERDLQSINPEVTLPYWEWETDAQMQDPSQSQIWSADFMGGNGNPIKDFIVDTGPFAAGRWTTIDEQGNPSGGLKRNFGATKEAPTLPTRDDVLNALKITQYDTPPWDMTSQNSFRNQLEGFINGPQLHNRVHRWVGGQMGVVPTAPNDPVFFLHHANVDRIWAVWQIIHRNQNYQPMKNGPFGQNFRDPMYPWNTTPEDVMNHRKLGYVYDIELRKSKRSS*

In the app bencling ==> https://benchling.com/s/seq-cSFfevSwjHFxf6KwwXGx?m=slm-D8VDordW2oyuqdI6bH4r

Structural Micro-Analysis: Copper Coordination and Catalysis

Tyrosinases are type-3 copper enzymes. Their catalytic activity depends on two copper-binding sites (CuA and CuB), each coordinated by conserved histidine residues. These histidine-rich motifs enable:

  • Binding of molecular oxygen
  • Oxidation of L-tyrosine to L-DOPA
  • Conversion to dopaquinone
  • Spontaneous polymerization into eumelanin

The presence of conserved histidines in the Tyr1 sequence reflects this copper-dependent catalytic architecture. The structure of the enzyme directly determines its ability to generate a redox-active polymer. Here, protein folding and metal coordination translate genetic code into material transformation.


Why This Protein Matters

1️⃣DNA → enzyme → material property The tyr1 gene encodes an enzyme. The enzyme catalyzes a redox reaction. The reaction produces a polymer. The polymer modifies material properties.

Thus: Genetic sequence → enzyme structure → catalytic activity → polymer formation → macroscopic material pigmentation and electroactivity This directly embodies the theme of “Reading & Writing Life.”

2️⃣ Minimal synthetic biology intervention Unlike multi-gene systems (e.g., the curli operon), Tyr1: Requires only one coding sequence Does not modify the cellulose biosynthesis operon (bcsABCD) Adds a new functional layer to the extracellular matrix This makes it an elegant example of additive biological functionalization, where a single gene expands the material phenotype of a living system.

3️⃣ Connection to biofabrication and living materials

Walker et al. demonstrate patterned melanin production using optogenetic control of tyr1 expression. This shows that: Gene expression can spatially control material coloration Biological programming can encode textile-level patterning Material properties can be written through genetic regulation This aligns directly with my research interest in growing functionalized 3D artifacts and interactive living interfaces.

Beyond color: material programming

In this project, the goal is not simply to encode color in DNA. The visible pigmentation is only a surface effect. What is actually encoded is a redox-active polymer production pathway inside living matter.

Melanin modifies:

  • local redox properties,
  • hydration-dependent conductivity,
  • and potentially the impedance behavior of the cellulose matrix.

So pigmentation here is not aesthetic encoding, but electrochemical encoding. The gene does not just change appearance — it changes how the material behaves electrically. In that sense, DNA is not only encoding a protein, but indirectly programming a material state.

3.2 Reverse Translate

Protein (amino acid) sequence → DNA (nucleotide) sequence

Because of codon degeneracy, there is not a single unique DNA sequence for a given protein sequence. Many different nucleotide sequences can encode the exact same amino acid sequence. To reverse-translate Tyr1, I used a reverse-translation approach (equivalent to common online reverse-translation tools) by selecting one plausible codon per amino acid (a “standard/high-frequency codon” strategy).

The protein sequence used is Tyr1 from Bacillus megaterium as reported in Walker et al. 2024 (Supplementary Data 1). :contentReference[oaicite:1]{index=1}

Reverse-translated nucleotide sequence (one valid coding DNA sequence)
ATGGGCAATAAATACCGCGTGCGTAAGAATGTTCTGCACCTGACAGATACCGAGAAGCGTGACTTCGTGCGCACTGTACTGATTTTGAAAGAGAAGGGCATTTACGATCGTTACATCGCATGGCACGGCGCCGCGGGTAAGTTTCACACCCCGCCCGGTAGTGACCGTAACGCGGCGCACATGTCGAGTGCGTTCTTGCCTTGGCACCGCGAATATCTGCTGCGCTTTGAGCGCGATCTGCAATCGATTAACCCTGAGGTGACTCTGCCGTACTGGGAGTGGGAAACCGATGCTCAAATGCAAGACCCTAGCCAGTCGCAGATCTGGAGCGCCGACTTCATGGGCGGCAATGGCAACCCAATTAAGGACTTCATTGTAGACACGGGCCCGTTCGCTGCCGGCCGTTGGACAACCATTGACGAGCAGGGTAACCCGTCAGGCGGCTTAAAGCGCAACTTCGGTGCGACTAAGGAAGCCCCCACCCTGCCGACGCGCGACGACGTGCTGAACGCACTTAAGATTACCCAATACGACACCCCACCCTGGGACATGACGTCCCAGAATAGTTTCCGCAACCAACTCGAGGGTTTCATCAATGGCCCGCAACTGCATAACCGTGTGCATCGCTGGGTCGGTGGCCAAATGGGTGTCGTCCCTACCGCGCCCAACGACCCGGTGTTCTTCCTGCATCATGCGAACGTTGACCGCATCTGGGCCGTGTGGCAGATCATCCACCGCAACCAGAATTACCAACCAATGAAGAATGGCCCGTTCGGCCAGAATTTCCGTGACCCAATGTATCCATGGAACACCACGCCTGAGGATGTAATGAATCACCGTAAACTGGGCTATGTTTATGACATCGAGTTGCGTAAGTCGAAGCGCAGCTCTTGA

Degeneracy and probabilistic encoding

During this process, I realized that reverse translation is not deterministic. The genetic code is degenerate: multiple DNA sequences can encode the same protein. Choosing a DNA sequence is therefore not neutral. Codon usage can influence expression level, folding efficiency, and metabolic burden in the host organism. I am only beginning to understand how codon bias shapes expression outcomes. What appears to be a simple reverse translation step is in reality a probabilistic optimization problem shaped by cellular context.

3.3 Codon Optimization

Why codon optimization is necessary

Although multiple DNA sequences can encode the same protein due to codon degeneracy, not all synonymous codons are used equally in different organisms.

If a gene uses codons that are rare in the host organism:

  • Translation can be slow
  • Ribosomes may stall
  • Protein yield can decrease
  • Misfolding may increase

Codon optimization ensures that the nucleotide sequence:

  • Matches the host organism’s codon usage bias
  • Improves translation efficiency
  • Reduces metabolic burden
  • Avoids problematic motifs (e.g., strong secondary structures, cryptic promoters, restriction sites)

Thus, codon optimization improves its expression in a chosen host.


Organism chosen for optimization

I chose to optimize the tyr1 gene for expression in: Komagataeibacter rhaeticus

This organism is the cellulose-producing bacterium used in Walker et al. (2024) and is central to my broader project on biologically functionalized bacterial cellulose.

Because K. rhaeticus has a relatively high GC content and a codon usage profile distinct from Bacillus megaterium, optimization is required to:

  • Improve translation efficiency
  • Ensure stable protein production
  • Support effective melanin biosynthesis

Additional design constraints

Following best practices (e.g., Twist Bioscience optimization tools), the optimized sequence was also designed to:

  • Avoid Type IIS restriction enzyme recognition sites:
    • BsaI
    • BsmBI
    • BbsI
  • Avoid internal repetitive motifs
  • Maintain balanced GC distribution

Codon-Optimized DNA Sequence (for K. rhaeticus)
ATGGGCAACAAGTATCGCGTCCGCAA GAACGTGCTGCACCTGACCGACACCGAGAAGCGCGACTTCCGCAGCACCTGCACCATTCTGAGGAGATCGGCTACGACCGCATCTACATCGGCGCGGCGGGCAAGTTCCACACCCCCACCGGCAGCGACCGCAACGCGGCACACTCGATCGCGCGTTCACCTGGCAACGCTACCTGCGCTTCGAGCGCTACCAGAGCATCAACCCCGAGGTGACCCTGCCGTACTGGGAGTGGGAGACCGACGCGCAGATCGACCCCGACAGCCAGATCTGGAGCGCGGACTTCATCGGCGGCAACGGCAACCCCATCAAGGACTTCATCGTCGACACCGGCCCCGTC TTCGCGGCGGGCCCCACCATCGACGAGCAGGGCAACCCCAGCGGCGGCCTGAAGCGCAACTTCGGCGCGACCAAGGAGGCGCCGACCCCCGCGCGCTCCGACGACGTGCTGAAGCGCCTCAAGATCTACC
CCAACATCGACACCCCCCTGGACATGACCTCCAACAGCTTCCGCAACCAGCTCGAGGGCATCTACAACGGCCCCCAGATCTTCAACAACCGCGTGCACCGCTGGGGCGGCCAGATGGGCGTGGTGCCCACCCCCGCGCCCAACGACCCCGTCTTCTCCTGCACCATGCCAACGTGGACCGCTACTGGGCCGTGTGGCAGATCCACCGCAACCAGAACTACCCCCATGAAGAACGGCCCCTTTGGCCAGAACTTCCGCTACCCCAACGTCTCCTGGAATACCACCGCGGAGGACGTCATGAACCACCGCAAGCTGGGCTACGTGTACGACATCGAGCTGCGCAAGAGCAAGCGCAGCAGCTAA

3.4 You have a sequence / Now what?

Now that we have a codon-optimized DNA sequence encoding Tyr1, the next step is to express the protein.

How can this protein be produced?

There are two main strategies:


A. Cell-dependent expression (in vivo)

In this approach, the DNA sequence is inserted into a plasmid vector containing:

  • A promoter (constitutive or inducible)
  • A ribosome binding site (RBS)
  • The coding sequence (our optimized tyr1)
  • A transcription terminator

The plasmid is then introduced into a host organism (e.g., Komagataeibacter rhaeticus).

What happens biologically?

1️⃣ Transcription
RNA polymerase binds to the promoter and transcribes the DNA sequence into messenger RNA (mRNA).

2️⃣ Translation
Ribosomes bind to the mRNA at the ribosome binding site. The mRNA codons (triplets) are read. tRNAs deliver amino acids corresponding to each codon. The ribosome assembles the amino acids into the Tyr1 protein.

3️⃣ Protein folding and function
The translated Tyr1 protein folds into its 3D structure. Copper ions bind to the active site. The enzyme becomes catalytically active.

In the case of Tyr1, the enzyme then catalyzes: L-tyrosine → L-DOPA → dopaquinone → eumelanin Thus, genetic information becomes a catalytic material transformation.


B. Cell-free expression (in vitro)

Alternatively, the DNA sequence can be expressed using a cell-free transcription-translation system.

These systems contain:

  • RNA polymerase
  • Ribosomes
  • tRNAs
  • Amino acids
  • Energy sources

The optimized DNA is added directly to the mixture.

Transcription and translation occur outside living cells.

Advantages:

  • Rapid prototyping
  • No need for transformation
  • Easier control of conditions

For exploratory material functionalization, cell-free systems can serve as a fast validation step before in vivo implementation.


From DNA to Protein: Summary

DNA (double-stranded)
→ transcription →
mRNA (single-stranded)
→ translation →
Protein (amino acid chain)

This is the operational flow of the Central Dogma.


From DNA to material interface

If we follow the chain of transformations:

DNA (tyr1 gene)
→ mRNA
→ Tyrosinase enzyme
→ Melanin polymer formation
→ Integration into the cellulose matrix
→ Modified impedance behavior
→ Human touch interaction

This illustrates that writing DNA is not only about modifying organisms. It is about programming how matter self-organizes over time.

My interest is not genetic novelty for its own sake, but how cellular metabolism reorganizes matter into functional biofilms.


4. Prepare a Twist DNA Synthesis Order

4.1. Create a Twist account, and Benchling account

To prepare the DNA synthesis order, I created:

Benchling was used to design and annotate the DNA construct.
Twist was used to validate and prepare the sequence for synthesis.

The idea is to make a manufacturable artifact, marking the transition from theory to industrial bioengineering.


4.2. Build Your DNA Insert Sequence

The expression cassette was assembled in Benchling as a linear DNA construct including:

  • Promoter
  • RBS
  • Coding sequence (Tyr1)
  • Stop codon
  • Terminator

Benchling file:

🔗 Benchling sequence link

ZIP version here : 📄 Tyr1 ZIP

Annotated sequence overview

Benchling allows visualization of the construct with annotated features and automatic translation of the coding sequence.

Benchling translation view Benchling translation view

This confirms:

  • Correct open reading frame
  • Proper annotation of CDS
  • Expected protein translation

Twist synthesis validation

The sequence was uploaded to Twist Bioscience for synthesis validation and plasmid build preparation.

Twist plasmid preview Twist plasmid preview

Twist checks for:

  • Sequence length constraints
  • GC content
  • Secondary structures
  • Restriction site conflicts
  • Synthesis compatibility

This step transforms a conceptual genetic design into a manufacturable DNA construct.


Files

📄 Download: Tyr1 synthesis package (ZIP)


Optional: SBOL visualization

To visually represent the genetic architecture, the construct can also be recreated using: 🔗 https://sbolcanvas.org
SBOL Canvas enables graphical design of: Promoter → RBS → CDS → Terminator
This helps communicate the logic of the construct clearly and aligns with synthetic biology design standards.


5. DNA Read/Write/Edit

5.1 DNA Read
(i) What DNA would I want to sequence and why?

I would sequence synthetic DNA used for digital data storage. Instead of reading DNA from living organisms, I am interested in DNA that has been artificially synthesized to encode digital information (text, images, archives). In this case, DNA is not used as biology — it is used as a storage material.

Why this matters

DNA can store enormous amounts of information in a very small volume and can remain stable for long periods of time. Unlike hard drives or servers, it does not depend on electronic infrastructure. Sequencing such DNA means reading information written into molecules. It turns a biological technology into a digital decoding tool. This connects biology with computation and raises important questions about how information can be stored, preserved, and accessed in the future.


(ii) Which sequencing technology would I use and why?

I would use Illumina sequencing.

Why Illumina?

DNA used for digital storage is usually made of short fragments. Illumina sequencing works very well for short pieces of DNA and provides very high accuracy, which is essential when decoding stored data. Small sequencing errors could corrupt the recovered file, so accuracy is more important than read length.


What generation is it?

Illumina is considered a second-generation sequencing technology because it reads millions of DNA fragments at the same time using amplified clusters and fluorescent signals.


What is the input and preparation?

Input:
Synthetic double-stranded DNA fragments (around 100–200 base pairs).

Basic preparation steps:

  1. Add short adapter sequences to the DNA fragments
  2. Amplify the fragments
  3. Load them onto the sequencing machine

How does Illumina read DNA?

The machine copies the DNA one base at a time.
Each base (A, T, C, G) produces a specific fluorescent signal.

A camera records the signal after each step, and software converts these signals into a DNA sequence.


What is the output?

The output is a digital file containing:

  • The DNA sequences
  • A quality score for each base

In DNA data storage, these sequences are decoded back into binary information using error-correction algorithms.


Why not Nanopore?

Nanopore sequencing can read longer DNA fragments, but it generally has higher error rates. For digital data storage, accuracy is more important than read length, so Illumina is currently more suitable.


Conceptual note

In this case:

DNA is not a gene.
It is a storage medium.

Sequencing is not diagnosing life.
It is decoding information.

5.2 DNA Write

(i) What DNA would I want to synthesize and why?

I would synthesize a genetic construct enabling melanin production in cellulose-producing bacteria.

More specifically, I would synthesize a codon-optimized version of the tyrosinase gene (tyr1) from Bacillus megaterium, placed inside an expression cassette designed for bacterial hosts.

The goal is to enable engineered Komagataeibacter strains (cellulose-producing bacteria) to produce eumelanin, a dark redox-active polymer, directly within the growing cellulose matrix.

This would allow:

  • Pigmented living materials
  • Potentially electroactive cellulose composites
  • A direct link between gene expression and macroscopic material properties

In this case, DNA is used not only to encode a protein, but to program a material property.


Example genetic construct (simplified expression cassette)

Promoter → RBS → tyr1 CDS → Stop → Terminator This DNA construct would enable constitutive expression of tyrosinase, leading to melanin formation when L-tyrosine is available.


(ii) What DNA synthesis technology would I use and why?

I would use phosphoramidite-based chemical DNA synthesis, the standard method used by companies like Twist Bioscience. This is currently the most reliable and scalable way to synthesize custom genes.


Essential steps of DNA synthesis
  1. Chemical synthesis of short oligonucleotides
  2. Assembly of oligos into full-length genes
  3. Error correction and amplification
  4. Cloning into a plasmid vector
  5. Sequence verification

Limitations
  • Error rates increase with sequence length
  • GC-rich or repetitive sequences are harder to synthesize
  • Cost increases with size
  • Large constructs require assembly from smaller fragments
  • Scalability

Scaling biological material production is not a simple matter of increasing volume. In living systems, morphology, oxygen gradients, metabolic stress, and contamination risks can fundamentally alter structural outcomes. A construct that works at lab scale does not automatically behave the same way at industrial scale. Understanding small-scale control is therefore a necessary first step before considering scalability.

Despite these limits, gene-scale synthesis (1–3 kb) is highly robust today.


5.3 DNA Edit

My ambition is not maximal intervention, but controlled understanding. Before redesigning entire biosynthetic networks, I prefer mastering one enzymatic layer and observing its material consequences. Editing cellulose crystallinity, secretion pathways, or c-di-GMP regulation may be possible in the future. But at this stage, focusing on a single added function (tyr1 expression) allows a clearer understanding of cause and effect.

(i) What DNA would I want to edit and why?

I would want to edit the genome of a cellulose-producing bacterium (Komagataeibacter) to integrate a functional gene such as tyr1 directly into its chromosome.

Rather than keeping the gene on a plasmid, chromosomal integration would:

  • Increase genetic stability
  • Reduce reliance on antibiotic selection
  • Make the engineered trait more sustainable

More broadly, DNA editing could be used to:

  • Improve cellulose yield
  • Modify crystallinity or fiber structure
  • Control secretion of functional proteins
  • Enable responsive or patterned material growth

The objective is material programming through living systems such as Komagataeibacter and raises technical and ecological questions regarding containment, stability, and unintended environmental interactions.


(ii) What technology would I use and why?

I would use CRISPR-Cas9 genome editing. CRISPR allows precise insertion or modification of DNA at specific genomic locations.


How CRISPR edits DNA
  1. Design a guide RNA (gRNA) targeting a specific genomic sequence
  2. Deliver Cas9 protein and guide RNA into the cell
  3. Cas9 creates a double-strand break at the target site
  4. Provide a donor DNA template containing the new gene
  5. The cell repairs the break using the donor template (homology-directed repair)

Required inputs
  • Guide RNA sequence
  • Cas9 enzyme (plasmid or ribonucleoprotein complex)
  • Donor DNA template
  • Competent cells
  • Transformation system

Limitations
  • Editing efficiency may be low in non-model organisms
  • Off-target edits are possible
  • Homology-directed repair is not always efficient
  • Delivery systems can be challenging

Despite these challenges, CRISPR remains the most precise and flexible editing tool currently available.


Conceptual note

DNA writing defines what is possible.
DNA editing reshapes what already exists.

Together, they allow us to move from reading biology to programming living materials.

Personnal note

Writing DNA is not merely modifying organisms; it is programming how matter self-organizes over time.

6. Exploration of others strategies for my project

Compare 4 Conductivity Strategies (What They Really Do)
StrategyMain EffectWhat Carries the SignalTypical IngredientsWhat Changes in the MaterialDifficulty (1–5)
1) Graphene/CNT in-situStrong conductionElectronic percolation networkCNT, graphene/rGO, dispersion aidLarge drop in resistance, wire-like behavior2/5
2) PEDOT:PSS / Ag nanowires (post or in-situ)Strong conductionPolymer film or metallic networkPEDOT:PSS, AgNWsHigh conductivity, less biological emergence3/5
3) Tyr1 → melanin (bio-made dopant)Impedance / redox / protonicRedox polymer + water/ionsTyr1 + Cu²⁺ + tyrosine (neutral pH development step)Response depends on hydration and pH; useful for sensing3/5
4) Curli-like programmable fibersStructure + hybrid conductionProtein fibers + mineralization or bindingCurli operon + metal-binding peptidesProgrammable and patternable but multi-gene4/5
Reading this table
  • If the goal is to create a wire-like conductive material, strategies 1–2 are the most effective.
  • If the goal is to create a living sensing surface, strategy 3 (and potentially 4 later) is more coherent.

Is Tyr1 Enough for a Tactile Sensor?

What I am actually building with Tyr1 + BC

With Tyr1/melanin integrated into bacterial cellulose, I am not creating a high-performance conductor. Instead, I am most likely creating an impedance-based sensor.

Two realistic sensing behaviors:

  1. Touch or pressure → impedance change
    Mechanical compression modifies microstructure and redistributes water and ions.
    → Impedance changes, especially at specific AC frequencies.

  2. Hydro-tactile response (touch + humidity)
    BC is highly sensitive to hydration.
    → Strong signal variation, but humidity must be controlled to avoid false readings.


Does the material need to remain alive?

Not necessarily. Two modes are possible:

  • Living mode
    The material continues to grow and self-organize.
    Advantage: conceptual strength, biological emergence.
    Limitation: signal drift over time due to metabolism and morphology changes.

  • Stabilized mode
    Washed and partially dried but rehydratable.
    Advantage: more stable electrical behavior and easier reproducibility.

For experimental validation, a stabilized but responsive material may be more reliable.


Is electrodes + amplifier + ADC sufficient?

Yes — provided that impedance (AC) is measured correctly.

Basic workflow:

  • Inject a small, known AC excitation signal
  • Measure amplitude (and ideally phase)
  • Test at 2–3 different frequencies

Possible hardware:

  • Dedicated impedance IC (e.g., AD5933 / AD5940)
  • Or a simple bridge with sinusoidal excitation and amplitude measurement

Electrode configurations:

  • 2-electrode setup
    Simpler, but more sensitive to contact variability.

  • 4-electrode setup
    Better separation of excitation and measurement, more stable results.

    See what I’ve already done with Bacterial cellulose and electrodes signals :


Minimal validation experiment

To test whether Tyr1 truly contributes:

  • BC control (no Tyr1)
  • BC + Tyr1 + Cu²⁺/tyrosine development step
  • Compare impedance vs pressure and humidity

If Tyr1 produces a measurable difference compared to control, the biological contribution is validated.


Why Copper Matters and Hybrid Strategies

Do I need to add copper?

Yes.

Tyrosinase is copper-dependent.
Without sufficient Cu²⁺, enzyme activity drops significantly.

However:

  • Too little copper → weak melanin synthesis
  • Too much copper → toxicity and growth inhibition

Copper availability must therefore be optimized, not maximized.


Can other dopants improve impedance?

Yes, but carefully.

Impedance in BC systems is strongly influenced by:

  • Hydration level
  • Ionic concentration
  • Microstructure and porosity

A mild hybrid strategy can be interesting:

  • Tyr1 provides a biologically synthesized redox layer
  • Controlled ionic tuning (salts, buffering, hydration control) shapes electrical response

This maintains biological emergence while improving signal stability.

Long-Term Direction: Redox → c-di-GMP → Cellulose Organization

c-di-GMP is a central intracellular regulator controlling biofilm formation and cellulose production.

In Komagataeibacter, cellulose synthesis is tightly linked to c-di-GMP levels.

A future direction could be:

  • A redox or electrical stimulus alters intracellular c-di-GMP levels
  • This modifies cellulose production or organization
  • Which then changes macroscopic impedance

This would represent a deeper form of biological programming, where structure and electrical behavior emerge from intracellular regulation.

However, this approach is high complexity (4/5–5/5) and better suited as a long-term research direction.


Week 03 HW lab-automation:

Table of Contents

Assignment: Python Script for Opentrons Artwork

https://opentrons-art.rcdonovan.com/?id=sux110hip535fnx

As you see I’ve played with the plug & play Interface to make some face. You can appreciate a “Crying Wosjak” at the top or here : https://opentrons-art.rcdonovan.com/?id=43vt9gy453hud34


Statement of Intent — Why Reaction–Diffusion

I have been playing with reaction–diffusion algorithms for a long time. I keep returning to them out of curiosity — both for their history (their direct link to Alan Turing’s work on emergent patterns) and for what they suggest in terms of morphogenesis.

What interests me is not the naïve idea that “everything reduces” to these equations, but rather the fact that a very simple model can produce rich structures that resemble certain biological patterns. In developmental biology, reaction–diffusion models are often invoked to explain parts of gradient formation, repetition, or textural organization (and, to a limited extent, aspects of differentiation). Of course, real biological systems are far more complex: mechanical constraints, multi-scale signaling, feedback loops, and energetic limitations all play crucial roles.

Precisely for this reason, in the context of this Opentrons experiment, I am interested in translating a dynamic of emergence into a very concrete material gesture — an image composed of discrete deposits, where a continuous phenomenon (reaction and diffusion) becomes a physical field of dots.

Best website for undertsand Reaction-diffusion equation: https://www.karlsims.com/rd.html

My collab is here https://colab.research.google.com/drive/1Kuuh7rw8CXD4cQs-CxuRT5iAjiOw7TgX?usp=sharing

Show / hide the Opentrons Python script
from opentrons import types
import subprocess, sys
import numpy as np


metadata = {    # see https://docs.opentrons.com/v2/tutorial.html#tutorial-metadata
    'author': 'Vivien',
    'protocolName': 'Gray-Scott Reaction-Diffusion Pattern',
    'description': 'Generates a reaction-diffusion pattern on an agar plate using the Gray-Scott model, pipetting where the B concentration is high.',
    'source': 'HTGAA 2026 Opentrons Lab',
    'apiLevel': '2.20'
}

##############################################################################
###   Gray-Scott Model Global Parameters and Functions
##############################################################################

# Gray-Scott Model Parameters
GS_DA = 1.0
GS_DB = 0.5
GS_F = 0.014 # Feed rate. Experiment with values like 0.035 (spots), 0.014 (worms), 0.062 (moving spots)
GS_K = 0.045 # Kill rate. Experiment with values like 0.065 (spots), 0.045 (worms), 0.061 (moving spots)
GS_DT = 1.0

#1. Mitosis / spots : f=0.0367, k=0.0649
#2. Solitions / worms : f=0.030, k=0.062
#3. Coral / dense : f=0.0545, k=0.062
#4. Vibe : f = 0.029, K = 0.057
#5. labyrythn : f = 0.060, K = 0.063

def laplace(Z):
    """Weighted 3x3 Laplacian (Karl Sims style) using numpy.roll."""
    return (
        -1.0 * Z +
        0.2 * (np.roll(Z,  1, axis=0) + np.roll(Z, -1, axis=0) +
               np.roll(Z,  1, axis=1) + np.roll(Z, -1, axis=1)) +
        0.05 * (np.roll(np.roll(Z,  1, axis=0),  1, axis=1) +
                np.roll(np.roll(Z,  1, axis=0), -1, axis=1) +
                np.roll(np.roll(Z, -1, axis=0),  1, axis=1) +
                np.roll(np.roll(Z, -1, axis=0), -1, axis=1))
    )

def simulate_gray_scott(A, B, num_iterations):
    """
    Runs the Gray-Scott simulation for a given number of iterations.
    Modifies A and B arrays in place.
    """
    for _ in range(num_iterations):
        lapA = laplace(A)
        lapB = laplace(B)

        reaction = A * B * B
        A += (GS_DA * lapA - reaction + GS_F * (1 - A)) * GS_DT
        B += (GS_DB * lapB + reaction - (GS_K + GS_F) * B) * GS_DT
    return A, B # Return A and B for clarity, though they are modified in-place

##############################################################################
###   Robot deck setup constants - don't change these
##############################################################################

TIP_RACK_DECK_SLOT = 9
COLORS_DECK_SLOT = 6
AGAR_DECK_SLOT = 5
PIPETTE_STARTING_TIP_WELL = 'A1'

well_colors = {
    'A1' : 'Red',
    'B1' : 'Green',
    'C1' : 'Orange'
}


def run(protocol):
  ##############################################################################
  ###   Load labware, modules and pipettes
  ##############################################################################

  # Tips
  tips_20ul = protocol.load_labware('opentrons_96_tiprack_20ul', TIP_RACK_DECK_SLOT, 'Opentrons 20uL Tips')

  # Pipettes
  pipette_20ul = protocol.load_instrument("p20_single_gen2", "right", [tips_20ul])

  # Modules
  temperature_module = protocol.load_module('temperature module gen2', COLORS_DECK_SLOT)

  # Temperature Module Plate
  temperature_plate = temperature_module.load_labware('opentrons_96_aluminumblock_generic_pcr_strip_200ul',
                                                      'Cold Plate')
  # Choose where to take the colors from
  color_plate = temperature_plate

  # Agar Plate
  agar_plate = protocol.load_labware('htgaa_agar_plate', AGAR_DECK_SLOT, 'Agar Plate')  ## TA MUST CALIBRATE EACH PLATE!
  # Get the top-center of the plate, make sure the plate was calibrated before running this
  center_location = agar_plate['A1'].top()

  pipette_20ul.starting_tip = tips_20ul.well(PIPETTE_STARTING_TIP_WELL)

  ##############################################################################
  ###   Patterning
##############################################################################

  ###
  ### Helper functions for this lab
  ###

  # pass this e.g. 'Red' and get back a Location which can be passed to aspirate()
  def location_of_color(color_string):
    for well,color in well_colors.items():
      if color.lower() == color_string.lower():
        return color_plate[well]
    raise ValueError(f"No well found with color {color_string}")

  # For this lab, instead of calling pipette.dispense(1, loc) use this: dispense_and_detach(pipette, 1, loc)
  def dispense_and_detach(pipette, volume, location):
      """
      Move laterally 5mm above the plate (to avoid smearing a drop); then drop down to the plate,
      dispense, move back up 5mm to detach drop, and stay high to be ready for next lateral move.
      5mm because a 4uL drop is 2mm diameter; and a 2deg tilt in the agar pour is >3mm difference across a plate.
      """
      assert(isinstance(volume, (int, float)))
      above_location = location.move(types.Point(z=location.point.z + 5))  # 5mm above
      pipette.move_to(above_location)       # Go to 5mm above the dispensing location
      pipette.dispense(volume, location)    # Go straight downwards and dispense
      pipette.move_to(above_location)       # Go straight up to detach drop and stay high

  ###
  ### YOUR CODE HERE to create your design (Gray-Scott Reaction-Diffusion Model)
  ###

  # Grid size for simulation
  size = 100 # Reduced size to make simulation faster and denser pattern on plate
  A = np.ones((size, size))
  B = np.zeros((size, size))

  # Create a central square seed for B
  r = 5
  A[size//2-r:size//2+r, size//2-r:size//2+r] = 0.5
  B[size//2-r:size//2+r, size//2-r:size//2+r] = 0.25

  # Add small random noise to the entire grid to introduce variation
  # This helps break symmetry and encourages diverse pattern formation
  A += np.random.rand(size, size) * 0.05 # Add noise up to 0.05
  B += np.random.rand(size, size) * 0.05 # Add noise up to 0.05

  B_min = float(B.min()); B_max = float(B.max());
  B_mean = float(B.mean())
  print("B stats:", B_min, B_mean, B_max)

  # Ensure values remain within bounds [0, 1] after adding noise
  A = np.clip(A, 0.0, 1.0)
  B = np.clip(B, 0.0, 1.0)

  print("Running Gray-Scott simulation...")

  # Define the number of simulation iterations here
  SIMULATION_ITERATIONS = 8000 # Initial run for 400 iterations

  A, B = simulate_gray_scott(A, B, SIMULATION_ITERATIONS)

  print(f"Gray-Scott simulation complete after {SIMULATION_ITERATIONS} iterations. Starting patterning.")
  print(f"To run more iterations, change the 'SIMULATION_ITERATIONS' variable in the code and re-run this cell and the visualization cell below.")


  # Patterning Parameters for Opentrons
  PIPETTE_VOLUME = 1 # 1uL per dot
  ASPIRATE_VOLUME = 16 # Aspirate up to this much at a time
  MAX_DOTS_PER_COLOR = 300 # Maximum number of dots to dispense per color

  # Colors to use for B components
  COLOR_FOR_PRIMARY_PATTERN = 'Green' # For the 'thick black lines' pattern
  COLOR_FOR_SECONDARY_PATTERN = 'Orange' # For other B concentrations
  COLOR_FOR_TERTIARY_PATTERN = 'Red' # For the highest B concentrations

  # Thresholds for B values
  # For 'thick black lines' (Green color)
  PRIMARY_PATTERN_B_LOWER_THRESHOLD = 0.40 * B_max
  PRIMARY_PATTERN_B_UPPER_THRESHOLD = 0.70 * B_max # This value will also define the lower bound for Red

  # For the secondary pattern (Orange color)
  # This will catch 'B' values that are above this, but not in the primary or tertiary pattern bands
  SECONDARY_PATTERN_B_THRESHOLD = 0.20 * B_max # Adjust this percentage as needed

  print("Using Red Pattern (Highest B) threshold: B >=", PRIMARY_PATTERN_B_UPPER_THRESHOLD)
  print("Using Green Pattern (Mid B - 'thick lines') B thresholds: (", PRIMARY_PATTERN_B_LOWER_THRESHOLD, ",", PRIMARY_PATTERN_B_UPPER_THRESHOLD, ")")
  print("Using Orange Pattern (Lower B) threshold: (", SECONDARY_PATTERN_B_THRESHOLD, ",", PRIMARY_PATTERN_B_LOWER_THRESHOLD, "]")


  # Agar plate dimensions (estimated for a standard 90mm round agar plate,
  # patterning in a 70x70mm square area roughly) to fit the pattern
  PATTERN_AREA_WIDTH_MM = 55 # Reduced from 85mm to fit within 40mm radius safe area
  PATTERN_AREA_HEIGHT_MM = 55 # Reduced from 85mm to fit within 40mm radius safe area

  # Calculate scaling factor to map grid coordinates to millimeters on the plate
  scale_x = PATTERN_AREA_WIDTH_MM / size
  scale_y = PATTERN_AREA_HEIGHT_MM / size

  # Add sampling step for clearer dots
  SAMPLING_STEP = 4 # Pipette every Nth pixel to create distinct dots
  DOT_SPACING_MM = SAMPLING_STEP * scale_x # Actual physical spacing between dot centers
  print(f"Desired dot spacing for distinct patterns: {DOT_SPACING_MM:.2f} mm (approx. 2mm drop diameter).")

  pipetted_count_primary = 0 # Green
  pipetted_count_secondary = 0 # Orange
  pipetted_count_tertiary = 0 # Red

  # --- Pipetting for Tertiary Pattern (Red component - highest B concentration) ---
  pipette_20ul.pick_up_tip()
  pipette_20ul.aspirate(ASPIRATE_VOLUME, location_of_color(COLOR_FOR_TERTIARY_PATTERN))
  current_pipette_volume = ASPIRATE_VOLUME

  print(f"Starting patterning for Tertiary Pattern with {COLOR_FOR_TERTIARY_PATTERN}.")

  for y in range(0, size, SAMPLING_STEP):
      for x in range(0, size, SAMPLING_STEP):
          if B[y, x] >= PRIMARY_PATTERN_B_UPPER_THRESHOLD:
              if current_pipette_volume < PIPETTE_VOLUME:
                  pipette_20ul.aspirate(ASPIRATE_VOLUME, location_of_color(COLOR_FOR_TERTIARY_PATTERN))
                  current_pipette_volume += ASPIRATE_VOLUME

              x_offset_mm = (x - size / 2) * scale_x
              y_offset_mm = (y - size / 2) * scale_y

              adjusted_location = center_location.move(types.Point(x=x_offset_mm, y=y_offset_mm))
              dispense_and_detach(pipette_20ul, PIPETTE_VOLUME, adjusted_location)
              current_pipette_volume -= PIPETTE_VOLUME
              pipetted_count_tertiary += 1
          if pipetted_count_tertiary >= MAX_DOTS_PER_COLOR:
              break
      if pipetted_count_tertiary >= MAX_DOTS_PER_COLOR:
          break
  pipette_20ul.drop_tip()
  print(f"Total {pipetted_count_tertiary} drops pipetted for Tertiary Pattern using {COLOR_FOR_TERTIARY_PATTERN}.")


  # --- Pipetting for Primary Pattern (Green component - 'thick lines') ---
  pipette_20ul.pick_up_tip()
  pipette_20ul.aspirate(ASPIRATE_VOLUME, location_of_color(COLOR_FOR_PRIMARY_PATTERN))
  current_pipette_volume = ASPIRATE_VOLUME

  print(f"Starting patterning for Primary Pattern with {COLOR_FOR_PRIMARY_PATTERN}.")

  for y in range(0, size, SAMPLING_STEP):
      for x in range(0, size, SAMPLING_STEP):
          if (B[y, x] > PRIMARY_PATTERN_B_LOWER_THRESHOLD) and (B[y, x] < PRIMARY_PATTERN_B_UPPER_THRESHOLD):
              if current_pipette_volume < PIPETTE_VOLUME:
                  pipette_20ul.aspirate(ASPIRATE_VOLUME, location_of_color(COLOR_FOR_PRIMARY_PATTERN))
                  current_pipette_volume += ASPIRATE_VOLUME

              x_offset_mm = (x - size / 2) * scale_x
              y_offset_mm = (y - size / 2) * scale_y

              adjusted_location = center_location.move(types.Point(x=x_offset_mm, y=y_offset_mm))
              dispense_and_detach(pipette_20ul, PIPETTE_VOLUME, adjusted_location)
              current_pipette_volume -= PIPETTE_VOLUME
              pipetted_count_primary += 1
          if pipetted_count_primary >= MAX_DOTS_PER_COLOR:
              break
      if pipetted_count_primary >= MAX_DOTS_PER_COLOR:
          break
  pipette_20ul.drop_tip()
  print(f"Total {pipetted_count_primary} drops pipetted for Primary Pattern using {COLOR_FOR_PRIMARY_PATTERN}.")

  # --- Pipetting for Secondary Pattern (Orange component) ---
  pipette_20ul.pick_up_tip()
  pipette_20ul.aspirate(ASPIRATE_VOLUME, location_of_color(COLOR_FOR_SECONDARY_PATTERN))
  current_pipette_volume = ASPIRATE_VOLUME

  print(f"Starting patterning for Secondary Pattern with {COLOR_FOR_SECONDARY_PATTERN}.")

  for y in range(0, size, SAMPLING_STEP):
      for x in range(0, size, SAMPLING_STEP):
          # Dispense Orange if B is above its threshold, AND NOT in the Green or Red band
          if (B[y, x] > SECONDARY_PATTERN_B_THRESHOLD) and (B[y, x] <= PRIMARY_PATTERN_B_LOWER_THRESHOLD):
              if current_pipette_volume < PIPETTE_VOLUME:
                  pipette_20ul.aspirate(ASPIRATE_VOLUME, location_of_color(COLOR_FOR_SECONDARY_PATTERN))
                  current_pipette_volume += ASPIRATE_VOLUME

              x_offset_mm = (x - size / 2) * scale_x
              y_offset_mm = (y - size / 2) * scale_y

              adjusted_location = center_location.move(types.Point(x=x_offset_mm, y=y_offset_mm))
              dispense_and_detach(pipette_20ul, PIPETTE_VOLUME, adjusted_location)
              current_pipette_volume -= PIPETTE_VOLUME
              pipetted_count_secondary += 1
          if pipetted_count_secondary >= MAX_DOTS_PER_COLOR:
              break
      if pipetted_count_secondary >= MAX_DOTS_PER_COLOR:
          break
  pipette_20ul.drop_tip()
  print(f"Total {pipetted_count_secondary} drops pipetted for Secondary Pattern using {COLOR_FOR_SECONDARY_PATTERN}.")
# Execute Simulation / Visualization -- don't change this code block
protocol = OpentronsMock(well_colors)
run(protocol)
protocol.visualize()

 B stats: 1.1769677182305039e-07 0.02738330028607403 0.2993330786497384
Running Gray-Scott simulation...
Gray-Scott simulation complete after 8000 iterations. Starting patterning.
To run more iterations, change the 'SIMULATION_ITERATIONS' variable in the code and re-run this cell and the visualization cell below.
Using Red Pattern (Highest B) threshold: B >= 0.20953315505481687
Using Green Pattern (Mid B - 'thick lines') B thresholds: ( 0.11973323145989537 , 0.20953315505481687 )
Using Orange Pattern (Lower B) threshold: ( 0.059866615729947684 , 0.11973323145989537 ]
Desired dot spacing for distinct patterns: 2.20 mm (approx. 2mm drop diameter).
Starting patterning for Tertiary Pattern with Red.
Total 0 drops pipetted for Tertiary Pattern using Red.
Starting patterning for Primary Pattern with Green.
Total 0 drops pipetted for Primary Pattern using Green.
Starting patterning for Secondary Pattern with Orange.
Total 0 drops pipetted for Secondary Pattern using Orange.

=== VOLUME TOTALS BY COLOR ===
	Orange:		 aspirated 16	 dispensed 0		##### WASTING BIO-INK : more aspirated than dispensed!
	Green:		 aspirated 16	 dispensed 0		##### WASTING BIO-INK : more aspirated than dispensed!
	Red:		 aspirated 16	 dispensed 0		##### WASTING BIO-INK : more aspirated than dispensed!
	[all colors]:	[aspirated 48]	[dispensed 0]

=== TIP COUNT ===
	 Used 3 tip(s)  (ideally exactly one per unique color)
```

I used Gemini (2.5) to help translate a Gray-Scott reaction–diffusion model into a stable Opentrons protocol and to choose a robust rendering strategy (iso-contour band → dot sampling) that produces reliable aesthetic output under time/volume constraints.


Post-Lab Questions

Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications.

Published Example:

Open Hardware and Laboratory Automation in Biology, Tobias Wenzel - arXiv:2210.08976 https://arxiv.org/abs/2210.08976

The paper does not focus only on Opentrons specifically, but it discusses how open hardware platforms — including OpenTrons — are transforming access to biological research tools. The author explains how open-source automation systems allow laboratories to build, adapt, and maintain their own equipment instead of relying only on expensive proprietary machines.

What is particularly interesting is the idea of “appropriate technology.” The paper argues that automation is not just about saving money. It is about local fabrication, adaptability, transparency, and knowledge transfer. Open systems such as OpenTrons make automation accessible to more researchers, especially in low-resource environments, while still enabling advanced biological workflows.

This approach supports reproducibility, customization, and global collaboration. Instead of being locked into closed commercial systems, researchers can modify and improve their automation tools to fit specific experimental needs.

In that sense, laboratory automation becomes not only a productivity tool, but also a platform for scientific autonomy and innovation.


Write a description about what you intend to do with automation tools for your final project.

Automation as a Tool to Explore the Behavioral Landscape of Living Materials

Automation tools such as Opentrons have been widely used for:

  • High-throughput DNA assembly
  • CRISPR editing workflows
  • Combinatorial genetic library screening
  • Automated protein expression testing

These systems increase reproducibility, reduce human variability, and enable scalable experimentation. In most cases, automation serves to optimize molecular workflows and accelerate genetic engineering cycles.

However, my interest in automation is different.


Moving Beyond Optimization

In classical bioprocess engineering, automation is used to:

  • Optimize growth conditions
  • Reduce experimental noise
  • Standardize reproducibility
  • Improve yield

But living materials — such as bacterial cellulose biofilms — do not behave like linear industrial systems.

They exhibit:

  • Non-linear responses
  • Narrow stability windows
  • Emergent morphologies
  • Phase transitions under small parameter shifts

Traditional experimental design (e.g., Taguchi matrices) assumes relatively smooth and predictable response surfaces. In living systems, this assumption often fails. Small changes in pH, oxygen availability, carbon source, or metal ions can lead to abrupt structural transitions.

Automation, in this context, is not merely an efficiency tool. It becomes a way to systematically explore instability and emergence.


My Intended Use of Automation

1. Mapping Morphogenetic Regimes of Bacterial Cellulose

Instead of optimizing for maximum growth, I aim to use automation to:

  • Vary glucose concentration
  • Modulate copper availability (for Tyr1-dependent melanin production)
  • Adjust nitrogen sources
  • Introduce controlled gradients

The objective is to map how living cellulose changes:

  • Thickness
  • Porosity
  • Impedance
  • Mechanical anisotropy
  • Conductive behavior

This transforms automation into a cartographic tool:

Not optimizing yield, but mapping the behavioral topology of a living material.


2. Spatial Programming of Living Matter

A more ambitious direction is to use liquid handling automation to:

  • Deposit gradients of dopants
  • Create patterned functional zones
  • Introduce local conductivity modulation
  • Encode anisotropy into growing pellicles

Instead of post-processing materials (e.g., adding graphene or PEDOT in situ), this approach attempts to let functionality emerge during growth. Automation allows spatial control. Living matter performs the structuration.


3. Toward a Cybernetic Living Material System

A future extension would integrate measurement and feedback:

  1. Grow bacterial cellulose
  2. Measure impedance or electrical response
  3. Adjust copper or nutrient concentration automatically
  4. Iterate

This creates a cybernetic loop: Living material → Measurement → Algorithmic adjustment → Modified growth Automation becomes a mediator between biological behavior and computational control.


Why This Matters

This project shifts the role of automation from:

Eliminating biological variability
to: Engaging systematically with biological variability.

Rather than forcing the living system into industrial predictability, automation is used to:

  • Detect bifurcations
  • Explore phase transitions
  • Reveal hidden regimes
  • Enable programmable morphogenesis

In this sense, automation becomes a bridge between:

  • Synthetic biology
  • Biofabrication
  • Morphogenesis
  • Cybernetic design

It allows living matter to be explored not as a static substrate, but as a dynamic, programmable system.

Final Project Ideas

https://docs.google.com/presentation/d/1FAFN4YYisOcso3CI5F3W3Z7hj6_n9D1vAhVUywQXKPU/edit?slide=id.g3c8d4cf45e8_0_55#slide=id.g3c8d4cf45e8_0_55

Week 04 HW: Protein Design Part 1

Table of Contents

Conceptual Questions – Answers

1. How many molecules of amino acids are in 500 g of meat?

Typical meat contains about 20% protein by weight. So for a 500 g piece of meat:

$$ 500 \ g \times 0.20 = 100 \ g \text{ of protein} $$

The average mass of an amino acid is approximately 100 Daltons. Since:

$$ 1 \ Dalton = 1.66 \times 10^{-24} , g $$

then:

$$ 100 \ Da \approx 1.66 \times 10^{-22} \ g $$

The approximate number of amino acids is therefore:

$$ \frac{100}{1.66 \times 10^{-22}} \approx 6 \times 10^{23} $$

So a 500 g piece of meat contains roughly:

$$ \sim 10^{24} $$

amino acid molecules (after proteins are digested).


2. Why do humans eat beef but do not become a cow?

Food proteins are not directly incorporated into our bodies.

Instead, they are broken down during digestion: protein → peptides → amino acids

These amino acids are then reused by our cells to build human proteins, according to our own genetic instructions: DNA → RNA → protein

In other words, food provides molecular building blocks, not ready-made biological structures.

Eating a cow is like receiving bricks, not a building.


3. Why are there only 20 natural amino acids?

The genetic code uses 64 codons, but these encode only 20 amino acids plus stop signals.

These 20 amino acids provide enough chemical diversity to build functional proteins:

  • hydrophobic residues
  • hydrophilic residues
  • charged residues
  • aromatic residues
  • flexible or rigid structures

Evolution stabilized around this set because it provides a good balance between chemical diversity and translational efficiency.

Some rare exceptions exist (for example selenocysteine and pyrrolysine), but the canonical system uses about twenty.


4. Can we make non-natural amino acids?

Yes.

Chemists and synthetic biologists routinely create non-natural amino acids.

Examples include amino acids containing:

  • fluorinated groups
  • photo-reactive groups
  • click-chemistry handles
  • metal-binding groups

These molecules can be incorporated into proteins using engineered:

  • tRNA molecules
  • aminoacyl-tRNA synthetases

This expands the chemical capabilities of proteins beyond what natural biology provides.


5. Where did amino acids come from before life?

Several hypotheses exist.

One well-known mechanism is prebiotic chemistry, demonstrated by the Miller–Urey experiment, where simple molecules such as

  • methane
  • ammonia
  • hydrogen
  • water

can react under energy input (lightning, heat) to produce amino acids.

Amino acids have also been detected in meteorites, suggesting that some may have arrived from space.

Another possibility involves hydrothermal vents, where mineral catalysis and heat may drive organic synthesis.


6. If you make an α-helix using D-amino acids, what handedness (right or left) would you expect?

Natural proteins use L-amino acids, which form mainly: right-handed α-helices

If the chirality is reversed and D-amino acids are used, the geometry of the peptide backbone flips and the helix becomes: left-handed

This is a direct consequence of molecular chirality.


7. Can you discover additional helices in proteins?


8 Why are most molecular helices right-handed?


9. Why do β-sheets tend to aggregate? What is the driving force for β-sheet aggregation?

β-sheets expose hydrogen-bonding edges along their backbone. These edges can interact with neighboring sheets: β-sheet + β-sheet → stacked structure

This stacking is stabilized by:

  • hydrogen bonding
  • hydrophobic interactions
  • van der Waals forces

Because β-sheets are relatively flat structures, they pack easily into extended aggregates.


9. What is the driving force for β-sheet aggregation?

The main driving forces are:

  • backbone hydrogen bonding
  • hydrophobic interactions
  • van der Waals interactions
  • exclusion of water from the interface

Together these interactions stabilize stacked β-sheet structures and can lead to fibrillar assemblies.


10. Why do many amyloid diseases form β-sheets?

Many amyloid-associated diseases involve proteins misfolding into β-sheet-rich conformations. β-sheets expose repetitive hydrogen-bonding surfaces that can stack into highly stable fibrillar aggregates called amyloids.

Because these structures are:

  • energetically stable,
  • self-templating,
  • and difficult for cells to degrade,

they can progressively accumulate in tissues.

Examples include:

  • Alzheimer’s disease (amyloid-β),
  • Parkinson’s disease (α-synuclein),
  • Huntington’s disease.

The pathological behavior emerges not only from the protein sequence itself, but from the ability of β-sheet structures to nucleate and propagate aggregation.

10. Can you use amyloid β-sheets as materials?

Yes. Although amyloids are associated with disease in humans, their structural properties are also highly attractive for material science. These systems exploit the natural ability of peptides to self-assemble into ordered architectures.

Amyloid fibrils exhibit:

  • high mechanical strength,
  • nanoscale self-assembly,
  • chemical stability,
  • and hierarchical organization.

Researchers are exploring amyloid-inspired systems for:

  • nanofibers,
  • hydrogels,
  • tissue engineering,
  • biosensors,
  • bioelectronics,
  • and programmable biomaterials.

Some biological systems even naturally use functional amyloids, showing that amyloid assembly is not inherently pathological.


11. Design a β-sheet motif that forms a well-ordered structure.

One simple strategy is to alternate hydrophobic and hydrophilic amino acids:

Val-Lys-Val-Lys-Val-Lys-Val-Lys
or 
VKVKVKVK

Part B: Protein Analysis and Visualization

B1. Identify the amino acid sequence of your protein.

The protein selected for this analysis is BcsA (Bacterial Cellulose Synthase catalytic subunit) from Rhodobacter sphaeroides.

BcsA is the catalytic core of the bacterial cellulose synthase complex. It polymerizes UDP-glucose into linear β-1,4-glucan chains while simultaneously translocating the growing cellulose polymer across the inner membrane.

This protein is particularly interesting because it directly couples:

  • enzymatic catalysis,
  • membrane transport,
  • and extracellular material production.

BcsA therefore represents a molecular interface between cellular metabolism and large-scale material morphogenesis.

I selected this protein because it is closely related to my research interests in bacterial cellulose biofabrication and living material growth.


B2. How long is it? What is the most frequent amino acid? You can use this Colab notebook to count the frequency of amino acids.

The amino acid sequence was retrieved from UniProt entry Q3J125, corresponding to BcsA / Cellulose synthase catalytic subunit [UDP-forming] from Cereibacter sphaeroides / Rhodobacter sphaeroides.

The canonical protein sequence is 788 amino acids long.

Using an amino-acid frequency count, the most frequent residue is:

Amino acidOne-letter codeCount
LeucineL102
AlanineA93
ArginineR69
ValineV63
ProlineP56

The most frequent amino acid in BcsA is therefore leucine (L).

This is consistent with BcsA being a membrane-associated protein, since hydrophobic amino acids such as leucine and valine are common in transmembrane regions.


B2. How many protein sequence homologs are there for your protein? Hint: Use Uniprot’s BLAST tool to search for homologs. Does your protein belong to any protein family?

I searched for homologs using the UniProt BLAST tool with the BcsA amino-acid sequence from UniProt Q3J125. The search returned many homologous sequences across bacteria, especially among cellulose-producing and biofilm-forming species. This indicates that BcsA is a conserved bacterial cellulose synthase protein rather than a species-specific enzyme. The exact number of homologs depends on the BLAST identity and coverage thresholds used. For my analysis, I focused on close bacterial homologs with significant sequence similarity.

BLAST


B.3 Identify the structure page of your protein in RCSB. When was the structure solved? Is it a good quality structure? Good quality structure is the one with good resolution. Smaller the better (Resolution: 2.70 Å)

BcsA belongs to the cellulose synthase catalytic subunit family.

More specifically, it is associated with:

  • Glycosyltransferase family 2 (GT2)
  • Cellulose synthase / BcsA family
  • PilZ domain-containing proteins, because BcsA contains a C-terminal PilZ domain involved in cyclic-di-GMP regulation.

This family is responsible for polymerizing UDP-glucose into β-1,4-glucan chains during bacterial cellulose biosynthesis. famillyTree

familly familly

The structure selected for this analysis is the bacterial cellulose synthase complex containing BcsA from Rhodobacter sphaeroides.


B.3 Are there any other molecules in the solved structure apart from protein? Does your protein belong to any structure classification family?

The structure was solved in 2013.

The experimental method used was:

  • X-ray diffraction

The structure resolution is approximately:

3.25 Å

RCSB Structure ID:

RCSB link: https://www.rcsb.org/structure/4HG6

This structure corresponds to the cellulose synthase catalytic subunit BcsA associated with BcsB during active cellulose synthesis. BcsA structure BcsA structure

The solved structure contains several non-protein molecules associated with cellulose synthesis and regulation. These include: UDP, cyclic-di-GMP, cellulose oligomers, lipid or detergent molecules, and water molecules.

These molecules are important because they help reveal how BcsA: polymerizes glucose into cellulose, regulates catalytic activity, and translocates the growing cellulose chain across the membrane.


B.4 Open the structure of your protein in any 3D molecule visualization

The structure of BcsA (PDB: 4HG6) was visualized using PyMol & tiny bit ChimeraX (after). After videos was loaded in Youtube for easiest vizualisation on Github page.

B.4 Visualize the protein as “cartoon”, “ribbon” and “ball and stick”

Different molecular representations were explored including:

  • cartoon => code : hide everything show cartoon
ProteinCartoon ProteinCartoon
  • ribbon => code : hide everything show ribbon
ProteinRibbon ProteinRibbon
  • and ball-and-stick visualization modes = hide everything show sticks set stick_radius, 0.2 show spheres set sphere_scale, 0.25
ProteinBall-and-stick ProteinBall-and-stick

These representations allow observation of:

  • global folding organization,
  • secondary structures,
  • residue distribution,
  • and surface topology.

chimera chimera chimera chimera

B.4 Color the protein by secondary structure. Does it have more helices or sheets?

The protein contains predominantly alpha helices with relatively few beta sheets.

This is consistent with BcsA being a membrane-associated protein, since transmembrane regions are commonly formed by alpha-helical bundles. Several long alpha helices span the membrane and appear organized into a compact extrusion architecture associated with cellulose translocation. The beta-sheet content appears mainly localized within internal globular domains associated with catalytic or regulatory regions.

I use this for color and effects

dss
color salmon, ss h
color marine, ss s
color gray, ss l+''
ProteinSecondaryStructure ProteinSecondaryStructure

B.4 Color the protein by residue type. What can you tell about the distribution of hydrophobic vs hydrophilic residues?

Hydrophobic hydrophobic hydrophobic

Hydrophilic hydrophilic hydrophilic

Charged charged1 charged1 charged2 charged2

Hydrophobic residues are mainly buried inside the protein core, contributing to structural stability, while hydrophilic and charged residues are predominantly exposed at the surface, where they may interact with solvent or ligands.

This distribution reflects the dual nature of BcsA as both:

  • a membrane-integrated transport system,
  • and an enzymatic catalytic complex.

B.4 Visualize the surface of the protein. Does it have any “holes” (aka binding pockets)?

Surface visualization reveals several cavities and channel-like regions within the protein complex.

These structural pockets are likely associated with:

  • substrate binding,
  • catalytic activity,
  • and cellulose translocation across the membrane.
pocket pocket

Part C. Using ML-Based Protein Design Tools

C.1 Protein Language Modeling

The protein selected for this section is BcsA (Cellulose synthase catalytic subunit) from Rhodobacter sphaeroides (UniProt: Q3J125, PDB: 4HG6).

BcsA is the catalytic membrane-associated enzyme responsible for bacterial cellulose biosynthesis. It polymerizes UDP-glucose into β-1,4-glucan chains while simultaneously translocating cellulose across the membrane.

Because my HTGAA final project focuses on bacterial cellulose morphogenesis and engineered living materials, BcsA represents a particularly relevant biological fabrication system.


Deep Mutational Scans

1.a Use ESM2 to generate an unsupervised deep mutational scan of your protein based on language model likelihoods.

An unsupervised deep mutational scan was generated using the ESM2 protein language model.

The model predicts the relative likelihood of amino-acid substitutions across the BcsA sequence based on learned statistical and structural constraints from large protein datasets.

The resulting heatmap visualizes:

  • sequence positions (x-axis),
  • amino-acid substitutions (y-axis),
  • and predicted mutation likelihoods (color scale).
heatmap heatmap

1.b Can you explain any particular pattern? (choose a residue and a mutation that stands out)

The mutational heatmap reveals several highly constrained positions appearing as vertical dark bands across many amino-acid substitutions.

These positions are likely associated with:

  • catalytic residues,
  • conserved transmembrane packing regions,
  • or structurally critical motifs required for cellulose synthesis and membrane extrusion.

The overall pattern is consistent with the biological role of BcsA as a highly conserved membrane-associated glycosyltransferase.

Hydrophobic residues such as:

  • leucine,
  • valine,
  • isoleucine,
  • and alanine

appear frequently favored across multiple positions, which is expected for a transmembrane protein rich in alpha-helical membrane domains.

Mutations preserving similar physicochemical properties appear more tolerated than chemically disruptive substitutions.

For example:

  • hydrophobic → hydrophobic substitutions tend to produce weaker predicted effects than substitutions introducing:
  • charge,
  • polarity,
  • or steric disruption

inside membrane-associated regions.

One notable observation is that substitutions introducing charged residues within predicted transmembrane helices are strongly disfavored by the model, suggesting structural incompatibility with membrane insertion and packing.


1.c (Bonus) Find sequences for which we have experimental scans, and compare the prediction of the language model to experiment.

No experimental mutational scan dataset specific to BcsA was available during this work.

However, the overall predictions produced by ESM2 appear biologically coherent with known structural constraints of membrane-associated glycosyltransferases:

  • strong conservation of catalytic regions,
  • high sensitivity of transmembrane helices to disruptive substitutions,
  • and tolerance for conservative hydrophobic substitutions.

The results suggest that protein language models capture meaningful structural and evolutionary information even without explicit supervision.


C.1 Latent Space Analysis

1.a Use the provided sequence dataset to embed proteins in reduced dimensionality.

The latent-space analysis was performed using protein embeddings generated by ESM2.

Protein sequences were transformed into high-dimensional embedding vectors representing learned structural and evolutionary features.

These embeddings were then projected into lower-dimensional space using t-SNE for visualization.

Originally, the notebook relied on an external SCOP/ASTRAL FASTA dataset which became temporarily inaccessible during the analysis. The workflow was therefore adapted conceptually using related protein sequences and homologous embeddings.


1.b Analyze the different formed neighborhoods: do they approximate similar proteins?

The latent-space projection organizes proteins into local neighborhoods reflecting structural and functional similarity.

Proteins clustering near one another tend to share:

  • similar folds,
  • catalytic mechanisms,
  • membrane architectures,
  • or conserved domains.

This suggests that the embedding space learned by ESM2 captures biologically meaningful relationships beyond simple sequence identity.

The observed clustering behavior approximates known evolutionary and structural protein families.


1.c Place your protein in the resulting map and explain its position and similarity to its neighbors.

BcsA is expected to cluster near:

  • glycosyltransferases,
  • polysaccharide synthases,
  • and membrane-associated biosynthetic enzymes.

This position is coherent with:

  • its GT2 catalytic domain,
  • its transmembrane architecture,
  • and its role in extracellular polysaccharide extrusion.

Its embedding likely reflects both:

  • catalytic similarity,
  • and membrane-associated structural constraints.

The latent-space organization therefore supports the interpretation of BcsA as a specialized molecular fabrication system coupling:

  • signaling,
  • polymer synthesis,
  • and extracellular material translocation.

C2. Protein Folding

Folding a protein

1 Fold your protein with ESMFold. Do the predicted coordinates match your original structure?

The BcsA sequence was folded using ESMFold and compared to the experimentally solved structure (PDB: 4HG6).

alfa alfa alfa alfa

The predicted structure closely matches the experimental structure.

The global fold, alpha-helical organization, and domain arrangement remain highly similar between prediction and experiment.

This result demonstrates the strong predictive capabilities of modern protein folding models.

The predicted structure preserves:

  • transmembrane helices,
  • catalytic core organization,
  • and overall spatial topology.

The similarity suggests that ESMFold successfully captures both:

  • sequence-derived structural constraints,
  • and long-range interactions within the protein.

2.b Try changing the sequence, first try some mutations, then large segments. Is your protein structure resilient to mutations?

Small point mutations generally produced limited structural changes.

Conservative substitutions often preserved:

  • overall folding,
  • alpha-helical organization,
  • and membrane topology.

However, larger sequence modifications, especially deletions or strongly disruptive substitutions, produced more significant structural perturbations.

Regions associated with:

  • transmembrane packing,
  • catalytic organization,
  • and domain interfaces

appear particularly sensitive to disruption.

Overall, the structure appears relatively resilient to small mutations but less stable under large-scale sequence modifications.

This behavior is consistent with many membrane-associated proteins where:

  • local substitutions may be tolerated,
  • but large disruptions destabilize membrane insertion and folding.

C3. Protein Generation

Inverse-Folding a protein: Let’s now use the backbone of your chosen PDB to propose sequence candidates via ProteinMPNN

ProteinMPNN was used in inverse-folding mode to generate amino-acid sequences compatible with the backbone structure of BcsA.

Instead of predicting structure from sequence, inverse folding predicts sequences capable of adopting a target structure.

This approach explores the relationship between:

  • structural constraints,
  • sequence variability,
  • and protein design.

1. Analyze the predicted sequence probabilities and compare the predicted sequence vs the original one.

The generated sequences preserve many structural and physicochemical properties of the original BcsA sequence.

Conserved regions are especially maintained in:

  • transmembrane helices,
  • catalytic motifs,
  • and structurally constrained regions.

The probability distributions suggest that multiple alternative sequences may still satisfy the same global fold.

Hydrophobic residues remain strongly favored in membrane-associated regions, while more variable positions appear primarily in solvent-exposed loops or flexible regions.

This demonstrates the degeneracy of sequence space: multiple distinct sequences may encode similar structural organizations.


2. Input this sequence into ESMFold and compare the predicted structure to your original.

The ProteinMPNN-generated sequence was folded again using ESMFold.

prevision prevision

The predicted structure remains globally similar to the original BcsA fold.

The resulting structure preserves:

  • alpha-helical membrane organization,
  • catalytic domain topology,
  • and overall architecture.

This suggests that the generated sequence remains compatible with the original structural scaffold.

The experiment highlights the growing capability of AI-based protein design systems to:

  • generate plausible protein sequences,
  • preserve structural organization,
  • and explore functional sequence space computationally.

Part D. Group Brainstorm on Bacteriophage Engineering

See Life Lab group.

Week 05 HW: Proteine design part II

Table of Contents

Table of Contents


Part A: SOD1 Binder Peptide Design

Superoxide dismutase 1 (SOD1) is an antioxidant enzyme. The A4V mutation in this protein destabilizes its folding and promotes toxic aggregation, leading to ALS. The goal of this exercise is to design and evaluate short 12-amino-acid peptides that bind to this mutant and stabilize it, using modern Machine Learning pipelines.

Part 1. Generate Binders with PepMLM

I retrieved the human SOD1 sequence (UniProt P00441) and introduced the ALS-associated A4V mutation.

Because SOD1 numbering conventionally excludes the initiator methionine, the A4V mutation corresponds to replacing the fifth residue in the UniProt sequence.

Wild-type SOD1 sequence:

MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

A4V mutant SOD1 sequence:

MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

Using the PepMLM model, I generated candidate binders against the A4V mutant SOD1 sequence.

Known baseline binder: FLYRWLPSRRGG

Generated Candidates & Perplexity Scores:

|—|—|—| |0|GSWPPWLIAKEFKYKLKKSGYSWSAGAAHEAEAAWARAEAVARVAEEALX|19.625494735054094| => GSWPPWLIAKEFKYKLKKSGYSWSAGAAHEAEAAWARAEAVARVAEEALA |1|WSWWEAAIEEALEYYKKESSSATAAGHAHTDAWAWAARVLAGALLLAAAR|17.830449559706416| |2|TSSPAWAITAYFEELYSTKYGSTKGHAHAGGGGEAALVALLAVRLEYAAG|23.151092004161082| |3|WTSWATAAKKAYKLYGRKKAAAAAGSHHAEGGAEEARAAGALRRREALLX|16.85729772690791| => WTSWATAAKKAYKLYGRKKAAAAAGSHHAEGGAEEARAAGALRRREALLA


Part 2. Evaluate Binders with AlphaFold3

1. Navigate to the AlphaFold Server: alphafoldserver.com

2. For each peptide, submit the mutant SOD1 sequence followed by the peptide sequence as separate chains to model the protein-peptide complex.

To verify structural binding, I modeled the mutant SOD1-peptide complexes using the AlphaFold3 Server.


3. Record the ipTM score and briefly describe where the peptide appears to bind. Does it localize near the N-terminus where A4V sits? Does it engage the β-barrel region or approach the dimer interface? Does it appear surface-bound or partially buried?

Structural Results (ipTM scores & Binding location):

  • Known Binder: ipTM = [0.42]. pTM = 0.83 . Location: [Ex: Binds near the N-terminus] => indicating a confident structural prediction for the folded SOD1 protein. However, the interface predicted TM-score (ipTM) was lower (0.42), suggesting only a weak or moderate confidence in the peptide-protein interaction itself. Visually, the peptide appeared mostly surface-associated rather than deeply buried within a defined binding pocket. The peptide localized near the exterior surface of SOD1 rather than forming a strong, highly structured interface.

This may indicate:

  • transient binding,
  • weak affinity,
  • or limited structural specificity between the peptide and the A4V mutant region.

The result highlights an important limitation of current peptide-binding prediction workflows: a peptide can appear structurally plausible while still exhibiting low-confidence intermolecular interactions.

alphafold3 alphafold3

4. In a short paragraph, describe the ipTM values you observe and whether any PepMLM-generated peptide matches or exceeds the known binder.

I evaluated the interaction between the mutant SOD1 A4V protein and one PepMLM-generated peptide using AlphaFold3. Tested peptide: WSWWEAAIEEALEYYKKESSSATAAGHAHTDAWAWAARVLAGALLLAAAR The predicted complex produced: ipTM = 0.21 / pTM = 0.72

The AlphaFold3 prediction suggests weak or limited interaction confidence between the generated peptide and the SOD1 target. The peptide appears largely surface-associated rather than deeply integrated into the protein structure. No strong binding pocket or stable interface near the A4V mutation site was clearly observed.

The predicted SOD1 structure itself remained globally stable and preserved the characteristic β-barrel fold expected for SOD1, while the peptide adopted a mostly α-helical conformation.

Compared to the reference binder, the PepMLM-generated peptide did not appear to achieve stronger interaction confidence. However, this experiment demonstrates the general workflow:

  • generate peptide candidates with PepMLM,
  • evaluate structural interaction using AlphaFold3,
  • compare predicted interaction metrics such as ipTM,
  • and analyze potential binding interfaces.

Overall, this exercise highlighted both the potential and current limitations of AI-based peptide binder generation workflows. While language-model-generated peptides can produce structurally plausible candidates, reliable prediction of functional binding interactions remains challenging and often requires extensive computational screening and experimental validation.

alphafold3 alphafold3

Part 3. Evaluate Properties in PeptiVerse

Structural confidence (ipTM) is only one dimension of design. A peptide must also be physically viable as a therapeutic. I used PeptiVerse to predict the functional properties of my top candidates.

Properties for my best candidate ([Insère la meilleure séquence]):

  • 🔗 Binding Affinity: [5.555,pKd/pKi] Soluble
  • 💧 Solubility: [1.000] Soluble
  • 🩸 Hemolysis: [0.047]Non-hemolytic
  • ⚡ Net Charge (pH 7): [2.76]
  • ⚖️ Molecular Weight: [1507.7] Da
  • 📏 Length: [12] aa
  • 🎯 Isoelectric Point: [11.71,pH]
  • 💦 Hydrophobicity (GRAVY): [-0.71] GRAVY

Decision & Justification: While some peptides displayed potentially favorable structural interactions in AlphaFold3, PeptiVerse predictions suggest that several candidates may exhibit limited therapeutic potential due to poor solubility or elevated hemolysis probability.

This highlights the importance of evaluating both:

  • structural compatibility,
  • and physicochemical therapeutic properties.

A peptide with moderate predicted binding but improved solubility and lower hemolysis risk may represent a better therapeutic compromise than the strongest structural binder alone.

Part 4. Generate Optimized Peptides with moPPIt

Moving from probabilistic sampling to controlled design, I used moPPIt (Multi-Objective Guided Discrete Flow Matching) to steer peptide generation directly toward the [Ex: dimer interface / position 4] patch of SOD1.

Observations: Due to current compute limitations and GPU requirements, full moPPIt optimization could not be completed locally or paying Google A100 service…

However, the workflow was successfully configured by:

  • defining the A4V SOD1 mutant as the target protein,
  • selecting residues near the mutation site as motif-guided binding positions,
  • and enabling affinity, solubility, and hemolysis optimization objectives.

Compared to PepMLM, moPPIt introduces a more controlled generative strategy by explicitly steering peptide generation toward predefined biological and therapeutic constraints.

Conceptually, this approach represents a transition from unconstrained statistical peptide sampling toward guided multi-objective biomolecular design.


Part C: Final Project: L-Protein Mutants

As part of the global HTGAA effort to engineer bacteriophages against antibiotic resistance, the goal here is to mutate the MS2 phage L-Protein. A common E. coli resistance mechanism involves a mutation in DnaJ that prevents L-protein binding. By engineering the L-protein, we aim to overcome this chaperone dependency.

Based on mutational analysis and structure-based models, here are my 5 proposed L-protein mutations.

Design Constraints applied:

  • At least 2 variants in the transmembrane region (affects lysis activity directly).
  • At least 2 variants in the soluble region (domain responsible for DnaJ interaction).

Proposed Mutations

Variant 1 (Transmembrane Region)

  • Mutations: [Ex: L25A, F28V...]
  • Rationale: [Ex: Modifying these hydrophobic residues may alter the oligomerization dynamics of the pore without disrupting membrane insertion.]

Variant 2 (Transmembrane Region)

  • Mutations: [Ex: I20V, L21A...]
  • Rationale: [Ex: Derived from positive mutational scores, aiming to create a faster integration into the E. coli membrane.]

Variant 3 (Soluble Region - DnaJ Interaction)

  • Mutations: [Ex: R15A, K16E...]
  • Rationale: [Ex: By changing the charge distribution in the soluble tail, we aim to decrease the L-protein's dependency on the DnaJ chaperone for folding.]

Variant 4 (Soluble Region)

  • Mutations: [Ex: D22N, Y24F...]
  • Rationale: [Ex: Selected based on sequence alignment (avoiding highly conserved sites) to allow autonomous insertion.]

Variant 5 (Combinatorial / Random)

  • Mutations: [Ex: T10A, P11G...]
  • Rationale: [Ex: A broader structural perturbation to test if increased flexibility in the N-terminus accelerates the breakdown of the membrane.]

Week 06 HW: Genetic circuit part I

Table of Contents


Assignment: DNA Assembly

1. What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose?

The Phusion High-Fidelity PCR Master Mix contains several essential components required for DNA amplification. The protocol specifically lists the use of Phusion HF PCR Mix (2X) during PCR setup.

Typical components include:

Phusion DNA Polymerase A high-fidelity DNA polymerase responsible for synthesizing new DNA strands with very low error rates. dNTPs (deoxynucleotide triphosphates) The molecular building blocks (A, T, G, C) used to synthesize DNA. Reaction Buffer Maintains optimal ionic strength and pH for enzyme activity. Mg²⁺ ions Essential cofactor required for polymerase function. Stabilizers and salts Improve enzyme stability and PCR efficiency.

The use of a high-fidelity polymerase is especially important in cloning experiments because mutations introduced during PCR could alter the final protein sequence.

2. What are some factors that determine primer annealing temperature during PCR?

Several factors determine the annealing temperature during PCR:

Primer length GC content Melting temperature (Tm) Sequence complementarity Secondary structures

The protocol recommends:

primers with a Tm between approximately 52–58°C, primer pairs within 5°C of each other, GC content around 40–60%, and inclusion of a GC clamp at the 3′ end.

Annealing temperature is generally chosen about:

2–5°C below the lower primer Tm

as described in the appendix.

If the annealing temperature is too low:

nonspecific binding may occur.

If it is too high:

primers may fail to bind efficiently.

3. Compare PCR and restriction enzyme digests for generating linear DNA fragments

Both PCR and restriction enzyme digestion can generate linear DNA fragments, but they work differently and are useful in different contexts.

PCR amplifies a specific DNA region using:

  • primers,
  • polymerase,
  • thermal cycling.

Advantages:

  • highly specific,
  • can introduce mutations,
  • can add Gibson overlaps,
  • does not require restriction sites.

Disadvantages:

  • may introduce amplification errors,
  • requires careful primer design.

In this protocol, PCR is used to generate:

  • the backbone fragment,
  • and the color insert fragment for Gibson assembly.
  • Restriction Enzyme Digest

Restriction digestion cuts DNA at specific recognition sequences using enzymes.

Advantages:

  • precise cleavage,
  • efficient for existing plasmid architectures.

Disadvantages:

  • requires compatible restriction sites,
  • less flexible for mutagenesis,
  • may leave unwanted scars.

Restriction digests are preferable when:

  • suitable restriction sites already exist,
  • and no sequence modification is needed.

PCR is preferable when:

  • introducing mutations,
  • assembling custom fragments,
  • or performing Gibson assembly.

4. How can you ensure that the DNA sequences are appropriate for Gibson cloning?

For Gibson Assembly, DNA fragments must contain overlapping homologous regions.

The protocol specifies:

  • overlaps of approximately 20–40 bp,
  • correct 5′→3′ orientation,
  • and complementary overhangs designed through primers.

To ensure compatibility:

  • primers must include overlap regions,
  • fragments must be purified,
  • PCR products should be verified by gel electrophoresis,
  • fragment sizes should match expected values.

The protocol also uses: DpnI digestion to remove the methylated parental plasmid template after PCR. This reduces background colonies from unmutated plasmids.

5. How does plasmid DNA enter E. coli cells during transformation?

The protocol describes two common transformation methods:

  • heat shock,
  • electroporation. Both methods temporarily create pores in the bacterial membrane.

Heat Shock A rapid temperature increase causes transient membrane destabilization.

Electroporation A high-voltage electrical pulse creates temporary membrane pores.

According to the protocol:

  • “The plasmid now enters the cells by diffusion.”

After transformation: cells recover in SOC medium, express antibiotic resistance genes, and are plated on selective media.

Only transformed cells survive antibiotic selection.

6. Describe another assembly method in detail: Golden Gate Assembly

Golden Gate Assembly is a cloning method based on:

  • Type IIS restriction enzymes,
  • and DNA ligase.

Unlike standard restriction enzymes, Type IIS enzymes cut outside of their recognition sequence, allowing the generation of custom overhangs.

This enables:

  • scarless assembly,
  • directional cloning,
  • simultaneous assembly of multiple fragments in one reaction.

A Golden Gate reaction typically alternates between:

  • digestion,
  • and ligation cycles. The restriction enzyme continuously cuts incorrect assemblies while ligase seals correctly matched fragments.

Because the recognition sites are removed during assembly:

  • the final construct is stable,
  • and cannot be recut.

Golden Gate is particularly useful for:

  • modular cloning,
  • synthetic biology,
  • combinatorial DNA assembly,
  • and large multi-fragment constructs.

Example Diagram of Golden Gate Assembly

Fragment A --[BsaI]--> sticky end A
Fragment B --[BsaI]--> sticky end B

Digest → compatible overhangs
Ligate → seamless assembly

Final construct:
[A][B][C][D]

Benchling / Modeling Component

Golden Gate Assembly can be modeled in Benchling by:

defining BsaI restriction sites, designing compatible overhangs, and simulating fragment assembly.

Benchling allows:

visualization of sticky ends, plasmid circularization, and verification of reading frames and orientation.

Asimov Kernel Assignment — Repository and Circuit Design

I was able to access the Asimov Kernel interface and explore the public repositories, including the Characterized Bacterial Parts repository and the Bacterial Demos examples. However, my account did not appear to have the necessary permissions or node access required to fully create, save, and simulate Constructs within a personal Repository.

Nevertheless, I investigated how the system is intended to function and reconstructed the expected logic of the exercise conceptually.

Expected Repressilator Design

The Repressilator is a synthetic oscillatory gene circuit composed of three repressors connected in a cyclic inhibition loop:

  • pTetR → LacI
  • pLacI → LambdaCI
  • pLambdaCI → TetR

The expected regulatory behavior is:

  • TetR represses pTetR
  • LacI represses pLacI
  • LambdaCI represses pLambdaCI

This creates a delayed negative feedback loop capable of producing oscillatory dynamics in gene expression.

To construct this system in Kernel, the workflow would likely involve:

  • Creating a blank Construct inside a Repository.
  • Searching the Characterized Bacterial Parts database.
  • Dragging promoters, CDS regions, RBS elements, and terminators into the Construct editor.
  • Linking the transcriptional units sequentially.
  • Running the simulator to observe oscillatory expression patterns.

Proposed Personal Constructs

Construct 1 — Self-Repression Circuit pLacI → LacI

This circuit would likely produce negative autoregulation. I would expect expression to stabilize at an intermediate level rather than continuously increasing.

Construct 2 — Toggle Switch pLacI → TetR pTetR → LacI

This mutual repression architecture should behave as a bistable toggle switch where one repressor dominates while suppressing the other.

Construct 3 — Repression Cascade pLambdaCI → LacI pLacI → TetR

This design would likely create delayed repression behavior rather than oscillations. Changes in upstream regulation would propagate progressively through the circuit.

Expected Simulation Behavior

If the simulator were fully accessible, I would expect:

oscillatory curves for the repressilator, stable equilibria for self-repression, bistable states for the toggle switch, and delayed temporal dynamics for the repression cascade.

Differences between expected and simulated behavior could result from:

promoter strength imbalance, degradation rate settings, insufficient repression efficiency, stochastic effects, or simulation parameter choices.

Adjusting repression constants, degradation rates, or transcriptional delays would likely help tune the system toward the expected behavior.

Week 7 — Genetic Circuits Part II: Neuromorphic Circuits

Table of Contents


Assignment Part 1: Intracellular Artificial Neural Networks (IANNs)

1. What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions?

Intracellular Artificial Neural Networks (IANNs) provide several advantages compared to traditional Boolean genetic circuits.

Traditional genetic circuits generally operate using discrete ON/OFF logic gates such as: AND, OR, NOT, NAND. These systems are powerful for simple decision-making but are limited when dealing with:

  • noisy biological signals,
  • continuous gradients,
  • or complex nonlinear relationships.

In contrast, IANNs can process information in a graded and weighted manner, similarly to artificial neural networks in machine learning. Instead of binary responses, IANNs can integrate multiple inputs with different strengths and produce continuous outputs. This allows:

  • analog computation,
  • pattern recognition,
  • signal integration,
  • adaptive responses,
  • and more robust behavior in noisy biological environments.

Another advantage is scalability. As Boolean circuits become larger, they often suffer from:

  • metabolic burden,
  • crosstalk,
  • and combinatorial complexity.

Neural-network-inspired architectures may allow more compact and flexible computation using weighted interactions between regulators such as:

  • transcription factors,
  • endoribonucleases,
  • or RNA regulators.

Finally, IANNs are conceptually closer to biological systems themselves, which rarely behave as strictly binary systems and instead rely heavily on gradients, thresholds, and probabilistic regulation.


2. Describe a useful application for an IANN

One useful application for an intracellular artificial neural network would be a programmable therapeutic cell capable of detecting complex disease states from multiple biomarkers.

For example, engineered immune or bacterial cells could monitor combinations of:

inflammatory markers, cancer-associated metabolites, hypoxia, pH, or signaling molecules.

Instead of responding to a single threshold, the IANN could integrate weighted biological inputs and classify whether the cellular environment corresponds to a pathological condition.

Example Input/Output Behavior

Inputs:

X1 = hypoxia marker X2 = inflammatory cytokine X3 = tumor-associated metabolite

Hidden layer:

weighted integration through RNA regulators or endoribonucleases

Output:

fluorescent reporter, therapeutic protein, or apoptosis-inducing signal.

The network could produce:

weak output for isolated signals, but strong activation only when a pathological combination of signals is detected.

This would reduce false positives and enable more context-aware therapies.

Limitations

Several limitations currently affect IANN implementation:

biological noise, stochastic gene expression, metabolic burden, limited orthogonal regulatory parts, slow response times, and difficulty tuning precise weights between biological components.

Another challenge is signal interference between layers, especially in large intracellular networks. Unlike electronic neural networks, biological systems are constrained by:

resource competition, molecular degradation, diffusion, and evolutionary instability.

Nevertheless, IANNs represent an important direction toward adaptive and programmable cellular computation. In biological neural-like systems, weights may emerge from molecular concentration, degradation rates, binding affinity, and diffusion dynamics rather than fixed numerical parameters.


3. Intracellular Multilayer Perceptron Diagram

Below is a conceptual diagram of a multilayer intracellular perceptron.

Layer 1 produces an endoribonuclease regulator. Layer 2 uses this regulator to modulate fluorescent protein expression.

flowchart LR

subgraph Layer1
X1[X1]
X2[X2]
ENDOA[EndoRNase A]
ENDOB[EndoRNase B]
end

subgraph Layer2
REG[Regulated mRNA]
GFP[GFP Output]
end

X1 --> ENDOA
X2 --> ENDOB

ENDOA -. repress .-> REG
ENDOB -. modulate .-> REG

REG --> GFP

In this multilayer architecture:

  • the first layer processes initial biological inputs,
  • the intermediate layer computes regulatory transformations,
  • and the final layer controls reporter expression.

This creates hierarchical intracellular computation analogous to multilayer artificial neural networks.


Assignment Part 2: Fungal Materials

1. What are some examples of existing fungal materials and what are they used for? What are their advantages and disadvantages over traditional counterparts?

Fungal materials are biomaterials produced using fungal mycelium, the filamentous vegetative network of fungi. In recent years, mycelium-based materials have been explored as sustainable alternatives to plastics, foams, leather, textiles, and construction materials.

Examples include:

Mycelium packaging materials Companies such as Ecovative produce mycelium-based packaging as an alternative to expanded polystyrene foam. Agricultural waste is colonized by fungal mycelium, which binds the substrate into lightweight composite materials. Mycelium leather Companies such as MycoWorks and Bolt Threads develop fungal leather alternatives for fashion and upholstery. These materials imitate some properties of animal leather while avoiding animal agriculture. Construction materials Mycelium composites are explored for insulation panels, acoustic materials, and lightweight structural elements due to their low density and thermal properties. Biofabricated textiles and design objects Designers and researchers use fungal growth to create experimental furniture, wearable materials, and biohybrid artifacts.

Advantages over traditional materials include:

biodegradability, renewable feedstocks, low-energy production, carbon sequestration potential, and compatibility with circular material systems.

Unlike petroleum-derived plastics, fungal materials can often be composted at end of life and grown from agricultural waste streams.

However, fungal materials also present limitations:

lower mechanical strength, moisture sensitivity, variability between growth batches, slower production times, and challenges in large-scale industrial standardization.

Many fungal materials also require post-processing treatments to stabilize growth and improve durability, which can reduce some of their ecological advantages.

My course on mycelium

2. What might you want to genetically engineer fungi to do and why? What are the advantages of doing synthetic biology in fungi as opposed to bacteria?

Fungi could be genetically engineered to produce materials with programmable mechanical, optical, electrical, or biochemical properties.

For example, engineered fungi could:

produce conductive biomaterials, synthesize pigments or bioactive molecules, self-heal damaged structures, sense environmental changes, or generate morphologically controlled growth patterns.

In the context of biofabrication, one interesting possibility is engineering fungal mycelium to act as a living morphogenetic substrate capable of:

growing predefined architectures, responding to environmental stimuli, or integrating sensing and computation directly into materials.

Fungi are particularly interesting because they naturally form:

large-scale filamentous networks, spatially distributed structures, and mechanically coherent materials.

Compared to bacteria, fungi offer several advantages for synthetic biology:

Multicellular and filamentous growth Fungi naturally generate large interconnected structures that are better suited for material fabrication than bacterial colonies. Complex extracellular matrices Fungi produce chitin, glucans, hydrophobins, and other structural polymers useful for biomaterials. Mechanical robustness Mycelial networks can create macroscopic materials with structural integrity. Spatial morphogenesis Fungal growth inherently involves branching, differentiation, and environmental adaptation. Compatibility with biofabrication Fungi can directly colonize scaffolds and substrates to generate large objects.

Bacteria, on the other hand, are generally:

easier to engineer genetically, faster to grow, and better characterized molecularly.

However, bacteria usually lack the large-scale structural organization naturally found in fungal mycelium.

Because of this, fungi occupy a particularly interesting space between:

organism, material, and morphogenetic fabrication system.

This makes fungal synthetic biology highly relevant for:

sustainable manufacturing, living materials, biohybrid systems, and programmable ecological fabrication.


Assignment Part 3: First DNA Twist Order

DNA Design Challenge — Insert Design Submission

For the DNA Design Challenge, I designed an expression cassette based on the tyrosinase Tyr1 gene from Bacillus megaterium for heterologous expression in Komagataeibacter rhaeticus.

The goal of this construct is to enable the biosynthesis of eumelanin within bacterial cellulose pellicles, following the strategy demonstrated by Walker et al. (2024).


Expression Cassette Design

The insert was designed as a complete bacterial expression cassette including:

Promoter → RBS → tyr1 CDS → Stop Codon → Terminator

More specifically:

Element Function pJ23104 constitutive promoter continuous transcription BBa_B0034 RBS translation initiation codon-optimized tyr1 CDS tyrosinase production TAA stop codon translation termination BBa_B0015 terminator transcription termination

The Tyr1 coding sequence was codon-optimized for K. rhaeticus expression.

Backbone Vector and DNA Assembly

The Tyr1 expression cassette was assembled and visualized in Benchling using the pTwist Amp High Copy plasmid backbone, a common commercial cloning vector used for DNA synthesis and sequence delivery.

The plasmid contains:

an ampicillin resistance marker (ampR) a high-copy bacterial origin of replication multiple cloning sites compatible with downstream assembly workflows

The Tyr1 insert was positioned within the cloning region of the plasmid to simulate a synthesis-ready construct.

Benchling Assembly

The construct includes:

promoter ribosome binding site (RBS) Tyr1 coding sequence terminator

and represents a conceptual expression cassette for eumelanin biosynthesis in Komagataeibacter rhaeticus.

Insert-Level View

The translation view confirms:

correct reading frame, continuous open reading frame, and proper placement of the coding sequence downstream of the promoter and RBS.

cassette cassette

Benchling Design

The sequence and construct architecture were designed in Benchling:

https://benchling.com/s/seq-cSFfevSwjHFxf6KwwXGx?m=slm-D8VDordW2oyuqdI6bH4r

Experimental Status

The computational and sequence design stages were completed. However, due to limitations in available infrastructure and wet-lab logistics during the course, the full DNA synthesis and cloning workflow was not experimentally completed within the class timeline.

Nevertheless, the construct architecture closely follows experimentally validated systems reported in: Walker et al., 2024 — Self-pigmenting textiles grown from cellulose-producing bacteria with engineered tyrosinase expression.

Week 09 HW: Cell-free system

Table of Contents


Homework Part A: General and Lecturer-Specific Questions

1. Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production.

Cell-free protein synthesis (CFPS) allows protein production outside living cells by using extracted cellular machinery such as ribosomes, enzymes, and cofactors. Unlike in vivo systems, CFPS provides direct control over reaction conditions without needing to maintain cell viability.

Major advantages include:

rapid prototyping of genetic constructs, precise control over temperature, pH, salts, cofactors, and energy supply, easier incorporation of non-natural amino acids, absence of cellular toxicity constraints, direct access to the biochemical environment.

Because there is no membrane barrier or growth requirement, researchers can manipulate the system much more freely than in living organisms.

Cell-free systems are especially beneficial for:

Toxic proteins Some proteins damage or kill host cells during expression. CFPS bypasses this limitation because no living organism must survive the production process. Membrane proteins Membrane proteins are difficult to express in vivo because they often misfold or aggregate in cells. Cell-free systems allow controlled addition of liposomes, detergents, or nanodiscs to stabilize folding.

Additional applications include:

biosensing, rapid vaccine prototyping, synthetic biology circuit testing, and on-demand biomanufacturing.


2. Describe the main components of a cell-free expression system and explain the role of each component.

A cell-free expression system contains the molecular machinery necessary for transcription and translation.

Main components include:

Component Role Cell extract Contains ribosomes, tRNAs, enzymes, and translation machinery DNA or mRNA template Encodes the target protein Amino acids Building blocks for protein synthesis Nucleotides (ATP, GTP, CTP, UTP) Required for transcription and energy transfer RNA polymerase Transcribes DNA into mRNA Ribosomes Translate mRNA into protein Energy regeneration system Maintains ATP levels Cofactors and salts Stabilize enzymatic activity and folding

In bacterial systems such as E. coli extracts, the lysate already contains most endogenous translation machinery. Researchers mainly supplement substrates and energy sources.


3. Why is energy regeneration critical in cell-free systems? Describe a method you could use to ensure continuous ATP supply in your cell-free experiment.

Protein synthesis consumes very large amounts of energy, especially ATP and GTP. Without energy regeneration, ATP becomes depleted rapidly and translation stops.

Energy regeneration is critical because:

peptide bond formation requires energy, ribosome translocation consumes GTP, transcription also requires nucleotide triphosphates.

One common strategy is to use:

phosphoenolpyruvate (PEP), creatine phosphate, or glucose metabolism

as secondary energy sources.

For example, phosphoenolpyruvate can regenerate ATP through pyruvate kinase activity:

PEP + ADP → Pyruvate + ATP

Another approach uses slow glucose metabolism, which can provide more stable long-term ATP regeneration and reduce accumulation of inhibitory byproducts.


4. Compare prokaryotic versus eukaryotic cell-free expression systems. Choose a protein to produce in each system and explain why.

Prokaryotic and eukaryotic CFPS systems differ mainly in complexity, speed, and post-translational processing capabilities.

Feature Prokaryotic CFPS Eukaryotic CFPS Speed Fast Slower Cost Lower Higher Yield Often high Moderate Post-translational modifications Limited Extensive Folding complexity Simpler proteins Complex proteins

An E. coli system would be ideal for producing:

GFP, bacterial enzymes, or tyrosinase.

For example, the Tyr1 tyrosinase from Bacillus megaterium could be efficiently expressed in a bacterial CFPS system because it is a bacterial enzyme and does not require complex glycosylation.

A eukaryotic system would be preferable for:

antibodies, receptors, or human membrane proteins.

For example, expressing a human GPCR receptor would benefit from a eukaryotic lysate because these proteins require complex folding and post-translational modifications.

5. How would you design a cell-free experiment to optimize the expression of a membrane protein? Discuss the challenges and how you would address them in your setup.

Membrane proteins are difficult to express because hydrophobic transmembrane domains tend to aggregate outside lipid environments.

To optimize expression, I would:

Use a cell-free system supplemented with: liposomes, nanodiscs, or mild detergents. Lower expression temperature to reduce aggregation. Optimize magnesium and salt concentrations to stabilize translation. Add molecular chaperones if available.

The main challenges include:

aggregation, improper folding, low solubility, and instability outside membranes.

Nanodiscs are particularly useful because they mimic native membrane environments while remaining soluble.

For example, if expressing the bacterial cellulose synthase BcsA membrane complex, adding lipid nanodiscs during translation could help stabilize the transmembrane helices and preserve catalytic function.


6. Imagine you observe a low yield of your target protein in a cell-free system. Describe three possible reasons for this and suggest a troubleshooting strategy for each.

Low protein yield may result from several factors.

Possible Cause Explanation Troubleshooting Poor DNA template quality Degraded or impure DNA reduces transcription efficiency Purify DNA and verify concentration ATP depletion Translation stops when energy runs out Improve energy regeneration system Protein aggregation Misfolded proteins precipitate Lower temperature or add chaperones

Additional causes may include:

incorrect magnesium concentration, RNase contamination, or codon usage incompatibility.

For membrane proteins specifically, adding lipids or detergents may dramatically improve yield and folding stability.


Homework question from Kate Adamala - Design Synthetic Minimal Cell Genetic Circuits

1. Fluorescent Metabolic Reporter

global globalpH_sensitive_TXTL_reporter pH_sensitive_TXTL_reporter

This construct represents a synthetic minimal cell designed to report the metabolic state of a Komagataeibacter bacterial cellulose culture through fluorescence.

The circuit uses the acid-responsive pCadC promoter to regulate expression of sfGFP inside a bacterial TXTL (cell-free transcription/translation) system encapsulated within a lipid vesicle. As the bacterial culture consumes sugar and progressively acidifies the medium, the promoter becomes activated and induces fluorescent protein production.

This design transforms an invisible metabolic parameter — acidification — into a visible optical signal. In the context of my final project on impedance-sensitive bacterial cellulose functionalized with Tyr1-mediated eumelanin, this synthetic minimal cell acts as a companion diagnostic system capable of indicating when the culture enters an active production or stress phase.

The system provides:

  • a real-time metabolic readout,
  • non-invasive monitoring,
  • and a possible way to determine the optimal timing for material harvesting, hydration control, or impedance measurements.

2. Pore-Release Minimal Cell System

SMC_pore_sensitive SMC_pore_sensitive

This second construct explores a different synthetic minimal cell strategy based on membrane pore formation rather than direct fluorescent protein accumulation.

Instead of expressing sfGFP, the acid-responsive promoter regulates expression of alpha-hemolysin (hlyA), a pore-forming protein. Under acidic conditions generated by bacterial metabolism, the TXTL system produces alpha-hemolysin, which forms pores in the synthetic cell membrane.

This mechanism could allow the controlled release of:

  • encapsulated fluorophores,
  • signaling molecules,
  • ions,
  • or metabolic markers.

Compared to the sfGFP design, this strategy is conceptually closer to how synthetic minimal cells are often imagined: not as fully living systems, but as programmable biochemical compartments capable of conditionally interacting with their environment.

The pore-release system could potentially provide:

  • signal amplification,
  • faster environmental response,
  • and stronger coupling between membrane state and metabolic sensing.

Comparison Between Both Designs

DesignOutput MechanismAdvantagesLimitations
sfGFP ReporterFluorescent protein productionSimple, visible, easy to interpretSlower response due to protein maturation
hlyA Pore SystemMembrane pore formation and fluorophore releaseFaster environmental interaction, stronger protocell logicMore complex membrane behavior and leakage control

The sfGFP design behaves more like a classical synthetic biology reporter circuit, while the hlyA system explores a more protocellular approach where the membrane itself becomes an active functional interface.

Together, these two designs explore different ways synthetic minimal cells can translate environmental metabolic changes into readable outputs.


Relation to My Final Project

These synthetic minimal cell systems were designed as speculative companion technologies for my project:

Growable Impedance-Sensitive Surface from Bacterial Cellulose via Tyr1-Mediated Eumelanin

In this context, the synthetic minimal cells do not directly produce the bacterial cellulose material. Instead, they provide an externalized metabolic sensing layer capable of monitoring:

  • acidification,
  • sugar consumption,
  • metabolic stress,
  • and active cellulose production phases.

Because impedance behavior depends strongly on hydration, ionic concentration, eumelanin deposition, and bacterial metabolic activity, such systems could help identify the optimal temporal window for material functionalization and electrochemical characterization.

More broadly, this project explores how synthetic minimal cells might function not only as biosensors, but as programmable metabolic observers embedded within living material ecologies.

Homework question from Peter Nguyen - Freeze-Dried Cell-Free Systems for Living Textile Interfaces

One-Sentence Pitch

I propose a bacterial-cellulose-based textile integrating freeze-dried cell-free systems capable of locally producing visible or electrochemical responses when activated by sweat, humidity, or environmental metabolites.

proposition proposition

Concept Description

This project explores the integration of freeze-dried TXTL (cell-free transcription/translation) systems into bacterial cellulose textiles functionalized with conductive or redox-active biomaterials such as Tyr1-mediated eumelanin.

The textile would contain embedded cell-free reaction zones distributed within the bacterial cellulose matrix. These regions would remain inactive in the dry state, preserving stability during storage and transportation. When exposed to moisture, sweat, or environmental humidity, the freeze-dried TXTL system would reactivate and produce a programmable output such as:

  • fluorescence,
  • color change,
  • local enzymatic activity,
  • or modulation of impedance behavior.

One possible implementation would use pH-sensitive or ion-sensitive genetic circuits to detect changes in skin perspiration or environmental conditions. For example, increased humidity or acidification could activate expression of chromoproteins or enzymes modifying the electrochemical behavior of the material.

Rather than treating textiles as passive substrates, this approach imagines fabrics as metabolically responsive interfaces capable of transient biochemical computation without requiring living engineered organisms.


Societal Challenge / Market Need

This concept addresses several emerging needs in wearable technology and sustainable materials research.

Current smart textiles often rely on:

  • rigid electronics,
  • batteries,
  • non-biodegradable conductive materials,
  • and difficult-to-recycle sensor architectures.

By contrast, freeze-dried cell-free systems offer:

  • low-energy biosensing,
  • biological programmability,
  • reduced ecological persistence,
  • and compatibility with biodegradable materials such as bacterial cellulose.

Potential applications include:

  • health-monitoring garments,
  • adaptive sportswear,
  • environmental exposure indicators,
  • disposable biomedical patches,
  • or interactive biofabricated fashion.

The project also contributes to broader questions around ecological electronics and post-silicon material computation, where sensing and responsiveness emerge from biochemical rather than electronic processes.


Addressing Cell-Free System Limitations

One of the major limitations of freeze-dried cell-free systems is that they are often:

  • activated only once,
  • sensitive to hydration conditions,
  • and unstable over long durations.

Several strategies could help address these challenges.

First, the bacterial cellulose matrix itself can act as a hydration regulator due to its high water retention capacity and porous nanofibrillar structure. This could help maintain localized moisture conditions and prolong TXTL activity.

Second, the cell-free reactions could be spatially compartmentalized into microcapsules or hydrogel domains embedded inside the textile. This would reduce premature activation and allow selective local responses.

Third, instead of designing continuous sensing systems, the textile could operate as a transient or event-based material:

  • activated only during specific conditions,
  • producing temporary readouts,
  • then naturally degrading or becoming inactive afterward.

Finally, rather than competing with electronic devices in durability or computational complexity, these materials could occupy a complementary niche where biodegradability, programmability, softness, and metabolic responsiveness are more important than long-term operation.


Relation to My Research

This proposal directly connects to my ongoing work on:

Growable Impedance-Sensitive Surfaces from Bacterial Cellulose via Tyr1-Mediated Eumelanin

In this context, freeze-dried cell-free systems could provide localized biochemical sensing layers embedded inside the living material itself. Instead of adding external electronics onto bacterial cellulose, the textile would integrate programmable biochemical functions directly into the material architecture.

This opens the possibility of biofabricated interfaces where sensing, metabolism, hydration, coloration, and impedance modulation become intertwined within a single grown material ecosystem.


Homework question from Ally Huang - Mock Genes in Space Proposal — Cell-Free Radiation Stress Reporter

groundcontrol groundcontrol

1. Background

Spaceflight exposes biological systems to ionizing radiation, microgravity, and limited access to diagnostic infrastructure. Radiation can damage DNA, proteins, and cellular function, making it a major challenge for long-duration missions. A portable, freeze-dried cell-free biosensor could help astronauts detect molecular signs of radiation stress without culturing living cells. This is significant for humanity because future space exploration will require autonomous biological monitoring systems that are lightweight, stable, and easy to activate on demand.

2. Molecular or Genetic Target

DNA damage response pathway, using a synthetic promoter responsive to oxidative or DNA-damage stress driving GFP expression.

3. Relation to the Space Biology Challenge

Radiation exposure in space can generate reactive oxygen species and DNA damage. These molecular stresses activate damage-response pathways in living cells. A cell-free system cannot fully reproduce cellular repair, but it can express a reporter gene controlled by a damage-responsive regulatory element. This makes it possible to build a simplified molecular sensor that translates invisible radiation-related biochemical stress into a fluorescent output.

4. Hypothesis / Research Goal

I hypothesize that a freeze-dried BioBits® cell-free protein expression system can be used as a portable reporter for radiation-associated molecular stress. If a DNA-damage- or oxidative-stress-responsive genetic construct is exposed to radiation-mimicking conditions, then the cell-free reaction should produce a measurable fluorescent signal. The goal is to evaluate whether cell-free systems can act as lightweight biological diagnostics for space environments, where traditional cell culture is difficult, slow, and resource-intensive.

5. Experimental Plan

I would test freeze-dried BioBits® reactions containing a GFP reporter construct under different simulated stress conditions. Samples would include: no-stress control, oxidative-stress condition, UV/radiation-mimic condition, and positive GFP-expression control. Reactions would be activated with water and incubated. Fluorescence would be measured using the P51 Molecular Fluorescence Viewer. If DNA amplification is needed, the miniPCR® thermal cycler could prepare or verify target DNA templates before expression.

Week 10 HW: Advanced Imaging & Measurement Technology

Table of Contents


Homework: Final Project

## Final Project — Measurement Plan

My final project is:

Growable Impedance-Sensitive Surface from Bacterial Cellulose via Tyr1-Mediated Eumelanin

The goal is to engineer or design a bacterial cellulose material whose electrochemical behavior is modified through Tyr1-mediated eumelanin production.


1. Genetic Construct Verification

I would first measure whether the tyr1 expression cassette was correctly assembled.

What I want to measure:

  • presence of the tyr1 gene,
  • correct promoter/RBS/CDS/terminator structure,
  • correct plasmid size,
  • absence of major cloning errors.

Technologies:

  • PCR / colony PCR,
  • agarose gel electrophoresis,
  • Sanger sequencing.

Expected result: A correct PCR band at the expected size, followed by sequencing confirming that the tyr1 coding sequence and regulatory elements are intact.


2. Tyr1 Protein Expression

I would measure whether the transformed bacteria actually express Tyr1.

What I want to measure:

  • presence of Tyr1 protein,
  • approximate protein size,
  • expression level.

Technologies:

  • SDS-PAGE,
  • optional Western blot if a His-tag or antibody is available.

Expected result: A protein band corresponding to Tyr1, around the expected molecular weight for tyrosinase.


3. Melanin Production

The central functional readout is whether Tyr1 produces eumelanin.

What I want to measure:

  • visible pigmentation,
  • melanin intensity,
  • spatial distribution of pigmentation in the bacterial cellulose pellicle.

Technologies:

  • photography under controlled lighting,
  • image analysis,
  • UV-Vis spectroscopy,
  • optional colorimetric quantification.

Expected result: The cellulose pellicle should progressively darken when exposed to L-tyrosine and copper under suitable pH conditions.


4. Bacterial Cellulose Growth

Because the material itself is grown, I would measure bacterial cellulose production.

What I want to measure:

  • wet mass,
  • dry mass,
  • thickness,
  • surface area,
  • growth time,
  • morphology.

Technologies:

  • scale for mass,
  • calipers or microscopy for thickness,
  • photographic documentation,
  • drying protocol for dry mass.

Expected result: A measurable pellicle that can be compared between wild-type and Tyr1-functionalized conditions.


5. Electrochemical / Impedance Behavior

This is the key material measurement.

What I want to measure:

  • impedance magnitude,
  • phase response,
  • frequency-dependent behavior,
  • hydration sensitivity,
  • pressure/touch sensitivity,
  • difference between native BC and melanin-rich BC.

Technologies:

  • two-electrode or four-electrode setup,
  • frequency sweep,
  • controlled hydration measurements.

Expected result: Eumelanin-functionalized bacterial cellulose may show altered impedance behavior compared to native bacterial cellulose, especially under different hydration or ionic conditions.


6. Environmental / Culture Conditions

Because cellulose growth and melanin production depend strongly on the culture environment, I would also monitor:

What I want to measure:

  • pH,
  • sugar consumption,
  • conductivity of the medium,
  • hydration state,
  • incubation time.

Technologies:

  • pH meter or pH strips,
  • conductivity meter,
  • refractometer or glucose assay,
  • mass-based hydration tracking.

Expected result: These measurements help connect metabolic state with material formation and final impedance behavior.


Summary Table

MeasurementPurposeTechnology
tyr1 DNA presenceConfirm constructPCR, gel electrophoresis
Sequence correctnessConfirm cassette integritySanger sequencing
Tyr1 expressionConfirm protein productionSDS-PAGE / Western blot
Melanin productionConfirm functional enzyme activityPhotography, UV-Vis
BC growthQuantify material formationMass, thickness, imaging
ImpedanceMeasure functional material behaviorLCR meter / impedance analyzer
pH / conductivityMonitor culture statepH meter, conductivity meter

Overall Goal

The final objective is to connect:

DNA design
→ Tyr1 expression
→ eumelanin production
→ bacterial cellulose modification
→ impedance-sensitive material behavior

This measurement plan would allow me to evaluate not only whether the genetic system works, but also whether the biological modification produces a meaningful change in the material’s electrochemical properties.


Homework: Waters Part I — Molecular Weight

Waters Part I — eGFP Molecular Weight

Using the eGFP amino acid sequence provided in the assignment, including:

  • the LE linker,
  • and the C-terminal 6xHis purification tag (HHHHHH),

I calculated the theoretical molecular weight and isoelectric point using the ExPASy Compute pI/Mw tool.

Compute pI/Mw Compute pI/Mw

Results

PropertyValue
Theoretical pI5.90
Theoretical Molecular Weight28,006.60 Da
Approximate Molecular Weight28.0 kDa

Because LC-MS intact protein analysis is performed under denaturing solvent conditions, I would expect eGFP to unfold and produce a charge-state distribution corresponding to the denatured protein around ~28 kDa after deconvolution.

The His-tag and linker slightly increase the final molecular weight compared to native GFP.


2. Molecular Weight from Adjacent Charge States

From Figure 1, I selected two adjacent charge-state peaks:

Peakm/z
z + 1 peak875.4421
z peak903.7138

Using the adjacent charge-state equation, the estimated charge state is:

z ≈ 31

Then the molecular weight can be calculated from:

MW = z × (m/z - H⁺)

where:

H⁺ = 1.0073 Da

So:

MW = 31 × (903.7138 - 1.0073) MW ≈ 27,983.9 Da Comparison with theoretical mass

The theoretical molecular weight calculated from the eGFP sequence was:

MW_theory = 28,006.60 Da

The accuracy/error is:

|27,983.9 - 28,006.6| / 28,006.6 = 0.00081

or approximately:

0.081% Interpretation

The molecular weight estimated from the adjacent charge states is very close to the theoretical eGFP molecular weight. Small differences may come from peak reading precision, isotope distribution, adducts, or calibration differences in the LC-MS measurement.


Charge State Observation from the Zoomed-In Peak

Yes, the charge state can be observed from the zoomed-in isotopic peak distribution.

In the inset spectrum, the individual isotope peaks are separated by approximately:

Δ(m/z) ≈ 0.032

For multiply charged ions in mass spectrometry:

Charge state z ≈ 1 / Δ(m/z)

So:

z ≈ 1 / 0.032 ≈ 31

This corresponds well to the charge state estimated previously from the adjacent peak method.

The reason this works is that highly charged proteins produce isotope peaks that are very closely spaced in m/z space. The spacing between isotopic peaks becomes inversely proportional to the charge state.


Homework: Waters Part II — Secondary/Tertiary structure

Native vs Denatured Protein Conformations

Proteins can exist in either a native folded state or a denatured unfolded state.

In the native state, the protein maintains its compact three-dimensional structure through:

  • hydrogen bonding,
  • hydrophobic interactions,
  • electrostatic interactions,
  • and sometimes disulfide bonds.

When a protein denatures, these stabilizing interactions are disrupted. The protein unfolds and exposes amino acid residues that were previously buried inside the structure. In mass spectrometry, denaturation is commonly induced using acidic solvents and organic solvents such as acetonitrile.

This unfolding strongly affects the protein charge-state distribution during electrospray ionization (ESI). A folded protein has a compact surface with fewer solvent-accessible protonation sites, so it acquires fewer charges. An unfolded protein exposes many more basic residues to the solvent, allowing it to acquire many additional protons.

As a result:

  • Native proteins usually produce:

    • lower charge states,
    • higher m/z peaks,
    • narrower charge-state distributions.
  • Denatured proteins usually produce:

    • higher charge states,
    • lower m/z peaks,
    • broader charge-state distributions.

Interpretation of Figure 2

In the native eGFP spectrum (bottom/red spectrum), the peaks appear at much higher m/z values (~2500–2800 m/z), indicating that the protein carries relatively few charges. This is consistent with a compact folded structure.

In the denatured eGFP spectrum (top/green spectrum), the peaks shift toward lower m/z values (~700–1400 m/z) and show a much broader charge-state distribution. This indicates that the protein has unfolded and acquired many more charges during ionization.

Therefore, the mass spectrometer indirectly detects protein folding state through the observed charge-state distribution:

  • compact folded proteins → low charge states,
  • unfolded proteins → high charge states.

Charge State of the Native eGFP Peak at ~2800 m/z

Yes, the charge state of the native eGFP peak around ~2800 m/z can be determined from the isotopic peak spacing in the zoomed-in spectrum.

The isotope peaks are separated by approximately:

Δ(m/z) ≈ 0.09

In electrospray ionization mass spectrometry, the relationship between isotope spacing and charge state is:

z ≈ 1 / Δ(m/z)

Therefore:

z ≈ 1 / 0.09 ≈ 11

So the peak near ~2800 m/z corresponds approximately to the:

11+ charge state

This makes sense for native folded eGFP because folded proteins generally acquire fewer charges than denatured proteins. The compact native structure exposes fewer protonatable residues, resulting in lower charge states and therefore higher m/z values.


Homework: Waters Part III — Peptide Mapping - primary structure

Lysine (K) and Arginine (R) Count in eGFP

Trypsin cleaves peptide bonds after:

  • Lysine (K)
  • Arginine (R)

(except when followed by Proline).

Using the eGFP sequence provided, I counted:

Amino AcidCount
Lysine (K)20
Arginine (R)6
Total tryptic cleavage residues26

eGFP Sequence with K and R Highlighted

MVS[K]GEELFTG VVPILVELDG DVNGH[K]FSVS GEGEGDATYG [K]LTL[K]FICTT
G[K]LPVPWPTL VTTLTYGVQC FS[R]YPDHM[K]Q HDFF[K]SAMPE
GYVQE[R]TIFF [K]DDGNY[K]T[R]A EV[K]FEGDTLV N[R]IEL[K]GIDF
[K]EDGNILGH[K] LEYNYNSHNV YIMAD[K]Q[K]NG I[K]VNF[K]I[R]HN
IEDGSVQLAD HYQQNTPIGD GPVLLPDNHY LSTQSALS[K]D PNE[K][R]DHMVL
LEFVTAAGIT LGMDELY[K]LE HHHHHH

The high number of Lysine and Arginine residues explains why trypsin digestion generates many smaller peptides suitable for LC-MS peptide mapping.

benchling benchling

Expected Number of Tryptic Peptides from eGFP

Using the ExPASy PeptideMass tool with the parameters shown:

  • Enzyme: Trypsin
  • Missed cleavages: 0
  • Monoisotopic masses
  • Reduced cysteines
  • Display peptides larger than 500 Da

the digestion of eGFP produced:

19 predicted tryptic peptides

As shown in the PeptideMass output, this corresponds to approximately:

90.7% sequence coverage

The peptides listed correspond to the fragments expected after trypsin cleavage at Lysine (K) and Arginine (R) residues.

website website

Number of Chromatographic Peaks in the eGFP Peptide Map

Using Figure 5a and counting chromatographic peaks between:

0.5 and 6.0 minutes

with approximately:

10% relative abundance

I observe approximately:

20 major chromatographic peaks

These peaks correspond to different tryptic peptides generated from the digestion of eGFP. Each peptide elutes at a different retention time depending on properties such as:

hydrophobicity, peptide length, charge, and amino acid composition.

The large number of peaks reflects the complexity of the peptide mixture produced after trypsin digestion.


The number of chromatographic peaks is close to, but slightly higher than, the number of peptides predicted by ExPASy.

ExPASy predicted:

19 tryptic peptides above 500 Da

From the chromatogram, I counted approximately:

20 major peaks between 0.5 and 6 minutes

So there appear to be slightly more chromatographic peaks than predicted peptides.

This difference is expected because one theoretical peptide can sometimes appear as more than one chromatographic signal due to different charge states, oxidation states, adducts, missed cleavages, or partially resolved co-eluting species. Also, not every chromatographic peak necessarily corresponds to a unique eGFP tryptic peptide.


Peptide m/z, Charge State, and Singly Charged Mass

From Figure 5b, the most abundant peptide peak is at: m/z = 525.767 In the zoomed-in isotope distribution, the isotope peaks are separated by approximately: 526.259 - 525.767 = 0.492

Since isotope spacing is approximately: 1 / z the charge state is: z ≈ 1 / 0.492 ≈ 2 So the peptide is mainly observed as a: 2+ ion To calculate the singly charged peptide mass:[M+H]+ = (m/z × z) - (z - 1) × H+

Using:

m/z = 525.767
z = 2
H+ = 1.0073
[M+H]+ = (525.767 × 2) - 1.0073
[M+H]+ ≈ 1050.53

This matches the singly charged peak observed near: m/z ≈ 1050.52


Peptide Identification and Mass Accuracy

From Question 5, the experimentally measured singly charged peptide mass was: MW_experimental ≈ 1050.52 Da Comparing this value with the predicted tryptic peptide masses from the ExPASy PeptideMass output, the closest match is:

Peptide	Theoretical Mass (Da)
FEGDTLVNR	1050.5214

Therefore, the peptide observed at retention time 2.78 min is most likely:

FEGDTLVNR
Mass Accuracy Calculation

Using: Accuracy = |MW_experimental - MW_theory| / MW_theory and converting to ppm: ppm error = Accuracy × 10^6

Using: MW_experimental = 1050.5238 MW_theory = 1050.5214 ppm error = (|1050.5238 - 1050.5214| / 1050.5214) × 10^6 ppm error ≈ 2.3 ppm Interpretation

A mass error of only a few ppm indicates excellent agreement between the experimental LC-MS measurement and the theoretical peptide mass prediction.


Percentage of Sequence Confirmed by Peptide Mapping

Based on Figure 6, the LC-MS peptide mapping identified: 88% sequence coverage

This means that peptides corresponding to approximately 88% of the eGFP amino acid sequence were experimentally detected and confirmed by LC-MS/MS analysis.

The uncovered regions likely correspond to:

peptides that are too small, poorly ionized, difficult to separate chromatographically, or outside the optimal mass detection range.


Homework: Waters Part IV — Oligomers

KLH Oligomeric State Assignment from CDMS

Using the subunit masses:

SubunitMass
7FU340 kDa
8FU400 kDa

we can calculate the expected oligomer masses.


Calculations

7FU Decamer 10 × 340 kDa = 3400 kDa = 3.4 MDa This corresponds to the peak near: 3.4 MDa

8FU Didecamer A didecamer contains 20 subunits: 20 × 400 kDa = 8000 kDa = 8.0 MDa This corresponds to the large peak near: 8.33 MDa

8FU 3-Decamer A 3-decamer contains 30 subunits: 30 × 400 kDa = 12000 kDa = 12.0 MDa This corresponds to the peak near: 12.67 MDa

8FU 4-Decamer A 4-decamer contains 40 subunits: 40 × 400 kDa = 16000 kDa = 16.0 MDa This corresponds to the lower-intensity signal around: 16 MDa

Summary Table

SpeciesCalculationExpected MassObserved Peak
7FU Decamer10 × 340 kDa3.4 MDa~3.4 MDa
8FU Didecamer20 × 400 kDa8.0 MDa~8.33 MDa
8FU 3-Decamer30 × 400 kDa12.0 MDa~12.67 MDa
8FU 4-Decamer40 × 400 kDa16.0 MDa~16 MDa

Week 11 - HW - Bioproduction & Cloud Labs

Table of Contents


Part A — The 1,536 Pixel Artwork Canvas | Collective Artwork

paint paint

Part B: Cell-Free Protein Synthesis | Cell-Free Reagents

1. Referencing the cell-free protein synthesis reaction composition (the middle box outlined in yellow on the image above, also listed below), provide a 1-2 sentence description of what each component’s role is in the cell-free reaction.

E. coli Lysate

BL21 (DE3) Star Lysate (includes T7 RNA Polymerase) The lysate contains the cellular machinery required for transcription and translation, including ribosomes, enzymes, tRNAs, cofactors, and metabolic components. The integrated T7 RNA polymerase specifically drives strong transcription from T7 promoters, enabling high protein expression in the cell-free system.


Salts / Buffer

Potassium Glutamate Potassium glutamate helps reproduce the intracellular ionic environment of E. coli and stabilizes ribosomes and enzymes involved in translation. It also contributes to osmotic balance and protein folding efficiency.

HEPES-KOH pH 7.5 HEPES is a buffering agent that maintains a stable physiological pH during the reaction. Stable pH is critical because transcription and translation enzymes are highly sensitive to pH changes.

Magnesium Glutamate Magnesium ions are essential cofactors for ribosomes, RNA polymerases, ATP-dependent enzymes, and nucleotide interactions. The magnesium concentration strongly affects translation efficiency and overall reaction stability.

Potassium phosphate monobasic This phosphate salt contributes to buffering capacity and phosphate availability in the reaction. It helps stabilize biochemical conditions during prolonged incubations.

Potassium phosphate dibasic Together with the monobasic form, this salt maintains phosphate equilibrium and contributes to pH stabilization and ionic balance.


Energy / Nucleotide System

Ribose Ribose acts as a precursor for nucleotide synthesis and energy metabolism. It helps sustain transcription and translation during long incubations.

Glucose Glucose provides a metabolic energy source that supports ATP regeneration pathways within the lysate. Sustained ATP availability is essential for protein synthesis.

AMP AMP is a nucleotide precursor used in RNA synthesis and cellular energy metabolism. It contributes to maintaining nucleotide pools during transcription.

CMP CMP provides cytidine nucleotides required for RNA synthesis during transcription.

GMP GMP supplies guanosine nucleotides necessary for RNA production and nucleotide balance.

UMP UMP supplies uridine nucleotides required for mRNA synthesis.

Guanine Guanine can be recycled into guanine nucleotides and helps maintain nucleotide biosynthesis capacity in the reaction mixture.


Translation Mix (Amino Acids)

17 Amino Acid Mix This mixture supplies most amino acids required for protein synthesis by ribosomes. Continuous amino acid availability is essential for efficient translation.

Tyrosine Tyrosine is added separately because it may have lower solubility or stability in standard amino acid mixtures. It is required for synthesis of proteins containing tyrosine residues.

Cysteine Cysteine is supplied separately because it is chemically reactive and can oxidize easily. It is important for proteins containing sulfur-containing residues and disulfide bonds.


Additives

Nicotinamide Nicotinamide functions as a precursor for NAD⁺/NADP⁺ cofactors involved in metabolic and redox reactions. It supports enzymatic activity and energy regeneration pathways in the lysate.


Backfill

Nuclease Free Water Nuclease-free water is used to adjust the final reaction volume while preventing degradation of DNA and RNA by contaminating nucleases.


2. Describe the main differences between the 1-hour optimized PEP-NTP master mix and the 20-hour NMP-Ribose-Glucose master mix shown in the Google Slide above. (2-3 sentences)

The 1-hour optimized PEP-NTP master mix is designed for rapid and high-intensity protein production. It uses phosphoenolpyruvate (PEP) as a fast energy source together with complete nucleotide triphosphates (ATP, GTP, CTP, UTP), allowing immediate transcription and translation but with limited long-term stability.

In contrast, the 20-hour NMP-Ribose-Glucose master mix is optimized for slower and more sustainable protein synthesis. Instead of directly supplying high-energy nucleotides, it relies on nucleotide monophosphates (AMP, CMP, GMP, UMP), ribose, and glucose to gradually regenerate energy and nucleotide pools over time, enabling longer reaction durations and improved metabolic sustainability.


Part C: Planning the Global Experiment | Cell-Free Master Mix Design

1. Fluorescent Protein Properties Relevant to Cell-Free Expression

sfGFP sfGFP is a strong choice for cell-free expression because it is designed to fold robustly and matures very rapidly. This makes it useful as a reliable green fluorescence output, especially when expression conditions are variable or not fully optimized.

mRFP1 mRFP1 is a monomeric red fluorescent protein, but it matures more slowly than many newer red fluorescent proteins. In a cell-free system, this may delay visible fluorescence, especially in shorter incubations, although its low acid sensitivity makes it relatively stable across changing pH conditions.

mKO2 mKO2 is an orange fluorescent protein that is relatively fast-folding, but it has moderate acid sensitivity. In long cell-free reactions, pH drift could reduce fluorescence readout, so maintaining buffer stability is important.

mTurquoise2 mTurquoise2 is a cyan fluorescent protein that matures rapidly and has very low acid sensitivity. This makes it a good candidate for long incubations where pH changes might otherwise reduce fluorescence.

mScarlet-I mScarlet-I is a bright, rapidly maturing red fluorescent protein. Its high brightness makes it attractive for maximizing fluorescence output, although its moderate acid sensitivity means that pH control may still matter during long incubations.

Electra2 Electra2 is a blue fluorescent protein derived from Entacmaea quadricolor. Because blue fluorescent proteins can be more difficult to detect cleanly and may have weaker apparent brightness depending on the imaging system, signal strength and excitation/emission compatibility are especially important for readout. :contentReference[oaicite:5]{index=5}

colorsss colorsss

2. Hypothesis for Master Mix Optimization

For a 36-hour incubation, I would optimize the master mix for mScarlet-I because it is bright and rapidly maturing but may be affected by pH drift over time. Increasing the buffering capacity and improving long-term energy regeneration will increase final mScarlet-I fluorescence after 36 hours.


3. Reagents to Adjust

I would increase or carefully tune:

  • HEPES-KOH pH 7.5
    to maintain pH stability during long incubation.

  • Glucose + ribose + NMP system
    to support sustained ATP and nucleotide regeneration over time.

  • Magnesium glutamate
    to optimize ribosome activity and translation efficiency.

4. Analyzing the fluorescence

I expect that stronger pH stabilization and sustained energy regeneration will improve total protein yield and allow mScarlet-I to mature more completely. This should increase final red fluorescence intensity after 36 hours compared to a faster but shorter-lived PEP-NTP formulation.

Part D: Build-A-Cloud-Lab | (optional) Bonus Assignment

nebulaGingko nebulaGingko