Week 11 Review: Bioproduction & Cloud Labs

Week 11 — Bioproduction & Cloud Labs

One-line takeaway. A cloud lab is a wet-lab you drive from a laptop. This week you design a cell-free protein synthesis (CFPS) reaction that will run on one, in a global 1,536-well bioart canvas.

CourseHTGAA Spring 2026
LectureTues, Apr 14, 2026 — Reshma Shetty, Bioproduction & Cloud Labs
RecitationWed, Apr 15 — Ronan Donovan, Cloud laboratories

| Author | Fiona Commited Listener BioPunk SF |


Why CFPS as the canonical case for cloud labs?

A cell-free reaction is the fastest way to make protein. Make protein, read out a signal, learn something, design the next round — that loop is what synthetic biology runs on. The faster the loop, the more design space you get to search.

Cloud labs let you run that loop without owning a lab. You write a protocol, queue it, and a fleet of robots executes overnight. That changes who can do science (you, on a Tuesday) and how reproducible it is (every parameter is explicit because a robot needs it explicit).

Tie-ins to your final projects:

  • MS2 L-protein (group project) — dozens of designed mutants, identical expression-and-readout pipeline. A textbook cloud-lab use case.
  • Cholera Shield (individual final project) — Week 9 anti-cholera-toxin module will need cell-free expression conditions; the master-mix logic locked this week is the foundation.

Vocabulary (skim once, refer back later)

TermWhat it is
Cloud labA wet-lab facility you drive remotely. Ginkgo Nebula, Emerald, Strateos.
RACReconfigurable Automation Cart — one instrument + a robotic arm + a software wrapper.
CFPSCell-free protein synthesis. Protein made in a tube of E. coli extract; no living cells.
LysateThe soluble fraction of broken-open E. coli — ribosomes, tRNAs, enzymes. Source of all the machinery.
BL21 (DE3) StarThe standard CFPS strain. (DE3) = carries T7 RNA polymerase. Star = truncated RNase E (rne131), so mRNA lasts longer.
T7 promoterTAATACGACTCACTATAGGG — the canonical 20-bp T7 RNAP cassette (–17 to +3), including the GGG initiator triplet required for efficient transcription start. Why your plasmid has one.
NTP / NMP / NDPNucleotide tri- / mono- / di-phosphates. NTPs are the working currency; NMPs are the cheap precursor.
Master mixPre-mixed cocktail of everything in the reaction except lysate and DNA. Used at 2× and diluted into the well.
Maturation half-timeHow long after translation finishes before the chromophore lights up. Dominant property for CFPS readout.

How a cloud lab is built

Reshma’s mental model is a 2×2 of automation vs flexibility:

quadrantChart
    title Lab work, on two axes
    x-axis "Low flexibility" --> "High flexibility"
    y-axis "Low automation" --> "High automation"
    quadrant-1 "Autonomous lab (Waymo)"
    quadrant-2 "Work cell (red line)"
    quadrant-3 "—"
    quadrant-4 "Lab bench (car)"
    "Lab bench": [0.85, 0.15]
    "Work cell": [0.15, 0.85]
    "Walk-up Opentrons": [0.5, 0.5]
    "Cloud lab (Nebula)": [0.85, 0.85]

Lab bench = max flexibility, near-zero automation. Reshma’s framing: the vast majority of research budgets — both academic and pharma R&D — still sit here. Work cell = max automation, locked to one workflow. Cloud lab = both. The hard engineering problem.

Cloud-lab CFPS overview Cloud-lab CFPS overview

Fallback rendering. If the quadrant chart above doesn’t render in your viewer, here’s the same information as a table:

ModeFlexibilityAutomationExample
Lab benchHighLowA graduate student pipetting
Walk-up automationMediumMediumOpentrons in the HTGAA teaching lab
Work cellLowHighNGS sample-prep robot
Cloud labHighHighGinkgo Nebula, Emerald Cloud Lab

What’s inside a Nebula rack

Each rack is one instrument inside a standardized enclosure, with a robotic arm and a sample transport track running past it.

flowchart LR
    subgraph RAC["A single RAC"]
        Inst[Instrument<br/>e.g. centrifuge, Echo, Bravo, reader]
        Arm[Robotic arm]
        SW[Software wrapper<br/>parameterized control]
    end
    Track[Sample transport track<br/>SBS-format plates]
    Plate[Plate]
    Plate --> Track
    Track <--> Arm
    Arm <--> Inst
    Inst <--> SW

A few practical points worth knowing:

  • ~50 racks in Boston right now, targeting 100.
  • ~2 months to onboard a new instrument (custom internal fittings — “iKit” — for each one).
  • SLAS / SBS plate format is the only meaningful adopted standard in the industry.
  • Centrifuges become bottlenecks before mass specs. Because every protocol “quick-spins” everything.
  • Nebula exposes Generic_* modules (Generic_echo_hitpick, generic_multiflo_dispense, generic_spark_read, etc.) rather than packaged workflows. You compose your own.

Design choice worth noting. Ginkgo is building purpose-built automation hardware, not humanoid robots. Reshma is open that this is a bet. Other labs are taking the opposite bet.


Cell-free protein synthesis — what’s in the tube

A CFPS reaction is a protein factory in a tube. The factory needs continuous energy, building blocks, the right ionic environment, and the machinery to run it. The reagents map cleanly onto those four roles:

CFPS reaction anatomy CFPS reaction anatomy
RoleReagentsWhat they do
MachineryBL21 (DE3) Star lysateRibosomes, tRNAs, RNA polymerase, all translation factors, metabolic enzymes. T7 RNAP is pre-induced before lysis. Star = rne131 truncated RNase E → longer mRNA half-life.
Ionic environmentK-glutamate (130 mM), HEPES-KOH pH 7.5 (50 mM), Mg-glutamate (10 mM Mg²⁺), K-phosphate mono + dibasic (10 mM each)Reconstruct the E. coli cytoplasm. K-glutamate over K-chloride because chloride inhibits many enzymes (Jewett & Swartz 2004).
Energy + nucleotidesGlucose, ribose, AMP, CMP, UMP, guanine — but no GMPDrive NTP regeneration. See “Two batteries” below.
Building blocks17-AA mix + tyrosine (separate, low solubility) + cysteine (separate, oxidizes)The amino acids.
AdditiveNicotinamidePrecursor for NAD⁺, which GAPDH needs to keep glycolysis running.
BackfillNuclease-free waterVolume only — no RNases allowed.

Two batteries — the energy systems

CFPS energy regimes comparison CFPS energy regimes comparison

A CFPS reaction needs ATP and GTP regenerated continuously. There are two ways to do this:

PEP-NTP mixNMP-Ribose-Glucose mix (this week)
AnalogyA single-use lithium batteryA wall outlet with a power adapter
What you addPre-made NTPs + PEP (high-energy phosphate donor)Cheap NMPs + glucose + ribose + free guanine
How it regeneratesPyruvate kinase puts PEP’s phosphate onto NDP → NTPGlucose → glycolysis → ATP → NMP kinases + NDP kinase → all four NTPs
Runtime~1 hour~20 hours
CostHigh (NTPs are expensive)84–99 % cheaper (Olsen et al. 2025)
Best forFast prototyping, fast-maturing reportersLong incubations, slow-maturing reporters, cost-sensitive work

Worth the bold. The reason this week uses NMPs, glucose, and ribose instead of NTPs and PEP is that the cheaper system lets the reaction run 20× longer at roughly 1⁄10 the cost. That single trade is what makes the 1,536-well global canvas affordable to run.

The salvage shortcut — why guanine alone is enough

The master mix this week supplies no GMP. Only free guanine. The reaction still produces RNA, which needs GTP. How?

flowchart LR
    G[Guanine] -->|"+ PRPP (from ribose + ATP)"| GMP
    GMP -->|guanylate kinase + ATP| GDP
    GDP -->|NDP kinase + ATP| GTP
    GTP -->|RNA polymerase| RNA[Transcribed mRNA]
    style G fill:#ffe
    style GTP fill:#9cf

The lysate kept the cell’s purine salvage machinery. The key enzyme is HPRT (hypoxanthine-guanine phosphoribosyltransferase) — it takes guanine and PRPP (from ribose + ATP) and makes GMP. From there the usual kinases phosphorylate it up to GTP. As long as ribose and ATP are around, guanine alone carries the GTP pool. Bonus answer for Part B Q3.


The fluorescent proteins — five-property cheat sheet

Before any FP-specific reasoning, the five properties that decide whether a fluorescent protein works in CFPS:

#PropertyWhy it matters for CFPS
1Maturation half-timeHow long after translation before the chromophore is fluorescent. The dominant property for long reactions.
2Brightness = EC × QYSets the detection floor.
3pKaCFPS metabolism acidifies the well. High pKa → signal fades as the reaction ages.
4Oxygen dependenceAll GFP/DsRed-family chromophores need O₂ to mature. Sealed wells run out within hours.
5Oligomeric stateMonomers behave best. Dimers can self-quench at high local concentration.

Photostability matters for imaging, not for endpoint reads.

The six FPs in this week’s canvas

Six fluorescent proteins, spectrum and properties Six fluorescent proteins, spectrum and properties

Each is chosen to expose a different bottleneck. The “supplement” column shows the per-well 2 µL intervention the dominant bottleneck argues for:

FPDominant CFPS propertySuggested 2 µL supplementPrimary reference
sfGFPFast maturation (<10 min, 37 °C). No bottleneck.Water control (no supplement needed).Pédelacq et al. 2006, doi:10.1038/nbt1172
mRFP1Slow, O₂-dependent two-step maturation (<1 hr, 37 °C — Campbell 2002 — vs DsRed’s ~24 hr).Catalase, 100 U/mL — clears H₂O₂, recycles O₂. (See worked example.)Campbell et al. 2002, doi:10.1073/pnas.082243699
mKO2Moderate pKa (~5.5, per FPbase). Acid-sensitive at long incubation.Boost K-phosphate to 15 mM + HEPES to 75 mM final.Sakaue-Sawano et al. 2008, doi:10.1016/j.cell.2007.12.033
mTurquoise2Highest QY of any monomer (0.93), pKa 3.1 — essentially pH-insensitive. Fast-maturing (Goedhart 2012 — the I146F structural mutation also accelerates folding); ceiling is translation rate.Boost Mg²⁺ to 12 mM final.Goedhart et al. 2012, doi:10.1038/ncomms1738
mScarlet-IT74I mutation = fast maturation. QY drops 0.70 (parent mScarlet) → 0.54 (mScarlet-I). EC ≈ 100,300 M⁻¹cm⁻¹ for the parent and largely preserved in mScarlet-I (the T74I substitution affects QY and folding kinetics, not the absorptive cross-section). Still a two-step chromophore.Catalase, 100 U/mL (same as mRFP1; smaller effect expected).Bindels et al. 2016, doi:10.1038/nmeth.4074
Electra2Newer scaffold (2022, from Entacmaea quadricolor), limited CFPS-specific data.Water control — treat as calibration arm of the canvas.Papadaki et al. 2022, doi:10.1038/s41598-022-13214-0

Worked example — designing the mRFP1 well

mRFP1 is the most useful teaching case because its bottleneck — slow, O₂-dependent chromophore maturation — has a clean reagent fix.

How the mRFP1 chromophore actually matures

flowchart LR
    A["Folded protein<br/>(Met-Tyr-Gly chromophore precursor)"] -->|"cyclize<br/>~1 min"| B["Cyclized intermediate<br/>(not fluorescent)"]
    B -->|"+ O₂ → H₂O₂<br/>1:1 stoichiometry<br/>(Strack 2010)"| C["GFP-class green intermediate"]
    C -->|"+ O₂ consumed<br/>(acylimine formation;<br/>H₂O₂ release inferred,<br/>not directly quantified)"| D["Mature red mRFP1<br/>584 / 607 nm"]
    style D fill:#f88
    style C fill:#9f9

Two things go wrong in a sealed, 36-hour well:

  1. O₂ runs out. Starts at ~250 µM, depletes to single digits within hours. mRFP1 maturation stalls at the green intermediate.
  2. H₂O₂ builds up. Each oxidation step releases peroxide. Peroxide oxidizes the mature chromophore back to dark species, and oxidizes methionine/cysteine in the lysate machinery.

The fix

Add bovine liver catalase to the 2 µL supplement slot, 100 U/mL final. Catalase runs the reaction 2 H₂O₂ → 2 H₂O + O₂, which:

  • clears the damaging peroxide, and
  • regenerates one O₂ for every two peroxides consumed.

Both bottlenecks get hit by the same enzyme.

E. coli has its own catalases (KatE and KatG) and these are present in the lysate — but at limited concentration. Supplemental bovine liver catalase (Sigma C9322, the standard commercial source) boosts activity well above the lysate’s residual level, which is the rationale for adding it rather than relying on the native enzymes.

The 2× master mix (shared across the canvas)

Component2× stockFinal (1×)Why
HEPES-KOH pH 7.5100 mM50 mMCytoplasmic-pH buffer.
K-glutamate260 mM130 mMIonic strength; glutamate avoids chloride enzyme inhibition.
Mg-glutamate (Mg²⁺)20 mM10 mMCentered in the safe window.
K-phosphate dibasic20 mM10 mMPhosphate pool + buffer.
K-phosphate monobasic20 mM10 mMBuffer pair.
Glucose80 mM40 mMBumped above canonical 30 mM for the 36-hour endpoint. Reasoned extrapolation — Olsen et al. 2025’s reported optimum should be cross-checked before submission; if their value differs significantly, defer to theirs.
Ribose60 mM30 mMPRPP precursor + pentose-phosphate flux.
AMP3 mM1.5 mMNMP load.
CMP1.8 mM0.9 mMNMP load.
GMP0 mM0 mMOmitted by design — guanine + HPRT replaces it.
UMP1.8 mM0.9 mMNMP load.
Guanine1 mM0.5 mMSalvage substrate.
17 AA mix3 mM each1.5 mM eachBuilding blocks.
Tyrosine3 mM1.5 mMSeparate (low solubility).
Cysteine3 mM1.5 mMSeparate (oxidizes); add fresh.
Nicotinamide2 mM1 mMNAD⁺ salvage precursor.
NF-waterq.s.Backfill.

Three deliberate departures from canonical 20-hr NMP-RG: glucose elevated for the 36-hr endpoint; GMP fully omitted; Mg²⁺ centered rather than pushed high.

The 2 µL supplement (mRFP1-specific)

ComponentStockFinal (in 20 µL)Purpose
Bovine liver catalase1000 U/mL100 U/mLH₂O₂ clearance + O₂ recycling.

Expected outcome

Higher endpoint fluorescence than a no-supplement control. Magnitude unknown — there’s no direct DsRed-family + catalase CFPS literature precedent, so any specific multiplier would be speculation. The time-course should rise for longer, with no plateau-then-droop.

What we don’t know yet. — Whether the per-well 2 µL slot accepts enzyme supplements (confirm with the TAs). — Whether catalase keeps full activity in CFPS buffer (it should, but unverified). — Whether the second mRFP1 oxidation releases peroxide at exactly 1:1 (mechanistically required, less explicitly measured than the first step).


The bigger picture — the GPT-5 case study

The same experimental loop you’re running by hand was run by GPT-5 driving Nebula in late 2025, on the same kind of reaction:

flowchart LR
    A[GPT-5 designs<br/>~256 conditions] -->|Pydantic<br/>validator| B[~128 feasible<br/>conditions]
    B -->|"384-well plate<br/>78 conditions × 4 reps"| C[Nebula execution]
    C -->|"titer + run metadata<br/>+ control QC"| D[GPT-5 lab notebook<br/>+ next-round hypotheses]
    D -->|next round| A
    style A fill:#cdf
    style D fill:#cdf
    style C fill:#fc9

Six rounds, 36,000 conditions, 40 % cost reduction, 27 % titer increase vs the Olsen et al. 2025 SOTA. The most striking detail: in early rounds, before GPT-5 had access to the Olsen preprint, it independently proposed swapping NTPs for NMPs — the same insight that drove the prior SOTA.

This week’s homework deliberately puts you in the GPT-5 role — you do, by hand and reasoning, the parameter optimization Reshma argues should ultimately be outsourced to models. The point is to build the intuition for what that role actually requires.

Reshma’s framing of human × AI division of labor. Outsource parameter optimization (DOE, “which ten permutations next”) to the model. Humans pick the destination — the problem worth working on, the question worth asking. Waymo analogy: the human says where; the system handles speed, lane, turn.

The ethics tension, openly. A student raised that friction is where expertise lives — practical difficulty is a biosafety entry barrier. Reshma agreed: Ginkgo applies human review at the end-to-end research-services tier (won’t build bioweapons). The cloud-lab tier is currently data-in / data-out (CFPS, enzyme assays) and innocuous, so biosafety filters haven’t been forced. As the catalog expands and primitives can be chained, that conversation has to happen. The 50-state Cloud Lab Act legislation moving through Congress will force convergence.


Pitfalls & controls — how to know it worked

Run these alongside the optimized well:

ControlPurpose
No-supplement (water in the 2 µL slot)Baseline anchor.
Heat-inactivated catalase (boil stock 10 min at ≥95 °C, then add 2 µL as in the active condition)Heat denatures the enzyme while preserving protein concentration in the well. If the active condition outperforms this, the gain is enzymatic — not generic protein-additive (osmotic, crowding) effects.
No-DNA (water in the DNA slot)Lysate background fluorescence; subtract.
sfGFP positive controlConfirms lysate + master mix are competent.

Common pitfalls:

  • Sealing too tightly. Cuts off O₂; mRFP1 stalls at the green intermediate. Use a breathable film.
  • Old cysteine. Cysteine that sat at 4 °C for >1 week is mostly oxidized cystine — replace.
  • Mg²⁺ too high. Above ~15 mM, phosphate precipitates. Visible flocculent = remake.
  • Endpoint-only reads. A plateau-then-droop diagnoses pH or peroxide problems; an endpoint number hides it. Request kinetic reads where possible.

Four papers that anchor everything on this page:

  1. Olsen et al. 2025. Design-driven optimization of low-cost reagent formulations for reproducible and high-yielding cell-free gene expression. bioRxiv 2025.08.01.668204. doi:10.1101/2025.08.01.668204 — The NMP-Ribose-Glucose paper; 84–99 % cost reduction over PEP-NTP.
  2. Ginkgo Bioworks × OpenAI. 2026. Using a GPT-5-driven autonomous lab to optimize the cost and titer of cell-free protein synthesis. bioRxiv 2026.02.05.703998 — The 36,000-condition autonomous CFPS study; 40 % cost + 27 % titer beyond Olsen.
  3. Jewett & Swartz 2004. Mimicking the Escherichia coli cytoplasmic environment activates long-lived and efficient cell-free protein synthesis. Biotechnol Bioeng 86:19–26. doi:10.1002/bit.20026 — Foundational CFPS reagent paper; the K-glutamate-over-chloride paper.
  4. Goedhart et al. 2012. Structure-guided evolution of cyan fluorescent proteins towards a quantum yield of 93 %. Nat Commun 3:751. doi:10.1038/ncomms1738 — mTurquoise2 design; reference example for how structure-guided FP engineering pushes limits.

Course resources


References

  1. Olsen ML et al. 2025. bioRxiv 2025.08.01.668204. doi:10.1101/2025.08.01.668204.
  2. Ginkgo Bioworks × OpenAI. 2026. bioRxiv 2026.02.05.703998.
  3. Jewett MC, Swartz JR. 2004. Biotechnol Bioeng 86:19–26. doi:10.1002/bit.20026.
  4. Pédelacq JD et al. 2006. Nat Biotechnol 24:79–88. doi:10.1038/nbt1172.
  5. Campbell RE et al. 2002. PNAS 99:7877–7882. doi:10.1073/pnas.082243699.
  6. Sakaue-Sawano A et al. 2008. Cell 132:487–498. doi:10.1016/j.cell.2007.12.033.
  7. Goedhart J et al. 2012. Nat Commun 3:751. doi:10.1038/ncomms1738.
  8. Bindels DS et al. 2016. Nat Methods 14:53–56. doi:10.1038/nmeth.4074.
  9. Papadaki S et al. 2022. Sci Rep 12:10190. doi:10.1038/s41598-022-13214-0.
  10. Strack RL et al. 2010. J Am Chem Soc 132:8496–8505. doi:10.1021/ja1020364.