Individual Final Project
AI-Designed Antimicrobial Peptide Cocktails Tested via Cell-Free Protein Synthesis
Sean Murphy · MAS.885 Spring 2026
Abstract
Antimicrobial resistance is on track to cause more than 10 million deaths per year by 2050, and the discovery pipeline for new small-molecule antibiotics has slowed sharply over the past two decades. Antimicrobial peptides (AMPs) are an attractive alternative class because they kill bacteria primarily by membrane disruption, a mechanism that is intrinsically harder for pathogens to evolve resistance against.
The broad objective of this project is to ask whether modern AI peptide-design tools, paired with cell-free protein synthesis (CFPS), can be combined into a fast prototyping platform that not only identifies single-peptide hits, but also surfaces synergistic peptide cocktails. The hypothesis is that AMPs produced by mechanistically distinct generative models (a length-conditioned latent-space diffusion model trained on natural AMPs and a CLIP-based target-binder generator) will exhibit synergy when co-expressed, with the most pronounced synergy appearing in cross-method pairs that combine membrane-active generalists with target-directed binders.
To test this, the specific aims are:
- Generate candidates with AMP-Diffusion, PepPrCLIP, and a MadSBM fallback strategy
- Order 15 codon-optimized linear gene fragments from Twist
- Express each peptide in BL21 Star DE3 lysate with GamS nuclease inhibitor to protect linear templates
- Score crude lysates against E. coli ATCC 25922 and B. subtilis ATCC 6633 in a two-organism antimicrobial assay, then run pairwise co-expression on top hits and score synergy by an FICI-style metric
Project Aims
Aim 1 · Experimental Aim (this project)
Determine whether AI-designed antimicrobial peptides exhibit synergistic activity when co-expressed. The platform integrates three generative peptide-design tools, 15 codon-optimized Twist linear gene fragments, a Ginkgo Bioworks BL21 Star DE3 CFPS lysate supplemented with NEBExpress GamS Nuclease Inhibitor, and a dual-organism optical-density growth-inhibition assay scored against scrambled-peptide and Cecropin B controls. A primary single-peptide screen identifies hits, and a follow-on pairwise co-expression screen at half-dose scores synergy using an FICI-style fractional inhibition index calculated separately for each organism.
Aim 2 · Development Aim
Close the design, build, test loop by feeding measured single-peptide and pairwise activity data back into the generative models as a fine-tuning signal, producing a second wave of candidates conditioned on both predicted activity and predicted synergy partners. Practically this means:
| Step | Action |
|---|---|
| i | Build a small labelled dataset (sequence → MIC-equivalent inhibition per organism, plus pairwise FICI labels) |
| ii | Parameter-efficient fine-tuning of AMP-Diffusion and a re-trained MiniCLIP scorer to bias toward in-house hit signatures |
| iii | Extend the assay panel to clinical isolates (drug-resistant E. coli, MRSA, A. baumannii) + a mammalian-cell cytotoxicity counter-screen |
| iv | Migrate from manual pipetting to Opentrons OT-2 / Ginkgo automation, raising throughput from ~100 to several hundred reactions per screen |
Aim 3 · Visionary Aim
A personalized, on-demand antibiotic-cocktail design platform: a clinician submits a patient’s pathogen genome (or a culture resistance profile), a generative pipeline designs a custom cocktail targeting that strain’s essential proteins and membrane chemistry, the cocktail is expressed locally in a freeze-dried CFPS reaction, and the peptide mixture is administered as a same-day therapeutic. The cocktail approach addresses resistance directly: pathogens that mutate around one membrane-active peptide are far less likely to escape three simultaneously, each with a distinct mechanism. If achieved, this could reduce time-to-treatment for drug-resistant infections from months to days.
Background
Key References
| Citation | Contribution to this project |
|---|---|
| Torres, M. D. T. et al. (2025) “AMP-Diffusion: Generative AI design of antimicrobial peptides,” Cell Biomaterials | Latent-space diffusion model (~16.5 M parameters) on ESM-2 embeddings, trained on 19,670 natural AMPs. Provides the generative backbone for Group A and the Group C fallback. |
| Bhat, S. et al. (2025) “PepPrCLIP: Target-directed de novo peptide binder design via contrastive language-image pretraining for proteins,” Science Advances | Frozen ESM-2 (650 M) encoder + contrastive “MiniCLIP” scorer that ranks ~100K Gaussian-perturbed candidates against a target. Used to design Group B binders against E. coli FtsZ and LpxC. |
Novelty
| # | Innovation |
|---|---|
| 1 | Two mechanistically distinct generative paradigms in the same Twist order. A generalist distribution-based sampler (AMP-Diffusion) and a target-directed contrastive ranker (PepPrCLIP) are combined head-to-head so that cross-method synergy can be tested against within-method synergy. |
| 2 | Linear Twist fragments straight into CFPS instead of plasmid cloning. Compresses design, build, test from weeks to days and makes the platform compatible with future on-demand / point-of-care use cases. |
| 3 | Cocktail design as a first-class objective. Rather than asking “which peptide is most potent?”, the platform asks “which combination is most potent at half-dose each?”, the relevant question for resistance-resistant therapeutics. |
Why It Matters
The WHO estimates that drug-resistant infections cause ~1.27 million deaths each year today, projected to climb past 10 million per year by 2050 unless the discovery pipeline accelerates. The pharmaceutical industry has largely exited small-molecule antibiotic development because the economic returns are poor, so the gap is increasingly being filled by academic and government-funded efforts to find chemically distinct alternatives. AMPs are a particularly promising chemical class because their primary action (disrupting bacterial membranes) is harder for pathogens to evolve away from than enzyme-target binding, and peptide synthesis can be reprogrammed faster than small-molecule chemistry. If the methods developed here generalize, the broader contribution is a faster, cheaper, more modular antibiotic discovery workflow that can be rerun for each emerging resistant strain.
Ethical Considerations
This project sits at the intersection of three ethical concerns:
| Concern | Mitigation |
|---|---|
| Dual-use biosecurity (generative tools could in principle design toxin-like peptides) | Twist orders pass SecureDNA screening; peptide pool filtered for cationic amphipathic AMP-like character; ATCC reference strains only (not clinical isolates) |
| Therapeutic safety (hemolysis, off-target toxicity) | PeptiVerse hemolysis, toxicity, and developability predictions on every candidate before ordering DNA; candidates flagged as likely RBC-toxic discarded |
| Responsible representation of preliminary data | Report negative results (e.g., the no-GamS validation run, the missing MadSBM inference code) with the same emphasis as positive results |
Guiding principles: non-maleficence (do no harm, including by not enabling foreseeable misuse), responsibility (using the synthesis-screening infrastructure consciously and transparently), beneficence (adding to humanity’s antibiotic arsenal), and justice (cocktail-based on-demand antibiotic design must be paired with deployment strategies that reach low-income contexts).
Experimental Design
Peptide Panel
| Group | Tool | Constructs | Purpose |
|---|---|---|---|
| A | AMP-Diffusion | AMP-D-01 through AMP-D-06 | 6 generalist AMPs, selected for diversity from ~2,000 candidates after physicochemical filtering |
| B | PepPrCLIP | PPC-FtsZ-01/02, PPC-LpxC-01/02 | 4 targeted binders against E. coli FtsZ (UniProt P0A9A6) and LpxC (UniProt P0A725) |
| C | AMP-Diffusion fallback | AMPD-ALT-01/02/03 | 3 additional diverse AMP-Diffusion candidates (MadSBM inference code unavailable) |
| D | Controls | CTRL-POS-01 (Cecropin B), CTRL-NEG-01 (scrambled), no-DNA | Gram-negative-selective positive, negative, and lysate-only references |
Construct Architecture
Each construct is a single linear cassette:
All 15 constructs are ≥ 300 bp (Twist gene-fragment minimum), adapters off, codon-optimized for E. coli K-12, and screened against BsaI, BbsI, EcoRI, HindIII sites and ≥6 nt homopolymer runs.
Timeline
| Week | Phase | Activities | Status |
|---|---|---|---|
| 1 | Computational | AMP-Diffusion (2,000 candidates, lengths 20/25/30/35) → physicochemical filter (charge +2..+9, hydrophobic ratio 0.30..0.70, no homopolymer ≥4) → greedy max-min Hamming selection. PepPrCLIP Quickstart on FtsZ + LpxC. PeptiVerse developability check. Codon-optimize and assemble cassettes. | Complete |
| 1 | Ordering | 15 Twist gene fragments (~4.9 kb total, ~$342). NEBExpress GamS (NEB #P0774S, $88). Confirm Ginkgo CFPS kit. | Complete |
| 2 | Bench | Receive fragments, resuspend to 10 ng/µL. Primary single-peptide CFPS screen: 15 constructs × 3 biological replicates × 2 organisms. 20 µL reactions, 50–100 ng linear template, GamS at 0.6 µg per 20 µL, 30 °C × 4 h. Transfer 5 µL crude lysate into 100 µL of ~5 × 10⁵ CFU/mL Mueller-Hinton culture. OD600 readout. | In progress |
| 2 | Bench validation | Pre-flight no-GamS control run for Group A vs E. coli. | Complete |
| 3 | Bench | Take top 5–6 hits; pairwise co-expression at half-dose (25–50 ng of each template per 20 µL); single-agent half-dose controls; dual-plate format. | Pending |
| 3 | Analysis | Compute FICI per pair per organism. FICI ≤ 0.5 = synergy; 0.5–1.0 = additive; >1.0 = indifference / antagonism. Flag cross-method vs within-method synergy. | Pending |
| 4 | Presentation | Repeat best hit/pair on a fresh CFPS batch; produce inhibition heatmaps, FICI grids, helical wheels for top hits; write final report. | Pending |
Techniques Used
| Category | Technique |
|---|---|
| Protein design | AMP-Diffusion (generative protein modeling), PepPrCLIP (target-directed peptide design), PeptiVerse (developability prediction), Benchling, models and notebooks, UniProt / NCBI databases |
| DNA design | DNA construct design, codon optimization, Twist linear gene-fragment ordering |
| Cell-free systems | BL21 Star DE3 CFPS lysate, T7 RNAP transcription, GamS-protected linear template reactions |
| Bioproduction | Chassis selection, bacterial culturing (E. coli ATCC 25922, B. subtilis ATCC 6633), Mueller-Hinton broth |
| QC and analysis | OD600 growth-inhibition assay, FICI-style synergy scoring |
| Lab automation | Python / PyTorch inference and filtering scripts, Twist order generation |
| Bioethics | SecureDNA screening, PeptiVerse hemolysis filter, BSL-1 ATCC strains only |
Two Techniques in Depth
Cell-Free Reactions
The entire bench phase is built around BL21 Star DE3 cell-free protein synthesis. Each Twist linear gene fragment is resuspended to 10 ng/µL and added at 50–100 ng to a 20 µL reaction containing lysate, reaction buffer, and NEBExpress GamS Nuclease Inhibitor (0.6 µg per 20 µL reaction). GamS is essential: it sequesters the RecBCD exonuclease in the crude lysate, which would otherwise degrade the linear DNA template within minutes. The no-GamS validation run (below) confirms that without it, peptides are not produced at detectable levels.
Reactions incubate at 30 °C for 4 h, then 5 µL of crude lysate is transferred directly into a 100 µL bacterial test well. No peptide purification step is needed because the assay tolerates lysate background, controlled for by the scrambled-Cecropin negative control. Using CFPS rather than plasmid-based in vivo expression compresses the design-build-test loop from weeks to days and decouples the peptide chemistry from any host-toxicity confounder, since the host that would normally express the AMP is not part of the assay.
Setting up CFPS reactions at the bench.

Transferring template and reagents into reaction strips.

Incubator with rack ready for the 30 °C × 4 h CFPS step.

Spiking crude lysate into the bacterial test plate.

Protein Design (Generative Peptide Modeling)
Three AI tools were used in this project:
| Tool | Role | Output |
|---|---|---|
| AMP-Diffusion (Torres et al., Cell Biomaterials 2025) | Latent-space diffusion model conditioned on length; produces broad-distribution AMP candidates | ~2,000 candidates across lengths 20/25/30/35, filtered by net charge (+2..+9), hydrophobic ratio (0.30..0.70), homopolymer runs (<4). Top 6 most diverse selected for Group A + 3 fallback for Group C. |
| PepPrCLIP (Bhat et al., Science Advances 2025) | CLIP-based contrastive scorer that ranks a 100K Gaussian-perturbed peptide pool against a target | E. coli FtsZ (P0A9A6) and LpxC (P0A725) scored; top 2 diverse binders per target picked for Group B. |
| PeptiVerse (Zhang et al., 2026) | Developability prediction (hemolysis, solubility, toxicity, permeability) | Run on all 13 designed peptides via the Gradio API. Hemolysis score < 0.5 required for every shipped construct. |
All peptide ORFs were reverse-translated using a high-expression E. coli K-12 codon table, screened against common restriction sites and homopolymer runs, then padded out to 300 bp for Twist.
Industry Council Partners
| Company | Contribution |
|---|---|
| Twist Bioscience | Supplier of the 15 linear gene fragments (the physical DNA inputs to the project) |
| Ginkgo Bioworks | Supplier of the BL21 Star DE3 CFPS lysate used to express every peptide |
| New England Biolabs | Supplier of NEBExpress GamS Nuclease Inhibitor (#P0774S), required to protect linear templates from RecBCD |
| ATCC | Supplier of E. coli ATCC 25922 and B. subtilis ATCC 6633 test strains |
| SecureDNA | Sequence screening pipeline that the Twist order passed through before synthesis |
Results
What Has Been Validated
Two aspects of the project are validated so far:
- Computational design pipeline. All three AI tools were stood up successfully and produced sequences that passed physicochemical and developability filters; the resulting 15 codon-optimized Twist constructs are documented in
constructs/construct_details.csvand assembled intwist_orders/twist_order_batch1_FINAL.csv. - Bench workflow’s interpretive logic. Group A AMP peptides were expressed in BL21 Star DE3 lysate without GamS and assayed against E. coli ATCC 25922 as a negative-by-design control. The expectation was that the linear DNA would be chewed up by RecBCD before useful amounts of peptide could accumulate, so no antimicrobial activity should be visible above scrambled control. That expectation was met, which validates the necessity of GamS in subsequent runs and demonstrates that the assay correctly returns a null result when no functional peptide is produced.
No-GamS Validation Run, Quantitative Data
Wells A1 through F1 correspond to AMP-D-01 through AMP-D-06 spiked into E. coli ATCC 25922 in Mueller-Hinton broth at ~5 × 10⁵ CFU/mL starting density. OD600 read every 30 min for 7.5 h on the plate reader.

Figure 1. OD600 growth curves over 7.5 h for the no-GamS Group A validation run. Wells A1 through F1 correspond to AMP-D-01 through AMP-D-06 in CFPS reactions added to E. coli ATCC 25922 in Mueller-Hinton broth. All six wells show normal sigmoidal growth with no inhibition relative to one another, confirming that without GamS protection the linear DNA template is degraded by RecBCD before functional peptide accumulates.
| Phase | Time | OD600 |
|---|---|---|
| Lag | 0–2 h | ~0.10 (flat) |
| Exponential | 2–5 h | Rising from ~0.10 to ~0.45 |
| Stationary | 5–7.5 h | Endpoint OD600 ≈ 0.55 (well F1) to ≈ 0.78 (well E1); wells A1, B1, C1, D1 clustered tightly between ≈ 0.55 and 0.61 |
The growth curves overlap heavily through lag and early-exponential phases and only diverge modestly in stationary phase, the normal pattern for untreated wells of the same strain. The key interpretive point is that no peptide well showed any reduction in E. coli growth relative to the others. The well that looks like an outlier (E1) is higher, not lower, which is the opposite of an antimicrobial signal. This is the expected null result. Without GamS, the linear template is degraded by RecBCD before functional peptide accumulates, so no antimicrobial activity is detectable, even though the bacteria are clearly viable and growing normally. The same plate reader, lysate prep, Twist fragments, and E. coli culture will be used for the +GamS run, so any growth suppression observed there can be attributed to peptide expression rather than to assay artifact.
Quantitative Expectations for the +GamS Phase
| Metric | Target |
|---|---|
| Group A hits (≥50% inhibition vs scrambled) | ≥ 2 of 6 against at least one organism |
| Group B hits (≥50% inhibition vs scrambled) | ≥ 1 of 4 against E. coli (bias expected, since binders are designed against E. coli targets) |
| Positive control (Cecropin B) | Strong inhibition of E. coli, weak inhibition of B. subtilis (gram-selective) |
| Negative control (scrambled) | OD600 indistinguishable from lysate-only |
| Synergy screen | ≥ 1 FICI ≤ 0.5 combination, with cross-method (Group A × Group B) pairs over-represented vs within-method pairs |
Published CFPS-AMP studies typically report 30–70% growth inhibition for active AMPs versus scrambled controls.
Challenges and Limitations
| Challenge | Status | Mitigation |
|---|---|---|
| MadSBM inference code not public | Encountered | Model weights are on HuggingFace (ChatterjeeLab/MadSBM) but no inference code or usage example was published. Rather than reverse-engineer a Schrödinger-bridge inference loop, fell back to a parallel AMP-Diffusion run maximizing Hamming distance from Group A picks. Loses the original spectrum-broadening hypothesis but preserves the cocktail-synergy hypothesis, which is the project’s main scientific question. |
| Linear template degradation by RecBCD in CFPS lysate | Mitigated | NEBExpress GamS Nuclease Inhibitor (NEB #P0774S) added at 0.6 µg per 20 µL reaction. Validated by the deliberately-no-GamS Group A run (Figure 1). |
| Modest yield of short peptides in CFPS | Anticipated | Strong E. coli RBS + codon-optimized ORFs + 3 biological replicates per construct. Fallback: scale CFPS reaction volume from 20 µL to 50 µL. |
| 15-peptide pilot, generalizability | Acknowledged | Framed explicitly as a feasibility study, not a clinical pre-screen. Cross-method vs within-method pairs are built-in controls, making the central synergy assumption directly testable. |
Supply List and Budget
| Item | Est. cost (USD) | Notes |
|---|---|---|
| Twist Gene Fragments (15 × 302–350 bp, adapters off) | $342–$440 | ~$0.07/bp; ~4.9 kb total. Ordered. |
| NEBExpress GamS Nuclease Inhibitor (#P0774S, 75 µg) | $88 | ~0.6 µg per 20 µL reaction; required for linear DNA protection |
| Ginkgo BL21 Star DE3 CFPS kit (~100 reactions) | $200–$500 | Provided through HTGAA / Ginkgo. Confirmed available. |
| E. coli ATCC 25922 + B. subtilis ATCC 6633 strains | $80–$140 | ATCC standard reference strains |
| Mueller-Hinton broth + agar plates | $60–$100 | Standard microbiology consumables |
| 96-well clear flat-bottom plates (×2 per screen) | $40–$80 | One plate per assay organism |
| Filter tips, microcentrifuge tubes, pipettes | $50–$120 | Shared lab stock |
| GPU compute (L40S × ~20 h for all AI runs) | $30–$60 | Cloud GPU credits |
| TOTAL (estimated range) | $890–$1,528 | Within the $2,000 project ceiling |
References
| Reference | Use in this project |
|---|---|
| Torres, M. D. T., Mohanty, A., Chatterjee, P., de la Fuente-Nunez, C., et al. (2025). AMP-Diffusion: Generative AI design of antimicrobial peptides. Cell Biomaterials. | Generative backbone for Groups A and C |
| Bhat, S., Palepu, K., Chatterjee, P., et al. (2025). PepPrCLIP: Target-directed de novo peptide binder design via contrastive language-image pretraining for proteins. Science Advances. | Target-binder generation for Group B |
| Goel, V. & Chatterjee, P. (2026). Minimal-Action Discrete Schrödinger Bridge Matching (MadSBM). arXiv:2601.22408 / ICLR 2026. Weights on HuggingFace: ChatterjeeLab/MadSBM; no inference code available. | Originally intended for Group C; replaced by AMP-Diffusion fallback |
| Zhang, A., Chatterjee, P., et al. (2026). PeptiVerse: A unified platform for therapeutic peptide developability prediction. bioRxiv, DOI: 10.64898/2025.12.31.697180. Hosted at huggingface.co/spaces/ChatterjeeLab/PeptiVerse. | Hemolysis, solubility, toxicity filter |
| Sun, Z. Z., Hayes, C. A., Shin, J., et al. (2013). Protocols for implementing an Escherichia coli-based TX-TL cell-free expression system for synthetic biology. JoVE 79, e50762. | GamS-based linear DNA protection protocol |
| World Health Organization (2024). Global Antimicrobial Resistance Surveillance Report. | AMR mortality and forecasting figures |
| NEB Product Page: NEBExpress GamS Nuclease Inhibitor (#P0774S). www.neb.com | GamS dosing (~0.6 µg per 20 µL CFPS reaction) |
| UniProt: P0A9A6 (FtsZ, E. coli K-12) and P0A725 (LpxC, E. coli K-12) | PepPrCLIP target inputs |
| ATCC: ATCC 25922 (E. coli) and ATCC 6633 (Bacillus subtilis subsp. spizizenii) | Test strains for dual-organism antimicrobial assay |
| Twist Bioscience Gene Fragment specifications | 300 bp minimum, adapters off |