Individual Final Project

AI-Designed Antimicrobial Peptide Cocktails Tested via Cell-Free Protein Synthesis

Sean Murphy · MAS.885 Spring 2026

Abstract

Antimicrobial resistance is on track to cause more than 10 million deaths per year by 2050, and the discovery pipeline for new small-molecule antibiotics has slowed sharply over the past two decades. Antimicrobial peptides (AMPs) are an attractive alternative class because they kill bacteria primarily by membrane disruption, a mechanism that is intrinsically harder for pathogens to evolve resistance against.

The broad objective of this project is to ask whether modern AI peptide-design tools, paired with cell-free protein synthesis (CFPS), can be combined into a fast prototyping platform that not only identifies single-peptide hits, but also surfaces synergistic peptide cocktails. The hypothesis is that AMPs produced by mechanistically distinct generative models (a length-conditioned latent-space diffusion model trained on natural AMPs and a CLIP-based target-binder generator) will exhibit synergy when co-expressed, with the most pronounced synergy appearing in cross-method pairs that combine membrane-active generalists with target-directed binders.

To test this, the specific aims are:

Generate candidates with AMP-Diffusion, PepPrCLIP, and a MadSBM fallback strategy
Order 15 codon-optimized linear gene fragments from Twist
Express each peptide in BL21 Star DE3 lysate with GamS nuclease inhibitor to protect linear templates
Score crude lysates against E. coli ATCC 25922 and B. subtilis ATCC 6633 in a two-organism antimicrobial assay, then run pairwise co-expression on top hits and score synergy by an FICI-style metric

Project Aims

Aim 1 · Experimental Aim (this project)

Determine whether AI-designed antimicrobial peptides exhibit synergistic activity when co-expressed. The platform integrates three generative peptide-design tools, 15 codon-optimized Twist linear gene fragments, a Ginkgo Bioworks BL21 Star DE3 CFPS lysate supplemented with NEBExpress GamS Nuclease Inhibitor, and a dual-organism optical-density growth-inhibition assay scored against scrambled-peptide and Cecropin B controls. A primary single-peptide screen identifies hits, and a follow-on pairwise co-expression screen at half-dose scores synergy using an FICI-style fractional inhibition index calculated separately for each organism.

Aim 2 · Development Aim

Close the design, build, test loop by feeding measured single-peptide and pairwise activity data back into the generative models as a fine-tuning signal, producing a second wave of candidates conditioned on both predicted activity and predicted synergy partners. Practically this means:

Step	Action
i	Build a small labelled dataset (sequence → MIC-equivalent inhibition per organism, plus pairwise FICI labels)
ii	Parameter-efficient fine-tuning of AMP-Diffusion and a re-trained MiniCLIP scorer to bias toward in-house hit signatures
iii	Extend the assay panel to clinical isolates (drug-resistant E. coli, MRSA, A. baumannii) + a mammalian-cell cytotoxicity counter-screen
iv	Migrate from manual pipetting to Opentrons OT-2 / Ginkgo automation, raising throughput from ~100 to several hundred reactions per screen

Aim 3 · Visionary Aim

A personalized, on-demand antibiotic-cocktail design platform: a clinician submits a patient’s pathogen genome (or a culture resistance profile), a generative pipeline designs a custom cocktail targeting that strain’s essential proteins and membrane chemistry, the cocktail is expressed locally in a freeze-dried CFPS reaction, and the peptide mixture is administered as a same-day therapeutic. The cocktail approach addresses resistance directly: pathogens that mutate around one membrane-active peptide are far less likely to escape three simultaneously, each with a distinct mechanism. If achieved, this could reduce time-to-treatment for drug-resistant infections from months to days.

Background

Key References

Citation	Contribution to this project
Torres, M. D. T. et al. (2025) “AMP-Diffusion: Generative AI design of antimicrobial peptides,” Cell Biomaterials	Latent-space diffusion model (~16.5 M parameters) on ESM-2 embeddings, trained on 19,670 natural AMPs. Provides the generative backbone for Group A and the Group C fallback.
Bhat, S. et al. (2025) “PepPrCLIP: Target-directed de novo peptide binder design via contrastive language-image pretraining for proteins,” Science Advances	Frozen ESM-2 (650 M) encoder + contrastive “MiniCLIP” scorer that ranks ~100K Gaussian-perturbed candidates against a target. Used to design Group B binders against E. coli FtsZ and LpxC.

Novelty

#	Innovation
1	Two mechanistically distinct generative paradigms in the same Twist order. A generalist distribution-based sampler (AMP-Diffusion) and a target-directed contrastive ranker (PepPrCLIP) are combined head-to-head so that cross-method synergy can be tested against within-method synergy.
2	Linear Twist fragments straight into CFPS instead of plasmid cloning. Compresses design, build, test from weeks to days and makes the platform compatible with future on-demand / point-of-care use cases.
3	Cocktail design as a first-class objective. Rather than asking “which peptide is most potent?”, the platform asks “which combination is most potent at half-dose each?”, the relevant question for resistance-resistant therapeutics.

Why It Matters

The WHO estimates that drug-resistant infections cause ~1.27 million deaths each year today, projected to climb past 10 million per year by 2050 unless the discovery pipeline accelerates. The pharmaceutical industry has largely exited small-molecule antibiotic development because the economic returns are poor, so the gap is increasingly being filled by academic and government-funded efforts to find chemically distinct alternatives. AMPs are a particularly promising chemical class because their primary action (disrupting bacterial membranes) is harder for pathogens to evolve away from than enzyme-target binding, and peptide synthesis can be reprogrammed faster than small-molecule chemistry. If the methods developed here generalize, the broader contribution is a faster, cheaper, more modular antibiotic discovery workflow that can be rerun for each emerging resistant strain.

Ethical Considerations

This project sits at the intersection of three ethical concerns:

Concern	Mitigation
Dual-use biosecurity (generative tools could in principle design toxin-like peptides)	Twist orders pass SecureDNA screening; peptide pool filtered for cationic amphipathic AMP-like character; ATCC reference strains only (not clinical isolates)
Therapeutic safety (hemolysis, off-target toxicity)	PeptiVerse hemolysis, toxicity, and developability predictions on every candidate before ordering DNA; candidates flagged as likely RBC-toxic discarded
Responsible representation of preliminary data	Report negative results (e.g., the no-GamS validation run, the missing MadSBM inference code) with the same emphasis as positive results

Guiding principles: non-maleficence (do no harm, including by not enabling foreseeable misuse), responsibility (using the synthesis-screening infrastructure consciously and transparently), beneficence (adding to humanity’s antibiotic arsenal), and justice (cocktail-based on-demand antibiotic design must be paired with deployment strategies that reach low-income contexts).

Experimental Design

Peptide Panel

Group	Tool	Constructs	Purpose
A	AMP-Diffusion	AMP-D-01 through AMP-D-06	6 generalist AMPs, selected for diversity from ~2,000 candidates after physicochemical filtering
B	PepPrCLIP	PPC-FtsZ-01/02, PPC-LpxC-01/02	4 targeted binders against E. coli FtsZ (UniProt P0A9A6) and LpxC (UniProt P0A725)
C	AMP-Diffusion fallback	AMPD-ALT-01/02/03	3 additional diverse AMP-Diffusion candidates (MadSBM inference code unavailable)
D	Controls	CTRL-POS-01 (Cecropin B), CTRL-NEG-01 (scrambled), no-DNA	Gram-negative-selective positive, negative, and lysate-only references

Construct Architecture

Each construct is a single linear cassette:

[5′ pad] → [T7 promoter] → [strong E. coli RBS] → [ATG] → [codon-optimized ORF] → [TAA TAA] → [T7 terminator] → [3′ pad]

All 15 constructs are ≥ 300 bp (Twist gene-fragment minimum), adapters off, codon-optimized for E. coli K-12, and screened against BsaI, BbsI, EcoRI, HindIII sites and ≥6 nt homopolymer runs.

Timeline

Week	Phase	Activities	Status
1	Computational	AMP-Diffusion (2,000 candidates, lengths 20/25/30/35) → physicochemical filter (charge +2..+9, hydrophobic ratio 0.30..0.70, no homopolymer ≥4) → greedy max-min Hamming selection. PepPrCLIP Quickstart on FtsZ + LpxC. PeptiVerse developability check. Codon-optimize and assemble cassettes.	Complete
1	Ordering	15 Twist gene fragments (~4.9 kb total, ~$342). NEBExpress GamS (NEB #P0774S, $88). Confirm Ginkgo CFPS kit.	Complete
2	Bench	Receive fragments, resuspend to 10 ng/µL. Primary single-peptide CFPS screen: 15 constructs × 3 biological replicates × 2 organisms. 20 µL reactions, 50–100 ng linear template, GamS at 0.6 µg per 20 µL, 30 °C × 4 h. Transfer 5 µL crude lysate into 100 µL of ~5 × 10⁵ CFU/mL Mueller-Hinton culture. OD600 readout.	In progress
2	Bench validation	Pre-flight no-GamS control run for Group A vs E. coli.	Complete
3	Bench	Take top 5–6 hits; pairwise co-expression at half-dose (25–50 ng of each template per 20 µL); single-agent half-dose controls; dual-plate format.	Pending
3	Analysis	Compute FICI per pair per organism. FICI ≤ 0.5 = synergy; 0.5–1.0 = additive; >1.0 = indifference / antagonism. Flag cross-method vs within-method synergy.	Pending
4	Presentation	Repeat best hit/pair on a fresh CFPS batch; produce inhibition heatmaps, FICI grids, helical wheels for top hits; write final report.	Pending

Techniques Used

Category	Technique
Protein design	AMP-Diffusion (generative protein modeling), PepPrCLIP (target-directed peptide design), PeptiVerse (developability prediction), Benchling, models and notebooks, UniProt / NCBI databases
DNA design	DNA construct design, codon optimization, Twist linear gene-fragment ordering
Cell-free systems	BL21 Star DE3 CFPS lysate, T7 RNAP transcription, GamS-protected linear template reactions
Bioproduction	Chassis selection, bacterial culturing (E. coli ATCC 25922, B. subtilis ATCC 6633), Mueller-Hinton broth
QC and analysis	OD600 growth-inhibition assay, FICI-style synergy scoring
Lab automation	Python / PyTorch inference and filtering scripts, Twist order generation
Bioethics	SecureDNA screening, PeptiVerse hemolysis filter, BSL-1 ATCC strains only

Two Techniques in Depth

Cell-Free Reactions

The entire bench phase is built around BL21 Star DE3 cell-free protein synthesis. Each Twist linear gene fragment is resuspended to 10 ng/µL and added at 50–100 ng to a 20 µL reaction containing lysate, reaction buffer, and NEBExpress GamS Nuclease Inhibitor (0.6 µg per 20 µL reaction). GamS is essential: it sequesters the RecBCD exonuclease in the crude lysate, which would otherwise degrade the linear DNA template within minutes. The no-GamS validation run (below) confirms that without it, peptides are not produced at detectable levels.

Reactions incubate at 30 °C for 4 h, then 5 µL of crude lysate is transferred directly into a 100 µL bacterial test well. No peptide purification step is needed because the assay tolerates lysate background, controlled for by the scrambled-Cecropin negative control. Using CFPS rather than plasmid-based in vivo expression compresses the design-build-test loop from weeks to days and decouples the peptide chemistry from any host-toxicity confounder, since the host that would normally express the AMP is not part of the assay.

Setting up CFPS reactions at the bench.

Transferring template and reagents into reaction strips.

Incubator with rack ready for the 30 °C × 4 h CFPS step.

Spiking crude lysate into the bacterial test plate.

Protein Design (Generative Peptide Modeling)

Three AI tools were used in this project:

Tool	Role	Output
AMP-Diffusion (Torres et al., Cell Biomaterials 2025)	Latent-space diffusion model conditioned on length; produces broad-distribution AMP candidates	~2,000 candidates across lengths 20/25/30/35, filtered by net charge (+2..+9), hydrophobic ratio (0.30..0.70), homopolymer runs (<4). Top 6 most diverse selected for Group A + 3 fallback for Group C.
PepPrCLIP (Bhat et al., Science Advances 2025)	CLIP-based contrastive scorer that ranks a 100K Gaussian-perturbed peptide pool against a target	E. coli FtsZ (P0A9A6) and LpxC (P0A725) scored; top 2 diverse binders per target picked for Group B.
PeptiVerse (Zhang et al., 2026)	Developability prediction (hemolysis, solubility, toxicity, permeability)	Run on all 13 designed peptides via the Gradio API. Hemolysis score < 0.5 required for every shipped construct.

All peptide ORFs were reverse-translated using a high-expression E. coli K-12 codon table, screened against common restriction sites and homopolymer runs, then padded out to 300 bp for Twist.

Industry Council Partners

Company	Contribution
Twist Bioscience	Supplier of the 15 linear gene fragments (the physical DNA inputs to the project)
Ginkgo Bioworks	Supplier of the BL21 Star DE3 CFPS lysate used to express every peptide
New England Biolabs	Supplier of NEBExpress GamS Nuclease Inhibitor (#P0774S), required to protect linear templates from RecBCD
ATCC	Supplier of E. coli ATCC 25922 and B. subtilis ATCC 6633 test strains
SecureDNA	Sequence screening pipeline that the Twist order passed through before synthesis

Results

What Has Been Validated

Two aspects of the project are validated so far:

Computational design pipeline. All three AI tools were stood up successfully and produced sequences that passed physicochemical and developability filters; the resulting 15 codon-optimized Twist constructs are documented in constructs/construct_details.csv and assembled in twist_orders/twist_order_batch1_FINAL.csv.
Bench workflow’s interpretive logic. Group A AMP peptides were expressed in BL21 Star DE3 lysate without GamS and assayed against E. coli ATCC 25922 as a negative-by-design control. The expectation was that the linear DNA would be chewed up by RecBCD before useful amounts of peptide could accumulate, so no antimicrobial activity should be visible above scrambled control. That expectation was met, which validates the necessity of GamS in subsequent runs and demonstrates that the assay correctly returns a null result when no functional peptide is produced.

No-GamS Validation Run, Quantitative Data

Wells A1 through F1 correspond to AMP-D-01 through AMP-D-06 spiked into E. coli ATCC 25922 in Mueller-Hinton broth at ~5 × 10⁵ CFU/mL starting density. OD600 read every 30 min for 7.5 h on the plate reader.

Figure 1. OD600 growth curves over 7.5 h for the no-GamS Group A validation run. Wells A1 through F1 correspond to AMP-D-01 through AMP-D-06 in CFPS reactions added to E. coli ATCC 25922 in Mueller-Hinton broth. All six wells show normal sigmoidal growth with no inhibition relative to one another, confirming that without GamS protection the linear DNA template is degraded by RecBCD before functional peptide accumulates.

Phase	Time	OD600
Lag	0–2 h	~0.10 (flat)
Exponential	2–5 h	Rising from ~0.10 to ~0.45
Stationary	5–7.5 h	Endpoint OD600 ≈ 0.55 (well F1) to ≈ 0.78 (well E1); wells A1, B1, C1, D1 clustered tightly between ≈ 0.55 and 0.61

The growth curves overlap heavily through lag and early-exponential phases and only diverge modestly in stationary phase, the normal pattern for untreated wells of the same strain. The key interpretive point is that no peptide well showed any reduction in E. coli growth relative to the others. The well that looks like an outlier (E1) is higher, not lower, which is the opposite of an antimicrobial signal. This is the expected null result. Without GamS, the linear template is degraded by RecBCD before functional peptide accumulates, so no antimicrobial activity is detectable, even though the bacteria are clearly viable and growing normally. The same plate reader, lysate prep, Twist fragments, and E. coli culture will be used for the +GamS run, so any growth suppression observed there can be attributed to peptide expression rather than to assay artifact.

Quantitative Expectations for the +GamS Phase

Metric	Target
Group A hits (≥50% inhibition vs scrambled)	≥ 2 of 6 against at least one organism
Group B hits (≥50% inhibition vs scrambled)	≥ 1 of 4 against E. coli (bias expected, since binders are designed against E. coli targets)
Positive control (Cecropin B)	Strong inhibition of E. coli, weak inhibition of B. subtilis (gram-selective)
Negative control (scrambled)	OD600 indistinguishable from lysate-only
Synergy screen	≥ 1 FICI ≤ 0.5 combination, with cross-method (Group A × Group B) pairs over-represented vs within-method pairs

Published CFPS-AMP studies typically report 30–70% growth inhibition for active AMPs versus scrambled controls.

Challenges and Limitations

Challenge	Status	Mitigation
MadSBM inference code not public	Encountered	Model weights are on HuggingFace (ChatterjeeLab/MadSBM) but no inference code or usage example was published. Rather than reverse-engineer a Schrödinger-bridge inference loop, fell back to a parallel AMP-Diffusion run maximizing Hamming distance from Group A picks. Loses the original spectrum-broadening hypothesis but preserves the cocktail-synergy hypothesis, which is the project’s main scientific question.
Linear template degradation by RecBCD in CFPS lysate	Mitigated	NEBExpress GamS Nuclease Inhibitor (NEB #P0774S) added at 0.6 µg per 20 µL reaction. Validated by the deliberately-no-GamS Group A run (Figure 1).
Modest yield of short peptides in CFPS	Anticipated	Strong E. coli RBS + codon-optimized ORFs + 3 biological replicates per construct. Fallback: scale CFPS reaction volume from 20 µL to 50 µL.
15-peptide pilot, generalizability	Acknowledged	Framed explicitly as a feasibility study, not a clinical pre-screen. Cross-method vs within-method pairs are built-in controls, making the central synergy assumption directly testable.

Supply List and Budget

Item	Est. cost (USD)	Notes
Twist Gene Fragments (15 × 302–350 bp, adapters off)	$342–$440	~$0.07/bp; ~4.9 kb total. Ordered.
NEBExpress GamS Nuclease Inhibitor (#P0774S, 75 µg)	$88	~0.6 µg per 20 µL reaction; required for linear DNA protection
Ginkgo BL21 Star DE3 CFPS kit (~100 reactions)	$200–$500	Provided through HTGAA / Ginkgo. Confirmed available.
E. coli ATCC 25922 + B. subtilis ATCC 6633 strains	$80–$140	ATCC standard reference strains
Mueller-Hinton broth + agar plates	$60–$100	Standard microbiology consumables
96-well clear flat-bottom plates (×2 per screen)	$40–$80	One plate per assay organism
Filter tips, microcentrifuge tubes, pipettes	$50–$120	Shared lab stock
GPU compute (L40S × ~20 h for all AI runs)	$30–$60	Cloud GPU credits
TOTAL (estimated range)	$890–$1,528	Within the $2,000 project ceiling

References

Reference	Use in this project
Torres, M. D. T., Mohanty, A., Chatterjee, P., de la Fuente-Nunez, C., et al. (2025). AMP-Diffusion: Generative AI design of antimicrobial peptides. Cell Biomaterials.	Generative backbone for Groups A and C
Bhat, S., Palepu, K., Chatterjee, P., et al. (2025). PepPrCLIP: Target-directed de novo peptide binder design via contrastive language-image pretraining for proteins. Science Advances.	Target-binder generation for Group B
Goel, V. & Chatterjee, P. (2026). Minimal-Action Discrete Schrödinger Bridge Matching (MadSBM). arXiv:2601.22408 / ICLR 2026. Weights on HuggingFace: ChatterjeeLab/MadSBM; no inference code available.	Originally intended for Group C; replaced by AMP-Diffusion fallback
Zhang, A., Chatterjee, P., et al. (2026). PeptiVerse: A unified platform for therapeutic peptide developability prediction. bioRxiv, DOI: 10.64898/2025.12.31.697180. Hosted at huggingface.co/spaces/ChatterjeeLab/PeptiVerse.	Hemolysis, solubility, toxicity filter
Sun, Z. Z., Hayes, C. A., Shin, J., et al. (2013). Protocols for implementing an Escherichia coli-based TX-TL cell-free expression system for synthetic biology. JoVE 79, e50762.	GamS-based linear DNA protection protocol
World Health Organization (2024). Global Antimicrobial Resistance Surveillance Report.	AMR mortality and forecasting figures
NEB Product Page: NEBExpress GamS Nuclease Inhibitor (#P0774S). www.neb.com	GamS dosing (~0.6 µg per 20 µL CFPS reaction)
UniProt: P0A9A6 (FtsZ, E. coli K-12) and P0A725 (LpxC, E. coli K-12)	PepPrCLIP target inputs
ATCC: ATCC 25922 (E. coli) and ATCC 6633 (Bacillus subtilis subsp. spizizenii)	Test strains for dual-organism antimicrobial assay
Twist Bioscience Gene Fragment specifications	300 bp minimum, adapters off

Projects

Final projects:

Individual Final Project

AI-Designed Antimicrobial Peptide Cocktails Tested via Cell-Free Protein Synthesis

Abstract