Week 5 HW: Protein Design Part 2

Part A: SOD1 A4V Peptide Binder Design

Superoxide dismutase 1 (SOD1) is a cytosolic antioxidant enzyme that converts superoxide radicals into hydrogen peroxide and oxygen. In its native state it forms a stable homodimer and binds copper and zinc. Mutations in SOD1 cause familial Amyotrophic Lateral Sclerosis (ALS). The A4V mutation (alanine to valine at residue 4) leads to one of the most aggressive forms of the disease. The mutation subtly destabilizes the N-terminus, perturbs folding energetics, and promotes toxic aggregation.

The goal here is to design short peptides that bind mutant SOD1, then decide which ones are worth advancing toward therapy, using three models: PepMLM, PeptiVerse, and moPPIt.

Generate four 12-mer binders with PepMLM and record perplexity scores

Four 12-residue peptides were generated using PepMLM-650M conditioned on the SOD1 A4V mutant sequence, alongside the known binder FLYRWLPSRRGG.

Peptide IDSequenceSourcePerplexity
1WRYYVAAVRWGEgenerated21.23
2WRSPPVGVEHKAgenerated22.21
3WLYYPVGAELKEgenerated16.06
4WHSGVVVLALKAgenerated13.84
5FLYRWLPSRRGGknown binder20.64

Lower pseudo-perplexity indicates higher model confidence. Peptide 4 (WHSGVVVLALKA, PPL = 13.84) shows the highest PepMLM confidence, followed by Peptide 3 (WLYYPVGAELKE, PPL = 16.06). Both outperform the known binder (PPL = 20.64), suggesting the model considers them plausible binders. All four generated peptides begin with Trp (W), suggesting a strong N-terminal preference for aromatic anchoring to SOD1.

Evaluate binders with AlphaFold3

All five peptide–SOD1 complexes were submitted to AlphaFold Server (fold date: 2026-03-09). Each job modeled the SOD1 A4V monomer (154 residues, chain A) with one 12-mer peptide (chain B). Results are stored in peptides/af3_results/.

PeptideipTM (best)Binding LocationSurface/BuriedNotes
WRYYVAAVRWGE0.31Dimer interface / β-barrelSurface-boundPAE 9.07 Å, moderate confidence
WRSPPVGVEHKA0.36Extended surface grooveSurface-boundSecond-best ipTM, extended conformation
WLYYPVGAELKE0.24β-barrel regionSurface-boundPAE 10.81 Å, lowest confidence
WHSGVVVLALKA0.48Dimer interface pocketPartially buriedBest model: PAE 4.97 Å, well-defined binding
FLYRWLPSRRGG0.31β-barrel / dimer interfaceSurface-boundKnown binder, PAE 8.60 Å

ipTM values range from 0.24 to 0.48 across the five complexes. While all fall below the 0.6 threshold typically considered high-confidence for protein–peptide interactions, they show meaningful differentiation among candidates.

Peptide 4 (WHSGVVVLALKA, ipTM = 0.48) clearly stands out: its ipTM exceeds the known binder FLYRWLPSRRGG (0.31) by 55%, and its PAE of 4.97 Å is roughly half that of the next-best model, indicating a well-resolved binding pose at the dimer interface pocket. It is also the only one predicted to be partially buried, suggesting tighter engagement.

Peptide 2 (WRSPPVGVEHKA, ipTM = 0.36) ranks second structurally, adopting an extended conformation along a surface groove. Peptides 1 and 5 tie at ipTM = 0.31, with Peptide 1 localizing to the dimer interface / β-barrel region and Peptide 5 (the known binder) similarly positioned. Peptide 3 (WLYYPVGAELKE, ipTM = 0.24) has the weakest structural prediction despite its moderate PepMLM perplexity (16.06), with a high PAE (10.81 Å) indicating uncertain binding geometry.

Notably, none of the five peptides bind near the N-terminus where the A4V mutation resides (position 4). All predicted binding sites localize to the dimer interface or β-barrel region, suggesting these peptides may act through general fold stabilization or dimer modulation rather than direct mutation-site engagement.

Evaluate therapeutic properties with PeptiVerse

PeptideSourcePPLBinding Affinity (pKd)SolubilityHemolysisNet Charge (pH 7)MW (Da)
WRYYVAAVRWGEgenerated21.237.021 (Medium)1.0000.093+0.771555.7
WRSPPVGVEHKAgenerated22.214.826 (Weak)1.0000.013+0.851362.5
WLYYPVGAELKEgenerated16.065.722 (Weak)1.0000.033-1.231467.7
WHSGVVVLALKAgenerated13.846.055 (Weak)1.0000.079+0.851279.5
FLYRWLPSRRGGknown binder20.645.968 (Weak)1.0000.047+2.761507.7

ipTM vs. PeptiVerse affinity. AlphaFold3 structural confidence and PeptiVerse-predicted affinity disagree on the top candidate. Peptide 4 (WHSGVVVLALKA) dominates structurally (ipTM = 0.48, PAE = 4.97 Å) but has only moderate predicted affinity (pKd = 6.055, “Weak”). Conversely, Peptide 1 (WRYYVAAVRWGE) has the best PeptiVerse affinity (pKd = 7.021, “Medium binding”) but an unremarkable ipTM of 0.31. This divergence likely reflects the fact that PeptiVerse predicts binding strength from sequence features while AF3 models 3D structural complementarity. The two views are complementary.

PepMLM perplexity vs. ipTM. These two metrics show better agreement. Peptide 4 ranks first in both (PPL = 13.84, ipTM = 0.48), supporting its candidacy from two independent perspectives. The correlation is imperfect: Peptide 3 ranks second by PepMLM (PPL = 16.06) but last by AF3 (ipTM = 0.24), so low perplexity does not guarantee a well-resolved pose.

Therapeutic safety. All five peptides are predicted to be fully soluble (probability = 1.000) and non-hemolytic (all below 0.10). No candidates present safety red flags. Peptide 2 (WRSPPVGVEHKA) has the lowest hemolysis risk (0.013) but also the weakest binding (pKd = 4.826).

Physicochemical properties. Net charges range from -1.23 to +2.76 at pH 7, all within reasonable bounds for cell-penetrating peptides. The known binder has the highest positive charge (+2.76), consistent with its arginine-rich C-terminus. Molecular weights are in the 1280–1556 Da range, typical for 12-mers.

Top candidate to advance: Peptide 4 (WHSGVVVLALKA), with Peptide 1 (WRYYVAAVRWGE) as a strong alternative.

Peptide 4 has the best PepMLM confidence (PPL = 13.84) and the best AlphaFold3 structural prediction by a wide margin (ipTM = 0.48, PAE = 4.97 Å). Two independent methods (sequence-based PepMLM and structure-based AF3) agree that this peptide has the most credible interaction with SOD1. Its predicted binding at the dimer interface pocket, where it is partially buried, suggests a geometrically specific interaction rather than nonspecific surface adhesion. While its PeptiVerse-predicted affinity is moderate (pKd = 6.055), the structural evidence from AF3 provides stronger support for a real binding event. It is fully soluble, non-hemolytic (0.079), and has the lowest molecular weight (1279.5 Da) among the candidates.

Peptide 1 (WRYYVAAVRWGE) remains a compelling alternative: it has the strongest predicted binding affinity (pKd = 7.021, the only “Medium binding” peptide), excellent safety properties, and a moderate ipTM (0.31). If PeptiVerse affinity predictions are weighted more heavily than AF3 structural models, Peptide 1 would be the preferred choice.

For experimental validation, both peptides merit testing: Peptide 4 as the structurally favored lead, Peptide 1 as the affinity-favored alternative.

Generate optimized peptides with moPPIt

The moPPIt model (discrete flow matching with multi-objective gradient guidance) was used to generate 11 peptides targeting the SOD1 A4V mutant. Target motifs were set to residues 1–15 (N-terminus, near the A4V mutation) and residues 49–54 (dimer interface near the EFGDN loop). Peptide length was 12 amino acids. Objective weights were [1, 1, 1, 4, 4, 2], so affinity and motif specificity were prioritized 4×. Results are in peptides/moPPIt/sod1_moppit_results.csv.

PeptideHemolysisNon-FoulingHalf-LifeAffinityMotifSpecificity
QKRRLLSLPVFK0.9020.6020.806.000.4780.622
YPPCAYYWQATD0.9290.5873.427.100.5630.686
SIVKTGVTFLTK0.9200.1861.816.380.5840.699
PPLIHRWYAATM0.9220.3213.496.300.4440.660
EEQVVKRIKVGP0.9530.7360.686.540.5800.679
CVQNKKPTFLII0.9110.4971.566.140.6680.647
LKKKIREFLKLG0.9520.5611.166.190.5120.660
YDPLPCAWTPTH0.9350.7262.696.570.4820.699
KPFVFFAKTEIM0.9320.1301.416.250.5890.538
PTWVIETKKKFR0.9790.6112.305.730.6090.667
GPKGWTGKQCFI0.8880.7112.077.000.4740.635

Hemolysis: probability of being non-hemolytic (higher is safer). Affinity: predicted binding score (higher is stronger). Motif: fraction of binding at target residues (higher means more on-target).

All 11 peptides show high predicted hemolysis scores (0.89–0.98), indicating low hemolytic risk. Affinity predictions span 5.73 to 7.10, with YPPCAYYWQATD (7.10) and GPKGWTGKQCFI (7.00) showing the strongest predicted binding. Half-lives vary considerably (0.68–3.49 hours), with PPLIHRWYAATM (3.49 h) and YPPCAYYWQATD (3.42 h) the most stable.

Top moPPIt candidates

CategoryPeptideHighlights
Highest affinityYPPCAYYWQATDAffinity 7.10, half-life 3.42, specificity 0.686
Best motif targetingCVQNKKPTFLIIStrongest on-target binding (motif 0.668)
Best therapeutic profileEEQVVKRIKVGPHighest non-hemolytic (0.953), best non-fouling (0.736), strong affinity (6.54)
Best overall balanceYDPLPCAWTPTHAffinity 6.57, non-fouling 0.726, half-life 2.69, specificity 0.699

Comparison to PepMLM peptides

  • Design philosophy. PepMLM generates peptides via masked language modeling conditioned on the target sequence: it learns what peptide “looks right” next to SOD1 based on evolutionary patterns. moPPIt uses discrete flow matching with explicit multi-objective gradient guidance: it actively optimizes for binding affinity, motif specificity, and therapeutic properties simultaneously.
  • Binding specificity. PepMLM peptides are generated without any notion of where on SOD1 they should bind. moPPIt peptides are explicitly guided toward residues 1–15 and 49–54 via the BindEvaluator motif score, with a specificity penalty that discourages off-target binding.
  • Sequence composition. PepMLM peptides all start with W (tryptophan), suggesting a strong bias for aromatic N-terminal anchors. moPPIt peptides are more diverse: no single residue dominates, and compositions vary based on the objective trade-offs the sampler explores.
  • Affinity. moPPIt’s highest-affinity peptide (YPPCAYYWQATD, 7.10) is comparable to PepMLM’s best (WRYYVAAVRWGE, 7.02 via PeptiVerse). However, moPPIt consistently produces peptides in the 6.0–7.1 range, while PepMLM has more variance (4.8–7.0), suggesting moPPIt’s affinity guidance is effective.
  • Solubility tradeoff. PepMLM peptides all have perfect predicted solubility (1.000). Some moPPIt peptides sacrifice solubility (SIVKTGVTFLTK non-fouling = 0.186, KPFVFFAKTEIM = 0.130) in favor of higher affinity. This reflects the multi-objective nature: aggressive affinity optimization can push sequences toward hydrophobic compositions.

Evaluation before clinical advancement

In silico

  • Molecular dynamics simulations of peptide–SOD1 complexes (starting from AF3 structures) to assess binding stability.
  • Binding free energy calculations (MM/PBSA or MM/GBSA) for ranking.
  • Aggregation prediction (AGGRESCAN, TANGO).

In vitro

  • Surface plasmon resonance (SPR) or isothermal titration calorimetry (ITC) to measure actual Kₓ against A4V SOD1.
  • Hemolysis assay with human red blood cells.
  • Serum stability to validate half-life predictions.
  • ThT fluorescence and aggregation assays to test whether the peptide inhibits A4V SOD1 aggregation.

Cell-based

  • Cell viability (MTT/MTS) to confirm non-cytotoxicity.
  • Cell-penetrating peptide assessment, since SOD1 is cytosolic.
  • Co-immunoprecipitation to confirm peptide–SOD1 interaction in cellular context.

In vivo preclinical

  • Pharmacokinetics (bioavailability, clearance, tissue distribution).
  • Efficacy testing in the SOD1-G93A transgenic ALS mouse model.
  • Standard safety pharmacology panel.

The key bottleneck for peptide therapeutics is typically delivery (cell penetration plus proteolytic stability), not binding affinity. Strategies to address this include D-amino acid substitution, cyclization, stapling, and conjugation to cell-penetrating peptide motifs.


Part B: BRD4 Drug Discovery with Boltz Lab

Tutorial designed by Geoffrey Smith, Boltz Lab.

Target: BRD4 (Bromodomain-containing protein 4), an epigenetic reader and validated oncology target. BRD4 is a member of the BET (Bromodomain and Extra-Terminal) family. It recognizes acetylated lysine residues on histone tails and recruits transcriptional machinery to gene promoters, driving expression of oncogenes including c-Myc. Dysregulated BRD4 activity is implicated in haematological malignancies, solid tumours, and inflammatory disease.

Reference: Filippakopoulos P. et al. Selective inhibition of BET bromodomains. Nature 468, 1067–1073 (2010). Crystal structure: PDB 3MXF.

Compound progression (Hit → Lead → Candidate)

StageCompoundSMILES
HitStripped Back CoreCC1C2C(=C(SC=2NCCN=1)C)C
LeadTriazole + AcidO=C(C[C@@H]1N=C(C)C2C(=C(SC=2N2C1=NN=C2C)C)C)O
Candidate(+)-JQ1O=C(C[C@H]1C2=NN=C(N2C3=C(C(C4=CC=C(C=C4)Cl)=N1)C(C)=C(S3)C)C)OC(C)(C)C

Boltz-2 metrics

MetricRangeMeaningTrust threshold
Binding Confidence0–1How confidently Boltz-2 places the ligand in the binding site.> 0.7 reliable; > 0.8 high confidence
Optimization Score0–1Relative affinity ranking for a congeneric series.Use for relative ranking
Structure Confidence0–1Confidence in the predicted structure.> 0.8 high confidence

All three metrics need to be high to trust a prediction.

Run Boltz-2 predictions for the Hit, Lead, and JQ1

CompoundBinding ConfidenceOptimization ScoreStructure Confidence
Hit0.430.220.93
Lead0.740.270.98
JQ10.960.440.98

Does Binding Confidence increase from hit to clinical candidate?

Yes, Binding Confidence increases monotonically across the drug discovery progression: Hit (0.43) → Lead (0.74) → JQ1 (0.96). This is exactly what we would expect: each optimization stage adds chemical features that improve shape complementarity and specific interactions with the BRD4 acetyl-lysine binding pocket. The Hit (stripped back core) contains only the minimal thienodiazepine scaffold with no substituents to make specific contacts, so Boltz-2 has low confidence in placing it. The Lead adds a triazole and carboxylic acid that mimic the acetyl-lysine pharmacophore, roughly doubling the Binding Confidence. JQ1 adds the chlorophenyl group and tert-butyl ester, filling the WPF shelf and ZA channel of the bromodomain pocket and pushing Binding Confidence to 0.96, well above the 0.8 high-confidence threshold.

Structure Confidence is high for all three compounds (0.93–0.98), indicating that the protein structure itself is well-predicted regardless of the ligand. This makes sense since BRD4 is a well-characterized, rigid globular domain.

Inspect the predicted binding pose for JQ1

JQ1 scores 0.96 Binding Confidence with 0.98 Structure Confidence, indicating a highly reliable predicted pose. Key binding interactions, expected from the known crystal structure (PDB 3MXF):

  • The triazole ring and methyl group occupy the acetyl-lysine recognition site, forming a hydrogen bond with the conserved asparagine (N140) in the BC loop. This is the hallmark interaction of BET bromodomain inhibitors.
  • The chlorophenyl ring packs against the WPF shelf (W81, P82, F83), providing hydrophobic anchoring.
  • The tert-butyl ester group extends into the ZA channel, contributing additional hydrophobic contacts and shape complementarity.
  • The thienodiazepine core sits at the mouth of the pocket, bridging the ZA and BC loops.

Compare the Optimization Scores

The Optimization Scores track the same progression: Hit (0.22) → Lead (0.27) → JQ1 (0.44). JQ1’s score (0.44) is roughly 63% higher than the Lead’s (0.27), reflecting the substantial affinity gain from adding the chlorophenyl and tert-butyl ester groups. The Hit-to-Lead jump is more modest (0.22 → 0.27, ~23%), consistent with the triazole and acid adding some specific contacts but not yet achieving full pocket occupancy.

Using the categorization thresholds, JQ1 falls squarely in the high confidence binder range (Binding Confidence > 0.80, Opt. Score > 0.40). The Lead sits at moderate confidence (0.74, 0.27, both within the 0.65–0.80 and 0.25–0.40 ranges). The Hit falls in the low confidence / non-binder category (0.43, 0.22), consistent with its role as an unoptimized screening hit.

1K virtual screen

A design project was created in Boltz Lab using PDB 3MXF (BRD4 bromodomain 1 co-crystallized with JQ1) as the structural template. JQ1 was specified as the molecular probe to define the acetyl-lysine binding pocket. The platform automatically detected the binding site from the JQ1 co-crystal pose, identifying key pocket residues: the WPF shelf (W81, P82, F83), BC loop (N140), and ZA channel. Project ID: VS-BRD4WO-5P52.

A virtual screen of 993 AI-designed small molecules was generated from the Enamine REAL chemical space with Drug-Like filtering. All compounds were scored by Boltz-2 against the BRD4 binding pocket.

Score distributions across the library

MetricMinMaxMean
Binding Confidence0.070.850.30
Optimization Score0.000.480.23
Structure Confidence> 0.84> 0.96~0.92

The vast majority of compounds cluster at low Binding Confidence (< 0.40), consistent with the expectation that random chemical space sampling yields few genuine binders. Structure Confidence remains high throughout (> 0.84), indicating that the protein structure predictions are reliable regardless of ligand quality.

Top 5 compounds by Binding Confidence

RankIDBinding ConfidenceOpt. ScoreSMILES
1SM-AQ8GBD730.850.35Cc1cc(-c2cc(C)c(Cl)c(C)c2)cc(C)c1O
2SM-VP5CRXFK0.840.25CN1Cc2c(NC(=O)c3cccnc3)cccc2C1=O
3SM-2MZLAGQT0.800.48Cc1nc2c(cc1C(=O)Nc1cnn(CC(C)(C)O)c1C)c(C)nn2C
4SM-G95H15CR0.760.20CCC(=O)N(C)c1ccc2c(c1)CN(C)C2
5SM-1ASUYQAA0.740.34CCN(C(=O)C(C)C)c1ccc(Cl)cc1F

Categorize the results and benchmark against JQ1

CategoryCriteriaCount% of library
High confidence bindersBC > 0.80, OS > 0.4010.1%
Moderate confidenceBC 0.65–0.80, OS 0.25–0.40131.3%
Low confidence / non-bindersBC < 0.65, OS < 0.2597998.6%

The reference compounds validate the scoring system:

CompoundCategory
JQ1High confidence binder (0.96 / 0.44)
LeadModerate confidence (0.74 / 0.27)
HitLow confidence (0.43 / 0.22)

The sole high-confidence AI hit:

IDBinding ConfidenceOpt. ScoreStructure ConfidenceSMILES
SM-2MZLAGQT0.800.480.92Cc1nc2c(cc1C(=O)Nc1cnn(CC(C)(C)O)c1C)c(C)nn2C

SM-2MZLAGQT contains a pyridazine-pyrazole core with multiple methyl groups and an amide linker to a neopentyl alcohol. Structurally distinct from JQ1, but it shares nitrogen-rich heterocyclic character.

How does JQ1 rank alongside the AI-generated library?

JQ1 scores BC = 0.96, OS = 0.44, substantially outperforming every AI-generated compound on Binding Confidence. By BC alone, JQ1 ranks #1 by a wide margin (0.96 vs. the next-best AI compound SM-AQ8GBD73 at 0.85). No AI-generated molecule approaches JQ1’s level of binding confidence.

However, SM-2MZLAGQT (the only high-confidence AI hit) achieves a higher Optimization Score (0.48) than JQ1 (0.44). This is notable: the Optimization Score reflects relative affinity ranking within a congeneric series, and SM-2MZLAGQT’s higher OS suggests it may achieve comparable or slightly better binding affinity despite lower structural confidence in its predicted pose.

CompoundBC RankOS RankBCOS
JQ1 (benchmark)120.960.44
SM-2MZLAGQT410.800.48
SM-AQ8GBD73260.850.35
SM-VP5CRXFK30.840.25

JQ1 is not the top compound by Optimization Score, but it dominates Binding Confidence. This is expected: JQ1 is a highly optimized clinical candidate with known high-affinity binding to BRD4, whereas the AI compounds are generated from general chemical space without iterative medicinal chemistry optimization.

How do the top scoring binders compare in binding pose to JQ1?

The top-scoring AI compound SM-2MZLAGQT (Cc1nc2c(cc1C(=O)Nc1cnn(CC(C)(C)O)c1C)c(C)nn2C) contains a fused pyridazine-pyrazole bicyclic core decorated with methyl groups and an amide-linked pyrazole bearing a neopentyl alcohol. Compared with JQ1’s thienodiazepine scaffold:

Shared pharmacophoric features

  • Both molecules feature nitrogen-rich heterocyclic cores capable of occupying the acetyl-lysine recognition site and forming hydrogen bonds with N140.
  • Multiple methyl substituents in both compounds provide hydrophobic contacts with the pocket walls.
  • Both have molecular weights in the drug-like range (SM-2MZLAGQT ~314 Da vs. JQ1 ~457 Da).

Key structural differences

  • JQ1 uses a thienodiazepine (7-membered ring with sulfur), while SM-2MZLAGQT uses a pyridazine-pyrazole (two fused 6+5 rings with nitrogen).
  • JQ1’s chlorophenyl group fills the WPF shelf. SM-2MZLAGQT lacks an equivalent aromatic group, which may explain its lower Binding Confidence.
  • JQ1’s tert-butyl ester extends into the ZA channel; SM-2MZLAGQT’s neopentyl alcohol (CC(C)(C)O) may partially mimic this interaction but with a hydroxyl instead of an ester.
  • SM-2MZLAGQT is more compact and lacks the extended hydrophobic features that give JQ1 its high shape complementarity.

The second-highest BC compound, SM-AQ8GBD73 (Cc1cc(-c2cc(C)c(Cl)c(C)c2)cc(C)c1O), is a simple biaryl phenol with chlorine and methyl substitution, structurally much simpler than JQ1. Its high BC (0.85) but moderate OS (0.35) suggests it may sit in the pocket with good shape complementarity but lack the specific pharmacophoric interactions (N140 hydrogen bond, ZA channel occupancy) that drive high affinity.

Selectivity analysis: BRD4 vs. BRD2

This analysis was not performed. A selectivity screen against BRD2 (PDB 5UEN) would require re-running the top-scoring compounds from the BRD4 screen against the BRD2 bromodomain structure and comparing Binding Confidence and Optimization Scores across the two targets. Compounds scoring highly for BRD4 but poorly for BRD2 would indicate selectivity, a desirable property for reducing off-target effects, since BRD4 and BRD2 share highly conserved acetyl-lysine binding pockets. JQ1 itself is a pan-BET inhibitor (binds BRD2, BRD3, and BRD4), so identifying BRD4-selective compounds from the AI screen would represent a potential advantage over the benchmark.

Resources

ResourceLink
Boltz Lab Platformdocs.boltz.bio
Key BRD4 paperFilippakopoulos P. et al. Nature 468, 1067–1073 (2010)
JQ1 PDB structurercsb.org/structure/3MXF

Part C: Phage Lysis Protein Design Challenge

L-Protein (Lysis Protein), 75 residues:

METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT
RegionResiduesRole
Soluble domain1–40Interacts with DnaJ.
Transmembrane domain41–75Drives lysis activity.

Engineering goals:

  1. DnaJ independence. L-protein folds and functions without requiring DnaJ.
  2. Faster or more efficient lysis. Reduces the window for E. coli to acquire resistance.
  3. Higher L-protein expression. Increases the amount of functional protein produced.

Approach. ESM-2 mutational scanning, experimental mutant data from PMC5775895, and conservation analysis via pBLAST + ClustalOmega were integrated to design five mutant L-protein sequences.

Generate mutational effect scores with ESM-2

The ESM-2 protein language model (650M parameters) was run on the 75-residue L-protein sequence. For each position, all 19 alternative amino acid substitutions were scored by computing the log-likelihood ratio (LLR = mutant log-probability minus wild-type log-probability). Results are saved in ms2/mutation_scores.csv (1,425 mutations across 75 positions).

MetricValue
Total mutations scored1,425
Positions75
Soluble region (1–40)760 mutations
Transmembrane region (41–75)665 mutations
Positive LLR (predicted beneficial)400 (28.1%)
Negative LLR (predicted deleterious)1,025 (71.9%)

Top 10 highest-scoring substitutions:

MutationLLRRegion
C29R+3.64Soluble
K50P+3.56TM
C29P+3.17Soluble
C29Q+3.06Soluble
C29S+3.04Soluble
K50L+2.96TM
C29K+2.76Soluble
C29L+2.74Soluble
C29A+2.55Soluble
C29T+2.52Soluble

Two positions dominate the positive LLR landscape: C29 (cysteine at position 29 in the soluble domain) and K50 (lysine at position 50 in the TM domain). ESM-2 strongly prefers substituting the cysteine at position 29, likely because free cysteines are rare in most proteins and the model considers them destabilizing. K50 scores highly because the model views a charged residue in a hydrophobic TM context as unfavorable. The most strongly disfavored mutations are all at the initiator methionine (M1).

Review the experimental mutant data

Experimental mutant data was obtained from PMC5775895 and is stored in ms2/L-Protein Mutants - Sheet1.csv. The dataset contains 139 entries representing 82 unique mutations across 49 positions in the L-protein.

CategoryCount
Total entries139
Unique mutations82
Missense mutations100 (59 unique)
Stop codon mutations39
Missense with lysis = 1 (functional)35 (19 unique)
Missense with lysis = 0 (non-functional)65 (40 unique)

Soluble domain (residues 1–40). This region is remarkably tolerant of mutation. Substitutions at R18, R19, R20 (the arginine-rich region) all retain lysis activity despite dramatically changing the charge profile (R18G, R18I, R19H, R19S, R20L, R20W; all lysis = 1). Positions 23 (K→E) and 25 (E→V, E→G, E→D) are also fully tolerant. Notable exceptions: M1 (initiator Met, essential), P6L (lysis = 0), Q8L (lysis = 0), Y39H (lysis = 0). C29R retains lysis, and C29 itself appears to be non-essential for function despite moderate conservation.

Transmembrane domain (residues 41–75). Far less tolerant. Most substitutions abolish lysis. K50 is functionally critical: all four tested substitutions (K50E, K50I, K50N, K50Q) show lysis = 0, yet the protein is still expressed (protein level = 1 for most), indicating that K50 is required for the lysis mechanism itself, not for protein stability. Proline substitutions in the TM helix are generally lethal (L48P, L56P, L57P, L60P all lysis = 0). Rare functional TM mutations include L44P and A45P; prolines at the TM boundary are tolerated, possibly because they sit at the helix-membrane interface. Positions 49–53 (S49, K50, F51, T52, N53) form a particularly intolerant stretch.

Does the experimental data correlate with the language model scores?

The ESM-2 LLR scores show no meaningful correlation with experimental lysis outcomes.

TestResult
Point-biserial correlationrₓ₋ = -0.041, p = 0.757
Mann–Whitney UU = 421, p = 0.511
Mean LLR for lysis = 1 (functional)-0.560
Mean LLR for lysis = 0 (non-functional)-0.433

The correlation is essentially zero and far from statistical significance. If anything, the slight negative trend (functional mutations have marginally lower LLR) contradicts the expected direction. The Mann–Whitney U test confirms that the LLR distributions for functional and non-functional mutations are not distinguishable.

Of the 59 matched mutations, ESM-2 predictions agree with experiment in approximately 30 cases (roughly 50%), no better than random.

What does this say about ESM-2 for the L-protein?

ESM-2’s evolutionary signal does not capture the functional constraints of the L-protein. Several factors explain this:

  1. Extreme sequence rarity. The L-protein is a 75-residue protein encoded by an overlapping reading frame in the MS2 genome. It has very few homologs in sequence databases (only 2–3 close relatives, fr and M12, plus a handful of distantly related levivirus lysis proteins). ESM-2 was trained on millions of sequences, and its effectiveness depends on having sufficient evolutionary depth. The L-protein’s shallow phylogenetic tree gives the model little signal to leverage.
  2. Unusual evolutionary constraints. Because the lysis gene overlaps the coat protein and replicase genes, its evolution is constrained by the reading frames of two other genes. The selective pressures captured in ESM-2’s training reflect these overlapping constraints, not the intrinsic functional requirements of the L-protein itself.
  3. Non-standard function. The L-protein is a single-pass transmembrane toxin whose function (membrane disruption) may not follow the same structure–function relationships ESM-2 captures well for globular enzymes.

The protein-level correlation is equally absent (r = 0.039, p = 0.768), confirming that ESM-2 does not predict expression or stability for this protein either.

Where does the model succeed and where does it fail?

Where ESM-2 succeeds

  • Strongly deleterious mutations at conserved positions. M1I and M1T (LLR = -6.13 and -5.63) are correctly predicted as non-functional. The initiator methionine is universally conserved and essential. Similarly, I42N (LLR = -1.43, lysis = 0) and I46N (LLR = -1.43, lysis = 0) in the transmembrane domain are correctly identified; replacing hydrophobic residues with polar asparagine disrupts TM helix packing.
  • Proline substitutions in the TM helix. L48P (LLR = -2.31), L56P (LLR = -1.22), L56H (LLR = -2.11), L57P (LLR = -0.42), and L60P (LLR = -0.84) all correctly receive negative LLR and experimentally show no lysis. ESM-2 recognizes that proline is incompatible with alpha-helical transmembrane segments.

Where ESM-2 fails

  • The arginine-rich soluble region (R18, R19, R20). R18G (LLR = -1.02), R18I (-1.37), R19H (-1.03), R19S (-0.30), R20L (-0.23), and R20W (-2.30) are all predicted deleterious, yet every one permits lysis. This is because the soluble N-terminal domain (residues 1–40) is largely dispensable for lysis activity; the amino-terminal half of the protein can tolerate extensive mutation as long as the transmembrane domain is intact. ESM-2 cannot distinguish “conserved for overlapping gene constraints” from “conserved for L-protein function.”
  • Position K50 in the TM domain. K50E (LLR = +0.50), K50I (+2.41), K50N (+0.86), and K50Q (+0.78) all receive positive or near-positive LLR scores, yet all four experimentally show no lysis. K50 is a charged “snorkeling” lysine in the TM domain that is apparently critical for membrane disruption. ESM-2 interprets this unusual charged residue in a hydrophobic context as unfavorable, when in fact it is functionally essential.
  • The failure pattern is region-dependent. Per-region analysis shows a slight positive trend in the soluble domain (rₓ₋ = +0.134) but a slight negative trend in the transmembrane domain (rₓ₋ = -0.166). ESM-2 is marginally better at predicting outcomes in the soluble domain but actively misleading in the transmembrane domain, likely because the functional rules for single-pass TM toxins differ from the evolutionary patterns in ESM-2’s training set.

Conservation analysis via pBLAST + ClustalOmega

A pBLAST search of the L-protein sequence identified 10 levivirus lysis protein homologs: fr (CAA33137), M12 (AAF19634), GA (CAA27498), JP34 (AAA72211), KU1 (AAF67675), BZ13 (ACT66727), Hgal1 (YP007237174), C1 (YP007237128), PP7 (NP042306), and PRR1 (YP717670). These were aligned with ClustalOmega and conservation scores were computed per position (ms2/conservation_scores.csv, ms2/alignment.fasta).

The alignment spans 11 sequences (MS2 L-protein + 10 homologs). Not all sequences cover every position; the N-terminal and C-terminal regions have variable sequence coverage (2–11 sequences per position).

Highly conserved positions (conservation ≥ 0.80):

PositionResidueConservationShannon EntropyRegion
1M1.000.00Soluble
2E1.000.00Soluble
3T1.000.00Soluble
4R1.000.00Soluble
9S0.800.72Soluble
12T0.800.72Soluble
29C0.820.68Soluble
46I0.820.87TM
48L0.820.87TM
64I0.880.54TM
69T0.880.54TM
70L0.880.54TM
73L1.000.00TM
75T1.000.00TM

The first four residues (METR) are universally conserved across all homologs. C29 (conservation = 0.82) is notable as the only cysteine in the protein and is highly conserved despite ESM-2 strongly favoring its substitution, highlighting a disconnect between evolutionary conservation and model preferences.

Highly variable positions (conservation ≤ 0.30):

PositionResidueConservationMost common AARegion
6P0.20PSoluble
17N0.30MSoluble
18R0.18GSoluble
19R0.09LSoluble
25E0.27KSoluble
26D0.18ESoluble
28P0.27LSoluble
30R0.18SSoluble
37T0.27RSoluble
41L0.27WTM
43F0.27ATM
50K0.30DTM
53N0.30STM
56L0.30STM
74L0.29PTM

The soluble domain (positions 1–40) shows a gradient: the first four residues are perfectly conserved, then conservation drops substantially in the R18–R20 arginine-rich region (0.09–0.38) and the E25–P28 stretch (0.18–0.27). The transmembrane domain (positions 41–75) has a mix of well-conserved structural residues (I46, L48, I64, T69, L70, L73, T75) and highly variable positions (L41, F43, K50, N53, L56), suggesting that TM helix geometry is maintained but specific side chains can vary.

Design 5 mutant variants

The variants below were selected by integrating three data sources: ESM-2 LLR scores (predicted mutational effect), conservation analysis (10 levivirus lysis protein homologs aligned via ClustalOmega), and experimental lysis data (59 characterized mutations). Selection criteria: positive LLR, non-conserved position (conservation < 0.8), and experimentally supported where available.

Variant 1: L-K23E

ItemValue
Full mutant sequenceMETRFPQQSQQTPASTNRRRPFEHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT
RegionSoluble (position 23)
MutationK23 → E (lysine to glutamate)
Language model score+0.289 (predicted beneficial)
Experimental supportLysis = 1 (functional); Protein level = 0 (not detected by Western blot)
Conservation status0.545 (moderately variable; Shannon entropy 2.05)
Criteria met3/3 (positive LLR, non-conserved, experimentally supported)
RationaleCharge reversal (positive K to negative E) in the soluble domain’s basic region near the DnaJ interaction interface. Position 23 is moderately conserved but shows high entropy (2.05), indicating tolerance for diverse amino acids across levivirus lysis proteins. The K→E substitution replaces the most common residue at this position with a negatively charged alternative, potentially altering the electrostatic interaction surface with DnaJ. Experimentally confirmed to retain lysis activity.
Target goalDnaJ independence. Charge reversal at the chaperone interaction surface may weaken DnaJ binding while the protein retains lysis function through an alternative folding pathway.

Variant 2: L-E25G

ItemValue
Full mutant sequenceMETRFPQQSQQTPASTNRRRPFKHGDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT
RegionSoluble (position 25)
MutationE25 → G (glutamate to glycine)
Language model score+0.251 (predicted beneficial)
Experimental supportLysis = 1 (functional); Protein level = 0
Conservation status0.273 (highly variable; most common AA at this position is K, not E)
Criteria met3/3
RationalePosition 25 is poorly conserved (0.273); across the 11-sequence alignment this site shows K, E, A, I, R, D, and others, indicating minimal functional constraint. The E→G substitution removes a bulky charged side chain and introduces maximum backbone flexibility. Experimentally confirmed functional.
Target goalHigher expression. Glycine at this unconstrained position may improve co-translational folding efficiency and reduce dependence on chaperone-assisted folding.

Variant 3: L-K50P

ItemValue
Full mutant sequenceMETRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSPFTNQLLLSLLEAVIRTVTTLQQLLT
RegionTransmembrane (position 50)
MutationK50 → P (lysine to proline)
Language model score+3.561 (highest LLR of all candidates)
Experimental supportNo direct data for K50P. Caution: K50E, K50I, K50N, and K50Q all show lysis = 0, indicating K50 may be functionally essential.
Conservation status0.300 (variable; most common AA at this position is D)
Criteria met2/3 (positive LLR, non-conserved; no direct experimental data)
RationaleESM-2 assigns the highest LLR to this mutation because K50 is a charged residue in a hydrophobic TM context, and the model strongly prefers hydrophobic alternatives. However, this represents a known ESM-2 blind spot: K50 appears to be a functionally critical “snorkeling” lysine whose charge is required for membrane disruption. This variant is included as a hypothesis-testing candidate: if K50P retains lysis, it would demonstrate that the helix-breaking property of proline can substitute for the charge-based mechanism.
Target goalFaster / more efficient lysis. If functional, the proline-induced helix kink could create a more aggressive membrane disruption geometry. This is the highest-risk, highest-reward variant.

Variant 4: L-K50L

ItemValue
Full mutant sequenceMETRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSLFTNQLLLSLLEAVIRTVTTLQQLLT
RegionTransmembrane (position 50)
MutationK50 → L (lysine to leucine)
Language model score+2.956 (second highest LLR)
Experimental supportNo direct data for K50L. Same caution as Variant 3: four other K50 substitutions are non-functional.
Conservation status0.300 (variable)
Criteria met2/3 (positive LLR, non-conserved)
RationaleLeucine is the most common residue in alpha-helical TM segments and represents the “default” hydrophobic substitution. Unlike the proline in Variant 3, leucine maintains helix geometry. This variant tests whether the loss of K50’s charge alone abolishes lysis or whether the specific chemistry of K50E/I/N/Q is what fails. Together, Variants 3 and 4 test two hypotheses: (3) can a structural perturbation compensate for charge loss; (4) is any uncharged residue tolerated?
Target goalFaster / more efficient lysis. If the TM domain can function with a fully hydrophobic helix, this would indicate that membrane insertion efficiency can compensate for the loss of charge-mediated disruption.

Variant 5: L-E25V

ItemValue
Full mutant sequenceMETRFPQQSQQTPASTNRRRPFKHVDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT
RegionSoluble (position 25)
MutationE25 → V (glutamate to valine)
Language model score+0.152 (predicted mildly beneficial)
Experimental supportLysis = 1 (functional); Protein level = 0
Conservation status0.273 (highly variable)
Criteria met3/3
RationaleSame position as Variant 2 (E25G) but with a different substitution strategy. While E25G maximizes flexibility, E25V introduces a branched hydrophobic side chain. This provides a paired comparison at a known-tolerant position: flexibility (G) vs. hydrophobicity (V). Position 25 is adjacent to the conserved C29 (conservation = 0.818), so mutations here probe the boundary between the variable N-terminal region and the more constrained core. Both E25G and E25V are experimentally confirmed functional.
Target goalDnaJ independence. Replacing the charged glutamate with hydrophobic valine at the soluble-domain surface creates a local hydrophobic patch that may reduce the protein’s requirement for DnaJ-mediated folding assistance.

Summary

VariantMutationRegionLLRConservationExp. LysisTarget Goal
1K23ESoluble+0.2890.545YesDnaJ independence
2E25GSoluble+0.2510.273YesHigher expression
3K50PTM+3.5610.300No data*Faster lysis
4K50LTM+2.9560.300No data*Faster lysis
5E25VSoluble+0.1520.273YesDnaJ independence

* Other K50 substitutions (E, I, N, Q) experimentally show no lysis.

Caveats

  1. K50 risk. Variants 3 and 4 target position K50, where 4/4 tested mutations are non-functional. These are hypothesis-testing variants, not safe bets. Lower-risk TM alternatives include L44P (lysis = 1, LLR = -1.84) or A45P (lysis = 1, LLR = -0.43), though these have negative ESM-2 scores.
  2. Position redundancy. The design includes two mutations at position 25 and two at position 50. This enables paired comparisons (flexibility vs. hydrophobicity at pos 25; helix-breaking vs. helix-maintaining at pos 50) but reduces position diversity.
  3. ESM-2 limitations for L-protein. As documented in the correlation analysis, ESM-2 LLR scores do not predict lysis outcomes for this protein (rₓ₋ = -0.041). The conservation analysis and experimental data were therefore weighted more heavily in the final selection.