Week 5 HW: Protein Design II

🧬 Part 1 Generate Binders with PepMLM

Human SOD1 Sequence:

https://www.uniprot.org/uniprotkb/P00441/entry

https://www.uniprot.org/uniprotkb/P00441/entry#sequences

SOD1 sequence

MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLS RKHGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLGDHCIIGRTLVVHEKADDLGKGGNEESTKT GNAGSRLACGVIGIAQ

SOD1 sequence with A4V mutation

MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLS RKHGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLGDHCIIGRTLVVHEKADDLGKGGNEESTKT GNAGSRLACGVIGIAQ

Here is a table with the binders ranked and compared against a known binder:

RankPeptide SourceSequencePseudo Perplexity
1Reference (Experimental)FLYRWLPSRRGG2.2833
2PepMLM (Candidate 0)KLVPAVVLAHKX7.4714
3PepMLM (Candidate 1)KRSYPTALRHWX10.1367
4PepMLM (Candidate 2)WRYPVAABHGK11.0383
5PepMLM (Candidate 3)WHVYVVGLRHKE25.8914

The perplexity metric measures how perplexed or “surprised” as it were, a model is by a sequence. Hence a lower score represents higher model confidence or predicted affinity. Here, the known binder FLYRWLPSRRGG acts as a benchmark, scoring 2.28 on the pseudo perplexity rating, which is significantly lower than the newly generated designs. As you can see, I have ranked the binders in order of their respective perplexity ratings.

🔬 Part 2: Evaluate Binders with AlphaFold3
RankJob NameipTMpTMPrimary Binding LocationTarget Engagement
1SOD1 and KLVPAVVLAHK0.580.82N-terminus GrooveHigh (Pocket)
2SOD1 and WHVYVVGLRHKE0.490.81Upper β-barrel RidgeModerate (Surface)
3SOD1 and KRSYPTALRHW0.440.90β-barrel LoopsModerate (Surface)
4SOD1 and WRYPVAABHGK0.390.83Lower Dimer InterfaceLow/Mod (Surface)
5SOD1 and FLYRWLPSRRGG (Ref)0.260.81Surface LoopsLow (Transient)

Key

Confidence LevelpLDDT RangeCorresponding Color
Very HighpLDDT > 90Dark Blue
Confident90 > pLDDT > 70Light Blue (Cyan)
Low70 > pLDDT > 50Yellow
Very LowpLDDT < 50Orange

Protein-peptide complex Models using AlphaFold3 and Residue Alignment Charts (Green)

They are ordered according to their ipTM score, with the first (KLVPAVVLAHK) having the greatest score (0.58) etc

AlphaFold 3 modelling supported the binding potential of the peptides generated using PepMLM. Notably, all four model-generated peptides outperformed the experimental reference peptide, FLYRWLPSRRGG, in terms of ipTM (interface confidence), despite the reference having the lowest pseudo-perplexity score.

Candidate 0, KLVPAVVLAHK, achieved the highest ipTM score of 0.58 and also exhibited the lowest pseudo-perplexity score of 7.4714. Its elevated ipTM score suggests a strong ability to dock deeply within the N-terminal groove of SOD1, specifically near the ALS-associated A4V mutation site. In contrast, the remaining peptides displayed differing binding preferences across the β-barrel region and dimer interface.

The second strongest binder was Candidate 3, WHVYVVGLRHKE, with an ipTM score of 0.49. Interestingly, this peptide also had the highest pseudo-perplexity score at 25.8914, indicating that although it demonstrates favourable binding to mutant SOD1, its sequence is less likely to occur naturally compared with the other generated candidates.

🧪 Part 3: Evaluate Properties of Generated Peptides in PeptiVerse

In the search for peptides capable of stabilizing the SOD1 protein, a major therapeutic target in ALS research, the focus shifts from structural prediction in AlphaFold 3 to therapeutic evaluation in PeptiVerse. While AlphaFold 3 provides insight into the three-dimensional binding structure of a peptide, the 11 profiling metrics generated by PeptiVerse offer a broader assessment of how each candidate may behave in a biological and therapeutic context. Shown below are the results of evaluating the four PepMLM-designed peptide candidates against the established reference binder, FLYRWLPSRRGG, ranked from highest to lowest ipTM score.

Metric / PropertyKLVPAVVLAHKWHVYVVGLRHKEKRSYPTALRHWWRYPVAABHGKFLYRWLPSRRGG (Ref)
ipTM (Structural)0.580.490.440.390.26
Solubility1.0001.0001.0001.0001.000
Permeability0.2420.1430.8490.3590.862
Hemolysis0.0320.0520.0220.0100.047
Non-Fouling0.2850.2970.5490.4800.666
Half-Life (hrs)0.4380.4120.3420.3390.310
Binding (pKd)5.5285.9195.9655.3005.968
Length (aa)1112111112
Mol. Weight (Da)1174.51522.81414.61166.51507.7
Net Charge (pH 7)+1.59+0.94+2.85+1.85+2.76
Isoelectric Point10.008.6011.009.9911.71
GRAVY (Hydrophobicity)1.02-0.38-1.44-0.73-0.71

The results revealed an interesting trade-off between structural binding confidence and therapeutic potential.

Although Candidate 0 (KLVPAVVLAHK) achieved the highest ipTM score of 0.58, indicating that AlphaFold 3 predicts a highly confident structural interaction with mutant SOD1, PeptiVerse’s therapeutic profiling identified Candidate 1 (KRSYPTALRHW) as the most promising overall candidate despite its lower ipTM score of 0.44.

This distinction likely arises from the balance between binding performance and drug-like properties. Candidate 1 exhibited one of the lowest pseudo-perplexity scores among the generated peptides at 7.4714, suggesting that its sequence remains relatively biologically plausible and potentially more nature-like. In addition, it achieved the highest predicted binding affinity of the generated candidates, with a pKd score of 5.965, alongside the strongest permeability score of 0.849, indicating an increased likelihood of penetrating cells and reaching intracellular mutant SOD1 targets.

Importantly, Candidate 1 also displayed the highest positive net charge of all tested peptides, including the reference peptide, with a score of +2.85. This characteristic may enhance its ability to cross the blood–brain barrier and interact with the negatively charged aggregates associated with mutant SOD1 pathology.

Taken together, these results suggest that while Candidate 0 demonstrates the strongest predicted structural fit, Candidate 1 offers the most balanced combination of binding capability, permeability, and therapeutic suitability, making it the strongest candidate for further investigation.

⚙️ Part 4: Generate Optimized Peptides with moPPIt
RunSequenceAffinity (pKd)SolubilitySpecificityMotif ScoreHemolysis
#1RFKCIVKVMVRR8.8810.5000.6150.5530.944
#2KRLQLYRKKCAE7.1930.7500.7370.6340.964
#3QRACDYFRDDED7.7830.8330.6790.0590.895
#4KEKEGPCWESEK7.3600.8330.8710.0020.962

The PepMLM-generated peptides primarily emphasize high-confidence structural docking alongside balanced biophysical properties, resulting in a more conservative yet affinity-improving profile relative to the baseline reference. In contrast, the moPPIt-generated peptides explore a broader chemical space and place greater emphasis on targeted binding interactions.