Week 5 HW: Protein Design Part II
Part A: SOD1 Binder Peptide Design (From Pranam)
Part 1: Generate Binders with PepMLM
Here is the Human SOD1 sequence from Uniprot (P00441)
MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ
Here it is again after adding the A4V mutation
MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ
Here are the produced peptides:
| Index | Binder | Pseudo Perplexity |
|---|---|---|
| 1 | WRYYPTGLRHKX | 12.016788 |
| 2 | HHYGAVVLELKK | 18.394675 |
| 3 | KRYPVAAARWKX | 10.061424 |
| 4 | WHVYVVAVALKE | 21.195186 |
| 5 | FLYRWLPSRRGG | N / A |
Part 2: Evaluate Binders with AlphaFold3
2 of the generated peptides had X in their sequence and Alphafold was rejecting them so I replaced X with A or Alanine on the advice of Google Gemini.
AlphaFold3 Prediction Results
Summary Analysis The ipTM scores for the PepMLM-generated peptides range from 0.26 to 0.68. According to AlphaFold standards, an ipTM score above 0.8 represents a high-quality prediction, while scores below 0.6 are generally considered failed predictions. Most candidates, including the known binder (0.32), fall into the failure category, indicating that these interactions are likely unstable or poorly modeled.
However, the peptide HHYGAVVLELKK achieved an ipTM of 0.68, placing it in the “grey zone” (0.6–0.8). This score indicates that the prediction could potentially be correct and represents a significant improvement over the control binder. While most peptides remain surface-bound, HHYGAVVLELKK shows the most potential to move beyond the surface and possibly engage with the dimer interface or the destabilized N-terminus where the A4V mutation sits.
Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse
PeptiVerse Property Comparison
Analysis Paragraph
Comparing the structural data from AlphaFold3 to the chemical properties from PeptiVerse reveals that higher structural confidence (ipTM) does not correlate with stronger predicted binding affinity in this dataset. For instance, HHYGAVVLELKK has the highest ipTM (0.68) but the lowest predicted affinity (5.454). Conversely, WHVYVVAVALKE shows the highest affinity (6.406) but is also the most hemolytic (0.113), which is a significant therapeutic drawback. Interestingly, all peptides are predicted to be highly soluble (1.000). While KRYPVAAARWKX stands out as the safest option with the lowest hemolysis probability (0.012), its structural confidence remains low.
Decision & Justification
Peptide to Advance: HHYGAVVLELKK
Justification: I would advance HHYGAVVLELKK because it is the only candidate that provides a credible structural binding mode, moving out of the “failure zone” and into the AlphaFold3 “grey zone” (0.68). While its predicted affinity is lower than others, structural stability is often a more reliable indicator of specific binding for complex targets like the SOD1 A4V mutation. Additionally, it remains highly soluble and has a low hemolysis probability, ensuring a safe therapeutic profile while potentially stabilizing the destabilized N-terminus better than the low-confidence surface binders.
Part 4: Generate Optimized Peptides with moPPIt
In this section, I moved from global sampling to controlled design. I used the moPPIt model to target specific residue indices (2, 3, 4, 5, 6) corresponding to the A4V mutation site at the N-terminus of SOD1.

Additionally, I selected all optimization properties in the notebook even though the computation was being performed on a T4 GPU in Google Colab, which has limited computational resources. It took 43 mins to implement the code.
Here are the moPPIt generated peptides:
| Peptide Sequence | Hemolysis | Solubility | Affinity | Motif |
|---|---|---|---|---|
| KANYWTTWTSDS | 0.93190462142229 | 0.75 | 5.74363183975219 | 0.78769564628601 |
| KCETKFLQKREI | 0.966306183487176 | 0.75 | 6.49503183364868 | 0.894703328609466 |
| KRQSCQKTKPFV | 0.938299626111984 | 0.75 | 6.26246261596679 | 0.869844377040863 |
| KSQKKQTEICGR | 0.958696339279413 | 0.916666686534881 | 6.46437692642211 | 0.800572216510772 |
In next step, I have decided to take those Peptides, and run them through AlphaFold and PeptiVerse and compare them with the pepMLM ones.
Candidate 1: KSQKKQTEICGR (Lead Candidate)
- AlphaFold3 Validation: ipTM Score: 0.52 (The highest structural confidence among the optimized set).
- Binding Analysis: This peptide shows the most promising localization. Unlike the PepMLM binders that were floating away, this sequence remains in close proximity to the N-terminal region. It appears to “hug” the site of the A4V mutation, suggesting it could potentially stabilize the destabilized fold.

PeptiVerse Property Profile:
- Affinity: 6.464
- Solubility: 1.000 (Perfectly soluble)
- Hemolysis: 0.041 (Low toxicity)

Candidate 2: KCETKFLQKREI
AlphaFold3 Validation: ipTM Score: 0.42
Binding Analysis: While the confidence is slightly lower than Candidate 1, it remains docked near the beta-barrel region adjacent to the N-terminus. It is not “floating away” into the solvent, indicating a specific interaction with the protein surface.

PeptiVerse Property Profile:
- Affinity: 6.495 (Highest Predicted Affinity)
- Solubility: 1.000
- Hemolysis: 0.074

Candidate 3: KRQSCQKTKPFV
AlphaFold3 Validation: ipTM Score: 0.41
Binding Analysis: This candidate also shows proximity to the mutation site. While the ipTM is in the lower confidence range, the physical placement in the model remains focused on the targeted residue patch rather than random surface binding.

PeptiVerse Property Profile:
- Affinity: 6.262
- Solubility: 1.000
- Hemolysis: 0.037

Candidate 3: KANYWTTWTSDS
AlphaFold3 Validation: ipTM Score: 0.37
Binding Analysis: This peptide localizes near the target but shows higher flexibility in the model, reflected in the lower ipTM score. It is close to the N-terminus but less “packed” than the lead candidate.

PeptiVerse Property Profile:
- Affinity: 5.744
- Solubility: 1.000
- Hemolysis: 0.066

How moPPIt Peptides Differ from PepMLM Peptides
The moPPIt-designed peptides represent a significant improvement over the PepMLM set for several reasons:
- Controlled Specificity: PepMLM performs “Global Sampling,” which often results in peptides that bind to random surface loops. moPPIt uses Multi-Objective Guided Discrete Flow Matching (MOG-DFM) to steer the peptide specifically toward residues 2-6 (the A4V site).
- Property Reliability: moPPIt optimized for Affinity and Solubility during the generation phase itself. This resulted in a 100% success rate for solubility (1.000) and consistently high affinity scores (pKd/pKi > 6.2) across the lead candidates.
- Targeting the “Toxic” Site: By forcing the model to bind near the N-terminus, moPPIt creates candidates more likely to stabilize the SOD1 dimer interface, which is the root cause of A4V-driven ALS.
Pre-Clinical Evaluation Strategy
To advance the lead candidate (KSQKKQTEICGR) toward clinical application, I would follow this validation pipeline:
- Biophysical Assays (SPR/ITC): I would use Surface Plasmon Resonance (SPR) to confirm the pKd/pKi values. Computational predictions must be validated with physical measurements of binding kinetics to ensure high-affinity binding in the nanomolar range.
- Aggregation Inhibition (ThT Assay): Since the A4V mutation causes toxic protein clumping, a Thioflavin T assay is essential to prove the peptide actually prevents SOD1 from aggregating.
- Efficacy in Motor Neurons: Testing on ALS patient-derived motor neurons is required to see if the peptide reduces intracellular SOD1 aggregates without causing cellular toxicity.
- Proteolytic Stability: I would evaluate the peptide’s half-life in human serum to ensure it isn’t degraded by proteases before it can reach the target neurons in the CNS.









