SOD1 Binder Peptide Design


Intro from Pranam Chatterjee

Superoxide dismutase 1 (SOD1) is a cytosolic antioxidant enzyme that converts superoxide radicals into hydrogen peroxide and oxygen. In its native state, it forms a stable homodimer and binds copper and zinc. Mutations in SOD1 cause familial Amyotrophic Lateral Sclerosis (ALS). Among them, the A4V mutation (Alanine → Valine at residue 4) leads to one of the most aggressive forms of the disease. The mutation subtly destabilizes the N-terminus, perturbs folding energetics, and promotes toxic aggregation.

CHALLENGE

1. Design short peptides that bind mutant SOD1

2. Then decide which ones are worth advancing toward therapy


Part 1. Generate Binders with PepMLM

SOD1 Original sequence (150 aa):

MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

A4V mutated sequence:

Replace A by V on position 4 (note that the initial Methionine in position 1 is not counted)

MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ


Approach 01: Generation of 4 peptides (length 12 aa, top K 3)

IndexPeptidesPseudo Perplexity
0WLSPVVAAEHKE15.485618
1WRYGAAAAEHKE9.098425
2WRYYAAAVAHKX8.351979
3HHSYAAAVALKK12.665942

Approach 02: Selection of 4 peptides out of 200 generated (length 12 aa, top K 3)

IndexPeptidesPseudo Perplexity
47WRYPAAAVALKX5.951070210421078
135WRSPAAVAAHKX4.819682049260351
147WHSGVVALAHKX6.103797038822761
199WRYGAVAARLKX5.926568160063284

For comparison, known peptide binder FLYRWLPSRRGG:

IndexBinderPseudo Perplexity
N/AFLYRWLPSRRGG20.63523127283615

Pseudo Perplexity measures how confident the model is in the sequence. A lower PPL (≈ 3–6) means that the sequence fits welllearned binding patterns and thus the model is confident that the peptide will bind the target. A higher PPL (> 10–15) means that the sequence looks different than the learned patterns and thus, the model is less confident about the probability for the peptide to bind the target.

In theory, the high value of the pseudo perplexity score (>15) of the known binder FLYRWLPSRRGG can sound surprising. But in reality, the authors of the reference paper showed that known binders can have relatively high perplexity values. While the confidence scores can be useful for ranking candidates, they have their own limitations: the score evaluates familiarity, not whether the peptide truly binds the protein of interest. See documentation for more details on outliers.

UPDATE: AlphaFold basic functions don’t allow the integration of undefined amino acids (X) so a new series of peptides was generated with Colab.

ReferencePeptidesPseudo Perplexity
BinderFLYRWLPSRRGG20.64
0WLSPVVAAEHKE15.49
1WRYGAAAAEHKE9.10
3HHSYAAAVALKK12.67
15WLSGAVGAAHKK8.10
43WRYGAAAAEHGK6.95
94WRYPAAAARLGK7.06
111WRYGAVAAAWKE7.94

2. Evaluate Binders with AlphaFold3

AlphaFold Visualization:

Except for peptides 3 and 43 which form an alpha helice secondary structure, all peptides present an unfolded structure that can appear more or less floppy or following the shape of the protein. None of the peptides are buried in the mutant SOD1: if binding to the protein, this appear to be a surface-bound for all of them. None of the peptide appear to interact with the N-terminal nor the dimer interface. All peptides seem to interact, at least partially with the beta-barrel. Beside the known binder, peptides 43, 94 and 111 appear to also interact with the electrostatic loop (loop containing a small portion of alpha helice).


cover image cover imagecover image cover image

Prediction of the different A4V Mutant SOD1-peptide Bindings:

ReferenceipTM
Binder0.3
00.27
10.28
30.73
150.46
430.35
940.42
1110.31

ipTM indicates the probability of binding. Typically, if ipTM is >0.8, the probability of binding is high, and low if it’s <0.6. Thus, it is surprising that the binder only exhibits a ipTM of 0,3 with SOD1, which is either lower or similar to all generated peptides. This can maybe be explained by the fact that the peptide binds the surface vs being typically buried inside the protein, or/and because the peptide establish an usual bound with the electrostatic loop. With an ipTM = 0.73, peptide 3 exhibits the highest binding score: it might be interesting to predict its therapeutic property to evaluate if it is worth advancing toward therapy.


3. Predicting Therapeutic Properties with PeptiVerse

All peptides evaluated exhibit soluble and non-hemolytic properties.

Except for peptide 111, they all appear to bind only weakly to the mutant SOD1 target.

Interestingly, Peptiverse prediction about binding affinity:

  • confirm previous surprising results indicating a weak binding of the known binder (FLYRWLPSRRGG) to mutant SOD1
  • don’t corroborate AlphaFold results about the binding of peptides with higher ipTM, not even peptide 3 (ipTM=0.73)
  • newly identify peptide 111 as a potential binding candidate (medium binding to mutant SOD1)
cover image cover imagecover image cover image

When considering how strongly the peptide binds to the target, peptide 111 seems to be the best candidate to proceed with. However, one can first question the discrepancy between the results obtained with the different prediction tools. And second, when checking more properties needed for the peptide to become a real drug, PeptiVerse indicates that this peptide presents a low membrane permeability and a fouling behavior. Thus, peptide 111 might not be the ideal candidate to be advanced for therapy. This result aligns with the purpose of PeptiVerse explained in the reference paper, which is to filter out bad candidates early and possibly, explains compromises that may have been made in the process: e.g. weaker binding property of the known SOD1 binder.

Peptide 111 extended analysis: cover image cover image

My strategy to find a potential candidate would be to (1) repeat the steps of peptides generation and select hundreds of candidates, (2) narrow down the selection by ensuring coherent results between the different prediction tools, checking all therapeutic properties and evaluating how the binding site may affect therapeutic efficacy (see next section).


4. Generating Optimized Peptides with moPPIt

Colab moPPIt was used to generate peptides binding to a 8 aa length motif focused on the residue 4 corresponding to the mutation site (motif: 1-8, specificity: off). Affinity guidance, solubility and hemolysis guidance parameters were enabled.

Optimized Peptide 01: RKTTCQLTKEQG

Optimized Peptide 02: TEKSEEFKKKII

Optimized Peptide 03: RNETCVQKSKGF

moPPIt values:

BinderHemolysisSolubilityAffinityMotif
RKTTCQLTKEQG0.970.925.820.88
TEKSEEFKKKII0.980.755.670.42
RNETCVQKSKGF0.970.836.700.88

Analysis of the optimized peptides in PeptiVerse

RKTTCQLTKEQG cover image cover image

TEKSEEFKKKII cover image cover image

RNETCVQKSKGF cover image cover image

Interpretation:

  • All peptides only bind weakly to the target protein and are not permeable.
  • However all peptides are soluble and present non-hemolytic, as well as non-fouling therapeutic properties.
  • The hemolysis results of moPPIt and PeptiVerse are opposite: the “real” meaning of the moPPIt “hemolysis” values needs to be checked.

Analysis of the optimized peptides in AlphaFold

cover image cover image

Interpretation:

  • moPPIt-generated peptide 02 doesn’t interact with the N-terminal, which is coherent with the scores.
  • The binding probability of the optimized peptide 01 and 02 are higher than the known binder.

CONCLUSION

While its binding affinity is evaluated as weak in PeptiVerse, 5.96 seems sufficient for it to bind to the target protein. Besides, the know binder exhibits all therapeutic characteristics needed: soluble, permeable, non-hemolytic and non-fouling.

EVALUATION OF THE KNOWN BINDER (FLYRWLPSRRGG) IN PEPTIVERSE:

cover image cover image

None of the generated peptides in this assignement combined all parameters at once. Concerning optimized peptide 01: a weaker permeability is not a deal breaker for potential therapeutic use, but given that the binding affinity isn’t strong I would exclude it from the list of potential therapeutic candidates for now. My strategy for the next steps would be to generate more peptides using moPPIt while testing different motifs: testing different length of the motif on a specific site (e.g. around the mutation site) and different site location. As a matter of fact, the electrostatic loop has been reported to be involved in the misfolding of SOD1 and the formation of toxic oligomers. Because it has a highly exposed surface and structural flexibility, this site is a motif top candidate (see documentation).


DOCUMENTATION

Colab PepMLM

Source Protein Sequence: UniProt P00441

Simple explanations of the reference paper:

https://chatgpt.com/s/t_69b036610424819197b52653c5b55f6b

https://chatgpt.com/s/t_69b03ad48f748191aa89855f9d3830e4

https://chatgpt.com/s/t_69b03d68d6908191a0c3e4b251c01972

Strategies for creating peptides + explanation top K value:

https://chatgpt.com/s/t_69b0555069408191ba5e1253c8c345f2

https://chatgpt.com/s/t_69b05b9e75bc8191876d8d01f138fbf8

Perplexity score: https://chatgpt.com/s/t_69b05b9e75bc8191876d8d01f138fbf8

Creation of the peptides: Peptides were created using PepMLM Colab The slider functions were blocked so the parameters (length, top K and number of binders) were changed manually in the code. Peptides were selected according to perplexity values using the interactive table function in the Generate Peptide block.

cover image cover image

The pseudo perplexity value of the known SOD1-binding peptide FLYRWLPSRRGG was calculated by Gemini within Colab using compute_pseudo_perplexity function.

cover image cover image

More about outliers binders: https://chatgpt.com/s/t_69b09f5842848191ad68e48392e7c919


AlphaFold

Mutant A4V SOD1 sequence and the different peptide sequences were submitted as separate chains to model the protein-peptide complex in AlphaFold. SOD1 structure

Explanation parameters of the analysis results in AlphaFold:

https://chatgpt.com/s/t_69b1d7639604819191417d9ae3069f9f

https://chatgpt.com/s/t_69b486d55cf48191b082ed6264b81c3a

Data

Mutant SOD1 - Peptide 00 complex https://alphafoldserver.com/fold/43ecc9ff963fde21

Mutant SOD1 - Peptide 01 complex https://alphafoldserver.com/fold/86c286d21a2176a

Mutant SOD1 - Peptide 03 complex https://alphafoldserver.com/fold/58da5ac311db6c0a

Mutant SOD1 - Peptide 15 complex https://alphafoldserver.com/fold/19055fa61aaf71a5

Mutant SOD1 - Peptide 43 complex https://alphafoldserver.com/fold/556af6d37be44d8f

Mutant SOD1 - Peptide 94 complex https://alphafoldserver.com/fold/3dba002c05b228b1

Mutant SOD1 - Peptide 111 complex https://alphafoldserver.com/fold/4018c707b09f0f5a


PeptiVerse

Peptides were evaluated using the PeptiVerse platform.

Reference paper

Simple explanations of the reference paper: https://chatgpt.com/s/t_69b489a6aa048191bbbe0e45a0227899


Colab moPPIt

Optimized peptides were generated using Colab moPPIt.

Reference paper

Simple explanations of the reference paper: https://chatgpt.com/s/t_69b53a1913b08191bbeffd3d085f973d

Run time: 30 min for 3 peptides (no access to A100 or L4 GPU).

cover image cover image

Gemini coded an extra cell to download results.

cover image cover image

Feedback ChatGPT on motif: https://chatgpt.com/s/t_69b7054bb99881918c31e0f77b6662d4