Week 5 HW: Protein Design Part II

Part A: SOD1 Binder Peptide Design (From Pranam)

Superoxide dismutase 1 (SOD1) is a cytosolic antioxidant enzyme that converts superoxide radicals into hydrogen peroxide and oxygen. In its native state, it forms a stable homodimer and binds copper and zinc.

Mutations in SOD1 cause familial Amyotrophic Lateral Sclerosis (ALS). Among them, the A4V mutation (Alanine → Valine at residue 4) leads to one of the most aggressive forms of the disease. The mutation subtly destabilizes the N-terminus, perturbs folding energetics, and promotes toxic aggregation.

Your challenge:

  1. Design short peptides that bind mutant SOD1.
  2. Then decide which ones are worth advancing toward therapy.

Part 1: Generate Binders with PepMLM

  1. Begin by retrieving the human SOD1 sequence from UniProt (P00441) and introducing the A4V mutation.

MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

  1. Using the PepMLM Colab linked from the HuggingFace PepMLM-650M model card:

    a. Generate four peptides of length 12 amino acids conditioned on the mutant SOD1 sequence.

    b. To your generated list, add the known SOD1-binding peptide FLYRWLPSRRGG for comparison.

    c. Record the perplexity scores that indicate PepMLM’s confidence in the binders.

To calculate the reference binder pseudo-perplexity, I used the compute_pseudo_perplexity function defined in the Load Model cell. For this purpose, I created an additional cell in the Colab notebook and computed the pseudo-perplexity of the known SOD1-binding peptide FLYRWLPSRRGG under the same conditions used for the generated binders.

    #@title Calculate Pseudo Perplexity
    comp_binder_seq = 'FLYRWLPSRRGG'
    ppl_value = compute_pseudo_perplexity(model, tokenizer, protein_seq, comp_binder_seq)
    print(f"Pseudo Perplexity: {ppl_value}")

The resulting pseudo-perplexity values are shown in the table below. Lower pseudo-perplexity values indicate higher model confidence in the corresponding binder sequences. Notably, several PepMLM-generated binders exhibited lower pseudo-perplexity than the reference peptide, suggesting that the model assigns higher confidence to these sequences when conditioned on the mutant SOD1 context.

BinderBinder SequencePseudo-Perplexity
referenceFLYRWLPSRRGG20.635231
predicted 0WLVYVVAVRLWK24.751888
predicted 1WLYYPAVLEHKE15.901555
predicted 2HHYGAVAVAWKE9.600505
predicted 3HRYYAAAARLWK12.24365

Part 2: Evaluate Binders with AlphaFold3

  1. Navigate to the AlphaFold Server
  2. For each peptide, submit the mutant SOD1 sequence followed by the peptide sequence as separate chains to model the protein-peptide complex.
  3. Record the ipTM score and briefly describe where the peptide appears to bind. Does it localize near the N-terminus where A4V sits? Does it engage the β-barrel region or approach the dimer interface? Does it appear surface-bound or partially buried?
  4. In a short paragraph, describe the ipTM values you observe and whether any PepMLM-generated peptide matches or exceeds the known binder.
ipTM valueBinder structureBindeing site
SOD1-Reference0.33DisorderedThe C-terminus is located between two small α-helices (aa 51–67 and 134–136), while the N-terminus lies near the bottom of the β-barrel, opposite to the V4 residue.
SOD1-Binder00.48DisorderedBinding in SOD1 surface near a β-sheet region of the SOD1 β-barrel (aa 90–102).
SOD1-Binder10.2Small α-helice in the middleBurried between two small α-helices (aa51-67 and 134-136) next-to the barrel. Not close to V4
SOD1-Binder20.34DisorderedSame binding that Binder 1
SOD1-Binder30.42DisorderedBinding in the barre surface, far from amino acid V4 of SOD1

The predicted protein–peptide complexes exhibit relatively low ipTM values, ranging from 0.20 to 0.48, indicating limited confidence in the modeled interfaces. Among the PepMLM-generated peptides, Binder0 displays the highest ipTM score (0.48), exceeding that of the known SOD1-binding reference peptide (0.33). Most peptides bind to surface-exposed regions of the SOD1 β-barrel or near small α-helical elements rather than localizing close to the N-terminal region harboring the A4V mutation. Additionally, several peptides remain largely disordered upon binding, suggesting flexible and potentially weak interactions. Overall, one generated binder shows improved predicted interface confidence relative to the reference, although the low ipTm value indicates a limited confidence in the modeled interfaces.

Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse

Structural confidence alone is insufficient for therapeutic development. Using PeptiVerse, let’s evaluate the therapeutic properties of your peptide! For each PepMLM-generated peptide:

  1. Paste the peptide sequence.
  2. Paste the A4V mutant SOD1 sequence in the target field.
  3. Check the boxes:

a. Predicted binding affinity

b. Solubility

c. Hemolysis probability

d. Net charge (pH 7)

e. Molecular weight

Compare these predictions to what you observed structurally with AlphaFold3. In a short paragraph, describe what you see. Do peptides with higher ipTM also show stronger predicted affinity? Are any strong binders predicted to be hemolytic or poorly soluble? Which peptide best balances predicted binding and therapeutic properties?

Choose one peptide you would advance and justify your decision briefly.