Week 5 - Protein Design Part II
This week we learn how cutting-edge AI and protein language models are used to design functional proteins and peptides “in silico”.

Part A: SOD1 Binder Peptide Design
Your challenge:
Design short peptides that bind mutant SOD1.
Then decide which ones are worth advancing toward therapy.
You will use three models developed in our lab:
PepMLM: target sequence-conditioned peptide generation via masked language modeling
PeptiVerse: therapeutic property prediction
moPPIt: motif-specific multi-objective peptide design using Multi-Objective Guided Discrete Flow Matching (MOG-DFM)
Part 1: Generate Binders with PepMLM
- Begin by retrieving the human SOD1 sequence from UniProt (P00441) and introducing the A4V mutation.
Here is the SOD1 sequence
sp|P00441|SODC_HUMAN Superoxide dismutase [Cu-Zn] OS=Homo sapiens OX=9606 GN=SOD1 PE=1 SV=2 MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTS AGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVV HEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ
And Here it is with the A4V mutation, (switch from Alanine to Valine at the residue 4, we doesn’t count the Methionine)
MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ
- Generate four peptides of length 12 amino acids conditioned on the mutant SOD1 sequence.
| Binder | Pseudo Perplexity |
|---|---|
| WRSYAYXLRLGE | 16,61684100000000 |
| WRVGAYAARWKK | 11,673187000000000 |
| WHYPVAAVAHKK | 12,46979000000000 |
| WHVPVAAVAWKE | 12,689345000000000 |
| FLYRWLPSRRGG | n/a - known one |
for the 1st one X likely means wichever amino acid so I defined one to run the alphafold compute
Part 2: Evaluate Binders with AlphaFold3
- Navigate to the AlphaFold Server: alphafoldserver.com
- For each peptide, submit the mutant SOD1 sequence followed by the peptide sequence as separate chains to model the protein-peptide complex.
- Record the ipTM score and briefly describe where the peptide appears to bind. Does it localize near the N-terminus where A4V sits? Does it engage the β-barrel region or approach the dimer interface? Does it appear surface-bound or partially buried?
- In a short paragraph, describe the ipTM values you observe and whether any PepMLM-generated peptide matches or exceeds the known binder.
| Binder | ipTM | Descriptions |
|---|---|---|
| WRSYAYXLRLGE | 0.33 | on the side of the B barrel, but on the other side than the one with the V |
| WRVGAYAARWKK | 0.41 | also on the barrel but close the N-terminus with the V4, cf image |
| WHYPVAAVAHKK | 0.33 | along the barrel but horizontaly and not close the the V4 |
| WHVPVAAVAWKE | 0.31 | at the other side of the protein than the N-terminus |
| FLYRWLPSRRGG | 0.39 | at the other side of the protein than the N-terminus but |

Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse
-> To finish, I’ve have a bit accumulated lateness the part on computing took me quite some time to make it work but i’m catching up.
Structural confidence alone is insufficient for therapeutic development. Using PeptiVerse, let’s evaluate the therapeutic properties of your peptide! For each PepMLM-generated peptide:
Paste the peptide sequence.
Paste the A4V mutant SOD1 sequence in the target field.
Check the boxes
Predicted binding affinity
Solubility
Hemolysis probability
Net charge (pH 7)
Molecular weight
Compare these predictions to what you observed structurally with AlphaFold3. In a short paragraph, describe what you see. Do peptides with higher ipTM also show stronger predicted affinity? Are any strong binders predicted to be hemolytic or poorly soluble? Which peptide best balances predicted binding and therapeutic properties?
Choose one peptide you would advance and justify your decision briefly.
Part 4: Generate Optimized Peptides with moPPIt
Now, move from sampling to controlled design. moPPIt uses Multi-Objective Guided Discrete Flow Matching (MOG-DFM) to steer peptide generation toward specific residues and optimize binding and therapeutic properties simultaneously. Unlike PepMLM, which samples plausible binders conditioned on just the target sequence, moPPIt lets you choose where you want to bind and optimize multiple objectives at once.
Open the moPPit Colab linked from the HuggingFace moPPIt model card
Make a copy and switch to a GPU runtime.
In the notebook:
Paste your A4V mutant SOD1 sequence.
Choose specific residue indices on SOD1 that you want your peptide to bind (for example, residues near position 4, the dimer interface, or another surface patch).
Set peptide length to 12 amino acids.
Enable motif and affinity guidance (and solubility/hemolysis guidance if available). Generate peptides.
After generation, briefly describe how these moPPit peptides differ from your PepMLM peptides. How would you evaluate these peptides before advancing them to clinical studies?
Part B: BRD4 Drug Discovery Platform Tutorial
skipped so far.
Part C: Final Project: L-Protein Mutants
cf doc