Week 5 HW: Protein Design Part II

cover image cover image

Part A: SOD1 Binder Peptide Design

Part 1: PepMLM Peptide Generation

sp|P00441|SODC_HUMAN Superoxide dismutase [Cu-Zn] OS=Homo sapiens OX=9606 GN=SOD1 PE=1 SV=2 MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

SOD1 A4V mutation sequence:

MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

To generate candidate binders, I retrieved the human SOD1 sequence from UniProt (accession P00441) and introduced the A4V mutation at position 4. I then used the PepMLM Colab notebook to generate four 12-amino-acid peptide sequences conditioned on the mutant protein. Since only four peptides were generated, a moderate Top-K(5) ensures enough variation between candidates to evaluate different potential binding interactions with Superoxide Dismutase 1.

IndexBinderPseudo PerplexityInterpretation
0MESILTLLLKRK12.187512Moderate confidence
1MKSLITDQLLVI10.999398Best candidate (lowest perplexity)
2MSLTETDLLIVV13.539035Acceptable but weaker
3MAKILVLLQRKI19.123590Low confidence
Known BinderFLYRWLPSRRGG21.42177599429796Lowest confidence

For comparison, I added the known SOD1-binding peptide FLYRWLPSRRGG, which was also evaluated using PepMLM and produced a perplexity score of 21.42. This score is higher than all four generated peptides, indicating that the model assigns lower likelihood to this sequence compared to the newly generated candidates. Interestingly, the peptide MKSLITDQLLVI showed the lowest perplexity (10.99), suggesting the highest confidence according to the model. This highlights that language model likelihood does not always perfectly correlate with experimentally validated binding but can still help propose plausible new candidates.

Part 2: AlphaFold3 Binding Evaluation

MESILTLLLKRK peptide: ipTM = 0.54 pTM = 0.86 localized near the β-barrel region

MKSLITDQLLVI peptide: ipTM = 0.25 pTM = 0.84 localized near the N-terminus

MSLTETDLLIVV peptide: ipTM = 0.34 pTM = 0.86 localized near the β-barrel region

MAKILVLLQRKI peptide: ipTM = 0.63 pTM = 0.88 localized near the β-barrel region

FLYRWLPSRRGG peptide: ipTM = 0.34 pTM = 0.81 localized near the N-terminus

The peptide–protein complexes were predicted using AlphaFold for mutant Superoxide Dismutase 1, where MAKILVLLQRKI showed the strongest predicted interaction (ipTM = 0.63) near the β-barrel region, while the known binder FLYRWLPSRRGG had a lower interaction score (ipTM = 0.34) and localized near the N-terminus.

Part 3: PeptiVerse Therapeutic Evaluation

MESILTLLLKRK peptide:

MKSLITDQLLVI peptide:

MSLTETDLLIVV peptide:

MAKILVLLQRKI peptide:

FLYRWLPSRRGG peptide:

I evaluated the peptides in PeptiVerse to assess therapeutic properties such as predicted binding affinity, solubility, hemolysis probability, net charge, and molecular weight against the A4V mutant of Superoxide Dismutase 1. In general, peptides with higher ipTM scores from AlphaFold tended to show stronger predicted binding affinity, although some candidates showed slightly lower solubility. None of the peptides showed strong hemolytic risk. Among them, MAKILVLLQRKI appeared to provide the best balance of structural binding confidence, predicted affinity, and acceptable physicochemical properties, so I selected it as the peptide to advance for further study.

Part 4: moPPIt Optimized Peptides

Using moPPIt, I generated the peptides by targeting residues near the A4V mutation on Superoxide Dismutase 1 while optimizing binding affinity and therapeutic properties. Compared to the peptides produced earlier with PepMLM, the moPPIt designs appeared more directed toward the chosen binding region and were predicted to have improved physicochemical properties such as solubility and lower hemolysis risk. Before advancing any candidate toward clinical studies, the peptides would need further evaluation using structural prediction tools like AlphaFold, followed by experimental validation of binding strength, stability, and toxicity in laboratory assays

Part C: L-Protein Mutants

The project focuses on improving the stability and folding efficiency of the lysis protein from the MS2 bacteriophage. In this assignment, computational tools are used to generate mutations that may improve the protein’s structural stability while maintaining its biological function. By comparing predicted folding energies and structural models, promising mutants can be identified for further experimental testing. Improving the stability of the lysis protein could help researchers better understand how bacteriophages disrupt bacterial cells, which may contribute to future strategies against antibiotic-resistant bacteria.