Week 5 HW: Protein Design Part II

SOD1 Binder Peptide Design (From Pranam)

Generate Binders with PepMLM

Human SOD1 sequence:

MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

A4V mutant SOD1 sequence:

MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

I inserted mutant sequence and set 12 as peptide length in PepMLM Google Colab code. To compare FLYRWLPSRRGG binding peptide pseudo perplexity, I added another code cell into Google Colab:

fig1 fig1

Here are the downloaded results rounded to hundredths:

BinderPseudo Perplexity
KRYYVTGVRLKK30.16
HRYPAVGVEHKX15.33
WRYPAAAVAWWX10.03
WRYPAAALAWKE11.92
FLYRWLPSRRGG20.64

WRYPAAAVAWWX showed the lowest perplexity (10.03) upon this peptide-generation session, meaning highest plausibility. Interestingly, experimentaly approved FLYRWLPSRRGG showed the second lowest perplexity, whereas “the best” variant contains X (undefined) amino acid making it should be carefully interpreted. As the current LLMs’ outputs in general.

Evaluate Binders with AlphaFold3

Binders 2(HRYPAVGVEHKX) and 3 (WRYPAAAVAWWX, with the lowest pseudo perplexity) were excluded from the analyzis because of invalid “X” character.

Binder 1 (KRYYVTGVRLKK), ipTM = 0.47: fig2 fig2

Binder 4 (WRYPAAALAWKE), ipTM = 0.36: fig3 fig3

Binder 5, experimentaly verified (FLYRWLPSRRGG), ipTM = 0.34: fig4 fig4

Evaluate Properties of Generated Peptides in the PeptiVerse

Binder 1 (KRYYVTGVRLKK): fig.5 fig.5

Binder 2 (KRYYVTGVRLKK): fig.6 fig.6

Binder 3 (WRYPAAALAWKE): fig.7 fig.7

Binder 4 (WRYPAAALAWKE): fig.9 fig.9

Binder 5, experimentaly verified (FLYRWLPSRRGG): fig.10 fig.10

BinderipTMBinding AffinitySolubilityHemolysisNet Charge (pH 7)Molecular WeightInterpretation
KRYYVTGVRLKK0.476.260.510.064.761510.8Moderate structural binding support; strongly cationic peptide may enhance target interaction but raises risk of nonspecific electrostatic binding and moderate cytotoxicity. Solubility is acceptable but not optimal
HRYPAVGVEHKXN/A5.91.000.010.941274.6Excellent developability profile with very high solubility and minimal hemolysis; however, absence of structural validation (ipTM unavailable) and weaker predicted affinity make binding confidence uncertain
WRYPAAAVAWWXN/A7.480.990.080.761358.6Strongest predicted affinity among candidates and highly soluble, but elevated hydrophobic/aromatic content may contribute to increased hemolytic potential and aggregation risk. Requires structural validation
WRYPAAALAWKE0.366.431.000.030.771461.7Balanced candidate with excellent solubility, low hemolysis, and moderate predicted affinity; relatively low ipTM suggests uncertain interface stability despite favorable developability properties
FLYRWLPSRRGG0.346.360.610.052.761507.7Experimentally validated binder with moderate predicted affinity and cationic character. Lower predicted solubility and interface confidence may reflect dynamic or partially disordered binding interactions rather than absence of activity

Generate Optimized Peptides with moPPIt

moPPIt was run targeting hydrophobic motif positions 3–9 of A4V SOD1 (KVVCVLK). 5 peptide samples of 12 residues were generated considering Hemolysis, Solubility, Affinity, and Motif parameters optimization.

PeptideHemolysisSolubilityBinding AffinityMotifInterpretation
CTSGVNVGPGVP0.040.996.110.63Strong developability profile with very low predicted hemolysis and excellent solubility. Moderate motif alignment suggests partial but potentially indirect engagement of the hydrophobic KVVCVLK region; likely acts via peripheral or transient interaction rather than deep motif mimicry
ADSEIKAPSSGH0.081.005.520.67Highly favorable physicochemical properties with maximal solubility and low hemolysis risk. Motif similarity is moderate, suggesting a more electrostatically driven or scaffold-like interaction rather than direct hydrophobic motif mimicry
RSKYQWVPYHVT0.041.006.310.49Balanced candidate with strong predicted affinity and good solubility, but lower motif score indicates weaker structural motif mimicry. Likely binds through mixed hydrophobic–aromatic interactions rather than targeted motif recognition
SFAGICNVEQQT0.051.005.950.75Best overall motif alignment among the set, suggesting strongest intended targeting of the KVVCVLK hydrophobic patch. Favorable solubility and low hemolysis support developability, with potential for more direct motif-driven binding
QEPCEELQFNHF0.020.516.260.66Strong predicted affinity and acceptable motif similarity, but reduced solubility may limit practical usability. Likely engages target through distributed polar–aromatic interactions rather than strict hydrophobic motif complementarity

Final Project: L-Protein Mutants

The goal of this project is to design mutant MS2 lysis protein (L-protein) to reduce its interaction with the bacterial chaperone DnaJ. Since DnaJ is important for proper folding and processing of the L-protein, weakening this interaction may help the phage remain functional even if bacteria modify DnaJ. To estimate this interaction, co-folding predictions were performed using AlphaFold2 Multimer with both proteins entered together. In AlphaFold2, both sequences are inserted into a single input field separated by a colon “:”. Since transmembrane domain affects the L-protein lysis activity, mutations were intoduced in N-terminal region.

L-Protein Sequence:

METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT

Otion 1: Mutagenesis

According to resulting heatmap and L-Protein Mutants Table data, I chose the following mutations:

  • F5T
  • S9Q
  • C29S
  • Y39L
  • K50L
fig.11 fig.11

Mutant L-Protein Sequence:

METRTPQQQQQTPASTNRRRPFKHEDYPSRRQQRSSTLLVLIFLAIFLSLFTNQLLLSLLEAVIRTVTTLQQLLT

AlphaFold2-multimer-v3 querry:

METRTPQQQQQTPASTNRRRPFKHEDYPSRRQQRSSTLLVLIFLAIFLSLFTNQLLLSLLEAVIRTVTTLQQLLT:MAKQDYYEILGVSKTAEEREIRKAYKRLAMKYHPDRNQGDKEAEAKFKEIKEAYEVLTDSQKRAAYDQYGHAAFEQGGMGGGGFGGGADFSDIFGDVFGDIFGGGRGRQRAARGADLRYNMELTLEEAVRGVTKEIRIPTLEECDVCHGSGAKPGTQPQTCPTCHGSGQVQMRQGFFAVQQTCPHCQGRGTLIKDPCNKCHGHGRVERSKTLSVKIPAGVDTGDRIRLAGEGEAGEHGAPAGDLYVQVQVKQHPIFEREGNNLYCEVPINFAMAALGGEIEVPTLDGRVKLKVPGETQTGKLFRMRGKGVKSVRGGAQGDLLCRVVVETPVGLNERQKQLLQELQESFGGPTGEHNSPRSKSFFDGVKKFFDDLTR