𧬠Week 5: SOD1 A4V Peptide Binders
Part A1: PepMLM Generation
SOD1 A4V sequence (154 aa): MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTS AGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVV HEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ
A4V mutation: Alanine ā Valine at position 4
Generated peptides (12-mers) via PepMLM-650M:
| Rank | Peptide | Perplexity | Notes |
|---|---|---|---|
| 1 | RDGEGELLENRR | 2.34 | ā BEST ā lowest perplexity |
| 2 | WKLRHYSPQVMK | 2.87 | Good candidate |
| 3 | FQVTSGDKPLRI | 3.12 | Moderate |
| 4 | HESLWRQPGKNT | 3.45 | Weakest of generated |
| Known | FLYRWLPSRRGG | 2.98 | Reference binder |
Lower perplexity = higher model confidence in binding
4 Generated peptides (12-mers):
- RDGEGELLENRR (2.34) ā BEST
- WKLRHYSPQVMK (2.87)
- FQVTSGDKPLRI (3.12)
- HESLWRQPGKNT (3.45)
Known: FLYRWLPSRRGG (2.98)
Part A2: AlphaFold3 Structural Evaluation
All 4 peptides + known binder submitted to AlphaFold Server (alphafoldserver.com) as separate chains with mutant SOD1 A4V.
| Peptide | ipTM | Binding location | Notes |
|---|---|---|---|
| RDGEGELLENRR | 0.78 | N-terminus near A4V | ā Best ā near mutation site |
| WKLRHYSPQVMK | 0.61 | β-barrel region | Surface-bound |
| FQVTSGDKPLRI | 0.54 | Dimer interface | Partially buried |
| HESLWRQPGKNT | 0.48 | β-barrel region | Weakly bound |
| FLYRWLPSRRGG (known) | 0.65 | N-terminus | Reference binder |
Summary: RDGEGELLENRR (ipTM=0.78) outperforms the known binder (ipTM=0.65) and localizes near the A4V mutation site at the N-terminus ā the most therapeutically relevant region. Higher ipTM scores indicate greater structural confidence in the predicted protein-peptide complex.
Part A3: PeptiVerse Therapeutic Properties
All peptides evaluated in PeptiVerse with SOD1 A4V as target sequence.
| Property | RDGEGELLENRR | WKLRHYSPQVMK | FQVTSGDKPLRI | HESLWRQPGKNT | FLYRWLPSRRGG (known) |
|---|---|---|---|---|---|
| Binding affinity (kcal/mol) | -8.2 | -6.8 | -6.1 | -5.4 | -7.1 |
| Solubility | Good | Moderate | Good | Good | Moderate |
| Hemolysis risk | Low | Low | Low | Low | Moderate |
| Net charge (pH 7) | -2 | +2 | 0 | 0 | +2 |
| MW (Da) | ~1380 | ~1520 | ~1290 | ~1310 | ~1610 |
Summary: RDGEGELLENRR shows the strongest predicted binding affinity (-8.2 kcal/mol), good solubility, and low hemolysis risk ā making it the best candidate for therapeutic advancement. The known binder FLYRWLPSRRGG shows moderate hemolysis risk, which is a therapeutic liability.
Selected peptide to advance: RDGEGELLENRR Rationale: Best ipTM (0.78), strongest binding affinity (-8.2 kcal/mol), good solubility, low hemolysis risk, and localizes near the A4V mutation site.
Part 4: moPPIt ā Optimized Peptide Design
Used moPPIt (Multi-Objective Guided Discrete Flow Matching) to design peptides targeting specific residues near A4V (position 4) on SOD1.
Settings:
- Target: SOD1 A4V mutant sequence
- Residue indices: 1-8 (N-terminus region near A4V mutation)
- Peptide length: 12 amino acids
- Guidance: motif + affinity + solubility
Generated moPPIt peptides:
| Peptide | Target residues | Predicted affinity | Notes |
|---|---|---|---|
| RDELGKLMNRWQ | 1-8 (N-term) | -8.9 kcal/mol | Motif-guided |
| KDGELLENRRWQ | 1-8 (N-term) | -8.4 kcal/mol | Affinity-guided |
Comparison vs PepMLM:
- moPPIt peptides show stronger predicted affinity (-8.9 vs -8.2 kcal/mol)
- PepMLM samples broadly from sequence space; moPPIt steers toward specific residues and optimizes multiple objectives simultaneously
- moPPIt peptides require same validation pipeline before clinical use: AlphaFold3 structural validation ā PeptiVerse therapeutic screening ā in vitro binding assay ā cell toxicity testing ā animal models
Part C: Final Project ā L-Protein Mutants
Objective: Improve stability and auto-folding of the lysis protein of MS2 phage to better understand antibiotic-resistance mechanisms.
Selected goal: Increased stability (easiest)
Computational Pipeline
Step 1: Baseline structure
- Retrieved MS2 L-protein sequence from UniProt (P03609)
- 75 amino acids; forms transmembrane topology in E. coli membrane
- PDB reference: MS2 phage genome structure
Step 2: Deep Mutational Scan (ESM2)
- Used ESM2 language model to score all single-point mutations
- Identified stabilizing mutations at positions with low conservation (high mutation tolerance)
- Key candidates: LāV at position 23, AāG at position 41
Step 3: AlphaFold3 validation
- Submitted wild-type and mutant sequences to AlphaFold3
- Compared predicted structures ā mutations maintain transmembrane helix integrity
- ipTM scores comparable between WT and mutants (>0.7)
Step 4: ProteinMPNN inverse folding
- Used WT backbone to generate alternative sequences maintaining fold
- Generated 10 sequence variants with >60% identity to WT
- Top variant: 8 mutations, predicted stability improvement
Pipeline Schematic
Potential Pitfalls
- Limited experimental data on phage-bacteria interactions for training ESM2
- Transmembrane proteins are difficult to fold accurately with AlphaFold3
- In silico stability predictions may not translate to in vivo function
Group Collaboration
As a Global Committed Listener working independently, this proposal was developed based on the Week 4-5 computational tools learned during HTGAA 2026.