Week 5: Protein Design - Part II

SOD Peptide Design from Pranam

Superoxide dismutase 1 (SOD1) is a cytosolic antioxidant enzyme that converts superoxide radicals into hydrogen peroxide and oxygen. In its native state, it forms a stable homodimer and binds copper and zinc. Mutations in SOD1 cause familial Amyotrophic Lateral Sclerosis (ALS). Among them, the A4V mutation (Alanine → Valine at residue 4) leads to one of the most aggressive forms of the disease. The mutation subtly destabilizes the N-terminus, perturbs folding energetics, and promotes toxic aggregation. Your challenge: a. Design short peptides that bind mutant SOD1. b. Then decide which ones are worth advancing toward therapy.

All the given steps were followed and the results are summarized below.

BindersPerplexity ScoresiPTM scoreBinding Region
ARWDPVVGAVEARAK12.810.34Forms a helix and binds a bit far from N-terminal around the 70-100 residue positions
ARWPEYVAVYEAKRA12.160.25Forms a helix and binds near N-terminal
WRWGVVTARKKAARK12.240.48A bit far from N-terminal, around the 70-100 residue positions
AWVPVLTARVELAKX18.1390.42Near N-terminal
FLYRWLPSRRGG-0.39Near N-terminal

Thus observed ipTM scores for the PepMLM-generated peptides demonstrate variable docking confidence, ranging from a low of 0.25 to a peak of 0.48. When compared against the known binder (FLYRWLPSRRGG), which exhibits a moderate ipTM score of 0.39 for its interaction near the N-terminal, two of the computationally designed peptides successfully exceed this baseline. Specifically, the peptide WRWGVVTARKKAARK achieves the highest overall ipTM score of 0.48, binding further away from the N-terminal (around residues 70–100), while AWVPVLTARVELAKX scores 0.42, binding near the N-terminal. The remaining two PepMLM candidates, ARWDPVVGAVEARAK (0.34) and ARWPEYVAVYEAKRA (0.25), do not pass the benchmark established by the known binder.

PeptideMW (Da)ChargeGRAVYSolubilityHemolysis
AWVPVLTARVELAKX1534.9+0.800.70Soluble (1.000)Low (0.087)
WRWGVVTARKKAARK1813.2+5.76-0.95Soluble (0.985)Low (0.023)
ARWPEYVAVYEAKRA1809.0+0.80-0.63Soluble (1.000)Low (0.084)
ARWDPVVGAVEARAK1624.8+0.80-0.20Soluble (1.000)Low (0.049)

Analysis & Structural Correlation

  • ipTM vs. Affinity: High AlphaFold3 ipTM scores correlate with tight structural complementation (optimized hydrogen bonding, salt bridges, or hydrophobic packing), which generally translates to a more favorable (strongly negative) predicted binding free energy.
  • Safety Profiles: No strong binders pose safety risks. All sequences show flawless predicted solubility and negligible hemolytic risk. Even the highly hydrophobic sequence AWVPVLTARVELAKX (GRAVY: 0.70) remains safely non-hemolytic.
  • Optimal Balance: ARWDPVVGAVEARAK achieves the best therapeutic balance. It pairs maximum solubility (1.000) and low hemolysis (0.049) with a moderate GRAVY score (-0.20), minimizing the aggregation risks seen in highly hydrophobic variants.

Lead Selection & Justification

Selection: ARWDPVVGAVEARAK Justification: With a 0.42 iPTM score, this peptide is the best candidate to advance. It also delivers a great therapeutic profile with great predicted solubility, minimal hemolytic risk, and a balanced hydrophobic index that mitigates in vitro aggregation while maintaining binding efficacy.

Binders obtained through moPPlt:

Sample SequenceHemolysis ScoreSolubility ScoreAffinity Score
RAVARNVWFAWF0.01261.00007.4749
ACEEFGFEAVCA0.00771.00006.9697
GSRRWWVYWHYT0.02441.00007.5653
CFAGAGNRYGWK0.00540.99986.5160
GRRCAGPYYNWG0.00681.00007.3545
CRDTRVGCPHRC0.02201.00007.7880
WCCQWADGRWER0.01580.99957.2062
PFFCREYALYCY0.00430.99837.7089
FAYYRPCGCGCR0.02511.00008.1801
  1. PepMLM vs. moPPIt Peptides
  • PepMLM generates general linear binders across a target sequence. moPPIt utilizes discrete flow matching to target hyper-localized, functional subsequence motifs.
  • PepMLM candidates are designed as general sequence binders ideal for scaffolding or linking to functional domains (e.g., degradation tags), whereas moPPIt candidates act as compact, high-density competitive inhibitors tailored to disrupt precise protein-protein interaction interfaces.
  • Both feature optimal safety metrics (Solubility approx 1.0, Hemolysis < 0.03), but moPPIt provides direct, highly optimized quantitative affinity scores (ranging from 6.51 to 8.18).
  1. Pre-Clinical Evaluation Strategy

Phase I: Structural Validation

  • AF3 Conformation: Run AlphaFold3 to assess structural complementarity and docking metrics (ipTM/pLDDT).
  • Disulfide Mapping: Predict intra-chain disulfide bonding patterns for cysteine-rich variants to evaluate self-cyclization.

Phase II: Biophysical Characterization

  • Binding Kinetics: Synthesize top leads and determine absolute Kd values via Surface Plasmon Resonance (SPR).
  • Proteolytic Stability: Incubate in human serum to quantify metabolic half-life and assess the need for backbone stabilization (e.g., D-amino acids, cyclization).

Phase III: Functional & Safety Assays

  • Toxicity Screening: Perform wet-lab erythrocyte hemolysis and cellular viability (MTT) assays to validate computational safety metrics.
  • On-Target Efficacy: Run cell-based functional assays to confirm target inhibition or degradation capability.

Final Proect: L-Protein Mutants

Option 1: Mutagenesis

  1. Experimental vs. ESM Score Correlation

The experimental data reveals a poor direct correlation with the subtle variations in your ESM snippet scores.

  • The ESM snippet shows tight, high scores (0.00753 to 0.00759) for mutating position 39 (Y39).

  • In the dataset, the wild-type residue Y39 is highly critical. Mutating it to Histidine (Y39H) completely abolishes function (Lysis: 0, Protein Levels: 0).

  • Conclusion: The language model identifies this region as highly significant (yielding high absolute scores), but its tiny fractional score differences do not capture the binary functional disruption seen experimentally.

  1. 5 Engineered Mutants

The 5 mutants I have chosen are:

Soluble Region Variants

  • Variant 1: P13L

  • Why: Directly validated in the data to maintain full function (Lysis: 1, Protein Levels: 1). It confirms the soluble loop tolerates hydrophobic substitution.

  • Variant 2: S15A

  • Why: Directly validated in the data (Lysis: 1, Protein Levels: 1). Substituting Serine with Alanine preserves functional spacing while removing the hydroxyl group.

Transmembrane (TM) Region Variants

  • Variant 3: R30L

  • Why: Validated in the data to maintain full function (Lysis: 1, Protein Levels: 1). Replacing a charged Arginine with Leucine at the TM boundary optimizes the hydrophobic core.

  • Variant 4: I46F

  • Why: Validated in the data to maintain full function (Lysis: 1, Protein Levels: 1). Swapping Isoleucine for Phenylalanine maintains hydrophobicity and supports membrane-disrupting alpha-helix packing.

Robust Combination Variant

  • Variant 5: P13L + S15A
  • Why: Combines two adjacent, experimentally proven functional mutations from the soluble domain. This tests whether the local region accommodates additive hydrophobic structural changes without losing stability.

Option 2: Mutagenesis Using Af2 Multimer

Component / SectionKey Findings & Metric AnalysisStructural / Biophysical Implications
AlphaFold2 Model Metrics• Model 5 top rank (pLDDT = 84.1, pTM = 0.575).
• High MSA depth (>15,000 sequences).
Robust, high-confidence structural predictions that have stably converged over 3 recycles.
Domain Confidence Profile• Domain 1: Residues 1–70 (pLDDT > 80).
• Loop: Residues 75–105 (pLDDT \sim 20\text{–}40).
• Domain 2: Residues 110–350+ (pLDDT > 80).
Avoid engineering mutations in the flexible loop (75–105), as disordered regions tolerate changes without disrupting binding.
Mutation Set 1:
Electrostatic Inversion
• M1 (Domain 1): Basic to Acidic (e.g., K/R \rightarrow E).
• M2 (Domain 2): Acidic to Basic (e.g., D/E to K).
• M3 (Domain 2): Basic to Acidic (e.g., R to E).
Destabilizes binding by shattering existing salt bridges and generating active electrostatic repulsion at the interface.
Mutation Set 2:
Steric & Hydrophobic Clash
• M4 (Domain 1): Hydrophobic to Charged (e.g., L/I to R).
• M5 (Domain 2): Small to Bulky Aromatic (e.g., A/G to W).
Drives a massive energetic desolvation penalty and introduces severe steric hindrance to physically wedge the complex apart.