Week 5 Homework - Protein Design ⓶

Notes

Part A: SOD1 Binder Peptide Design

A.1: Generate Binders with PepMLM

A.1.1: Retrieve SOD1 and introduce the A4V mutation

Note

Begin by retrieving the human SOD1 sequence from UniProt (P00441) and introducing the A4V mutation.

Find the entry for P00441 SODC_HUMAN at UniProt

!()[uniprot01.jpg?width=768px]

By clicking on the Amino Acids go to sequence link, we arrive at the sequence.

!()[uniprot02.jpg?width=768px]

The Download links downloads the FASTA file, but you can also just copy the sequence.

MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

I manually introduce the A4V mutation, mutating Alanine to Valine at residue 4.

MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ
Note

Why exactly does A4V means?

If we start counting at 1, then the 4th residue is K.

1234567
|||||||
MAT*K*AVC..

If we start counting at 0, then the 4th residue is actually A.

1234567
|||||||
MATK*A*VC..

Which one is correct? Both. Neither? Proteins usually start with the start codon, which translates to M (methionine), which is often removed after translation, therefore MATKAVC.. becomes ATKAVC... When we not start counting residues at 1, then 4 is A

1234567
|||||||
ATK*A*VCV..

A.1.2: Use the PepMLM Colab

Copy the [https://colab.research.google.com/drive/1u0i-LBog_lvQ5YRKs7QLKh_RtI-tV8qM?usp=sharing](PepMLM Colab), from ChatterjeeLab/PepMLM-650M

A.1.3: Generate four peptides of length 12

Binder,Pseudo Perplexity HRYYVAAVRHWK,23.837650573350423 WRSPVVVAEHKK,13.883698506608807 WLYYAAALRLKE,19.11173304337227 WRYYAAALAWGX,10.052019407114733

A.1.4: Add known SOD1-binding peptide FLYRWLPSRRGG

HRYYVAAVRHWK WRSPVVVAEHKK WLYYAAALRLKE WRYYAAALAWGX FLYRWLPSRRGG

A.2 Evaluate Binders with AlphaFold3

A.2.1 Evaluate Binders with AlphaFold3

alphafoldserver.com