Homework — Protein Design Part II
Warning
Mandatory for MIT/Harvard students and Committed Listeners. Due Tue, Mar 11 (before lecture).
Key Links
- Tracking sheet: Google Sheet
Part A — PepMLM peptide design (From Pranam)
Info
Create a Hugging Face account → we’ll use PepMLM‑650M: model page.
- Generate a token: Settings → Tokens (create new).
- Ensure repo is ChatterjeeLab/PepMLM-650M.
- Open the PepMLM Colab and make a copy: Colab (linked from the model page).
- In Colab, choose T4 GPU, run all blocks.
- When prompted “Input HF token”, paste your token. When asked “Add token as git credential?”, choose No.
Get the amino‑acid sequence for SOD1 on UniProt (ID: P00441). Make the A4V mutation.
Run PepMLM inference and generate 4 peptides (length 12 aa). (2 is acceptable if time‑limited.)
Add a known SOD1‑binding peptide to your list:
FLYRWLPSRRGG(see Genes & Development reference).
genesdev.cshlp.orgUse AlphaFold‑Multimer (ColabFold notebook) to model the SOD1:peptide complex.
Open notebook: AlphaFold‑Multimer- Set
model_type = alphafold2_multimer_v3— shown to recapitulate peptide‑protein binding accurately.
Ref: Frontiers in Bioinformatics (2022)
- Set
After running AF‑Multimer with your 5 peptides (4 generated + 1 known), plot the ipTM scores to compare relative binding confidence.
Write a 1‑paragraph summary of your results.
Part B — Final Project: L‑Protein Mutants
Info
This is computationally heavy — start early.
More details: Final Project Page (external Notion): www.notion.so