Week 5: HW Protein Design Part II
Part A
Part 1: Generate Binders with PepMLM
| Binder Sequence | Perplexity Score | |
|---|---|---|
| WRYYATVARHKE | 15.38014901 | |
| WLYYVVVLRHGE | 32.45105037 | |
| WRYYAAGARLKE | 11.75101427 | |
| WRYYATAVELKG | 10.44080960 |
Part 2: Evaluate Binders with AlphaFold3
| Peptide ID | Binder Sequence | ipTM Score | Location |
|---|---|---|---|
| Peptide 0 | WRYYATVARHKE | 0.35 | Approaches the $\beta$ barrel; surface-bound |
| Peptide 1 | WLYYVVVLRHGE | 0.33 | Approaches the $\beta$ barrel; surface-bound |
| Peptide 2 | WRYYAAGARLKE | 0.35 | Approaches the $\beta$ barrel; surface-bound |
| Peptide 3 | WRYYATAVELKG | 0.22 | Conforms to the dimer interface; binds to a pocket |
| Control | FLYRWLPSRRGG | 0.31 | Surrounding the $\beta$ barrel; surface-bound |
While peptides 0-2 exceed the control (known binder)’s ipTM scores, indicating stronger protein-protein interaction, they still represent low confidence values. I would not conclude that they produce stronger protein-protein interactions for use as potential molecular glues.
Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse
| Peptide ID | Binder Sequence | Predicted Binding Affinity | Solubility | Hemolysis Probability | Net Charge (pH 7) | Molecular Weight |
|---|---|---|---|---|---|---|
| Peptide 0 | WRYYATVARHKE | 5.659 pKd/pKi | 1.000 | 0.034 | 1.85 | 1579.8 Da |
| Peptide 1 | WLYYVVVLRHGE | 6.562 pKd/pKi | 1.000 | 0.158 | -0.15 | 1533.8 Da |
| Peptide 2 | WRYYAAGARLKE | 5.878 pKd/pKi | 1.000 | 0.042 | 1.77 | 1483.7 Da |
| Peptide 3 | WRYYATAVELKG | 5.785 pKd/pKi | 1.000 | 0.060 | 8.50 | 1456.6 Da |
While there isn’t enough of an ipTM range to draw meaningful conclusions regarding binding affinity, I would think that structural conformation lends itself, by necessity, to the strength of protein-protein interactions. Peptide 1 actually had the highest hemolysis probability and the lower pH; on that note, peptide 3 makes for the best therapeutic candidate considering its net charge at pH 7 and solubility assists in drug delivery while its binding affinity renders it efficacious as a therapeutic target.
Part 4
KKCGKQFKQKQE
This is vastly different in terms of conformation and osmolarity as compared to the other peptides. I would request representative lab samples of the protein and conduct initially binding efficiency assays using MS before advancing either as a candidate.
Part B
Part C
Protein Design Process Overview
We’ll be utilizing random mutagenesis to generate variants of L-protein optimized for both (1) Lysis Activity and (2) Expression Levels. We begin with the ground truth of the experimental data provided, generate combinations of mutations and test them usign Af2-multimer.
Mutation Set
We filter mutations by excluding nonsense mutations and prioritizing missense mutations which enhance both lytic activity and expression levels. Frankly, the synergistic effects of different mutations is not a straightforward problem to solve, so we resort to a brute force attack on different possible combinations of mutations in L protein that affect its binding to DNAJ. In an ideal world, we would create a performance score that assessed its lytic activity and expression as well, but existing models wouldn’t encode these properties especially for a small transmembrane protein.
| Mut # | Amino Acid Position | Amino Acid Change |
|---|---|---|
| 0 | 13 | P->L |
| 1 | 15 | S->A |
| 2 | 18 | R->G |
| 3 | 18 | R->I |
| 4 | 30 | R->Q |
| 5 | 30 | R->L |
| 6 | 31 | R->I |
| 7 | 44 | L->P |
| 8 | 45 | A->P |
| 9 | 46 | I->F |
Mathematically, there are 26*32=576 possible sets of mutations that can be made.
Evaluation
In the absence of reliable lysis and expression level metrics, we resort to DNAJ:L Protein binding affinity as predicted by Af2-Multimer instead.
Results
| Mutations | iptm value |
|---|---|
| 0_1 | 0.152 $\pm$ 0.02 |
| 0_2 | 0.16 $\pm$ 0.02 |
| 0_3 | 0.148 $\pm$ 0.02 |
| 0_4 | 0.152 $\pm$ 0.02 |
| 0_5 | 0.144 $\pm$ 0.02 |
| 0_6 | 0.168 $\pm$ 0.02 |
| 0_7 | 0.17 $\pm$ 0.02 |
| 0_8 | 0.178 $\pm$ 0.02 |
| 0_9 | 0.208 $\pm$ 0.04 |
| 1_2 | 0.15 $\pm$ 0.02 |
| 1_3 | 0.152 $\pm$ 0.02 |
| 1_4 | 0.154 $\pm$ 0.03 |
| 1_5 | 0.146 $\pm$ 0.01 |
| 1_6 | 0.168 $\pm$ 0.03 |
| 1_7 | 0.178 $\pm$ 0.03 |
| 1_8 | 0.172 $\pm$ 0.02 |
| 1_9 | 0.178 $\pm$ 0.02 |
| 2_4 | 0.146 $\pm$ 0.02 |
| 2_5 | 0.146 $\pm$ 0.02 |
| 2_6 | 0.156 $\pm$ 0.02 |
| 2_7 | 0.164 $\pm$ 0.03 |
| 2_8 | 0.174 $\pm$ 0.03 |
| 2_9 | 0.196 $\pm$ 0.04 |
| 3_4 | 0.136 $\pm$ 0.02 |
| 3_5 | 0.144 $\pm$ 0.02 |
| 3_6 | 0.156 $\pm$ 0.03 |
| 3_7 | 0.136 $\pm$ 0.02 |
| 3_8 | 0.142 $\pm$ 0.02 |
| 3_9 | 0.144 $\pm$ 0.02 |
| 4_6 | 0.146 $\pm$ 0.03 |
| 4_7 | 0.14 $\pm$ 0.02 |
| 4_8 | 0.144 $\pm$ 0.02 |
| 4_9 | 0.14 $\pm$ 0.02 |
| 5_6 | 0.146 $\pm$ 0.03 |
| 5_7 | 0.14 $\pm$ 0.02 |
| 5_8 | 0.144 $\pm$ 0.02 |
| 5_9 | 0.14 $\pm$ 0.02 |
| 6_7 | 0.142 $\pm$ 0.01 |
| 6_8 | 0.142 $\pm$ 0.02 |
| 6_9 | 0.14 $\pm$ 0.02 |
| 7_8 | 0.14 $\pm$ 0.02 |
| 7_9 | 0.14 $\pm$ 0.02 |
| 8_9 | 0.152 $\pm$ 0.02 |
Thus, METRFPQQSQQTLASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAFFLSKFTNQLLLSLLEAVIRTVTTLQQLLT (mut #0 and #9) is identified to be the “best” L-protein