Week 5: HW Protein Design Part II

Part A

Part 1: Generate Binders with PepMLM

Binder SequencePerplexity Score
WRYYATVARHKE15.38014901
WLYYVVVLRHGE32.45105037
WRYYAAGARLKE11.75101427
WRYYATAVELKG10.44080960

Part 2: Evaluate Binders with AlphaFold3

Peptide IDBinder SequenceipTM ScoreLocation
Peptide 0WRYYATVARHKE0.35Approaches the $\beta$ barrel; surface-bound
Peptide 1WLYYVVVLRHGE0.33Approaches the $\beta$ barrel; surface-bound
Peptide 2WRYYAAGARLKE0.35Approaches the $\beta$ barrel; surface-bound
Peptide 3WRYYATAVELKG0.22Conforms to the dimer interface; binds to a pocket
ControlFLYRWLPSRRGG0.31Surrounding the $\beta$ barrel; surface-bound

While peptides 0-2 exceed the control (known binder)’s ipTM scores, indicating stronger protein-protein interaction, they still represent low confidence values. I would not conclude that they produce stronger protein-protein interactions for use as potential molecular glues.

Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse

Peptide IDBinder SequencePredicted Binding AffinitySolubilityHemolysis ProbabilityNet Charge (pH 7)Molecular Weight
Peptide 0WRYYATVARHKE5.659 pKd/pKi1.0000.0341.851579.8 Da
Peptide 1WLYYVVVLRHGE6.562 pKd/pKi1.0000.158-0.151533.8 Da
Peptide 2WRYYAAGARLKE5.878 pKd/pKi1.0000.0421.771483.7 Da
Peptide 3WRYYATAVELKG5.785 pKd/pKi1.0000.0608.501456.6 Da

While there isn’t enough of an ipTM range to draw meaningful conclusions regarding binding affinity, I would think that structural conformation lends itself, by necessity, to the strength of protein-protein interactions. Peptide 1 actually had the highest hemolysis probability and the lower pH; on that note, peptide 3 makes for the best therapeutic candidate considering its net charge at pH 7 and solubility assists in drug delivery while its binding affinity renders it efficacious as a therapeutic target.

Part 4

KKCGKQFKQKQE

This is vastly different in terms of conformation and osmolarity as compared to the other peptides. I would request representative lab samples of the protein and conduct initially binding efficiency assays using MS before advancing either as a candidate.

Part B

Part C

Protein Design Process Overview

We’ll be utilizing random mutagenesis to generate variants of L-protein optimized for both (1) Lysis Activity and (2) Expression Levels. We begin with the ground truth of the experimental data provided, generate combinations of mutations and test them usign Af2-multimer.

Mutation Set

We filter mutations by excluding nonsense mutations and prioritizing missense mutations which enhance both lytic activity and expression levels. Frankly, the synergistic effects of different mutations is not a straightforward problem to solve, so we resort to a brute force attack on different possible combinations of mutations in L protein that affect its binding to DNAJ. In an ideal world, we would create a performance score that assessed its lytic activity and expression as well, but existing models wouldn’t encode these properties especially for a small transmembrane protein.

Mut #Amino Acid PositionAmino Acid Change
013P->L
115S->A
218R->G
318R->I
430R->Q
530R->L
631R->I
744L->P
845A->P
946I->F

Mathematically, there are 26*32=576 possible sets of mutations that can be made.

Evaluation

In the absence of reliable lysis and expression level metrics, we resort to DNAJ:L Protein binding affinity as predicted by Af2-Multimer instead.

Results

Mutationsiptm value
0_10.152 $\pm$ 0.02
0_20.16 $\pm$ 0.02
0_30.148 $\pm$ 0.02
0_40.152 $\pm$ 0.02
0_50.144 $\pm$ 0.02
0_60.168 $\pm$ 0.02
0_70.17 $\pm$ 0.02
0_80.178 $\pm$ 0.02
0_90.208 $\pm$ 0.04
1_20.15 $\pm$ 0.02
1_30.152 $\pm$ 0.02
1_40.154 $\pm$ 0.03
1_50.146 $\pm$ 0.01
1_60.168 $\pm$ 0.03
1_70.178 $\pm$ 0.03
1_80.172 $\pm$ 0.02
1_90.178 $\pm$ 0.02
2_40.146 $\pm$ 0.02
2_50.146 $\pm$ 0.02
2_60.156 $\pm$ 0.02
2_70.164 $\pm$ 0.03
2_80.174 $\pm$ 0.03
2_90.196 $\pm$ 0.04
3_40.136 $\pm$ 0.02
3_50.144 $\pm$ 0.02
3_60.156 $\pm$ 0.03
3_70.136 $\pm$ 0.02
3_80.142 $\pm$ 0.02
3_90.144 $\pm$ 0.02
4_60.146 $\pm$ 0.03
4_70.14 $\pm$ 0.02
4_80.144 $\pm$ 0.02
4_90.14 $\pm$ 0.02
5_60.146 $\pm$ 0.03
5_70.14 $\pm$ 0.02
5_80.144 $\pm$ 0.02
5_90.14 $\pm$ 0.02
6_70.142 $\pm$ 0.01
6_80.142 $\pm$ 0.02
6_90.14 $\pm$ 0.02
7_80.14 $\pm$ 0.02
7_90.14 $\pm$ 0.02
8_90.152 $\pm$ 0.02

Thus, METRFPQQSQQTLASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAFFLSKFTNQLLLSLLEAVIRTVTTLQQLLT (mut #0 and #9) is identified to be the “best” L-protein