Week 10 HW
Homework: Waters Part I — Molecular Weight
1. Predicted molecular weight of eGFP
The eGFP sequence contains a C-terminal His purification tag (HHHHHH) and a short linker (LE) before the tag.
Using the full provided amino acid sequence:
- Length: 247 amino acids
- Predicted average molecular weight before chromophore maturation: 28,006.60 Da
- GFP chromophore maturation causes an approximate mass loss of 20.03 Da
- Corrected theoretical molecular weight: 27,986.58 Da, or 27.987 kDa
So the calculated molecular weight is approximately:
MW_theory = 27.987 kDa
2. Molecular weight from adjacent charge states
I selected two adjacent charge-state peaks from the intact LC-MS spectrum:
| Peak | m/z | Assigned charge |
|---|---|---|
| Peak 1 | 933.739 | 30+ |
| Peak 2 | 966.039 | 29+ |
The charge state can be estimated from the adjacent peak spacing:
[ z \approx \frac{m/z_{higher}}{m/z_{higher} - m/z_{lower}} ]
[ z \approx \frac{966.039}{966.039 - 933.739} \approx 29.9 \approx 30 ]
Then the molecular weight can be calculated from:
[ MW = z(m/z - H^+) ]
where (H^+ \approx 1.0073) Da.
For the 30+ peak:
[ MW = 30(933.739 - 1.0073) = 27,981.95 \text{ Da} ]
For the 29+ peak:
[ MW = 29(966.039 - 1.0073) = 27,985.92 \text{ Da} ]
Average experimental molecular weight:
[ MW_{experiment} \approx 27,983.94 \text{ Da} = 27.984 \text{ kDa} ]
Accuracy/error compared with the theoretical molecular weight:
[ \text{Accuracy} = \frac{|27,983.94 - 27,986.58|}{27,986.58} = 9.44 \times 10^{-5} ]
This is approximately:
0.0094% error, or 94 ppm
3. Charge state of the zoomed-in intact eGFP peak
Yes, the charge state can be observed from the isotope spacing in the zoomed-in peak near m/z 1473.7.
The isotope peaks are separated by about 0.053 m/z. Since isotope spacing is approximately:
[ \Delta(m/z) = \frac{1}{z} ]
then:
[ z \approx \frac{1}{0.053} \approx 19 ]
So the zoomed-in peak is approximately:
19+ charge state
Homework: Waters Part II — Secondary/Tertiary Structure
1. Native versus denatured protein conformations
A native protein is folded into its functional three-dimensional structure. A denatured protein is unfolded, so more of its amino acid residues are exposed to solvent.
In mass spectrometry, this changes the charge-state distribution. A denatured protein usually takes on more protons because more basic and polar groups are exposed. This produces higher charge states and therefore peaks at lower m/z values.
In the denatured eGFP spectrum, there are many charge states spread across lower m/z values. In the native eGFP spectrum, the protein remains more compact, so it carries fewer charges. The native spectrum therefore shows lower charge states at higher m/z values, with fewer dominant peaks.
So the main difference is:
Denatured eGFP: more highly charged, lower m/z, broader charge-state distribution.
Native eGFP: less highly charged, higher m/z, fewer charge states.
2. Charge state of the native peak near m/z 2800
Yes. The peak near m/z 2800 corresponds to approximately the 10+ charge state.
Reason:
The molecular weight of eGFP is about 28 kDa. A 10+ ion would appear near:
[ \frac{28,000 + 10H^+}{10} \approx 2800 ]
The isotope spacing also supports this. The zoomed-in isotope peaks are separated by about 0.1 m/z, and:
[ z = \frac{1}{0.1} = 10 ]
Therefore:
The native eGFP peak at ~2800 m/z is 10+.
Homework: Waters Part III — Peptide Mapping / Primary Structure
1. Number of lysines and arginines in eGFP
In the provided eGFP sequence:
- Lysine, K: 20
- Arginine, R: 6
Highlighted sequence, with K and R bolded:
2. Number of peptides generated by tryptic digestion
Trypsin cleaves after K and R, except when followed by proline.
Using the PeptideMass-style settings shown in the homework figure:
- Enzyme: Trypsin
- Missed cleavages: 0
- Mass mode: monoisotopic
- Output: peptides larger than 500 Da
The digest gives:
27 total tryptic fragments if all fragments are counted.
19 reported peptides if the PeptideMass cutoff of >500 Da is applied.
For the homework answer, I would report:
19 peptides above 500 Da.
3. Number of chromatographic peaks between 0.5 and 6 minutes
From the peptide map TIC, counting peaks above roughly 10% relative abundance between 0.5 and 6 minutes, I see approximately:
18 chromatographic peaks
This is a visual count from the figure, so I would describe it as approximate.
4. Does the number of chromatographic peaks match the predicted peptide number?
The chromatogram shows slightly fewer peaks than the 19 predicted peptides above 500 Da, and far fewer than the 27 total theoretical tryptic fragments.
This does not necessarily mean the digest failed. Several things can happen:
- Some peptides may co-elute in the same chromatographic peak.
- Some peptides may be too small, too low abundance, or ionize poorly.
- Some peptides may not be detected under the LC-MS conditions.
- Some chromatographic peaks may include adducts, modified peptides, or background signals.
So the experimental chromatogram is close to the predicted peptide count, but it does not produce a perfect one-peak-per-peptide match.
5. m/z, charge state, and singly charged mass of the peptide in Figure 5b
The most abundant peptide peak in Figure 5b is at approximately:
m/z = 525.767
The isotope spacing is about 0.5 m/z, which indicates:
[ z = \frac{1}{0.5} = 2 ]
So the most abundant charge state is:
2+
To calculate the singly charged form ([M+H]^+):
[ [M+H]+ = z(m/z) - (z-1)H+ ]
[ [M+H]^+ = 2(525.76712) - 1.0073 ]
[ [M+H]^+ \approx 1050.527 \text{ Da} ]
Therefore:
m/z = 525.767, z = 2+, [M+H]+ ≈ 1050.527 Da
6. Peptide identity and mass accuracy
The peptide that best matches this mass in the predicted tryptic digest is:
FEGDTLVNR
The theoretical singly charged mass of this peptide is approximately:
[ [M+H]^+_{theory} = 1050.521 \text{ Da} ]
Experimental value:
[ [M+H]^+_{experiment} = 1050.527 \text{ Da} ]
Mass error:
[ \text{ppm error} = \frac{|1050.527 - 1050.521|}{1050.521} \times 10^6 ]
[ \text{ppm error} \approx 5.25 \text{ ppm} ]
So the mass accuracy is approximately:
5 ppm error
7. Percentage of sequence confirmed by peptide mapping
From the amino acid coverage map:
88% sequence coverage
Bonus 8. Peptide sequence from the fragmentation spectrum
The peptide sequence that best matches the mass and fragmentation spectrum is:
FEGDTLVNR
Bonus 9. Does the peptide map indicate the protein is eGFP?
Yes. The peptide map supports that the sample is the eGFP standard.
The evidence is:
- The intact protein mass is close to the theoretical eGFP mass.
- The tryptic peptide mass at m/z 525.767, 2+ matches a predicted eGFP peptide.
- The fragmentation spectrum supports the sequence FEGDTLVNR.
- The peptide mapping coverage is high, at 88%.
Together, these results strongly indicate that the protein is eGFP.
Homework: Waters Part IV — Oligomers
KLH oligomer masses can be estimated by multiplying the subunit mass by the number of subunits in the oligomer.
| Oligomeric species | Calculation | Expected mass | Peak position in spectrum |
|---|---|---|---|
| 7FU Decamer | 10 × 340 kDa | 3,400 kDa | ~3.4 MDa |
| 8FU Didecamer | 20 × 400 kDa | 8,000 kDa | ~8.3 MDa |
| 8FU 3-Decamer | 30 × 400 kDa | 12,000 kDa | ~12.7 MDa |
| 8FU 4-Decamer | 40 × 400 kDa | 16,000 kDa | ~16 MDa, weak/low abundance |
The strongest visible assignments are approximately:
- 3.4 MDa = 7FU decamer
- 8.33 MDa = 8FU didecamer
- 12.67 MDa = 8FU 3-decamer
- ~16 MDa = expected 8FU 4-decamer, but it appears weak or not clearly resolved in the shown spectrum
Homework: Waters Part V — Did I make GFP?
| Measurement | Theoretical | Observed / measured on intact LC-MS | PPM mass error |
|---|---|---|---|
| Molecular weight | 27.987 kDa | 27.984 kDa | ~94 ppm |
Based on the intact protein mass, peptide map, and 88% sequence coverage, the sample is consistent with eGFP.