Mass Spectrometry

1. Calculation of eGFP molecular weight

1a. Online calculator

Reference eGFP sequence:

MVSKGEELFTG VVPILVELDG DVNGHKFSVS GEGEGDATYG KLTLKFICTT GKLPVPWPTL VTTLTYGVQC FSRYPDHMKQ HDFFKSAMPE GYVQERTIFF KDDGNYKTRA EVKFEGDTLV NRIELKGIDF KEDGNILGHK LEYNYNSHNV YIMADKQKNG IKVNFKIRHN IEDGSVQLAD HYQQNTPIGD GPVLLPDNHY LSTQSALSKD PNEKRDHMVL LEFVTAAGIT LGMDELYKLE HHHHHH

According to Gemini:

  • In experimental settings, enhanced GFP (eGFP) is expressed with the start codon (N-terminal Methionine) as well as the His-purification tag (HHHHHH) and its linker (LE).
  • For the calculation, one needs to calculate the summation of all the residues (linear Mw, that includes the addition of an extra hydrogen in N-terminal and an extra OH in C-term) and take into account the chromophore maturation (self-catalyzed backbone cyclization and oxidation of the tripeptide Thr65–Tyr66–Gly67 that allows eGFP to become fluorescent, see diagram below), which leads to the loss of one water molecule (-18.02 Da) during cyclization and the loss of two hydrogen atoms (-2Da) during oxidation.
  • Average molecular weight is a best choice for the calculation because takes into account the natural distribution of all isotopes while monoisotopic weight only considers the single most abundant isotope for each atom. In simple words, the average estimation is closer to the natural distribution of isotopes present in nature.

Schematic diagram of the chromophore formation in maturing eGFP:

Source image: What is the maturation time for fluorescent proteins?


Calculation

When entering the full sequence in the online calculator one obtains a theoretical general molecular weight of ~ 28.006 kDa (“Th.Av. Mw” = 28006.60 Da)

Integration of the fluorophore maturation: Th.Av. Mw - (H2O Mw + H2 Mw) = 28006.60 - (18.02 + 2) = 28006.60 - 20.02 = 27986.58 Da

In conclusion, the molecular weight of eGFP is estimated to 27.987 kDa.

1b. Adjacent charge state approach

Pair of adjacent peaks selected from the intact LC-MS data:

M/Zn+1 = 875.4421 and M/Zn = 903.7148

According to the adjacent charge state formula presented during recitation:

n = [ (M/Zn+1) -1 ] / [ (M/Zn) - (M/Zn+1) ]

n = (875.4421 -1) / (903.7148 - 875.4421) = 874.4421 / 28.2727 = 30.9289

Thus, charge state adjacent peaks:

Zn = ~31+ and Zn+1 = ~32+

According to molecular weight formula presented during recitation:

MW = ( n x M/Zn ) - n

MW (31+) = ( 31 x 903.7148 ) - 31 = 28015.1588 - 31 = 27984.1588 Da

MW (32+) = ( 32 x 875.4421 ) - 32 = 28014.1472 - 32 = 27982.1472 Da

Thus, average experimental molecular weight:

MW = 27.983 kDa

According to the mass accuracy formula presented during recitation:

Accuracy = | MW experiment - MW theory | / MW theory x 1’000’000

With MW experiment = 27’983 Da and MW theory = 27’987 Da

Accuracy = | 27983 - 27987 | / 27987 x 1’000’000 = 142.92 ppm

Conclusion: If accuracy > 50 ppm, either the protein is denatured or the mass spectrometer was not calibrated.

1c. Charge state zoomed-in peak

The values of the zoomed-in peak are not readable.

2. Secondary/tertiary structure

2a. Charge-state distribution

Analyze of eGFP in its native, folded state and comparison with its denatured, unfolded state on a quadrupole time-of-flight MS (lab experiment on Waters Xevo G3-QToF MS).

When a protein gets denatured, it loses the compact 3D (tertiary) structure of its native form. This unfolding process increases the protein surface exposed to the solvent, which acquires more charges during electrospray ionization. The extended shape of the protein also increases the drift time through the tube.

The mass spectrometer can detect this by measuring the protein’s charge-state distribution:

Reference: Lecture Week 10 by Lindsay Morrison

Image credit: Waters Corporation (slides from the lecture)

2a. Charge-state ~800 m/z peak

According to the formulas above, Zn = (MW + n) / M/Zn

With MW_theory = 27,986.57 Da and M/Zn = 2799.4199 for the peak ~800 M/Zn.

Thus, Zn = 27986.57 / 2799.4199 = 9.9973, i.e. Zn = ~ 10+

3. Peptide mapping: primary structure

3a. Lysine (K) and Arginine (R) residues count

Manual count:

Lysine (K) = 20

Arginine (R) = 6

Results confirmed when analysing the peptide in Benchling

3b. Trypsin-generated peptides

Online tool used to predict the list of peptides generated from a tryptic digest: Expasy

Trypsin cuts after the Lysine (K) and Arginine (R) residues. If the digestion is complete, one can expect it to generate 27 (20+6+1) peptides.

However, when running the digestion in Expasy, the list only contains 19 peptides:

This discrepancy is explained by the application of a filter: PeptideMass only displays peptides above 500 Da.

So in conclusion:

  • Theoretical digest: 27 peptides

  • PeptideMass list (>500Da) = 19 peptides

3c. Peptide Map

Peaks count:

  • If the threshold is set relatively to the baseline intensity (10% above 5e^6): 21 to 23 peaks (depending if including unlabelled peaks or not)
  • If the threshold is set relatively to the max peak intensity (10% of 1.2e7): 18 peaks

3d. Experiment vs theory

PeptideMass predicted 19 peptides (>500 Da) but the experimental data are slightly different (18-23 peptides, depending on threshold applied). The degree of mismatch appears reasonable and might be explained by:

  • Peptides too small to be detected
  • Miscleavage
  • Sample degradation or modification (e.g. oxidation)
  • Peaks merging due to similar elution time
  • Impurities

3e. Analysis of the peptide 2.78 min retention time

Charge State

The most abundant charge state for the peptide 2.78 min retention time is M/Z = 525.76712

According to formula [M/Z_adj - M/Z] = [M_adj -M] / Z and given that adjacent peak M/Z_adj = 526.25918,

M/Z_adj - M/Z= 526.25918 - 525.76712 = 0.4921

Thus, Z = 1 / 0.4921 = 2.03 = ~2+


Mass of the singly charged form of the peptide [M+H]+

[M+H]+ = Z x (M/Z) - (Z-1) x H

with M/Z = 525.767, Z = 2 and H+ = 1.00728 Da

[M+H]+ = 2 x 525.76712 - 1 x 1.00728 = 1050.52691 Da


This result match the singly charged peak m/z = 1050.52438 that can be observable in Fig.5c:

3f. Peptide identification & Mass Accuracy

Match with PeptideMass generated list (see above): Peptide FEGDTLVNR 1050.5214 Da (residues 115–123)

Accuracy: Accuracy = | MW experiment - MW theory | / MW theory x 1’000’000

With MW experiment = 1050.52438 Da and MW theory = 1050.5214 Da

Accuracy = | 1050.52438 - 1050.5214 | / 1050.5214 x 1’000’000 = 2.837 ppm

3g. Peptide map coverage

According to this last figure, 88% of the eGFP sequence was identified which is relatively close to the 90.2% predicted by PeptideMass.

4. Oligomers

5. Experimental data