Week 10 HW: Imaging and Measurement (Mass Spectrometry)

Waters Part I — Molecular Weight

  1. Theoretical molecular weight Based on the 247 aa sequence (including the initiator methionine, linker, and His-tag), the calculated average molecular weight is 28006.60 Da. Chromophore maturation (an autocatalytic post-translational modification where residues T66-Y67-G68 undergo cyclization resulting in −18.02) Da and oxidation (resulting in −2.02 Da) contribute to a 20.04 Da mass loss. The corrected predicted mass would therefore be 27,986.56 Da.

  2. Experimental molecular weight

    1. z for adjacent peaks: z = m/z(n+1) / (m/z(n) - m/z(n+1)). Two peaks were chosen, m/z(n+1)=875.4421 and m/z(n)=903.7148, therefore, z = 875.4421/(903.7148-875.4421) = 30.9642 ~ 31 charges the 28kDa eGFP protein carries. z(n) = 31; z(n+1) = 32.
    2. MW = z × (m/z(n) − 1.00728): MW = 31*(903.7148-1.00728)= 31*902.70752 = 27983.93312 Da
    3. Accuracy = |MWexp - MWtheor|/MWtheor: Accuracy = (27,983.93312 - 27,986.56)/ 27,986.56 = 93 ppm, which exceeds the 50 ppm mass error tolerance (referenced in Waters application note for another intact protein processed with Xevo™ G3 QT, https://www.waters.com/nextgen/us/en/library/application-notes/2024/lc-ms-analysis-of-intact-lysine-conjugated-adcs-using-the-acquity-premier-uplc-and-xevo-g3-qtof-mass-spectrometer.html), which means the measurement isn’t reliable.
  3. Individual peaks cannot be resolved in the zoomed-in peak as the difference between those peaks is too low and cannot represent a charge. The zoomed-in peaks are isotope peaks within the charge state envelope.

    (To calculate the theoretical MW, Expasy calculator https://web.expasy.org/compute_pi/ or Biopython can be used)

    !pip install biopython

    from Bio.SeqUtils.ProtParam import ProteinAnalysis
    Step 3  Define your sequence and run the analysis
    pythonseq = "MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHK LEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITLGMDELYKLEHHHHHH"  

    analysis = ProteinAnalysis(seq)

    mw = analysis.molecular_weight()
    pi = analysis.isoelectric_point()

    print(f"Molecular Weight: {mw:.2f} Da")
    print(f"Isoelectric Point: {pi:.2f}"))

Waters Part II — Secondary/Tertiary structure

  1. The mass spectrum shifts toward lower m/z when a denatured protein is evaluated, and shifts toward higher m/z when a native protein is evaluated. In the case of denatured protein, charge envelopes become broader as many charge states are observed.
  2. The charge state of the peak at ~2800 ….

Waters Part III — Peptide Mapping - primary structure

  1. Lysine (K) and Arginine (R) in the eGFP: 20 Lysines and 6 Arginines.

  2. Theoretical digest produced 19 peptides (Expasy output):

    Selected enzyme:Trypsin Maximum number of missed cleavages (MC): 0 Cysteines modifications: All cysteines in reduced form Methionines modifications: Methionines have not been oxidized. Mass of displayed peptides: > 500 Dalton Mass calculation: Using monoisotopic masses of the occurring amino acid residues and giving peptide masses as [M+H]+.

    [Theoretical pI: 5.90 / Mw (average mass): 28006.60 / Mw (monoisotopic mass): 27988.96]

countmass#MCmodificationspeptide sequence
14472.1752170-2100HNIEDGSVQLADHYQQNTPI GDGPVLLPDNHYLSTQSALS K
22566.2931217-2390DHMVLLEFVTAAGITLGMDE LYK
32437.2608170-2100GEELFTGVVPILVELDGDVN GHK
42378.257754-740LPVPWPTLVTTLTYGVQCFS R
51973.9062142-1570LEYNYNSHNVYIMADK
61503.659728-420FSVSGEGEGDATYGK
71266.578387-970SAMPEGYVQER
81083.4979240-2470LEHHHHHH
91050.5214115-1230FEGDTLVNR
10982.4952133-1410EDGNILGHK
11821.394081-860QHDFFK
12790.355275-800YPDHMK
13769.391347-530FICTTGK
14711.2944103-1080DGNYK
15655.381398-1020TIFFK
16602.2780211-2150DPNEK
17579.3137128-1320GIDFK
18507.2925164-1670VNFK
19502.3235124-1270IELK
90.7% of sequence covered ():

        10         20         30         40         50         60 
mvskGEELFT GVVPILVELD GDVNGHKFSV SGEGEGDATY GKltlkFICT TGKLPVPWPT 

        70         80         90        100        110        120 
LVTTLTYGVQ CFSRYPDHMK QHDFFKSAMP EGYVQERTIF FKDDGNYKtr aevkFEGDTL 

       130        140        150        160        170        180 
VNRIELKGID FKEDGNILGH KLEYNYNSHN VYIMADKqkn gikVNFKirH NIEDGSVQLA 

       190        200        210        220        230        240 
DHYQQNTPIG DGPVLLPDNH YLSTQSALSK DPNEKrDHMV LLEFVTAAGI TLGMDELYKL 


EHHHHHH
  1. If peaks of >10% abundance are considered, ~ 19 peaks can be counted, about the same number as the peptides:

    Marked_chromatogram Marked_chromatogram Chromotogram with peaks that are >10% relative abundance highlighted. Red line marks the 10% cutoff.

  2. The mass-to-charge (m/z) of the peptide shown in Figure 5b.

The isotope spacing is= 1/z; the peaks are spaced ~0.5 m/z apart (525.76712, 526.25918), therefore z = +2.

The mass of the singly charged form of the peptide ([M+H]+) based on m/z=525.76712 and z=2: M neutral =(m/z x z)-(z x 1.0073)= 525.76712 x 2 – 2 x 1.0073 = 1,051.53424 - 2.0146 = 1,049.51964 m/z, the singly charged form [M+H]⁺= = M+1.0073 = 1,049.51964 + 1.0073 = 1,050.52694 Da

  1. Identification of the peptide

The peptide is FEGDTLVNR, with the mass of 1050.5214 Da.

Accuracy = |MWexp - MWtheor|/MWtheor: Accuracy = (1,050.52694 – 1050.5214)/ 1050.5214 = 5.27ppm, which is within the 10 ppm mass error tolerance for peptide evaluation.

  1. The percentage of the sequence that is confirmed by peptide mapping is 88% (see Figure 6).

Bonus Peptide Map Questions

The Fragment Ion Calculator tool http://db.systemsbiology.net/proteomicsToolkit/FragIonServlet.html provided the following results:

Sequence: FEGDTLVNR, pI: 4.37029

Mass/Charge Table
Mass - MonoMass - Average
(M)1049.514221050.13629
(M+H)+1050.521491051.14356
(M+2H)2+525.76441526.07544
(M+3H)3+350.84538351.05273
(M+4H)4+263.38586263.54138
Fragment Ion Table, monoisotopic masses
Seq#BY# (+1)
F1148.075741050.521499
E2277.11833903.453088
G3334.13979774.410497
D4449.16673717.389026
T5550.21441602.362085
L6663.29848501.314404
V7762.36689388.230343
N8876.40982289.161922
R91032.51093175.119001

Based on the ion masses, 7 y ions can be identified in the spectrum:

Marked_fragmentation Marked_fragmentation Figure 5c. Fragmentation spectrum of the peptide eluting at retention time 2.78 minutes in Figure 5a.

No b ions were identified on the spectrum. Claude explains why:

  • b and y ions are not produced equally — the chemistry of fragmentation strongly favors one series over the other depending on the peptide
  • Arginine (R) at the C-terminus — your peptide ends in R, which is highly basic and strongly attracts the charge. This means the C-terminal fragments (y-ions) tend to retain the charge and be detected, while the complementary b-ions lose the charge and become invisible. This is actually a well-known rule: tryptic peptides ending in R or K almost always show dominant y-ion series with weak or absent b-ions

Several peaks do not match with any b or y ions.

Claude explains what these peaks might refer to:

536.74844 and 525.76712 - these are spaced ~11 m/z apart and are in the 500s range, which strongly suggests these are doubly charged ions (z=+2) of larger fragments or possibly the precursor isotope peaks.

214.09165 - The most likely explanation is an internal fragment ion. These form when two bond breaks happen in the same fragmentation event, releasing a small piece from the middle of the peptide. They are named by the residues they contain. T + L = 101.05 + 113.08 = 214.13 (very close ✅)

86.09155, 122.08946 - immonium ions, which are small diagnostic ions produced as side products of fragmentation. They are specific to individual amino acids. 86.09155 — immonium ion of Leucine (L), which is indeed in the sequence. 122.08946 — immonium ion of Phenylalanine (F), the first residue in the sequence.

The y ions found in the fragmentation spectrum are consistent with the predicted sequence (FEGDTLVNR). Figure 6 depicting the % of aa coverage of peptides positively identified using their calculated mass and fragmentation, confirms that the sequence of this peptide (FEGDTLVNR) matched.

Waters Part IV — Oligomers

Marked_klh-mass-spec Marked_klh-mass-spec

Waters Part V — Did I make GFP?