Week 10: Imaging and Measurement

Part I — Molecular Weight

Q1 — Theoretical MW from Sequence

The eGFP sequence was entered into the ExPASy Compute pI/Mw tool.

The resulting molecular weight was:

28,006.60 Da

However, eGFP undergoes autocatalytic chromophore cyclization, which removes approximately 20 Da from the protein.

Therefore:

Theoretical MW = 27,986.60 Da

Q2 — Adjacent Charge State Calculation

From Figure 1, two adjacent peaks were selected:

( m/z_n = 903.7148 )
( m/z_{n+1} = 875.4421 )

Step 1 — Determine the Charge State

$$ z = \frac{875.4421}{903.7148 - 875.4421} $$

$$ z = \frac{875.4421}{28.2727} $$

$$ z = 30.96 \approx 31 $$

Step 2 — Determine Molecular Weight

$$ MW = z \times \left(\frac{m}{z_n} - 1\right) $$

$$ MW = 31 \times (903.7148 - 1) $$

$$ MW = 31 \times 902.7148 $$

$$ MW = \mathbf{27,984.16 \text{ Da}} $$

Step 3 — Determine Accuracy

$$ \text{Accuracy} = \frac{|27,984.16 - 27,986.60|}{27,986.60} $$

$$

\frac{2.44}{27,986.60} $$

$$

8.72 \times 10^{-5}

\mathbf{87.2 \text{ ppm}} $$

This value is slightly above the ideal threshold of <50 ppm, but still sufficiently close to strongly suggest the protein is eGFP.

Q3 — Charge State of the Zoomed-In Peak

In the zoomed inset, isotope peaks are separated by approximately 0.05 m/z.

Because isotope spacing equals:

$$ \frac{1}{z} $$

The charge state is approximately:

$$ z \approx 20 $$

The peak is sufficiently resolved to directly observe isotopic spacing.

Part II — Secondary and Tertiary Structure

Q1 — Native vs. Denatured Protein

A native protein retains its folded three-dimensional structure, including intact secondary and tertiary interactions.

A denatured protein is unfolded into a linear chain, exposing additional protonatable sites.

In mass spectrometry:

Denatured proteins acquire more charges
Higher charge states produce lower m/z values
Spectra become broader and shift left

Native proteins:

Acquire fewer charges
Produce higher m/z values
Generate narrower spectra shifted right

In Figure 2:

The denatured spectrum (top) shows many peaks around 700–1000 m/z
The native spectrum (bottom) shows fewer peaks above 2000 m/z

Q2 — Charge State of the ~2800 m/z Native Peak

From the Figure 3 inset, isotope peaks are spaced approximately 0.09 m/z apart.

Using:

$$ \frac{1}{z} \approx 0.09 $$

The charge state is approximately:

$$ z \approx 11 $$

At high instrument resolution, isotopic spacing directly reveals charge state.

Part III — Peptide Mapping

Q1 — K and R Count + Highlighted Sequence

MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITLGMDELYKLEHHHHHH

Residue counts:

~19 Lysine (K)
~7 Arginine (R)

Total cleavage sites:

26 cleavage sites

Q2 — Tryptic Peptide Count

Using the ExPASy PeptideMass tool with the settings shown in Figure 4:

19 peptides

Short peptides (<5 amino acids) are excluded from the output.

Q3 — Chromatographic Peaks

From Figure 5a, counting peaks greater than 10% relative abundance between 0.5–6 minutes gives:

~21–23 peaks

Q4 — Peak Count vs. Predicted Peptide Count

The observed number of chromatographic peaks exceeds the predicted peptide count.

Possible reasons include:

missed trypsin cleavages
non-specific cleavage
oxidized or modified peptides
peptides detected in multiple charge states

Q5 — Peptide at 2.78 min: m/z, Charge, and Mass

From Figure 5b:

Most abundant peak:
- ( m/z = 525.76712 )

The isotope spacing is approximately 0.5 m/z, indicating:

$$ z = 2 $$

Neutral Molecular Weight

$$ MW = z \times \frac{m}{z} - z \times 1.00727 $$

$$ MW = 2 \times 525.76712 - 2 \times 1.00727 $$

$$ MW = 1051.53424 - 2.01454 $$

$$ MW = 1049.52 \text{ Da} $$

Protonated Mass

$$ [M+H]^+ = 1049.52 + 1.00727 $$

$$ [M+H]^+ = \mathbf{1050.53 \text{ Da}} $$

Q6 — Peptide Identification + Mass Accuracy

The closest peptide match from the PeptideMass output is:

FEGDTLVNR

Theoretical protonated mass:

$$ [M+H]^+ = 1050.5214 \text{ Da} $$

PPM Error

$$ \text{Error} = \frac{|1050.527 - 1050.5214|}{1050.5214} \times 10^6 $$

$$

\frac{0.0056}{1050.5214} \times 10^6 $$

$$ \approx \mathbf{5.3 \text{ ppm}} $$

This is well within the accepted threshold for confident identification.

Q7 — Sequence Coverage

From Figure 6:

88% sequence coverage

Bonus Q8 — Fragment Ion Matching

Using the Fragment Ion Calculator with:

peptide: FEGDTLVNR
singly charged ions
B and Y ions enabled

The fragmentation spectrum in Figure 5c closely matches the predicted fragments.

Most major B and Y ions align correctly. Small unmatched peaks likely represent noise or internal fragment ions.

Bonus Q9 — Did We Make eGFP?

Yes, the collected evidence strongly supports that the sample is eGFP.

Supporting evidence includes:

88% sequence coverage
peptide identifications within <10 ppm
intact molecular weight close to theoretical

The remaining unconfirmed sequence likely corresponds to peptides outside the detectable mass range.

Part IV — Oligomers

Using the subunit masses provided in Table 1:

Species	Calculation	Mass
7FU Decamer	10 × 340 kDa	3,400 kDa (3.4 MDa)
8FU Didecamer	20 × 400 kDa	8,000 kDa (8.0 MDa)
8FU 3-Decamer	30 × 400 kDa	12,000 kDa (12 MDa)
8FU 4-Decamer	40 × 400 kDa	16,000 kDa (16 MDa)

These masses correspond to the major species observed in the CDMS spectrum.

CDMS is especially useful for extremely large assemblies because it directly measures ion mass without requiring charge-state deconvolution.

Part V — Did I Make GFP?

	Theoretical	Observed (Intact LC-MS)	PPM Error
Molecular Weight (Da)	27,986.60	27,984.16	87.2

Conclusion

The observed intact mass differs from the theoretical value by approximately 87 ppm, which is slightly above the ideal threshold.

However, peptide mapping provides strong supporting evidence:

88% sequence coverage
FEGDTLVNR identified within 5.3 ppm
all major peptides match predicted tryptic fragments

The small discrepancy in intact mass likely results from manual charge-state selection rather than incorrect protein identity.