Week 10 HW: Advanced Imaging & Measurement Technology
Homework: Final Project
For your final project:
Please identify at least one (ideally many) aspect(s) of your project that you will measure. It could be the mass or sequence of a protein, the presence, absence, or quantity of a biomarker, etc.
Please describe all of the elements you would like to measure, and furthermore describe how you will perform these measurements.
What are the technologies you will use (e.g., gel electrophoresis, DNA sequencing, mass spectrometry, etc.)? Describe in detail.
Ans:The primary aspect measured in this project is the biosensor response, observed as a visible color change resulting from the activation of a reporter gene in the presence of plastic associated chemical signals. Additionally, the level of gene expression will be evaluated to understand the strength of the response under different signal conditions. These measurements will be performed using in silico simulation tools, where the behavior of the genetic circuit is modeled to predict activation and output. DNA construct design and validation will be carried out using Benchling, ensuring proper sequence structure and functionality. The simulation of circuit behavior will be conducted using Asimov Kernel.
In a practical setting, these measurements could be further validated using cell-free expression assays and techniques such as spectrophotometry or fluorescence analysis to quantify the output signal.The output of the biosensor is a visible color change, which can be measured quantitatively using spectrophotometry. This technique measures the absorbance of light at specific wavelengths corresponding to the produced color. The intensity of absorbance is directly proportional to the amount of reporter protein expressed.Fluorescence intensity can be measured using a fluorometer. The emitted light intensity corresponds to the level of gene expression. This method provides high sensitivity and allows precise quantification of the biosensor response, especially at low signal concentrations.
Homework: Waters Part I β Molecular Weight
We will analyze an eGFP standard on a Waters Xevo G3 QTof MS system to determine the molecular weight of intact eGFP and observe its charge state distribution in the native and denatured (unfolded) states. The conditions for LC-MS analysis of intact protein cause it to unfold and be detected in its denatured form (due to the solvents and pH used for analysis).
- Based on the predicted amino acid sequence of eGFP (see below) and any known modifications, what is the calculated molecular weight? You can use an online calculator like the one at https://web.expasy.org/compute_pi/ eGFP Sequence: MVSKGEELFTG VVPILVELDG DVNGHKFSVS GEGEGDATYG KLTLKFICTT GKLPVPWPTL VTTLTYGVQC FSRYPDHMKQ HDFFKSAMPE GYVQERTIFF KDDGNYKTRA EVKFEGDTLV NRIELKGIDF KEDGNILGHK LEYNYNSHNV YIMADKQKNG IKVNFKIRHN IEDGSVQLAD HYQQNTPIGD GPVLLPDNHY LSTQSALSKD PNEKRDHMVL LEFVTAAGIT LGMDELYKLE HHHHHH Note: This contains a His-purification tag (HHHHHH) and a linker (the LE before it).
Ans: The calculated molecular weight using https://web.expasy.org/compute_pi/ is 28006.60 Da

Ans: a. A common pair visible is approximately:
π/π§π β 933
π/π§π+1 β 903
Determination of charge state (z):
Two adjacent peaks were selected at approximately 933 and 903 m/z. Using the given formula:
Z = 903/933β903 β 30
Thus, the charge state is z β +30.
b. Molecular weight calculation:
Using the relationship:
ππ = π§(π/π§)βπ§
MW = 30Γ933β30=27,960 Da
Thus, the molecular weight of eGFP is approximately 27.96 kDa.
c.From earlier:
ππππ₯π = 27,960 Da
πππ‘βππππ¦ = 28,006.60 Da
Accuracy calculation:
Accuracy =β£27960β28006.60β£/ 28006.60 β 0.00166
Error β 0.166%
- Charge state of zoomed-in peak:
No, the charge state cannot be directly determined from the zoomed-in peak because isotopic spacing is not resolved, which is necessary to assign charge states.
Homework: Waters Part III β Peptide Mapping - primary structure
- How many Lysines (K) and Arginines (R) are in eGFP? Please circle or highlight them in the eGFP sequence given in Waters Part I question 1 above. (Note: adding the sequence to Benchling as an amino acid file and clicking biochemical properties tab will show you a count for each amino acid).
Ans: Number of Lysines (K) present are 20 and number of Arginines (R) present = 6
MVSKGEELFTG VVPILVELDG DVNGH[K]FSVS GEGEGDATYG [K]LTL[K]FICTT G[K]LPVPWPTL VTTLTYGVQC FS[R]YPDHM[K]Q HDFF[K]SAMPE GYVQE[R]TIFF [K]DDGNY[K]T[R]A EV[K]FEGDTLV N[R]IEL[K]GIDF [K]EDGNILGH[K] LEYNYNSHNV YIMAD[K]Q[K]NG IKVNFKI[R]HN IEDGSVQLAD HYQQNTPIGD GPVLLPDNHY LSTQSALSKD PNE[K][R]DHMVL LEFVTAAGIT LGMDELY[K]LE HHHHHH
2.How many peptides will be generated from tryptic digestion of eGFP?
Navigate to https://web.expasy.org/peptide_mass/
Copy/paste the sequence above into the input box in the PeptideMass tool to generate expected list of peptides.
Use Figure 4 below as a guide for the relevant parameters to predict peptides from eGFP.
Click βPerform the Cleavageβ button in the PeptideMass tool and report the number of peptides generated when using trypsin to perform the digest.
Ans: Using the PeptideMass tool with default settings shows 19 peptides because small peptides (<500 Da) are excluded.
- Based on the LC-MS data for the Peptide Map data generated in lab (please use Figure 5a as a reference) how many chromatographic peaks do you see in the eGFP peptide map between 0.5 and 6 minutes? You may count all peaks that are >10% relative abundance.

Ans: The number of peptide peaks observed in the chromatogram between 0.5 and 6 minutes, considering only peaks above 10% relative abundance, is approximately 12β14 peaks.
- Assuming all the peaks are peptides, does the number of peaks match the number of peptides predicted from question 2 above? Are there more peaks in the chromatogram or fewer?
Ans: The predicted number of peptides from tryptic digestion is 19 peptides (based on PeptideMass with filtering). The observed number of chromatographic peaks is similar but slightly lower or comparable.
This difference arises because:
Some peptides may not ionize efficiently
Very small or hydrophilic peptides may not be detected
Some peptides may co-elute

Ans: Charge state and peptide mass
The isotopic peak spacing for the peptide at 2.78 minutes is approximately 0.5 m/z, giving:
π§=1/0.5 = 2
Thus, the charge state is +2.
Using:
MW=z(m/z)βz
MW=2Γ526.27β2.0146 = 1050.525 Da
Therefore, the peptide has:
Charge state (z) = +2
Molecular weight β 1050.5 Da
- Identify the peptide based on comparison to expected masses in the PeptideMass tool. What is mass accuracy of measurement? Please calculate the error in ppm. (Recall that Accuracy from part I)
Ans: Peptide Identification
The peptide is FEGDTLVNR with theoretical mass 1050.5214 Da.
The mass accuracy was calculated using:
1050.525β1050.5214 / 1050.5214 Γ 10 ^ 6 = 0.0036 / 1050.5214 Γ 10 ^ 6 β3.4ppm
- What is the percentage of the sequence that is confirmed by peptide mapping? (see Figure 6)

Ans: Peptide mapping confirms 88% of the eGFP sequence, as indicated in Figure 6. This value represents the proportion of the protein sequence that is covered by experimentally identified peptides in the LC-MS analysis. The highlighted regions correspond to detected peptides, while the unhighlighted regions indicate portions of the sequence that were not observed, likely due to limitations such as poor ionization or peptide size. This high coverage suggests successful and reliable protein identification.
Bonus Peptide Map Questions
- Can you determine the peptide sequence for the peptide fragmentation spectrum shown in Figure 5c? (HINT: Use your results from Question 2 above to match the peptide molecular weight that is closest to that shown in Figure 5b. Copy and paste its sequence into this tool online to predict the fragmentation pattern based on its amino acid sequence: http://db.systemsbiology.net/proteomicsToolkit/FragIonServlet.html. What is the sequence of the eGFP peptide that best matches the fragmentation spectrum in Figure 5c?
Ans: The peptide sequence that best matches the fragmentation spectrum in Figure 5c is FEGDTLVNR.This identification is based on matching the experimentally determined molecular weight (~1050.5 Da) with the theoretical peptide masses from the PeptideMass tool. The peptide FEGDTLVNR (1050.5214 Da) shows the closest agreement.
- Does the peptide map data make sense, i.e. do the results indicate the protein is the eGFP standard? Why or why not? Consult with Figure 6, which depicts the % amino acid coverage of peptides positively identified using their calculated mass and fragmentation pattern.
Ans: Yes, the peptide map data is consistent with the protein being the eGFP standard. Figure 6 shows approximately 88% amino acid sequence coverage, indicating that a large majority of the protein sequence has been experimentally confirmed through peptide identification. Additionally, multiple peptides across different regions of the sequence were identified and validated using both mass measurements and fragmentation patterns, providing strong evidence for correct protein identification. The remaining 12% of the sequence not covered is likely due to typical limitations such as poor ionization or peptide detectability and does not significantly affect the confidence of identification. Therefore, the results strongly support that the protein analyzed is eGFP.
Homework: Waters Part IV β Oligomers
We will determine Keyhole Limpet Hemocyanin (KLH)βs oligomeric states using charge detection mass spectrometry (CDMS). CDMS single-particle measurements of KLH allow us to make direct mass measurements to determine what oligomeric states (that is, how many protein subunits combine) are present in solution. Using the known masses of the polypeptide subunits (Table 1) for KLH, identify where the following oligomeric species are on the spectrum shown below from the CDMS (Figure 7):
7FU Decamer 8FU Didecamer 8FU 3-Decamer 8FU 4-Decamer

Ans: To identify the oligomeric species of Keyhole Limpet Hemocyanin (KLH) on the provided CDMS spectrum, we calculate the expected mass of each assembly and correlate it with the experimental peaks shown in Figure 7.Theoretical Mass CalculationsSince the x-axis of the spectrum is in Megadaltons (MDa) and the subunit masses are in kilodaltons (kDa), we use the conversion 1,000 kDa = 1 MDa
| Oligomeric Species | Composition | Calculation | Theoretical Mass |
|---|---|---|---|
| 7FU Decamer | 10 Γ 7FU subunits | 10 Γ 340 kDa | 3.40 MDa |
| 8FU Didecamer | 20 Γ 8FU subunits | 20 Γ 400 kDa | 8.00 MDa |
| 8FU 3-Decamer | 30 Γ 8FU subunits | 30 Γ 400 kDa | 12.00 MDa |
| 8FU 4-Decamer | 40 Γ 8FU subunits | 40 Γ 400 kDa | 16.00 MDa |
Species Identification on Spectrum (Figure 7) Based on the calculations above, the oligomeric species correspond to the following peaks labeled in the mass spectrum:
7FU Decamer: Assigned to the peak at 3.4 MDa. This matches the theoretical calculation exactly.
8FU Didecamer: Assigned to the highest intensity peak at 8.33 MDa. The slight shift from 8.00 MDa to 8.33 MDa is attributed to native glycosylation and adducts common in large KLH proteins.
8FU 3-Decamer: Assigned to the peak at 12.67 MDa. This represents the assembly of 30 8FU subunits
8FU 4-Decamer: Assigned to the low-intensity cluster of peaks between 16.00 and 17.00 MDa. This corresponds to the 40-subunit assembly.
Homework: Waters Part V β Did I make GFP?
Please fill out this table with the data you acquired from the lab work done at the Waters Immerse Lab in Cambridge, or else the data screenshots in this document if you were unable to have lab work done at Waters.
| Theoretical (kDa) | Observed /Measured on Intact LC-MS (kDa) | PPM Mass Error |
|---|---|---|
| 28.0066 | 27.960 | -1664 ppm |
The relatively high ppm error is due to approximate peak selection and lack of deconvolution in intact protein analysis.
Calculations:
Theoretical MW (from sequence )
28,006.60 Da = 28.0066 kDa
Observed MW (from LC-MS intact protein)
From your earlier intact MS calculation:
β 27,960 Da = 27.960 kDa
PPM error calculation
Formula:
PPM = πππππ β πππ‘βππππ¦ / πππ‘βππππ¦ Γ 10 ^ 6
PPM = 27960 β28006.60 / 28006.60 Γ 10^6
ββ1664ppm