HTGAA 2026 · Homework Week 10 · April 7, 2026
Week 10

Advanced Imaging & Measurement

Mass spectrometry for protein characterization — molecular weight determination, native vs denatured state analysis, peptide mapping of eGFP, and charge detection MS of KLH oligomers. Applied to Füzi Poiesis biosafety validation at UFRO-BIOREN.

Evan Daugharthy Waters Corp. LC-MS MALDI-TOF Peptide Mapping
Final Project · Measurement Plan

What Füzi Poiesis Measures

Füzi Poiesis integrates measurement across three layers — computational, biological, and ecological — each with specific targets, instruments, and success thresholds. The connection to this week's mass spectrometry content is direct: MALDI-TOF and HPLC-MS at UFRO's BIOREN Proteomics and Metabolomics Unit are the primary biosafety validation instruments for Aim 2.

What to measure
Measurement targets across all three aims
  • Simulated GFP production (Aim 1). Activation of the AND-gate circuit's output layer is tracked in silico by modeling GFP reporter fluorescence across a 0–100 nM range of the Z₂ AHL inducer, generating a sigmoidal dose-response curve in Python/Matplotlib. The target: output below 5% of maximum in OFF states and above 90% in the (AHL HIGH, SRP HIGH) ON state.
  • Population dynamics and auxotrophic interdependence (Aim 1). Growth curves tracking monoculture extinction versus stable co-culture steady states, computed from the six-equation Monod ODE system. The interior fixed point (n* ≈ 0.629, Re(λ_max) ≈ −0.215) and cascade extinction time under single-strain removal are the key quantitative outputs.
  • Kill switch activation and off-target effects (Aim 2). Specific triggering of the MazE/MazF biocontainment module under low-phosphorus conditions, screened for unintended metabolic byproducts by MALDI-TOF and HPLC.
  • Plasmid stability and biomass (Aim 2). Consortium viability under high-competition solid-state fermentation conditions — target: 10⁸–10⁹ CFU/g without loss of engineered circuits.
  • Fecal coliform reduction (Aims 2–3). Concentration of fecal coliforms in multi-stress microcosms — target: greater than 3-log reduction from baseline (from the SEREMI-documented 54,000 NMP/100 mL).
  • Phosphorus and H₂S sequestration (Aims 2–3). Environmental concentrations of soluble reactive phosphorus and hydrogen sulfide — targets: greater than 60% phosphorus reduction and greater than 80% H₂S reduction relative to initial conditions.
  • Ecological resilience (Aims 2–3). Shannon entropy of microbial community composition in microcosms under operational stress, from 16S rRNA amplicon sequencing — the consortium should not collapse native biodiversity below the baseline Shannon entropy of undisturbed Budi sediments.
How to measure
Technologies and protocols
  • Python / SciPy / Matplotlib. ODE integration (scipy.integrate.solve_ivp, RK45), Jacobian eigenvalue analysis (SymPy + scipy.linalg.eigvals), AND-gate Hill function modeling (NumPy). All Aim 1 computational deliverables.
  • Benchling. In silico genetic circuit design — pFP-A, pFP-B, pFP-C annotated plasmids, virtual restriction digest, R-M shielding verification.
  • MALDI-TOF mass spectrometry (UFRO-BIOREN). Confirmation that the AND-gate PhoA circuit produces the correct protein under inducing conditions; screening for off-target proteins when the MazE/MazF kill switch activates. MALDI-TOF provides rapid intact protein identification by molecular weight without chromatographic separation — the same technology demonstrated this week with eGFP.
  • HPLC and HPLC-MS/MS (UFRO-BIOREN). Quantification of PhoA enzymatic output (alkaline phosphatase activity assay by pNPP substrate), screening for unintended metabolic byproducts, and confirmation of Microcin J25 lasso peptide production by Strain A. HPLC-MS/MS provides the secondary stringent toxin screening required before any organism enters a non-sterile environment.
  • MPN coliform kits. Standard NMP/100 mL quantification of fecal coliform reduction in microcosm experiments. The regulatory baseline is 1,000 NMP/100 mL (NCh 1333); the documented contamination level at Puerto Domínguez was 54,000 NMP/100 mL.
  • H₂S and SRP sensors. Electrochemical H₂S sensors and colorimetric SRP kits (ascorbic acid-molybdate method) for real-time tracking of remediation efficiency in microcosms.
  • 16S rRNA amplicon sequencing. V3-V4 hypervariable region sequencing on Illumina MiSeq platform; Shannon entropy calculated from OTU tables using QIIME2. Baseline community composition from published studies of comparable Budi sediment samples.
  • CRISPR/Cas9. Gene knockout verification (ΔhisD, ΔtrpB, ΔleuB) by colony PCR and Sanger sequencing of the deletion loci in Aim 2.
  • Gibson Assembly. Physical construction of pFP-A, pFP-B, pFP-C plasmids from PCR-amplified inserts; confirmed by restriction digest gel electrophoresis and Sanger sequencing.
BIOREN connection — Proteómica y Metabolómica UFRO

The BIOREN Proteomics and Metabolomics Unit at Universidad de La Frontera is the institutional infrastructure for Füzi Poiesis Aim 2 biosafety validation. Its MALDI-TOF system enables rapid intact protein profiling — the same principle demonstrated this week on eGFP with the Waters Xevo G3. For Füzi Poiesis, this translates directly: PhoA expression (MW ~47 kDa) from Strain C, and Microcin J25 (MW ~2.1 kDa, lasso-modified) from Strain A, would be confirmed by intact mass measurement before any co-culture experiment. The unit's HPLC pipeline provides the quantitative metabolite screening required to confirm that AND-gate activation produces the intended bioremediation output without unintended byproducts.

Waters Part I

Molecular Weight of eGFP

Question 1 — Theoretical MW
Calculated molecular weight from amino acid sequence (ExPASy ProtParam)

The eGFP sequence (246 amino acids including the His-tag linker LEHHHHHH) was submitted to ExPASy ProtParam and cross-validated with the PeptideMass tool (confirmed in the PeptideMass output header: Theoretical pI: 5.90 / Mw average mass: 28006.60 / Mw monoisotopic mass: 27988.96).

Theoretical pI: 5.90
MW (average mass): 28,006.60 Da
MW (monoisotopic mass): 27,988.96 Da

The difference between average and monoisotopic mass (~17.6 Da) reflects the contribution of naturally occurring heavy isotopes (¹³C, ¹⁵N, ²H) averaged across all atoms in the protein. At 28 kDa, the instrument resolves the monoisotopic mass — the mass of the most abundant isotopologue — rather than the average mass.

Question 2 — Experimental MW from charge states
Adjacent charge state approach — peaks at m/z 933.7349 and 903.7148
Mass spectrum of intact eGFP from Waters Xevo G3 LC-MS showing charge state envelope from m/z 700-1100 with individual peaks labeled. Major peaks at 933.7349, 903.7148, 875.4421, 848.9758 and others.
Figure 1 · Mass spectrum of intact eGFP · Waters Xevo G3 LC-MS (30,000 resolution) · Charge state envelope spanning m/z 700–1100 · Inset: zoomed peak at ~1473 m/z

Two adjacent peaks were selected from the denatured eGFP charge state envelope: m/z = 933.7349 (charge state n) and m/z = 903.7148 (charge state n+1).

Step 1: Determine z for each adjacent pair

z = m/z_(n+1) / (m/z_n − m/z_(n+1))
z = 903.7148 / (933.7349 − 903.7148)
z = 903.7148 / 30.0201
z = 30.10 → z_n = 30 (z_(n+1) = 31)

Step 2: Determine MW from m/z and z

MW = (m/z_n × z_n) − (z_n × m_proton)

From peak n (m/z = 933.7349, z = 30):
MW = (933.7349 × 30) − (30 × 1.00728) = 28,012.05 − 30.22 = 27,981.83 Da

From peak n+1 (m/z = 903.7148, z = 31):
MW = (903.7148 × 31) − (31 × 1.00728) = 28,015.16 − 31.23 = 27,983.93 Da

Average MW (experimental) = 27,982.88 Da = 27.983 kDa

Step 3: Accuracy of measurement

Accuracy = |MW_experiment − MW_theory| / MW_theory × 10⁶

vs. average mass (28,006.60 Da) — correct reference for intact MS at this mass range:
= |27,982.88 − 28,006.60| / 28,006.60 × 10⁶ = 847 ppm

vs. monoisotopic mass (27,988.96 Da) — note only:
= |27,982.88 − 27,988.96| / 27,988.96 × 10⁶ = 217 ppm
Reported accuracy: 847 ppm vs average mass.

For intact proteins above ~5 kDa, the instrument measures the centroid of the isotope envelope — which corresponds to the average mass, not the monoisotopic mass. At 28 kDa the monoisotopic isotopologue (all ¹²C, ¹H, ¹⁴N, ¹⁶O) is practically undetectable; the observed signal is the weighted average of all isotopologues. Comparing to the monoisotopic mass (217 ppm) would be technically incorrect for intact LC-MS at this mass range. 847 ppm vs the average mass of 28,006.60 Da is the appropriate accuracy metric and falls within the expected performance of the Waters Xevo G3 for proteins of this size.
Question 3 — Zoomed peak charge state
Can the charge state of the peak at ~1473 m/z be observed?

The charge state of the zoomed peak at m/z ≈ 1473 can be determined by calculation, though resolving individual isotope peaks by eye requires care. From the intact MW of 27,983 Da and the observed m/z:

z = MW / m/z ≈ 27,983 / 1473.7 ≈ 19

Verification: MW = 1473.7428 × 19 − 19 × 1.00728 = 27,981.1 − 19.1 = 27,962.0 Da ✓

Expected isotope peak spacing at z = 19: Δ(m/z) = 1/z = 1/19 = 0.053 Da

The inset in Figure 1 shows isotope peaks separated by approximately 0.05–0.15 m/z. At the instrument's 30,000 resolution and m/z ≈ 1473, the resolving power is approximately 1473/30,000 ≈ 0.049 Da — at the edge of resolving individual isotopes spaced 0.053 apart. The charge state of z = 19 is consistent with the observed peak pattern and the calculated MW.

Waters Part II · Optional

Native vs Denatured — Secondary/Tertiary Structure

Question 1
What happens when a protein unfolds, and how does mass spectrometry detect it?

A native protein maintains its three-dimensional fold through non-covalent interactions — hydrogen bonds, hydrophobic packing, electrostatic contacts, and van der Waals forces. These interactions compact the structure and, critically for mass spectrometry, determine how many proton-accepting sites are accessible to solvent. In the native state, many basic residues (Lys, Arg, His) are buried in the hydrophobic core or engaged in salt bridges, reducing their ability to acquire protons during electrospray ionization.

When a protein unfolds — by denaturing solvent (acetonitrile, methanol, low pH) — the polypeptide chain extends fully, exposing all basic residues to solvent and dramatically increasing the number of protons the molecule can accept. This produces a higher average charge state and a broader charge state distribution shifted toward lower m/z values in the denatured spectrum.

In Figure 2, the denatured eGFP spectrum (top) shows a broad envelope of high-charge states centered around m/z 800–1000 — consistent with 30+ protons on a fully extended 246-residue chain. The native eGFP spectrum (bottom) shows a narrow distribution at higher m/z (lower charge states), consistent with a compact folded structure where fewer basic residues are accessible. The shift to higher m/z and narrower distribution is the direct spectroscopic signature of tertiary structure.

Question 2
Charge state of the peak at ~2800 m/z in the native spectrum

From the native eGFP spectrum (Figure 3), the peak at ~2800 m/z corresponds to a low charge state of the folded protein. Using the intact MW of 27,983 Da:

z = MW / m/z ≈ 27,983 / 2800 ≈ 10

Verification: m/z = (27,983 + 10 × 1.00728) / 10 = 27,993.07 / 10 = 2,799.3 ≈ 2800 ✓

The charge state is z = 10. This low charge state is expected for native eGFP: the compact β-barrel fold buries most basic residues, limiting proton uptake during electrospray. The narrow, symmetrical peak shape in the inset of Figure 3 confirms a single well-defined conformational state — the native fold — rather than the broad, heterogeneous distribution seen in the denatured spectrum.

Waters Part III

Peptide Mapping — Primary Structure of eGFP

Question 1
Lysines (K) and Arginines (R) in eGFP — trypsin cleavage sites

Trypsin cleaves after Lys (K) and Arg (R) residues. Counting from the eGFP sequence:

K residues: K(28), K(42), K(53), K(74), K(86), K(97), K(102), K(108), K(122), K(123), K(132), K(141), K(157), K(162), K(167), K(209), K(215), K(239), K(246-HisTag)
R residues: R(73), R(80), R(87), R(109), R(116), R(127), R(128), R(145), R(215), R(217)

Total K: ~19 · Total R: ~10 · Total cleavage sites: ~29

The exact count is confirmed by the PeptideMass output below (19 peptides >500 Da covering 90.7% of the sequence).

Questions 2 & 3
PeptideMass tryptic digest — 19 peptides predicted, ~18–20 chromatographic peaks observed
PeptideMass tool submission results showing eGFP sequence, trypsin enzyme, 0 missed cleavages, cysteines in reduced form
ExPASy PeptideMass · eGFP tryptic digest · Trypsin · 0 missed cleavages · Cysteines reduced · Masses as [M+H]⁺
PeptideMass results table showing 19 peptides with masses from 502 to 4472 Da, 90.7% sequence coverage
PeptideMass results · 19 peptides >500 Da · Theoretical pI 5.90 · Average MW 28,006.60 Da · Monoisotopic MW 27,988.96 Da · 90.7% sequence coverage
Peptides predicted (trypsin, 0 missed cleavages, >500 Da): 19
Chromatographic peaks in TIC (0.5–6 min, >10% relative abundance): ~20
More peaks than predicted peptides — expected: peptides below 500 Da (not listed in PeptideMass table) still appear in the chromatogram, and some peptides co-elute or appear as multiple charge states.
Questions 4 & 5
Peptide at 2.78 min — m/z 525.76, charge state z=2, [M+H]⁺ = 1050.51 Da

The chromatographic peak at 2.78 min shows a dominant charge state at m/z = 525.76. From the isotope peak spacing in the Figure 5b inset:

Isotope spacing ≈ 0.50 m/z → z = 1/0.50 = 2

[M+H]⁺ = m/z × z − (z−1) × m_proton
= 525.767 × 2 − 1 × 1.00728
= 1051.534 − 1.007 = 1050.527 Da
Questions 6 & 7 (Bonus 8 & 9)
Peptide identification — FEGDTLVNR · 8.3 ppm accuracy · 90.7% sequence coverage

Matching [M+H]⁺ = 1050.51 Da against the PeptideMass table: the closest entry is FEGDTLVNR (residues 115–123) with theoretical [M+H]⁺ = 1050.52149 Da (monoisotopic).

Mass accuracy = |1050.527 − 1050.521| / 1050.521 × 10⁶
= 0.006 / 1050.521 × 10⁶ = 5.3 ppm

5.3 ppm is well within the typical accuracy specification of the Waters BioAccord LC-MS system (<5–15 ppm), confirming correct peptide identification.

Fragment Ion Calculator results for FEGDTLVNR showing b and y ion series, monoisotopic [M+H]+ = 1050.52149, [M+2H]2+ = 525.76441
Fragment Ion Calculator · FEGDTLVNR · pI 4.37 · [M+H]⁺ = 1050.52149 Da · [M+2H]²⁺ = 525.76441 Da · b/y ion series confirmed
Amino acid coverage map of eGFP showing 90.7% of sequence covered by identified peptides, with uncovered regions in lowercase
Amino acid coverage map · eGFP tryptic digest · BioAccord LC-MS · 90.7% sequence coverage · Uncovered regions: mvsk (N-term) and aevk (region 116–119)
Sequence coverage: 88% experimental (Figure 6, BioAccord LC-MS) vs 90.7% theoretical (PeptideMass prediction)
The 2.7% difference reflects peptides predicted by PeptideMass that were not detected in the actual LC-MS run — typical for very small peptides or those with poor ionization efficiency.

Conclusion: The peptide map data confirms this is eGFP. The mass accuracy (8.3 ppm), the correct peptide sequence (FEGDTLVNR confirmed by b/y fragmentation ions), and 90.7% sequence coverage are all consistent with a correctly produced and pure eGFP standard.
Waters Part IV

KLH Oligomers — Charge Detection Mass Spectrometry

Oligomeric state identification
KLH species assignment from CDMS spectrum

Keyhole Limpet Hemocyanin (KLH) exists in multiple oligomeric assemblies in solution. Using the subunit masses from Table 1 (7FU = 340 kDa, 8FU = 400 kDa), the expected masses for each oligomeric species are:

SpeciesSubunitSubunitsExpected massObserved (CDMS)Δ (%)
7FU Decamer7FU (340 kDa)103,400 kDa3.40 MDa0% — exact match
8FU Didecamer8FU (400 kDa)208,000 kDa8.33 MDa+4.1%
8FU 3-Decamer8FU (400 kDa)3012,000 kDa12.67 MDa+5.6%
8FU 4-Decamer8FU (400 kDa)4016,000 kDa~16–17 MDa~+4%
The 7FU Decamer matches its predicted mass exactly (3.40 MDa). The three 8FU assemblies show a systematic positive deviation of ~4–6% relative to the polypeptide-only mass of 400 kDa per subunit. This is not instrument error — it reflects the fact that KLH is a heavily glycosylated protein, and the carbohydrate mass is not captured in the polypeptide subunit mass given in Table 1. The observed ~4% deviation implies a true 8FU subunit mass of approximately 416–417 kDa, consistent with the known glycosylation of KLH used in immunological applications. The additional peaks visible in the CDMS spectrum (0.79, 1.52, 4.01, 7.52 MDa) correspond to sub-assemblies or intermediate oligomeric states not requested in this assignment.
Waters Part V

Did I Make GFP?

TheoreticalObserved (intact LC-MS)PPM mass error
Molecular weight 28,006.60 Da (28.007 kDa) — average mass 27,982.88 Da (27.983 kDa) 847 ppm
Conclusion: Yes — eGFP was successfully produced.

The observed MW of 27,982.88 Da is compared against the theoretical average mass (28,006.60 Da), giving 847 ppm error. For intact proteins at 28 kDa, the instrument detects the centroid of the isotope envelope — the average mass — not the monoisotopic peak, which is undetectable at this molecular weight. 847 ppm is within the expected performance range for intact LC-MS on the Waters Xevo G3 at this mass. The peptide map provides orthogonal confirmation: 88% experimental sequence coverage (Figure 6) with FEGDTLVNR identified at 5.3 ppm, consistent with correctly produced, pure eGFP standard.