group-final-project

L-Protein Engineering | Option 1: Mutagenesis

☀️ Team Members

LIAO LITING      WANG YUXIN      ZHANG SIWEI

Important
Objective
"Engineering the MS2 Lysis Protein to enhance mutagenesis efficiency while balancing cellular viability—a significant challenge in modern synthetic biology."
1. Chaperone-independent lysis design;
2. Rapid and efficient E. coli killing;
3. Potentiated lysis protein yield;
Electron Micrograph of Bacteriophages

Fig 1. Electron micrograph of bacteriophages showing their characteristic morphology.

Note

📜 Project Background:

1. Prototype Lysis Systems:

  • MS2 Lysis Protein (L): A single-gene lysis system that triggers membrane fusion and cell wall degradation.
  • ϕX174 Lysis Protein (E): A classic model for chaperone-dependent lysis in E. coli.

These proteins serve as the biological foundation for our engineered modifications, providing the baseline for lysis efficiency and cellular impact.

Fig 1. Genome organization

Fig 2. Genome organization of ϕX174 and MS2 phages and similarities between their lysis proteins. The lysis genes of the two phages are shaded blue.

Important

Key Insights & Design Principles

  1. Functional Core > The C-terminal 25-30 residues of the L-protein are the functional heart of lysis, capable of dissipating the proton-motive force via hydrophilic pores (Goessens et al., 1988).

  2. Chaperone Evasion > Modifying the non-essential N-terminus allows the protein to evade DnaJ C-terminal sequestration, optimizing lysis independent of host chaperones (Chamakura et al., 2017).

  3. Critical Targeting > Bayer’s patches (membrane adhesion sites) are the decisive targets that determine the efficiency of the infection and lysis process (Chamakura et al., 2017).


核心原则 / Design Principle: > “Nature evolves for survival stability; engineering designs for peak performance."

References

  • Goessens, W.H.F., et al. (1988). A synthetic peptide corresponding to the C-terminal 25 residues of phage MS2-coded lysis protein… EMBO J, 7:867–873.
  • Chamakura, K.R., et al. (2017). MS2 lysis of Escherichia coli depends on host chaperone DnaJ. Journal of Bacteriology, 199(12).
Tip

Technical Approach: Chaperone-Independent Lysis

1. Re-evaluating Chaperone Requirements

To engineer a superior lysis protein, we must first address the host chaperone (DnaJ) dependencies:

  • Proteostasis: Preventing non-specific aggregation of lysis proteins.
  • Kinetic Control: Establishing a precision “Lysis Timer”.
  • Spatial Navigation: Ensuring accurate targeting to Bayer’s Patches.
  • Conformational Modulation: Facilitating smooth transmembrane insertion.

2. Engineering Chaperone-Independence

Goal: Achieving Autonomy while Preserving Lytic Potency.

  • Internalization of Function: Converting external chaperone support into intrinsic protein functionality.
  • Strategic Trade-off: Precision balancing between lysis timing and viral burst size.

3. Core Design Principles

  • Stability: Augmenting protein conformational stability.
  • Latency: Expanding the kinetic latency buffer for optimized maturation.
  • Affinity: Fortifying site-specific binding to the cell envelope.

Conclusion: By implementing these designs, we achieve a tempered infection that optimizes the delicate balance between lysis timing and total viral burst size.

Insights from DnaJ External Support

A. Host-Derived Recruitment (The "External Support")

  • System: DnaJ is an endogenous E. coli protein from the Hsp40 family.
  • Strategy: Instead of encoding its own chaperones, the MS2 phage recruits the host's system to assist in L-protein folding.
  • Implication: This biological "dependency" creates a vulnerability that our engineering aims to internalize.

B. N-Terminal: The Evolutionary Sandbox

  • Character: The N-terminal domain is nonessential for core lysis function, granting it high evolutionary latitude.
  • Potential: It allows for the exploration of flexible linker lengths and dynamic charge distributions.
  • Design Goal: This is the optimal entry point for "Tempered Self-Evolution," enabling the protein to reach a functional equilibrium.
Fig 3. Lysis Proteins Sequence Alignment

Fig 3. GenBank accession numbers: MS2 (CAA23990.1), M12 (AAF19634.1), fr (CAA33137.1), GA (CAA27498.1), JP34 (AAA72211.1), KU1 (AAF67675.1), Hgal1 (YP_007237174.1), C1 (YP_007237128.1), AP205 (NP_085469.1), PP7 (NP_042306.1), PRR1 (YP_717670.1).The conserved LS motif (yellow) is essential for lytic function, preceded by a hydrophobic stretch (underlined) that facilitates membrane insertion. Highly basic N-termini (red) and acidic residues (blue) are strategically positioned to regulate electrostatic interactions.

🛠️ Implementation Process

Core Focus: Targeted Synergistic Effects

Overview: Redesign the fragile domains of the L-protein to bypass the DnaJ dependency. By creating a self-stabilizing and autonomously positioning structure, we aim to increase the protein's robustness and conformational stability.

 

STEP 1

Data Acquisition & Analysis

  • Gather multi-omic data: Lysis protein sequences & DNA structural motifs.
  • Identify conserved functional sites and critical domains.
  • Systematically review known mutational effects from global research databases.
🧬 Lysis Protein (MS2) UniProt: P03609 ↗
METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT
🧬 Host Chaperone: DnaJ (E. coli) UniProt: P08622 ↗
MAKQDYYEILGVSKTAEEREIRKAYKRLAMKYHPDRNQGDKEAEAKFKEIKEAYEVLTDSQKRAAYDQYGHAAFEQGGMGGGGFGGGADFSDIFGDVFGDIFGGGRGRQRAARGADLRYNMELTLEEAVRGVTKEIRIPTLEECDVCHGSGAKPGTQPQTCPTCHGSGQVQMRQGFFAVQQTCPHCQGRGTLIKDPCNKCHGHGRVERSKTLSVKIPAGVDTGDRIRLAGEGEAGEHGAPAGDLYVQVQVKQHPIFEREGNNLYCEVPINFAMAALGGEIEVPTLDGRVKLKVPGETQTGKLFRMRGKGVKSVRGGAQGDLLCRVVVETPVGLNERQKQLLQELQESFGGPTGEHNSPRSKSFFDGVKKFFDDLTR

### 💡 :Known Mutational Effect

LS Dipeptide: > The Leucine-Serine (LS) residues at positions 44 and 45 of the MS2 L-protein are extremely critical.

Domain Organization

The L-protein is partitioned into four domains:

  • Domain 1 (N-terminus): Despite being positively charged and significant, it is dispensable for the lysis function itself (it primarily mediates binding with the host chaperone DnaJ).
  • Domain 2 to Domain 4 (C-terminal half): This region, which contains the LS motif, constitutes the essential core for executing lysis.

Design Focus: > The design centers on Domain 1, increasing its hydrophobicity via amino acid substitution to ensure spontaneous folding and structural stability.


L-protein Structure

Fig.4 Schematic representation of the core structural domains of MS2 L-protein

 

STEP 2

###💡 :Select an approach to make sequence variants

Screening of mutation sites 1

Screening of mutation sites 2

Fig.5 & 6 Schematic of mutation site screening and selection process

Plan 1: Design Strategy


  1. K50L Mutation Score: 2.56
    • Effect: Strengthens hydrophobic anchoring at the Domain 2/4 interface.
  2. Y39L Mutation (Domain 1) Score: 2.24
    • Effect: Replaces Tyrosine (Y) with Leucine (L) to create a robust transmembrane helix.
Rationale: The goal is to enhance structural stability via increased hydrophobicity and better interface anchoring.
Note

Mutated Sequence: Plan 1 (Y39L & K50L)

METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLLVLIFLAIFLSLFTNQLLLSLLEAVIRTVTTLQQLLT


  • Position 39 (Y→L): LLR Score 2.24 | Enhances TM helix robustness.
  • Position 50 (K→L): LLR Score 2.56 | Strengthens hydrophobic anchoring.

Plan 1.5 | Design Action: Charge Enhancement

Design Action: Based on LLR scores, the C-to-R mutation yields a high score of 2.39. Mechanism: Increasing the positive charge enhances the protein’s autonomous attraction to the negatively charged cell membrane, thereby reducing its functional dependency on DnaJ escorting.

METRFPQQSQQTPASTNRRRPFKHEDYPRRRQQRSSTLLVLIFLAIFLSLFTNQLLLSLLEAVIRTVTTLQQLLT


  • C-to-R Mutation (Pos 29-31 area): LLR Score 2.39 | Boosts electrostatic attraction.
  • Functional Impact: Bypasses DnaJ dependency for more autonomous membrane targeting.

Plan 2 | Design Action: Structural Rigidity Reinforcement

Design Action: Based on LLR scores, the S-to-Q mutation yields a high score of 2.39. Mechanism: Increasing the rigidity of Domain 1 facilitates its autonomous folding into a helical state, optimizing its structural readiness for membrane insertion.

METRFPQQQQQQTPASTNRRRPFKHEDYPRRRQQRSSTLLVLIFLAIFLSLFTNQLLLSLLEAVIRTVTTLQQLLT


  • S-to-Q Mutation (Pos 8-12 area): LLR Score 2.39 | Enhances alpha-helical propensity.
  • Functional Impact: Promotes spontaneous folding of Domain 1, ensuring structural stability before membrane interaction.

 

STEP 3

Mutated Simulation: Plan 1 (Y39L & K50L)

> Fig. 7 Mutation simulation of Plan 1: Enhancing hydrophobic anchoring (K50L & Y39L) >

> >

> Fig. 8 Mutation simulation of Plan 1: Enhancing hydrophobic anchoring (K50L & Y39L) >

> > Fig. 9 Mutation simulation of Plan 1: Enhancing hydrophobic anchoring (K50L & Y39L) >

Mutated Simulation: Plan 1.5 (Charge Enhancement)

Fig. 10 Surface electrostatic potential simulation after C-to-R mutation

Fig. 11 Interaction analysis between the positive charge cluster and lipid bilayer

Fig. 12 Stability and binding energy evaluation for Plan 1.5 design

Mutated Simulation: Plan 2 (Design Action)

Fig. 13 Structural comparison of Domain 1: Wild-type vs. S-to-Q mutated rigid state

Fig. 14 Helix propensity analysis and autonomous folding simulation

Fig. 15 Energy landscape of Plan 2 design during membrane transition

STEP 4 Submit 5 mutated sequences

🛠️ Implementation Process

Focuse on: Integrated Strategy

Note

**Design Logic: Structural Stability**

🔹 Structural Stability: Intramolecular Salt-Bridge Lock
🔹 Rationale: If both the N- and C-termini of the L protein are highly enriched with positive charges, they will experience mutual electrostatic repulsion. In the absence of an anionic (negatively charged) chaperone like DnaJ to bridge them, this 'dual-cationic' structure causes the protein to behave like a tensed spring—becoming highly unstable and prone to non-specific aggregation.
Fig 16

Fig. 16 Integrated Strategy for Lysis Protein Analysis. (c) Multiple sequence alignment of the lysis proteins. GenBank accession numbers: MS2 (CAA23990.1), M12 (AAF19634.1), fr (CAA33137.1), GA (CAA27498.1), JP34 (AAA72211.1), KU1 (AAF67675.1), Hgal1 (YP_007237174.1), C1 (YP_007237128.1), AP205 (NP_085469.1), PP7 (NP_042306.1), and PRR1 (YP_717670.1).


  • Sequence Motifs: The conserved LS motif is highlighted in yellow, preceded by a stretch of hydrophobic residues (underlined) and highly basic N-termini.
  • Amino Acid Properties: Basic and acidic residues are highlighted in red and blue, respectively.
  • Mutagenesis Analysis: * Green asterisks (*) indicate all possible codon positions where a nonsense mutation could be accessed by a single nucleotide change.
  • Underlined asterisks (*) indicate positions where no nonsense mutants were obtained in the experimental mutagenesis.

Protein Data Card: Lysis Protein (MS2)

1. Basic Information

  • UniProt ID: P03609 (LYS_BPMS2)
  • Full Name: Lysis protein
  • Organism: Escherichia phage MS2 (OX=12022)
  • Evidence: PE=2 (Evidence at protein level)
  • Version: SV=1

2. Sequence Analysis

N-terminal Start: M (Methionine)
C-terminal End: T (Threonine)

Full Sequence Segment:

      10         20         30         40         50
METRF PQQSQ QTPAS TNRRR PFKHE DYPCR RQQRS (N-terminal)

      60         70         80         90
ST**LYV LIFLA IFLSK FTNQL LLSLL** EAVIR TVTTL QQLLT (C-terminal)

Important

DESIGN OPTIMIZATION & STABILITY RATIONALE

The design strictly follows the principles of in-situ salt-bridge locking to stabilize the autoinhibitory state, while carefully optimizing codon usage to avoid the introduction of nonsense mutations that would truncate the L-protein.

Engineered Sequence 1

Rationale: First, the electrostatic repulsion is converted into intramolecular attraction, where the N- and C-termini form multiple E-R or D-K pairs that are spatially proximal, establishing stable multivalent salt bridges.

Full Sequence Segment:

      10         20         30         40         50
      |          |          |          |          |
METRF PQQSQ QTPAS TNRRR PFKHE DYPCR RQQRS (N-terminal)
[------- N-terminal: Cationic/Basic Region -----------]

      60         70         80         90
      |          |          |          |
GGSGG SGEDD ELYVL IFLAI FLSKF TNQLL LSLLR RRW (C-terminal)
[ Linker ] [--- C-terminal: Anionic & Hydrophobic ---]

Fig. 17 AlphaFold 3 and ColabFold-Based Structural Analysis of De Novo Designed MS2 Lysis Protein. (a) High-confidence structural model predicted by AlphaFold 3, highlighting the optimized helical regions. (b) Comparative alignment using ColabFold, demonstrating the consistency of the intramolecular salt-bridge formation between the N- and C-termini.

Preventing Pre-mature Lysis:Proline Switch\The pH-responsive histidine switch

Rationale: Bayer's Patches (5.5-6.0) Normal physiological pH (7.2–7.4) shifts toward acidity during active infection.

Important
CRITICAL DESIGN: TRIGGERED RELEASE & MEMBRANE KINETICS
Subsequently, histidine and proline switches are incorporated into the N-terminus to ensure that the 'salt-bridge lock' is specifically released at the Bayer's patches, thereby optimizing the viral burst size. By enhancing the N-terminal hydrophobic masking, the solubility of the L protein is improved, successfully delaying its insertion into the host membrane.

Engineered Sequence 2

Full Sequence Segment:

      10         20         30         40         50
      |          |          |          |          |
METRF PQQSQ QTPAS TNRRR PFKHE DYPCR RQQRS (N-terminal)
[------- N-terminal: Cationic/Basic Region -----------]

      60         70        80         90
      |          |          |          |
GGSGG SG HPH EDDE LYVLI FLAIF LSKFT NQLLL SLLRR RW (C-terminal)
         ^^^
[ Linker ] [--- C-terminal: Anionic & Hydrophobic ---]

Stage 2: Synthesize the L-protein mutant gene via Twist

Full Sequence Segment:

      10        20        30        40        50
      |         |         |         |         |
CTCGAGGGTA CCACCGGTGA GTCCCATGGC ATATGGGGCC CGTGCACGGC (Row 1)
GCGCCGCTAG CGCGGCCGCG GTACCATGCA TCCTAGGGGA TCCGAAGACA (Row 2)
GATCTTTAAT TAACTCGAGG GGCCCCACGT CGGTCTCCGT CTCATCGATT (Row 3)
TCGAAATCGA TCGGCCGGAG CTCGAATTCG ATATCCGTCT CAAGCTTGTT (Row 4)
AACGGTACCA CGCGTCTGCA GCGATCGCAG CTGGAGCTCC CGCGGGTCGA (Row 5)
CCCCGGGTAC GTAACTAGTG CATGCCTCGA GCCCGGGTCT AGACTCGAGC (Row 6)
CCGGGTAACT CGAG                                        (Row 7)

Stage 3: Clone the L-protein mutant gene into a plasmid using Gibson

🧬 Stage 3: Gibson Assembly for L-Protein Cloning

Objective: A highly efficient, seamless cloning method to insert the optimized L-protein sequence into a linearized expression vector.


  • Vector Preparation (载体线性化)

    • Linearize your target plasmid (e.g., pET or pBAD) via Restriction Digestion (using high-fidelity enzymes) or Inverse PCR.
    • Note: Ensure the linearized vector is purified to remove any residual circular template.
  • Insert Preparation (插入片段制备)

    • Perform PCR on your optimized L-protein sequence.
    • Key Requirement: Use primers designed to add 20–40 bp overlapping arms identical to the ends of your linearized plasmid.
  • The Master Mix Reaction (一锅法反应)

    • Mix the vector and insert (molar ratio ~1:2) with the Gibson Assembly Master Mix.
    • Reaction Condition: Incubate at 50°C for 15–60 minutes.
EnzymeActionDescription
ExonucleaseChew-backRemoves nucleotides from the 5’ ends, creating single-stranded 3’ overlaps.
DNA PolymeraseGap-fillIncorporates nucleotides after the overlapping strands anneal.
DNA LigaseNick-sealCovalently joins the DNA fragments into a circular, double-stranded plasmid.
  • Transformation (转化)
    • “Shock” the assembled DNA into competent E. coli cells (e.g., DH5α or BL21).
    • Process: 1. Heat-shock ($42°C$) or Electroporation; 2. Recovery in SOC/LB medium; 3. Plating on selective agar.
    • Outcome: Replicates the assembled plasmid for subsequent sequence verification.

Generated for Molecular Biology Workflow | v1.0

Stage 4: Test the L-protein mutant’s structural integrity using the Nuclera system

🧪 Stage 4: Cell-Free Synthesis & Quality Control

System: Nuclera eProtein Desktop Platform / Microfluidic Integration

  • DNA Input (底物加载)

    • Load the constructed plasmid (from Stage 3) or the high-purity PCR-amplified linear DNA into the Nuclera microfluidic chip.
    • Requirement: Ensure DNA concentration meets the chip’s specified detection range.
  • Microscale Synthesis (微量合成)

    • The system executes automated coupled transcription and translation (TX-TL).
    • Synthesis occurs within discrete microdroplets on-chip, enabling rapid production of the L-protein mutant in a cell-free environment.
  • Integrity Check (完整性检测)

    • Utilizes integrated biosensors for real-time monitoring.
    • Focus: Confirms the protein’s biophysical state, specifically targeting a monomeric and soluble profile to avoid aggregation or misfolding.

Stage 5: Test the L-protein in E. coli with plaque assays

🧫 Stage 5: Functional Validation via Plaque Assays

Objective: Evaluate the lysis activity of the L-protein mutant in vivo using E. coli host systems.

  • Induction of Expression (诱导表达)

    • Culture the transformed E. coli cells until they reach mid-log phase ($OD_{600} \approx 0.4 - 0.6$).
    • Trigger protein synthesis using a specific inducer:
      • L-arabinose (for pBAD vectors) or IPTG (for pET vectors).
    • Note: Maintain optimal temperature to balance protein folding and expression levels.
  • Plaque Formation (噬菌斑形成)

    • Employ the Double-layer Agar Technique (双层琼脂法).
    • Mechanism: Functional L-protein triggers localized host cell lysis.
    • Observation: Formation of visible, clear circular zones (Plaques) within the bacterial lawn.
  • Efficiency Analysis (效率分析)

    • PFU Calculation: Quantify lysis efficiency by calculating Plaque Forming Units (PFU/mL).
    • Phenotypic Mapping: Measure plaque diameters to assess:
      1. The stability provided by the Salt-bridge Lock.
      2. The impact of enhanced Bayer’s Patch binding affinity on lysis kinetics.