Phase 4: Promoter-Terminator Pairing and Expression Simulation (Asimov Kernel)

Promoter–Terminator Pairing and Expression Design:

Initial Design Strategy

I first assembled a promoter library containing 20 plant promoters with different reported expression strengths, together with a smaller library of seven plant terminators.

The initial strategy was to generate multiple promoter–terminator combinations for each CODH gene and then computationally simulate their predicted expression behavior using the Asimov Kernel platform. This simulation step was intended to help compare the different expression architectures before final construct selection.

The design process was based on several important principles:

  • Stronger genes or more critical proteins should receive stronger promoters
  • Structural subunits should maintain relatively balanced stoichiometry
  • Strong promoters should generally be paired with stronger terminators
  • Construct size should remain compatible with cloning and synthesis workflows
  • Extremely high expression should be avoided when possible to reduce metabolic stress and instability risks

Using these principles, multiple candidate expression sets were generated for both the structural genes and the maturation genes.

Structural Gene Expression Sets

The structural construct contains the three genes directly forming the CODH enzyme complex:

  • coxL —> the large catalytic subunit
  • coxM —> the electron transfer medium subunit
  • coxS —> the iron-sulfur small subunit

Set 1 — High Balanced Expression (Primary Candidate)

GenePromoterRelative StrengthRecommended TerminatorReasoning
coxLdPCisVtOCSStrongest terminator paired with the strongest promoter to maximize CoxL expression. CoxL is the largest catalytic subunit and requires the highest transcriptional support.
coxMPNCRtHSP18.2Second strongest terminator matched with a highly active promoter to maintain balanced expression relative to CoxL.
coxSDaMVFLt4tATPaseHigh-performance terminator selected to provide expression levels comparable to coxM while preserving subunit stoichiometry.

This configuration was designed to maximize structural gene expression while maintaining relatively balanced production between the three subunits.

Set 2 — Medium-High Balanced Expression (Alternative)

GenePromoterRelative StrengthRecommended TerminatorReasoning
coxLD1002.2×tOCSAgain, the strongest terminator was paired with the lead structural gene to maximize transcriptional output and support high CoxL accumulation.
coxMSM2.1×tHSP18.2Promoters and terminators with similar strengths were combined to maintain balanced intermediate expression levels.
coxSFMV 34S (Sgt)tATPaseThe same stepwise promoter–terminator pairing strategy was maintained to preserve proportional expression among structural subunits.

This set provided a more moderate expression profile. Although weaker than Set 1, it was expected to reduce cellular burden and lower the risks associated with excessive transgene expression.

Set 3 — Very High Expression Configuration

GenePromoterRelative StrengthRecommended TerminatorReasoning
coxLM2410×tOCSM24 is an extremely strong promoter and therefore requires pairing with the strongest terminator to ensure efficient transcription termination and prevent premature transcript instability.
coxMCPV 2Comparable to e35StHSP18.2tHSP18.2 was selected to support stable expression; however, CPV2 is substantially weaker than M24, creating a potential stoichiometric imbalance between CoxL and CoxM expression levels.
coxSTobUbi.u4tATPaseA strong terminator was retained to match the high activity of the TobUbi.u4 promoter and maintain efficient expression of the coxS subunit.

This configuration aimed to maximize expression output. However, because of the extremely strong promoters involved, it also carried higher risks of stoichiometric imbalance, metabolic stress, transcriptional instability, and possible silencing effects.

Maturation Gene Expression Sets

The maturation construct contains four accessory genes involved in CODH assembly and activation:

coxD coxE coxF coxG

Unlike the structural genes, these proteins are not part of the final catalytic complex itself but are essential for proper enzyme maturation, sulfur insertion, and cofactor incorporation.

Special attention was given to coxD because it plays a central role in active-site maturation.

Set 4 — Balanced Maturation Expression

GenePromoterRelative StrengthRecommended TerminatorReasoning
coxDPTSB1~2.4×tOCSThe strongest promoter in this maturation construct was paired with the strongest terminator because CoxD is the most critical maturation protein and should not become rate-limiting during enzyme assembly.
coxED1002.2×tHSP18.2The second strongest promoter was matched with a highly efficient terminator to maintain balanced and stable expression of the coxE maturation factor.
coxFSM2.1×T-35SA moderately strong viral terminator was selected to support stable transcription while avoiding repeated use of the same terminator combinations across constructs.
coxGFMV 34S (Sgt)tATPaseThe strong tATPase terminator was used to compensate for the relatively weaker promoter and maximize final transcript accumulation for coxG.

This set was designed to maintain balanced maturation-protein production while prioritizing coxD expression because of its importance in catalytic-site activation.

Set 5 — Lower Expression Configuration

GenePromoterRelative StrengthRecommended TerminatorReasoning
coxDS1001.8×tOCSWeaker promoters benefit the most from highly efficient terminators; therefore, tOCS was selected to compensate for the lower promoter strength and maximize transcript stability.
coxEBM1.72×tHSP18.2The same compensation strategy was applied by pairing a moderately weak promoter with a high-performance terminator to improve overall expression efficiency.
coxFPPHYB~1.5×tATPaseA robust terminator was retained to stabilize transcripts produced from the moderate-strength PPHYB promoter.
coxGMSD31.15×T-E9The T-E9 terminator was selected as a reliable transcriptional terminator to support expression from the weakest promoter within this construct set.

This configuration represented a weaker-expression alternative intended to minimize cellular burden and reduce possible stress associated with transgene overexpression.

The original plan for this phase was to computationally simulate all designed expression sets using the Asimov Kernel platform.

The objective of these simulations was to:

  • Predict relative expression behavior
  • Evaluate stoichiometric balance between genes
  • Identify potential bottlenecks in the pathway
  • Detect excessive or insufficient expression levels
  • Refine promoter–terminator combinations before DNA synthesis

At the current stage of the project, access to the Asimov Kernel platform is still pending. To avoid delaying the workflow, provisional promoter–terminator combinations were selected manually based on promoter strength, expected biological balance, construct compactness, and cloning feasibility.

If access to Asimov Kernel becomes available later, the selected systems will still be computationally validated, and additional adjustments may be introduced if simulation results suggest improved expression architectures.

Final Selected Expression Systems

Final Structural Construct Selection

For the structural genes, Set 2 was selected as the final configuration.

Although Set 1 and Set 3 could potentially generate stronger expression, Set 2 was considered more biologically balanced and technically safer. The moderate promoter strengths reduce the likelihood of excessive chloroplast burden, instability, or transcriptional silencing while still maintaining relatively balanced subunit expression.

Final structural configuration:

GenePromoterTerminator
coxLD100tOCS
coxMSMtHSP18.2
coxSFMV 34StATPase

Final Maturation Construct Selection

For the maturation genes, a modified version of Set 4 was selected.

Initially, the promoter PTSB1 was assigned to coxD because of its relatively strong expression profile. However, this promoter was approximately 1.5 kb long, which significantly increased construct size and cloning complexity.

To maintain a more compact and synthesis-friendly construct, PTSB1 was replaced with D100 while preserving the overall balanced-expression strategy.

Final maturation configuration:

GenePromoterTerminator
coxDD100tOCS
coxESMtHSP18.2
coxFS100tATPase
coxGFMV 34ST-35S

This final configuration aimed to preserve balanced maturation-gene expression while improving construct compactness and compatibility with downstream Gibson Assembly and DNA synthesis workflows.

Objective

After completing sequence collection, codon optimization, chloroplast transit peptide fusion, and cleavage site verification, the next objective was to design the regulatory architecture controlling expression of the seven CODH genes inside Nicotiana tabacum cells.

The CODH pathway is composed of multiple interacting structural and maturation proteins that must function together in a coordinated manner. Because of this, maintaining balanced expression between the genes is critical. Excessive or insufficient expression of specific subunits could negatively affect protein folding, complex assembly, chloroplast burden, and overall enzyme functionality.

Therefore, the main goal of this phase was to design a biologically balanced expression system by selecting suitable promoter–terminator combinations capable of driving efficient and coordinated expression of all seven CODH genes.

The initial plan for this phase was to:

  • Build multiple promoter–terminator combinations for each gene
  • Simulate their expression behavior using the Asimov Kernel platform
  • Compare predicted expression outputs
  • Select the most balanced and stable expression architecture for the final constructs

The final promoter–terminator combinations were selected based on relative promoter strengths, functional compatibility between regulatory elements, and expected expression balance across the CODH pathway. Terminator efficiency values were taken from reported comparative plant expression data in Shakhova et al. (2022). The overall performance scores were predicted using an AI-based evaluation (Claude AI) integrating promoter strength, terminator efficiency, and expected transcriptional balance.

GenePromoterStrengthTerminatorCombined Performance
coxLD1002.2×tOCS★★★★
coxMSM2.1×tHSP18.2★★★★
coxSFMV 34S2.0×tATPase★★★
coxDD1002.2×tOCS★★★★
coxESM2.1×tHSP18.2★★★★
coxFS1001.8×tATPase★★★
coxGFMV 34S2.0×T-35S★★★