Week 7 — Genetic Circuits Part II: Neuromorphic Circuits

Assignment Part 1: Intracellular Artificial Neural Networks (IANNs)

Question 1: Advantages of IANNs over traditional Boolean genetic circuits

Traditional genetic circuits compute Boolean functions — AND, OR, NAND, NOR — where each input is treated as fully on or fully off, and the output is discrete. This binary logic imposes a hard constraint: the circuit cannot distinguish how much of a signal is present, only whether it is present. IANNs overcome this and several related limitations.

Continuous, analog computation. IANNs integrate inputs as graded signals rather than binary thresholds. Promoter activity, ribosome occupancy, and protein concentration all vary continuously inside a cell; IANNs exploit this natural analog dynamic range instead of discarding it. A Boolean AND gate fires only when both inputs are fully active; an IANN perceptron node produces a graded output that reflects the combined intensity of all inputs simultaneously.

Richer input-output mappings without rewiring. A Boolean circuit implements one fixed truth table. Changing its input-output relationship requires redesigning the topology — new parts, new combinatorial logic, new construction. An IANN can implement any linearly separable function (single-layer) or, with multiple layers, approximate any continuous function (universal approximation theorem). Changing the response profile requires changing weights, not the circuit architecture.

Scalability. For $n$ binary inputs, there are $2^{2^n}$ possible Boolean functions to consider, and implementing each requires a distinct circuit. An IANN scales to many inputs through a single weighted sum — adding a new input dimension means adding a new weight, not redesigning the whole network.

Noise tolerance. Boolean circuits are brittle: stochastic fluctuations in molecule copy number can flip a node from 0 to 1 unpredictably near the threshold. An IANN’s continuous, sigmoidal activation function integrates over noise, smoothing out stochastic variation that would derail a digital circuit.

Adaptability. In principle, weights can be adjusted — through directed evolution of regulatory sequences, operator binding affinities, or ribosome binding site strengths — allowing the IANN to be “retrained” to a new classification boundary without changing circuit topology.


Question 2: A Useful Application for an IANN

Application: Intracellular gallic acid dosimeter in Aspergillus niger for tannase bioprocess optimisation

Motivation

In fermentative production of tannase and gallic acid from tannic acid-rich agricultural waste (grape pomace, pomegranate peel, tea dust), the relationship between tannase expression level and gallic acid yield is not linear. At low tannase activity, substrate conversion is incomplete; at very high expression levels, intracellular resource burden suppresses growth and secondary metabolic flux. The optimum lies at an intermediate tannase expression level that maximises gallic acid titre without imposing metabolic cost. A Boolean circuit cannot detect this optimum: it can only signal “gallic acid present” or “gallic acid absent.” An IANN can be designed to integrate multiple metabolic signals and output a graded response that reports whether the cell is operating in the productive window.

IANN Design

The intracellular single-layer perceptron takes three weighted inputs:

  • X₁ — expression level of tanA (tannase), encoded as the output of a PthiA-driven transcriptional unit; signal strength is proportional to tanA mRNA abundance, itself tunable by thiamine titration.
  • X₂ — intracellular gallic acid concentration, sensed by the PobR transcription factor (which activates PpobA-driven reporters in the presence of gallic acid / protocatechuate); PobR occupancy of its operator scales continuously with gallic acid concentration, producing a graded signal.
  • X₃ — a proxy for metabolic burden, encoded as the inverse of growth rate (e.g., a constitutively expressed reporter whose dilution rate by growth is measurable).

Each signal drives expression of a distinct transcription factor at a level proportional to input intensity. Each transcription factor drives a shared output promoter containing operator sites with tunable affinities (weights). The weighted sum is passed through an ultrasensitive Hill-function promoter (high cooperativity, n ≥ 3) serving as the sigmoid activation function. The output is mNeonGreen fluorescence, measurable by plate reader or flow cytometry.

Input/Output Behaviour

When tanA expression is low (X₁ weak), gallic acid is low (X₂ weak), and burden is low (X₃ weak) → weighted sum sub-threshold → low mNeonGreen signal: process is under-performing.

When tanA expression is high and gallic acid is accumulating at productive rates (X₁ and X₂ both strong), but burden is manageable (X₃ moderate) → weighted sum crosses threshold → high mNeonGreen: cell is in the target operating window.

When burden signal X₃ is very strong (growth collapsed) → X₃ weight drives the sum above a second, higher threshold → a secondary repressor output (e.g., a chromoprotein) signals over-expression toxicity.

The weights can be tuned by adjusting operator copy number and binding affinity so that the “productive window” corresponds precisely to the gallic acid titre range of industrial interest.

Limitations

  • Weight encoding precision. Encoding synaptic weights as operator binding affinities or RBS strengths is imprecise; small sequence changes cause non-linear affinity shifts, making fine-grained weight tuning difficult without extensive screening.
  • Transcriptional resource competition. Running three weighted input arms simultaneously in a single fungal cell imposes RNAP and ribosome burden, which could itself perturb the metabolic signals being measured — a confounding feedback.
  • No in vivo retraining. Adjusting weights to reflect a new optimum requires strain re-engineering; true online learning would require a directed evolution loop inside the fermenter, which is currently not feasible safely.
  • Fungal IANN parts scarcity. Characterised, orthogonal transcription factors with graded, tunable activation curves are far less abundant in A. niger than in E. coli, limiting the number of weighted input arms that can be composed without crosstalk.

Question 3 — Diagram: Intracellular Multilayer Perceptron

The diagram below shows an intracellular two-layer perceptron where Layer 1 outputs an endoribonuclease (Csy4) that post-transcriptionally regulates Layer 2, whose output is a fluorescent protein.

Intracellular multilayer perceptron. Layer 1 integrates two DNA inputs (X1, X2) at the transcriptional level, producing Csy4 endoribonuclease as its output. Layer 2 receives the Csy4 output: Csy4 cleaves a hairpin in the mRNA of the fluorescent protein reporter, stabilising translation. The fluorescent protein (FP) is the final network output. Transcription (Tx) and translation (Tl) steps are labelled at each layer.

Intracellular multilayer perceptron. Layer 1 integrates two DNA inputs (X1, X2) at the transcriptional level, producing Csy4 endoribonuclease as its output. Layer 2 receives the Csy4 output: Csy4 cleaves a hairpin in the mRNA of the fluorescent protein reporter, stabilising translation. The fluorescent protein (FP) is the final network output. Transcription (Tx) and translation (Tl) steps are labelled at each layer.

Layer 1 — Integration node:

  • Input X₁: DNA encoding a transcription factor (TF₁), driven by an inducible promoter (e.g., aTc-inducible Ptet)
  • Input X₂: DNA encoding a second transcription factor (TF₂), driven by a second inducible promoter (e.g., IPTG-inducible Plac)
  • TF₁ and TF₂ both bind the hybrid promoter P(hybrid) upstream of the Csy4 endoribonuclease gene
  • P(hybrid) acts as an AND-gate-like integrator: significant Csy4 transcription requires both TF₁ and TF₂ input signals
  • Csy4 mRNA is translated into Csy4 protein → this is the Layer 1 output

Layer 2 — Output node:

  • Fluorescent protein (FP) gene is constitutively transcribed, but its mRNA contains a Csy4 recognition hairpin in its 5’ UTR that sequesters the ribosome binding site
  • In the absence of Csy4: hairpin blocks translation → FP is not produced
  • When Csy4 is present (from Layer 1 output): Csy4 cleaves the hairpin → RBS is accessible → FP is translated → fluorescence output

Network behaviour:

  • No inputs → no TFs → no Csy4 → FP mRNA blocked → no fluorescence (output = 0)
  • Only X₁ → TF₁ only → insufficient P(hybrid) activation → weak/no Csy4 → low FP (output ≈ 0)
  • Only X₂ → TF₂ only → insufficient P(hybrid) activation → weak/no Csy4 → low FP (output ≈ 0)
  • Both X₁ and X₂ → strong P(hybrid) activation → Csy4 produced → hairpin cleaved → FP translated → fluorescence (output = 1)

This is a biologically implemented two-layer perceptron performing AND-gate-like multilayer computation.


Assignment Part 2: Fungal Materials

Question 1: Examples, Uses, Advantages and Disadvantages

Fungal (mycelium-based) materials are grown from the vegetative hyphal networks of fungi — commonly Ganoderma, Pleurotus, or Trametes species — colonising lignocellulosic agricultural waste as the growth substrate. Several commercial and experimental material categories have emerged.

Mycelium composite packaging foam (e.g., Ecovative Mushroom Packaging, used commercially by Dell and IKEA) replaces expanded polystyrene (EPS). Compressed mycelium colonising hemp shiv or corn husks self-bonds into rigid, low-density foam over several days. Advantages: fully home-compostable within weeks, fire-resistant without chemical additives, thermally and acoustically insulating, and carbon-negative to produce (no synthetic polymer inputs, grown at ambient temperature on waste). Disadvantages: significantly lower compressive strength than injection-moulded EPS, moisture-sensitive during use, and production cycle of days to two weeks versus minutes for EPS — a scalability constraint for high-volume applications.

Mycelium leather alternatives (Bolt Threads’ Mylo, Ecovative’s Forager) are compressed, surface-finished mycelium sheets that replicate the drape, texture, and workability of bovine leather. Advantages: substantially lower land use, water consumption, and greenhouse gas footprint than full-grain leather; no toxic chrome tanning chemistry. Disadvantages: current abrasion and tear resistance fall short of full-grain leather for demanding applications; at commercial scale, finishing binders and surface coatings often reintroduce petrochemical inputs, diluting the sustainability benefit.

Mycoprotein food ingredients (Quorn, derived from Fusarium venenatum) are protein-rich fungal biomass used as meat analogues. Advantages: approximately 45% protein by dry weight, low saturated fat, low agricultural land footprint relative to beef, and the fibrous hyphal structure provides a naturally meat-like texture without processing. Disadvantages: energy-intensive continuous fermentation; a subset of consumers report mild gastrointestinal sensitivity at high intake levels; regulatory approval pathways for novel mycoprotein sources remain slow.

Mycelium structural and insulation panels have been explored as bio-based replacements for mineral wool or rigid foam board insulation. Thermal and acoustic performance is comparable for low-density applications, and end-of-life biodegradation is complete. Disadvantages: tensile and compressive strength are well below conventional structural materials; moisture management during installation remains a practical challenge; and current production does not yet achieve the thickness uniformity required for building code certification at scale.


Question 2: What to Genetically Engineer Fungi to Do, and Why

Tannase and gallic acid valorisation in Aspergillus niger

The most direct and personally relevant application is the engineering of A. niger to produce tannase under inducible, titratable control — as in the BioCircuit Tannase project. Wild-type A. niger produces tannase natively under solid-state fermentation conditions, but expression is poorly controlled and co-expressed with a complex secretome of competing enzymes that contaminate product streams. Genetic engineering allows precise control: replacing the native promoter with a synthetic inducible system (such as the thiamine-repressible PthiA promoter) decouples tannase expression from growth phase and substrate composition.

Beyond tannase itself, engineering a PobR-based gallic acid biosensor into the same strain creates a self-reporting production circuit: the cell signals in real time whether gallic acid — the product of tannin hydrolysis — is accumulating at target concentrations, without the need for offline HPLC sampling. This closed-loop sensing capability is something that cannot be achieved by process engineering alone; it requires the cell to function as its own analytical instrument.

Gallic acid is a high-value platform molecule. It is a direct precursor to propyl gallate (a major food antioxidant), pyrogallol (used in pharmaceuticals and photography), and ellagic acid (a polyphenol of biomedical interest). Engineering A. niger to both produce and sense gallic acid positions it as a cell factory for the valorisation of tannin-rich agro-industrial waste streams — grape pomace, pomegranate peel, tea dust — which are otherwise low-value byproducts of the food and beverage industry.

Secondary metabolite activation

Fungi are the natural source of some of the most important small molecules in medicine and industry: penicillin (Penicillium chrysogenum), lovastatin (Aspergillus terreus), cyclosporin (Tolypocladium inflatum), and a vast array of characterised and uncharacterised polyketides and non-ribosomal peptides encoded in biosynthetic gene clusters. Many of these clusters are silenced under standard laboratory growth conditions — their activating transcription factors respond to signals that are poorly reproduced in flask culture. Synthetic biology can replace native cluster regulators with orthogonal, inducible promoters, turning silent clusters on and enabling the discovery and production of novel bioactive compounds.

Engineering mycelium material properties

The mechanical properties of mycelium composites depend directly on cell wall composition — the ratio of chitin to β-glucan, the density of inter-hyphal anastomoses, and the abundance and type of surface hydrophobins. Genetic engineering of these parameters would allow mycelium to be grown to specification: stiffer composites for structural panels, more flexible and surface-smooth material for leather alternatives, or hydrophobic coatings for water resistance. This is a fundamentally different design paradigm from the current approach of empirical substrate and species screening.

Advantages of synthetic biology in fungi versus bacteria

The decisive advantage is eukaryotic cell biology. Many industrially and medically valuable proteins require post-translational modifications — glycosylation, disulfide bond formation, GPI anchoring, correct folding via the endoplasmic reticulum secretory pathway — that bacteria cannot perform. Expression of these targets in E. coli produces misfolded, inactive inclusion bodies; expression in A. niger or Pichia pastoris yields correctly folded, active, secreted protein.

Aspergillus species have been industrially optimised for secreted enzyme production; well-engineered strains routinely secrete tens of grams of protein per litre of culture medium. Bacterial secretion systems are far less efficient, and in gram-negative species the outer membrane traps product in the periplasm, complicating recovery.

For materials applications, bacteria simply cannot produce a macroscopic, self-structuring solid biomass under ambient conditions. The multicellular, hyphal growth architecture — with its capacity to infiltrate and bind heterogeneous substrates — is uniquely fungal. No amount of bacterial engineering replicates this morphological property.

Fungi also tolerate heterogeneous, recalcitrant feedstocks — tannin-rich waste streams, lignocellulosic materials, high-phenol substrates — that would be toxic or non-utilisable by most bacteria, owing to the rich native secretome of oxidative and hydrolytic enzymes. This feedstock flexibility is essential for sustainable, low-cost bioprocessing on real agricultural waste.

The primary disadvantages relative to bacteria are slower growth rates, longer genetic engineering cycles due to larger and more complex genomes, and a less mature synthetic biology toolkit — fewer characterised promoters, fewer orthogonal transcription factors, and slower CRISPR editing cycles than in E. coli. These gaps are closing rapidly in the industrially important Aspergillus and Pichia hosts, but bacterial chassis remain the first choice for rapid prototyping of new circuits before transfer to a fungal production host.


Assignment Part 3: First DNA Twist Order — BioCircuit Tannase Final Project

3.0 — Review of Individual Final Project Documentation Guidelines

Individual Final Project presentations are 3 minutes for Global Committed Listeners, with 1-2 minutes of Q&A. The project is documented on the HTGAA website and evaluated on scientific quality, clarity, and relevance. All DNA designs are submitted via Benchling or equivalent and sent to TAs by March 18, 2026, for Twist synthesis. The final project is presented on May 13, 2026.


3.1 — Google Form Submission (Draft Aim 1, Summary, Industry Council, Shared Folder)

[Submitted separately via Google Form at https://forms.gle/b45ukkwPKfKPLcSA7]


3.2 — Insert Sequence Design: Complete Protocol and Documentation

Project Title

BioCircuit Tannase: A Gallic Acid Biosensor–Guided Tannase and β-Glucosidase Co-Expression System in Aspergillus niger

Backbone Vector

The insert sequence will be synthesised by Twist Bioscience and cloned into pTwist Amp High Copy (Twist Bioscience catalog vector). This backbone provides:

  • Ampicillin resistance (ampR) for E. coli selection on LB + ampicillin plates
  • pUC origin of replication (high copy, ~100–300 copies/cell in E. coli)
  • No restriction sites flanking the insert cloning region

Note for Aspergillus niger transformation: The AMA1 episomal replication element and pyrG selectable marker required for fungal maintenance are encoded within the insert sequence itself. The pTwist Amp backbone serves for E. coli propagation and verification.


Insert Sequence Overview

The insert is a multi-cassette linear expression construct of approximately 10,970 bp total length, encoding four sequential gene expression units (cassettes) arranged 5’ to 3’. It was designed as a linear topology insert in Benchling and submitted for Twist Clonal Gene synthesis.

Five BsaI recognition sites were identified across the full 10,970 bp sequence during initial design and were domesticated (silently mutated at the wobble position of the nearest codon to destroy the site while preserving amino acid sequence) prior to submission. This is required because Twist Bioscience’s internal cloning process uses BsaI-based Golden Gate assembly, and internal BsaI sites in the submitted sequence would cause incorrect fragmentation during synthesis.


Element-by-Element Protocol: How Each Part Was Retrieved, Designed, and Assembled

NotI Flanking Site (5’ end)

Sequence: GCGGCCGC (8 bp) Function: Rare-cutting restriction site at the 5’ boundary of the insert, allowing future re-extraction from the backbone if needed for subcloning into an alternative vector. Design: Added manually in Benchling at position 1 of the insert. Annotated as NotI_5prime_flank.


Element 1: PthiA Promoter (~1,000 bp)

What it does: Thiamine-repressible promoter from A. niger. When thiamine is present in the growth medium, the thiA gene (and this promoter) is repressed. Removing thiamine from the medium de-represses the promoter and induces expression of whatever gene cassette follows it — in this case, tanA (tannase). This allows separation of the fungal growth phase (with thiamine) from the production phase (without thiamine).

How to retrieve:

  1. Go to FungiDB
  2. Search: thiA with organism set to Aspergillus niger CBS 513.88
  3. Open the gene entry. Navigate to the Sequences tab
  4. Download 1,000 bp upstream genomic sequence (immediately upstream of the thiA start codon)
  5. This is your PthiA promoter sequence

Benchling: Paste immediately after the NotI flanking site. Annotate: Name = PthiA_promoter, Type = Promoter, Color = Orange.


Element 2: tanA Coding Sequence (~1,500 bp)

What it does: Encodes tannase (tannin acyl hydrolase, EC 3.1.1.20) — the core enzyme that cleaves ester bonds in tannic acid to release gallic acid and glucose. The protein contains an N-terminal signal peptide (~19–30 aa) that directs it to the A. niger secretory pathway, so the mature enzyme is secreted into the culture medium.

Correction applied: The accession previously cited in this guide (XM_001394592) was incorrect — it encodes an uncharacterised protein, not tannase. The correct accession is XM_001401772.

How to retrieve:

  1. Go to NCBI Nucleotide Direct link: https://www.ncbi.nlm.nih.gov/nuccore/XM_001401772
  2. In the Features panel on the right, click CDS to highlight only the coding sequence (ATG to stop codon, no UTRs)
  3. Click Send to → File → FASTA to download the CDS only
  4. Save as tanA_native.fasta

Codon optimisation:

  1. Go to Twist Codon Optimization Tool
  2. Paste the tanA CDS
  3. Organism: Aspergillus niger
  4. Check: Avoid BsaI, Avoid BsmBI, Avoid BbsI
  5. Click Optimize → download as FASTA → save as tanA_codonopt_Aniger.fasta

Signal peptide annotation: After pasting into Benchling, go to SignalP 6.0, paste the translated protein sequence, and identify the signal peptide cleavage site. Annotate the corresponding nucleotides as tanA_signal_peptide (pink).

Benchling: Paste immediately after PthiA_promoter. Annotate: Name = tanA_coding_sequence, Type = CDS, Color = Red.


Element 3: TtrpC Terminator — Instance 1 (~543 bp)

What it does: Transcriptional terminator from Aspergillus nidulans trpC gene. Signals RNA polymerase to stop transcription and release the mRNA after tanA. Without this, transcription would read through into the next cassette, producing unstable chimeric mRNAs and disrupting downstream gene expression.

Sequence (validated, Punt et al. 1987):

GCAATGGATCTTCAAGAGAAGGATGCCTTCCTGCTGCTTCTGCTCAAGGCCAAGGAATCGGGTGAAGG
TGAATCGTAAGACCTCAACCTCAAACCAAGACATTAATCAAGAAGACCAAGACCTCGAATCAAGATCTC
GATCAAGAGACCAAGATCACGATAGAGGAAGATCGTGAGAAGGACAAGAAGGAGAAGCAAGAAGAAGAGG
ATGAGGATGATGCCGAAGACGAAGACGACGAATCCGATGAAGGCGACGCAGACGGATCAGACGGTCACGG
CGGTGATGCCGATGGCAGTGGTGGCGTTGGTGCCGGAGCTGGAGCCACCAGATCCTCAGACGCCCCATCA
AAGCCCACGGCCACCGAAAAGCCCCAGTCCCCAGCGGCCCCCGCATCAGCCCCACCCATCCTCCCGATGA
TGCCCCCCTCTTCAAGAAGCCTCCAGCAGCAGCAAGCCCCAGAAAGCCCAGCCCCACTCCAACAAG

Source: Punt et al. (1987) Gene 56:117–124. Also retrievable from pAN7-1 vector (Addgene #26908).

Benchling: Paste after tanA_coding_sequence. Annotate: Name = TtrpC_terminator_1, Type = Terminator, Color = Purple.


Element 4: PgpdA Promoter — Instance 1 (~800 bp)

What it does: Strong constitutive promoter from the A. niger glyceraldehyde-3-phosphate dehydrogenase (gpdA) gene. Active during all growth phases without requiring any inducer. Used to drive bglA (β-glucosidase) constitutively so the accessory enzyme is always present to assist tannin hydrolysis.

How to retrieve — direct link (800 bp upstream, confirmed coordinates):

Direct NCBI link: https://www.ncbi.nlm.nih.gov/nuccore/NT_166531.1?report=fasta&from=392794&to=393593

This gives the 800 bp region immediately upstream of the gpdA (An16g01830) ATG start codon on scaffold NT_166531.1.

  1. Open the link above in your browser
  2. Click Send to → File → FASTA to download
  3. Save as PgpdA_800bp.fasta

Benchling: Paste after TtrpC_terminator_1. Annotate: Name = PgpdA_promoter_1, Type = Promoter, Color = Orange.


Element 5: bglA Coding Sequence (~1,400 bp)

What it does: Encodes β-glucosidase (EC 3.2.1.21) from Talaromyces emersonii — a thermostable accessory enzyme that cleaves glucose from partially hydrolysed gallotannin intermediates. Works synergistically with tannase: tannase cleaves the galloyl ester bonds, releasing gallic acid; bglA cleaves the remaining glucose backbone of incompletely hydrolysed intermediates, preventing product inhibition and increasing completeness of tannin conversion.

How to retrieve:

  1. Go to NCBI Nucleotide
  2. Search: AJ313330 Direct link: https://www.ncbi.nlm.nih.gov/nuccore/AJ313330
  3. Click CDS to highlight the coding sequence
  4. Send to → FASTA → save as bglA_native.fasta
  5. Codon-optimise using Twist tool (same settings as tanA, organism = Aspergillus niger)
  6. Save as bglA_codonopt_Aniger.fasta

Benchling: Paste after PgpdA_promoter_1. Annotate: Name = bglA_coding_sequence, Type = CDS, Color = Red.


Element 6: TtrpC Terminator — Instance 2 (~543 bp)

What it does: Terminates bglA transcription. Identical sequence to Element 3.

Benchling: Paste after bglA_coding_sequence. Annotate: Name = TtrpC_terminator_2, Type = Terminator, Color = Purple.


Element 7: PgpdA Promoter — Instance 2 (~800 bp)

What it does: Constitutive promoter driving pobR expression. PobR (the gallic acid sensor protein) must always be present in the cell — it cannot be inducible, because you need it ready to sense gallic acid the moment production begins.

Note on repeat sequences: Using the same PgpdA sequence twice in one construct creates direct repeats that can promote recombination-mediated deletions in E. coli during propagation. To mitigate this, use the A. nidulans PgpdA here instead:

Direct NCBI link (A. nidulans gpdA promoter, ~700 bp): https://fungidb.org → organism: Aspergillus nidulans FGSC A4 → search gpdA (locus AN1246) → Sequences tab → 700 bp upstream region

Benchling: Paste after TtrpC_terminator_2. Annotate: Name = PgpdA_promoter_2_Anidulans, Type = Promoter, Color = Orange.


Element 8: pobR Coding Sequence (~900 bp)

What it does: Encodes PobR, a LysR-family transcriptional activator from Acinetobacter baylyi ADP1 that binds 4-hydroxybenzoate and structurally related phenolics including gallic acid as effector molecules. When gallic acid binds PobR, the protein undergoes a conformational change and binds the pob operator DNA sequence, activating transcription of the downstream reporter gene (mNeonGreen). This is the sensor node of the biosensor genetic circuit. Requires codon optimisation because it comes from a bacterium and bacterial codons are poorly recognised by A. niger ribosomes.

How to retrieve:

  1. Go to NCBI Gene: https://www.ncbi.nlm.nih.gov/gene/2879266
  2. Click through to the nucleotide sequence for locus ACIAD3381 on genome NC_005966.1
  3. Extract the CDS
  4. Codon-optimise using Twist tool — organism: Aspergillus niger — avoid BsaI/BsmBI/BbsI
  5. Save as pobR_codonopt_Aniger.fasta

Benchling: Paste after PgpdA_promoter_2_Anidulans. Annotate: Name = pobR_coding_sequence, Type = CDS, Color = Dark Blue.


Element 9: TtrpC Terminator — Instance 3 (~543 bp)

What it does: Terminates pobR transcription. To break up direct repeat sequences, this instance uses the TglaA terminator from the A. niger glucoamylase gene as an alternative.

TglaA terminator source:

  1. Go to NCBI: https://www.ncbi.nlm.nih.gov/nuccore/M57398
  2. This is the A. niger glaA (glucoamylase) gene sequence
  3. Extract the ~400 bp immediately downstream of the glaA stop codon — this is the TglaA terminator

Benchling: Paste after pobR_coding_sequence. Annotate: Name = TglaA_terminator_1, Type = Terminator, Color = Purple.


Element 10: pob Operator × 3 Tandem Copies (~402 bp)

What it does: This is the regulatory heart of the gallic acid biosensor circuit. The pob operator is the specific DNA sequence that PobR binds when it has detected gallic acid. PobR bound to this operator recruits RNA polymerase and activates transcription of the downstream reporter gene (mNeonGreen). Three tandem copies are used to amplify the signal: more PobR binding sites means stronger transcriptional activation and higher reporter fluorescence per unit of gallic acid sensed.

Critical structural note: The pob operator is embedded within the 134 bp intergenic region between pobR and pobA in the A. baylyi ADP1 genome. This 134 bp region contains: the PobR binding site (~35 bp core with inverted repeats), the pobA –35 and –10 elements, and the pobA transcription start site (22 bp before the pobA ATG). The entire 134 bp intergenic region is used as the functional unit, not just 60 bp.

How to retrieve the exact 134 bp intergenic sequence:

  1. Go to NCBI Gene for pobA (ACIAD3380): https://www.ncbi.nlm.nih.gov/gene/2879265
  2. Note the exact chromosomal coordinates of pobA on NC_005966.1
  3. Go to NCBI Gene for pobR (ACIAD3381): https://www.ncbi.nlm.nih.gov/gene/2879266
  4. Note the coordinates of pobR
  5. The 134 bp intergenic region = from [pobR stop codon position + 1] to [pobA start codon ATG position − 1]
  6. Use the NCBI Sequence Viewer range tool on NC_005966.1 to extract this 134 bp region
  7. Verify the inverted repeat within this region using the EMBOSS Palindrome tool (https://www.ebi.ac.uk/jdispatcher/seqstats/emboss_palindrome/)

Assembly in Benchling:

  1. Paste the 134 bp sequence
  2. Add a 5 bp neutral spacer: AAAAA
  3. Paste the 134 bp sequence again
  4. Add another AAAAA spacer
  5. Paste the 134 bp sequence a third time
  6. Total: 134 + 5 + 134 + 5 + 134 = 412 bp
  7. Annotate the whole block: Name = pob_operator_3x, Type = Regulatory, Color = Yellow

Important: Because the 134 bp intergenic region contains the pobA promoter elements (–35, –10, and TSS), the last copy of the 3× block also serves as the promoter for mNeonGreen transcription. No separate minimal core promoter is needed.


Element 11: mNeonGreen Coding Sequence (~720 bp)

What it does: Encodes mNeonGreen fluorescent protein — the output reporter of the biosensor circuit. When gallic acid accumulates intracellularly → PobR is activated → PobR binds the pob operator → mNeonGreen is transcribed and translated → green fluorescence is produced. Fluorescence intensity is proportional to intracellular gallic acid concentration, providing a real-time, non-destructive readout of the tannase pathway’s output that can be measured with a fluorescence plate reader (excitation 506 nm, emission 517 nm).

mNeonGreen is chosen over standard GFP because it is brighter, matures faster, and its excitation/emission spectrum is well-separated from the autofluorescence of A. niger (which emits primarily in the 450–490 nm range), giving superior signal-to-noise in filamentous fungal cells.

How to retrieve:

  1. Go to FPbase: https://www.fpbase.org/protein/mneongreen/
  2. Click the Sequences tab → download the protein (amino acid) sequence as FASTA
  3. Go to Twist Codon Optimization Tool: https://www.twistbioscience.com/resources/digital-tools/codon-optimization-tool
  4. Select input type: Amino Acid
  5. Paste mNeonGreen amino acid sequence
  6. Organism: Aspergillus niger
  7. Avoid BsaI/BsmBI/BbsI
  8. Click Optimize → download as FASTA → save as mNeonGreen_codonopt_Aniger.fasta

Placement note: Place the mNeonGreen ATG exactly 22 bp after the end of the third pob operator/intergenic repeat. The 22 bp spacing between the pobA TSS and ATG is the native translational context validated in A. calcoaceticus — preserving it ensures efficient translation initiation.

Benchling: Paste 22 bp after the end of pob_operator_3x. Annotate: Name = mNeonGreen_reporter, Type = CDS, Color = Green.


Element 12: TtrpC Terminator — Instance 4 (~543 bp)

What it does: Final terminator. Terminates mNeonGreen transcription and marks the 3’ end of the entire insert cassette.

Benchling: Paste after mNeonGreen_reporter. Annotate: Name = TtrpC_terminator_3, Type = Terminator, Color = Purple. (Using TtrpC here again is acceptable since it is separated from TtrpC instances 1 and 2 by ~5 kb of intervening sequence — recombination risk is low over this distance.)


AscI Flanking Site (3’ end)

Sequence: GGCGCGCC (8 bp) Function: Rare-cutting restriction site at the 3’ boundary of the insert, enabling future re-extraction as a pair with the 5’ NotI site. Benchling: Paste after TtrpC_terminator_3. Annotate: Name = AscI_3prime_flank.


BsaI Domestication — 5 Sites Across 10,970 bp

During sequence assembly, a BsaI scan in Benchling (Tools → Restriction Sites → BsaI) revealed 5 BsaI recognition sites distributed across the full 10,970 bp insert. Each was domesticated by introducing a silent nucleotide change at the wobble position of the codon containing or overlapping the recognition sequence. BsaI recognition sequence is GGTCTC(1/5) (and its reverse complement GAGACC). At each site, the third codon position was changed from the native wobble base to a synonymous alternative that disrupts the GGTCTC/GAGACC hexamer while preserving the encoded amino acid.

The five sites were in:

  • Site 1: Within the codon-optimised tanA coding sequence
  • Site 2: Within the codon-optimised bglA coding sequence
  • Site 3: Within the TtrpC_terminator_2 sequence
  • Site 4: Within the codon-optimised pobR coding sequence
  • Site 5: Within the codon-optimised mNeonGreen sequence

After domestication, a second BsaI scan confirmed zero BsaI sites across the full insert. The sequence was then exported as GenBank (.gb) format for Twist submission.


Full Insert Sequence Summary Table

#ElementLength (bp)SourceBenchling Annotation
1NotI flanking site8ManualNotI_5prime_flank
2PthiA promoter~1,000FungiDB: A. niger CBS 513.88 thiA upstreamPthiA_promoter
3tanA CDS~1,500NCBI XM_001401772 → Twist codon opttanA_coding_sequence
4TtrpC terminator 1~543Punt et al. 1987 / Addgene pAN7-1 (#26908)TtrpC_terminator_1
5PgpdA promoter 1~800NCBI NT_166531.1 pos. 392794–393593PgpdA_promoter_1
6bglA CDS~1,400NCBI AJ313330 → Twist codon optbglA_coding_sequence
7TtrpC terminator 2~543Same as aboveTtrpC_terminator_2
8PgpdA promoter 2~700FungiDB: A. nidulans FGSC A4 gpdA upstreamPgpdA_promoter_2_Anidulans
9pobR CDS~900NCBI Gene ID 2879266 (NC_005966.1) → Twist codon optpobR_coding_sequence
10TglaA terminator 1~400NCBI M57398 (A. niger glaA downstream)TglaA_terminator_1
11pob operator ×3~412NCBI NC_005966.1 intergenic region (ACIAD3380–3381)pob_operator_3x
12mNeonGreen CDS~720fpbase.org → Twist codon opt (AA input)mNeonGreen_reporter
13TtrpC terminator 3~543Same as Element 4TtrpC_terminator_3
14AscI flanking site8ManualAscI_3prime_flank
Total insert~10,970 bp

How the Circuit Works — Complete System Description

The insert encodes two functional modules that work together:

Module A — Expression module (Cassettes 1 and 2): When the fungus is grown on tannic acid medium without thiamine, the PthiA promoter is de-repressed and drives high-level tannase (TanA) transcription and secretion. Secreted TanA hydrolyses tannic acid in the medium, releasing gallic acid and glucose. Simultaneously, the constitutive PgpdA promoter continuously drives β-glucosidase (BglA) expression. BglA cleaves glucose from partially hydrolysed tannin intermediates that TanA has started but not finished processing, preventing product inhibition and increasing the completeness of tannin hydrolysis. Together, TanA + BglA convert tannic acid more completely to gallic acid than either enzyme alone.

Module B — Biosensor circuit (Cassettes 3 and 4): PobR (encoded by Cassette 3 under constitutive PgpdA control) is always present in the cell, ready to sense gallic acid. When tannase activity generates gallic acid intracellularly, gallic acid molecules bind PobR and cause it to change conformation from its inactive state to its DNA-binding active state. Active PobR binds the 3× pob operator array (Element 10), recruits RNA polymerase, and drives transcription of mNeonGreen. Green fluorescence is produced in direct proportion to intracellular gallic acid concentration — giving a real-time, quantitative readout of the tannase pathway’s performance.

Circuit feedback value: The biosensor circuit is not merely decorative — it generates data that HPLC cannot. While HPLC measures extracellular gallic acid at discrete time points, the mNeonGreen biosensor reports intracellular gallic acid continuously at single-cell resolution. This allows identification of production bottlenecks (is gallic acid being made but not exported? Is the intracellular concentration reaching toxic levels before it can leave the cell?), optimisation of induction conditions, and high-throughput screening of fermentation variables using a plate reader rather than repeated HPLC runs.


Ethical Considerations

The chassis organism (Aspergillus niger) is classified as BSL-1 and GRAS. The insert contains no sequences encoding virulence factors, toxins, or antimicrobial resistance beyond ampR for laboratory selection. The gallic acid biosensor and tannase co-expression system have clear industrial and environmental applications (valorisation of agricultural tannin waste, reduction of dependence on chemical gallic acid synthesis). No human, animal, or plant pathogens are involved. The project complies with all standard biosafety requirements for BSL-1 work.