Week 2 HW: DNA Read, Write and Edit
Part 1: Benchling & In-silico Gel Art
- Benchling: 48,502 bp Lambda phage DNA.
Open Project - Enzymes: NdeI, PvuII, SacI only.

Part 3: Protein Design - Human Lysozyme (hLYZ) (no Part 2 because no lab attendance yet)
UniProt: P61626
Why: Lyses Gram+ bacteria (clearing zones on plates); ~130 aa antimicrobial for food/biofilms.
Codon-opt for E. coli (I used VectorBuilder:
:
ATGAAAGCGCTGATTGTGCTGGGCCTGGTGCTGCTGAGCGTGACCGTGCAGGGCAAAGTGTTTGAACGCTGTGAACTGGCCCGTACCCTGAAACGTCTGGGCATGGATGGCTATCGCGGCATTAGCCTGGCGAACTGGATGTGCCTGGCGAAATGGGAAAGCGGATATAACACCCGCGCGACCAACTATAACGCAGGCGATCGTAGCACCGATTATGGCATTTTCCAGATTAACAGCCGTTATTGGTGCAATGATGGCAAAACCCCGGGCGCCGTGAACGCGTGCCATCTGAGCTGTAGCGCCCTGCTGCAGGATAACATTGCGGATGCCGTGGCCTGCGCGAAACGCGTGGTGCGCGATCCGCAGGGCATTCGCGCGTGGGTGGCGTGGCGCAACCGCTGCCAGAACCGCGATGTTCGCCAGTACGTGCAGGGCTGTGGCGTG
Why codon optimized?:
DNA’s genetic code is degenerate: 64 codons encode 20 amino acids, but cells prefer “optimal” codons with abundant tRNAs. Human lysozyme (P61626) uses eukaryotic codons (e.g., AGA/CGA/CGG/CGC for Arg), rare in bacteria → ribosomal pausing → low protein (~1-10% max yield).
Optimization process:
- Scan AA seq → Replace rare codons (e.g., AGA Arg → CGT/CGC).
- GC ~50% (E. coli sweet spot).
- Avoid RE sites/repeats.
Why E. coli?
K12/BL21 canonical:
- 20min doubles.
- T7/plasmids stocked.
- Lysis halos assay.
- BSL1/$1L.
Protein Production Technologies
To produce lysozyme from my codon-opt DNA (plasmid: T7 promoter + RBS + hLYZ + term + AmpR), two main paths: cell-dependent (in vivo) and cell-free (TXTL)—both HTGAA staples.
1. Cell-Dependent (E. coli BL21(DE3))
Tech: Chemical transformation → IPTG induction → lysis/purification.
Steps:
- Transform plasmid into competent BL21(DE3) (T7 RNA pol strain).
- LB+Amp plate (37°C overnight) → single colonies.
- Liquid culture → mid-log (OD600~0.6) → add 1mM IPTG (T7 binds promoter).
- Express 4h (A280 monitor) → centrifuge → lysozyme buffer lysis → Ni-NTA purify (His-tag).
Why: Scalable (L-scale), cheap, natural folding/chaperones. Yield: 50-200 mg/L. Assay: Lysis halos on Micrococcus luteus agar. Limit: Inclusion bodies if misfold.
2. Cell-Free (TXTL / PURExpress)
Tech: NEB PURExpress or NEB TXTL kit (plasmid-fed).
Steps:
- Mix: DNA template (10-100 ng), lysate/lysate-free enzymes, NTPs/aa-tRNAs, energy (PEP), Mg/ATP.
- 29-37°C incubate 4-16h (plate reader monitor if fluorescent tag).
- Direct assay (no purification): Add bacteria → halo formation.
Why: Rapid prototyping (hours), no cloning/transformation, toxic proteins OK. Yield: 1-10 μM (~0.1-1 mg/mL). HTGAA fave for Week 2 demos. Limit: Costly, short-lived.
Transcription/Translation (Central Dogma):
DNA → transcription (RNA pol binds promoter → mRNA w/ RBS/5’UTR; ~100 nt/s).
mRNA → translation (30S ribosome binds RBS → 70S → tRNAs decode codons → peptide chain ~20 aa/s → release factor → fold).
Lysozyme: ~130 aa → active in <1 min post-Tx. Cell-free skips replication/folding issues!
Part 4 - STILL EDITING
Part 5.1 DNA Read
(i) What DNA would I sequence & why?
Synthetic DNA data storage archive (e.g., Microsoft/ETH Zurich 1PB encoded in 13.3 zettabytes of oligo pools).
Why: DNA stores 1 exabyte/gram (10^18 density, 1000+ year stability at 4°C vs HDD 5yrs/90% capacity loss). Perfect for AI training data, climate models, genomic libraries—HTGAA synbio extension (store lysozyme designs, node protocols). Error-correcting codes (FEC) handle 1-5% synthesis/seq errors. Applications: Space (NASA), biobanks, “DNA cloud”.
(ii) Technology: Oxford Nanopore MinION + PromethION (3rd-generation)
Why MinION: Long reads (>10kb) assemble repetitive oligo pools; portable (USB, London node); real-time basecalling (Guppy 6.0, 98% raw→99.9% consensus). PromethION scales to 290Gb/run for PB archive.
Generation: 3rd—nanopore sequencing (single-molecule, no amplification bias, real-time vs SBS parallel/PCR).
Input/Preparation (100ng oligos):
- End-repair/A-tail: Blunt→A-overhangs (NEB Ultra II).
- Adapter ligation: SQK-LSK114 (motor protein + tether). No fragmentation (native length preserves codes).
- Loading: 30-48 flowcells, R9.4.1 pore.
Essential Steps & Base Calling:
- Threading: Helicase ratchets ssDNA through protein nanopore (α-hemolysin mutant).
- Ionic current: 4 bases uniquely modulate ~100pA current (A=high, C=mid, G=low, T=mid-low). Homopolymers ~3-10nt compress signal.
- Raw signal → Guppy neural net (RNN+Attention): Translates squiggles → FAST5 → FASTQ (Q10=99.9%). Dorado (GPU) live.
- Polish: Medaka/RACON consensus from 30x coverage.
Output: FASTQ reads (10-100kb), PHRED scores. Assemble → ECC decode → binary data.
Limits: Homopolymers (5% err), speed (450bases/s/pore), but duplex mode (both strands) → Q20.
Part 5.2 DNA Write
(i) What DNA to synthesize & why?
Codon-optimized human lysozyme (hLYZ, P61626) + Gibson homology arms for pET28a integration.
Why: Antimicrobial for food preservation (Gram+ lysis), biofilms, therapeutics. Twist-free (~$40), test lysis Week 4. Applications: Self-sterilizing surfaces, phage replacement.
(ii) Technology: Twist Bioscience (arrayed phosphoramidite synthesis → assembly)
Why: $0.09/bp, 1-5kb genes error-free, 10-14 day turnaround, HTGAA partner (Benchling→Twist integration).
Essential Steps:
- Oligo array synthesis: Inkjet print A/C/G/T phosphoramidites → 10^6 spots/plate → 200nt oligos (60mer tiles).
- Error correction: 4x coverage unique tiles → NGS validate → discard errors.
- Hierarchical assembly: Gibson (chemo-enzymatic overlap), NEBuilder HiFi.
- Cloning/Verification: Electrocompetent DH5α → colony PCR/Sanger → ship 2-5μg plasmid.
Limitations:
- Speed: 2wks (vs enzymatic 2days).
- Accuracy: 99.9% post-assembly (raw oligos 1:100 err).
- Scalability: 5kb genes ($450), 100kb pathways $$$; toxic/repeats fail.
Part 5.3 DNA Edit
(i) What DNA to edit & why?
E. coli MG1655 genome—insert hLYZ cassette (constitutive pro, RBS, hLYZ, term) at neutral locus (betT/aga).
Why: Chromosomal antibiotics eliminate plasmid loss/stability issues. Industrial: Self-lysing probiotics for food safety. Conservation: Edit coral bacteria to fight bleaching pathogens.
Edit: 1.2kb insertion replacing pseudogene.
(ii) Technology: CRISPR-Cas12a recombineering (enAsCas12a)
Why: TTN PAM (AT-rich E.coli), dual crRNAs multiplex, 90% HR efficiency w/ RecET. Better than Cas9 (NG PAM-limited).
How it edits: Cas12a crRNA binds target → RuvC cleaves → RecET boosts HDR from ssDNA donor.
Preparation/Steps:
- Design (Benchling): 22nt spacer + TTN PAM, 500-800nt ssDNA donor (40nt homology arms).
- Inputs: enAsCas12a protein (20μM), crRNA/tracrRNA (1μM), ssDNA (1μM), BL21 RecET cells.
- Electroporation: 25μF, mix → recover 37°C → plate Amp + lysis assay.
- Screen: PCR + Sanger verify insertion.
Limitations:
- Efficiency: 10-50% (donor design critical).
- Precision: 0.1% off-target (Cas12a collateral nuclease quenched).
- Size: <3kb insertions optimal.
My spacer: GTTGATCTGGAAGCTGACCGC (betT locus).