Week 2 - 2. DNA Read, Write & Edit

This week explores the read–write–edit toolkit: sequencing and synthesis workflows, restriction digests and gel electrophoresis, and early genome-editing frameworks.

I’m a commited listener but without acces to a biology lab.

0. Basics of Gel Electrophoresis

I followed the class and take a look at the content around this technique.

1. Benchling & In-silico Gel Art

After creating an account on Benchling and using it to understand better the possibilities it offers.

I made a Digestion Simulation with all the indicated restriction enzymes :

Here is my created image using this same technique :

“Seagull flying in a clear sky”, I acknowleged it is quite simple it’s the first inspiration that came to my mind :)

2. Gel Art - Restriction Digests and Gel Electrophoresis

I don’t have acces to a lab as mentionned ealier

3. DNA Design Challenge

Choose your protein. I’ve chosen Chitodextrinase, it’s an enzyme that permit the degradation of chitin which a polysaccaride that form structure and rigidity of living orgnisme. It’s found mostly in mushrooms and insect, there for chitodextrinase is produced by animal eating insect or plants to protect themselve from fungi. I found it intersting because i worked on mycelium based composite and chitin is on of the main strucuring compond allowing those object to be resistant and light while being biodegradable. So i thought it would be intersting to understand better the process of it’s decomposition by living organisme.

Here is this protein Sequence :

LOCUS CAM6841167 609 aa linear BCT 18-MAR-2025 DEFINITION Chitodextrinase [Enterobacter rongchengensis]

    1 mnkrtllsvl vagacvapfm aqaaslqatt sepytikasd lakkekelts fplmasvket
   61 iqtldnaqve liepgraanp dnvkrvegiv kasdweylfp lraqaytysn flkavgkfpa
  121 lcktyndgrd sdaicrkela tmfahfaqet ggheswrpea ewrqalvhvr emgwsegqkg
  181 gyngecnpdv wqgqtwpcgk dkdgdflsyf grgakqlsyn ynygpfseam ygdvrtlldk
  241 pelvadtwln lasaifffay pqppkpsmlq vidgtwqpnd hdkanglvpg fgvttqiing
  301 gvecggptei aqsenrikyy kefanylkvp vpanevlgca nmkqfdegga galkiyweqd
  361 wgwsadtpsg atyscqlvgy qtpfsafkeg dyskcvqkff nvnivnddgs atpdqtpvtp
  421 tptpapsqde tpapapvpde tpaepaavnh apvadiagpi gavdagaqvs lsaegstdad
  481 gnaltytwrs qdgqtvtgqd kavvtfkape aataqqieis ltvsdgelss ttsyllnvka
  541 kaapsqdegt sgnyaawsan skykagdivn nhgklfqckp fpysgwcnna payyepgagl
  601 awadawtal

information retrieve from : https://www.ncbi.nlm.nih.gov/protein/CAM6841167.1 and originating from a French lab in Paris not super far from where I live!

elipse: I started my research of an intersting protein thinking about cafeine but it’s an alcaloïde. Then I though that i could be one at the origine of the smell of fresh rain : it come from geosmin a bacterian made metabolite (way smaller than proteines). I also though about melanine but it’s not so simple you have different type and they aren’t protein either. After that I thought about all the things we hears about the protein in our alimentation and remembered that the most protein dense “veggie” in french it’s considered a legumineuse is lentils and the 2nd most important protein in quantity in them is “Albumine” a protein playing an important role in human lymphatic systeme alowing a proper osmotique balance of the plasma.

So for bonus here is an Albumine sequence I founded too :

    1 mkwvtfisll flfssaysrg vfrrdahkse vahrfkdlge enfkalvlia faqylqqcpf
   61 edhvklvnev tefaktcvad esaencdksl htlfgdklct vatlretyge madccakqep
  121 ernecflqhk ddnpnlprlv rpevdvmcta fhdneetflk kylyeiarrh pyfyapellf
  181 fakrykaaft eccqaadkaa cllpkldelr degkassakq rlkcaslqkf gerafkawav
  241 arlsqrfpka efaevsklvt dltkvhtecc hgdllecadd radlakyice nqdsissklk
  301 eccekpllek shciaevend empadlpsla adfveskdvc knyaeakdvf lgmflyeyar
  361 rhpdysvvll lrlaktyett lekccaaadp hecyakvfde fkplveepqn likqncelfe
  421 qlgeykfqna llvrytkkvp qvstptlvev srnlgkvgsk cckhpeakrm pcaedylsvv
  481 lnqlcvlhek tpvsdrvtkc cteslvnrrp cfsalevdet yvpkefnaet ftfhadictl
  541 sekerqikkq talvelvkhk pkatkeqlka vmddfaafve kcckaddket cfaeegkklv
  601 aasqaalgl

Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence. using : https://www.bioinformatics.org/sms2/rev_trans.html I found the translation of my Chitodextrinase protein amino acid code to nucleotide.

reverse translation of Untitled to a 1827 base sequence of most likely codons. atgaacaaacgcaccctgctgagcgtgctggtggcgggcgcgtgcgtggcgccgtttatg gcgcaggcggcgagcctgcaggcgaccaccagcgaaccgtataccattaaagcgagcgat ctggcgaaaaaagaaaaagaactgaccagctttccgctgatggcgagcgtgaaagaaacc attcagaccctggataacgcgcaggtggaactgattgaaccgggccgcgcggcgaacccg gataacgtgaaacgcgtggaaggcattgtgaaagcgagcgattgggaatatctgtttccg ctgcgcgcgcaggcgtatacctatagcaactttctgaaagcggtgggcaaatttccggcg ctgtgcaaaacctataacgatggccgcgatagcgatgcgatttgccgcaaagaactggcg accatgtttgcgcattttgcgcaggaaaccggcggccatgaaagctggcgcccggaagcg gaatggcgccaggcgctggtgcatgtgcgcgaaatgggctggagcgaaggccagaaaggc ggctataacggcgaatgcaacccggatgtgtggcagggccagacctggccgtgcggcaaa gataaagatggcgattttctgagctattttggccgcggcgcgaaacagctgagctataac tataactatggcccgtttagcgaagcgatgtatggcgatgtgcgcaccctgctggataaa ccggaactggtggcggatacctggctgaacctggcgagcgcgatttttttttttgcgtat ccgcagccgccgaaaccgagcatgctgcaggtgattgatggcacctggcagccgaacgat catgataaagcgaacggcctggtgccgggctttggcgtgaccacccagattattaacggc ggcgtggaatgcggcggcccgaccgaaattgcgcagagcgaaaaccgcattaaatattat aaagaatttgcgaactatctgaaagtgccggtgccggcgaacgaagtgctgggctgcgcg aacatgaaacagtttgatgaaggcggcgcgggcgcgctgaaaatttattgggaacaggat tggggctggagcgcggataccccgagcggcgcgacctatagctgccagctggtgggctat cagaccccgtttagcgcgtttaaagaaggcgattatagcaaatgcgtgcagaaatttttt aacgtgaacattgtgaacgatgatggcagcgcgaccccggatcagaccccggtgaccccg accccgaccccggcgccgagccaggatgaaaccccggcgccggcgccggtgccggatgaa accccggcggaaccggcggcggtgaaccatgcgccggtggcggatattgcgggcccgatt ggcgcggtggatgcgggcgcgcaggtgagcctgagcgcggaaggcagcaccgatgcggat ggcaacgcgctgacctatacctggcgcagccaggatggccagaccgtgaccggccaggat aaagcggtggtgacctttaaagcgccggaagcggcgaccgcgcagcagattgaaattagc ctgaccgtgagcgatggcgaactgagcagcaccaccagctatctgctgaacgtgaaagcg aaagcggcgccgagccaggatgaaggcaccagcggcaactatgcggcgtggagcgcgaac agcaaatataaagcgggcgatattgtgaacaaccatggcaaactgtttcagtgcaaaccg tttccgtatagcggctggtgcaacaacgcgccggcgtattatgaaccgggcgcgggcctg gcgtgggcggatgcgtggaccgcgctg

Codon optimization.

Different organisms “prefer” different codons for the same amino acid, because their tRNA pools are tuned to their own typical codon usage. If I took a gene from another organism and used it as‑is in a new host, translation could be slow or inefficient, leading to low protein yield or even misfolded, non‑functional protein. Codon optimization rewrites the DNA sequence without changing the amino acid sequence, replacing rare codons with codons that are common in the chosen host to match its tRNA availability and boost expression. I chose to optimize the sequence for E. coli, because it is one of the most widely used and well‑characterized hosts in synthetic biology and industrial protein production, with many tools, protocols, and optimization resources available.

using https://en.vectorbuilder.com/tool/codon-optimization.html I found this codon optimization for E. coli

Improved DNA[1]: GC=78.96%, CAI=0.88 GCCACCGGCGCAGCGTGCGCGGCCGCGTGCGGTTGTGCGTGCTGCTGTACCGGCTGCACCGGCGCCGGCTGTGGCACAGGCTGCACCGGGGGCACCGGCGGCTGCGGCGGCGGTTGCGGCTGCGGCACCGGCTGCGGCACCGGCGGCTGCGGCTGCTGCGGTACAACCACCGCGACCGGGGGCTGTGGCTGTGCCGGCGGTTGCGGCGGCTGCGGCGCGGGCTGCTGCACCGGCTGCGCGGGCGGCTGCGGCGCCTGCTGTGCGTGTTGCGCGGGCTGCGGCGCAGCGTGCTGTGGCACTGCGACCGCCTGCTGTGCGACCACCGCCGCTGCGGGTTGTGGCGCGGGTTGCGGTGCCACCTGTACCGGCGGCTGCGGCGCGGCGGCGGCGGCGGCGGGTGCGGCGGCAGCCGCCGGCGCGGCGTGCACCGGCGCCTGCTGCGCCGGCTGTACCACCACTTGCTGCGGTTGCACGGGCGCGACTGGTGGCTGTGGTGCAGGCTGCGGCACCGGCGCAGCGGCAGGCGCCGCGGCCTGCTGCGCGACCACCTGCGCAGGCGCGTGTTGCTGCACCGGTGGAGCGACCGCCGCGTGCGGCTGTGGTTGCGCGGGTGGTACCGGCGGCGCCGCCTGTACCGGCGCCACCACTGGCGCGGCGTGTTGTGGCGGCGGCTGCTGTGGCTGCGGCTGCGGAGGTTGCGGTGCGGCGTGTTGCTGCGGCGGCGCGACCGCGGCGTGTGGTACCGGCGCAGCCGCTTGCGGTTGCGGCACCGGCGGCGCAGCCGGCGGCTGCGCCACCACCGGTACCGGCGCCGCGGCGGGCTGCGGCGCGGGTTGCGGCGCCACCACGGGCGGCGGCGCGGCCACCGCGACGTGCACCGGTACGACCACCTGCTGTGGCTGCACTGGTTGTGGCTGCGGTTGCGGTTGCGCCGGCGGTTGCGGCACCGCGACCGCGTGTTGTACCGCGACCGCGGGCTGCGCGGCGTGCACCACCACCTGCACCGGTGCGGCGGCGGGCTGCGGCGGCACGGGCGGCGGCTGCGCGGCGGCGACCACGACCTGCTGCGGTGGTTGCGGTTGTACCGGCACCGGCTGCGCGGCCGCGGCATGCTGTACAGCCACCGCGGCCTGCGGCGCGACCGGTGGGTGCTGCGGCTGCGGCGCAACCGCCGGCTGCGGTGCGACCGGCTGCGGTGCAACCACCACCGGTTGCTGCGGCTGCGCGGCCGCCGGCGCGGCATGCACGGGCGGCTGTGGCGCCTGCTGCGCCACCGGCACCACAACCGGGTGCGGCTGCGCAACGACCACCACCGGCTGCGGCTGTGCCGGTGGCGCGGCGGCCTGCTGTGGTGGCTGCGGTGGCTGTTGCGCGACCGGCGCCGCGGCGGGTTGTACCGGTGGTTGCGGTTGCTGCTGCGGCGGAGCGGCGGGCTGCGGCGGCGCGGCGACGGGCGGTTGCGGTTGCTGCGCGGGCGGCTGTGGTTGCACCGGCGGCACTGGCTGCGCGACCGGCACCGGCTGCGGCTGTGGCGCCGCCGCGACCGGCGGCGGCTGCACTGGCGGCGCGGGCTGTGGCGCCGCGGGTGGCTGCTGCGCGGGCGCGGCAGCCGGTGGTTGCGGCGGCTGCACTGCCACTGCGGCGTGCGGTGGCTGCGGCGCGGCGACCGGCTGTGCAGCCTGTTGTTGCGGTGGCGCGACCGGTACCGGCACTGGCGGCTGCGCGGGCGGCGGTTGTTGTGCGGGTGCGTGTTGTACCGGCGGCTGTTGCGGCACGGGCTGCGGCGGCTGTGCGGCGGCAGGCGCGACCGCGGCGGCGGGCGCGACCGGCGGCTGCGGTGCGACCACCACCACCTGCACGGGCGCGGGCTGCACCGCCACCACCACCACCGGCGGCTGTTGTGGCTGCGGCGGCTGCGGCTGCGGTGCCGCGGCCTGCGCGGGTTGCACCGGCGCGGGCTGTACCGCGACCGCGGCCTGTACCGCCACGGCGGCGTGCACTGCGACCGGCGGCTGCTGTTGCGGCACCACCACCGCCGGCTGCGGTGCCGCGGGCTGTGGTGCGACCGGCACCGCCACCGGCGGATGCGGCGCTACCGGCACCGGCTGCGGTTGTGCGTGCTGTTGTACCGGCTGCACGGGCGGCGCGACCGCAGCGGCCTGCTGTGGCGGGGCCGCGTGCACCGGCGGCACCGGCGGCTGCGGCGGCGCCACCGCCTGCTGTACCGGAGGTTGCACGGGCGCGGCGTGTTGCACCGGCGGCTGTGGCGCGGGCTGCGGTTGCGGCGCGACCACCACCACCACCACCACCACCACCACCACCGGCTGTGGTACCGCAACCTGTTGCGGCTGCGCCGGCTGCTGCGGCTGCTGCGGCGCCGCAGCATGTTGTGGCGCGGGTTGTGCAACGGGTTGTACAGGCTGCGCCGGTGGTACCGGCGCGACCACCGGCGCGACCGGTGGCTGTGCGTGCTGTACCGGCGGCTGTGCGGGCTGCTGCGGCGCCGCGTGCGGCGCCACCTGCGCCACCGGCGCCACCGCCGCGGCGGGCTGTGGCGCCGCATGCGGCGGCTGCTGTACCGGCGGCACCGGCTGCTGCGGCGGCGGATGCACCACGACCGGCGGCTGCGGCACCGGTGCGTGCTGCGCGTGCTGCTGCGCGGGCGCGACGACGGCGACCACCGCGGCGTGCGGCGGCTGCGGCGGCTGCGGCACCGGTGGTGCCGCCACCGGCTGTGGCGGTTGCGGCGGTTGTTGCTGTGGTGCCTGCTGTGGCGCGGCCGCCACCACCGGCTGCGGCTGTGCCGGCGCCGGCTGCGGCGCCGCCGCGGCCTGCTGCGGCTGCGCGACCACCGCAGCCGCCACCGCGACCACCGCAACAGCCGCGGCCGGCGCGGCCACCACCACAGGTTGTGGTGCCGCGTGCACCGCGACGTGCACGGGTGCCGCCGCCGGTACCGGCTGCTGCGGTGGCACCGGGTGTTGCGGCGGCTGCGGCGCCGCGTGCGGCGCCGCGGGCACCGGCTGTACCGGTGGCGGTTGCACCGGCTGCGGCTGCGGCGCGGCATGCGCGACCGGCGCGGCCGCATGCGCCGGCACCACCACCGGCGCCACGGGCGCGGCGGGCGGGTGCGGCGGTTGCGGCTGTGGCGGCGGCTGTGGCTGCGGCTGCACCGGAGCAGCCGCGGCGACCACCACCGCGACCACCGGTGGCGGCGCCGCGTGCGCAGGCGGCGCGACCACCGGCGGCGGCGGCTGCACCGGCGGCGCGGGCTGCGGCTGCGGCGGCGCCACCGCCTGCTGTTGCTGTGGCGCCGGTTGCGGTGGCTGTGGTTGCGGTGCGTGCTGTACGGCAACCGCGGGGTGTACCGGCTGCTGCGCGGGCTGCACCGGTGGCACCGGGGGCGGCTGCACCGCCACATGTGCCGGTGCGTGCTGCTGTTGTGGCACCACGACCGCGGGCTGTGGTTGCGGCACCACCACCGCGGCCGCGGGTGCGGCGGGCGGCTGCGGCGCGACCACCGCAACCGCGGGCTGTGCAGCGGCCACCGGCTGCGGCACCGGCTGTGCGGGCGCGGCGGCCACCACCACCACGACCACCGCCGCCTGTGGTACCGGTGCGGCCTGCGCCACCACCGGCACCGGCGCGGCCTGCGGCGCGACCGGGGCGACCGGCGGCTGTGCCGGCTGTGGCTGCGGCGCATGCTGCTGCTGCGGTGGTGCAACCTGCGCGGGCGCGTGCTGTTGTTGCGGCGGCACCGGTGCCTGCTGCTGCTGTGGCGCGTGCTGCTGCTGTGGCGCCTGCTGCTGCTGCGGCGGCTGCGGCTGTTGCGGTGCCGGCTGTTGCGCAGGCGGCGCGACCGGTGCCGCGGCGTGCTGCTGTTGTGGTGGTTGCGGTTGCTGCGGCGGTTGTGGCTGCTGCGGCGGTACCGGCTGTTGCGGCGGTGCCACCGGCGCCGCGGCATGCTGCTGCTGTGGCGGTTGCGGCGGCGCCGCGTGCTGCGGCGGTTGCGGCGGCTGTGGCGGTACAGGCGCCGCGTGCTGCGCAACCGGCTGTGGCTGCTGCGGCGGCACCGGCGGTTGCGGTGGTGCCACCGCCACCACCGGCTGCGGCGGCGGCTGCTGCTGCGGCGCGACCACCGGCGGCTGCGGTTGTGGTGGCACCGGCGGTGCGACCGGCTGTGGCGGCGGTTGTGGTTGCGGCTGCGCGGGCGGCACCGGCGCCGGCTGCTGCACCGGCGCGGGTTGCGGCTGCGGTGGCGCAGCCGGCGGCTGCGCGGGCTGCGCCTGCTGCGGCGCGACCGGCTGCGGTGGCGCCACCGGCGGCTGTGCGGCGTGCGGTTGCGGTTGCACCGGCGCGTGTTGTACCGCGACCGCGTGCTGCACCGGTGGCTGCGGCTGCGCCGGCTGCTGCGCGGGGGGCGCCACCGGCGGCTGCTGCGCGGGCGCCTGTTGCGGCACCGGCGCGTGTTGCGGCGGCTGCTGCGCCGGCGGTGCGACCGCCGCAGCCGGCTGCGGCGGTACCGGCGGCACCGGCGCGTGTTGCACCACCACCGCCGCGGCGGGCTGCGGTTGTTGCGGCGGTGCGGCCGGTTGTGGTGGCTGTGGCGCCTGCTGCGGCTGCGGTTGCGCGGGCTGTGCGGGCGCGACCACCGGTGCAGCCGCGACCACCGCGGGTTGCTGCACGGGTGCGTGCTGTGGCACCGGCGCCGGCTGTGGCGCGACCGGCGGTTGTGGCGCGGCGTGTACCGGCGCGGGTTGTGCGGGCTGCGCGTGCTGCGCGTGTTGCGCCGGCTGCACGGCGACGTGTACCGGTTGCACCGGCGCCGCGTGCGGTACCGGCGCGGCCGCGGGCTGCGGCGCCGCGGCGGGTTGCGGTGGTTGCGGTTGCTGTGGCGCAGGTTGTTGCGCGGGCGGCGCGACCGGCGCCGCGGGAGGCTGCGCCTGCTGCGCCGGCTGCGGCGGCTGCGCCGCGTGTACCGCCACGGGCTGCGGCGGTTGTGGCACCGGCGGCGCGGGTTGCGGCTGCGGCGCGGCCTGTGCAGGCTGCGCCGCCGCGACCGCGACCGCCGCGGCGGGTTGCGGCGGCGGCTGTGGCGCGACCGCGACCACCGGCACCGGCGCCGCGTGCGCGGCGTGCTGTGCAACCGGCGGTTGCGCAGCGGCGTGTACCGGCACCACCACCTGCGCCGGCACCGGCTGTGCGGCTGCGTGCTGCGGCACCACCACCTGTTGTGGCACCGCCACCGCGGGCTGTGGCGGCTGCACCGGCGGCACCGGCTGCGCGGCCTGCGCCGCGTGTGGCTGTGGCTGCTGTGGCGGCTGCGGCACCGCGACCACCGCGACGGGTGCCGCCTGTTGCGGCGGTGGCTGCGGTTGTGGTGGTGGTTGCTGCACCGGCGGCTGTGGCACCGGCGGCGGGTGCGGCGGCGCAACCGGCTGCGGCACCGGCGGTGCGTGTTGCGGTTGCGGTTGCACCGGCTAA

You have a sequence! Now what? What technologies could be used to produce this protein from your DNA? Describe in your words the DNA sequence can be transcribed and translated into your protein. You may describe either cell-dependent or cell-free methods, or both.

You could produce the protein using recombinant expression in cells (like E. coli, yeast, insect, or mammalian cells) or by cell‑free translation systems that use cell extracts. In both cases, your DNA sequence is first transcribed into messenger RNA by RNA polymerase, using base pairing rules (A–U, C–G) to copy the gene. Then ribosomes translate the mRNA: every three nucleotides (a codon) are read in order and matched to a tRNA carrying the corresponding amino acid. As the ribosome moves along the mRNA, it links amino acids together into a polypeptide chain that folds into your functional protein

How does it work in nature/biological systems? Describe how a single gene codes for multiple proteins at the transcriptional level.

A single gene can give multiple proteins mainly through alternative splicing of its pre‑mRNA. During transcription, the whole gene (exons + introns) is copied into a pre‑mRNA. The spliceosome can then join exons together in different combinations (for example skipping an exon, using alternative 5′ or 3′ splice sites, or choosing mutually exclusive exons), producing several distinct mRNA isoforms from the same gene. Each mRNA has a different coding sequence, so translation produces related but different protein isoforms with altered domains or lengths.

4. Prepare a Twist DNA Synthesis Order

Whole process done around the exemple of “Constitutive sfGFP” .fasta and .gbk file in the folder

5. DNA Read/Write/Edit

DNA Read

(i) What DNA would you want to sequence (e.g., read) and why? This could be DNA related to human health (e.g. genes related to disease research), environmental monitoring (e.g., sewage waste water, biodiversity analysis), and beyond (e.g. DNA data storage, biobank).

I would want to sequence metagenomic DNA extracted from the outer skin/surface of our biolab’s mycelium composite object. This would reveal the microbial communities (bacteria, fungi, etc.) colonizing the surface, helping us understand biofilm formation, contamination risks, and natural protection that mycelium provides against external microbes. Additionally, I’d sequence internal DNA from the mycelium-wood composite itself (after surface sterilization) to compare the native microbiome within the material versus surface colonizers. This contrast would show whether the mycelium creates a unique internal niche or if external contaminants penetrate.

(ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why?

I would use Illumina sequencing for the metagenomic DNA from our biolab’s mycelium composite surface and internal samples. Because Illumina provides high accuracy and important throughput (millions of reads), which is adapted for detecting low-abundance microbes on material surfaces and generating statistically robust community profiles. Short reads (150-300 bp) cover the hypervariable regions of 16S rRNA (bacteria/archaea) and ITS (fungi) needed for species identification.

Is your method first-, second- or third-generation or other? How so? This method is considered second-generation. It sequences millions of DNA fragments simultaneously using PCR-amplified clusters on a flow cell, unlike first-gen Sanger (one fragment at a time) or third-gen single-molecule methods.
What is your input? How do you prepare your input (e.g. fragmentation, adapter ligation, PCR)? List the essential steps. For the imput and preparation the essentials steps are : Input: Total DNA (~10-100 ng) extracted from surface swabs and internal core samples.
Essential preparation steps:

PCR amplification of 16S V3-V4 (~460 bp) and ITS1 (~250 bp) regions using universal primers
Adapter ligation to add Illumina sequencing handles
Library pooling and normalization
Cluster generation on flow cell via bridge amplification

What are the essential steps of your chosen sequencing technology, how does it decode the bases of your DNA sample (base calling)?

Sequencing by synthesis: DNA polymerase adds one fluorescently labeled nucleotide (with reversible terminator) at a time
Each base (A/T/C/G) emits a unique fluorescent color
Imaging: Camera captures light emission after each incorporation
Cleavage: Fluorescent blocker removed, allowing next base addition
Base calling: Software converts fluorescence intensities → ACGT sequence

What is the output of your chosen sequencing technology? a FASTQ files containing:
Millions of short reads (150-300 bp) per sample
Phred quality scores per base (Q30 = 99.9% accuracy)
Processed via QIIME2/DADA2 → taxonomic profiles showing bacterial/archaeal/fungal relative abundances on mycelium surface vs. interior

DNA Write (i) What DNA would you want to synthesize (e.g., write) and why? To be fully honest i’m not so sure so far, I need to take time to do more research to understand more the possibilities and the outcomes it could bring.

(ii) What technology or technologies would you use to perform this DNA synthesis and why?

(ii) What DNA synthesis technology would I use and why?

I could use phosphoramidite-based chemical DNA synthesis the industry standard employed by companies like Twist Bioscience and IDT. This method offers the highest reliability, scalability, and quality control for custom gene synthesis up to 3 kb, with >99% accuracy at the oligo level.

Essential steps:

Solid-phase phosphoramidite synthesis of short oligonucleotides (50-200 bp)
Gibson assembly or Golden Gate to stitch oligos into full-length genes
Error correction via enzymatic selection or hybridization capture
Cloning into a plasmid vector
Sanger sequencing verification

Key limitations:

Error rates accumulate with length (>3 kb requires multi-fragment assembly)
GC-rich, repetitive, or hairpin-forming sequences reduce yield
Cost scales linearly with length (~$0.10-0.20/base)

Why it fits: Despite limitations, gene-scale synthesis (1-3 kb) is now routine and robust enough for most synthetic biology applications. For a caffeine pathway, multiple short genes can be synthesized separately then assembled.

Scaling reality check: Unlike chemical reactors, biological systems face oxygen gradients, metabolic stress, and contamination at scale. Lab success doesn’t guarantee industrial performance, so small-scale mastery comes first.

DNA Edit (i) What DNA would you want to edit and why? Same as for the Write,i’m not sure yet. I don’t think I have a big enough understanding.

(ii) What technology or technologies would you use to perform these DNA edits and why?

I would use *CRISPR-Cas9 genome editing because it enables precise insertion or modification of DNA at specific genomic locations using programmable guide RNAs, making it far more flexible and efficient than older methods like TALENs or ZFNs.

How CRISPR edits DNA:

Design guide RNA (gRNA) to target specific 20-bp genomic sequence (adjacent to PAM site)
Deliver Cas9 protein + gRNA as RNP complex or plasmids into cells
Cas9 creates double-strand break at target site
Provide donor DNA template with desired gene flanked by homology arms
Homology-directed repair (HDR) integrates new sequence during break repair

Required inputs:

Custom gRNA sequence
Cas9 enzyme (plasmid or RNP)
Donor DNA template with homology arms
Competent cells + electroporation/chemical transformation