Subsections of Week 2 – DNA Read, Write & Edit

Week 2 – Homework

✨ Part 1: Benchling & In‑silico Gel Art ✨

I simulated a restriction digest on λ‑DNA (E. coli bacteriophage) in Benchling using several restriction enzymes: EcoRI, HindIII, BamHI, KpnI, EcoRV, SacI, and SalI. Each enzyme recognizes its own specific DNA sequence, producing different fragment patterns depending on how many cut sites are present. Some enzymes generate sticky ends, while others like EcoRV create blunt ends. By comparing the band patterns, we can see which enzymes cut the DNA, how many fragments they produce, and estimate fragment sizes—from large fragments (~10 kb) to very small ones (~100 bp). If an enzyme doesn’t cut, the result is a single intact band.

✨ Part 3: DNA Design Challenge ✨

3.1. Choose your protein

For this assignment, I chose linalool synthase, an enzyme involved in the biosynthesis of linalool, one of the major aromatic and bioactive compounds found in lavender (Lavandula spp.). Since my bachelor’s thesis focuses on the bioactive components of lavender, including linalool, this protein felt like a meaningful and relevant choice.

Using UniProt, I obtained the amino acid sequence for the linalool synthase I selected:

 >sp|Q2XSC5|LALIN_LAVAN R-linalool synthase OS=Lavandula angustifolia (Lavender) OX=39329 PE=2 SV=1 MSININMPAAAVLRPFRCSQLHVDETRRSGNYRPSAWDSNYIQSLNSQYKEKKCLTRLEGLIEQVKELKGTKMEAVQQLELIDDSQNLGLSYYFQDKIKHILNLIYNDHKYFYDSEAEGMDLYFTALGFRLFRQHGFKVSQEVFDRFKNENGTYFKHDDTKGLLQLYEASFLVREGEETLEQAREFATKSLQRKLDEDGDGIDANIESWIRHSLEIPLHWRAQRLEARWFLDAYARRPDMNPVIFELAKLNFNIVQATQQEELKALSRWWSSLGLAEKLPFVRDRLVESYFWAIPLFEPHQYGYQRKVATKIITLITSLDDVYDIYGTLDELQLFTNLFERWDNASIGRLPEYLQLFYFAIHNFVSEVAYDILKEKGFTSIVYLQRSWVDLLKGYLKEAKWYNSGYTPSLEEYFDNAFMTIGAPPVLSQAYFTLGSSMEKPIIESMYEYDNILRVSGMLVRLPDDLGTSSFEMERGDVPKSVQLYMKETNATEEEAVEHVRFLNREAWKKMNTAEAAGDSPLVSDVVAVAANLGRAAQFMYFDGDGNQSSLQQWIVSMLFEPYA 

3.2. Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence

Using the online tool available at https://proteiniq.io , I reverse‑translated the amino acid sequence of the linalool synthase protein into a corresponding DNA sequence. This process is based on the Central Dogma of Molecular Biology, which states that DNA makes RNA, and RNA makes protein. By working backwards from the protein sequence, the tool generates a plausible nucleotide sequence that could encode the same protein.

I obtained the following reverse‑translated DNA sequence:

Converted Sequence ATGTCAATAAACATAAATATGCCGGCAGCAGCTGTTCTGCGGCCTTTCCGGTGCAGCCAGCTTCATGTCGATGAAACACG AAGATCGGGTAACTATAGGCCCTCTGCCTGGGATTCTAACTACATACAAAGCTTGAACTCACAATATAAGGAGAAGAAGT GCTTAACGAGGCTTGAGGGTCTAATTGAGCAAGTAAAAGAATTAAAAGGGACGAAAATGGAAGCTGTGCAGCAACTGGAA TTGATCGACGATTCCCAAAACCTTGGGCTATCATATTACTTTCAGGATAAAATTAAGCACATTCTCAATCTGATCTACAA CGACCACAAGTATTTTTACGATAGCGAGGCAGAGGGGATGGACTTATATTTCACGGCCTTGGGTTTCCGTCTCTTTCGGC AACACGGGTTCAAGGTCTCACAAGAAGTCTTCGATAGGTTCAAAAATGAAAACGGAACGTACTTTAAACACGATGACACT AAAGGGCTACTTCAGCTTTATGAGGCGTCCTTCTTGGTCCGAGAAGGAGAGGAGACGCTAGAACAAGCACGAGAGTTCGC CACTAAGAGCTTACAGAGAAAACTTGACGAGGACGGAGACGGAATTGACGCGAATATCGAATCATGGATACGGCATAGTC TTGAGATACCCTTGCATTGGCGCGCGCAGCGTCTCGAAGCCCGCTGGTTTTTAGACGCATACGCTCGGAGGCCGGATATG AACCCTGTTATTTTCGAATTGGCTAAGTTAAATTTTAATATTGTGCAAGCAACGCAACAAGAGGAGCTCAAGGCGCTTTC TCGGTGGTGGTCGTCATTGGGCCTAGCCGAGAAACTACCATTTGTGAGAGACAGACTGGTGGAGTCATACTTTTGGGCCA TTCCGCTATTTGAACCACATCAGTACGGTTACCAGCGAAAGGTGGCGACTAAGATAATTACGCTCATAACCTCATTAGAT GACGTCTACGATATCTATGGAACCTTAGACGAATTGCAACTTTTCACCAACCTCTTCGAACGCTGGGATAACGCGTCGAT CGGGAGGCTACCCGAATATCTGCAACTGTTTTACTTTGCGATACACAATTTTGTCAGTGAGGTCGCGTATGATATCCTGA AAGAAAAGGGCTTCACTTCAATAGTATACTTACAAAGAAGTTGGGTTGATTTACTTAAAGGTTACCTCAAGGAAGCTAAA TGGTACAACAGCGGGTATACGCCTTCGCTGGAAGAGTATTTTGACAATGCGTTCATGACGATAGGTGCGCCCCCGGTCCT TTCTCAAGCCTACTTTACACTGGGTTCAAGCATGGAAAAACCCATTATAGAATCCATGTATGAATATGACAATATCCTAC GAGTAAGCGGCATGCTGGTGCGCCTGCCTGACGATTTGGGAACAAGTTCGTTCGAAATGGAGCGCGGGGACGTTCCTAAA TCCGTCCAGCTCTACATGAAGGAGACCAATGCAACTGAAGAGGAAGCAGTAGAACATGTGCGCTTTCTGAACAGGGAGGC TTGGAAAAAAATGAACACTGCTGAGGCTGCGGGCGACTCGCCGTTAGTGTCCGACGTAGTTGCTGTAGCAGCCAATCTAG GACGCGCAGCGCAATTTATGTATTTCGACGGAGATGGCAATCAATCCTCGTTGCAACAGTGGATTGTGTCCATGCTTTTC GAGCCATATGCA

3.3. Codon optimization

1. Why do we optimize codons?
  • Different organisms prefer different codons for the same amino acid. When a gene from one species is expressed in another, the codon usage may not match the host’s preferences, which can slow down translation and reduce protein expression.
  • Codon optimization rewrites the DNA sequence using the codons most frequently used by the host organism, without changing the amino acid sequence. This increases translation efficiency, mRNA stability, and overall protein yield. It is a standard technique in biotechnology to improve recombinant protein production.
2. Which organism did you choose and why?
  • I chose Nicotiana tabacum for codon optimization because it is a plant species, just like lavender, the natural source of linalool synthase. Since both are plants, their codon usage patterns are more similar, making N. tabacum a more suitable host for expressing a plant-derived enzyme than bacteria or yeast.
  • In addition, several studies on related Nicotiana species show that linalool plays an important ecological role in plant defense. For example, Nicotiana attenuata emits (S)-(+)-linalool to attract predators of herbivores such as Manduca sexta, reducing leaf damage. Linalool is also known to have insecticidal and repellent properties in many species, including mosquitoes and agricultural pests.
  • Because N. tabacum is widely used in biotechnology, cosmetics, and pharmaceutical production, enhancing its natural protection against insects through increased linalool production could reduce the need for pesticides. Introducing a codon‑optimized linalool synthase gene from lavender into tobacco could therefore help the plant produce higher levels of linalool and benefit from its natural repellent and defensive properties.
Codon Optimization Using Two Different Tools

To ensure that my codon optimization results were reliable and not dependent on a single algorithm, I performed the optimization using two independent online tools, each of which uses different reference datasets and calculation methods for Nicotiana tabacum. Because of these differences, the CAI (Codon Adaptation Index) and GC% values vary slightly between platforms, which is expected.

Tool 1 — VectorBuilder Codon Optimization

https://en.vectorbuilder.com/tool/codon-optimization/59fae592-9784-4c9e-976c-f649a1865c8f.html

MetricBefore OptimizationAfter Optimization
CAI0.670.88
GC content46.34%38.06%

This tool showed a clear improvement in CAI, indicating that the optimized sequence is much better adapted to the codon usage preferences of Nicotiana tabacum.

Tool 2 — NovoPro Codon Optimization

https://www.novoprolabs.com/tools/codon-optimization

MetricBefore OptimizationAfter Optimization
CAI0.860.81
GC content40.37%42.55%

In this case, the original sequence was already well adapted according to NovoPro’s reference dataset, and the optimized version produced a CAI of similar magnitude, with a slightly higher GC%.

3.4. You have a sequence! Now what?

Now that I have my optimized DNA sequence, the next step is to think about how this DNA could actually be used to produce the protein. In general, the process is the same in any biological system: the DNA is transcribed into mRNA, and then the mRNA is translated by ribosomes into the protein.

One way to do this is by expressing the gene directly in a plant, such as Nicotiana tabacum . To get the gene into the plant genome, a genome‑editing tool like CRISPR–Cas9 could be used. Cas9 can make a cut at a specific location in the plant’s DNA, and then my optimized gene can be inserted at that site. After the gene is integrated, the plant’s own machinery will read it, make mRNA from it, and then produce the protein.

Another option would be to use cell‑free expression systems or express the gene in E. coli, for example, but plant expression is especially relevant when the protein is naturally part of a plant pathway.

Overall, once the DNA is inside the host (either a cell or a cell‑free system), the basic flow is the same: DNA → mRNA → protein.

DNA
RNA
Protein

3.5. [Optional] How does it work in nature/biological systems?

1. Describe how a single gene codes for multiple proteins at the transcriptional level.
  • A gene is first transcribed into a long RNA molecule called pre‑mRNA. This pre‑mRNA contains exons, which are kept and introns which will be removed.
  • Alternative Splicing The cell can splice (cut and join) the exons in different combinations. Different exon combinations = different mRNA molecules.
  • Different mRNAs → different proteins Each mRNA variant is translated into a protein. Because the exon order changes, the amino acid sequence changes too, and finally, one gene can produce multiple proteins.
DNA
EXON1 — intron — EXON2 — intron — EXON3
↓ alternative splicing
mRNA Variant 1
EXON1 + EXON2 + EXON3
mRNA Variant 2
EXON1 + EXON3
↓ translation
Protein 1
Protein 2
2. Try aligning the DNA sequence, the transcribed RNA, and also the resulting translated Protein!!!
In nature, DNA is double-stranded, but only one strand is used as the template during transcription. RNA polymerase reads the template strand (3′→5′) and synthesizes a single-stranded RNA molecule based on base complementarity: A pairs with U (because RNA has no T), T pairs with A, G with C, and C with G.

After transcription, the mRNA is read in groups of three nucleotides called codons. Each codon corresponds to one amino acid. During translation, tRNA molecules bring amino acids to the ribosome by matching their anticodon to each codon on the mRNA. As amino acids join together, they form the polypeptide chain — the protein.
DNA 5′→3′: ATG TCA ATA AAC ATA AAT
DNA 3′→5′: TAC AGT TAT TTG TAT TTA
RNA 5′→3′: AUG UCA AUA AAC AUA AAU
AA: M S I N I N

✨ Part 4: Prepare a Twist DNA Synthesis Order ✨

I created my Twist Bioscience account, and I already had a Benchling account from the previous step. In Benchling, I created a new DNA sequence named Linalool_E.coli, where I inserted the codon‑optimized DNA sequence of my gene of interest (Linalool synthase), optimized for E. coli. Before the coding sequence, I added the following genetic elements:

  • Promoter (BBa_J23106)
 TTTACGGCTAGCTCAGTCCTAGGTATAGTGCTAGC 
  • RBS (BBa_B0034 – ribosome binding site)
 CATTAAAGAGGAGAAAGGTACC 
  • Start Codon (ATG)
 ATG 
  • Coding Sequence (Linalool synthase, codon‑optimized for E. coli)
AGCATCAACATTAATATGCCGGCGGCGGCGGTGCTGCGCCCGTTTCGTTGCAGCCAGCTGCACGTTGATGAAACCCGTCGCAGCGGCAATTATCGTCCGAGCGCGTGGGATAGCAATTATATTCAGAGCCTGAATAGCCAGTATAAAGAAAAAAAATGTCTGACCCGCCTGGAAGGCCTGATTGAACAGGTGAAAGAACTGAAAGGCACCAAAATGGAGGCGGTGCAGCAGCTGGAACTGATCGATGATAGCCAGAATTTGGGCCTGAGCTATTATTTTCAGGATAAAATTAAACATATTCTGAACCTGATTTATAACGATCATAAATATTTTTACGATAGCGAAGCGGAAGGCATGGACCTGTACTTTACCGCGCTTGGCTTTCGCCTGTTTCGCCAGCATGGCTTTAAAGTGTCGCAGGAAGTGTTTGATCGCTTTAAAAACGAAAATGGCACCTATTTTAAACATGATGATACCAAAGGTCTGCTGCAGCTGTATGAAGCGAGCTTTCTGGTTCGCGAAGGCGAAGAAACCTTAGAACAGGCCCGCGAATTCGCGACGAAAAGCCTGCAGCGCAAACTGGATGAAGATGGCGATGGCATTGATGCGAACATTGAAAGCTGGATTCGCCACAGCCTGGAAATTCCGCTGCATTGGCGTGCGCAGCGCCTGGAAGCCCGCTGGTTTCTGGATGCCTACGCGCGCCGCCCGGATATGAATCCGGTGATTTTCGAACTGGCCAAACTGAACTTTAACATTGTGCAGGCGACCCAGCAAGAAGAACTGAAAGCGCTGAGCCGCTGGTGGAGCTCTCTGGGCCTGGCAGAAAAACTGCCGTTTGTGCGTGATCGTCTGGTGGAAAGCTATTTCTGGGCGATTCCGCTGTTTGAACCGCATCAGTATGGCTATCAGCGCAAAGTCGCGACCAAAATTATTACCCTGATTACCAGCCTGGATGATGTCTATGATATTTATGGCACCCTGGATGAACTGCAGCTGTTCACGAATTTATTTGAACGTTGGGATAACGCGAGCATTGGTCGCCTGCCGGAATATCTGCAGCTGTTCTATTTCGCGATCCATAATTTTGTGTCGGAAGTGGCCTATGATATTCTGAAAGAAAAAGGCTTTACCAGCATTGTGTACCTGCAGCGCTCCTGGGTGGATCTGCTGAAAGGCTACCTGAAAGAAGCGAAATGGTATAATTCAGGCTATACCCCGAGCCTGGAAGAATATTTTGATAATGCCTTCATGACGATTGGCGCCCCTCCGGTGCTGTCGCAGGCCTATTTCACCCTGGGCAGCAGCATGGAGAAACCGATTATTGAAAGCATGTATGAATATGATAATATTCTGCGTGTGAGCGGCATGCTGGTTCGCCTGCCGGATGATCTGGGCACCAGCAGTTTTGAAATGGAGCGCGGCGATGTGCCGAAAAGCGTGCAGCTGTACATGAAAGAAACCAACGCCACCGAAGAAGAAGCCGTGGAACATGTGCGCTTCCTGAATCGCGAAGCGTGGAAAAAAATGAATACCGCGGAAGCAGCGGGTGACAGCCCGCTGGTAAGCGATGTGGTGGCGGTGGCCGCGAACCTGGGCCGCGCAGCGCAGTTCATGTATTTTGATGGCGATGGCAACCAGAGCTCACTGCAGCAGTGGATTGTGAGCATGCTGTTTGAACCGTATGCG

After the coding sequence, I added:

  • 7×His Tag
 CATCACCATCACCATCATCAC 
  • Stop Codon (TAA)
 TAA 
  • Terminator (BBa_B0015)
 CCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGAACGCTCTCTACTAGAGTCACACTGGCTCACCTTCGGGTGGGCCTTTCTGCGTTTATA 

The resulting linear map can be seen in the first image, next to this textm and the second one is the visual diagram.


✨ This is the plasmid I just built! ✨

✨ Part 5: DNA Read/Write/Edit ✨

5.1 DNA Read

I would choose to sequence DNA from plants that naturally show resistance to agricultural pathogens, as well as DNA from the pathogens themselves. Understanding the genetic basis of plant immunity — for example, genes involved in pathogen recognition, antimicrobial compound production, or stress signaling — can help identify natural strategies that crops use to defend themselves without relying on chemical pesticides.
This connects directly with my own project, where I designed a plasmid for the biosynthesis of linalool in E. coli. Linalool is a naturally occurring monoterpene found in many aromatic plants, and it is known to have antimicrobial and insect‑repellent properties. By studying the DNA of pathogen‑resistant plants, we can discover how these organisms use compounds like linalool or related molecules as part of their defense systems.
Sequencing both plant and pathogen DNA would therefore support sustainable agriculture by revealing natural defense pathways that could be enhanced, transferred, or synthetically produced — reducing the need for synthetic insecticides and promoting more resilient crop systems.
Sequencing technology I would use

To sequence plant resistance genes and agricultural pathogens in a way that is fast, portable, and useful directly in the field, I would use Oxford Nanopore MinION. This technology allows rapid, on‑site DNA analysis, which supports sustainable agriculture by allowing early pathogen detection and reducing unnecessary pesticide use.

1. GenerationI would use Oxford Nanopore MinION, a third‑generation method. It reads DNA directly by measuring small electrical changes as the strand passes through a nanopore.
2. Input & PreparationInput: purified DNA from plant tissue or pathogens.

Preparation steps:
• DNA extraction
• Optional DNA cutting/fragmentation
• Add Nanopore adapters
• Load the sample into the MinION
• Sequencing starts
3. How it reads the DNAThe DNA strand moves through a nanopore. Each base changes the electrical signal slightly. The device reads these signal patterns and the software turns them into A, T, C, or G.
4. OutputThe output is the actual DNA reading: long sequences + quality scores (FASTQ files). This shows exactly which bases were detected in the sample.

5.2 DNA Write

I would like to synthesize the DNA for a plant gene that naturally produces linalool, a fragrant molecule with mild antimicrobial and insect‑repellent properties. Adding this gene to Nicotiana tabacum could help the plant protect itself better in a natural way, supporting more sustainable agriculture without relying on chemical pesticides.

  1. What are the essential steps of your chosen sequencing methods?
I would use commercial DNA synthesis technology, such as the automated chemical DNA writing used by companies like Twist Bioscience. This method can quickly and accurately produce the exact DNA sequence I want, including the gene responsible for linalool production. It is reliable, fast, and ideal for creating small custom DNA fragments for research.
  1. What are the limitations of your sequencing method (if any) in terms of speed, accuracy, scalability?
The main limitations of DNA synthesis are related to length, errors, time, and cost. Longer DNA sequences are harder to synthesize and may take more time to produce. Errors can appear during synthesis, so the final DNA often needs to be checked. The process can also take longer for complex or larger sequences. Finally, one of the biggest limitations is cost, because high‑quality synthesis technologies and equipment are expensive.

5.3 DNA Edit

If I could edit DNA, I would choose to modify the genome of common ragweed (Ambrosia artemisiifolia). I would edit the genes responsible for pollen development and allergenic pollen proteins. I would target two types of DNA regions.

First, I would edit genes controlling pollen formation so the plant becomes male-sterile and produces non-viable pollen. This could be done using modern gene-editing technologies such as CRISPR, which allow precise mutations in specific genes. If the pollen cannot develop properly, the plant would release little or no functional pollen, which would strongly reduce allergy problems and also limit the plant’s uncontrolled spread.

Second, I would consider modifying the DNA coding for the main allergenic pollen proteins, such as the Amb a allergens, so that their structure becomes less likely to trigger immune reactions. Even if allergies cannot be completely removed, reducing both pollen quantity and allergen strength could significantly decrease the public health impact.

AI generated image
AI generated
The reason I would edit ragweed DNA is that this plant already grows naturally in polluted and disturbed environments and tolerates poor soil conditions. Because of this, it could potentially be used for **opportunistic phytoremediation**, meaning helping absorb some heavy metals from contaminated soil without needing intensive cultivation. Currently, public policy focuses on **elimination**, not **domestication** of ragweed because of its allergy risk. However, if genetic editing reduced pollen hazards and spread, the plant might instead be safely managed and used in an **ecological direction** for environmental cleanup.

Therefore, editing ragweed DNA could transform a harmful invasive species into a controlled plant with potential environmental benefits while reducing risks to human health.


1. How does your technology of choice edit DNA? What are the essential steps?

A: The DNA editing technology I would use for both approaches is CRISPR. For the first, CRISPR would target pollen-development genes by designing guide RNA, cutting the DNA with Cas9, and letting the cell repair it to create non-viable pollen, while for the second, CRISPR would target allergen genes, cut the DNA at epitope regions, and use a repair template to introduce small changes so the protein becomes less allergenic.

2. What preparation do you need to do (e.g. design steps) and what is the input (e.g. DNA template, enzymes, plasmids, primers, guides, cells) for the editing?

A: The DNA editing technology I would use for both approaches is CRISPR. For the first, I would prepare a guide RNA targeting the pollen-development gene, and for the second, a guide RNA targeting the allergen gene along with a DNA template for small changes. The input for the editing includes the plant cells, the Cas9 enzyme, the guide RNAs, and the repair template (for the allergen modification), which together allow the plant to make the desired DNA changes.

3. What are the limitations of your editing methods (if any) in terms of efficiency or precision?

A: The main limitations of CRISPR are that DNA repair is not always perfect, which can cause unintended mutations, and not all cells may be successfully edited, so the efficiency is less than 100%. Precise changes, like modifying allergen proteins, are harder to achieve than simply turning a gene off, making the method less precise for complex edits.


References

  1. https://www.benchling.com
  2. https://eu.idtdna.com/page/support-and-education/decoded-plus/codon-optimization-the-basics-explained/#:~:text=Codon%20optimization%20involves%20strategically%20modifying%20the%20nucleotide%20sequence,increased%2C%20resulting%20in%20higher%20levels%20of%20protein%20expression.
  3. https://doi.org/10.1073/pnas.1818585116
  4. https://doi.org/10.1177/1934578X0800300727
  5. https://nanoporetech.com/platform/technology
  6. https://en.wikipedia.org/wiki/Ragweed#Allergy
  7. https://my.clevelandclinic.org/health/diseases/ragweed-allergy
  8. https://www.aloki.hu/pdf/1302_489512.pdf
  9. https://doi.org/10.1016/j.envpol.2015.11.041

Week 2 – Lecture Prep

AI generated

Questions

1. What’s the most commonly used method for oligo synthesis currently?

Answer: The most commonly used method for oligo synthesis today is phosphoramidite DNA synthesis.

2. Why is it difficult to make oligos longer than 200nt via direct synthesis?

Answer: It is difficult because each chemical synthesis step has less than 100% efficiency, errors accumulate with length, making oligos longer than ~200 nt unreliable.

3. Why can’t you make a 2000bp gene via direct oligo synthesis?

Answer: Chemical DNA synthesis has an error rate of about 1 in 100 bases, and the errors accumulate over 2000 bases, so it is impossible to obtain a correct full‑length gene through direct synthesis.