Week 2 HW: DNA READ, WRITE AND EDIT

Part 0: Basics of Gel Electrophoresis

I attended and reviewed the lecture and recitation materials on gel electrophoresis. From these sessions, I learned that gel electrophoresis is a technique used to separate DNA fragments based on size, where negatively charged DNA migrates toward the positive electrode through an agarose matrix. Smaller fragments travel faster than larger ones, allowing visualization of fragment patterns after staining. Understanding these principles was important for designing and interpreting restriction digest patterns in the subsequent gel art exercise.

Part 1: Benchling & In-silico Gel Art

I created a Benchling account and imported the Lambda DNA sequence as described in the lab protocol. I simulated restriction enzyme digestion using multiple enzymes, including EcoRI, EcoRV, KpnI, SalI, SacI, XhoI, and combinations of these enzymes. I explored different digestion conditions to observe how fragment sizes and band patterns changed under various enzyme combinations.

To design the gel art, I iteratively tested different digests and compared band distributions to identify visually interesting patterns. I used Ronan’s gel design approach as inspiration to refine the arrangement and achieve a structured pattern consistent with the artistic style of Paul Vanouse’s Latent Figure Protocol works, where DNA fragment bands are treated as compositional elements.

The final design was constructed using the following ladder and digestion conditions:

Ladder

Life 1 kb Plus ladder

Lane setup

LAMCG-1 — EcoRI
LAMCG-1 — EcoRI + EcoRV
LAMCG-1 — KpnI
LAMCG-1 — EcoRI + KpnI
LAMCG-1 — XhoI
LAMCG-1 — KpnI + SalI
LAMCG-1 — EcoRI
LAMCG-1 — EcoRV + XhoI
LAMCG-1 — EcoRV + SalI
LAMCG-1 — SacI + SacII

Through this process, I evaluated multiple digestion scenarios and selected combinations that produced distinct band spacing and contrast, enabling a visually coherent pattern. The design reflects both the structural diversity of restriction fragments and the concept of DNA as a medium for encoding visual information.

Part 3: DNA Design Challenge

3.1 Choose your protein

I chose mCherry, a red fluorescent protein commonly used as a reporter in synthetic biology and molecular biology experiments. mCherry allows visualization of gene expression and protein localization due to its bright fluorescence and stable folding. Because it is widely expressed in E. coli and frequently used in engineered biological systems, it serves as a practical example for exploring DNA design, reverse translation, and protein expression workflows.

The protein sequence was obtained from NCBI.

UFQ89828.1 mCherry [synthetic construct]
MVSKGEEDNMAIIKEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQF
MYGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPV
MQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAEVKTTYKAKKPVQLPGAYNVNIKLDITSHN
EDYTIVEQYERAEGRHSTGGMDELYK

3.2 Reverse Translate: Protein to DNA

Using the reverse translation tool provided by Ghent University (GeneCorner), I converted the amino acid sequence into a corresponding nucleotide sequence. Reverse translation applies the principles of the central dogma, where multiple codons can encode each amino acid, allowing computational inference of a possible DNA sequence that produces the protein.

The reverse-translated DNA sequence was:

atggtgagcaaaggcgaagaagataacatggcgattattaaagaatttatgcgctttaaa
gtgcatatggaaggcagcgtgaacggccatgaatttgaaattgaaggcgaaggcgaaggc
cgcccgtatgaaggcacccagaccgcgaaactgaaagtgaccaaaggcggcccgctgccg
tttgcgtgggatattctgagcccgcagtttatgtatggcagcaaagcgtatgtgaaacat
ccggcggatattccggattatctgaaactgagctttccggaaggctttaaatgggaacgc
gtgatgaactttgaagatggcggcgtggtgaccgtgacccaggatagcagcctgcaggat
ggcgaatttatttataaagtgaaactgcgcggcaccaactttccgagcgatggcccggtg
atgcagaaaaaaaccatgggctgggaagcgagcagcgaacgcatgtatccggaagatggc
gcgctgaaaggcgaaattaaacagcgcctgaaactgaaagatggcggccattatgatgcg
gaagtgaaaaccacctataaagcgaaaaaaccggtgcagctgccgggcgcgtataacgtg
aacattaaactggatattaccagccataacgaagattataccattgtggaacagtatgaa
cgcgcggaaggccgccatagcaccggcggcatggatgaactgtataaa

3.3 Codon optimization

The reverse-translated sequence was codon optimized using VectorBuilder for expression in E. coli. Codon optimization improves protein expression by adjusting codon usage to match the host organism’s tRNA abundance, reducing ribosomal pausing and increasing translation efficiency.

I chose E. coli as the expression host because it is a well-characterized organism widely used for recombinant protein production, with fast growth and robust genetic tools.

During optimization, restriction enzyme recognition sites were removed to avoid cloning complications and enable compatibility with common assembly methods. These included:

Golden Gate enzymes (BsaI, BbsI, SapI, Esp3I)
BioBrick enzymes (EcoRI, XbaI, SpeI, PstI)
Common cloning sites (NcoI, NdeI, BamHI, XhoI, HindIII, KpnI, SalI, SacI)
NotI, PacI, PmeI, AscI, FseI, SfiI

ATGGTGAGCAAAGGCGAAGAAGATAACATGGCCATTATTAAAGAATTTATGCGCTTTAAAGTGCACATGGAAGGCAGCGTGAACGGCCATGAATTTGAAATTGAAGGCGAAGGGGAAGGCCGCCCGTACGAAGGCACCCAGACCGCGAAACTGAAAGTGACCAAAGGCGGTCCGCTGCCGTTTGCCTGGGATATTCTGAGCCCGCAGTTTATGTATGGCTCAAAAGCGTATGTGAAACACCCGGCAGATATTCCGGATTATCTGAAACTGAGCTTCCCGGAAGGTTTTAAATGGGAACGTGTAATGAACTTTGAAGATGGTGGCGTGGTAACGGTGACGCAGGATAGCAGCCTGCAAGACGGTGAATTTATCTACAAAGTGAAACTGCGCGGCACCAACTTCCCGAGTGATGGTCCGGTGATGCAGAAAAAAACAATGGGCTGGGAAGCGAGCAGCGAACGCATGTACCCGGAAGATGGGGCGCTGAAAGGTGAAATTAAACAGCGTCTGAAACTGAAAGATGGCGGCCACTACGATGCGGAAGTGAAAACCACCTACAAAGCGAAAAAACCTGTACAGCTGCCGGGCGCATATAACGTGAACATTAAACTGGATATTACCTCACATAATGAAGATTATACCATTGTTGAACAGTACGAACGCGCCGAAGGCCGTCACAGCACCGGGGGCATGGATGAACTGTACAAA

3.4 You have a sequence — now what?

The codon optimized DNA sequence can be inserted into an expression vector under a promoter and transformed into E. coli. The promoter recruits RNA polymerase to transcribe the DNA into mRNA, and the ribosome binds to the ribosome binding site to initiate translation. The mCherry protein is synthesized and folds into a fluorescent structure that emits red light.

Protein production can occur in:

Cell-dependent systems: recombinant expression in E. coli
Cell-free systems: transcription-translation reactions without living cells

These approaches allow controlled production and characterization of the protein.

Part 4: Prepare a Twist DNA Synthesis Order

4.1 Create accounts

I created accounts on Benchling and Twist Bioscience to design and simulate ordering a synthetic DNA construct. These platforms were used to build, annotate, and visualize the expression cassette and to simulate cloning into a plasmid backbone.

4.2 Build your DNA insert sequence

Using Benchling, I created a new DNA sequence with linear topology to represent the insert fragment. The goal was to design an expression cassette that constitutively expresses the mCherry fluorescent protein in E. coli.

I assembled the sequence by placing each genetic element sequentially:

Constitutive promoter (BBa_J23106)
Ribosome binding site (BBa_B0034)
mCherry coding sequence (codon optimized for E. coli)
C-terminal 7×His tag for purification
Stop codon
Transcription terminator (BBa_B0015)

Each component was annotated in Benchling to clearly indicate its function within the construct. The linear map view was used to verify that all features were correctly ordered and properly labeled.

This assembled sequence represents an expression cassette that can drive transcription and translation of mCherry in a bacterial host.

4.5 Import sequence

I uploaded the FASTA file of the annotated mCherry expression cassette to Twist using the nucleotide sequence upload option. This step represents submitting the design for synthesis.

4.6 Choose vector

I selected the pTwist Amp High Copy vector as the cloning backbone. This vector contains:

Ampicillin resistance for selection
High copy origin of replication for strong expression
Standard cloning features

Choosing a high copy vector supports robust production of the mCherry protein in E. coli.

After selecting the vector, I downloaded the full plasmid construct in GenBank format and imported it back into Benchling to visualize the complete plasmid map.

Final construct

The final plasmid contains the mCherry expression cassette inserted into the pTwist Amp High Copy backbone. The plasmid map confirms the presence of the insert, antibiotic resistance marker, and origin of replication, indicating a successful simulated cloning workflow.

This exercise demonstrates the process of designing a synthetic gene, assembling regulatory elements, and preparing a construct for DNA synthesis and expression.

Part 5: DNA Read / Write / Edit

5.1 DNA Read

(i) What DNA would you want to sequence and why?

I would want to sequence microbial DNA from wastewater samples. Wastewater contains DNA from bacteria, viruses, and other microorganisms, making it a powerful way to monitor public health and environmental changes. By sequencing wastewater DNA, we can track the spread of pathogens, detect antibiotic resistance genes, and monitor outbreaks such as viral infections. This approach can provide early warning signals for disease surveillance and help guide public health decisions.

(ii) What sequencing technology would you use and why?

I would use a combination of Illumina sequencing and Oxford Nanopore sequencing.

Illumina provides highly accurate short reads, which are useful for identifying specific genes and variants, while Nanopore sequencing provides long reads that help assemble genomes and detect structural variations.

Generation type

Illumina: Second-generation sequencing
Nanopore: Third-generation sequencing

Input and preparation

Input would be extracted DNA from wastewater samples.

Preparation steps include:

DNA extraction
Fragmentation
Adapter ligation
PCR amplification (for Illumina)
Library preparation

Essential sequencing steps

For Illumina:

DNA fragments bind to flow cell
Bridge amplification forms clusters
Sequencing by synthesis detects fluorescent signals as nucleotides are incorporated

For Nanopore:

DNA passes through a nanopore
Changes in electrical current are measured
Base calling software converts signals into sequence

Output

The output is a digital DNA sequence (FASTQ/FASTA files) that can be analyzed to identify organisms and genes present in the sample.

5.2 DNA Write

(i) What DNA would you want to synthesize and why?

I would like to synthesize a genetic circuit that expresses the mCherry fluorescent protein in response to environmental stress signals. This could serve as a biosensor to detect changes such as oxidative stress or toxic compounds. Such circuits could be used for environmental monitoring or research applications.

Example sequence component:

Promoter
Regulatory element
mCherry coding sequence
Terminator

(ii) What technology would you use and why?

I would use phosphoramidite chemical DNA synthesis followed by gene assembly methods such as Gibson Assembly or Twist Bioscience gene synthesis.

These methods allow precise construction of custom DNA sequences with high accuracy.

Essential steps

Chemical synthesis of oligonucleotides
Assembly into full-length gene
Sequence verification
Cloning into vector

Limitations

Error rates increase with longer sequences
Cost increases for large constructs
Synthesis length limits
Need for sequence verification

5.3 DNA Edit

(i) What DNA would you want to edit and why?

I would be interested in editing genes related to genetic diseases, such as correcting mutations that cause inherited disorders. For example, editing disease-associated genes could help develop gene therapies. Beyond medicine, editing microbial genomes to improve metabolic pathways could help in sustainable biomanufacturing.

(ii) What technology would you use and why?

I would use CRISPR-Cas9 because it is precise, programmable, and widely used for targeted genome editing.

How it edits DNA

CRISPR uses a guide RNA to direct the Cas9 enzyme to a specific DNA sequence, where it creates a double-strand break. The cell’s repair machinery then repairs the DNA, allowing insertion, deletion, or correction.

Preparation and input

Guide RNA design
Cas9 protein or plasmid
Donor template (if precise repair needed)
Target cells

Limitations

Off-target effects
Variable editing efficiency
Delivery challenges
Ethical considerations

Overall, DNA read, write, and edit technologies provide powerful tools for understanding and engineering biological systems, with applications ranging from medicine to environmental monitoring.