Week 2 HW: DNA Read, Write & Edit

Part 1: Benchling & In-silico Gel Art

cover image cover image cover image cover image

PART 3: DNA Design Challenge

3.1 Choose your protein

Erythropoietin is a hormone that stimulates red blood cell production. Selected because:

  1. Vital in anemia therapy

  2. High-value biotechnology protein

  3. Relevant to the pharmaceutical industry

Erythroproietin :

sp|P01588|EPO_HUMAN Erythropoietin OS=Homo sapiens OX=9606 GN=EPO PE=1 SV=1 MGVHECPAWLWLLLSLLSLPLGLPVLGAPPRLICDSRVLERYLLEAKEAENITTGCAEHC SLNENITVPDTKVNFYAWKRMEVGQQAVEVWQGLALLSEAVLRGQALLVNSSQPWEPLQL HVDKAVSGLRSLTTLLRALGAQKEAISPPDAASAAPLRTITADTFRKLFRVYSNFLRGKL KLYTGEACRTGDR

3.2 Reverse Translate : Protein sequence to DNA sequence Erythroproietin DNA Sequence:

sp|P01588|EPO_HUMAN Erythropoietin OS=Homo sapiens OX=9606 GN=EPO PE=1 SV=1 ATGGGGGTGCACGAATGCCCAGCATGGTTGTGGCTACTATTGAGCCTTCTGTCCTTGCCCTTAGGTCTCCCTGTACTTGG GGCGCCCCCCCGACTAATATGTGACTCGCGGGTTTTAGAGCGGTACCTGTTGGAAGCAAAAGAAGCGGAAAATATCACTA CTGGCTGCGCTGAACATTGTTCCTTAAATGAGAATATCACAGTTCCCGACACCAAGGTAAATTTTTATGCGTGGAAACGC ATGGAGGTTGGCCAACAAGCAGTCGAAGTTTGGCAGGGGTTAGCGCTACTTTCTGAGGCAGTGCTTAGAGGCCAGGCATT GTTAGTAAATTCAAGCCAGCCTTGGGAGCCTCTACAACTTCATGTGGACAAAGCCGTGTCAGGCCTGAGATCCCTAACTA CGCTCCTCCGCGCGCTAGGAGCGCAAAAAGAGGCTATCAGTCCGCCCGACGCAGCTTCTGCCGCCCCACTCCGTACCATA ACAGCTGACACTTTCCGAAAACTTTTCAGAGTTTATTCAAACTTCCTACGAGGTAAATTGAAATTATACACTGGCGAAGC CTGCAGGACTGGGGATCGC

3.3 Codon Optimization

For this project, Escherichia coli has been chosen as the expression host.

ATGGGTGTGCACGAATGCCCAGCATGGTTGTGGCTACTGTTGAGCCTTCTGTCCTTGCCGTTAGGTCTCCCTGTACTTGGGGCGCCCCCGCGTCTTATTTGTGATTCGCGTGTTCTGGAGCGGTACCTGTTGGAAGCCAAAGAAGCGGAAAATATTACTACCGGCTGCGCTGAACATTGTTCCTTAAATGAAAACATCACAGTTCCGGACACCAAGGTCAACTTTTATGCGTGGAAACGCATGGAGGTCGGCCAACAGGCGGTCGAAGTGTGGCAGGGGCTGGCGCTACTGAGCGAGGCAGTGCTTCGTGGCCAGGCACTGTTAGTAAATAGTAGCCAGCCTTGGGAGCCGCTGCAACTGCATGTGGACAAAGCCGTGTCAGGCCTGCGCTCGCTGACGACGCTCCTCCGCGCGCTGGGAGCGCAGAAGGAAGCTATCAGTCCGCCGGATGCAGCCTCTGCCGCCCCACTGCGTACCATTACCGCTGATACATTCCGAAAACTGTTCCGTGTTTATTCAAACTTTCTGCGCGGTAAACTGAAATTATACACTGGTGAAGCCTGCAGAACGGGCGATCGC

cover image cover image

3.4 You have a sequence! Now what?

Once the DNA sequence encoding Erythropoietin (EPO) has been designed and codon-optimized for E. coli, the next step is to produce the protein using an appropriate expression system. Two major technological approaches can be used: cell-dependent expression systems and cell-free expression systems.

A. Cell-Dependent Expression System (Using E. coli)

In this project, a cell-dependent system is used with the pET-28a expression vector.

Step 1: Cloning into pET-28a

The optimized EPO gene is inserted into the Multiple Cloning Site (MCS) of the pET-28a plasmid. The construct includes:

a. T7 promoter

b. Ribosome Binding Site (RBS)

c. Start codon (ATG)

d. EPO coding sequence

e. Stop codon

f. Optional His-tag for purification

This produces the recombinant plasmid pET-28a-EPO.

Step 2: Transformation into E. coli

The recombinant plasmid is introduced into competent E. coli cells (e.g., BL21(DE3)). Transformed cells are selected using kanamycin resistance encoded by the plasmid.

Step 3: Transcription

Upon induction:

  1. The T7 RNA polymerase recognizes the T7 promoter.

  2. The DNA sequence is transcribed into messenger RNA (mRNA).

  3. During transcription:

    a. Adenine (A) pairs with Uracil (U)

    b. Thymine (T) in DNA becomes Uracil (U) in RNA

The result is an mRNA molecule complementary to the DNA template strand

Step 4: Translation

  1. The ribosome binds to the Ribosome Binding Site (RBS).

  2. Translation begins at the start codon (AUG).

  3. Each codon (three nucleotides) is translated into one amino acid.

  4. Transfer RNA (tRNA) delivers the corresponding amino acids.

  5. The polypeptide chain elongates until a stop codon is reached.

This process follows the Central Dogma of Molecular Biology:

DNA → RNA → Protein

The final product is recombinant EPO protein, typically fused with a His-tag for purification.

B. Cell-Free Expression System (Alternative Method)

Alternatively, the EPO protein can be produced using a cell-free expression system.

In this system:

a. The DNA template is added directly to a reaction mixture.

b. The mixture contains RNA polymerase, ribosomes, tRNAs, amino acids, and necessary cofactors.

c. Transcription and translation occur in vitro (outside living cells).

Advantages:

a. Faster protein production

b. No need for cell transformation

c. Suitable for rapid screening

However, for large-scale production, cell-dependent systems are generally preferred.

3.5 [Optional] How does it work in nature/biological systems?

In human cells:

a. Alternative splicing generates isoforms.

b.RNA editing may modify nucleotides post-transcription.

c. Post-translational modifications (e.g., glycosylation in EPO) alter protein stability and function.

In contrast, E. coli:

a. Does not perform alternative splicing.

b. Does not process introns.

c. Does not perform complex glycosylation.

Thus, recombinant EPO produced in E. coli may differ structurally from native human EPO.

5.1 DNA Read

(i) What DNA would you want to sequence and why?

Primary DNA to Sequence in This Project

I would sequence:

  1. The recombinant pET-28a-EPO plasmid

  2. The EPO coding sequence (CDS)

  3. The promoter–insert junction regions

Why?

a. To confirm that the EPO gene was inserted correctly.

b. To verify that no mutations occurred during gene synthesis or cloning.

c. To confirm the correct reading frame with the His-tag.

d. To ensure no premature stop codons or frameshifts are present.

Beyond this project, sequencing could be applied to sequencing disease-associated genes in Human health (e.g., cancer mutations).

cover image cover image

(ii) What sequencing technology would you use and why?

For this project, I would use:

Sanger Sequencing Why?

a. The EPO gene is relatively short (~500–600 bp coding region).

b. Plasmid verification is well suited for Sanger sequencing.

c. High accuracy for single-gene validation.

d. Cost-effective for small constructs.

Classification

a. First-generation sequencing

b. Uses chain-termination chemistry

c. Produces highly accurate reads (~700–1000 bp per read)

Input: Purified plasmid DNA (pET-28a-EPO)

Preparation Steps:

  1. Plasmid extraction from E. coli

  2. Primer design (forward and reverse primers flanking insert)

  3. PCR cycle sequencing reaction with labeled dideoxynucleotides (ddNTPs)

Essential Steps of Sanger Sequencing

  1. DNA denaturation

  2. Primer annealing

  3. DNA polymerase extension

  4. Random incorporation of fluorescently labeled ddNTPs

  5. Chain termination

  6. Capillary electrophoresis separation

  7. Laser detection of fluorescent signals

Base Calling : Each ddNTP is labeled with a different fluorescent dye. When incorporated, elongation stops.Fragments of different lengths are separated and detected.The emitted fluorescence determines the base identity (A, T, C, or G).

Output : Chromatogram (electropherogram), DNA sequence file (.ab1 or .seq), Base quality scores

5.2 DNA Write

(i) What DNA would you want to synthesize and why?

In this project, I would synthesize:

Codon-optimized human EPO gene for E. coli

Purpose:

a. Produce recombinant EPO for research applications.

b. Study protein folding and expression optimization.

ii) What technology would you use for DNA synthesis?

Chosen Technology:

Chemical DNA synthesis followed by gene assembly & Commercial providers (e.g., gene synthesis companies) synthesize DNA de novo.

Essential Steps of DNA Synthesis

  1. Oligonucleotide synthesis (phosphoramidite chemistry)

  2. Assembly of short oligos into full-length gene

  3. Error correction (if necessary)

  4. Cloning into plasmid backbone (pET-28a)

  5. Sequence verification

Limitations of DNA Synthesis

Speed: Synthesis of longer genes takes more time.

Accuracy: Errors can occur during chemical synthesis, Requires sequencing validation.

Scalability: Cost increases with gene length, Whole-genome synthesis remains complex and expensive.

However, for single-gene constructs like EPO, synthesis is efficient and practical.

5.3 DNA Edit

(i) What DNA would you want to edit and why?

In this project, I would edit:

The EPO coding sequence

Possible edits:

a. Improve solubility in E. coli

b. Reduce aggregation

c. Modify specific amino acids to increase stability

Beyond this project: i wish we can Correct disease-causing mutations in humans

(ii) What technology would you use?

For Plasmid Editing: Site-Directed Mutagenesis

For Genomic Editing: CRISPR-Cas systems

How CRISPR Edits DNA

  1. Design guide RNA (gRNA) complementary to target DNA.

  2. Cas enzyme binds to gRNA.

  3. Complex locates target DNA.

  4. Cas creates double-strand break.

  5. Repair occurs via:

    a. Non-homologous end joining (NHEJ)

    b. Homology-directed repair (HDR)

Required Inputs : DNA template (target sequence), Guide RNA, Cas enzyme, Repair template (if precise edit desired), Host cells

Essential Design Steps

  1. Identify target sequence.

  2. Design guide RNA.

  3. Check for off-target sites.

  4. Prepare delivery system (plasmid or ribonucleoprotein).

  5. Validate edits via sequencing.

Limitations of Editing Technologies

Efficiency: Editing efficiency may vary by cell type.

Precision: Off-target mutations may occur.

Delivery Challenges: Introducing CRISPR components into certain cells is difficult.

Ethical Considerations: Human genome editing raises significant ethical concerns.