Week 2 Homework: DNA Read, Write & Edit

1. Gel Art:

vs

The digestion results:

3.1. Choose your protein.

Human ubiquitin is one of the most highly conserved proteins across eukaryotic species, with an identical amino acid sequence in organisms ranging from yeast to humans. It plays a central role in the ubiquitin-proteasome system, where it tags damaged or unnecessary proteins for degradation, regulating critical cellular processes such as cell cycle progression, DNA repair, signal transduction, and protein quality control. Protein sequence - 76 amino acids sp|P0CG48|UBC_HUMAN Polyubiquitin-C OS=Homo sapiens OX=9606 GN=UBC PE=1 SV=2 MQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGG

3.2. Reverse Translation:

ATGCAGATCTTCGTGAAGACCCTGACCGGCAAGACCATCACTCTGGAGGTGGAGCCCAGTGACACCATCGAGAACGTGAAGGCCAAGATCCAGGACAAGGAGGGCATCCCTCCTGACCAGCAGAGGCTGATCTTTGCCGGCAAGCAGCTGGAAGATGGCCGCACCCTGTCTGACTACAACATCCAGAAGGAGTCCACCCTGCACCTGGTGCTGCGTCTGCGTGGCGGC

3.3. Codon optimization:

Different organisms have distinct preferences for which codons they use for each amino acid, due to differences in tRNA abundance and translation efficiency. The native human sequence may contain codons that are rare in other hosts, leading to slower translation and ribosomal stalling, which increase the risk of misfolding. I decided to optimize for Escherichia coli K-12, which is fast-growing, inexpensive, well-characterized, and widely used for high-yield protein expression in research and industry. Using Benchling codon optimization tool, I obtained:

ATGCAGATCTTTGTGAAAACCCTGACCGGTAAAACCATTACCCTGGAAGTGGAGCCGAGCGATACCATTGAAAACGTGAAAGCGAAAATTCAGGATAAAGAAGGCATTCCGCCGGATCAGCAGCGCCTGATTTTTGCCGGCAAACAGCTGGAAGATGGTCGTACCCTGAGCGACTATAACATTCAGAAAGAAAGCACCTTACATCTGGTGCTGCGTCTGCGTGGTGGT

3.4. You have a sequence! Now what?

The optimized DNA sequence can be synthesized and cloned into a vector for protein production in E.coli K-12:

The DNA is inserted into a plasmid and transported into E. coli cells. Host RNA polymerase transcribes the DNA into mRNA. Ribosomes then translate the mRNA into the polypeptide chain using tRNAs, amino acids, and energy from GTP/ATP. The protein folds , and cells are lysed to purify it.

3.5. How does it work in nature/biological systems?

How a single gene codes for multiple proteins at the transcriptional level: One common mechanism is splicing, where a single pre-mRNA transcript is processed in different ways to include or exclude exons, producing multiple mature mRNA isoforms from the same gene. These isoforms are translated into distinct proteins with different functions. (Ubiquitin genes themselves primarily produce identical monomers via polyubiquitin precursors or fusions, with cleavage occurring post-translationally; alternative splicing is more prominent in other human genes, e.g., for diversity in receptors or enzymes.) Alignment of DNA, transcribed RNA, and translated Protein:

Important:

  • in the genome there are multiple genes that code for ubiquitin (ribosomal fusion genes and polyubiquitin genes), due to the necessity of upregulating the production during periods of increased metabolic stress;
  • in general, ubiquitin genes do not contain introns, most likely due to the need of quick production and low change for errors to occur.
  • ubiquitin suffers post translation modifications
  • ubiquitination is the process by which E1, E2 and E3 enzymes mark misfolded proteins or danaturated ones for protein degradation by attaching the damaged protein to ubiquitin as a substrate
  • E1 activates ubiquitin
  • E2 transports ubiquitin to the target protein
  • E3 facilitates ubiquity-substrate binding
  • the E1 enzyme activates the ubiquitin

4.2

Complete sequence (promoter, RBS, Start codon, coding sequence, 7xHis Tag, end codon, terminator)

4.3, 4.4, 4,5, 4,6 Results:

5.1 What DNA would you want to sequence (e.g., read) and why? DNA READ

i)I would be enthralled to sequence the genes responsible for ribonucleic vaults. A deeper understanding of how the MVPs (major histocompatibility proteins) are used to create literal containers used for cellular transportation would help us better understand how to create intricate structures far more complex than what evolution could achieve in the coming millions of years.

ii) I would the Third-Generation sequencing. Why? Unlike second-generation (Next-Generation Sequencing/NGS) platforms like Illumina, which require the sheer repetition of “sequencing by synthesis” on amplified clusters, ONT performs single-molecule, real-time sequencing. & It does not require PCR amplification to create a signal, meaning it bypasses the “vicious” cycle of amplification biases that can sometimes lead to errors in GC-rich regions of the genome.

5.2 DNA Write:

i) A genetic circuit designed for the real-time detection of microplastic degradation products in aquatic ecosystems.

The sheer volume of plastic entering our oceans has created a vicious cycle where macro-plastics break down into microscopic particles that enter the food chain, eventually bearing the brunt of their toxicological effects on human health. The design would be a modular genetic circuit that responds to Terephthalic Acid (TPA), a primary breakdown product of PET plastics.

  • The Sensor: A specific transcription factor (TpaR) that remains secretive and inactive until it binds to TPA.
  • The Amplifier: Once TpaR is activated, it triggers a “leaky” expression filter to ensure high sensitivity.
  • The Reporter: A high-intensity chromogenic protein (like amilCP, which turns dark blue) that is easily visible to the naked eye, even when swathed in murky environmental water samples. ii)
  • Synthesis Technology: Silicon-based Synthesis I would choose this technology because it uses a silicon platform to act as a solid support for the chemical reactions. Traditional plastic plates are inefficient; by moving to silicon, we can rebuke the wastefulness of older methods, reducing reagent use by over 99%.
  • Scalability: It can synthesize 9,600 genes on a single chip Length Constraints: As the DNA strand grows longer, the efficiency of adding each subsequent base drops slightly.

5.3 DNA Edit:

(i) The Edit: Enabling Nitrogen Fixation in Cereals Currently, only legumes can naturally convert atmospheric nitrogen into a usable form via a secretive symbiotic relationship with specialized bacteria. Most cereal crops lack this ability, forcing farmers into a vicious cycle of applying synthetic nitrogen fertilizers. The planet’s soil and waterways bear the brunt of this practice, leading to massive nutrient runoff and greenhouse gas emissions.

I would want to edit the Symbiosis Receptor-Like Kinase (SYMRK) and Nod Factor Receptor (NFR) genes in wheat. These edits would “re-tune” the plant’s root receptors to recognize the signaling molecules of nitrogen-fixing bacteria. This change would rebuke the notion that high-yield farming must be ecologically destructive, as the crops would essentially produce their own “green” fertilizer.

(ii) The Technology: Prime Editing To achieve these precise changes without causing accidental damage to the rest of the genome, I would use Prime Editing (PE).

While standard CRISPR-Cas9 is effective for “breaking” genes, Prime Editing is a “search-and-replace” tool that can rewrite specific DNA letters with manifold precision. This is essential for plants, where the genome conceals delicate regulatory networks that can be easily disrupted by the “clumsy” DNA breaks of older technologies.

How it Edits DNA (Essential Steps): Targeting: The Prime Editor—a fusion protein consisting of a Cas9 nickase and a Reverse Transcriptase (RT)—is guided to the specific root receptor gene.

Nicking: Instead of cutting both strands of DNA (which is a “vicious” event for a cell), it creates a small nick in only one strand.

Reverse Transcription: The pegRNA (prime editing guide RNA) provides a template. The RT enzyme “reads” this RNA and synthesizes a new strand of DNA that contains the desired edit.

Flap Competition: The newly synthesized “edited” DNA flap competes with the original unedited DNA flap.

Incorporation: Through natural cellular repair, the old flap is removed, and the new, edited sequence is permanently integrated into the genome.

Preparation and Input The sheer complexity of Prime Editing requires meticulous preparation before a single cell is touched.

Design Steps: We must use software to design the pegRNA. This includes choosing a Primer Binding Site (PBS) (which tells the editor where to start) and an RT template (which contains the actual “fix” for the nitrogen receptor).

Inputs:

The Prime Editor Construct: Usually delivered as a plasmid or mRNA encoding the Cas9-RT fusion protein.

The pegRNA: The custom RNA sequence that “searches and replaces.”

Delivery System: For plants, we often use Agrobacterium-mediated transformation or Biolistics (a “gene gun”) to shoot these components into the plant tissue.

Target Cells: Embryogenic callus cells from the wheat plant, which can be grown back into a whole, fertile plant.

Limitations: Efficiency and Precision Although Prime Editing is highly precise, it is not yet perfect:

Efficiency: In many plant species, the editing efficiency is still quite low (often <10%). Getting the “search-and-replace” to actually stick across millions of cells remains a bottleneck.

Size Constraints: Because the Prime Editor protein is so large, it is difficult to pack into some delivery vehicles.

Target Range: The editor must be near a specific sequence called a PAM site. If the receptor gene we want to edit isn’t near a PAM, the tool is effectively “blind” to that location.