Week 2 HW: DNA READ, WRITE & EDIT

PART 0: BASICS

I have attended all lectures and recitation necessary to prepare for this week.

PART 1: GEL ART & BENCHLING

I made my free account on Benchling, following Ice’s tutorial in class on Lambda DNA. I then played around with Ronan’s website and Benchling’s Digest feature to try to come up with something I liked. Ultimately, I came up with something that looks vaguely like a lucky cat (if you squint).

PART 2: GEL ART & RESTRICTION DIGESTS

Look at my design here!

PART 3: DNA DESIGN CHALLENGE

3.1

The protein I will choose for this is CLOCK which is apparently responsible for regulating the 24H mammalian circadian rhythm. I chose this because the concept of a circadian rhythm in general is interesting to me and its interesting that it’s regulated by a protein and not some psychological factor. Although I guess if you deep dived into all human actions, you could boil it down to the result of some complex protein interactions.

3.2

The protein sequence is as follows:

NP_001254772.1 CLOCK [organism=Homo sapiens] [GeneID=9575]
MLFTVSCSKMSSIVDRDDSSIFDGLVEEDDKDKAKRVSRNKSEKKRRDQFNVLIKELGSMLPGNARKMDKSTVLQKSIDFLRKHKEITAQSDASEIRQDWKPTFLSNEEFTQLMLEALDGFFLAIMTDGSIIYVSESVTSLLEHLPSDLVDQSIFNFIPEGEHSEVYKILSTHLLESDSLTPEYLKSKNQLEFCCHMLRGTIDPKEPSTYEYVKFIGNFKSLNSVSSSAHNGFEGTIQRTHRPSYEDRVCFVATVRLATPQFIKEMCTVEEPNEEFTSRHSLEWKFLFLDHRAPPIIGYLPFEVLGTSGYDYYHVDDLENLAKCHEHLMQYGKGKSCYYRFLTKGQQWIWLQTHYYITYHQWNSRPEFIVCTHTVVSYAEVRAERRRELGIEESLPETAADKSQDSGSDNRINTVSLKEALERFDHSPTPSASSRSSRKSSHTAVSDPSSTPTKIPTDTSTPPRQHLPAHEKMVQRRSSFSSQSINSQSVGSSLTQPVMSQATNLPIPQGMSQFQFSAQLGAMQHLKDQLEQRTRMIEANIHRQQEELRKIQEQLQMVHGQGLQMFLQQSNPGLNFGSVQLSSGNSSNIQQLAPINMQGQVVPTNQIQSGMNTGHIGTTQHMIQQQTLQSTSTQSQQNVLSGHSQQTSLPSQTQSTLTAPLYNTMVISQPAAGSMVQIPSSMPQNSTQSAAVTTFTQDRQIRFSQGQQLVTKLVTAPVACGAVMVPSTMLMGQVVTAYPTFATQQQQSQTLSVTQQQQQQSSQEQQLTSVQQPSQAQLTQPPQQFLQTSRLLHGNPSTQLILSAAFPLQQSTFPQSHHQQHQSQQQQQLSRHRTDSLPDPSKVQPQ

3.3

The corresponding nucleotide sequence is quite long, but a truncated version is as follows:

NC_000004.12:c55546909-55427903 CLOCK [organism=Homo sapiens] [GeneID=9575] [chromosome=4]
GCTGACGACGCATGCGCCGGGAGGGGGCGCAATCACGGACTCGGCTTGCGGCTGCCGGTTTAAAAAAGGAAACCCCGGAGAGCGAGAGCGCGAAGGAAATCTGGCCGCCGCCGCCGCGAGCGCTCCCGGTGAGAGGCGCCCGCCCGGTGGGCCCAGGGCCTGCCGAGTGCCGGTTGGCTTCCTTGGCGGCGCATGCGCGCTCCTGGGCTGGTGGAGGAGGGGAAGGGAAGGGAGGGGGAGGAGGAGCTGGCCACAGGAGCGGCGAATTTTTGGGGGGGTGGGTGGGGGGCGCCACTCACAGCCCCAGGTGCTGCTGGAGGTGGGAGCCGCGGCGCCTCCTGGACACAGGCGGGGTAGTGGTTCCGAGTCACCGCAGCGGGAGACCTGGGTGGGGGAGGGAAGAAGCCGGAGCCGCCGCAA

To optimize my codon sequence, I used a tool offered by Twist. Since I’m just playing around with the protein, I left most options blank, for example, “Sites to Avoid Introducing.”

Codon optimization is necessary because different organisms prefer different synonymous codons to encode the same amino acid. Although the genetic code is universal, the frequency with which specific codons are used varies between species. If a gene is expressed in a host organism that does not frequently use certain codons, translation can become slow or inefficient. Optimizing codon usage ensures that the DNA sequence matches the host organism’s codon bias.

I chose to optimize the codon sequence for Homo sapiens because the CLOCK protein is a human transcription factor and is typically studied in mammalian cell systems. Since the goal is to express functional CLOCK protein in a human cellular environment, optimizing for human codon bias ensures efficient translation using human tRNA pools and supports proper protein folding and regulation. Additionally, expressing CLOCK in human cells preserves the relevant post-translational modifications and cellular context necessary for its biological function in circadian regulation.

The optimized codon sequence is as follows:

ORIGIN
    1 GCTGATGATG CTTGCGCTGG ACGTGGCCGG AATCATGGTT TGGGACTCAG ACTTCCTGTG
   61 TGAAAGCGGA AGCCGAGACG AGCACGAGCA CGCCGGAAGA GTGGGCGTAG AAGACGAGAA
  121 AGGAGCAGAT AAGAAGCTCC TGCAAGATGG GCTCAAGGAC TCCCAAGCGC AGGCTGGTTG
  181 CCATGGAGAC GAATGAGGGC CCCAGGCTTA GTAGAAGAAG GAAAAGGCCG GGAAGGTGAA
  241 GAAGAATTGG CTACTGGCGC AGCAAATTTC TGGGGCGGCG GATGGGGAGC TCCTTTAACC
  301 GCACCTGGAG CAGCAGGCGG CGGCAGTCGT GGTGCAAGCT GGACGCAAGC TGGTTAATGG
  361 TTTAGGGTGA CTGCCGCCGG GGATTTAGGC GGCGGTGGAA AGAAACCTGA ACCACCCCAG

3.4

To produce this CLOCK protein form optimized DNA, both in vivo and in vitro systems can be used.

In vivo methods (cell-dependent) involve inserting the optimized CLOCK DNA sequence into a plasmid expression vector under the control of a strong promoter. This plasmid can then be introduced into human cells. Once inside the cell:

The plasmid enters the nucleus, where transcription occurs
mRNA is processed
mRNA is translated
tRNA pairs with complementary codons on mRNA
The protein is synthesized
The protein folds

Because the DNA was optimized for Homo sapiens, this translation efficiency would be quite high.

In vitro methods (cell-free) involve adding the optimized DNA into a biochemical mixture containing things like tRNAs, amino acids, purified ribosomes, etc. In these systems:

DNA first transcribed into mRNA
Ribosomes in the extract translate the mRNA directly into protein
The reaction follows the same biochemical principles of transcription/translation. These systems allow rapid protein production and tighter control over experimental conditions, though they may lack some post-translational modifications found in living cells.

In both methods, the fundamental process remains the same.

PART 4: PREPARING A DNA ORDER

I followed the tutorial on the homework page.

PART 5: DNA READ/WRITE/EDIT

5.1 DNA READ

To be honest, I would want to sequence my own DNA the most. All of these other things are probably frequently sequenced in research, but there’s probably a <1% chance that my DNA would ever be sequenced in a relevant way. I could learn a lot about myself that would be almost impossible for me to learn otherwise.
I would use Illumina sequencing to do so. It is a second-generation technology and is well-suited for human DNA because it provides high-throughput and accuracy at a relatively low cost. The input would be genomic DNA extracted from my cells (like blood), which would then be fragmented into short pieces. After fragmentation, adapters would be ligated onto both ends of each fragment, allowing the fragments to bind to a flow cell and serve as primer-binding sites. These fragments are amplified through bridge PCT to create clusters of identical DNA copies. Here, sequencing occurs through sequencing-by-synthesis. Fluorescently labeled reversible terminator nucleotides are added in each cycle and each incorporated base emits a distinct fluorescent signal that is detected by a camera. After imaging, the terminator is remove and the next cycle can begin. Base calling is performed by identifying the fluorescent signal at each cycle to determine the sequence of each fragment. The output would ultimately be millions to billions of short DNA reads.

5.2 DNA WRITE

I don’t know much about DNA synthesis and what’s actually possible. I think sensors would be cool although I’m not creative enough to come up with novel ideas. If I had to choose, I think a biomaterial would be most up my alley, like synthetic spider silk (which is really hard) or something inspired by spider-silk for textiles. The construct could also include an anchoring domain to coat cotton fibers or a crosslinking handle for durability, enabling a bio-based coating or fiber additive that adapts comfort and moisture handling without petroleum-derived polymers. It would just be very interesting to me and it would be a very unique material to work with if I was actually able to synthesize something with the same relative strength as spider silk.
I would use chemical DNA synthesis followed by gene assembly. In modern DNA synthesis, short DNA fragments are chemically built one nucleotide at a time on a solid support. Because longer genes cannot be synthesized in a single piece, multiple short fragments are designed with overlapping regions and then assembled enzymatically into the full-length gene. After assembly, the construct would be inserted into a plasmid and verified using Sanger sequencing where DNA polymerase copies the template strand in the presence of fluorescently labeled chain-terminating nucleotides. When a terminator is incorporated, elongation stops, producing fragments of different lengths that are separated and read to determine the sequence. This method is highly accurate and well suited for confirming a single engineered gene, though it is relatively low-throughput and not ideal for sequencing very large genomes.

5.2 DNA EDIT

I think I would probably be most interested in conservation. I would edit the DNA of reef-building corals to enhance their resilience to oceanic changes. Coral bleaching occurs when heat stress disrupts the relationship between corals and their symbiotic algae, leading to widespread reef decline. Rather than drastically altering the organism, I would focus on targeted edits to genes involved in heat shock response, oxidative stress regulation, and cellular repair pathways. Enhancing the expression or function of these stress-response genes could increase thermal tolerance and reduce bleaching under moderate heat stress.
I would use CRISPR-based genome editing technologies, as they allow precise and targeted modifications to specific DNA sequences. CRISPR systems use a guide RNA designed to match a target gene sequence and a DNA-cutting enzyme to create a break at that location. The cell’s natural DNA repair machinery then repairs the break, either introducing small changes or incorporating a designed DNA template to achieve a specific modification. Preparation would involve identifying the target coral genes, designing guide RNAs to match those sequences, and preparing the necessary components—such as the Cas enzyme, guide RNA, and potentially a repair template. These components would be delivered into coral embryos or cells using appropriate transformation methods. While CRISPR offers high precision compared to older gene-editing tools, limitations include variable editing efficiency, potential off-target edits, and challenges associated with delivering editing machinery effectively in marine organisms.