Week 2 HW: DNA read and edit
part 1:Create a pattern/image in the style of Paul Vanouse’s Latent Figure Protocol artworks. You might find Ronan’s website a helpful tool for quickly iterating on designs!
part 3: 3.1. Choose your protein.
In recitation, we discussed that you will pick a protein for your homework that you find interesting. Which protein have you chosen and why? Using one of the tools described in recitation (NCBI, UniProt, google), obtain the protein sequence for the protein you chose.
A) for my protein, i chose HPLC-12 which is type III antifreeze protein, found in Ocean pout and eelpot (polar fish) it is only 66 amino acids long. its function is to prevent ice crystals from forming in the blood stream in sub zero temperatures.
the 66 amino acid obtained from Uniprot:
NQASVVANQLIPINTALTLVMMRSEVVTPVGIPAEDIPRLVSMQVNRAVPLGTTLMPDMV KGYPPA
3.2 Reverse Translation
The Central Dogma discussed in class and recitation describes the process in which DNA sequence becomes transcribed and translated into protein. The Central Dogma gives us the framework to work backwards from a given protein sequence and infer the DNA sequence that the protein is derived from. Using one of the tools discussed in class, NCBI or online tools (google “reverse translation tools”), determine the nucleotide sequence that corresponds to the protein sequence you chose above.
A) I used an online reverse translation website to obtain the DNA sequence from which the amino acid sequence was derived.
ATGAATCAAGCCTCCGTAGTCGCTAACCAACTCATACCGATCAATACAATGTTAACACTCGTAATGATGAGGAGTGAGGT CGTGACACCTGTAGGAATTCCTGCTGAAGACATTCCCAGACTAGTCTCTATGCAAGTTAACAGGGCAGTGCCATTGGGAA CAACGTTAATGCCCGACATGGTAAAGGGATACGCCGCC
3.3. Codon optimization.
Once a nucleotide sequence of your protein is determined, you need to codon optimize your sequence. You may, once again, utilize google for a “codon optimization tool”. In your own words, describe why you need to optimize codon usage. Which organism have you chosen to optimize the codon sequence for and why?
A) Reverse translation does not always give the exact unique DNA sequence that produced this protein but it gives the most likely sequence out of many possibilities. This is why we optimize the sequence we got to best accommodate the species. Dna optimization ensures that the process of translation is done as smoothly as possible. i have personally tried several optimizing tools but i came up with a lot of errors and premature stop codons, i believe it is due to the fact that i am using a fairly uncommon gene, which makes it hard to optimize it for human use
3.4. You have a sequence! Now what?
What technologies could be used to produce this protein from your DNA? Describe in your words the DNA sequence can be transcribed and translated into your protein. You may describe either cell-dependent or cell-free methods, or both.
A) the most realistic human application of this protein is during organ freezing. given the sensitivity of such applications, the yeild must be extremely pure and consistent. the method must be cell dependant (using a living organism to manifacture the protein) in this case i decided on using yeast Pichia pastoris. it is true that E.coli bacteria is the most common in cell dependant methods, however it cannot be used for human clinical applications because it produces endotoxins. yeast, p.pastoris is a eukaryote and does not produce endotoxins. yeast cells also contain an endoplasmic riticulum that ensures that the 66 aa chain is folded correctly into its 3d shape that allows it to bind to ice crystals.
transcription and translation:
- the optimized DNA sequence is inserted into a plasmid and integrated into the yeast cells.
- the promoter attached to the sequence acts as a trigger to initiate the trascription process where RNA polymerase enters the yeast nuclues and reads the sequence to build a matching mRNA strand which in turn carries the instructions from the DNA to the ribosome.
- once it reaches the ribosome, the ribosome reads the optimized sequence, producing a protein chain.
- the protein chain enters the endoplasmic reticulum where the chaperones help in folding it into its final 3d shape. .
- the purified HPLC12 is added in low concentrations to the medium, preventing ice crystals from growing large enough to puncture the membranes of perserved organs in sub zero temperatures.
4.1, 4.2) Build Your DNA Insert Sequence. A) since i am using yeast (Pichia Pastoris) i selected the GAP promoter. GAP is a naturally occuring metabloic enzyme in yeast which makes its promoter highly compatible with Oichia Pastoris. it is constantly active and does not require Methanol to perform, unlike the AOX1 promoter. As for the ribosome binding site, eukaryotic systems do not generally require a sequence based binding site. Everything else followed the procedure (inserting the HPLC12 DNA sequence, His tag, Stop codon) The terminator I selected is the AOX1 TT which is the most common choice of terminator for pichia pastoris. the need for a stable terminator is prominant in ichia pastoris because of its strong promoters, necessitating that the mRNA is efficiently cut.
4.6) choose your vector.
5.1 DNA Read (i) What DNA would you want to sequence (e.g., read) and why? This could be DNA related to human health (e.g. genes related to disease research), environmental monitoring (e.g., sewage waste water, biodiversity analysis), and beyond (e.g. DNA data storage, biobank).
A) i would like to sequence the HTT gene (Huntington’s disease), what causes the disease is the extensive CAG repeats. when these repeats exceed 36, it becomes pathogenic. this diseases degenrates the neural cells in the brain, causing the progressive loss of motor functions and loss of cognition. why? because it may assist in early detection and, in understanding it more, editing the gene to prevent aggregation and the rapid repeat in the CAG regions.
(ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why? Also answer the following questions:
Is your method first-, second- or third-generation or other? How so? What is your input? How do you prepare your input (e.g. fragmentation, adapter ligation, PCR)? List the essential steps. What are the essential steps of your chosen sequencing technology, how does it decode the bases of your DNA sample (base calling)? What is the output of your chosen sequencing technology?
- the best option to sequence this would be the Oxford Nanopore sequencing. its third generation. it reads only single DNA molecules at a time, lessening error percentage. it provides long reads. and it does not utilize PCR, which is prone to producing errors (critical when dealing with HTT where diagnosis relies on the exact number of CAG repeats)
- input: genomic DNA (blood or saliva) preparing: the dna is extracted, fragmented, ligating adapters to help the machine recognize the DNA, load DNA into the nanopore cell.
- how it works: the DNA passes through nanopore where theres an electric current running, each base causes a different shift in the current, the signals are recorded and converted into a base sequence.
- output: it produces long DNA reads and digital data (FASTQ) which would provide information on the exact number of CAG repeats and the possible mutations.
5.2 DNA write. (i) What DNA would you want to synthesize (e.g., write) and why? These could be individual genes, clusters of genes or genetic circuits, whole genomes, and beyond. As described in class thus far, applications could range from therapeutics and drug discovery (e.g., mRNA vaccines and therapies) to novel biomaterials (e.g. structural proteins), to sensors (e.g., genetic circuits for sensing and responding to inflammation, environmental stimuli, etc.), to art (DNA origamis). If possible, include the specific genetic sequence(s) of what you would like to synthesize! You will have the opportunity to actually have Twist synthesize these DNA constructs! :)
A) To tie back into my 5.1 answer, i would like to synthesise a therapeutic construct targetting HTT using CRISPR technique. it would provide instructions for RNA to target mutant HTT, and adding CAS9 to cut the targeted DNA. it should work to cut the repeats in the HTT gene.
(ii) What technology or technologies would you use to perform this DNA synthesis and why? Also answer the following questions:
What are the essential steps of your chosen sequencing methods? What are the limitations of your sequencing method (if any) in terms of speed, accuracy, scalability?
A) i would use the Phosphoramidite DNA synthesis which builds DNA one base at a time. And its perfect for short sequences ( we are not synthsizing the entire HTT gene but a guide DNA that targets the repeat region.) the essential steps: fix one nucleotide as a starting point onto a solid surface, The elongation cycle, which involves deprotection, coupling, capping, and oxidation. the cycle is repeated until the full dna sequence is produced, the DNA strand is detached and then purified. limitation: it is very expensive for long sequences, and it is very limited with the length of DNA it can synthesize.
5.3 DNA Edit (i) What DNA would you want to edit and why? In class, George shared a variety of ways to edit the genes and genomes of humans and other organisms. Such DNA editing technologies have profound implications for human health, development, and even human longevity and human augmentation. DNA editing is also already commonly leveraged for flora and fauna, for example in nature conservation efforts, (animal/plant restoration, de-extinction), or in agriculture (e.g. plant breeding, nitrogen fixation). What kinds of edits might you want to make to DNA (e.g., human genomes and beyond) and why?
A) I would want to edit the mutant HTT gene in the human nueral cells to remove CAG repeats. I would use CRISPR/CAS9 where CRISPR use the guide RNA to bind to the targeted DNA and the CAS9 enzyme would cut the double stranded DNA, the process should be followed by NHEJ and HDR repairs. 1) design the gRNA, place the CRISPR system into the cell, CAS9 cute the DNA, cell repairs. 2) cutting in the wrong places, harming the natural HTT gene instead of targetting the repeat area, and insefficient repairs.