Week 2 HW: DNA Read, Write, and Edit

Part 1: Benchling and In-silico Gel Art 

I started this week with great ambition. Frankly, that faded pretty fast! Initially, I wanted to only use enzymes that started with letters in my name (Katharine) but that was over-ambitious given the time I had available to devote to the exercise this week. I had a couple ideas in mind for design… a space invader; the Windows logo. I had used Benchling before (in fact I’m trying to design a competitor) but was a bit rusty, so it took me a second to get my bearings. Following the documentation provided (thank you!) and playing around a bit with Ronan’s website for some inspiration, I was able to kind of make “HI”. Initially, I also included an exclamation point, but it just wasn’t translating. See below for screenshots of my viewer and the final PNG. Enzymes were selected and arranged according to the steps outlined in the provided protocol.

Part 2 was not completed due to lack of lab access. However, I ran gels often at an old gig looking at shark DNA. Took me a second to stop breaking the wells. Pissed my manager off a lot, though she did her best not to show it. Miss those days!  Part 3: DNA Design Challenge

3.1 Choose Your Protein

I have chosen the human microtubule-associated tau protein for this exercise. Before my current job at HoX, I spent 6 years working in Translational R&D at Ed Boyden and Li-Huei Tsai’s incredible startup, Cognito Therapeutics. As the “amyloid hypothesis”—which posits that the buildup of amyloid-beta peptides in the brain is the primary driver of Alzheimer’s pathogenesis—has fallen to the wayside, and interest in Tau has risen in its place, I spent many hours devising a schematic/protocol for the exploration of novel tau and tau-adjacent biomarkers in our pivotal trial participants, and that’s all I can say for now 🙂

Here’s the sequence of this nasty thing! 

>sp|P10636|TAU_HUMAN Microtubule-associated protein tau OS=Homo sapiens OX=9606 GN=MAPT PE=1 SV=5

MAEPRQEFEVMEDHAGTYGLGDRKDQGGYTMHQDQEGDTDAGLKESPLQTPTEDGSEEPG

SETSDAKSTPTAEDVTAPLVDEGAPGKQAAAQPHTEIPEGTTAEEAGIGDTPSLEDEAAG

HVTQEPESGKVVQEGFLREPGPPGLSHQLMSGMPGAPLLPEGPREATRQPSGTGPEDTEG

GRHAPELLKHQLLGDLHQEGPPLKGAGGKERPGSKEEVDEDRDVDESSPQDSPPSKASPA

QDGRPPQTAAREATSIPGFPAEGAIPLPVDFLSKVSTEIPASEPDGPSVGRAKGQDAPLE

FTFHVEITPNVQKEQAHSEEHLGRAAFPGAPGEGPEARGPSLGEDTKEADLPEPSEKQPA

AAPRGKPVSRVPQLKARMVSKSKDGTGSDDKKAKTSTRSSAKTLKNRPCLSPKHPTPGSS

DPLIQPSSPAVCPEPPSSPKYVSSVTSRTGSSGAKEMKLKGADGKTKIATPRGAAPPGQK

GQANATRIPAKTPPAPKTPPSSGEPPKSGDRSGYSSPGSPGTPGSRSRTPSLPTPPTREP

KKVAVVRTPPKSPSSAKSRLQTAPVPMPDLKNVKSKIGSTENLKHQPGGGKVQIINKKLD

LSNVQSKCGSKDNIKHVPGGGSVQIVYKPVDLSKVTSKCGSLGNIHHKPGGGQVEVKSEK

LDFKDRVQSKIGSLDNITHVPGGGNKKIETHKLTFRENAKAKTDHGAEIVYKSPVVSGDT

SPRHLSNVSSTGSIDMVDSPQLATLADEVSASLAKQGL

3.2 Reverse Translate: Protein sequence to DNA sequence

Here is the unoptimized sequence DNA sequence for this protein using the reverse translation tool at bioinformatics.org

atggcggaaccgcgccaggaatttgaagtgatggaagatcatgcgggcacctatggcctg

ggcgatcgcaaagatcagggcggctataccatgcatcaggatcaggaaggcgataccgat

gcgggcctgaaagaaagcccgctgcagaccccgaccgaagatggcagcgaagaaccgggc

agcgaaaccagcgatgcgaaaagcaccccgaccgcggaagatgtgaccgcgccgctggtg

gatgaaggcgcgccgggcaaacaggcggcggcgcagccgcataccgaaattccggaaggc

accaccgcggaagaagcgggcattggcgataccccgagcctggaagatgaagcggcgggc

catgtgacccaggaaccggaaagcggcaaagtggtgcaggaaggctttctgcgcgaaccg

ggcccgccgggcctgagccatcagctgatgagcggcatgccgggcgcgccgctgctgccg

gaaggcccgcgcgaagcgacccgccagccgagcggcaccggcccggaagataccgaaggc

ggccgccatgcgccggaactgctgaaacatcagctgctgggcgatctgcatcaggaaggc

ccgccgctgaaaggcgcgggcggcaaagaacgcccgggcagcaaagaagaagtggatgaa

gatcgcgatgtggatgaaagcagcccgcaggatagcccgccgagcaaagcgagcccggcg

caggatggccgcccgccgcagaccgcggcgcgcgaagcgaccagcattccgggctttccg

gcggaaggcgcgattccgctgccggtggattttctgagcaaagtgagcaccgaaattccg

gcgagcgaaccggatggcccgagcgtgggccgcgcgaaaggccaggatgcgccgctggaa

tttacctttcatgtggaaattaccccgaacgtgcagaaagaacaggcgcatagcgaagaa

catctgggccgcgcggcgtttccgggcgcgccgggcgaaggcccggaagcgcgcggcccg

agcctgggcgaagataccaaagaagcggatctgccggaaccgagcgaaaaacagccggcg

gcggcgccgcgcggcaaaccggtgagccgcgtgccgcagctgaaagcgcgcatggtgagc

aaaagcaaagatggcaccggcagcgatgataaaaaagcgaaaaccagcacccgcagcagc

gcgaaaaccctgaaaaaccgcccgtgcctgagcccgaaacatccgaccccgggcagcagc

gatccgctgattcagccgagcagcccggcggtgtgcccggaaccgccgagcagcccgaaa

tatgtgagcagcgtgaccagccgcaccggcagcagcggcgcgaaagaaatgaaactgaaa

ggcgcggatggcaaaaccaaaattgcgaccccgcgcggcgcggcgccgccgggccagaaa

ggccaggcgaacgcgacccgcattccggcgaaaaccccgccggcgccgaaaaccccgccg

agcagcggcgaaccgccgaaaagcggcgatcgcagcggctatagcagcccgggcagcccg

ggcaccccgggcagccgcagccgcaccccgagcctgccgaccccgccgacccgcgaaccg

aaaaaagtggcggtggtgcgcaccccgccgaaaagcccgagcagcgcgaaaagccgcctg

cagaccgcgccggtgccgatgccggatctgaaaaacgtgaaaagcaaaattggcagcacc

gaaaacctgaaacatcagccgggcggcggcaaagtgcagattattaacaaaaaactggat

ctgagcaacgtgcagagcaaatgcggcagcaaagataacattaaacatgtgccgggcggc

ggcagcgtgcagattgtgtataaaccggtggatctgagcaaagtgaccagcaaatgcggc

agcctgggcaacattcatcataaaccgggcggcggccaggtggaagtgaaaagcgaaaaa

ctggattttaaagatcgcgtgcagagcaaaattggcagcctggataacattacccatgtg

ccgggcggcggcaacaaaaaaattgaaacccataaactgacctttcgcgaaaacgcgaaa

gcgaaaaccgatcatggcgcggaaattgtgtataaaagcccggtggtgagcggcgatacc

agcccgcgccatctgagcaacgtgagcagcaccggcagcattgatatggtggatagcccg

cagctggcgaccctggcggatgaagtgagcgcgagcctggcgaaacagggcctg

3.3 Codon optimization

Here is the codon-optimized DNA sequence from vectorbuilder.com: 

ATGGCTGAGCCCCGGCAGGAGTTCGAAGTGATGGAAGACCATGCTGGAACCTATGGTCTGGGCGACAGGAAGGACCAGGGCGGATACACAATGCATCAGGACCAGGAGGGCGACACAGACGCCGGCCTGAAAGAGTCTCCCCTGCAGACCCCTACCGAAGACGGGTCAGAAGAGCCCGGCTCTGAGACCTCTGACGCTAAGAGCACACCAACCGCCGAGGATGTCACCGCCCCCCTGGTGGATGAAGGCGCCCCCGGCAAACAGGCCGCAGCCCAGCCCCACACTGAGATCCCCGAAGGAACAACCGCTGAGGAGGCCGGCATTGGCGATACCCCTTCTCTGGAAGATGAAGCCGCCGGGCACGTGACCCAGGAACCTGAGTCTGGAAAGGTCGTGCAGGAAGGCTTCCTGCGCGAGCCAGGGCCTCCCGGACTGTCTCACCAACTCATGAGCGGCATGCCCGGGGCCCCTTTACTCCCCGAGGGTCCCAGAGAGGCCACACGTCAGCCATCTGGAACAGGCCCCGAGGACACCGAAGGCGGTAGACATGCTCCAGAGCTGCTTAAACACCAGCTGCTGGGCGACCTCCACCAGGAGGGCCCTCCTCTGAAGGGCGCCGGGGGCAAGGAAAGGCCCGGCAGTAAAGAGGAAGTGGATGAGGACAGAGATGTGGATGAATCTTCTCCTCAGGATTCTCCCCCATCTAAGGCCTCTCCTGCCCAGGACGGCAGGCCACCTCAGACTGCTGCCAGGGAGGCCACCTCCATTCCTGGATTCCCAGCAGAAGGCGCCATTCCACTGCCCGTGGATTTCCTGTCTAAAGTGTCAACCGAAATCCCCGCTTCTGAACCCGATGGCCCTTCCGTGGGGCGAGCCAAGGGCCAGGACGCCCCTCTGGAGTTCACCTTTCATGTGGAGATAACACCAAACGTGCAGAAGGAGCAGGCTCACTCTGAGGAGCATCTTGGGAGAGCTGCCTTTCCCGGCGCCCCTGGGGAAGGGCCAGAGGCCAGAGGGCCTTCCCTGGGCGAGGACACAAAGGAGGCCGATCTGCCCGAACCTAGCGAGAAGCAGCCCGCTGCTGCTCCTCGCGGGAAACCAGTGTCCCGGGTCCCACAACTCAAGGCTAGAATGGTTTCCAAGTCCAAGGACGGAACAGGCTCAGACGATAAAAAGGCCAAGACTAGCACCCGGTCTAGTGCCAAGACACTGAAAAACCGCCCCTGCCTGAGCCCTAAGCACCCAACACCCGGAAGTTCTGACCCTCTGATTCAGCCCTCTTCCCCTGCAGTGTGTCCCGAGCCTCCTTCCAGTCCCAAATACGTGTCATCTGTAACTAGCCGGACTGGCTCCAGCGGAGCCAAAGAGATGAAGCTCAAGGGGGCCGACGGGAAGACAAAGATTGCCACCCCTCGGGGCGCCGCCCCTCCTGGACAGAAGGGACAGGCCAACGCCACCCGAATCCCTGCCAAGACCCCTCCAGCCCCGAAGACCCCCCCTAGTTCCGGGGAACCTCCCAAGTCTGGAGACCGGTCCGGATATAGTTCACCAGGAAGCCCTGGGACCCCAGGATCTAGGTCCAGGACACCCTCTCTGCCTACTCCCCCTACAAGGGAGCCCAAAAAAGTCGCCGTGGTGAGAACCCCCCCTAAGTCACCCTCCTCCGCTAAATCTCGGCTGCAGACTGCTCCTGTGCCCATGCCTGACCTGAAAAATGTGAAGTCTAAAATCGGCTCCACCGAGAACCTGAAGCACCAGCCCGGGGGCGGCAAAGTGCAAATCATCAATAAGAAGCTGGATCTGTCCAACGTGCAGTCCAAATGCGGGTCCAAGGACAACATCAAGCATGTGCCTGGGGGTGGCTCCGTGCAGATTGTGTACAAGCCCGTGGATCTGAGCAAGGTTACCTCCAAGTGTGGGTCCCTGGGCAATATCCACCACAAGCCAGGAGGCGGACAGGTTGAGGTAAAATCCGAAAAGCTGGACTTTAAGGACCGGGTGCAGAGCAAAATTGGCTCTCTGGATAATATCACCCACGTGCCAGGAGGCGGCAACAAGAAGATCGAAACCCATAAGCTGACTTTTCGCGAGAATGCCAAGGCAAAGACTGACCACGGGGCCGAGATCGTGTATAAAAGCCCGGTTGTCTCTGGGGATACATCTCCAAGGCACCTGTCCAACGTTAGTTCCACCGGGAGCATCGATATGGTGGATTCTCCTCAACTGGCAACACTGGCCGACGAGGTGTCCGCCTCCCTGGCTAAACAGGGGCTG

3.4 You have a sequence! Now what?

i) What technologies could be used to produce this protein from your DNA? Describe in your words the DNA sequence can be transcribed and translated into your protein. You may describe either cell-dependent or cell-free methods, or both.

Let’s remember our cornerstore here, the central dogma: DNA -> RNA -> protein. We just need to choose an environment and the methods by which to execute this workflow.

First, for research purposes, we can decide what vector we want to use—for example e-coli, human iPSCs, or transgenic mice. Let’s go with stem cells, I think, which already contain the MAPT gene (the gene that encodes Tau protein, microtubule-associated protein tau). And let’s say we want to make pTAU-217 specifically, a phosphorylated tau that is a really hot biomarker in Alzheimer’s research. It has recently shot up as a protein of interest but in fact, collaborators and I were researching this protein before the first commercial assay for it even came out (from Eli Lilly). It is a truly fascinating protein with an implication in a lot of nervous system diseases, and I encourage interested parties to check it out. 

First, we need to get the stem cells to express a neuronal identity. For AD research that is often forebrain cortical neurons. So, we need to turn off the pluripotency genes and turn on neural lineage transcription factors using chromatin remodeling complexes, neuronal transcription factors (to bind regulatory DNA near MAPT), and histone modifications to make the MAPT gene transcriptionally accessible. 

Inside the nucleus, RNA polymerase binds to the MAPT promoter, the DNA unwinds, and the polymerase reads the template strand, bringing us closer to the second step in our central dogma through the production of pre-mRNA. Then, a 5’ cap is added, introns are removed, exons are joined, and a poly-A tail is added. Alternative splicing will determine whether we produce 3R or 4R tau based on the inclusion/exclusion of exons (R represents the number of repeats). Then, the mRNA exits the nucleus. 

On the ribosome, the mRNA is bound to the start codon, tRNA matches codons with amino acids, and peptide bonds are formed. A polypeptide chain begins to grow until a stop codon is reached. 

The final step in generating pTau-217 from endogenous MAPT in iPSC-derived neurons is going to be the phosphorylation of tau at threonine 217 by a proline-directed kinase—a specialized type of kinase that phosphorylates target proteins specifically at threonine when followed by proline. After tau is translated, a kinase transfers a phosphate group from ATP to the hydroxyl group of Thr217 (threonine at amino acid position 217 in the tau protein), producing phosphorylated tau. This is very strongly correlated with Alzheimer’s pathology and prognosis, and is detectable in cerebrospinal fluid and plasma, making it a fascinating AND accessible exploratory biomarker. We have also seen it to be highly specific in discerning severity of amyloid-associated tau pathology.

3.5 [Optional] How does it work in nature/biological systems?

The system for creating pTau-217 naturally is the same as the steps outlined above here after we get past the turning-stem-cells-to-neurons part. However, I can explain how elevated levels are achieved in humans. This is through kinase overactivation and reduced phosphatase activity, which promote the aggregation of this protein.

i) Describe how a single gene codes for multiple proteins at the transcriptional level.

A single gene can code for multiple proteins at the transcriptional level through several regulatory mechanisms that alter how its RNA transcript is generated or processed before translation. The most common mechanism is alternative splicing, in which different combinations of exons are joined together from the same pre-mRNA by the spliceosome, producing mRNA variants. Another mechanism is alternative promoter usage, where transcription begins at different promoter regions within the same gene, generating transcripts with different 5′ exons and often different N-terminal protein sequences that can affect localization or regulation. Alternative polyadenylation also contributes to diversity by allowing transcription to terminate at different polyadenylation signals, producing mRNAs with different 3′ ends that can alter coding regions or influence mRNA stability and translation efficiency. Finally, RNA editing modifies specific nucleotides within the RNA after transcription, such as converting adenosine to inosine, which can change codons and result in amino acid substitutions not directly encoded in the DNA sequence. Together, these mechanisms enable a single gene to produce multiple distinct protein isoforms, greatly expanding functional diversity without increasing genome size.

ii) Try aligning the DNA sequence, the transcribed RNA, and also the resulting translated Protein!!! See example below.

[Did not complete this optional sub-question].

Part 4: Prepare a Twist DNA Synthesis Order

Part 4.2 Build your DNA Insert Sequence

Below are the steps I used to build my insert sequence: 

(Not sure why this turned out so dark, but alas, it went like this and so on and so forth, until…) 

Glorious! 

View it here: https://benchling.com/s/seq-14sPMo30CkZYN9taSAEg?m=slm-oQBXs48FU1YuPssOy0A2 

Note: I wanted to use SBOL but it was throwing me an error the moment I opened the site. 

4.3 / 4.4: Using Twist Nice! Now back to Benchling… We did it!!!

Part 5: DNA Read/Write/Edit

Part 5.1 DNA Read

(i) What DNA would you want to sequence (e.g., read) and why? This could be DNA related to human health (e.g. genes related to disease research), environmental monitoring (e.g., sewage waste water, biodiversity analysis), and beyond (e.g. DNA data storage, biobank).

I’ll start by talking about a project pretty close to my work—the desire to sequence DNA in areas of political uprising or conflict zones to test for the use of novel biothreats. In this case, I would want to look at human DNA and RNA. Engineered or heavily mutated pathogens evade direct identification on the battlefield due to novelty, low abundance, or deliberate obfuscation (being engineered to be under certain thresholds or not easily understood or categorized) making sure that the pathogens are hard to detect against reference databases. This is a pickle, but biology is on our side, because the human immune response is impossible (for now, thankfully!) to conceal. Whole genome sequencing can reveal susceptibility loci and structural variants while single-cell RNA sequencing can capture interferon signaling, inflammatory cascades, and isoform switching that signal abnormal biological stress, even if we aren’t aware what the pathogen is causing these phenomena. Time is very much of the essence on the battlefield or in areas where resources are limited/situations are volatile, so it’s not always an option to sequence a novel pathogen and do detective work to figure out exactly what the hell is going on. Integrating host genomics with metagenomic reads allows us a pretty elucidating readout of how the body is responding to a foreign encounter. This host-centered approach, which is growing in popularity, shifts the focus to identifying the impact on humans versus taxonomy at a time when that just might not be feasible. 

(ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why?

Let’s use the current state of the art, Oxford Nanopore’s MinION, as an example; it enables this kind of portable, real-time, long-read sequencing of host (and pathogen) nucleic acids in low-infrastructure settings. It gives us resolution as far as structural variation, haplotypes, splice isoforms, and unknown microbial genomes without reliance on a reference database, which we have already established can be impractical in these settings.

Also answer the following questions:

Is your method first-, second- or third-generation or other? How so?

Nanopore sequencing is a third-generation sequencing technology. It enables direct, single-molecule sequencing of full-length transcripts without PCR amplification.

What is your input? How do you prepare your input (e.g. fragmentation, adapter ligation, PCR)? List the essential steps.

Input: Genomic DNA from blood (could also use tissue but blood is easier).

  1. Extract genomic DNA.

  2. Optionally fragment DNA if standardized read lengths are desired, but with long read sequencing not necessary or fragments can be longer.

  3. Perform end repair and dA-tailing to prepare DNA ends.

  4. Ligate sequencing adapters containing a motor protein.

  5. Load the prepared library onto the nanopore flow cell.

For RNA sequencing: 

  1. Extract RNA.

  2. Enrich for polyadenylated RNA if targeting mRNA.

  3. Ligate sequencing adapters directly to RNA.

  4. Attach motor protein and load onto flow cell.

cDNA sequencing (not recommended ue to use of PCR):

  1. Extract RNA.

  2. Reverse transcribe RNA into cDNA.

  3. Optionally amplify via PCR.

  4. Perform end repair.

  5. Ligate sequencing adapters.

  6. Load onto flow cell.

What are the essential steps of your chosen sequencing technology, how does it decode the bases of your DNA sample (base calling)?

  1. A voltage is applied across a membrane containing protein nanopores.

  2. A motor protein feeds a single DNA or RNA strand through the nanopore.

  3. As short groups of bases pass through the pore, they disrupt the ionic current in characteristic ways. The MinION records this.

  4. Computational base-calling algorithms use models to translate electrical signal patterns into nucleotide sequences.

  5. Bases are decoded by measuring changes in electrical current.

What is the output of your chosen sequencing technology?

The output includes:

  • Raw electrical signal data files.

  • Basecalled sequences in FASTQ format.

  • Long sequencing reads ranging from kilobases to potentially megabases.

  • Quality scores for each base.

These reads can then be aligned to reference genomes, assembled de novo, analyzed for structural variants, used to determine splice isoforms, and screened for metagenomic content.

Part 5.2 DNA Write

(i) What DNA would you want to synthesize (e.g., write) and why? These could be individual genes, clusters of genes or genetic circuits, whole genomes, and beyond. As described in class thus far, applications could range from therapeutics and drug discovery (e.g., mRNA vaccines and therapies) to novel biomaterials (e.g. structural proteins), to sensors (e.g., genetic circuits for sensing and responding to inflammation, environmental stimuli, etc.), to art (DNA origamis). If possible, include the specific genetic sequence(s) of what you would like to synthesize! You will have the opportunity to actually have Twist synthesize these DNA constructs! :)

Back in the day (and still sometimes in my fantasies, though I don’t think I have the mechanical engineering chops for it) my goal was to work in next-generation bionics, engineering prosthetic, intelligent limbs for people who no longer had them. I’ve always been curious about the mechanical hardware, and as a neuroscientist had considered the biology of the nervous system—but never the meeting of the two in the molecular, biological hardware. 

If I were to choose anything, I would want to use this for a bionic application in amputees. This would look like a project to improve the long-term biological interface between the residual nerves and the implanted electrodes; a major limitation of advanced prosthetics is chronic inflammation and fibrotic encapsulation at the implant site, which degrades signal quality over time, leading to increasingly poor performance. To do this, I would want to introduce cassettes that reduce inflammation and promote nerve stability and regeneration. For example, finding the human IL10 coding sequence (interleukins being a class of biomarkers I am familiar with from my work in Alzheimer’s Disease) and promoting the production of anti-inflammatory interleukin-10 (IL-10) when inflammation is particularly active, to create a feedback-mediated response that can reduce damaging immune responses without causing immune suppression (NM_000572.3.). 

For nerve stability and regeneration, conditional expression of the BDNF coding sequence under an injury state-responsive promoter might do the trick (only make BDNF when the nearby neurons are in an injury response state) (NM_001709.5). 

(ii) What technology or technologies would you use to perform this DNA synthesis and why?

Commercial gene synthesis should do the trick, like the kind that Twist offers—short DNA nucleotides are synthesized on high-density silicon platforms, then assembled into double-stranded gene fragments using overlap-based assembly methods. This method should be precise and defined enough to accommodate my cassettes.

Also answer the following questions:

What are the essential steps of your chosen sequencing methods?

  1. Chemical oligonucleotide synthesis

  2. Cleavage and removal of protection groups 

  3. Oligo assembly into longer fragments

  4. Amplification and cloning

  5. Sanger sequencing or next-gen sequencing verification

What are the limitations of your sequencing method (if any) in terms of speed, accuracy, scalability?

Speed: Larger constructs require assembly of multiple fragments, which increases time and complexity. Turnaround can be days to weeks (we provide these services at my job, and we’re looking at ways to make this shorter using automation).

Accuracy: While this method has a small per-base error rate, errors can accumulate during assembly. We can use sequence verification to reduce the likelihood of this. 

Scalability: The scalability issue here mostly comes down to cost. Cost scales with length and complexity.

5.3 DNA Edit

(i) What DNA would you want to edit and why? In class, George shared a variety of ways to edit the genes and genomes of humans and other organisms. Such DNA editing technologies have profound implications for human health, development, and even human longevity and human augmentation. DNA editing is also already commonly leveraged for flora and fauna, for example in nature conservation efforts, (animal/plant restoration, de-extinction), or in agriculture (e.g. plant breeding, nitrogen fixation). What kinds of edits might you want to make to DNA (e.g., human genomes and beyond) and why?

I honestly think I’ll use the same example, because for the purposes of HTGAA it fascinatingly elucidates how completely different methods can be used to the same end; the distinction being whether we are adding an engineered construct or modifying DNA directly.

(ii) What technology or technologies would you use to perform these DNA edits and why?

We could use a CRISPR-based system to insert the IL10 or BDNF constructs into the genome. This could allow stable, long-term expression, which could be useful especially in these chronic applications where we want to maintain as close to peak efficacy as possible in the long-term.

Also answer the following questions:

How does your technology of choice edit DNA? What are the essential steps?

CRISPR Cas-9 edits DNA by creating a double-strand break at a specific location on the genome. Guide RNA directs the Cas9 nuclease, which then introduces the double-strand break. Then, the break gets repaired through non-homologous end joining or directed repair, which can insert a designed DNA sequence (that we would use in this case). Then we would screen the edited cells to confirm that it worked correctly. 

What preparation do you need to do (e.g. design steps) and what is the input (e.g. DNA template, enzymes, plasmids, primers, guides, cells) for the editing?

First we need to select our genomic target site or “safe harbor” locus that would support the expression of the edited sequence. Then we need to design our guide RNA. We need the donor DNA template of course, with our cassette and then matching sequences so it fits well in the cut site. Then we perform off-target analysis in silico. 

For this, we need:

  1. The Cas9 nuclease 

  2. Synthetic guide RNA 

  3. Donor DNA

  4. Target cells 

  5. Delivery system (viral vectors or even nanoparticles would be fun)

Then we need to sequence to double check accuracy. 

What are the limitations of your editing methods (if any) in terms of efficiency or precision?

Off-target effects are a concern; Cas9 may cut unintended genomic sites with partial sequence similarity, creating unwanted mutations.

Also, homology-directed repair tends to work more slowly in mature cells or slowly dividing cells like neurons, since it’s most active during the S phase.