Week 2 HW:DNA Read, Write, & Edit

Class Assignment Homework 2

Part1: Benchling & In-sico GelArt See this week’s lab protocol “Gel Art: Restriction Digests and Gel Electrophoresis” for details. Overview: Make a free account at benchling.com Import the Lambda DNA. Simulate Restriction Enzyme Digestion with the following Enzymes: EcoRI HindIII BamHI KpnI EcoRV SacI SalI Create a pattern/image in the style of Paul Vanouse’s Latent Figure Protocol artworks. You might find Ronan’s website a helpful tool for quickly iterating on designs!

This was my first time working with Benchling, as well as my first experience exploring the possibilities of DNA design. As a first step, I imported the Lambda DNA sequences to begin experimenting with the digestion using enzymes such as EcoRI, EcoRV, SalI, SacI, KpnI, among others. This process led me to explore combinations and fusions between these enzymes, resulting in a wide variety of sequence ladder patterns. From that point, I began manipulating the patterns by rearranging the order of the tabs, which allowed me to design and create my own custom sequence design.

Now behold the first robot, made with 100% dna Enzymes.

3.1. Choose your protein.

In recitation, we discussed that you will pick a protein for your homework that you find interesting. Which protein have you chosen and why? Using one of the tools described in recitation (NCBI, UniProt, google), obtain the protein sequence for the protein you chose. [Example from our group homework, you may notice the particular format~~~ The example below came from UniProt] sp|P03609|LYS_BPMS2 Lysis protein OS=Escherichia phage MS2 OX=12022 PE=2 SV=1 METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLL EAVIRTVTTLQQLLT

For the second week of the HTGAA course, I chose a molluscan shell matrix protein involved in calcium carbonate biomineralization, called mantle protein N25 (N25). This protein is found in the calcifying organism Pinctada fucata, commonly known as the Akoya pearl oyster. N25 is part of the shell matrix and helps regulate calcium carbonate biomineralization, playing a key role in crystal growth, morphology, and the formation of microscopic calcifying structures. I want to explore this protein because my goal is to engineer a material for artificial reefs that can act as a “magnet” for dissolved calcium carbonate in seawater, creating localized microenvironments that favor CaCO₃ deposition and make carbonate more accessible for calcifying organisms affected by ocean acidification.

IMAGE FROM:https://www.newscientist.com/article/2151281-oysters-can-hear-the-ocean-even-though-they-dont-have-ears/

Yang, D., Yan, Y., Yang, X., Liu, J., Zheng, G., Xie, L., & Zhang, R. (2019). A basic protein, N25, from a mollusk modifies calcium carbonate morphology and shell biomineralization. Journal of Biological Chemistry, 294(21), 8371–8384. https://doi.org/10.1074/jbc.RA118.007338

Using UniProt, I obtained the amino acid sequence of mantle protein N25 (N25):

**FASTA

tr|A0A0E3XA28|A0A0E3XA28_PINFU Mantle protein N25 OS=Pinctada fucata OX=50426 PE=2 SV=1 MKRIYVLVLLFILLVCIAEAQKKSKDSKKASSKSSSKSSGKSKSSPKSSGAKGKSPAPSA PASKGPSEMQKLAEEMVALSNRLLKAIKAGEQMPPPMCPNGLPKADCSPLACDKWTCSNI LNTVCKEQCHVCEPKFYIGGSEVTQFCELKPANMQPRATQSPPTSRNTATDQGPQNSGPS SNGAPSNMPPMPGMPMMFSENPMPMGGPPGMEFMPNFENFPPGMSPMQFFHHLQNMNMPN ENQGSRSQAN**

3.2. Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence. The Central Dogma discussed in class and recitation describes the process in which DNA sequence becomes transcribed and translated into protein. The Central Dogma gives us the framework to work backwards from a given protein sequence and infer the DNA sequence that the protein is derived from. Using one of the tools discussed in class, NCBI or online tools (google “reverse translation tools”), determine the nucleotide sequence that corresponds to the protein sequence you chose above. [Example: Get to the original sequence of phage MS2 L-protein from its genome phage MS2 genome - Nucleotide - NCBI] Lysis protein DNA sequence atggaaacccgattccctcagcaatcgcagcaaactccggcatctactaatagacgccggccattcaaacatgaggattacccatgtcgaagacaacaaagaagttcaactctttatgtattgatcttcctcgcgatctttctctcgaaatttaccaatcaattgcttctgtcgctactggaagcggtgatccgcacagtgacgactttacagcaattgcttacttaa

To reverse translate the mantle protein N25 amino acid sequence to DNA, I used the online reverse translation tool at novoprolabs.com/tools/revtrans.

N25 protein DNA sequence atgaaacgcatttatgtgctggtgctgctgtttattctgctggtgtgcattgcggaagcgcagaaaaaaagcaaagatagcaaaaaagcgagcagcaaaagcagcagcaaaagcagcggcaaaagcaaaagcagcccgaaaagcagcggcgcgaaaggcaaaagcccggcgccgagcgccggcgagcaaaggcccgagcgaaatgcagaaactggcggaagaaatggtggcgctgagcaaccgcctgctgaaagcgattaaagcgggcgaacagatgccgccgccgatgtgcccgaacggcctgccgaaagcggattgcagcccgctggcgtgcgataaatggacctgcagcaacattctgaacaccgtgtgcaaagaacagtgccatgtgtgcgaaccgaaattttatattggcggcagcgaagtgacccagttttgcgaactgaaaccggcgaacatgcagccgcgcgcgacccagagcccgccgaccagccgcaacaccgcgaccgatcagggcccgcagaacagcggcccgagcagcaacggcgcgccgagcaacatgccgccgatgccgggcatgccgatgatgtttagcgaaaacccgatgccgatgggcggcccgccgggcatggaatttatgccgaactttgaaaactttccgccgggcatgagcccgatgcagttttttcatcatctgcagaacatgaacatgccgaacgaaaaccagggcagccgcagccaggcgaac

3.3 Codon optimization.Once nucleotide sequence of your protein is determined, you need to codon optimize your sequence. You may, once again, utilize google for a “codon optimization tool”. In your own words, describe why do you need to optimize codon usage. Which organism have you chose to optimize the codon sequence for and why?

I recently discovered what codons are and the key role they have in the protein formation process. They function as coordinates to encode amino acids. In the world, there are 64 possible codons for the 20 types of amino acids, which means that different codons can encode the same amino acid. That being said, codon optimization is necessary because different organisms prefer some codons over others due to DNA reading bias. This is a vital step to create a more effective way to reproduce genes from one organism to another. Although the genetic code is universal, certain organisms preferentially use specific codons due to differences in tRNA abundance and translation efficiency.

For this project, I optimized the codon sequence for expression in Bacillus subtilis. I selected Bacillus subtilis because:

It is genetically well-characterized It is widely used in biotechnology It can secrete proteins extracellularly It tolerates moderate saline environments

This process was performed using an online codon optimization tool. The optimized sequence maintains the same amino acid sequence but improves translational efficiency in the chosen host organism.

3.4 You Have a sequence! Now What? With the codon-optimized DNA sequence of N25, the protein can be produced using a cell-dependent system with Bacillus subtilis as the host. The optimized DNA is introduced into the bacteria through transformation. Inside the cells, the DNA is transcribed into mRNA and then translated by ribosomes into the N25 protein.

Bacillus subtilis can secrete the protein extracellularly, which simplifies collection and use. This approach leverages the host’s natural cellular machinery while ensuring that the codon optimization maximizes translation efficiency. By producing N25 in Bacillus subtilis, it is possible to generate sufficient protein for bioengineering applications, such as creating microenvironments that attract calcium carbonate for artificial reef materials.

**Part 4: Prepare a Twist DNA Synthesis Order ** This is a practice exercise, not necessarily your real Twist order! 4.1. Create a Twist account and a Benchling account

UFQ89828.1 mCherry [synthetic construct] MVSKGEEDNMAIIKEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQF MYGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPV MQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAEVKTTYKAKKPVQLPGAYNVNIKLDITSHN EDYTIVEQYERAEGRHSTGGMDELYK

reverse translation of UFQ89828.1 mCherry [synthetic construct] to a 708 base sequence of most likely codons. atggtgagcaaaggcgaagaagataacatggcgattattaaagaatttatgcgctttaaa gtgcatatggaaggcagcgtgaacggccatgaatttgaaattgaaggcgaaggcgaaggc cgcccgtatgaaggcacccagaccgcgaaactgaaagtgaccaaaggcggcccgctgccg tttgcgtgggatattctgagcccgcagtttatgtatggcagcaaagcgtatgtgaaacat ccggcggatattccggattatctgaaactgagctttccggaaggctttaaatgggaacgc gtgatgaactttgaagatggcggcgtggtgaccgtgacccaggatagcagcctgcaggat ggcgaatttatttataaagtgaaactgcgcggcaccaactttccgagcgatggcccggtg atgcagaaaaaaaccatgggctgggaagcgagcagcgaacgcatgtatccggaagatggc gcgctgaaaggcgaaattaaacagcgcctgaaactgaaagatggcggccattatgatgcg gaagtgaaaaccacctataaagcgaaaaaaccggtgcagctgccgggcgcgtataacgtg aacattaaactggatattaccagccataacgaagattataccattgtggaacagtatgaa cgcgcggaaggccgccatagcaccggcggcatggatgaactgtataaa

4.1 (i) What DNA would you want to sequence and why?

As an industrial designer, I’m always looking for ways to improve urban lifestyles through innovative materials. With CO₂ emissions rising in cities, I would sequence DNA from engineered E. coli bacteria embedded in “bio-bricks” designed for future buildings and urban infrastructure. These bacteria act as simple CO₂ sensors: when CO₂ levels are high, they modify a specific part of their DNA to record exposure events. By sampling bacteria from different bricks across the city and sequencing this “memory region,” I could map pollution patterns over time and create bio-integrated technologies that literally store environmental data within our built environment.

IMAGE FROM: https://www.front-materials.com/news/biomason-front-biobasedtile/

4.1 (ii) What technology would you use to perform sequencing and why?

Is your method first-, second- or third-generation or other? How so? For this project I would choose the second-generation method, specifically Illumina, since it is very accessible and efficient and allows reading millions of short DNA fragments at the same time in parallel. In contrast, first-generation (Sanger) reads one DNA molecule at a time, which is very slow, and third-generation (Nanopore, PacBio) reads complete long molecules one by one but is less precise for my case. For my bio-bricks I only need to read a specific short region of the bacterial DNA, so second generation is more accurate and pertinent.

What is your input? How do you prepare your input? List the essential steps.

As a first step in preparing the input, I would obtain samples from the bacteria in the bio-bricks by scraping their surface. As a next step, I understand that I would use Polymerase Chain Reaction (PCR) to obtain copies only of the specific DNA region where the “CO₂ memory” is stored. As a third step, I would add short adapter sequences to allow the fragments to enter the sequencing machine. I would also use a type of identification tag (barcode) for each bio-brick in order to distinguish the samples. Finally, I would mix the samples together so they can be sequenced simultaneously. What are the essential steps of your chosen sequencing technology, and how does it decode the bases of your DNA sample (base calling)? Since I chose the second-generation Illumina method, the process begins with attaching DNA fragments to a special plate called a flow cell. Once attached, the fragments are copied many times, forming small groups called clusters. Then, the machine adds one nucleotide at a time, and each incorporation generates a fluorescent color signal (for example, A is green, T is red, C is blue, and G is yellow). After each cycle, a camera takes an image of the entire plate, and a software program interprets the color detected in each cluster to determine which base was added. By repeating this process many times, the complete DNA sequence is reconstructed.

What is the output of your chosen sequencing technology?

The result I get from Illumina is a set of files that contain millions of small DNA fragments from the region where my bacteria store the “CO₂ memory.”Each sequence comes from a different group of DNA that was read by the machine, and thanks to the labels I added earlier, I can tell which bio-brick each fragment came from. With all this information, I can compare patterns between different bricks and analyze the history of CO₂ exposure in different parts of the city

4.2 DNA Write (i) What DNA would you want to synthesize (write) and why?

As a designer, I have always been interested in understanding why things have specific shapes. For that reason, I would like to synthesize simple genetic circuits related to morphogenesis — how living tissues or cell populations grow into particular forms. During the first class, the possibility of inoculating bacteria with DNA fragments to make them generate shapes or patterns inspired by much larger organisms was mentioned, but within an in vitro environment.

In the long term, and in a more speculative way, I would like to explore applying this principle to the creation of architectural modules — potentially shaping urban landscapes, or even designing products that are grown by nature itself, but guided by human intention.I am aware that this is not currently a mature technology, so I see it more as an exploratory and future-oriented design question rather than something that can be fully built today.

(ii) What technology would you use to perform this DNA synthesis and why? What are the essential steps of your chosen sequencing methods? From what I’ve understood, DNA synthesis is done by adding one base at a time in a controlled way. First, the last added base is protected, then a new base is incorporated, the bond is chemically stabilized, the remaining reagents are washed away, and the cycle is repeated until the full sequence is completed. I understand that it is a repetitive process that requires a lot of precision to obtain the correct sequence.

4.2 (ii) What are the limitations of your sequencing method (if any) in terms of speed, accuracy, scalability? One of the main limitations is the turnaround time of 2-4 weeks since chemical synthesis happens cycle by cycle, plus limitations on chain length up to 2000 bases and occasional errors of ~1 every 100-300 bases.

4.3 DNA Edit (i) What DNA would you want to edit and why?

I would like to edit a specific gene in the Peruvian scallop (Argopecten purpuratus) to make its shells more resistant to ocean acidification during the larval stage. Ocean acidification dissolves calcium carbonate shells of shellfish, and scallop larvae are especially vulnerable. This species is important for Peruvian aquaculture and coastal economies, so helping them survive better would support sustainable fisheries. The principle could also apply to oysters, mussels, and other shellfish facing the same threat.

(ii) What technology would you use and why? I understand that one of the most precise and accessible gene editing tools available today is CRISPR-Cas9, and also because it’s precise and has been used successfully in shellfish for targeted edits.

https://www.frontiersin.org/journals/marine-science/articles/10.3389/fmars.2022.912409/full

How does your technology of choice edit DNA? What are the essential steps?

I still find it hard to fully understand the principles of genetic modification, but I understand that CRISPR-Cas9 is a tool that allows cutting and editing DNA at a specific location. This tool uses an RNA molecule as a guide to select the sequence we want to modify, and then the Cas9 protein works like scissors to cut the DNA. Once the DNA is cut, it is possible to deactivate a gene or insert new sequences with different genetic information. What preparation do you need to do (e.g. design steps) and what is the input (e.g. DNA template, enzymes, plasmids, primers, guides, cells) for the editing? To change a gene that makes oysters vulnerable, you first need to identify which gene it is. Then, you use CRISPR-Cas9 with a guide RNA to find that specific gene and Cas9 to cut it like scissors. After the cut, the oyster cell can repair the DNA, and at that moment you can insert a new version of the gene that doesn’t cause vulnerability.

What are the limitations of your editing methods (if any) in terms of efficiency or precision?

CRISPR-Cas9 is not perfect. I understand that sometimes it could cut in the wrong places and damage healthy genes. Also, to carry out this process, it is necessary to do many tests to make sure it actually worked. In the case of oysters, the larvae are very fragile and some could die during the injection. In addition, it takes a lot of time, even months, to raise them and check the results.