Week 2 — DNA Read, Write, & Edit
Part 1: Benchling & In-silico Gel Art
See this week’s lab protocol “Gel Art: Restriction Digests and Gel Electrophoresis” for details. Overview:
- Make a free account at benchling.com
- Import the Lambda DNA.
- Simulate Restriction Enzyme Digestion with the following Enzymes: EcoRI, HindIII, BamHI, KpnI, EcoRV, SacI, SalI.
- Create a pattern/image in the style of Paul Vanouse’s Latent Figure Protocol artworks.
Part 3: DNA Design Challenge
- In recitation, we discussed that you will pick a protein for your homework that you find interesting. Which protein have you chosen and why? Using one of the tools described in recitation (NCBI, UniProt, google), obtain the protein sequence for the protein you chose.
yo use la proteina SIRT1 por su potencial en …. y su relevancia en estudio antievenjecimiento y su relacion con el desarollo de cancer???. Obtuve su secuencia por medio de UniProt
sp|Q96EB6|SIRT1_HUMAN NAD-dependent protein deacetylase sirtuin-1 OS=Homo sapiens OX=9606 GN=SIRT1 PE=1 SV=3 MADEAALALQPGGSPSAAGADREAASSPAGEPLRKRPRRDGPGLERSPGEPGGAAPEREVPAAARGCPGAAAAALWREAEAEAAAAGGEQEAQATAAAGEGDNGPGLQGPSREPPLADNLYDEDDDDEGEEEEEAAAAAIGYRDNLLFGDEIITNGFHSCESDEEDRASHASSSDWTPRPRIGPYTFVQQHLMIGTDPRTILKDLLPETIPPPELDDMTLWQIVINILSEPPKRKKRKDINTIEDAVKLLQECKKIIVLTGAGVSVSCGIPDFRSRDGIYARLAVDFPDLPDPQAMFDIEYFRKDPRPFFKFAKEIYPGQFQPSLCHKFIALSDKEGKLLRNYTQNIDTLEQVAGIQRIIQCHGSFATASCLICKYKVDCEAVRGDIFNQVVPRCPRCPADEPLAIMKPEIVFFGENLPEQFHRAMKYDKDEVDLLIVIGSSLKVRPVALIPSSIPHEVPQILINREPLPHLHFDVELLGDCDVIINELCHRLGGEYAKLCCNPVKLSEITEKPPRTQKELAYLSELPPTPLHVSEDSSSPERTSPPDSSVIVTLLDQAAKSNDDLDVSESKGCMEEKPQEVQTSRNVESIAEQMENPDLKNVGSSTGEKNERTSVAGTVRKCWPNRVAKEQISRRLDGNQYLFLPPNRYIFHGAEVYSDSEDDVLSSSSCGSNSDSGTCQSPSLEEPMEDESEIEEFYNGLEDEPDVPERAGGAGFGTDGDDQEAINEAISVKQEVTDMNYPSNKS
1.2. Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence.
The Central Dogma discussed in class and recitation describes the process in which DNA sequence becomes transcribed and translated into protein. The Central Dogma gives us the framework to work backwards from a given protein sequence and infer the DNA sequence that the protein is derived from. Using one of the tools discussed in class, NCBI or online tools (google “reverse translation tools”), determine the nucleotide sequence that corresponds to the protein sequence you chose above.
Using Benchling’s reverse translation tool, I converted the SIRT1 protein sequence into its corresponding coding DNA sequence (CDS) using the standard genetic code. The resulting DNA sequence is:
1.3. Codon optimization.
Once a nucleotide sequence of your protein is determined, you need to codon optimize your sequence. You may, once again, utilize google for a “codon optimization tool”. In your own words, describe why you need to optimize codon usage. Which organism have you chosen to optimize the codon sequence for and why?
Codon optimization is required because the genetic code is degenerate-multiple codons encode the same amino acid, and different organisms have distinct codon usage biases. When expressing a human gene like SIRT1 in a foreign host such as E. coli, the presence of human-rare codons (which correspond to low-abundance tRNAs in E. coli) can lead to translational pausing, premature termination, mRNA instability, or protein misfolding. By replacing those rare codons with synonymous codons preferred by the expression host, I can dramatically increase protein yield, solubility, and functional expression without altering the amino acid sequence.
Organism chosen: Escherichia coli (E. coli)
I chose to optimize SIRT1 for E. coli for several reasons:
Cell-free system compatibility: The Week 11 cell-free protein synthesis system uses E. coli BL21(DE3) Star lysate, which is naturally optimized for E. coli codon preferences.
Speed and cost: E. coli is the fastest and most cost-effective microbial host for protein production.
Well-characterized codon bias: E. coli K-12 codon usage tables are extensively documented.
Twist synthesis standard: Twist Bioscience’s clonal gene service is optimized for E. coli expression vectors.
Downstream applications: Optimized genes work efficiently in both in vivo (E. coli transformation) and in vitro (cell-free lysate) platforms.
Using the tool Twist Codon Optimization Tool, the following report was obtained:
Using the tool Twist Codon Optimization Tool, the following report was obtained:
| Name | Original sequence | Flank 5' | Optimized sequence | Flank 3' | Organism of expression | Type | Preserved regions | Restriction sites | Sites only | Length | Optimized | Issues |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SIRT1 | ATGGCTGATGAAGCTGCATTAGCGCTGCAGCCGGGCGGCTCACCATCAGCGGCGGGCGCGGACCGTGAAGCGGCCAGCTCCCCGGCGGGTGAGCCGCTGCGCAAGCGTCCGCGCCGTGACGGCCCGGGGCTGGAGCGCAGCCCGGGTGAGCCAGGCGGGGCTGCGCCGGAGCGTGAGGTGCCGGCTGCGGCGCGTGGCTGCCCGGGTGCGGCGGCGGCAGCGCTGTGGCGTGAAGCAGAAGCTGAAGCTGCGGCAGCAGGCGGTGAACAGGAAGCCCAGGCAACAGCTGCTGCTGGTGAAGGTGACAACGGCCCGGGGCTGCAGGGGCCGAGCCGTGAACCGCCGCTGGCTGACAACCTGTATGATGAAGATGACGATGATGAAGGTGAAGAAGAAGAAGAAGCGGCGGCGGCGGCGATTGGTTACCGTGACAACCTGCTGTTCGGTGATGAAATCATCACCAACGGTTTCCACAGCTGCGAGAGCGATGAGGAAGACCGTGCCTCTCATGCCAGCAGCTCTGACTGGACCCCGCGTCCGCGTATTGGTCCGTACACCTTTGTCCAGCAGCATCTGATGATCGGCACCGATCCGCGCACTATCCTGAAAGACCTGCTGCCGGAAACCATCCCGCCGCCGGAACTGGATGACATGACGCTGTGGCAGATTGTGATCAACATCCTGAGCGAACCGCCGAAAAGGAAGAAGCGCAAAGACATCAACACCATTGAAGATGCGGTGAAGCTGCTGCAGGAGTGCAAAAAAATCATCGTCCTGACCGGTGCGGGCGTCTCTGTCAGCTGCGGTATCCCGGATTTCCGCAGCCGTGATGGTATCTATGCGCGTCTGGCGGTTGATTTCCCGGATCTGCCGGACCCGCAGGCGATGTTTGACATCGAGTACTTCCGCAAAGATCCGCGTCCGTTCTTCAAATTTGCCAAAGAGATCTACCCGGGTCAGTTCCAGCCGAGCCTGTGTCACAAATTTATTGCGCTGAGCGACAAAGAAGGCAAACTGCTGCGTAACTACACCCAGAACATCGACACCCTGGAACAGGTGGCGGGTATTCAGCGCATCATTCAGTGCCATGGCAGCTTTGCGACTGCGAGCTGCCTGATCTGCAAATACAAAGTGGACTGCGAAGCGGTGCGTGGTGATATCTTCAACCAGGTGGTACCGCGCTGCCCGCGCTGCCCGGCTGATGAACCGCTGGCGATTATGAAACCGGAAATTGTGTTCTTTGGTGAGAACCTGCCGGAACAGTTCCACCGTGCGATGAAATATGACAAAGATGAGGTTGATCTGCTGATTGTGATCGGCAGCTCGCTGAAAGTGCGTCCGGTTGCGCTGATCCCATCCTCGATTCCGCATGAAGTACCGCAGATTCTGATCAACCGTGAGCCGCTGCCGCACCTGCACTTCGATGTTGAGCTGCTGGGTGACTGCGATGTCATCATCAACGAGCTGTGCCACCGTCTGGGTGGTGAGTATGCCAAGCTGTGCTGCAACCCGGTGAAACTCTCTGAAATCACCGAGAAGCCGCCGCGTACCCAGAAAGAGCTGGCTTATTTATCTGAACTGCCGCCGACACCGCTGCACGTCAGCGAAGACAGCTCTAGCCCGGAGCGCACCTCTCCGCCGGACTCCTCCGTGATTGTCACCCTGCTGGATCAGGCGGCGAAATCCAACGATGACCTGGATGTCTCTGAGAGCAAAGGTTGCATGGAAGAGAAACCGCAGGAGGTGCAGACCAGCCGTAACGTGGAGAGTATTGCTGAACAGATGGAAAACCCGGATCTGAAAAACGTTGGCAGCTCAACTGGTGAGAAAAATGAGCGTACCTCGGTGGCGGGTACTGTGCGTAAATGTTGGCCGAACCGCGTGGCGAAAGAACAGATTAGCCGCCGTCTGGATGGTAACCAGTATCTGTTCCTGCCGCCGAACCGTTATATTTTCCACGGTGCGGAAGTCTACTCCGACAGCGAAGATGACGTGCTGTCCTCCTCGAGCTGTGGTAGCAACTCCGATAGCGGTACCTGTCAGAGCCCGTCACTGGAAGAACCGATGGAAGATGAAAGTGAAATTGAGGAATTCTATAACGGTCTGGAAGATGAACCGGATGTTCCGGAGCGTGCGGGTGGTGCGGGTTTCGGTACCGACGGTGATGATCAGGAAGCGATTAACGAAGCGATCTCCGTGAAACAGGAAGTGACCGATATGAACTACCCGTCCAACAAGAGC | Escherichia coli general (562) | Other protein type | false | 2241 | true |
1.4. You have a sequence! Now what?
What technologies could be used to produce this protein from your DNA? Describe in your words the DNA sequence can be transcribed and translated into your protein. You may describe either cell-dependent or cell-free methods, or both.
Cell-dependent method (in vivo): The optimized SIRT1 gene would be cloned into an expression vector (e.g., pET-28a) containing a T7 promoter, ribosome binding site (RBS), and terminator. This plasmid is transformed into E. coli BL21(DE3), where IPTG induction drives T7 RNA polymerase to transcribe the gene into mRNA, which is then translated by bacterial ribosomes into SIRT1 protein. The protein can be purified using affinity tags (e.g., 6xHis tag). Cell-free method (in vitro): The DNA template (linear or plasmid) is added directly to an E. coli lysate-based cell-free system (as in Week 11). The lysate provides all necessary machinery — RNA polymerase, ribosomes, tRNAs, energy regeneration systems, and amino acids — to perform coupled transcription and translation in a test tube within hours, without living cells. This method is faster, allows precise control over reaction conditions, and avoids toxicity issues.
1.5. [Optional] How does it work in nature/biological systems?
A single gene can produce multiple protein variants through a process called alternative splicing. In higher eukaryotes like humans, the initial pre-mRNA transcript contains both exons (coding regions) and introns (non-coding regions). During splicing, different combinations of exons can be joined together, producing multiple mature mRNA isoforms from the same gene. Each isoform translates into a different protein variant with potentially distinct functions, localization, or regulatory properties.
For SIRT1, alternative splicing generates at least five known isoforms:
| Isoform | Exon combination | Protein length | Functional difference |
|---|---|---|---|
| Isoform 1 (canonical) | Exons 1-9 | 747 aa | Full-length, nuclear localization |
| Isoform 2 | Missing exon 3 | ~650 aa | Truncated, altered deacetylase activity |
| Isoform 3 | Alternative exon 5 | ~500 aa | Cytoplasmic, different substrate specificity |
| Isoform 4 | Missing exons 3 & 5 | ~400 aa | Catalytically inactive, dominant negative |
| Isoform 5 | Alternative 3’ end | 685 aa | Different C-terminus, altered stability |
Central Dogma: DNA → RNA → Protein alignment
Below is the alignment for the first 30 amino acids of SIRT1, showing the flow of information from DNA to RNA to protein.
Color legend:
- 🔵 Blue: DNA template strand (coding sequence)
- 🟢 Green: RNA transcript (T → U)
- 🟡 Yellow: Protein (amino acids from codons)
Complete alignment for the first 30 amino acids in table format:
| DNA (5’ → 3') | RNA (5’ → 3') | Codon | Amino Acid (3-letter) | Amino Acid (1-letter) |
|---|---|---|---|---|
| ATG | AUG | AUG | Methionine | M |
| GCG | GCG | GCG | Alanine | A |
| GAC | GAC | GAC | Aspartic acid | D |
| GAG | GAG | GAG | Glutamic acid | E |
| GCA | GCA | GCA | Alanine | A |
| GCA | GCA | GCA | Alanine | A |
| CTG | CUG | CUG | Leucine | L |
| GCA | GCA | GCA | Alanine | A |
| CTG | CUG | CUG | Leucine | L |
| CAG | CAG | CAG | Glutamine | Q |
| CCG | CCG | CCG | Proline | P |
| GGC | GGC | GGC | Glycine | G |
| GGC | GGC | GGC | Glycine | G |
| AGC | AGC | AGC | Serine | S |
| CCG | CCG | CCG | Proline | P |
| AGC | AGC | AGC | Serine | S |
| GCA | GCA | GCA | Alanine | A |
| GCC | GCC | GCC | Alanine | A |
| GGT | GGU | GGU | Glycine | G |
| GAT | GAU | GAU | Aspartic acid | D |
| CGT | CGU | CGU | Arginine | R |
| GAA | GAA | GAA | Glutamic acid | E |
| GCA | GCA | GCA | Alanine | A |
| GCA | GCA | GCA | Alanine | A |
| TCT | UCU | UCU | Serine | S |
| TCT | UCU | UCU | Serine | S |
| CCG | CCG | CCG | Proline | P |
| GCA | GCA | GCA | Alanine | A |
| GGT | GGU | GGU | Glycine | G |
| GAA | GAA | GAA | Glutamic acid | E |
Key observations from the alignment:
| Feature | Description |
|---|---|
| Transcription | All “T” (thymine) in DNA are replaced by “U” (uracil) in RNA |
| Translation | Each set of 3 nucleotides (codon) specifies one amino acid |
| Reading frame | The sequence is read continuously from the start codon (ATG/AUG/Met) |
| Genetic code | Follows the standard genetic code (same for humans and E. coli for most codons) |
| Start codon | ATG (DNA) / AUG (RNA) codes for Methionine (M) |
Additional notes on SIRT1 biology:
In nature, SIRT1 expression is regulated at multiple levels:
- Transcriptional regulation: The SIRT1 promoter contains binding sites for transcription factors like p53, FOXO3a, and HIC1
- Alternative splicing: As shown above, multiple isoforms arise from the same gene
- Post-transcriptional regulation: miRNAs (e.g., miR-34a, miR-9) bind to the 3’ UTR and repress translation
- Post-translational modification: The protein itself can be phosphorylated, SUMOylated, or acetylated
This multi-level regulation allows a single gene to respond to diverse cellular signals (NAD+ levels, oxidative stress, caloric restriction) and produce context-specific functional outcomes.
.png)
.png)
.png)