Week 1 HW: Principles and Practices
Part 1: Benchling & In-silico Gel Art
Gel lane 1: LAMCG - MboI
Gel lane 2: LAMCG - MwoI
Gel lane 3: LAMCG - PshAI
Gel lane 4: LAMCG - MwoI
Gel lane 5: LAMCG - MboI

Part 3: DNA Design Challenge
I am very intersted for Interleukin-6 (IL-6), an protein indicating early inflammation & immune activation.
Here is the protein sequence:
CAA31312.1 IL-6 receptor precursor (AA -19 to 449) [Homo sapiens] MLAVGCALLAALLAAPGAALAPRRCPAQEVARGVLTSLPGDSVTLTCPGVEPEDNATVHWVLRKPAAGSH PSRWAGMGRRLLLRSVQLHDSGNYSCYRAGRPAGTVHLLVDVPPEEPQLSCFRKSPLSNVVCEWGPRSTP SLTTKAVLLVRKFQNSPAEDFQEPCQYSQESQKFSCQLAVPEGDSSFYIVSMCVASSVGSKFSKTQTFQG CGILQPDPPANITVTAVARNPRWLSVTWQDPHSWNSSFYRLRFELRYRAERSKTFTTWMVKDLQHHCVIH DAWSGLRHVVQLRAQEEFGQGEWSEWSPEAMGTPWTESRSPPAENEVSTPMQALTTNKDDDNILFRDSAN ATSLPVQDSSSVPLPTFLVAGGSLAFGTLLCIAIVLRFKKTWKLRALKEGKTSMHPPYSLGQLVPERPRP TPVLVPLISPPVSPSSLGSDNTSSHNRPDARDPRSPYDISNTDYFFPR
Here is the reverse translated DNA (nucleotide) sequence in both “Most likely codons” option and “Consensus codons” option:
reverse translation of CAA31312.1 IL-6 receptor precursor (AA -19 to 449) [Homo sapiens] to a 1404 base sequence of most likely codons. atgctggcggtgggctgcgcgctgctggcggcgctgctggcggcgccgggcgcggcgctg gcgccgcgccgctgcccggcgcaggaagtggcgcgcggcgtgctgaccagcctgccgggc gatagcgtgaccctgacctgcccgggcgtggaaccggaagataacgcgaccgtgcattgg gtgctgcgcaaaccggcggcgggcagccatccgagccgctgggcgggcatgggccgccgc ctgctgctgcgcagcgtgcagctgcatgatagcggcaactatagctgctatcgcgcgggc cgcccggcgggcaccgtgcatctgctggtggatgtgccgccggaagaaccgcagctgagc tgctttcgcaaaagcccgctgagcaacgtggtgtgcgaatggggcccgcgcagcaccccg agcctgaccaccaaagcggtgctgctggtgcgcaaatttcagaacagcccggcggaagat tttcaggaaccgtgccagtatagccaggaaagccagaaatttagctgccagctggcggtg ccggaaggcgatagcagcttttatattgtgagcatgtgcgtggcgagcagcgtgggcagc aaatttagcaaaacccagacctttcagggctgcggcattctgcagccggatccgccggcg aacattaccgtgaccgcggtggcgcgcaacccgcgctggctgagcgtgacctggcaggat ccgcatagctggaacagcagcttttatcgcctgcgctttgaactgcgctatcgcgcggaa cgcagcaaaacctttaccacctggatggtgaaagatctgcagcatcattgcgtgattcat gatgcgtggagcggcctgcgccatgtggtgcagctgcgcgcgcaggaagaatttggccag ggcgaatggagcgaatggagcccggaagcgatgggcaccccgtggaccgaaagccgcagc ccgccggcggaaaacgaagtgagcaccccgatgcaggcgctgaccaccaacaaagatgat gataacattctgtttcgcgatagcgcgaacgcgaccagcctgccggtgcaggatagcagc agcgtgccgctgccgacctttctggtggcgggcggcagcctggcgtttggcaccctgctg tgcattgcgattgtgctgcgctttaaaaaaacctggaaactgcgcgcgctgaaagaaggc aaaaccagcatgcatccgccgtatagcctgggccagctggtgccggaacgcccgcgcccg accccggtgctggtgccgctgattagcccgccggtgagcccgagcagcctgggcagcgat aacaccagcagccataaccgcccggatgcgcgcgatccgcgcagcccgtatgatattagc aacaccgattatttttttccgcgc
reverse translation of CAA31312.1 IL-6 receptor precursor (AA -19 to 449) [Homo sapiens] to a 1404 base sequence of consensus codons. atgytngcngtnggntgygcnytnytngcngcnytnytngcngcnccnggngcngcnytn gcnccnmgnmgntgyccngcncargargtngcnmgnggngtnytnacnwsnytnccnggn gaywsngtnacnytnacntgyccnggngtngarccngargayaaygcnacngtncaytgg gtnytnmgnaarccngcngcnggnwsncayccnwsnmgntgggcnggnatgggnmgnmgn ytnytnytnmgnwsngtncarytncaygaywsnggnaaytaywsntgytaymgngcnggn mgnccngcnggnacngtncayytnytngtngaygtnccnccngargarccncarytnwsn tgyttymgnaarwsnccnytnwsnaaygtngtntgygartggggnccnmgnwsnacnccn wsnytnacnacnaargcngtnytnytngtnmgnaarttycaraaywsnccngcngargay ttycargarccntgycartaywsncargarwsncaraarttywsntgycarytngcngtn ccngarggngaywsnwsnttytayathgtnwsnatgtgygtngcnwsnwsngtnggnwsn aarttywsnaaracncaracnttycarggntgyggnathytncarccngayccnccngcn aayathacngtnacngcngtngcnmgnaayccnmgntggytnwsngtnacntggcargay ccncaywsntggaaywsnwsnttytaymgnytnmgnttygarytnmgntaymgngcngar mgnwsnaaracnttyacnacntggatggtnaargayytncarcaycaytgygtnathcay gaygcntggwsnggnytnmgncaygtngtncarytnmgngcncargargarttyggncar ggngartggwsngartggwsnccngargcnatgggnacnccntggacngarwsnmgnwsn ccnccngcngaraaygargtnwsnacnccnatgcargcnytnacnacnaayaargaygay gayaayathytnttymgngaywsngcnaaygcnacnwsnytnccngtncargaywsnwsn wsngtnccnytnccnacnttyytngtngcnggnggnwsnytngcnttyggnacnytnytn tgyathgcnathgtnytnmgnttyaaraaracntggaarytnmgngcnytnaargarggn aaracnwsnatgcayccnccntaywsnytnggncarytngtnccngarmgnccnmgnccn acnccngtnytngtnccnytnathwsnccnccngtnwsnccnwsnwsnytnggnwsngay aayacnwsnwsncayaaymgnccngaygcnmgngayccnmgnwsnccntaygayathwsn aayacngaytayttyttyccnmgn
Here is the codon optimized sequence:
based on reverse translation of most likely codons. ATGCTGGCCGTCGGCTGTGCACTGCTGGCCGCCCTGCTGGCAGCCCCCGGCGCCGCTCTGGCTCCCAGAAGGTGTCCCGCTCAGGAGGTGGCCAGAGGCGTGCTGACCTCCCTGCCAGGCGATTCCGTTACCCTGACCTGTCCAGGCGTGGAGCCCGAAGATAACGCCACCGTGCACTGGGTGCTGAGGAAACCCGCCGCCGGCTCCCATCCAAGCAGATGGGCAGGCATGGGGAGGAGACTGCTTCTGAGATCTGTCCAGCTGCACGACAGTGGGAACTATTCCTGTTACAGGGCTGGGAGGCCTGCCGGAACAGTGCATCTCCTGGTTGATGTGCCCCCAGAGGAGCCGCAGCTGTCTTGCTTCAGGAAGAGCCCCCTGAGCAATGTGGTGTGCGAATGGGGCCCCCGGAGTACCCCCAGCCTGACAACCAAAGCCGTGCTGCTGGTGAGGAAGTTTCAGAACAGCCCCGCTGAAGACTTTCAGGAGCCCTGCCAGTATAGCCAGGAGTCCCAGAAGTTTTCCTGCCAGCTGGCCGTGCCCGAGGGAGATAGCAGCTTCTACATCGTCTCCATGTGCGTGGCCTCTTCCGTGGGAAGTAAGTTTTCTAAGACCCAGACCTTCCAGGGCTGCGGCATCCTGCAGCCAGATCCTCCCGCCAATATCACAGTGACAGCCGTGGCCAGAAACCCCAGATGGCTGAGCGTGACCTGGCAGGACCCACACTCTTGGAATAGCTCTTTCTATAGGCTGAGGTTCGAGCTGAGATACAGAGCTGAGCGGTCCAAGACATTCACCACTTGGATGGTGAAGGACCTGCAGCATCACTGCGTGATTCACGATGCTTGGAGCGGCCTGAGGCATGTGGTGCAGCTGAGGGCCCAGGAAGAATTTGGGCAGGGGGAATGGAGTGAATGGTCCCCAGAGGCCATGGGGACACCCTGGACCGAGTCCAGGAGCCCACCAGCAGAAAATGAGGTGAGCACTCCCATGCAGGCTCTGACCACCAACAAAGACGATGATAACATTCTCTTCAGAGACTCTGCCAACGCCACCTCCCTGCCCGTGCAGGACAGCAGCTCCGTCCCTCTGCCAACCTTTCTGGTGGCCGGAGGCTCTCTGGCCTTTGGGACCCTGCTGTGTATCGCAATCGTGCTGAGGTTTAAGAAAACCTGGAAGCTGCGGGCCCTGAAGGAGGGAAAGACAAGCATGCACCCACCTTACTCCCTGGGACAGCTGGTGCCAGAGAGGCCCAGGCCTACACCTGTGCTGGTGCCCCTGATCTCCCCTCCTGTGTCTCCTTCCTCTCTGGGTTCTGACAATACAAGTAGCCACAACCGGCCAGACGCCAGGGACCCCCGAAGCCCCTATGATATCTCTAATACCGATTACTTCTTTCCAAGA
based on reverse translation of consensus codons. (not working, as the sequence length is not a multiple of 3)
Codon optimization is a necessary step because of codon bias. Although multiple codons can encode the same amino acid, different organisms prefer certain codons over others. This preference affects: 1/ Translation efficiency, codons that match abundant tRNAs in the host organism are translated faster. 2/ Protein expression levels, Using preferred codons increases protein yield. 3/ mRNA stability, some codons help stabilize the mRNA, reducing degradation. 4/ Reduced secondary structure, optimized sequences minimize hairpins or repetitive sequences that hinder transcription/translation. 5/ Accurate protein folding, efficient translation reduces ribosome stalling, improving proper folding. Without codon optimization, a gene might produce very little protein in the chosen host.
Part 5
(i) What DNA would you want to sequence and why?
Chosen DNA
I would like to sequence DNAs sampled from human bodily fluids, household sinks, wastewater streams.
Why this DNA?
I am interested how ubiqutous DNA samples can construct a genetic reality to how we preserve healthcare and preventative monitoring.
(ii) Sequencing technology choice
Technology selected
[Name the sequencing technology (e.g., Sanger sequencing, Illumina sequencing, Oxford Nanopore, PacBio SMRT, etc.)]
Why this technology?
[Explain reasons such as read length, accuracy, cost, throughput, real-time analysis, portability, etc.]
Generation of sequencing
- [First-generation / Second-generation / Third-generation]
- Explain why it belongs to this category.
Input and sample preparation
Describe the required input and preparation steps, for example:
- DNA extraction
- Fragmentation (if required)
- Adapter ligation
- PCR amplification (if required)
- Library preparation
- Loading onto sequencing platform
Essential sequencing steps and base calling
[Explain how the technology reads DNA bases — e.g., fluorescence detection, nanopore electrical signal changes, synthesis-based imaging, etc.]
Output
[Describe output format — e.g., short reads, long reads, FASTQ files, sequence quality scores, aligned genomes.]
5.2 DNA Write
(i) What DNA would you want to synthesize and why?
Chosen DNA construct
[Describe the DNA you wish to synthesize — gene, operon, genetic circuit, DNA origami design, therapeutic construct, etc.]
Why synthesize this DNA?
[Explain the intended application — medicine, sensors, biomaterials, art, storage, etc.]