Week 2 HW: DNA Read, Write & Edit
Professor Jacobson:
The error rate of DNA polymerase differs in organisms and also because there are different types of polymerase enzymes. Usually this accuracy is 1 in 100,000 bp before proof reading and error correction. (I used an AI prompt to confirm, the prompt: What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy?).
The length of the human genome is around 3.2 billion bp. To compare, without error correction, this would cause around 32,000 errors if the entire genome is replicated .To correct for this, biology employs DNA repair mechanisms and proof reading that massively reduce the error rate.An average protein in humans is around 375 aa long (Brocchieri, 2005). Each of those aa can be 1 of 20 types. Each of those Amino acids are coded by 3 codons on average. Codons are 3 nucleotide sequences that code for an amino acid and there are 64 of them. So, for an average protein 20375 different aa chains of length 375 . The number of different DNA encodings are 3375.
In practice most codons don’t code due to lack of specific tRNAs in the cell. These tRNAs match specific codons to specific amino acids before sending them to the ribosome for assembly. As a result if tRNAs do not recognise the specific codon, it stalls protein synthesis. (AI prompt: what are some of the reasons that all of these different codes don’t work to code for the protein of interest?).
Proteins also follow strict structural rules and if the amino acids change, the structure become unstable leading to protein destruction.
Dr LeProust:
Solid phase phosphoramidite synthesis.
As the of an oligonucleotide increase, the coupling efficiency affects the base pair added, even with a 99.9% coupling efficiency after 200nt the probability of the correct bp being added is around 37% and this decreases further as the length increases. At longer lengths, purification becomes difficult. (Gene Synthesis: Methods and Applications, 2011) (AI prompt : what is coupling efficiency).
As mentioned due to coupling efficiency, long lengths of 2000bp would have a lot of errors and may even prevent the stabilisation of the nucleotide.
Prof. George Church:
- The 10 essential amino acids are called essential cause they are usually synthesised from external sources like food. They are not made in the body and are pretty necessary for normal functioning. They are phenylalanine, valine, tryptophan, threonine, isoleucine, methionine, histidine, leucine, lysine and arginine. (Lopez and Mohiuddin, 2024).
In my view, Lysine Contingency worked as a great concept in Jurassic park initially, but like in the movies, this did not seem to work. Lysine is already an essential amino acid, so it does not make sense to add a lysine contingency, it wouldn’t do much. Even if we ignored that part, Lysine is very abundant in the ecosystem, so unless the animal starves it does not experience lysine deficiency.
References:
Brocchieri, L. (2005). Protein length in eukaryotic and prokaryotic proteomes. Nucleic Acids Research, 33(10), pp.3390–3400. doi:https://doi.org/10.1093/nar/gki615.
Gene Synthesis: Methods and Applications. (2011). Methods in Enzymology, [online] 498, pp.277–309. doi:https://doi.org/10.1016/B978-0-12-385120-8.00012-7.
Lopez, M.J. and Mohiuddin, S.S. (2024). Biochemistry, Essential Amino Acids. [online] PubMed. Available at: https://www.ncbi.nlm.nih.gov/books/NBK557845/.
AI used: OpenAI’s ChatGPT 4.1