Week 2 pre-Lecture HW: DNA Read, Write and Edit

Questions from Professor Jacobson:

What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy?

Response

DNA polymerases make about 1 error per 10⁶ bases during DNA synthesis when proofreading is included (≈10⁻⁶ per base). Given that the human genome is ~3.2 × 10⁹ base pairs (haploid), If replication relied only on a 10⁻⁶ error rate, each cell division would cause thousands of mutations (≈3,200 errors per replication), which would be biologically dangerous. However, cells have multiple error reduction mechanisms: such as Polymerase proofreading (3′→5′ exonuclease activity) that removes most misincorporated bases during synthesis. And post-replicative mismatch repair (MMR) detects and fixes remaining mismatches after replication. Also, it’s important to remember that DNA have intersting properties such diploidy, noncoding DNA and kill switches

How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?

Response
An average human protein has about 350 amino acids, can be encoded in roughly ~10¹⁸⁰–10¹⁹⁰ different sequences of DNA, due to codon degeneracy. Why don’t most work in practice? Mainly due to: Codon bias or tRNA availability causing inefficient translation. Could also be mRNA structure because of poor ribosome loading or elongation. Can also be GC content extremes leading to instability, synthesis and amplification problems

Questions from Dr. LeProust:

Most commonly used oligo synthesis method?

Solid phase phosphoramidite chemical synthesis

Why it’s hard to make oligos >200 nt directly?

Chemical synthesis is open-loop with an error rate of 1 per 10² bases. Each coupling step is less than 100% efficient, by the time we cross 200nt the yield and accuracy drop dramatically.

Why you can’t make a 2000 bp gene by direct oligo synthesis?

Over thousands of cycles, cumulative coupling failures and errors reduce the full-length product to almost zero. Long base genes must be made by assembling many short oligos with enzymatic error correction like Gibson assembly.

Question from George Church:

What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?

The 10 essential amino acids in animals that can not be synthesized by their cells and must come from diet are: Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Threonine, Tryptophan, Valine, and Arginine all universally essential during growth and functionally essential across animals. Humans included! Lysine is both essential and commonly limiting, especially in cereal-based and plant-heavy diets. This means that protein synthesis, growth, and health can be constrained by lysine availability even when total calories or protein intake is adequate. As a result, lysine availability disproportionately shapes nutrition, agriculture, and evolution, supporting the idea of a “lysine contingency” in which access to lysine strongly influences biological and societal outcomes.

Sources / prompts used: Google search: “essential amino acids animals”; Lehninger Principles of Biochemistry 6th edition, introduction; Prof. George Church, HTGAA slides (#4).