Week 2: Lecture Prep

Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy?
1. 1 error per 10^6 additions, with a throughput of 10 mS per Base Addition.
2. If a human genome is 3.1 gigabase pairs haploid, then
  1. 3.1 10^{9 / 10} 6 = 3100 errors (3.1 * 1000)
3. It fixes these errors during biological synthesis, where the nucleotides in an error physically dont fit together so it pushes that error out to unjam the system and continue on with further synthesis
How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?
1. 1036 base pairs in an average human protein, 3 bp (codon) per protein = 1036 / 3 = ~345
  1. 3^345
2. That’s a huge number, but these dont all work physically because of codon bias, mRNA structure, regulatory interference, and translation/folding constraints.
  1. Note: Had to ChatGPT this one “Why don’t all codon possibilities work at a protein site?, couldn’t find it in the slides.

What’s the most commonly used method for oligo synthesis currently?
1. 1965 Solid phase synthesis of oligos
  1. Had to search google separately
Why is it difficult to make oligos longer than 200nt via direct synthesis?
1. Because errors accumulate over time, even though oligo synthesis has a 99% success rate, the 1% error rate over time breaks it down with the step-by-step addition. So at 200 nucleotide length, you get a 37% full length molecules, which you can purify away from the rest. Beyond this, the exponential rate of the error makes it not wort the amount of effort.
Why can’t you make a 2000bp gene via direct oligo synthesis?
1. A looped run with error rate 99.5% 2000 times == 0.000045 chance of success, because oligo synthesis is all or nothing once a failure occurs.

Choose ONE of the following three questions to answer; and please cite AI prompts or paper citations used, if any.

[Using Google & Prof. Church’s slide #4] What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?
- Lysine contingency – jurassic park failsafe of not letting the park animals themselves generate lysine, which they need to be provided for. But an adaptation at the park allowed that adaptation of lysine generation to occur.
- Ten amino acids animals can’t produce on their own so need to supply are: Arginine, Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Threonine, Tryptophan, and Valine
- Because Lysine is already an essential amino acid not supplied for most all animals, we all have a lysine contingency and they didn’t really do any gene editing to get this in Jurassic park.
[Given slides #2 & 4 (AA:NA and NA:NA codes)] What code would you suggest for AA:AA interactions?
- If NA:NA = A:U, NA:AA = AUG:Met, but for AA:AA this doesnt make sense because their connections are not clear (theyre not “building blocks” as simple as NA -> NA -> AA is compiled), theyre discrete rather than modular units because of their chemical composition
[(Advanced students)] Given the one paragraph abstracts for these real 2026 grant programs sketch a response to one of them or devise one of your own: