Week 2 HW: DNA Read, Write and Edit
Prep work for week 2 homework:
Questions of Professor Jacobson:
- Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy?
DNA polymerases error rate is one mistake per 10⁵–10⁶ nucleotides, with proofreading it improves to 1 in 10⁷, with post replication final error rate is 1 in 10⁹–10¹⁰ nucleotides.
Comparing this to human genomie size of ~3 × 10⁹ base pairs, we can see just pure DNA polymerases by itself would still result in thousands of errors per cell division, but with all biology defense mechanisms it drops to 0-3 errors.
- How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?
Average Human Protein: 1036 bp, codons are 3 bp ~ 345 amino acids. 3345 ≈ 10165 theoretical ways to encode the same proteins.
Reasons why we don’t have that many codes to code for proteins:
- Codon bias: Rare codons slow translation or reduce protein yield
- mRNA structure: Folding can block ribosomes
- Translation speed: Affects proper protein folding
- Regulatory signals: Splicing, miRNA, or promoter motifs can be disrupted
- GC content & stability: Extreme sequences can be silenced or unstable
- Quality control: Cells degrade faulty mRNA or misfolded proteins
Questions of Dr. LeProust:
- What’s the most commonly used method for oligo synthesis currently?
Phosphoramidite chemical synthesis
- Why is it difficult to make oligos longer than 200nt via direct synthesis?
Error rates accumulate with each step; longer oligos have many more synthesis errors
- Why can’t you make a 2000bp gene via direct oligo synthesis?
Errors multiply, so genes are built by assembling shorter oligos instead
Question from George Church:
- [Using Google & Prof. Church’s slide #4] What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?
- Histidine
- Isoleucine
- Leucine
- Lysine
- Methionine
- Phenylalanine
- Threonine
- Tryptophan
- Valine
- Arginine
I did not know about Lysine Contingency before, so not sure how it affects my view, but it’s really interesting we got locked out evolutionary out of something required at such a fundamental level, it also shows an interesting systemic dependency at massive scale I was unaware of.