Week 2 HW: DNA Read Write and Edit
Week 1 Lecture Prep
Homework Questions from Professor Jacobson
1. Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy?
Answer:DNA polymerases with proofreading (like human replicative polymerases) have an intrinsic error rate of approximately 10⁻⁶ to 10⁻⁷ errors per base per replication. After mismatch repair, the final in vivo mutation rate is approximately 10⁻⁹ to 10⁻¹⁰ per base per cell division. The human genome is ~3.2 × 10⁹ base pairs. If replication occurred at 10⁻⁶ error rate without repair, it causes 3 × 10⁹ bp × 10⁻⁶ ≈ ~3,000 mutations per cell division. But observed mutation rates are closer to ~0.1–1 mutations per genome per division. Biology resolves this discrepancy by using a layered fidelity systems which includes, Base selectivity of polymerase, 3’ - 5’ exonuclease proofreading, DNA damage repair pathways and Mismatch repair (MMR) system. Together these reduce the errors.
2. How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?
Answer:
As we know, average human protein has approximately, 1036 bp coding sequence or 345 Amino Acids (AA). And each amino acids except few is coded by multiple codons. If the average degeneracy is 3 codons per amino acid, the Total possible DNA sequences will be approximately, 3345 or 10164 possible sequences, which is way large. All possible codes wont work because of few factors. One is codon usage bias, which affect translation efficiency. Second is mRNA secondary structure, where high GC and hairpins block ribosome binding, third is the Regulatory motifs which causes Cryptic splice sites in polyA signals, Fourth is Repetitive elements that cause instability in synthesis, fifth is GC content extremes which affects stability and expression and finally, Protein folding kinetics causes Codon speed influences co-translational folding.
Homework Questions from Dr. LeProust
1. What’s the most commonly used method for oligo synthesis currently?
Answer:Phosphoramidite solid-phase chemical synthesis is the commonly used method for oligo synthesis currently. It is developed in the 1980s.
2. Why is it difficult to make oligos longer than 200nt via direct synthesis?
Answer:Each base addition has a 99–99.5% coupling efficiency. So, even at 99.5%, ie, after 200 cycles, (0.995)^200 which gives approximately 0.37 or gives 37% full product and errors accumulate exponentially and Also, Depurination, Incomplete deprotection and Side reactions. So purity collapses beyond ~200nt.
3. Why can’t you make a 2000bp gene via direct oligo synthesis?
Answer:It is because (0.995)^2000 = 0.00004, which is approximately zero or it is essentially a zero full-length product. So, Genes are assembled from shorter oligos using methods like Gibson Assembly, PCR assembly and Chip-based multiplex synthesis.
Homework Questions from George Church
1. What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?
Answer:Essential Amino Acids in Animals are:
- Histidine
- Isoleucine
- Leucine
- Lysine
- Methionine
- Phenylalanine
- Threonine
- Tryptophan
- Valine
- Arginine
Lysine contigency is about engineering organisms so that their survival depends on external lysine supply as it is Essential amino acid is not synthesized by animals and hence, It becomes a powerful control metabolite. Som If an engineered organism is made auxotrophic for lysine, it cannot survive outside controlled environments and prevents ecological escape and this strengthens synthetic biocontainment. This makes lysine a biological kill switch and can be extensively used for selection purposes.
ReferenceMandell, D., Lajoie, M., Mee, M. et al. Biocontainment of genetically modified organisms by synthetic protein design. Nature 518, 55–60 (2015). https://doi.org/10.1038/nature14121