Week 2 Pre-Lecture HW: DNA Read Write and Edit
Pre-lecture 2 homework questions:
Question 1: What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy? Response: The error rate of DNA polymerase varies between 1 in 104 to 1 in 105. The length of the human genome is 3 * 109 base pairs, this would mean hundereds of mutations every cell division. However, biology dealt with that discrepancy through proofreading, which decreases the error rate to about 1 in 107, along with other correcting methods like mismatch repair, DNA damage checkpoints, and if there is still severe mutation detected, the cell will go into apoptosis.
Question 2: How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest? Response: The genetic code is degenerate, meaning multiple codons can encode the same amino acid. With 61 codons specifying 20 amino acids, each amino acid is encoded by about three codons on average. For a typical human protein of roughly 400 amino acids, this results in approximately 3400 (about 10190) possible DNA sequences that could produce the same protein. In practice, many of these sequences do not function effectively. Cells exhibit codon bias, preferring certain codons because the corresponding tRNAs are more abundant, which improves translation efficiency. Some sequences also create mRNA structures that hinder ribosome movement or reduce stability, leading to lower protein production. also, specific nucleotide patterns can unintentionally introduce regulatory signals such as premature stop cues or splice sites. For these reasons, although many DNA sequences are theoretically possible, only a subset will reliably generate the desired protein within a biological system.
Questions from Dr. LeProust:
- Most commonly used oligo synthesis method? response: The most commonly used oligonucleotide synthesis method is solid-phase phosphoramidite synthesis.
- Why it’s hard to make oligos >200 nt directly? Response: It is difficult to synthesize oligos longer than about 200 nucleotides because each nucleotide addition is not perfectly efficient. Even with ~99% coupling efficiency, small errors accumulate at every step, causing many strands to be truncated or contain mistakes by the end of the process. Longer sequences also become harder to chemically handle and purify, reducing overall yield and quality.
- Why you can’t make a 2000 bp gene by direct oligo synthesis? Response: A 2000 bp gene cannot be made directly through oligo synthesis because the stepwise chemical process introduces small errors during each nucleotide addition. As the sequence length increases, these errors accumulate, leading to a very low proportion of full-length, correct DNA strands. The resulting mixture would contain many truncated or incorrect sequences, making purification impractical. Instead, long genes are typically assembled from shorter, high-quality oligos using methods such as PCR or gene assembly.
Question from Dr. George Church: What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”? Response: The ten essential amino acids in animals are histidine, isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan, valine, and arginine (arginine is especially essential during growth). They are considered essential because animals cannot synthesize them in sufficient amounts and must obtain them through diet. This weakens the idea behind a “lysine contingency,” which proposes engineering organisms to depend on lysine so they cannot survive outside controlled environments. Since lysine is already common in many natural environments and diets, relying on it alone would not provide strong containment. An organism could potentially obtain lysine from surrounding biological material, making the safeguard less reliable. Effective biocontainment typically requires dependencies on nutrients that are rare or entirely synthetic rather than widely available amino acids.
Sources used for homework: Google searches and slides for Lecture 2