Lecture Preps
This section contains my weekly lecture preparation notes and faculty questions.
This section contains my weekly lecture preparation notes and faculty questions.
Human genome length is 3.2 Gbp (Slide 10). Since 1 Gbp=109 bp, the human genome is 3.2×109 bp. Because DNA polymerase can make mistakes approximately every 106 bp (Slide 8), this makes the human genome prone to thousands of potential mistakes (3×109/106≃3000) during replication. In order to deal with this discrepancy, mismatch repair systems (Slide 14) detect and fix replication errors that escape polymerase proofreading.
The number of average base pairs in a human protein is around 1036 bp (Slide 6), which corresponds to approximately 1036/3≃345 amino acids. Depending on the amino acid, each one can be coded by more than one codon (for example, leucine and serine have 6 codons, alanine and glycine have 4 codons, while methionine and tryptophan have only 1 codon). Because of this, depending on the amino acid composition of the protein, there can be many different DNA coding combinations for the same protein sequence. In practice, not all coding sequences work because different DNA sequences can have very different physical properties. Factors such as GC content (Slide 39) can interfere with DNA synthesis and proper expression by affecting secondary structure and minimum free energy. For this reason, coding sequences often need to be optimized for the host organism, since each organism has its own requirements in this respect.
Modern oligonucleotide synthesis is most commonly based on phosphoramidite method (Slide 11; Hoose et al. 2023).
Since the elongation efficiency is not 100%, errors accumulate with each nucleotide addition cycle. Consequently, beyond 200 nt, the system becomes dominated by incorrectly synthesized DNA fragments. Moreover, as the chain length increases, the risks of depurination, strand breakage, and stabilized hairpin formation significantly reduce the final yield of pure DNA (Hoose et al., 2023).
Even oligonucleotide fragments longer than 200 nt accumulate a high proportion of incorrectly synthesized DNA for the reasons discussed in Question 2. This means that synthesizing a 2000 bp sequence directly is even more difficult, as errors accumulate at a much higher rate and strand breakage and hairpin formation become much more likely. To build a 2000 bp gene, shorter oligonucleotide fragments must be assembled using methods such as Gibson assembly (Hoose et al., 2023)
Essential amino acids for all animals are: arginine, histidine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan, valine, lysine. According to https://jurassicpark.fandom.com/wiki/Lysine_contingency, “lysine contingency” is a method of inserting a faulty enzyme gene in dinosaurs so that lysine metabolism is disrupted and dinosaurs are unable to biosynthesize the lysine they need. This was intended as a containment strategy, preventing them from surviving outside the island, because they are supposedly dependent on lysine rich plants found there. However, this logic is flawed, because in reality, like all animals, dinosaurs would not have been able to synthesize lysine in the first place. Instead, they would have obtained lysine from dietary sources. Thus, lysine dependency is a natural condition in animals, including humans, rather than an effective synthetic containment mechanism.
Hoose, A., et al. (2023). DNA synthesis technologies to close the gene writing gap. Nature Reviews Chemistry, 7, 144-161.
Week #2 slides from Prof. Jacobson, Dr. LeProust, and Prof. George Church (https://pages.htgaa.org/2026a/course-pages/weeks/week-01/index.html)
AI was used for grammatical corrections and paraphrasing ideas.