Preparation Class 02

HOMEWORK QUESTIONS FROM STEVEN JACOBSEN

After proofreading, DNA polymerase has an error rate of 1:106, meaning 1 error per 1 million base pairs. The human genome contains approx. 3 billions base pairs (3x109bp) in haploid cells and thus, 6 billions base pairs (6x109bp) in diploid cells. This means that thousands of errors occur during DNA replication, but the cell machinery has a post-replication mismatch repair (MMR) system that brings down DNA replication errors to only a few potential base pairs per division.

Human proteins are made of 20 amino acids (aa) whose code is stored in the DNA (A,C,G,T nucleotides coding). Ribosomes are macromolecules that synthesize proteins by translating messenger RNA (mRNA) into amino acid chains. This translation process is mediated by transfer RNA (tRNA) molecules that add a single amino acid corresponding to the mRNA code (A,C,G,U three-nucleotide codon/anticodon coding system). Because there are fewer amino acids than codon possibilities (4^3=64), multiple codons can encode for the same amino acid: a phenomenon called codon redundancy. Some codons are also associated to prompt the start and the end of the translation process. According to the genetic code, there are between two and four DNA code possibilities per amino acid. So in theory there are staggering possibilities to code for an average human protein (approx. 450-480 aa length).

But in practice, spatial configuration and kinetics can affect this process:

AI support: ChatGPT. Prompt: Please read this research article thoroughly and answer “In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?”: https://www.science.org/doi/10.1126/science.1241459

HOMEWORK QUESTIONS FROM EMILY LEPROUST

Solid-phase phosphoramidite chemical synthesis is the industry-standard, automated method for creating custom DNA/RNA oligonucleotides.

Direct synthesis of oligonucleotides (oligos) longer than 200 nucleotides (nt) is difficult primarily because of the cumulative, exponential decline in yield due to imperfect coupling efficiency and the accumulation of chemical errors. Cumulative Inefficiency: Standard oligo synthesis adds nucleotides one by one. Even if each step has a 99% success rate, the overall yield drops significantly as length increases. Longer sequences result in mostly truncated, incorrect, or incomplete products. Accumulation of Errors: With longer synthesis times, chemical side reactions increase, leading to a higher rate of sequence errors, such as deletions or misincorporations. Purification Challenges: As the length increases, it becomes difficult to separate the desired full-length, error-free product from the failed side products. Steric Hindrance: As the oligo grows, it can become tangled, making it harder for reagents to access the reactive end.

Making a 2000bp (base pair) gene via direct synthesis is currently not possible due to these limitations in efficiency, which result in a very low yield of the full-length, correct sequence. Exponentially Low Yield: Using standard 99% efficiency, a single-stranded DNA or RNA molecule that is 2000 bases in length would yield roughly effectively zero usable product. Error Rate vs. Length: The error rate is roughly one mistake per 200 bases, meaning a 2000bp strand would contain an average of 10 errors, making it highly unlikely to contain the correct sequence. Physical Limits of Support: The solid support material (e.g., controlled pore glass) becomes clogged by the growing DNA strands, preventing reagents from completing the synthesis.

AI support: ChatGPT, Gemini. Prompt: long oligonucleotide synthesis + “What’s the most commonly used method for oligo synthesis currently? Why is it difficult to make oligos longer than 200nt via direct synthesis? Why can’t you make a 2000bp gene via direct oligo synthesis?”

HOMEWORK QUESTIONS FROM GEORGE CHURCH

Lysine is one of the 10 essential amino acids found in all animals: Arginine, Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Threonine, Tryptophan, and Valin. An amino acid is classified as essential in a species if the organism can’t produce it and therefore is required in the diet (or any other external supply) in order to survive.

In the movie Jurassic Park (1993), scientist Ray Arnold explains how the research team modified the genome of the dinosaurs to prevent them from surviving in the wild in case the dinosaurs would escape the park: “The lysine contingency is intended to prevent the spread of the animals in case they ever get off the island. Dr. Wu inserted a gene that makes a single faulty enzyme in protein metabolism. The animals can’t manufacture the amino acid lysine. Unless they’re continually supplied with lysine by us, they’ll slip into a coma and die.”

Lysine is classified as an essential amino acid in all known animals, including vertebrates. The movie portrayed the “Lysine Contingency” as an engineered weakness but it is likely that dinosaurs likely didn’t have an endogenous lysine biosynthesis pathway to remove in the first place. The auxotrophic strategy presented in the “Lysine Contingency” concept is also not valid. Indeed, lysine is widely present in nature, particularly animals but also in some plants. Carnivorous dinosaurs representing the main threat on the island, they would likely have no difficulty in finding their lysine supply in the wild. The idea of making an organism dependent on a non-natural amino acid would have been more plausible than preventing biosynthesis of a normal nutrient like lysine.

In the real world, synthetic biologists use more robust strategies to design genetic safeguards: Genetic kill switches: circuits that trigger death in certain environments. Synthetic amino acid dependencies: organisms engineered to depend on non-natural amino acids that aren’t in nature. Multiple overlapping dependencies: not just one but many safety constraints. Genetic firewalling: preventing horizontal gene transfer.

In conclusion, movies like Jurassic Park make synthetic biology look inherently dangerous, even though real scientists focus heavily on safety and careful regulation. The media shape how society feels about science, thus also have the responsibility to spark curiosity without creating unnecessary stigma around technologies that can also bring major benefits.

AI support: Gemini. Prompt: What could the scientist of the “Lysine Contingency” have proposed instead?