Week-2 Prep Questions

Homework Questions from Professor Jacobson

Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome, and how does biology deal with that discrepancy?

  • Human replicative DNA polymerases have an initial error rate of approximately $10^{-4}$–$10^{-6}$ per base; when copying the ~3 × 10⁹-base human genome, this would result in ~3,000 errors per replication cycle, which biology suppresses through polymerase proofreading (3′→5′ exonuclease activity), mismatch repair systems, and cellular quality-control mechanisms.

How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice, why don’t all of these codes work for the protein of interest?

  • An average human protein is approximately 1,036 bp long, corresponding to ~345 amino acids. With roughly 3 synonymous codons per amino acid, this yields approximately $3^{345}$ possible DNA sequences that could encode the same protein.
  • In practice, many of these sequences fail due to:
    • Secondary structure: Certain nucleotide combinations form stable mRNA structures that stall ribosomes or reduce mRNA stability.
    • GC content: Extremely high or low GC content can lead to synthesis failures or poor expression.
    • Codon usage bias: Rare codons can slow translation or reduce protein yield due to limited tRNA availability.

Homework Questions from Dr. LeProust

What’s the most commonly used method for oligo synthesis currently?

  • Phosphoramidite solid-phase chemical synthesis is the most commonly used method for oligonucleotide synthesis today.

Why is it difficult to make oligos longer than 200 nt via direct synthesis?

  • The primary limitation is step-wise yield loss: overall yield is governed by the cumulative coupling efficiency at each synthesis step, following
    Yield = (efficiency)^(length). Even with a high coupling efficiency (e.g., 0.997 per base), a 200-nt oligo yields only ~55% full-length product, and yield decreases exponentially as length increases beyond this range.

Why can’t you make a 2000 bp gene via direct oligo synthesis?

  • As oligo length increases, full-length yield decays exponentially, making 2,000-bp products effectively unobtainable by direct chemical synthesis; instead, long genes are assembled enzymatically from many shorter oligos.

Homework Questions from George Church

[Using Google & Prof. Church’s slide #4] What are the 10 essential amino acids in all animals, and how does this affect your view of the “Lysine Contingency”?

  • The ten essential amino acids commonly cited for animals are: Phenylalanine, Valine, Threonine, Tryptophan, Isoleucine, Methionine, Histidine, Arginine, Leucine, and Lysine.
  • Coming from a non-biology background, the concept of the lysine contingency was new to me. Learning that animals already lack the ability to synthesize lysine reframes this idea as a fragile containment strategy on its own, since lysine dependence is a natural and widespread biological condition rather than a unique engineered constraint.

[Given slides #2 & #4 (AA:NA and NA:NA codes)] What code would you suggest for AA:AA interactions?

  • (To be completed)