Week 2 HW: DNA Read-Write-Edit - Pre Lecture Prep

Pre-Lecture Prep

Homework Questions from Professor Jacobson:
1 Machinery of nature, for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy?

The error rate for the error correcting polymerase is approximately 1 in 10^6 base pairs. The human genome is about 3 billion base pairs in lenght, meaning that without correction, a single round of replication can result in approximately 3000 errors per cell division.

Biology employs many layers of correction to bridge this gap which includes using a 3’-5’ proofreading exonuclease activity of the polymerase during replication and conducting mismatch repairs after replication. Proteins such as MutH, MutL and MutS bind and help in mismatch correction, thus lowering the error rate.

2 How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice what are some of the reasons that all of these different codes do not work to code for the protein of interest?

An average human protein is about 1036 base pairs long. Because the genetic code is redundant, there is a huge number of possible DNA sequences for the same protein! But, not all codes are functional or easy to produce due to physical and chemical constraints such as:
1 Sequences with extreme GC content (>90%).
2 DNA sequences capable of folding into minimum free energy secondary structures (MFE-SS) such as hairpins, etc..
3 Practical sequences introduce deletions and mismatches which often need error corrections.


Homework Questions from Dr. LeProust:
1 What is the most commonly used method for oligo synthesis currently?

The most widely used method is phosphoramidite synthesis cycle, involving deprotection, base coupling, capping and oxidation. Currently, high throughput platforms such as Twist Bioscience, use this chemistry on silicon chips to produce massive oligo pools containing millions of unique sequences.

2 Why is it difficult to make oligos longer than 200nt via direct synthesis?

As number of base additions increases, the cuulative yield of full length, error-free product decreases in an exponential manner. It is given mathematically as:
(1-1/N)^N ~ 37%
As the oligos approach 200 nucleotides, the percentage of “perfect” sequences in the final mixture goes down substantially. In addition, increased length increases the likelihood of obtaining complex motifs such as homoploymers or inverted repeats that hinder synthesis.

3 Why cannot you make a 2000bp gene via direct oligo synthesis?

Chemical efficiency of the coupling steps, combined with the factors discussed in the previous question, limit the nucleotide length to approximately 170 nucleotides, whereas specialized platforms can reach upto 300-350 nucleotides.

To create a 2000 bp gene, we would need to synthesize multiple fragments and join them together by an enzymatic assembly such as Gibson assembly, because a single, continuous chain of 2000 couplings would result in a yield near zero.


Homework Question from George Church:
Choose ONE of the following three questions to answer and please cite AI prompts or paper citations used if any.
1 What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?

The 10 essential amino acids are:
Arginine, Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Threonine, Tryptophan and Valine.

The Lysine Contingency:
This concept involves making an organism dependent on a external supply of specific nutrient to survive. Especially, in synthetic biology, if we can engineer organisms as Genomically Recoded Organisms, to require non standard amino acids that do not exist in nature, we can obtain a stronger form of biocontainment compared to a simple, lysine deficiency.

This also provides us with a biosecurity advantage where, even if the organism escapes into the environment, it would be unable to survive because of the lack of the amino acid or the building blocks to synthesize it.