Homework

Weekly homework submissions:

  • Week 1 HW: Principles and Practices

    AI Cite Prompts were directly based on the homework questions provided. Homework Questions from Professor Jacobson 1. What is the error rate of polymerase? How does this compare to the human genome length, and how does biology address the discrepancy? Polymerase error rate: ~10⁻⁵ per base without proofreading; ~10⁻⁷–10⁻⁸ with proofreading; ~10⁻⁹–10⁻¹⁰ with mismatch repair.

  • Week 2 HW: DNA Read, Write, & Edit

    3.1 mCherry I was thinking about observation under UV illumination — red fluorescence seems easier to detect compared to blue or green light. It also has stronger visual impact. Protein Sequence MVSKGEEDNMAIIKEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQFMYGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAEVKTTYKAKKPVQLPGAYNVNIKLDITSHNEDYTIVEQYERAEGRHSTGGMDELYK Source: https://www.fpbase.org/protein/mcherry/ 3.2 Original DNA Sequence atggtgagcaaaggcgaagaagataacatggcgattattaaagaatttatgcgctttaaa gtgcatatggaaggcagcgtgaacggccatgaatttgaaattgaaggcgaaggcgaaggc cgcccgtatgaaggcacccagaccgcgaaactgaaagtgaccaaaggcggcccgctgccg tttgcgtgggatattctgagcccgcagtttatgtatggcagcaaagcgtatgtgaaacat ccggcggatattccggattatctgaaactgagctttccggaaggctttaaatgggaacgc gtgatgaactttgaagatggcggcgtggtgaccgtgacccaggatagcagcctgcaggat ggcgaatttatttataaagtgaaactgcgcggcaccaactttccgagcgatggcccggtg atgcagaaaaaaaccatgggctgggaagcgagcagcgaacgcatgtatccggaagatggc gcgctgaaaggcgaaattaaacagcgcctgaaactgaaagatggcggccattatgatgcg gaagtgaaaaccacctataaagcgaaaaaaccggtgcagctgccgggcgcgtataacgtg aacattaaactggatattaccagccataacgaagattataccattgtggaacagtatgaa cgcgcggaaggccgccatagcaccggcggcatggatgaactgtataaa 3.3 Codon Optimization Why Optimize? The same amino acid can be encoded by multiple codons, but different organisms have different codon usage preferences (tRNA abundance, translation efficiency, mRNA structure, etc.). To allow the host to express the protein more efficiently, codon optimization is necessary.

  • Week 3 HW: AUTOMATION

    1. Published paper Villanueva-Cañas et al., PLOS ONE (2021) built a multi-station SARS-CoV-2 RT-qPCR testing workflow using Opentrons OT-2 robots. The core novelty is a reusable software + station architecture that makes a complex diagnostic pipeline programmable, modular, and reproducible across setups.
  1. Final project automation plan Project: “Living Ice Cream” A temperature-responsive dessert system with:

Subsections of Homework

Week 1 HW: Principles and Practices

cover cover cover cover cover cover

AI Cite

Prompts were directly based on the homework questions provided.


Homework Questions from Professor Jacobson

1. What is the error rate of polymerase? How does this compare to the human genome length, and how does biology address the discrepancy?

Polymerase error rate:
~10⁻⁵ per base without proofreading; ~10⁻⁷–10⁻⁸ with proofreading; ~10⁻⁹–10⁻¹⁰ with mismatch repair.

Human genome size:
~3 × 10⁹ base pairs.

Biological solutions:
Proofreading, mismatch repair, diploidy, and natural selection.


2. How many different DNA codes can encode an average human protein? Why don’t all of them work in practice?

Theoretical number of encodings:
Due to codon degeneracy, typically 10³–10⁶+ possible sequences.

Practical constraints:

  • Codon bias and tRNA availability
  • mRNA secondary structure
  • GC content and sequence stability
  • Regulatory motifs (e.g., splicing, translation signals)
  • Error accumulation during synthesis and replication

Homework Questions from Dr. LeProust

3. What is the most commonly used method for oligo synthesis today?

Phosphoramidite solid-phase synthesis.


4. Why is it difficult to synthesize oligos longer than ~200 nt directly?

  • Each coupling step is less than 100% efficient
  • Errors accumulate linearly with length
  • Yield and purity drop exponentially

5. Why can’t a 2000 bp gene be made by direct oligo synthesis?

  • Error rates become prohibitive
  • Full-length product yield approaches zero
  • Long genes must be assembled from shorter oligos (e.g., Gibson assembly, PCA)

Homework Question from George Church

What are the 10 essential amino acids in all animals?

The ten essential amino acids that animals cannot synthesize de novo and must obtain from diet are:

Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Threonine, Tryptophan, Valine, and Arginine
(Arginine is conditionally essential in adults but universally essential during growth.)


How does this affect the “Lysine Contingency”?

Lysine’s essentiality reflects a deep evolutionary constraint: animals universally lost lysine biosynthesis pathways, making them metabolically dependent on external sources. This supports the “lysine contingency” as a system-level lock-in rather than an arbitrary biochemical choice. Once lysine synthesis was abandoned, translational machinery, diet, and ecological dependencies co-evolved around its availability, making reversal highly unlikely. Thus, lysine exemplifies how early metabolic decisions constrain future evolutionary trajectories.

Week 2 HW: DNA Read, Write, & Edit


3.1 mCherry

I was thinking about observation under UV illumination — red fluorescence seems easier to detect compared to blue or green light. It also has stronger visual impact.

Protein Sequence

MVSKGEEDNMAIIKEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQFMYGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAEVKTTYKAKKPVQLPGAYNVNIKLDITSHNEDYTIVEQYERAEGRHSTGGMDELYK

Source: https://www.fpbase.org/protein/mcherry/


3.2 Original DNA Sequence

atggtgagcaaaggcgaagaagataacatggcgattattaaagaatttatgcgctttaaa
gtgcatatggaaggcagcgtgaacggccatgaatttgaaattgaaggcgaaggcgaaggc
cgcccgtatgaaggcacccagaccgcgaaactgaaagtgaccaaaggcggcccgctgccg
tttgcgtgggatattctgagcccgcagtttatgtatggcagcaaagcgtatgtgaaacat
ccggcggatattccggattatctgaaactgagctttccggaaggctttaaatgggaacgc
gtgatgaactttgaagatggcggcgtggtgaccgtgacccaggatagcagcctgcaggat
ggcgaatttatttataaagtgaaactgcgcggcaccaactttccgagcgatggcccggtg
atgcagaaaaaaaccatgggctgggaagcgagcagcgaacgcatgtatccggaagatggc
gcgctgaaaggcgaaattaaacagcgcctgaaactgaaagatggcggccattatgatgcg
gaagtgaaaaccacctataaagcgaaaaaaccggtgcagctgccgggcgcgtataacgtg
aacattaaactggatattaccagccataacgaagattataccattgtggaacagtatgaa
cgcgcggaaggccgccatagcaccggcggcatggatgaactgtataaa

3.3 Codon Optimization

Why Optimize?

The same amino acid can be encoded by multiple codons, but different organisms have different codon usage preferences (tRNA abundance, translation efficiency, mRNA structure, etc.). To allow the host to express the protein more efficiently, codon optimization is necessary.

Organism Selected

E. coli K-12.

It is one of the most commonly used host strains. The technical maturity and widespread adoption of this system make it highly suitable for experimental work.


Optimized DNA Sequence

ATGGTGAGCAAAGGCGAAGAAGATAACATGGCGATTATTAAAGAATTTATGCGTTTTAAA
GTGCATATGGAAGGCAGCGTGAACGGCCATGAATTTGAAATTGAAGGCGAAGGCGAAGGC
CGTCCGTATGAAGGCACCCAGACCGCGAAACTGAAAGTGACCAAAGGCGGCCCGCTGCCG
TTTGCGTGGGATATTCTGAGCCCGCAGTTTATGTATGGCAAAGCGTATGTGAAACATCCG
GCGGATATTCCGGATTATCTGAAACTGAGCTTTCCGGAAGGCTTTAAATGGGAACGCGTG
ATGAACTTTGAAGATGGCGGCGTGGTGACCGTGACCCAGGATAGCAGCCTGCAGGATGGC
GAATTTATTTATAAAGTGAAACTGCGCGGCACCAACTTTCCGAGCGATGGCCCGGTGATG
CAGAAAAAAACCATGGGCTGGGAAGCGAGCAGCGAAACGCGTGTATCCGGAAGATGGCGCG
CTGAAAGGCGAAATTAAACAGCGCCTGAAACTGAAAGATGGCGGCCATTATGATGCGGAAG
TGAAAACCCACCTATAAAGCGAAAAAACCGGTGCAGCTGCCGGGCGCGTATAACGTGAACA
TTAAACTGGATATTACCAGCCATAACGAAGATTATACCATTGTGGAACAGTATGAACGTGC
GGAAGGCCGTCATAGCACCGGCGGCATGGATGAACTGTATAAATAA

Source: https://www.kazusa.or.jp/codon/cgi-bin/showcodon.cgi?aa=1&species=83333&style=N&utm


3.4 Cell-Dependent Expression

Goal

To have E. coli transcribe this DNA into mRNA and translate it into a fluorescent protein.

Method

Insert the optimized CDS into an expression plasmid including:

  • Promoter
  • RBS
  • Terminator

Transform the plasmid into E. coli.

In E. coli:

  • RNA polymerase recognizes the promoter
  • DNA is locally unwound
  • Complementary mRNA is synthesized

The ribosome binds via the Shine–Dalgarno sequence and begins translation at the start codon.

  • One codon (3 nucleotides) is read at a time
  • tRNAs deliver corresponding amino acids
  • Amino acids are linked into a polypeptide chain
  • Final product: fluorescent protein

4.1

/


4.2

https://benchling.com/s/seq-7VHbcY2Zp8vixGlBz1td?m=slm-7HxrEb6QjOZ58RtT8p96

(There’s unknown error for the optimization version,,,) cover cover cover cover


5.1 DNA Read

5.1(i) What DNA Would I Sequence?

A synthetic DNA library used for digital DNA data storage.

Artificially designed DNA fragments encoding digital information (text or images).

Why?

DNA here serves as an information storage medium rather than biological genetic material.

Sequencing verifies:

  • Whether the written digital information is preserved
  • Whether errors occurred during storage or amplification
  • The error rate (substitutions, insertions, deletions)

This is effectively a biotechnology-based data integrity check.


5.1(ii) Sequencing Technology

First-generation sequencing: Sanger sequencing.

Why?

  • High accuracy
  • Suitable for validating single fragments

Characteristics

  • Reads one DNA template at a time
  • Read length ~700–900 bp
  • Very high accuracy

Inputs

  • Template DNA
  • Primer
  • DNA polymerase
  • dNTPs
  • Fluorescently labeled ddNTPs

Core Principle

DNA synthesis is terminated at random positions using ddNTPs.

Process:

  1. Polymerase copies the template
  2. ddNTP incorporation stops elongation
  3. Fragments of different lengths are produced
  4. Fragments separated by size
  5. Fluorescent signal read to reconstruct sequence

Output

  • DNA sequence
  • Chromatogram

Limitations

  • Cannot sequence thousands of fragments simultaneously

5.2 DNA Write

5.2(i) What DNA Would I Synthesize?

An expression cassette expressing mCherry in E. coli.

Includes:

  • Promoter
  • RBS
  • Codon-optimized mCherry CDS
  • His tag
  • Terminator

Reason:

  • Produces visible red fluorescence
  • Strong contrast under blue light

5.2(ii) Synthesis Technology

Solid-phase chemical DNA synthesis (phosphoramidite method) + gene assembly.

Steps

  1. Design sequence computationally
  2. Split into short oligos
  3. Chemically synthesize oligos
  4. Assemble via PCR or Gibson Assembly
  5. Clone into vector
  6. Sequence verify

Limitations

  • Error rate increases with length
  • Assembly required
  • Sequencing verification required
  • Cost scales with length

5.3 DNA Edit

5.3(i) What DNA Would I Edit?

A single-base mutation (e.g., disease-causing point mutation).

Reason:

  • Represents the most precise editing scenario
  • Relevant for therapeutic research

5.3(ii) Editing Technology

CRISPR-Cas9 + HDR repair template.

Principle

  1. Design gRNA
  2. Cas9 creates double-strand break
  3. Provide donor DNA
  4. HDR replaces base precisely

Required Inputs

  • gRNA
  • Cas9 protein or plasmid
  • Donor DNA
  • Target cells

Limitations

  • Low HDR efficiency
  • Possible off-target effects
  • Complex delivery
  • Cell-type dependent precision

Week 3 HW: AUTOMATION

1) Published paper

Villanueva-Cañas et al., PLOS ONE (2021) built a multi-station SARS-CoV-2 RT-qPCR testing workflow using Opentrons OT-2 robots. The core novelty is a reusable software + station architecture that makes a complex diagnostic pipeline programmable, modular, and reproducible across setups.


2) Final project automation plan

Project: “Living Ice Cream”

A temperature-responsive dessert system with:

  • Slow “breathing” surface behavior (controlled micro-gas generation)
  • Visual shift (color / glow) near melt-adjacent temperatures

Why Ginkgo automation

I’m using Ginkgo’s autonomous / cloud-lab framing as an iteration engine for high-throughput DOE: stable automation backbone, fast experimental loops, and standardized readouts for repeated screening rounds.

cover cover

What I will automate

A) “Breathing” kinetics screening (high-throughput DOE)

Goal: Find enzyme/substrate + formulation conditions that yield slow, non-violent micro-gas behavior around ~15–25°C.

DOE axes (example)

  • enzyme concentration
  • substrate concentration
  • buffer / pH
  • capsule matrix composition
  • temperature + time

Minimal pseudocode

for cond in DOE_grid:
    dispense(cond.reagents, well)
    incubate(temp=cond.temp, time=cond.time)
    readout = measure_optical_bubble_proxy_or_pressure(well)
    log(readout, cond)