Homework

Weekly homework submissions:

  • Week 1 Homework: Principles and Practices

    Exchangeable Fabs As an undergraduate student, I worked on a project that required the use of multiple antibodies to bind specific epitopes and stain them. Nonetheless, antibodies are not cheap; they take a long time to get to Latin American laboratories, and they usually suffer from thawing and unthawing processes, causing them to lose efficacy. On another note, commercial antibodies are not fully characterized; therefore, other protein interactions are not screened beforehand, leading to cross-reactivity, which ultimately leads to less reliable results [1].

  • Week 2 Homework Section 1: DNA Read, Write, and Edit

    Homework Questions from Professor Jacobson 1. Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome? How does biology deal with that discrepancy? The intrinsic error rate of DNA polymerase is approximately 1 error per 10⁶ nucleotides incorporated. Given that the human genome is approximately 3.2 × 10⁹ base pairs, this would imply on the order of ~3,200 errors per genome replication if no corrective mechanisms were present.

  • Week 2 Homework Section 2: DNA Read, Write and Edit

    Part 1: Benchling and in silico gel art Using benchling, and after some time playing around with Ronan’s website for iteration, I decided to created the expression “OH”. Part 3: DNA Design Challenge 3.1. Choose your protein I decided to work with the E Protein of the Zika virus Brazil strain.

  • Week 3 Homework: Lab Automation

    Part 1: Python Script for Opentrons Artwork Using GUI opentrons-art.rcdonovan.com, I created two designs, one using a whole range of colors and the other using just red and green (https://opentrons-art.rcdonovan.com/?id=zk5r154b9r6q995). mplum_points = [(-4.4, 24.2),(-2.2, 24.2),(0, 24.2),(-6.6, 22),(-4.4, 22),(2.2, 22),(-6.6, 19.8),(-4.4, 19.8),(2.2, 19.8),(-6.6, 17.6),(-4.4, 17.6),(-2.2, 17.6),(0, 17.6),(2.2, 17.6),(-6.6, 15.4),(-4.4, 15.4),(-2.2, 15.4),(-6.6, 13.2),(-4.4, 13.2),(-6.6, 11),(-4.4, 11),(-6.6, 8.8),(-4.4, 8.8),(-6.6, 6.6),(-4.4, 6.6),(-2.2, 6.6),(-6.6, 4.4),(-4.4, 4.4),(-2.2, 4.4),(-6.6, 2.2),(-4.4, 2.2),(-2.2, 2.2),(-6.6, 0),(-4.4, 0),(-2.2, 0),(-6.6, -2.2),(-4.4, -2.2),(-2.2, -2.2),(-6.6, -4.4),(-4.4, -4.4),(-2.2, -4.4),(-6.6, -6.6),(-4.4, -6.6),(-2.2, -6.6),(-6.6, -8.8),(-4.4, -8.8),(-2.2, -8.8),(-6.6, -11),(-4.4, -11),(-2.2, -11),(-6.6, -13.2),(-4.4, -13.2),(-2.2, -13.2),(-6.6, -15.4),(-4.4, -15.4),(-2.2, -15.4),(-4.4, -17.6)] mko2_points = [(0, 15.4),(2.2, 15.4),(-2.2, 13.2),(0, 13.2),(2.2, 13.2),(4.4, 13.2),(-2.2, 11),(0, 11),(2.2, 11),(4.4, 11),(-2.2, 8.8),(0, 8.8),(2.2, 8.8),(4.4, 8.8),(0, 6.6),(2.2, 6.6)] ultramarine_points = [(-2.2, 22),(0, 22),(-2.2, 19.8),(0, 19.8)] electra2_points = [(4.4, 2.2),(6.6, 2.2),(8.8, 2.2),(11, 2.2),(13.2, 2.2),(15.4, 2.2),(17.6, 2.2),(8.8, 0),(13.2, 0),(6.6, -2.2),(13.2, -2.2),(15.4, -2.2),(17.6, -2.2),(4.4, -4.4),(13.2, -4.4),(2.2, -6.6),(4.4, -6.6),(6.6, -6.6),(8.8, -6.6),(13.2, -6.6)] megfp_points = [(-4.4, 24.2),(-2.2, 24.2),(0, 24.2),(-6.6, 22),(-4.4, 22),(2.2, 22),(-6.6, 19.8),(-4.4, 19.8),(2.2, 19.8),(-6.6, 17.6),(-4.4, 17.6),(-2.2, 17.6),(0, 17.6),(2.2, 17.6),(-6.6, 15.4),(-4.4, 15.4),(-2.2, 15.4),(-6.6, 13.2),(-4.4, 13.2),(-6.6, 11),(-4.4, 11),(-6.6, 8.8),(-4.4, 8.8),(-6.6, 6.6),(-4.4, 6.6),(-2.2, 6.6),(-6.6, 4.4),(-4.4, 4.4),(-2.2, 4.4),(-6.6, 2.2),(-4.4, 2.2),(-2.2, 2.2),(-6.6, 0),(-4.4, 0),(-2.2, 0),(-6.6, -2.2),(-4.4, -2.2),(-2.2, -2.2),(-6.6, -4.4),(-4.4, -4.4),(-2.2, -4.4),(-6.6, -6.6),(-4.4, -6.6),(-2.2, -6.6),(-6.6, -8.8),(-4.4, -8.8),(-2.2, -8.8),(-6.6, -11),(-4.4, -11),(-2.2, -11),(-6.6, -13.2),(-4.4, -13.2),(-2.2, -13.2),(-6.6, -15.4),(-4.4, -15.4),(-2.2, -15.4),(-4.4, -17.6)] mrfp1_points = [(-2.2, 22),(0, 22),(-2.2, 19.8),(0, 19.8),(0, 15.4),(2.2, 15.4),(-2.2, 13.2),(0, 13.2),(2.2, 13.2),(4.4, 13.2),(-2.2, 11),(0, 11),(2.2, 11),(4.4, 11),(-2.2, 8.8),(0, 8.8),(2.2, 8.8),(4.4, 8.8),(0, 6.6),(2.2, 6.6)] I submitted my design but I still have some inquiries about the python file.

Subsections of Homework

Week 1 Homework: Principles and Practices

Protein structure

Exchangeable Fabs

As an undergraduate student, I worked on a project that required the use of multiple antibodies to bind specific epitopes and stain them. Nonetheless, antibodies are not cheap; they take a long time to get to Latin American laboratories, and they usually suffer from thawing and unthawing processes, causing them to lose efficacy. On another note, commercial antibodies are not fully characterized; therefore, other protein interactions are not screened beforehand, leading to cross-reactivity, which ultimately leads to less reliable results [1].

Protein structure Protein structure

Taking inspiration from research done at the UW Institute for Protein Design [2], I decided to delve into a synthetic biology proposal for the protein engineering of “exchangeable Fabs”. Antibodies have two parts, the Fab (Fragment antigen-binding) region at the arms for antigen recognition, and the Fc (Fragment crystallizable) region at the stem for specific species immune system interaction [3]. The idea is to create a protocol that allows for Top-Down design of the Fab region, where you design a Fab that binds an epitope in silico, and then you go on to produce a plasmid to produce it in-house. I called it “exchangeable Fabs” because the idea in the long run would be to be able to select an epitope and then get a plasmid system that codes for the Fab region that you assemble in your own laboratory with an already acquired plasmid for the Fc region.

Protein structure

Governance Goals and Proposed Actions

Considering what I learned in class and drawing on Baker and Church’s Protein Design Meets Biosecurity, I define a high-level governance goal and propose three complementary governance actions.

Protein structure

High-Level Goal

Ensure that computationally designed, exchangeable Fab fragments are developed and deployed in ways that maximize scientific reliability, equity, and safety while minimizing risks of misuse, experimental harm, and environmental burden.

This overarching goal can be broken down into the following sub-goals:

Non-malfeasance and Biosecurity

  • Prevent the misuse of antibody design pipelines to generate harmful or deceptive biological tools.
  • Reduce the risk of poorly characterized binding agents entering experimental or clinical workflows.

Laboratory Safety and Scientific Integrity

  • Ensure that designed Fab fragments are traceable, validated, and reproducible.
  • Reduce experimental artifacts caused by cross-reactivity or undocumented binding behavior.

Equitable Access and Global Scientific Inclusion

  • Lower structural barriers for under-resourced laboratories (e.g., in Latin America) to access reliable affinity reagents.
  • Avoid creating new dependencies on proprietary platforms or centralized institutions.

Environmental Sustainability

  • Reduce waste associated with disposable commercial antibodies and repeated cold-chain shipping.
  • Minimize redundant production of poorly validated reagents.

Governance Actions

I propose three complementary governance actions, pursued by different actors and using distinct mechanisms.

Option 1: Standardized Fab Design and Validation Protocols (Academic Setting)

Purpose

Currently, antibodies, especially commercial ones, are often under-characterized, with limited disclosure of off-target interactions. I propose the development of community-wide standards for the computational design, experimental validation, and reporting of exchangeable Fab fragments.

Design

Actors: Academic researchers, journals, funding agencies.

Requirements:

  • Minimum in silico validation benchmarks (e.g., binding specificity screens).
  • Mandatory disclosure of design parameters and validation assays.
  • Journal and funder requirements for compliance as a condition of publication or funding.

Assumptions

  • Computational predictions meaningfully correlate with in vitro behavior.
  • Academic communities are willing to converge on shared standards.

Risks of Failure and “Success”

  • Failure risk: Standards may be unevenly adopted or reduced to box-checking exercises.
  • Success risk: Over-standardization could slow exploratory or unconventional research.

Option 2: Modular Licensing of Fab–Fc Systems (Incentive-Based Governance)

Purpose

To prevent misuse while promoting access, I propose a modular licensing framework in which validated Fc backbones are shared under open or semi-open licenses, while Fab designs are distributed with usage constraints.

Design

Actors: Universities, nonprofit consortia, biotechnology companies.

Mechanisms:

  • Fc regions distributed as standardized, safety-validated plasmids.
  • Fab designs tagged with metadata describing intended use, validation level, and constraints.
  • Governance structures analogous to open-source software licenses with biological usage clauses.

Assumptions

  • Laboratories will comply with licensing norms in the absence of strong enforcement mechanisms.
  • Existing plasmid-sharing infrastructures can support metadata-rich distribution.

Risks of Failure and “Success”

  • Failure risk: Informal sharing networks may bypass governance mechanisms.
  • Success risk: Licensing complexity may discourage adoption by smaller or under-resourced laboratories.

Option 3: Technical Safeguards Embedded in Design Pipelines

Purpose

Rather than relying solely on rules or norms, this option embeds technical constraints directly into antibody design tools to reduce the likelihood of harm.

Design

Actors: Tool developers, computational biologists, funding agencies.

Features:

  • Automated screening against databases of known toxin-related or pathogenic epitopes.
  • Flagging or blocking of designs with high-risk similarity profiles.
  • Audit logs for design requests, analogous to controls used in DNA synthesis.

Assumptions

  • Reliable and up-to-date databases of hazardous epitopes exist.
  • Researchers will tolerate a degree of friction in design workflows in exchange for increased safety.

Risks of Failure and “Success”

  • Failure risk: False positives may block benign or exploratory research.
  • Success risk: Users may migrate to unregulated or less transparent tools.

Scoring of Governance Options Against Policy Goals

Policy Goal / CriterionOption 1Option 2Option 3
Enhance Biosecurity221
• By preventing incidents221
• By helping respond232
Foster Laboratory Safety122
• By preventing incidents122
• By helping respond232
Protect the Environment212
• By preventing incidents212
• By helping respondn/an/an/a
Other Considerations
• Minimizing costs and burdens to stakeholders232
• Feasibility122
• Not impede research123
• Promote constructive applications112

Scoring scale: 1 = best performance; 3 = weakest performance; n/a = not applicable.

References

[1] Baker, M. Reproducibility crisis: Blame it on the antibodies. Nature (2019).
https://www.nature.com/articles/d42473-019-00273-7

[2] Institute for Protein Design. Designing antibodies with RFdiffusion (2025).
https://www.bakerlab.org/2025/02/28/designing-antibodies-with-rfdiffusion/

[3] Janeway, C. Immunobiology. Garland Science.
https://www.ncbi.nlm.nih.gov/books/NBK27144/

AI assistance was used to organize, clarify, and refine explanations

Week 2 Homework Section 1: DNA Read, Write, and Edit

Homework Questions from Professor Jacobson

1. Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome? How does biology deal with that discrepancy?

The intrinsic error rate of DNA polymerase is approximately 1 error per 10⁶ nucleotides incorporated. Given that the human genome is approximately 3.2 × 10⁹ base pairs, this would imply on the order of ~3,200 errors per genome replication if no corrective mechanisms were present.

Biological systems mitigate this discrepancy through multiple layers of error correction. First, many DNA polymerases possess 3′→5′ exonuclease proofreading activity, which reduces the error rate to approximately 1 in 10⁷–10⁸ nucleotides. Second, post-replicative mismatch repair (MMR) pathways further correct misincorporated bases, ultimately reducing the effective mutation rate to approximately 1 error per 10⁹–10¹⁰ nucleotides. Together, these mechanisms ensure genome stability despite the large size of the human genome.

2. How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice, why do all of these different codes not work equally well?

An average human protein is approximately 450 amino acids long. Due to the degeneracy of the genetic code—where most amino acids are encoded by multiple codons—the number of possible DNA sequences that could encode such a protein is astronomically large, potentially exceeding 10¹⁰⁰ distinct nucleotide sequences.

In practice, however, only a small subset of these sequences function efficiently. This is due to several biological constraints:

  • Codon usage bias: Different organisms preferentially use certain synonymous codons, reflecting tRNA abundance and translational efficiency.
  • mRNA stability and structure: Certain nucleotide sequences form secondary structures that reduce transcript stability or impair ribosome binding and elongation.
  • Translational accuracy and speed: Rare codons can cause ribosome stalling, leading to misfolding or premature termination.
  • Protein folding requirements: Translation kinetics influence co-translational folding; inappropriate codon choices can result in misfolded or nonfunctional proteins.

Thus, although many nucleotide sequences are theoretically valid, only a limited fraction produce functional protein in vivo.

Homework Questions from Dr. LeProust

3. What is the most commonly used method for oligonucleotide synthesis currently?

The most commonly used method for oligonucleotide synthesis is the solid-phase phosphoramidite method, an automated chemical process that sequentially adds nucleotides in the 3′ to 5′ direction.

4. Why is it difficult to make oligonucleotides longer than ~200 nucleotides via direct synthesis?

Direct oligonucleotide synthesis is limited to approximately 200 nucleotides due to cumulative coupling inefficiencies. Even with stepwise coupling efficiencies greater than 99%, errors accumulate exponentially as oligo length increases, resulting in:

  • Decreased full-length product yield
  • Increased truncation products
  • Higher overall error rates

As a result, longer oligos become impractical to synthesize reliably via single-step chemical synthesis.

5. Why can’t you make a 2000 bp gene via direct oligonucleotide synthesis?

A 2000 bp gene cannot be synthesized directly because chemical synthesis error rates and truncation frequencies render long products unusable. Instead, long genes are constructed by assembling multiple shorter oligonucleotides using enzymatic methods such as:

  • PCR-based assembly
  • Gibson Assembly
  • Golden Gate cloning

These methods leverage biological fidelity mechanisms rather than purely chemical synthesis.

Homework Question from George Church

Chosen question: Using Google & Prof. Church’s slide #4 — What are the 10 essential amino acids in all animals, and how does this affect your view of the “Lysine Contingency”?

The ten essential amino acids in animals are: Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Threonine, Tryptophan, Valine, Arginine.

The concept of the “Lysine Contingency” highlights lysine’s unique status as an essential amino acid that animals cannot synthesize de novo and must obtain from their diet or environment. This dependency introduces a systemic vulnerability: any disruption to lysine availability, whether ecological, agricultural, or geopolitical, can have cascading effects on animal health and food security.

From a synthetic biology perspective, the lysine contingency underscores how metabolic dependencies constrain evolutionary and technological design space. It also suggests that engineering alternative nutritional pathways or lysine-efficient systems could have outsized impacts on resilience, sustainability, and global food systems.

AI assistance was used to organize, clarify, and refine explanations; no external factual claims were introduced beyond standard molecular biology knowledge.

Week 2 Homework Section 2: DNA Read, Write and Edit

Part 1: Benchling and in silico gel art

Using benchling, and after some time playing around with Ronan’s website for iteration, I decided to created the expression “OH”.

gel

Part 3: DNA Design Challenge

3.1. Choose your protein

I decided to work with the E Protein of the Zika virus Brazil strain.

>tr|A0A060H177|A0A060H177_ZIKV E protein (Fragment) OS=Zika virus OX=64320 GN=E PE=4 SV=1 IRCIGVSNRDFVEGMSGGTWVDVVLEHGGCVTVMAQDKPTVDIELVTTTVSNMAEVRSYCYEASISDMASDSRCPTQGEAYLDKQSDTQYVCKRTLVDRGWGNGCGLFGKGSLVTCAKFACSKKMTGKSIQPENLEYRIMLSVHGSQHSGMIVNDTGHETDENRAKVEITPNSPRAEATLGGFGSLGLDCEPRTGLDFSDLYYLTMNNKHWLVHKEWFHDIPLPWHAGADTGTPHWNNKEALVEFKDAHAKRQTVVVLGSQEGAVHTALAGALEAEMDGAKGRLSSGHLKCRLKMDKLRLKGVSYSLCTAAFTFTKIPAETLHGTVTVEVQYAGTDGPCKVPAQMAVDMQTLTPVGRLITANPVITESTENSKMMLELDPPFGDSYIVIGVGEKKITHHWHRSGSTIGKAFEATVRGAKRMAVLGDTAWDFGSVGGALNSLGKGIHQIFGAAFKSLFGGMSWFSQILIGTLLMWLGLNTKNGSISLMCLALGGVLIFLSTAVSA

3.2 Reverse Translate into DNA

First, I had to check for the correct codon table for Zika virus.

gel

Then, using an online tool, I reverse translated the aa sequence into DNA

>reverse translation of tr|A0A060H177|A0A060H177_ZIKV E protein (Fragment) OS=Zika virus OX=64320 GN=E PE=4 SV=1 to a 1512 base sequence of most likely codons. atcagatgtatcggagtgtcaaacagagacttcgtggagggaatgtcaggaggaacatgggtggacgtggtgctggagcacggaggatgtgtgacagtgatggcccaggacaagccaacagtggacatcgagctggtgacaacaacagtgtcaaacatggccgaggtgagatcatactgttacgaggcctcaatctcagacatggcctcagactcaagatgtccaacacagggagaggcctacctggacaagcagtcagacacacagtacgtgtgtaagagaacactggtggacagaggatggggaaacggatgtggactgttcggaaagggatcactggtgacatgtgccaagttcgcctgttcaaagaagatgacaggaaagtcaatccagccagagaacctggagtacagaatcatgctgtcagtgcacggatcacagcactcaggaatgatcgtgaacgacacaggacacgagacagacgagaacagagccaaggtggagatcacaccaaactcaccaagagccgaggccacactgggaggattcggatcactgggactggactgtgagccaagaacaggactggacttctcagacctgtactacctgacaatgaacaacaagcactggctggtgcacaaggagtggttccacgacatcccactgccatggcacgccggagccgacacaggaacaccacactggaacaacaaggaggccctggtggagttcaaggacgcccacgccaagagacagacagtggtggtgctgggatcacaggagggagccgtgcacacagccctggccggagccctggaggccgagatggacggagccaagggaagactgtcatcaggacacctgaagtgtagactgaagatggacaagctgagactgaagggagtgtcatactcactgtgtacagccgccttcacattcacaaagatcccagccgagacactgcacggaacagtgacagtggaggtgcagtacgccggaacagacggaccatgtaaggtgccagcccagatggccgtggacatgcagacactgacaccagtgggaagactgatcacagccaacccagtgatcacagagtcaacagagaactcaaagatgatgctggagctggacccaccattcggagactcatacatcgtgatcggagtgggagagaagaagatcacacaccactggcacagatcaggatcaacaatcggaaaggccttcgaggccacagtgagaggagccaagagaatggccgtgctgggagacacagcctgggacttcggatcagtgggaggagccctgaactcactgggaaagggaatccaccagatcttcggagccgccttcaagtcactgttcggaggaatgtcatggttctcacagatcctgatcggaacactgctgatgtggctgggactgaacacaaagaacggatcaatctcactgatgtgtctggccctgggaggagtgctgatcttcctgtcaacagccgtgtcagcc

3.3. Codon Optimization

Using Codon Optimization Tool of Twist Bioscience avoiding Type IIs enzyme (BsaI, BsmBI and Bbs), avoiding Standard MCS sites (BamHI, EcoRI, HindIII, NcoI, and XhoI), and preserving glycosylation sites and fusion loop for protein functionality, I obtained the following sequence:

>EProtein ATCAGATGTATCGGAGTGTCAAACAGAGACTTCGTGGAGGGAATGTCAGGAGGAACATGGGTGGACGTGGTGCTGGAGCACGGAGGATGTGTGACAGTGATGGCCCAGGACAAGCCAACAGTGGACATCGAGCTGGTGACAACAACAGTGTCAAACATGGCCGAGGTGAGATCATACTGTTACGAGGCCTCAATCTCAGACATGGCCTCAGACTCAAGATGTCCAACACAGGGAGAGGCCTACCTGGACAAGCAGTCAGACACACAGTACGTGTGTAAGAGAACACTGGTGGACAGAGGATGGGGAAACGGATGTGGACTGTTCGGAAAGGCATTACAGGAGATATGTGTCAGGTACGACTTTTCAAAGAGGACGATCGAAAGGTGAACCCTGCTCGTGAGCCCGGGGTTCAAAACCACGCAGTTAGCGCGCGGATCACAGCACTCAGGAATGATCGTGAACGACACAGGACACGAGACAGACGAGAACAGAGCCAGGGCGGCGACCATACGAAACTTACTAAATCCCGGGGACATACAGGCCGCATCCGAATTACGGGCACGGGTCTGTAGGCCAAGAACAGGACTGGACTTCTCAGACCTGTACTACCTGACAATGAACAACAAGCACTGGCTGGTGCACAAGGAGTGGTTCCACGACATCCCACTGCCATGGCACGCCGGAGCCGACACAGGAACACCACACTGGAACAACAAGGAGGCCCTGGTGGAGTTCAAGGACGCCCACGCCAAGAGACAGACAGTGGTGGTGCTGGGATCACAGGAGGGAGCCGTGCACACAGCCCTGGCCGGAGCCCTGGAGGCCGAGATGGACGGAGCCAAGGGAAGACTGTCATCAGGACACCTGAAGTGTAGACTGAAGATGGACAAGCTGAGACTGAAGGGAGTGTCATACTCACTGTGTACAGCCGCCTTCACATTCACAAAGATCCCAGCCGAGACACTGCACGGAACAGTGACAGTGGAGGTGCAGTACGCCGGAACAGACGGACCATGTAAGGTGCCAGCCCAGATGGCCGTGGACATGCAGACACTGACACCAGTGGGAAGACTGATCACAGCCAACCCAGTGATCACAGAGTCAACAGAGAACTCAAAGATGATGCTGGAGCTGGACCCACCATTCGGAGACTCATACATCGTGATCGGAGTGGGAGAGAAGAAGATCACACACCACTGGCACAGATCAGGATCAACAATCGGAAAGGCCTTCGAGGCCACAGTGAGAGGAGCCAAGAGAATGGCCGTGCTGGGAGACACAGCCTGGGACTTCGGATCAGTGGGAGGAGCCCTGAACTCACTGGGAAAGGGAATCCACCAGATCTTCGGAGCCGCCTTCAAGTCACTGTTCGGAGGAATGTCATGGTTCTCACAGATCCTGATCGGAACACTGCTGATGTGGCTGGGACTGAACACAAAGAACGGATCAATCTCACTGATGTGTCTGGCCCTGGGAGGAGTGCTGATCTTCCTGTCAACAGCCGTGTCAGCC

gel gel gel

3.4. You have a sequence! Now what?

As it is a viral protein and it has methylations, I would have to transfected into a mammalian cell system such a HECK293, obviously having the optimize sequence on a expression mammalian vector.

3.5. How does it work in nature/biological systems?

A single gene codes for mutiple proteins at the transcriptional level because of the Open Reading Frame and the splicing of genes which can take out some exons.

gel

Part 4: Prepare a Twist DNA Sunthesis Order

4.2-4.6 Build Your DNA Insert Sequence and expression system

As the E protein is a viral surface protein, I must use for expression of this protein the twist CMV plasmid.

gel gel gel gel

Part 5: DNA Read/Write/Edit

5.1 DNA Read

(i) What DNA would you want to sequence and why?

I would prioritize sequencing immunoglobulin G (IgG)–encoding genes, specifically the immunoglobulin heavy chain (IGH) and light chain (IGK or IGL) loci, as well as recombinant plasmids encoding engineered Fab fragments. IgG genes are particularly compelling because they undergo V(D)J recombination, somatic hypermutation, and class-switch recombination, processes that collectively generate extraordinary antibody diversity. Sequencing these loci enables detailed characterization of clonal expansion, affinity maturation, and mutational landscapes during immune responses. Such analyses are highly relevant to therapeutic antibody development, vaccine design, and the study of autoimmune or lymphoproliferative disorders. In parallel, plasmid sequencing is essential for molecular cloning workflows, as it confirms the correct insertion of coding sequences, preservation of reading frames, and absence of deleterious point mutations prior to protein expression.

(ii) What sequencing technology would you use and why?

For plasmid constructs and individual Ig variable regions, I would use Sanger sequencing, a first-generation sequencing technology. Because these targets are relatively short (typically under 3 kb), Sanger sequencing provides sufficient read length, high per-base accuracy, and straightforward validation. The input material would consist of purified plasmid DNA or PCR-amplified variable regions. Preparation involves plasmid isolation, primer design (vector-specific or insert-specific), and a sequencing reaction containing template DNA, a single primer, DNA polymerase, deoxynucleotides (dNTPs), and fluorescently labeled dideoxynucleotides (ddNTPs). During the reaction, stochastic incorporation of ddNTPs terminates DNA elongation, generating fragments of varying lengths. These fragments are separated by capillary electrophoresis, and laser detection of fluorescent signals enables base calling. The output consists of a chromatogram (electropherogram) with color-coded peaks corresponding to A, T, C, and G, along with a base-called sequence and associated quality scores. If full-length immunoglobulin loci or complex repertoires were to be analyzed, I would alternatively employ third-generation long-read sequencing, such as PacBio HiFi sequencing, which provides highly accurate circular consensus reads and resolves repetitive or structurally complex regions more effectively than short-read methods.

5.2 DNA Write

(i) What DNA would you want to synthesize and why?

I would synthesize engineered Fab fragments derived from human IgG, optimized for therapeutic applications. These constructs would include codon-optimized variable heavy (VH) and variable light (VL) domains fused to constant regions (CH1 and CL), with engineered complementarity-determining regions (CDRs) to enhance antigen affinity and specificity. Additional design features could include a secretion signal peptide, affinity purification tags (e.g., His-tag), and optimized linker sequences where appropriate. The rationale for synthesizing such constructs lies in their relevance to antibody therapeutics, where improved binding kinetics, reduced immunogenicity, and controlled expression are critical parameters. Synthetic gene construction allows precise sequence engineering beyond what is achievable through conventional cloning.

(ii) What technology would you use for DNA synthesis and why?

To synthesize these constructs, I would rely on phosphoramidite-based solid-phase DNA synthesis for short oligonucleotides, followed by enzymatic assembly methods such as Gibson Assembly. In this approach, short chemically synthesized oligonucleotides are generated sequentially through nucleotide coupling cycles, cleaved from the solid support, deprotected, and purified. Overlapping fragments are then assembled enzymatically into a full-length gene construct and cloned into an appropriate expression vector. The assembled plasmid is subsequently sequence-verified, typically by Sanger sequencing. Limitations of chemical DNA synthesis include increased error rates with longer sequences, challenges associated with repetitive or GC-rich regions, and cost scalability. However, high-fidelity assembly methods and post-synthesis sequence validation mitigate these concerns and allow reliable production of gene-length constructs.

5.3 DNA Edit

(i) What DNA would you want to edit and why?

I would focus on editing the genomes of animal models, such as mice, to introduce fluorescent reporters (e.g., EGFP) or precise disease-associated mutations. For example, inserting an EGFP reporter at an endogenous locus would enable real-time visualization of gene expression patterns during development. Alternatively, introducing defined point mutations into genes implicated in cancer or developmental disorders would facilitate mechanistic studies and disease modeling. Genome editing in animal systems provides powerful tools for functional genomics, lineage tracing, and in vivo investigation of gene regulation.

(ii) What technology would you use and how does it work?

To perform these edits, I would use CRISPR-Cas9 genome editing. This system employs a single-guide RNA (sgRNA) to direct the Cas9 endonuclease to a complementary genomic locus, where it induces a double-strand break (DSB). The cell subsequently repairs the DSB through either non-homologous end joining (NHEJ), which can introduce insertions or deletions, or homology-directed repair (HDR), which enables precise sequence insertion when a donor DNA template is supplied. Preparation requires rational guide RNA design to minimize off-target effects, construction or procurement of Cas9 expression systems (plasmid, mRNA, or ribonucleoprotein complex), and, when precise edits are desired, synthesis of a donor DNA template containing homology arms. Delivery methods may include electroporation, viral transduction, or microinjection into embryos. Edited cells or organisms must then be screened and validated via PCR and sequencing. Although CRISPR-Cas9 is highly versatile, limitations include variable editing efficiency, potential off-target cleavage, mosaicism in early embryos, and constraints associated with homology-directed repair efficiency. Despite these challenges, CRISPR-based editing remains the most precise and adaptable platform currently available for targeted genome engineering.

Week 3 Homework: Lab Automation

Part 1: Python Script for Opentrons Artwork

Using GUI opentrons-art.rcdonovan.com, I created two designs, one using a whole range of colors and the other using just red and green (https://opentrons-art.rcdonovan.com/?id=zk5r154b9r6q995).

full
mplum_points = [(-4.4, 24.2),(-2.2, 24.2),(0, 24.2),(-6.6, 22),(-4.4, 22),(2.2, 22),(-6.6, 19.8),(-4.4, 19.8),(2.2, 19.8),(-6.6, 17.6),(-4.4, 17.6),(-2.2, 17.6),(0, 17.6),(2.2, 17.6),(-6.6, 15.4),(-4.4, 15.4),(-2.2, 15.4),(-6.6, 13.2),(-4.4, 13.2),(-6.6, 11),(-4.4, 11),(-6.6, 8.8),(-4.4, 8.8),(-6.6, 6.6),(-4.4, 6.6),(-2.2, 6.6),(-6.6, 4.4),(-4.4, 4.4),(-2.2, 4.4),(-6.6, 2.2),(-4.4, 2.2),(-2.2, 2.2),(-6.6, 0),(-4.4, 0),(-2.2, 0),(-6.6, -2.2),(-4.4, -2.2),(-2.2, -2.2),(-6.6, -4.4),(-4.4, -4.4),(-2.2, -4.4),(-6.6, -6.6),(-4.4, -6.6),(-2.2, -6.6),(-6.6, -8.8),(-4.4, -8.8),(-2.2, -8.8),(-6.6, -11),(-4.4, -11),(-2.2, -11),(-6.6, -13.2),(-4.4, -13.2),(-2.2, -13.2),(-6.6, -15.4),(-4.4, -15.4),(-2.2, -15.4),(-4.4, -17.6)] 
mko2_points = [(0, 15.4),(2.2, 15.4),(-2.2, 13.2),(0, 13.2),(2.2, 13.2),(4.4, 13.2),(-2.2, 11),(0, 11),(2.2, 11),(4.4, 11),(-2.2, 8.8),(0, 8.8),(2.2, 8.8),(4.4, 8.8),(0, 6.6),(2.2, 6.6)] 
ultramarine_points = [(-2.2, 22),(0, 22),(-2.2, 19.8),(0, 19.8)] 
electra2_points = [(4.4, 2.2),(6.6, 2.2),(8.8, 2.2),(11, 2.2),(13.2, 2.2),(15.4, 2.2),(17.6, 2.2),(8.8, 0),(13.2, 0),(6.6, -2.2),(13.2, -2.2),(15.4, -2.2),(17.6, -2.2),(4.4, -4.4),(13.2, -4.4),(2.2, -6.6),(4.4, -6.6),(6.6, -6.6),(8.8, -6.6),(13.2, -6.6)]
redgreen
megfp_points = [(-4.4, 24.2),(-2.2, 24.2),(0, 24.2),(-6.6, 22),(-4.4, 22),(2.2, 22),(-6.6, 19.8),(-4.4, 19.8),(2.2, 19.8),(-6.6, 17.6),(-4.4, 17.6),(-2.2, 17.6),(0, 17.6),(2.2, 17.6),(-6.6, 15.4),(-4.4, 15.4),(-2.2, 15.4),(-6.6, 13.2),(-4.4, 13.2),(-6.6, 11),(-4.4, 11),(-6.6, 8.8),(-4.4, 8.8),(-6.6, 6.6),(-4.4, 6.6),(-2.2, 6.6),(-6.6, 4.4),(-4.4, 4.4),(-2.2, 4.4),(-6.6, 2.2),(-4.4, 2.2),(-2.2, 2.2),(-6.6, 0),(-4.4, 0),(-2.2, 0),(-6.6, -2.2),(-4.4, -2.2),(-2.2, -2.2),(-6.6, -4.4),(-4.4, -4.4),(-2.2, -4.4),(-6.6, -6.6),(-4.4, -6.6),(-2.2, -6.6),(-6.6, -8.8),(-4.4, -8.8),(-2.2, -8.8),(-6.6, -11),(-4.4, -11),(-2.2, -11),(-6.6, -13.2),(-4.4, -13.2),(-2.2, -13.2),(-6.6, -15.4),(-4.4, -15.4),(-2.2, -15.4),(-4.4, -17.6)] 
mrfp1_points = [(-2.2, 22),(0, 22),(-2.2, 19.8),(0, 19.8),(0, 15.4),(2.2, 15.4),(-2.2, 13.2),(0, 13.2),(2.2, 13.2),(4.4, 13.2),(-2.2, 11),(0, 11),(2.2, 11),(4.4, 11),(-2.2, 8.8),(0, 8.8),(2.2, 8.8),(4.4, 8.8),(0, 6.6),(2.2, 6.6)]
response

I submitted my design but I still have some inquiries about the python file.

from opentrons import types

import string

metadata = {
    'protocolName': 'Sarah Jimenez - Opentrons Art - HTGAA',
    'author': 'HTGAA',
    'source': 'HTGAA 2026',
    'apiLevel': '2.20'
}

Z_VALUE_AGAR = 2.0
POINT_SIZE = 1

megfp_points = [(-4.4,24.2), (-2.2,24.2), (0,24.2), (-6.6,22), (-4.4,22), (2.2,22), (-6.6,19.8), (-4.4,19.8), (2.2,19.8), (-6.6,17.6), (-4.4,17.6), (-2.2,17.6), (0,17.6), (2.2,17.6), (-6.6,15.4), (-4.4,15.4), (-2.2,15.4), (-6.6,13.2), (-4.4,13.2), (-6.6,11), (-4.4,11), (-6.6,8.8), (-4.4,8.8), (-6.6,6.6), (-4.4,6.6), (-2.2,6.6), (-6.6,4.4), (-4.4,4.4), (-2.2,4.4), (-6.6,2.2), (-4.4,2.2), (-2.2,2.2), (-6.6,0), (-4.4,0), (-2.2,0), (-6.6,-2.2), (-4.4,-2.2), (-2.2,-2.2), (-6.6,-4.4), (-4.4,-4.4), (-2.2,-4.4), (-6.6,-6.6), (-4.4,-6.6), (-2.2,-6.6), (-6.6,-8.8), (-4.4,-8.8), (-2.2,-8.8), (-6.6,-11), (-4.4,-11), (-2.2,-11), (-6.6,-13.2), (-4.4,-13.2), (-2.2,-13.2), (-6.6,-15.4), (-4.4,-15.4), (-2.2,-15.4), (-4.4,-17.6)]
mrfp1_points = [(-2.2,22), (0,22), (-2.2,19.8), (0,19.8), (0,15.4), (2.2,15.4), (-2.2,13.2), (0,13.2), (2.2,13.2), (4.4,13.2), (-2.2,11), (0,11), (2.2,11), (4.4,11), (-2.2,8.8), (0,8.8), (2.2,8.8), (4.4,8.8), (0,6.6), (2.2,6.6)]

point_name_pairing = [("megfp", megfp_points),("mrfp1", mrfp1_points)]

# Robot deck setup constants
TIP_RACK_DECK_SLOT = 9
COLORS_DECK_SLOT = 6
AGAR_DECK_SLOT = 5
PIPETTE_STARTING_TIP_WELL = 'A1'

# Place the PCR tubes in this order
well_colors = {
    'A1': 'sfGFP',
    'A2': 'mRFP1',
    'A3': 'mKO2',
    'A4': 'Venus',
    'A5': 'mKate2_TF',
    'A6': 'Azurite',
    'A7': 'mCerulean3',
    'A8': 'mClover3',
    'A9': 'mJuniper',
    'A10': 'mTurquoise2',
    'A11': 'mBanana',
    'A12': 'mPlum',
    'B1': 'Electra2',
    'B2': 'mWasabi',
    'B3': 'mScarlet_I',
    'B4': 'mPapaya',
    'B5': 'eqFP578',
    'B6': 'tdTomato',
    'B7': 'DsRed',
    'B8': 'mKate2',
    'B9': 'EGFP',
    'B10': 'mRuby2',
    'B11': 'TagBFP',
    'B12': 'mChartreuse_TF',
    'C1': 'mLychee_TF',
    'C2': 'mTagBFP2',
    'C3': 'mEGFP',
    'C4': 'mNeonGreen',
    'C5': 'mAzamiGreen',
    'C6': 'mWatermelon',
    'C7': 'avGFP',
    'C8': 'mCitrine',
    'C9': 'mVenus',
    'C10': 'mCherry',
    'C11': 'mHoneydew',
    'C12': 'TagRFP',
    'D1': 'mTFP1',
    'D2': 'Ultramarine',
    'D3': 'ZsGreen1',
    'D4': 'mMiCy',
    'D5': 'mStayGold2',
    'D6': 'PA_GFP'
}

volume_used = {
    'megfp': 0,
    'mrfp1': 0
}

def update_volume_remaining(current_color, quantity_to_aspirate):
    rows = string.ascii_uppercase
    for well, color in list(well_colors.items()):
        if color == current_color:
            if (volume_used[current_color] + quantity_to_aspirate) > 250:
                # Move to next well horizontally by advancing row letter, keeping column number
                row = well[0]
                col = well[1:]
                
                # Find next row letter
                next_row = rows[rows.index(row) + 1]
                next_well = f"{next_row}{col}"
                
                del well_colors[well]
                well_colors[next_well] = current_color
                volume_used[current_color] = quantity_to_aspirate
            else:
                volume_used[current_color] += quantity_to_aspirate
            break

def run(protocol):
    # Load labware, modules and pipettes
    protocol.home()

    # Tips
    tips_20ul = protocol.load_labware('opentrons_96_tiprack_20ul', TIP_RACK_DECK_SLOT, 'Opentrons 20uL Tips')

    # Pipettes
    pipette_20ul = protocol.load_instrument("p20_single_gen2", "right", [tips_20ul])

    # Deep Well Plate
    temperature_plate = protocol.load_labware('nest_96_wellplate_2ml_deep', 6)

    # Agar Plate
    agar_plate = protocol.load_labware('htgaa_agar_plate', AGAR_DECK_SLOT, 'Agar Plate')
    agar_plate.set_offset(x=0.00, y=0.00, z=Z_VALUE_AGAR)

    # Get the top-center of the plate, make sure the plate was calibrated before running this
    center_location = agar_plate['A1'].top()

    pipette_20ul.starting_tip = tips_20ul.well(PIPETTE_STARTING_TIP_WELL)
    
    # Helper function (dispensing)
    def dispense_and_jog(pipette, volume, location):
        assert(isinstance(volume, (int, float)))
        # Go above the location
        above_location = location.move(types.Point(z=location.point.z + 2))
        pipette.move_to(above_location)
        # Go downwards and dispense
        pipette.dispense(volume, location)
        # Go upwards to avoid smearing
        pipette.move_to(above_location)

    # Helper function (color location)
    def location_of_color(color_string):
        for well,color in well_colors.items():
            if color.lower() == color_string.lower():
                return temperature_plate[well]
        raise ValueError(f"No well found with color {color_string}")

    # Print pattern by iterating over lists
    for i, (current_color, point_list) in enumerate(point_name_pairing):
        # Skip the rest of the loop if the list is empty
        if not point_list:
            continue

        # Get the tip for this run, set the bacteria color, and the aspirate bacteria of choice
        pipette_20ul.pick_up_tip()
        max_aspirate = int(18 // POINT_SIZE) * POINT_SIZE
        quantity_to_aspirate = min(len(point_list)*POINT_SIZE, max_aspirate)
        update_volume_remaining(current_color, quantity_to_aspirate)
        pipette_20ul.aspirate(quantity_to_aspirate, location_of_color(current_color))

        # Iterate over the current points list and dispense them, refilling along the way
        for i in range(len(point_list)):
            x, y = point_list[i]
            adjusted_location = center_location.move(types.Point(x, y))

            dispense_and_jog(pipette_20ul, POINT_SIZE, adjusted_location)
            
            if pipette_20ul.current_volume == 0 and len(point_list[i+1:]) > 0:
                quantity_to_aspirate = min(len(point_list[i:])*POINT_SIZE, max_aspirate)
                update_volume_remaining(current_color, quantity_to_aspirate)
                pipette_20ul.aspirate(quantity_to_aspirate, location_of_color(current_color))

        # Drop tip between each color
        pipette_20ul.drop_tip()

Part 2: Opentrons on published science and project

2.1. Automated, high-throughput in-situ hybridization of Lytechinus pictus embryos

response

This study presents the development of an automated, high-throughput hybridization chain reaction (HT-HCR) pipeline optimized for whole-mount embryos of Lytechinus pictus, addressing a central technical limitation in developmental biology: the low scalability of conventional in situ hybridization. The authors engineered a miniaturized, robotics-based workflow capable of processing large probe sets in 96-well format with minimal reagent volumes and without manual intervention, followed by automated confocal imaging. Using this platform, they generated spatial expression data for over one hundred genes across multiple embryonic stages, encompassing transcription factors, signaling components, and physiological regulators. Importantly, the method preserves spatial resolution while dramatically increasing throughput, thereby enabling systematic construction of spatial gene expression atlases and facilitating integration with perturbation-based studies. Overall, this work represents a methodological advance that bridges classical embryological approaches with scalable spatial transcriptomics, significantly expanding the experimental capacity for gene regulatory network analysis in early development.

response response

2.2. What about the project?

An Opentrons platform could be strategically integrated into the “exchangeable Fabs” workflow to automate and standardize the Design–Build–Test cycle of recombinant Fab production. At the molecular cloning stage, Opentrons could perform high-throughput Golden Gate or Gibson assemblies to systematically combine variable Fab regions with a standardized Fc backbone plasmid, followed by automated bacterial transformation, colony PCR setup, and plasmid prep normalization in 96-well format. For protein production, the system could prepare transfection mixes (e.g., HEK293 or CHO systems), seed cells, and manage media exchanges in small-scale expression screens to optimize Fab yield and stability. Downstream, Opentrons could automate affinity purification workflows (Protein A/G or Ni-NTA for tagged constructs) using magnetic bead–based protocols, ensuring reproducible wash and elution conditions while minimizing reagent waste—particularly relevant for resource-limited laboratories. Finally, in the validation phase, the robot could execute ELISA plate coating, blocking, serial dilutions, and detection antibody incubations to quantitatively screen binding affinity and specificity across multiple designed Fabs in parallel. Overall, implementing Opentrons would increase reproducibility, reduce hands-on time and antibody wastage, and enable scalable, semi-automated screening of in silico–designed Fab variants prior to functional characterization.

Part 3: Project ideas!!

full full full