Homework
Weekly homework submissions:
Week 1 HW: Principles and Practices
Application: An AI Agent for Protein and Molecular Design I’m developing an AI agent for protein and molecular design - an autonomous system that can take a high-level design brief (e.g. “design a protein that binds target X with nanomolar affinity”) and execute the full computational design pipeline: searching structure databases, running generative models, evaluating candidates, iterating on designs, and preparing sequences for synthesis. Unlike standalone models, an agent orchestrates multiple tools and makes decisions across the design cycle with minimal human intervention.
Week 2 HW: DNA Read, Write, and Edit
Part 1: Benchling Gel Art Virtual restriction digest of Lambda DNA (J02459) using EcoRI, HindIII, BamHI, PstI, SalI, and XhoI, visualized in NEBcutter. Part 3: DNA Design Challenge 3.1 Choose Your Protein I chose endoglucanase A (CelCCA) from the bacterium Clostridium cellulolyticum. This organism has since been reclassified as Ruminiclostridium cellulolyticum. The crystal structure of its catalytic domain is in the PDB as entry 1EDG. It was solved at 1.6 angstrom resolution.
Part 1: Opentrons Art I designed my artwork using the Automation Art GUI at opentrons-art.rcdonovan.com. I uploaded a bat image and the tool pixelated it into dispensing coordinates for three fluorescent proteins: mClover3 (green, 41 points), mRFP1 (red, 393 points), and Azurite (blue, 40 points). The design uses 0.75 µL droplet sizes at 2.2 mm spacing.
Week 4 HW: Protein Design Part I
Part A: Conceptual Questions (Shuguang Zhang) Q1: How many molecules of amino acids do you take with a piece of 500 grams of meat? Meat is roughly 25% protein by weight, so 500g of meat contains ~125g of protein. The average amino acid has a molecular weight of ~110 Da (daltons), i.e. ~110 g/mol. So 125g ÷ 110 g/mol ≈ 1.14 mol of amino acid residues. Multiply by Avogadro’s number (6.022 × 10²³): roughly 6.8 × 10²³ amino acid molecules, just over one mole.
Week 5 HW: Protein Design Part II
Part 1: Generate Binders with PepMLM Sequence Preparation Retrieved WT SOD1 from UniProt P00441 and introduced A4V (position 5 in full sequence, position 4 in mature protein after Met cleavage): WT: M A T K A V C V L K … A4V: M A T K V V C V L K … ^ PepMLM Results Model: PepMLM-650M (ESM-2 based, Colab, T4 GPU) | Parameters: length = 12, num_binders = 4, top_k = 3
Week 6 HW: Genetic Circuits Part I
Assignment: DNA Assembly 1. What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose? Phusion DNA Polymerase: high-fidelity polymerase with 3’→5’ proofreading activity, so it corrects errors during extension. Much lower error rate than standard Taq. dNTPs (dATP, dTTP, dCTP, dGTP): the nucleotide building blocks the polymerase adds to the growing strand. MgCl₂: magnesium ions are an essential cofactor for polymerase function. Concentration affects stringency. Reaction buffer: maintains optimal pH and salt conditions across the three PCR temperature steps. The user adds template DNA and two primers (forward and reverse) to complete the reaction.
Week 7 HW: Genetic Circuits Part II
Assignment Part 1: Intracellular Artificial Neural Networks (IANNs) 1. What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions? Boolean circuits force biological signals into binary (high/low), but real biomarkers exist at continuous concentrations. IANNs operate on analog values, performing weighted summation and nonlinear activation (ReLU), so they can compute complex continuous functions like bandpass filters and diagonal decision boundaries. These are the kinds of input-output shapes actually needed for problems like cancer classification, where you care about relative concentration levels, not just on/off.