Homework

Weekly homework submissions:

  1. Image Generation I started by using Google Gemini to generate a high-quality image of a Markhor with a clean white background to serve as the reference for my design.

Subsections of Homework

Week 1 HW: Principles and Practices

Homework 1: Governance & Ethics

1. Biological Engineering Application

Project: Recombinant production of BmK CT (Scorpion Peptide) via Cloud-Lab Automation. Why: Venom-derived therapeutics are currently difficult to source sustainably. My goal is to engineer a microbial “cell factory” using E. coli to produce high-purity BmK CT for glioma (brain cancer) research, utilizing automated cloud-lab infrastructure for scalable access.

2. Governance/Policy Goals

My primary goal is to ensure Non-malfeasance (preventing harm) while promoting Constructive Use.

  • Sub-goal A (Biosecurity): Preventing the diversion of synthesized neurotoxin sequences for harmful, non-therapeutic purposes.
  • Sub-goal B (Equity): Ensuring that the digital blueprints and production protocols are accessible to researchers in developing regions like Pakistan to promote autonomy.

3. Governance Actions Matrix

AspectAction 1: Automated ScreeningAction 2: User VerificationAction 3: Ethical Peer-Review
ActorDNA Synthesis CompaniesCloud Lab PlatformsAcademic Communities
PurposeFlags regulated toxin sequences.Mandatory ID for remote users.Standardizing “Dual-Use” data sharing.
AssumptionsAll toxins are indexed.Affiliation equals ethics.Users will follow protocols.
RisksFalse positives slow research.Excludes independent scientists.Security breaches are possible.

4. Scoring Matrix

(1 = Best, 3 = Least)

Does the option:Option 1Option 2Option 3
Enhance Biosecurity112
Foster Lab Safety213
Not Impede Research231
Promote Constructive App111

5. Prioritization & Recommendation

I prioritize a combination of Action 1 and Action 2. This recommendation is intended for International Regulatory Bodies. We must implement “Know Your Customer” (KYC) rigor for Cloud Labs, similar to financial institutions. While this adds a burden to remote researchers, it is a necessary trade-off for handling potent neurotoxins safely.

6. Ethical Reflection

The concept of an “Information Hazard” was a significant new concern for me. Even if my physical lab work is safe, publishing a “perfect roadmap” for toxin production could be misused. I propose “Ethical Red-Teaming” as a governance action where students peer-review project documentation for potential dual-use risks before publication.


Week 2 Pre-Lecture Preparation

In preparation for “DNA Read, Write, and Edit” lecture.

Part 1: Questions from Professor Jacobson

  • What is the error rate of polymerase? The error rate of DNA polymerase is approximately $10^{-7}$ to $10^{-8}$ per base pair.
  • How does this compare to the length of the human genome? The human genome is about 3 billion ($3 \times 10^9$) base pairs long, meaning mutations are inevitable without repair.
  • How does biology deal with that discrepancy? Biology utilizes Proofreading and Mismatch Repair (MMR) systems to reduce the final error rate to $10^{-9}$ or $10^{-10}$.
  • How many different ways are there to code for an average human protein? Due to Codon Degeneracy, there are millions of potential DNA sequences for a single protein.
  • Why don’t all of these different codes work? Factors like Codon Usage Bias, mRNA secondary structures, and cryptic splice sites can hinder protein expression.

Part 2: Questions from Dr. LeProust

  • Most common method for oligo synthesis: The Phosphoramidite method is the current standard.
  • Why is it difficult to make oligos > 200nt? Cumulative coupling inefficiencies lead to extremely low yields for long, pure sequences.
  • Why can’t you make a 2000bp gene via direct synthesis? The error rate and yield drop make direct synthesis of long genes unfeasible; they must be assembled from shorter oligos.

Part 3: Question from George Church

  • 10 Essential Amino Acids: Arginine, Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Threonine, Tryptophan, and Valine.
  • The “Lysine Contingency”: This is a Biocontainment strategy where organisms are engineered to depend on external amino acids to prevent survival outside the lab environment.

Week 2 HW: DNA Read, Write, and Edit

Part 1: Benchling & In-silico Gel Art

For this assignment, I performed a virtual restriction digest on Lambda DNA using Benchling. My creative goal was to produce a pattern that mirrors the anatomy of a scorpion, inspired by my final project on the BmK CT scorpion peptide.

Creative Vision: The Genetic Scorpion 🦂

To achieve this symmetrical design, I strategically assigned different enzymes and double-digests across 7 lanes:

  • Lanes 1 & 7 (The Pincers): I used SalI to create distinct high-molecular-weight bands.
  • Lanes 2 & 6 (The Legs): A double digest of KpnI and SacI created a wider spread of bands, mimicking the scorpion’s legs.
  • Lanes 3 & 5 (The Torso): Combining BamHI and EcoRI resulted in a dense cluster of bands to represent the main body segments.
  • Lane 4 (The Stinger): I used EcoRV, which produces 21 fragments, creating a solid vertical pillar in the center that represents the scorpion’s iconic stinger.

Methodology

  1. Sequence: Imported the Lambda DNA (48,502 bp) sequence into Benchling.
  2. Analysis: Used the Restriction Analysis tool to map sites for EcoRI, HindIII, BamHI, KpnI, EcoRV, SacI, and SalI.
  3. Simulation: Conducted virtual gel electrophoresis using a NEB 2-Log Ladder as a size reference.
  4. Result: The final digest achieved a balanced, symmetrical pattern that visually aligns with my biotech research interests.
Scorpion Gel Art Scorpion Gel Art

Part 3: DNA Design Challenge

3.1. Choose your protein

The protein I chose for this assignment is BmK CT, a chlorotoxin-like peptide derived from the venom of the Chinese scorpion Olivierus martensii (previously called Mesobuthus martensii). It is a small bioactive peptide which functions primarily as a targeted anti-tumor agent against glioma cells, with key roles in inhibiting cell migration, invasion, and proliferation.

What makes BmK CT especially promising is that it can promote apoptosis and enhance the sensitivity of glioma cells to chemotherapeutic agents like temozolomide. Importantly, studies suggest that it exhibits minimal toxicity toward normal glial cells, highlighting its potential as a selective therapeutic candidate for high-grade gliomas.

The sequence was obtained from UniProt (Entry: Q9UAD0). I focused on the biologically active mature peptide consisting of 35 amino acids:

CGPCFTTDANMARKCRECCGGIGKCFGPQCLCNRI

This peptide is cysteine-rich, suggesting the formation of multiple disulfide bonds that stabilize its tertiary structure.

3.2. Reverse Translate

Using the Reverse Translate tool from the Sequence Manipulation Suite, the following 105 bp DNA sequence (35 × 3 nucleotides) was generated:

tgcggcccgtgctttaccaccgatgcgaacatggcgcgcaaatgccgcgaatgctgcggcggcattggcaaatgctttggcccgcagtgcctgtgcaaccgcatt

3.3. Codon Optimization

When expressing a foreign gene in a host organism, it’s important to consider codon usage bias. This refers to the preference of the host to translate certain synonymous codons more efficiently than others. Codon optimization replaces some codons in the sequence with those preferred by the host, without changing the amino acid sequence, to ensure efficient translation.

For BmK CT, I chose Escherichia coli (K-12 strain MG1655) as the host because it is a standard, well-characterized chassis in synthetic biology that grows rapidly and is easy to handle. Using the Codon Optimization Tool from VectorBuilder, I optimized the sequence while avoiding Type IIs restriction enzyme sites (BsaI, Esp3I, and BbsI) to make the sequence compatible with modular cloning workflows.

VectorBuilder Optimization Interface VectorBuilder Optimization Interface

Improved Sequence (CAI: 0.99, GC Content: 59.05%): TGCGGCCCGTGCTTTACCACCGATGCGAATATGGCGCGTAAATGCCGCGAATGCTGCGGCGGCATTGGCAAATGCTTTGGCCCGCAGTGCCTGTGCAATCGCATT

Optimized DNA Results Optimized DNA Results

3.4. You have a sequence! Now what?

Once the optimized gene is synthesized, it can be produced using a hybrid approach that combines both cell-dependent and cell-free systems.

Cell-Dependent Recombinant Expression

In a traditional approach, the optimized gene is cloned into an expression vector (like pET-28a) under a strong promoter and transformed into E. coli.

  • Transcription: DNA is transcribed into mRNA by RNA polymerase.
  • Translation: mRNA is translated by ribosomes into the peptide.
  • Purification: The peptide is purified using chromatographic methods (e.g., IMAC via a His-tag).

Cell-Free Expression Systems

Alternatively, the optimized DNA can be directly introduced into a cell-free transcription–translation system. This offers:

  • Rapid prototyping and reduced toxicity constraints.
  • Precise control over reaction conditions.
  • Compatibility with high-throughput screening and automated cloud-lab paradigms.

This hybrid approach ensures the scalability of living cells while providing the flexibility needed to produce potentially toxic bioactive peptides like BmK CT.

Part 4: Prepare a Twist DNA Synthesis Order (Practice)

In this practice exercise, I simulated the workflow for ordering a synthetic gene, moving from a custom expression cassette design in Benchling to a clonal gene construct on the Twist Bioscience portal.

Step 1: Account Creation

I successfully set up accounts on Twist Bioscience and Benchling to facilitate DNA design and synthesis simulation. ✅

Step 2: Build Your DNA Insert Sequence

I designed a complete Expression Cassette for my codon-optimized BmK CT sequence, optimized for an E. coli expression system. Using Benchling, I sequentially assembled the following components into a single linear DNA sequence:

  • Promoter: TTTACGGCTAGCTCAGTCCTAGGTATAGTGCTAGC
  • RBS: CATTAAAGAGGAGAAAGGTACC
  • Start Codon: ATG
  • Coding Sequence (CDS): Codon-optimized BmK CT
  • 7x His Tag: CATCACCATCACCATCATCAC (to enable protein purification)
  • Stop Codon: TAA
  • Terminator: CCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGAACGCTCTCTACTAGAGTCACACTGGCTCACCTTCGGGTGGGCCTTTCTGCGTTTATA

Linear Map: View on Benchling

BmK CT Linear Sequence BmK CT Linear Sequence

Step 3: Simulation on Twist Bioscience

I simulated the ordering process by selecting the “Genes” category and choosing the “Clonal Genes” option. Unlike gene fragments, clonal genes arrive already inserted into a circular vector, which allows for direct transformation into E. coli.

Twist Genes Selection Twist Genes Selection Twist Clonal Genes Option Twist Clonal Genes Option

  1. Importing Sequence: I uploaded the FASTA file of the Benchling cassette.
  2. Vector Selection: I chose the pTwist Amp High Copy cloning vector as the circular backbone.
  3. Export: I downloaded the final construct as a GenBank (.gb) file for verification.

Twist Vector Selection Twist Vector Selection Download GenBank Sequence Download GenBank Sequence

Recombinant Vector Overview: Below is the visualization of the final construct as it appears in the Twist portal:

Twist Construct Visualization Twist Construct Visualization

Step 4: Final Plasmid Verification in Benchling

After re-importing the Twist-generated GenBank file back into Benchling, I verified the final circular plasmid containing my custom expression cassette.

Final Recombinant Plasmid Map Final Recombinant Plasmid Map

Part 5: DNA Read/Write/Edit

5.1 DNA Read

(i) Target for Sequencing

I want to sequence the metagenomic DNA from the venom glands of Northern Pakistani scorpions (specifically species related to Mesobuthus martensii). Rationale: While BmK CT is documented, local Pakistani species might harbor unique genetic variants with higher therapeutic potency or different binding affinities for glioma cells. Sequencing this DNA helps explore local biodiversity and bioprospecting for novel drug precursors.

(ii) Sequencing Technology: Oxford Nanopore (ONT)

I have chosen Oxford Nanopore Technologies (ONT), a third-generation, single-molecule sequencing platform.

Input & Preparation:

  • Material: High Molecular Weight (HMW) genomic DNA from venom gland tissue.
  • Steps: 1. Optional fragmentation (kept long for ONT), 2. End-repair and A-tailing, 3. Adapter ligation with motor proteins, and 4. Tethering to the flow cell membrane.

Mechanism & Base Calling:

  • DNA strands pass through a protein nanopore, causing characteristic disruptions in the ionic current.
  • AI/ML algorithms decode these electrical signals into base sequences (A, T, G, C) in real-time.

Output:

  • FASTQ files containing long-read sequences, ideal for variant analysis and de novo assembly.

5.2 DNA Write

(i) Target for Synthesis

I want to synthesize a codon-optimized genetic construct for expressing the BmK CT peptide in E. coli. Rationale: Synthesis allows the inclusion of a T7 promoter for high-level expression and a 6xHis-tag for streamlined purification. This is a direct therapeutic application for glioma drug discovery.

(ii) Synthesis Technology: Silicon-based Electrochemical Synthesis

I would utilize the technology implemented by Twist Bioscience.

Essential Steps:

  1. Phosphoramidite chemistry performed on a high-density silicon chip.
  2. Computer-controlled electrochemical activation of specific pixels to add nucleotides one at a time.
  3. Short synthesized oligos are harvested and assembled into the full-length gene.

Limitations:

  • Error rates can increase with length, necessitating sequence verification.
  • Large constructs (>3 kb) are significantly more challenging to synthesize directly.

5.3 DNA Edit

(i) Target for Editing

I want to edit the genome of E. coli expression strains (e.g., BL21) to enhance the secretion of BmK CT. Rationale: Normally, recombinant peptides accumulate in the periplasm; by editing secretion pathways or outer membrane proteins, the peptide could be secreted directly into the medium, simplifying downstream purification.

(ii) Editing Technology: CRISPR-Cas9

Mechanism & Essential Steps:

  1. Recognition: A custom Guide RNA (gRNA) targets the specific genomic sequence.
  2. Cleavage: The Cas9 enzyme generates a Double-Strand Break (DSB).
  3. Repair: Homology-Directed Repair (HDR) inserts the desired mutation using a provided donor template.

Preparation & Input:

  • Design: Designing gRNA via Benchling to minimize off-target effects.
  • Inputs: Cas9 enzyme/plasmid, custom gRNA, donor DNA template, and competent cells.

Limitations:

  • Efficiency: HDR efficiency can be low in certain bacterial strains.
  • Precision: Potential for off-target effects requires careful validation.

Sources & Acknowledgments

I used a combination of published literature, AI tools (including ChatGPT and Google Gemini), and discussions with peers to compile and refine this assignment.

Week 3 HW: Lab Automation

Assignment: Python Script for Opentrons Artwork

This section documents my process of creating bio-art of the Markhor, Pakistan’s national animal, using the Opentrons OT-2 liquid handling robot.

1. Image Generation

I started by using Google Gemini to generate a high-quality image of a Markhor with a clean white background to serve as the reference for my design.

Markhor Original Markhor Original

2. Pixel Art Conversion

I then requested the AI to convert the high-resolution image into pixel art, which is necessary for the robot to dispense discrete “dots” of bacteria.

Markhor Pixel Art Markhor Pixel Art

3. Coordinate Generation via GUI

I uploaded the pixel art to the Opentrons Art GUI. The initial generation included many colors, so I tweaked the design to limit it to three primary fluorescent colors: Blue, Green, and Red. During this stage, I also manually added the Markhor’s ears, which were not correctly captured in the original AI image.

Opentrons GUI Preview Opentrons GUI Preview

4. Python Scripting & Debugging

Using the Echo CSV coordinates from the GUI, I utilized Google Gemini (both the web interface and the integrated assistant in the HTGAA Colab) to write the final Python script.

The process involved significant debugging due to:

  • Color Swapping: Adjusting the mapping to ensure the robot pulled the correct color from the source wells (A1, B1, C1).
  • API Syntax: Fixing NameError and KeyError related to the labware setup and coordinate indexing in the simulator.

Finally, I successfully simulated the design, ensuring the Markhor pattern was centered and clearly defined.

Google Colab Simulation Google Colab Simulation

Post-Lab Questions: Research & Automation Strategy

1. Published Research Paper Analysis

I have reviewed the following paper regarding advanced laboratory automation:

Paper Name: Automating life science labs at the single-cell level through precise ultrasonic liquid sample ejection: PULSE

Link: https://doi.org/10.1038/s41378-024-00798-y

General Overview

  • The paper introduces PULSE (Precise Ultrasonic Liquid Sample Ejection), a novel laboratory automation platform designed for biological experiments at the single-cell level.
  • While current lab automation excels at bulk samples, it often lacks the precision and biocompatibility required for single-cell work, which can mask critical cellular heterogeneity.
  • PULSE overcomes these limitations by using focused acoustic waves to eject nanoliter-sized droplets containing reagents or individual cells from a compact, disposable chip.
  • This technology transforms standard titer plates into high-density microdroplet arrays, allowing researchers to programmably print “nanodrop pixels” with specific numbers of cells.
  • The platform integrates software that synchronizes over 100 parameters, enabling automated and scalable workflows.

Major Applications

The authors demonstrate the versatility of PULSE across three major areas:

  1. Bio-fabrication: Precise deposition of different cell types to create 3D spheroids and patterned constructs.
  2. Precision Gating: Isolating single cells in nanodrops to track behavior over time (e.g., response to drugs) and retrieving them for downstream analysis.
  3. Deterministic Array Barcoding: Directly linking a single cell’s observable phenotype (via imaging) to its genotype (via sequencing) using pre-allocated, addressable primers.

Key Findings

  • High Precision: The platform can print single cells at speeds of 5-20 cells per second with an accuracy of 90.5-97.7%.
  • Biocompatibility: RNA sequencing revealed that the acoustic ejection process causes no significant differential gene regulation up to 24 hours post-treatment, outperforming FACS.
  • Accuracy: In deterministic barcoding experiments, the platform achieved a 95.6% accuracy in matching cell fluorescent phenotypes to genotypes.

2. Final Project Ideas

I am exploring three core directions for my final project, focusing on recombinant production, metabolic engineering, and industrial biocatalysis.

Idea 1: Recombinant Production of BmK CT Peptide for Glioma Therapy

  • The Concept: To engineer E. coli K-12 for the high-yield production of the 35-residue mature BmK CT scorpion peptide.
  • The Approach: Utilizing Twist Bioscience for codon-optimized gene synthesis and pET-series vectors for efficient recombinant expression.
  • Impact: Establishing a scalable bio-manufacturing pipeline for a peptide with high affinity for brain tumor cells to facilitate further oncological research.

Idea 2: CryoDesign Yak: AI-Engineered Cold-Active Enzymes

  • The Concept: Harnessing the unique metagenomic diversity of the Himalayan Yak microbiome to engineer psychrophilic enzymes for energy-efficient industrial bioprocessing.
  • The Approach: Utilizing ESM-2 Transformers for structural motif identification and ProteinMPNN for inverse folding to optimize catalytic loop flexibility.
  • Impact: Enabling room-temperature industrial catalysis (15°C–25°C) to slash heating energy demand, contributing to the EU Net Zero 2050 goal.

Idea 3: Automated Bio-manufacturing of Bryostatin-1 Precursors

  • The Concept: Engineering a microbial chassis for the sustainable production of Bryostatin-1, a potent marine-derived anti-cancer precursor.
  • The Approach: Implementing the biosynthetic gene cluster into a laboratory-friendly host and using Opentrons automation for screening optimal metabolic flux.
  • Impact: Transitioning from destructive marine harvesting to a sustainable, automated bio-manufacturing paradigm.

Lab Automation Strategies

Automation Strategy for Idea 2: CryoDesign Yak (Cold-Active Enzymes)

My automation strategy focuses on a Closed-Loop ‘Design-Build-Test’ cycle for validating AI-designed cold-adapted enzymes. Precision in thermal management is the primary constraint for this project.

  • Build (Ginkgo Nebula):
    • I will utilize the Echo Acoustic Liquid Handler to nanodisperse synthetic DNA variants into 384-well plates.
    • This will be coupled with the Bravo Automated Liquid Handling Platform to stamp in chilled cell-free protein synthesis (CFPS) reagents.
  • Test (Local Opentrons):
    • Since psychrophilic enzymes are heat-labile, I will use a 3D-printed chilled-block holder (maintained at 10°C) on the Opentrons deck.
  • Process Flow:
    • Opentrons P20 will perform high-accuracy dispensing of chromogenic substrates into the expressed enzyme variants.
    • The Inheco Thermoshaker module will be used for controlled incubation at ambient ‘cold-start’ temperatures (15°C).
    • Data will be captured via an integrated plate reader to measure the kinetic rate ($\Delta$ Abs / $\Delta$ time).

Automation Strategy for Idea 3: Bio-manufacturing of Bryostatin-1 Precursors

This strategy aims to automate the metabolic engineering and screening of Bryostatin-1 precursors in a laboratory-friendly microbial host. Since the biosynthetic gene cluster (BGC) is large and complex, automation is essential to find the optimal expression balance.

  • Design & Build (Ginkgo Nebula):
    • I will utilize the Echo Acoustic Liquid Handler to assemble combinatorial libraries of BGC components, including promoters, RBS, and biosynthetic modules.
    • The Bravo Automated Liquid Handling Platform will be used for rapid transformation and plating of the engineered chassis.
  • Test & Optimize (Local Opentrons):
    • I will use the Opentrons OT-2 to automate the metabolic screening of small-molecule precursors.
  • Process Flow:
    • Opentrons P300 will handle the precise serial dilution of feeding precursors (e.g., acetate/propionate units) into the culture media.
    • The Heater-Shaker Module will be programmed for automated induction and micro-fermentation at optimal temperatures.
    • Post-fermentation, the Opentrons will perform an automated extraction protocol (solvent addition and supernatant collection) for downstream analysis via an integrated HPLC or plate reader.