Affiliation: Committed Listener at BioClub Tokyo Node (HTGAA 2026)
“I used to be a computer person, but I’ve realized that the most interesting programming language is DNA. Biology is moving from a science to an engineering discipline.” — Jensen Huang, CEO of NVIDIA
🚀 My Mission: Scaling Cancer Research
As a student in Pakistan, I have witnessed the challenges of accessing high-end therapeutics. My research is focused on bridging the gap between advanced synthetic biology and affordable healthcare.
Core Interests:
Oncolytic Viral Therapy: Building on my experience from the iGEM Venture Creation Lab (2025).
Automated Biomanufacturing: Leveraging Cloud Labs to produce therapeutic peptides like BmK CT for Glioma research.
Genetic Diversity: My FYP on the genetic characterization of immune genes in domestic Yak from Northern Pakistan has given me a deep appreciation for local biological assets.
🛠 Skills & Toolbox
Wet Lab: PCR, DNA Assembly (Gibson), Bacterial Transformation.
In-Silico: DNA Design (Benchling), Protein Folding (AlphaFold), Metabolic Modeling.
Leadership: Outreach Team Lead at ISCB RSG-Pakistan.
🎯 HTGAA 2026 Goals
During this semester, I aim to:
Master Cloud-Lab protocols to decentralize drug production.
Design Neuromorphic genetic circuits for smarter therapeutics.
Advocate for Ethical Governance in biotech across developing nations.
Assignment: Python Script for Opentrons Artwork This section documents my process of creating bio-art of the Markhor, Pakistan’s national animal, using the Opentrons OT-2 liquid handling robot.
Image Generation I started by using Google Gemini to generate a high-quality image of a Markhor with a clean white background to serve as the reference for my design.
Subsections of Homework
Week 1 HW: Principles and Practices
Homework 1: Governance & Ethics
1. Biological Engineering Application
Project: Recombinant production of BmK CT (Scorpion Peptide) via Cloud-Lab Automation.
Why: Venom-derived therapeutics are currently difficult to source sustainably. My goal is to engineer a microbial “cell factory” using E. coli to produce high-purity BmK CT for glioma (brain cancer) research, utilizing automated cloud-lab infrastructure for scalable access.
2. Governance/Policy Goals
My primary goal is to ensure Non-malfeasance (preventing harm) while promoting Constructive Use.
Sub-goal A (Biosecurity): Preventing the diversion of synthesized neurotoxin sequences for harmful, non-therapeutic purposes.
Sub-goal B (Equity): Ensuring that the digital blueprints and production protocols are accessible to researchers in developing regions like Pakistan to promote autonomy.
3. Governance Actions Matrix
Aspect
Action 1: Automated Screening
Action 2: User Verification
Action 3: Ethical Peer-Review
Actor
DNA Synthesis Companies
Cloud Lab Platforms
Academic Communities
Purpose
Flags regulated toxin sequences.
Mandatory ID for remote users.
Standardizing “Dual-Use” data sharing.
Assumptions
All toxins are indexed.
Affiliation equals ethics.
Users will follow protocols.
Risks
False positives slow research.
Excludes independent scientists.
Security breaches are possible.
4. Scoring Matrix
(1 = Best, 3 = Least)
Does the option:
Option 1
Option 2
Option 3
Enhance Biosecurity
1
1
2
Foster Lab Safety
2
1
3
Not Impede Research
2
3
1
Promote Constructive App
1
1
1
5. Prioritization & Recommendation
I prioritize a combination of Action 1 and Action 2. This recommendation is intended for International Regulatory Bodies. We must implement “Know Your Customer” (KYC) rigor for Cloud Labs, similar to financial institutions. While this adds a burden to remote researchers, it is a necessary trade-off for handling potent neurotoxins safely.
6. Ethical Reflection
The concept of an “Information Hazard” was a significant new concern for me. Even if my physical lab work is safe, publishing a “perfect roadmap” for toxin production could be misused. I propose “Ethical Red-Teaming” as a governance action where students peer-review project documentation for potential dual-use risks before publication.
Week 2 Pre-Lecture Preparation
In preparation for “DNA Read, Write, and Edit” lecture.
Part 1: Questions from Professor Jacobson
What is the error rate of polymerase? The error rate of DNA polymerase is approximately $10^{-7}$ to $10^{-8}$ per base pair.
How does this compare to the length of the human genome? The human genome is about 3 billion ($3 \times 10^9$) base pairs long, meaning mutations are inevitable without repair.
How does biology deal with that discrepancy? Biology utilizes Proofreading and Mismatch Repair (MMR) systems to reduce the final error rate to $10^{-9}$ or $10^{-10}$.
How many different ways are there to code for an average human protein? Due to Codon Degeneracy, there are millions of potential DNA sequences for a single protein.
Why don’t all of these different codes work? Factors like Codon Usage Bias, mRNA secondary structures, and cryptic splice sites can hinder protein expression.
Part 2: Questions from Dr. LeProust
Most common method for oligo synthesis: The Phosphoramidite method is the current standard.
Why is it difficult to make oligos > 200nt? Cumulative coupling inefficiencies lead to extremely low yields for long, pure sequences.
Why can’t you make a 2000bp gene via direct synthesis? The error rate and yield drop make direct synthesis of long genes unfeasible; they must be assembled from shorter oligos.
The “Lysine Contingency”: This is a Biocontainment strategy where organisms are engineered to depend on external amino acids to prevent survival outside the lab environment.
Week 2 HW: DNA Read, Write, and Edit
Part 1: Benchling & In-silico Gel Art
For this assignment, I performed a virtual restriction digest on Lambda DNA using Benchling. My creative goal was to produce a pattern that mirrors the anatomy of a scorpion, inspired by my final project on the BmK CT scorpion peptide.
Creative Vision: The Genetic Scorpion 🦂
To achieve this symmetrical design, I strategically assigned different enzymes and double-digests across 7 lanes:
Lanes 1 & 7 (The Pincers): I used SalI to create distinct high-molecular-weight bands.
Lanes 2 & 6 (The Legs): A double digest of KpnI and SacI created a wider spread of bands, mimicking the scorpion’s legs.
Lanes 3 & 5 (The Torso): Combining BamHI and EcoRI resulted in a dense cluster of bands to represent the main body segments.
Lane 4 (The Stinger): I used EcoRV, which produces 21 fragments, creating a solid vertical pillar in the center that represents the scorpion’s iconic stinger.
Methodology
Sequence: Imported the Lambda DNA (48,502 bp) sequence into Benchling.
Analysis: Used the Restriction Analysis tool to map sites for EcoRI, HindIII, BamHI, KpnI, EcoRV, SacI, and SalI.
Simulation: Conducted virtual gel electrophoresis using a NEB 2-Log Ladder as a size reference.
Result: The final digest achieved a balanced, symmetrical pattern that visually aligns with my biotech research interests.
Part 3: DNA Design Challenge
3.1. Choose your protein
The protein I chose for this assignment is BmK CT, a chlorotoxin-like peptide derived from the venom of the Chinese scorpion Olivierus martensii (previously called Mesobuthus martensii). It is a small bioactive peptide which functions primarily as a targeted anti-tumor agent against glioma cells, with key roles in inhibiting cell migration, invasion, and proliferation.
What makes BmK CT especially promising is that it can promote apoptosis and enhance the sensitivity of glioma cells to chemotherapeutic agents like temozolomide. Importantly, studies suggest that it exhibits minimal toxicity toward normal glial cells, highlighting its potential as a selective therapeutic candidate for high-grade gliomas.
The sequence was obtained from UniProt (Entry: Q9UAD0). I focused on the biologically active mature peptide consisting of 35 amino acids:
CGPCFTTDANMARKCRECCGGIGKCFGPQCLCNRI
This peptide is cysteine-rich, suggesting the formation of multiple disulfide bonds that stabilize its tertiary structure.
3.2. Reverse Translate
Using the Reverse Translate tool from the Sequence Manipulation Suite, the following 105 bp DNA sequence (35 × 3 nucleotides) was generated:
When expressing a foreign gene in a host organism, it’s important to consider codon usage bias. This refers to the preference of the host to translate certain synonymous codons more efficiently than others. Codon optimization replaces some codons in the sequence with those preferred by the host, without changing the amino acid sequence, to ensure efficient translation.
For BmK CT, I chose Escherichia coli (K-12 strain MG1655) as the host because it is a standard, well-characterized chassis in synthetic biology that grows rapidly and is easy to handle. Using the Codon Optimization Tool from VectorBuilder, I optimized the sequence while avoiding Type IIs restriction enzyme sites (BsaI, Esp3I, and BbsI) to make the sequence compatible with modular cloning workflows.
Once the optimized gene is synthesized, it can be produced using a hybrid approach that combines both cell-dependent and cell-free systems.
Cell-Dependent Recombinant Expression
In a traditional approach, the optimized gene is cloned into an expression vector (like pET-28a) under a strong promoter and transformed into E. coli.
Transcription: DNA is transcribed into mRNA by RNA polymerase.
Translation: mRNA is translated by ribosomes into the peptide.
Purification: The peptide is purified using chromatographic methods (e.g., IMAC via a His-tag).
Cell-Free Expression Systems
Alternatively, the optimized DNA can be directly introduced into a cell-free transcription–translation system. This offers:
Rapid prototyping and reduced toxicity constraints.
Precise control over reaction conditions.
Compatibility with high-throughput screening and automated cloud-lab paradigms.
This hybrid approach ensures the scalability of living cells while providing the flexibility needed to produce potentially toxic bioactive peptides like BmK CT.
Part 4: Prepare a Twist DNA Synthesis Order (Practice)
In this practice exercise, I simulated the workflow for ordering a synthetic gene, moving from a custom expression cassette design in Benchling to a clonal gene construct on the Twist Bioscience portal.
Step 1: Account Creation
I successfully set up accounts on Twist Bioscience and Benchling to facilitate DNA design and synthesis simulation. ✅
Step 2: Build Your DNA Insert Sequence
I designed a complete Expression Cassette for my codon-optimized BmK CT sequence, optimized for an E. coli expression system. Using Benchling, I sequentially assembled the following components into a single linear DNA sequence:
Promoter:TTTACGGCTAGCTCAGTCCTAGGTATAGTGCTAGC
RBS:CATTAAAGAGGAGAAAGGTACC
Start Codon:ATG
Coding Sequence (CDS): Codon-optimized BmK CT
7x His Tag:CATCACCATCACCATCATCAC (to enable protein purification)
I simulated the ordering process by selecting the “Genes” category and choosing the “Clonal Genes” option. Unlike gene fragments, clonal genes arrive already inserted into a circular vector, which allows for direct transformation into E. coli.
Importing Sequence: I uploaded the FASTA file of the Benchling cassette.
Vector Selection: I chose the pTwist Amp High Copy cloning vector as the circular backbone.
Export: I downloaded the final construct as a GenBank (.gb) file for verification.
Recombinant Vector Overview:
Below is the visualization of the final construct as it appears in the Twist portal:
Step 4: Final Plasmid Verification in Benchling
After re-importing the Twist-generated GenBank file back into Benchling, I verified the final circular plasmid containing my custom expression cassette.
Part 5: DNA Read/Write/Edit
5.1 DNA Read
(i) Target for Sequencing
I want to sequence the metagenomic DNA from the venom glands of Northern Pakistani scorpions (specifically species related to Mesobuthus martensii).
Rationale: While BmK CT is documented, local Pakistani species might harbor unique genetic variants with higher therapeutic potency or different binding affinities for glioma cells. Sequencing this DNA helps explore local biodiversity and bioprospecting for novel drug precursors.
(ii) Sequencing Technology: Oxford Nanopore (ONT)
I have chosen Oxford Nanopore Technologies (ONT), a third-generation, single-molecule sequencing platform.
Input & Preparation:
Material: High Molecular Weight (HMW) genomic DNA from venom gland tissue.
Steps: 1. Optional fragmentation (kept long for ONT), 2. End-repair and A-tailing, 3. Adapter ligation with motor proteins, and 4. Tethering to the flow cell membrane.
Mechanism & Base Calling:
DNA strands pass through a protein nanopore, causing characteristic disruptions in the ionic current.
AI/ML algorithms decode these electrical signals into base sequences (A, T, G, C) in real-time.
Output:
FASTQ files containing long-read sequences, ideal for variant analysis and de novo assembly.
5.2 DNA Write
(i) Target for Synthesis
I want to synthesize a codon-optimized genetic construct for expressing the BmK CT peptide in E. coli.
Rationale: Synthesis allows the inclusion of a T7 promoter for high-level expression and a 6xHis-tag for streamlined purification. This is a direct therapeutic application for glioma drug discovery.
(ii) Synthesis Technology: Silicon-based Electrochemical Synthesis
I would utilize the technology implemented by Twist Bioscience.
Essential Steps:
Phosphoramidite chemistry performed on a high-density silicon chip.
Computer-controlled electrochemical activation of specific pixels to add nucleotides one at a time.
Short synthesized oligos are harvested and assembled into the full-length gene.
Limitations:
Error rates can increase with length, necessitating sequence verification.
Large constructs (>3 kb) are significantly more challenging to synthesize directly.
5.3 DNA Edit
(i) Target for Editing
I want to edit the genome of E. coli expression strains (e.g., BL21) to enhance the secretion of BmK CT.
Rationale: Normally, recombinant peptides accumulate in the periplasm; by editing secretion pathways or outer membrane proteins, the peptide could be secreted directly into the medium, simplifying downstream purification.
(ii) Editing Technology: CRISPR-Cas9
Mechanism & Essential Steps:
Recognition: A custom Guide RNA (gRNA) targets the specific genomic sequence.
Cleavage: The Cas9 enzyme generates a Double-Strand Break (DSB).
Repair: Homology-Directed Repair (HDR) inserts the desired mutation using a provided donor template.
Preparation & Input:
Design: Designing gRNA via Benchling to minimize off-target effects.
Inputs: Cas9 enzyme/plasmid, custom gRNA, donor DNA template, and competent cells.
Limitations:
Efficiency: HDR efficiency can be low in certain bacterial strains.
Precision: Potential for off-target effects requires careful validation.
Sources & Acknowledgments
I used a combination of published literature, AI tools (including ChatGPT and Google Gemini), and discussions with peers to compile and refine this assignment.
Week 3 HW: Lab Automation
Assignment: Python Script for Opentrons Artwork
This section documents my process of creating bio-art of the Markhor, Pakistan’s national animal, using the Opentrons OT-2 liquid handling robot.
1. Image Generation
I started by using Google Gemini to generate a high-quality image of a Markhor with a clean white background to serve as the reference for my design.
2. Pixel Art Conversion
I then requested the AI to convert the high-resolution image into pixel art, which is necessary for the robot to dispense discrete “dots” of bacteria.
3. Coordinate Generation via GUI
I uploaded the pixel art to the Opentrons Art GUI. The initial generation included many colors, so I tweaked the design to limit it to three primary fluorescent colors: Blue, Green, and Red. During this stage, I also manually added the Markhor’s ears, which were not correctly captured in the original AI image.
Using the Echo CSV coordinates from the GUI, I utilized Google Gemini (both the web interface and the integrated assistant in the HTGAA Colab) to write the final Python script.
The process involved significant debugging due to:
Color Swapping: Adjusting the mapping to ensure the robot pulled the correct color from the source wells (A1, B1, C1).
API Syntax: Fixing NameError and KeyError related to the labware setup and coordinate indexing in the simulator.
Finally, I successfully simulated the design, ensuring the Markhor pattern was centered and clearly defined.
Post-Lab Questions: Research & Automation Strategy
1. Published Research Paper Analysis
I have reviewed the following paper regarding advanced laboratory automation:
Paper Name: Automating life science labs at the single-cell level through precise ultrasonic liquid sample ejection: PULSE
The paper introduces PULSE (Precise Ultrasonic Liquid Sample Ejection), a novel laboratory automation platform designed for biological experiments at the single-cell level.
While current lab automation excels at bulk samples, it often lacks the precision and biocompatibility required for single-cell work, which can mask critical cellular heterogeneity.
PULSE overcomes these limitations by using focused acoustic waves to eject nanoliter-sized droplets containing reagents or individual cells from a compact, disposable chip.
This technology transforms standard titer plates into high-density microdroplet arrays, allowing researchers to programmably print “nanodrop pixels” with specific numbers of cells.
The platform integrates software that synchronizes over 100 parameters, enabling automated and scalable workflows.
Major Applications
The authors demonstrate the versatility of PULSE across three major areas:
Bio-fabrication: Precise deposition of different cell types to create 3D spheroids and patterned constructs.
Precision Gating: Isolating single cells in nanodrops to track behavior over time (e.g., response to drugs) and retrieving them for downstream analysis.
Deterministic Array Barcoding: Directly linking a single cell’s observable phenotype (via imaging) to its genotype (via sequencing) using pre-allocated, addressable primers.
Key Findings
High Precision: The platform can print single cells at speeds of 5-20 cells per second with an accuracy of 90.5-97.7%.
Biocompatibility: RNA sequencing revealed that the acoustic ejection process causes no significant differential gene regulation up to 24 hours post-treatment, outperforming FACS.
Accuracy: In deterministic barcoding experiments, the platform achieved a 95.6% accuracy in matching cell fluorescent phenotypes to genotypes.
2. Final Project Ideas
I am exploring three core directions for my final project, focusing on recombinant production, metabolic engineering, and industrial biocatalysis.
Idea 1: Recombinant Production of BmK CT Peptide for Glioma Therapy
The Concept: To engineer E. coli K-12 for the high-yield production of the 35-residue mature BmK CT scorpion peptide.
The Approach: Utilizing Twist Bioscience for codon-optimized gene synthesis and pET-series vectors for efficient recombinant expression.
Impact: Establishing a scalable bio-manufacturing pipeline for a peptide with high affinity for brain tumor cells to facilitate further oncological research.
Idea 2: CryoDesign Yak: AI-Engineered Cold-Active Enzymes
The Concept: Harnessing the unique metagenomic diversity of the Himalayan Yak microbiome to engineer psychrophilic enzymes for energy-efficient industrial bioprocessing.
The Approach: Utilizing ESM-2 Transformers for structural motif identification and ProteinMPNN for inverse folding to optimize catalytic loop flexibility.
Impact: Enabling room-temperature industrial catalysis (15°C–25°C) to slash heating energy demand, contributing to the EU Net Zero 2050 goal.
Idea 3: Automated Bio-manufacturing of Bryostatin-1 Precursors
The Concept: Engineering a microbial chassis for the sustainable production of Bryostatin-1, a potent marine-derived anti-cancer precursor.
The Approach: Implementing the biosynthetic gene cluster into a laboratory-friendly host and using Opentrons automation for screening optimal metabolic flux.
Impact: Transitioning from destructive marine harvesting to a sustainable, automated bio-manufacturing paradigm.
Lab Automation Strategies
Automation Strategy for Idea 2: CryoDesign Yak (Cold-Active Enzymes)
My automation strategy focuses on a Closed-Loop ‘Design-Build-Test’ cycle for validating AI-designed cold-adapted enzymes. Precision in thermal management is the primary constraint for this project.
Build (Ginkgo Nebula):
I will utilize the Echo Acoustic Liquid Handler to nanodisperse synthetic DNA variants into 384-well plates.
This will be coupled with the Bravo Automated Liquid Handling Platform to stamp in chilled cell-free protein synthesis (CFPS) reagents.
Test (Local Opentrons):
Since psychrophilic enzymes are heat-labile, I will use a 3D-printed chilled-block holder (maintained at 10°C) on the Opentrons deck.
Process Flow:
Opentrons P20 will perform high-accuracy dispensing of chromogenic substrates into the expressed enzyme variants.
The Inheco Thermoshaker module will be used for controlled incubation at ambient ‘cold-start’ temperatures (15°C).
Data will be captured via an integrated plate reader to measure the kinetic rate ($\Delta$ Abs / $\Delta$ time).
Automation Strategy for Idea 3: Bio-manufacturing of Bryostatin-1 Precursors
This strategy aims to automate the metabolic engineering and screening of Bryostatin-1 precursors in a laboratory-friendly microbial host. Since the biosynthetic gene cluster (BGC) is large and complex, automation is essential to find the optimal expression balance.
Design & Build (Ginkgo Nebula):
I will utilize the Echo Acoustic Liquid Handler to assemble combinatorial libraries of BGC components, including promoters, RBS, and biosynthetic modules.
The Bravo Automated Liquid Handling Platform will be used for rapid transformation and plating of the engineered chassis.
Test & Optimize (Local Opentrons):
I will use the Opentrons OT-2 to automate the metabolic screening of small-molecule precursors.
Process Flow:
Opentrons P300 will handle the precise serial dilution of feeding precursors (e.g., acetate/propionate units) into the culture media.
The Heater-Shaker Module will be programmed for automated induction and micro-fermentation at optimal temperatures.
Post-fermentation, the Opentrons will perform an automated extraction protocol (solvent addition and supernatant collection) for downstream analysis via an integrated HPLC or plate reader.