Homework

Weekly homework submissions:

  • Week 1 HW: Principles and Practices

    Does the option: Option 1 Option 2 Option 3 Enhance Biosecurity • By preventing incidents • By helping respond Foster Lab Safety • By preventing incident • By helping respond Protect the environment • By preventing incidents • By helping respond Other considerations • Minimizing costs and burdens to stakeholders • Feasibility? • Not impede research • Promote constructive applications title: ‘Week 1 HW: Principles & Practices’ weight: 10 Introduction and Motivation This week emphasized that biological engineering is not only about what we can build, but how and why we choose to build it. The lectures and recitation highlighted that ethics, safety, and governance should not be treated as external constraints applied after a technology is developed, but rather as integral design dimensions from the earliest stages of a project.

  • Week 2 HW: DNA Read, Write, & Edit

    Part 0 — Gel Electrophoresis Basics (Concepts) This week, I reviewed how gel electrophoresis turns a DNA “mixture” into an interpretable pattern. In an agarose gel, DNA fragments migrate toward the positive electrode because DNA is negatively charged, and smaller fragments travel farther through the gel matrix than larger ones. A DNA ladder provides a size reference so unknown bands can be estimated in base pairs. When a restriction enzyme digest is performed, the DNA sequence is converted into a predictable set of fragment lengths, and those fragments appear as bands at specific positions. Band brightness is roughly related to how much DNA mass is in that fragment (longer fragments can look brighter if molar amounts are similar). Overall, the key idea is that restriction digests plus gels let you “read out” a cutting pattern, validate identity, and compare designs or conditions in a simple visual way.

  • Week 3 HW: Lab Automation

    Automated two-color agar art using Opentrons OT-2 and design validation with simulation.

Subsections of Homework

Week 1 HW: Principles and Practices

cover image cover image
Does the option:Option 1Option 2Option 3
Enhance Biosecurity
• By preventing incidents
• By helping respond
Foster Lab Safety
• By preventing incident
• By helping respond
Protect the environment
• By preventing incidents
• By helping respond
Other considerations
• Minimizing costs and burdens to stakeholders
• Feasibility?
• Not impede research
• Promote constructive applications

title: ‘Week 1 HW: Principles & Practices’ weight: 10

Introduction and Motivation

This week emphasized that biological engineering is not only about what we can build, but how and why we choose to build it. The lectures and recitation highlighted that ethics, safety, and governance should not be treated as external constraints applied after a technology is developed, but rather as integral design dimensions from the earliest stages of a project.

Revisiting a previous biosensing project through the HTGAA framework allowed me to explicitly articulate design decisions that were originally motivated by technical performance, but which also carry strong ethical, safety, and governance implications. This exercise helped me move beyond a purely technical evaluation and reflect more deeply on responsibility, context, and downstream impact.


Biological Engineering Application

The biological engineering application I focus on is a cell-free biosensor based on a Pb²⁺-specific DNAzyme coupled to CRISPR-Cas12a, designed for the ultrasensitive detection of lead in water.

Lead contamination represents a serious public health concern, with no safe threshold for chronic exposure. While analytical techniques such as ICP-MS or atomic absorption spectroscopy provide high sensitivity, they require centralized laboratories, specialized equipment, and trained personnel, limiting their accessibility for frequent or decentralized monitoring.

Previous generations of biological sensors, including whole-cell bacterial biosensors, demonstrated the feasibility of biological detection but suffered from long response times, higher detection limits, and biosafety concerns related to the use of living genetically modified organisms. In contrast, this project deliberately adopts a cell-free, in vitro architecture, translating the presence of Pb²⁺ into a fluorescent signal in under one hour.

The motivation behind this application is to combine high sensitivity, portability, and safety by design, enabling environmental monitoring in settings where conventional laboratory infrastructure is unavailable, while minimizing biological risks.


Governance and Policy Goals

Reframing this project within the HTGAA framework led to the identification of several governance and policy goals that extend beyond technical performance.

Goal A – Prevent harm and misuse (Non-malfeasance)

  • Avoid enabling biological manipulation or amplification of hazardous agents.
  • Prevent repurposing of the sensing platform for unintended or harmful biological activities.

Goal B – Enhance biosafety and biosecurity

  • Minimize risks associated with handling living organisms by using a fully cell-free system.
  • Reduce the likelihood of accidental environmental release or uncontrolled replication.

Goal C – Promote constructive and equitable use

  • Enable access to sensitive environmental monitoring tools without requiring advanced infrastructure.
  • Support public health and environmental decision-making rather than surveillance or coercive applications.

Option 1 – Safe-by-design, cell-free system architecture

Purpose
Many biosensing platforms rely on living cells, which introduce biosafety, containment, and regulatory challenges. This project replaces whole-cell systems with a fully cell-free, non-replicative architecture.

Design
This approach is implemented directly by academic researchers during the design phase and can be reinforced by funding agencies that prioritize safe-by-design technologies.

Assumptions

  • Eliminating living components significantly reduces biosafety risks.
  • Performance can be maintained or improved in vitro.

Risks of Failure and “Success”

  • Failure: reduced robustness in complex environmental matrices.
  • Success risk: overconfidence in technical safeguards without complementary governance measures.

Option 2 – Transparent documentation of limitations and failures

Purpose
Scientific reporting often emphasizes successful outcomes while underreporting failures. This project explicitly documents experimental failures, matrix effects, and design trade-offs.

Design
Implemented through detailed lab records and public documentation on the course website, supported by academic training and publication norms.

Assumptions

  • Transparency improves reproducibility, safety awareness, and ethical reflection.

Risks of Failure and “Success”

  • Failure: documentation becomes superficial or performative.
  • Success risk: increased reporting burden for early-stage researchers.

Option 3 – Context-specific deployment guidelines

Purpose
Environmental biosensors may be deployed in diverse contexts with different ethical implications. This option proposes context-aware guidelines distinguishing research, environmental monitoring, and regulatory use.

Design
Developed by public health and environmental agencies in collaboration with researchers and adapted to local regulatory frameworks.

Assumptions

  • Misuse risk depends strongly on deployment context.
  • Local institutions have the capacity to enforce guidelines.

Risks of Failure and “Success”

  • Failure: inconsistent enforcement across regions.
  • Success risk: delayed deployment in high-need environments.

Scoring Matrix

Policy GoalOption 1Option 2Option 3
Enhance biosecurity (prevention)122
Foster lab safety112
Protect the environment221
Minimize costs and burdens132
Feasibility122
Not impede research123
Promote constructive applications112

Prioritization and Recommendation

Based on this analysis, the highest priority should be given to Option 1 (cell-free, safe-by-design architecture), complemented by Option 2 (transparent documentation). Together, these strategies embed ethical and governance considerations directly into technical design and research practice, rather than relying solely on downstream regulation.

This combined approach is particularly relevant for academic research institutions and funding agencies, where early design choices strongly influence future applications. While these decisions may introduce additional development effort, they significantly enhance safety, trust, and long-term societal benefit.


Weekly Reflection

A key insight from this week is that biosensing technologies are not ethically neutral, even when developed for public health or environmental protection. Portability and accessibility, while beneficial, can also enable misuse if deployment contexts are not carefully considered.

Engaging with the recitation examples reinforced the importance of situating my project at the detection and prevention end of the biological intervention spectrum. This week shifted my perspective from asking only “can this work?” to also asking “should it work this way, and under what conditions?”, a mindset I intend to maintain throughout the course and into the final project.


Documentation Practice

In alignment with the course emphasis on documentation, I am recording all in-silico design steps, experimental iterations, failed conditions, and troubleshooting decisions. This documentation is intended to support reproducibility, collaborative learning, and ethical transparency, and to make visible the full experimental journey rather than only successful outcomes.

George Church – Homework Question

Question chosen: (AA:AA and NA:NA codes) What code would you suggest for AA:AA interactions?

Why we need a code (and what it can/can’t do)

Protein–protein interactions are not “pairwise letters” like Watson–Crick base pairing. They depend on 3D context (distance, solvent exposure, orientation, dynamics, PTMs, local environment). Still, a useful AA:AA “code” can exist as a coarse-grained interaction alphabet: a compact way to describe which residue pairs are likely to attract/repel or stabilize contacts, similar in spirit to how other biological codes map chemistry into discrete symbols.

So the goal is not a perfect predictor of structure, but a portable interaction language that is:

  • symmetric (A–B = B–A),
  • composable (many contacts → one interface),
  • extendable (can include non-standard amino acids / PTMs),
  • and human-usable (a small alphabet rather than a 20×20 table).

Proposed AA:AA interaction code (two-layer)

Layer 1 — Assign each amino acid to an “interaction class”

Define a small set of classes that reflect dominant chemistry:

H = hydrophobic aliphatic (A, V, L, I, M)
Ar = aromatic (F, Y, W)
P = polar uncharged (S, T, N, Q)
D+ = cationic / H-bond donor-leaning (K, R, H, plus N-termini)
A− = acidic (D, E, plus C-termini)
S = sulfur/thiol special (C)
G = glycine (conformational special)
Pro = proline (conformational breaker)

Note: H and Ar are separated because π-stacking and cation-π interactions are distinct modes; Cys is treated separately because it can form disulfides and participate in redox/metal interactions.

Layer 2 — Use a compact “interaction operator” between classes

Use a small set of operators that describe the type of contact:

= favorable hydrophobic packing (H–H, H–Ar, Ar–Ar stacking)
± = electrostatic attraction (D+–A− salt bridge)
= electrostatic repulsion (D+–D+, A−–A−)
= hydrogen bonding (P–P, P–D+, P–A−; and some aromatic H-bonding cases)
π+ = cation-π (D+–Ar)
S–S = disulfide bond (S–S; context-dependent oxidation/geometry)
= conformational modulation (Pro/Gly effects; Pro–X, G–X)

This yields a compact grammar:

  • Contact = Class(residue1) OP Class(residue2)
  • Example: Lys–Glu → D+ ± A−
  • Example: Leu–Ile → H ⊕ H
  • Example: Arg–Trp → D+ π+ Ar
  • Example: Cys–Cys → S–S (only if oxidation state and geometry allow)

Why this code is useful

  1. Small alphabet, big coverage: compresses 20×20 possibilities into a readable set of “interaction modes.”
  2. Extendable to non-standard amino acids / PTMs: you can add classes/operators for modified residues (e.g., phospho-Ser behaving more A−-like; methyl-Lys tuning D+ strength).
  3. Bridges to protein design: interface reasoning often uses these primitives (hydrophobic core + H-bond networks + salt bridges + cation-π + disulfides).

Known limitations (important)

  • Context dependence: the same pair can change behavior depending on burial, pH, dielectric, water mediation, and geometry.
  • Not a folding code: this is an interaction vocabulary, not a full structural specification.
  • Many-body effects: cooperative networks (packing + H-bond chains) are only approximated by pairwise labels.

Optional refinement (if more precision is needed)

Add an environment tag:

  • (B) buried, (E) exposed
    Example: D+ ± A−(B) often stronger than D+ ± A−(E).

AI / Prompt citation

I used ChatGPT to draft and structure this answer. Given Church’s lecture framing of codes beyond DNA→AA, propose a concise, extensible AA:AA interaction code that captures major interaction types (hydrophobic, salt bridges, H-bonds, cation-π, disulfide).

Week 2 HW: DNA Read, Write, & Edit

Part 0 — Gel Electrophoresis Basics (Concepts)

This week, I reviewed how gel electrophoresis turns a DNA “mixture” into an interpretable pattern. In an agarose gel, DNA fragments migrate toward the positive electrode because DNA is negatively charged, and smaller fragments travel farther through the gel matrix than larger ones. A DNA ladder provides a size reference so unknown bands can be estimated in base pairs. When a restriction enzyme digest is performed, the DNA sequence is converted into a predictable set of fragment lengths, and those fragments appear as bands at specific positions. Band brightness is roughly related to how much DNA mass is in that fragment (longer fragments can look brighter if molar amounts are similar). Overall, the key idea is that restriction digests plus gels let you “read out” a cutting pattern, validate identity, and compare designs or conditions in a simple visual way.


title: “Week 2 HW: DNA Read, Write, & Edit” weight: 20

Restriction digest (lambda phage genome)

Sequence used: Escherichia phage lambda, complete genome
Database/Accession: NCBI Nucleotide (GenBank), J02459
Genome length: 48,502 bp
Tool: Benchling (Import from Database → Digest) Captura 114536 Captura 114536

Captura 115134 Captura 115134

What I did (quick documentation)

  1. Imported the lambda phage genome from NCBI using accession J02459.
  2. Opened the Digest tool in Benchling.
  3. Ran single-enzyme digests with EcoRI, EcoRV, HindIII, KpnI, SacI, and SalI.
  4. Recorded the number of cut sites and the expected fragment sizes (in genome order).

Results table (fragment sizes in bp)

EnzymeCutsExpected fragmentsFragment sizes (bp)Cut ends (from Benchling)
EcoRI5621226, 4878, 5643, 7421, 5804, 35305’ overhang (sticky)
EcoRV2122652, 1434, 4597, 1403, 738, 4613, 588, 3744, 618, 2884, 1679, 3873, 1377, 13, 5376, 5765, 1921, 268, 35, 655,blunt
HindIII6723130, 2027, 2322, 9416, 564, 6682, 43615’ overhang (sticky)
KpnI2317057, 1503, 299423’ overhang (sticky)
SacI2324776, 1105, 226213’ overhang (sticky)
SalI2332745, 499, 152585’ overhang (sticky)
EcoRI digest EcoRI digestEcoRV digest EcoRV digestCaptura 115329 Captura 115329Captura 131720 Captura 131720

Consigna 2 — Gel Art (Virtual Digest)

I created a “gel art” pattern inspired by the idea that restriction digests can produce recognizable visual signatures.
The design uses symmetry and band density as the main visual elements: enzymes with few cuts generate sparse lanes (lighter), while enzymes with many cuts generate dense lanes (darker).

Lane plan (left → right):
Ladder (Life 1 kb Plus), ApaI, EcoRI, HaeIII, EcoRI, ApaI.

HaeIII creates a high-density fragmentation pattern that acts as the “dark center,” while EcoRI and ApaI provide low-cut, high-molecular-weight bands that frame the pattern.

Gel art virtual digest Gel art virtual digest

Part 3 — DNA Design Challenge

3.1 Protein choice

I chose sfGFP (superfolder GFP) as the target protein because it is a robust fluorescent reporter widely used to validate expression, folding, and cloning workflows. It provides an easy quantitative readout (fluorescence) and is a standard “sanity check” part in many synthetic biology builds. sfGFP protein PDB sfGFP protein PDB

3.2 Reverse translation (baseline CDS)

Starting from the sfGFP amino-acid sequence, I generated a DNA coding sequence (CDS) by back-translation using a codon-usage–matching approach (Benchling output). This produces a valid CDS encoding the same protein sequence.

  • Protein length: 246 aa
  • DNA CDS length (no stop codon): 738 bp

sfGFP amino-acid sequence (246 aa):

![sfGFP amino acids](<./sgGFP Aas.jpg>)
MSKGEELFTGVVPILVELDGDVNGHKFSVRGEGEGDATNGKLTLKFICTTGKLPVPWPTL
VTTLTYGVQCFSRYPDHMKRHDFFKSAMPEGYVQERTISFKDDGTYKTRAEVKFEGDTLV
NRIELKGIDFKEDGNILGHKLEYNFNSHNVYITADKQKNGIKANFKIRHNVEDGSVQLAD
HYQQNTPIGDGPVLLPDNHYLSTQSVLSKDPNEKRDHMVLLEFVTAAGITHGMDELYKGS
HHHHHH
![sgGFP Aas](/attachments/d12c476e-50db-469a-82d0-41c5496d9b00)
![sfGFP protein PDB](/attachments/8e400e42-cded-484d-97f0-8e28a82b2fdf)

Back-translated / codon-usage–matched CDS (low GC target):
ATGTCAAAAGGTGAGGAATTATTTACCGGAGTAGTACCAATACTGGTAGAATTAGATGGCG
ATGTTAATGGGCATAAGTTTTCAGTGCGTGGAGAAGGAGAAGGCGATGCTACAAATGGAAA
ATTAACGTTAAAATTTATTTGTACTACTGGGAAACTACCTGTACCTTGGCCAACTTTAGTT
ACAACCTTAACATATGGTGTACAATGTTTTTCTCGTTATCCAGATCATATGAAACGTCATG
ATTTTTTTAAAAGTGCGATGCCTGAAGGTTACGTTCAAGAAAGAACTATATCTTTTAAAGAT
GATGGTACATATAAAACACGAGCTGAAGTAAAATTTGAAGGTGATACTTTGGTTAATAGAAT
TGAACTTAAAGGGATTGATTTTAAGGAAGATGGAAATATTCTCGGACACAAATTAGAATACA
ATTTTAATTCACATAATGTTTACATAACAGCTGATAAACAAAAAAATGGCATAAAAGCAAAT
TTTAAAATAAGACATAATGTAGAAGATGGAAGTGTCCAATTAGCAGATCATTATCAGCAAAA
CACACCAATTGGTGATGGTCCTGTCCTTTTACCAGATAATCATTATTTATCAACCCAATCTG
TTTTGTCAAAAGATCCGAATGAAAAAAGAGATCATATGGTTTTATTGGAATTTGTAACAGCA
GCAGGTATTACTCATGGCATGGATGAATTATATAAAGGCTCTCATCATCATCATCATCAT


Codon optimization for E. coli

I then codon-optimized the CDS for Escherichia coli using a “use best codon” strategy. As expected, the amino-acid sequence is unchanged, but the nucleotide sequence changes due to synonymous codon choices that better match E. coli translation preferences.

Nucleotide identity (baseline vs optimized): 76.96%

GC content (baseline, codon-usage–matched): 33.0%

GC content (optimized, best-codon): 50.0%

Rare codons: 11 (baseline) vs 0 (optimized)

Hairpins (reported by the tool): 0 in both

Thymine fraction (reported by the tool): 0.30 (baseline) vs 0.21 (optimized)
ATGAGCAAAGGCGAAGAACTGTTTACCGGCGTGGTGCCGATTCTGGTGGAACTGGATGGCGAT
GTGAACGGCCATAAATTTAGCGTGCGCGGCGAAGGCGAAGGCGATGCGACCAACGGCAAACT
GACCCTGAAATTTATTTGCACCACCGGCAAACTGCCGGTGCCGTGGCCGACCCTGGTGACCA
CCCTGACCTATGGCGTGCAGTGCTTTAGCCGCTATCCGGATCATATGAAACGCCATGATTTT
TTTAAAAGCGCGATGCCGGAAGGCTATGTGCAGGAACGCACCATTAGCTTTAAAGATGATGG
CACCTATAAAACCCGCGCGGAAGTGAAATTTGAAGGCGATACCCTGGTGAACCGCATTGAAC
TGAAAGGCATTGATTTTAAAGAAGATGGCAACATTCTGGGCCATAAACTGGAATATAACTTT
AACAGCCATAACGTGTATATTACCGCGGATAAACAGAAAAACGGCATTAAAGCGAACTTTAA
AATTCGCCATAACGTGGAAGATGGCAGCGTGCAGCTGGCGGATCATTATCAGCAGAACACCC
CGATTGGCGATGGCCCGGTGCTGCTGCCGGATAACCATTATCTGAGCACCCAGAGCGTGCTG
AGCAAAGATCCGAACGAAAAACGCGATCATATGGTGCTGCTGGAATTTGTGACCGCGGCGGGC
ATTACCCATGGCATGGATGAACTGTATAAAGGCAGCCATCATCATCATCATCATCAT

Best way to obtain the DNA

For a ~0.74 kb CDS like sfGFP, the most straightforward approach is gene synthesis (ordering a dsDNA fragment). It is fast, accurate, and does not require an existing template. If a plasmid template is already available, an alternative is PCR amplification + cloning (e.g., restriction cloning or Gibson), but synthesis avoids PCR-introduced mutations and simplifies the workflow.
Codon-optimized CDS (best codons, medium GC target)

## Part 4 — DNA Write (Ordering + Construct Design)

### 4.1 Expression cassette design (what I would build)
To express **sfGFP in *E. coli***, I would build a standard bacterial expression cassette:

- **Promoter:** T7 promoter (for high expression in BL21(DE3)-like strains) or a strong constitutive promoter if T7 is not desired  
- **RBS:** strong bacterial RBS (e.g., a consensus Shine–Dalgarno / gene10-like RBS)  
- **CDS:** sfGFP coding sequence, codon-optimized for *E. coli* (AA sequence unchanged)  
- **Tag / stop:** optional **C-terminal 6xHis** tag for purification + **stop codon**  
- **Terminator:** strong transcription terminator (e.g., T7 terminator / bacterial terminator)

This design is simple, robust, and makes fluorescence an immediate readout for “does expression work?”.

### 4.2 What I would order (DNA “write” step)
Because the sfGFP CDS is short (~0.7–0.8 kb), the most straightforward approach is **DNA synthesis** (a dsDNA fragment or a cloned gene). Concretely, I would order one of these:

**Option A — Gene fragment (fast + flexible)**
- Order the **sfGFP insert as dsDNA** with flanking overlaps for Gibson/HiFi assembly (or with restriction sites).
- Then clone into an expression plasmid in the lab.

**Option B — Cloned gene in a plasmid (one-step ready)**
- Order **sfGFP already cloned** into a high-copy plasmid backbone.

### 4.3 Twist Bioscience access limitation (Argentina) + workaround plan
From my location (Argentina), the Twist ordering portal is not accessible and prompts me to contact a local operator. In a real order scenario, I would do one of the following:

![Twist screenshot](<./twist.jpg>)

1) **Contact Twist local sales/support** (as requested) and place the order via email (sequence + vector + cloning format).  
2) Use an **alternative synthesis provider** that ships to my region (e.g., ordering a dsDNA fragment from another vendor) and then perform the same assembly into an equivalent plasmid backbone.

For the purposes of this homework, I describe the intended order and construct as if placing a standard synthesis + cloning order.

### 4.4 Vector choice and final construct
If using Twist’s catalog, I would choose a standard **high-copy AmpR plasmid backbone** (e.g., a pTwist Amp high-copy–type vector), and insert the sfGFP expression cassette into it.

Final construct conceptually looks like:

**[T7 promoter] – [RBS] – [sfGFP CDS (E. coli optimized)] – [6xHis] – [STOP] – [Terminator]**

### 4.5 How I would obtain protein from this DNA (high-level workflow)
1) **Assemble** the insert into the plasmid (Gibson/HiFi or restriction cloning).  
2) **Transform** into *E. coli* (expression strain if using T7).  
3) **Verify** by sequencing (to confirm sfGFP is correct and in-frame).  
4) **Express** and measure fluorescence as a fast functional readout.  
5) (Optional) **Purify** via His-tag if purification is required.

This approach separates “DNA write” (ordering/synthesis) from “DNA read” (sequencing verification) and “DNA function” (fluorescence output).

## Part 5 — DNA Read / Write / Edit (Dengue focus: Argentina)

### 5.1 DNA Read

**(i) What DNA/RNA would I want to sequence and why?**  
I would focus on **genomic surveillance of Dengue virus (DENV) in Argentina**, integrating **clinical** and **environmental** sequencing to support public health decisions in real time.

Concretely, I would sequence:

1) **Clinical DENV genomes (RNA → cDNA)** from a **representative subset** of confirmed cases:
- **Across regions** (e.g., AMBA vs. northern provinces where dengue burden can be higher).
- **Across time** (weekly/biweekly sampling during season peaks).
- **Across epidemiological contexts** (outbreak clusters, travel-associated cases, and sporadic detections).

**Why:**  
- To track **serotype dynamics** (DENV-1/2/3/4) and detect shifts that may correlate with outbreak intensity.  
- To monitor **lineage introductions** (new clades entering a province) and infer **transmission connectivity** between regions.  
- To support **molecular epidemiology**: identify clusters, potential superspreading contexts, and genomic signatures associated with rapid spread (without overclaiming causality).  
- To generate local datasets that strengthen **regional capacity** and reduce dependence on external sequencing pipelines.

2) **Environmental DENV surveillance in Aedes aegypti pools** (and optionally wastewater as exploratory):
- **Mosquito pools** (RT-PCR confirmed) from vector surveillance programs: this can provide early hints of circulating serotypes/lineages even before clinical case counts surge.
- **Wastewater** is less standard for DENV than for enteric viruses, but could be explored as a research add-on; vector-based sampling is usually more direct for arboviruses.

**Why:**  
- To get **earlier warning signals** and a broader picture of circulation beyond who shows up at clinics.
- To link **vector circulation** with **human cases**, improving outbreak models.

---

**(ii) What sequencing technology would I use and why?**  
I would use a **two-tier strategy**:

- **Illumina short-read sequencing (2nd generation)** for routine surveillance:
  - High per-base accuracy, scalable multiplexing, strong variant calling.
  - Great for producing reliable consensus genomes and phylogenies.

- **Oxford Nanopore sequencing (3rd generation)** for rapid, field-forward situations:
  - Faster turnaround when you need same-week answers (e.g., suspected new introduction or unusual outbreak).
  - Useful for decentralized labs or mobile workflows, at the cost of higher raw read error (mitigated by coverage + consensus polishing).

This hybrid approach fits a realistic public health workflow: Illumina as the “gold standard backbone”, Nanopore as the “rapid response tool”.

---

**1) Is it first-, second-, or third-generation? How so?**  
- **Illumina = second-generation**: massively parallel short reads (sequencing-by-synthesis).  
- **Nanopore = third-generation**: single-molecule sequencing, long reads, electrical signal through nanopores.

---

**2) What is the input? How do you prepare your input? Essential steps.**  
**Input:** Dengue is an **RNA virus**, so the primary input is **viral RNA** extracted from samples, then converted to **cDNA**.

A practical pipeline:

**Clinical samples (serum/plasma/whole blood, depending on stage):**
1. **Sample + metadata collection** (date, location, Ct value, suspected serotype if known, etc.).  
2. **RNA extraction**.  
3. **RT step → cDNA**.  
4. **Target enrichment strategy** (choose one):
   - **Amplicon tiling PCR** (common for viral genomes; efficient and cheap).  
   - OR **capture-based enrichment** (more flexible but more expensive).  
5. **Library preparation**:
   - Illumina: adapter ligation + indexes (multiplexing), optional PCR.  
   - Nanopore: end-repair + adapter ligation, optional barcoding.  
6. **Sequencing run**.  
7. **Bioinformatics**: QC → mapping → consensus → variants → phylogeny.

**Mosquito pool samples:**
1. **Pool preparation** (Aedes aegypti pools, ideally with RT-qPCR confirmation).  
2. **RNA extraction** (often with inhibitors → extra QC).  
3. RT → cDNA, then same as above.

**Key practical note:** For DENV, sampling time matters: early infection tends to have higher viremia (better genome recovery). Also, using Ct thresholds to select samples improves success rate.

---

**3) How does it decode the bases (base calling)?**  
- **Illumina**: fluorescent signals from nucleotide incorporation per cycle → base calls + quality scores.  
- **Nanopore**: ionic current shifts as molecules pass through the pore → signal-to-sequence base calling (model-based), then consensus polishing.

---

**4) What is the output?**  
- **FASTQ** reads (with quality scores).  
- **BAM/CRAM** alignments to a reference genome.  
- **Consensus genome FASTA** per sample.  
- **Variant calls (VCF)** (when appropriate).  
- **QC reports** (coverage depth, % genome recovered, contamination checks).  
- Downstream: **phylogenetic trees** and **lineage/cluster summaries** for epidemiological interpretation.

---

### 5.2 DNA Write

**(i) What DNA would I want to synthesize and why? (Dengue-focused)**  
I would “write” DNA that enables **faster and more deployable dengue diagnostics** and/or supports local R&D.

Three concrete synthesis targets:

1) **DENV diagnostic standards and controls** (safe, non-infectious):
- Synthetic **gene fragments** (e.g., conserved regions of DENV genome used in RT-qPCR/CRISPR assays).
- **Positive control templates** for assay development and QA/QC.
**Why:** robust controls are crucial for reliable diagnostics, especially across multiple labs and seasons.

2) **CRISPR-based dengue detection components** (research prototype):
- Synthetic DNA templates to generate **RNA targets** (IVT) or **reporter constructs** for assay benchmarking.
- If building cell-free or isothermal detection workflows, you can synthesize the necessary templates without needing infectious material.
**Why:** safer, faster iteration.

3) **Aedes-related biosensor modules** (optional):
- DNA parts for sensor chassis optimization (e.g., expression cassettes for reporters in E. coli cell-free systems).
**Why:** create modular “plug-and-play” parts to accelerate prototyping.

---

**(ii) What technology would I use for DNA synthesis and why?**  
- For ~0.3–3 kb fragments: **commercial gene synthesis** (dsDNA fragments or cloned gene in a plasmid).
- For many variants: **oligo pools** (array-based synthesis) + assembly.

**Why:** speed + reliability, avoids PCR errors, and supports rapid iteration (especially when you want multiple versions: different primers, target regions, or assay designs).

---

**1) Essential steps (high-level)**  
- Design sequence (include constraints: avoid repeats/extreme GC, include needed cloning sites/overlaps).  
- Order as dsDNA fragment (or oligos + assembly).  
- If needed: clone into plasmid backbone (Gibson/HiFi or restriction cloning).  
- Verify by sequencing (at least Sanger for inserts, or NGS for pools).  
- Use as template/control in downstream assays.

---

**2) Limitations (speed, accuracy, scalability)**  
- **Length & complexity**: longer sequences or high repeat content may fail or take longer.  
- **Error rate**: increases with length; sometimes error correction or clone screening is needed.  
- **Sequence constraints**: extreme GC, hairpins, homopolymers can reduce success.  
- **Regulatory/shipping**: international access can be limited; some vendors require regional sales contact.  
- **Cost**: scales with length and number of variants.

---

### 5.3 DNA Edit

**(i) What DNA would I want to edit and why? (Dengue context)**  
I would focus on edits that are **ethically appropriate, feasible, and beneficial**, avoiding speculative or high-risk human germline scenarios.

Two realistic editing directions:

1) **Editing lab strains (E. coli or cell-free chassis) to improve dengue diagnostic prototyping**  
Examples (conceptual):
- Reduce background nuclease activity that can degrade reporters.  
- Improve expression stability of reporter proteins or enzymes used in readouts.  
**Why:** more robust, reproducible diagnostics and faster prototyping cycles.

2) **Vector biology research (Aedes aegypti) — in controlled research settings**  
Examples (high-level):
- Knock-in/knock-out genes to study **vector competence** or immune pathways relevant to arbovirus replication.  
**Why:** better understanding of transmission biology can support long-term control strategies (with strong oversight and biosafety/ethics review).

---

**(ii) What technology would I use and why?**  
- **CRISPR-Cas9** for knock-outs and knock-ins in model systems.  
- **Base editing** for precise point mutations (when you want to avoid double-strand breaks).  
- **Prime editing** for flexible small edits (insertions/deletions/substitutions) with less HDR dependence.

Choice depends on the edit:
- Big insertions → Cas9 + HDR (or targeted integration strategies).  
- Single base changes → base editor.  
- Small flexible edits → prime editor.

---

**1) How does it edit DNA? (conceptual steps)**  
- Guide RNA targets a specific locus.  
- Editor performs cut or base conversion.  
- Cellular repair/processing results in the desired change.  
- Screen and validate clones/lines.

---

**2) What preparation is needed and what is the input?**  
- Target selection + guide design + off-target risk assessment.  
- Editor delivery strategy (plasmid, mRNA, RNP).  
- Optional donor template for HDR edits.  
- Validation plan:
  - PCR across the locus, Sanger/NGS confirmation,
  - phenotype/functional assay relevant to the edit,
  - off-target screening where appropriate.

---

**3) Limitations (efficiency/precision)**  
- **Delivery** limitations (some cell types/organisms are difficult).  
- **Off-targets** and unintended edits (varies with editor/guide).  
- **HDR efficiency** can be low; requires careful design and screening.  
- Need for **strong controls**, replication, and transparent reporting.

Week 3 HW: Lab Automation