Homework

📚 Weekly Homework Submissions

Subsections of Homework

Week 1 HW: Principles and Practices

Week 1 - Principles & Practices

Ethics, Safety, and Governance in Biotechnology


1. Proposed Biological Engineering Application / Tool

I am interested in the computational design of biomarker-integrated biomedical devices that combine localized therapeutic delivery with biological sensing to improve safety and precision in cancer treatment.

Specifically, I am working on the conceptual design of an antitumoral intrauterine device (IUD) for endometrial adenocarcinoma that leverages intrauterine molecular and transcriptomic biomarkers to support real-time monitoring and data-driven therapeutic decisions. By integrating localized drug delivery potentially through drug repurposing strategies with biomarker-informed feedback, such a system aims to reduce systemic toxicity while increasing treatment specificity.

While biomarker-driven therapeutic devices have the potential to significantly improve clinical outcomes, they also raise important ethical and governance concerns. These include risks related to patient safety, data governance, algorithmic bias, unequal access to precision technologies, and the challenge of managing systems where computational interpretation of biological data directly informs therapeutic action.

2. Governance / Policy Goals for an Ethical Future

High-level goal:
Ensure that biomarker-driven biomedical devices and computational biological design tools are developed and deployed in ways that are safe, clinically responsible, and equitable.

Sub-goals:

  • Patient Safety & Non-malfeasance
    Prevent harm arising from incorrect biomarker interpretation, device malfunction, or inappropriate therapeutic responses.
  • Equity & Access
    Avoid reinforcing disparities in access to precision biomedical technologies across healthcare systems and regions.
  • Responsible Innovation
    Enable translational progress in biomarker-integrated medical devices without imposing governance frameworks that unnecessarily slow legitimate research or clinical adoption.
3. Governance Action - Option 1: Ethical Risk Assessment

Purpose
Require early-stage ethical and clinical risk assessment for biomarker-driven therapeutic devices prior to clinical testing or deployment.

Design

  • Implemented by universities, hospitals, ethics committees, and funding agencies
  • Short, standardized assessment focused on patient safety, data use, and system autonomy

Assumptions

  • Designers can anticipate plausible failure modes and misuse risks
  • Institutions are willing to meaningfully enforce early review

Risks

  • Assessments may become procedural rather than substantive
  • Overregulation could discourage early-stage innovation if applied rigidly
4. Governance Action - Option 2: Tiered Access & Monitoring

Purpose
Limit misuse or premature deployment of sensitive biomarker-driven design tools and therapeutic decision-support algorithms.

Design

  • Tiered access based on clinical training, certification, or institutional affiliation
  • Monitoring of high-risk functionalities, such as automated therapeutic adjustments

Assumptions

  • Access control reduces patient risk
  • Monitoring does not unduly hinder clinical research

Risks

  • Centralization of control over design and deployment tools
  • Exclusion of low-resource or underrepresented research groups
5. Governance Action - Option 3: Incentives for Responsible Design

Purpose
Encourage safety-by-design and equity-focused development of biomarker-integrated biomedical devices.

Design

  • Funding incentives for designs prioritizing patient safety, interpretability, and clinical robustness
  • Recognition programs implemented by funders, hospitals, and research institutions

Assumptions

  • Positive incentives influence research and design priorities

Risks

  • Unequal distribution of incentives
  • Difficulty in measuring long-term clinical and societal impact
6. Scoring Governance Options
Does the option:Option 1Option 2Option 3
Enhance Patient Safety
• By preventing incidents213
• By helping respond22n/a
Improve Clinical Reliability
• By preventing incidents223
• By helping respond223
Protect Patients & Data
• By preventing incidents223
• By helping respond223
Other considerations
• Minimizing costs and burdens231
• Feasibility122
• Not impede research231
• Promote constructive applications221

(1 = best)

7. Recommended Governance Strategy

I would prioritize a combination of Option 1 and Option 3.

Early ethical and clinical risk assessment establishes a shared baseline of responsibility, while incentive-based mechanisms encourage designers to proactively integrate patient safety, interpretability, and equity into biomarker-driven biomedical devices.

Audience:
Academic institutions, funding agencies, hospital ethics boards, and translational research consortia.

8. Ethical Reflections from Week 1

This week highlighted that ethical responsibility in biotechnology increasingly emerges before physical deployment, particularly in systems where computational interpretation of biological data directly informs therapeutic action.

A key insight was the importance of anticipatory governance—especially for biomarker-driven medical devices—where risks arise from data interpretation, system autonomy, and unequal access rather than from biological materials alone.

Proposed Action:
Integrate ethics, governance, and clinical risk analysis into the early design phase of biomarker-informed biomedical devices as core engineering constraints rather than post hoc considerations.

📚 References
  1. Baker, D., & Church, G. (2024). Protein Design Meets Biosecurity. Science Editorial. (link)
  2. OpenAI. (2026). ChatGPT (GPT-5.2) [Large language model]. Used for drafting support and conceptual clarification. (link)
  3. Harvard–MIT HTGAA. (2026). Week 1: Principles & Practices. Course lecture materials.
  4. This topic is related to my thesis proyect

Week 2 HW: DNA Read, Write, & Edit

Part 1: Benchling & In-silico Gel Art
  • Created a Benchling account and initialized a project for Week 2 focused on molecular cloning and restriction digest simulation.
  • Import Lambda DNA from NCBI
    The NEB product specifies Lambda DNA derived from the cI857ind1 Sam7 strain. While this represents a specific laboratory variant, the full genomic backbone corresponds to the canonical 48,502 bp lambda phage genome (NC_001416).
    For restriction digest simulations, the reference genome is appropriate, as restriction sites are conserved in the standard laboratory strain.


  • Upload FASTA to Benchling
    Enable translational progress in biomarker-integrated medical devices without imposing governance frameworks that unnecessarily slow legitimate research or clinical adoption.
  • Simulate Restriction Enzyme Digestion with the following Enzymes
    Enable translational progress in biomarker-integrated medical devices without imposing governance frameworks that unnecessarily slow legitimate research or clinical adoption.



  • Create a pattern/image in the style of Paul Vanouse’s Latent Figure Protocol artworks.
Part 3: DNA Design Challenge
  • 3.1 Choose Your Protein: Alpha-Synuclein (SNCA)
    I chose human alpha-synuclein (SNCA) because of its central role in neurodegenerative diseases, particularly a group of disorders known as synucleinopathies.

    Synucleinopathies are a group of chronic and progressive neurodegenerative diseases characterized by the abnormal accumulation of misfolded alpha-synuclein protein. This misfolded protein aggregates into insoluble inclusions known as Lewy bodies, which are found in neurons and glial cells. The major synucleinopathies include Parkinson’s disease, Dementia with Lewy bodies, and Multiple system atrophy. These disorders typically appear in late adulthood and progressively worsen over time.

    Alpha-synuclein is a neuronal protein encoded by the SNCA gene in humans. Under normal physiological conditions, it plays a role in synaptic vesicle regulation and neurotransmitter release. However, in neurodegenerative conditions, the protein undergoes misfolding and aggregation, forming toxic oligomers and fibrils that disrupt cellular homeostasis and ultimately lead to neuronal death.

    Because protein misfolding and aggregation are central mechanisms in many neurodegenerative diseases, studying alpha-synuclein at the sequence level provides important insight into disease pathology and potential therapeutic strategies.

    `>sp|P37840|SYUA_HUMAN Alpha-synuclein OS=Homo sapiens OX=9606 GN=SNCA PE=1 SV=1MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAPQEGILEDMPVDPDNEAYEMPSEEGYQDYEPEA`<br>
    

    Sequence taken from UNIPROT

  • 3.2 Reverse Translation: Protein (amino acid sequence) → DNA (nucleotide sequence)
    Because of the degeneracy of the genetic code, multiple nucleotide sequences can encode the same amino acid sequence. To obtain a DNA sequence corresponding to human alpha-synuclein, I used the Sequence Manipulation Suite Reverse Translate tool. Using the Escherichia coli codon usage table, the program generated a 420 base pair non-degenerate DNA sequence representing the most likely coding sequence for bacterial expression.

    The resulting optimized nucleotide sequence is:
    atggatgtgtttatgaaaggcctgagcaaagcgaaagaaggcgtggtggcggcggcggaa aaaaccaaacagggcgtggcggaagcggcgggcaaaaccaaagaaggcgtgctgtatgtg ggcagcaaaaccaaagaaggcgtggtgcatggcgtggcgaccgtggcggaaaaaaccaaa gaacaggtgaccaacgtgggcggcgcggtggtgaccggcgtgaccgcggtggcgcagaaa accgtggaaggcgcgggcagcattgcggcggcgaccggctttgtgaaaaaagatcagctg ggcaaaaacgaagaaggcgcgccgcaggaaggcattctggaagatatgccggtggatccg gataacgaagcgtatgaaatgccgagcgaagaaggctatcaggattatgaaccggaagcg

  • 3.3 Codon Optimization
    Although the genetic code is universal, different organisms exhibit codon usage bias, meaning that certain codons are used more frequently than others. Using rare codons in a host organism can reduce translation efficiency, slow protein synthesis, and negatively impact protein folding.

    To improve expression efficiency, I optimized the alpha-synuclein coding sequence for Escherichia coli, a commonly used bacterial expression system in molecular biology. Using the Reverse Translate tool with an E. coli codon usage table, the most frequently used codons were selected for each amino acid. This increases the likelihood of efficient transcription and translation when the gene is expressed in bacterial cells.

  • 3.4 How Can This DNA Be Used to Produce Alpha-Synuclein?

    The optimized alpha-synuclein DNA sequence can be used to produce protein through recombinant expression in a bacterial system. Below is a step-by-step explanation of how the DNA sequence is transcribed and translated into protein.

    Step 1: Gene Synthesis and Cloning
    The codon-optimized alpha-synuclein DNA sequence is first chemically synthesized and inserted into a plasmid vector. The plasmid contains a strong promoter (such as T7), a ribosome binding site, an antibiotic resistance gene for selection, and a transcription terminator. This plasmid serves as a vehicle to introduce and express the gene inside bacterial cells.

    Step 2: Transformation into Bacterial Cells
    The recombinant plasmid is introduced into competent Escherichia coli cells through heat shock or electroporation. The bacteria are then plated on antibiotic-containing media so that only cells that successfully incorporate the plasmid survive.

    Step 3: Transcription (DNA → mRNA)
    Inside the bacterial cell, RNA polymerase binds to the promoter and transcribes the alpha-synuclein DNA sequence into messenger RNA (mRNA). During transcription, thymine (T) in DNA is replaced by uracil (U) in RNA.

    Step 4: Translation (mRNA → Protein)
    The mRNA binds to ribosomes, which read the sequence in sets of three nucleotides (codons). Each codon specifies one amino acid. Transfer RNAs (tRNAs) deliver the corresponding amino acids, which are linked together to form the alpha-synuclein protein. Because the gene was codon-optimized for E. coli, translation efficiency is increased.

    Step 5: Protein Folding and Purification
    After translation, the protein folds into its functional structure. The bacterial cells can then be lysed, and the protein purified using methods such as affinity chromatography, ion exchange chromatography, or size exclusion chromatography.

    In summary, the optimized DNA sequence is transcribed into RNA and translated into alpha-synuclein protein, enabling large-scale production for research on neurodegenerative diseases such as Parkinson’s disease.

  • 3.5 How Does It Work in Nature?

    In biological systems, genetic information flows according to the Central Dogma: DNA → RNA → Protein. A single gene can give rise to multiple protein products through mechanisms such as alternative splicing, alternative promoters, and post-transcriptional modifications.

    The human SNCA gene primarily encodes the 140-amino-acid alpha-synuclein protein. However, alternative splicing can generate shorter isoforms (e.g., 126 or 112 amino acids), demonstrating how one gene can produce multiple protein variants.

    Below is the alignment of the designed DNA sequence with its corresponding RNA and translated protein:

    DNA
    ATG GAT GTG TTT ATG AAA GGC CTG AGC
    ↓ Transcription
    mRNA
    AUG GAU GUG UUU AUG AAA GGC CUG AGC
    ↓ Translation
    Protein
    Met – Asp – Val – Phe – Met – Lys – Gly – Leu – Ser
    This alignment confirms that the optimized DNA sequence correctly encodes alpha-synuclein without frame shifts or premature stop codons.
Part 4: Prepare a Twist DNA Synthesis Order/* Insert Pictures */
Part 5: DNA Read/Write/Edit - First draft
  • 5.1 DNA Read

    (i) What DNA would you want to sequence and why?
    For this project, I would focus on genes commonly mutated in adenocarcinoma endometrial, such as TP53, PTEN, PIK3CA. Sequencing these genes allows identification of mutations that drive tumor progression and informs targeted therapy strategies.

    Example source: NCBI Gene Database for TP53: https://www.ncbi.nlm.nih.gov/gene/7157

    (ii) Sequencing technologies to use and why:
    - I would use next-generation sequencing (NGS, second-generation) due to its high throughput and accuracy for detecting somatic mutations.
    - Input preparation: DNA extraction from tissue, fragmentation, adapter ligation, PCR amplification.
    - Sequencing steps: Cluster generation on flow cell, cyclic reversible termination (base incorporation), imaging, base calling.
    - Output: FASTQ files with read sequences, aligned to human reference genome to identify mutations.

  • 5.2 DNA Write

    (i) What DNA would you want to synthesize and why?
    To simulate or study therapeutic interventions, I could synthesize DNA fragments of TP53, PTEN, or PIK3CA with specific mutations. These could be used in cell line models to test gene expression, protein production, or biosensor detection in a controlled environment.

    Example applications:
    - Genetic circuits for biosensors embedded in the DIU to detect tumor markers.
    - DNA origami structures for controlled drug release or as scaffolds for sensor integration.

    (ii) Technology to perform DNA synthesis and why:
    - I would use oligonucleotide synthesis and assembly methods (Twist Bioscience, IDT), which are precise and scalable.
    - Essential steps: Phosphoramidite synthesis, purification, assembly into larger constructs.
    - Limitations: Sequence length constraints (usually <1–2 kb per construct), cost for large-scale synthesis, and potential errors in repetitive sequences.

  • 5.3 DNA Edit

    (i) What DNA would you want to edit and why?
    I would target mutated TP53 or PTEN alleles in endometrial cells. Editing these genes could theoretically restore normal tumor suppressor function, providing a model for localized gene therapy integrated with DIU-mediated monitoring.

    For Example: The Company Colossal Biosciences demonstrates genome engineering in animals; similarly, precise editing tools can model human therapeutic strategies.

    (ii) Technology to perform DNA edits and why:
    - I would use CRISPR/Cas9 or base editors for targeted nucleotide correction.
    - Essential steps: Design guide RNA complementary to the mutation, deliver Cas9 and guide RNA into cells, allow DNA repair mechanisms to introduce the correct sequence.
    - Input preparation: DNA template or plasmid with Cas9 and guide RNA, target cells (e.g., endometrial cell line), transfection reagents.
    - Limitations: Off-target effects, variable editing efficiency, delivery challenges in vivo.

📚 References
  1. NCBI. (n.d.). Lambda phage genome, complete sequence. (link)
  2. Benchling. (n.d.). Lambda phage genome project – Week 2. (link)
  3. UniProt. (n.d.). Alpha-synuclein (SNCA) – Homo sapiens. (link)
  4. Bioinformatics.org. (n.d.). Sequence Manipulation Suite – Reverse Translate. (link)
  5. NCBI Gene. (n.d.). TP53 tumor protein p53 [Homo sapiens]. (link)
  6. R.C. Donovan. (n.d.). Gel art. (link)
  7. HTGAA 2026a. (2026). Week 2 Lab: Benchling & In-silico Gel Art. Course materials. (link)
  8. OpenAI. (2026). ChatGPT (GPT-5.2) [Large language model]. Used for drafting support and conceptual clarification. (link)
  9. This topic is related to my thesis project: Diseño Computacional de un DIU Antitumoral con Liberación Controlada y Monitoreo para Terapia Localizada en Adenocarcinoma Endometrial.

Week 2 Lecture Preparation Assignment

Assignment (Week 2 Lecture Prep): DNA Read, Write, and Edit

Professor Jacobson - DNA Copying & Coding

1. What is the error rate of DNA polymerase?

  • Nature’s biological machinery for copying DNA, error-correcting polymerase, has an error rate of approximately 1:10^6 (one in a million).
  • This compares to the human genome length of approximately 3 billion base pairs (3 Gb).
  • To handle the discrepancy between this error rate and the vast size of the genome, biology utilizes multiple proofreading mechanisms, including 5’-3’ error-correcting exonucleases and 3’-5’ proofreading exonucleases during template-dependent primer extension

2. Coding for a Human Protein

  • For an average human protein (approximately 1,036 bp or 345 amino acids), there are an enormous number of DNA sequences that can code for it due to the redundancy of the genetic code, where multiple codons often translate to the same amino acid.
  • In practice, many of these potential codes do not work because of technical “complexities” that hinder synthesis and biological function, such as extreme GC content (local content ≥90% or ≤10%), hairpins and secondary structures, long repeats (tandem or inverted), and homopolymers (runs of an identical base longer than 30 bp)
Dr. LeProust - DNA Writing & Oligo Synthesis

1. What is the most commonly used method for oligo synthesis today?

The dominant method is solid-phase phosphoramidite chemistry, which synthesizes DNA one nucleotide at a time in the 3’ → 5’ direction.

2. Why is it difficult to synthesize oligos longer than ~200 nt?

Direct oligonucleotide synthesis becomes increasingly difficult as sequence length increases because each synthesis cycle introduces a small probability of error. As the number of base additions grows, these errors accumulate, leading to an exponential decrease in yield. In practice, this means that truncated products and incorrect sequences begin to dominate the final mixture.

Beyond approximately 200 nucleotides, the fraction of full-length, sequence-correct oligos becomes very low. This is a direct consequence of step-wise coupling inefficiency: even when individual base addition steps are highly efficient, the cumulative probability of producing a perfect sequence decreases exponentially with length. For example, if each coupling step has a small failure rate, the overall yield after N steps follows an exponential decay curve (often approximated by (1−1/N)^N ≈ 37%), resulting in a dramatic loss of high-quality, full-length DNA as oligo length increases.

3. Why can’t a 2000 bp gene be made via direct oligo synthesis?

A 2,000 bp gene cannot be produced via direct oligonucleotide synthesis because the cumulative yield loss across thousands of coupling cycles would result in an effectively zero number of sequence-perfect molecules. As synthesis length increases, small inefficiencies at each base addition compound exponentially, making full-length, error-free products exceedingly rare.

As a result, direct synthesis of long genes would lead to extremely low yields of full-length DNA, high error rates, and prohibitive costs. Instead, long genes are constructed by assembling shorter, sequence-verified oligonucleotides or gene fragments using methods such as enzymatic assembly, PCR-based assembly, or Gibson Assembly, which allow errors to be corrected at intermediate steps and dramatically improve overall fidelity.

Professor George Church - Essential Amino Acids & Lysine Contingency

Question Chosen

What are the 10 essential amino acids in animals, and how does this affect the “Lysine Contingency”?

The 10 Essential Amino Acids

Animals cannot synthesize these amino acids and must obtain them from their diet:

  • Histidine
  • Isoleucine
  • Leucine
  • Lysine
  • Methionine
  • Phenylalanine
  • Threonine
  • Tryptophan
  • Valine
  • Arginine (essential in many animals, especially during growth)

Implications for the Lysine Contingency

Lysine’s essentiality creates a biological dependency that can act as a containment strategy. Organisms engineered to require external lysine supplementation would be unable to survive outside controlled environments.

This reinforces the idea that metabolic dependencies can be used as safety mechanisms, embedding governance directly into biological design rather than relying solely on external regulation.

Advanced Section - ARPA-H BoSS: BioStabilization Systems

Program Chosen: BioStabilization Systems (BoSS)

Proposed Concept: Nature-Inspired Modular Biostabilization for Cell Therapies

The BoSS program addresses a critical bottleneck in modern medicine: the extreme fragility of biologics and cell therapies and their dependence on an expensive, failure-prone cold chain. I propose a modular biostabilization strategy inspired by naturally stress-tolerant organisms, combined with scalable materials engineering.

Core Idea

Develop a hybrid intracellular–extracellular stabilization platform that enables long-term room-temperature storage of cell therapies by combining:

  1. Intracellular protection

    • Transient expression or loading of protective molecules inspired by anhydrobiotic organisms (e.g., trehalose analogs, intrinsically disordered stress proteins).
    • These molecules stabilize membranes and proteins during dehydration and rehydration cycles without permanently altering cellular identity.
  2. Extracellular engineered matrices

    • Encapsulation of cells within biocompatible, glass-like polymer matrices that mimic vitrification without freezing.
    • Materials engineered to reversibly dissolve upon rehydration, enabling rapid cell recovery at the point of care.

Technical Approach

  • Screen and optimize protective molecules using high-throughput assays for post-rehydration viability and function.
  • Engineer scalable cell-processing workflows compatible with existing biomanufacturing pipelines.
  • Design restoration protocols that ensure rapid functional recovery without specialized infrastructure.

Why This Fits BoSS

This approach directly addresses both BoSS technical areas:

  • Cell interventions for stabilization and restoration
  • Scalable processing systems for deployment and commercialization

By eliminating ultra-cold storage requirements, this system would dramatically reduce costs, improve resilience to supply chain disruptions, and expand access to advanced biologics in remote and resource-limited settings.

Broader Impact

If successful, room-temperature biologics would:

  • Enable stockpiling of life-saving therapies
  • Reduce product loss due to logistics failures
  • Democratize access to advanced medicine beyond major medical centers

This proposal treats stability not as a logistical afterthought, but as a design constraint embedded directly into biological systems, aligning with BoSS’s vision of transformative, nature-inspired biotechnologies.

📚 References
  1. Jacobson, J. M. (2026). Lecture 2 Slides: DNA Read, Write, and Edit. HTGAA Spring 2026.
  2. LeProust, E. M. (2026). Lecture 2 Slides: DNA Synthesis Technologies. HTGAA Spring 2026.
  3. Church, G. M. (2026). Lecture 2 Slides: DNA Codes and Biological Design. HTGAA Spring 2026.
  4. OpenAI. (2026). ChatGPT (GPT-5.2) [Large language model]. Used for drafting support and conceptual clarification.
  5. Google. (2026). NotebookLM [AI-assisted research tool]. Used for document synthesis and note organization.
  6. ARPA-H. (2026). BioStabilization Systems (BoSS) Program [Online funding program]. Advanced Research Projects Agency for Health. https://arpa-h.gov/explore-funding/programs/boss