Alexandra — HTGAA Spring 2026

cover image cover image

About me

I am a third year student of genetics in Peru 🇵🇪 I’m interested in understanding biological phenomena from different approaches and multiple scales with the aim of developing comprehensive and effective treatments and technologies (mainly in chronic diseases) I enjoy doing a variety of things; lately I’ve been exploring mathematical modeling, music, and quantum physics. I love playing volleyball, cycling, and open water swimming.

Contact info

LinkedIn

Weeks

Subsections of Alexandra — HTGAA Spring 2026

Homework

Weekly homework submissions:

  • Week 1 HW: Principles and Practices

    1. “An application of synthetic biology that I recently found out about, and I’m really excited for, is partial cellular reprogramming.” It’s achieved by inducing the expression of factors called “Yamanaka factors”, which enable a cell to regain a pluripotent state in which DNA methylation patterns and chromatin architecture are reset to a younger state in a “rejuvenation” process that enhances health and lifespan for individuals. (1, 2) Recently, a treatment intended to treat optic neuropathies based on this mechanism was approved by the FDA for its first clinical trials; this treatment consists of an AAV2 vector that carries the Oct4, Sox2, and Klf4 factors. Systemic doxycycline administration is needed for almost two months to activate OSK expression. (3) What if we could engineer a self-regulated genetic circuit that, by utilizing biosensors, detects aging or disease markers like transcription factors that enhance senescence-associated secretory phenotypes (SASPs), and thus activates itself and begins rewiring? To ensure non-malfeasance, the system incorporates a failsafe kill switch as a safety module that induces apoptosis if the cell loses its differentiated identity or if markers of full pluripotency, such as Nanog, are detected, to reduce the risk of cancer. By the time a “safe-by-design” technology like this becomes approved and available to the general public, there must be some important regulations upheld.
  • Week 2 HW: Read, write and edit DNA

    Includes question solving as preparation

  • Week 3 HW: Lab automation

    Using Python to run Opentrons liquid handling robot

  • Weeek 4 HW: Protein design - Part 1

    Exploring the world of AI

Subsections of Homework

Week 1 HW: Principles and Practices

1. “An application of synthetic biology that I recently found out about, and I’m really excited for, is partial cellular reprogramming.”

It’s achieved by inducing the expression of factors called “Yamanaka factors”, which enable a cell to regain a pluripotent state in which DNA methylation patterns and chromatin architecture are reset to a younger state in a “rejuvenation” process that enhances health and lifespan for individuals. (1, 2) Recently, a treatment intended to treat optic neuropathies based on this mechanism was approved by the FDA for its first clinical trials; this treatment consists of an AAV2 vector that carries the Oct4, Sox2, and Klf4 factors. Systemic doxycycline administration is needed for almost two months to activate OSK expression. (3) What if we could engineer a self-regulated genetic circuit that, by utilizing biosensors, detects aging or disease markers like transcription factors that enhance senescence-associated secretory phenotypes (SASPs), and thus activates itself and begins rewiring? To ensure non-malfeasance, the system incorporates a failsafe kill switch as a safety module that induces apoptosis if the cell loses its differentiated identity or if markers of full pluripotency, such as Nanog, are detected, to reduce the risk of cancer. By the time a “safe-by-design” technology like this becomes approved and available to the general public, there must be some important regulations upheld.

2. The policy goals

Biosafety:

  • Genetic safety and vectors: Ensure that the reprogramming does not result in oncogenesis and that the use of vectors (such as AAV2s) does not cause insertional mutagenesis or unforeseen infections (due to its viral nature).
  • Responsible usage: Restrict access to reprogramming vectors to authorized hospitals to prevent self-administration and prevent the genetic circuit from being altered or replicated at home without adequate security controls. There shouldn’t be people modifying or hacking the genetic circuit nor the vectors.

Public accessibility and equity:

  • Universal access: Everyone should be able to access such treatment if required, so there should be financing or insurance programs that guarantee it.
  • Inclusive standards: Treatment must work on a global genetics level, being effective regardless of ethnic diversity
  • Transparency: People should be taught the nature of things we apply on them and informed of possible risks

3. The potential actions and the actors:

Actors: Researchers, Medics, Industry, Government and Patients

1. Enhance safety by standard design:

Main cast: Researchers, Industry

  • The researchers must develop a genetic circuit design that must induce apoptosis whenever there is a risk of oncogenesis or a full dedifferentiation process; industry must standardize that design to diminish risk.
  • We are assuming that the pluripotency (nanong) indicators are completely accurate.
  • Failure here results in the formation of teratomas or tumors.

2. Funding to ensure equity and access:

Main cast: Government, Medics, Patients

  • The government must finance treatment for patients that need it but can’t afford it (through funding the program through taxes on purely cosmetic longevity treatments, for example) and incentivize the creation of genomics databases of different ethnic groups. Medics must contribute to the genomic database by taking samples while doing service in the mentioned groups.
  • We assume that the cost of such treatment decreases as time passes and advances are made in the field.
  • Failure here results in discrimination (economic and/or ethnic) and leaves people without support.

3. Development of transgene control systems

Main cast: Medics, Researchers

  • Researchers should develop sophisticated control systems, not only in the form of a synthetic chemical compound (drug) that inhibits the circuit by marking its proteins for degradation, but also in an inhalation-mediated inhibition mechanism with a volatile compound that triggers an inducible promoter that codes for a regulatory protein that instantly degrades the Yamanaka factors. Physicians should prescribe this drug to stop any unwanted side effects that some patients might experience (oncogenesis).
  • We assume that the transgene control systems are effective and fast enough to reverse a dedifferentiation process before it becomes irreversible or tumorous.
  • Failure here results in possible untreatable disregulations of the system.

4. Scoring the options

Does the option:Option 1Option 2Option 3
Enhance Biosecurity
• By preventing incidents1--
• By helping respond--1
Foster Lab Safety
• By preventing incident---
• By helping respond---
Protect the environment
• By preventing incidents---
• By helping respond---
Other considerations
• Minimizing costs and burdens to stakeholders313
• Feasibility?132
• Not impede research121
• Promote constructive applications323

5. What to prioritize

I believe that options 1 and 3 are the most important in this case, so I would recommend developing a standard circuit design that is self-regulating and has an external regulation mechanism in case, in a physician’s judgment, it is considered necessary to stop the process in any patient. I would suggest this kind of approach as a prerequisite for any autonomous reprogramming system intended for human use in the FDA.

6. Bibliography

  1. Takahashi & Yamanaka (2006) Induction of Pluripotent Stem Cells from Mouse Embryonic and Adult Fibroblast Cultures by Defined Factors. Cell 126, 663–676. http://dx.doi.org/10.1016/j.cell.2006.07.024
  2. Schmidt & Plath (2012) The roles of the reprogramming factors Oct4, Sox2 and Klf4 in resetting the somatic cell epigenome during induced pluripotent stem cell generation. http://genomebiology.com/2012/13/10/251
  3. Ocampo et al. (2016) In Vivo Amelioration of Age-Associated Hallmarks by Partial Reprogramming. Cell 167, 1719–1733. http://dx.doi.org/10.1016/j.cell.2016.11.052
  4. Macip et al. (2024) Gene Therapy-Mediated Partial ReprogrammingExtends Lifespan and Reverses Age-RelatedChanges in Aged Mice. http://dx.doi.org/10.1089/cell.2023.0072
  5. Biosciences, L. (2026). Evaluating ER-100 for Safety in People With Glaucoma or Non-Arteritic Anterior Ischemic Optic Neuropathy (Optic Nerve Conditions). https://clinicaltrials.gov/study/NCT07290244

7. Searches:

  • Google: “Life Biosciences”
  • Gemini:
  1. “Me gustarĂ­a que clarifiques la redacciĂłn de la introducciĂłn a la aplicaciĂłn de la biologĂ­a sintĂ©tica de la que voy a hablar en mi tarea.”
  2. “LeĂ­ estos dos papers Takahashi & Yamanaka (2006) y Schmidt & Plath (2012). Estos hablan de los factores OSK ÂżYa han habido avances de terapias desarrolladas con estos? ÂżCuáles son los avances más recientes?”

![cover image](Molly chiquita y grande.JPG)

Week 2 HW: Read, write and edit DNA

Preparation for Week 2 lecture:

About next generation gene synthesis

  1. What is the error rate of polymerase? 1:10^6

  2. How does this compare to the length of the human genome? Human genome consist on 3.2 Giga base pairs, so at least more than 3 000 errors could be made in the synthesis of an entire human genome by one polymerase. But, since human methabolic pathways are at most 10kbp long, its negligible in the majority of context.

  3. How does biology deal with that discrepancy? There are a lot of polymerase and DNA genome is separated in chromosomes, so chances of committing errors in replication are low, also thera are DNA mismatch reparation systems (MutS Repair System).

  4. How many different ways are there to code for an average human protein? Considering that the average human protein has 1036 bp, and the genetic code is degenerate, you can make the same protein with different codons that in the end encode the same aminoacid, so there should be a lot of different ways.

  5. In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest? This is primarily because, when using different base conjugations, some bases will be compatible with others, and in the case of biological synthesis, we could risk the formation of hairpin-like structures between them (as in the case of RNA proteins), which would force the termination of the translation when we want to obtain the proteins. There is also the error rate.

From DNA Synthesis Development and Application:

  1. What’s the most commonly used method for oligo synthesis currently? Phosphoramidite method
  2. Why is it difficult to make oligos longer than 200nt via direct synthesis? An accumulation of errors due to the error rate of chemical synthesis
  3. Why can’t you make a 2000bp gene via direct oligo synthesis? Because the error rate makes it, probabilistically speaking, almost impossible to achieve. So instead gene assembly is used.

From reading and writing life:

What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?

  • 10 essentials: Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Threonine, Tryptophan, Valine, Arginine The lysine contingency It would work well if dinosaurs were as easy to isolate from external resources as bacteria in a petri dish. After all, essential amino acids are those we can’t produce ourselves, but obtain by eating other organisms that do possess them. It would be very difficult to cut off any other available source of lysine from the enviroment, so I would recommend another aproach.

Google Search: “Lysine Contigency”

Week 2 assingments:

(Part 2 consisted of “Gel Art - Restriction of Digests and Gel Electrophoresis”, intended to be carried out in the laboratory, since I do not have access to one, I did not complete it)

Subsections of Week 2 HW: Read, write and edit DNA

Part 1: Benchling & In silico gel art

My attempt at drawing a Teletubbie

Part 3: DNA Design Challenge

1. Obtaining the protein amino acid sequence

I went to UniProt page and retrieved Homo sapien’s PO5F1 sequence

https://www.uniprot.org/uniprotkb/Q01860/entry#sequences

Q01860-1: This isoform has been chosen as the canonical sequence

sp|Q01860|PO5F1_HUMAN POU domain, class 5, transcription factor 1 OS=Homo sapiens OX=9606 GN=POU5F1 PE=1 SV=1 MAGHLASDFAFSPPPGGGGDGPGGPEPGWVDPRTWLSFQGPPGGPGIGPGVGPGSEVWGI PPCPPPYEFCGGMAYCGPQVGVGLVPQGGLETSQPEGEAGVGVESNSDGASPEPCTVTPG AVKLEKEKLEQNPEESQDIKALQKELEQFAKLLKQKRITLGYTQADVGLTLGVLFGKVFS QTTICRFEALQLSFKNMCKLRPLLQKWVEEADNNENLQEICKAETLVQARKRKRTSIENR VRGNLENLFLQCPKPTLQQISHIAQQLGLEKDVVRVWFCNRRQKGKRSSSDYAQREDFEA AGSPFSGGPVSFPLAPGPHFGTPGYGSPHFTALYSSVPFPEGEAFPPVSVTTLGSPMHSN

OCT4-SOX2-bound nucleosome - SHL-6 From: Science 368 1460-1465 (2020) PMID 32327602 DOI 10.1126/science.abb0074

2. DNA sequence:

PO5F1 protein DNA sequence ATGGCGGGACACCTGGCTTCGGATTTCGCCTTCTCGCCCCCTCCAGGTGGTGGAGGTGATGGGCCAGGGGGGCCGGAGCCGGGCTGGGTTGATCCTCGGACCTGGCTAAGCTTCCAAGGCCCTCCTGGAGGGCCAGGAATCGGGCCGGGGGTTGGGCCAGGCTCTGAGGTGTGGGGGATTCCCCCATGCCCCCCGCCGTATGAGTTCTGTGGGGGGATGGCGTACTGTGGGCCCCAGGTTGGAGTGGGGCTAGTGCCCCAAGGCGGCTTGGAGACCTCTCAGCCTGAGGGCGAAGCAGGAGTCGGGGTGGAGAGCAACTCCGATGGGGCCTCCCCGGAGCCCTGCACCGTCACCCCTGGTGCCGTGAAGCTGGAGAAGGAGAAGCTGGAGCAAAACCCGGAGGAGTCCCAGGACATCAAAGCTCTGCAGAAAGAACTCGAGCAATTTGCCAAGCTCCTGAAGCAGAAGAGGATCACCCTGGGATATACACAGGCCGATGTGGGGCTCACCCTGGGGGTTCTATTTGGGAAGGTATTCAGCCAAACGACCATCTGCCGCTTTGAGGCTCTGCAGCTTAGCTTCAAGAACATGTGTAAGCTGCGGCCCTTGCTGCAGAAGTGGGTGGAGGAAGCTGACAACAATGAAAATCTTCAGGAGATATGCAAAGCAGAAACCCTCGTGCAGGCCCGAAAGAGAAAGCGAACCAGTATCGAGAACCGAGTGAGAGGCAACCTGGAGAATTTGTTCCTGCAGTGCCCGAAACCCACACTGCAGCAGATCAGCCACATCGCCCAGCAGCTTGGGCTCGAGAAGGATGTGGTCCGAGTGTGGTTCTGTAACCGGCGCCAGAAGGGCAAGCGATCAAGCAGCGACTATGCACAACGAGAGGATTTTGAGGCTGCTGGGTCTCCTTTCTCAGGGGGACCAGTGTCCTTTCCTCTGGCCCCAGGGCCCCATTTTGGTACCCCAGGCTATGGGAGCCCTCACTTCACTGCACTGTACTCCTCGGTCCCTTTCCCTGAGGGGGAAGCCTTTCCCCCTGTCTCCGTCACCACTCTGGGCTCTCCCATGCATTCAAAC

3. Gene and CDS

https://www.ncbi.nlm.nih.gov/nuccore/NM_002701.6?report=genbank

I went to NCBI genbank and retrieved POU5F1 mARN transcript variant, 1409bp long I learned that there are spliced variants of OCT4 gene, that includes some introns and can have a more pivotal role in the induction of stemness properties (Yazd et al., 2011)

4. Central Dogma of biology

Part 4: DNA Synthesis order, building my first plasmid

1. Finding an appropriate plasmid backbone:

Reading the Yang et al. (2023) article, I saw in their “Materials and Methods” section, that they transfected pLVX-EF1alpha 2xGFP:NES-IRES-2xRFP:NLS to generate NCC-stable cells, so I thought it would be a good backbone for the assembly. This “shuttle vector” was originally described by Mertens et al. (2015) (AddGene ID: 71396) Directly Reprogrammed Human Neurons Retain Aging-Associated Transcriptomic Signatures and Reveal Age-Related Nucleocytoplasmic Defects. Mertens J, Paquola AC, Ku M, Hatch E, Bohnke L, Ladjevardi S, McGrath S, Campbell B, Lee H, Herdy JR, Goncalves JT, Toda T, Kim Y, Winkler J, Yao J, Hetzer MW, Gage FH. Cell Stem Cell. 2015 Oct 6. pii: S1934-5909(15)00408-7. doi: 10.1016/j.stem.2015.09.001. 10.1016/j.stem.2015.09.001 PubMed 26456686

2. Assembly

I selected restriction enzymes BamHI and EcoRI to cut the GFP reporter out, and then I built my DNA insert sequence (POU5F1 gene for Oct4) with sticky ends ideal to connect with the backbone. I I incorporated a Kozak consensus sequence (GCCACC) upstream of the ATG instead of an RBS And to add the 7xHis at the C-terminus of the protein, before the stop codon The design leverages the backbone’s endogenous promoter and polyadenylation signal, so I didn’t incorporate those to the insert

And this is the result

https://benchling.com/s/seq-dcl08hA5mckl7k6SyMfX?m=slm-6JoFb5a3gBdbPekfEuDq

Twist order

I tried ordering, but couldn’t so I made another insert, with a Trp-less GFP like in recitation, and got my Hight Copy cassette from Twist: https://benchling.com/s/seq-tiZifTIvfRfmji3ky9IP?m=slm-Xdd23sZE45BAObyU1yP8

Part 5: Read, Write and Edit

DNA Read

What DNA would you want to sequence (e.g., read) and why?

I would like to sequence the plasmid I modified (pLVX-EF1alpha-POU5F1-7xHis-IRES-mCherry) as a way of experimentally verifying that there were no mutations and that the junctions integrated properly.

In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why?

I would use the “Oxford Nanopore” sequencing method because it would allow me to find accidental recombinations and deletions by being able to sequence the entire plasmid.

DNA Write

What DNA would you want to synthesize (e.g., write) and why?

I would like to synthesize de novo the entire modified POU5F1 (Oct4) cassett, because that way I can perform codon optimization and add the necessary elements (histidine tag and Kozak sequence) in the most efficient way.

What technology or technologies would you use to perform this DNA synthesis and why?

I would use silicon-based DNA synthesis on microchips. Given that the Oct4 cDNA is approximately 1 kb, I really appreciate the ability to synthesize thousands of oligonucleotides in parallel with extremely high accuracy and at a much lower cost than traditional column synthesis.

DNA Edit

What DNA would you want to edit and why?

I would like to edit the genome of the plasmid recipient cells so that they possess a specific locus where the POU5F1 cassette can be inserted without activating oncogenes or disrupting vital genes.

What technology or technologies would you use to perform these DNA edits and why?

I would use Prime Editing because it would allow me to edit a small portion of the genome with considerable precision and turn it into a specific recognition site. Since it uses a Reverse Transcriptase (fused to Cas9 nickase) to write the new sequence directly into the DNA, it works perfectly in neurons or resting fibroblasts.

Subsections of Week 3 HW: Lab automation

Part 1: Opentrons artwork

My attempt

Part 2: Use of automation tools

Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications

Write a description about what you intend to do with automation tools for your final project. You may include example pseudocode, Python scripts, 3D printed holders, a plan for how to use Ginkgo Nebula, and more. You may reference this week’s recitation slide deck for lab automation details.

Well, of the three projects I’ve proposed, I’ll start with the second one, “Nostoc’s smart gummies”.

The objective here is to Characterize the sensitivity of heavy metal biosensors in Nostoc colonies and standardize the globular shape for the production of “gummies”

Cloud Lab Workflow (Ginkgo Nebula)

Echo: Precise gradient transfer of heavy metal solutions (lead and arsenic) at parts-per-billion (ppb) concentrations to Nostoc culture plates.

Multiflo: Dispensing of culture media optimized for cyanobacteria growth (BG-11).

PHERAstar: Measurement of the color intensity (absorbance) of reporter chromoproteins to establish the biosensor calibration curve.

Weeek 4 HW: Protein design - Part 1

Notes about key concepts for deeper understanding

About ESM

Key Concepts to Know:

  • Evolutionary Scale Model (ESM): A model that generates likelihood values for the existence of an amino acid at a specific position within a protein sequence. It is essentially a Masked Language Modeling (MLM) architecture.

  • Deep Mutational Scanning (DMS): A 2D map representing the probability values generated by the ESM for every possible mutation in a sequence.

  • Latent Space: Upon inputting a FASTA sequence, the language model adds start and end tokens, generating a high-dimensional vector (320 dimensions in the case of the recitation model). This vector classifies the protein; when projected into a 3D space, it creates a map that clusters the protein alongside others from a provided database based on functional or family similarities.

  • Multi-Layer Perceptron (MLP): Consist on two linear transformation with an non-linear activation (https://www.ultralytics.com/glossary/gelu-gaussian-error-linear-unit) in between.

Steps for 3D Protein Structure Prediction with ESM-2:

  1. Likelihood Generation: We utilize the fundamental property of the ESM (Masked Language Modeling) to calculate amino acid probabilities based on the sequence context.

  2. Attention Mechanism: These data points are processed as Queries (Q), Keys (K), and Values (V). This generates an Attention Matrix, where vectors for each amino acid indicate the relevance and position of other residues in the chain.

  3. Structural Extraction (Simple ESM): The raw attention matrix is symmetrized to ensure physically plausible distances and refined using Average Product Correction (APC) to eliminate correlation noise. Through regression, a 2D contact map is obtained.

  4. Model Optimization: This map is compared against structural databases (like the PDB) to adjust the model. This feedback loop updates the model’s weights, maximizing the log-likelihood of real-world protein structures found in nature.

  5. Enrichment via MLP: In parallel, the correlation vectors from the attention matrix are processed by a Multi-Layer Perceptron (MLP). This generates enriched vectors that describe the physicochemical implications of the attention data.

  6. 3D Prediction (ESM-2/ESMFold): In the ESM-2 architecture, these MLP-enriched vectors are fed into a “Folding Trunk” module. This module applies learned rules to predict the 3D structure, including bond angles and atomic coordinates.

About AlphaFold ans MSA

Key Concepts to Know:

  • Multiple Sequence Alignments (MSA): The different dialects of the tree of life for the same protein

Subsections of Labs

Week 1 Lab: Pipetting

cover image cover image

Subsections of Projects

Individual Final Project

cover image cover image

Group Final Project

cover image cover image