“An application of synthetic biology that I recently found out about, and I’m really excited for, is partial cellular reprogramming.” It’s achieved by inducing the expression of factors called “Yamanaka factors”, which enable a cell to regain a pluripotent state in which DNA methylation patterns and chromatin architecture are reset to a younger state in a “rejuvenation” process that enhances health and lifespan for individuals. (1, 2) Recently, a treatment intended to treat optic neuropathies based on this mechanism was approved by the FDA for its first clinical trials; this treatment consists of an AAV2 vector that carries the Oct4, Sox2, and Klf4 factors. Systemic doxycycline administration is needed for almost two months to activate OSK expression. (3) What if we could engineer a self-regulated genetic circuit that, by utilizing biosensors, detects aging or disease markers like transcription factors that enhance senescence-associated secretory phenotypes (SASPs), and thus activates itself and begins rewiring? To ensure non-malfeasance, the system incorporates a failsafe kill switch as a safety module that induces apoptosis if the cell loses its differentiated identity or if markers of full pluripotency, such as Nanog, are detected, to reduce the risk of cancer. By the time a “safe-by-design” technology like this becomes approved and available to the general public, there must be some important regulations upheld.
1. “An application of synthetic biology that I recently found out about, and I’m really excited for, is partial cellular reprogramming.”
It’s achieved by inducing the expression of factors called “Yamanaka factors”, which enable a cell to regain a pluripotent state in which DNA methylation patterns and chromatin architecture are reset to a younger state in a “rejuvenation” process that enhances health and lifespan for individuals. (1, 2) Recently, a treatment intended to treat optic neuropathies based on this mechanism was approved by the FDA for its first clinical trials; this treatment consists of an AAV2 vector that carries the Oct4, Sox2, and Klf4 factors. Systemic doxycycline administration is needed for almost two months to activate OSK expression. (3) What if we could engineer a self-regulated genetic circuit that, by utilizing biosensors, detects aging or disease markers like transcription factors that enhance senescence-associated secretory phenotypes (SASPs), and thus activates itself and begins rewiring? To ensure non-malfeasance, the system incorporates a failsafe kill switch as a safety module that induces apoptosis if the cell loses its differentiated identity or if markers of full pluripotency, such as Nanog, are detected, to reduce the risk of cancer. By the time a “safe-by-design” technology like this becomes approved and available to the general public, there must be some important regulations upheld.
2. The policy goals
Biosafety:
Genetic safety and vectors: Ensure that the reprogramming does not result in oncogenesis and that the use of vectors (such as AAV2s) does not cause insertional mutagenesis or unforeseen infections (due to its viral nature).
Responsible usage: Restrict access to reprogramming vectors to authorized hospitals to prevent self-administration and prevent the genetic circuit from being altered or replicated at home without adequate security controls. There shouldn’t be people modifying or hacking the genetic circuit nor the vectors.
Public accessibility and equity:
Universal access: Everyone should be able to access such treatment if required, so there should be financing or insurance programs that guarantee it.
Inclusive standards: Treatment must work on a global genetics level, being effective regardless of ethnic diversity
Transparency: People should be taught the nature of things we apply on them and informed of possible risks
3. The potential actions and the actors:
Actors: Researchers, Medics, Industry, Government and Patients
1. Enhance safety by standard design:
Main cast: Researchers, Industry
The researchers must develop a genetic circuit design that must induce apoptosis whenever there is a risk of oncogenesis or a full dedifferentiation process; industry must standardize that design to diminish risk.
We are assuming that the pluripotency (nanong) indicators are completely accurate.
Failure here results in the formation of teratomas or tumors.
2. Funding to ensure equity and access:
Main cast: Government, Medics, Patients
The government must finance treatment for patients that need it but can’t afford it (through funding the program through taxes on purely cosmetic longevity treatments, for example) and incentivize the creation of genomics databases of different ethnic groups. Medics must contribute to the genomic database by taking samples while doing service in the mentioned groups.
We assume that the cost of such treatment decreases as time passes and advances are made in the field.
Failure here results in discrimination (economic and/or ethnic) and leaves people without support.
3. Development of transgene control systems
Main cast: Medics, Researchers
Researchers should develop sophisticated control systems, not only in the form of a synthetic chemical compound (drug) that inhibits the circuit by marking its proteins for degradation, but also in an inhalation-mediated inhibition mechanism with a volatile compound that triggers an inducible promoter that codes for a regulatory protein that instantly degrades the Yamanaka factors. Physicians should prescribe this drug to stop any unwanted side effects that some patients might experience (oncogenesis).
We assume that the transgene control systems are effective and fast enough to reverse a dedifferentiation process before it becomes irreversible or tumorous.
Failure here results in possible untreatable disregulations of the system.
4. Scoring the options
Does the option:
Option 1
Option 2
Option 3
Enhance Biosecurity
• By preventing incidents
1
-
-
• By helping respond
-
-
1
Foster Lab Safety
• By preventing incident
-
-
-
• By helping respond
-
-
-
Protect the environment
• By preventing incidents
-
-
-
• By helping respond
-
-
-
Other considerations
• Minimizing costs and burdens to stakeholders
3
1
3
• Feasibility?
1
3
2
• Not impede research
1
2
1
• Promote constructive applications
3
2
3
5. What to prioritize
I believe that options 1 and 3 are the most important in this case, so I would recommend developing a standard circuit design that is self-regulating and has an external regulation mechanism in case, in a physician’s judgment, it is considered necessary to stop the process in any patient. I would suggest this kind of approach as a prerequisite for any autonomous reprogramming system intended for human use in the FDA.
6. Bibliography
Takahashi & Yamanaka (2006) Induction of Pluripotent Stem Cells from Mouse Embryonic and Adult Fibroblast Cultures by Defined Factors. Cell 126, 663–676. http://dx.doi.org/10.1016/j.cell.2006.07.024
Schmidt & Plath (2012) The roles of the reprogramming factors Oct4, Sox2 and Klf4 in resetting the somatic cell epigenome during induced pluripotent stem cell generation. http://genomebiology.com/2012/13/10/251
Macip et al. (2024) Gene Therapy-Mediated Partial ReprogrammingExtends Lifespan and Reverses Age-RelatedChanges in Aged Mice. http://dx.doi.org/10.1089/cell.2023.0072
Biosciences, L. (2026). Evaluating ER-100 for Safety in People With Glaucoma or Non-Arteritic Anterior Ischemic Optic Neuropathy (Optic Nerve Conditions). https://clinicaltrials.gov/study/NCT07290244
7. Searches:
Google: “Life Biosciences”
Gemini:
“Me gustaría que clarifiques la redacción de la introducción a la aplicación de la biología sintética de la que voy a hablar en mi tarea.”
“Leí estos dos papers Takahashi & Yamanaka (2006) y Schmidt & Plath (2012). Estos hablan de los factores OSK ¿Ya han habido avances de terapias desarrolladas con estos? ¿Cuáles son los avances más recientes?”

Week 2 HW: Read, write and edit DNA
Preparation for Week 2 lecture:
About next generation gene synthesis
What is the error rate of polymerase?
1:10^6
How does this compare to the length of the human genome?
Human genome consist on 3.2 Giga base pairs, so at least more than 3 000 errors could be made in the synthesis of an entire human genome by one polymerase. But, since human methabolic pathways are at most 10kbp long, its negligible in the majority of context.
How does biology deal with that discrepancy?
There are a lot of polymerase and DNA genome is separated in chromosomes, so chances of committing errors in replication are low, also thera are DNA mismatch reparation systems (MutS Repair System).
How many different ways are there to code for an average human protein?
Considering that the average human protein has 1036 bp, and the genetic code is degenerate, you can make the same protein with different codons that in the end encode the same aminoacid, so there should be a lot of different ways.
In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?
This is primarily because, when using different base conjugations, some bases will be compatible with others, and in the case of biological synthesis, we could risk the formation of hairpin-like structures between them (as in the case of RNA proteins), which would force the termination of the translation when we want to obtain the proteins.
There is also the error rate.
From DNA Synthesis Development and Application:
What’s the most commonly used method for oligo synthesis currently? Phosphoramidite method
Why is it difficult to make oligos longer than 200nt via direct synthesis? An accumulation of errors due to the error rate of chemical synthesis
Why can’t you make a 2000bp gene via direct oligo synthesis? Because the error rate makes it, probabilistically speaking, almost impossible to achieve. So instead gene assembly is used.
From reading and writing life:
What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?
10 essentials: Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Threonine, Tryptophan, Valine, Arginine
The lysine contingency It would work well if dinosaurs were as easy to isolate from external resources as bacteria in a petri dish. After all, essential amino acids are those we can’t produce ourselves, but obtain by eating other organisms that do possess them. It would be very difficult to cut off any other available source of lysine from the enviroment, so I would recommend another aproach.
Google Search: “Lysine Contigency”
Week 2 assingments:
(Part 2 consisted of “Gel Art - Restriction of Digests and Gel Electrophoresis”, intended to be carried out in the laboratory, since I do not have access to one, I did not complete it)
Using a ladder Life 1 kb Plus and different restriction enzymes (EcoRI, XhoI, SalI, EcoRV, NdeI, BamHI, HindIII and KpnI) on the Lambda fage genome, I tried to draw a Teletubbie.
PO5F1 protein DNA sequence
ATGGCGGGACACCTGGCTTCGGATTTCGCCTTCTCGCCCCCTCCAGGTGGTGGAGGTGATGGGCCAGGGGGGCCGGAGCCGGGCTGGGTTGATCCTCGGACCTGGCTAAGCTTCCAAGGCCCTCCTGGAGGGCCAGGAATCGGGCCGGGGGTTGGGCCAGGCTCTGAGGTGTGGGGGATTCCCCCATGCCCCCCGCCGTATGAGTTCTGTGGGGGGATGGCGTACTGTGGGCCCCAGGTTGGAGTGGGGCTAGTGCCCCAAGGCGGCTTGGAGACCTCTCAGCCTGAGGGCGAAGCAGGAGTCGGGGTGGAGAGCAACTCCGATGGGGCCTCCCCGGAGCCCTGCACCGTCACCCCTGGTGCCGTGAAGCTGGAGAAGGAGAAGCTGGAGCAAAACCCGGAGGAGTCCCAGGACATCAAAGCTCTGCAGAAAGAACTCGAGCAATTTGCCAAGCTCCTGAAGCAGAAGAGGATCACCCTGGGATATACACAGGCCGATGTGGGGCTCACCCTGGGGGTTCTATTTGGGAAGGTATTCAGCCAAACGACCATCTGCCGCTTTGAGGCTCTGCAGCTTAGCTTCAAGAACATGTGTAAGCTGCGGCCCTTGCTGCAGAAGTGGGTGGAGGAAGCTGACAACAATGAAAATCTTCAGGAGATATGCAAAGCAGAAACCCTCGTGCAGGCCCGAAAGAGAAAGCGAACCAGTATCGAGAACCGAGTGAGAGGCAACCTGGAGAATTTGTTCCTGCAGTGCCCGAAACCCACACTGCAGCAGATCAGCCACATCGCCCAGCAGCTTGGGCTCGAGAAGGATGTGGTCCGAGTGTGGTTCTGTAACCGGCGCCAGAAGGGCAAGCGATCAAGCAGCGACTATGCACAACGAGAGGATTTTGAGGCTGCTGGGTCTCCTTTCTCAGGGGGACCAGTGTCCTTTCCTCTGGCCCCAGGGCCCCATTTTGGTACCCCAGGCTATGGGAGCCCTCACTTCACTGCACTGTACTCCTCGGTCCCTTTCCCTGAGGGGGAAGCCTTTCCCCCTGTCTCCGTCACCACTCTGGGCTCTCCCATGCATTCAAAC
I went to NCBI genbank and retrieved POU5F1 mARN transcript variant, 1409bp long
I learned that there are spliced variants of OCT4 gene, that includes some introns and can have a more pivotal role in the induction of stemness properties (Yazd et al., 2011)
4. Central Dogma of biology
Part 4: DNA Synthesis order, building my first plasmid
1. Finding an appropriate plasmid backbone:
Reading the Yang et al. (2023) article, I saw in their “Materials and Methods” section, that they transfected pLVX-EF1alpha
2xGFP:NES-IRES-2xRFP:NLS to generate NCC-stable cells, so I thought it would be a good backbone for the assembly. This “shuttle vector” was originally described by Mertens et al. (2015) (AddGene ID: 71396)
Directly Reprogrammed Human Neurons Retain Aging-Associated Transcriptomic Signatures and Reveal Age-Related Nucleocytoplasmic Defects. Mertens J, Paquola AC, Ku M, Hatch E, Bohnke L, Ladjevardi S, McGrath S, Campbell B, Lee H, Herdy JR, Goncalves JT, Toda T, Kim Y, Winkler J, Yao J, Hetzer MW, Gage FH. Cell Stem Cell. 2015 Oct 6. pii: S1934-5909(15)00408-7. doi: 10.1016/j.stem.2015.09.001. 10.1016/j.stem.2015.09.001 PubMed 26456686
2. Assembly
I selected restriction enzymes BamHI and EcoRI to cut the GFP reporter out, and then I built my DNA insert sequence (POU5F1 gene for Oct4) with sticky ends ideal to connect with the backbone.
I I incorporated a Kozak consensus sequence (GCCACC) upstream of the ATG instead of an RBS
And to add the 7xHis at the C-terminus of the protein, before the stop codon
The design leverages the backbone’s endogenous promoter and polyadenylation signal, so I didn’t incorporate those to the insert
What DNA would you want to sequence (e.g., read) and why?
I would like to sequence the plasmid I modified (pLVX-EF1alpha-POU5F1-7xHis-IRES-mCherry) as a way of experimentally verifying that there were no mutations and that the junctions integrated properly.
In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why?
I would use the “Oxford Nanopore” sequencing method because it would allow me to find accidental recombinations and deletions by being able to sequence the entire plasmid.
DNA Write
What DNA would you want to synthesize (e.g., write) and why?
I would like to synthesize de novo the entire modified POU5F1 (Oct4) cassett, because that way I can perform codon optimization and add the necessary elements (histidine tag and Kozak sequence) in the most efficient way.
What technology or technologies would you use to perform this DNA synthesis and why?
I would use silicon-based DNA synthesis on microchips. Given that the Oct4 cDNA is approximately 1 kb, I really appreciate the ability to synthesize thousands of oligonucleotides in parallel with extremely high accuracy and at a much lower cost than traditional column synthesis.
DNA Edit
What DNA would you want to edit and why?
I would like to edit the genome of the plasmid recipient cells so that they possess a specific locus where the POU5F1 cassette can be inserted without activating oncogenes or disrupting vital genes.
What technology or technologies would you use to perform these DNA edits and why?
I would use Prime Editing because it would allow me to edit a small portion of the genome with considerable precision and turn it into a specific recognition site. Since it uses a Reverse Transcriptase (fused to Cas9 nickase) to write the new sequence directly into the DNA, it works perfectly in neurons or resting fibroblasts.
Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications
Write a description about what you intend to do with automation tools for your final project. You may include example pseudocode, Python scripts, 3D printed holders, a plan for how to use Ginkgo Nebula, and more. You may reference this week’s recitation slide deck for lab automation details.
Well, of the three projects I’ve proposed, I’ll start with the second one, “Nostoc’s smart gummies”.
The objective here is to Characterize the sensitivity of heavy metal biosensors in Nostoc colonies and standardize the globular shape for the production of “gummies”
Cloud Lab Workflow (Ginkgo Nebula)
Echo: Precise gradient transfer of heavy metal solutions (lead and arsenic) at parts-per-billion (ppb) concentrations to Nostoc culture plates.
Multiflo: Dispensing of culture media optimized for cyanobacteria growth (BG-11).
PHERAstar: Measurement of the color intensity (absorbance) of reporter chromoproteins to establish the biosensor calibration curve.
Weeek 4 HW: Protein design - Part 1
Notes about key concepts for deeper understanding
About ESM
Key Concepts to Know:
Evolutionary Scale Model (ESM): A model that generates likelihood values for the existence of an amino acid at a specific position within a protein sequence. It is essentially a Masked Language Modeling (MLM) architecture.
Deep Mutational Scanning (DMS): A 2D map representing the probability values generated by the ESM for every possible mutation in a sequence.
Latent Space: Upon inputting a FASTA sequence, the language model adds start and end tokens, generating a high-dimensional vector (320 dimensions in the case of the recitation model). This vector classifies the protein; when projected into a 3D space, it creates a map that clusters the protein alongside others from a provided database based on functional or family similarities.
Steps for 3D Protein Structure Prediction with ESM-2:
Likelihood Generation: We utilize the fundamental property of the ESM (Masked Language Modeling) to calculate amino acid probabilities based on the sequence context.
Attention Mechanism: These data points are processed as Queries (Q), Keys (K), and Values (V). This generates an Attention Matrix, where vectors for each amino acid indicate the relevance and position of other residues in the chain.
Structural Extraction (Simple ESM): The raw attention matrix is symmetrized to ensure physically plausible distances and refined using Average Product Correction (APC) to eliminate correlation noise. Through regression, a 2D contact map is obtained.
Model Optimization: This map is compared against structural databases (like the PDB) to adjust the model. This feedback loop updates the model’s weights, maximizing the log-likelihood of real-world protein structures found in nature.
Enrichment via MLP: In parallel, the correlation vectors from the attention matrix are processed by a Multi-Layer Perceptron (MLP). This generates enriched vectors that describe the physicochemical implications of the attention data.
3D Prediction (ESM-2/ESMFold): In the ESM-2 architecture, these MLP-enriched vectors are fed into a “Folding Trunk” module. This module applies learned rules to predict the 3D structure, including bond angles and atomic coordinates.
About AlphaFold ans MSA
Key Concepts to Know:
Multiple Sequence Alignments (MSA): The different dialects of the tree of life for the same protein