Subsections of <Fatima Fragoso Hernández> — HTGAA Spring 2026

Homework

Weekly homework submissions:

  • Week 1 HW: Principles and Practices

    Does the option: Option 1 Option 2 Option 3 Enhance Biosecurity • By preventing incidents 1 2 NA • By helping respond 3 NA NA Foster Lab Safety • By preventing incident 3 1 NA • By helping respond NA 1 NA Protect the environment • By preventing incidents NA 1 NA • By helping respond NA 2 NA Other considerations • Minimizing costs and burdens to stakeholders 2 2 3 • Feasibility? 2 1 3 • Not impede research 3 1 1 • Promote constructive applications 1 1 1

  • Week 2 HW: Lecture Prep

    Homework Questions from Professor Jacobson DNA Polymerase Error Rates and the Human Genome Error Rate of Polymerase: In biological synthesis, error-correcting polymerase has an error rate of approximately $1:106$. This is significantly more accurate than raw chemical synthesis, which has an error rate of roughly $1:102$. Comparison to the Human Genome: The human genome is approximately 3 billion base pairs ($3 \times 109$) in length. At an error rate of $1:106$, copying the entire human genome would result in roughly 3,000 errors per replication cycle. How Biology Deals with the Discrepancy: Biology utilizes specific enzymatic functions to manage and correct these errors to ensure genomic integrity. This includes 3’-5’ proofreading exonuclease activity and 5’-3’ error-correcting exonuclease functions that work alongside template-dependent primer extension to identify and remove incorrect bases. Coding for Human Proteins Ways to Code for an Average Human Protein: The average human protein is 1,036 base pairs long. Because the genetic code is redundant (multiple different codons can code for the same amino acid), there are an astronomical number of possible DNA sequences that can result in the same protein sequence. The sources highlight that biology must find a balance between this codon redundancy and diversity to maintain “fabricational complexity”. Reasons Some Codes Do Not Work: In practice, many DNA sequences that technically code for the correct protein are “impossible” or difficult to use for synthesis or expression due to several biological and mechanical factors: Secondary Structures: Sequences that form hairpins or inverted repeats can interfere with replication and transcription machinery. Extreme GC Content: Regions with very high (≥90%) or very low (≤10%) GC content are often unstable or difficult for polymerase to navigate. Repetitive Sequences: Long terminal repeats, tandem repeats, or clusters of repeats can lead to “slippage” and errors during synthesis. Homopolymers: Long runs of an identical base (e.g., more than 30bp of A) are particularly prone to errors. RNA Cleavage and Stability: Certain nucleotide combinations may inadvertently trigger RNA cleavage rules (such as targets for RNase III), leading to the degradation of the mRNA before it can be translated. Codon Optimization: Not all redundant codons are treated equally by the cell’s translational machinery; choosing the “wrong” codons can lead to inefficient protein production. Homework Questions from Dr. LeProust

  • Week 2 HW: DNA read write and edit

    Part 3. Chose Protein I chose glucokinase (GCK) because in my biochemistry classes I found it to be a very interesting enzyme due to its unique functions and its critical role as a glucose sensor. According to the sources, what makes this enzyme particularly fascinating is that, unlike other members of the hexokinase family, it is not inhibited by its product(glucose-6-phosphate). This allows the enzyme to remain active even when glucose is abundant in the system.

  • Week 3 HW: Lab automation

    PART 1. ![cover image](OpentronLab ART.png) artistic design using the GUI LINK: https://opentrons-art.rcdonovan.com/?id=98conne30870554 PART 2. ARTICLE “An Automated Versatile Diagnostic Workflow for Infectious Disease Detection in Low-Resource Settings” DOI: https://doi.org/10.3390/mi15060708 The article highlights how implementing Opentrons for automated workflows in hospital and clinical settings helps significantly reduce turnaround times and accelerates overall logistics. By increasing sample throughput and enabling the simultaneous processing of multiple samples, the system greatly enhances operational efficiency. Furthermore, automation reduces the risk of human error inherent in manual repetitive tasks and minimizes the possibility of sample contamination or compromising the diagnostic process, ensuring more reliable results.

Subsections of Homework

Week 1 HW: Principles and Practices

cover image cover image
Does the option:Option 1Option 2Option 3
Enhance Biosecurity
• By preventing incidents12NA
• By helping respond3NANA
Foster Lab Safety
• By preventing incident31NA
• By helping respondNA1NA
Protect the environment
• By preventing incidentsNA1NA
• By helping respondNA2NA
Other considerations
• Minimizing costs and burdens to stakeholders223
• Feasibility?213
• Not impede research311
• Promote constructive applications111

Subsections of Week 1 HW: Principles and Practices

Week 1 HW: Principles and Practices

Below is the translation of the proposal for the biological engineering tool and its associated governance framework.

1. Biological Engineering Application: “Bio-Hybrid Designer”

Description: I propose the development of Bio-Hybrid Designer, an AI-powered BioCAD software specifically engineered to design and optimize hybrid biosynthetic pathways for complex natural drugs, such as taxanes (paclitaxel).

Why develop it? Inspired by the “How to Grow (Almost) Anything” (HTGAA) philosophy and the need for sustainable pharmaceutical production, this tool addresses critical enzymatic “bottlenecks”. Currently, complete biosynthesis of complex drugs in microorganisms is hindered by poor expression of enzymes like cytochrome P450 in heterologous systems. The software would identify the optimal transition point where biological synthesis should stop (producing an advanced precursor like baccatin III) and where selective chemical synthesis should take over to finalize the drug.


2. Governance and Policy Goals

The primary goal is non-malfeasance (preventing harm) to ensure the ethical deployment of this technology.

  • Sub-goal A: Prevention of Harmful Dual-Use. Ensuring the AI cannot be used to design pathways for known toxins or pathogens, thereby protecting international security.
  • Sub-goal B: Global Equity and Sustainability. Preventing abrupt “economic displacement”. If lab production replaces natural harvesting (such as from the Pacific yew tree), a just transition must be ensured for local communities that currently depend on these natural resources.

3. Proposed Governance Actions

Action 1: Mandatory AI Screening for “Critical Bio-Parts” (Federal Regulators & Companies)

  • Purpose: Currently, access to many biological databases is open. This action proposes that AI BioCAD tools must incorporate mandatory filters to block the design of sequences structurally similar to biological threat agents.
  • Design: Federal regulators would require companies providing BioCAD services to implement automated detection protocols (similar to internet content filters or anti-plagiarism software).
  • Assumptions: It assumes that biological threats have predictable genetic or structural “signatures” that AI can accurately identify.
  • Risks of Failure & “Success”: It could fail if attackers use sequence “obfuscation” techniques. Excessive “success” might hinder legitimate research on rare medicines that happen to share chemical precursors with toxins.

Action 2: Incentives for Chassis with “Kill-Switches” (Academic Researchers & Funding Agencies)

  • Purpose: To move from passive observation to technical biosecurity. All research grants for synthetic drug production would require the use of host organisms (chassis) equipped with programmed “kill-switches” to prevent environmental survival in case of a leak.
  • Design: Funding agencies (such as the NIH) would act as the primary actors, tying financial support to the use of validated containment protocols like “GeneGuard”.
  • Assumptions: It assumes these kill-switches are evolutionarily stable and will not be deactivated by natural mutations within the lab.
  • Risks of Failure & “Success”: Failure occurs if the microorganism survives environmental filters through horizontal gene transfer. “Success” might create a technological monopoly over “safe” chassis.

Action 3: Bioeconomy Transition Fund (International Organizations & Big Pharma)

  • Purpose: To mitigate global inequalities. Similar to energy transition funds, a levy would be placed on profits from drugs produced via synthetic hybrid pathways.
  • Design: The UN or WHO would coordinate with pharmaceutical giants to fund economic reconversion programs in regions where natural harvesting is displaced by industrial production.
  • Assumptions: It assumes synthetic production will be significantly cheaper, generating enough surplus to fund compensation.
  • Risks of Failure & “Success”: The main failure would be a lack of political will to tax new technologies. “Success” could potentially create a dependency on subsidies rather than fostering a new, sustainable local economy.

Week 2 HW: Lecture Prep

Homework Questions from Professor Jacobson

DNA Polymerase Error Rates and the Human Genome

  • Error Rate of Polymerase: In biological synthesis, error-correcting polymerase has an error rate of approximately $1:10^6$. This is significantly more accurate than raw chemical synthesis, which has an error rate of roughly $1:10^2$.
  • Comparison to the Human Genome: The human genome is approximately 3 billion base pairs ($3 \times 109$) in length. At an error rate of $1:106$, copying the entire human genome would result in roughly 3,000 errors per replication cycle.
  • How Biology Deals with the Discrepancy: Biology utilizes specific enzymatic functions to manage and correct these errors to ensure genomic integrity. This includes 3’-5’ proofreading exonuclease activity and 5’-3’ error-correcting exonuclease functions that work alongside template-dependent primer extension to identify and remove incorrect bases.

Coding for Human Proteins

  • Ways to Code for an Average Human Protein: The average human protein is 1,036 base pairs long. Because the genetic code is redundant (multiple different codons can code for the same amino acid), there are an astronomical number of possible DNA sequences that can result in the same protein sequence. The sources highlight that biology must find a balance between this codon redundancy and diversity to maintain “fabricational complexity”.
  • Reasons Some Codes Do Not Work: In practice, many DNA sequences that technically code for the correct protein are “impossible” or difficult to use for synthesis or expression due to several biological and mechanical factors:
    • Secondary Structures: Sequences that form hairpins or inverted repeats can interfere with replication and transcription machinery.
    • Extreme GC Content: Regions with very high (≥90%) or very low (≤10%) GC content are often unstable or difficult for polymerase to navigate.
    • Repetitive Sequences: Long terminal repeats, tandem repeats, or clusters of repeats can lead to “slippage” and errors during synthesis.
    • Homopolymers: Long runs of an identical base (e.g., more than 30bp of A) are particularly prone to errors.
    • RNA Cleavage and Stability: Certain nucleotide combinations may inadvertently trigger RNA cleavage rules (such as targets for RNase III), leading to the degradation of the mRNA before it can be translated.
    • Codon Optimization: Not all redundant codons are treated equally by the cell’s translational machinery; choosing the “wrong” codons can lead to inefficient protein production.

Homework Questions from Dr. LeProust

The most commonly used method for oligonucleotide synthesis currently is phosphoramidite chemical synthesis on a solid support. This process follows a cycle involving deprotection, base coupling, capping, and oxidation to add nucleotides sequentially to a growing chain.

Direct synthesis of oligos longer than 200 nucleotides is difficult because the cumulative yield drops exponentially with each coupling step. Chemical synthesis has a high raw error rate of approximately 1:10², meaning errors are introduced frequently as the chain grows. Because the efficiency of adding each base is not 100%, the probability of obtaining a perfect, full-length product decreases significantly as the length increases.

You cannot make a 2000bp gene via direct synthesis primarily because the error rate of chemical synthesis would result in approximately 20 errors in a sequence of that length. Furthermore, the physical yield of a 2000bp molecule would be vanishingly small due to the cumulative losses over 2000 coupling cycles. Instead, genes are constructed using enzymatic assembly to join many smaller, sequence-verified oligonucleotides into a single long fragment. While advanced chemistry has pushed direct synthesis limits to 700 nucleotides, creating a 2000bp gene still requires the assembly of multiple fragments to maintain accuracy and throughput.

Homework Question from George Church

Command IA: [Using Google & Prof. Church’s slide #4] What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?

Based on the list of essential amino acids and the provided lecture materials, here is an analysis of the “Lysine Contingency”:

The 10 Essential Amino Acids

According to general biological standards (verified via external search), the 10 essential amino acids that most animals (including humans) cannot synthesize and must obtain through their diet are:

  1. Phenylalanine
  2. Valine
  3. Threonine
  4. Tryptophan
  5. Isoleucine
  6. Methionine
  7. Histidine
  8. Arginine
  9. Leucine
  10. Lysine

In Professor Church’s slides, the standard genetic code chart identifies these amino acids by their single-letter codes (e.g., K for Lysine, L for Leucine, M for Methionine) as the building blocks for protein synthesis.

The “Lysine Contingency” and Biocontainment

The “Lysine Contingency” is a concept popularized by Jurassic Park, where engineered organisms are made unable to produce lysine, theoretically preventing them from surviving in the wild without human-provided supplements.

Knowing that lysine is already an essential amino acid for all animals significantly changes the view of this contingency:

  • Redundancy in Nature: Since wild animals (and humans) already cannot synthesize lysine, they must constantly find it in their environment (by eating plants or other animals). Therefore, a “lysine contingency” is not a robust biocontainment strategy because lysine is widely available in the natural world. An escaped organism would simply find lysine in the wild just as any other animal does.
  • Church’s Advanced Solution: Professor Church’s research proposes a much more effective version of this idea through Genomically Recoded Organisms (GROs). Instead of relying on a natural amino acid like lysine, Church’s team engineered organisms to be dependent on Non-Standard Amino Acids (NSAAs).
  • Synthetic Dependency: These NSAAs do not exist in the wild. This creates a true “metabolic isolation” or “biocontainment” because, unlike lysine, the organism cannot find these synthetic building blocks in the environment, effectively preventing survival outside a controlled laboratory setting.

In summary, while the “Lysine Contingency” is a natural reality for all animals, it is an ineffective tool for synthetic biology containment. Modern genetic engineering instead uses synthetic amino acid dependency to ensure organisms remain contained.

Week 2 HW: DNA read write and edit

cover image cover image

Part 3. Chose Protein I chose glucokinase (GCK) because in my biochemistry classes I found it to be a very interesting enzyme due to its unique functions and its critical role as a glucose sensor.

According to the sources, what makes this enzyme particularly fascinating is that, unlike other members of the hexokinase family, it is not inhibited by its product(glucose-6-phosphate). This allows the enzyme to remain active even when glucose is abundant in the system.

Furthermore, glucokinase exhibits tissue-specific expression in the pancreas and the liver, which leads to different but equally vital functions:

  • In the pancreas, it is a key player in glucose-stimulated insulin secretion.
  • In the liver, it is essential for glucose uptake and its subsequent conversion into glycogen.

I also find it interesting from a clinical perspective, as mutations in this gene that alter enzyme activity are associated with several medical conditions, including Maturity-Onset Diabetes of the Young type 2 (MODY2) and hyperinsulinemic hypoglycemia. This demonstrates its fundamental importance in maintaining human glucose homeostasis.

Protein Sequence (Isoform 1 - Pancreatic): The following sequence corresponds to the pancreatic isoform 1 (NP_000153.1 / UniProt P35557), which has a distinct N-terminus compared to liver isoforms.

sp|P35557|HXK4_HUMAN Glucokinase OS=Homo sapiens OX=9606 GN=GCK PE=1 SV=3 MLDDRARMEAAKKKEKVEQILAEFQLQEEDLKKVMRRMQKEMDRGLRLETHEEASVKMLP TYVRSTPEGSEVGDFLSLDLGGTNFRVMLVKVGEGEEGQWSVKTKHQMYSIPEDAMTGTA EMLFDYISECISDFLDFLDKHQMKHKKLPLGFTFSFPVRHEDIDKGILLNWTKGFKASGA EGNNVVGLLRDAIKRRGDFEMDVVAMVNDTVATMISCYYEDHQCEVGMIVGTGCNACYME EMQNVELVEGDEGRMCVNTEWGAFGDSGELDEFLLEYDRLVDESSANPGQQLYEKLIGGK YMGELVRLVLLRLVDENLLFHGEASEQLRTRGAFETRFVSQVESDTGDRKQIYNILSTLG LRPSTTDCDIVRRACESVSTRAAHMCSAGLAGVINRMRESRSEDVMRITVGVDGSVYKLH PSFKERFHASVRRLTPSCEITFIESEEGSGRGAALVSAVACKKACMLGQ

3.2. Reverse Translation: Protein sequence to DNA sequence

The Central Dogma (DNA → RNA → Protein) allows us to work backward from a protein sequence to identify the corresponding DNA sequence. By using the NCBI Gene database (Gene ID: 2645), we identified the Coding Sequence (CDS) for Glucokinase.

In accordance with HTGAA conventions, the sequence is presented in the 5’ to 3’ coding strand** format, as found in GenBank or FASTA files.

DNA Sequence for Glucokinase (CDS - Variant 1): This nucleotide sequence corresponds to the mRNA RefSeq NM_000162.5, which encodes the pancreatic islet beta cell isoform.

atgttggatgacagagccaggatggaggccgccaagaaggagaaggttgagcagatcctggcagagttccagctgcaggaggaggacctgaagaaggtgatgagacggatgcagaaggagatggaccgcggcctgaggctggagacccatgaggaggccagtgtgaagatgctgcccacctacgtgcgctccaccccagaaggctcagaagtcggagacttcctctccctggacctgggtggcaccaacttcagggtgatgctggtgaaggtgggagaaggtgaggaggggcagtggagcgtgaagaccaaacaccagatgtactccatccccgaggacgccatgaccggcactgctgagatgctcttcgactacatctctgagtgcatctccgacttcctggacaagcatcagatgaaacacaagaagctgcccctgggcttcaccttctccttccctgtgaggcacgaagacatcgataagggcatccttctcaactggaccaagggcttcaaggcctcaggagcagaagggaacaatgtcgtggggcttctgcgagatgctatcaaacggagaggggactttgaaatggatgtggtggcaatggtgaatgacacggtggccacgatgatctcctgctactacgaagaccatcagtgcgaggtcggcatgatcgtgggcacgggctgcaatgcctgctacatggaggagatgcagaatgtggagctggtggagggggatgagggccgcatgtgcgtcaatacggagtggggcgccttcggggactccggcgagctggacgagttcctgctggagtatgaccggctggtggacgagagctctgcaaaccccggtcagcagctgtatgagaagctcataggtggcaagtatatgggcgagctggtgcgacttgtgctgctcaggctggtggacgagaacctgctcttccacggagaggcctccgagcagctgcgcacacgcggagccttcgagacgcgcttcgtgtcgcaggtggagagcgacacgggcgaccgcaagcagatctacaacatcctgagcacgctggggctgcgaccctcgaccaccgactgcgacatcgtgcgccgcgcctgcgagagcgtgtctacgcgcgctgcgcacatgtgctcggccgggctggcgggcgtcatcaatcgcatgcgcgagagccgcagcgaggacgtgatgcgcatcaccgtgggcgtggatggctccgtgtacaagctgcaccccagcttcaaggagcgcttccatgccagcgtgcgcaggctgacgcccagctgcgagatcaccttcatcgagtcggaggagggcagtggccggggcgctgccctggtctcggcggtggcctgtaagaaggcctgtatgctgggccagtga

Based on the HTGAA course materials and the biological databases provided, here is the completion of your assignment for Glucokinase (GCK) in English.

3.3. Codon Optimization

Why do we need to optimize codon usage?** Codon optimization is necessary because different organisms have distinct “preferences” or varying abundances of tRNA for the same amino acids. Since the genetic code is redundant (multiple codons can code for one amino acid), using the codons that are most frequently used by the host organism—and thus for which there are more available tRNAs—ensures that the translation process is more efficient and results in higher protein yields. Optimization also helps avoid technical synthesis issues such as extreme GC content, high repetition, or “Homo polymers” (long strings of the same nucleotide) that can cause errors during DNA printing.

a) Which organism have you chosen to optimize the codon sequence for and why? I have chosen Escherichia coli_ (E. coli) as the target organism. According to the sources, E. coli is the preferred host for this course because it is the easiest and most accessible organism to work with in a laboratory setting.

Glucokinase DNA sequence with Codon Optimization (E. coli) (The following sequence is a simulated optimization of the GCK coding sequence for E. coli, avoiding common restricted enzyme sites like BsaI or BbsI to ensure compatibility with Twist Bioscience synthesis tools.)

ATGTTAGATGATCGTGCGCGTATGGAAGCGGCGAAAAAAGAAAAAGTTGAACAGATTCTGGCGGAATTTCAGCTGCAGGAAGAAGATCTGAAAAAAGTGATGCGTCGTATGCAGAAAGAAATGGATCGTGGCCTGCGTCTGGAAACCCATGAAGAAGCGAGCGTGAAAATGCTGCCGACCTATGTGCGTAGCACCCCGGAAGGTAGCGAAGTTGGCGATTTTCTGAGCCTGGATCTGGGTGGCACCAATTTTCGTGTGATGCTGGTGAAAGTTGGCGAAGGCGAAGAAGGCCAGTGGAGCGTGAAAACCAAACATCAGATGATTAGCATCCCGGAAGATGCGATGACCGGCACCGCGGAAATGCTGTTCGATTATATTAGCGAATGCATTAGCGATTTTCTGGATAAACATCAGATGAAACATAAAAAACTGCCGCTGGGCTTTACCTTTAGCTTTCCGGTGCGTCATGAAGATATTGATAAAGGCATTCTGCTGAACTGGACCAAAGGCTTTAAAGCGAGCGGCGCGGAAGGCAATAATGTTGTTGGCCTGCTGCGTGATGCGATTAAACGTCGTGGCGATTTTGAAATGGATGTGGTTGCGATGGTGAATGATACCGTTGCGACCATGATTAGCTGCTATTATGAAGATCATCAGTGCGAAGTTGGCATGATTGTTGGCACCGGTTGCAATGCGTGCTATATGGAAGAAATGCAGAACGTTGAACTGGTGGAAGGCGATGAAGGCCGTATGTGCGTGAATACCGAATGGGGCGCGTTTGGCGATAGCGGCGAACTGGATGAATTTCTGCTGGAATATGATCGTCTGGTTGATGAAAGCAGCGCGAATCCGGGCCAGCAGCTGTATGAAAAACTGATTGGCGGCAAATATATGGGCGAACTGGTGCGTCTGGTGCTGCTGCGTCTGGTTGATGAAAACCTGCTGTTTCATGGCGAAGCGAGCGAACAGCTGCGTACCCGTGGCGCGTTTGAAACCCGTTTTGTTAGCCAGGTTGAAAGCGATACCGGCGATCGTAAACAGATTTATAATATTCTGAGCACCCTGGGCCTGCGTCCGAGCACCACCGATTGCGATATTGTGCGTCGTGCGTGCGAAAGCGTGAGCACC CGTGCGGCGCATATGTGCAGCGCGGGCCTGGCGGGCGTTATTAATCGTATGCGTGA AAGCCGTAGCGAAGATGTGATGCGTATTACCGTTGGTGTTGATGGCAGCGTTTATA AACTGCATCCGAGCTTTAAAGAACGTTTTCATGCGAGCGTGCGTCGTCTGACCCCG AGC TGCGAAATTACCTTTATTGAAAGCGAAGAAGGCAGCGGTCGTGGCGCGGCGCTG GTTAGCGCGGTTGCGTGCAAAAAAGCGTGCATGCTGGGCCAGTGA

3.4. You have a sequence! Now what?

What technologies could be used to produce this protein from your DNA? To produce Glucokinase from the optimized DNA sequence, two main approaches can be utilized:

  1. Cell-dependent methods:** This is the most common approach, using E. coli_ as a “chassis”**. The DNA sequence is inserted into a circular DNA molecule called a plasmid and then introduced into the bacteria through a process called transformation (often using heat shock). The living bacteria then act as biological factories to produce the protein.
  2. Cell-free methods: These methods involve using cell lysates (essentially the “guts” or internal machinery of a cell) to produce proteins without the need for a living organism. This allows for rapid prototyping and can even be incorporated into materials like textiles.

How the DNA sequence is transcribed and translated into your protein: The production follows the Central Dogma of molecular biology: a) Transcription: The process begins when an enzyme called RNA polymerase binds to the DNA template and “reads” the sequence to create a complementary strand of mRNA. b) Translation:** This mRNA is then processed by a ribosome. The ribosome reads the mRNA in sets of three nucleotides called codons. Each codon corresponds to a specific amino acid. Transfer RNA (tRNA) molecules bring the correct amino acids to the ribosome, which links them together into a long chain that eventually folds into the functional Glucokinase protein.

PART 4. Prepare a twist DNA synthesis order

link: https://benchling.com/s/seq-w4sU7egh8kNc9MQL1cVB?m=slm-dCvQA2Kij6DxaFgAHkwi

part 4.3 Link: https://benchling.com/s/seq-XorsFADOnAX2SybWLZmZ?m=slm-zfSOr0QI0HTnmSynYfuH

Part 5.

5.1 DNA Read

  1. What DNA would you want to sequence and why? I would want to sequence the human GCK (Glucokinase) gene variants in patients with atypical metabolic profiles. Sequencing this DNA is critical for human health research because mutations in GCK are directly linked to conditions like Maturity-Onset Diabetes of the Young type 2 (MODY2)** and hyperinsulinemic hypoglycemia. By reading these sequences, researchers can characterize variant mechanisms and improve clinical diagnostics.

  2. Technology and Details a) Technology: I would use Illumina Sequencing (Next-Generation Sequencing), as it is the standard mentioned in the sources for high-throughput analysis of human samples.

b) Generation:** This is a second-generation technology. It is characterized by massive parallelism, allowing millions of DNA fragments to be sequenced simultaneously. c) Input and Preparation:** The input is genomic DNA or cDNA. Preparation involves fragmentation of the DNA into smaller pieces, adapter ligation to attach specific sequences to the ends of fragments, and PCR amplification to create clusters for signal detection. c) Essential Steps and Base Calling:** The technology uses “sequencing by synthesis” where fluorescently labeled nucleotides are added to the DNA template. During each cycle, the machine captures the fluorescence color (red, green, blue, or yellow) emitted as a base is incorporated, a process known as **base calling. d) Output: The primary output is a FASTQ or FASTA file containing the strings of nucleotides (A, T, G, C) and their corresponding quality scores.

5.2 DNA Write

  1. What DNA would you want to synthesize and why?** I would want to synthesize an optimized expression cassette for human Glucokinase (GCK). This would be used for therapeutics and drug discovery, specifically to produce functional GCK protein in E. coli or mammalian cell lines to test new activators for diabetes treatment. I would use the codon-optimized sequence we generated previously to maximize protein yield.

  2. Technology and Detllais a) Technology:** I would use Silicon-based DNA synthesis provided by companies like Twist Bioscience. b) Essential Steps: The process involves designing the sequence digitally in tools like Benchling, performing codon optimization for the host organism, and then using a silicon chip to print thousands of tiny DNA fragments (oligonucleotides) simultaneously using chemical synthesis (phosphoramidite chemistry). These fragments are then assembled into full-length genes. c) Limitations:** The primary limitations include complexity; sequences with high GC content, repetitive regions, or long homopolymers (e.g., many ‘A’s in a row) are very difficult to synthesize and may fail during the printing process.

5.3 DNA Edit

  1. What DNA would you want to edit and why? In alignment with the goals of Colossal Biosciences, I would want to edit the genome of an Asian Elephant to include specific genes from the Woolly Mammoth. The goal of this “de-extinction” project is to restore historic animals to their ecological roles, which can help in nature conservation and the restoration of Arctic ecosystems. These edits would focus on traits like cold tolerance, hair growth, and fat distribution.

  2. Technology and Details Technology: I would use CRISPR-Cas9 technology, as it is the most precise and versatile tool for targeted genome editing discussed in class. How it works: CRISPR-Cas9 acts as “molecular scissors.” A guide RNA (gRNA) is designed to match a specific target sequence in the DNA. The Cas9 enzyme then follows this guide to the exact location and creates a double-strand break. The cell then repairs this break, allowing us to delete or insert specific mammoth-like genetic information. Preparation and Input: Preparation requires designing the guide RNA using digital tools to ensure specificity. The input includes the Cas9 enzyme, the gRNA, a DNA template for the desired mammoth traits, and the target host cells (elephant cells). a) Limitations: Key limitations include efficiency (the edit may not happen in every cell) and precision (potential “off-target” effects where the enzyme cuts at unintended locations similar to the target).

Week 3 HW: Lab automation

PART 1. ![cover image](OpentronLab ART.png) artistic design using the GUI LINK: https://opentrons-art.rcdonovan.com/?id=98conne30870554

PART 2. ARTICLE “An Automated Versatile Diagnostic Workflow for Infectious Disease Detection in Low-Resource Settings” DOI: https://doi.org/10.3390/mi15060708

The article highlights how implementing Opentrons for automated workflows in hospital and clinical settings helps significantly reduce turnaround times and accelerates overall logistics. By increasing sample throughput and enabling the simultaneous processing of multiple samples, the system greatly enhances operational efficiency. Furthermore, automation reduces the risk of human error inherent in manual repetitive tasks and minimizes the possibility of sample contamination or compromising the diagnostic process, ensuring more reliable results.

The system OpenTrons was demonstrated through a test for Neisseria meningitidis (meningitis) and consists of four integrated modules:

  1. DNA Isolation: Utilizing magnetic beads to purify pathogen DNA.
  2. DNA Amplification: Performing isothermal Recombinase Polymerase Amplification (RPA) at 37°C, which avoids the need for complex thermal cycling.
  3. DNA Digestion: Using exonucleases to convert double-stranded amplicons into single-stranded DNA.
  4. DNA Detection: Employing a vertical flow microarray (VFM) on paper, where gold nanoparticles create a colorimetric signal for visual results. The automated process takes 110 minutes, making it approximately 18% faster than manual processing. Additionally, the cost is roughly $16 per sample, which is significantly more affordable than the estimated $94 for a standard PCR test

PART 3. Final Project Automation Plan: “Bio-Hybrid Designer”

My plan to use automation to optimize hybrid biosynthetic pathways for complex drugs like paclitaxel. The goal is to identify the most efficient “transition point” where biological synthesis (producing precursors like baccatina III) should end and selective chemical synthesis should begin.

  1. Cloud Lab Implementation (Ginkgo Nebula) To achieve high-throughput screening of enzymatic variants and chemical cofactors, I will utilize the Ginkgo Nebula platform. The workflow will follow these steps:
  • Acoustic Liquid Handling: Use the Echo 525 to transfer nanoliter-scale droplets of DNA constructs and cofactors into destination plates.
  • Reagent Stamping: Use the Bravo system to stamp cell-free protein synthesis (CFPS) master mixes into 384-well plates.
  • Incubation: Seal plates with PlateLoc and incubate in Inheco modules at 37°C to express the enzymes.
  • Detection: After unsealing with XPeel, use the PHERAstar to measure fluorescence from biosensors, providing data for the next Design-Build-Test-Learn (DBTL) cycle.
  1. Local Prototyping (Opentrons OT-2) I will use the Opentrons OT-2 for initial protocol validation and mixing of chemical precursors. I will use a standardized layout:
  • Slot 5: Agar plate or reaction plate for the final output.
  • Slot 6: Source plate for bacterial cultures or chemical reagents.
  • Slot 9: Tip rack for the P20 pipette.

a) Python Script / Pseudocode Example (maybe use, I´m not sure right now) My script will incorporate the dispense_and_jog function to ensure the pipette moves horizontally before dispensing vertically, preventing the tip from scratching the agar surface.

from opentrons import protocol_api

metadata = {
    'protocolName': 'Bio-Hybrid Path Optimization',
    'author': 'User',
    'description': 'Automated mixing of chemical cofactors with biosynthetic precursors'
} #

def run(protocol: protocol_api.ProtocolContext): #
    # Load Labware
    tips = protocol.load_labware('opentrons_96_tiprack_20ul', '9') #
    plate = protocol.load_labware('corning_96_wellplate_360ul_flat', '5') #
    reagents = protocol.load_labware('opentrons_24_tuberack_eppendorf_1.5ml', '6') #
    
    p20 = protocol.load_instrument('p20_single_gen2', 'right', tip_racks=[tips]) #

    # Automated serial dilution and mixing logic
    for i in range(8):
        p20.pick_up_tip()
        p20.aspirate(5, reagents['A1'])
        # Implementation of vertical movement to protect labware
        p20.dispense(5, plate.wells()[i]) 
        p20.mix(3, 10, plate.wells()[i])
        p20.drop_tip()

Subsections of Labs

Week 1 Lab: Pipetting

cover image cover image

Subsections of Projects

Individual Final Project

cover image cover image

Group Final Project

cover image cover image