Governance means different things to different people, but what truly matters to me is building systems that share power rather than seize it. I imagine resilient structures that adapt, nurture well-being in many forms, and thrive on transparency and shared commitment.
First, describe a biological engineering application or tool you want to develop and why
Plants can be reciprocal and help each other. For example, ectomycorrhizal (ECM) fungi, which form sheaths around roots, help restore soil and make plants more resilient, especially in the face of climate change. I want to create a system inspired by the ectomycorrhizal (ECM) fungi living model of resilience. By mimicking their metabolism and distribution, we could recover and manage resources more wisely, whether in organizations, food systems, or conservation efforts. Imagine a biological pump that naturally generates, transports, and processes information, strengthening our collective resilience.
Here is the reference of the assignment: https://2026a.htgaa.org/2026a/course-pages/weeks/week-02/lab/index.html
Basic Understanding DNA Gel, restriction enzymes, Benchling intro, Twist intro (▶️Recording | 💻Slides)
easy content in Spanish: ¿Cómo hacer EDICIÓN GENÉTICA con CRISPR? https://www.youtube.com/watch?v=UaxrYWCyLdY&t=1s
As a committed listener in distance, I can only make the benching :
First step: https://www.benchling.com/, make an account. Second Step: choose a DNA sequence: https://www.neb.com/en-gb/tools-and-resources/interactive-tools/dna-sequences-and-maps-tool Begin by importing your DNA sequence and use the Digests tool to test the effects of different restriction enzyme(s). Export your final design as a png and compare with your lab results on your Notion page. See the images below for where to find the Digests tool, selecting the “NEB 2-log” ladder in the Virtual Digest tab, and how to have multiple Digests appear in the same Virtual Digest.
basic Concepts
Your task this week is to Create a Python file to run on an Opentrons liquid handling robot.
Review this week’s recitation and this week’s lab for details on the Opentrons and programming it. Generate an artistic design using the GUI at opentrons-art.rcdonovan.com. Using the coordinates from the GUI, follow the instructions in the HTGAA26 Opentrons Colab to write your own Python script that draws your design using the Opentrons. I took the Elephant as a starting point for my art in OpenTrons. This was an experience we complemented by understanding th step by step how to set up the Opentrons machine
Basics
Questions
Amino Acids, Protein Structure, and β-Sheets 1. How many molecules of amino acids do you take with a piece of 500 grams of meat? It depends on the type and cut of meat, but a reasonable estimate is about 20–26 g of protein per 100 g of beef. Therefore, 500 g of meat would provide approximately 100–130 g of protein.
Still not really sure what I am doing :-)!
Basics
Design short peptides that bind mutant SOD1. “Design short peptides that bind mutant SOD1” means creating small, synthetic chains of amino acids (peptides) specifically engineered to attach to a deformed version of the Superoxide Dismutase 1 (SOD1) protein. This is a therapeutic strategy aimed at treating SOD1-related Amyotrophic Lateral Sclerosis (ALS).
Mutant SOD1: A faulty version of the SOD1 enzyme produced due to genetic mutations. Unlike healthy SOD1, mutant SOD1 misfolds, becomes unstable, and aggregates (clumps together), leading to toxicity in motor neurons.
basic Concepts
This week we learn core molecular biology tools and techniques for processing and assembling DNA, including PCR and Gibson Assembly.
Components of Phusion High-Fidelity PCR Master Mix Phusion Master Mix contains several key components:
Phusion Hot Start II DNA Polymerase — A high-fidelity polymerase with a proofreading (3’→5’ exonuclease) domain that corrects misincorporated bases, resulting in ~50× lower error rates than Taq. It also has a processivity-enhancing domain that speeds up elongation. dNTPs (dATP, dCTP, dGTP, dTTP) — The nucleotide building blocks incorporated during strand synthesis. MgCl₂ — Magnesium ions are an essential cofactor for DNA polymerase activity and also stabilize the dNTP substrates. Optimized reaction buffer — Maintains proper pH and ionic conditions for polymerase activity and primer/template annealing. Stabilizers/additives — Help maintain enzyme stability and can improve yield on difficult templates (e.g., GC-rich regions). 2. Factors Determining Primer Annealing Temperature Primer GC content — G·C pairs form 3 hydrogen bonds vs. 2 for A·T, so higher GC content raises the melting temperature (Tm). A rough formula is Tm = 4(G+C) + 2(A+T). Primer length — Longer primers have higher Tm values because more base-pair interactions must be disrupted. Salt/ion concentration — Higher Mg²⁺ or monovalent salt concentrations stabilize the DNA duplex and raise Tm. Primer secondary structure — Hairpins or self-dimers can reduce effective annealing efficiency. Template secondary structure — Highly structured templates may require higher annealing temperatures or additives like DMSO. Mismatches — Deliberate mismatches (e.g., for mutagenesis) lower Tm and require adjusted annealing temperatures. Annealing temperature rule of thumb — Typically set 5°C below the lower Tm of the two primers used. 3. PCR vs. Restriction Enzyme Digests Feature PCR Restriction Enzyme Digest Input template Any DNA (plasmid, genomic, cDNA) Usually plasmid or purified DNA Output Amplified, defined fragment Fragment(s) cut at specific recognition sites End type Blunt (Phusion) or 3’ A-overhang (Taq) Blunt or sticky (cohesive) ends depending on enzyme Precision Defined by primer design; any sequence Defined by restriction site locations in DNA Flexibility Very high — you design the fragment Limited to where restriction sites naturally exist Time ~1–3 hours ~1–2 hours Error risk Polymerase errors possible (mitigated by HiFi) No sequence errors; only wrong cut possible Requires sequence knowledge? Yes, for primer design Yes, to identify restriction sites When to prefer PCR You need to amplify a fragment from a complex mixture (e.g., genomic DNA). You want to add sequences (overhangs, restriction sites, Gibson overlaps) to the ends of a fragment. No convenient restriction sites flank your gene of interest. You are introducing a point mutation or modifying a sequence. When to prefer restriction enzyme digest You are sub-cloning between two vectors that already have compatible restriction sites. You need sticky ends for directional cloning. You want to cut a vector backbone without amplifying it (avoids PCR errors in the vector). Speed and simplicity are priorities when restriction sites are already present. 4. Ensuring Compatibility with Gibson Assembly Gibson Assembly requires fragments with overlapping homologous sequences (~15–30 bp) at their ends. To ensure compatibility:
What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions? IANNs have several advantages over traditional Boolean genetic circuits.
First, they can process continuous and graded inputs rather than only treating signals as ON or OFF. This is important because many biological signals, such as metabolite concentrations, transcription factor levels, or signaling gradients, are not binary.
Second, IANNs can perform weighted integration of multiple inputs. Instead of responding only when a rigid logical condition is met, they can combine signals with different strengths, similar to how neurons sum inputs.
Subsections of Homework
Week 1 HW: Principles and Practices
Governance
means different things to different people, but what truly matters to me is building systems that share power rather than seize it. I imagine resilient structures that adapt, nurture well-being in many forms, and thrive on transparency and shared commitment.
First, describe a biological engineering application or tool you want to develop and why
Plants can be reciprocal and help each other. For example, ectomycorrhizal (ECM) fungi, which form sheaths around roots, help restore soil and make plants more resilient, especially in the face of climate change.
I want to create a system inspired by the ectomycorrhizal (ECM) fungi living model of resilience. By mimicking their metabolism and distribution, we could recover and manage resources more wisely, whether in organizations, food systems, or conservation efforts. Imagine a biological pump that naturally generates, transports, and processes information, strengthening our collective resilience.
Arbuscular mycorrhizae (AM), a type of endomycorrhizal fungi, associate with 80% of terrestrial plants. AM supports soil stabilisation, nutrient cycling, microbial diversity, and soil organic matter (SOM) formation.
Figure 1. “Diagram showing the updated conceptual framework of arbuscular mycorrhizal (AM) fungi-mediated soil organic matter (SOM) dynamics. Plants fix carbon through photosynthesis, which is then delivered to AM fungi. Arbuscular mycorrhizal fungi influence SOM dynamics through four pathways classified as (1) generating, (2) reprocessing, (3) reorganizing, and (4) stabilizing” Wu et al. 2024. (Image credits: https://doi.org/10.1111/nph.19178)reference: https://cid-inc.com/blog/2024-research-insights-how-mycorrhizal-fungi-benefits-agriculture/
Wu et al. (2024) introduced a framework explaining AM’s role in soil organic matter formation, considering factors such as organic compound diversity, mineral weathering, chemical interactions, and hyphosphere microbial contributions. They identify four pathways: SOM generation, reprocessing, reorganisation, and stabilisation (see Figure 1).
Generating: AM fungi produce exudates, metabolites, mucilage, and necromass. Wu et al. refer to the diversity, composition, and properties of these compounds as chemodiversity. They argue that chemodiversity, rather than individual compounds, is key to SOM composition, microbial biodegradation rates, and persistence.
Reprocessing: AM fungi attract specific microbes to their hyphosphere, the soil area influenced by AM hyphal exudates. These microbes drive soil biochemistry by decomposing SOM components that AM fungi cannot process. Through internal and extracellular pathways, they break down and assimilate SOM, contributing to chemodiversity, persistence, and SOM resynthesis. This process is known as the hyphosphere “microbial carbon pump.”
Reorganising: The fungi’s mycelial growth, expansion, and colonisation change the soil’s physical porosity and hydraulic properties. While AM stabilises macro soil aggregates, the mycelial dynamics increase micro aggregate turnover, water infiltration, soil-water retention capacity, hydraulic conductivity, and redistribution of AM exudates. The changing soil conditions due to mycelial expansion also change nutrient availability, temperature, and oxygen. It results in SOM redistribution and transformation.
Stabilising: AM fungi cause mineral weathering and alter interactions that influence SOM formation and stabilisation. AM rock mineral weathering makes nitrogen, phosphorus, potassium, and magnesium available in soils that form secondary compounds with different sizes, surfaces, and reactivity. This can alter mineral absorption, catalysis, and oxidation of SOM. These processes are called the “soil mineral carbon pump”.
Takeaway: The new concept can explain AM’s role in small- to large-scale SOM dynamics, which can help develop mycorrhiza-based technologies to enhance soil health.
I am facinating of the concept of carbon cycling and the major processes and mechanisms involved in this process, and how it can create positive changes in the ecosystem:
Figure 2. The solubility carbon pump (SCP) is driven by the difference of CO2 partial pressures between the atmosphere and surface waters; exchanges of CO2 occur through dissolution into water or release into the air. Generally, the SCP refers to the pumping of CO2 from the atmosphere into the ocean driven by abiotic processes such as lowering temperature and downward mixing. The biological carbon pump (BCP) refers to a series of biogeochemical processes that transport organic carbon (mainly particulate organic carbon (POC)) from the surface to the ocean interior
Figure 3. Microorganism–dissolved organic matter (DOM) complex networks consist of two types of nodes: microbial and DOM. Connections are made between nodes based on correlations of data sets. Microbial diversity can be analysed using 16S rRNA amplicons, metagenomics, metatranscriptomics, and metaproteomics, as described in Jiao, N., Luo, T., Chen, Q. et al. The microbial carbon pump and climate change. Nat Rev Microbiol 22, 408–419 (2024). https://doi.org/10.1038/s41579-024-01018-0
Next, describe one or more governance/policy goals related to ensuring that this application or tool contributes to an “ethical” future, like ensuring non-malfeasance (preventing harm). Break big goals down into two or more specific sub-goals
Restoring governance in our ecosystems, especially within food systems and conservation, is crucial for protecting key species. By studying these organisms, we can build a deeper, more collaborative knowledge base and move toward distributed, systemic governance.
Purpose: Transform the soil by introducing this plant, aiming to regenerate and sustain a thriving, healthy food environment.
Design: Build partnerships among small businesses, farmers, and policymakers to expand these practices across diverse ecosystems.
Assumptions: it is not a law or regutlation is ian a mediation with the ecosystmen
Risks of Failure & Success: Outcomes depend on the ecosystem, but the ultimate goal is to design interventions that measure plant survival and resilience as the true indicators of success.
Although I am not entirely certain how the process functions, I believe in the potential of replicating plant-based or bioinspired systems to support regeneration and address deforestation within food systems.
First, implementing plant-based systems can address the irrigation and soil nutrition needs of various sectors. For example, cacao plantations, which are highly sensitive to extreme temperatures, could benefit from such approaches and support regenerative agriculture. Monitoring these plants could facilitate the development of sensing systems and enable the implementation of more complex networks for reforestation efforts.
Idea 2
My family’s cacao farm in Colombia has always sparked my curiosity about our crop’s roots and how we might make it more sustainable. For years, we have wondered about the true quality of our cacao and whether it belongs to the prized Criollo variety, since its genetics remain a mystery. Now, I am exploring how synthetic biology could unlock new possibilities for our farm, strengthen local leadership, and add value by revealing and celebrating the unique DNA of our cacao. hire a vision with a longer term
Step 1: Identify and document ancestral cacao varieties, which can be the first stage of a HTGAA
Map existing Criollo cacao varieties and their genetic diversity.
Record sensory profiles, cultural significance, and local cultivation knowledge. Connect with the idea 1 to produce regenerative practices
Collaborate with farmers, researchers, and local organisations. to contribut into a ecosytem DNA picture
Develop genetic repositories combining scientific methods with community participation.
Ensure fair access, ethical data governance, and benefit sharing with producer communities.
Integrate sustainable agricultural policies and biodiversity frameworks.
Step 3: Strengthen biodiversity conservation
Preserve endangered cacao genetic resources.
Promote agroecological cultivation models that maintain ecosystem health.
Support climate resilience through genetic diversity.
Step 4: Empower local farming communities
Recognise farmers as co-stewards of genetic heritage.
Provide training, technical support, and participatory decision-making spaces.
Foster economic opportunities linked to high-quality heritage cacao.
Step 5: Support sustainable certification and traceability
Use genetic data to strengthen transparency in cacao supply chains.
Enable certification schemes that reflect biodiversity conservation and ethical production.
Improve governance mechanisms linking agriculture, sustainability, and cultural heritage.
Does the option:
Option 1
Option 2
Option 3
Enhance Biosecurity
• By preventing incidents
x
• By helping respond
Foster Lab Safety
• By preventing incident
• By helping respond
Protect the environment
x
• By preventing incidents
• By helping respond
x
Other considerations
• Minimizing costs and burdens to stakeholders
• Feasibility?
• Not impede research
x
• Promote constructive applications
x
Bio Questions Week 1 <3 !
-> Question by J. Jacobson´s Presentation <-
Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy?
As all concepts are new for me, I might need a bit more explanation of all concepts
A polymerase is a type of enzyme, such as DNA or RNA polymerase, that synthesises long chains of nucleic acids by adding nucleotides to a template strand. These enzymes are essential for critical biological processes, including DNA replication, repair, and transcription. Polymerases are also vital in laboratory applications, most notably in the polymerase chain reaction (PCR) for amplifying DNA. https://en.wikipedia.org/wiki/Polymerase
Function: DNA polymerases catalyze the synthesis of DNA by adding nucleoside triphosphates, creating two identical DNA duplexes from one.
Types: DNA polymerase (replicates DNA) and RNA polymerase (transcribes DNA into RNA) are the primary types, found in all living organisms.
Mechanism: They require a primer and a template strand to function, adding nucleotides in a
to
direction.
Accuracy: DNA polymerases often have built-in proofreading abilities (
to
exonuclease activity) to ensure high-fidelity replication.
PCR Application: Thermostable polymerases, such as Taq polymerase, are used in PCR to automate DNA copying, allowing for billions of copies to be made in a few hours.
Structural Variation: Polymerases range from simple single proteins to complex, multi-subunit assemblies.
What is the error rate of polymerase?
Throughput Error Rate Product Differential: ~10⁹
based on a consensus DNA polymerase is extremely accurate, with typical error rates around 1 mistake per 10⁵–10⁷ bases before cellular repair, and ~10⁹–10¹¹ per base per replication after all proofreading and repair.
How does this compare to the length of the human genome?
3.2 Gbp
How does biology deal with that discrepancy?
By error correcton MutS Repair System
by consensus: Biology layers multiple error‑correction systems to shrink polymerase’s raw error rate down to ~1 mutation per genome copy.
How many different ways are there to code (DNA nucleotide code) for an average human protein?
I am not sure, but in this paper they present a method for constructing complex and diverse DNA sequences using DNA three-way junctions. Theoretically, because of genetic code degeneracy, an “average” human protein can be encoded by an astronomically large number of DNA sequences.
In practice, what are some of the reasons that all of these different codes don’t work to code for the protein of interest?
I guess because the fragmentation :S
-> Questions by LEPROUST´s Presentation <-
What’s the most commonly used method for oligo synthesis currently?
Oligonucleotide Synthesis
Why is it difficult to make oligos longer than 200nt via direct synthesis?
I didn´t find the aswer but I found this: phosphoramidite synthesis struggles beyond ~150–200 nt because stepwise inefficiency and side reactions make long, error‑free chains very rare. Filges, S., Mouhanna, P., & Ståhlberg, A. (2021). Digital Quantification of Chemical Oligonucleotide Synthesis Errors. Clinical chemistry. https://doi.org/10.1093/clinchem/hvab136.
Why can’t you make a 2000bp gene via direct oligo synthesis?
If I understand correctly, it is more efficient to work based on the Twist Silicon Platform, which can produce 9,600 genes. but not sure
-> Question by George Church´s Presentation <-
[Using Google & Prof. Church’s slide #4] What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?
Across vertebrates, most mammals (including humans) require nine essential amino acids:
histidine, isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan, valine
Biochemically, a “lysine contingency” (as in Jurassic Park) is plausible only in the trivial sense that any essential amino acid could serve as a single‑point nutritional dependency. Lysine is important and often limiting, but it is not uniquely essential compared with the rest of the conserved EAA set, and species differ enough that no single “10‑EAA rule” applies universally. Mccann, J., & Rawls, J. (2023). Essential Amino Acid Metabolites as Chemical Mediators of Host-Microbe Interaction in the Gut.. Annual review of microbiology. https://doi.org/10.1146/annurev-micro-032421-111819.
As a committed listener in distance, I can only make the benching :
First step: https://www.benchling.com/, make an account. Second Step: choose a DNA sequence: https://www.neb.com/en-gb/tools-and-resources/interactive-tools/dna-sequences-and-maps-tool Begin by importing your DNA sequence and use the Digests tool to test the effects of different restriction enzyme(s). Export your final design as a png and compare with your lab results on your Notion page. See the images below for where to find the Digests tool, selecting the “NEB 2-log” ladder in the Virtual Digest tab, and how to have multiple Digests appear in the same Virtual Digest.
Part 0: Basics of Gel Electrophoresis
We did this call nicely online!
Thanks to the introduction from Digby Usher. Here are a couple of pics of the experience:
The work is saved in the following link: https://benchling.com/s/seq-33Yxt01UY8AwvjfPOIbd?m=slm-XZXh4GbXIU8Dh07RBlDZ
Attend or watch all lecture and recitation videos. Optionally watch bootcamp.
Part 2: Gel Art - Restriction Digests and Gel Electrophoresis ( just the play with the gel art interface)
Part of the assignment was to create a pattern/image in the style of Paul Vanouse’s Latent Figure Protocol artworks.
What was very useful for understanding the second part, at least the preparation for the lab, was Ronan’s website (Ronan’s website, a helpful tool for quickly iterating on designs!
For this interface, I play with the Enzimes:
EcoRI, EcoRV,Sall and SacI
Restriction Digest Parameters
37°C for 60 minutes (incubation)
80°C for 20 minutes (heat inactivation) (optional)
DNA Gel Electrophoresis
Hand-Cast Gel
Digest
14.7 μL Water
3.3 μL Loading Dye
2 μL Digest
Which protein have you chosen and why?
I selected the Protein Titin, which is related to muscle function and movement; I used UniProt to look up information about the protein and found fascinating functions.
Cell color indicative of number of GO terms
Aspect Term
Molecular Function actin bindingSource:UniProtKB-KW
Molecular Function alpha-actinin bindingSource:MGI1 publication
Molecular Function axon guidance receptor activitySource:GO_Central
Molecular Function structural constituent of muscleSource:ProtInc1 publication
Biological Process homophilic cell adhesion via plasma membrane adhesion moleculesSource:GO_Central
Biological Process muscle contractionSource:ProtInc1 publication
Biological Process synapse organizationSource:GO_Central
3.2. Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence.
This is the Exercise: The Central Dogma discussed in class and recitation describes the process by which the DNA sequence becomes transcribed and translated into protein. The Central Dogma gives us the framework to work backwards from a given protein sequence and infer the DNA sequence that the protein is derived from. Using one of the tools discussed in class, NCBI or online tools (google “reverse translation tools”), determine the nucleotide sequence that corresponds to the protein sequence you chose above.
LOCUS NW_023416316 74216526 bp DNA linear CON 30-SEP-2020
DEFINITION Myotis myotis isolate mMyoMyo1 unplaced genomic scaffold,
mMyoMyo1.p scaffold_m19_p_12, whole genome shotgun sequence.
ACCESSION NW_023416316
VERSION NW_023416316.1
DBLINK BioProject: PRJNA665501
BioSample: SAMN14734277
Assembly: GCF_014108235.1
KEYWORDS WGS; RefSeq.
SOURCE Myotis myotis
ORGANISM Myotis myotis
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
Mammalia; Eutheria; Laurasiatheria; Chiroptera; Yangochiroptera;
Vespertilionidae; Myotis.
REFERENCE 1 (bases 1 to 74216526)
AUTHORS Jebb,D., Huang,Z., Pippel,M., Hughes,G.M., Lavrichenko,K.,
Devanna,P., Winkler,S., Jermiin,L.S., Skirmuntt,E.C.,
Katzourakis,A., Burkitt-Gray,L., Ray,D.A., Sullivan,K.A.M.,
Roscito,J.G., Kirilenko,B.M., Davalos,L.M., Corthals,A.P.,
Power,M.L., Jones,G., Ransome,R.D., Dechmann,D.K.N.,
Locatelli,A.G., Puechmaille,S.J., Fedrigo,O., Jarvis,E.D.,
Hiller,M., Vernes,S.C., Myers,E.W. and Teeling,E.C.
TITLE Six reference-quality genomes reveal evolution of bat adaptations
JOURNAL Nature 583 (7817), 578-584 (2020)
PUBMED 32699395
COMMENT REFSEQ INFORMATION: The reference sequence is identical to
JABWUV010000012.1.
Assembly name: mMyoMyo1.p
The genomic sequence for this RefSeq record is from the
whole-genome assembly released by the Bat1K on 2020/08/07. The
original whole-genome shotgun project has the accession
JABWUV000000000.1.
##Genome-Assembly-Data-START##
Assembly Provider :: Bat1K
Assembly Date :: 16-APR-2019
Assembly Method :: DAmar v. APRIL-2019; Bionano Solve DLS v.
3.3; Salsa2 HiC v. git commit: e3ae7d8;
GenomicConsensus v. git commit: 038de5c;
longranger align v. 2.2.0; Freebayes v.
1.2.0; HiGlass manual curation v. 1.5
Assembly Name :: mMyoMyo1.p
Genome Representation :: Full
Expected Final Version :: No
Genome Coverage :: 90.9x
Sequencing Technology :: PacBio Sequel CLR; 10X Genomics chromium
linked reads; Bionano Genomics; Phase
Genomics HiC; PacBio Sequel IsoSeq
##Genome-Assembly-Data-END##
##Genome-Annotation-Data-START##
Annotation Provider :: NCBI
Annotation Status :: Full annotation
Annotation Name :: Myotis myotis Annotation Release 100
Annotation Version :: 100
Annotation Pipeline :: NCBI eukaryotic genome annotation
pipeline
Annotation Software Version :: 8.5
Annotation Method :: Best-placed RefSeq; Gnomon
Features Annotated :: Gene; mRNA; CDS; ncRNA
##Genome-Annotation-Data-END##
Once a nucleotide sequence of your protein is determined, you need to codon optimize your sequence. You may, once again, utilize google for a “codon optimization tool”. In your own words, describe why you need to optimize codon usage. Which organism have you chosen to optimize the codon sequence for and why?
Lysis protein DNA sequence with Codon-Optimization ATGGAAACCCGCTTTCCGCAGCAGAGCCAGCAGACCCCGGCGAGCACCAACCGCCGCCGCCCGTTCAAACATGAAGATTATCCGTGCCGTCGTCAGCAGCGCAGCAGCACCCTGTATGTGCTGATTTTTCTGGCGATTTTTCTGAGCAAATTCACCAACCAGCTGCTGCTGAGCCTGCTGGAAGCGGTGATTCGCACAGTGACGACCCTGCAGCAGCTGCTGACCTAA
Week 3 Automation
basic Concepts
Your task this week is to Create a Python file to run on an Opentrons liquid handling robot.
Using the coordinates from the GUI, follow the instructions in the HTGAA26 Opentrons Colab to write your own Python script that draws your design using the Opentrons.
I took the Elephant as a starting point for my art in OpenTrons. This was an experience we complemented by understanding th step by step how to set up the Opentrons machine
I adapted the colors for the ones of the LifeFabs: purple Pink and Blue, which is actually a great Fluorescent
After exporting the code from the website, I ran the code in the Colab notebook as shown in the following picture
As our protocols didn’t work because we still had some issues in our programming, we did another practical part inoculating bacteria of the diferent colors, which we could set up in the future for the OpenTrons work.
Here is the Python code, and in the lab session you can find the pictures showing how to set up the OpenTrons software.
‘# -- coding: utf-8 --
“““Untitled0.ipynb
1. How many molecules of amino acids do you take with a piece of 500 grams of meat?
It depends on the type and cut of meat, but a reasonable estimate is about 20–26 g of protein per 100 g of beef. Therefore, 500 g of meat would provide approximately 100–130 g of protein.
If we assume an average amino acid residue has a mass of ~100 g/mol, this corresponds to about 1.0–1.3 moles of amino acid residues.
Since 1 mole = 6.022 × 10²³ molecules, this means:
1.0 mole ≈ 6 × 10²³ amino acid residues
1.3 moles ≈ 8 × 10²³ amino acid residues
So, eating 500 g of meat gives you on the order of 6 × 10²³ to 8 × 10²³ amino acid units.
2. Why do humans eat beef but do not become cows, eat fish but do not become fish?
Humans do not become the organisms they eat because food is first digested. Proteins from beef or fish are broken down into amino acids and small peptides in the digestive system. These small molecules are then absorbed and reused by the body to build human proteins, following the instructions encoded in human DNA.
In other words, the body does not copy the identity of the food organism. It only reuses its chemical building blocks.
3. Why are there only 20 natural amino acids?
There are not literally only 20 amino acids in nature, but there are 20 standard amino acids that are universally encoded by the genetic code in most proteins.
These 20 were likely selected during early evolution because they provide:
a broad range of chemical properties
good structural diversity
compatibility with the ribosome
efficient use in the genetic code
They include hydrophobic, polar, charged, aromatic, small, and flexible side chains, which together allow proteins to fold and function in many different ways.
There are also rare exceptions, such as selenocysteine and pyrrolysine, but the core set remains the same.
4. Can you make other non-natural amino acids? Design some new amino acids.
Yes, scientists can make non-natural or non-canonical amino acids. These are useful in chemistry, protein engineering, and synthetic biology.
Examples of designed amino acids
Fluoro-leucine Similar to leucine, but with a fluorine atom added to the side chain. This could change hydrophobicity and stability.
Photo-switch amino acid An amino acid with an azobenzene group in its side chain, allowing it to change shape when exposed to light.
Metal-binding amino acid An amino acid containing a bipyridine-like side chain that can bind metal ions such as copper or zinc.
Redox amino acid An amino acid with a quinone or ferrocene-like group that could participate in electron transfer.
Click-ready amino acid An amino acid containing an azide or alkyne group for bioorthogonal “click” chemistry.
These new amino acids could give proteins new properties such as:
light responsiveness
selective chemical reactivity
conductivity
catalytic activity
metal binding
5. Where did amino acids come from before enzymes that make them, and before life started?
Before life began, amino acids likely formed through prebiotic chemistry. This means they were produced by natural chemical reactions without enzymes or living cells.
Possible sources include:
reactions in the early Earth atmosphere
hydrothermal systems
lightning or UV-driven chemistry
meteorites and extraterrestrial delivery
This suggests that amino acids may have existed before life and later became incorporated into the first biological systems. Enzymes appeared later and made these processes faster and more controlled.
6. If you make an α-helix using D-amino acids, what handedness would you expect?
A normal α-helix made from L-amino acids is usually right-handed.
If the helix were made from D-amino acids, it would be expected to form a left-handed α-helix, which is the mirror image of the normal structure.
7. Can you discover additional helices in proteins?
Yes. Besides the classical α-helix, proteins and peptides can adopt other helical forms.
Examples include:
3₁₀-helices
π-helices
left-handed helices in special contexts
synthetic helical structures designed in peptides and foldamers
It is possible to discover or design additional helices by studying unusual protein structures, computational modeling, and synthetic peptide chemistry.
8. Why are most molecular helices right-handed?
In biology, most helices are right-handed because proteins are built mainly from L-amino acids. The stereochemistry of L-amino acids favors the formation of right-handed α-helices.
So the preference is not random: it arises from the chirality of the molecular building blocks.
9. Why do β-sheets tend to aggregate?
β-sheets tend to aggregate because their peptide backbones can form extensive hydrogen-bonding networks between neighboring strands. These interactions are repetitive and highly stabilizing.
Also, β-strands often expose side chains in an alternating pattern, which makes them good at packing together into larger assemblies such as fibrils.
10. What is the driving force for β-sheet aggregation?
The main driving forces are:
hydrogen bonding between peptide backbones
hydrophobic interactions between side chains
release of water molecules from the interface, which increases solvent entropy
Together, these effects make β-sheet assemblies, especially cross-β structures, very stable.
11. Why do many amyloid diseases form β-sheets?
Many amyloid diseases involve proteins that misfold and then assemble into β-sheet-rich fibrils. The cross-β structure is very stable and can grow by recruiting additional misfolded protein molecules.
This makes β-sheet aggregation a common structural feature in diseases such as:
Alzheimer’s disease
Parkinson’s disease
Huntington’s disease
other protein misfolding disorders
12. Can you use amyloid β-sheets as materials?
Yes. Amyloid β-sheet assemblies can be used as functional biomaterials because they are often:
strong
stable
self-assembling
nanoscale and highly ordered
Potential applications include:
tissue engineering scaffolds
nanomaterials
functional coatings
drug delivery systems
bio-inspired structural materials
So although amyloids are linked to disease, they can also be useful when carefully designed and controlled.
13. Design a β-sheet motif that forms a well-ordered structure.
A good β-sheet design should encourage:
β-strand formation
regular side-chain patterning
controlled intermolecular interactions
reduced disorder at the ends
Example 1: Amphipathic β-strand peptide
Sequence: Ac–Val-Lys-Val-Glu-Val-Lys-Val-Glu–NH2
Why this may work
Val promotes β-strand structure and creates a hydrophobic face.
Lys and Glu create a charged face.
Oppositely charged residues can form salt bridges.
The alternating arrangement supports ordered packing.
N-terminal acetylation and C-terminal amidation reduce end effects.
Example 2: More aggregation-prone fibril-forming motif
Sequence: Ac–Phe-Val-Phe-Val-Lys-Glu-Phe-Val–NH2
Why this may work
Phe and Val strongly favor packing and aggregation.
Aromatic residues may strengthen intermolecular interactions.
Lys/Glu improve some balance between solubility and assembly.
This sequence may form fibrils more easily, but it also carries a higher risk of uncontrolled aggregation.
Example 3: β-hairpin motif with defined turn
Sequence: RGKWTWQ–DPro-Gly–QWTVKGR
Why this may work
The DPro-Gly pair promotes a defined hairpin turn.
The strands can align in a controlled intramolecular β-sheet.
Aromatic and charged residues can help stabilize folding and packing.
This design is often more controlled than open-ended fibril-forming strands.
Briefly describe the protein you selected and why you selected it.
Identify the amino acid sequence of your protein.
I am interested in Proteins that can enable movement or have realtionship in designing soft robotics, so I was interested in the following proteins:
Silk-Elastin-Like Proteins (SELPs)
Silk-Elastin-Like Proteins (SELPs) are a class of genetically engineered, chimeric biopolymers that combine the structural, mechanical properties of silk (specifically Bombyx mori silk fibroin) with the elasticity and thermo-responsiveness of elastin. By leveraging recombinant DNA technology, these proteins can be precisely tailored for applications in biomedical engineering, drug delivery, and tissue engineering.
Silk-Elastin-Like Proteins (SELPs) are engineered block copolymers comprising repeating amino acid sequences of silk, typically GAGAGS (Gly-Ala-Gly-Ala-Gly-Ser), and elastin, often GVGVP (Val-Pro-Gly-Val-Gly). A common, highly studied monomer unit is, one that combines elastic and structural properties
Composition: SELPs are block copolymers, consisting of alternating silk-like motifs (typically GAGAGS) and elastin-like motifs (typically GVGVP).
Self-Assembly: In aqueous solutions, SELPs form micellar-like nanoparticles, with the hydrophobic silk blocks forming the core and the hydrophilic elastin blocks forming the corona.
Stimuli-Responsiveness: SELPs are “smart” materials that respond to environmental triggers, most notably temperature, but also pH, ionic strength, and light.
Mechanical Properties: The silk-to-elastin ratio determines the mechanical behavior. Higher silk content increases beta-sheet formation, resulting in stiffer materials, while higher elastin content increases flexibility.
Production: Produced through E. coli expression systems, allowing for high control over sequence, molecular weight, and monodispersity, which improves reproducibility compared to natural materials
Chambre L, Martín-Moldes Z, Parker RN, Kaplan DL. Bioengineered elastin- and silk-biomaterials for drug and gene delivery. Adv Drug Deliv Rev. 2020;160:186-198. doi: 10.1016/j.addr.2020.10.008. Epub 2020 Oct 17. PMID: 33080258; PMCID: PMC7736173.
Collagen
Collagen’s primary amino acid sequence is characterized by a repeating, unique motif, most commonly Glycine-Proline-X or Glycine-X-Hydroxyproline, where Glycine appears every third residue. These ~1,000 amino acid-long chains form a triple helix, rich in glycine, proline, and hydroxyproline.
Repeating Units: The primary sequence is defined by
repeats, making up a significant portion of the chain.
Glycine (Gly): Occurs at every third position, essential for the tight packing of the triple helix.
Proline (Pro) & Hydroxyproline (Hyp): The ‘X’ and ‘Y’ positions are frequently occupied by Proline (approx. 28%) and Hydroxyproline (approx. 38%).
Hydroxyproline and Hydroxylysine: These modified amino acids are crucial for stabilizing the triple helix structure via hydrogen bonding.
Structure: Three left-handed polyproline II helices intertwine to create a right-handed superhelical triple helix, known as tropocollagen.
Actin and myosin are highly conserved, complex proteins, with actin typically comprising 374-376 amino acids and myosin (specifically the heavy chain) being a much larger molecule (~2000+ residues). Due to their size and various isoforms, they are generally identified by their full sequences in protein databases (like UniProt) rather than a single short string.
Below are the key details regarding their amino acid sequences based on rabbit skeletal muscle, which is the standard reference:
Actin Amino Acid Sequence (Rabbit Skeletal Muscle)
Actin is a 374-residue protein with a highly conserved sequence. It includes a unique-methyl histidine residue.
Key Features: High proportion of proline and glycine.
Sequence Data Source: The complete sequence was first determined by Elzinga et al. (1973).
Isoforms: While highly conserved, differences occur between skeletal, cardiac, and cytoplasmic isoforms (e.g., about 25 amino acid differences between skeletal and cytoplasmic actin).
Myosin Amino Acid Sequence (Heavy Chain/S1 Fragment)
Myosin is a large motor protein (Hexamer: 2 heavy chains, 4 light chains). The functional motor domain is the S1 fragment (globular head).
Active Site Sequence: A key 20-residue peptide containing the active site in Acanthamoeba and rabbit skeletal myosin has been identified, with sequences such as Thr-Glu-Asn-Thr-Me2Lys-Lys.
Fragment Identification: A 92-residue fragment containing SH-1 and SH-2 groups in the globular head was identified by Maita et al..
Motor Domain: The motor domain of myosin II comprises approximately 700-800 amino acids at the N-terminus of the heavy chain.
Key Structural Sites (Interaction Points)
The interaction between actin and myosin involves specific binding sites on both proteins:
Actin Binding Site on Myosin: Located on the S1 head, this area involves multiple hinged segments that change shape to facilitate contraction.
Myosin Binding Site on Actin: The interaction involves specific residues that can be mapped using peptide fragments.
Loop 4/CM Loop: Specific loops on the myosin head are critical for binding to actin.
For the full, exact sequence, searching for “Rabbit skeletal muscle actin UniProt” or “Human Beta-Myosin Heavy Chain UniProt” in scientific databases is required.
Some images from :
Myotilin Monomer
AF-Q9UBF9-2-F1-v6
Protein: Myotilin
Gene: MYOT
Source organism: Homo sapiens search this organism
UniProt: Q9UBF9-2 go to UniProt
Experimental structures: 2 PDB structures for Q9UBF9-2go to PDBe-KB
Global quality average pLDDT 77.06 (High)
Sequence length 314
week 05 protein design part 2
Still not really sure what I am doing :-)!
Basics
Design short peptides that bind mutant SOD1.
“Design short peptides that bind mutant SOD1” means creating small, synthetic chains of amino acids (peptides) specifically engineered to attach to a deformed version of the Superoxide Dismutase 1 (SOD1) protein. This is a therapeutic strategy aimed at treating SOD1-related Amyotrophic Lateral Sclerosis (ALS).
Mutant SOD1: A faulty version of the SOD1 enzyme produced due to genetic mutations. Unlike healthy SOD1, mutant SOD1 misfolds, becomes unstable, and aggregates (clumps together), leading to toxicity in motor neurons.
Short Peptides: Small molecules, often consisting of only a few amino acids, designed to act as targeted “decoys.”
Bind: The peptides are engineered to stick to specific, exposed, or misfolded areas of the mutant SOD1 protein.
Goal: The binding stops the mutant SOD1 from interacting with, and damaging, vital parts of the cell—such as mitochondrial outer membranes (e.g., VDAC1) or Bcl-2 proteins—thereby preventing cell death and slowing the progression of ALS
Why this approach?
Mutant SOD1 binds to mitochondria, disrupting energy production and causing toxicity. Designed short peptides can mimic the normal binding partners of the mitochondria (like the VDAC1 N-terminus), acting as a decoy to prevent the toxic mutant protein from sticking to the mitochondria, as demonstrated in scientific studies where these peptides improved neuronal survival.
Then decide which ones are worth advancing toward therapy.
The goal of designing these peptides is typically to:
-> Block Aggregation: By binding to the “sticky” parts of the mutant protein, the peptides can prevent it from clumping into toxic aggregates.
-> Prevent Toxic Interactions: Mutant SOD1 often interferes with other vital cell parts, like mitochondria. Short “decoy” peptides can bind to the mutant protein first, blocking it from damaging these organelles.
-> Stabilize the Protein: Some peptides are designed to help the mutant protein keep its proper shape, making it less likely to become toxic.
In essence, “Design short peptides that bind mutant SOD1” means creating custom-made, small molecules to “clamped onto” the broken protein responsible for ALS to stop it from causing harm.
Would you like to see current examples of these peptides or learn more about how they are tested in the lab?
PART 1
Design short peptides that bind mutant SOD1.
Then decide which ones are worth advancing toward therapy.
PepMLM: target sequence-conditioned peptide generation via masked language modeling
This week we learn core molecular biology tools and techniques for processing and assembling DNA, including PCR and Gibson Assembly.
1. Components of Phusion High-Fidelity PCR Master Mix
Phusion Master Mix contains several key components:
Phusion Hot Start II DNA Polymerase — A high-fidelity polymerase with a proofreading (3’→5’ exonuclease) domain that corrects misincorporated bases, resulting in ~50× lower error rates than Taq. It also has a processivity-enhancing domain that speeds up elongation.
dNTPs (dATP, dCTP, dGTP, dTTP) — The nucleotide building blocks incorporated during strand synthesis.
MgCl₂ — Magnesium ions are an essential cofactor for DNA polymerase activity and also stabilize the dNTP substrates.
Optimized reaction buffer — Maintains proper pH and ionic conditions for polymerase activity and primer/template annealing.
Stabilizers/additives — Help maintain enzyme stability and can improve yield on difficult templates (e.g., GC-rich regions).
2. Factors Determining Primer Annealing Temperature
Primer GC content — G·C pairs form 3 hydrogen bonds vs. 2 for A·T, so higher GC content raises the melting temperature (Tm). A rough formula is Tm = 4(G+C) + 2(A+T).
Primer length — Longer primers have higher Tm values because more base-pair interactions must be disrupted.
Salt/ion concentration — Higher Mg²⁺ or monovalent salt concentrations stabilize the DNA duplex and raise Tm.
Primer secondary structure — Hairpins or self-dimers can reduce effective annealing efficiency.
Template secondary structure — Highly structured templates may require higher annealing temperatures or additives like DMSO.
Mismatches — Deliberate mismatches (e.g., for mutagenesis) lower Tm and require adjusted annealing temperatures.
Annealing temperature rule of thumb — Typically set 5°C below the lower Tm of the two primers used.
3. PCR vs. Restriction Enzyme Digests
Feature
PCR
Restriction Enzyme Digest
Input template
Any DNA (plasmid, genomic, cDNA)
Usually plasmid or purified DNA
Output
Amplified, defined fragment
Fragment(s) cut at specific recognition sites
End type
Blunt (Phusion) or 3’ A-overhang (Taq)
Blunt or sticky (cohesive) ends depending on enzyme
Precision
Defined by primer design; any sequence
Defined by restriction site locations in DNA
Flexibility
Very high — you design the fragment
Limited to where restriction sites naturally exist
Time
~1–3 hours
~1–2 hours
Error risk
Polymerase errors possible (mitigated by HiFi)
No sequence errors; only wrong cut possible
Requires sequence knowledge?
Yes, for primer design
Yes, to identify restriction sites
When to prefer PCR
You need to amplify a fragment from a complex mixture (e.g., genomic DNA).
You want to add sequences (overhangs, restriction sites, Gibson overlaps) to the ends of a fragment.
No convenient restriction sites flank your gene of interest.
You are introducing a point mutation or modifying a sequence.
When to prefer restriction enzyme digest
You are sub-cloning between two vectors that already have compatible restriction sites.
You need sticky ends for directional cloning.
You want to cut a vector backbone without amplifying it (avoids PCR errors in the vector).
Speed and simplicity are priorities when restriction sites are already present.
4. Ensuring Compatibility with Gibson Assembly
Gibson Assembly requires fragments with overlapping homologous sequences (~15–30 bp) at their ends. To ensure compatibility:
For PCR fragments: Design primers so that the 5’ overhang of each primer matches the end of the adjacent fragment. This way, after PCR, the amplified insert carries ~20–30 bp of sequence identical to the neighboring fragment or vector.
For restriction-digested fragments: After digestion, check that the sticky ends or blunt ends are located within the overlap region you plan to use — or add Gibson overlaps via a subsequent PCR step using primers that extend into the adjacent sequence.
Check orientation: Use Benchling or SnapGene to simulate the assembly and verify that all overlaps are in the correct orientation and reading frame.
Avoid internal repeat sequences in the overlap regions, as the exonuclease in Gibson mix can cause misannealing.
Ensure no unwanted restriction sites or stop codons are introduced at junctions.
Gel-purify or column-purify fragments after PCR or digest to remove enzymes, primers, and small fragments that could interfere.
5. How Plasmid DNA Enters E. coli During Transformation
The most common method in lab courses is heat-shock transformation of chemically competent cells:
Chemical competency preparation — Cells are treated with divalent cations (typically CaCl₂), which neutralize the negative charges on the LPS of the outer membrane and on the DNA, reducing electrostatic repulsion.
DNA binding — Plasmid DNA associates with the cell surface, facilitated by the Ca²⁺ ions.
Heat shock (42°C, ~30–45 sec) — The rapid temperature increase is thought to create a thermal imbalance that momentarily destabilizes the membrane and drives DNA into the cell, possibly through transient pores or membrane disruptions. The exact mechanism is still not fully understood.
Recovery on ice — Cells are rapidly cooled to stabilize the membrane after DNA entry.
Outgrowth in SOC/LB — Cells recover and begin expressing antibiotic resistance genes before plating on selective media.
Alternative method — Electroporation: A brief electrical pulse (~1.8–2.5 kV) creates transient pores in the membrane through which DNA passes. This is more efficient but requires electrocompetent cells and specialized equipment.
6. Golden Gate Assembly
6.1 Explanation in 5–7 sentences
Golden Gate Assembly is a DNA assembly technique that uses Type IIS restriction enzymes, such as BsaI or BsmBI, which cut outside of their recognition sites rather than within them. This makes it possible to design custom overhangs that determine the exact order in which DNA fragments join together. In a single reaction tube, the restriction enzyme cuts the DNA fragments and vector, and DNA ligase joins the matching overhangs. Because the recognition sites can be removed during the assembly process, the final DNA construct is often scarless, meaning no extra unwanted sequence remains at the junctions. Golden Gate Assembly is especially useful for assembling multiple DNA fragments in a defined order with high efficiency. It is widely used in modular cloning systems and synthetic biology workflows. Compared with Gibson Assembly, Golden Gate relies on restriction sites and short designed overhangs rather than long homologous overlaps.
NEB’s (New England Biolabs) explanation & protocols for Gibson Assembly®
General principle
Fragment 1 Fragment 2 Fragment 3
[BsaI] [BsaI] [BsaI]
| | |
v v v
Cut outside the recognition sequence to create custom overhangs
Overhangs designed as:
Fragment 1 ---> AATG
Fragment 2 ---> GCTT
Fragment 3 ---> CGGA
Matching overhangs guide ligation in the correct order:
Fragment 1 + Fragment 2 + Fragment 3
↓
Final assembled construct
1. Type IIS restriction enzyme cuts DNA outside its recognition site
2. Custom sticky ends are generated
3. Matching sticky ends anneal
4. DNA ligase seals the backbone
5. Final construct forms without the original restriction sites
week 7 genetic circuits part II
What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions?
IANNs have several advantages over traditional Boolean genetic circuits.
First, they can process continuous and graded inputs rather than only treating signals as ON or OFF. This is important because many biological signals, such as metabolite concentrations, transcription factor levels, or signaling gradients, are not binary.
Second, IANNs can perform weighted integration of multiple inputs. Instead of responding only when a rigid logical condition is met, they can combine signals with different strengths, similar to how neurons sum inputs.
Third, they can generate nonlinear and more complex input-output behaviors, such as band-pass filters, threshold responses, or spatial patterns. This makes them more suitable for approximating real biological decision-making.
Fourth, multilayer IANNs can achieve greater design flexibility and generalization. By stacking regulatory layers, they can produce behaviors that would be difficult or inefficient to implement with simple Boolean gates alone.
Finally, IANNs are useful when the goal is not just logical control but also prediction, optimization, and adaptive design, especially when paired with AI-based modeling tools.
Describe a useful application for an IANN; include a detailed description of input/output behavior, as well as any limitations an IANN might face to achieve your goal.
A useful application for an IANN would be a smart therapeutic cell for liver disease detection and response.
Application idea
The engineered cell could sense several biomarkers associated with liver injury or inflammation and produce a therapeutic or reporter output only when a specific combination of signals is detected.
Inputs
The IANN could receive multiple intracellular or extracellular inputs, for example:
• X1: inflammatory cytokine level
• X2: oxidative stress signal
• X3: metabolite associated with liver dysfunction
• X4: hypoxia-related signal
Each of these inputs would not simply be present or absent, but could vary in concentration.
Output behavior
The output, Y, could be:
• expression of a fluorescent reporter for diagnosis, or
• release of a protective therapeutic protein
The IANN would integrate the four inputs using weighted biological regulation. For example:
• low inflammation alone would not activate the output
• moderate inflammation plus high oxidative stress might produce a medium output
• a specific disease-like combination of all four signals could trigger a strong output
• healthy or nonspecific combinations would remain below threshold
This would allow the system to distinguish a true pathological state from random fluctuations or isolated signals.
Why IANN is useful here
A Boolean circuit might require strict YES/NO cutoffs and could be too rigid. In contrast, an IANN could better handle noisy biological data and produce a more nuanced response.
Limitations
However, an IANN would face several limitations:
• biological noise: gene expression varies from cell to cell
• limited predictability: real cells may behave differently from the model
• cross-talk: regulators may unintentionally affect other components
• timing delays: transcription and translation are slower than electronic computation
• metabolic burden: large circuits can stress the cell
• safety and stability: long-term behavior may drift due to mutation or epigenetic changes
So while IANNs are powerful, achieving reliable therapeutic performance would require careful design, validation, and containment.