Sofya Rakitina — HTGAA Spring 2026
Homework
- Week 1 HW: Principles and Practices
- Week 2 HW: DNA Read, Write, & Edit
- Week 3 HW: Principles and Practices
Week 1 HW: Principles and Practices
First, describe a biological engineering application or tool you want to develop and why. This could be inspired by an idea for your HTGAA class project and/or something for which you are already doing in your research, or something you are just curious about. I have worked with the concept of CA before within design and 3d space generative making through creating tools for generating patterns and environments, so it was really fascinating to see it being brought up during class. So, for my idea I’d like to merge my previous digital experience with CA and synthetic biology tooling in a form of a computer aided design tool for spatial synthetic biology
Week 2 HW: DNA Read, Write, & Edit
Part 1: Benchling & In-silico Gel Art Enzyme Number of Cuts Number of Fragments Fragment Sizes (bp) EcoRI 5 6 21,226 / 7,421 / 5,804 / 5,643 / 4,878 / 3,530 HindIII 6 7 9,416 / 6,682 / 4,361 / 3,130 / 2,322 / 2,027 / 564 BamHI 5 6 16,841 / 7,233 / 6,770 / 6,527 / 5,626 / 5,505 KpnI 2 3 29,942 / 17,057 / 1,503 EcoRV 21 22 5,765 / 5,376 / 4,613 / 3,873 / 3,744 / 3,595 / 2,884 / 2,674 / 1,921 / 1,679 / 1,434 / 1,403 / 1,377 / 1,313 / 738 / 655 / 618 / 597 / 588 / 268 / 52 / 35 SacI 2 3 24,776 / 22,621 / 1,105 SalI 2 3 32,745 / 15,258 / 499 Restriction Enzymes Used EcoRI EcoRV HindIII KpnI BamHI SacI SalI Restriction Digest Setup Lane Water CutSmart Buffer λ DNA Enzyme(s) M (Ladder) 14 μL 2 μL 3 μL - 1 13 μL 2 μL 3 μL 1 μL EcoRI 2 13 μL 2 μL 3 μL 1 μL KpnI + 1 μL BamHI 3 14 μL 2 μL 3 μL 1 μL EcoRI + 1 μL HindIII 4 14 μL 2 μL 3 μL 1 μL EcoRV 5 13 μL 2 μL 3 μL 1 μL EcoRI + 1 μL KpnI 6 13 μL 2 μL 3 μL 1 μL SacI + 1 μL HindIII 7 13 μL 2 μL 3 μL 1 μL SacI + 1 μL SacI 8 13 μL 2 μL 3 μL 1 μL SalI + 1 μL KpnI 9 14 μL 2 μL 3 μL 1 μL SacI + 1 μL SacI 10 13 μL 2 μL 3 μL 1 μL HindIII + 1 μL SacI Total volume per tube: 20 μL
Week 3 HW: Principles and Practices
Post-Lab Questions Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications. https://www.nature.com/articles/s41598-024-64938-0 This study by Norton-Baker et al. (2024) used an Opentrons OT-2 liquid-handling tool to efficiently characterise a large number of proteins. They also described a generalizable pipeline for high-throughput protein purification using small-scale expression in E. coli and an affordable liquid-handling robot. As a result, the automation significantly increased throughput, reduced manual labour, and improved consistency across samples, demonstrating how accessible robotics can accelerate biological research workflows. It also allowed to confirm the validity of previous findings.
I have worked with the concept of CA before within design and 3d space generative making through creating tools for generating patterns and environments, so it was really fascinating to see it being brought up during class. So, for my idea I’d like to merge my previous digital experience with CA and synthetic biology tooling in a form of a computer aided design tool for spatial synthetic biology
Ensuring biosafety: a) Possibility of the prediction of biological behaviour b) Testable behaviour (not just false confidence in whatever is happening, both for a and b!) c) safety protocols built in within the ux/ui of the software d) training provided
Transparency whilst preventing harm: a) Maintenance of accountability over the created projects (as in not fully automated) b) Responsible use c) (Ideally!) some sort of encouragement of socially beneficial applications

| Does the option: | Option 1 | Option 2 | Option 3 |
|---|---|---|---|
| Enhance Biosecurity | |||
| • By preventing incidents | 2 | 1 | 2 |
| • By helping respond | 3 | 2 | 2 |
| Foster Lab Safety | |||
| • By preventing incident | 1 | 1 | 2 |
| • By helping respond | 2 | 1 | 3 |
| Protect the environment | |||
| • By preventing incidents | 3 | 2 | 1 |
| • By helping respond | 2 | 2 | 1 |
| Other considerations | |||
| • Minimizing costs and burdens to stakeholders | 3 | 2 | 1 |
| • Feasibility? | 2 | 2 | 1 |
| • Not impede research | 2 | 3 | 1 |
| • Promote constructive applications | 1 | 1 | 2 |
Homework Questions from Professor Jacobson: [Lecture 2 slides]
1)Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase?
1:10⁶
How does this compare to the length of the human genome?
approximately 3.2 Gbp
How does biology deal with that discrepancy?
Through 5’-3’ error-correcting exonuclease and 3’-5’ proofreading exonucleas + there is also MutS Repair System
2)How many different ways are there to code (DNA nucleotide code) for an average human protein?
2³⁴⁵ or higher (as there is 1,036 bp and 345 amino acids)
In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?
Homework Questions from Dr. LeProust: [Lecture 2 slides]
1.What’s the most commonly used method for oligo synthesis currently?
The phosphoramidite method
2.Why is it difficult to make oligos longer than 200nt via direct synthesis?
Because of the 1)error accumulation, 2)”enhanced chemistry” is required
3.Why can’t you make a 2000bp gene via direct oligo synthesis?
Error accumulation and length limitations > assembly approach is required
Homework Question from George Church: [Lecture 2 slides]
[Given slides #2 & 4 (AA:NA and NA:NA codes)] What code would you suggest for AA:AA interactions?
A protein–protein interaction code
| Enzyme | Number of Cuts | Number of Fragments | Fragment Sizes (bp) |
|---|---|---|---|
| EcoRI | 5 | 6 | 21,226 / 7,421 / 5,804 / 5,643 / 4,878 / 3,530 |
| HindIII | 6 | 7 | 9,416 / 6,682 / 4,361 / 3,130 / 2,322 / 2,027 / 564 |
| BamHI | 5 | 6 | 16,841 / 7,233 / 6,770 / 6,527 / 5,626 / 5,505 |
| KpnI | 2 | 3 | 29,942 / 17,057 / 1,503 |
| EcoRV | 21 | 22 | 5,765 / 5,376 / 4,613 / 3,873 / 3,744 / 3,595 / 2,884 / 2,674 / 1,921 / 1,679 / 1,434 / 1,403 / 1,377 / 1,313 / 738 / 655 / 618 / 597 / 588 / 268 / 52 / 35 |
| SacI | 2 | 3 | 24,776 / 22,621 / 1,105 |
| SalI | 2 | 3 | 32,745 / 15,258 / 499 |

| Lane | Water | CutSmart Buffer | λ DNA | Enzyme(s) |
|---|---|---|---|---|
| M (Ladder) | 14 μL | 2 μL | 3 μL | - |
| 1 | 13 μL | 2 μL | 3 μL | 1 μL EcoRI |
| 2 | 13 μL | 2 μL | 3 μL | 1 μL KpnI + 1 μL BamHI |
| 3 | 14 μL | 2 μL | 3 μL | 1 μL EcoRI + 1 μL HindIII |
| 4 | 14 μL | 2 μL | 3 μL | 1 μL EcoRV |
| 5 | 13 μL | 2 μL | 3 μL | 1 μL EcoRI + 1 μL KpnI |
| 6 | 13 μL | 2 μL | 3 μL | 1 μL SacI + 1 μL HindIII |
| 7 | 13 μL | 2 μL | 3 μL | 1 μL SacI + 1 μL SacI |
| 8 | 13 μL | 2 μL | 3 μL | 1 μL SalI + 1 μL KpnI |
| 9 | 14 μL | 2 μL | 3 μL | 1 μL SacI + 1 μL SacI |
| 10 | 13 μL | 2 μL | 3 μL | 1 μL HindIII + 1 μL SacI |
Total volume per tube: 20 μL
Goal: 100-150 ng of DNA per lane
Digest Sample:
Ladder:
Digest Sample:
Ladder:
3.1 For this task, I’d like to use green fluorescent protein also known as GFP as it is, firstly, safe and pretty well characterised (as described here! https://pubmed.ncbi.nlm.nih.gov/8303295/), essential for CA mechanics to be working (https://www.science.org/doi/10.1126/science.abb8205) and just generally directly connects to tool making (https://www.nature.com/articles/s41467-024-53078-8)
sp|P42212|GFP_AEQVI Green fluorescent protein OS=Aequorea victoria OX=6100 GN=GFP PE=1 SV=1 MSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPT LVTTFSYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDT LVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQ LADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK
3.2 Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence.
based on this tool https://www.bioinformatics.org/sms2/rev_trans.html
reverse translation of sp|P42212|GFP_AEQVI Green fluorescent protein OS=Aequorea victoria OX=6100 GN=GFP PE=1 SV=1 to a 714 base sequence of consensus codons. atgwsnaarggngargarytnttyacnggngtngtnccnathytngtngarytngayggn gaygtnaayggncayaarttywsngtnwsnggngarggngarggngaygcnacntayggn aarytnacnytnaarttyathtgyacnacnggnaarytnccngtnccntggccnacnytn gtnacnacnttywsntayggngtncartgyttywsnmgntayccngaycayatgaarcar caygayttyttyaarwsngcnatgccngarggntaygtncargarmgnacnathttytty aargaygayggnaaytayaaracnmgngcngargtnaarttygarggngayacnytngtn aaymgnathgarytnaarggnathgayttyaargargayggnaayathytnggncayaar ytngartayaaytayaaywsncayaaygtntayathatggcngayaarcaraaraayggn athaargtnaayttyaarathmgncayaayathgargayggnwsngtncarytngcngay caytaycarcaraayacnccnathggngayggnccngtnytnytnccngayaaycaytay ytnwsnacncarwsngcnytnwsnaargayccnaaygaraarmgngaycayatggtnytn ytngarttygtnacngcngcnggnathacncayggnatggaygarytntayaar
3.3. Codon optimization.
based on this tool - https://en.vectorbuilder.com/tool/codon-optimization/51ebfb2e-a00a-4190-9fe9-c1578d3ecfbe.html
Pasted Sequence: GC=48.60%, CAI=1.00
ATGAGCAAAGGCGAAGAACTGTTTACCGGCGTGGTGCCGATTCTGGTGGAACTGGATGGCGATGTGAACGGCCATAAATTTAGCGTGAGCGGCGAAGGCGAAGGCGATGCGACCTATGGCAAACTGACCCTGAAATTTATTTGCACCACCGGCAAACTGCCGGTGCCGTGGCCGACCCTGGTGACCACCTTTAGCTATGGCGTGCAGTGCTTTAGCCGCTATCCGGATCATATGAAACAGCATGATTTTTTTAAAAGCGCGATGCCGGAAGGCTATGTGCAGGAACGCACCATTTTTTTTAAAGATGATGGCAACTATAAAACCCGCGCGGAAGTGAAATTTGAAGGCGATACCCTGGTGAACCGCATTGAACTGAAAGGCATTGATTTTAAAGAAGATGGCAACATTCTGGGCCATAAACTGGAATATAACTATAACAGCCATAACGTGTATATTATGGCGGATAAACAGAAAAACGGCATTAAAGTGAACTTTAAAATTCGCCATAACATTGAAGATGGCAGCGTGCAGCTGGCGGATCATTATCAGCAGAACACCCCGATTGGCGATGGCCCGGTGCTGCTGCCGGATAACCATTATCTGAGCACCCAGAGCGCGCTGAGCAAAGATCCGAACGAAAAACGCGATCATATGGTGCTGCTGGAATTTGTGACCGCGGCGGGCATTACCCATGGCATGGATGAACTGTATAAA
Improved DNA[1]: GC=48.88%, CAI=0.97
ATGAGCAAAGGCGAAGAACTGTTTACCGGCGTGGTGCCGATTCTGGTGGAACTGGATGGCGATGTGAATGGCCATAAATTTAGCGTGAGCGGCGAAGGTGAAGGCGATGCGACCTATGGCAAACTGACCCTGAAATTTATCTGCACCACCGGTAAACTGCCGGTGCCGTGGCCGACCCTGGTGACCACCTTCAGCTACGGCGTGCAGTGTTTTAGCCGCTACCCGGATCATATGAAACAGCATGATTTTTTTAAAAGCGCGATGCCGGAAGGCTATGTGCAGGAACGCACCATTTTTTTCAAAGATGATGGCAATTACAAAACCCGTGCCGAAGTGAAATTCGAAGGCGATACCCTGGTGAATCGCATTGAACTGAAAGGCATTGATTTTAAAGAAGATGGTAACATTCTGGGCCACAAACTGGAATACAACTATAACAGCCATAACGTGTACATTATGGCGGATAAACAGAAAAATGGCATTAAAGTGAACTTTAAAATTCGCCATAACATTGAAGATGGCTCAGTGCAGCTGGCGGATCACTATCAGCAGAACACCCCGATTGGCGATGGCCCGGTTCTGCTGCCGGATAACCACTATCTGAGCACCCAGAGCGCGCTGTCGAAAGATCCGAACGAAAAACGCGATCACATGGTGCTGCTGGAATTTGTGACCGCCGCGGGCATCACCCATGGTATGGATGAACTGTATAAA Avoid cleavage sites of restriction enzymes: BbsI BsaI
It’s done to get a more reliable and consistent protein expression!
I chose Escherichia coli K-12 substr. MG1655 because it’s standard and saf to work with + GFP expression is well established within it
3.4 If it’s a cell-dependent method the DNA sequence can be transcribed and translated into my protein through these steps: 1. Transformation 2.Transcription 3.Translation 4.Protein Folding 5.Fluorescence
https://benchling.com/s/seq-TYyZiuVgAJUcWN179kJj?m=slm-SR2kt8YFMq4ngkmbdM2Q
Afterwards in Twist I got a bunch of errors when importing as when I was making a sequence a copy pasted an article “a"into it as it was a part of it. It is also visible here as 931 is not divisible by 3
fixed version!>
how it works>
final link https://benchling.com/s/seq-jNpnBifWoDGPy7RLJoHu?m=slm-nUfWbFSbXctEauGsTTaq

5.1 DNA Read
(i) What DNA would you want to sequence (e.g., read) and why?
So far some sort of plasmids encoding CA-based pattern formation circuits in bacterial populations to be able to design patterns with them for the final project tool
(ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why?
Probably something like Illumina as it is pretty commonly used which means it will be easier to find
5.2 DNA Write
(i) What DNA would you want to synthesize (e.g., write) and why?
GFP-based circuits so I could prototype programmable living pattern systems
(ii) What technology or technologies would you use to perform this DNA synthesis and why?
PCR or Gibson assembly to do precise construction of my custom genetic circuit
And then in terms of computer based tech Benchling and Twist
5.3 DNA Edit
(i) What DNA would you want to edit and why?
So far based on my research for this project E. coli K-12 MG1655as it is safe, well-documented, accessible and also works amazing with my project
(ii) What technology or technologies would you use to perform these DNA edits and why?
I’m not fully sure what’s possible to use for the projects yet but if I can CRISPR-Cas9 as it can precisely edit genetic circuits
Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications.
https://www.nature.com/articles/s41598-024-64938-0
This study by Norton-Baker et al. (2024) used an Opentrons OT-2 liquid-handling tool to efficiently characterise a large number of proteins. They also described a generalizable pipeline for high-throughput protein purification using small-scale expression in E. coli and an affordable liquid-handling robot. As a result, the automation significantly increased throughput, reduced manual labour, and improved consistency across samples, demonstrating how accessible robotics can accelerate biological research workflows. It also allowed to confirm the validity of previous findings.
On a separate note, I really enjoyed their explanation for using the tool : “Therefore, we aimed to develop a protocol using the OT-2 to provide a low-cost option for the purification and analysis of enzymes, or other proteins, making high-throughput studies more accessible to a broader range of research laboratories. Beyond increasing efficiency, this automation-assisted approach reduces the labor burden on researchers and lowers the risk of repetitive use injuries.”"
Write a description about what you intend to do with automation tools for your final project. You may include example pseudocode, Python scripts, 3D printed holders, a plan for how to use Ginkgo Nebula, and more. You may reference this week’s recitation slide deck for lab automation details.
For my final project, I plan on using Opentrons to support a design, then build, test and learn pipeline for programmable living CA.
Things to do:
build libraries of sender receiver circuits with a range of parameters (I’ve done a similar thing with chi.bio before so should be straightforward). I’d also like to see how I can integrate it within the pre existing lab environment (Airtable) to get more transparent data for the future
Go through the process of general screening of spatial rule variants in plates to test pattern formation under different genetic and environmental conditions, depending on available options
In the later stages. use Cloud lab integration (e.g., Ginkgo Nebula) to scale, run large combinatorial libraries, and feed results back into the CA simulation engine for model fine tunning
All in all, this automation will increase reproducibility, enable systematic exploration of rule space, and accelerate the feedback loop between digital design and wet-lab experimentation.
I would also like to explore more of the idea using Ginkgo’s cloud lab (Nebula) to run large combinatorial libraries of CA rule variants. That would allow me to test hundreds of spatial rule configurations and feed experimental results back into the simulation engine!


