Homework

Weekly homework submissions:

Week 1 HW: Principles and Practices
1.First, describe a biological engineering application or tool you want to develop and why? Biosensors (living sensors) are not a new concept. They have already been used to detect metals in water, metabolites in bioreactors, pathogens in food, and plant stress signals in agriculture. However, an exciting future application would be designing biosensors that can detect tumors inside the human body and assist in cancer diagnosis.
Week 2 HW: DNA Read Write & Edit
Home Work Week 2.
Week 3 HW: lab automation
Home Work Week 3
Week 4 HW: protein design part 1
Home Work Week 3

Week 1 HW: Principles and Practices

1.First, describe a biological engineering application or tool you want to develop and

why?

Biosensors (living sensors) are not a new concept. They have already been used to detect metals in water, metabolites in bioreactors, pathogens in food, and plant stress signals in agriculture. However, an exciting future application would be designing biosensors that can detect tumors inside the human body and assist in cancer diagnosis.

My interest lies in cancer biology and applying synthetic biology to combat this disease. I want to design and engineer bacteria that can sense tumor-specific signals in the body and produce a measurable response, such as fluorescence or secretion of a detectable biomarker.

This technology could:

Enable early cancer detection
Improve treatment outcomes
Complement existing diagnostic methods
Reduce invasive diagnostic procedures
Detect tumor-associated metabolites or surface markers

application or tool contributes to an “ethical” future, like ensuring non-malfeasance (preventing harm). Break big goals down into two or more specific sub-goals?

Past global health events, such as COVID-19, demonstrate how biological systems—whether naturally mutated or engineered—can spread widely and have major unintended consequences. Therefore, the overarching governance goal is to ensure tumor-sensing engineered bacteria are developed and used ethically, prioritizing safety, environmental protection, and social responsibility.

Sub-Goals

1. Non-maleficence (prevent harm)

Minimize unintended pathogenicity
Prevent horizontal gene transfer
Reduce harmful immune responses in the host

2. Environmental protection

Prevent persistence outside intended environments
Avoid uncontrolled dissemination
Minimize ecological disruption

3. Equity and access

Ensure fair access to the technology
Distribute benefits across populations
Prevent socioeconomic disparities in availability

3.Next, describe at least three different potential governance “actions” by considering

the four aspects below (Purpose, Design, Assumptions, Risks of Failure & “Success”)?

Option 1 — Biosafety Regulation

Engineered microorganisms are tested for laboratory safety, but additional requirements should include:

Genetic kill-switch testing
Immunogenicity testing
Containment validation
Standardized regulatory protocols (FDA, EMA)
Mandatory approval before clinical trials

Assumptions

Lab testing predicts in-body behavior accurately
Kill switches function reliably

Risks

Failure → infection or environmental persistence
Regulatory burden may slow research

Potential Success

Reduced patient risk and improved clinical safety

Option 2 — Incentives for Fair Access and Ethical Use

Commercialization is necessary for implementation, but profit incentives can limit accessibility.

Proposed actions:

Funding incentives tied to equity plans
Ethical research and commercialization guidelines
Grant approval contingent on fair-access strategies
Responsible research training in universities and industry

Assumption

Financial and policy incentives influence behavior

Risks

Companies prioritize profit over guidelines

Potential Success

More equitable distribution of benefits

Option 3 — Environmental Containment & Monitoring

Containment practices vary across laboratories, creating inconsistency.

Proposed actions:

Standardized containment protocols
Gut simulator testing models
Routine environmental monitoring
Mandatory reporting requirements
Implementation by regulators, biotech companies, and academic labs

Risks

Human error
System failure
False sense of security if protocols become outdated

Potential Success

Early detection of accidental release
Reduced environmental risk

Costs

Additional training and infrastructure

4.Next, score (from 1-3 with, 1 as the best, or n/a) each of your governance actions

against your rubric of policy goals. The following is one framework but feel free to make your own?

Does the option:	Option 1	Option 2	Option 3
Enhance Biosecurity
• By preventing incidents	1	2	2
• By helping respond	1	2	2
Foster Lab Safety
• By preventing incident	1	2	2
• By helping respond	1	2	2
Protect the environment
• By preventing incidents	2	2	1
• By helping respond	2	2	1
Other considerations
• Minimizing costs and burdens to stakeholders	3	1	2
• Feasibility?	2	1	2
• Not impede research	3	1	2
• Promote constructive applications	1	1	2

5.Last, drawing upon this scoring, describe which governance option, or combination of

options, you would prioritize, and why. Outline any trade-offs you considered as well as assumptions and uncertainties?

Based on the scoring, I would prioritize Option 1 (Biosafety Regulation) in combination with Option 2 (Fair Access and Ethical Use).

This combination provides the strongest protection for both human health and society, which are primary ethical priorities. Biosafety regulation ensures risk control, while equity-focused policies ensure responsible distribution of benefits.

Option 3 (environmental containment) remains important but serves more as a supporting framework.

Trade-offs

Increased regulatory oversight may slow research
Higher costs for labs and companies
Training and monitoring systems require infrastructure

Assumptions

Genetic kill switches function reliably
Lab testing predicts biological behavior accurately
Institutions comply with containment and monitoring protocols
No major ecological disruption occurs if accidental release happens

Overall Rationale

Maximizing safety and ethical use outweighs the potential slowdown in research progress.

References

Panteli JT, Forbes NS. Engineered bacteria detect spatial profiles in glucose concentration within solid tumor cell masses. Biotechnol Bioeng. 2016.
Zúñiga A et al. Engineered L-lactate responding promoter system. ACS Synth Biol. 2021.
Chien T et al. Enhancing bacterial tropism via genetically programmed biosensors. Nat Biomed Eng. 2022.
National Cancer Institute. Engineered bacteria detect cancer.
Ma X et al. Modular-designed engineered bacteria for precision tumor immunotherapy. Nat Commun. 2023.
Part 2
Homework Questions

Homework Questions from Professor Jacobson

1. DNA Polymerase Error Rate and Genome Replication

Nature’s machinery for copying DNA is called DNA polymerase.

Question:
What is the error rate of polymerase? How does this compare to the length of the human genome? How does biology deal with that discrepancy?

Answer:
DNA polymerase error rate is roughly 1 per 10,000–100,000 bp
(10⁻⁵ to 10⁻⁷), whereas the length of the human genome is around
3.2 billion (3.2 × 10⁹) bp.

So, compared to the human genome length, DNA polymerase would introduce approximately 30,000–300,000 errors in a daughter cell during cell division.

Estimated errors:

Haploid genome
10⁻⁵ to 10⁻⁷ × 3.2 × 10⁹ = 32,000 errors
Diploid genome
10⁻⁵ to 10⁻⁷ × 6.4 × 10⁹ = 64,000 errors

Error-correction mechanisms:

DNA polymerase has built-in error removal capabilities:

Nucleotide filtering (selection)
Proofreading

The first reduces errors to about 1 in 10,000 bp, and proofreading reduces errors to about 10⁻⁷.

Cells also have repair systems such as:

Mismatch repair → reduces errors 100–1000 fold
DNA damage repair pathways

Together, these mechanisms reduce the overall error rate to approximately: one

This ensures genetic stability, meaning most cells copy their entire genome with zero permanent mistakes, and sometimes only one.

2. Number of Possible DNA Codes for an Average Human Protein

Question:
How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice, why don’t all of these different codes work to produce the protein of interest?

Answer:
Due to the degeneracy of the genetic code, multiple codons can encode the same amino acid.

There are 20 amino acids encoded by 61 codons
Most amino acids are encoded by more than one codon
The average human protein is roughly 400 amino acids long
If each amino acid is encoded in 3 different ways on average

Then the number of possible DNA sequences encoding the same protein is approximately:~10¹⁹⁰

Why many sequences don’t work in practice:

Even if different DNA sequences produce the same amino acid chain, they may not produce the desired functional protein because biological function also depends on:

Protein folding
Gene regulation
Translational efficiency
RNA stability
Cellular compatibility

Protein sequence alone is not sufficient for proper function.

Homework Questions from Professor LeProust

Oligo Synthesis

Questions:

What is the most commonly used method for oligo synthesis currently?
Why is it difficult to make oligos longer than 200 nt via direct synthesis?
Why can’t you make a 2000 bp gene via direct oligo synthesis?

Answer:

The most common method is solid-phase phosphoramidite synthesis.

During chemical synthesis, each nucleotide addition has less than perfect efficiency. As the strand grows longer:

The probability of adding incorrect nucleotides increases
Errors accumulate exponentially
Many strands become incomplete

After about 150–200 bases, most DNA molecules contain errors or are truncated, making purification difficult.

It is difficult to directly synthesize long genes (e.g., 2000 bp) because:

Thousands of sequential chemical steps would be required
Error accumulation would make accurate full-length products extremely rare

Solution used in practice:

Scientists synthesize many short DNA fragments (100–200 bases each) and assemble them using enzymatic methods to build the full gene.

Homework Questions from Professor George Church

Essential Amino Acids and the Lysine Contingency

Question:
What are the 10 essential amino acids in all animals, and how does this affect your view of the “Lysine Contingency”?

Answer:

The ten essential amino acids in animals are:

Histidine
Isoleucine
Leucine
Lysine
Methionine
Phenylalanine
Threonine
Tryptophan
Valine
Arginine

Lysine Contingency:

In synthetic biology biocontainment discussions, the lysine contingency refers to engineering organisms that require lysine to survive, preventing growth without external supply.

However:

Lysine is essential for all animals
It is widely present in tissues, food, and nutrient-rich environments
It is commonly available in biological systems

Conclusion:

Because lysine is widely available, dependence on lysine alone is not a reliable or strict biocontainment strategy.

References

Chan, C. T. Y., et al. (2016).
“Deadman” and “Passcode” microbial kill switches for bacterial containment.
Nature Chemical Biology, 12, 82–86.
https://doi.org/10.1038/nchembio.1979

Mandell, D. J., et al. (2015).
Biocontainment of genetically modified organisms by synthetic protein design.
Nature, 518, 55–60.
https://doi.org/10.1038/nature14121

Gallagher, R. R., et al. (2015).
Multilayered genetic safeguards limit growth of microorganisms to defined environments.
Nature Communications, 6, 1–9.
https://doi.org/10.1038/ncomms6823

Week 2 HW: DNA Read Write & Edit

Part 1: Benchling & In-silico Gel Art

See the Gel Art: Restriction Digests and Gel Electrophoresis protocol for details.

Overview:

Make a free account at benchling.com.
Import the Lambda DNA.
Simulate Restriction Enzyme Digestion with the following Enzymes:
- EcoRI
- HindIII
- BamHI
- KpnI
- EcoRV
- SacI
- SalI

Part 3: DNA Design Challenge

I selected human mesothelin protein, which could be used as a biomarker for detecting cancers, as it is overexpressed in many cancers such as ovarian cancer and mesothelioma. Since I want to engineer a tumor-sensing bacterium, I want to use mesothelin as the target molecule for sensing.

For this, I took the protein sequence of mesothelin precursor from NCBI and uploaded it on Bioinformatics.org, an online tool for reverse translation. After reverse translation, I codon-optimized the gene sequence for E. coli through the Integrated DNA Technologies (IDT) tool. I want E. coli to produce the target protein for assay optimization and surface display experiments before proceeding with engineering E. coli as a biosensor for tumor detection.

For developing tumor-sensing bacteria, I plan to use a cell-dependent approach using E. coli, rather than a cell-free system. In this method:

The mesothelin precursor gene is expressed in E. coli, allowing the protein to be produced and displayed on the bacterial surface.
This is essential for live-cell tumor-sensing applications because the surface-displayed mesothelin can directly interact with tumor-associated markers or antibodies.
Cell-dependent expression enables:
- Real-time functional testing
- Control over protein expression levels
- Scalable production

Overall, this makes it the most practical and effective strategy for building a functional bacterial tumor sensor.

AAV87530.1 mesothelin precursor [Homo sapiens] MALPTARPLLGSCGTPALGSLLFLLFSLGWVQPSRTLAGETGQEAAPLDGVLANPPNISSLSPRQLLGFP CAEVSGLSTERVRELAVALAQKNVKLSTEQLRCLAHRLSEPPEDLDALPLDLLLFLNPDAFSGPQACTHF FSRITKANVDLLPRGAPERQRLLPAALACWGVRGSLLSEADVRALGGLACDLPGRFVAESAEVLLPRLVS CPGPLDQDQQEAARAALQGGGPPYGPPSTWSVSTMDALRGLLPVLGQPIIRSIPQGIVAAWRQRSSRDPS WRQPERTILRPRFRREVEKTACPSGKKAREIDESLIFYKKWELEACVDAALLATQMDRVNAIPFTYEQLD VLKHKLDELYPQGYPESVIQHLGYLFLKMSPEDIRKWNVTSLETLKALLEVNKGHEMSPQVATLIDRFVK GRGQLDKDTLDTLTAFYPGYLCSLSPEELSSVPPSSIWAVRPQDLDTCDPRQLDVLYPKARLAFQNMNGS EYFVKIQSFLGGAPTEDLKALSQQNVSMDLATFMKLRTDAVLPLTVAEVQKLLGPHVEGLKAEERHRPVR DWILRQRQDDLDTLGLGLQGGIPNGYLVLDLSVQEALSGTPCLLGPGPVLTVLALLLASTLA

reverse translation of Untitled to a 1950 base sequence of most likely codons. gcggcggtgatggaaagcacccatgaactgattaacccgcgcgaatgccgcagccgccat atgagcgcgccgattgaaaacagcatggcgctgccgaccgcgcgcccgctgctgggcagc tgcggcaccccggcgctgggcagcctgctgtttctgctgtttagcctgggctgggtgcag ccgagccgcaccctggcgggcgaaaccggccaggaagcggcgccgctggatggcgtgctg gcgaacccgccgaacattagcagcctgagcccgcgccagctgctgggctttccgtgcgcg gaagtgagcggcctgagcaccgaacgcgtgcgcgaactggcggtggcgctggcgcagaaa aacgtgaaactgagcaccgaacagctgcgctgcctggcgcatcgcctgagcgaaccgccg gaagatctggatgcgctgccgctggatctgctgctgtttctgaacccggatgcgtttagc ggcccgcaggcgtgcacccatttttttagccgcattaccaaagcgaacgtggatctgctg ccgcgcggcgcgccggaacgccagcgcctgctgccggcggcgctggcgtgctggggcgtg cgcggcagcctgctgagcgaagcggatgtgcgcgcgctgggcggcctggcgtgcgatctg ccgggccgctttgtggcggaaagcgcggaagtgctgctgccgcgcctggtgagctgcccg ggcccgctggatcaggatcagcaggaagcggcgcgcgcggcgctgcagggcggcggcccg ccgtatggcccgccgagcacctggagcgtgagcaccatggatgcgctgcgcggcctgctg ccggtgctgggccagccgattattcgcagcattccgcagggcattgtggcggcgtggcgc cagcgcagcagccgcgatccgagctggcgccagccggaacgcaccattctgcgcccgcgc tttcgccgcgaagtggaaaaaaccgcgtgcccgagcggcaaaaaagcgcgcgaaattgat gaaagcctgattttttataaaaaatgggaactggaagcgtgcgtggatgcggcgctgctg gcgacccagatggatcgcgtgaacgcgattccgtttacctatgaacagctggatgtgctg aaacataaactggatgaactgtatccgcagggctatccggaaagcgtgattcagcatctg ggctatctgtttctgaaaatgagcccggaagatattcgcaaatggaacgtgaccagcctg gaaaccctgaaagcgctgctggaagtgaacaaaggccatgaaatgagcccgcaggtggcg accctgattgatcgctttgtgaaaggccgcggccagctggataaagataccctggatacc ctgaccgcgttttatccgggctatctgtgcagcctgagcccggaagaactgagcagcgtg ccgccgagcagcatttgggcggtgcgcccgcaggatctggatacctgcgatccgcgccag ctggatgtgctgtatccgaaagcgcgcctggcgtttcagaacatgaacggcagcgaatat tttgtgaaaattcagagctttctgggcggcgcgccgaccgaagatctgaaagcgctgagc cagcagaacgtgagcatggatctggcgacctttatgaaactgcgcaccgatgcggtgctg ccgctgaccgtggcggaagtgcagaaactgctgggcccgcatgtggaaggcctgaaagcg gaagaacgccatcgcccggtgcgcgattggattctgcgccagcgccaggatgatctggat accctgggcctgggcctgcagggcggcattccgaacggctatctggtgctggatctgagc gtgcaggaagcgctgagcggcaccccgtgcctgctgggcccgggcccggtgctgaccgtg ctggcgctgctgctggcgagcaccctggcg

GCG GCT GTA ATG GAA AGT ACC CAC GAG TTA ATC AAT CCA CGT GAA TGT CGG TCT CGC CAC ATG AGC GCG CCT ATT GAG AAT TCC ATG GCA CTG CCC ACA GCT CGC CCA CTG CTT GGC AGC TGC GGG ACA CCG GCA CTG GGG TCG CTT CTG TTT TTA TTA TTT TCC TTG GGT TGG GTT CAG CCG TCT AGA ACA CTG GCG GGT GAG ACG GGC CAA GAA GCG GCC CCA TTA GAC GGC GTT CTG GCA AAT CCG CCA AAC ATA TCA TCG CTG TCA CCT AGA CAG CTG CTT GGC TTT CCC TGT GCG GAA GTA AGC GGT TTA TCA ACT GAA CGG GTT CGT GAG CTG GCG GTG GCT CTG GCA CAA AAA AAT GTT AAG CTG AGC ACC GAA CAG CTT CGT TGT TTG GCC CAT CGT CTG TCT GAG CCG CCG GAA GAT CTG GAC GCC CTT CCC CTT GAC TTA CTG CTG TTC TTA AAC CCC GAC GCA TTT AGC GGT CCT CAG GCA TGT ACC CAT TTT TTT AGT CGT ATA ACT AAG GCT AAC GTT GAC CTG TTA CCT CGT GGG GCG CCG GAA CGC CAG CGC CTG TTA CCC GCT GCA CTG GCT TGC TGG GGT GTG CGT GGT TCT CTG CTG AGC GAA GCA GAC GTT AGA GCT TTA GGT GGC CTT GCT TGC GAC TTA CCG GGT CGT TTT GTA GCG GAG TCG GCT GAA GTA CTT CTG CCG AGA TTA GTT TCT TGT CCT GGT CCG TTA GAT CAG GAT CAG CAG GAA GCT GCC AGA GCG GCA CTG CAA GGT GGT GGA CCT CCA TAT GGC CCG CCG TCA ACA TGG TCA GTT AGT ACA ATG GAC GCT TTG CGT GGG CTG TTA CCT GTG TTA GGC CAA CCT ATC ATT CGC TCG ATT CCC CAG GGG ATA GTG GCT GCC TGG CGT CAA CGT TCT TCC CGT GAC CCG AGC TGG CGT CAG CCA GAG CGC ACG ATA CTT CGC CCC CGC TTT CGC AGA GAG GTT GAA AAA ACG GCT TGT CCG TCT GGG AAA AAA GCC CGC GAG ATC GAC GAA AGT CTG ATT TTC TAC AAA AAA TGG GAG TTG GAG GCC TGC GTG GAT GCC GCC CTT TTG GCC ACA CAG ATG GAC CGC GTA AAT GCG ATA CCA TTT ACT TAC GAA CAG CTG GAT GTG CTG AAG CAT AAA TTA GAC GAG CTG TAC CCG CAG GGC TAT CCG GAG TCC GTG ATT CAA CAT CTG GGT TAC CTG TTC CTG AAA ATG TCT CCA GAA GAT ATC CGT AAG TGG AAC GTG ACT AGT CTT GAA ACG CTG AAA GCG TTG CTG GAA GTC AAT AAA GGA CAT GAG ATG AGT CCC CAG GTA GCC ACC CTG ATA GAT CGT TTT GTT AAA GGT CGT GGC CAG CTG GAT AAG GAC ACA CTG GAT ACT TTA ACA GCA TTT TAT CCG GGT TAT CTG TGC TCT CTG AGC CCA GAA GAG TTG TCC TCT GTA CCC CCT TCA AGC ATT TGG GCC GTT CGG CCT CAG GAT CTT GAC ACT TGC GAT CCA AGA CAG CTT GAT GTG TTG TAT CCT AAA GCC CGC CTG GCG TTT CAG AAC ATG AAT GGC TCA GAG TAT TTT GTT AAA ATT CAA AGT TTT TTA GGT GGC GCT CCG ACC GAG GAT TTA AAA GCG CTT AGT CAA CAG AAC GTG TCC ATG GAC TTG GCA ACT TTT ATG AAG CTG CGG ACA GAT GCA GTA CTT CCT CTG ACT GTG GCC GAA GTT CAA AAG CTG TTG GGC CCG CAC GTG GAA GGC TTG AAA GCC GAA GAA CGT CAC CGC CCA GTG AGA GAC TGG ATT TTA CGT CAG CGC CAG GAT GAT TTG GAT ACT CTT GGA CTG GGT CTG CAA GGA GGT ATT CCG AAT GGA TAT CTG GTC CTG GAC CTT AGC GTT CAG GAA GCG CTG TCA GGA ACC CCA TGC TTA CTG GGC CCG GGC CCA GTT CTG ACA GTG CTG GCT TTA TTG TTG GCC AGC ACC CTT GCG

Part 4: Prepare a Twist DNA Synthesis Order

Following all the steps, i successfully prepared a Twist DNA synthesis order.

Sharing the final results from both the benchling and Twist here:

Part 5: DNA Read/Write/Edit

DNA read

(i) What DNA would you want to sequence (e.g., read) and why? This could be DNA related to human health (e.g. genes related to disease research), environmental monitoring (e.g., sewage waste water, biodiversity analysis), and beyond (e.g. DNA data storage, biobank).

As I am currently working on EBV-association with Oral squamous cell carcinoma, since OSCC is a real problem in our region and the majority of the population is also EBV positive, I would like to sequence the BAMBI gene in tissue samples from patients with OSCC.

BAMBI is an interesting regulatory molecule in cancer biology, which acts as a pseudoreceptor (looks like receptor but doesn’t signal) in TGF-β signaling pathway, similar to type I TGF-β receptors and blocks or modulate signalling by competing with functional receptors that influence tumor progression and epithelial–mesenchymal transition (EMT).

In recent studies it has been identified as a reliable EBV infection marker that could serve as a potential biomarker of EBV related oncogenic activity.

I aim to assess tissue level variation of BAMBI in our population by sequencing it whether it could serve as a prognostic biomarker for EBV-associated epithelial cancers, and also investigate the possibility that BAMBI mediates EBV-induced EMT in OSCC, a biologically reasonable but not yet established relationship.

(ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why? Also answer the following questions: Is your method first-, second- or third-generation or other? How so? What is your input? How do you prepare your input (e.g. fragmentation, adapter ligation, PCR)? List the essential steps. What are the essential steps of your chosen sequencing technology, how does it decode the bases of your DNA sample (base calling)? What is the output of your chosen sequencing technology?

I would want to use oxford nanopare as a sequencing technology as it can sequence the entire human genome in 6 hours it can give me results for my gene in no time, and the accuracy of oxford nanopore has been improved from 10⁻³ to 10⁻⁷ over the past couple over years, which is huge difference. It can be advantageous over other sequencing platforms because it can detect structural variations, long repeats, and modifications that could be missed by other technologies.

Oxford nanopore is a third generation sequencing technology as it does not use sequence by synthesis approach where as second generation sequencing technology used PCR amplified fragments for the DNA sequencing. It can read long, single DNA or RNA molecules directly without the need for amplification. It uses nanopores embedded in a membrane, measuring electrical current changes as DNA or RNA molecules pass through, which allows base calling in real time.

Oxford nanopore Input

For input we can use gDNa and cDNA, which can be prepared by following these essential steps depending upon the size of the DNA or RNA we want to sequence:

DNA Extarction: which is going to be the first step in order to get purified DNA of high quality and High molecular weight.

Optional fragmentation: in case the DNA is shear prone and needs to be dissected into fragments to improve sequence efficiency we can use this step.

End repair and dA-tailing: during this step the ends of the DNA are prepared for adoptaer ligation by enzymatic repairing the ends followed by the attachment of A tail (dA-tailing).

Adopter ligation: Specilaized sequence adapter are ligated to the prepared DNA ends, which is attacted a motor protein that helps in pulling the DNa through the pore so that the sequence can read the DNA.

Library clean up: After the ligation step the DNA libraries are purified using beads so that efficient and purified DNA (free of Adapters and Enzymes) can be loaded on the flow cell.

Essential steps

The essential steps involved in Oxford nanopore are as following:

DNA molecules are delivered through the nanopores embedded in a membrane with ligated adopters.

As the molecule passes through the pore it generates current.

The changes in the current are measured in real time and transformed into nucleotide sequence by a computational algorithm called base calling.

Output

The output consists of the nucleotide sequence in FASTA or FASTQ format together with quality scores for each base.

DNA write

(i) What DNA would you want to synthesize (e.g., write) and why? These could be individual genes, clusters of genes or genetic circuits, whole genomes, and beyond. As described in class thus far, applications could range from therapeutics and drug discovery (e.g., mRNA vaccines and therapies) to novel biomaterials (e.g. structural proteins), to sensors (e.g., genetic circuits for sensing and responding to inflammation, environmental stimuli, etc.), to art (DNA origamis). If possible, include the specific genetic sequence(s) of what you would like to synthesize! You will have the opportunity to actually have Twist synthesize these DNA constructs! :)

I would want to synthesize the pattern genes of a butterfly which are responsible for the colour and motifs seen on the butterfly wings.I want to assemble these genes in a genetic circuit like a ring oscillator, which can control the expression of these genes in a specific pattern by turning on and turning off these genes in response to internal or external stimuli, replicating the natural patterns and colours seen on a butterfly wings, functioning as a synthetic biological sensor actuator circuit.

Applications of this method include synthetic morphogenesis, bio-inspired design, and developmental biology research. It may also shed light on how gene regulatory networks generate complex natural patterns.

(ii) What technology or technologies would you use to perform this DNA synthesis and why? Also answer the following questions: What are the essential steps of your chosen sequencing methods? What are the limitations of your sequencing method (if any) in terms of speed, accuracy, scalability?

For synthesizing butterfly pattern genes I would want to use a silicon chip based DNA synthesis platform from Twist bioscience, as they are using the standard phosphoramidite chemistry for DNA synthesis, miniaturized on the silicon chip. This reduces reagent use drastically and increases efficiency.

I opted for this technology due to the following reasons:

It can synthesize a number of genes at once in parallel.

It uses phosphoramidite chemistry with error reducing methods, giving reliable sequences.

It can be easily scaled from short length logos to full length genes.

It makes it possible to synthesise intended variations, which is helpful for understanding how mutations affect wing patterns.

Essential Steps

Design DNA sequence: First I need to design exact nucleotide sequence using bioinformatic tools.

Synthesis: The exact sequence fed to the system will synthesise oligos on the silicon chip in parallel using phosphoramidite chemistry.

Collection of the oligos: The oligos will be cleaved from the chip and collected for the downstream processing.

Assembly: The collected oligos will be enzymatically assembled into full length genes.

Verification: The assembled gene sequence will be verified through sequencing technologies like Sanger sequencing or maybe Oxford nanopore.

Limitations

Speed: It can synthesize longer oligos but assembling those might take additional time.

Accuracy: In case of full length genes or longer oligos the chances of error increases and would require sequence verification.

Cost: Though Twist Bioscience is cost effective as compared to other competitors, it can be costly when synthesizing large constructs or thousands of genes.

DNA Edit

(i) What DNA would you want to edit and why? In class, George shared a variety of ways to edit the genes and genomes of humans and other organisms. Such DNA editing technologies have profound implications for human health, development, and even human longevity and human augmentation. DNA editing is also already commonly leveraged for flora and fauna, for example in nature conservation efforts, (animal/plant restoration, de-extinction), or in agriculture (e.g. plant breeding, nitrogen fixation). What kinds of edits might you want to make to DNA (e.g., human genomes and beyond) and why?

I would like to edit wing pattern genes of the butterfly such as Optix, Distal-less, and WntA which codes for the colour and patterns on the wings. For editing these genes I would use CRISPR cas9 technology as it is a widely used gene editing tool due to its precision, it will help me modify the genes and to study how changes in their expression affect pattern formation. This would allow me to understand the genetic basis of natural patterns, test how dynamic regulation of these genes could produce new patterns when combined in synthetic circuits like a ring oscillator and explore applications in synthetic developmental biology, where spatial and temporal gene expression can be controlled for pattern engineering.

(ii) What technology or technologies would you use to perform these DNA edits and why? Also answer the following questions: How does your technology of choice edit DNA? What are the essential steps? What preparation do you need to do (e.g. design steps) and what is the input (e.g. DNA template, enzymes, plasmids, primers, guides, cells) for the editing? What are the limitations of your editing methods (if any) in terms of efficiency or precision?

I would like to use CRISPR cas9 editing tool to edit these butterfly genes as it is the most widely used precise genome editing tool used for functional studies in organisms.For the butterfly genes i will have to design a complementary guide RNAs and use the cas9 protein to cleave the double bond at the target site,the ends will be repaired either through Non-homologous end joining (NHEJ)(small insertion or deletion) or Homology-directed repair (HDR)(precise sequence edits) Essential Steps gRNAs designing: gRNAs complementary to the butterfly genes i.eOptix, WntA, or Dll sequences will be designed. Preparation of CRISPR components: CRISPR components like Cas9 protein or mRNA, gRNA, and optional donor DNA template will be assembled. Delivery: The CRISPR machinery will be delivered into butterfly embryos or cell cultures via microinjection, electroporation, or other suitable methods. Screening: after allowing the genome editing to occur all the individuals will be screened. Validation: to confirm and verify the changes in the target genes sequencing will be used. Required Inputs gRNA(s) for the target gene sequences. Cas9 nuclease Donor DNA template (for precise HDR edits, if needed). Butterfly embryos or cell cultures. Delivery tools (microinjection, electroporation, etc.).

Limitations Efficiency :Editing in butterfly embryos may vary as HDR-based precise edits are less efficient than simple knockouts. Off-target effects :Cas9 may occasionally cut unintended sequences, requiring careful design and validation. Ethical considerations :Editing living organisms requires regulatory approval and careful ethical review. Developmental effects :Some edits may affect viability or other developmental traits.

References: Callahan, B., McMurdie, P., Rosen, M. et al. DADA2: High-resolution sample inference from Illumina amplicon data. Nat Methods 13, 581–583 (2016). https://doi.org/10.1038/nmeth.3869 Lin, B., Hui, J., & Mao, H. (2021). Nanopore Technology and Its Applications in Gene Sequencing. Biosensors, 11(7), 214. https://doi.org/10.3390/bios11070214 Kosuri, S., Church, G. Large-scale de novo DNA synthesis: technologies and applications. Nat Methods 11, 499–507 (2014). https://doi.org/10.1038/nmeth.2918 Kronforst MR, Papa R. The functional basis of wing patterning in Heliconius butterflies: the molecules behind mimicry. Genetics. 2015 May;200(1):1-19. doi: 10.1534/genetics.114.172387. PMID: 25953905; PMCID: PMC4423356. Livraghi, L., Martin, A., Gibbs, M., Braak, N.V., Arif, S., & Breuker, C.J. (2018). CRISPR/Cas9 as the Key to Unlocking the Secrets of Butterfly Wing Pattern Development and Its Evolution. Jennifer A. Doudna,Emmanuelle Charpentier ,The new frontier of genome engineering with CRISPR-Cas9.Science346,1258096(2014).DOI:10.1126/science.1258096

Week 3 HW: lab automation

Python Script for Opentrons Artwork

from opentrons import types
import math

metadata = {    # see https://docs.opentrons.com/v2/tutorial.html#tutorial-metadata
    'author': 'Saba Saeed',
    'protocolName': 'Opentrons Bacteriophage Structural Artwork',
    'description': '',
    'source': 'HTGAA 2026 Opentrons Lab',
    'apiLevel': '2.20'
}

##############################################################################
###   Robot deck setup constants - don't change these
##############################################################################

TIP_RACK_DECK_SLOT = 9
COLORS_DECK_SLOT = 6
AGAR_DECK_SLOT = 5
PIPETTE_STARTING_TIP_WELL = 'A1'

well_colors = {
    'A1' : 'Red',
    'B1' : 'Yellow',
    'C1' : 'Green',
    'D1' : 'Cyan',
    'E1' : 'Blue'
}

def run(protocol):

    ##############################################################################
    ###   Load labware, modules and pipettes
    ##############################################################################

    tips_20ul = protocol.load_labware(
        'opentrons_96_tiprack_20ul',
        TIP_RACK_DECK_SLOT,
        'Opentrons 20uL Tips'
    )

    pipette_20ul = protocol.load_instrument(
        "p20_single_gen2",
        "right",
        [tips_20ul]
    )

    temperature_module = protocol.load_module(
        'temperature module gen2',
        COLORS_DECK_SLOT
    )

    temperature_plate = temperature_module.load_labware(
        'opentrons_96_aluminumblock_generic_pcr_strip_200ul',
        'Cold Plate'
    )

    color_plate = temperature_plate

    agar_plate = protocol.load_labware(
        'htgaa_agar_plate',
        AGAR_DECK_SLOT,
        'Agar Plate'
    )

    center_location = agar_plate['A1'].top()

    pipette_20ul.starting_tip = tips_20ul.well(PIPETTE_STARTING_TIP_WELL)

    ##############################################################################
    ###   Helper functions for this lab
    ##############################################################################

    def location_of_color(color_string):
        for well, color in well_colors.items():
            if color.lower() == color_string.lower():
                return color_plate[well]
        raise ValueError(f"No well found with color {color_string}")

    def dispense_and_detach(pipette, volume, location):
        above_location = location.move(types.Point(z=location.point.z + 5))
        pipette.move_to(above_location)
        pipette.dispense(volume, location)
        pipette.move_to(above_location)

    ##############################################################################
    ###   Patterning & Auto-Scaling
    ##############################################################################

    # Raw GUI coordinate arrays
    azurite_points = [
        (-2.25,20.25),(2.25,20.25),(2.25,18.75),(2.25,15.75),(2.25,6.75),
        (-8.25,-3.75),(6.75,-3.75),(8.25,-3.75),(-9.75,-6.75),(-6.75,-6.75),
        (2.25,-6.75),(-6.75,-8.25),(-2.25,-8.25),(2.25,-9.75),(3.75,-9.75),
        (5.25,-9.75),(8.25,-9.75),(-12.75,-11.25),(9.75,-11.25),
        (-12.75,-12.75),(-11.25,-12.75),(5.25,-14.25),(11.25,-14.25),
        (12.75,-14.25),(-5.25,-15.75),(12.75,-15.75)
    ]

    mclover3_points = [
        (-0.75,20.25),(0.75,20.25),(3.75,18.75),(-6.75,17.25),(-3.75,17.25),
        (3.75,17.25),(-5.25,15.75),(-2.25,15.75),(-3.75,14.25),(3.75,14.25),
        (2.25,12.75),(-0.75,11.25),(0.75,11.25),(2.25,9.75),(-6.75,8.25),
        (3.75,8.25),(-2.25,6.75),(-0.75,6.75),(-2.25,5.25),(-2.25,3.75),
        (-2.25,2.25),(-2.25,0.75),(-0.75,0.75),(-2.25,-0.75),(-2.25,-2.25),
        (-9.75,-3.75),(-6.75,-3.75),(5.25,-3.75),(9.75,-3.75),
        (-11.25,-5.25),(-0.75,-5.25),(3.75,-5.25),(-5.25,-6.75),
        (-2.25,-6.75),(-0.75,-6.75),(5.25,-6.75),(-9.75,-8.25),
        (2.25,-8.25),(6.75,-8.25),(9.75,-8.25),(-5.25,-9.75),
        (-3.75,-9.75),(9.75,-9.75),(-5.25,-11.25),(5.25,-11.25),
        (-5.25,-12.75),(5.25,-12.75),(-12.75,-14.25),(-5.25,-14.25),
        (-12.75,-15.75),(-3.75,-17.25)
    ]

    mrfp1_points = [
        (-3.75,18.75),(-5.25,17.25),(5.25,17.25),(-6.75,15.75),
        (-0.75,15.75),(0.75,15.75),(5.25,15.75),(-6.75,14.25),
        (-6.75,12.75),(-2.25,12.75),(-6.75,11.25),(-6.75,9.75),
        (-3.75,9.75),(-2.25,9.75),(-0.75,9.75),(0.75,9.75),
        (-5.25,8.25),(5.25,8.25),(0.75,6.75),(-0.75,5.25),
        (0.75,5.25),(0.75,3.75),(0.75,2.25),(0.75,0.75),
        (0.75,-0.75),(-0.75,-2.25),(0.75,-2.25),(-5.25,-3.75),
        (-3.75,-3.75),(-2.25,-3.75),(-0.75,-3.75),(0.75,-3.75),
        (2.25,-3.75),(3.75,-3.75),(-9.75,-5.25),(-8.25,-5.25),
        (-6.75,-5.25),(-5.25,-5.25),(-3.75,-5.25),(-2.25,-5.25),
        (0.75,-5.25),(2.25,-5.25),(5.25,-5.25),(6.75,-5.25),
        (8.25,-5.25),(9.75,-5.25),(-11.25,-6.75),(-3.75,-6.75),
        (0.75,-6.75),(3.75,-6.75),(9.75,-6.75),(-8.25,-8.25),
        (-3.75,-8.25),(3.75,-8.25),(8.25,-8.25),(-11.25,-9.75),
        (-9.75,-9.75),(11.25,-9.75),(-11.25,-11.25),(-3.75,-11.25),
        (3.75,-11.25),(11.25,-11.25),(-3.75,-12.75),(3.75,-12.75),
        (11.25,-12.75),(-3.75,-14.25),(3.75,-14.25),
        (-3.75,-15.75),(3.75,-15.75),(3.75,-17.25)
    ]

    sfgfp_points = [
        (5.25,14.25),(5.25,12.75),(5.25,11.25),(5.25,9.75),
        (-3.75,6.75),(3.75,6.75),(-12.75,-8.25),
        (-11.25,-8.25),(-12.75,-9.75)
    ]

    # Color grouping
    color_groups = [
        ('Blue', azurite_points),
        ('Green', mclover3_points),
        ('Red', mrfp1_points),
        ('Yellow', sfgfp_points)
    ]

    # Auto-scaling logic
    all_points = azurite_points + mclover3_points + mrfp1_points + sfgfp_points
    max_r = 0

    for x, y in all_points:
        r = math.sqrt(x**2 + y**2)
        if r > max_r:
            max_r = r

    scale_factor = 38.0 / max_r if max_r > 38.0 else 1.0

    # Physical compilation loop
    for color_name, points in color_groups:

        if not points:
            continue

        pipette_20ul.pick_up_tip()

        for i, (x, y) in enumerate(points):

            if i % 20 == 0:
                drops_remaining = len(points) - i
                volume_to_aspirate = min(20, drops_remaining)

                pipette_20ul.aspirate(
                    volume_to_aspirate,
                    location_of_color(color_name)
                )

            scaled_x = x * scale_factor
            scaled_y = y * scale_factor

            adjusted_location = center_location.move(
                types.Point(x=scaled_x, y=scaled_y)
            )

            dispense_and_detach(
                pipette_20ul,
                1,
                adjusted_location
            )

        pipette_20ul.drop_tip()

Week 4 HW: protein design part 1

Homework

Weekly homework submissions:

Subsections of Homework

Week 1 HW: Principles and Practices

1.First, describe a biological engineering application or tool you want to develop and

2.Next, describe one or more governance/policy goals related to ensuring that this

Sub-Goals

3.Next, describe at least three different potential governance “actions” by considering

Option 1 — Biosafety Regulation

Option 2 — Incentives for Fair Access and Ethical Use

Option 3 — Environmental Containment & Monitoring

4.Next, score (from 1-3 with, 1 as the best, or n/a) each of your governance actions

5.Last, drawing upon this scoring, describe which governance option, or combination of

Trade-offs

Assumptions

Overall Rationale

References

Homework Questions

Homework Questions from Professor Jacobson

1. DNA Polymerase Error Rate and Genome Replication

2. Number of Possible DNA Codes for an Average Human Protein

Homework Questions from Professor LeProust

Oligo Synthesis

Homework Questions from Professor George Church

Essential Amino Acids and the Lysine Contingency

References

Week 2 HW: DNA Read Write & Edit

Part 1: Benchling & In-silico Gel Art

Part 3: DNA Design Challenge

Week 3 HW: lab automation

Week 4 HW: protein design part 1