Homework

Weekly homework submissions:

Week 1 HW: Principles and Practices
1. Describe a biological engineering application or tool you want to develop and why: I want to develop a diagnostics chip for neuropsychiatric disorders (microarray or microfluidics based). 2. Describe one or more governance/policy goals related to ensuring that this application or tool contributes to an “ethical” future Policy goals would prevent misuse of diagnostics for measurement of features like “intelligence” or personality traits. Policies would need to limit interpretable results to areas of high variability in specific diseases and include diverse populations. 3. Three Governance Actions One: Federal oversight of use of genetic data. Prevention of use of data for non-psychiatric disease related traits. Two: Standardized, valid production of chips. Quality control. Three: Ensuring equal genetic information from diverse ancestries for diagnostics.
Week 10
Week 11
Week 12
Week 13
Week 14
Week 2 — DNA Read, Write, & Edit
3.1. Choose your protein. In recitation, we discussed that you will pick a protein for your homework that you find interesting. Which protein have you chosen and why? Using one of the tools described in recitation (NCBI, UniProt, google), obtain the protein sequence for the protein you chose. Chosen Protein: Chromodomain Helicase DNA-binding Protein 8 (CHD8). Why: CHD8 is a “master regulator” of gene expression. It functions as an ATP-dependent chromatin remodeler, meaning it physically moves nucleosomes to open up DNA for transcription. I find it interesting because mutations in CHD8 specifically impact the expression of hundreds of other genes linked to brain development, providing a clear molecular bridge between genetic architecture and neurobiology.
Week 3 — Lab Automation
One of the great parts about having an automated robot is being able to precisely mix, deposit, and run reactions without much intervention, and design and deploy experiments remotely. For this week, we’d like for you to do the following: Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications. Write a description about what you intend to do with automation tools for your final project. You may include example pseudocode, Python scripts, 3D printed holders, a plan for how to use Ginkgo Nebula, and more. You may reference this week’s recitation slide deck for lab automation details. While your description/project idea doesn’t need to be set in stone, we would like to see core details of what you would automate. This is due at the start of lecture and does not need to be tested on the Opentrons yet.
Week 4
Week 5
Week 6
Week 7
Week 9

Week 1 HW: Principles and Practices

1. Describe a biological engineering application or tool you want to develop and why: I want to develop a diagnostics chip for neuropsychiatric disorders (microarray or microfluidics based). 2. Describe one or more governance/policy goals related to ensuring that this application or tool contributes to an “ethical” future Policy goals would prevent misuse of diagnostics for measurement of features like “intelligence” or personality traits. Policies would need to limit interpretable results to areas of high variability in specific diseases and include diverse populations. 3. Three Governance Actions One: Federal oversight of use of genetic data. Prevention of use of data for non-psychiatric disease related traits.
Two: Standardized, valid production of chips. Quality control. Three: Ensuring equal genetic information from diverse ancestries for diagnostics.

Does the option:	Option 1	Option 2	Option 3
Enhance Biosecurity
• By preventing incidents	X	X
• By helping respond	X
Foster Lab Safety
• By preventing incident			X
• By helping respond			X
Protect the environment
• By preventing incidents	X
• By helping respond	X
Other considerations
• Minimizing costs and burdens to stakeholders		X
• Feasibility?			X
• Not impede research			X
• Promote constructive applications	X		X

At this moment in time, I would prioritize option three. Without adequate information of diseases in diverse populations the diagnostic tool would have limited use in populations.

Slide questions: Jacobsen: Error rate: 1x10^6 Given the rate and length of genome (3.2billion base pairs) rate of error would be 3200 errors per DNA replication. This is addressed through various DNA repair pathways in the cell.

There are 10^623 ways to code for the average protein. The number of possibilities is dramatically reduced to the 3D conformational constraints and stability of the sequence.

Leproust: Most commonly used method: phosphoramidite method Difficult to sequences longer than 200bp because: Yield greatly decreases after 200bp with traditional methods. 2000bp gene cannot be made because: Current limits for oligonucleotides are much shorter.

Church: The ten essential amino acids are histidine, isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan, valine. Oddly, this validates the “lysine contingency” as the dinosaurs are indeed dependent on this essential amino acid.

Week 10

Week 11

Week 12

Week 13

Week 14

Week 2 — DNA Read, Write, & Edit

3.1. Choose your protein.

In recitation, we discussed that you will pick a protein for your homework that you find interesting. Which protein have you chosen and why? Using one of the tools described in recitation (NCBI, UniProt, google), obtain the protein sequence for the protein you chose.

Chosen Protein: Chromodomain Helicase DNA-binding Protein 8 (CHD8). Why: CHD8 is a “master regulator” of gene expression. It functions as an ATP-dependent chromatin remodeler, meaning it physically moves nucleosomes to open up DNA for transcription. I find it interesting because mutations in CHD8 specifically impact the expression of hundreds of other genes linked to brain development, providing a clear molecular bridge between genetic architecture and neurobiology.

Protein Sequence (UniProt ID: Q9H270 - First 50 amino acids):

MADPGMMSLFGEDGNFGEIIMEPEVTYDGLDAPAKRPRVRSGVDIPEAIL (The full protein is 2,581 amino acids long)

3.2. Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence.

The Central Dogma discussed in class and recitation describes the process in which DNA sequence becomes transcribed and translated into protein. The Central Dogma gives us the framework to work backwards from a given protein sequence and infer the DNA sequence that the protein is derived from. Using one of the tools discussed in class, NCBI or online tools (google “reverse translation tools”), determine the nucleotide sequence that corresponds to the protein sequence you chose above.

Because the genetic code is degenerate (multiple codons can code for the same amino acid), there are many possible DNA sequences.

e.g. ATGGCCGACCCCGGCATGATGAGCCTGTTCGGCGAGGACGGCAACTTCGGCGAGATCATCATGGAGCCCGAGGTGACCTACGACGGCCTGGACGCCCCCGCCAAGCGCCCCCGGGTGCGGAGCGGCGTGGACATCCCCGAGGCCATCCTG

3.3. Codon optimization.

Once a nucleotide sequence of your protein is determined, you need to codon optimize your sequence. You may, once again, utilize google for a “codon optimization tool”. In your own words, describe why you need to optimize codon usage. Which organism have you chosen to optimize the codon sequence for and why?

Why optimize?: Different organisms have different “preferences” for which codons they use most frequently (codon bias). If you try to express a human gene in a bacterium like E. coli without optimization, the bacterium might run out of the specific tRNAs needed for “human-preferred” codons, leading to stalled translation or misfolded proteins. Optimization replaces rare codons with the host’s preferred ones without changing the amino acid sequence.

Chosen Organism: Escherichia coli (K12). Reason: E. coli is the gold standard for high-yield protein production because it grows rapidly and is easy to manipulate. Since CHD8 is a large, complex protein, starting with a well-characterized bacterial system is the logical first step for initial structural studies.

3.4. You have a sequence! Now what?

What technologies could be used to produce this protein from your DNA? Describe in your words the DNA sequence can be transcribed and translated into your protein. You may describe either cell-dependent or cell-free methods, or both.

To produce CHD8, I would use Cell-Dependent Expression (Mammalian Cells). While E. coli is great for simple proteins, CHD8 is a massive chromatin remodeler that likely requires specific post-translational modifications (like phosphorylation) to function correctly, which bacteria cannot provide.

The Process:

Transcription: The optimized DNA (carried on a plasmid vector) enters the nucleus of a CHO (Chinese Hamster Ovary) cell. RNA Polymerase binds to the promoter and “reads” the DNA to create a complementary mRNA strand.

Translation: The mRNA exits the nucleus and enters the cytoplasm. Ribosomes bind to the mRNA, and tRNAs bring the corresponding amino acids (following the triplet codon code) to build the CHD8 polypeptide chain.

Folding: Molecular chaperones within the cell assist the long CHD8 chain in folding into its complex 3D shape.

3.5. [Optional] How does it work in nature/biological systems?

Describe how a single gene codes for multiple proteins at the transcriptional level. Try aligning the DNA sequence, the transcribed RNA, and also the resulting translated Protein!!! See example below.

In nature, a single gene can code for multiple proteins through Alternative Splicing. This occurs at the transcriptional level where different exons (coding regions) are joined together while introns (non-coding regions) are removed.Alignment Example:DNA: [Exon 1] — [Intron] — [Exon 2] — [Exon 3]mRNA (Isoform A): [Exon 1][Exon 2][Exon 3] $\rightarrow$ Protein AmRNA (Isoform B): [Exon 1][Exon 3] $\rightarrow$ Protein B (Shorter version)

5.1 DNA Read (i) What DNA would you want to sequence (e.g., read) and why? This could be DNA related to human health (e.g. genes related to disease research), environmental monitoring (e.g., sewage waste water, biodiversity analysis), and beyond (e.g. DNA data storage, biobank).

I would sequence the exomes (the protein-coding regions) of individuals with undiagnosed neurodevelopmental delays. This helps identify “de novo” (new) mutations in CHD8 that may not be present in the parents.

(ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why?

Also answer the following questions: Is your method first-, second- or third-generation or other? How so? What is your input? How do you prepare your input (e.g. fragmentation, adapter ligation, PCR)? List the essential steps. What are the essential steps of your chosen sequencing technology, how does it decode the bases of your DNA sample (base calling)? What is the output of your chosen sequencing technology?

Technology: Nanopore Sequencing (Oxford Nanopore Technologies).

Generation: 3rd Generation (Long-read sequencing).

Input: High-molecular-weight (HMW) genomic DNA.

Preparation: DNA is fragmented into large pieces. Adapters containing a “motor protein” are ligated to the ends. No PCR is required, which avoids amplification bias.

Essential Steps: The motor protein pulls the DNA strand through a microscopic protein pore (nanopore) embedded in a membrane.

Decoding (Base Calling): As DNA passes through, it disrupts an electrical current across the pore. Each combination of bases creates a unique “wiggle” in the current. AI algorithms decode these electrical shifts back into A, T, C, and G.

Output: “FastQ” files containing long, continuous sequences of the DNA.

5.2 DNA Write (i) What DNA would you want to synthesize (e.g., write) and why? These could be individual genes, clusters of genes or genetic circuits, whole genomes, and beyond. As described in class thus far, applications could range from therapeutics and drug discovery (e.g., mRNA vaccines and therapies) to novel biomaterials (e.g. structural proteins), to sensors (e.g., genetic circuits for sensing and responding to inflammation, environmental stimuli, etc.), to art (DNA origamis). If possible, include the specific genetic sequence(s) of what you would like to synthesize! You will have the opportunity to actually have Twist synthesize these DNA constructs! :)

What to synthesize: I would synthesize a fluorescently tagged CHD8 genetic circuit. Specifically: [CMV Promoter] - [CHD8 cDNA] - [GFP Tag]. This allows us to see exactly where the CHD8 protein goes inside a living cell by looking for the green glow of the GFP (Green Fluorescent Protein) tag.

(ii) What technology or technologies would you use to perform this DNA synthesis and why? Also answer the following questions:

What are the essential steps of your chosen sequencing methods? What are the limitations of your sequencing method (if any) in terms of speed, accuracy, scalability?

Essential Steps: This uses a silicon chip with thousands of tiny “wells.” Using ink-jet printing technology, the four chemical building blocks (A, T, C, G) are added one by one to a growing chain attached to the surface.

Limitations:

Speed: Writing long genes (like the 7.7kb CHD8) takes time because it must be synthesized in smaller fragments and then assembled (e.g., via Gibson Assembly).

Accuracy: Chemical synthesis can have errors (roughly 1 in 200–300 bases), necessitating strict post-synthesis “error correction” or NGS verification.

5.3 DNA Edit (i) What DNA would you want to edit and why? In class, George shared a variety of ways to edit the genes and genomes of humans and other organisms. Such DNA editing technologies have profound implications for human health, development, and even human longevity and human augmentation. DNA editing is also already commonly leveraged for flora and fauna, for example in nature conservation efforts, (animal/plant restoration, de-extinction), or in agriculture (e.g. plant breeding, nitrogen fixation). What kinds of edits might you want to make to DNA (e.g., human genomes and beyond) and why?

What to edit: I would like to perform Base Editing in human induced pluripotent stem cells (iPSCs) to “correct” a known CHD8 mutation (e.g., changing a premature “Stop” codon back into the original amino acid). This would allow researchers to compare “diseased” and “rescued” neurons to see if the cellular defects are reversed.

(ii) What technology or technologies would you use to perform these DNA edits and why? Also answer the following questions:

How does your technology of choice edit DNA? What are the essential steps? What preparation do you need to do (e.g. design steps) and what is the input (e.g. DNA template, enzymes, plasmids, primers, guides, cells) for the editing? What are the limitations of your editing methods (if any) in terms of efficiency or precision?

Technology: CRISPR-Cas9 Base Editors.

How it works: Unlike standard CRISPR, which cuts DNA, Base Editors use a “deactivated” Cas9 (dCas9) fused to a deaminase enzyme. It travels to the target site but chemically converts one base into another without breaking the DNA backbone.

Essential Steps:

Design: Create a guide RNA (gRNA) that matches the CHD8 mutation site.

Input: A plasmid or mRNA encoding the Base Editor and the gRNA.

Process: The gRNA leads the Cas9 to the DNA. The deaminase enzyme converts a C to a T (or A to G) at the specific coordinate.

Limitations:

Precision: “Bystander editing” can occur if there are multiple target bases (e.g., two C’s) very close together in the “editing window.”

Efficiency: Not all cells will be successfully edited, requiring single-cell sorting to find the correctly modified clones.

Week 3 — Lab Automation

One of the great parts about having an automated robot is being able to precisely mix, deposit, and run reactions without much intervention, and design and deploy experiments remotely.

For this week, we’d like for you to do the following:

Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications. Write a description about what you intend to do with automation tools for your final project. You may include example pseudocode, Python scripts, 3D printed holders, a plan for how to use Ginkgo Nebula, and more. You may reference this week’s recitation slide deck for lab automation details. While your description/project idea doesn’t need to be set in stone, we would like to see core details of what you would automate. This is due at the start of lecture and does not need to be tested on the Opentrons yet.

Example 1: You are creating a custom fabric, and want to deposit art onto specific parts that need to be intertwined in odd ways. You can design a 3D printed holder to attach this fabric to it, and be able to deposit bio art on top. Check out the Opentrons 3D Printing Directory.

Example 2: You are using the cloud laboratory to screen an array of biosensor constructs that you design, synthesize, and express using cell-free protein synthesis.

Echo transfer biosensor constructs and any required cofactors into specified wells. Bravo stamp in CPFS reagent master mix into all wells of a 96-well / 384-well plate. Multiflo dispense the CFPS lysate to all wells to start protein expression. PlateLoc seal the plate. Inheco incubate the plate at 37°C while the biosensor proteins are synthesized. XPeel remove the seal. PHERAstar measure fluorescence to compare biosensor responses.

Homework

Weekly homework submissions:

Subsections of Homework

Week 1 HW: Principles and Practices

Week 10

Week 11

Week 12

Week 13

Week 14

Week 2 — DNA Read, Write, & Edit

Week 3 — Lab Automation

Week 4

Week 5

Week 6

Week 7

Week 9