First, describe a biological engineering application or tool you want to develop and why. I want to work with Geobacter bacteria to create a living soil contaminate sensor: where electric signals modulate based on the prescence of heavy metals in the ground. Geobacters are already well studied bacterias that produce electric signals under ground. This makes them a useful organism to modify and use as a biosensor.
Python Script for Opentrons Artwork I was able to run my opentron script of a rainbow at Lifefabs Institute last year, using paint instead of bacteria. I’m adding a couple images and short video of the process.
Python Code This is the code that I used to run my protocol.
Part A: Conceptual Questions 1. How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons).
amino acid = ~100 Daltons meat = 25% protein (according to google between 20 to 30% depending on meat)
Daltons = g/mol So 500g/4 is 125g of protein. We’re going to have to use: Avogadro's number (6.022 × 10²³)
Part A: SOD1 Binder Peptide Design Part 1: Generate Binders with PepMLM Human SOD1 sequence from Uniprot (no Mutation): (https://www.uniprot.org/uniprotkb/P00441/entry) >sp|P00441|SODC_HUMAN Superoxide dismutase [Cu-Zn] OS=Homo sapiens OX=9606 GN=SOD1 PE=1 SV=2 MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTS AGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVV HEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ A4V Mutation refers to Alanine being changed to Valine at the 4th amino acid in the protein.
DNA Assembly 1. What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose?
According to the New England Biolabs product page, Phusion High-Fidelity PCR Master Mix consists of:
Phusion DNA Polymerase deoxynucleotides a reaction buffer that has been optimized and includes MgCl2 The Phusion DNA polymerase is able to synthesize new DNA strands. It is able to “proofread” itself which means it produces much less errors than other polymerase (hence why it is “high fidelity”). Deoxynucleotides refers to the molecules that makeup DNA (ie. A,T,G,C). These building blocks are going to be used in the new DNA strands. The reaction buffer creates the right environment for this process to take place. It ensure there is the right pH and the right amount of ions.
Intracellular Artificial Neural Networks (IANNs) 1. What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions?
IANNs are interesting and different than traditional genetic circuits becuase their inputs are continous molecular concentrations. So a basic artificial neuron can take 2 inputs and have an output that is a constantly changing and evolving based on the changing concentration amounts of the inputs. They are able to therefore handle more complexity than traditional genetic circuits. You might need multiple genetic circuits to create calculations IANNs can do.
General and Lecturer-Specific Questions 1. Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production.
There are many advantages of cell-free protein synthesis over traditional in vivo methods. Cell-free systems allow you to have more control over the conditions of your experiment (ie. DNA, proteins, small molecules, etc.). Unlike working in vivo methods, where things are a bit more of a “black box”.
Final Project - Measurements Please identify at least one (ideally many) aspect(s) of your project that you will measure. It could be the mass or sequence of a protein, the presence, absence, or quantity of a biomarker, etc. Please describe all of the elements you would like to measure, and furthermore describe how you will perform these measurements. What are the technologies you will use (e.g., gel electrophoresis, DNA sequencing, mass spectrometry, etc.)? Describe in detail.
Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork So I forgot to take a photo but I submitted one green pixel that was above a red heart in the bottom left quadrant. This was early on, when I first received the link to the Pixel Artwork Canvas.
However I’m pretty sure the whole heart got erased and replaced multiple times over by the time the deadline ended! haha
Subsections of Homework
Week 1 HW: Principles and Practices
1. First, describe a biological engineering application or tool you want to develop and why.
I want to work with Geobacter bacteria to create a living soil contaminate sensor: where electric signals modulate based on the prescence of heavy metals in the ground. Geobacters are already well studied bacterias that produce electric signals under ground. This makes them a useful organism to modify and use as a biosensor.
Heavy metal soil tests exist but by having living bacteria that emit electric signals one can track the growth and spread of these metals. This can lead to better pollutant mapping and detection of the source of the issue. With genetic engineering we can also customize the bacteria to respond to local problems by genetically modifiying the GeoBacters to respond to specific heavy metals.
The specific mechanism I would study would be how to add a metal-inducible promoter to express the gene in Geobacters related to their electron transfer (electricity) trait. That way the electricity would increase when there are heavy metal contaminants present in the ground and this would be a measurable way of tracking soil health overtime.
2. Next, describe one or more governance/policy goals related to ensuring that this application or tool contributes to an “ethical” future, like ensuring non-malfeasance (preventing harm). Break big goals down into two or more specific sub-goals.
Environmental Health: Non-malfeasance to ecosystems. This living soil contaminate sensor will improve environmental health by providing constant monitoring about the state of soil health.
Reduced Inequality: Making the sensors accessible to all. By having a living biosensor that is all around us, it removes the barrier of entry when it comes to education about soil quality. Tracking the changes in electric signals in the modified Geobacters requires limited technology.
Human Health: Protect people from heavy metal exposure. By having constant, easy to understand living soil sensors, communities of people will be empowered with data and knowledge to advocate for better environmental protections from their lawmakers. The monitoring will also allow them to know which parts of their community have been most impacted and how to avoid harmful proximity.
3. Next, describe at least three different potential governance “actions” by considering the four aspects below (Purpose, Design, Assumptions, Risks of Failure & “Success”).
Biocontainment and a Kill-Switch for Geobacters
Purpose: To have a mechanism in place to stop Geobacters from spreading and ruining the balance in the ecosystem they are introduced in. Or to keep them contained in one specific area.
Design: This would involve engineering a specific kill-switch for the modified Geobacter bacteria. This could be having the bacteria be susceptible to a specific type of antibiotic, or making the bacteria be dependent on a specific synthetic nutrient that has to be administered regularly to ensure geobacter survival.
Assumptions: The assumption is that this kind of kill-switch would be effective. And the assumption is that having a specific antibiotic for geobacter would not affect other nearby living organisms.
Risks of Failure & Success: Failure would be if the engineered Geobacteria spread uncontrollably and endangered the balance of the microorganism ecosystem. Success looks like well contained Geobacters that do not harm any other living organism but instead works symbiotically with them.
Open Source Community Science
Purpose: To make sure that the information being collected from the Geobacter soil sensors is open to all members of the nearby communities and beyond.
Design: There would be public dashboards accessible to all that show the data collected by the sensors. There would also have to be open source documentation about how to create your own low-cost “readers” that would be able to evaluate the electronic signals emitted by the sensors.
Assumptions: Communities implicated have basic knowledge of electronics, and have access to digital media.
Risks of Failure & Success: Failure would be if this information became privatized and used in a way that would be harmful or exploitative of local communities. Success would be if all the data collected remained transparent and accessible to all.
Reporting System for High Heavy Metal Levels
Purpose: To ensure that there is a way that authorities are notified when toxic metal detection reaches a dangerous threshold. To encourage action is taken if a threat to human or animal health is detected.
Design: Create a policy where there is a threshold level that activates an alert system either to the local community, the relevant environmental authorities, etc. Also have a trigger that would alert a non-profit or similar body of scientists to verify the findings from the biosensors.
Assumptions: The biosensors are accurate and reporting correctly. There are competent authorities in the area that are able to help restrict dangerous high-contaminant areas, to prevent people from being affected.
Risks of Failure & Success: A failure would be if the system works improperly and gives false negatives or false positives, eroding trust in the biosensors. A success would be if the biosensors report accurately and their findings can be confirmed by a third body in the case of heavy metal detection, and communities and people become more aware of the quality of their soil and issues related to pollution.
4. Next, score (from 1-3 with, 1 as the best, or n/a) each of your governance actions against your rubric of policy goals:
Does the option:
Environmental Health
Reduced Inequality
Human Health
Biocontainment and Kill-Switch
1
3
2
Open Source Community Science
3
1
2
Reporting System for High Heavy Metal Levels
1
1
1
5. Last, drawing upon this scoring, describe which governance option, or combination of options, you would prioritize, and why. Outline any trade-offs you considered as well as assumptions and uncertainties.
I would prioritize the Biocontainment and Kill-Switch action, since this is the foundational issue which could cause a great deal of environmental harm if not done right. None of the other options are meaningful if I can’t get the biosecurity issue right. I think I would focus on the US since the laws and regulations there are more permissible. I would therefore have to deal with the Environmental Protection Agency (EPA) and try to get their approval to allow me to deploy my bacteria in the soil. I would have to prove that my kill-switch works correctly and efficiently. And for additional ethical concerns, I would want to meet with the nearby local communities and try to get their consent, since I am adding a new organism to their land, and they would be most affected should there be any biocontainment issues.
Resources
Bazhenov, S. V., Novoyatlova, U. S., Scheglova, E. S., Prazdnova, E. V., Mazanko, M. S., Kessenikh, A. G., Kononchuk, O. V., Gnuchikh, E. Y., Liu, Y., Al Ebrahim, R., Zavilgelsky, G. B., Chistyakov, V. A., & Manukhov, I. V. (2023). Bacterial lux-biosensors: Constructing, applications, and prospects. Biosensors and Bioelectronics: X, 13, 100323. https://doi.org/10.1016/j.biosx.2023.100323
Webster, C. F., Kim, W.-J., Reguera, G., Friesen, M., & Beyenal, H. (2024). Can bioelectrochemical sensors be used to monitor soil microbiome activity and fertility? Current Opinion in Biotechnology, 90, 103222. https://doi.org/10.1016/j.copbio.2024.103222
Week 2 Lecture Prep
Homework Questions from Professor Jacobson:
1. Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy?
The error rate of polymerase is 1:106. The length of the human genome is approximately 3.2 gigabase pairs (gbp). This discrepancy is fixed in part by error correcting polymerase that go back and “check their work.” There is also the MutS Repair System which is where a protein (MutS) detects mismatches in DNA and activates other proteins to “cut out” the mismatched protein allowing the DNA polymerase to recreate the missing segment.
2. How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?
There are 20 amino acids in biology. An average human protein is about 300 - 400 amino acids long. An amino acid can have 4 possible codons which means the number of different ways to code for an average human protein would be (using the upper limit) 4400 which is huge.
There are many reasons why not all the different codons would work. One of the reasons is some organisms prefere use of certain codons over others. Another reason is that protein folding is sensitive, and not all codon pairings would work.
Homework Questions from Dr. LeProust:
1. What’s the most commonly used method for oligo synthesis currently?
Currently the most commonly used method for oligo synthesis is solid‑phase phosphoramidite. This refers to a way of growing DNA in a way where it is on a stable support using chemically protected DNA letters to create a smooth process.
2. Why is it difficult to make oligos longer than 200nt via direct synthesis?
It’s difficult to make oligos longer than 200nt via direct synthesis because the number of errors accumulates with the length of the oligonucleotide.
3. Why can’t you make a 2000bp gene via direct oligo synthesis?
According to slide 59 there is an error rate of 1:2000 nt. Which means the likely hood of errors in making a 2000bp gene is high and its better to synthesize small fragments and stitch them together later.
Homework Questions from George Church:
1. What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?
The 10 essential amino acids in all animals are: phenylalanine, valine, threonine, tryptophan, isoleucine, methionine, histidine, arginine (non-essential for mammals though), leucine and lysine.
The Lysine Contingency is an idea from the fictional book/movie Jurassic Park which was about how engineered dinosaurs could not make their own lysine so they depended on human’s supplementing it for them. So there was a sort of kill-switch for the engineered dinosaurs.
However lysine is already an essential amino acid for animals, meaning they can’t make it on their own and need to supplement it. So the dinosaurs would naturally get lysine from their environment – not such a great fail-safe in the end!
Week 2 HW: DNA Read, Write, & Edit
Part 1: Benchling & In-silico Gel Art
I made a free account on benchling and then imported the Lambda DNA (see below).
I simulated the restriction enzyme digestion with the following enzymes: EcoRI, EcoRV, SacI, SacII, KpnI, BamHI, HindIII.
I made an attempt at a pattern/image in the style of Paul Vanouse’s Latent Figure Protocol artworks. It was supposed to be a semi circle of sorts, I’ve decided it’s ski goggles now. I suppose it’s a bit of a Rorschach Test and you can see what you want to see.
Here are the enzymes and enzyme combinations used for the ski goggles:
Part 2: Gel Art - Restriction Digests and Gel Electrophoresis
Not applicable – No Lab Access
Part 3: DNA Design Challenge
3.1. Choose your protein.
In recitation, we discussed that you will pick a protein for your homework that you find interesting. Which protein have you chosen and why?
I am choosing the PilA protein which is derived from the bacteria Geobacter Sulfurreducens. PilA is the protein that is the main building block in the conductive nanowires (called e-pili) on Geobacter bacteria. Interestingly in a lab setting it is easier to work with the water-friendly shortened version of the protein, called PilA19, which can then used in bacteria such as E.coli. I will be finding my protein based on what is studied in the “Bottom-Up Fabrication of Protein Nanowires via Controlled Self-Assembly of Recombinant Geobacter Pilins” research paper.
References: Cosert KM, Castro-Forero A, Steidl RJ, Worden RM, Reguera G. Bottom-Up Fabrication of Protein Nanowires via Controlled Self-Assembly of Recombinant Geobacter Pilins. mBio. 2019 Dec 10;10(6):e02721-19. doi: 10.1128/mBio.02721-19. PMID: 31822587; PMCID: PMC6904877.
3.2. Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence.
Using the reverse translate tool from bioinformatics.org I was able to get the most likely codons from my protein sequence.
>reverse translation of tr|Q74D23|Q74D23_GEOSL Geopilin domain 1 protein OS=Geobacter sulfurreducens (strain ATCC 51573 / DSM 12127 / PCA) OX=243231 GN=pilA-N PE=1 SV=1 to a 270 base sequence of most likely codons.
atggcgaactatccgcataccccgacccaggcggcgaaacgccgcaaagaaaccctgatg
ctgcagaaactgcgcaaccgcaaaggctttaccctgattgaactgctgattgtggtggcg
attattggcattctggcggcgattgcgattccgcagtttagcgcgtatcgcgtgaaagcg
tataacagcgcggcgagcagcgatctgcgcaacctgaaaaccgcgctggaaagcgcgttt
gcggatgatcagacctatccgccggaaagc
3.3. Codon optimization.
I have used VectorBuilder’s online codon optimization tool to optimize my above protein sequence to be used in E.coli. I chose E.coli because it is one of the most commonly used chassis when it comes to genetic engineering and therefore it will make experimenting with the PilA protein easier. The codon optimization is important because different organisms have preferences for different codons. For example some organisms have more available tRNA’s for specific codons. By optimizing codon usage, you can take advantage of what is more available in the organism, which speeds up the protein-making process.
PilA protein DNA sequence with Codon-Optimization
ATGGCGAACTATCCGCATACCCCGACCCAGGCCGCGAAACGCCGCAAAGAAACCCTGATGCTGCAGAAACTGCGCAATCGTAAAGGCTTTACCCTGATTGAACTGCTGATTGTGGTGGCGATTATTGGCATTCTGGCGGCGATTGCGATTCCGCAGTTTAGCGCGTATCGCGTTAAAGCCTACAATAGCGCGGCGAGCAGCGATCTGCGTAATCTGAAAACCGCGCTGGAATCCGCCTTTGCGGATGATCAGACCTATCCGCCGGAAAGC
3.4. You have a sequence! Now what?
What technologies could be used to produce this protein from your DNA? Describe in your words the DNA sequence can be transcribed and translated into your protein. You may describe either cell-dependent or cell-free methods, or both.
To produce the PilA protein from its DNA sequence, I could use the cell-dependent method. So the PilA19 (the adapted version of PilA) plasmid gene would be in the E.coli host. What the Central Dogma tells us is that the PilA DNA will be next to a promoter, which wil allow RNA polymersae to read the DNA sequence and make an mRNA copy.
This copy can be scanned by a ribosomes in sets of 3 (codons) Each codon represents an amino acid (which is fetched by tRNA) and compiled by the ribosome to create a chain of amino acids which folds and becomes the PilA19 protein.
I would to use cell-dependent production with E.coli because E.coli can give higher yields for the protein as opposed to cell-free methods. It is also less expensive. And it is the method used in the reserach paper I referenced above.
Part 4: Prepare a Twist DNA Synthesis Order
I am creating a DNA sequence that would use GFP with PilA to create glowing green conductive nanowires that in theory could be visually detected and tested for conductivity via Scanning Tunneling Microscopes (STM).
I will be basing myself on several papers that have worked with these proteins before, which I will reference below.
There were some redundancies that Twist Bioscience optimized for with my PilA with sfGFP tag DNA sequence. The GenBank file, which I’ve added to my assets folder is the optimized version.
References:
Cosert, K. M., Castro-Forero, A., Steidl, R. J., Worden, R. M., & Reguera, G. (2019). Bottom-Up Fabrication of Protein Nanowires via Controlled Self-Assembly of Recombinant Geobacter Pilins. mBio, 10(6), 10.1128/mbio.02721-19. https://doi.org/10.1128/mbio.02721-19.
Ueki, T., Walker, D. J. F., Woodard, T. L., Nevin, K. P., Nonnenmann, S. S., & Lovley, D. R. (2020). An Escherichia coli Chassis for Production of Electrically Conductive Protein Nanowires. ACS Synthetic Biology, 9(3), 647–654. https://doi.org/10.1021/acssynbio.9b00506.
Wahlfors, J., Loimas, S., Pasanen, T., & Hakkarainen, T. (2001). Green fluorescent protein (GFP) fusion constructs in gene therapy research. Histochemistry and Cell Biology, 115(1), 59–65. https://doi.org/10.1007/s004180000219.
Part 5: DNA Read/Write/Edit
5.1 DNA Read
What DNA would you want to sequence (e.g., read) and why?
I have two projects that inspire me: one by Carolina Reyes about fungal batteries, another one by Tanguy Chotel about ressurecting ancestral proteins that were better suited at carbon capture and using them to do a “reboot” of the traditional Calvin Cycle. He uses Chlamydomonas reinhardtii as his chassis which is a microalgae.
References:
Reyes, C., Fivaz, E., Sajó, Z., Schneider, A., Siqueira, G., Ribera, J., Poulin, A., Schwarze, F. W. M. R., & Nyström, G. (2024). 3D Printed Cellulose-Based Fungal Battery. ACS Sustainable Chemistry & Engineering, 12(43), 16001–16011. https://doi.org/10.1021/acssuschemeng.4c05494
Inckemann, R., Chotel, T., Brinkmann, C. K., Burgis, M., Andreas, L., Baumann, J., Sharma, P., Klose, M., Barrett, J., Ries, F., Paczia, N., Glatter, T., Willmund, F., Mackinder, L. C. M., & Erb, T. J. (2024). Advancing chloroplast synthetic biology through high-throughput plastome engineering of Chlamydomonas reinhardtii (p. 2024.05.08.593163). bioRxiv. https://doi.org/10.1101/2024.05.08.593163
(i) Building off the aforementioned fungal battery, I would be interestied in sequencing the white-rot fungus Trametes pubescens – which in the fungal battery serves as a cathode. There are laccase enzymes in the white-rot fugus that capture the electrons and close the circuit. I’d be curious to sequence other fungi and compare with the Trametes pubescens to see if there are other laccase enzymes in other species that could prove even more adept for bio-batteries. I could sequence specifically the laccase enzymes (lap1/2). Laccase enzymes have copper atoms that capture the electrons. Those electrons then are used to turn airborne oxygen to harmless water (no issue of hydrogen peroxide).
ii) I think I would use NGS (Next Generation Sequencing) to read the DNA from Trametes pubescens because it is one of the more cost-effective solutions and it also handles multiple samples well (for if I want to compare laccase enzymes from different fungi.) However if the genome is very long I should use Oxford Nanopore Technologies which can handle long reads.
The NGS method is second generation (massively parallel), which means that as opposed to Sanger sequencing (first generation) this method uses multiplexing (not just one tube per reaction).
References
Jiang, S., Chen, Y., Han, S., Lv, L., & Li, L. (2022). Next-Generation Sequencing Applications for the Study of Fungal Pathogens. Microorganisms, 10(10), 1882. https://doi.org/10.3390/microorganisms10101882
5.2 DNA Write
(i) An example of a DNA I would want to synthesize is what I wrote above in the Twist order – a GFP-tag version of the pilA protein in order to understand visually how and where the nanowires are constructed.
To take this project a step further (and align with the project I investigated in my first homework) I could make a genetic circuit that would take a promoter from another bacteria which is activated by high levels of toxic metals. One such example is the czcCBA promoter that comes from Pseudomonas putida bacteria. This promoter is activated by zinc, cadmium, or lead. I could place this promoter in front of the pilA-GFP protein so that these proteins are expressed only when heavy metals are detected. The GFP would be useful for debugging the circuit, the conductivity could be hooked up to electrodes or other electric components to create a bio-hybrid senor.
(ii) To perform this particular DNA synthesis I would use Twist Bioscience’s web platform. It would be a similar process to what I created in the Twist Example order above. There are other options, like using Golden Gate cloning to assemble DNA. But with Twist I can order the fully built plasmid. They verify the sequences and there’s a guarantee that the sequence delivered is 99.5% exact. Practically speaking this is the best option for me, even if it is more expensive.
5.3 DNA Edit
(i) What DNA would you want to edit and why?
In a similar theme to my earlier responses I would edit DNA in order to be able to boost the conductive capabilities of Geobacter Sulfurreducens.
Based on a research paper I found there is a gene called ftsZ which limits the length of the host body. If this gene is repressed (according to the paper) the cell body becomes longer and more filamentous (ressembling a conductive thread). This creates a higher density of the nanowires which improves conductivity.
(ii) I’d use CRISPRi (CRISPR interference) witha dead or deactivated Cas9. CRISPRi is the perfect tool for this because it can pause certain genes without having to cut out parts of the DNA. This makes it perfect for testing.It functions similarly to CRISPR in that there is a “guide RNA” attached to the Cas9 that helps to find the target DNA. However instead of cutting out the DNA it blocks the transcription of that part of the DNA. In this way we can do gene silencing at the transcription level. Once I’ve tested with CRISPRi I can then move to CRISPR to do true gene editing. CRISPR is the same mechanism except is uses activated Cas9 protein which actually cuts out the DNA it is configured to cut out.
In order to do this (for either CRISPR or CRISPRi) I would need to:
a. Find the sequence for the ftsZ gene. This will be the guide RNA for Cas9.
b. Order the Cas9 proteins
c. Grow host chassis (Geobacter or E.coli)
I would have to clone the guide RNA into the Cas9 plasmids. Then I would have to deliver that plasmid into the host cell, something I could do with electroporation.
CRISPR has some limitations, particularly that it does not always target the write cut. This is not only inefficient but can also cause bigger issues like cell death due to inaccurate cuts.
References:
Bird, L. J., Kundu, B. B., Tschirhart, T., Corts, A. D., Su, L., Gralnick, J. A., Ajo-Franklin, C. M., & Glaven, S. M. (2021). Engineering wired life: Synthetic biology for electroactive bacteria. ACS Synthetic Biology, 10(11), 2808–2823. https://doi.org/10.1021/acssynbio.1c00335
Week 3 HW: Lab Automation
Python Script for Opentrons Artwork
I was able to run my opentron script of a rainbow at Lifefabs Institute last year, using paint instead of bacteria.
I’m adding a couple images and short video of the process.
Python Code
This is the code that I used to run my protocol.
fromopentronsimporttypesimportmathmetadata={# see https://docs.opentrons.com/v2/tutorial.html#tutorial-metadata'author':'Elsa Donovan','protocolName':'HGTAA Opentrons Lab','description':'The protocol will pipette 4 different types of genetically engineered e. coli bacteia into a rainbow design.','source':'HTGAA 2025 Opentrons Lab','apiLevel':'2.20'}################################################################################# Robot deck setup constants - don't change these##############################################################################TIP_RACK_DECK_SLOT=9COLORS_DECK_SLOT=6AGAR_DECK_SLOT=5PIPETTE_STARTING_TIP_WELL='A1'well_colors={'A1':'Red','B1':'Yellow','C1':'Green','D1':'Cyan','E1':'Blue'# if in a 24-well plate, this needs to be moved to e.g. D2}defrun(protocol):################################################################################# Load labware, modules and pipettes############################################################################### Tipstips_20ul=protocol.load_labware('opentrons_96_tiprack_20ul',TIP_RACK_DECK_SLOT,'Opentrons 20uL Tips')# Pipettespipette_20ul=protocol.load_instrument("p20_single_gen2","right",[tips_20ul])# Modulestemperature_module=protocol.load_module('temperature module gen2',COLORS_DECK_SLOT)# Temperature Module Platetemperature_plate=temperature_module.load_labware('opentrons_96_aluminumblock_generic_pcr_strip_200ul','Cold Plate')# Choose where to take the colors fromcolor_plate=temperature_plate# Agar Plateagar_plate=protocol.load_labware('htgaa_agar_plate',AGAR_DECK_SLOT,'Agar Plate')## TA MUST CALIBRATE EACH PLATE!# Get the top-center of the plate, make sure the plate was calibrated before running thiscenter_location=agar_plate['A1'].top()pipette_20ul.starting_tip=tips_20ul.well(PIPETTE_STARTING_TIP_WELL)################################################################################# Patterning#################################################################################### Helper functions for this lab#### pass this e.g. 'Red' and get back a Location which can be passed to aspirate()deflocation_of_color(color_string):forwell,colorinwell_colors.items():ifcolor.lower()==color_string.lower():returncolor_plate[well]raiseValueError(f"No well found with color {color_string}")# For this lab, instead of calling pipette.dispense(1, loc) use this: dispense_and_jog(pipette, 1, loc)defdispense_and_jog(pipette,volume,location):"""
Dispense and then move up 5mm to shake all dispensed fluid off the tip;
move back down to touch the agar to try to make sure the drop adheres;
then go back up 5mm to avoid smearing the drop before moving laterally.
"""assert(isinstance(volume,(int,float)))pipette.dispense(volume,location)currLoc=pipette._get_last_location_by_api_version()pipette.move_to(currLoc.move(types.Point(z=5)))pipette.move_to(currLoc)pipette.move_to(currLoc.move(types.Point(z=5)))###### YOUR CODE HERE to create your design###radius_step=3# Distance between concentric circles within a color layercolor_layer_spacing=6# Distance between different color layersnum_layers=2base_drops_per_circle=4# Drops in the first circlecolor_layer=1forcolor_nameinwell_colors.values():pipette_20ul.pick_up_tip()forlayerinrange(num_layers):radius=(layer+1+color_layer)*radius_stepdrops_per_circle=base_drops_per_circle*(color_layer+layer+1)angle_step=math.pi/drops_per_circle# only cover the top halfpipette_20ul.aspirate(10,location_of_color(color_name))foriinrange(drops_per_circle+1):ifpipette_20ul.current_volume<1:pipette_20ul.aspirate(10,location_of_color(color_name))angle=i*angle_step# restrict angle to top half (0 to pi)x_offset=radius*math.cos(angle)y_offset=radius*math.sin(angle)-10drop_location=center_location.move(types.Point(x=x_offset,y=y_offset))dispense_and_jog(pipette_20ul,1,drop_location)pipette_20ul.drop_tip()color_layer+=num_layers
Post lab questions
1. Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications.
I’m really interested in the concept of Biofounderies. These are biospaces that use robotics, machine learning, and automation to be able to run through many protocols while reducing the possibily for human error. This article (Jun et al.) specifically mentions the ability to be able to cell-free protein synthesis quickly and efficiently. With the incorporation of automation tools (like opentrons) you can basically rapid-prototype the protein of your choice with minimal errors.
Paper:
Jun, J.S. et al. (2025). Automated and Programmable Cell-Free Systems for Scalable Synthetic Biology with a Focus on Biofoundry Integration. Journal of Microbiology and Biotechnology, 35, e2507019.
2. Write a description about what you intend to do with automation tools for your final project. You may include example pseudocode, Python scripts, 3D printed holders, a plan for how to use Ginkgo Nebula, and more. You may reference this week’s recitation slide deck for lab automation details.
I’m interested in creating a 3D printed flexible microwell array. For that I am thinking of 3D printing with TPU (like rubber) and creating custom wells like that. Alternatively I might make a classic 3D printed array and then use that as a mold for a silicone cast. The inspiration from this comes from the flexible LED light strips and I want to mimic that appearance. Of course for this I would also need a transparent or translucent flexible cover which could encapsulate the genetically altered e.coli. I might have to use something like a: Polydimethylsiloxane (PDMS) membranes – this allows gas exchanges but would keep the e.coli from leaking out if the microwell gets turned upside down or something like that.
Final Project Ideas
Week 4 HW: Protein Design Part I
Part A: Conceptual Questions
1. How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons).
amino acid = ~100 Daltons
meat = 25% protein (according to google between 20 to 30% depending on meat)
Daltons = g/mol
So 500g/4 is 125g of protein.
We’re going to have to use: Avogadro's number (6.022 × 10²³)
Avogadro’s number helps us to convert Moles to number of molecules.
125 g of amino acids (protein) divided by 100 (g/mol) = 1.25 mol
1.25 mol x AVOGADROS = 7.5 × 10^23 molecules
So you take in approx 7.5 × 10^23 molecules of amino acid by eating 500g of meat.
2. Why do humans eat beef but do not become a cow, eat fish but do not become fish?
Humans can eat beef and fish and not turn into either a cow or fish because when we eat these animals the human digestion system breaks food down into smaller parts before reusing them. The foods break down into proteins, fats, and carbohydrates in our bodies. These nutrients are absorbed into our bloodstream and used by our bodies using human DNA – not the DNA of what was eaten.
In essence: eating something doesn’t transfer its DNA to us and therefore doesn’t transfer its identity. The human body just takes the useful parts and repurposes them.
3. Why are there only 20 natural amino acids?
This question is difficult to answer. There are only 20 natural amino acids because that’s just the way life evolved. Part of the limitations are that there are 3-letter codons that code for amino acids. And as we learned, some of the amino acids have multiple codons, and some of the codons are used to signal to the ribosome to stop making the protein. So there is a limited number of combinations possible, but this number exceeds 20. So part of the explanation is just that this is how nature evolved and there wasn’t a need to evolve past this system since it worked.
4. Can you make other non-natural amino acids? Design some new amino acids.
Yes, you can definitely make non-natural amino acids. The 20 natural ones sort of developed by chance. So if you wanted you could design some new amino acids by changing whats called the R-group (or side chain) it’s the only part of the amino acid that is different in each natural amino acid. So you can add atoms that can bond to the carbon base of the amino acid, like: Bromine (-Br) or Iodine (-I). And that way you create a new non-natural amino acid!
5. Where did amino acids come from before enzymes that make them, and before life started?
Amino acids, before life started, were a product of when natural gases (like methand and hydrogen) were hit with sources of energy (like radiation from the sun). This energy was enough to rearrange the atoms in the natural gases and cause amino acids to be formed. This was proven by the Miller-Urey experiment where they triggered the creation of amino acids by zapping natural gases with electricity.
6. If you make an α-helix using D-amino acids, what handedness (right or left) would you expect?
A α-helix using D-amino acids would be left-handed. In D-amino acids, D refers to dextra which means left, which is confusing. But that refers to their position around the carbon in an amino acid. So if you flip it, you get the “handedness” So D-amino acids actually do a right-handed spiral in a shape like the alpha-helix.
7. Why do β-sheets tend to aggregate?
Beta-sheets have a flat shape with zig-zag patterns. They tend to aggregate because this shape makes them very stackable. Also the large surface area of the beta sheets are hydrophobic. So beta sheets tend to “stick” together to avoid water.
8. What is the driving force for β-sheet aggregation?
I answered this in the previous question but the fact that so many of the amino acid chains in the the beta sheet are hydrophobic means that there is a tendency for β-sheets to clump together. They also have hydrogen bonds on the side which allows them to easily attach to other beta-sheets or other proteins
9. Can you use amyloid β-sheets as materials?
Amyloid β-sheets as materials are really useful as materials due to the fact that they aggregate and strongly bind to each other. Spider silk for example contains Beta sheets and it is a very strong natural material. Beta-sheets is also used in some nanowires to conduct electricity.
Part B: Protein Analysis and Visualization
1. Briefly describe the protein you selected and why you selected it.
I am choosing pilA from my week 2 homework since I find it’s electron carrying properties fascinating and would love to explore it more:
Total Sequence Length: 270
-----------------------------------
Amino Acid | Count | Frequency (%)
-----------------------------------
C | 75 | 27.78%
G | 73 | 27.04%
A | 66 | 24.44%
T | 56 | 20.74%
The most frequent amino acid is: Cysteine.
2.b How many protein sequence homologs are there for your protein? Hint: Use Uniprot’s BLAST tool to search for homologs.
I found about 250 protein sequence homologs for my protein. I did, in fact, use Uniprot’s Blast tool to find this.
2.c Does your protein belong to any protein family?
Yes, pilA belongs to the type IV pilin protein family.
3. Identify the structure page of your protein in RCSB3.a When was the structure solved? Is it a good quality structure? Good quality structure is the one with good resolution. Smaller the better (Resolution: 2.70 Å)
The structure was solved in 2013. It’s a good quality structure. A Solution NMR was used, which I believe means there is no resolution value.
3.b Are there any other molecules in the solved structure apart from protein?
The 2M7G structure doesn’t have any ligands or cofactors, it’s just the protein.
3.c Does your protein belong to any structure classification family?
PilA belongs to the Pilin structural family. It’s in the the type IV pilus superfamily as mentioned above.
4. Open the structure of your protein in any 3D molecule visualization software:PyMol Tutorial Here (hint: ChatGPT is good at PyMol commands)
Visualize the protein as “cartoon”, “ribbon” and “ball and stick"
Color the protein by secondary structure. Does it have more helices or sheets?
Color the protein by residue type. What can you tell about the distribution of hydrophobic vs hydrophilic residues?
The hydrophobic and hydrophilic elements are mixed here however there seems to be more hydrophobic (the orange) elements on the outside parts.
Visualize the surface of the protein. Does it have any “holes” (aka binding pockets)?
There don’t seem to be any holes/binding pockets in this particular protein.
Part C: Using ML-Based Protein Design Tools
I’m choosing this protein from the PDB.
It’s called MtrC from Shewanella oneidensis and it transports electrons to the surface of the membrane of the conductive bacteria.
https://www.rcsb.org/structure/4LM8
>4LM8_1|Chain A|Extracellular iron oxide respiratory system surface decaheme cytochrome c component MtrC|Shewanella oneidensis (211586)
MKFKLNLITLALLANTGLAVAADGGGSDGNNGNDGSDGGEPAGSIQTLNLDITKVSYENGAPMVTVFATNEADMPVIGLANLEIKKALQLIPEGATGPGNSANWQGLGSSKSYVDNKNGSYTFKFDAFDSNKVFNAQLTQRFNVVSAAGKLADGTTVPVAEMVEDFDGQGNAPQYTKNIVSHEVCASCHVEGEKIYHQATEVETCISCHTQEFADGRGKPHVAFSHLIHNVHNANKAWGKDNKIPTVAQNIVQDNCQVCHVESDMLTEAKNWSRIPTMEVCSSCHVDIDFAAGKGHSQQLDNSNCIACHNSDWTAELHTAKTTATKNLINQYGIETTSTINTETKAATISVQVVDANGTAVDLKTILPKVQRLEIITNVGPNNATLGYSGKDSIFAIKNGALDPKATINDAGKLVYTTTKDLKLGQNGADSDTAFSFVGWSMCSSEGKFVDCADPAFDGVDVTKYTGMKADLAFATLSGKAPSTRHVDSVNMTACANCHTAEFEIHKGKQHAGFVMTEQLSHTQDANGKAIVGLDACVTCHTPDGTYSFANRGALELKLHKKHVEDAYGLIGGNCASCHSDFNLESFKKKGALNTAAAADKTGLYSTPITATCTTCHTVGSQYMVHTKETLESFGAVVDGTKDDATSAAQSETCFYCHTPTVADHTKVKMKGELKLEGKPIPNPLLGLDSTRTGHHHHHH
Protein Language Modeling
1. Deep Mutational Scans
What I can tell from this heatmap is that there are multiple positions in my protein where every single mutation is harmful. These can be identified by the blue/purple vertical stripes. Interestingly there is also a horizontal blue/purple band towards the bottom. This is for the amino acid C or Cysteine. In essence this means that almost any mutation involving Cysteine would be a bad idea.
2. Latent Space Analysis
3. Protein Folding
Unfortunately I crashed my RAM trying to run ESMFold 😨.
The MtrC is too long (671 aa) and has complex heme cofactors, which I think is why ESMFold is crashing.
A4V Mutation refers to Alanine being changed to Valine at the 4th amino acid in the protein.
It’s called A4V in the homework but I’m pretty sure it’s now referred to as A5V based on the literature (or maybe actually the other way around - it used to be A5V and is now A4V).
So I found this link for the A5V mutation on Uniprot here:
https://web.expasy.org/variant_pages/VAR_007131.html
The entry with the mutation would now start with MATKVVC
Using PepMLM colab: Generate four peptides of length 12 amino acids conditioned on the mutant SOD1 sequence.
SOD1-binding peptide FLYRWLPSRRGG put through the PepMLM colab:
Part 2: Evaluate Binders with AlphaFold3
Record the ipTM score and briefly describe where the peptide appears to bind. Does it localize near the N-terminus where A4V sits? Does it engage the β-barrel region or approach the dimer interface? Does it appear surface-bound or partially buried?
Note: Only 2 of my generated peptides were able to be used for the Alphafold model. It didn’t like the ones that ended with “X”.
These are the scores:
ipTM = 0.79
pTM = 0.83
One of the yellow ends of the protein is the N-terminus. The peptides are near each of the ends of the protein but not directly on it. I would have expected it to be closer to the N-terminus since that’s where the mutation is.
The peptides are surface-bound. The peptide doesn’t seem to bind to the β-barrel region, and it also doesn’t appear to bind at the dimer interface.
Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse
For the peptide: WHSYPAQLRHKE (a new peptide I generated)
The values for this peptide are:
ipTM = 0.29 pTM = 0.84
This particular peptide has a low ipTM. It also has a weak binding affinity. It is soluble but not hemolyctic.
I assume that the weaker the binding, the more soluble it is due to the poor bonds.
If I want a protein that is not soluble I will choose one with a high ipTM.
This peptide: WLNYAVAVAWWE
has a ipTM of 0.48 which is slightly higher than the previous one.
Here are it’s peptiverse results:
This is a better peptide because it is medium-binding so a bit stronger than the typical peptide I’ve been generating. However it is still water soluble and non-hemolyctic which is important and necessary for therapies. I will move forward with this peptide then.
Part 4: Generate Optimized Peptides with moPPIt
PepMLM generates peptides conditioned on the whole protein sequence, while moPPit lets you target specific residues, giving you more control over where the peptide binds.
Open the moPPit Colab linked from the HuggingFace moPPIt model cardMake a copy and switch to a GPU runtime.In the notebook:Paste your A4V mutant SOD1 sequence.Choose specific residue indices on SOD1 that you want your peptide to bind (for example, residues near position 4, the dimer interface, or another surface patch).Set peptide length to 12 amino acids.Enable motif and affinity guidance (and solubility/hemolysis guidance if available). Generate peptides.After generation, briefly describe how these moPPit peptides differ from your PepMLM peptides. How would you evaluate these peptides before advancing them to clinical studies?
moPPit peptides are more targeted than PepMLM peptides because they are designed to bind a specific region near the A4V mutation, rather than the whole protein. Before clinical studies, you would haveto evaluate them using AlphaFold3 for binding confirmation. Then PeptiVerse for solubility and hemolysis, and eventually in-vitro binding and test for toxicity.
Final Project: L-Protein Mutants
Objective: to improve the stability and autofolding of the lysis protein
Stage 1: Engineer novel L-protein mutants using protein design tools
Here, I ran the notebook to be able to “score” for what would happen to the protein if you mutated into another amino acid.
From the Colab:
I’ll be honest I’m not seeing a lot of correlation between the 2 datasets. Essentially I can see that for example, for position 50 there is a high score for the mutation. However in the provided dataset I’m seeing a 1 for protein level but not for lysis.
Stage 2: Synthesize the L-protein mutant gene via Twist
Stage 3: Clone the L-protein mutant gene into a plasmid using Gibson Assembly
Stage 4: Test the L-protein mutant’s structural integrity using the Nuclera system
Stage 5: Test the L-protein in E. coli with plaque assays
Week 6 HW: Genetic Circuits I
DNA Assembly
1. What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose?
According to the New England Biolabs product page, Phusion High-Fidelity PCR Master Mix consists of:
Phusion DNA Polymerase
deoxynucleotides
a reaction buffer that has been optimized and includes MgCl2
The Phusion DNA polymerase is able to synthesize new DNA strands. It is able to “proofread” itself which means it produces much less errors than other polymerase (hence why it is “high fidelity”). Deoxynucleotides refers to the molecules that makeup DNA (ie. A,T,G,C). These building blocks are going to be used in the new DNA strands. The reaction buffer creates the right environment for this process to take place. It ensure there is the right pH and the right amount of ions.
2. What are some factors that determine primer annealing temperature during PCR?
The factors that determine primer annealing temperature during PCR are: primer length (meaning the number of bonds - the more bonds, the higher the Tm), the amount of G-C base pairs (they form 3 hydrogen bonds as opposed to 2 which means you need a higher Tm, which means higher annealing temp). In the lab we are doing some intentional mismatches, so there is a lack of hydrogen bonds which means the strands are less “sticky” and therefore we may need a lower Tm and subsequently a lower annealing temperature. That’s another factor.
3. There are two methods from this class that create linear fragments of DNA: PCR, and restriction enzyme digests. Compare and contrast these two methods, both in terms of protocol as well as when one may be preferable to use over the other.
PCR:
PCR is more of a refined process than restriction enzyme digests because you can design primers, and introduce mutations as you exponentially amplify DNA.
Restriction enzyme digests:
This process is a bit more “rudimentary” than PCR because restriction enzymes can only cut DNA at certain sections. You can’t add new sequences or mutations, then enzymes just cut where they are told to. Restriction enzymes just cut existing DNA they don’t amplify anything. PCR can do it all.
PCR is a powerful process that can be used if you want to make exponential copies of the DNA and introduce mutations or add Gibson overlaps for Gibson assemply. There can be polymerase errors though. Restriction enzyme digests is great if you know exactly where you want to cut and you want to avoid polymerase errors.
4. How can you ensure that the DNA sequences that you have digested and PCR-ed will be appropriate for Gibson cloning?
To ensure DNA sequences are appropriate for Gibson cloning you have to have the Gibson overlaps I mentioned above. As mentioned in the lab the overlaps need to be “typically 20–40 bp of sequence identity between adjoining fragments” to have them connect. And these, of course, need to matcht the cloning vector. You can also check the sequences to make sure the restriction enzymes have not cut any necessary part of the DNA that will be used in the Gibson assembly, like the Gibson overlaps.
5. How does the plasmid DNA enter the E. coli cells during transformation?
The plasmid DNA can enter the E.coli using different transformation methods. The most common are heat shock and electroporation. The electric or heat shock causes the cell to “open up” and the plasmid DNA is able to enter the cell through diffusion. Once the shocks stop the E.coli cells are able to “reseal” themselves and in that way trap the plasmid DNA inside.
6. Describe another assembly method in detail (such as Golden Gate Assembly)
a. Explain the other method in 5 - 7 sentences plus diagrams (either handmade or online).
Golden Gate Assembly is another method that uses restriction enzymes, such as Type IIS, which cut the DNA that is next to the spot that they recognize. This creates “sticky ends” which allows DNA sequences to attach together.
For a visual:
[Type IIS site] — [4-bp overhang A] — [Gene of your Choice] — [4-bp overhang B] — [Type IIS site]
After the restriction enzymes do their job you’re left with:
[sticky end A] — [Gene of your Choice] — [sticky end B]
To make sure they sticky ends of your choice properly bond with each other you have to use a ligase enzyme. This isn’t related to the hydrogen bonds that keep the sticky ends together, but rather it helps to rebuild the “spine” of the DNA that was cut by the enzyme (also called the sugar phosphate backbone).
Asimov Kernel
Note: Waiting for Asimov Kernel login
Week 7 HW: Genetic Circuits II - Neuromorphic Circuits
Intracellular Artificial Neural Networks (IANNs)
1. What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions?
IANNs are interesting and different than traditional genetic circuits becuase their inputs are continous molecular concentrations. So a basic artificial neuron can take 2 inputs and have an output that is a constantly changing and evolving based on the changing concentration amounts of the inputs. They are able to therefore handle more complexity than traditional genetic circuits. You might need multiple genetic circuits to create calculations IANNs can do.
2. Describe a useful application for an IANN; include a detailed description of input/output behavior, as well as any limitations an IANN might face to achieve your goal.
Bulding off of what was discussed in the lecture I think IANN would be really useful in helping autoimmune conditions. In autoimmune conditions, the immune system is overreactive and starts attacking its own healthy cells. IANN could be designed to diffentiate between a normal immune response (which we would want to keep) and an unhealthy immune self-attack (which we would want to stop). At the moment the issue is that to alleviate certain autoimmune conditions we use immunosuppressants which just reduces the function of the immune system completely, and makes people immune compromised and more susceptible to illnesses. This would be a great way to be more specific, to identify problem areas, and not shut down the immune system of a person as a whole.
Using AI and IANN we could discover a combination or range or different signals/proteins in the body that would differentiate between an attack on a real pathogen versus an attack on harmless human tissue. These signals are already known and lie in differences between interferon signature, TNF/IFN ratio, and IL-10 resolution. Machine learning on automimmmune data would be helpful here to be able to set the exact “weights” that would be used in the IANN.
I think the limitation is that these IANN would have to be personalized to the own person’s biology - and that would make the creation of this therapy expensive. Though hopefully as medical innovation progresses, so would the price of treatment.
3. Below is a diagram depicting an intracellular single-layer perceptron where the X1 input is DNA encoding for the Csy4 endoribonuclease and the X2 input is DNA encoding for a fluorescent protein output whose mRNA is regulated by Csy4. Tx: transcription; Tl: translation.
Draw a diagram for an intracellular multilayer perceptron where layer 1 outputs an endoribonuclease that regulates a fluorescent protein output in layer 2.
In this diagram X1 is Csy4 DNA which cleaves X2 which is an additional endoribonuclease DNA.
This becomes the output that regulates X2 (the fluorescent protein) which is output in layer 2.
References:
Cantaert T, Baeten D, Tak PP, van Baarsen LG. Type I IFN and TNFα cross-regulation in immune-mediated inflammatory disease: basic concepts and clinical relevance. Arthritis Res Ther. 2010;12(5):219. doi: 10.1186/ar3150. Epub 2010 Oct 28. PMID: 21062511; PMCID: PMC2991015. https://pmc.ncbi.nlm.nih.gov/articles/PMC2991015/
Kyogoku C, Smiljanovic B, Grün JR, Biesen R, Schulte-Wrede U, Häupl T, Hiepe F, Alexander T, Radbruch A, Grützkau A. Cell-specific type I IFN signatures in autoimmunity and viral infection: what makes the difference? PLoS One. 2013 Dec 31;8(12):e83776. doi: 10.1371/journal.pone.0083776. PMID: 24391825; PMCID: PMC3877094. https://pmc.ncbi.nlm.nih.gov/articles/PMC3877094/
Fungal Materials
1. What are some examples of existing fungal materials and what are they used for? What are their advantages and disadvantages over traditional counterparts?
Fungal materials can be used for many things! At the moment they are being used as eco-friendly packaging (mostly for luxury products), there is also a company that uses them for acoustic panels since their styrofoam-like properties give them the ability to absorb sound waves. Research is also being done to use them as water filtration systems since they have a porous-mesh like texture when dried. They are also being used in construction. The advantages they have is they create a strong lightweight materials which can easily be grown and formed into any shape. The disadvantage is that they take a long time to grow – you have to wait for the mycelium to inoculize and fill out the container you’ve placed it in. Another disadvantage is without any post-processing the fungal materials can easily disintegrate, especially when in contact with water.
2. What might you want to genetically engineer fungi to do and why? What are the advantages of doing synthetic biology in fungi as opposed to bacteria?
I think I would want to either genetivally engineer fungi to help them breakdown PET plastics a bit more efficiently. Oyster mushroom mycelium already has the ability to do this and it would be great to boost this ability. There is also research on fungal batteries, where different fungal strains can be used as anodes and cathodes and transfer electrons via their hyphae. One of the advantages of doing synthetiv biology in fungi is that fungi is a bit more stable than bacteria, which need more specific environmental conditions to thrive. They also have a tendency to be seen as “safer” organisms as opposed to bacteria. This might help with regulation and biosecurity issues.
First DNA Twist Order
Created a Twist Order that was added to the LifeFabs spreadsheet but ultimately not ordered.
It was a way to boost the electron transfer capabilities in Oneidensis Shewanella by using a pLacI promoter to overexpress mtrC - a key protein in this process.
However, since Shewanella is understudied I was unable to cross-check the genetic sequence that was used in the paper. Which made me not want to pursue this direction further.
This ended up being the mock Twist Order:
The insert is the the mtrC gene sequence and the vector backbone is pET-28a, which was modified with pLacI for constitutive expression.
Week 8 HW: Cell-Free Systems
General and Lecturer-Specific Questions
1. Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production.
There are many advantages of cell-free protein synthesis over traditional in vivo methods. Cell-free systems allow you to have more control over the conditions of your experiment (ie. DNA, proteins, small molecules, etc.). Unlike working in vivo methods, where things are a bit more of a “black box”.
Two cases where cell-free is more beneficial:
i. Cell-free is more beneficial when it comes to rapid prototyping of metabolic pathways. In the lecture it was mentioned that a four-enzyme pathway would have taken weeks of cloning and transformation with an in-vivo method could take about an hour if done in a cell-free system.
ii. Cell-free systems can be more affordable. For example, there are cell-free freeze dried systems that cost under a dollar and can therefore be used in low-resource communities to detect Ebola, Zika or other diseases.
2. Describe the main components of a cell-free expression system and explain the role of each component.
A cell-free expression system essentially mimics what is going on in a bacteria. You scoop out the ribosomes, polymerases, the tRNAs - and then you components like amino acids and an energy source.
Below are the main components in detail:
DNA template - the DNA for the protein you want to make.
Amino acids - the building blocks for protein synthesis.
tRNAs - small RNA molecule that decodes mRNA and delivers amino acids to ribosome.
Energy source - something that powers the transcription and translation (could be ATP regeneration)
Cell extract - This is what includes the ribosomes and RNA polymerases (PURE express is a version of this)
3. Why is energy provision regeneration critical in cell-free systems? Describe a method you could use to ensure continuous ATP supply in your cell-free experiment.
Transcription and translation cost energy. In a cell-free system the reaction quicky uses up either the ATP or GTP - it’s designed more for a one time use. To avoid this there are ways you can create a regenerative ATP supply. One of the ways to create this is to use the Phosphoketolase (PKT) Systems: these use a pathway of enzymes to turn phosphorylate ADP into usable ATP.
4. Compare prokaryotic versus eukaryotic cell-free expression systems. Choose a protein to produce in each system and explain why.
A prokaryotic cell-free expression system is fairly straightforward, it can be E.coli-based which is already highly understood. It is high-yield and cheaper. You could produce GFP in this system and it would be quick and efficient.
A eukaryotic cell-free system is useful if you need post-translational modifications. There are extra components in eukaryotic cells like the endoplasmic reticulum that attaches sugar chains to the proteins being expressed. There are also chaperones that help the protein fold correctly. It more expensive and has a lower yield than prokaryotic cell free system.
5. How would you design a cell-free experiment to optimize the expression of a membrane protein? Discuss the challenges and how you would address them in your setup.
Membrane proteins normally live inside cell membranes (of course), so the challenge when you try to make them in a cell-free system is that they can clump together because there’s no membrane for them to fold into. A way you can fix this is by adding lipids (like lipid vesicles) to create a sort of artificial membrane. This helps give the protein membranes somwhere to go.
6. Imagine you observe a low yield of your target protein in a cell-free system. Describe three possible reasons for this and suggest a troubleshooting strategy for each.
One of the issues could be degradation of mRNA by RNases. A way to get around this is to add more Murine RNase Inhibitor, as was mentioned in the lab. Another issue could be a deficiency in the energy supply. A way to fix this would be to increase energy like in the form of 3-PGA or PEP. A third issue could be plasmid quality, maybe there is an issue with the T7 promoter. It would be good in that case to verify the plasmid by running it in a gel.
Homework question from Kate Amadala
Design an example of a useful synthetic minimal cell as follows:
1. Pick a function and describe it.
a. What would your synthetic cell do? What is the input and what is the output?
I would want to make a synthetic cell that would detect heavy metals in the soil (inspired by my earlier homeworks about Schewanella and Geobacter bacterias). The input would be the heavy metal molecules that would diffuse into the cells. The output would be a chromoprotein or maybe GFP that would be expressed if the heavy metals were detected.
b. Could this function be realized by cell-free Tx/Tl alone, without encapsulation?
It could be. Without encapsulation the cell-free system is more one-shot (not reusable). So it would functionally still work, but it would be more useful if the cell-free system had a membrane to better protect it from the soil elements. Although technically it’s supposed to be used as more of a one shot because once the GFP proteins are expressed you can’t reuse it. But you might want the sensor to last a while until it does detect something.
c. Could this function be realized by genetically modified natural cell?
Totally. E.coli and Shewanella have been genetically modified with metal-detecting promoters to respond to heavy metals. Of course, when working with natural cells you have to make sure that the environmental conditions are perfect in order to keep the natural cell alive.
d. Describe the desired outcome of your synthetic cell operation.
The idea would be to mix these freeze-dried synthetic cells with soil and water. And then if heavy metals are detected the cells will change color. In the case of using GFP you will have to flash blue or UV light in order to detect the change in color. There could be a variety of synthetic cells that change different colors based on different toxic metals. Since there’s a very limited biosecurity risks with the natural cells, these detectors could be used by non-scientists (like farmers) to detect issues with their soil.
2. Design all components that would need to be part of your synthetic cell.
a. What would be the membrane made of?
The membrane would be a classic synthetic minimal cell membrane: so by that I mean that it would be a lipid membrane made with cholesterol and phospholipids. This would help protect the synthetic cell while still letting the heavy metal molecules enter.
b. What would you encapsulate inside? Enzymes, small molecules.
You would have to use all the mastermix components like ribosomes, enzymes, small molecules, tRNAs, etc. Then of course you would need a toxic metal responsive promoter and either the chromoprotein or a GFP.
c. Which organism your Tx/Tl system will come from? Is bacterial OK, or do you need a mammalian system for some reason? (hint: for example, if you want to use small molecule modulated promotors, like Tet-ON, you need mammalian)
Bacterial cell here will work fine! I don’t have any specific need for a mammalian system. Metal responsive promoters are very well understood in bacterials system, so no need to switch it up.
d. How will your synthetic cell communicate with the environment? (hint: are substrates permeable? or do you need to express the membrane channel?)
I won’t need any special membrane channel proteins, like the ones Kate Amadala mentioned in their lecture. For my design the heavy metal ions are small enough to diffuse through the synthetic cell membrane.
Experimental detailsa. List all lipids and genes. (bonus: find the specific genes; for example, instead of just saying “small molecule membrane channel” pick the actual gene.)
The setup is pretty straightforward, nothing crazy.
As mentioned in the previous questions here are my lipids and genes:
Lipids - phospholipids, cholesterol
Genes - chromoprotein or GFP under a metal-responsive promoter (MerR for mercury, ArsR for arsenic)
TX-TL system - a lyophilized E. coli cell extract with basic master mix
b. How will you measure the function of your system?
I would use a plate reader or a UV light in order to monitor the function of heavy metals. If they are bright and there is fluorescence then that would show that toxic metals were detected in the soil.
References:
Lentini, R. et al., 2014. Communicating with the outside world from within a lipid vesicle. Nature Communications, 5, p.4012. Available at: https://www.nature.com/articles/ncomms5012
Homework question from Peter Nguyen
Summary Pitch: The development of freeze-dried cell-free systems that are embedded into building materials and are activated by water that seeps through the cracks in ceilings or wall, which in turn produces urease enzyme to start a biocementation process - to seal the crack before it spreads.
The freeze-dried cell-free systems will be mixed in with the concrete, so that there are cell-free systems a little everywhere in the mixture. When water touches the freeze-dried systems the production of the urease systems begins. This breaks down urea (also in the material) into ammonia and CO2. This raises the pH and provides carbonate which then bonds with calcium in the material to create limestone (calcium carbonate). Which seals the crack in the material.
This addresses a common contruction issue that affects many urban areas. Building materials are constantly degrading due to environmental issues so this will help the longevity of materials.
The cell-free solution for this issue does have a big limitation in that it is a one-time use. Using freeze-dried cell free systems to create self-healing concrete will not create a perpetually self-healing material. If there is a recurring water issue that causes a crack then the cell-free system will increase the longevity of the material but not solve the issue permanently.
Homework question from Ally Huang
1. Provide background information that describes the space biology question or challenge you propose to address. Explain why this topic is significant for humanity, relevant for space exploration, and scientifically interesting.
A challenge with space biology is the fact that DNA can degrade when exposed to the high levels of cosmic radiation that exist outside of Earth’s atmosphere. There is no easy way to detect how much radiation is affecting the DNA of astronauts. We could use the Biobits cell-free system to test DNA that has been exposed to radiation and compare it to regular control DNA to get a better understanding of how the astronauts are being impacted.
2. Name the molecular or genetic target that you propose to study. Examples of molecular targets include individual genes and proteins, DNA and RNA sequences, or broader -omics approaches. (Maximum 30 words)
The integrity of DNA templates will be studied by encoding for GFP after exposure to space radiation (and comparing with the control DNA), using fluorescence output from BioBits cell-free expression as the readout.
3. Describe how your molecular or genetic target relates to the space biology question or challenge your proposal addresses. (Maximum 100 words)
DNA damage caused by radiation in space would pose a big health risk to astronauts. By being able to track the amount of radiation damage is done over time we can better time space missions to protect the health of space travelers. A great next step would be understanding how to conduct DNA repair while in space, but at the moment I am working with the constraints of the Biobit.
4. Clearly state your hypothesis or research goal and explain the reasoning behind it. (Maximum 150 words)
The hypothesis is that the DNA exposed to space will produce less fluorescent proteins in the Biobit cell-free system. I also believe that the longer the DNA is in space the more radiation damage it will recieve and the fewer and fewer proteins it will be able to express over time. This knowldege of when radiation damage occurs to DNA could help researchers and scientists develop better preventative measures for astronauts’ protection.
5. Outline your experimental plan - identify the sample(s) you will test in your experiment, including any necessary controls, the type of data or measurements that will be collected, etc. (Maximum 100 words)
While on Earth the idea would be to test a GFP-encoding DNA template wusing the BioBits cell-free system. This will help to create a fluorescence baseline as the control for the experiment. The same DNA template will then be put onboard a spacecraft. Then, at regular intervals throughout the mission (weekly?), astronauts will run the same reaction again. They will rehydrate the freeze-dried BioBits kit with water, then incubate, and finally measure the fluorescence using the P51 Molecular Fluorescence Viewer. If there is a decrease in fluorescence compared to the Earth baseline one that will indicate a relative measure of DNA damage from cosmic radiation. It will help to give an idea of how DNA integrity changes the further and longer we travel from Earth.
Please identify at least one (ideally many) aspect(s) of your project that you will measure. It could be the mass or sequence of a protein, the presence, absence, or quantity of a biomarker, etc. Please describe all of the elements you would like to measure, and furthermore describe how you will perform these measurements. What are the technologies you will use (e.g., gel electrophoresis, DNA sequencing, mass spectrometry, etc.)? Describe in detail.
I’ll need to a way to check that the Shewanella host actually carry my modified DNA. I’ll also need to measure the electron output/electrical current produced by the Shewanella (modified and unmodified). In a more basic sense, I’ll need to measure the amount of nitrate that I put in the soil for testing, and also a way to test the nitrate in the soil - to check if it correlates with the changes in electron output by the Shewanella.
For measuring which Shewanella have been genetically modified I can use Gel Electrophoresis to make sure the DNAs are the lengths they should be, which will tell me that I have the correct DNA. I can also just sequence the DNA using Sanger sequencing to make sure manually that it is thte right DNA.
For the electron current measurement I should use a potentiostat which can detect small levels of current. From that I can also calculate the power density. This should give me a good iea of what type of simple circuit could be powered by this.
Measuring the nitrate can be done with color-based chemical test kits called “colorimetric nitrate assay kits” that you can easily purchase and is pictured below.
Waters Part I — Molecular Weight
1. Based on the predicted amino acid sequence of eGFP (see below) and any known modifications, what is the calculated molecular weight?
Using the online calculator the theoreatical pI (isoelectric point) is 5.90 and the Theoretical Mw (molecular weight) is 28,006.60 Da.
2. Calculate the molecular weight of the eGFP using the adjacent charge state approach described in the recitation
a. Determine z for each adjacent pair of peaks
I'm going to choose the 2 adjacent peaks:
903.7148 and 933.7349
By plugging that into the formula I get:
z = 903.7148 / (933.7349 − 903.7148)
z = 903.7148 / 30.0201
z = 30.10 ≈ 30
b. Determine the MW of the protein using the relationship between m/zn, MW, and z.
1.008 Da is the mass of a single proton (H+)
So we take this formula: m/zn = (MW + z × 1.008) / z
and then we can rearrange it to get: MW = z × (m/zn) − z × 1.008
30 x 933.7349 - 30 x 1.008MW
= 28,012.05 − 30.24MW
= 27,981.81 Da
c. Calculate the accuracy of the measurement using the deconvoluted MW from 2.2 and the predicted weight of the protein from 2.1 using the formula.
28,006.60 Da from the calculator and 27,981.81 Da from the calculations.
Accuracy = abs(27,981.81 − 28,006.60) / 28,006.60
= 24.79 / 28,006.60
= 0.00089
or 0.089% accuracy.
Waters Part III — Peptide Mapping - primary structure
1. How many Lysines (K) and Arginines (R) are in eGFP? Please circle or highlight them in the eGFP sequence given in Waters Part I question 1 above.
By putting the eGFP sequence into Benchling we can see that there are 20 Lysines (K) and 6 Argine (R):
However just by manually counting them I get 16 Lysines (K) and 6 Argine (R)
2. How many peptides will be generated from tryptic digestion of eGFP?
a. Navigate to https://web.expasy.org/peptide_mass/b. Copy/paste the sequence above into the input box in the PeptideMass tool to generate expected list of peptides.c. Use Figure 4 below as a guide for the relevant parameters to predict peptides from eGFP.d. Click “Perform the Cleavage” button in the PeptideMass tool and report the number of peptides generated when using trypsin to perform the digest.
Here is the list of generated peptide given by Expasy:
When using trypsin to perform the digest we end up with about 19 peptides.
Based on the LC-MS data for the Peptide Map data generated in lab (please use Figure 5a as a reference) how many chromatographic peaks do you see in the eGFP peptide map between 0.5 and 6 minutes? You may count all peaks that are >10% relative abundance.
There are about 23 chromatogrphic peaks in the eGFP peptide map (between 0.5 and 6 minutes).
Assuming all the peaks are peptides, does the number of peaks match the number of peptides predicted from question 2 above? Are there more peaks in the chromatogram or fewer?
There are slightly more peaks than peptides! But this might also be because I counted a cluster of peaks as multiple separate peaks. If I didn’t do that and clumped them into one peak it would be a similar number to the count of peptides.
Identify the mass-to-charge (m/2) of the peptide shown in Figure 5b. What is the charge (z) of the most abundant charge state of the peptide (use the separation of the isotopes to determine the charge state). Calculate the mass of the singly charged form of the peptide (|M+H|+) based on it’s m/2 and z.
The tallest peak (and therefore most abundant charge state) is at m/z = 525.76712.
Calculating the mass of the singly charged form of the peptide is a bit trickier.
(|M+H|+) is basically just describing the peptide if it carried just one charge (the +).
Okay this is a bit confusing but basically in the zoomed in part of the graph we can see that the graphs are approximately ~0.5 units apart from each other. And the spectrum shows these units at 1/z apart.
We can then solve this equation by doing:
spacing = 1/z
0.5 = 1/z
z = 1/0.5 = 2
z = 2
(M + z)/z = 525.76712 where m = (M + z)
(M + 2)/2 = 525.76712
(M + 2) = 1051.53
M = 1049.53
Therefore:
[M+H]+ = M+1 --> This is because H is one proton, and a proton weighs 1 Da.
As a result:
M + 1 = 1050.53 Da
Identify the peptide based on comparison to expected masses in the PeptideMass tool. What is mass accuracy of measurement? Please calculate the error in ppm.
I found the closest matching peptide based on calculated mass:
From the fragmentation spectrum we have MW_Experiment = 1050.52438
From the PeptideMass tool we have MW_Theory = 1050.5214
By plugging it into the formula we get:
|1050.52−1050.53∣ divided by 1050.53
= 0.0000095
In ppm that would be 9.5 ppm!
What is the percentage of the sequence that is confirmed by peptide mapping? (see Figure 6)
According to Figure 6 the percentage sequence that was detected by the peptide mapping is 88%.
So basically this means that the peptide mapping detected about 216 of 246 total amino acids that make up eGFP.
Waters Part IV — Oligomers
We will determine Keyhole Limpet Hemocyanin (KLH)’s oligomeric states using charge detection mass spectrometry (CDMS). CDMS single-particle measurements of KLH allow us to make direct mass measurements to determine what oligomeric states (that is, how many protein subunits combine) are present in solution. Using the known masses of the polypeptide subunits (Table 1) for KLH, identify where the following oligomeric species are on the spectrum shown below from the CDMS (Figure 7):
7FU Decamer
The 7FU subunit has a mass of 340 kDa
Decamer refers to 10 units so you multiple 340 kDa by 10.
Which is 3,400 kDa or 3.4 MDa.
So it’s at intensity 50 on the graph.
8FU Didecamer
The 8FU has a mass of 400 kDa
Didecamer refers to 20 units so multiply 20 x 400kDA
20 × 400 kDa = 8,000 kDa
= 8.0 MDa
Which roughly correspond to the tallest peak in the graph
8FU 3-Decamer
400 kDA x 30
= 12,000 kDa
= 12.0 MDa
Which corresponds to the 12.67 peak one of the last ones on the graph.
8FU 4-Decamer
400 kDA x 40
16,000 kDa = 16.0 MDa
It’s not labeled on the graph but I’m pretty sure you can see this on the graph it’s the last visible peak.
Waters Part V — Did I make GFP?
Please fill out this table with the data you acquired from the lab work done at the Waters Immerse Lab in Cambridge, or else the data screenshots in this document if you were unable to have lab work done at Waters.
Theoretical
Observed/measured on the Intact LC-MS
PPM Mass Error
Molecular weight (Da)
27,988.96
27,981.81
~255 ppm
From Expasy I can see that the (monoisotopic mass) = 27,988.96
From Question 2 I calculated: 27,981.81 Da so that’s what I’m using for observed since I didn’t get the lab work done at Waters.
abs(27981.81−27988.96) divided by 27988.96
= 7.15/27988.96 * 1 million
= ~255 ppm
Week 10 HW: Bioproduction & Cloud Labs
Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork
So I forgot to take a photo but I submitted one green pixel that was above a red heart in the bottom left quadrant. This was early on, when I first received the link to the Pixel Artwork Canvas.
However I’m pretty sure the whole heart got erased and replaced multiple times over by the time the deadline ended! haha
Here’s the final output (maybe my pixel ended up being used as some of the green background for “Love HTGAA”?):
It’s a cool idea but I think you should only be able to modify one pixel one time.
Some people wrote scripts to try to get a certain image they want and that kind of defeats the purpose of collaboratively making art.
Part B: Cell-Free Protein Synthesis | Cell-Free Reagents
Referencing the cell-free protein synthesis reaction composition (the middle box outlined in yellow on the image above, also listed below), provide a 1-2 sentence description of what each component’s role is in the cell-free reaction.
Component Roles
E. coli Lysate
BL21 (DE3) Star lysates - will contain the protein synthesizing components of E. coli cells such as ribosomes, tRNA, elongation factors and T7 RNA Polymerase.
Salts & Buffers
Potassium glutamate - is used to control ionic strength and stabilize ribosome complexes.
HEPES-KOH buffer - maintains optimal pH conditions for enzyme activity.
Magnesium glutamate - is an ion source for magnesium which serves as a cofactor in both ribosomal activity and enzyme activity.
Phosphoric acids (monobasic/phosphate) - serve as additional pH buffers as well as contribute phosphor atoms to enable regenerative steps of energy production.
Energy/Nucleotide Regeneration Systems
Ribose - is fed into the pentose phosphate pathway to generate new NTP’s from existing NMP’s allowing longer time frames with this system.
Glucose - can be converted by glycolytic pathways to produce ATP for the overall process.
AMP, CMP, GMP, and UMP - are available for conversion to their corresponding diphosphate or triphosphate forms to support transcription.
Guanine - is also provided in its free form to help circumvent limitations on generating GMP during the reaction.
Components of Translation Mixture (Amino Acids)
The 17 amino acid mix plus tyrosine and cysteine make up all of the necessary materials required for peptide bond formation during translation.
Separate additions of tyrosine and cysteine were made because they have limited solubility in aqueous solutions and cysteine was found to oxidize upon standing.
Additional Components
Nicotinic acid - has been included to maintain adequate levels of NAD+ in order to keep all redox dependent metabolic reactions proceeding as needed.
Backfilling
Nuclease-free water - used to bring the reaction mixture to its desired final volume
Part C: Planning the Global Experiment | Cell-Free Master Mix Design
Given the 6 fluorescent proteins we used for our collaborative painting, identify and explain at least one biophysical or functional property of each protein that affects expression or readout in cell-free systems. (Hint: options include maturation time, acid sensitivity, folding, oxygen dependence, etc) (1-2 sentences each)
sfGFP - The sfGFP was created to better fold and have less tendency to aggregate than normal GFP. This creates an advantage in using it in cell free systems, since the folding of mis-folded proteins are aided by chaperone enzymes in cells but do not exist in a cell free system.
mRFP1 is a slow-maturing FP, therefore many of the translated proteins may never achieve their fluorescent state during a cell free reaction due to the consumption of O2 as the reaction progresses.
mKO2 is very sensitive to oxygen levels required to complete chromophore maturation. Therefore, in longer cell free reactions as the available oxygen levels decrease, this can lead to incomplete maturation of chromophores and reduced fluorescence levels.
mTurquoise2 - Due to a slow forming chromophore reaction with oxygen, mTurquoise2 matures slowly and therefore, some portion of the expressed protein may not exhibit fluorescence when measured at the end of the reaction.
mScarlet- I - The T74I mutation in mScarlet-I allows it to mature significantly faster than wild-type mScarlett. However, this comes at the cost of a lower quantum yield. While this is somewhat of a trade-off, in cell free systems, where rapid maturation is advantageous, it is likely still a beneficial mutation.
Electra2 - Since blue FPs such as Electra2 produce inherently less light than green or red FPs, the fluorescence produced by electra2 will be weaker, even if it is well-expressed relative to the other FPs on the plate.
Create a hypothesis for how adjusting one or more reagents in the cell-free mastermix could improve a specific biophysical or functional property you identified above, in order to maximize fluorescence over a 36-hour incubation. Clearly state the protein, the reagent(s), and the expected effect.
THe amount of time it takes for for mKO2 to begin emitting light is approximately 108 minutes/molecule, according to Fbase. It can’t be shortened. However, an increase in the number of mKO2 proteins produced by the cell-free system would lead to a greater amount of those proteins that have finished maturation to emit light at any point throughout the 36 hour experimental window.
So my hypothesis is that if you add more Magnesium Glutamate (to help ribosome functions - which leads to more translations) and increase th 17 Amino Acid Mix (additional building blocks for ribosomes) to the master-mix then you will record a higher fluorescence signal.
To test this hypothesis I will be increasing the magnesium glutamate from 6.975mM to 7.600mM and amino acids from 4.063mM to 4.500mM.
I ended up going with this:
As a committed listener I did not have access to the data for the last question.