Homework

Weekly homework submissions:

  • 𓃠 Week 1 Homework 𓃠 1. First, describe a biological engineering application or tool you want to develop and why. I want to build a Biological 3D Printer :D It is a quite crazy idea, basically a biological 3D printer takes in a DNA file and prints it and expresses it right away and delivers the target product in vial.
  • 𓃠 Week 2 Homework 𓃠 Part 1: Benchling & In-silico Gel Art Simulate Restriction Enzyme Digestion with the following Enzymes:
  • 𓃠 Week 3 Homework 𓃠 Assignment: Python Script for Opentrons Artwork This is an Ancient Egyptian Pharaoh Figure, this was made using this GUI Also here is the live link to check it out! :D
  • 𓃠 Week 4 Homework 𓃠 Part A. Conceptual Questions 1. How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons) Proteins make about 20%-30% of the composition of meat, so we can take 25% as the average, this means 500 x 0.25 = 125g protein, to determine molar mass, 1 Dalton is approximately equal to 1 gram per mole (g/mol), so an average amino acid weighs 100 g/mol which when divided by the molar mass of 100 daltons gives 1.25 moles of amino acids, then we can multiply this by Avogadro’s number (6.022x1023), we get 7.5275 x 1023 molecules of amino acids in the 500 grams of meat (this is extremely huge wow XD)
  • 𓃠 Week 5 Homework 𓃠 Part A: SOD1 Binder Peptide Design (From Pranam) Part 1: Generate Binders with PepMLM Here is the Human SOD1 sequence from Uniprot (P00441)
  • 𓃠 Week 6 Homework 𓃠 Assignment: DNA Assembly 1. What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose? Phusion High-Fidelity PCR Master Mix is basically a pre-optimized solution containing every chemical component needed for DNA amplification except for the DNA template and the primers.
  • 𓃠 Week 7 Homework 𓃠 Assignment Part 1: Intracellular Artificial Neural Networks (IANNs) 1. What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions? IANNs are more suited for biology since they are not constrained with the digital β€œ0 and 1” appraoch, but can follow an Analog appraoch that is more realistic and suited for biology, because biology is not in an On/Off stage but it varies with different values and expression levels and they are more flexible in that they help in designing more Decision Boundaries without having to create new parts from the scratch and as we have seen with the Neuromorphic wizard, it can also tap and work in Advanced areas like β€œDual Region” zones where a cell activates if inputs are strictly below a threshold Or strictly above a threshold, but remains totally inactive in the intermediate zone
  • 𓃠 Week 9 Homework 𓃠 Homework Part A: General and Lecturer-Specific Questions General homework questions Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production. Cell-free protein synthesis is basically taking the protein-making machinery out of a living cell and running the process in a test tube. The biggest advantage over traditional in vivo methods (which use living cells) is that you do not have to worry about keeping a cell alive. When you work with living cells, the cell membrane blocks you from easily changing the environment, and the cell’s natural life-cycles get in the way.
  • 𓃠 Week 10 Homework 𓃠 Homework: Final Project Ξ±-Pinene titer: quantity of our target molecule produced per chassis per carbon source. Measured by GC-MS with dodecane overlay extraction from 96-well deep plates. AgPS protein identity and mass: purified AgPS-His6 protein (via Ni-NTA) can be characterized by intact protein LC-MS to confirm correct molecular mass and verify the His6 tag is present. Cell growth (OD₆₀₀): optical density at 600nm across all chassis Γ— carbon source conditions. Measured on the PHERAstar FSX plate reader. Tells us if the strain is healthy while producing. Gene expression (mRNA level): RT-qPCR on the CFX Opus machine using primers targeting AgGPPS2 and AgPS transcripts. Confirms the genes are being transcribed in vivo. π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή Homework: Waters Part I β€” Molecular Weight The Theoritical Molecular weight of the eGFP Sequence is 28006.60
  • 𓃠 Week 11 Homework 𓃠 Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork I just want to say, this was soooo fun! :D and i truly love how it changed alot and all the virtual pixel fun wars i’ve had and truces and teamups, it was a really engaging and lovely experience
  • 𓃠 Week 12 Homework 𓃠 No Homework :D Working on Final Project :D π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή
  • HTGAA Committed Listener (CL) Agreement I am a HTGAA Committed Listener, my responsibilities are: Watching class lectures and recitations Participating in node reviews Developing and documenting my homework Actively communicating with other students and TAs on the forum Allowing HTGAA and BioClub to share my work (with attribution) Honestly reporting on my work, and appropriately attributing and citing the work of others (both human and non-human) Following locally applicable health and safety guidance Promoting a respectful environment free of harassment and discrimination Signed by committing this file to my documentation page/repository,

Subsections of Homework

Week 1 HW: Principles and Practices

cover image cover image

𓃠 Week 1 Homework 𓃠

1. First, describe a biological engineering application or tool you want to develop and why.

I want to build a Biological 3D Printer :D It is a quite crazy idea, basically a biological 3D printer takes in a DNA file and prints it and expresses it right away and delivers the target product in vial.

How it works: You download a DNA file (a plasmid sequence) for a specific function, like a molecule that smells like chocolate, or a protein that glows red, or a specific medication. You send the file to the printer, which synthesizes or assembles the genetic instruction, expresses it using a cell-free system or bacterial chassis, and delivers the purified product in a vial.

I want to build this because I want to democratize manufacturing. Right now, biology is locked in big labs. I want to bring ‘Bio-Production’ to the home, the remote clinic, or even a spaceship. If you can email a file, you should be able to print a cure (or a scent).

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Main Goals include:

A. Preventing Malicious Use (Biosecurity): to make sure such printers will not be used to create harmful, toxic or dangerous products or molecules, maybe this can achieved through:

  1. DNA Screening: We can screen the DNA sequence that is requested for printing, to check and prevent the process if it encodes for a dangerous molecule or a viral toxin

  2. Reagent Lock-in: To make sure the reagents, proteins and raw materials in the printer are only viable inside the printer environment and cannot be extracted and used else where

B. End User Safety & Reliability (Biosafety): we need to make sure the printed molecule is exactly what it claims to be, and to make sure no mutation or contamination has occured

  1. Proofreading Mechanism: we can apply a proofreading startegy during and after printing and synthesis to confirm similarity between the printed the DNA and the uploaded DNA file

  2. Proper Elimination of undesired or faulty molecules: we need to be able to safely and properly discard or eliminate mutated or contaminated sequences once identified

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

3. Next, describe at least three different potential governance β€œactions” by considering the four aspects below (Purpose, Design, Assumptions, Risks of Failure & β€œSuccess”).

A. Action 1: International Regulation & Monitoring: This action can be pursued & applied through international health organizations like WHO and local alikes

  1. Purpose: We can regulate what DNA sequences get printed through a digital signature that has to be approved by such organiztions, and the printer can only print these approved sequences

  2. Design: We can have a main website similiar to the iGEM registry where users can submit new DNA sequences that they want to print and Organizations like the WHO can have a team of scientists + AI that periodically screen new DNA sequences that get submitted and either give it a signature for public use or flag it as Dangerous/Toxic and prevents its printing

  3. Assumptions: We assume that bad people cannot manipulate the signature or bypass its checking before printing

  4. Risks of Failure & Success:

    4a. Failure: Risks include Hackers being able to bypass certain signatures and produce dangerous and harmful products through the printers or even hijacking the printers themselves to allow it to print without checking anything

    4b. Success: If these regulations get applied so strictly, it could possibly lead to a lack of innovation and creativity and less DNA sequences the end users are able to print, Also local regulations would also mean some molecules might be allowed to be printed in one country and not allowed in another

B. Action 2: Companies Regulation: This Regulation is applied through the parent company that sells the printers

  1. Purpose: We can regulate the usage printers reagents and chemicals and make sure it only uses certified or accepted reagents, more like printers cartridge system or coffee pods, mainly to make sure users can’t take the reagents and use it elsewhere

  2. Design: The company can design the printer and cartridge so that they become dependant on each other, maybe the reagents need an activation factor that is only in the printer system or hardware so that without it, the reagents are dormant and cant be used

  3. Assumptions: We assume the chemical “lock” is not breakable by other chemistry sets, we also assume users can afford the prices of the printer reagent cartridges

  4. Risks of Failure & Success:

    4a. Failure: Risks include bad people managaing to break the chemical formula and use the reagents in other probably malicious work

    4b. Success: Monopoly, if one company produces those reagents and it can increase prices or stop selling to specifc countries or competitors for company gains

C. Action 3: Community Integration & Bio Bug Bounty Programs: This action involves both the companies and the community for a shared safe future

  1. Purpose: Integrate the Community into the process through lectures and workshops about proper usage of the printers, how to deal with malfunctions, how to submit DNA designs and the application of Bug Bounty incentive programs to quickly and effictevly patch errors, faults or flaws in the printer’s software and hardware

  2. Design: Incentives can be offered to people who can bypass certain security checks in the printer or print a certain toxic molecule that the print is not supposed to print, this helps turn possible hackers into quality assurance testers.

  3. Assumptions: We assume the bounty hunters will report all findings and that the incentive given is enough to make sure they report to us.

  4. Risks of Failure & Success:

    4a. Failure: A bounty hunter discovers a major defect in the printer but recieves a bigger incentive from another competitor or bad groups and delivers the information to them instead

    4b. Success: It is actually hard to think of one. maybe all these trials and errors and bug hunting will make the printer model known and allow for competitors to start companies and compete with us

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

4. Next, score (from 1-3 with, 1 as the best, or n/a) each of your governance actions against your rubric of policy goals. The following is one framework but feel free to make your own:

Does the option:Action 1Action 2Action 3
Enhance Biosecurity
β€’ By preventing incidents111
β€’ By helping respond3n/a2
Foster Lab Safety
β€’ By preventing incident122
β€’ By helping respond3n/a3
Protect the environment
β€’ By preventing incidents122
β€’ By helping respond3n/a2
Other considerations
β€’ Minimizing costs and burdens to stakeholders331
β€’ Feasibility?112
β€’ Not impede research222
β€’ Promote constructive applications211
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

5. Last, drawing upon this scoring, describe which governance option, or combination of options, you would prioritize, and why. Outline any trade-offs you considered as well as assumptions and uncertainties

Based on the scoring matrix, I recommend a Hardware-First, Software-Verified Hybrid Strategy, prioritizing Action 1 (Digital Screening) as the immediate standard, supported by Action 2 (Reagent Lock).

Hardware is the Hardest Barrier, Software can be hacked, and Bounties are reactive and takes a while to build community knowledge. Physical reagent cartridges is a very robust fail-safe for preventing accidental contamination or malicious misuse by non-experts, however reagents can be found elsewhere and maybe chemically designed to imitate the proposed solutions, so having signatured DNA sequences remain the most reliable approach.

These actions however carry tradeoffs that include lack of creative space for other users to design DNA sequences they like and have to go through a rigrous application process to get approved and signatured, and the Reagent Lock risks Company Monopoly if it is allowed to be the sole producer of these types of reagents.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

6. Reflecting on what you learned and did in class this week, outline any ethical concerns that arose, especially any that were new to you. Then propose any governance actions you think might be appropriate to address those issues. This should be included on your class page for this week.

One of the ethical concerns that arose to me is how “Open Source” in this setting can actually do more harm than good, if a lab builds a toxic molecule or a virus using the printer in a closed setting, we can basically lock up and quarantine the lab, but with open source, if the process goes online on the internet, there is no going back, and now anyone can have it. this ties back to the importance of the proposed Signature only printing action, as it can help mitigate the effects of distributed dangerous DNA sequences and trying to print it, maybe also we will need to make these printers Online only, so proper monitoring of what is being printed and by whom, of course users must be informed of this and be allowed to either accept such terms to buy the printer and use our products or refuse, and in the case of refusal, and taking biosecurity into consideration, the products should not be sold to the user nor permitted for use in this case

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

𓃠 Week 2 Lecture Prep 𓃠

1. Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy?

The error rate for polymerase is 1:106. The Human Genome is ~ 3.2 Gbp, doing the math means that it makes 3200 errors each time which is alot. Biology deals with these errors through proofreading and corrections, one example is the MutS Repair System where it fixes the mismatches and repairs them using DNA polymerase and Ligase

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

2. How many different ways are there to code (DNA nucleotide code) for an average human protein? In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?

An Average human protein can be up to 1036 bp, each 3 (called a codon) codes for an amino acid, but amino acids can be coded by many combinations in those codons, making the different ways to code for the average human protein way too much and increases exponentially. In practice, the many of these theoretical coding sequences do not work well since the nucleotide sequence determines the physical behavior of the mRNA molecule. One of the reasons is formation of mRNA secondary structures, such as hairpins and loops, which are governed by the “Minimum Free Energy” of the sequence. Other reasons include that specific nucleotide patterns can create recognition sites for cellular enzymes like RNase, leading to in vivo cleavage and the destruction of the mRNA before it can be translated

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

3. What’s the most commonly used method for oligo synthesis currently?

The phosphoramidite method performed via Solid-Phase Synthesis of Oligos is the current most common method. It starts with phosphoramidite Coupling then Capping of unreacted sites followed by Oxidation and then Deblocking. These steps are then repeated for as much times needed for the synthesis

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

4. Why is it difficult to make oligos longer than 200nt via direct synthesis?

(Had to look up more info about this topic using AI) It mainly returns to difficulties regarding Yield, Truncation Products and Depurination. The longer the synthesized chain gets, the lower the yield gets, and it is not uniform but exponential. Additionally Long Chains accumalate Truncation Products which may happen because of errors or depurination from acids added during the synthesis stage. On the bright side, Twist Bioscience has came up with an Enhanced Process & Chemistry allowing PCR yield to go up to >10 fold increase, 1:2000 error rate and acheiving a 500bp Oligos

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

5. Why can’t you make a 2000bp gene via direct oligo synthesis?

This touches back to the previous question, with exponential Yield Decay, a 2000bp gene would have a very tiny yield, making most of the produced bases possibly junk, also the depurination issue arises, because the base pairs at the very beginning of the chain will need to resist and endure the acids added in the Deblocking phase up to 2000 times which is difficult. The current process of creating the 2000bp gene would creating multiple oligos (200bp - 500bp) and stitching them together using DNA ligase.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

6. [(Advanced students)] Given the one paragraph abstracts for these real 2026 grant programs sketch a response to one of them or devise one of your own:

I chose the Smart-RBC: Smart Red Blood Cells Grant

My Proposal: The “Oxygen Turbo Cell”

The Idea: I propose creating a “High-Performance” Red Blood Cell that acts like a backup oxygen tank for hikers, firefighters or Rescue teams in extreme conditions. We can engineer the RBCs’ Heamoglobin to be much more sensetive to markers that arise in low oxygen or hypoxic settings.

To work it should have a biosensor that can detect Lactic Acid, which is the acid that gets produced in low oxygen settings and causes the burning sensation in your muscles when you run and once the cell senses high acid levels, the engineered haemoglobin gets activated and lets go of even more oxygen that it normally would creating a higher oxygen surge instantly into the muscles or brain to properly and rapidly accomodate to the new environment. Since these cells have no nucleus or DNA (Enuculated), they are just temporary “smart delivery bags”, They circulate for a few weeks to keep the hiker safe, then naturally die off without changing the hiker’s permanent genetics.

This also solves the “Natural Acclimation Time” problem mentioned in the abstract. Instead of waiting weeks for the body to get used to high altitude, or low oxygen environments, a person can receive a transfusion of these “Oxygen Turbo Cells” and be ready immediately to tackle these extreme environments and this applies to what the abstract mentioned as “enhance physiological resilience”.

AI citation: i used Gemini to help with idea validation and properly tying it back to match the requirements of the Grant Program

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Week 2 HW: DNA Read, Write, & Edit

cover image cover image

𓃠 Week 2 Homework 𓃠

Part 1: Benchling & In-silico Gel Art

  • Simulate Restriction Enzyme Digestion with the following Enzymes: Virtual Digest of Lambda DNA using a bunch of Restriction Enzymes Virtual Digest of Lambda DNA using a bunch of Restriction Enzymes

  • Create a pattern/image in the style of Paul Vanouse’s Latent Figure Protocol artworks. A Smiley Face Drawn using Benchling’s Virtual Digest A Smiley Face Drawn using Benchling’s Virtual Digest

    • I tried to do a smiley face and it turned out so bad but i love it XD!
    • Many thanks to Ronan for his Website, it really helped alot make this much faster!
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Part 3: DNA Design Challenge

3.1. Choose your protein.

Hemocyanin. It is a giant, copper-based protein that functions as the respiratory pigment for many mollusks and arthropods, but the coolest thing is that when their blood is oxygenated it turns into Blue :D. Unlike our iron-based hemoglobin which is packed inside cells, hemocyanin floats freely in the hemolymph of animals like the Atlantic Horseshoe Crab and the Keyhole Limpet. It is a medical powerhouse for humans; its massive, alien structure provokes a strong immune response, making it an effective immunotherapy treatment for bladder cancer and a crucial carrier protein for vaccines (helping the body recognize small drug molecules) and more (1).

Horseshoe Crab (Limulus polyphemus) Horseshoe Crab (Limulus polyphemus)

Here is a picture of the Horseshoe Crab (Limulus polyphemus) from the National Wildlife Federation (NWF)

Cayenne keyhole limpet (Diodora cayenensis) Cayenne keyhole limpet (Diodora cayenensis)

Here is a picture of the Cayenne keyhole limpet (Diodora cayenensis) from the Bailey-Matthews National Shell Museum & Aquarium

Here is the Hemocyanin Protein Sequence >sp|P04253|HCY2_LIMPO Hemocyanin II OS=Limulus polyphemus OX=6850 PE=1 SV=2 TLHDKQIRVCHLFEQLSSATVIGDGDKHKHSDRLKNVGKLQPGAIFSCFHPDHLEEARHLYEVFWEAGDFNDFIEIAKEARTFVNEGLFAFAAEVAVLHRDDCKGLYVPPVQEIFPDKFIPSAAINEAFKKAHVRPEFDESPILVDVQDTGNILDPEYRLAYYREDVGINAHHWHWHLVYPSTWNPKYFGKKKDRKGELFYYMHQQMCARYDCERLSNGMHRMLPFNNFDEPLAGYAPHLTHVASGKYYSPRPDGLKLRDLGDIEISEMVRMRERILDSIHLGYVISEDGSHKTLDELHGTDILGALVESSYESVNHEYYGNLHNWGHVTMARIHDPDGRFHEEPGVMSDTSTSLRDPIFYNWHRFIDNIFHEYKNTLKPYDHDVLNFPDIQVQDVTLHARVDNVVHFTMREQELELKHGINPGNARSIKARYYHLDHEPFSYAVNVQNNSASDKHATVRIFLAPKYDELGNEIKADELRRTAIELDKFKTDLHPGKNTVVRHSLDSSVTLSHQPTFEDLLHGVGLNEHKSEYCSCGWPSHLLVPKGNIKGMEYHLFVMLTDWDKDKVDGSESVACVDAVSYCGARDHKYPDKKPMGFPFDRPIHTEHISDFLTNNMFIKDIKIKFHE

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

3.2. Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence.

accctgcatgataaacagattcgcgtgtgccatctgtttgaacagctgagcagcgcgaccgtgattggcgatggcgataaacataaacatagcgatcgcctgaaaaacgtgggcaaactgcagccgggcgcgatttttagctgctttcatccggatcatctggaagaagcgcgccatctgtatgaagtgttttgggaagcgggcgattttaacgattttattgaaattgcgaaagaagcgcgcacctttgtgaacgaaggcctgtttgcgtttgcggcggaagtggcggtgctgcatcgcgatgattgcaaaggcctgtatgtgccgccggtgcaggaaatttttccggataaatttattccgagcgcggcgattaacgaagcgtttaaaaaagcgcatgtgcgcccggaatttgatgaaagcccgattctggtggatgtgcaggataccggcaacattctggatccggaatatcgcctggcgtattatcgcgaagatgtgggcattaacgcgcatcattggcattggcatctggtgtatccgagcacctggaacccgaaatattttggcaaaaaaaaagatcgcaaaggcgaactgttttattatatgcatcagcagatgtgcgcgcgctatgattgcgaacgcctgagcaacggcatgcatcgcatgctgccgtttaacaactttgatgaaccgctggcgggctatgcgccgcatctgacccatgtggcgagcggcaaatattatagcccgcgcccggatggcctgaaactgcgcgatctgggcgatattgaaattagcgaaatggtgcgcatgcgcgaacgcattctggatagcattcatctgggctatgtgattagcgaagatggcagccataaaaccctggatgaactgcatggcaccgatattctgggcgcgctggtggaaagcagctatgaaagcgtgaaccatgaatattatggcaacctgcataactggggccatgtgaccatggcgcgcattcatgatccggatggccgctttcatgaagaaccgggcgtgatgagcgataccagcaccagcctgcgcgatccgattttttataactggcatcgctttattgataacatttttcatgaatataaaaacaccctgaaaccgtatgatcatgatgtgctgaactttccggatattcaggtgcaggatgtgaccctgcatgcgcgcgtggataacgtggtgcattttaccatgcgcgaacaggaactggaactgaaacatggcattaacccgggcaacgcgcgcagcattaaagcgcgctattatcatctggatcatgaaccgtttagctatgcggtgaacgtgcagaacaacagcgcgagcgataaacatgcgaccgtgcgcatttttctggcgccgaaatatgatgaactgggcaacgaaattaaagcggatgaactgcgccgcaccgcgattgaactggataaatttaaaaccgatctgcatccgggcaaaaacaccgtggtgcgccatagcctggatagcagcgtgaccctgagccatcagccgacctttgaagatctgctgcatggcgtgggcctgaacgaacataaaagcgaatattgcagctgcggctggccgagccatctgctggtgccgaaaggcaacattaaaggcatggaatatcatctgtttgtgatgctgaccgattgggataaagataaagtggatggcagcgaaagcgtggcgtgcgtggatgcggtgagctattgcggcgcgcgcgatcataaatatccggataaaaaaccgatgggctttccgtttgatcgcccgattcataccgaacatattagcgattttctgaccaacaacatgtttattaaagatattaaaattaaatttcatgaa

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

3.3. Codon optimization.

I decided to use Benchling’s Codon Optimization for E. Coli K-12. Codon Optimization is important because it uses the amino acids more native to the chosen organism which boosts speed and efficiency of translation and also if it is not done the organism might not have enough complementary tRNA anti codon molecules to synthesize this specific amino acid.

ACCCTGCATGATAAACAGATTCGCGTGTGCCATCTGTTTGAACAGCTGAGCAGCGCGACCGTGATTGGCGATGGCGATAAACATAAACATAGCGATCGCCTGAAAAACGTGGGCAAACTGCAGCCGGGCGCGATTTTTAGCTGCTTTCATCCGGATCATCTGGAAGAAGCGCGCCATCTGTATGAAGTGTTTTGGGAAGCGGGCGATTTTAACGATTTTATTGAAATTGCGAAAGAAGCGCGCACCTTTGTGAACGAAGGCCTGTTTGCGTTTGCGGCGGAAGTGGCGGTGCTGCATCGCGATGATTGCAAAGGCCTGTATGTGCCGCCGGTGCAGGAAATTTTTCCGGATAAATTTATTCCGAGCGCGGCGATTAACGAAGCGTTTAAAAAAGCGCATGTGCGCCCGGAATTTGATGAAAGCCCGATTCTGGTGGATGTGCAGGATACCGGCAACATTCTGGATCCGGAATATCGCCTGGCGTATTATCGCGAAGATGTGGGCATTAACGCGCATCATTGGCATTGGCATCTGGTGTATCCGAGCACCTGGAACCCGAAATATTTTGGCAAAAAAAAAGATCGCAAAGGCGAACTGTTTTATTATATGCATCAGCAGATGTGCGCGCGCTATGATTGCGAACGCCTGAGCAACGGCATGCATCGCATGCTGCCGTTTAACAACTTTGATGAACCGCTGGCGGGCTATGCGCCGCATCTGACCCATGTGGCGAGCGGCAAATATTATAGCCCGCGCCCGGATGGCCTGAAACTGCGCGATCTGGGCGATATTGAAATTAGCGAAATGGTGCGCATGCGCGAACGCATTCTGGATAGCATTCATCTGGGCTATGTGATTAGCGAAGATGGCAGCCATAAAACCCTGGATGAACTGCATGGCACCGATATTCTGGGCGCGCTGGTGGAAAGCAGCTATGAAAGCGTGAACCATGAATATTATGGCAACCTGCATAACTGGGGCCATGTGACCATGGCGCGCATTCATGATCCGGATGGCCGCTTTCATGAAGAACCGGGCGTGATGAGCGATACCAGCACCAGCCTGCGCGATCCGATTTTTTATAACTGGCATCGCTTTATTGATAACATTTTTCATGAATATAAAAACACCCTGAAACCGTATGATCATGATGTGCTGAACTTTCCGGATATTCAGGTGCAGGATGTGACCCTGCATGCGCGCGTGGATAACGTGGTGCATTTTACCATGCGCGAACAGGAACTGGAACTGAAACATGGCATTAACCCGGGCAACGCGCGCAGCATTAAAGCGCGCTATTATCATCTGGATCATGAACCGTTTAGCTATGCGGTGAACGTGCAGAACAACAGCGCGAGCGATAAACATGCGACCGTGCGCATTTTTCTGGCGCCGAAATATGATGAACTGGGCAACGAAATTAAAGCGGATGAACTGCGCCGCACCGCGATTGAACTGGATAAATTTAAAACCGATCTGCATCCGGGCAAAAACACCGTGGTGCGCCATAGCCTGGATAGCAGCGTGACCCTGAGCCATCAGCCGACCTTTGAAGATCTGCTGCATGGCGTGGGCCTGAACGAACATAAAAGCGAATATTGCAGCTGCGGCTGGCCGAGCCATCTGCTGGTGCCGAAAGGCAACATTAAAGGCATGGAATATCATCTGTTTGTGATGCTGACCGATTGGGATAAAGATAAAGTGGATGGCAGCGAAAGCGTGGCGTGCGTGGATGCGGTGAGCTATTGCGGCGCGCGCGATCATAAATATCCGGATAAAAAACCGATGGGCTTTCCGTTTGATCGCCCGATTCATACCGAACATATTAGCGATTTTCTGACCAACAACATGTTTATTAAAGATATTAAAATTAAATTTCATGAA

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

3.4. You have a sequence! Now what?

When this protein is introduced to the E. coli K-12, it can start transcribing this DNA into an mRNA then this mRNA can be translated into a protein so that the bacteria can use it, and since we have codon optimized it for the e coli, we should expect a smooth translation process without any stalling.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

3.5. [Optional] How does it work in nature/biological systems?

1) Describe how a single gene codes for multiple proteins at the transcriptional level.

Alternative Splicing. A single Gene in Eukaryotes can code for multiple proteins through a process called Alternative Splicing, It takes on many forms, like Exon Skipping, where a certain exon might be skipped possibly altering the protein function, or Alternative 5’ or 3’ Splicing, where some exons can be longer or shorter hence affecting the number of produced amino acids altering the function too, another way is intron retention where some introns are not spliced out which can introduce a stop codon and can cause the protein to decay using the Nonsense mediated decay pathway.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

2) Try aligning the DNA sequence, the transcribed RNA, and also the resulting translated Protein!!! See example below.

DNA, RNA & AA DNA, RNA & AA

Here is a Picture of the very first few bases of Hemocyanin II DNA, RNA & Amino Acid

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Part 4: Prepare a Twist DNA Synthesis Order

Benchling Linear Map of a Constitutive sfGFP DNA Benchling Linear Map of a Constitutive sfGFP DNA

Here is a screenshot of my Linear map of Constitutive sfGFP DNA and here is the Benchling Link

Final Twist sfGFP Plasmid Final Twist sfGFP Plasmid

Here is a screenshot of my Twist Ready Plasmid :D

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Part 5: DNA Read/Write/Edit

5.1 DNA Read

(i) What DNA would you want to sequence (e.g., read) and why?

I’d like to sequence and read the genome of the Axolotl (Ambystoma mexicanum) to learn more about the process of its limb regeneration and how it happens, and figure out if other species have these genes too whether Human, Marine or Plants.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

(ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why?

I would use PacBio SMRT sequencing. This is a third-generation technology, which is perfect because the Axolotl genome is very huge (32 billion bases!) and full of repetitive parts that confuse older, short-read machines. Third-generation sequencing reads single molecules of DNA in real-time, giving us the long reads necessary to bridge those gaps and locate the regeneration genes i care about.

For the process, I’d follow the SMRTbells process where I’d start with really long strands of high-quality DNA and attach hairpin adapters to both ends, turning them into circular loops. These loops go into tiny wells where a polymerase enzyme runs around the circle, adding bases. As each base (A, C, T, G) is added, it gives off a specific colored flash of light. The machine records these flashes to decode the sequence. The final output is HiFi reads: extremely long, highly accurate digital sequences that we can assemble into a full map.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

5.2 DNA Write

(i) What DNA would you want to synthesize (e.g., write) and why?

I want to synthesize the Dsup (Damage Suppressor) gene found in Tardigrades, Tardigrades are famous for surviving the vacuum of space and massive radiation because this specific protein wraps around their DNA like a physical shield to prevent damage. By synthesizing this gene and inserting it into human cells or gut bacteria or plants, I want to see if we can “borrow” this superpower to protect astronauts from lethal cosmic radiation on the long space journies, effectively genetically engineering a radiation protection.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

(ii) What technology or technologies would you use to perform this DNA synthesis and why?

For the DNA synthesis technology, maybe i will rely either on the current common Phosphoramidite DNA synthesis or use an Enzymatic Synthesis using Terminal deoxynucleotidyl transferase (TdT), or maybe an even more easier and more quick method is to just use Twist Bioscience and order the gene right away :D

Basically the steps for phosphoramidite synthesis process would start first with deblocking of the nucleotide then it couples using phosphoramidite, after that we do Capping for the unreacted sites to prevent any faulty chains to continue growing, and then oxidation to seal and empower the bond of our newly added nucleotide, then the process is repeated until we get a chain of N bases, then these Oligos are stitched and assembled together using methods like Gibson Assembly, Golden Gate Assembly or the recently announced Sidewinder (2) way (which is pretty cool :D)

Phosphoramidite Synthesis currently faces the major issue of Exponentially Decaying Yield when the synthesized chain gets longer however Twist Bioscience seems to be doing a really great job regarding this especially for their achievement of direct synthesis of the first 700mer Oligo using “Enhanced Chemistry”. For DNA Assembly currenty Sidewinder (2) seems like a very promising tool that can achieve high accuracy and avoids many of the errors that can happen from long, repititve or high GC content Oligos that methods like Gibson Assembly or Golden Gate Assembly used to suffer from.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

5.3 DNA Edit

(i) What DNA would you want to edit and why?

Quite a crazy idea but i would like to edit the DNA of the E. Coli K-12 and install regeneration genes from the Axolotl to allow the e coli to be able to regrow and detach vesicles containing molecules that it can produce or has been metabolically engineered to produce. this can allow it to match some Yeast ability for storing produced molecules especially hydrophobic molecules. These vesicles can be engineered to float to the top of flask or fermentation tanks which would allow easy extraction and purification.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

(ii) What technology or technologies would you use to perform these DNA edits and why?

I would use CRISPR-Cas9 to edit the genome. The process starts with designing a Guide RNA (gRNA) that acts as a GPS coordinate for the specific location in the E. coli genome where I want to make the edit, then introduce a plasmid into the cells containing the Cas9 enzyme, the gRNA, and a Donor DNA template (which carries the Axolotl genes). The Cas9 enzyme cuts the bacterial DNA at the target site, and the cell repairs this cut using the Donor template, successfully “pasting” the new regeneration genes, this process is called Homology-Directed Repair.

However, this method has limitations, mainly of efficiency and payload size. Inserting large, complex gene pathways is much harder than making small mutations, the larger the DNA insert, the less likely the bacteria are to accept it. There is also the risk of off-target effects, where the enzyme cuts the wrong part of the genome, potentially killing the cell, Also sometimes the Homology-Directed Repair may not be efficient and may introduce mutations that can render the insert non functional

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

References

  1. Keyhole limpet haemocyanin – a model antigen for human immunotoxicological studies

  2. Construction of complex and diverse DNA sequences using DNA three-way junctions

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Week 3 HW: Lab Automation

cover image cover image

𓃠 Week 3 Homework 𓃠

Assignment: Python Script for Opentrons Artwork

Ancient Egyptian Pharaoh Oepntrons GUI Art Ancient Egyptian Pharaoh Oepntrons GUI Art

This is an Ancient Egyptian Pharaoh Figure, this was made using this GUI Also here is the live link to check it out! :D

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή
Note

This section is going to be just documentation about how i managed to draw the Ancient Egyptian Pharaoh Figure, to view the rest of the homework, You can skip to Post-Lab Questions

So i started out with this image which i generated using Gemini a long while ago.

Ancient Egyptian Pharaoh Ancient Egyptian Pharaoh
  • I wanted to draw it using the Opentrons OT-2 but i had no idea how + i am not a very good pixel artist.
  • So i decided to use AI to help me with this, first i went to Ronan’s GUI (Thanks Again Ronan! :D) and i roughly counted there was (on default settings) 36x36 pixels (i was off by a little number btw XD)
  • So i went back to Gemini and prompted it to draw my the image on a 36x36 pixel grid and this was the output
Ancient Egyptian Pharaoh Pixel Ancient Egyptian Pharaoh Pixel
  • It was pretty good to me, but i need coordinates for the Opentrons API, i tried to roughly copy it on Ronan’s GUI but i failed misreably and wasn’t accurate
  • So when i was searching online for tools to like give some sort of coordinates to this (and i didn’t find XD), i stumbled on this website called pixilart, I went in and generated a 36x36 pixel grid and guess what? I drew it from scratch using Gemini’s image as the reference and i added some tweaks myself to it too :D
Ancient Egyptian Pharaoh Pixel Redraw Ancient Egyptian Pharaoh Pixel Redraw
  • It took alot of effort trying to focus on those pixels to get a picture perfect copy of it, but in the end the mission was a success :D
  • Now that i could somehow get x/y coordinates, i wanted to redo this on Ronan’s GUI so that the coordinates i give to Opentrons are accurate, so now i had to take my new reference and map it there.
  • I tried to upload the image but it didn’t map it well at all, maybe because of the pixel grids, so i had to do it manually XD
Ancient Egyptian Pharaoh Pixel Redraw On Ronan’s GUI Ancient Egyptian Pharaoh Pixel Redraw On Ronan’s GUI
  • Here is a screenshot of me mapping every single pixel on Ronan’s GUI and marking it in red on pixilart so i don’t mix something up :D
  • It took sometime but the results were totally worth it :D
  • This way i took the generated coordinates and went the Opentrons Google Colab Notebook and used Gemini to write the actual function because i was too tired after all the drawing XD, and this was the result! :D
Ancient Egyptian Pharaoh Opentrons OT-2 Python API Ancient Egyptian Pharaoh Opentrons OT-2 Python API

Here is some other designs that i did too while in the Autonomous Cloud Lab Stream where we were printing Fluorescent Artwork designs

Ancient Egyptian Eye of Ra Nebula Ancient Egyptian Eye of Ra Nebula

This is the Eye of Ra, a very common and famous Ancient Egyptian Symbol

Ancient Egyptian Winged Scarab Nebula Ancient Egyptian Winged Scarab Nebula

This is a Winged Scarab, which is also a very well known Ancient Egyptian Symbol

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Post-Lab Questions β€” DUE BY START OF FEB 24 LECTURE

Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications.

Paper: Development of a Modular Lab Automation System with Applications to Animal and Bacteria Cell Culture(1)

This paper is a perfect example of why automation is critical for modern synthetic biology. They identify the key challenges of manual lab work which includes that complex protocols are tedious, prone to human error, and suffers from a lack of reproducibility. and To solve this, so they developed a modular lab automation system. They validated their system through automating cell culture, for both simple bacteria (prokaryotic) and complex animal cells (eukaryotic).

The most important part of their project isn’t just the robot, it’s their focus on transparency, quality control, and community reuse. They created a system that automatically generates Jupyter Notebooks as a β€œfull protocol execution report,” which makes the experiments perfectly documented and reusable. They also described how this modular system is the first step toward β€œself-driving labs,” where AI and machine learning models can design and run their own experiments, possibly creating a fully automated DBTL cycle.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Write a description about what you intend to do with automation tools for your final project. You may include example pseudocode, Python scripts, 3D printed holders, a plan for how to use Ginkgo Nebula, and more.

Automation Tools can help me quickly and efficiently do many tasks and test different scenarios with minimal errors, for example in Idea 1 i can use Automation Tools to test out different circuits with different promotors to test them and check out which on is the most optimal and which ones burden the cell too much, same with Idea 2 where i can test different Inducer Ratios and measure production titers in a quick way

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Final Project Ideas

The presentation slides can be found at the Slides Deck

Idea 1: The Malaria Machine: Computational Optimization of Artemisinic Acid Biosynthesis

  • The Problem: Artemisinin, the frontline treatment for malaria which still kills ~600,000 people annually, relies on plant extraction from Artemisia annua β€” an inherently slow, geographically limited, and climate-vulnerable supply chain that cannot meet global demand consistently.

  • The Mechanism: This project computationally engineers a microbial factory (E. coli or yeast) to overproduce artemisinic acid, the direct biosynthetic precursor to artemisinin, by modeling the mevalonate pathway using flux balance analysis, systematically evaluating knockout and overexpression strategies, and designing optimized genetic circuits for the winning candidate.

  • The Impact: By delivering a fully documented, wet-lab-ready computational blueprint for high-yield artemisinic acid biosynthesis, this project contributes toward a scalable, plant-independent, and globally accessible supply of the world’s most critical antimalarial compound.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Idea 2: The Bio-Propulsion Model: Engineering Off-World Propellant

  • The Problem: High-density aerospace propellants (like JP-10) are entirely petroleum-derived, making them unsustainable on Earth and physically impossible to drill for during deep space missions.

  • The Mechanism: This project computationally optimizes a microbial factory, utilizing Metabolic pathway, to metabolically convert raw carbon into alpha-pinene.

  • The Impact: Because alpha-pinene can be chemically dimerized into a direct, high-energy biological equivalent of JP-10 rocket fuel, this creates a scalable, closed-loop propulsion supply for advanced aerospace applications.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Idea 3: The Coral Distress Beacon: Early-Warning Marine Biosensors

  • The Problem: Corals release specific chemical stress markers, such as Reactive Oxygen Species (ROS), immediately before undergoing heat-induced bleaching.

  • The Mechanism: This project proposes engineering a marine microbe biosensor with a targeted genetic logic circuit designed to detect these exact ROS molecules and output a highly visible fluorescent signal.

  • The Impact: By deploying this living “distress beacon,” marine biologists gain a realtime, colorful early warning system, allowing them to intervene and protect the reef before the ecological damage becomes irreversible.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Extra Ideas

Idea 4: BioDesalination: Engineering a Living Portable Freshwater System

  • The Problem: Existing desalination technologies are energy-hungry, expensive and dependent on infrastructure which sometimes makes it inaccessible to coastal communities, small vessels and off-grid environments that need them the most

  • The Mechanism: This project aims to engineer a microorganism with optimized ion transport and osmoregulation pathways to actively convert saltwater into freshwater through biological membrane processes. It begins computationally, modeling genetic modifications and pathway optimization, with a clear roadmap toward experimental validation.

  • The Impact: A living, self-sustaining, portable biodesalination system that requires no external energy infrastructure, deployable anywhere there is saltwater and sunlight

References

  1. Development of a Modular Lab Automation System with Applications to Animal and Bacteria Cell Culture
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Week 4 HW: Protein Design Part i

cover image cover image

𓃠 Week 4 Homework 𓃠

Part A. Conceptual Questions

1. How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons)

Proteins make about 20%-30% of the composition of meat, so we can take 25% as the average, this means 500 x 0.25 = 125g protein, to determine molar mass, 1 Dalton is approximately equal to 1 gram per mole (g/mol), so an average amino acid weighs 100 g/mol which when divided by the molar mass of 100 daltons gives 1.25 moles of amino acids, then we can multiply this by Avogadro's number (6.022x1023), we get 7.5275 x 1023 molecules of amino acids in the 500 grams of meat (this is extremely huge wow XD)

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

2. Why do humans eat beef but do not become a cow, eat fish but do not become fish?

Because when we eat them, we do not inherit their DNA or Genomic composition, we just digest them and use the amino acids and other nutrients from their catabolism to build our own self using our own Genome instructions

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

3. Why are there only 20 natural amino acids?

The currently settled on 20 Amino acids provide a sweet spot between the most chemical diversity with the fewest components to a build functioning complex system, however we dont seem to be only stuck with 20, as there are 2 other natural amino acids Selenocysteine is one of them, it is used by humans and bacteria for precise antioxidant enzymes, the other is Pyrrolysine which is found strictly in ancient methane-producing archaea.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

4. Can you make other non-natural amino acids? Design some new amino acids.

Yes, you definetly can, actually non natural amino acids have been created before, usually by chemically modifying the “R-group” side chain to create proteins with other powers like heat resistance, fluorescence, or adhesive properties. Maybe we can design new amino acids that can glow different colors depending on how powerful radiation is in a certain place.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

5. Where did amino acids come from before enzymes that make them, and before life started?

(Had to use AI for this one) Amino acids were already being mass produced by high energy physics and geology, forming spontaneously whenever simple carbon compounds interact with energy, some common sources were Atmospheric Sparks (Lightning), Cosmic Delivery (Meteorites), and Deep-Sea Vents (Geothermal Heat).

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

6. If you make an Ξ±-helix using D-amino acids, what handedness (right or left) would you expect?

it will form a Left-Handed helix, since D-amino Acids are mirrors of the L-amino acids that make Right handed helices

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

8. Why are most molecular helices right-handed?

Because more than 99% of the amino acids in the body are in the L-form (Left-handed), as they are the exclusive building blocks for proteins and enzymes, so most of the produced helices will be right handed.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

9. Why do Ξ²-sheets tend to aggregate? What is the driving force for Ξ²-sheet aggregation?

Unlike alpha helices, which are self contained cylinders, a single beta strand is structurally incomplete. Its edges have exposed hydrogen bond donors and acceptors that act like open Velcro hooks. If these edges are not protected by the protein’s own fold, they will inevitably recruit strands from neighboring proteins to satisfy these bonds, triggering a stacking event. The primary driving force for beta sheet aggregation is the thermodynamic stability gained from intermolecular hydrogen bonding combined with the hydrophobic effect.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

10. Why do many amyloid diseases form Ξ²-sheets? Can you use amyloid Ξ²-sheets as materials?

Proteins normally fold into delicate 3D shapes to do their jobs, but keeping those shapes takes constant effort. If a protein gets damaged, it can collapse into a “Ξ²-sheet”, a misfolded shape that acts like a microscopic zipper. These sheets lock together so tightly and squeeze out so much water that the body’s natural recycling system literally cannot break them apart. That extreme, indestructible stability is exactly why they clump up in the brain and cause diseases.

However, that same indestructible nature makes them an incredible raw material. Because these microscopic Ξ²-sheets are stronger than steel and completely waterproof, scientists are now programming cells to grow harmless, synthetic versions of them in the lab. We can harvest these lab grown amyloids to build strong biofibers.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Part B: Protein Analysis and Visualization

Briefly describe the protein you selected and why you selected it.

crtI (phytoene desaturase), it is a bacterial enzyme that handles the multi-step conversion of colorless phytoene into the bright red pigment lycopene. i mainly chose it because i was working on a metabolic engineering project recently on lycopene production so i thought it would be cool to pick crtI and explore it more

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Identify the amino acid sequence of your protein.

MKPTTVIGAGFGGLALAIRLQAAGIPVLLLEQRDKPGGRAYVYEDQGFTFDAGPTVITDPSAIEELFALAGKQLKEYVELLPVTPFYRLCWESGKVFNYDNDQTRLEAQIQQFNPRDVEGYRQFLDYSRAVFKEGYLKLGTVPFLSFRDMLRAAPQLAKLQAWRSVYSKVASYIEDEHLRQAFSFHSLLVGGNPFATSSIYTLIHALEREWGVWFPRGGTGALVQGMIKLFQDLGGEVVLNARVSHMETTGNKIEAVHLEDGRRFLTQAVASNADVVHTYRDLLSQHPAAVKQSNKLQTKRMSNSLFVLYFGLNHHHDQLAHHTVCFGPRYRELIDEIFNHDGLAEDFSLYLHAPCVTDSSLAPEGCGSYYVLAPVPHLGTANLDWTVEGPKLRDRIFAYLEQHYMPGLRSQLVTHRMFTPFDFRDQLNAYHGSAFSVEPVLTQSAWFRPHNRDKTITNLYLVGAGTHPGAGIPGVIGSAKATAGLMLEDLI

  • The length of the protein is 492 aminoacids, with Leucine (L) being the most common amino acid which appears 57 times.

  • Uniprot ID P21685 and it has 242 homologs.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Identify the structure page of your protein in RCSB

  • The structure for Phytoene Dehydrogenase was deposited on Jan 26th 2012, it has a good resolution of 2.4 Γ…
  • Apart from the Main Protein Polymer, it has other components including Ligands, Water and Ions.
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Open the structure of your protein in any 3D molecule visualization software.

crtI Cartoon crtI Cartoon

Cartoon View

crtI Ribbon crtI Ribbon

Ribbon View

crtI Ball and Sticks crtI Ball and Sticks

Ball & Sticks View

crtI Residues crtI Residues

crtI Residues, Blue are hydrophilic and red are hydrophobic, it seems to have more hydrophilic residues than hydrophobic ones

crtI Cavities crtI Cavities

crtI Cavities

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Part C. Using ML-Based Protein Design Tools

C1. Protein Language Modeling

Mutations Scan Heatmap Mutations Scan Heatmap
  • Some Patterns that I identified include that some amino acids in the crtI protein, especially those between positions 4-8 seem to have the lowest scores for mutations possibly highlighting how important or cruical these are for the protein’s stability and function.
  • On the other hand, some amino acids show high and positive scores when mutated to other amino acids, potentially showing better stability or favorable changes, such as positions 354 & 365 seem to be having a high score for each differtiated amino acid.
  • Position 269 seems to have the highest score when mutated to Valine (V) or Isoleucine (I).

3D TSNE map 3D TSNE map 3D TSNE map with crtI 3D TSNE map with crtI

Here is the TSNE map, i hightlighted my crtI protein in black.

3D TSNE map Neighborhood1 3D TSNE map Neighborhood1

I found this neighborhood cluster that hosts many dehydrogenase proteins like Retinal Dehydrogenase & Putative dehydrogenase, and other Reductase proteins too

3D TSNE map Neighborhood2 3D TSNE map Neighborhood2

Here in this cluster i found many microbial cytochrome proteins and some photosystem proteins, and other mitochondiria and ATP related ones

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

C2. Protein Folding

crtI Fold crtI Fold

This is how the normal crtI folded

crtI Fold 4Vs crtI Fold 4Vs

Here i tried swapping amino acids 10-13 to be all Valine (V) and here is the result

crtI Fold 2Es crtI Fold 2Es

Here i actually picked two of the lowest scores from the mutation scan heatmap and decided to switch positions 4 and 5 which were Threonine (T) and switched them with Glutamic acid (E) which had a very low negative scores of -9.97 and -9.39 for positions 4 and 5 respectively

crtI Fold 30 AA removal crtI Fold 30 AA removal

Here i tried to go a bit crazy and removed 30 amino acids, in sets of 10 randomly across the protein to see what effect this could have on it

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

C3. Protein Generation

Inverse Folded Protein Sequence: ALPVAVVDGGAGGLALAIRLKAAGLPVVLLESGXXXXXXXGSVEKDGFIFDTTDLIITDPSPIEALFALAGKKLEDYVKLLKVEPFYRMVFENGRTFDFNQDLAAILAQIAKFNPADVAGFQALMAALRARYAEGYPXXGPVPYLDFDRLLRVAPTLRESPAYKAIHAEIAKYIKDPFLQLALTTFHLLVSGRPXXDTDPYHLISYFTQDWDVYYPEGGYKALVEAMKTLLRDLGGTIVEGARVARFELEGNRVVAVVLEDGRVIPVSAVALPPAXXXXXXXXXXXXXXXXXXXXXXXXXXXRYDLLELFFATDKRYDHLAMYTLVFKPVXXXXXXXXXXXXXLDYSLALVIYNPNVVDPSLAPEGGNSLYVKAPVPALGSANIDWSVWGPEKAEELLAYLEAHLMPGLRASLVTHAIVTPADKLXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXELENLFVIGXXXXXXGGIPGAIAAAFEVADRILAALK

Inverse Folding AA Probabilities Inverse Folding AA Probabilities crtI Inverse Fold crtI Inverse Fold

I generated an AA Probaility Heat map and the ESM Fold for the ProteinMPNN produced sequence, it actually doesn’t really hold the same shape as the original one, when i checked the output of the ProteinMPNN on colab it was this: >T=0.1, sample=0, score=0.8096, seq_recovery=0.4642 ALPVAVVDGGAGGLALAIRLKAAGLPVVLLESGXXXXXXXGSVEKDGFIFDTTDLIITDPSPIEALFALAGKKLEDYVKLLKVEPFYRMVFENGRTFDFNQDLAAILAQIAKFNPADVAGFQALMAALRARYAEGYPXXGPVPYLDFDRLLRVAPTLRESPAYKAIHAEIAKYIKDPFLQLALTTFHLLVSGRPXXDTDPYHLISYFTQDWDVYYPEGGYKALVEAMKTLLRDLGGTIVEGARVARFELEGNRVVAVVLEDGRVIPVSAVALPPAXXXXXXXXXXXXXXXXXXXXXXXXXXXRYDLLELFFATDKRYDHLAMYTLVFKPVXXXXXXXXXXXXXLDYSLALVIYNPNVVDPSLAPEGGNSLYVKAPVPALGSANIDWSVWGPEKAEELLAYLEAHLMPGLRASLVTHAIVTPADKLXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXELENLFVIGXXXXXXGGIPGAIAAAFEVADRILAALK

The sequence recovery was 0.4642, i tried to regenerate it again and got 0.4840, still both are under 50% so that explains why the produced Fold is quite different

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Part D. Group Brainstorm on Bacteriophage Engineering

By: 2026a-nourelden-rihan, 2026a-ritika-saha, 2026a-rahul-yaji, 2026a-keerthana-gunaretnam

Goals and Strategy

  • Primary Goal: Increase the structural stability of the MS2 bacteriophage lysis protein (L) while preserving its ability to lyse bacterial cells.

  • Secondary Goal: Decrease the protein’s reliance on the host chaperone DnaJ, which could allow the lysis protein to function more efficiently and independently in engineered systems

  • Design Focus: The strategy involves stabilizing the transmembrane and oligomerization regions, protecting essential functional motifs (such as the L48–S49 motif), and modifying the N-terminal region to bypass DnaJ dependence

Computational Pipeline

The project utilizes a multi-step computational protein engineering pipeline to rationally design mutations:

  • Homolog Discovery (BLAST): Identifying related lysis proteins to find evolutionarily conserved residues and natural sequence variations

  • Multiple Sequence Alignment (Clustal Omega): Mapping essential structural regions and differentiating between highly conserved zones (to be protected) and mutable sites

  • In Silico Mutagenesis (ESM): Using protein language models to generate mutation heatmaps and rationally select amino acid substitutions that improve protein fitness and stability

  • Structure Prediction (ESMFold): Modeling the 3D structures of promising mutants to ensure the essential transmembrane helix is not distorted

  • Complex Prediction (AlphaFold Multimer): Evaluating whether mutated proteins can successfully form the required oligomeric pore complex (>10 subunits) and assessing if N-terminal mutations successfully reduce interactions with DnaJ

Expected Outcomes and Applications

  • The pipeline is expected to yield MS2 L variants with enhanced structural stability, proper transmembrane insertion, lower aggregation risks, and reduced DnaJ dependency

  • These optimized proteins have potential downstream applications in synthetic phage engineering, antimicrobial protein development, and bacterial ghost cell production

Challenges and Future Validation

  • Key computational challenges include the limited training data for small transmembrane toxins (as models primarily focus on globular proteins), the poor database annotation of single-gene lysis proteins (amurins), and the risk of over-stabilizing the protein, which could impede proper membrane insertion or functional oligomerization

  • Additionally, mutations might inadvertently expose protease cleavage sites, reducing stability inside the cell

  • Future steps will involve experimentally expressing the computationally identified mutants in E. coli to validate protein stability, lysis timing, and DnaJ independence

Schematic

Project Schematic Project Schematic

You can check out the fully detailed Project Proposal here

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Week 5 HW: Protein Design Part ii

cover image cover image

𓃠 Week 5 Homework 𓃠

Part A: SOD1 Binder Peptide Design (From Pranam)

Part 1: Generate Binders with PepMLM

Here is the Human SOD1 sequence from Uniprot (P00441)

MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

Here it is again after adding the A4V mutation

MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

Here is the produced peptides:

IndexBinderPseudo Perplexity
1HRSYAVALRHGK15.152948
2HHSGPVAVRWKX11.715797
3WRSPAAAVEHWX9.166994
4WRYGVVGVRLWE14.868017
5FLYRWLPSRRGGn/a
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Part 2: Evaluate Binders with AlphaFold3

Some of the peptides had X in it and Alphafold seems to reject it, i asked Gemini and it mentioned it should be swapped, Safe Options include Alanine (A) or Glycine (G) or a Rational swap where i can choose a hydrophobic Leucine (L) or Valine (V) or a hydrophilic Lysine (K) or Arginine (R) swap that depends on the pocket i am binding to, for now though i have decided to go with an Alanine (A) swap for simplicity.

Peptide 1 (HRSYAVALRHGK)

AlphaFold Prediction of Peptide 1 AlphaFold Prediction of Peptide 1

The Peptide seems to be floating really close near a side of the Dimer Interface but it is not totally sticking to it

AlphaFold Prediction of Peptide 1 Surface View AlphaFold Prediction of Peptide 1 Surface View

Here in this Molecular Surface View, it shows that the Peptide is floating and not touching the protein surface, yet it looks to be fitting into the protein surface groove well.

Peptide 2 (HHSGPVAVRWKX)

AlphaFold Prediction of Peptide 2 AlphaFold Prediction of Peptide 2

This Peptide seems to be floating near the Dimer Interface but not actually stuck to it.

Peptide 3 (WRSPAAAVEHWX)

AlphaFold Prediction of Peptide 3 AlphaFold Prediction of Peptide 3

The Peptide seems to be floating away from the protein and not engaging with the N-terminus, the beta barrel or the Dimer Interface.

Peptide 4 (WRYGVVGVRLWE)

AlphaFold Prediction of Peptide 4 AlphaFold Prediction of Peptide 4

The Peptide seems to be buried into a groove into the protein, yet does not engaging with the N-terminus, the beta barrel or the Dimer Interface.

AlphaFold Prediction of Peptide 4 Surface View AlphaFold Prediction of Peptide 4 Surface View

This image shows clearly how the peptide seems to be holding into a groove into the protein.

Peptide 5 (Control SOD1-binding peptide) (FLYRWLPSRRGG)

AlphaFold Prediction of Peptide 5 AlphaFold Prediction of Peptide 5

The Peptide seems to be floating away from the protein and not engaging with the N-terminus, the beta barrel or the Dimer Interface.


IndexPeptideipTM Score
1HRSYAVALRHGK0.54
2HHSGPVAVRWKX0.3
3WRSPAAAVEHWX0.34
4WRYGVVGVRLWE0.47
5FLYRWLPSRRGG0.41

The ipTM score tells you how confident AlphaFold is that these two chains actually interact. It ranges from 0 to 1. Generally, and as noted in AlphaFold, Scores beyond 0.8 are confident high-quality predictions) while below 0.6 are a failed prediction, in between is considered a grey zone where predictions could be correct or incorrect.

Looking at my ipTM scores, it seems like all the generated peptides fall into the failed prediction zone, with Peptide 1 almost surpassing it with a score of 0.54, Peptides 1 (0.54) & 4 (0.47) seem to have a higher score than the control Peptide 5, while Peptides 2 (0.3) & 3 (0.34) have very low scores.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse

Peptide 1 (HRSYAVALRHGK)

PeptiVerse Property Prediction of Peptide 1 PeptiVerse Property Prediction of Peptide 1

Peptide 1 has a very high solubility score and is also non hemolytic so it is generally safe, however it shows a weak binding affinity, The results also show that it has a isoelectric point of 11.00 explaining its positive net charge (2.93), hydrophilic (-0.85) so it binds well to hydrophilic or polar pockets of the protein, and has a molecular weight of 1394.6 Daltons.

Peptide 2 (HHSGPVAVRWKX)

PeptiVerse Property Prediction of Peptide 2 PeptiVerse Property Prediction of Peptide 2

Peptide 2 has a very high solubility score and is also non hemolytic so it is generally safe, however it also shows a weak binding affinity, The results also show that it has a isoelectric point of 11.00 explaining its positive net charge (1.93), hydrophilic (-0.69) so it binds well to hydrophilic or polar pockets of the protein, and has a molecular weight of 1255.6 Daltons.

Peptide 3 (WRSPAAAVEHWX)

PeptiVerse Property Prediction of Peptide 3 PeptiVerse Property Prediction of Peptide 3

Peptide 3 has a very high solubility score and is also non hemolytic so it is generally safe, however it shows a weak binding affinity, The results also show that it has a isoelectric point of 6.75 explaining its negative (or almost neutral) net charge (-0.15), hydrophilic (-0.48) so it binds well to hydrophilic or polar pockets of the protein, and has a molecular weight of 1291.5 Daltons.

Peptide 4 (WRYGVVGVRLWE)

PeptiVerse Property Prediction of Peptide 4 PeptiVerse Property Prediction of Peptide 4

Peptide 4 has a very high solubility score and is also non hemolytic so it is generally safe, and it shows Medium binding affinity :D, The results also show that it has a isoelectric point of 8.75 explaining its positive net charge (0.77), amphiphilic (-0.00) so it binds well to both water or greasy non-polar pockets of the protein, and has a molecular weight of 1519.7 Daltons.


Overall it seems like all the generated Peptides are soluble and non hemolytic so they are safe to use, Peptide 4 seems to be the one with the highest chance of Binding because it has Medium Binding Affinity compared to all Weak Binding affinity of all other peptides, even though it didnt have the highest ipTM score, it got 0.47 while Peptide 1 got 0.54 which proves structural prediction and Shape complementarity is not the only factor affecting Protein-Peptide Binding.

So with that, i have decided to proceed with Peptide 4 WRYGVVGVRLWE since it has a higher Binding Affinity than all others

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Part 4: Generate Optimized Peptides with moPPIt

For this part i have decided to pick the amino acid residues from 2-6, since our mutation is A4V so i picked it with two amino acids upstream and 2 downstream to try to bind to this specific area.

Here is the moPPIt generated peptides

IndexPeptideHemolysisSolubilityAffinityMotif
6REYDQKQICKKL0.94282174480.83333331357.0317659380.8390983939
7KKSKKQKELTCG0.98300009030.91666668656.9224300380.7878084183
8IQQWETKGKRLK0.96037480980.75000000005.6749410630.5862649679

I gave them indexes starting from 6 to account for the previously generated 4 peptides and the control one

To level the playing field for comparison, i decided to take those Peptides, and run them through AlphaFold and PeptiVerse and compare them with the pepMLM ones.

Peptide 6 (REYDQKQICKKL)

AlphaFold Prediction of Peptide 6 AlphaFold Prediction of Peptide 6

Peptide 6 seems to be floating away from the protein but slightly close to the N-Terminus region where our A4V mutation is, but it is not binded to it and the ipTM score is 0.48 so it falls in the failed prediction zone

PeptiVerse Property Prediction of Binder 6 PeptiVerse Property Prediction of Binder 6

Peptide 6 has a very high solubility score and is also non hemolytic so it is generally safe, and it shows a medium binding affinity, The results also show that it has a isoelectric point of 9.11 explaining its positive net charge (1.75), hydrophilic (-1.72) so it binds well to hydrophilic or polar pockets of the protein, and has a molecular weight of 1551.8 Daltons.

Peptide 7 (KKSKKQKELTCG)

AlphaFold Prediction of Peptide 7 AlphaFold Prediction of Peptide 7

Peptide 7 seems to be floating away from the protein and also very slightly close to the N-Terminus and the mutation region as well, yet again it is not binded to it and the ipTM score is 0.38 so it falls in the failed prediction zone

PeptiVerse Property Prediction of Binder 7 PeptiVerse Property Prediction of Binder 7

Peptide 6 has a very high solubility score and is also non hemolytic so it is generally safe, and it shows a weak binding affinity, The results also show that it has a isoelectric point of 9.87 explaining its positive net charge (3.75), hydrophilic (-1.84) so it binds well to hydrophilic or polar pockets of the protein, and has a molecular weight of 1377.7 Daltons.

Peptide 8 (IQQWETKGKRLK)

AlphaFold Prediction of Peptide 8 AlphaFold Prediction of Peptide 8

Peptide 8 seems to be floating away from the protein and also very far away from to the N-Terminus and the mutation region (which is weird), and again it is not binded to it and the ipTM score is 0.45 so it falls in the failed prediction zone

PeptiVerse Property Prediction of Binder 8 PeptiVerse Property Prediction of Binder 8

Peptide 6 has a very high solubility score and is also non hemolytic so it is generally safe, and it shows a weak binding affinity, The results also show that it has a isoelectric point of 10.29 explaining its positive net charge (2.76), hydrophilic (-1.70) so it binds well to hydrophilic or polar pockets of the protein, and has a molecular weight of 1514.8 Daltons.


While pepMLM generates peptides randomly and tries to somehow bind to the protein through the protein-protein interactions it has learned, moPPIt seems to follow a more Rational Design when designing those peptides and tries to target a specific region/residue to achieve the binding there. With that said, it seems like all my generated peptides have failed surpassing the AlphaFold threshold of an ipTM score of at least more than 0.6 and most of them seem to show a weak binding affinity on PeptiVerse except for Peptides 4 and 6 which actually showed medium binding affinity.


Evaluation of the generated peptides before clinical studies is very crucial and it requires bridging the massive gap between computational biology and actual human physiology, first these peptides need to be synthesized and then tested on human cell cultures we mainly need to confirm that it does its expected job which is preventing SOD1 from misfolding or aggregating and most importantly not to kill the cell in the process, Then we may test these on In Vivo Animal Models like an ALS Mouse Model, to monitor if these peptides can actually cross the Blood Brain Barrier (BBB) and reach the affected Neurons.

Additional Tests and Validations may include Immunogenetic Testings to make sure the Human Body doesn’t mark this synthetic peptide as a foreign invader, which can trigger a dangerous immune response or cause the body to generate neutralizing antibodies against the drug, also we may need to check if it needs chemical modifications to survive cellular proteases since peptides can be an easy target for them and this negatively affects the peptide’s half life.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Part B: BRD4 Drug Discovery Platform Tutorial (Gabriele)

Boltz JQ1 Output

Boltz JQ1 Output Boltz JQ1 Output

Boltz Hit Output

Boltz Hit Output Boltz Hit Output

Boltz Lead Output

Boltz Lead Output Boltz Lead Output
CompoundBinding ConfidenceOptimization ScoreStructure Confidence
Hit0.420.210.97
Lead0.760.260.98
JQ10.970.440.98

Boltz Predicted Pocket Structure

Boltz Pocket Structure Boltz Pocket Structure

Boltz AI Generated Binders

Boltz AI Generated Binders Boltz AI Generated Binders
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Part C: Final Project: L-Protein Mutants

Option 1: Mutagenesis

Here is the Pictures of the Clustal Omega Multiple Sequence Alignment (MSA) Output

Clustal Omega Multiple Sequence Alignment (MSA) Output for the MS2 L Protein Part 1 Clustal Omega Multiple Sequence Alignment (MSA) Output for the MS2 L Protein Part 1 Clustal Omega Multiple Sequence Alignment (MSA) Output for the MS2 L Protein Part 2 Clustal Omega Multiple Sequence Alignment (MSA) Output for the MS2 L Protein Part 2

It seems like these Positions [21,25,28-29,33,35-37,40] have an “*” which means these have not changed at all and most probably are a totally conserved crucial region that should not be mutated, Positions [17,26,30] have a “:” which means they are highly conserved, but mutations that are very similar in shape, structure and chemical properties are tolerated, the rest seem to be more flexible.

Interesting Finding here, All the conserved regions seem to be in the Soluble Domain (1-40) that is responsible for DnaJ Interaction :D

This is the generated L Protein Mutation Heatmap

L Protein Mutation Heatmap L Protein Mutation Heatmap

I tried to cross reference with the experimental data sheet but it was quite hard to do manually so i wrote a script and ran it on Here on Colab to just find the common ones across the two files and came up with 3 common ones, however even though these 3 had quite a high LLR Score, the experimental data showed a Lysis score of 0, and i believe this shows the limitation in the structure based modeling, it predicted structural stability but failed to account for the functional mechanism of lysis.

Common Mutations Common Mutations

The common mutations identified by my algorithm

So Then i Filtered my generated mutations first to target the soluble domain (1-40) first while making sure to avoid the totally conserved regions [21,25,28-29,33,35-37,40], the highly conserved regions [17,26,30] and my 3 common matches to avoid choosing mutants that are either disrupting the protein or have been already experimented on with no lysis observed.

The two mutations i have decided to go with for the Soluble Domain are:

IndexPositionWild_Type_AAMutation_AALLR Score
139YL2.24177968502044
29SQ2.01432478427886

These have the highest LLR score in this region and avoid any conservative regions in the protein

Then i repeated the steps again, this time targeting the transmembrane domain (41-75) and followed the same criteria, and i picked these 2 mutations:

IndexPositionWild_Type_AAMutation_AALLR Score
350KL2.56146776676178
453NL1.86493206024169

And Again because These have the highest LLR score in this region and avoid any conservative regions in the protein

For the Final 5th mutant i decided to pick this one:

IndexPositionWild_Type_AAMutation_AALLR Score
552TL1.81396758556365

Because this one still has a relatively high LLR Score and is in the area where the L Protein does not overlap with either the Coat Protein or the replicase ones.

so here is the full 5 Mutations table i chose:

IndexPositionWild_Type_AAMutation_AALLR Score
139YL2.24177968502044
29SQ2.01432478427886
350KL2.56146776676178
453NL1.86493206024169
552TL1.81396758556365

AlphaFold Multimer Runs

Mutant 1 (Y39L)

The Monomer Sequence: METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLLVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT

The Multimer Sequence (the one i used for Alphafold Multimer): METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLLVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLLVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLLVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLLVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLLVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLLVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLLVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLLVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT

Mutant 1 Multimer Front View Mutant 1 Multimer Front View Mutant 1 Multimer Side View Mutant 1 Multimer Side View

The L Proteins do form a cylinder like shape mimicking the transmembrane pore but the piDDT score are very low (<50) so this probably won’t express well or won’t express at all if done in wet lab :(.

Mutant 1 Multimer Predicted Aligned Error (PAE) Mutant 1 Multimer Predicted Aligned Error (PAE)

This is the Predicted Aligned Error (PAE), i asked AI (Gemini) how to interpret this and it said this:

Gemini’s Interpretation of Predicted Aligned Error (PAE) Plots Gemini’s Interpretation of Predicted Aligned Error (PAE) Plots

So following this, it seems like each monomer folds correctly with high confidence yet their together-grouping is not reliable at all with very low confidence scores possibly hinting that the pore forming shape we saw might not actually happen (i hope i understood that correctly XD)

Mutant 2 (S9Q)

The Monomer Sequence: METRFPQQQQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT

The Multimer Sequence (the one i used for Alphafold Multimer): METRFPQQQQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQQQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQQQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQQQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQQQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQQQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQQQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQQQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT

Mutant 2 Multimer Front View Mutant 2 Multimer Front View Mutant 2 Multimer Side View Mutant 2 Multimer Side View

Same exact findings for this mutant too :(.

Mutant 2 Multimer Predicted Aligned Error (PAE) Mutant 2 Multimer Predicted Aligned Error (PAE)

Again, Same Here as well. (it is the same for all five mutants :(. )

Mutant 3 (K50L)

The Monomer Sequence: METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSLFTNQLLLSLLEAVIRTVTTLQQLLT

The Multimer Sequence (the one i used for Alphafold Multimer): METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSLFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSLFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSLFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSLFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSLFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSLFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSLFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSLFTNQLLLSLLEAVIRTVTTLQQLLT

Mutant 3 Multimer Front View Mutant 3 Multimer Front View Mutant 3 Multimer Side View Mutant 3 Multimer Side View Mutant 3 Multimer Predicted Aligned Error (PAE) Mutant 3 Multimer Predicted Aligned Error (PAE)

Mutant 4 (N53L)

The Monomer Sequence: METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTLQLLLSLLEAVIRTVTTLQQLLT

The Multimer Sequence (the one i used for Alphafold Multimer): METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTLQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTLQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTLQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTLQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTLQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTLQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTLQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTLQLLLSLLEAVIRTVTTLQQLLT

Mutant 4 Multimer Front View Mutant 4 Multimer Front View Mutant 4 Multimer Side View Mutant 4 Multimer Side View Mutant 4 Multimer Predicted Aligned Error (PAE) Mutant 4 Multimer Predicted Aligned Error (PAE)

Mutant 5 (T52L)

The Monomer Sequence: METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFLNQLLLSLLEAVIRTVTTLQQLLT

The Multimer Sequence (the one i used for Alphafold Multimer): METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFLNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFLNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFLNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFLNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFLNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFLNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFLNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFLNQLLLSLLEAVIRTVTTLQQLLT

Mutant 5 Multimer Front View Mutant 5 Multimer Front View Mutant 5 Multimer Side View Mutant 5 Multimer Side View Mutant 5 Multimer Predicted Aligned Error (PAE) Mutant 5 Multimer Predicted Aligned Error (PAE)

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

I Actually decided to pick a mutant from the Experimental Data Sheet and made sure it has been proven to have the Lysis Effect, my aim is to try to perform the AlphaFold Multimer step for it, to have a look at how different it might be, the mutant i picked was R30Q.

This is the monomer sequence: METRFPQQSQQTPASTNRRRPFKHEDYPCQRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT

This is the full Alphafold sequence i used: METRFPQQSQQTPASTNRRRPFKHEDYPCQRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCQRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCQRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCQRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCQRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCQRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCQRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCQRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT

Experimental Mutant Multimer Front View Experimental Mutant Multimer Front View Experimental Mutant Multimer Side View Experimental Mutant Multimer Side View Experimental Mutant Multimer Predicted Aligned Error (PAE) Experimental Mutant Multimer Predicted Aligned Error (PAE)

Well i got bad news and good news, the bad news is, the structre is also of very low confidence and the Predicted Aligned Error (PAE) Plot shows the same low confidence for the monomer’s interaction with each othe, the good news though is that this is the run of a mutant that has been experimentally validated and has the lysis effect and protein level determined, so maybe this gives me hope that my five mutants might actually stand a chance in a wet lab validation regardless of the very low confidence scores it got.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Week 6 HW: Genetic Circuits Part i

cover image cover image

𓃠 Week 6 Homework 𓃠

Assignment: DNA Assembly

1. What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose?

Phusion High-Fidelity PCR Master Mix is basically a pre-optimized solution containing every chemical component needed for DNA amplification except for the DNA template and the primers.

It mainly consistes of:

  1. Phusion High-Fidelity DNA Polymerase: This is the main enzyme that synthesizes the new DNA strand
  2. Deoxynucleotide Triphosphates (dNTPs): These are the nucleotide building blocks that the DNA Polymerase use to synthesize the new DNA strand
  3. Magnesium Chloride (MgCl2): This is an essential cofactor for the DNA polymerase, helps it with stabilizing the negatively charged alpha-phosphate of the dNTPs, reducing electrostatic repulsion and allowing primers to anneal more effectively to the template
  4. Reaction Buffers: These maintain a stable chemical environment for the DNA Polymerase, it includes:
    1. High Fidelty Buffer: The default buffer optimized to provide the highest possible sequence accuracy.
    2. GC Buffer: This buffer is has special additives additives to help denature high GC content templates
    3. pH Buffer: This buffer includes buffering agents that keep the reaction at a specific pH (8.8–9.3) to prevent DNA damage and maintain enzyme activity.
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

2. What are some factors that determine primer annealing temperature during PCR?

The main two factors are the GC content of the primer, the more Gs and Cs, the higher the temperature, also the Primer Length, the longer the primer the more hydrogen bonds holding the strand and the higher the chance there will be GC regions too, one more factor is salt concentrations, High salt concentrations neutralize the negatively charged DNA backbone, stabilizing the bonds in the process and raising the melting temperature.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

3. There are two methods from this class that create linear fragments of DNA: PCR, and restriction enzyme digests. Compare and contrast these two methods, both in terms of protocol as well as when one may be preferable to use over the other.

They both do generate linear DNA, but the processs is different:

  • PCR generates the linear fragment through building DNA from scratch, you can pick the starting area you want through primer design, the end product usually has a blunt end and you end up with a huge amount of your target area since PCR also has exponential amplification
  • Restriction Enzyme Digests however generates the fragment through cutting existing DNA, you usually cant pick the starting area since each restriction enzyme has a specific area where it can make a cut, and depending on the used enzyme, you might end up with sticky or blunt ends, and you usually end up with just the amoung of your target fragment that depends on the starting amount.
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

4. How can you ensure that the DNA sequences that you have digested and PCR-ed will be appropriate for Gibson cloning?

For Gibson Assembly the main thing you need to focus on is having overlapping sequences (20-40 bp) on both ends of your target DNA with whatever backbone it is going to be inserted in, so that the Gibson mix can chew back these overlaps and anneal them and start stitching them together.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

5. How does the plasmid DNA enter the E. coli cells during transformation?

There is two ways:

  1. Chemical Transformation (Heat Shock): This is the most common method and is done using Calcium Chloride where the rapid heat change causes thermal imbalance across the membrane, causing pores to open in the bacterial cell wall so that the plasmid can enter.
  2. Electroporation: This method uses a quick pulse of electricity and is more efficient for large plasmid, the idea is similar, High voltage electric pulses are sent and it disrupts the cell membrane causing pores so the plasmid can enter.
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

6. Describe another assembly method in detail (such as Golden Gate Assembly)

I will talk in depth about Gibson Assembly

So Gibson Assembly is a molecular cloning method that allows seamless joining of multiple DNA fragments in a one pot reaction without requiring restriction enzymes or leaving behind scar sites, and it all mainly depends on having overlapping ends (20-40 bp) with the fragment next to it, and these overlapping ends can be designed using PCR primers.

Here is how the One Pot reaction happens:

  1. The Gibson Master Mix is added and the temperature is set to around 50 degrees
  2. T5 Exonuclease chews back the DNA from the 5’ ends exposing the overlaps we designed, this now makes them stick together and anneal
  3. Phusion DNA Polymerase then fills those gaps using the overlapping strand as a template
  4. Taq DNA Ligase it seals the gaps and nicks between the sugar-phosphate backbone to create a single continous strand of DNA

Gibson Assembly is very good in that it is Fast, Seamless and can assemble up to 15-20 fragments at once and doesnt care about internal restriction sites, however the overlapping primers is expensive, and the reaction can fail if the overlaps create stable hairpins or secondary structures

Benchling Workflow

I will do a simple experiment where i stitch a GFP Protein to a plasmid.

First we need to retreive the GFP DNA sequence from ENA.

This is the Link ot the Fasta file and Here is the DNA sequence: TACACACGAATAAAAGATAACAAAGATGAGTAAAGGAGAAGAACTTTTCACTGGAGTTGTCCCAATTCTTGTTGAATTAGATGGTGATGTTAATGGGCACAAATTTTCTGTCAGTGGAGAGGGTGAAGGTGATGCAACATACGGAAAACTTACCCTTAAATTTATTTGCACTACTGGAAAACTACCTGTTCCATGGCCAACACTTGTCACTACTTTCTCTTATGGTGTTCAATGCTTTTCAAGATACCCAGATCATATGAAACAGCATGACTTTTTCAAGAGTGCCATGCCCGAAGGTTATGTACAGGAAAGAACTATATTTTTCAAAGATGACGGGAACTACAAGACACGTGCTGAAGTCAAGTTTGAAGGTGATACCCTTGTTAATAGAATCGAGTTAAAAGGTATTGATTTTAAAGAAGATGGAAACATTCTTGGACACAAATTGGAATACAACTATAACTCACACAATGTATACATCATGGCAGACAAACAAAAGAATGGAATCAAAGTTAACTTCAAAATTAGACACAACATTGAAGATGGAAGCGTTCAACTAGCAGACCATTATCAACAAAATACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTACCTGTCCACACAATCTGCCCTTTCGAAAGATCCCAACGAAAAGAGAGACCACATGGTCCTTCTTGAGTTTGTAACAGCTGCTGGGATTACACATGGCATGGATGAACTATACAAATAAATGTCCAGACTTCCAATTGACACTAAAGTGTCCGAACAATTACTAAAATCTCAGGGTTCCTGGTTAAATTCAGGCTGAGATATTATTTATATATTTATAGATTCATTAAAATTGTATGAATAATTTATTGATGTTATTGATAGAGGTTATTTTCTTATTAAACAGGCTACTTGGAGTGTATTCTTAATTCTATATTAATTACAATTTGATTTGACTTGCTCAAA

Now Let’s Create the DNA Entry for GFP on Benchling

Creating GRP_DNA in Benchling Creating GRP_DNA in Benchling

Now we need to find a proper plasmid to use, Gemini suggested i use pUC19 so i looked it up on Addgene

pUC19 Addgene Entry pUC19 Addgene Entry

It is a High Copy cloning vector, 2686 bp backbone, with Ampicillin resistance for selection

pUC19 Addgene Properties pUC19 Addgene Properties

using the Addgene Link https://www.addgene.org/50005/ for pUC19 i imported it into Benchling

Importing Plasmid on Benchling Importing Plasmid on Benchling

The Plasmid Looked cool with its annotations and was easily readible

Plasmid Annotations on Benchling Plasmid Annotations on Benchling

so i decided to go in and annotate my GFP too, and since it was just the GFP coding sequence, the process was striaghforward, i just selected it all and named the annotation GFP CDS.

Plasmid Annotations on Benchling Plasmid Annotations on Benchling Plasmid Annotations on Benchling Plasmid Annotations on Benchling

Now we need to cut and Linearize our vector to do Gibson Assembly, so i chose these two restricion enzymes, they are BamHI and BstAPI, why? because these only cut once and inside the Multiple Cloning Site (MCS)

Plasmid Restriction Enzymes Plasmid Restriction Enzymes

And this is the result of the digest, we have two parts, the discarded small part (239 bp) and our linearized backbone (2447 bp)

Plasmid Digest Plasmid Digest

And Now we do Gibson Assembly, first through the assembly wizard

Gibson Assembly Step 1 on Benchling Gibson Assembly Step 1 on Benchling

then we select our linearized backbone as the “Backbone”

Gibson Assembly Step 2 on Benchling Gibson Assembly Step 2 on Benchling

then we select all of our GFP as the “insert”

Gibson Assembly Step 3 on Benchling Gibson Assembly Step 3 on Benchling

and tadaaaa! here is our final construct

Gibson Assembly Step 4 on Benchling Gibson Assembly Step 4 on Benchling

Also Benchling has generated the Forward and Reverse PCR Primers needed for the Reaction, here is the forward primer

caggtcgactctagaggatcTACACACGAATAAAAGATAACAAAG Gibson Assembly FWD Primer Gibson Assembly FWD Primer

and here is the reverse primer ttgtactgagagtgcaccatTTTGAGCAAGTCAAATCAAATTG Gibson Assembly REV Primer Gibson Assembly REV Primer

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Assignment: Asimov Kernel

Here is the represilator workflow, i tried to follow the same model that Traci did in the Lecture

Asimov Kernel Represilator workflow Asimov Kernel Represilator workflow Asimov Kernel Represilator workflow Asimov Kernel Represilator workflow Asimov Kernel Represilator workflow Asimov Kernel Represilator workflow Asimov Kernel Represilator workflow Asimov Kernel Represilator workflow

This was the simulation i got which is totally different than the Bacterial Demos one :(

Asimov Kernel Represilator workflow Asimov Kernel Represilator workflow

I thought i did it wrong so i looked up this represilator visual

Represilator Visual Represilator Visual

i tried switching up my components to match it but as shown here, the graph is still the same :(

Asimov Kernel Represilator workflow Asimov Kernel Represilator workflow
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Construct 1: Constitutive GFP Circuit

This is a very simple GFP producing genetic circuit with a constitutive promoter for continous production

Asimov Kernel Constitutive GFP Circuit workflow Asimov Kernel Constitutive GFP Circuit workflow
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Construct 2: Inducible RFP Circuit

This is an Arabinose controlled RFP producing genetic circuit, in the simulation graph here, i added L-Arabinose at hour 12 so show the increasing production of RFP starting from this point

Asimov Kernel Inducible RFP Circuit Simulation Settings Asimov Kernel Inducible RFP Circuit Simulation Settings Asimov Kernel Inducible RFP Circuit workflow Asimov Kernel Inducible RFP Circuit workflow

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Construct 3: GFP-RFP Operon

This genetic circuit is an Operon of both GFP and RFP, using a single promotor and a single terminator and in between are both the GFP CDS and its RBS and the RFP CDS with its RBS, to make the simulation graph clearer in the production concentrations, i gave the RFP a weaker RBS than the GFP one so now in the graph you can see the protein concentration of RFP is a bit less than GFP

Asimov Kernel GFP-RFP Operon workflow Asimov Kernel GFP-RFP Operon workflow
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Week 7 HW: Genetic Circuits Part ii

cover image cover image

𓃠 Week 7 Homework 𓃠

Assignment Part 1: Intracellular Artificial Neural Networks (IANNs)

1. What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions?

IANNs are more suited for biology since they are not constrained with the digital “0 and 1” appraoch, but can follow an Analog appraoch that is more realistic and suited for biology, because biology is not in an On/Off stage but it varies with different values and expression levels and they are more flexible in that they help in designing more Decision Boundaries without having to create new parts from the scratch and as we have seen with the Neuromorphic wizard, it can also tap and work in Advanced areas like “Dual Region” zones where a cell activates if inputs are strictly below a threshold Or strictly above a threshold, but remains totally inactive in the intermediate zone

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

2. Describe a useful application for an IANN; include a detailed description of input/output behavior, as well as any limitations an IANN might face to achieve your goal.

Useful Applications for IANNs include that it can detect multiple inputs with varying ranging differencies, allowing it to sense and act on subtle expression differences and The network’s decision boundaries can be uniquely tuned to trigger only when the inputs fall into a highly specific area or threshold matching the random, analog nature of biology and on the other hand the network would make no action or keep the outputs at 0 when outside the decision boundaries

Limitations include that using endoribonucleases (ERNs) that bound to the RNA to prevent translation can be biologically expensive since it doesn’t destroy the RNA, it just holds on to it for a long while until it degrades, another one is that the bigger the circuits and the logic gets, probably the more unique ERNs you will need, and this can be tough but it is very important to prevent different branches of the network from accidentally destroying the wrong RNA targets

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

3. Draw a diagram for an intracellular multilayer perceptron where layer 1 outputs an endoribonuclease that regulates a fluorescent protein output in layer 2.

Neuromorphic Circuit Canva Neuromorphic Circuit Canva
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Lab (Week 7) β€” Neuromorphic Circuits

For the Lab, After spending countless hours trying to understand how these work and the different components and how to interpret them on the graph, I created this summary to explain what i understood exactly.

Our two main building blocks are colors (positive weights) and Endoribonucleases or ERNs (negative weights). An ERN’s main job is to degrade whatever RNA has a motif/tag that is specific to it. So in the Part Names excel file, you will see parts named like CasE_rec_mNeonGreen. This means that this is an RNA instruction to produce the mNeonGreen protein, but it has a target tag for CasE. If CasE is present, it sticks to that tag and cleaves the RNA, preventing the translation of mNeonGreen. The same concept follows for the other parts as well!

Then comes the prediction/graph part. You can split the graph into 4 quadrants to make it easier to understand. The Upper Left is where our X2 circuit is highly expressed and X1 circuit is not. The Upper Right is where both X2 and X1 are highly expressed. The Lower Right is where X1 is expressed and X2 is not, and the Lower Left is where neither are expressed.

Here is a simple example to clarify everything:

  • In X1, we put CasE_rec_mNeonGreen: This is supposed to produce the mNeonGreen protein whenever CasE is not available.
  • In X2, we put CasE: This means in areas where X2 is available, the CasE will destroy the green RNA, so we won’t see the mNeonGreen protein.

If you plot this, because X1 brings the color and X2 brings the ERNs, the cells will only glow in the Lower Right quadrant (where X1 is high and X2 is low)!

Simple Circuit Setup Simple Circuit Setup Simple Circuit Graph Simple Circuit Graph


After That I tried to create this Circuit, my aim was to produce a graph that is totally blue on the right and fades to white on the left

Second Circuit Setup Second Circuit Setup

This is my Setup, let me explain it It relies on two main biases:

  • PgU_rec_mNeonGreen which produces the Green color all over the graph and has the PgU tag
  • Csy4_rec_PgU which cleaves the Green color RNA that has the PgU tag

so far we are supposed to have nothing produced since our second bias Csy4_rec_PgU sequesters our first bias PgU_rec_mNeonGreen, then comes in our X1 and X2.

Remember X1 controls the right half of the graph, which is the part we want filled with color, so X1 should probably break our second bias Csy4_rec_PgU to let our first bias PgU_rec_mNeonGreen produce the color

  • X1: CasE_rec_Csy4 this sequesters our second bias Csy4_rec_PgU so now the color is available on the right side

This effectively produces our desired graph, nothing else is needed but since we need to have an X2, we can make it so that it stops X1 letting the second bias be alive again Csy4_rec_PgU to stop the coloring, but we need to do it in an equal way so that X2 doesn’t overpower X1 and removes the color from the upper right quadrant

  • X2: CasE this effecitevly sequesters X1 on the left side and is as equal to it on the right side so it doesn’t whiten that part
Second Circuit Graph Second Circuit Graph

I hope i still remember how to explain this when the next BioClub Tokyo homework review comes lol XD

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Assignment Part 2: Fungal Materials

1. What are some examples of existing fungal materials and what are they used for? What are their advantages and disadvantages over traditional counterparts?

Fungal materials, such as mycelium, are currently used to create alternative leathers for fashion, luxury packaging, and household thermal insulation, Because they take the shape of the molds they grow in, they are also used to fabricate solid objects like planters, furniture, and building bricks for structures and remote habitats

Advantages include that they are incredibly lightweight and deeply insulating as Ren mentioned a thin sheet of mycelium can block the heat of a blowtorch and they are also sustainable and can be grown entirely on agricultural waste like wood chips, grain, or hay

Disadvantages include Vulnerability and Contamination during growth and deformation which is to make the final material safe to handle (inert), it must be baked, which can cause the material to dehydrate and shrink which might break the desired shape

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

2. What might you want to genetically engineer fungi to do and why? What are the advantages of doing synthetic biology in fungi as opposed to bacteria?

We Engineer Fungi for many reasons including Biomanufacturing therapeutics and enzymes, Grow sustainable materials and foods or building bricks and tools Advantages of fungi over bacteria includes fungi having secretion capacities 10 to 1,000 times higher than bacterial hosts, also they are able to survive harsh and extreme environmental conditions that would destroy bacterial cultures, including highly acidic pH levels, severe dryness, and toxic pollutant surges, also while bacteria typically require a continuous water phase to grow, filamentous fungi build intricate 3D networks of branching hyphae (mycelium) which in huge industrial bioreactors are more efficient to capture and degrade hydrophobic molecules that wet bacterial biofilms struggle to absorb

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Assignment Part 3: First DNA Twist Order

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Week 9 HW: Cell Free Systems

cover image cover image

𓃠 Week 9 Homework 𓃠

Homework Part A: General and Lecturer-Specific Questions

General homework questions

Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production.

Cell-free protein synthesis is basically taking the protein-making machinery out of a living cell and running the process in a test tube. The biggest advantage over traditional in vivo methods (which use living cells) is that you do not have to worry about keeping a cell alive. When you work with living cells, the cell membrane blocks you from easily changing the environment, and the cell’s natural life-cycles get in the way.

By removing the cell entirely, you get an open system. This gives you much more flexibility and control over the experimental variables because you can easily add, remove, or change ingredients in the test tube without worrying about burdening the cell.

Because of this level of control, cell-free expression is much more beneficial than using live cells in a few specific cases like producing toxic proteins, If a protein is naturally toxic, trying to grow it in a living cell will just kill the cell before the process is finished, unlike cell-free systems that don’t have a life to lose, so it can successfully produce it, another example is using unnatural ingredients, If you want to build a protein using artificial or unnatural amino acids, living cells will usually reject them. In a cell-free test tube, you can easily force the system to accept and use these custom parts.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Describe the main components of a cell-free expression system and explain the role of each component.

A cell-free expression system is made up of five main components that work together to produce a protein outside of a living cell:

  • Cell Extract: This is the biological “machinery” (like ribosomes and enzymes) harvested from crushed cells. Its role is to actually read the instructions and assemble the protein.

  • Genetic Template (DNA or RNA): This acts as the blueprint. It provides the specific instructions for which protein the machinery needs to build.

  • Amino Acids: These are the raw materials or building blocks that the machinery links together to form the final protein.

  • Energy Source: The process requires power to run.

  • Reaction Buffer: This is a mixture of salts and chemicals (like magnesium) that maintains the correct pH and environment.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Why is energy provision regeneration critical in cell-free systems? Describe a method you could use to ensure continuous ATP supply in your cell-free experiment.

Energy regeneration is critical in cell-free systems because making proteins takes a massive amount of energy. If I just add a starting dose of ATP (the fuel) to my test tube, my experiment will fail quickly for two reasons:

  • I will run out of fuel almost instantly, causing the reaction to stop.

  • The “dead” fuel (ADP) builds up in the test tube and actually acts like toxic exhaust, slowing down the machinery.

To make sure I have a continuous ATP supply in my experiment, I would use an energy regeneration system like creatine phosphate and creatine kinase. I like to think of this as a backup battery charger. I would add the high-energy chemical (creatine phosphate) and a specific enzyme (creatine kinase) to my test tube. As my machinery burns through the ATP and turns it into dead ADP, the enzyme automatically uses the creatine phosphate to recharge the ADP back into fresh ATP, keeping my experiment running smoothly.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Compare prokaryotic versus eukaryotic cell-free expression systems. Choose a protein to produce in each system and explain why.

When comparing cell-free expression systems, the main difference comes down to speed, cost, and the complexity of the protein we want to make.

  • Prokaryotic Systems (e coli): These are fast, cheap, and produce high yields of proteins. However, they lack the machinery to properly fold complex proteins or add necessary chemical modifications.

  • Eukaryotic Systems (yeast): These are slower and more expensive, yielding less protein. But, they contain the advanced machinery needed to perfectly fold complex proteins and add necessary modifications (like attaching sugar molecules).

My Protein Choices:

For the Prokaryotic System: i’d choose to produce Green Fluorescent Protein (GFP), because GFP is a very simple, standard protein that folds easily on its own. Since it does not require complex modifications, we can take advantage of the prokaryotic system’s speed and cheap cost to produce massive amounts of it quickly.

For the Eukaryotic System: I’d choose to produce a Human Antibody, since Antibodies are highly complex proteins that require intricate folding and specific chemical modifications (like glycosylation) to actually work. A simple bacterial system would build it incorrectly, so we must use the more advanced eukaryotic system to ensure it functions properly.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

How would you design a cell-free experiment to optimize the expression of a membrane protein? Discuss the challenges and how you would address them in your setup.

When designing a cell-free experiment to produce a membrane protein, the biggest challenge we face is that these proteins are highly hydrophobic (water-repelling). Because a standard cell-free reaction mixture is mostly water-based, newly made membrane proteins will naturally clump together and fold incorrectly to hide from the water, becoming completely useless.

To optimize our setup and solve this problem, we need to provide a fatty environment directly in the test tube so the protein has somewhere to go as soon as it is built. We would design the experiment by adding either Liposomes which are artificial bubbles of cell membrane into the liquid. As the machinery builds our membrane protein, it will insert it directly into the liposome wall, mimicking a natural cell, another option is detergennts, We can use specific detergents that wrap around the protein to shield it from the water, keeping it dissolved and properly folded.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Imagine you observe a low yield of your target protein in a cell-free system. Describe three possible reasons for this and suggest a troubleshooting strategy for each.

If we observe a low yield of our target protein in our cell-free system, it can be either

Reason 1: Our genetic template is degraded. The DNA or RNA instructions might be breaking down before the machinery can read them, and to solve it we would use higher-quality, purified DNA/RNA and add special chemicals (like RNase inhibitors) to the mix to protect the blueprint from being destroyed.

Reason 2: We ran out of energy or building blocks. The system might have burned through all its ATP or used up all the amino acids, here we would add an energy regeneration system (like creatine phosphate) to keep the power on, or by adding a fresh batch of amino acids mid-reaction.

Reason 3: The buffer conditions are wrong. The environment in the test tube, specifically the salt or magnesium levels, might not be ideal for our specific protein, We would set up a few small, test reactions with slightly different magnesium concentrations to find the exact “sweet spot” where our protein grows best.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Homework question from Kate Adamala

Function and Description

I am designing a synthetic cell that acts as an environmental sensor for contaminated drinking water.

Input: Arsenic.

Output: Red Fluorescent Protein (RFP).

Could this function be realized by cell-free Tx/Tl alone, without encapsulation?

No, if we just poured our raw cell-free protein machinery directly into a river or a cup of tested water, the machinery would instantly dilute into the water and be destroyed by the environment. The cell membrane acts as a protective shield, keeping our machinery concentrated and safe in its own little bubble (very similar to the example answer given in homeowrk)

Could this function be realized by a genetically modified natural cell?

Yes, but there is a major flaw with living cells here: arsenic is highly toxic. If the water sample is heavily contaminated, the arsenic will likely kill the living bacteria before they even have a chance to produce the red warning color. Because our synthetic cell is basically “dead” chemical machinery, it cannot be killed by the toxin.

Desired Outcome of Operation

When we drop our synthetic cells into a water sample, they will glow bright red if the water is contaminated with arsenic, acting as a clear visual warning.

Design Components

The Membrane: We will use a basic mix of POPC (a standard, easy-to-use lipid) and cholesterol to make the bubble stable.

Inside the Cell (Encapsulation): We will pack it with our cell-free Tx/Tl machinery, amino acids, and the specific genetic instructions (DNA) needed to sense arsenic and build the red protein.

Tx/Tl System Origin We will use a Bacterial (E. coli) system since bacterial systems are much cheaper and easier for a basic sensor like this.

Communication with the Environment Because arsenic ions are very small, we will rely on them being naturally permeable to our simple lipid membrane. The arsenic will slip through the fat bubble into the inside. The output (the red protein) is too big to escape, so it stays trapped inside, making the entire bubble light up red.

Experimental Details

Lipids: POPC and Cholesterol.

Enzymes: E. coli cell-free extract (the Tx/Tl machinery).

Genes: We will use the Pars promoter and arsR gene (genetic switch that reacts to arsenic) connected to the mCherry gene (RFP).

Measurement We will measure the function of our system using a fluorometer (a machine that measures specific colors of light) to track how much red fluorescence is being produced. For a quick visual check, we could even just shine a UV flashlight on the sample to see if it glows red!

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Homework question from Peter Nguyen

Idk why but my idea kinda reminds me of Detroit Become Human XD

Infection-Detecting Smart Bandage

We are developing a smart, wearable bandage that uses freeze-dried cell-free sensors to instantly change color when it detects an early infection in a wound.

How the Idea Will Work

We will embed freeze-dried, dormant cell-free machinery directly into the cotton fibers of a standard bandage. When a person puts the bandage on a cut, the natural moisture from the wound (blood or plasma) will soak into the fabric and “wake up” the machinery. If the cut becomes infected, the harmful bacteria will release specific chemical toxins. Our built-in cell-free sensors are programmed to detect those specific toxins; if they find them, the machinery will produce a brightly colored protein, turning the bandage blue to warn the user of the infection.

The Societal Challenge and Market Need

This directly addresses the massive global problem of delayed medical care and severe infections. Right now, most people do not know a wound is infected until it becomes hot, swollen, and dangerous, often requiring heavy antibiotics or a hospital visit. Our smart bandage gives patients an easy, visual warning at home or disaster araes so they can get a simple treatment before the problem gets worse.

Addressing the Limitations of Cell-Free Systems Cell-free systems have limits, but we can actually use them to our advantage in this design:

  • Activation with water: The reaction stays completely paused on the shelf and is only activated by the natural moisture of the wound itself (unfortunately water can provide perfect medium for bacteria to grow).

  • Stability: To keep the freeze-dried machinery stable for months or years, we simply package our smart bandages in airtight, moisture-proof foil wrappers (just like normal sterile bandages).

  • One-time use: A cell-free reaction eventually runs out of energy and stops, meaning it only works once. However so does normal bandages :D

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Homework question from Ally Huang

After having a discussion of the idea with Gemini, i was lazy :( and had it do the writing instead because i was really busy sooo i am sorry for that XD

On-Demand Parathyroid Hormone Production

Background Information (Maximum 100 words)

Astronauts lose a dangerous amount of bone mass during long spaceflights due to zero gravity. Taking pre-made liquid medicines from Earth is difficult because they are heavy, take up precious cargo space, and expire quickly due to harsh space radiation. We need a way to manufacture fresh medicines directly on the spaceship. This is scientifically interesting and absolutely critical for keeping astronauts healthy on long missions to Mars where supply drops are impossible.

Molecular or Genetic Target (Maximum 30 words)

The target is the gene for Parathyroid Hormone (PTH), which is a specific protein that tells the human body to build new bone mass.

Relation to Space Biology Challenge (Maximum 100 words)

PTH is a well-known protein medicine used on Earth to treat severe bone loss. By targeting the PTH gene, we can use the BioBits cell-free system as a mini medical factory. Instead of bringing fragile, expiring liquid medicine into space, we just bring the lightweight, freeze-dried DNA instructions for PTH. When astronauts need the medicine, they simply add water, and the BioBits system builds the fresh PTH protein on demand to treat their decaying bones.

Hypothesis and Research Goal (Maximum 150 words)

Our goal is to prove that a freeze-dried cell-free system (BioBits) can successfully manufacture a complex human medicine (PTH) in a zero-gravity environment. We hypothesize that the microgravity on the space station will not stop the cell-free machinery from correctly reading the DNA and assembling the protein. Because cell-free systems are just chemical reactions and do not rely on living cells (which often get stressed or behave unpredictably in space), we believe the protein-building process will work just as perfectly in space as it does on Earth.

Experimental Plan (Maximum 100 words)

On the space station, we will add water to two sets of freeze-dried BioBits tubes and warm them using the miniPCR machine.

  • Test Sample: BioBits containing the DNA instructions for the PTH bone medicine.

  • Positive Control: BioBits containing DNA for a red fluorescent protein.

After a few hours, we will look at the control tube using the P51 Viewer; if it glows red, we know the BioBits machinery survived the trip and is working. Finally, we will test our main sample using a simple protein test strip to confirm the fresh PTH medicine was successfully built.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Homework Part B: Individual Final Project

cover image cover image

Full Project Documentation

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Week 10 HW: Imaging and Measurement

cover image cover image

𓃠 Week 10 Homework 𓃠

Homework: Final Project

  • Ξ±-Pinene titer: quantity of our target molecule produced per chassis per carbon source. Measured by GC-MS with dodecane overlay extraction from 96-well deep plates.
  • AgPS protein identity and mass: purified AgPS-His6 protein (via Ni-NTA) can be characterized by intact protein LC-MS to confirm correct molecular mass and verify the His6 tag is present.
  • Cell growth (OD₆₀₀): optical density at 600nm across all chassis Γ— carbon source conditions. Measured on the PHERAstar FSX plate reader. Tells us if the strain is healthy while producing.
  • Gene expression (mRNA level): RT-qPCR on the CFX Opus machine using primers targeting AgGPPS2 and AgPS transcripts. Confirms the genes are being transcribed in vivo.
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Homework: Waters Part I β€” Molecular Weight

Theoritical MW of eGFP Theoritical MW of eGFP

The Theoritical Molecular weight of the eGFP Sequence is 28006.60

I have decided to pick two peaks from the Figure 1 that was provided in the homework:

  • m/z_n (Right Peak): 903.7148 (Lower charge)
  • m/z_{n+1} (Left Peak): 875.4421 (Higher charge)
HW provided image of mass spec of egfp HW provided image of mass spec of egfp

1. Charge State Calculation ($z$) $$ z = \frac{875.4421 - 1}{903.7148 - 875.4421} $$ $$ z = \frac{874.4421}{28.2727} $$ $$ z \approx 30.93 \rightarrow \mathbf{31} $$

2. Experimental Molecular Weight ($MW_{exp}$) $$ MW = 31 \times (903.7148 - 1) $$ $$ MW = 31 \times 902.7148 $$ $$ MW_{exp} = \mathbf{27,984.16 \text{ Da}} $$

3. Mass Accuracy (ppm) $$ \text{Accuracy} = \frac{|27,984.16 - 28,006.60|}{28,006.60} \times 1,000,000 $$ $$ \text{Accuracy} = \frac{22.44}{28,006.60} \times 1,000,000 $$ $$ \text{Accuracy} \approx \mathbf{801 \text{ ppm}} $$

Also the zoomed in part is blurry and unclear but i guess i cant identify the charge state anyways, the more i try to look it seems the numbers are so close together mostly 1473 and 1474 with lots of decimal changes in between

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Homework: Waters Part III β€” Peptide Mapping - primary structure

MVSKGEELFTG VVPILVELDG DVNGHKFSVS GEGEGDATYG KLTLKFICTT GKLPVPWPTL VTTLTYGVQC FSRYPDHMKQ HDFFKSAMPE GYVQERTIFF KDDGNYKTRA EVKFEGDTLV NRIELKGIDF KEDGNILGHK LEYNYNSHNV YIMADKQKNG IKVNFKIRHN IEDGSVQLAD HYQQNTPIGD GPVLLPDNHY LSTQSALSKD PNEKRDHMVL LEFVTAAGIT LGMDELYKLE HHHHHH

The eGFP sequence has 19 Lysines (K) and 4 Arginines (R)

tryptic digestion of eGFP tryptic digestion of eGFP

19 peptides will be generated from the tryptic digestion of eGFP.

tryptic digestion of eGFP tryptic digestion of eGFP

i drew a blue line to mimic the >10% relative abundance but that is probably not exactly 10% XD but anyways, it gives exactly 19 peptides like what was predicted before so they match but if i did it exactly 10%, it would probably not match and they are more peptide fragments here than predicted

The most abundant peak is The peak eluting at 2.78 minutes and has an $m/z$ of 525.76, the spacing between isotopes is mostly 0.5 so the charge will be $1/0.5 = 2$

Single Charged Mass $$ \text{[M+H]}^+ = (z \times m/z) - (z - 1) $$ $$ \text{[M+H]}^+ = (2 \times 525.76) - 1 $$ $$ \text{[M+H]}^+ = 1051.52 - 1 $$ $$ \text{[M+H]}^+ = \mathbf{1050.52 \text{ Da}} $$

Experimental Mass = 1050.52 Da

The peptide perfect match would be 1050.5214 FEGDTLVNR

Mass Accuracy (ppm) $$ \text{Accuracy} = \frac{|MW_{\text{experiment}} - MW_{\text{theory}}|}{MW_{\text{theory}}} \times 1,000,000 $$ $$ \text{Accuracy} = \frac{|1050.52 - 1050.5214|}{1050.5214} \times 1,000,000 $$ $$ \text{Accuracy} = \frac{0.0014}{1050.5214} \times 1,000,000 $$ $$ \text{Accuracy} \approx \mathbf{1.33 \text{ ppm}} $$

Accuracy $\approx$ 1.33 ppm

88% of the eGFP sequence is confirmed by peptide mapping in Figure 6

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Homework: Waters Part IV β€” Oligomers

Polypeptide Subunit NameSubunit Mass
7FU340 kDa
8FU400 kDa

first we need to calculate the Mass of each of our oligomeric species

Decamer is 10 so didecamer is 20 and 3 decamer is 30 etc

  • A) 7FU Decamer: Mass = 10 x 340 kDa = 3400 kDa (3.4 Megadaltons (MDa))
  • B) 8FU Didecamer: Mass = 20 x 400 kDa = 8000 kDa (8 Megadaltons (MDa))
  • C) 8FU 3-Decamer: Mass = 30 x 400 kDa = 3400 kDa (12 Megadaltons (MDa))
  • D) 8FU 4-Decamer: Mass = 40 x 400 kDa = 3400 kDa (16 Megadaltons (MDa))
Graph of the KHL Oligomeric Speices Graph of the KHL Oligomeric Speices
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Homework: Waters Part V β€” Did I make GFP?

TheoriticalObservedPPM Mass Error
28006.6027,984.16801
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Week 11 HW: Building Genomes

cover image cover image

𓃠 Week 11 Homework 𓃠

Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork

I just want to say, this was soooo fun! :D

and i truly love how it changed alot and all the virtual pixel fun wars i’ve had and truces and teamups, it was a really engaging and lovely experience

Here is some screenshots i took.

Pixel Artwork Pixel Artwork

This was 6 hours before the deadline hit, i was responsible for the Big EGYPT section, and it was getting erased constantly and i was trying to rebuild it again and again haha. i also took part helping in the other parts as well like the 2026 and the LOVE HTGAA ones.

Pixel Artwork Pixel Artwork

This was a couple minutes before it ended, i was actually asleep because it was 5 am my time, but i set a special alarm to wake up 10 mins before it ends to make sure EGYPT still stands and it wasn’t but in place was MIT and i actually didn’t want to erase that, i love MIT for the HTGAA course, so i decided i’d go on a smaller EGY scale in the lower quadrant

Pixel Artwork Pixel Artwork

aaaand that was it :D. i really loved it all and to prove that, look at this! :D

Pixel Artwork Top Contributors Pixel Artwork Top Contributors

I am the Top 1 Contributor :D, and many thanks to Gustavo and Anastasia, i’ve had so much fun coloring with them at times and against them at other times as well :D it was a very fun experience :D.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Part B: Cell-Free Protein Synthesis | Cell-Free Reagents

Referencing the cell-free protein synthesis reaction composition (the middle box outlined in yellow on the image above, also listed below), provide a 1-2 sentence description of what each component’s role is in the cell-free reaction.

E. coli Lysate:

  • BL21 (DE3) Star Lysate: Provides the essential molecular machinery, including ribosomes, tRNAs, and initiation/elongation factors, while the Star mutation enhances mRNA stability by reducing RNase activity. It also contains T7 RNA Polymerase to drive high-level transcription from T7 promoter-based DNA templates.

Salts and Buffer

  • Potassium Glutamate: Acts as the primary cytoplasmic salt to maintain osmotic balance and stabilize protein-nucleic acid interactions.
  • HEPES-KOH pH 7.5: Serves as a buffering agent to maintain a stable pH for the optimal activity of the enzymes.
  • Magnesium Glutamate: Provides ions that act as essential cofactors for RNA polymerase and ribosomes, stabilizing their structures.
  • Potassium Phosphate (Monobasic and Dibasic): Maintains pH stability and provides inorganic phosphate, which can assist in the regeneration of nucleoside triphosphates (NTPs) and ATP.

Energy / Nucleotide System

  • Ribose and Glucose: Serve as energy sources that can be metabolized to regenerate ATP.
  • AMP, CMP, GMP, UMP: Function as the building blocks (mononucleotides) for RNA synthesis and are interconverted into their triphosphate forms (ATP, CTP, GTP, UTP) to power the reaction.
  • Guanine: Acts as a precursor for the synthesis of GTP, which is specifically required for the initiation and translocation steps of translation.

Translation Mix (Amino Acids)

  • 17 Amino Acid Mix, Tyrosine, and Cysteine: Provide the full set of 20 proteinogenic amino acids required as the fundamental building blocks for synthesizing the polypeptide chain.

Additives and Backfill

  • Nicotinamide: Often used as a precursor to NAD+, a cofactor that supports metabolic pathways involved in energy regeneration within the lysate.
  • Nuclease Free Water: Used as a “backfill” to bring the reaction to its final volume while ensuring the environment remains free of contaminating enzymes that could degrade DNA or RNA.
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Describe the main differences between the 1-hour optimized PEP-NTP master mix and the 20-hour NMP-Ribose-Glucose master mix shown in the Google Slide above. (2-3 sentences)

The main difference between the two systems lies in the energy sources and metabolic pathways: the 1-hour PEP-NTP mix uses high-energy phosphoenolpyruvate (PEP) and pre-formed nucleoside triphosphates (NTPs) for rapid, high-intensity protein synthesis. In contrast, the 20-hour NMP-Ribose-Glucose mix utilizes cheaper nucleoside monophosphates (NMPs) and sugars (Ribose/Glucose) to fuel a slower, sustained reaction that significantly reduces costs by regenerating energy over a longer period.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Bonus question: How can transcription occur if GMP is not included but Guanine is?

Transcription can still occur because the E. coli lysate contains endogenous enzymes (such as phosphoribosyltransferases) that salvage Guanine and convert it into GMP. Once converted to GMP, the energy system further phosphorylates it into GTP, the active nucleotide required by RNA polymerase for transcription.

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Part C: Planning the Global Experiment | Cell-Free Master Mix Design

Given the 6 fluorescent proteins we used for our collaborative painting, identify and explain at least one biophysical or functional property of each protein that affects expression or readout in cell-free systems. (Hint: options include maturation time, acid sensitivity, folding, oxygen dependence, etc) (1-2 sentences each)

  • sfGFP: Engineered for robust folding and high solubility, it provides a very rapid readout even when fused to other proteins or in less-than-ideal reaction conditions.
  • mRFP1: As the first truly monomeric RFP, it features a moderate maturation time (half-time ~50 minutes) but can be less photostable than newer variants, potentially leading to signal fading during long observations.
  • mKO2: This orange protein is highly dependent on oxygen for chromophore maturation; under low-oxygen (hypoxic) conditions typical of some sealed cell-free reactions, its fluorescence recovery is significantly slower.
  • mTurquoise2: Known for its exceptional brightness and high quantum yield, and requires approximately 66 minutes for maturation.
  • mScarlet_I: rapid maturation (half-time ~31 minutes), making it one of the fastest-developing red reporters.
  • Electra2: This blue fluorescent protein is designed for high intracellular brightness and exhibits a tendency to form puncta or aggregates in certain environments.
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Create a hypothesis for how adjusting one or more reagents in the cell-free mastermix could improve a specific biophysical or functional property you identified above, in order to maximize fluorescence over a 36-hour incubation. Clearly state the protein, the reagent(s), and the expected effect.

Boosting Chromophore Maturation

  • Target Protein: mKO2 or mScarlet-I
  • Reagent Adjustment: Increase Oxygen availability and add Nicotinamide (NAD+).
  • Why: Since orange and red chromophores require multiple oxidation steps to mature, increasing NAD+ and oxygen transfer will accelerate the rate-limiting maturation step, ensuring that the protein synthesized in the final 20 hours actually becomes fluorescent before the reaction ends.
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Part D: Build-A-Cloud-Lab | (optional) Bonus Assignment

This is very cool, but i probably did a very bad job haha :D, i will use this chance though to learn more about the different parts of these machines :O

Cloud Lab Map Cloud Lab Map
π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Week 12 HW: Bioproduction

cover image cover image

𓃠 Week 12 Homework 𓃠

No Homework :D Working on Final Project :D

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Week 13 HW: Bio Design Living Materials

cover image cover image

𓃠 Week 13 Homework 𓃠

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

Week 14 HW: Biofabrication

cover image cover image

𓃠 Week 14 Homework 𓃠

π“‹Ή π“‹Ή π“‹Ή π“‹Ή π“‹Ή

BioClub Committed Listener MoU

HTGAA Committed Listener (CL) Agreement

I am a HTGAA Committed Listener, my responsibilities are:

  • Watching class lectures and recitations
  • Participating in node reviews
  • Developing and documenting my homework
  • Actively communicating with other students and TAs on the forum
  • Allowing HTGAA and BioClub to share my work (with attribution)
  • Honestly reporting on my work, and appropriately attributing and citing the work of others (both human and non-human)
  • Following locally applicable health and safety guidance
  • Promoting a respectful environment free of harassment and discrimination

Signed by committing this file to my documentation page/repository,

Nourelden Rihan

2.3.2026