Jason Ross — HTGAA Spring 2026

Andromeda Andromeda

About me

Passionate about space-based biotechnology applications and strengthening biosecurity. Excited to grow my technical acumen through this course!

Contact info

Please feel free to reach out via the HTGAA Discourse forum at 2026a-jason-ross!

Homework

Labs

Projects

Subsections of Jason Ross — HTGAA Spring 2026

Homework

Weekly homework submissions:

  • Week X HW: Test

    Hello 1 World ↩︎

  • Week 1 HW: Principles and Practices

    A glorious space phage (with artistic license lol) Week 1 Biological Engineering Application Governance Exercise I’m interested in developing phage chassis capable of targeting bacteria that commonly cause infections during spaceflight. I’m interested in developing these phage chassis because:

  • Week 2 HW: DNA Read, Write, and Edit

    Part 0: Basics of Gel Electrophoresis Per instructions for this part, I attended the 02/10 lecture and 02/11. Additionally I attended all 3 Bootcamp sessions. Part 1: Benchling & In-silico Gel Art Make a free account at benchling.com Benchling Account Creation Confirmation

  • Week 3 HW: Lab Automation

    The Power of Lab Automation Assignment: Python Script for Opentrons Artwork 0: Attended this week’s recitation and reviewed the lab information on programming Opentrons 1: Generated an artistic design using Ronan’s Opentrons GUI 1 2: Artistic Design Python Script: See script in URL below: https://colab.research.google.com/drive/1-pgSJt_aF9MydtG0szxz2YKoogNRLRhH#scrollTo=PsOgJ2DndZzt 3: Listing my sfgfp point coordinates from Ronan’s Opentrons GUI below (the shape is a rightward-facing green arrow): [(6.6,11), (8.8,11), (11,11), (8.8,8.8), (11,8.8), (13.2,8.8), (11,6.6), (13.2,6.6), (15.4,6.6), (13.2,4.4), (15.4,4.4), (17.6,4.4), (15.4,2.2), (17.6,2.2), (19.8,2.2), (17.6,0), (19.8,0), (22,0), (-22,-2.2), (-19.8,-2.2), (-17.6,-2.2), (-15.4,-2.2), (-13.2,-2.2), (-11,-2.2), (-8.8,-2.2), (-6.6,-2.2), (-4.4,-2.2), (-2.2,-2.2), (0,-2.2), (2.2,-2.2), (4.4,-2.2), (6.6,-2.2), (8.8,-2.2), (11,-2.2), (13.2,-2.2), (15.4,-2.2), (17.6,-2.2), (19.8,-2.2), (22,-2.2), (24.2,-2.2), (-22,-4.4), (-19.8,-4.4), (-17.6,-4.4), (-15.4,-4.4), (-13.2,-4.4), (-11,-4.4), (-8.8,-4.4), (-6.6,-4.4), (-4.4,-4.4), (-2.2,-4.4), (0,-4.4), (2.2,-4.4), (4.4,-4.4), (6.6,-4.4), (8.8,-4.4), (11,-4.4), (13.2,-4.4), (15.4,-4.4), (17.6,-4.4), (19.8,-4.4), (22,-4.4), (24.2,-4.4), (26.4,-4.4), (-22,-6.6), (-19.8,-6.6), (-17.6,-6.6), (-15.4,-6.6), (-13.2,-6.6), (-11,-6.6), (-8.8,-6.6), (-6.6,-6.6), (-4.4,-6.6), (-2.2,-6.6), (0,-6.6), (2.2,-6.6), (4.4,-6.6), (6.6,-6.6), (8.8,-6.6), (11,-6.6), (13.2,-6.6), (15.4,-6.6), (17.6,-6.6), (19.8,-6.6), (22,-6.6), (24.2,-6.6), (17.6,-8.8), (19.8,-8.8), (22,-8.8), (15.4,-11), (17.6,-11), (19.8,-11), (13.2,-13.2), (15.4,-13.2), (17.6,-13.2), (11,-15.4), (13.2,-15.4), (15.4,-15.4), (8.8,-17.6), (11,-17.6), (13.2,-17.6), (6.6,-19.8), (8.8,-19.8), (11,-19.8)]

  • Week 4 HW: Protein Design Part 1

    South American Rattlesnakes (Crotalus durissus terrificus) with Crotamine protein Part A: Conceptual Questions How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons) I intake approx. 5 * 1023 Daltons of amino acids when ingesting 500 grams of meat. This is based off results indicating I ingest approx. 1021 Daltons of amino acids when ingesting 1 gram of meat Why do humans eat beef but do not become a cow, eat fish but do not become fish? Humans eat beef but don’t become cattle, and eat fish but don’t become fish because genetic information from the lifeform being ingested isn’t transferred wholesale. Much of the genetic material being eaten is broken down during digestion and more importantly a human beings’ cells follow instructions derived from their DNA. Human beings’ cells utilize amino acids from the lifeform being ingestion, but perform this utilization according to specific genetic instructions. The lifeform being ingested and its amino acids are the raw materials the cell uses for various means. Why are there only 20 natural amino acids? There are several broad reasons why there are only 20 standard natural amino acids. The first reason is that early on in the history of evolution, this group of amino acids became more or less ’locked in’, meaning that once the basic relationship between three letter codons and these 20 standard natural amino acids became widely distributed across the kingdom of life, it becamde too risky/dangerous from an evolutionary standpoint to alter this core set. Another reason is that the group of 20 gives enough range in structure and chemistry to build a large chunk of what evolution or directed evolution might desire. The other reasons seem to amount to various types of evolutionary trade-offs. Adding more than 20 amino acids to this standard set would add additional, potentially unwanted complexity, while decreasing the number of amino acids in the set might lead to issues with a lack of uniqueness with amino acids side chain sharing, which would in turn limit the functional flexibility of amino acids to do things like fold precisely. Can you make other non-natural amino acids? Design some new amino acids. Yes you can. My attempts to design some new amino acids usng SwissSideChain and the Cryo-EM structure of Receptor Tyrosine Kinase ROS1 PDB file in PyMol (open-source) are shown below: Attempt at creating a non-natural amino acid residue mutation of Tyrosine Kinase ROS1 using cyclohexanecarboxylic acid

  • Week 5 HW: Protein Design Part 2

    Using AlphaFold for Protein Optimization Part A: SOD1 Binder Peptide Design Part 1: Generate Binders with PepMLM Retrieved human SOD1 sequence via UniProt (see photo below). Introduced A4V mutation via Gemini prompt (see sequence below). Human SOD1 sequence (A4V mutation not added) Human SOD1 sequence (A4V mutation added)

  • Week 6 HW: Genetic Circuits Part 1

    Robot Crafting Genetic Circuit (Stylized) DNA Assembly What are some components in the Phusion High-Fidelity (HF) PCR Master Mix and what is their purpose? HF DNA Polymerase: This is the enzyme responsible for copying DNA as it moves from the 5’ to the 3’ position across the DNA Deoxynucleotide triphosphates (dNTPs): These are the DNA molecular building blocks, consisting of Adenine (A), Thymine (T), Cytosine (C), and Guanine (G) variants HF Buffer: This consists of magnesium chloride, which is salt added to the reaction. It matters because it dissolves into Mg²⁺, which helps nucleotides bond during the reaction What are some factors that determine primer annealing temperature during PCR? Some factors that determine primer annealing tempeature during PCR include: Primer lengths Primer melting tempratures GC content/sequence content Buffer components There are two methods from this class that create linear fragments of DNA: PCR, and restriction enzyme digests. Compare and contrast these two methods, both in terms of protocol as well as when one may be preferable to use over the other. PCR: PCR creates new linear DNA fragments by via enzymatic amplification of a given region nth number of times. The PCR protocol essentially consists of setting up reaction mixes, denaturating the DNA into single strends, annealing so primers can anneal to specific complementary sequences, extension so the polymerase can syntehsize a new strand, and then repeating this as many times as neccessary. This method might be more useful when there is a specific fragment of DNA one wants to amplify for further use. Restriction Enzyme Digests: Restriction Enzyme Digests create new linear DNA fragments by cutting DNA at specific points/recognition sites. The Restriction Enzyme Digest protocol consists of setting up a reaction mix, incubation, and then stopping the reaction. This method might be more useful when there is a specific fragment of DNA one wants to isolate for further analysis. How can you ensure that the DNA sequences that you have digested and PCR-ed will be appropriate for Gibson cloning? You can ensure the DNA sequences have appropriate 5’ –> 3’ orientation with corresponding overlaps. Fragments salso need to cover the relevant region for cloning, and also need to be inserted at the appropriate molar ratio relative to the plasmid backbone (vector). This is usually a 2:1 ratio. How does the plasmid DNA enter the E. coli cells during transformation? The plasmid DNA enters the E. coli either via heat shock (temperature change) or electroporation (high electrical voltage). Both methods shock the E. coli cell, causing its cell membrane to open for the plasmid DNA to enter. Describe another assembly method in detail (such as Golden Gate Assembly) DNA topoisomerase I (TOPO) Cloning: TOPO cloning’s traditionally used, as it’s a fast, reliable method for cloning products from PCR for later sequencing, etc. The first step in TOPO cloning is generating an insert with Taq polymerase via PCR. This creates inserts with an A-overhang, which can then help address the second step. The second step is to combine this PCR product with the TOPO vector. This is usually done for a couple of minutes. The insert’s 5’ OH/hydroxyl interacts with the TOPO DNA at its end, and as part of this process A and T base pairing occurs between the respective insert and the vector . Then the TOPO religates the strangs and dissociates, creating a closed circular plasmid with the given insert. See diagrams below:

  • Week 7 HW: Genetic Circuits Part 2

    Genetic Circuits Part 2 Assignment Part 1: Intracellular Artificial Neural Networks (IANNs) Unlike traditional genetic circuits, IANNs are analog, and as such correspond more closely to the nature of biological systems (i.e, we’re not always looking for strict 0/1 binary logic, sometimes we’re looking to establish control across a range of values or space/time). This analog nature means they are more responsive, efficient, and biocompatible.

  • Week 9 HW: Cell-Free Systems

    Homework Part A: General and Lecturer-Specific Questions Cell-Free Systems General Homework Questions Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production. Cell-free systems allow for a broader range of potential chemistries than those given to us from natural biology, expanding flexibility. Cell-free protein synthesis also allows for greater control over experimental variables because the entire protein expression construct is designed from scratch (i.e., we have the opportunity to bypass a lot of the compleity of natural cells). Cell-free expression is more beneficial than cell production if you want to rapidly protoype gene pathways and if you want an expression mechanism that’s more amenable to consistent, predictable modeling and analysis. Describe the main components of a cell-free expression system and explain the role of each component. The main components of a cell-free expression system are (based on elements described in this hyperlink 1): DNA template: Genetic code to begin Tx/Tl process Ribosomes: Assembling amino acids into polypeptides Enzymes: Catalyzing certain important chemical reactions necessary for the appropriate functioning of that cell-free expression system (ex. transcription and translation, energy generation) Amino Acids: The core chemical building blocks of the proteins the cell-free expression system will express Polymerases: Synthesizing DNA and RNA Why is energy provision regeneration critical in cell-free systems? Describe a method you could use to ensure continuous ATP supply in your cell-free experiment Energy provision regeneration is critical in cell-free systems because cell-free systems don’t consume enzymes to produce energy. They also need external energy sources to remove waste products. A workaround might be to have analogous enzymatic reactions (possibly based off shared common charges) within the cell-free system to produce energy Compare prokaryotic versus eukaryotic cell-free expression systems. Choose a protein to produce in each system and explain why. Prokaryotic cell-free expression systems allow for the colocation of transcription and translation. This might work well for proteins that need to be produced at high volume, like an industrial protease prtoein. Eukaryotic cell-free expression systems allow for more complex proteins to be built due to their nuclei. This might work well for the production of more advanced/technically complex proteins, like rabbit serum albumin. How would you design a cell-free experiment to optimize the expression of a membrane protein? Discuss the challenges and how you would address them in your setup. In a manner similar to Shuguang Zhang ‘molecular glove’ experiment, I’d try to essentially coat and/or surround the the membrane protein with hydrophilic proteins to attract and/or absorb water in the cell-free environment, so the membrane protein can incorporate into the liposome 2. Challenges might include appropriate hydrophilic concentrations (which might be discerned via calculations or trial and error) or bonding between the hydrophilic proteins and the membrane proteins. This might be mitigated and/or the amount of error reduced through the use of computaitonal modeling and simulation tools like AlphaFold Imagine you observe a low yield of your target protein in a cell-free system. Describe three possible reasons for this and suggest a troubleshooting strategy for each. Suboptimal Ribosome Function: Examine ribosome mRNA transcription processes and modify as necessary Suboptimal Transcription: Examine tRNAs for coding errors/misreads or inappropriate expression levels and modify as necessary Suboptimal External Communication (i.e., yields cannot properly exit system at desired levels): Examine and modify membrane channel functionality as necessary Supporting prompts for this section listed below:

  • Week 10 HW: Advanced Imaging & Measurement Technology

    Waters Corporation Mass Spectrometer Homework: Final Project For your final project: Please identify at least one (ideally many) aspect(s) of your project that you will measure. Lysis Rate Efficiency of Plating Please describe all of the elements you would like to measure, and furthermore describe how you will perform these measurements. Lysis Rate: This measures the rate at which the mutated m. smegma mycobacteriophage lyses or destroys bacteria. This would be measured in a wet lab setting by comparing percentages of bacteria across a control and another plate that has been exposed to a mutated form of m. smegma mycobacteriophage Efficiency of Plating: This measures the rate at which the mutated m. smegma mycobacteriophage can begin initiating a host infection. Believe this would also be measured in a wet lab setting by comparing percentages of bacteria across a control and another plate that has been exposed to a mutated form of m. smegma mycobacteriophage What are the technologies you will use (e.g., gel electrophoresis, DNA sequencing, mass spectrometry, etc.)? Describe in detail. Lysis Rate: I’d likely use a microplate reader as part of a wet lab extension of the final project Efficiency of Plating: I’d use a plauqe assay as part of a wet lab extension of the final project Supporting prompts for this section listed below:

  • Week 11 HW: Bioproduction and Cloud Labs

    Part 1: Global Pixel Artwork Cloud Lab Contribution Made the following contributions to the Global Pixel Artwork Cloud Lab (see screenshots below) Global Pixel Artwork Contributions (see above). Edited 4 pixels in the upper right hand corner of the image (changed them to sfGFP)

  • Week 12 HW: Bioproduction & Cloud Labs Part 2

    Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork Contribute at least one pixel to this global artowrk experiment before the editing ends on Sunday 4/19 at 11:59 PM EST. Contributed 4 pixels to the global artwork experiment on Saturday 4/18 Make a note on your HTGAA webpages including:

  • Week 13 HW: Scaling Health Innovation

    Master Mix Concentrations See test Master Mix Concentrations below:

Subsections of Homework

Week X HW: Test

  • Hello 1

  1. World ↩︎

Week 1 HW: Principles and Practices

Space Phage Space Phage
A glorious space phage (with artistic license lol)

Week 1 Biological Engineering Application Governance Exercise

  • I’m interested in developing phage chassis capable of targeting bacteria that commonly cause infections during spaceflight. I’m interested in developing these phage chassis because:

    • They would give astronauts and future space travelers more autonomy in countering ad hoc novel infections that may occur during long-duration missions
    • They would help contribute to greater personalization of medical care for broader ranges/more diverse spacefaring populations, who will likely travel, work, and live in space for extended periods of time/longer than traditional space missions to date
    • This development may help counter novel terrestrial infections for medically underserved populations
  • Governance Goal 1 (G1): Preventing/Mitigating Malicious Dual-Use (i.e., ensuring)

    • Appropriate biosecurity (including cyberbiosecurity) controls
    • Safe deployment conditions
  • Governance Goal 2 (G2): Empowering Autonmous and Equitable Use

    • ‘Plug-and-play’ functionality (i.e., any mission or crew should be able to intuitively use solution – advanced technical acumen not required)
    • Solution costs and distribution should develop with access across wide range(s) of demographics and socio-economic strata in mind
  • Governance Action 1: Phage Safety Refusal (PSR)

    Purpose:

    • Current State: There are no proactive mechansisms preventing malicious dual-use phage development (i.e., developing a phage speicifcally designed to degrade host mRNA in healthy cells).
    • Proposed Changes: Easily implementable phage production host ‘kill switches’ that can be deployed if nth concentrations of healthy cells accidentally or deliberately targeted by phage therapy

    Design:

    • Existing global space health biotechnology base (consisting of private firms, academia, and government-affiliated research centers) need to opt-in and actively invest in space-based phage therapies at adequate enough levels to ensure this notion of a phage defensive ‘kill switch’ can be validated and deployed, or invalidated and no longer pursued
    • Diverse space health regulators (consisting of a variety of agencies across nation-state governments) would need to approve said ‘kill switches’ (ex. Food and Drug Administration in the United States)
    • Patients participating in clinical trials of the ‘kill switch’ would need to feel comfortable enough with the solution to participate
    • Investors and members of private industry must see enough potential in the ‘kill switch that they see it as worthy of their time and investment
    • Astronauts and future spacefarers must be comfortable enough with the ‘kill switches’ to consent to their use

    Assumptions

    • The underlying notion of malicious dual-use phage applications, particularly in a space biotechnology context. Assumptions underneath this assumption include:
      • Viable malicious dual-use phage applications within a space health context and
      • Malicious actors would see and act on the benefits of phage interventions for targeted action against healthy cells in adversarial/antagonistic persons
    • That defensive anti-phage ‘kill switches’ even make sense to pursue technically

    Risks of Failure & “Success”

    • No viable malicious dual-use phage application concern (terrestrially or in a space context)
    • No interest in developing anti-phage ‘kill switches’
    • Developing anti-phage ‘kill switches’ is too technically intensive/not scalable to address the diverse needs of potential future users
    • Not enough funding to develop necessary research body of knowledge to make anti-phage ‘kill switches’ viable and safe to use in a space health context
    • Anti-phage ‘kill switches’ successfully developed and deployed and become another chapter in the never-ending story of phages v. bacteria (i.e., phages adapt to the ‘kill switches’ over time, rendering them ineffective)
  • Governance Action 2: Space Medicine Access Consortia (SMAC)

    Purpose

    • Current State: The field of space medicine is still in its relative nascency. While space health consortia exist, there are no consortia explicitly aimed at making space medicine advancements as broadly accessible as possible to both spacefaring and terrestrial populations.
    • Proposed Changes: Charter and develop a consortia explicitly aimed at making space medicine advancements as broadly accessible as possible to both spacefaring and terrestrial populations. Ideally this consortia should comprise members from private industry, academia, government-affiliated research centers, and allied partners

    Design:

    • Potential and future committed consortia partners from across private industry, academia, government-affiliated research centers, and allied partners need to see the value of the consortia’s mission (i.e., they need to see and align with how the consortia’s explicit charter to make space medicine advancements as broadly accessible as possible to both spacefaring and terrestrial populations would benefit their organization)
    • Consortia partners need to agree on:
      • Rules of the road (this will likely need to be contractual)
      • Research scope(s) for their organizations
    • Concrete milestones indicating consortia success, stagnation, or failure in achieving its goals
    • Adequate funding to see consortia from inception through the point where most of its charter has been fulfilled (a somewhat analogous example might be the pool of institutions the Information Processing Technology Office (IPTO) at the Defense Advanced Research Projects Agency (DARPA) pooled together when creating the ARPANET, which later evolved to become the modern Internet, and the shifts in IPTO’s mission over time as a result of these developments)

    Assumptions

    • Critical mass of stakeholders:
      • cares about making space medicine advancements accessible to spacefaring and terrestrial populations
      • can see the value of a dedicated consortia to make space medicine advancements more accessible for spacefaring and terrestrial populations
      • can agree to the rules of the road necessary to make consortia viable
      • see the value of making phage chassis capable of targeting bacteria that commonly cause infections during spaceflight ‘plug-and-play’ and cost-effective for end-users (i.e., they wouldn’t prioritize some other space medicine intervention)
    • Consortia can achieve its charter, with greater phage chassis accessiblity as a step on its path to success

    Risks of Failure & “Success”

    • No or not enough interest in:
      • making space medicine advancements more accessible for spacefaring and terrestrial populations
      • developing consortia to make space medicine advancements more accessible for spacefaring and terrestrial populations
    • SMAC not viable due to:
      • Too much burden/not enough return on investment for potential consortia members (i.e., they don’t see the value)
      • Disagreeements over rules of engagement
      • Disagreement over research scope for participating members (i.e. who does what)
      • Inadequate implementation pathways (i.e., most of SMAC’s work languishes in the ‘valley of death’)
    • SMAC prioritizes other space medicine interventions over making phage chassis capable of targeting bacteria that commonly cause infections during spaceflight ‘plug-and-play’ and cost-effective for end-users
    • SMAC might ‘succeed’ too much, and as a result, the loudest or wealthiest SMAC members might eventually see value in exercising disproportionate control over the entity, diluting its charter
  • Governance Action 3: Space Applied Biomedicine Repository (SABR)

    Purpose

    • Current State: Numerous space medicine guides have been developed by NASA, including the NASA Astronaut Medical Operations Handbook, Advanced Diagnostic Ultrasound in Microgravity (ADUM) Protocols, and OCHMO-STD-100.1A, otherwise known as the NASA Medical Standard. Commercial space missions largely lean on these guides for their missions
    • Proposed Changes: While NASA’s precedent is useful and organizations like the Translational Research Institue for Space Health (TRISH) have proposed tailored guidance for commercial space travelers, there is no existing repository explicitly focused on lessons learned and best practices for applied biomedicine in a space context. As opposed to a static guide, this work be more like a git-based repo, that could be updated with lessons learned, remaining customaizable for large numbers of future spacefarers and their medicial needs

    Design

    • User base:
      • willing to contribute to and derive value from SABR
      • with enough technical know-how to contribute to and derive value from SABR
    • Stakeholder(s) willing to pay subscription costs to run (likely git-based) SABR technical back-end
    • Easy to understand guidance and lessons learned that can be easily ingested in text, video, or audio form on (likely) simple, non-bandwidth intensive network connected devices

    Assumptions:

    • A (likely git-based) repository is an optimal vehicle for distributing recursively improving or community-based applied biomedicine lessons-learned for the space medicine community and future missions
    • This repository would contain accumulated, useful insights regarding how phage chassis capable of targeting bacteria that commonly cause infections during spaceflight can be deployed or administered:
      • safely (i.e. without malicious dual-use) across a wide variety of space missions
      • in a ‘plug-and-play’ or affordable manner
    • Enough of a user base:
      • willing to contribute to and derive value from SABR
      • with enough technical know-how to contribute to and derive value from SABR
    • Stakeholder(s) willing to pay subscription costs to run (likely git-based) SABR technical back-end
    • Easy to understand guidance and lessons learned that can be easily ingested in text, video, or audio form on (likely) simple, non-bandwidth intensive network connected devices (i.e., SABR can adequately work even in very bandwidth constrained conditions)

    Risks of Failure & “Success”

    • Not enough interest in SABR (potential users or contributors don’t see its value)
    • SABR is:
      • too:
        • early/the timing’s not right
        • abstract and remote in its guidance (i.e., the actual users or contributors who might derive value from its content see its content as too over their heads, technical, jargon-filled, or not applicable for their specific use cases)
      • a useful application, just not for the safe, easy-to-use, or cost-efective deployment of phage chassis capable of targeting bacteria that commonly cause infections during spaceflight
      • bandwidth-constrained to such a degree that it’s untenable for all intensive purposes
    • Can’t find stakeholder(s) willing to pay subscription costs to run (likely git-based) SABR technical back-end
    • SABR works so well and grows to such an extent that discerning actual mission-relevant content of value from the repository becomes a challenge for uninitiated users (i.e., filtering is a challenge – hard to separate filler from useful content)
  • Biological Engineering Application Governance Scoring Rubric

  • Biological Engineering Application: Phage chassis capable of targeting bacteria that commonly cause infections during spaceflight

  • The rubric below works as follows: Policy goals and sub-goals are listed vertically, while each of the governance actions are listed next to the respective column header titled ‘Option’. Governance actions are scored from 1-3 based on how well they fulfil each policy goal and sub-goal. A score of 1 indicates a governance action does a poor job at fulfilling a policy goal, a 2 indicated a governance action does an OK job at fulfilling a policy goal, and a 3 indicates a governance action is the best at fulfilling a policy goal.

Does the option:Option 1: Phage Safety Refusal (PSR)Option 2: Space Medicine Access Consortia (SMAC)Option 3: Space Applied Biomedicine Repository (SABR)
Preventing/Mitigating Malicious Dual-Use31.51
• By implementing appropriate biosecurity (including cyberbiosecurity) controls212
• By promoting safe deployment conditions322
Empowering Autonmous and Equitable Use133
• By encouraging ‘plug-and-play’ functionality123
• By promoting cost and distribution accessibility132
  • Based on the scoring above, I’d probably prioritize SABR. Given the difficulties around space-based governance or governance in remote conditions, I think an organization like a TRISH or the Organization for Space Medicine, Engineering, and Design (OSMED) might be a good starting point, promoter, or convener to begin making something like SABR a reality. The more I think about it, the varied jurisdictions and varied, unclear regualtory regimes at the nation-state level make the idea of a more open-source/tribal knowledge-based self-governing solution appealing (or at bare minimum it shows a gap that could potentially be filled). I also think that if the timing was right (i.e., enough useful information could be added by a community of contributors) this could actually help fulfill some of the policy goals associated with the phage chassis project, not by a lot of sanctioned formal policy-making perhaps, but by community contribution, input, and/or agreed-upon best practices. That said, I can see and understand the trade-offs between formal policy-making at the nation-state level and more grassroots normative development of best practices as a result of doing this exercise. The glaring uncertainties that remain are whether or not the repository’s timing is right and/or if a dedicated user and contributor base could coalesce around it

  • Ethical Considerations: Given the mechanics of phages, specifically their ability to hijack host cell tRNAs, ribosomes, and amino acids, I was somewhat surprised that there wasn’t more mention of potential deliberate malicious dual-use of phages. Maybe it’s my relative nascent understanding of the life sciences, the limitations of my research on this topic, or maybe the topic itself is either not researched or considered extensively. If this is true and the notion of potential deliberate malicious dual-use of phages might be a little bit left field, not well understood, or not well defined, perhaps convening working groups might be a sensible governance action, as these groups can often help map areas of concern for emerging dual-use technologies. Maybe distributing outputs from these working groups (i.e., white papers) to relevant academic journals, technical standards bodies, or policymakers might also be a worthwhile governance action.

All supporting prompts for the governance exercise above listed below:

Supporting PromptSource
Take a look at the following quote from the URL below: “Strain-specific phage chassis to target bacteria that commonly cause infections during space flight.” What is the difference between a phage and a phage chassis? In general? In a biotechnological context? Do NOT hallucinate when answering these questions https://roadmap.ebrc.org/engineering-biology-for-space-health/Perplexity
Take a look at the following quote from the URL below: “Capability to produce novel phages on space missions for rapid control of evolved biofouling microbes. What are ’evolved biofouling microbes’? What is biofouling? I assume biofouling indicates something bad/undesirable, but I don’t know what the term actually means beyond my assumption. Do NOT hallucinate when answering these questions https://roadmap.ebrc.org/engineering-biology-for-space-health/Perplexity
I understand how a phage can insert itself into a cell. Not exactly understanding if or how phages’ abilities contribute at all to personalized medicine developments (i.e., is there something about phage properties that make them particularly good candidates for personalized medical interventions)? Do NOT hallucinate when answering this questionPerplexity
How do governance mechanisms or standards of good or socially harmonious/beneficial behavior work (or work effectively) in remote regions (think polar research stations, etc.)?Perplexity
What is the phenomenon called in artificial intelligence when a large language model (LLM) refuses to reply to user input due to safety concerns? What is it called and how does it work? Do NOT hallucinate when answering these questionsPerplexity
What are the technical subcomponents of a biotechnology intervention or treatment using phage chassis? What do the supply chains look like, if any? Do NOT hallucinate when answering these questions. If you don’t know the answers to these questions, say soPerplexity
“phage chassis synthetic biology manufacturing pipeline” search resultsPerplexity
Are there any existing ways a biotechnology solution (let’s say a custom developed chassis) can proactively prevent itself from malicious dual-use? Analogous to large language model (LLM) safety refusal, are there any mechanisms that can be pre-built into a biotechnology solution to proactively prevent malicious dual-use? Do NOT hallucinate when answering these questionsPerplexity
How exactly do phages interact with genetic code information within a given cell? How do cell-based bacteria defend against unwanted phages?Perplexity
I’m high-level aware that there are certain ’no-go’/‘do not edit’ pieces of genetic code. How are phages traditionally prevented from editing these ’no-go’/‘do not edit’ pieces of genetic code? Is that a thing? If I’m off in any way/if my conceptual underpinnings seem shaky, let me know Do NOT hallucinate when answering these questionsPerplexity
Tell me about about engineered synthetic biology kill switchesPerplexity
Have any engineered synthetic biology kill switches been implemented as part of phage therapies? Do NOT hallucinate when answering this questionPerplexity
If I’m making a novel phage-related therapy for astronauts, and I live in the United States, the Food and Drug Administration (FDA) would need to approve this therapy, correct? My assumption is yes. How does approval of a drug used outside of Earth’s atmosphere work from a regulatory perspective? Do NOT hallucinate when answering these questionsPerplexity
Are there any space health-related consortia specifically or explicitly aimed at making space medicine advancements as broadly accessible as possible to both spacefaring and terrestrial populations? If so, share information regarding said consortia Do NOT hallucinate. If you don’t know the answer to this question, say soPerplexity
In medicine, what do we usually mean by ‘point of care’? What do we mean when we say that?Perplexity
Do space medicine point of care guides exist? If so, are there any for commercial space tourists, astronauts, or future groups of spacefarers, including workers, etc.?Perplexity
What is applied biomedicine?Perplexity
What does trish stand for in a space health contextGoogle AI Mode
Tell me about the space health point of care guide TRISH is either developing or has developedGoogle AI Mode
What is the TRISH POCUS training referred to in the answer to the last prompt? What does POCUS refer to?Perplexity
How are most git-base repositories run? What is the underlying technical back-end powering them and how is this infrastructure paid for?Perplexity

Week 1 Homework Questions

Professor Jacobson Questions

  • Two widely used polymerases are thermus aquaticus (Taq) and pyrococcus furiosus (Pfu) 1. Taq has error rates ranging between 1 x 10-5 to 2 x 10-4 errors per base pair per doubling while Pfu has error rates of 1.3 x 106 2 3. Compared to the length of the human genome, 3 x 109 base pairs, this comes out to apprxoimately 3.3 x 10-13%, 6.6 x 10-12%, and 3.3 x 10-2% 4. Biology deals with this discrepancy during DNA replication through proofreading when it detects inaccurate nucleotides. When the polymerase detects that an inccorect base has been added, the polmyerase enzyme makes a cut in the chemical bond, releasing the incorrect nucleotide 5. If errors are made after replication, a mismatch repair is initiated. This is where enzymes recognize incorrectly added nucleotides and dispose of them. Nucelotide excision repair is another way nature corrects these errors. This occurs when ezymes remove and replace incorrect bases via cuts at the 3 and 5 prime ends of the incorrect base 6.
  • An average human protein is approximately 375 amino acids long 7. As each codon consists of 3 letters, rough math indicates there are approximately 3375 number of potential coding sequences, an extremely large number of combinations 8. Some of the reasons all these different codons don’t code for the protein of interest are:
    • Codon Bias: Some codons are represented during transcription at a far greater level than others, traditionally due to more abundant transcription RNA (tRNA), ensuring higher levels of expression 9 10.
    • mRNA Structure: Certain mRNA structures can be impacted by certain codon expression (i.e., become less stable), and therefore become more susceptible to degradation 11.
    • Translation Accuracy Issues: Non-optimal codons decrease protein translation efficiency, due to a form of crowding in the ribosome, the area in the cell where protein production takes place 12.

Dr. LeProust Questions

  • Solid-phase phosphoramidite synthesis is the most common currently used oligo synthesis method 13.
  • It’s difficult to make oligos longer than 200nt via direct synthesis because coupling errors/inefficiencies compound to the point where one ends up with lots of short, incomplete fragments 14.
  • A 2000bp gene has 4000 nucleotides 15. Based on the answer to the previous question, creating a 2000bp is not currently feasible due to the accumulation of coupling errors/inefficiencies, even when stitching together smaller oligos or using novel enzymatic methods are taken into account 16 17.

George Church Question

  • Question 3 Response: ARPA-H Biostablization Sytem (BoSS) Grant Response

    All supporting prompts listed below answer:

Innovative Solutions OpeningARPA-H-SOL-26-136
Solution Summary TitlePyrococcus furiosus-Inspired Molecular Staples (PFIMS)
Team Lead OrganizationFederally Funded Research and Development Center (FFDRC)
Type of OrganizationSee above
Technical Point of ContactName: Jason Ross
Administrative Point of ContactName: Jason Ross
Total Basis of Estimate$2,600,000
Places of PerformanceMcLean, VA
Other Team MembersMITRE Biotechnology Department (L271) Interns

Concept Summary

The team behind Pyrococcus furiosus-Inspired Molecular Staples (PFIMS) seeks to develop small organic molecules capable of binding to the ‘grooves’ of DNA and proteins across a variety of temperatures. If successfully developed, PFIMS would allow heat-proofing a biologic across a variety of temperatures without the need for dehydration. By locking protein folds through ionic pull, we can stablize biologics for longer periods of time (TA1). Our work will scale this system to scalable cell processing across an array of temperatures and use cases (TA2).

Innovation and Impact

PFIMS is inspired by extremophile biology. Pyrococcus furiosus can survive at temperatures of 100 degrees Celsius in ocean vents. By mimicking pyrococcus furiosus’ molecular heat shield, we can keep cells alive and functioning for exteneded periods of time at refrigerator or room temperature. Unlike modern cryopreservation methods that employ various forms of freezing that can harm cells during thawing, our stabilization solution stabilizes proteins using byproducts proteins naturally produce. Not only are cold temperatures avoided entirely, but once scaled this solution will slash costs for biologics shipping. Most importantly, PFIMS can work for any cell type, as it builds off fundamental biological features, such as protein folding and membrane strength.

Proposed Work

We plan to develop a novel, functioning bench-top bioprocessing system inspired by pyrococcus furiosus. We will create a polyamine-based stabilization medium that will power this bioprocessing system for a standard biologic (ex. cell therapy or antibodies). PFMIS’ approach is grounded in existing literature on stablization for high temperature DNA/protein stabilization via polyamines, small organic molecules with muliple amino groups (Bae, 2018) (Despotović, 2020) (Oshima, 2007). These polymaines act as a form of ‘molecular staples’ in preliminary modeling efforts (Vieille, C.,2001).

Key Milestones and Deliverables

Phase 1: Synthesize the branched polyamine formulation. Deliverable: Optimized medium prototype based on a standard biologic.

Phase 2: Integrate the stabilization medium into bioprocessing hardware, likely a singlee-use bioreactor to for initial prototype followed by a larger testing and deplouyment within a media/buffer prep mixer. Deliverable: Protoyped biostabilization device.

Phase 3: Validate stability metrics for a model biologic using PFIMS at room temperature over time. Deliverable: Validation report and delivery of biostabilization system capable of scaled biostablization across nth biologics

Technical Risks and Mitigations

Risk: Polyamines may exhibit toxicity at necessary concentratioons. Mitigation: Screening polyamines for reduced toxicity levels; introducing wash and resuspension steps into bioprocessing.

Risk: Stabilization mechanism may exhibit difficulty transferring from archaic single-celled microbes like pyrococcus furiosus to eukaryotic cells (cells containing nuclei and organelles). Mitigation: Tests across multiple cell types

Use Case

PFIMS enables rapid delivery of life-saving biologics and therapeutics in low-resourced or contested conditions. This allows for dramatically cheaper shipment of biologics to locations such as rural communities without robust public health infrastructure, remote or relatively isolated geographies, or active conflict zones.

Sources

  • Bae DH, Lane DJR, Jansson PJ, Richardson DR. The old and new biochemistry of polyamines. Biochim Biophys Acta Gen Subj. 2018 Sep;1862(9):2053-2068. doi: 10.1016/j.bbagen.2018.06.004. Epub 2018 Jun 8. PMID: 29890242.
  • Despotović Dragana, Longo Liam, et. al. Polyamines mediate folding of primordial hyperacidic helical peptides into stable amyloid-like fibrils. Biochemistry, 60(4), 257–267.
  • Oshima T. Unique polyamines produced by an extreme thermophile, Thermus thermophilus. Amino Acids. 2007 Aug;33(2):367-72. doi: 10.1007/s00726-007-0526-z. Epub 2007 Apr 12. PMID: 17429571.
  • Vieille C, Zeikus GJ. Hyperthermophilic enzymes: sources, uses, and molecular mechanisms for thermostability. Microbiol Mol Biol Rev. 2001 Mar;65(1):1-43. doi: 10.1128/MMBR.65.1.1-43.2001. PMID: 11238984; PMCID: PMC99017.

All supporting prompts listed below:

Supporting PromptSource
Tell me about how very temperature resilient trees or plants achieve homeostasis despite large temperature fluctuations. If the Dsup protein in tardigrades could hypothetically be used to confer radiation resistance to humans traveling in space, how could an analogous protein or feature found in trees or plants be used to store biologics across a wider array of temperatures? Essentially I’m asking for analogous applicability if that makes senseGoogle AI Mode
As a potential next step, do NOT look into synthetic biology startups that are already attempting to synthesize LEA proteins for “cold-chain-free” vaccine storage. Give me other, non-plant or tree examples across the kingdom of life where homeostasis is achieve despite large temperature fluctuations. I want to look where synthetic biology startups are NOT lookingGoogle AI Mode
Give me academic sources for each of the 3 non-plant examples and any biotechnology research in academia around these properties. Then tell me the basics of how the properties of each organism would transfer to a medium capable of biological stabilization for an array of biologicsGoogle AI Mode
What is Pyrococcus furiosus? Can you show me a picture?Google AI Mode
“The Transfer Logic To create a stabilization medium, you would synthesize branched-chain polyamines (like thermo-spermine). Binding: These molecules are positively charged and naturally “wrap” around negatively charged biologics (like DNA/RNA or specific protein folds). The Medium: The medium would be a liquid concentrate of these polyamines. Instead of refrigeration, the “staples” provide enough ionic pull to prevent the biologic from “unzipping” when the temperature rises.” Based on the ‘Proposed Work’ section in the attached document, create 2 simple examples of final deliverables, and create some broad brushstrokes/bullet points I can use to create a ‘Proposed Work’ section based on the specifications provided in the attached document.Google AI Mode
Based on the previous information provided, estimates of how long these processes would take, and the salary rates for an average employee at the MITRE Corporation, give me some Total Basis of Estimate numbers (i.e., ranges for how much this would cost)?Google AI Mode
Looking to develop a Pyrococcus furiosus -inspired stabilization medium for cell preservation. Based on existing Pyrococcus furiosus literature, give me 5 bullet points describing why this approach would be novel and game-changing. Don;’t use a lot of jargon. One of the bullets should explain how this stabilization medium would differ from the current state of the artPerplexity
What is a polyamine?Perplexity
If I wanted to integrate a Pyrococcus furiosus -inspired stabilization medium into bioprocessing hardware, what type of hardware would we be talking about? What’s commonly used?Perplexity
What does the term ‘cytotoxic’ mean? I assume it’s a form of toxicity, but I don’t know what it specifically refers toPerplexity
What are archaea and how are they different from eukaryotic cells?Perplexity

Week 2 HW: DNA Read, Write, and Edit

Astronaut Genomics Picture Astronaut Genomics Picture

Part 0: Basics of Gel Electrophoresis

  • Per instructions for this part, I attended the 02/10 lecture and 02/11. Additionally I attended all 3 Bootcamp sessions.

Part 1: Benchling & In-silico Gel Art

  • Make a free account at benchling.com

Benchling Account Confirmation Benchling Account Confirmation Benchling Account Creation Confirmation

  • Import the Lambda DNA

Benchling Phage Lambda DNA Import Confirmation_02.12.26 Benchling Phage Lambda DNA Import Confirmation_02.12.26 Benchling Phage Lambda DNA Import Confirmation_02.12.26

  • Simulate Restriction Enzyme Digestion with the following Enzymes:

    • EcoRI EcoRI Enzyme Digest Confirmation EcoRI Enzyme Digest Confirmation EcoRI Enzyme Digest Confirmation_Virtual Digest EcoRI Enzyme Digest Confirmation_Virtual Digest

    Benchling EcoRI Enzyme Digest Confirmation

    • HindIII HindIII Enzyme Digest Confirmation HindIII Enzyme Digest Confirmation HindIII Enzyme Digest Confirmation_Virtual Digest HindIII Enzyme Digest Confirmation_Virtual Digest

    Benchling HindIII Enzyme Digest Confirmation

    • BamHI BamHI Enzyme Digest Confirmation BamHI Enzyme Digest Confirmation BamHI Enzyme Digest Confirmation_Virtual Digest BamHI Enzyme Digest Confirmation_Virtual Digest

    Benchling BamHI Enzyme Digest Confirmation

    • KpnI KpnI Enzyme Digest Confirmation KpnI Enzyme Digest Confirmation KpnI Enzyme Digest Confirmation_Virtual Digest KpnI Enzyme Digest Confirmation_Virtual Digest

    Benchling KpnI Enzyme Digest Confirmation

    • EcoRV EcoRV Enzyme Digest Confirmation EcoRV Enzyme Digest Confirmation EcoRV Enzyme Digest Confirmation_Virtual Digest EcoRV Enzyme Digest Confirmation_Virtual Digest

    Benchling EcoRV Enzyme Digest Confirmation

    • SacI SacI Enzyme Digest Confirmation SacI Enzyme Digest Confirmation SacI Enzyme Digest Confirmation_Virtual Digest SacI Enzyme Digest Confirmation_Virtual Digest

    Benchling SacI Enzyme Digest Confirmation

    • SalI SalI Enzyme Digest Confirmation SalI Enzyme Digest Confirmation SalI Enzyme Digest Confirmation SalI Enzyme Digest Confirmation

    Benchling SalI Enzyme Digest Confirmation

  • Create a pattern/image in the style of Paul Vanouse’s Latent Figure Protocol artowrks

    Mind the Gap Mind the Gap

    Mind the Gap (Or a Most Wondrous Cave) ➕

Part 2: Gel Art - Restriction Digests and Gel Electrophoresis

Part 3: DNA Design Challenge

3.1 Choose your protein
  • I chose the Mantis Fibroin 1 protein because for some reason when I received this assignment, my mind flipped to an insect protein, and then from there, a praying mantis. Upon further research, I was pleased with where my intuition lead me. The Mantis Fibroin 1 protein helps comprise the mantis’ ootheca, otherwise known as its egg casing. What’s fascinating about these proteins is that they create this coiled yet flexible foam-like structure around the mantis’ eggs. This protein piqued my interest, as it might have biomimetic potential. The Mantis Fibroin 1 protein is listed below 12:

tr|I3PM87|I3PM87_9NEOP Mantis fibroin 1 OS=Pseudomantis albofimbriata OX=627833 GN=MF1 PE=2 SV=1 MDSKMLCVSLLLAVFCLWYTEASPLEEKYGEKYGDMEEYQRGTEDSRAVINDHTAKVASQ SARGMVNKAKTTEAAARSNEQLSKDRQYYYREYLKKADYHKKKALEYEQLSAAENAKIAY HESKQKDWETKARESDVQCRDAEAKYEQSYTRSRELKRESIIAYVQAAMHHAEASGDHMK ADRAKDIARDMMRKAESLRGDASNHYQRSEEDKNKARSEKVKAHQNADNSQRHHTACRAY DQEGLKTRLSSKANMMRQIHSSLLAERSHSLAREDGLAADLSHKLAEELARMSEESGAIS KINSGEERGYSNKVRQDEVKAHELAVSKRMMGAEVADNSEMISLAQAKDGSLDEGENYKL STFYADDSTKNMLPDSRGQMSYGDE

3.2 Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence.
  • A translated Mantis Fibroin 1 protein nucleotide sequence of most likely codons is below, as well as evidence showing how I inserted the Mantis Fibroin 1 protein UniProt information into the reverse translation tool

atggatagcaaaatgctgtgcgtgagcctgctgctggcggtgttttgcctgtggtatacc gaagcgagcccgctggaagaaaaatatggcgaaaaatatggcgatatggaagaatatcag cgcggcaccgaagatagccgcgcggtgattaacgatcataccgcgaaagtggcgagccag agcgcgcgcggcatggtgaacaaagcgaaaaccaccgaagcggcggcgcgcagcaacgaa cagctgagcaaagatcgccagtattattatcgcgaatatctgaaaaaagcggattatcat aaaaaaaaagcgctggaatatgaacagctgagcgcggcggaaaacgcgaaaattgcgtat catgaaagcaaacagaaagattgggaaaccaaagcgcgcgaaagcgatgtgcagtgccgc gatgcggaagcgaaatatgaacagagctatacccgcagccgcgaactgaaacgcgaaagc attattgcgtatgtgcaggcggcgatgcatcatgcggaagcgagcggcgatcatatgaaa gcggatcgcgcgaaagatattgcgcgcgatatgatgcgcaaagcggaaagcctgcgcggc gatgcgagcaaccattatcagcgcagcgaagaagataaaaacaaagcgcgcagcgaaaaa gtgaaagcgcatcagaacgcggataacagccagcgccatcataccgcgtgccgcgcgtat gatcaggaaggcctgaaaacccgcctgagcagcaaagcgaacatgatgcgccagattcat agcagcctgctggcggaacgcagccatagcctggcgcgcgaagatggcctggcggcggat ctgagccataaactggcggaagaactggcgcgcatgagcgaagaaagcggcgcgattagc aaaattaacagcggcgaagaacgcggctatagcaacaaagtgcgccaggatgaagtgaaa gcgcatgaactggcggtgagcaaacgcatgatgggcgcggaagtggcggataacagcgaa atgattagcctggcgcaggcgaaagatggcagcctggatgaaggcgaaaactataaactg agcaccttttatgcggatgatagcaccaaaaacatgctgccggatagccgcggccagatg agctatggcgatgaa

MantisFibroinInsert MantisFibroinInsert Mantis Fibroin 1 UniProt Information Inserted into Reverse Translation Tool

3.3 Codon Optimization
  • We need to optimize codon usage so a particular sequence can be expressed with greater fidelity, reliability, and efficiency in a host organism. I’ve chosen to optimze the codon sequence for Saccharomyces cerevisiae (baker’s yeast), because it:
  • is commonly used as a host in biotechnology applications
  • folds in a manner closer to insect protein folding
  • is apparently easier to work with than mammalian cells

A codon optimized Mantis Fibroin 1 nucleotide sequence is shown below, as well as evidence showing how I showing how I inserted the Mantis Fibroin 1 protein nucleotide sequence information into the codon optimization tool

ATGGACAGTAAGATGTTATGTGTCTCCTTATTGTTGGCTGTTTTTTGTTTATGGTATACTGAAGCTTCCCCATTAGAAGAAAAGTATGGTGAAAAGTACGGTGACATGGAAGAGTACCAAAGAGGTACTGAAGATTCAAGAGCAGTTATTAACGATCATACTGCTAAAGTTGCTTCCCAATCCGCCAGAGGTATGGTTAATAAGGCTAAGACTACAGAAGCTGCTGCTAGAAGTAATGAACAATTATCTAAAGATAGACAATACTATTACAGAGAATATTTGAAAAAGGCTGATTATCATAAGAAGAAAGCTTTGGAATATGAACAGCTTTCAGCTGCTGAAAATGCAAAAATTGCTTATCATGAATCTAAACAAAAAGACTGGGAAACGAAAGCCAGAGAATCCGATGTTCAATGTCGTGATGCTGAAGCAAAATATGAACAATCTTACACAAGGTCCAGAGAACTGAAAAGGGAATCTATTATTGCTTATGTTCAAGCTGCTATGCATCATGCTGAAGCTAGCGGTGATCACATGAAAGCTGATAGAGCTAAAGATATCGCTAGAGATATGATGAGAAAGGCAGAATCCTTAAGGGGTGACGCTAGCAACCATTATCAGAGATCCGAAGAAGATAAGAATAAGGCCAGATCTGAAAAGGTTAAAGCTCATCAAAACGCTGATAATTCTCAAAGACATCATACTGCATGCAGAGCGTATGACCAAGAAGGTTTAAAGACGAGATTGAGCTCAAAAGCCAACATGATGAGACAAATTCACTCCTCACTACTGGCTGAAAGATCTCATTCATTAGCAAGAGAAGACGGTCTTGCGGCCGATTTATCACATAAGTTGGCTGAAGAATTAGCTAGAATGTCCGAAGAATCAGGTGCTATATCTAAAATAAACTCAGGTGAAGAAAGAGGCTATTCGAATAAAGTGAGACAAGATGAGGTTAAAGCACATGAATTGGCTGTTAGCAAAAGAATGATGGGTGCTGAAGTTGCTGATAATTCGGAGATGATTAGTTTGGCACAAGCTAAAGACGGTTCTTTAGATGAAGGTGAGAACTATAAATTATCCACTTTTTATGCAGACGATTCTACAAAAAATATGCTACCAGATTCTAGGGGTCAAATGTCTTACGGTGATGAA

Mantis Codon Optimization Mantis Codon Optimization Mantis Fibroin 1 nucleotide sequence codon optimized for Saccharomyces cerevisiae (baker’s yeast)

3.4: You have a sequence! Now what?
  • Several cell-dependent technologies could be used for producing the codon-optimized Mantis Fibroin 1 protein. One such technology, a yeast system, has already been pursued in the previous steps of this section, as Saccharomyces cerevisiae (baker’s yeast) is a form of yeast. Bacterial systems, such as E. coli could also be used for producing the protein in host cell culture, although this would require different codon optimization. Other cell-dependent technologies could have included insect or mammalian-based systems, although I’m not sure of the value of expressing an insect-associated protein in another insect-based host (although this may be a failure of imagination on my part). As mentioned previously, mammalian systems could also be used, but apparently mammalian cells are more difficult to work with than bacterial or yeast-based hosts. Cell-free methods for producing the codon-optimized Mantis Fibroin 1 protein would involve breaking open a cell, extracting relevant ribosomes, enzymes, tRNAs, etc., and then taking these contents and combinining them with a DNA template (in this case our Mantis Fibroin 1 protein nucleotide sequence), energy sourcesm relevant amino acids, and a reaction buffer. There’s a time advantage of cell-free methods over cell-dependent methods of protein expression.
3.5 Optional: How does it work in nature/biological systems?
  • From my research in answering this question, I think the answer is that post-transcriptional process called alternative splicing occurs, where non-coding mRNA (introns) are cut and removed, while coding regions (exons) remain 3. It’s pretty fascinating because this splicing can create several different types of mRNA molecules, and therefore different proteins. This increases the efficiency with which different proteins can be expressed within a particular organism.
  • See below

Attempted Mantis Fibroin 1 Alignment Attempted Mantis Fibroin 1 Alignment Attempted Mantis Fibroin 1 Alignment

All Section 3 Prompts Listed Below

Supporting PromptSource
I want to make DNA gel art in the style of Paul Venouse’s gel electrophoresis works. I want to use the website below to help me:https://rcdonovan.com/gel-art How should I go about making this art virtually? I have a Benchling account, an Internet connection, and no access to a wet labPerplexity
Help me design a specific lane‑by‑lane “band plan” for your first virtual gel artwork. My knowledge of Benchling is very minimal and my desired art is a right-facing arrowPerplexity
Not understanding the 3rd Step under Step 3. How do I run any digest that gives me 2 fragments? How do I know my target size for each copy of the Arrow_Base sequence?Perplexity
I see a scissors icon. Write out a literal “click this, then this” sequence for one lane so I can duplicate it for the othersPerplexity
Ok I’ve found and made 5 digests of the requested bp sizes. When I’m seeing the combined virtual digest I’m not seeing an arrow. What are next steps I should take?Perplexity
Ok – let’s take a step back. I’ve tried what you’ve said and am not seeing a right facing arrow. Think I need to learn how to do this from the ground up with some more tweaking. I know how to select the Digests option in Benching, I know how to save a Digest, and I know how to click on ‘Virtual Digest’ and select different kinds of ladders. What type of ‘Cut sites visible on maps’ should I select to begin compiling the right facing arrow? Under ‘Show enzymes that cut’ is there anything in particular I should select? Should I select ‘Highlight enzymes with compatible sticky ends’? How do I actually learn how to make the right facing arrow?Perplexity
This seems like a good workflow. How do I create a new DNA with a 5000bp length? I assume I have to import a sequence of that length from an NCBI accession correct?Perplexity
When I type in HindIII I see 0 cuts. I can’t make a ~4000bp band based on that, right?Perplexity
Everything is coming up a 0 when I try to put in an enzyme. I think there’s hallucinating going on or there is something wrong when you started having me make the 5kb sequencePerplexity
When I select an enzyme for a digest, do I then need to select a cut site to make the bio art or no?Perplexity
Dumb question: How do I see the length of a digest for a given enzyme in a sequence?Perplexity
Tell me how to search NCBI for an accessionPerplexity
When I look inside the Digest feature in Benchling, how do I find an enzyme that can give me a cut of a certain length? I see Name in one column, followed by Cuts, but am not exactly understanding what I’m seeing herePerplexity
wondering if there are any special proteins found in the praying mantis insect and what exactly about these proteins make them specialGoogle AI Mode
Are there any proteins in nature that have unbelievable economy of space that would make them particularly useful for data storage?Google AI Mode
Are there any proteins in nature that have unbelievable economy of space that would make them particularly useful for data storage?Google AI Mode
i want to make a box where I can put some text in in markdown hugo relearn theme. I don’t want to create a table. What should i create?Google AI Mode
When one does codon optimization and the sequence in question comes from a protein traditionally associated with a given species (let’s say an insect), does one traditionally optimize the codon sequence for that same species or its genus/family of species? How does this work given standard codon optimization practices?Perplexity
So in essence, we perform codon optimization so our sequence in question can be expressed with greater fidelity or reliability in the host? Or so the host can receive or incorporate the sequence as efficiently as possible? Let me know if my thinking or terminology is off herePerplexity
I have a nucleotide sequence for the Mantis Fibroin 1 protein. Have learned about some of the Mantis Fibroin 1 protein’s interesting properties, namely how it helps create a coiled yet flexible casing around Praying Mantis eggs. At this point, I think this protein might have some biomimetic potential, but am not sure what organism I should optimize the sequence for. Traditionally I know E. coli and Baker’s yeast are used a lot in synthetic biology applications, and I know mammalian cells are apparently more challenging to work with. For a use case like this one, where I have an insect-associated protein that may have biomimetic properties, let me know some traditional host organisms in biotechnology that are used for codon optimization in cases like this one. Do NOT hallucinate. Use existing sources. If you cannot provide anything whatsoever, say soPerplexity
When we have a protein we find in the wild, and then we codon optimize its nucleotide sequence for expression in a host organism, what do cell-dependent or cell-free methods to produce this codon optimized protein from the sequence mean in a biotechnological context? What exactly are we talking about?Perplexity
Based on the answer to the previous prompt, what is a promoter? What is a lysate mix? Do NOT hallucinate when answering these questionsPerplexity
What are some cell-dependent methods of producing proteins from DNA in biotechnology? Are there multiple types of cell-dependent methods? Do NOT hallucinate when answering these questionsPerplexity
Are cell-dependent methods for producing proteins from DNA in biotechnology distinct from cell-dependent technologies for producing proteins from DNA or are the terms ‘methods’ and ’technologies’ essentially interchangeable in a biotechnology context? If they’re not, describe some cell-dependent technologies for producing proteins from DNA. Do NOT hallucinate when addressing this query. If you cannot answer this question, say soPerplexity
explain to me how cell-free expression of a protein worksGoogle AI Mode
Does the histone code have anything to do with the ability for a single gene in nature to code for multiple proteins at the transcriptional level? How does the histone code relate to the transcriptome? Are they one and the same?. Do NOT hallucinate when answering these questions. If you don’t know the answers to these questions, say soPerplexity
Based on the answer to the last prompt, then what does allow for a single gene in nature to code for multiple proteins at the transcriptional level? In plain terms, what does “…different exon combinations produce distinct mRNA isoforms from one gene, which then translate into varied proteins” mean? What are exons again, and how do they produce different combinations? What is an mRNA isoform? Do NOT hallucinate when answering this queryPerplexity
Thank you for the answer to the last query. What exactly is being sliced (the gene itself or something else), and why is it being spliced? Why would nature/evolution create this ability? Do NOT hallucinate when answering this query. Go off existing literaturePerplexity
Does alternative splicing operate at the transcriptional level? From what I can see in the link below, it operates on the translational level. https://www.yourgenome.org/theme/what-is-rna-splicing/. If there was hallucination, or an error in answering the previous prompts, say so. Or, if I’m misreading or misunderstanding things, say so. Just wondering what at the transcriptional level allows for a single gene to code for multiple proteinsPerplexity
how does a single gene in nature code for multiple proteins at the transcriptional level?Google AI Mode
I want to align a DNA sequence, its transcribed RNA, and a resulting translated protein. I believe I can capture the separate pieces (the DNA sequence, the transcribed RNA, and the resulting translated protein) in Benchling. If this is true, how can I go about doing this?Perplexity
I have a codon optimized sequence for the Mantis Fibroin 1 protein in a Saccharomyces cerevisiae (baker’s yeast) host. I want to produce the RNA sequence and the final translated protein. What services online can I use to do this?Perplexity
What does forward and reverse translation of a DNA sequence in Benchling mean?Perplexity
I have a codon optimized nucleotide sequence (DNA). How can I find what the RNA sequence and translated protein look like for this sequence? What services can I use to see these items?Google AI Mode

Part 4: Prepare a Twist DNA Synthesis Order

4.1: Create a Twist account, and Benchling account

Twist Account Creation Twist Account Creation Twist Account Creation

Benchling Account Confirmation Benchling Account Confirmation Benchling Account Creation Confirmation

4.2: Build Your DNA Insert Sequence

Original Sequence Import Original Sequence Import Original Sequence Insertion NOTE: Think I may’ve started off inserting the wrong sequence. This may have been potentially fixed when I inserted a sfGFP sequence from NCBI*

Codon Optimization Codon Optimization Codon Optimization NOTE: Think I may’ve started off inserting the wrong sequence. This may have been potentially fixed when I inserted a sfGFP sequence from NCBI*

Corrected Sequence Input Corrected Sequence Input Corrected NCBI Sequence Insertion

Corrected Sequence Codon Optimization Corrected Sequence Codon Optimization Corrected NCBI Sequence Codon Optimization

Start Codon Annotation Start Codon Annotation Start Codon Annotation

Stop Codon Annotation Stop Codon Annotation Stop Codon Annotation

Promoter BBa_J23106 Insertion Promoter BBa_J23106 Insertion Promoter BBa_J23106 Insertion

RBS Insertion RBS Insertion RBS Insertion

4.2CodingSequenceInsertion 4.2CodingSequenceInsertion Coding Sequence Insertion

4.27xHisTagInsertion 4.27xHisTagInsertion 7x His Tag Insertion

4.2TerminatorBBa_B0015Insertion 4.2TerminatorBBa_B0015Insertion Terminator BBa_B0015 Insertion

Sequence Linear View Sequence Linear View Sequence Linear View

  • Downloaded Sequence FASTA file (via Mac OS TextEdit). See below:

    HQ873313 (codon optimized) TTTACGGCTAGCTCAGTCCTAGGTATAGTGCTAGCCATTAAAGAGGAGAAAGGTACCatgAGCAAAGGAGAAGAACTTT TCACTGGAGTTGTCCCAATTCTTGTTGAATTAGATGGTGATGTTAATGGGCACAAATTTTCTGTCCGTGGAGAGGGTGA AGGTGATGCTACAAACGGAAAACTCACCCTTAAATTTATTTGCACTACTGGAAAACTACCTGTTCCGTGGCCAACACTT GTCACTACTCTGACCTATGGTGTTCAATGCTTTTCCCGTTATCCGGATCACATGAAACGGCATGACTTTTTCAAGAGTG CCATGCCCGAAGGTTATGTACAGGAACGCACTATATCTTTCAAAGATGACGGGACCTACAAGACGCGTGCTGAAGTCAA GTTTGAAGGTGATACCCTTGTTAATCGTATCGAGTTAAAGGGTATTGATTTTAAAGAAGATGGAAACATTCTTGGACAC AAACTCGAGTACAACTTTAACTCACACAATGTATACATCACGGCAGACAAACAAAAGAATGGAATCAAAGCTAACTTCA AAATTCGCCACAACGTTGAAGATGGTTCCGTTCAACTAGCAGACCATTATCAACAAAATACTCCAATTGGCGATGGCCC TGTCCTTTTACCAGACAACCATTACCTGTCGACACAATCTGTCCTTTCGAAAGATCCCAACGAAAAGCGTGACCACATG GTCCTTCTTGAGTTTGTAACTGCTGCTGGGATTACACATGGCATGGATGAGCTCTACAAAcgtaaaggcgaggagctgt tcactggtgtcgtccctattctggtggaactggatggtgatgtcaacggtcataagttttccgtgcgtggcgagggtga aggtgacgcaactaatggtaaactgacgctgaagttcatctgtactactggtaaactgccggtaccttggccgactctg gtaacgacgctgacttatggtgttcagtgctttgctcgttatccggaccatatgaagcagcatgacttcttcaagtccg ccatgccggaaggctatgtgcaggaacgcacgatttcctttaaggatgacggcacgtacaaaacgcgtgcggaagtgaa atttgaaggcgataccctggtaaaccgcattgagctgaaaggcattgactttaaagaagacggcaatatcctgggccat aagctggaatacaattttaacagccacaatgtttacatcaccgccgataaacaaaaaaatggcattaaagcgaatttta aaattcgccacaacgtggaggatggcagcgtgcagctggctgatcactaccagcaaaacactccaatcggtgatggtcc tgttctgctgccagacaatcactatctgagcacgcaaagcgttctgtctaaagatccgaacgagaaacgcgatcatatg gttctgctggagttcgtaaccgcagcgggcatcacgcatggtatggatgaactgtacaaatgaCATCACCATCACCATC ATCACtaaCCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTG AACGCTCTCTACTAGAGTCACACTGGCTCACCTTCGGGTGGGCCTTTCTGCGTTTATA

4.3: On Twist, Select the Genes Option

4.3SelectGenes 4.3SelectGenes Selected Genes

4.4: Select the ‘Clonal Genes’ Option

Clonal Genes Selected see results in subsections 4.5 and 4.6 below

4.5: Import your sequence

4.5ImportedSequence_1 4.5ImportedSequence_1 Imported Sequence (Step 1)

4.5ImportedSequence_2 4.5ImportedSequence_2 Imported Sequence (Step 2)

4.6: Choose Your Vector

4.6ChoosingVector 4.6ChoosingVector Chose Vector

4.6DownloadingConstruct 4.6DownloadingConstruct Downloaded Construct

4.6ImportedPlasmid 4.6ImportedPlasmid Imported sequence into Benchling and viewed resulting plasmid

All Section 4 Prompts Listed Below

Supporting PromptSource
Tell me how to add a Promoter to a sequence in BenchlingPerplexity
Found this information from the Registry of Standard Biological Parts: BBa_J23106 Can you break down what this naming convention means and how I can find the relevant Promoter information in a sequence based on this naming convention? Do NOT hallucinate. If you don’t know the answer, say soPerplexity
What is an alignment in Benchling? In Benchling, how do I put a codon optimized sequence under or next to a sequence I originally imported? Do NOT hallucinate when answering this questionPerplexity
How do I replace a sequence in Benchling with a codon-optimized sequence?Perplexity
Bit confused regarding how to find a Promoter in a sequence in Benchling. I tried Auto-Annotate and it doesn’t seem to be working. Where should I go from here?Perplexity
What is an RBS in Benchling?Perplexity
What is a 7x His Tag? What is a Terminator? How do I find these in Benchling? Where are these traditionally inserted into a sequence in Benchling?Perplexity
How do I paste sequences into a Benchling file?Perplexity
How do I know where to insert a Promoter into a given sequence in Benchling?Perplexity
Not totally understanding. If the start codon (the ATG) represents the start of the sequence, how do I insert something before that in Benchling?Perplexity
What is an RBS? Where are they traditionally inserted into a sequence?Perplexity
What do spacers look like in Benchling? Is it literally just empty space with no letters/codons? Something tells?Perplexity
Where is a coding sequence traditionally inserted in a codon optimized sequence in Benchling? If there’s something off in what I’m saying, let me knowPerplexity
Where is a C-terminus in a protein in Benchling?Perplexity
How do I find an amino acid view for a sequence in Benchling?Perplexity
In Benchling, if I’m inserting a 7x His Tag and a Terminator, and I have a stop codon in my sequence, what is the traditional sequence? Is it 7x His Tag, stop codon, Terminator? Something else?Perplexity
Any way I can add a Schema to a sequence after the fact in Benchling?Perplexity

Part 5: DNA Read/Write/Edit

5.1 DNA Read

  • (i) What DNA would you want to sequence (e.g., read) and why?

  • I want to sequence the DNA of the bdelloid rotifer Adineta genus, a microscopic-sized invertebrate that kind of looks like a worm. I’m fascinated by its ability to sustain cryptobiosis for thousands (in the case of a bdelloid rotifer thawed out in Russia in 2015, more than 24,000!!) of years. Transgenesis of the bdelloid rotifer’s cryptobiotic abilities in mammalian organisms could have profound impacts on the future of the species, specifically the ability for groups of homo sapiens or other future sapiens forms to engage in interstellar travel over large durations of time and space. More information on the bdelloid rotifer Adineta genus and its cryptobiotic abilities can be found in the footnote at the end of this sentence 4.

  • (ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why?

    • I’d use Next-Generation Sequencing (NGS) on my DNA because it’s well-suited for transgenesis. It’s fast, high-resolution, and allows for massively parallel sequencing (if desired). More importantly, it’s highly precise, meaning it can pinpoint transgene locations within a host genome.
  • Is your method first-, second- or third-generation or other? How so?

    • It’s a second-generation sequencing method, as it emerged in the 2000’s after the advent of Sanger sequencing in the 1970’s and before the advent of single-molecule sequencing in the 2010’s.
  • What is your input? How do you prepare your input (e.g. fragmentation, adapter ligation, PCR)? List the essential steps.

    • If the gene exists in an organism, the initial input is extracted genomic DNA from that particular organism. Otherwise, the initial input can be a plasmid (if the DNA’s already cloned) or complementary DNA (cDNA) if only mRNA is available. Essential preparation steps listed below (assuming gene exists in an organism):
      • Isolate or extract the gene; lyse cells or tissues from donor, remove proteins and contaminants
      • Fragment the isolated or extracted gene into fragments of apprxo. ~200-600bp
      • Convert ragged ends into blunt or sticky ends
      • Attach adapter sequences to ends of each fragment
      • Enrich fragments of the intended size (i.e., the size you want), removing any remaining small artifacts and excess adapters
      • Review fragement size distribution
      • Convert double-stranded library into single strands if necessary
      • Load into sequencing instrument
  • What are the essential steps of your chosen sequencing technology, how does it decode the bases of your DNA sample (base calling)?

    • The essential NGS steps are listed below:
      • Preparation (see above)
      • Amplification: Many copies of each DNA fragment are created on flow cells. One end of each fragment sticks to a primer, gets copied to a complementary strand, and bends over to stick to a primer. This bridging repeats nth times, forming amplified (hence the name) clusters of these identical grouped fragments
      • Sequencing: A polymerase enzymes adds 1 colored flourescent nucleotide to each strand in each cluster of grouped fragments. A camera takes a picture of recording the color for the given nucleotide (revealing its A, T, G, or C base) then chemicals are used to wash away any free-floating nucleotides that aren’t part of a given cluster.
      • Base Decoding/Base Calling: A computer analyzes these colored image clusters, assigning each cluster a sequence of bases based on its colors. This analysis then becomes a text string of DNA code, with confidence ratings per base sequence based on the resolution of the read 5
  • What is the output of your chosen sequencing technology?

    • See above. NGS outputs a text string of DNA code, with confidence ratings per base sequence

5.2 DNA Write

  • (i) What DNA would you want to synthesize (e.g., write) and why?)

    • See answer to 5.1 (i). I’m fascinated by the potential of transgenesis of the bdelloid rotifer Adineta’s cryptobiotic abilities for interstellar travel, and therefore, am very interested in reading and editing its sequence.

    NOTE: I found a bdelloid rotifer Adineta sequence in the link in the footnote. However, the raw FASTA information is so long that upon insertion into this webpage, it seemingly broke the webpage, or caused it to freeze up (pardon the unintentional cryptobiosis pun) 6

  • (ii) What technology or technologies would you use to perform this DNA synthesis and why?

    • I’d use PCR amplification, chemical synthesis, and restriction enzymes and ligation to perform this DNA synthesis. I’d use PCR so I can make many copies of the original DNA, I’d use chemical synthesis so I can clone the DNA into a given plasmid, encapsulating the DNA for desired level expression in an appropiate vehicle, and I’d use restriction enzymes and ligation for precise synthesis.
    • What are the essential steps of your chosen sequencing methods?
      • PCR amplification:
        • Denaturation: Heat DNA so it breaks into stingle strands
        • Annealing: Cool the DNA, allowing primers to bind to sites in target gene
        • Extension: DNA polymerase adds nucleotides from the primer starting point, allowing each strand to fully copy
      • Chemical synthesis:
        • Deprotection: Remove protecting DMT group
        • Base Coupling: Add protected phosphoramidite nucleotide for phosphate linkage
        • Capping: Cap the chain to prevent errors
        • Oxidation: Stabilize phosphate triester bonds
      • Restriction enzymes and ligation:
        • Stitch oligos from prior chemical synthesis step into a complete gene for insertion into plasmid
        • Add flanking restriction enzymes at ends of oligos as needed
        • Clean the isolated DNA via gel extract
        • Use matching restriction enzymes to incubate the gene insert and its plasmid vector. This incubation process recognizes specific sequences in the gene insert and cut, creating blunt and sticky ends
        • Use DNA ligase enzyme to mix compatible ends
    • What are the limitations of your sequencing method (if any) in terms of speed, accuracy, scalability?
      • A scalability challenge is that it’s difficult to synthesize a sequence of more than 200nt via direct synthesis methods because at a certain point, too many errors accumulate in the synthesized sequence

5.2 DNA Edit

  • (i) What DNA would you want to edit and why?
    • See answer to 5.1 (i). I’m fascinated by the potential of transgenesis of the bdelloid rotifer Adineta’s cryptobiotic abilities for interstellar travel, and therefore, am very interested in reading and editing its sequence.
  • (ii) What technology or technologies would you use to perform these DNA edits and why?
    • I’d use CRISPR-Cas9 to perform these DNA edits because when performing transgenesis from an invertebrate to a mammalian vertebrate, it allows for non-random, precise insertion of large amounts of genetic data with reduced risk of unintended or off-target effects.
    • How does your technology of choice edit DNA? What are the essential steps?
      • CRISPR-Cas9 edits DNA through a multi-stage mechanism. This mechanism is broken down below:
        • Recognition: A single guide RNA (sgRNA) pairs with a Cas9 protein
        • This pair scans the genome for a 20bp DNA sequence if the sequence is next to a protospacer adjacent motif (PAM)
        • If there is a PAM next to the desired 20bp DNA sequence, the Cas9 makes a dobule-stranded break (DSB)
        • The DSB then triggers repair mechanisms (either non-homolgous end joining [NHEJ] or homology-directed repair [HDR]). These repair mechanisms allow the desired edited DNA to be incorporated into the sequence
    • What preparation do you need to do (e.g. design steps) and what is the input (e.g. DNA template, enzymes, plasmids, primers, guides, cells) for the editing?
      • Preparation:
        • Select target site(s)
        • Design and synthesize gRNA
        • Build DNA donor templates
        • Create mixture of recombinant Cas9 protein and purified gRNA
        • Preapre cells/embryos
      • Inputs to this process are gRNDA,Cas9 protein, donor DNA (usually linear templates), and a delivery vehicle (usually an injection buffer)
    • What are the limitations of your editing methods (if any) in terms of efficiency or precision?
      • HDR can have low efficiency in a transgenesis context
      • Off-target or unintended consequences can still occur
      • Need PAMs near target sequences for precise DSBs

All Section 5 Prompts Listed Below

Supporting PromptSource
Remind me what a digest is in a biotechnological context. I know it has something to do with subdividing DNA sequences into fragments based on enzymes, but there’s some additional information I know I’m missing. Do NOT hallucinate when answering this questionPerplexity
What is horizontal gene transfer? A separate question (perhaps): What is the technical term in biotechnology for transferring the abilities of one organism to another (ex. if I wanted to actually give a lizard the ability to fly like a bird by importing genetic properties that allow for the creation of wings for example)?Perplexity
What is it called in biotechnology when traits from one organism are transferred or conferred to another via an engineered process or processes?Google AI Mode
If I want to perform transgenesis in a biotechnological context (i.e., introduce a foreign gene into a new organism to confer a desired trait), and I want to start this process by sequencing the original foreign gene, what is considered the best practice in modern biotechnology for sequencing this original foreign gene? Is this sequencing method first, second, or third generation in the history of biotechnology? From some other period? What essential steps does it involve and how does it decode the bases of the original foreign gene? What is its output? Do NOT hallucinate when answering these questions. If you don’t know the answer to any of these questions, say soPerplexity
How is Next-Generation Sequencing (NGS) considered second generation? Do NOT hallucinate when answering this questionPerplexity
If I want to perform transgenesis in a biotechnological context (i.e., introduce a foreign gene into a new organism to confer a desired trait), and I want to start this process by sequencing the original foreign gene via Next-Generation Sequencing (NGS), what is my input at the very beginning of the sequencing process? How is that input prepared for sequencing? Do NOT hallucinate when answering these questions. If you don’t know the answer to any of these questions, say soPerplexity
What is cDNA?Perplexity
I have the following information describing and Illumina Next-Generation Sequencing (NGS workflow: –Library Prep: Extract/isolate the gene (PCR amplify if needed), fragment to ~200-500 bp, add adapters/barcodes.–Amplification: Bridge amplification on flow cell creates clusters of identical fragments.–Sequencing: DNA polymerase incorporates fluorescent reversible terminator nucleotides (A/T/G/C); camera captures color/emission per base, cleaving terminator for next cycle.–Imaging/Analysis: Base calling from images, alignment to reference or de novo assembly. Explain the amplification through imaging/analysis steps to me as if I was a reasonably educated 16 year old without an advanced biotechnology background. Tell me what the terms in the amplification through imaging/analysis steps mean. Do NOT hallucinate when addressing this queryPerplexity
What is a polymerase enzyme? Explain this to me as if I was a reasonably educated 16 year old without an advanced biotechnology backgroundPerplexity
Found the following on NCBI: “Uncultured bdelloid rotifer isolate Undet.AN.1.3 cytochrome oxidase subunit I gene, partial cds; mitochondrial"Is that the same as the full bdelloid rotifer genome? Bit confused and am unsure whether or not it isDo NOT hallucinate. If you don’t know the answer to this question, say soGoogle AI Mode
Yes, I’m looking for a complete nuclear genome of a bdelloid rotifer, although I’m not sure which species of bdelloid rotifer I’m looking for. I thought bdelloid rotifer was its own species. Open to having any misconceptions cleared up. If you can provide me links where I can find a complete bdelloid rotifer nuclear genome, that would be greatly appreciatedGoogle AI Mode
Am aware that there was a bdelloid rotifer that came out of long-term cryptobiosis in Russia back in 2015. Any chance we know/you can find the specific species of this rotifer that had this cryptobiotic ability? If so, I’d like the complete nuclear genome for that rotifer speciesGoogle AI Mode
In the answer two prompts ago, there was a mention of “chemicals wash away extras” in the ‘Sequencing by synthesis’ step. What are ’extras’ in this context? Do NOT hallucinate when answering this questionPerplexity
Is the workflow sequence described in the answer to the prompt 2 prompts ago Polymerase Chain Reaction (PCR)? Believe so, but am not surePerplexity
If I want to perform transgenesis and I start by Next-Generation Sequencing (NGS) to read the DNA, what comes after (i.e., what technologies are traditionally used for writing and then editing the DNA or the original organism that has the abilities I want to confer in a host)? Do NOT hallucinate when answering this questionPerplexity
In the answer to the last prompt, can you give me the ‘So What?’ or ‘So What’? behind the ‘Writing the DNA (Gene Preparation & Cloning)’ section from a transgenesis perspective? What do gene synthesis steps like “clone it into a plasmid vector with promoter, terminator, and selection marker.” allow one to do in a transgenesis context? What does restriction enzyme digestion and ligation allow one to do? Do NOT hallucinate. If you don’t know the answers to these questions, say soPerplexity
I have a DNA writing workflow consisting of the following steps:–PCR amplification: Use the sequenced gene info to design primers and amplify the gene from the original source DNA.–Gene synthesis: Chemically synthesize the DNA sequence (especially if optimized for the host), then clone it into a plasmid vector with promoter, terminator, and selection marker.–Restriction enzyme digestion & ligation: Cut the vector and gene insert with enzymes, then join (ligate) them using DNA ligase to create a recombinant plasmidCan you give me the ‘So What?’ or ‘So What’? behind this workflow from the perspective of someone who wants to perform transgenesis from one organism to another? What do gene synthesis steps like “clone it into a plasmid vector with promoter, terminator, and selection marker.” allow one to do in a transgenesis context? What does restriction enzyme digestion and ligation allow one to do?Do NOT hallucinate. If you don’t know the answers to these questions, say soGoogle AI Mode
How does PCR amplification and gene synthesis actually work? What are the essential steps?Perplexity
How does the content in the last prompt related to the Phosphoramidite DNA Synthesis Cycle? Do NOT hallucinate when answering this questionPerplexity
What is a dNTP in genomics?Perplexity
What is a ‘phosphate triester’? Explain this to me in simple terms. Do NOT hallucinatePerplexity
Understanding the phosphoramidite DNA synthesis cycle. How does this transition to the essential steps of restriction enzyme digestion and ligation? Do NOT hallucinate when answering this questionPerplexity
If I want to perform transgenesis from one organism to another in a biotechnological context, particularly if I want to confer a trait from an invertebrate organism to a vertebrate, mammalian organism, what is/are the recommended DNA technology or technologies for accomplishing this task? What are the benefits and drawbacks, and respective workflows of each ot these technologies? Where does CRISPR fit into the mix?Perplexity
Can you elaborate on the CRISPR-Cas9 workflow from the answer to the last prompt, specifically describing how it edits DNA? Can you make a workflow just focused on that? Explain this workflow to me as if I was a reasonably educated person with some (not extensive) biology and biotechnology knowledge. Do NOT hallucinate when answering this queryPerplexity

  1. https://www.uniprot.org/uniprotkb/I3PM87/entry#sequences ↩︎

  2. https://rest.uniprot.org/uniprotkb/I3PM87.fasta ↩︎

  3. https://www.yourgenome.org/theme/what-is-rna-splicing/ ↩︎

  4. Excerpt from “The Great Siberian Thaw” (New Yorker Magazine; 2022-01-17): “Permafrost thaw has brought to the surface all sorts of mysteries from millennia past. In 2015, scientists from a Russian biology institute in Pushchino, a Soviet-era research cluster outside Moscow, extracted a sample of yedoma from a borehole in Yakutia. Back at their lab, they placed the piece of frozen sediment in a sterilized culture box. A month later, a microscopic, wormlike invertebrate known as a bdelloid rotifer was crawling around inside. Radiocarbon dating revealed the rotifer to be twenty-four thousand years old. In August, I drove out to Pushchino, where I was met by Stas Malavin, a researcher at the laboratory. “It’s one thing for a simple bacterium to come back to life after being buried in the permafrost,” he said. “But this creature has intestines, a brain, nervous cells, reproductive organs. We’re clearly dealing with a higher order.” The rotifer had survived the intervening years in a state of “cryptobiosis,” Malavin explained, “a kind of hidden life, where metabolism effectively slows down to zero.” The animal emerged from this geological “time machine,” as he put it, not just alive but able to reproduce. A rotifer lives for only a few weeks, but replicates itself multiple times through parthenogenesis, a type of asexual reproduction. Malavin removed from the lab fridge a direct descendant of the rotifer that had crawled out of the permafrost and placed it under a microscope. An oval-shaped plankton squirmed around; I imagined this blob, two-tenths of a millimetre in size, as a nervous explorer who awoke to find itself in a strange and unexpected future. “Why be modest?” Malavin asked. Unlocking the secret of how an animal with a complex anatomy was able to shut down for tens of thousands of years and then turn itself back on might, for example, offer hints for using cryogenic conditions to store organs for donation. Neuroscientists at M.I.T. have been in touch. “I’m obviously not saying our findings will lead to people being put into long-term cryogenic slumber tomorrow,” Malavin said. “But it’s a step in that direction.”” ↩︎

  5. I think this might mean the resolution of the image of the cluster ↩︎

  6. https://www.ncbi.nlm.nih.gov/datasets/genome/GCA_021613535.1/ ↩︎

Week 3 HW: Lab Automation

PowerofLabAutomation PowerofLabAutomation

The Power of Lab Automation

Assignment: Python Script for Opentrons Artwork

  • 0: Attended this week’s recitation and reviewed the lab information on programming Opentrons
  • 1: Generated an artistic design using Ronan’s Opentrons GUI 1
  • 2: Artistic Design Python Script: See script in URL below:
  • 3: Listing my sfgfp point coordinates from Ronan’s Opentrons GUI below (the shape is a rightward-facing green arrow):
    • [(6.6,11), (8.8,11), (11,11), (8.8,8.8), (11,8.8), (13.2,8.8), (11,6.6), (13.2,6.6), (15.4,6.6), (13.2,4.4), (15.4,4.4), (17.6,4.4), (15.4,2.2), (17.6,2.2), (19.8,2.2), (17.6,0), (19.8,0), (22,0), (-22,-2.2), (-19.8,-2.2), (-17.6,-2.2), (-15.4,-2.2), (-13.2,-2.2), (-11,-2.2), (-8.8,-2.2), (-6.6,-2.2), (-4.4,-2.2), (-2.2,-2.2), (0,-2.2), (2.2,-2.2), (4.4,-2.2), (6.6,-2.2), (8.8,-2.2), (11,-2.2), (13.2,-2.2), (15.4,-2.2), (17.6,-2.2), (19.8,-2.2), (22,-2.2), (24.2,-2.2), (-22,-4.4), (-19.8,-4.4), (-17.6,-4.4), (-15.4,-4.4), (-13.2,-4.4), (-11,-4.4), (-8.8,-4.4), (-6.6,-4.4), (-4.4,-4.4), (-2.2,-4.4), (0,-4.4), (2.2,-4.4), (4.4,-4.4), (6.6,-4.4), (8.8,-4.4), (11,-4.4), (13.2,-4.4), (15.4,-4.4), (17.6,-4.4), (19.8,-4.4), (22,-4.4), (24.2,-4.4), (26.4,-4.4), (-22,-6.6), (-19.8,-6.6), (-17.6,-6.6), (-15.4,-6.6), (-13.2,-6.6), (-11,-6.6), (-8.8,-6.6), (-6.6,-6.6), (-4.4,-6.6), (-2.2,-6.6), (0,-6.6), (2.2,-6.6), (4.4,-6.6), (6.6,-6.6), (8.8,-6.6), (11,-6.6), (13.2,-6.6), (15.4,-6.6), (17.6,-6.6), (19.8,-6.6), (22,-6.6), (24.2,-6.6), (17.6,-8.8), (19.8,-8.8), (22,-8.8), (15.4,-11), (17.6,-11), (19.8,-11), (13.2,-13.2), (15.4,-13.2), (17.6,-13.2), (11,-15.4), (13.2,-15.4), (15.4,-15.4), (8.8,-17.6), (11,-17.6), (13.2,-17.6), (6.6,-19.8), (8.8,-19.8), (11,-19.8)]

  • 4: Used Gemini 2.5 Flash (built into Google Colab) to assist with completing the coding portion of the homework. I have some rough Python knowledge via a Codecademy course, which helped get things started (i.e., I did do some of the coding for this assignment).

All Gemini 2.5 Flash prompts are listed below:

Supporting PromptModel
I want to create some code similar to the code in Examples 1-7. What are the core elements of the code I need to create?Gemini 2.5 Flash
Been working on some code in the ‘Your Code’ module. Have made a single green dot so far. Looking to create a rightward-facing green arrow based on these coordinates in the attached .py file Tell me how the code in the ‘Your Code’ module under the #Aspirate subsection needs to be edited to output the rightward facing green arrow in the attached .py fileGemini 2.5 Flash
Wondering if you could help explain something. Not seeing an actual visualization of a green arrow below. Where is it? Can you give me a picture output of the code similar to the picture output located in the examples in this URL below? https://colab.research.google.com/drive/1VoouRH0nqlk09g50rHxOElaLD-SVknYY#scrollTo=PsOgJ2DndZztGemini 2.5 Flash
Ensure the ‘run’ function executes all arrow_pointsGemini 2.5 Flash
Still getting a single green dot when I run the simulator. Have inputted the coordinates for the rightward-facing green arrow based on the attached file and am aware this will need to likely need a for loop to aspirate the colors.Gemini 2.5 Flash
Still getting a single green dot when I run the simulator. Have inputted the coordinates for the rightward-facing green arrow based on the attached file and am aware this will need to likely need a for loop to aspirate the colors. Recommendations on how to proceed?Gemini 2.5 Flash
Explain why only 1uL was dispensed instead of 20uLGemini 2.5 Flash
Looking at the code and see that I inputted the points for the rightward-facing arrow under arrow_points. Not understanding what parts of the code need to be changed (if any) so that when the simulation runs, a rightward-facing green arrow is outputtedGemini 2.5 Flash
Thank you. Can you please tell me where cell “pczDLwsq64mk” is located, so I can have an idea of the code used to create the rightward-facing arrow in the simulation?Gemini 2.5 Flash
  • 5: Coordinating robot time slot with William & Mary node
  • 6: Submitted Python file via assignment form (see screenshot below):

PythonFormSubmissionScreenshot PythonFormSubmissionScreenshot Python Form Submission Confirmation Screenshot

Post-Lab Questions

  • Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications.

    • The paper An Automated Versatile Diagnostic Workflow for Infectious Disease Detection in Low-Resource Settings was published in the journal Micromachines in 2024 and describes using simple Commercial Off-the-Shelf (COTS) reagents and lab equipment, along with an Opentrons robot, to create an automated workflow for detecting diseases in low-resource settings 2. What’s interesting about the paper is that the workflow the researchers designed reduced the time for detecting a pathogen (in this case meningitis) by approx. 18% (total timne of 118 min.) with an almost 5.8x reduction in cost for sample processing 3. The total cost of to run 8 samples for meningitis detection was approx. $126 USD, a cost savings that matters in a low-resourced environments. The findings and extension of this paper’s workflow shows promise for decentralized disease detection in low-resource settings during future health security incidents. Except for opening and closing of tube lids, the following four steps of the autonated workflow were completed by the Opentrons robot:
      • DNA isolation with Dynabeads
        • Sample incubation
        • Washing step
        • DNA resuspension
      • DNA amplification
        • Recombinase Polymerase Amplification (RPA) mix preparatoon
        • RPA amplification
      • DNA digestion
        • Exonuclease digestion
      • DNA dtection
        • Preparation of Vertical Flow Microarray (VFM) solutions
        • Addition of samples to VFM
        • Signal enhancement

    Associated AI prompts to address this question included below:

Supporting PromptModel
Can you find me biotechnology related papers from the past 5 years (ideally from Western sources) that incorporate Opentrons or another lab automation tool to create or research a novel biosecurity applicationGoogle Scholar Labs
Give me a rundown of what primers are in genomics in simple terms (don’t get overly technical, if possible). Explain this to me as if I was a reasonably educated 15 year-old. Tell me if primers are found in nature or if they’re an artificial construct. Tell me how primers are used in biotechnology Do NOT hallucinate when answering this queryPerplexity
In a gene amplification context, what does vortexing mean?Perplexity
What are amplicons?Perplexity
What is a ctrA gene?Perplexity
What is a reagent in a diagnostic or biotechnology context?Perplexity
  • Write a description about what you intend to do with automation tools for your final project.
    • Phage isolation experiments generally require enriching bacterial strains, filtering out bacterial particles, pouring the phage-containing mixture in with fresh bacteria on agar, plating and re-plating phage plaques (areas of phage propagation and bacetrial destruction on agar), and characterizing the resulting phage plaques on the agar. My working thoughts are that I might actually be able to create a somewhat automated workflow a-la the paper referenced in the previous question to help with the filteration through characterization steps of a phage isolation experiment. This might involve using an Opentrons robot and COTS equipment to:
      • Handle bacterial liquids
      • (Maybe) use an Opentrons module to shake bacterial liquids and reagents
      • Operate centrifuge for spinning bacteria
      • Pour agar with phage plaques
      • Operate micropipettes for phage testing
      • Operate Thermocycler module for amplifying specific lysed areas
      • Rapidly run software to characterize novel phage DNA

Note: The maeterial above is not set in stone. It’s an outline of potential automation options for a potential final project (Space (LEO and beyond) or Suborbital Phage Isolation)

Associated AI prompts to address this question included below:

Supporting PromptModel
What do phage isolation experiments usually entail? Are any elements of phage isolation experiments dangerous to human health and safety, and if so, why?Perplexity
Take the steps in the “What phage isolation usually involves” section of the last prompt and break down the tools traditionally used for each step. Do NOT hallucinate when answering this questionPerplexity
What is supernatant?Perplexity
What are plaques in a phage isolation experiment context?Perplexity
What does it mean to ‘pellet’ bacteria?Perplexity

Final Project Ideas

  • Submitted 3 Final Project ideas in my node’s section of the slide deck (see screenshot below):
FinalProjectIdeaSubmissionConfirmation FinalProjectIdeaSubmissionConfirmation

All supporting prompts for Final Project Ideas Slide listed below

Supporting PromptModel
What is the microbiome? It’s the gut right? If I’m oversimplifying with the second question, let me know Do NOT hallucinate when answering these questionsPerplexity
Does a biotechnological equivalent of ‘Build Your Own Phage’ a-la Build a Bear or Lego exist in the real world? If so, does this capability exist in a personalized medicine context, or is it operable in remote environments? Do NOT hallucinate when answering this questionPerplexity
Tell me about the adenita vaga bdelloid rotifer that was able to maintain cryptobiosis for 24,000 yearsGoogle AI Mode
Tell me about the adenita vaga bdelloid rotifer that went into cryptobiosis for 24,000 yearsGoogle AI Mode
Thinking about creating some sort of  real-time personalized medicine biological monitoring capability allowing phages to be identified, created, and disseminated for nascent infections. Create some type of logo image for this concept (the kind you might find on a sticker) that combines the elements of a phage and a time-keeping watch. Let’s not have it be too cartoony or effused with excessive colorGemini
Nice. Change the writing on the top to PhageWatch as opposed to PhageGuardGemini

  1. https://opentrons-art.rcdonovan.com ↩︎

  2. https://www.mdpi.com/2072-666X/15/6/708#Introduction ↩︎

  3. This is for running samples through the workflow not for equipment like the Opentrons OT-One-Hood ↩︎

Week 4 HW: Protein Design Part 1

Gemini_Generated_Image_tlkngctlkngctlkn Gemini_Generated_Image_tlkngctlkngctlkn
South American Rattlesnakes (Crotalus durissus terrificus) with Crotamine protein

Part A: Conceptual Questions

  1. How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons)
  • I intake approx. 5 * 1023 Daltons of amino acids when ingesting 500 grams of meat. This is based off results indicating I ingest approx. 1021 Daltons of amino acids when ingesting 1 gram of meat
  1. Why do humans eat beef but do not become a cow, eat fish but do not become fish?
  • Humans eat beef but don’t become cattle, and eat fish but don’t become fish because genetic information from the lifeform being ingested isn’t transferred wholesale. Much of the genetic material being eaten is broken down during digestion and more importantly a human beings’ cells follow instructions derived from their DNA. Human beings’ cells utilize amino acids from the lifeform being ingestion, but perform this utilization according to specific genetic instructions. The lifeform being ingested and its amino acids are the raw materials the cell uses for various means.
  1. Why are there only 20 natural amino acids?
  • There are several broad reasons why there are only 20 standard natural amino acids. The first reason is that early on in the history of evolution, this group of amino acids became more or less ’locked in’, meaning that once the basic relationship between three letter codons and these 20 standard natural amino acids became widely distributed across the kingdom of life, it becamde too risky/dangerous from an evolutionary standpoint to alter this core set. Another reason is that the group of 20 gives enough range in structure and chemistry to build a large chunk of what evolution or directed evolution might desire. The other reasons seem to amount to various types of evolutionary trade-offs. Adding more than 20 amino acids to this standard set would add additional, potentially unwanted complexity, while decreasing the number of amino acids in the set might lead to issues with a lack of uniqueness with amino acids side chain sharing, which would in turn limit the functional flexibility of amino acids to do things like fold precisely.
  1. Can you make other non-natural amino acids? Design some new amino acids.
  • Yes you can. My attempts to design some new amino acids usng SwissSideChain and the Cryo-EM structure of Receptor Tyrosine Kinase ROS1 PDB file in PyMol (open-source) are shown below:

HomoSapiensReceptorTyrosineKinaseROS1_Attempted1‑amino‑cyclohexanecarboxylic-acid-creation HomoSapiensReceptorTyrosineKinaseROS1_Attempted1‑amino‑cyclohexanecarboxylic-acid-creation Attempt at creating a non-natural amino acid residue mutation of Tyrosine Kinase ROS1 using cyclohexanecarboxylic acid

HomoSapiensReceptorTyrosineKinaseROS1_Attempted1‑amino‑cyclopropanecarboxylic-acid-creation HomoSapiensReceptorTyrosineKinaseROS1_Attempted1‑amino‑cyclopropanecarboxylic-acid-creation Attempt at creating a non-natural amino acid residue mutation of Tyrosine Kinase ROS1 using cyclopropanecarboxylic acid

  1. Where did amino acids come from before enzymes that make them, and before life started?
  • Amino acids come from metabolic molecules within the cell. These metabolic molecules consist of carbon atom chemical backbone, inorganic nitrogen, and enzyme-facilitated chemical reactions. Before life as we know it started, amino acids originated from abiotic (not from living organisms) chemical reactions on Earth before the emergence of life as we know it. The chemical reactions occurred in the atmosphere, hydrothermal and oceanic vents, and via meteorite and comet (i.e., extraterrestrial) delivery
  1. If you make an α-helix using D-amino acids, what handedness (right or left) would you expect?
  • If I made an a-helix using D-amino acids, I’d expect left handedness because the majority of alpha helices are built from L-amino acids, which exhibit right handedness.
  1. Can you discover additional helices in proteins?
  • Yes additional helices can be discovered in proteins. These can include non-alpha helices and even π-helices. These can be discovered via x-ray crystallography, structure databases, or with the aid of AI prediction tools.
  1. Why are most molecular helices right-handed?
  • Most molecular helices are right-handed because L-amino acides force the helix backbones to conform to right-handedness. Selection pressures also favors this form of handedness in the helix for hydrogen atom bonding and overall functionality
  1. Why do β-sheets tend to aggregate?
  • β-sheets tend to aggregate because the edges of β-sheets have exposed nitrogen and hydrogen (donor) atomic pairs and carbon-oxygen pairs (acceptor) atoms. What happens is that when sheets get in close proximity these atomic pairs snap together, almost like a zipper.

    • What is the driving force for β-sheet aggregation?
      • The driving force for β-sheet aggregation is the attraction of the donor and acceptor atomic pairs

    Supporting prompts for this section listed below:

Supporting PromptModel
What is a Dalton as a unit of measurement?Perplexity
How many molecules of amino acids are located within a single gram of edible meat, on average? Do NOT hallucinate when answering this questionPerplexity
What is Avogadro’s number and why is it relevant within the context of the answer to the previous prompt? What does mol mean in the context of g/mol? Do NOT hallucinate when answering these questionsPerplexity
10^{21} * 500 equals how much?Perplexity
From a biological and genetic standpoint why is it that when an organism ingests food, they don’t become the organism they’re ingesting (i.e., if a human being eats salmon, why don’t they become a salmon)? Where do amino acids fit into this explanation? Do NOT hallucinate when answering these questionsPerplexity
Why are there only 20 amino acids found in nature? Do NOT hallucinate when answering this questionPerplexity
To elaborate on a point made in the 5th part of the answer to the previous prompt: what is a side chain in the context of amino acids? Do NOT hallucinate when answering this questionPerplexity
Share me resources for designing non-natural amino acids rather simply/with relatively little friction or extensive technical knowledgeGoogle AI
How can I quickly design a non-natural amino acids? What internet resources exist to help me do this?Google AI
Having issues downloading SwissSidechain into PyMol. I’m on a MacOS. Do I need Python downloaded on my local machine for successful downloading of SwissSidechain? Do NOT hallucinate when answering these questionsChatGPT
It’s telling me it’s installed but that Swiss Sidechain is Not Loaded. I downloaded the SwissSidechain package. Please tell me how to proceedChatGPT
I’m using ‘3.1.6.1’, 3.0, 3000000, 1749441553, ‘963e47f43382e009c2cd391f0747a8c20ef108e7’, 0’ of PyMolChatGPT
Getting ModuleNotFoundError: No module named ‘swisssidechain’ChatGPT
When I go to select the SwissSidechain folder from my Downloads, I only can open the file. I can’t select the entire thing. Is that an issue?ChatGPT
Yeah when I try to just select the entire folder from PyMol I can’t do. How should I proceed with the install?ChatGPT
Ok. Now I have only a .zip and am able to select it. Getting the following error: “Plugin “PySwissSidechain2” has been installed but initialization failed. Tell me how to proceedChatGPT
It says ‘Unable to find license file’. Not understanding what’s going on hereChatGPT
No I don’t have a PyMol licenseChatGPT
How do I install Conda on Mac? Is Miniconda downloadable on the web?ChatGPT
Should I download the graphical or command Miniconda install?ChatGPT
YesChatGPT
Getting “zsh: command not found: conda”ChatGPT
For this code, how do I find my username? Users//miniconda3ChatGPT
Getting “zsh: no such file or directory: /Users/j_d_r/miniconda3”ChatGPT
YesChatGPT
Getting “Do you accept the Terms of Service (ToS) for https://repo.anaconda.com/pkgs/main? [(a)ccept/(r)eject/(v)iew]:” What should I write?ChatGPT
Getting “PackagesNotFoundError: The following packages are not available from current channels: - pymol”ChatGPT
OkChatGPT
Ok. I’ve opened up the Open-Source version of PyMol and installed SwissSideChain. Now tell me how to mutate some residues from some Protein Data Bank (PDB) structures based on one of the non-natural L- or D-sidechains of the SwissSidechain database. Basically I want to create some graphical representations of non-natural amino acids using SwissSideChain in the Open-Source version of PyMol and I would like you to provide me a step by step process for how to get these visuals (when the visuals are created I also want to be able to know their names and other relevant information)ChatGPT
What does a PDB structure mean within the context of proteins?Perplexity
What’s a common protein downloaded from RCSB PDB?Perplexity
Not totally understanding how SwissSidechain works. Do I just install random .pdb online, open that in PyMol, and then run SwissSidechain against it?ChatGPT
Ok. At RCSB PDB and am looking to download a PyMol compatible file extension. What file extension should I go for?ChatGPT
What is the name of a .pdb file? PDBx/mmCIF?ChatGPT
Whenever I click ‘PySwissSidechain’ from Plugin –> Legacy Plugins my Open-Source PyMol window fails and I see this in Terminal: “libc++abi: terminating due to uncaught exception of type NSException”. What’s going on? How do I proceed with running the SwissSidechain extension?ChatGPT
Yes do thisChatGPT
Seeing this command on SwissSidechain: “Command line Use the command: Mutate Object//Chain/ResNumber/, Newres For instance, to mutate residue number ‘82’ on chain ‘E’ in object ‘complex’ into Homoleucine (HLEU), write: Mutate complex//E/82/, HLEU” What does each piece of this command mean in biological speak? How do I find residues and non-natural amino acids to choose from?ChatGPT
In SwissSideChain, what is the short code for the non-natural amino acid ‘Cha’ and ‘Ca’?Perplexity
Amino acids are produced by enzymes, correct? DO NOT hallucinate when answering this questionPerplexity
So based on the answer to the last prompt, where exactly do amino acids come from? Where exactly do they originate? DO NOT hallucinate when answering this questionPerplexity
When we say ’enzyme‑catalyzed pathway’ we mean a chemical reaction that an enzyme speeds up, correct? Do NOT hallucinate when answering this questionPerplexity
What is an enzyme pathway in the context converting a precursor into an amino acid? Is it simply a chemical reaction? Something else? Do NOT hallucinate when answering this questionPerplexity
In the answer 3 prompts ago, there was mention of “Carbon Skeleton” and “Intermediates of glycolysis, the citric acid (TCA) cycle, and the pentose phosphate pathway” when describing them. What does this mean in simple terms? Are we saying these provide the chemical structure of an amino acid to keep it fundamentally sound? Do NOT hallucinate when answering these questionsPerplexity
Ok. Based on available literature and if relevant/necessary, the information shared in response to the previous prompts, how did amino acids originate before the emergence of life as we know it on this planet? Was it via chemical process in cyanobacteria or another form of bacteria or archaea? Do NOT hallucinate when answering these questionsPerplexity
An a-helix within the context of genomics and synthetic biology is an alpha helix correct? How do D-amino acids relate to alpha helices? Do NOT hallucinate when answering these questionsPerplexity
Not understanding. In the answer to the last prompt you stated, " In genomics and synthetic biology, “α-helix” or “a-helix” refers to the standard protein alpha helix: a right‑handed secondary structure formed by regular backbone hydrogen bonding, almost always built from L‑amino acids in natural proteins.” Then you said, “A chain made entirely of L‑amino acids naturally forms a right‑handed α‑helix” Did hallucination occur here? If so, where and how?Perplexity
If this is the case, how can an  α-helix using D-amino acids exhibit right-handedness? Not seeing how this would work? Do NOT hallucinate when answering this questionPerplexity
Explain the following: –The difference between how protein is described in a nutritional context and what proteins are/how they are described in a biological or biotechnological context –My hunch is additional helices in naturally-occurring proteins can be found, although I’m not sure how. Confirm or deny this hunch. If my hunch is correct, tell me why additional helices for proteins found in nature can be found and some methods for discovering these additional helices. Do NOT hallucinate when explaining these items. If you there’s risk of exaggeration or outputting something not confirmed or based on sources, don’t output itPerplexity
What are molecular helices in a biological context? How do they differ from other types of helices (if at all)? Do NOT hallucinate when answering these questionsPerplexity
What’s the handedness of most molecular helices? What explains their handedness? Do NOT hallucinate when answering these questionsPerplexity
What does this mean in the answer to the last prompt? “L-amino acids dictate right-handed protein helices (opposite would clash sterically).” What are L-amino acids again? Does the L stand for lysate? Explain all this to me as if I was a reasonably educated 14 year-old. Do NOT hallucinate when answering these questionssPerplexity
What are β-sheets in a biological context? Do we call them beta-sheets? What does the β stand for? What exactly are they? Do NOT hallucinate when answering these questionsPerplexity
What causes beta-sheets to come together as a larger aggregate? Is it molecular forces from difference ionic charges on atoms comprising the beta-sheets? Something else? Do NOT hallucinate when answering these questionsPerplexity
From the answer to the previous prompt, “H-bond donors (N-H) and acceptors (C=O)”" means hydrogen bond donors and carbon acceptors, correct? Do NOT hallucinate when answering this question. What does the C stand for? Do NOT hallucinate when answering these questionsPerplexity

Part B: Protein Analysis and Visualization

  • I selected the crotamine protein found in the South American rattlesnake, the protein’s abilities to penetrate cells as a toxin allows it to serve as a template for antivenom and potentially targeted therapies.

  • The amino acid sequence of the crotamine protein protein is below 1

    • AAF34911.1 crotamine [Crotalus durissus terrificus] MKILYLLFAFLFLAFLSEPGNAYKQCHKKGGHCFPKEKICLPPSSDFGKMDCRWRWKCCKKGSGK

    • This protein is 65 amino acids long. Its most frequent amino acid in this protein is lysine, which appears 11 times
    • There are 250 protein sequence homologs for the crotamine protein after running it in UniProt BLAST (ID: Q9PWF3)
    • Yes, this protein belongs to the crotamine-myotoxin family.
  • Structural answers to this question are listed below:

    • The protein structure was discovered in 2005 2. It appears to be a mixed bag from a structural quality standpoint based on the NMR Structure Validation Report below: NMRStructureValidationReport NMRStructureValidationReport Crotamine protein RCSB NMR Structure Validation Report
    • No, there are no other molecules in the solved structure apart from the protein
    • It’s a defensin-like protein of the myotoxin family
  • 3D visualization software (NGL Outputs)

    • “Cartoon”, “Ribbon”, and “Ball and Stick” combo CrotamineNGLView2(All3) CrotamineNGLView2(All3)
    • Protein coloring by secondary structure. I think it might have more sheets than helices, although I’m not sure CrotamineNGLView3(SecondaryStructure) CrotamineNGLView3(SecondaryStructure)
    • Protein by residue type. It seems like it has a good mix of hydrophobic vs hydrophilic residues CrotamineNGLView4(Hydrophilic) CrotamineNGLView4(Hydrophilic)
    • Protein surface visualization. It doesn’t seem like it has a lot of holes, although I’m not sure CrotamineNGLView5(Holes) CrotamineNGLView5(Holes)

All supporting prompts listed below:

Supporting PromptModel
I have a FASTA file and a GenBank identifier for a protein I found on NCBI.gov. How do I then find the amino acid sequence for that given protein? What will it look like? Do NOT hallucinate when answering this questionPerplexity
Confirming that the beta-keratin 2 [Gekko gecko] protein has a 3D structure, correct?Perplexity
Does the beta-keratin 2 [Gekko gecko] protein have a 3D structure?Google AI
Understood. Are there any slug mucus proteins that have a 3d structure?Google AI
Gotcha. Tell me if there are any serpent-derived proteins that have a 3D structureGoogle AI
What is the K amino acid? Is it potassium?Perplexity
What are main protein sequence homologs in the context of genomics and biotechnology? What do they look like?Perplexity
Show me the format of protein sequence homologs in UniProt database. Show me what they usually look like. Do NOT hallucinate when producing this outputGoogle AI
Find me resolution info depicting the quality of the structure described in this tabGemini
So I’ve heard a good quality structure has higher resolution. How can I obtain information on the quality o the structure from this tab? Is it possible?Gemini
Based on the scores in the 2nd page of this tab, what is the quality of the structure? Where exactly does the black dot reside here?Gemini
Based on this tab, tell me if there are any other molecules in the structure of the Crotamine protein beyond the protein itself? If there are, what are they? Do NOT hallucinate when answering this questionGemini
Where can I find the structure classification family for this protein in this tab? Show me where I can find thisGemini
Tell me what I’m looking at and how it helps determine the structure classification family of the crotamine proteinGemini
Tell me if and/or how I can view this protein as a “cartoon” “ribbon” and “ball and stick” using this NGL viewer toolGemini
See the plus icon but can’t click on itGemini
I want to color the protein by secondary structure to determine whether or not it has more helices or shets. How can I do this in this viewer?Gemini
Can you see the successful secondary structure output here? If not, say soGemini
I now want to color the protein by residue type to determine whether or not it has more helices or shets. How can I do this in this viewer?Gemini
I no longer want to see secondary structures or helices or sheets. I want to see the protein’s residue typesGemini
Right now my chain-related Color Scheme options are ChainID, ChainIndex, and ChainName. Which one should I choose?Gemini
Yes it seems to be recognizing the entire structure as Chain A. Does this mean that most of the residue type is of 1 type? And does this mean that the residues are mostly hydrophobic or hydrophilic?Gemini
Ok – now how would I visualize the surface of the protein. How can I see if it has holes or binding pockets?Gemini

Part C: Using ML-Based Protein Design Tools

C1. Protein Language Modeling

  1. Deep Mutational Scans

    • See picture below: CrotamineDeepMutationalScan CrotamineDeepMutationalScan Crotamine Protein Deep Mutational Scan

    • There seem to be these really interesting columns that occur as one moves rightward past position 20 in the heatmap. There is a single almost uniform top-down color for these columns, which according to Gemini, indicate a special sensitivity. These are apparently disulfide bridges, which are quite important for holding the protein together (these are referred to as ‘anchors’)

  2. Latent Space Analysis

    • See picture below CrotamineLatentAnalysisScan(NoTag) CrotamineLatentAnalysisScan(NoTag) Non-Crotamine tagged point cloud showing embedded proteins

    • Yes they do–there are approximations of similar proteins found throughout the cloud. CrotamineLatentAnalysisScan(Tag) CrotamineLatentAnalysisScan(Tag) Crotamine tagged point cloud showing embedded proteins and Crotamine’s location among embeddings

    • There are other proteins at the edge of the cloud that belong to either other viruses, or other organisms that might be found in similar locations/habitats to the South American rattlesnake (mice, chickens, humans), or other types of toxins or species carrying toxins. See picture below: CrotamineLatentAnalysisScan(NeighborhoodPic) CrotamineLatentAnalysisScan(NeighborhoodPic) Crotamine tagged point cloud showing embedded proteins and Crotamine’s location relative to its neighborhood

C2. Protein Folding

  • Approach 1: It doesn’t look like the predicted coordinates match the original structure in the PDB much at all (see pictures below). Think this might be due to the fact that the original inputted structures in the PDB for this protein were apparently a bit of a mixed bag according to its NMR Structure Validation Report results (see previous section). CrotaminePDBESMComparison1 CrotaminePDBESMComparison1 Crotamine Protein PDB < > ESM Side-by-Side Structural Comparison (1)

    CrotaminePDBESMComparison2 CrotaminePDBESMComparison2 Crotamine Protein PDB < > ESM Side-by-Side Structural Comparison (2)

    Approach 2: Asked Gemini about this discrepancy and was recommended to input the mature sequence containing the last 42 residues. The result still visually seems to have some discrepancies compared with the original PDB visual upon first glance, but far less than the outputs under Approach 1 (see photo below) CrotaminePDBESMComparison3 CrotaminePDBESMComparison3 Crotamine Protein PDB < > ESM Side-by-Side Structural Comparison (3) – Post-Mature Sequence Input

  • I tried the following mutations:

    • Cytesine Break: Per Gemini suggestion, replaced all the C residues to A (alanine). This notably decreased the predicted Local Difference Test (pLDDT) score of the outputed mutated sequence and the visual also seemed to indicate less structural integrity to the protein, implying less resilience to this mutation (see photo below) CrotamineCytesineBreakMutation1 CrotamineCytesineBreakMutation1 CrotamineCytesineBreakMutation2 CrotamineCytesineBreakMutation2 Crotamine Cytesine Break Mutation Results

    • Charge Swap: Per Gemini suggestion, replaced all the K (Lysine) and R (Arginine) residues to D (Aspartic Acid). This decreased the pLDDT score of the outputed mutated sequence a bit, but less than the Cytesine Break, indicating the protein was more comparably resilient to this mutation (see photo below) CrotamineChargeSwapMutation1 CrotamineChargeSwapMutation1 CrotamineChargeSwapMutation2 CrotamineChargeSwapMutation2 Crotamine Charge Swap Mutation Results

C3. Protein Generation

  • Based on the heatmap below, there seems to be a difference between the predicted sequence probabilities here and the original heatmap generated in the ‘Deep Mutational Scans’ subsection. 48 of the 65 positions were changed and the sequence recovery rate was around 0.26, which didn’t seem all that promising CrotamineProteinMPNNResult1 CrotamineProteinMPNNResult1 CrotamineProteinMPNNResult2 CrotamineProteinMPNNResult2 Crotamine ProteinMPNN Results (1)
  • Visually, this re-inserted ESMFold output also looks structurally different than the original protein structure (see photo below) CrotamineProteinMPNNResult3 CrotamineProteinMPNNResult3 Crotamine ProteinMPNN Results (2)

All supporting promopts for this section listed below:

Supporting PromptModel
Want to choose a GPU to run this. Believe the ability to select GPUs should be in the bottom right but am not seeing this. Direct me to where I should go on this page for GPU selectionGemini
Looking over the heatmap under the ‘Mutation Scans’ section of the code, what stands out regarding the inputted protein sequence (MKILYLLFAFLFLAFLSEPGNAYKQCHKKGGHCFPKEKICLPPSSDFGKMDCRWRWKCCKKGSGK)? Are there any particular mutations or patterns that stand out and why? Do NOT hallucinate when answering this questionGemini
I want to make sure I perform the Latent Space Analysis located under ‘Latent Space Analysis’ correctly, as in I want to make sure I perform Latent Space Analysis on the specific protein sequence I inputted under the previous section ‘Mutation Scans’. Is there anything I need to do or input? Does the code in ‘Latent Space Analysis’ just take the specific protein sequence I inputted under the previous section ‘Mutation Scans’ and run with it? Do NOT hallucinate when answering this questionGemini
I want to make sure I perform the Latent Space Analysis located under ‘Latent Space Analysis’ correctly, as in I want to make sure I perform Latent Space Analysis on the specific protein sequence I inputted under the previous section ‘Mutation Scans’. It looks like I just inputted some incorrect code in the cell under ‘Latent Space Analysis’ that starts with ‘Latent Space Analysis’ Fix the errors in the code and make sure to input the appropriate information so the protein sequence from the previous ‘Mutation Scans’ cell can be analyzed here. Do NOT hallucinate when addressing this queryGemini
Ok, so it looks like there’s output in the ‘3D T-SNE visualization of Protein Sequence Embeddings’ subsection of the ‘Latent Space Analysis’ section. Help me understand where my initially inputted Crotamine protein sequence (MKILYLLFAFLFLAFLSEPGNAYKQCHKKGGHCFPKEKICLPPSSDFGKMDCRWRWKCCKKGSGK) is located within the 3D plot. Do NOT hallucinate when addressing this queryGemini
Ok, so it looks like there’s output in the ‘3D T-SNE visualization of Protein Sequence Embeddings’ subsection of the ‘Latent Space Analysis’ section. Help me understand where my initially inputted Crotamine protein sequence (MKILYLLFAFLFLAFLSEPGNAYKQCHKKGGHCFPKEKICLPPSSDFGKMDCRWRWKCCKKGSGK) is located within the 3D plot. It just seems like there are a lot of protein sequences in this plot and while I can drag over and find things manually, I’m not sure where to start or how to efficiently find the area in this plot where the inputted sequence is located. Any assistance would be useful. Do NOT hallucinate when addressing this queryGemini
Yes let’s do that. Let’s give it a color that stands out.Gemini
Now that the protein visualization has been generated in the ‘Run ESMFold’ section, I’m instructed to do the following: “Try changing the sequence, first try some mutations, then large segments. Is your protein structure resilient to mutations?” There’s a couple things here. The first is that the outputted protein structure in ESMFold doesn’t really seem to match up all that well with the original protein structure in the Protein Data Bank (PDB). This might be because the original protein structure in the PDB was apparently a bit of a mixed bag from a quality standpoint. Not sure how to proceed, and that hesitation is twofold: 1) Is there anything I can do regarding the discrepancy between the outputted protein structure in ESMFold and the original protein structure in the PDB? If this is most likely out of my control/if there’s nothing that can be changed from an input or coding perspective, say so 2) To accomplish the instructions listed above in the quotation marks, should I just go to the subsection that says ‘sequence’ in the first cell under the ‘Run ESMFold’ section and just start randomly changing letters? Does that make sense? Regardless of whether or not this is the best approach for fulfilling the instructions, how would I know if the outputted structure(s) are resilient (or not) to mutations? Would it just be discerned from degree of change in outputted structures (i.e., if the structures change a lot, it likely isn’t resilient and vice versa)? Do NOT hallucinate when addressing these questionsGemini
What does the pLDDT acryonym stand for?Gemini
My instructions for the ‘Inverse Folding with ProteinMPNN’ section are to ‘Analyze the predicted sequence probabilities and compare the predicted sequence vs the original one.’ I assume there’s some customization to be done here, either through coding or through inputting sequence information. What would be the most likely sensible plan to proceed? Success seems to be that the compared protein sequence and the original one look visually rather similar. Any next steps on how to begin working this workflow would be appreciated. Do NOT hallucinate when addressing this queryGemini
My instructions for the ‘Inverse Folding with ProteinMPNN’ section are to ‘Analyze the predicted sequence probabilities and compare the predicted sequence vs the original one.’ From there I need to put results back into ESMFold to see if the visuals match up. I assume there’s some customization to be done here, either through coding or through inputting sequence information. What would be the most likely sensible plan to proceed? Any next steps on how to begin working this workflow would be appreciated. Do NOT hallucinate when addressing this queryGemini
My instructions for the ‘Inverse Folding with ProteinMPNN’ section are to ‘Analyze the predicted sequence probabilities and compare the predicted sequence vs the original one.’ From there I need to put results back into ESMFold to see if the visuals match up. I assume there’s some customization to be done here, either through coding or through inputting sequence information. What would be the most likely sensible plan to proceed? Any next steps on how to begin working this workflow would be appreciated. Do NOT hallucinate when addressing this queryGemini
Do these steps outputted from the answer to the previous prompt take into account earlier sequence information inputted into previous cells outside the ‘Inverse Folding with ProteinMPNN’ section, specifically information on the protein sequence in question we want to visually compare across the ‘Inverse Folding with Protein MPNN’ and ‘Run ESMFold’ sections (YKQCHKKGGHCFPKEKICLPPSSDFGKMDCRWRWKCCKKGSGK)? If not, how can this be addressed? Do NOT hallucinate when addressing these questionsGemini
For the last step, just executed, I want a more 1-to-1 comparison between the results in the diagram under ‘Visualize Amino Acid Probabilities’ an the results located in the cell under ‘Mutation Scans’ earlier in the notebook (https://colab.research.google.com/drive/10EnA1imLYVVtWQYR-CsQIWU7tA-LliMj#scrollTo=09FwbZ6v1AUs&line=2&uniqifier=1) Can we make the X and Y axis of the heatmap just executed match this original heatmap with ‘Position in Protein Sequence’ with numbers left to right in the X axis and ‘Amino Acid Mutations’ in the Y axis? Also want ‘Model Scores’ scored from 2 (more yellow-ish) to -6 (more purple-ish) a-la this original heatmap If there are parts of this instruction you don’t understand or that are not possible to execute, say so. After this is resolved, we’ll return to the original workflow from the answer to the previous promptGemini
Address the error found when running the ‘Run ESMFold’ cellGemini
Everything under ‘Configure ProteinMPNN with Crotamine Structure’ doesn’t appear to be working. What’s going on?Gemini

Part D: Group Brainstorm on Bacteriophage Engineering

  • My William & Mary Node Bacteriophage Engineering Group Brainstorm Google Doc. can be found here 3.

Week 5 HW: Protein Design Part 2

Gemini_Generated_Image_oq0v2doq0v2doq0v Gemini_Generated_Image_oq0v2doq0v2doq0v Using AlphaFold for Protein Optimization

Part A: SOD1 Binder Peptide Design

Part 1: Generate Binders with PepMLM

  1. Retrieved human SOD1 sequence via UniProt (see photo below). Introduced A4V mutation via Gemini prompt (see sequence below). PartA_Part1_1 PartA_Part1_1 Human SOD1 sequence (A4V mutation not added)

Human SOD1 sequence (A4V mutation added)

MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

  1. See results in later questions (answers and photos below)

  2. See photos below PartA_Part1_3(1) PartA_Part1_3(1)

    PartA_Part1_3(2) PartA_Part1_3(2) Generated four peptides of length 12 amino acids conditioned on the mutant SOD1 sequence

  3. See photo below PartA_Part1_4 PartA_Part1_4 Added known SOD1-binding peptide FLYRWLPSRRGG for comparison

  4. Perplexity scores listed below

BinderPerplexity
WRYGAAALAHKE8.976454
WRYYAAAVELGE12.659931
WRYGPAVLALGK9.556429
WLYYAVALALGE15.294134
FLYRWLPSRRGG20.635226

NOTE: PepMLM Colab used to generate results above can be found here 1

Supporting prompts for this section listed below:

Supporting PromptModel
Why is protein design and models like AlphaFold important in the context of drug discovery and improvements in human health? If I were to describe its importance to a reasonably educated person on the street who doesn’t know much about the subject what would I say? Would something like “Since most diseases are caused by protein-related issues, and because proteins comprise an essential role in human health and physiology, knowing how proteins function and fold can help us design therapeutics with greater precision and efficacy”? What am I missing there and where am I off? Do NOT hallucinate when answering this questionPerplexity
In the context of proteins and/or the Superoxide dismutase (SOD1) protein found in Homo sapiens, what is the A4V mutation? What does it entail? What does A4V stand for? Do NOT hallucinate when answering this promptPerplexity
I have the following Superoxide dismutase (SOD1) protein sequence found in Homo sapiens: MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ. I want to introduce the A4V mutation in this sequence so I can eventually generate relevant peptides to bind to the mutated sequence. How does the sequence need to change to accurately represent this mutation? Do NOT hallucinate when addressing this promptGemini
Based on the contents in the ‘Inputs and Parameters’ cell, if I want to generate 4 peptides that each have a 12 amino acid length, other than the ‘Peptide Length’ variable, the other variable I should alter is the ‘Number of Binders’ variable, correct? Or is it the ‘Top K Value’ variable? Not sure which variable I need to alter. Do NOT change any cell content as part of addressing this prompt and do NOT hallucinateGemini
Now that I’ve generated 4 peptides, each of which are 12 amino acids long, I now want to add the known SOD1-binding peptide FLYRWLPSRRGG for comparison. Without changing any code in any of the cells in this workbook, how can I go about doing this? Do NOT hallucinate when addressing this queryGemini
Question regarding the third bullet point under 3. for Method 1. How would the benchmark (FLYRWLPSRRGG)’s properties be known for comparison by performing the actions listed under Method 1 in response to the last prompt? Do NOT hallucinate when answering this questionGemini
Let’s create a new cell under the ‘Download Results’ tab where the following Superoxide dismutase (SOD1)-binding peptide FLYRWLPSRRGG will be analyzed in the exact same way the peptides generated from the ‘Inputs and Parameters’ cell were analyzed in the ‘Generate Peptides’ cell. Do not alter any of the underlying fundamental logic from code in prior cells. Just extend it so the FLYRWLPSRRGG can be analyzed with a Perplexity score in the same way the results from the ‘Inputs and Parameters’ cell were analyzed in the ‘Generate Peptides’ cell. Do NOT hallucinate when performing this taskGemini

Part 2: Evaluate Binders with AlphaFold3

  1. Navigated to AlphaFold Server (see below):
  2. See peptide results (ipTM scores and binding information) below in 3.
  3. See ipTM and binding information results below:
  • WRYGAAALAHKE Peptide:

    • ipTM: 0.31; peptide appears to bind near the dimer interface, and appears surface-bound, although it should be noted that the level of confidence indicated by the ipTM score is notably low, which can color the perception of these results. See photo below: WRYGAAALAHKE_AlphaFold WRYGAAALAHKE_AlphaFold WRYGAAALAHKE peptide AlphaFold Visualization Results
  • WRYYAAAVELGE Peptide:

    • ipTM: 0.24; again peptide appears to bind near the dimer interface, and appears surface-bound, although it should be noted again that the level of confidence indicated by the ipTM score is again notably low, which can color the perception of these results. See photo below: WRYYAAAVELGE_AlphaFold WRYYAAAVELGE_AlphaFold WRYYAAAVELGE peptide AlphaFold Visualization Results
  • WRYGPAVLALGK Peptide:

    • ipTM: 0.33; again peptide appears to bind near the dimer interface, and appears surface-bound, although once again the confidence of this assessment is not high based on the ipTM. See photo below: WRYGPAVLALGK_AlphaFold WRYGPAVLALGK_AlphaFold WRYGPAVLALGK peptide AlphaFold Visualization Results
  • WLYYAVALALGE Peptide:

    • ipTM: 0.40; again peptide appears to bind near the dimer interface, and appears surface-bound, although once again the confidence of this assessment is not high based on the ipTM. See photo below: WLYYAVALALGE_AlphaFold WLYYAVALALGE_AlphaFold WLYYAVALALGE peptide AlphaFold Visualization Results
  • FLYRWLPSRRGG Peptide:

    • ipTM: 0.29; peptide appears to engage with the β-barrel region somewhat and appears surface-bound–again the confidence of this assessment is not high based on the ipTM. See photo below: FLYRWLPSRRGG_AlphaFold FLYRWLPSRRGG_AlphaFold FLYRWLPSRRGG peptide AlphaFold Visualization Results
  1. All of the ipTM values were low, meaning AlphaFold expressed notable uncertainty regarding peptide placement. It’s interesting to note that almost all of the PepMLM-generated peptides exceeded the FLYRWLPSRRGG 0.29 ipTM. Not sure what that means about techniques used to ascertain the relationship between the FLYRWLPSRRGG and the sequence, although it does seem to indicate PepMLM’s power

Supporting prompts for this section listed below:

Supporting PromptModel
If I want to model a protein-peptide complex using this service, how should I proceed? I understand I’ll need to input a protein sequence, but not sure how to input a relevant peptide? What entity type would a peptide fall under? Do NOT hallucinate when outputting this resultGemini
I already have a protein sequence that should be formatted appropriately. I do have peptides, and it would be great to see if there is any modification that needs to be made in their formats to make sure they’re being inputted according to the correct FASTA format. Here’s the first peptide sequence: WRYGAAALAHKE. If any modification need to be made in their format to make sure they’re being inputted according to the correct FASTA format, tell me what changes need to be made, why, and then make the changes. Otherwise, don’t change anything if everything already checks out. Do NOT hallucinate when addressing this queryGemini
What does piDDT mean on this page? What do ipTM and pTM mean?Gemini
Need to understand where the WRYGAAALAHKE peptide binds to the A4V mutated SOD 1 homo sapiens protein sequence (MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ). Not understanding how to interpret the 3D visual I’m seeing on this page. What is the N-terminus and where is it located? What is the β-barrel region or the dimer interface and where are they located here? When we talk about peptide binding sites and we say they are either ‘surface-bound’ or ‘partially-buried’, which of these make sense for this peptide and how can we visually discern this from the 3D graphic? Do NOT hallucinate when replying to this promptGemini
How do I find Residue 1 in the 3d graphic?Gemini
Ok – so is the WRYGAAALAHKE peptide displayed in orange and yellow in the 3D graphic? Do NOT hallucinate when answering this questionGemini
So based on what I’m seeing here, it looks like the WRYGAAALAHKE peptide might (very tentatively) bind near/around the N-terminus, and appears to be surface-bound correct? If this is wrong, correct this tentative peptide location and binding type information and explain why. Do NOT hallucinate when addressing this promptGemini
Ok. So does the peptide engage the β-barrel region or approach the dimer interface? Where exactly does the protein appear to bind, generally speaking? Do NOT hallucinate when answering this questionGemini
So based on what I’m seeing here, it looks like the WRYYAAAVELGE peptide might (very tentatively) bind near/around the N-terminus, and appears to be surface-bound correct? If this is wrong, correct this tentative peptide location and binding type information and explain why. If it approaches the dimer interface, explain why. Do NOT hallucinate when addressing this promptGemini
How do I read what I’m seeing in the 3D graphic? Understand the β-barrel region can be visually eyeballed because it looks like an actual barrel. Other areas like the N-terminus or the dimer interface are harder to visually discern. Essentially I’m asking how to read this visual map of the protein < > peptide interaction located in the graphic. Do NOT hallucinate when replying to this promptGemini
Yeah when I hover over residues in Chrome I just get a cursor. Nothing is highlighting. How should I proceed with reading the structural “landmarks” of the SOD1 protein.Gemini
The WRYGPAVLALGK peptide appears surface-bound and NOT partially buried, correct? This makes sense because it doesn’t interact with the β-barrel region much at all, right? Do NOT hallucinate when answering this promptGemini
The WRYGPAVLALGK peptide appears surface-bound and NOT partially buried, correct? This makes sense because it doesn’t interact with the β-barrel region much at all, right? Do NOT hallucinate when answering this promptGemini
Believe the level of confidence indicated by the 0.4 ipTM is still not quite high, correct? Would it be considered failing? What is the threshold for failing here? Do NOT hallucinate when answering this questionGemini
The WLYYAVALALGE peptide appears surface-bound and NOT partially buried, correct? Believe so. Do NOT hallucinate when answering this quesitonGemini
Can you explain what it means that most of the 3D graphic is colored dark blue? What is this color indicating exactly? Do NOT hallucinate when answering this questionGemini
Wondering whether or not it would be fair to say that the FLYRWLPSRRGG binds near the dimer interface and appears surface bound and NOT partially buried. Do NOT hallucinate when addressing this promptGemini
Would we say that the peptide engages the β-barrel region or approaches the N-terminus? Believe it doesn’t approach the N-terminus from my high-level understanding. Do NOT hallucinate when addressing this promptGemini

Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse

  1. See pasted peptide sequences, A4V mutant SOD1 sequences in target fields, and checked boxes results below
  2. See pasted peptide sequences, A4V mutant SOD1 sequences in target fields, and checked boxes results below
  3. See results below:

WRYGAAALAHKE Peptide:

  • This peptide has weak binding affinity, is soluble, non-hemolytic, with a slightly positive net charge and a molecular weight of 1372.5. See results below: WRYGAAALAHKE_PeptiVerse1 WRYGAAALAHKE_PeptiVerse1 WRYGAAALAHKE_PeptiVerse2 WRYGAAALAHKE_PeptiVerse2 WRYGAAALAHKE Peptide PeptiVerse Results

WRYYAAAVELGE Peptide:

  • This peptide has weak binding affinity, is soluble, non-hemolytic, with a slightly negative net charge and a molecular weight of 1427.6. See results below: WRYYAAAVELGE_PeptiVerse1 WRYYAAAVELGE_PeptiVerse1 WRYYAAAVELGE_PeptiVerse2 WRYYAAAVELGE_PeptiVerse2 WRYYAAAVELGE Peptide PeptiVerse Results

WRYGPAVLALGK Peptide:

  • This peptide has weak binding affinity, is soluble, non-hemolytic, with a positive net charge and a molecular weight of 1330.6. See results below: WRYGPAVLALGK_PeptiVerse1 WRYGPAVLALGK_PeptiVerse1 WRYGPAVLALGK_PeptiVerse2 WRYGPAVLALGK_PeptiVerse2 WRYGPAVLALGK Peptide PeptiVerse Results

WLYYAVALALGE Peptide:

  • This peptide has weak binding affinity, is soluble, hemolytic, with a negative net charge and a molecular weight of 1368.6. See results below: WLYYAVALALGE_PeptiVerse1 WLYYAVALALGE_PeptiVerse1 WLYYAVALALGE_PeptiVerse2 WLYYAVALALGE_PeptiVerse2 WLYYAVALALGE Peptide PeptiVerse Results

FLYRWLPSRRGG Peptide:

  • This peptide has weak binding affinity, is soluble, non-hemolytic, with a positive net charge and a molecular weight of 1507.7. See results below: FLYRWLPSRRGG_PeptiVerse1 FLYRWLPSRRGG_PeptiVerse1 FLYRWLPSRRGG_PeptiVerse2 FLYRWLPSRRGG_PeptiVerse2 FLYRWLPSRRGG Peptide PeptiVerse Results

  • There seems to be some relationship between higher ipTM scores and stronger predicted affinity, although it’s definitely not the type of relationship across the PepMLM-generated peptides that’s 1-to-1 or strong enough to indicate any direct form of causality. In fact the WRYYAAAVELGE peptide had the lowest ipTM score of 0.24, and yet it has the 2nd highest predicted affinity of the group (6.07). So again, we can’t say it’s a clean 1-to-1 relationship. While none of the PepMLM-generated peptides appear to have strong bindings in the general sense, the two strongest of the group, WLYYAVALALGE and WRYGPAVLALGK, are predicted to be soluble and hemolytic and soluble and non-hemolytic respectively.

Based on its Predicted Binding Affinity (5.77 pKd/pKi), Solubility (1.00), Hemolysis (Non-Hemolytic; 0.036), Net charge (ph 7) (1.76), and its Molecular Weight (1330.6 Da), it appears the WRYGPAVLALGK peptide best balances predicted binding and therapeutic properites, and therefore should be advanced based on this balance relative to the other PepMLM-generated peptides

Supporting prompts for this section listed below:

Supporting PromptModel
Need to place the following A4V mutant SOD1 sequence “in the target field”: MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ. I’ve already inserted the peptide sequence in the ‘Peptide Sequence(s) / SMILES’ field. Where do I place the A4V mutant SOD1 sequence? Where is the ’target field’? Do NOT hallucinate when answering this questionGemini
Need to check a box off for ‘molecular weight’. Which unchecked box would that be and why? Do NOT hallucinate when answering this questionGemini
If I have a ‘Net Charge (pH 7) value of 0.85, what does that mean in plain terms? Is it good or bad from a therapeutic perspective? Likewise, if I have an ‘Isoelectric Point’ value of 8.60, what does that mean in plain terms? Is it good or bad from a therapeutic perspective? And if I have a ‘Hydrophobicity (GRAVY)’ score of -0.56, what does that mean in plain terms? Is it good or bad from a therapeutic perspective? Do NOT hallucinate when answering these questionsGemini
If I have a ‘Molecular Weight’ value of 1372.5, what does that mean in plain terms? Is it good or bad from a therapeutic perspective?Gemini
Would it be fair to say that this peptide has a slightly positive ‘Net charge (pH 7)’ score or that is has a positive ‘Net charge (pH 7)’ score? What is the distinction? Do NOT hallucinate when answering this promptGemini
What does ‘ug/m’ mean again? Also it’s definitely fair to say that any results showing a hemolytic peptide indicate the peptide is NOT safe for advancement into further therapeutic trials, correct (given the risk of red blood cell damage)? Do NOT hallucinate when answering this promptGemini
If the WRYGPAVLALGK peptide has the following properties, would we say it has a decent or nice balance of predicted binding and therapeutic properties? See properties below:–Soluble: 1.00–Hemolysis: Non-Hemolytic (0.036) –Binding Affinity: Weak Binding Affinity (5.77)–Net charge (pH 7): 1.76–Molecular weight: 1330.6. Do NOT hallucinate when answering this promptGemini

Part 4: Generate Optimized Peptides with moPPIt

  1. Opened moPPit Colab

  2. Made a copy and switched to a GPU runtime (see below) moPPit_1 moPPit_1 Switched to GPU runtime

  3. Notebook results:

    • Pasted A4V mutat SOD1 sequence (see below) moPPit_2 moPPit_2 Pasting A4V mutant SOD1 sequence

    • Chose specific residue indices on SOD1 for peptides to bind (see below) moPPit_3 moPPit_3 Set specific residues indices on SOD1 for peptide binding

    • Set peptide length to 12 amino acids. Generated peptide (see below) moPPit_4(Results) moPPit_4(Results) Generated peptides

  4. First off, these peptides have stronger and more specific binding than the previous PepMLM peptides. They also appear to achieve this stronger and more specific binding while simultaenously remaining non-hemolytic and soluble. It does appear that there was a slight dip in non-fouling, however. I would evaluate these peptides against the previous set, and also against the intended safety standards for anticipated means of therapeutic transmission (oral, intravenous, etc.)

NOTE: moPPit Colab used to generate results above can be found here 2

Supporting prompts for this section listed below:

Supporting PromptModel
If I want to choose specific residues indices (places on the ‘Target_Protein’ variable located under cell ‘3.1 Inputs and Parameters’) where I want to want peptides to bind, what variables in cells 3.1 or 3.2 should I be focusing on and why? Do NOT hallucinate when answering this promptGemini
I’m dealing with a A4V mutated Superoxide dismutase (SOD1) protein sequence found in homo sapiens (MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ). Most of the peptides I generated on another tool appeared to bind rather weakly, potentially near the dimer interface of the protein. Based on this information on the protein sequence and the previously-generated binders, I’m not exactly sure where (i.e., what Motif Positions) and how strong (Specificity) my binders should be that I create in cell 3.2. I’m aware I want to likely increase binding strength/have stronger bindings, but again, not sure exactly what placement(s) make sense given the nature of the A4V mutuation. Open to any thoughts you may have. Do NOT hallucinate when addressing this promptGemini
Ok. Help me understand the results that were just produced from the ‘4. Binder Generation’ cell, and tell me how I can get a .csv file of the resultsGemini
I’ve created some peptide binders that are meant to bind to the a A4V mutated Superoxide dismutase (SOD1) protein sequence found in homo sapiens (MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ). These peptides are meant to bind to the 1-10, 49-54, 111, 148-153 sites of the mutated SOD1 sequence, and are meant to do so with greater binding strength than the previously generated peptide binders in the attached screenshotsGemini

Part B: BRD4 Drug Discovery Platform Tutorial (Optional)

  • Did not complete Part B

Part C: Final Project: L-Protein Mutants

  • Part C Homework and supporting prompts can be found at the hyperlink below 3
  • Part C Colab Notebooks can be found at the hyperlinks located below 4 5

Week 6 HW: Genetic Circuits Part 1

Gemini_Generated_Image_i4wz87i4wz87i4wz Gemini_Generated_Image_i4wz87i4wz87i4wz
  • Robot Crafting Genetic Circuit (Stylized)

DNA Assembly

  1. What are some components in the Phusion High-Fidelity (HF) PCR Master Mix and what is their purpose?
    • HF DNA Polymerase: This is the enzyme responsible for copying DNA as it moves from the 5’ to the 3’ position across the DNA
    • Deoxynucleotide triphosphates (dNTPs): These are the DNA molecular building blocks, consisting of Adenine (A), Thymine (T), Cytosine (C), and Guanine (G) variants
    • HF Buffer: This consists of magnesium chloride, which is salt added to the reaction. It matters because it dissolves into Mg²⁺, which helps nucleotides bond during the reaction
  2. What are some factors that determine primer annealing temperature during PCR?
    • Some factors that determine primer annealing tempeature during PCR include:
      • Primer lengths
      • Primer melting tempratures
      • GC content/sequence content
      • Buffer components
  3. There are two methods from this class that create linear fragments of DNA: PCR, and restriction enzyme digests. Compare and contrast these two methods, both in terms of protocol as well as when one may be preferable to use over the other.
    • PCR: PCR creates new linear DNA fragments by via enzymatic amplification of a given region nth number of times. The PCR protocol essentially consists of setting up reaction mixes, denaturating the DNA into single strends, annealing so primers can anneal to specific complementary sequences, extension so the polymerase can syntehsize a new strand, and then repeating this as many times as neccessary. This method might be more useful when there is a specific fragment of DNA one wants to amplify for further use.
    • Restriction Enzyme Digests: Restriction Enzyme Digests create new linear DNA fragments by cutting DNA at specific points/recognition sites. The Restriction Enzyme Digest protocol consists of setting up a reaction mix, incubation, and then stopping the reaction. This method might be more useful when there is a specific fragment of DNA one wants to isolate for further analysis.
  4. How can you ensure that the DNA sequences that you have digested and PCR-ed will be appropriate for Gibson cloning?
    • You can ensure the DNA sequences have appropriate 5’ –> 3’ orientation with corresponding overlaps. Fragments salso need to cover the relevant region for cloning, and also need to be inserted at the appropriate molar ratio relative to the plasmid backbone (vector). This is usually a 2:1 ratio.
  5. How does the plasmid DNA enter the E. coli cells during transformation?
    • The plasmid DNA enters the E. coli either via heat shock (temperature change) or electroporation (high electrical voltage). Both methods shock the E. coli cell, causing its cell membrane to open for the plasmid DNA to enter.
  6. Describe another assembly method in detail (such as Golden Gate Assembly)
    • DNA topoisomerase I (TOPO) Cloning: TOPO cloning’s traditionally used, as it’s a fast, reliable method for cloning products from PCR for later sequencing, etc. The first step in TOPO cloning is generating an insert with Taq polymerase via PCR. This creates inserts with an A-overhang, which can then help address the second step. The second step is to combine this PCR product with the TOPO vector. This is usually done for a couple of minutes. The insert’s 5’ OH/hydroxyl interacts with the TOPO DNA at its end, and as part of this process A and T base pairing occurs between the respective insert and the vector . Then the TOPO religates the strangs and dissociates, creating a closed circular plasmid with the given insert. See diagrams below:

TOPO_Diagram TOPO_Diagram TOPO_image TOPO_image

  • See results below: TOPO_Process1 TOPO_Process1

    TOPO_Process2 TOPO_Process2TOPO_Process3 TOPO_Process3TOPO_Process4 TOPO_Process4TOPO_Process5 TOPO_Process5TOPO_Process6 TOPO_Process6

Supporting prompts for this section listed below:

Supporting PromptModel
Within the context of Gibson Assembly (biotechnology DNA assembly method), why exactly are molar ratios (apparently they need to be 2:1, insert:vector) important? What are molar ratios? Do NOT hallucinate when replying to this promptPerplexity
What exactly is the insert and what exactly is the vector within the context of the Gibson Assembly DNA Assembly method? Do NOT hallucinate when replying to this promptPerplexity
In the context of biotechnology and synthetic biology, what exactly is a plasmid backbone? Explain this to me as if I were a reasonably educated 16-year old. Do NOT hallucinate when addressing this promptPerplexity
Tell me about the Phusion High-Fidelity (HF) Polymerase Chain Reaction (PCR) Master Mix. What is it? What are its subcomponents and what do they do? Do NOT hallucinate when addressing this promptPerplexity
The 4 dNTPs referenced in the answer to the last prompt are essentially chemical mixtures, correct? If that is incorrect, what are they? Do NOT hallucinate when answering this promptPerplexity
In the answer to the prompt before the previous prompt, there was reference to MgCl₂, and Mg²⁺. What are MgCl₂, and Mg²⁺ respectively? Are they chemicals? Something else? Why do they matter? Do NOT hallucinate when addressing this promptPerplexity
Within the context of a Polymerase Chain Reaction (PCR), I believe primers are the pieces of DNA that get copied nth number of times, correct? If I’m mistaken, indicate as such, and the error in the initial reasoning. Do NOT hallucinate when addressing this promptPerplexity
So based on the answer to the last prompt: –Primers essentially define the space in the DNA sequence that will be copied? –What is a free 3′‑OH end? Explain this to me as if I were a relatively educated 16-year old Do NOT hallucinate when answering this promptPerplexity
Do primer pairs always need to have a temperature difference of 5°C from each other? If so, why? Do primer pairs always need to at a temperature of between 52–58°C before annealing? If so, why? What factors determine ideal primer annealing temperatures, and why? Do NOT hallucinate when addressing these promptsPerplexity
Both Polymerase Chain Reaction (PCR) and Restriction Enzyme Digests create linear DNA fragments. PCR creates these linear DNA fragments via enzymatic amplification and Restriction Enzyme Digests create these by essentially cutting the DNA, correct? What do the basic steps of each look like (in some simple broken down steps)? Do NOT hallucinate when addressing this promptPerplexity
Other than PCR, Restriction Enzyme Digest, and Golden Gate Assembly, what other DNA assembly methods exist? Do NOT hallucinate when addressing this promptPerplexity
Which of the following results from the answer to the previous prompt is easiest to model in Benching? EXCLUDE Gibson Assembly from the selection and do NOT hallucinate when addressing this promptPerplexity
Ok. Based on the content in the answer to the prompt before the last prompt, explain to me what TOPO cloning is. What is a TOPO? What does it consist of/what are the basic steps? Do NOT hallucinate when addressing this promptPerplexity
Based on the answer to the last prompt, what exactly is Taq polymerase again? When we say TOPO cloning is ’ligase-free’, what do we mean when we say that? Why is TOPO cloning traditionally used?Perplexity
In the context of the answer to the last prompt, why would one clone a PCR product using TOPO cloning? An insert is just the piece/fragment of DNA being inserted into the vector (usually a bacterial plasmid), correct? Do NOT hallucinate when addressing this promptPerplexity
Break down the basic steps of how I would model basic (i.e., not complicated TOPO Cloning) in Benchling. Do NOT hallucinate when addressing this promptPerplexity
Where can I find a PCR insert to insert into Benchling for TOPO Cloning? Can I basically choose anything from GenBank or UniProt? What should I be looking for in the context of TOPO Cloning? Do NOT hallucinate when addressing this promptPerplexity
Thanks. Based on the answer to the last prompt, EGFP is a green fluorescent protein, correct? Do NOT hallucinate when addressing this promptPerplexity
Is TOPO Cloning a form of homologous, homology-based cloning? Do NOT hallucinate when addressing this promptPerplexity
Would Benchling’s ‘Concantenate sequences’ feature work/be suitable if one was trying to model TOPO Cloning within Benchling? Why or why not? Do NOT hallucinate when addressing this promptPerplexity
Take a look at this tab. I want to model DNA topoisomerase I (TOPO) Cloning and am not sure I’m doing the right things. I know I have a Polymerase Chain Reaction (PCR) insert, but am not sure where to go from here regarding the other sequence I’ve imported (sequence-416748). It has a ‘pCR 2.1-TOPO Fzd6HA’ Sequence Label, but I’m not sure if it’s right, or if it is, where to go from here to appropriately model TOPO Cloning in Benchling. Do NOT hallucinate when replying to this promptGemini
Ok. For Step A, can you help me find the insertion sites or the Fzd6 gene?Gemini
Ok. I have this EGFP sequence and want to use it as part of TOPO Cloning with the previous pCR 2.1-TOPO sequence. What do I do? Do NOT hallucinate when replying to this prompt?Gemini
Ok. So to complete the 3rd bullet under Step 1 from the answer to the last prompt, do I literally just add an A to the first base and an A to the last base in the sequence? Do NOT hallucinate when replying to this promptGemini
When I go to Asembly Wizard, I can only do ‘Golden Gate’, ‘Gibson’ or ‘Homology’. How should I proceed?Gemini
When we say ‘select the part of the vector excluding the Fzd6HA gene’, that can mean just copy the entire sequence EXCEPT for the Fzd6HA gene in a new sequence and then including that or no?Gemini
Ok. Take a look at the construct I made. Does this depict TOPO Cloning in a reasonable way based on the insert and the vector? Do NOT hallucinate when replying to this promptGemini
Getting an error that one of my constructs is invalid. Show me how to fix thisGemini
Confused. Want to do this manually WITHOUT the Assembly Wizard. Not seeing where to add the EGFP PCR Product. Also not seeing how I can see the plasmidGemini

Asimov Kernel

  1. See Repository below

    • Week6_Asimov1 Week6_Asimov1
      • Created Asimov Repository
  2. See blank Notebook entry below

    • Week6_Asimov2 Week6_Asimov2
      • Created Asimov Notebook
  3. See results below

    • Week6_Asimov3 Week6_Asimov3
    • Week6_Asimov4 Week6_Asimov4
    • Week6_Asimov5 Week6_Asimov5
    • Week6_Asimov6 Week6_Asimov6

    Explored Bacterial Repos Devices and ran Simulator on various examples (see above)

  4. See Construct creation results below

    • Question 1-3 Results: Recreated Repressilator in empty Construct using Characterized Bacterial Parts repository parts, searched and selected parts using the Search function, and dragged and dropped parts into Construct (see photos below)
      • Week6_Asimov7 Week6_Asimov7
      • Week6_Asimov8 Week6_Asimov8
      • Week6_Asimov10 Week6_Asimov10
    • Question 4 Results: The Repressilator wasn’t running as expected so I re-made it and ensured I directlty copied and pasted everything. Then I re-ran the Simulation and the Repressilator operated as expected (see photos below)
      • Week6_Asimov12 Week6_Asimov12
      • Week6_Asimov11 Week6_Asimov11
    • Question 5 Results: Documented results in Notebook (see below)
      • Week6_Asimov13 Week6_Asimov13
  5. See 3 Construct creation results below

    • Created 3 Constructs, explaining in each Notebook entry how I thought the Constructs should operate and why (see photos below)
      • Week6_Asimov14 Week6_Asimov14
      • Week6_Asimov18_Construct1NB Week6_Asimov18_Construct1NB
        • Construct 1 Results
      • Week6_Asimov15 Week6_Asimov15
      • Week6_Asimov17_Construct2NB Week6_Asimov17_Construct2NB
        • Construct 2 Results
      • Week6_Asimov16 Week6_Asimov16
      • Week6_Asimov19_Construct3NB Week6_Asimov19_Construct3NB
        • Construct 3 Results

Supporting prompts for this section listed below:

Supporting PromptModel
Remind me what XOR means in classical computation (including, but not limited to its instantiation on digital computers). Do NOT hallucinate when addressing this promptPerplexity
Share any links that break down the basic symbology/legend of Synthetic Biology Open Language (SBOL). Do NOT hallucinate when replying to this promptPerplexity
Little bit confused here. Copied all the Repressilator parts from the ‘Bacterial Demos’ repo, but am getting different results when running the Simulation. What could be going on?Kernel AI

Week 7 HW: Genetic Circuits Part 2

DNACircuits2 DNACircuits2

Genetic Circuits Part 2

Assignment Part 1: Intracellular Artificial Neural Networks (IANNs)

  1. Unlike traditional genetic circuits, IANNs are analog, and as such correspond more closely to the nature of biological systems (i.e, we’re not always looking for strict 0/1 binary logic, sometimes we’re looking to establish control across a range of values or space/time). This analog nature means they are more responsive, efficient, and biocompatible.

  2. Zoonotic Reservor IANNs (ZRIANNs)

    • This is applying IANNs to proactively identify and eliminate zoonotic pathogens in host organisms (ex. cattle and other forms of livestock, bats). By establishing prior understanding of organism homeostatis, ZRIANNs can co-evolve their understanding of nominal organism cellular activity. Across ZRIANNs, this might act as a form of input. When there are deviations from said baselines due to the emergence of novel pathogens trigering events like elevated inflammation levels or unexpected apoptosis, this would act as another form of input, triggering the ZRIANNs to begin understanding novel pathogen behavior for elimination, which would be a form of output. Limitations might include the novel of an IANNs across diverse non-human hosts, difficulties in IANNs achieving homeostatic understanding and insights, and difficulties in IANNs detecting novel, essentially zero-day equivalent pathogens
  3. See my attempt at a diagram below:

IANNDiagramAttempt IANNDiagramAttempt Intracellular multilayer perceptron attempt

Supporting prompts for this section listed below:

Supporting PromptModel
In the context of artificial neural networks, what is a bandpass circuit? In simple term, what does a bandpass circuit do and what does it look like? Do NOT hallucinate when answering this promptGemini
In the context of artificial neural networks, what is a bandpass circuit? In simple term, what does a bandpass circuit do and what does it look like? Do NOT hallucinate when answering this promptGoogle AI Mode
Can you show me a simple image/graphic of what a multilayer perceptron looks like?Do NOT hallucinate when addressing this promptGoogle AI Mode

Assignment Part 2: Fungal Materials

  1. Existing fungal materials include mycellium-based composites (MBCs), flexible fungal materials (FFMs), and pure mycellium materials. They’re used for use cases such as packaging, fashion, construction, and health and beauty products. Compared to their traditional counterparts, their advantages include greater environmental sustainability (including less use of petroleum and less harm to animals in their construction and use) with high levels of customization and low density Their disadvantage seem to be their load-bearing strength in some applications, and more variability as it’s biologically grown.
  2. I’d personally be interested to see if fungi could be genetically engineered to improve air or water quality and filtration in remote environments where air or water filters may not always be available/readily attainable. My why is rather simple – it would seem rather convenient and environmentally sustainable if air and water filters could be biologically grown as opposed to manufactured and shipped through traditional processes. The advantages of doing synthetic biology in fungi as opposed to bacteria include greater ability to fold complex proteins and a greater ability to form larger, more complex macroscopic structures
Supporting PromptModel
Share me some links for articles about fungal materials. Want to understand the types of fungal materials and what they’re used forGemini
What are the advantages and disadvantages of fungal materials compared to more traditional materials (including petroleum-based materials) EXCLUDING environmental impact/sustainability? Do NOT hallucinate when addressing this promptGoogle AI Mode
What are some of the advantages of using fungal materials as opposed to bacteria in synthetic biologyGemini

Assignment Part 3: First DNA Twist Order

  1. Reviewed Final Project documentation guidelines
  2. Submitted Google Form
  3. Created Benchling for insert sequence input (https://benchling.com/bio_star_39/f_/MMV1lxUm3Y-htgaa-final-project-working-folder/)

Week 9 HW: Cell-Free Systems

Homework Part A: General and Lecturer-Specific Questions

CellFreeActual CellFreeActual

Cell-Free Systems

General Homework Questions

  1. Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production.
    • Cell-free systems allow for a broader range of potential chemistries than those given to us from natural biology, expanding flexibility. Cell-free protein synthesis also allows for greater control over experimental variables because the entire protein expression construct is designed from scratch (i.e., we have the opportunity to bypass a lot of the compleity of natural cells). Cell-free expression is more beneficial than cell production if you want to rapidly protoype gene pathways and if you want an expression mechanism that’s more amenable to consistent, predictable modeling and analysis.
  2. Describe the main components of a cell-free expression system and explain the role of each component.
    • The main components of a cell-free expression system are (based on elements described in this hyperlink 1):
      • DNA template: Genetic code to begin Tx/Tl process
      • Ribosomes: Assembling amino acids into polypeptides
      • Enzymes: Catalyzing certain important chemical reactions necessary for the appropriate functioning of that cell-free expression system (ex. transcription and translation, energy generation)
      • Amino Acids: The core chemical building blocks of the proteins the cell-free expression system will express
      • Polymerases: Synthesizing DNA and RNA
  3. Why is energy provision regeneration critical in cell-free systems? Describe a method you could use to ensure continuous ATP supply in your cell-free experiment
    • Energy provision regeneration is critical in cell-free systems because cell-free systems don’t consume enzymes to produce energy. They also need external energy sources to remove waste products. A workaround might be to have analogous enzymatic reactions (possibly based off shared common charges) within the cell-free system to produce energy
  4. Compare prokaryotic versus eukaryotic cell-free expression systems. Choose a protein to produce in each system and explain why.
    • Prokaryotic cell-free expression systems allow for the colocation of transcription and translation. This might work well for proteins that need to be produced at high volume, like an industrial protease prtoein. Eukaryotic cell-free expression systems allow for more complex proteins to be built due to their nuclei. This might work well for the production of more advanced/technically complex proteins, like rabbit serum albumin.
  5. How would you design a cell-free experiment to optimize the expression of a membrane protein? Discuss the challenges and how you would address them in your setup.
    • In a manner similar to Shuguang Zhang ‘molecular glove’ experiment, I’d try to essentially coat and/or surround the the membrane protein with hydrophilic proteins to attract and/or absorb water in the cell-free environment, so the membrane protein can incorporate into the liposome 2. Challenges might include appropriate hydrophilic concentrations (which might be discerned via calculations or trial and error) or bonding between the hydrophilic proteins and the membrane proteins. This might be mitigated and/or the amount of error reduced through the use of computaitonal modeling and simulation tools like AlphaFold
  6. Imagine you observe a low yield of your target protein in a cell-free system. Describe three possible reasons for this and suggest a troubleshooting strategy for each.
    • Suboptimal Ribosome Function: Examine ribosome mRNA transcription processes and modify as necessary
    • Suboptimal Transcription: Examine tRNAs for coding errors/misreads or inappropriate expression levels and modify as necessary
    • Suboptimal External Communication (i.e., yields cannot properly exit system at desired levels): Examine and modify membrane channel functionality as necessary

Supporting prompts for this section listed below:

Supporting PromptModel
When we’re saying ‘cell-free expression system’, we’re not saying the same thing as a ‘synthetic minimal cell’ correct? Do NOT hallucinate/make things up when replying to this queryGemini
Believe a ribosome has a waste removal function within cells, but I could be wrong in this. Clarfy or confirm this is the case, and do NOT hallucinate/make things up when replying to this promptGemini
Believe a polymerase’s job is to essentially make copies of certain things, but the specifics beyond this are evading me at the moment. What is the specific role of a polymerase generally speaking? Do NOT hallucinate/make things up when replying to this promptGemini
Do cell-free systems contain polypeptides? Why or why not? Do NOT hallucinate/make things up when replying to this promptGoogle AI Mode
In essence, an enzyme’s function within a cell-free expression system is to catalyze certain important chemical reactions necessary for the appropriate functioning of that cell-free expression system, correct? If I’m mistaken here, or if some element of my thinking is off, say so. Do NOT hallucinate/make things up when replying to this promptGoogle AI Mode
In essence, amino acids’ function within a cell-free expression system is to serve as the core chemical building blocks of the proteins the cell-free expression system will eventually express, correct? If I’m mistaken here, or if some element of my thinking is off, say so. Do NOT hallucinate/make things up when replying to this promptGoogle AI Mode
Within the context of cell, I know ADP and ATP are phosphates (I believe an adenine-type phosphate), although I’m not sure. Tell me if this is correct, and tell me in simple terms how ADP and ATP work within the context of cells to generate appropriate energy levels for cell functionality. Do NOT hallucinate/make things up when replying to this promptGoogle AI Mode
Unlike a prokaryote, eukaryotes have nuclei, correct? What are the advantages and disadvantages of nuclei within the context of protein production? Is it simply that the complexity of eukaryotic cells allows for the production of more sophisticated, technically complex proteins, or are there more reasons? Answer this prompt in a relatively succinct fashion and do NOT hallucinate/make things up when doing soGemini
Tell me about what types of proteins don’t require excessive quality control, that require a large volume to be produced, and benefit from a prokaryotic setup (i.e., a cell where transcription and translation occur in the same location)? Do NOT hallucinate/make things up when replying to this promptGemini
What is the name of the common rabbit protein traditionally used in biotechnology experimentation with mammalian cell culture? Blanking on the name. Do NOT hallucinate/make things up when replying to this promptGoogle AI Mode

Homework question from Kate Adamala

Based on Iulianna, T., Kuldeep, N. & Eric, F. The Achilles’ heel of cancer: targeting tumors via lysosome-induced immunogenic cell death. Cell Death Dis 13, 509 (2022). 3

Design an example of a useful synthetic minimal cell as follows:

  1. Pick a function and describe it
  • What would your synthetic cell do? What is the input and what is the output?
    • Increase apoptosis in mammalian cells with defective lysosomes. Input: Protein kinase R-like endoplasmic reticulum kinase (PERK). Output: Phosphorylated eukaryotic initiation factor 2 α-subunit (elF2a)
  • Could this function be realized by cell-free Tx/Tl alone, without encapsulation
    • No, it appears that communication with the external environment as well as some form of an encapsulating membrane are necessary for these immunogenic cell death (ICD) reactions to properly work
  • Could this function be realized by genetically modified natural cell?
    • Believe this function could be realized by a genetically modified natural cell. If PERK expression levels could be increased, this could increase elF2a phosphorylation
  • Describe the desired outcome of your synthetic cell operation.
    • Increased PERK expression levels lead to increased elF2a phosphorylation
  1. Design all components that would need to be part of your synthetic cell
  • What would be the membrane made of?
    • Mostly phospholipids and some (a relative minority percentage) of cholesterol
  • What would you encapsulate inside? Enzymes, small molecules.
    • PERK, elF2a, Adeonsine Triphosphate (ATP), GTP, Creatine Phosphate, Reporter (likely GFP)
  • Which organism your Tx/Tl system will come from? Is bacterial OK, or do you need a mammalian system for some reason? (hint: for example, if you want to use small molecule modulated promotors, like Tet-ON, you need mammalian)
    • Believe a mammalian system would be needed as this is meant to mimic a homo sapiens-based eukaryotic system
  • How will your synthetic cell communicate with the environment? (hint: are substrates permeable? or do you need to express the membrane channel?)
    • It should have permeable substrates, as my understanding of the PERK pathway seems to indicate that external communication with the environment via a permeable membrane is necessary for the PERK pathway to appropriately function (i.e., for the increase in the PERK expression levels to induce greater elF2a expression)
  1. Experimental details
    • List all lipids and genes. (bonus: find the specific genes; for example, instead of just saying “small molecule membrane channel” pick the actual gene.)
      • Lipids: POPC, Cholesterol
      • Enzymes: Binding immunoglobulin protein (BiP), PERK, ATP, elF2B, Growth arrest and DNA damage-inducible 34 (GADD34)
      • Genes: HSPA5 gene/GRP78 (for BiP expression), ELF2AK3 (encodes PERK protein), EIF2S1 (encodes elF2a), ATF4, DDIT3, and PPP1R15A (latter 3 genes necessary for apoptotic response)
    • How will you measure the function of your system?
      • Measure presence of GFP reporter to show that ribosomes are shutting down and apoptosis is beginning

Supporting prompts for this section listed below:

Supporting PromptModel
In this paper, what is the ER membrane?Gemini
In this paper, the targeted ICD reactions require interfacing with elements outside the cell correct? Is the cell membrane essential for their function? Do NOT hallucinate when replying to this promptGemini
If I wanted to make a synthetic cell that would allow for greater Protein Kinase RNA (PKR)-like ER Kinase (PERK) expression levels to induce increased eukaryotic initiation factor 2 α-subunit (elF2a) expression, would a cholesterol membrane make sense? Would some other type of membrane make sense? Why or why not? Do NOT hallucinate/make things up when replying to this promptGemini
In a ’normal’/natural non-synthetic cell, where do PERK and elF2a sit (i.e., where are they located? What are the components within a ’normal’/natural non-synthetic cell necessary for them to appropriately function? Do NOT hallucinate/make things up when replying to this promptGemini
When we say that PERK ‘reaches across the membrane into the cytoplasm’, do we mean to say that it reaches outside the cell? Do NOT hallucinate/make things up when replying to this promptGemini
Ok. So if I was building a minimal synthetic cell to replicate the PERK pathway (and the increase in elF2a phosphorylation), then my necessary components inside the minimal synthetic cell to encapsulate this reaction would be the PERK itself, relevant ribosomes, tRNAs, cytoplasm, and DNA? Tell me what I’m missing, what’s incorrect, and do NOT hallucinate/make things up when replying to this promptGemini
To clarify, is PERK an enzyme that induces a chemical reaction leading to increased elF2a expression? What enzymes are usually required for the perk pathway to appropriately function? Do NOT hallucinate/make things up when replying to this promptGemini
If the PERK pathway requires the BiP as its upstream regulator, what genes are necessary for this protein to be produced? What genes are necessary for the pathway to successfully function? Do NOT hallucinate/make things up when replying to this promptGemini

Homework question from Peter Nguyen

  • Write a one-sentence summary pitch sentence describing your concept.
    • Robotics Use Case: Thinking about using cell-free systems delivered/facilitated by drones to collect metagenomic samples from remote environments, for the pupose of expanding biosurveillance beyond the traditional wastewater sampling
  • How will the idea work, in more detail? Write 3-4 sentences or more.
    • Drone (or potentially a larger drone ship like an evTOL) would deliver robots and kits with cell-free reactions. These robots might be similar to Mars rovers or bomb-defusing robots. The kit would ideally auto-unload once the drone has reached its given destination, the robot would have to complete a set (i.e., limited and discrete) number of specific steps to collect and store the metagenomic sample. If analysis of the metagenomic sample could be done in real-time or a short time duration, that would be beneficial. If this could not be done, there would essentially be a ‘packing’ step before the robot and the now-utilized kits return to their origin site for sample processing/analysis
  • What societal challenge or market need will this address?
    • The need to expand biosurveillance beyond the purely human environment into more remote locales and animal populations
  • How do you envision addressing the limitation of cell-free reactions (e.g., activation with water, stability, one-time use)?
    • Would need secure storage to maintain stability. Activation with water challenges would require an appropriate water disposal mechanism(s) either within/near the sample kit or facilitated by the robot without harming the robot (or the robot might have some form(s) of waterproof protection). One-time use isn’t an issue for this use case because there are ample examples from the world of biosurveillance where one-time sample collection is the aim

Homework question from Ally Huang

  1. Provide background information that describes the space biology question or challenge you propose to address. Explain why this topic is significant for humanity, relevant for space exploration, and scientifically interesting.
    • As humans travel farther out into space, particularly on remote, potentially skeleton-crew missions, they may not be able to bring blood supplies for adverse events like necessary transfusions. Moreover, blood banks play an important terrestrial role that might need replication. The basic idea is to engineer liver cells to create blood proteins or blood-like fluid on demand a-la the high-level idea initially proposed in the hyperlinked Engineering Biology Research Consortium (EBRC) Roadmap document 4
  2. Name the molecular or genetic target that you propose to study. Examples of molecular targets include individual genes and proteins, DNA and RNA sequences, or broader -omics approaches.
    • I’d like to study the albumin plasma protein
  3. Describe how your molecular or genetic target relates to the space biology question or challenge your proposal addresses.
    • Albumin has a vital role in the liver’s production of blood-related proteins. Therefore, it would seem rather improbable to have engineered liver cells to create blood proteins without some sort of working albumin configuration or some analog
  4. Clearly state your hypothesis or research goal and explain the reasoning behind it.
    • I’d like to study how to modulate or fine-tune albumin expression levels in microgravity, as it appears microgravity exposure can cause albumin levels to increase 5.
  5. Outline your experimental plan - identify the sample(s) you will test in your experiment, including any necessary controls, the type of data or measurements that will be collected, etc.
    • I’d include purified albumin plasma protein as a sample in my experiment, as well as some form of small interfering RNA (siRNA) for lowering albumin expression, and GFP for measuring expression level change. I’d use Biobits and the P51 Molecular Fluorescence Viewer to measure the impact of siRNA modulation on albumin expression. I’d collect information on oncoctic pressure modulations across a microgravitty and terrestrial experimental configurations. The terrestrial configuration would serve as a control to indicate comparative rates of siRNA modulation efficacy. GFP expression would indicate certain levels of albumin expression post-siRNA modulation
Supporting PromptModel
What exactly is the role of the human liver in creating blood or blood-relate proteins? Do NOT hallucinate/make things up when replying to this promptGoogle AI Mode
Show me research papers from the past 5-10 years on the impacts of microgravity on the production of albumin plasma protein. Do NOT hallucinate/make things up when replying to this promptGoogle Scholar Labs
When biotechnologists typically study albumin terrestrially, how is this done? How do they typically measure albumin’s impact on oncoctic pressure? Do NOT hallucinate/make things up when replying to this promptGoogle AI Mode

Homework Part B: Individual Final Project

  1. Put your chosen final project slide in the appropriate slide deck following the instructions on slide 1

    • Inserted slide in Committed Listener Deck
  2. Submit this Final Project selection form if you have not already.

    • Submitted FInal Project selection form (see screenshots below)
  3. Begin planning how you will write your final project documentation based on these guidelines

    • Began writing final project documentation based on hyperlinked guidelines
  4. Prepare your first DNA order and put it in the “Twist (MIT)” or “Twist (Nodes)” tab of the 2026 HTGAA Ordering: DNA, Reagents, Consumables spreadsheet, as appropriate.


Week 10 HW: Advanced Imaging & Measurement Technology

waters-sets-new-standard-in-high-resolution-mass-spectrometry-349983-960x540 waters-sets-new-standard-in-high-resolution-mass-spectrometry-349983-960x540

Waters Corporation Mass Spectrometer

Homework: Final Project

For your final project:

  • Please identify at least one (ideally many) aspect(s) of your project that you will measure.
    • Lysis Rate
    • Efficiency of Plating
  • Please describe all of the elements you would like to measure, and furthermore describe how you will perform these measurements.
    • Lysis Rate: This measures the rate at which the mutated m. smegma mycobacteriophage lyses or destroys bacteria. This would be measured in a wet lab setting by comparing percentages of bacteria across a control and another plate that has been exposed to a mutated form of m. smegma mycobacteriophage
    • Efficiency of Plating: This measures the rate at which the mutated m. smegma mycobacteriophage can begin initiating a host infection. Believe this would also be measured in a wet lab setting by comparing percentages of bacteria across a control and another plate that has been exposed to a mutated form of m. smegma mycobacteriophage
  • What are the technologies you will use (e.g., gel electrophoresis, DNA sequencing, mass spectrometry, etc.)? Describe in detail.
    • Lysis Rate: I’d likely use a microplate reader as part of a wet lab extension of the final project
    • Efficiency of Plating: I’d use a plauqe assay as part of a wet lab extension of the final project

Supporting prompts for this section listed below:

Supporting PromptModel
Does Efficiency of Plating (EOP) mean the same thing as rates of lysing? Believe so. Do NOT hallucinate/make things up when replying to this promptGemini
Explain how mutations are measured in this paper. Do NOT hallucinate/make things up when replying to this promptGemini
In experiments like the one referenced in this paper, how are metrics like Lysis Rate and Efficiency of Plating traditionally measured? What tools are used? Do NOT hallucinate/make things up when replying to this promptGemini

Homework: Waters Part I — Molecular Weight

  1. Based on the predicted amino acid sequence of eGFP (see below) and any known modifications, what is the calculated molecular weight?
  • The calculated molecular weight is 28006.60 Mw
  1. Calculate the molecular weight of the eGFP using the adjacent charge state approach described in the recitation. Select two charge states from the intact LC-MS data and:

  2. Determine z for each adjacent pair of peaks (n, n+1) using:

    • Chose 933.7349 and 965.9684 from the Figure 1 chart. Based on the formula z = ~28.96
  3. Determine the MW of the protein using the relationship between m/z, MW, and z

    • MW = 27,983.85 Daltons
  4. Calculate the accuracy of the measurement using the deconvoluted MW from 2.2 and the predicted weight of the protein from 2.1

    • The result of the measurement I got was -0.0008 ppm
  5. Can you observe the charge state for the zoomed-in peak in the mass spectrum for the intact eGFP? If yes, what is it? If no, why not?

  • Believe the answer’s no because the zoomed-in peak from my understanding is not uniform. Instead it constitutes a variety of different charged states that cannot be discerned as a singular discrete value. Absolutely open to being wrong on this

Supporting prompts for this section listed below:

Supporting PromptModel
In Section 1, question 2, what does the variable z stand for again? Do NOT hallucinate/make things up when addressing this promptGemini
In the formula in the answer to the last prompt, what does the numerator represent? How does it correlate with the spikes in this image? Am aware that each spike represents an m over z ratio but am unsure where/how to begin. Do NOT hallucinate/make things up when replying to this promptGemini
Not understanding how a spike with an 800 value on the left hand side can have a higher number of charges than a right hand spike with a value of 1000. Explain this to me, and if there was any hallucination(s) that have any implications for the results of the answer to the last prompt, say so. Do NOT hallucinate/make things up when replying to this promptGemini
So to be clear in the hypothetical calculation in the answer to the last prompt, to get z you divided 848.97 (the numerator) by 875.44-848.97 (the denominator). Do NOT hallucinate/make things up when replying to this promptGemini
Where is the predicted weight of the protein in 2.1 that is referenced in Section 1, question 3? Not exactly following. Do NOT hallucinate/make things up when replying to this promptGemini
Think I’m doing something wrong. Got a theoretical MW of 28006.60 and an experimental MW of 26,409,038. I chose the 966.0037 and 966.0390 peaks. Apparently the results from the last equation should be in the 30-50 range and I got approx. 941 when I ran the equation. What did I do wrong/what am I missing? Do NOT hallucinate/make things up when replying to this promptGemini

Homework: Waters Part II — Secondary/Tertiary Structure

  1. Based on learnings in the lab, please explain the difference between native and denatured protein conformations. For example, what happens when a protein unfolds? How is that determined with a mass spectrometer? What changes do you see in the mass spectrum between the native and denatured protein analyses
    • Believe native proteins are not manipulated in any way (i.e., their properties are not altered via heat or other impacts) while denatured proteins are proteins where these properties are destroyed via direct alteration, usually by applying something like heat or acidity. This is determined in a mass spectrometer via distribution of charges across respective proteins. In the denatured protein in Figure 2, there appears to be a somewhat more Gaussian charge distribution, where the native protein below has a more spread out charge distribution
  2. Zooming into the native mass spectrum of eGFP from the Waters Xevo G3 QTof MS (see Figure 3), can you discern the charge state of the peak at ~2800? What is the charge state? How can you tell?
    • Observing a +10 charge state. This because the two peaks near ~2800 (2799.4199 and 2799.6365) have a 0.2166 difference between them, which equates to ~+10 when the 1/z is calculated

Supporting prompts for this section listed below:

Supporting PromptModel
Looking over Section 2 question 2 and am pretty sure the answer is that there isn’t a single answer to the charge state of the peak at ~2800, based on the inset in the Figure 3 and its somewhat parabolic-looking curve. Feel free to tell me if/where or how my thinking is off and do NOT hallucinate/make things up when replying to this promptGemini

Homework: Waters Part III - Peptide Mapping - Primary Structure

  1. How many Lysines (K) and Arginines (R) are in eGFP? Please circle or highlight them in the eGFP sequence given in Waters Part I question 1 above.
  • There are 18 Lysines (K) and 6 Arginines (R) in eGFP. (See screenshot below) ImageandMeasurementScreenshot1 ImageandMeasurementScreenshot1
  1. How many peptides will be generated from tryptic digestion of eGFP? Believe 25 peptides will be generated
  • Navigate to https://web.expasy.org/peptide_mass/
  • Copy/paste the sequence above into the input box in the PeptideMass tool to generate expected list of peptides
  • Use Figure 4 below as a guide for the relevant parameters to predict peptides from eGFP.
  • Click “Perform the Cleavage” button in the PeptideMass tool and report the number of peptides generated when using trypsin to perform the digest.
    • Confirmed 25 generated peptides (see screenshot below) ImageandMeasurementScreenshot3 ImageandMeasurementScreenshot3
  1. Based on the LC-MS data for the Peptide Map data generated in lab (please use Figure 5a as a reference) how many chromatographic peaks do you see in the eGFP peptide map between 0.5 and 6 minutes? You may count all peaks that are >10% relative abundance.

    • I count 25 peaks
  2. Assuming all the peaks are peptides, does the number of peaks match the number of peptides predicted from question 2 above? Are there more peaks in the chromatogram or fewer?

    • I see a chromatographic peak match
  3. Identify the mass-to-charge (m/z) of the peptide shown in Figure 5b. What is the charge (z) of the most abundant charge state of the peptide (use the separation of the isotopes to determine the charge state). Calculate the mass of the singly charged form of the peptide based on its (m/z) and z.

    • The charge (z) of the most abundant charge state of the peptide equals ~2.0323. The mass of the singly charged form of the peptide is 1,067.4760
  4. Identify the peptide based on comparison to expected masses in the PeptideMass tool. What is mass accuracy of measurement? Please calculate the error in ppm.

    • Think the peptide’s in position 115-123 (peptide sequence FEGDTLVNR). The mass accuracy of measurement’s ~0.0161 ppm
  5. What is the percentage of the sequence that is confirmed by peptide mapping?

    • According to the PeptideMass tool, 90.7% of the sequence is confirmed by peptide mapping

Supporting prompts for this section listed below:

Supporting PromptModel
What is the relationship between a ’tryptic digestion of eGFP’ and the ’trypsin’ enzyme identified in the screenshot in Section III? Do NOT hallucinate/make things up when replying to this promptGemini
So if they’re 18 Lysines (K) and 6 Arginines (R), then that would mean that there would be 24 cuts made and 24 peptides generated correct? Do NOT hallcuinate/make things up when replying to this promptGemini
Not sure where to begin in terms of breaking down/starting to work on Question 5 in Section III. Any thoughts on where or how to begin? Do NOT hallucinate/make things up when replying to this promptGemini
Believe you’re looking at the wrong question, specifically question 4 in Section III as opposed to question 5. Do not hallucinate and provide guidance on how to begin tackling question 5 in Section IIIGemini

Homework: Waters Part IV - Oligomers

  • 7FU Decamer
    • Sits directly to the left of the 4.013 peak between 0 and 5 MDa axis in Figure 7
  • 8FU Didecamer
    • Very tall 8.33 MDa peak sitting between 5 and 10 MDa on the MDa axis in Figure 7
  • 8FU 3-Decamer
    • The 12.67 MDa peak sitting between 10 and 15 MDa on the MDa axis in Figure 7
  • 8FU 4-Decamer
    • The tiny peak sitting between 15 and 20 MDa on the MDa axis in Figure 7
Supporting PromptModel
Looking over Section IV of this page, it appears as if the subunit masses are in kilo-Daltons and the Mass Spectrum readouts in Figure 7 are in Mega-Daltons, correct? Blanking on the relationship between kilo and Mega units of measurement. Do NOT hallucinate/make things up when replying to this promptGemini

Homework: Waters Part V - Did I Make GFP?

  • See screenshots below:
TheoreticalObserved/measured on the Intact LC-MSPPM
Molecular weight (kDa)ImageandMeasurementScreenshot4 ImageandMeasurementScreenshot427,983.85 Daltons-0.0008 ppm

Week 11 HW: Bioproduction and Cloud Labs

Gemini_Generated_Image_87y9eq87y9eq87y9 Gemini_Generated_Image_87y9eq87y9eq87y9

Part 1: Global Pixel Artwork Cloud Lab Contribution

Made the following contributions to the Global Pixel Artwork Cloud Lab (see screenshots below)

Week11Screenshot1 Week11Screenshot1Week11Screenshot2 Week11Screenshot2Week11Screnshot3 Week11Screnshot3Week11Screenshot4 Week11Screenshot4

Global Pixel Artwork Contributions (see above). Edited 4 pixels in the upper right hand corner of the image (changed them to sfGFP)

Week 12 HW: Bioproduction & Cloud Labs Part 2

Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork

Gemini_Generated_Image_953e9b953e9b953e Gemini_Generated_Image_953e9b953e9b953e
  1. Contribute at least one pixel to this global artowrk experiment before the editing ends on Sunday 4/19 at 11:59 PM EST.

    • Contributed 4 pixels to the global artwork experiment on Saturday 4/18
  2. Make a note on your HTGAA webpages including:

    • what you contributed to the community bioart project

      • I contributed 4 pixels to the community bioart project. I changed 4 pixels in the upper right plate to sfGGP (see screenshots below)

        Week11Screenshot1 Week11Screenshot1Week11Screenshot2 Week11Screenshot2Week11Screnshot3 Week11Screnshot3

        Week11Screenshot4 Week11Screenshot4 Community bioart project contributions – changed 4 upper right plate pixels to sfGFP (see above)

    • what you liked about the project

      • The project was a nice opportunity to contribute to a larger HTGAA effort. It was nice to see the creativity of the community at play! I also appreciated Ronan’s page for contributing pixels – very intuitive and easy to understand
    • what about this collaborative art experiment could be made better for next year

      • I’d probably say a bit more advance notice might have been useful. Perhaps a bit more clarity on ground rules. These are relatively minor nitpicks in the grand scheme of things

Part B: Cell-Free Protein Synthesis | Cell-Free Reagents

  1. Referencing the cell-free protein synthesis reaction composition (the middle box outlined in yellow on the image above, also listed below), provide a 1-2 sentence description of what each component’s role is in the cell-free reaction.
    • E. coli Lysate
      • BL21 (DE3) Star Lysate (includes T7 RNA Polymerase)
        • I think BL21 (DE3) Star Lysate’s role is to provide the E. coli bacteria necessary to be synthesized into fluorescent proteins. Basically it seems to serve as a starting ingredient/necessary component, for lack of a better word
  • Salts/Buffer
    • Potassium Glutamate
      • Helps aid in the elongation portion of the RNA –> protein translation process. In this context, I think elongation refers to the reaction timeframe
    • HEPES-KOH pH 7.5
      • Important cell buffer. It allows for extra buffering capacity if cell culture manipulation occurs for prolonged/longer than normal time period
    • Magnesium Glutamate
      • Helps stablizie ribosome construction. Ir also helps neutralize mRNA and DNA backbone negative charge
    • Potassium phosphate monobasic
      • Potassium source and buffer for the reaction, helping stabilize pH levels/keep them nominal during the reaction. Contains 1 replaceable atom
    • Potassium phosphate dibasic
      • Another potassium source and buffer for the reaction, helping stabilize pH levels/keep them nominal during the reaction. Contains 2 replaceable atoms
  • Energy / Nucleotide System
    • Ribose
      • Key cell energy source. It’s crucial for creating adenosine triphosphate (ATP) the primary form of energy within cells
    • Glucose
      • Another essential energy source for cell processes. Also helps produce ATP
    • AMP
      • Adenosine Monophosphate (AMP) is a metabolite helping regulate energy levels. It acts as a form of an ATP sensor/response mechanism
    • CMP
      • Cytidine monophosphate (CMP) assists with RNA synthesis. It helps decompose RNA into ribonuclease (RNase)
    • GMP
      • Guanosine monophosphate (GMP) is key for RNA synthesis and regulates cellular signaling. Helps polymerize RNA
    • UMP
      • Uridine monophosate (UMP) is a pyrimidine compound. It also helps polymerize RNA.
    • Guanine
      • Nucleic acid base that pairs with cystosine in double-stranded DNA. It’s used to build RNA during the transcription process.
  • Translation Mix (Amino Acids)
    • 17 Amino Acid Mix
      • The mix provides a group of compatible amino acids for translation by ribosomes. These are the materials ribosomes work with for translation into proteins
    • Tyrosine
      • Assists with protein synthesis. Also asists with phosphate group post-translational modification (PTM).
    • Cysteine
      • An amino acid used by ribosomes to build a protein chains during the translation process. Helps with protein folding and stability.
  • Additives
    • Nicotinamide
      • Helps manage the process of cellular nutrients converting themselves to ATP and vice versa. Think this means it also might help with reaction energetic stability
  • Backfill
    • Nuclease Free Water
      • Ensures appropriate reaction concentration. It also ensures no extraneous enzymes destroy reaction byproducts
  1. Describe the main differences between the 1-hour optimized PEP-NTP master mix and the 20-hour NMP-Ribose-Glucose master mix shown in the Google Slide above.
  • Believe the main difference between the 1-hour optimized PEP-NTP master mix and the 20-hour NMP-Ribose-Glucose master mix is that the 1-hour optimized PEP-NTP master mix optimizes for the fast production of flourescent proteins, while the 20-hour NMP-Ribose-Glucose master mix optimizes for flourescent proteins across a longer timespan. While this might seem obvious based on the slide content, my understanding is that the PEP-NTP master mix is more energy intensive (i.e., it essentially consumes more energy faster to create flourescent protein output) while the NMP-Ribose-Glucose master mix is comparatively less energy intensive (i.e., it essentially consumes either less energy slower across its 20 hour-reaction timespan to create flourescent protein output or it consumes the same amount of energy across its 20 hour-reaction timespan in a less energy intensive fashion). So, in essence, I think the main difference between these reactions comes down to their respective energy constumption levels

Supporting prompts for this section listed below:

Supporting PromptModel
Under the ‘Salts/Buffer’ subsection under the 1st question in Part B, I think glutamates help with the creation of a given chemical (in this case potassium or magnesium). Not sure how/why these salts/buffers are relevant in a cell-free protein synthesis reaction. Any insights you might have into the roles of the glutamates here, as well as all the various types of potassium would be useful. Do NOT hallucinate/make things up when replying to this promptGemini
When we say something is a buffer for a chemical reaction, what exactly do we mean when we say that? Do NOT hallucinate/make things up when replying to this prompt and keep the answer relatively succinct and plainspoken (4-5 sentences max)Gemini
Looking at the ‘Energy/Nucleotide’ subsection under the 1st question in Part B, and given a passing understanding of genomics, I understand that guanine (G) pairs with Cytosine (C). In the context of this subsection, does this mean that a cell-free Guanine mix translates or outputs Cysteine in some may? Is that a relationship between these two things? If so, what’s the relationship? Do NOT hallucinate/make things up when replying to this prompt and keep the answer relatively succinct and plainspoken (5-6 sentences max)Gemini
Looking at the ‘Backfill’ subsection under the 1st question in Part B, my understanding of nuclease free water’s function in a cell-free protein synthesis reaction is to basically provide a clean backdrop for the reaction to occur, or to determine what’s what post-reaction. Are either of those high-level explanations correct or sensible? Do NOT hallucinate/make things up when replying to this prompt and keep the answer relatively succinct and plainspoken (4-5 sentences max)Gemini
Doing a sanity check: ribosomes turn amino acids into proteins as part of the translation process, correct? Do NOT hallucinate/make things up when replying to this prompt and keep the answer relatively succinct and plainspoken (2-3 sentences max)Gemini

Part C: Planning the Global Experiment | Cell-Free Master Mix Design

  1. Given the 6 fluorescent proteins we used for our collaborative painting, identify and explain at least one biophysical or functional property of each protein that affects expression or readout in cell-free systems. (Hint: options include maturation time, acid sensitivity, folding, oxygen dependence, etc) (1-2 sentences each)

    • sfGFP
      • It has a relatively quick maturation time (13.6 min.). This means researchers can find out whether or not the cell-free reaction occurred successfully rather quickly if they were solely measuring this protein’s flourescence
    • mRFP1
      • It has a comparably longer maturation time (60 min.). This means researchers might need to wait a bit to determine whether or not the cell-free reaction occurred successfully if they were solely measuring this protein’s flourescence
    • mKO2
      • Acid sensitivitiy levels are 5.5 pKa. This means that if pH drops from the typical 7.5 pH of a common cell-free reaction, this will cause the flourescence to not show
    • mTurquoise2
      • It’s acid sensitivitiy levels are 5.5 pKa and its maturation time is also relatively quick (33.5 min.). This means its realtively resistant to drops in pH (i.e., a flourescence readout will still occur) and a researcher can discern whether or not a successful reaction occurred relatively quickly if they were solely measuring this protein’s flourescence
    • mScarlet_1
      • It’s acid sensitivitiy levels are 5.3 pKa and its maturation time is comparatively long (174 min.). This means the protein’s relatively senstive to pH drops from the common mean, and it will also take several hours for a researcher to discern whether or not a successful reaction occurred if they were solely measuring this protein’s flourescence
    • Electra2
      • It’s the second brightest of all the proteins in this list. It has a 61.48 brightness readout (the brightest protein is mScarlet_1 with a 70.0 brightness readout)
  2. Create a hypothesis for how adjusting one or more reagents in the cell-free mastermix could improve a specific biophysical or functional property you identified above, in order to maximize fluorescence over a 36-hour incubation. Clearly state the protein, the reagent(s), and the expected effect.

    • I hypothesize that if I increase ribose and/or glucose reagent concentrations in the cell-free mastermix, it will increase sfGFP brightness over a 36-hour incubation period relative to its nominal brightness rate (54.15)
  3. The second phase of this lab will be to define the precise reagent concentrations for your cell-free experiment. You will be assigned artwork wells with specific fluorescent proteins and receive an email with instructions this week (by April 24). You can begin composing master mix compositions here 1

    • Test reagent master mix compositions based on the hypothesis above shown in the screenshots below

    Week12Cell-FreeRiboseIncreaseComposition Week12Cell-FreeRiboseIncreaseComposition Test master mix composition (increase ribose)

    Week12Cell-FreeGlucoseIncreaseComposition Week12Cell-FreeGlucoseIncreaseComposition Test master mix composition (increase glucose)

  4. The final phase of this lab will be analyzing the fluorescence data we collect to determine whether we can draw any conclusions about favorable reagent compositions for our fluorescent proteins. This will be due a week after the data is returned (date TBD!). The reaction composition for each well will be as follows:

    • 6 μL of Lysate
    • 10 μL of 2X Optimized Master Mix from above
    • 2 μL of assigned fluorescent protein DNA template
    • 2 μL of your custom reagent supplements

Supporting prompts for this section listed below:

Supporting PromptModel
In the ‘Attributes’ table in this tab, what does the ‘Maturation (min.)’ value mean? Why does it matter in practical terms? Do NOT hallucinate/make things up when replying to this prompt and keep the answer relatively succinct and plainspoken (3-4 sentences max)Gemini
On this tab, when we say mK02 has moderate acid sensitivity, what does that actually mean? What acids is it sensitive to? Any type of acid? Why does acid sensitivity matter in practical terms? Do NOT hallucinate/make things up when replying to this prompt and keep the answer relatively succinct and plainspoken (5-6 sentences max)Gemini
So based on mK02’s moderate acid sensitivity readout (5.5) would we say that that readout is suboptimal/undesirable relative to normal cell-free reaction pH? Do NOT hallucinate/make things up when replying to this prompt and keep the answer relatively succinct and plainspoken (2-3 sentences max)Gemini
Looking over the subsections in Part B, does every single subcategory consist of reagents, or do only some subcategories consist of reagents? Do Do NOT hallucinate/make things up when replying to this prompt and keep the answer relatively succinct and plainspoken (3-4 sentences max)Gemini
This might be a dumb question. If I give a cell-free reaction more power in the form of higher concentrations of some of the ATP/cellular energy-associated reagents under the ‘Energy/Nucleotide System’ subsection, could I expect a decreased maturation time for fluorescent protein readout/indication of fluorescence? Do Do NOT hallucinate/make things up when replying to this prompt and keep the answer relatively succinct and plainspoken (5-6 sentences max)Gemini
In the response to the last prompt, the energy reagents listed were AMP, CMP, GMP, or UMP. What if ribose or glucose levels were increased? Could I expect a decreased maturation time for fluorescent protein readout/indication of fluorescence for sfGPF? Do Do NOT hallucinate/make things up when replying to this prompt and keep the answer relatively succinct and plainspoken (5-6 sentences max)Gemini
Understood. What about the impact of increased energy reagents on sfGFP brightness? If I added increased concentrations of some of the ATP/cellular energy-associated reagents under the ‘Energy/Nucleotide System’ subsection, such as Ribose or Glucose, in principle could I expect to get brighter/more fluorescent sfGFP as an output? Do Do NOT hallucinate/make things up when replying to this prompt and keep the answer relatively succinct and plainspoken (5-6 sentences max)Gemini

Week 13 HW: Scaling Health Innovation

Master Mix Concentrations

  • See test Master Mix Concentrations below:

Labs

Lab writeups:

  • Week 1 Lab: Introduction to Pipetting and Dilutions

    Overview Date(s): 03/02/26 – 03/03/26 Notes: Reviewed lab materials outlined in ‘Overview’ protocol section (pipette types and tips, tubes, tube holders, and stock reagents) and concentration basics with Kate Carline (William & Mary Node TA). Discussed lab material functions and reviewed the basics of dilution math and pipetting technique. Supporting Picture(s): Part 1: Mixing Color Prepared tubes with red, yellow, and blue food coloring solution Marked 6 tubes with red, yellow, blue, red/yellow, blue/yellow, and red/blue combinations Added 500 uL to each each red, yellow, blue, red/yellow, blue/yellow, and red/blue combination solution tube See above – made combinations by mixing colors See above See above Dispersed concentrations onto wax paper to make design in lieu of petri plate Supporting Part 1 and Part 2 photos below

  • Week 2: DNA Gel Art

    DNA Gel Art Protocol Part 0: Designing My Gel Art / Expected Results and Walkthrough Created a virtual digest in Benchling as a basis for DNA Gel Art (see below) Benchling Virtual Digest (A Hidden Hello) Protocol Part 1a: Preparing a 1% Agarose Electrophoresis Gel Preared a 1% Agarose Electrophoresis Gel (see below)

  • Week 3 Lab: Opentrons Art

    Opentrons Art Lab Part 1: Flourescent Bacteria & Black Agar Script See Flourescent Bacteria & Black Agar Script Colab Notebook Script here 1 2 Part 2: Submission and Running Your Protocol Traveled to William & Mary Node to complete this lab, as well as the Pipetting and DNA Gel Art Labs. During my time working this lab, I:

  • Week 4 Lab: Protein Design Part I

    Lab Information Lab work can be found within the Week 4 HW Assignment in the hyperlink below 1 https://pages.htgaa.org/2026a/jason-ross/homework/week-04-hw-protein-design-part-i/index.html ↩︎

  • Week 5 Lab: Protein Design Part II

    Lab Information Lab work can be found within the Week 5 HW Assignment in the hyperlink below 1 https://pages.htgaa.org/2026a/jason-ross/homework/week-05-hw-protein-design-part-ii/index.html ↩︎

  • Week 6 Lab: Gibson Assembly

    Gibson Assembly Lab Pre-Lab: Primer and PCR (Part 1 of 3) Read this section and scanned the NuPack software hyperlink Pre-Lab: Gibson Assembly (Part 2 of 3) Read this section Pre-Lab: DpnI Read this section Pre-Lab: Plasmid Transformation Read this section Part 1: Polymerase Chain Reaction (PCR) Prepared PCR (see photos below)

  • Week 7 Lab: Neuromorphic Circuits

    LOREM IPSUM

  • Week 9 Lab: Cell-Free Systems

    LOREM IPSUM

Subsections of Labs

Week 1 Lab: Introduction to Pipetting and Dilutions

Overview

    • Date(s): 03/02/26 – 03/03/26
    • Notes: Reviewed lab materials outlined in ‘Overview’ protocol section (pipette types and tips, tubes, tube holders, and stock reagents) and concentration basics with Kate Carline (William & Mary Node TA). Discussed lab material functions and reviewed the basics of dilution math and pipetting technique.
    • Supporting Picture(s):
      • Pipetting_Tuesday_10 Pipetting_Tuesday_10

Part 1: Mixing Color

  • Prepared tubes with red, yellow, and blue food coloring solution
  • Marked 6 tubes with red, yellow, blue, red/yellow, blue/yellow, and red/blue combinations
  • Added 500 uL to each each red, yellow, blue, red/yellow, blue/yellow, and red/blue combination solution tube
  • See above – made combinations by mixing colors
  • See above
  • See above
  • Dispersed concentrations onto wax paper to make design in lieu of petri plate
    • Supporting Part 1 and Part 2 photos below

      Pipetting_Mon_1 Pipetting_Mon_1Pipetting_Tuesday_1 Pipetting_Tuesday_1Pipetting_Tuesday_2 Pipetting_Tuesday_2Pipetting_Tuesday_3 Pipetting_Tuesday_3Pipetting_Tuesday_4 Pipetting_Tuesday_4Pipetting_Tuesday_5 Pipetting_Tuesday_5Pipetting_Tuesday_6 Pipetting_Tuesday_6Pipetting_Tuesday_7 Pipetting_Tuesday_7Pipetting_Tuesday_8 Pipetting_Tuesday_8Pipetting_Tuesday_9 Pipetting_Tuesday_9

      Practiced basic pipetting, mixing colors, and performing serial dilution

Part 2: Performing Serial Dilution

  • Performed serial dilutions on MS/food coloring
  • Made a final serial dilution reaction based on the information in the pre-lab
    • See pictures above

Week 2: DNA Gel Art

DNA Gel Art

Protocol Part 0: Designing My Gel Art / Expected Results and Walkthrough

Created a virtual digest in Benchling as a basis for DNA Gel Art (see below)

virtual_digest_sequence_Mycobacterium-phage-Kampy-complete-sequence-51378-bp-including-10-base-3-overhang-CGGCCGGTAA-Cluster-A4_[HiddenHello] virtual_digest_sequence_Mycobacterium-phage-Kampy-complete-sequence-51378-bp-including-10-base-3-overhang-CGGCCGGTAA-Cluster-A4_[HiddenHello]

Benchling Virtual Digest (A Hidden Hello)

Protocol Part 1a: Preparing a 1% Agarose Electrophoresis Gel

Preared a 1% Agarose Electrophoresis Gel (see below)

DNAGelArt_Tuesday_8 DNAGelArt_Tuesday_8DNAGelArt_Tuesday_10 DNAGelArt_Tuesday_10DNAGelArt_Tuesday_11 DNAGelArt_Tuesday_11

Protocol Part 1a: Restriction Digest

Ran Restriction Digest (see images below)

DNAGelArt_Tuesday_2 DNAGelArt_Tuesday_2DNAGelArt_Tuesday_3 DNAGelArt_Tuesday_3DNAGelArt_Tuesday_4 DNAGelArt_Tuesday_4DNAGelArt_Tuesday_5 DNAGelArt_Tuesday_5

Protocol Part 2: Gel Run

Performed Gel Run (see mp4s below)

output1_GelArt output1_GelArtoutput2_GelArt output2_GelArt

Protocol Part 3: Imaging My Results With a Transilluminator

Took gel and prepared to image results (see below)

DNAGelArt_Tuesday_12 DNAGelArt_Tuesday_12

Final Results

Final result (see below)

DNAGelArt_Tuesday_13 DNAGelArt_Tuesday_13

//

Benchling Protocol Notes (sourced from Wiliam & Mary Node TA, Kate Carline)

NotI-HF: rCutSmart, incubates at 37C, 20,000 U/ml = 10 U/ul Kpn1 (Promega): Buffer J, incubates at 37C, 12,000 U/ml = 12 U/ul Sal1 (Promega): Buffer D, incubates at 37C. 10,000 U/ml = 20 U/ul

1.5 ug DNA 324 ng/ul of Kampy B 4.62 ul DNA for N and K 141.4 ng/ul Kampy C (Nanodrop after running out of Kampy B) 10.61 ul for S

15 units of enzyme 1.5 uL Not1-HF 1.25 ul Kpn1 0.75 ul Sal1

2 ul of each 10X Buffer

Remaining to 20 ul NFW 11.88 ul Not1 12.13 ul Kpn1 6.64 ul Sal1

Spin down briefly in picofuge

Incubated for 30 min at 37C

1% agarose gel 2 ul dye with 10 ul reaction 40 min 185 mA 150V //

Week 3 Lab: Opentrons Art

Opentrons Art Lab

Part 1: Flourescent Bacteria & Black Agar Script

See Flourescent Bacteria & Black Agar Script Colab Notebook Script here 1 2

Part 2: Submission and Running Your Protocol

Traveled to William & Mary Node to complete this lab, as well as the Pipetting and DNA Gel Art Labs. During my time working this lab, I:

  • Selected plates for the Opentrons robot
  • Operated the Opentrons robot with the help of William & Mary students
  • Ran my Opentrons code
  • Dispensed Opentrons tips

Protocol photos and mp4 video loops shown below:

Opentrons_Monday_1 Opentrons_Monday_1Opentrons_Monday_2 Opentrons_Monday_2Opentrons_Monday_3 Opentrons_Monday_3Opentrons_Monday_4 Opentrons_Monday_4Opentrons_Monday_5 Opentrons_Monday_5Opentrons_Monday_6 Opentrons_Monday_6Opentrons_Monday_7 Opentrons_Monday_7Opentrons_Monday_8 Opentrons_Monday_8Opentrons_Monday_9 Opentrons_Monday_9Opentrons_Monday_10 Opentrons_Monday_10Opentrons_Monday_11 Opentrons_Monday_11Opentrons_Monday_12 Opentrons_Monday_12Opentrons_Monday_13 Opentrons_Monday_13Opentrons_Monday_14 Opentrons_Monday_14Opentrons_Monday_16 Opentrons_Monday_16output2_Opentrons output2_Opentronsoutput1_Opentrons-2 output1_Opentrons-2Opentrons_Monday_17 Opentrons_Monday_17

Part 3: Final Result

Here’s the final result, showing my Opentrons Art!

IMG_4967 IMG_4967

  1. https://colab.research.google.com/drive/1-pgSJt_aF9MydtG0szxz2YKoogNRLRhH?usp=sharing ↩︎

  2. Gemini was used to help code a good chunk of the Opentrons code. At William & Mary, we did need to re-configure the code slightly to make it work on the Opentrons ↩︎

Week 6 Lab: Gibson Assembly

Gibson Assembly Lab

Pre-Lab: Primer and PCR (Part 1 of 3)

  • Read this section and scanned the NuPack software hyperlink

Pre-Lab: Gibson Assembly (Part 2 of 3)

  • Read this section

Pre-Lab: DpnI

  • Read this section

Pre-Lab: Plasmid Transformation

  • Read this section

Part 1: Polymerase Chain Reaction (PCR)

Prepared PCR (see photos below)

GibsonLab_1 GibsonLab_1GibsonLab_2 GibsonLab_2GibsonLab_3 GibsonLab_3GibsonLab_4 GibsonLab_4GibsonLab_5 GibsonLab_5GibsonLab_6 GibsonLab_6GibsonLab_7 GibsonLab_7GibsonLab_8 GibsonLab_8GibsonLab_9 GibsonLab_9GibsonLab_10 GibsonLab_10

Part 1a: DpnI Digest

Completed DpnI Digest (see photos below)

GibsonLab_11 GibsonLab_11GibsonLab_12 GibsonLab_12

Part 1b: DNA Purification and Quantification

Purified and quantified DNA. It seems at this point that I did something wrong in one of the proceeding protocol stages with my non-Blue chosen color, so instead of proceeding with both colors, I only proceeded with Blue, as the other color did not have an adequate concentration. See photos below for more documentation of this protocol step

GibsonLab_13 GibsonLab_13GibsonLab_14 GibsonLab_14GibsonLab_15 GibsonLab_15GibsonLab_16 GibsonLab_16GibsonLab_17 GibsonLab_17

Part 2a: Gibson Assembly

Completed Gibson Assembly. Incubated reaction per protocol (see photo below)

GibsonLab_18 GibsonLab_18

Part 2b: Transformation

Completed Transformation protocol step

GibsonLab_19 GibsonLab_19GibsonLab_20 GibsonLab_20GibsonLab_21 GibsonLab_21

Final Results

LOREM

//

Supporting prompts for analyzing the lab protocol listed below for reference

Supporting PromptModel
There’s a part of this page that says “After PCR, we treat each reaction with DpnI to eliminate carryover of the original mUAV plasmid.”. That’s Dpnl, not DpnI right? Is it a lower-case l or a capital I?Gemini 2.5 Flash
Remind me what an oligonucleotide is again in simple terms. Keep the response to this prompt short and do NOT hallucinate/make anything upGemini 2.5 Flash
Within the context of this lab, what is a ‘HiFi assembly method’? What is an ‘overhang’? Keep the response to this prompt short and do NOT hallucinate/make anything upGemini 2.5 Flash
How does the ’exonuclease “chews back” one strand of the double-stranded DNA.Gemini 2.5 Flash
I guess I want to understand how this exonuclease works at the chemical level in relatively simple terms. Can you explain that for me? Do NOT hallucinate/make anything up when replying to this promptGemini 2.5 Flash
When the lab refers to ‘Secondary Structures’, what does that mean? Explain in relatively simple terms and do NOT hallucinate/make things upGemini 2.5 Flash
Within the context of this lab, tell me what the ‘pUC19 backbone’ is? What does ‘pUC19’ stand for? Answer this prompt in relatively simple terms, keep the response relatively short, and do NOT hallucinate/make things upGemini 2.5 Flash
Believe when this lab refers to ‘molar ratios’ it means the ratio of molecules to one another. Is this correct or am I mistaken? Do NOT hallucinate when replying to this prompt and answer this prompt in relatively simple termsGemini 2.5 Flash
When the lab states, ‘DpnI recognizes the sequence GATC only when it is methylated’, the ‘GATC’ refers to a Guanine-Adenine-Thyme-Cytosine combination/piece of DNA, and methylation is a chemical process related to said combination/piece of DNA, correct? Do we refer to a 4-letter combination/piece of DNA as a codon, or am I mistaken? Explain in simple terms what methylation is, the actually technical name of this 4-letter combination/piece of DNA, and if there was anything about my thinking/statements so far that’s off. Do NOT hallucinate/make things up when replying to this prompt and keep things relatively simple and short whenever possibleGemini 2.5 Flash
What does the ‘SOC’ in ‘SOC growth media’ stand for? Do NOT hallucinate/make things up when replying to this prompt and answer this prompt in relatively simple termsGemini 2.5 Flash
In the context of this lab, what does ‘uM’ stand for/mean? What does ‘uL’ stand for/mean? Which represents a larger volume? Explain this in relatively simple terms, show me where these terms fit on an overall scale of liqiuid volumes, and do NOT hallucinate/make things up when answering this promptGemini 2.5 Flash
What is ‘silica adsorption’? Do NOT hallucinate/make things up when answering this prompt and keep the answer relatively conciseGemini 2.5 Flash
When the protocol says to ‘gel at ~ 100 mV for 15 min.’, the ‘mV’ being referred to is some type of voltage, correct? Do NOT hallucinate/make things up when answering this promptGemini 2.5 Flash
How does Chloramphenicol relate to amilCP? Are they the same? Clarify this relationship and do so in relatively simple terms and do NOT hallucinate/make things up when doing soGemini 2.5 Flash

Week 7 Lab: Neuromorphic Circuits

LOREM IPSUM

Week 9 Lab: Cell-Free Systems

LOREM IPSUM

Projects

Final projects:

Final Project - Ideas/Drafts

Idea: Space Phage Supreme

Section 1: Abstract

  • Phage therapy’s potential to treat novel bacterial infections has generated increased attention in recent years, terrestrially and in space health research. Recent research from University of Wisconsin Madison demonstrated the unique impacts of microgravity on Escherichia coli bacteria and T7 bacteriophage interactions, particularly on the distribution of genetic mutations across the T7 bacteriophage genome 1. Understanding unique microgravity-derived insights on bacteriophage mutations and bacteriophage bacterial interactions could yield phage therapeutic insights terrestrially and for future space travelers. Accordingly, this research aims to extend the University of Madison, Wisconsin’s research by proactively apply analogous microgravity-derived phage mutations in a new bacteriophage. The working hypothesis of this research is that proactively applying analogous microgravity-derived phage mutations in a new bacteriophage will yield similar increase in bacteriophage fitness and lysing.

To implement this analogous research, the plan is to:

  • Perform targeted, direct engineering involving Ribosome Binding Site (RBS) and tip-domain substitutions in Mycobacterium smegmatis (M. smegma) bacteriophage to improve host fitness and lysing via phage recombination or CRISPR-based techniques
  • Test mutated M. smegma bacteriophage host range and infection efficiency

The methods for achieving the specific aims referenced above include:

  • Single-site mutagenesis of relevant RBS/tip domain proteins
  • Deep sequencing to map variant fitness pre/post-selection
  • Whole genome sequencing for de novo mutation identification
  • Use host plating assays for lysing analysis

Section 2: Project Aims:

  • Experimental Aim: The first aim of this project is to demonstrate if/how analogous E. coli microgravity bacteriophage mutations manifest in M. smegma bacteriophage via single-site mutagenesis of relevant RBS/tip domain proteins, followed by deep and whole genome sequencing as necessary. This will help discern insights on M. smegma bacteriophage fitness and lysing

  • Developmental Aim: Follow-on experiments showcasing comparable mutations across several additional bacteriophage species, with the aim of discerning which microgravity bacteriophage mutations can instigate positive terrestrial human health outcomes (specifically improved bacteriophage fitness and lysing)

  • Visionary Aim: ‘Plug and play’ (ideally bidirectional [terrestrial and space-based]) catalog of microgravity-derived high-fitness bacteriophages for use against nth forms of bacterial infection. This would help confront the challenge of novel (or routine) space-based bacterial infections and lack of corresponding antibiotics

Section 3: Background:

  • Briefly summarize two-peer reviewed research citations relevant to your research
    • In ‘Microgravity reshapes bacteriophage–host coevolution aboard the International Space Station’ University of Wiconsin Madison researchers reported on the dynamics between a T7 bacteriophage and E. Coli after microgravity exposure aboard the International Space Station (ISS) 2. Their results indicated delayed phage activity, but ultimately the emergence of several novel mutations across the bacteriophage, which when replicated terrestrially, improved lysing. In ‘Impact of simulated microgravity in short-term evolution of an RNA bacteriophage’ researchers from the Centro de Astrobiología and Universidad Autónoma de Madrid also discovered similar delayed phage activity when RNA bacteriophage was exposed to a terrestrial simulated microgravity environment 3. Both studies indicate novel phage activity due to microgravity exposure.
  • Explain how your project is novel or innovative
    • My project seeks to extend learnings on bacteriophage microgravity exposure to determine how microgravity-derived phage mutations can proactively apply to improve terrestrial phage fitness (including but not limited to lysing). If successful, this project will help demonstrate the utility of microgravity in terrestrial phage therapy development. If successful, it might also help create a bidirectional virtuous cycle between microgravity-derived insights, terrestrial phage therapy, and non-terrestrial phage therapies for long-duration space missions
  • Explain why your project matters and what impact it could have
    • This project attempts to solve the problem of proactively improving phage fitness. Improving phage fitness matters as it’s crucial to making phage therapies a viable alternative to traditional antibiotics, particularly in remote, resource-constrained environments like a long-term space exploration mission. Creating the bidirectional virtuous cycle referred to in the answer to the previous question could advance public health and wellbeing in several ways. It could combat antimicrobial resistant (AMR) bacteria while giving humanity a means of dealing with space-based infections when nth volumes of antibiotics or standard pharmaceuticals may be in short supply or logistically unfeasible to transport. In helping bidirectional virtuous cycle, this research will advance our knowledge of customizing terrestrial bacteriophage for improved fitness based on microgravity-derived insights
  • Describe the ethical implications associated with your project and identify relevant ethical principles (i.e., non-maleficence, beneficence, justice, or responsibility)
    • Improving phage fitness might have unintended consequences, as significantly fit phage could lyse bacteria that are important to function of human microbiomes. Therefore, this project intends to follow the principle of non-maleficence. In practice this means our research will be conducted in low-biosafety (BSL) environments on M. smegma and will focus on improving phage fitness to combat AMR bacteria. This research will also uphold the principle of beneficence by making the results of our research publicly available. The measures taken to ensure this project aligns with ethical principles are mentioned in passing in the previous paragraph and elaborated upon here. The research will be conducted in low-BSL settings, and its results will be made publicly available. Any/all researchers associated with this project will comport with all appropriate statutes in maintaining lab safety at all times. While there could be unintended consequences of publicly sharing this research, any/all researchers associated with this project will share what is strictly necessary within the scope of this project’s research aims. Any/all discussion of using bacteriophages to deliberately alter human microbiomes for adverse health outcomes will not occur.

Section 4: Experimental Design, Techniques, Tools, and Technology

  • Create a detailed experimental plan for your final project. Include a timeline for each part of your experimental plan (i.e., how long you expect each step in your final project to take)

NOTE: This is Claude-derived per tool in this HTGAA hyperlink 4 and tentative


The experimental workflow spans 4–6 weeks and is divided into four phases: (1) construct design and ordering, (2) library generation and recombineering, (3) automated screening, and (4) validation and analysis.

Detailed Workflow Workflow (≥15 Steps)

Step 1 — Bioinformatic identification of D29 tail spike host-range determinant residues

  • Method: Structural homology modeling of D29 gp10 (tail spike) and minor tail proteins using AlphaFold2; alignment with T7 tip domain; identification of surface-exposed residues likely involved in glycolipid receptor contact
  • Automation: Computational (no lab automation required)
  • Expected result: 8–15 target residue positions for combinatorial diversification

Step 2 — Design of combinatorial tail spike variant library

  • Method: Design of 96–192 D29 tail spike variants encoding combinations of microgravity-inspired substitutions (hydrophilic/aromatic substitutions at receptor-contact positions, guided by Huss et al. microgravity fitness preferences); design includes full upstream/downstream homology arms (500 bp each) for recombineering
  • Automation: Computational design; Twist Bioscience order submission
  • Microplate: N/A (ordering stage)
  • Expected result: Finalized Twist order of combinatorial tail spike library as linear dsDNA fragments

Step 3 — Twist Bioscience DNA synthesis and delivery

  • Method: Order combinatorial library as Twist Oligo Pools (clonal gene synthesis for top 12 priority variants as individual whole-plasmid orders for validation constructs); also order pJV53 recombineering plasmid backbone as a Twist whole-plasmid synthesis
  • Automation: External (Twist Bioscience)
  • Expected result: Pooled oligo library + 12 individual constructs delivered

Step 4 — M. smegmatis mc²155 electrocompetent cell preparation

  • Method: Grow mc²155 to mid-log in 7H9 media, wash 3× with 10% glycerol, aliquot and freeze at −80°C
  • Automation: Manual preparation; HiG Centrifuge for pelleting
  • Microplate: N/A
  • Expected result: Electrocompetent mc²155 cells ready for recombineering

Step 5 — pJV53 recombineering plasmid transformation into mc²155

  • Method: Electroporate pJV53 (encodes Che9c gp60/gp61 recombinases under acetamide-inducible promoter) into mc²155; select on hygromycin plates; confirm by colony PCR
  • Automation: Manual electroporation; ATC Thermal Cycler for colony PCR confirmation
  • Microplate: 96-Armadillo-PCR-AB2396X for colony PCR
  • Expected result: mc²155/pJV53 recombineering-competent strain confirmed

Step 6 — Recombineering: introduction of tail spike library into D29 phage genome

  • Method: Infect mc²155/pJV53 with wild-type D29 at MOI 0.1 in presence of tail spike library linear DNA (500 bp homology arms); induce recombinases with acetamide at time of infection; harvest phage lysate after 4h lysis
  • Automation: Inheco Plate Incubator for controlled 37°C incubation; BioshakeD3000 for mixing during induction
  • Microplate: 96-v-eppendorf-951033502-deep (deep well for liquid culture volumes)
  • Expected result: Recombinant phage library in lysate; estimated recombination frequency 10⁻⁴–10⁻³

Step 7 — Primary plaque assay screen on mc²155 (permissive host, library QC)

  • Method: Serial dilution of phage library lysate; plaque assay on mc²155 lawn in 7H9 top agar; count plaques to confirm library titer; pick 48 random plaques for Sanger sequencing of tail spike region to confirm library diversity
  • Automation: Echo525 for serial dilution transfers into 384-well plate; Multiflo for top agar dispensing
  • Microplate: 384-flat-corning-3640 for dilution series; omni plates for agar overlays
  • Expected result: Library titer ≥10⁸ PFU/mL; sequencing confirms >60% variant recovery

Step 8 — Automated host range screen: plaque assay across 3 M. smeg strains

  • Method: Plate library phage against (a) mc²155 (permissive control), (b) M. smeg ATCC 607 (naturally more resistant to D29), (c) M. smeg strain with modified cell wall glycolipids; score plaques at 24h and 48h; calculate relative plating efficiency (EOP) for each variant on each strain
  • Automation: Echo525 for phage dilution transfers; Multiflo for bacterial lawn + top agar dispense; Spark Plate Reader for OD-based bacterial growth confirmation pre-plating; Plateloc with A4s breathable seal for incubation; Cytomat for 37°C shaking incubation of liquid cultures
  • Microplate: 384-flat-corning-3640 for plaque assay screens; 384 Greiner black-well clear-bottom for OD600 bacterial growth monitoring
  • Expected result: Identification of 3–10 variant plaques with EOP >10× WT D29 on resistant strains

Step 9 — Plaque isolation and phage stock preparation of top variants

  • Method: Pick top 12 expanded-host-range plaques from Step 8; propagate individual phage stocks on mc²155; titer by plaque assay; extract phage genomic DNA for sequencing
  • Automation: Echo525 for phage propagation setup; PHERAstar FSX for high-sensitivity titer confirmation via absorbance
  • Microplate: 96-round-axygen-pdw11cs-halfdeep for phage propagation
  • Expected result: 12 purified expanded-host-range D29 variant stocks at ≥10⁹ PFU/mL

Step 10 — Adsorption rate assay for top 12 variants

  • Method: Mix each variant phage with each target M. smeg strain at MOI 0.01; centrifuge at defined time points (2, 5, 10, 20 min) to pellet bacteria + adsorbed phage; measure unadsorbed phage in supernatant by plaque assay; calculate adsorption rate constant k
  • Automation: Echo525 for phage + bacteria mixing in 384-well format; HiG Centrifuge for pelleting; Spark Plate Reader for OD600 monitoring of bacterial density
  • Microplate: 384-flat-perkin-elmer-60082XX-shallow for adsorption kinetics timepoints
  • Expected result: Top variants show adsorption rate constants 2–10× higher than WT D29 on resistant strains

Step 11 — One-step growth curve for top 3 variants

  • Method: Synchronize infection of each target strain at MOI 5; wash away unadsorbed phage; resuspend in fresh media; sample every 10 min for 90 min; titer each sample by plaque assay; calculate latent period, rise period, and burst size
  • Automation: BioshakeD3000 for temperature-controlled mixing; Echo525 for timed sampling transfers; Inheco Plate Incubator for constant 37°C
  • Microplate: 96-v-eppendorf-951033502-deep for growth curve samples
  • Expected result: Top variants show burst sizes within 2× of WT D29 on mc²155, confirming no major replication fitness cost

Step 12 — Whole-genome sequencing of top variants

  • Method: Submit phage genomic DNA of top 3–5 variants for Illumina sequencing; align reads to D29 reference genome (GenBank accession AF022214); identify all mutations relative to WT; confirm tail spike variants match designed library sequences
  • Automation: External sequencing service; bioinformatic analysis computational
  • Expected result: Full mutation maps for top variants; confirmation that tail spike substitutions are responsible for host range expansion

Step 13 — qPCR validation of tail spike gene expression in top variants

  • Method: Infect mc²155 with top 3 variants; extract total RNA at 5, 15, and 30 min post-infection; reverse transcribe; run qPCR with primers flanking the engineered tail spike region; compare expression to WT D29 tail spike
  • Automation: CFX Opus qPCR machine; ATC Thermal Cycler for RT-PCR; Echo525 for qPCR reaction setup
  • Microplate: 384-pcr-eppendorf-9510207XX for qPCR reactions
  • Expected result: Tail spike gene expression confirmed in all top variants; no significant transcriptional defects introduced by library substitutions

Step 14 — Data analysis and fitness landscape construction

  • Method: Compile EOP, adsorption rate, burst size, and sequence data for all variants; calculate fitness scores; generate heatmap of tail spike residue substitutions vs. host-range fitness; compare enriched positions to Huss et al. T7 microgravity-selected residues
  • Automation: Computational (Python/R analysis)
  • Expected result: Interpretable fitness landscape of D29 tail spike host-range residues; identification of convergent vs. divergent positions relative to T7 microgravity data

Step 15 — Validation experiment: PCR + Sanger sequencing of final top clone

  • Method: PCR-amplify the full tail spike coding region of the top-performing D29 variant using flanking primers; run on 1% agarose gel for size confirmation; submit for Sanger sequencing; confirm exact amino acid substitutions
  • Automation: ATC Thermal Cycler; Spark Plate Reader for gel quantification
  • Microplate: 96-Armadillo-PCR-AB2396X for PCR setup
  • Expected result: Confirmed sequence identity of top variant tail spike; complete project validation

Project Techniques

  • DNA design and synthesis — Combinatorial tail spike library design; whole-plasmid synthesis via Twist Bioscience
  • Recombineering / in-cell recombination — pJV53-mediated Che9c recombinase system in M. smegmatis mc²155
  • Combinatorial library synthesis — Twist oligo pool-based tail spike variant library
  • Plaque assay — Primary host range screening methodology
  • PCR and gel electrophoresis — Colony PCR confirmation; validation sequencing
  • qPCR — Tail spike gene expression validation (CFX Opus)
  • Microplate automation — Full Ginkgo Bioworks automation pipeline
  • Bioinformatics / sequence analysis — AlphaFold2 structural modeling; fitness landscape analysis
  • Liquid handling robotics — Echo525, Multiflo, Tempest, Floi8
  • Plate reader assays — Spark (OD600 growth curves), PHERAstar FSX (titer confirmation)

Supporting prompts for this output listed below:

Supporting PromptModel
Please follow the skill instructions below and begin Phase 1:Claude
Q: What biological problem are you most interested in solving? A: I’m interested in bacteriophage engineering, specifically taking microgravity-derived phage fitness insights from an International Space Station (ISS) experiment and proactively applying these insights to a non-Escherichia coli terrestrial bacteriophage Q: What system or organism would you prefer to work in? A: Thinking a Mycobacterium smegmatis (M. smegma) , but I’m not sure Q: Which synthetic biology approaches interest you most? (Select all that apply) A: ORACLE (Optimized Recombination, Accumulation, and Library Expression), In-cell recombination, and Combinatorial Library Synthesis (although I’m not sure and am open other synthetic biology approaches for this research)Claude
Q: The ISS microgravity experiments you’re referencing — do you have a specific phage or dataset in mind? A: Yes — a specific phage/study (I’ll describe below) Q: For your M. smegmatis host system, which mycobacteriophage are you thinking of engineering? A: Open to suggestions based on what’s best for the project. Please note that I’m working in a low-biosafety level (BSL) lab, so that’s the only major consideration that comes to mind here Q: What is the core hypothesis — what ‘fitness’ trait from microgravity are you trying to engineer in? A: The desired fitness traits would most likely be ‘Broader host range’ and ‘Enhanced infectivity / adsorption rate’ in that order of priorityClaude
Q: What type of phage genes do you expect carry the microgravity fitness signal? (This shapes your DNA construct design) A: Tail fiber / receptor binding protein (RBP) mutations Q: For measuring success, which primary assay makes most sense to you? A: Combination of the above Q: You mentioned ORACLE / in-cell recombination / combinatorial library synthesis — are you set on using all three, or open to a focused recommendation? A: I’d like your recommendation given the BSL constraint and M. smeg systemClaude
Microgravity reshapes bacteriophage–host coevolution aboard the International Space Station (title); Phil Huss, Chutikarn Chitboonthavisuk, Anthony Meger, Kyle Nishikawa, R. P. Oates, Heath Mills4 Olivia Holzhaus, Srivatsan Raman (authors); 2026 (year)Claude
Q: Does D29 with focus on the gp80 tail spike RBP sound right for your project? A: I don’t know enough to know enough so will defer to these findings Q: What host range panel would you like to test your engineered D29 variants against? A: Multiple M. smeg strains only (keep it BSL-1) Q: The paper used cell-free approaches for DMS. Do you want a cell-free component in your project? A: Do whatever will likely be logistically easier given the timeframes to complete the research (essentially 4 to maybe 6 weeks max.)Claude
“generate proposal”Claude

Subsections of Projects

Individual Final Project

cover image cover image

Group Final Project

cover image cover image