I am an Industrial Designer who has worked as a Learning Technology specialist in the Biotechnology and Manufacturing industry for over 25 years. My passion for content creation stems from my experience with photography, video production and interactive 3d visualizations. I am currently instructing activities at the Makerspace Charlotte where I continue to explore the intersection of design and technology.
Concept Create new BioArt experiences for members of a community MakerSpace where our stated goal is to Make, Learn, and Share. The MakerSpace has recently opened a BioArt Studio, led by Karen Ingram, co-author of “BioBuilder - Synthetic Biology in the Lab” (ISBN 978-1-491-90429-9).
My applications are inspired by the innovative use of living systems to create art & design. Concepts incorporate digital imaging, interactive 3d and microprocessing to create algorithmic artwork, influenced and driven by the biological science found in the collection of experimental solutions described below: (Click to expand each item)
Checklist Part 0: Basics of Gel Electrophoresis Attend Lecture (2 of 3) Attend Recitation Review 2025 recording (3 of 3) Part 1: Benchling & In-silico Gel Art Part 2: Gel Art - Restriction Digests and Gel Electrophoresis (Optional- for those with Lab access) Design Simulation Part 3: DNA Design Challenge 3.1 Choose your Protein 3.2 Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence. 3.3 Codon optimization 3.4. You have a sequence! Now what? 3.5. [Optional] How does it work in nature/biological systems? Part 4: Prepare a Twist DNA Synthesis Order 4.1. Create a Twist account and a Benchling account 4.2. Build Your DNA Insert Sequence 4.3. On Twist, Select The “Genes” Option 4.4. Select “Clonal Genes” option 4.5. Import your sequence 4.6. Choose Your Vector Part 5: DNA Read/Write/Edit 5.1 DNA Read (i) What DNA would you want to sequence (e.g., read) and why? (ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why? 5.2 DNA Write (i) What DNA would you want to synthesize (e.g., write) and why? (ii) What technology or technologies would you use to perform this DNA synthesis and why? 5.3 DNA Edit (i) What DNA would you want to edit and why? (ii) What technology or technologies would you use to perform these DNA edits and why? Part 1: Benchling & In-silico Gel Art In this section, I was able to successfully sign up for Benchling, request to join HTGAA (pending), and create a new project. I was able to find the Lambda DNA sequence in the FASTA database, which I copied and pasted. I then found the downloadable file in GenBank, which I imported into Benchling. It took me a few tries to get multiple Digests to appear, once I selected multiple restriction enzymes and ordered the tabs before Virtual Digest. I exported the resulting image as a .PNG as well as my NC_001416 Project “Linear Map” and “Sequence Map” as well as the Lambda Map from GenBank, as PDFs for future reference.
Focus on Lab Automation research, with creative examples of OpenTrans instruction sets using Python. Final project slide to be included in Node deck.
Opentrons Art This week started witn an exploration of the Opentrons Art web app found at https://opentrons-art.rcdonovan.com
I was able to quickly upload an image and randomize the colors, to generate a point paired data set. I really like the bitmap rasterization and creative expression found in the gallery.
This week focuses on how sequence, structure, and energetics can be modeled and manipulated to create or optimize proteins with specified functions.
Part A - Conceptual Questions For my homework, I initated a conversation with Claude Ai using Sonnet v4.6. My prompts use a method I use to start with a question, allow me to provide my answer, and receive an evaluation of my response with reinforcing key learning concepts. (Expand to see detailed responses to my answers.). I find this approach to be more interactive and leads to better knowledge retention.
This week we learned how cutting-edge AI and protein language models are used to design functional proteins and peptides “in silico”.
Part A: SOD1 Binder Peptide Design Part 1: Generate Binders with PepMLM Begin by retrieving the human SOD1 sequence from UniProt (P00441) and introducing the A4V mutation.
This week we learn core molecular biology tools and techniques for processing and assembling DNA, including PCR and Gibson Assembly.
Assignment: DNA Assembly What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose?
The mix contains DNA Polymerase, known for thermostable accuracy. Used to amplify fragments used in PCR for Gibson Assembly. What are some factors that determine primer annealing temperature during PCR?
This week covers neuromorphic genetic circuits, showing how engineered gene networks can implement neural-network “perceptron”-like computation and learning.
Assignment Part 1: Intracellular Artificial Neural Networks (IANNs) Q1. What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions? Answer: IANNs have many possible responses, reflecting more of a gaussian distribution rather than binary ON/OFF outputs. This allows for gradiated, continuous range or responses versus the step-function behavior of Boolean genetic circuits, making them well-suited for environments with high levels of variability such as changing temperatures, pH, or time.
This week introduces synthesis of proteins using cellular machinery outside of a cell.
Section 1: General Homework Questions Question 1 Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production.
This week’s lecture presents a range of advanced technologies to do precision measurement of proteins at atomic scales, characterizing chemical composition, and detecting protein sequence and structure.
Question 1 — What aspects of your project will you measure? Validity and viability of the BioLightV5 plasmid obtained from Twist, confirmed through gel electrophoresis and successful colony growth in E. coli.
HTGAA 2026 — Week 11: Bioproduction & Cloud Labs Hypothesis — Version 2.1 This is a hypothesis on the design of a variable luminosity construct based on cell-free protein synthesis. By adding independent reagent modifications to a fixed cell-free DNA and master mix, we hypothesize a measurable delta in sfGFP luminosity relative to the unmodified control, operating on a single mechanistic axis — free Mg2+ availability:
HTGAA Week 12 Homework Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork Item 1: Pixel Contribution I contributed to plate #G3, initiating a rose design on April 15. I seeded the concept on Discourse: "#G3 - Starting to build a rose… let’s see what grows!"
Final Project Build This week has been focused on key milestones for my final project.
BioLight V5 circular plasmid finalized Benchling construct submitted to the Node Order Form as approved TWIST order simulated with no errors reported pUC19 confirmed as vector with availabiliy at Genspace Scheduled a visit to Genspace on May 28th to attend Safety Training and Orientation Aim 1 objective is to validate viability of the Clonal Gene “BioLight V5” to be used in the “Photoplasm” optogenetic labware Slides for final project are drafted ChimeraX being used to generate Mechanism of Action illustrations. pDawn with sfGFP identified and ordered from AddGene as a control to enable testing of protocols Design, Development, and physical prototyping (Build) of the “Photoplasm” device is proceeding, with sensor data. Additional applications developed (software) to facilitate electrical engineering schemats. All steps being documented as an open-source build framework, Aim 2 objective in support of the larger Aim 3 “MakerSpace” vision. Reviewed and contributed to the “Ai Tutor” project led by @Derek - Shared a Q&A prompt based on accuracy & confidence in a feedback loop.
This week has been focused on finalizing my final project.
Subsections of Homework
Week 1 HW: Principles and Practices
Concept
Create new BioArt experiences for members of a community MakerSpace where our stated goal is to Make, Learn, and Share.
The MakerSpace has recently opened a BioArt Studio, led by Karen Ingram, co-author of “BioBuilder - Synthetic Biology in the Lab” (ISBN 978-1-491-90429-9).
My applications are inspired by the innovative use of living systems to create art & design.
Concepts incorporate digital imaging, interactive 3d and microprocessing to create algorithmic artwork, influenced and driven by the biological science found in the collection of experimental solutions described below: (Click to expand each item)
BioPhotoLab
Exploring 2D and 3D visual imaging techniques to discover new applications and experiences suitable for a community MakerSpace.
Concept #1: SlimeOgraphy
Imaging with light-following organisms.
Imaging with photoreactive synthetic proteins.
Experimenting with Slime Mold to determine if organisms can be guided and trained with light to create organic designs.
High Probability, Ease of Access, Generative Art
Aligns with Makerspace ethos, with derivative output via multiple media formats
Concept #2: BioTerrain
Terraforming with Image Maps.
Translate organic interactions into realtime interactive terrain maps that can be explored using immersive virtual reality
Experimenting with slime mold and fluorescent bacterial cultures
Slime mold “reader” can leverage imagery from previously 2D generated image sequences to create immersive virtual worlds.
Fluorescent bacterial cultures can be interpolated into displacement maps, and texture maps.
Both type of input methods will become part of a wider narrative that allows for creative virtual exploration using game engine mechanics.
The capture of image sequences leads to time-based controls to visualize change.
Concept #3: BioScanner
Event Based Triggers : Machine Vision Detection of Change
Similar to IOT “Internet of (Almost Any) Things”
Building on the previous experiments, the introduction of change results in a condition that will trigger an event, or automation.
A simplified gateway will send an encoded message that can be visualized over time.
The unique nature of a biofeedback loop allows for a bi-directional conversation between the experiment and participating scientist.
An entire API can be developed that leads to a notification platform that seeks to identify key triggers and events.
High level of governance, potential risk, and personal identity protection required as data is flowing from the source. May be encoded at rest.
Concept # 4: BioEmulsion Print
Paper based coating that is light-sensitive and photo reactive
Emulsion coating that is applied to paper and other materials that can be exposed via an enlarger and creates a bio-digital original
Advanced understanding of Protein Synthesis from samples that result in a range of photo emulsions and papers.
Leverages the darkroom lab to expose and print
Can be a digital file transmission or analog optical projection
Similar to sun prints or cyanotypes.
Governance Design & Purpose
This governance model outlines the actions of the BioPhotoLab within the MakerSpace “BioArt Studio.”
By integrating biology with creative mediums—such as Slimeography, BioTerrain, and BioEmulsion—the initiative provides a public and member-driven workspace to foster experiences based on science, technology, engineering, art, and math (STEAM).
The model addresses critical dependencies on membership-driven funding and the need for standardized best practices in a shared environment. It prioritizes a transition from simple completion or attendance tracking metrics to an activity-based training model (using experience APIs) to monitor safe, scalable, and inclusive biotechnology exploration.
A leading purpose is to develop a Makerspace focus area, “BioPhotoLab,” that is deemed accessible and can be experienced by people with a wide range of abilities. We will demonstrate how Bioengineering is well suited to the concepts of Universal Design while encouraging technological creativity and community knowledge sharing.
Governance Policies
The following options evaluate proposed actions against core governance pillars: Safety, Privacy, Digital Rights/IP, and Accessibility.
Evaluation of Risks and Assumptions
Assumptions: Success assumes that funding (dues, grants, donations) remains stable and that “Universal Design” (if accessible for a person with a disability, it is good for everyone) is adopted. It assumes learners will practice safe operation and intent to share knowledge.
Risks of Failure: Potential failure points include membership attrition, lack of succession planning for instructors, and the perception that class attendance equates to workcell competency.
Risks of “Success”: Unintended consequences of success may include challenges with proprietary IP/Patents from corporate R&D and the need for rigorous Digital Rights Management to combat “AI hallucinations” or attribution infringement.
Governance Model with Matrix Ranking
Segment A: Selected Actor: MakerSpace Instructors, MakerSpace Members
Segment B: Selected Action: Activity Logging
Segment C: Selected Actor: Disability and Accessibility Organizations
Segment D: Selected Actions: ADA Legislation, Web Accessibility Guidelines
Governance Matrix Analysis
My governance matrix uses the rubric of Design, Purpose, Assumptions, and Risks of Failure/Success to align Actors (Personas) with Actions. The resulting table is color coded to show a relative heatmap of ratings, along with a total by row and column to highlight outliers.
Segment A: Makerspace Instructors and Members
This grouping represents the majority of best-scoring results, represented by MakerSpace Instructors and MakerSpace Members who may be considered the primary target audience for “BioPhotoLab” activities and experiments with governance.
Segment B: Activity Logging
“Activity Logging” is a high-rated Action, and has been prioritized as it will allow for measurable, realistic and verifiable data to be generated in support of the objectives of safely creating BioPhoto art, while teaching and learning with others, in a growing MakerSpace community. A well governed framework can address the need to maintain anonymity and privacy, as well as an opt-in approach to managed certified access. The assumption is that activity data will drive better participant engagement , higher rates of adherence to safety protocols, with increased knowledge retention and sharing.
Segment C: Disability and Accessibility Organizations
The governance actions related to ADA (Americans with Disabilities Act) legislation, as well as evolving WCAG (Web Content Accessibility Guidelines) represent the best scores when looking across the full range of Actors, which is an indicator that Universal Design may have a powerful impact across a wide range of people of all abilities. As I perform BioPhotoLab experiments, the lens of accessibility becomes a priority when seeking to solve human-centric challenges.
Segment D: ADA Legislation, Web Accessibility Guidelines
The target audience for governance activities is defined as any organization that supports Universal Design, Accessibility, disability awareness, legislation, advocacy, and of course, people with disabilities, including perceived, hidden, disclosed or non-disclosed. Privacy is a key consideration in this segment. The assumption is that we will safely, privately and publicly engage with this audience for maximized community engagement. This segment may also carry the most risks, in that it shows high rankings in nearly all governance Actions. A playbook is a likely solution to help drive adoption.
Reflection
The ethical concerns that arose for me this week were centered on data privacy and safety. The concept that (almost) anyone can grow (almost) anything means that extra care needs to be taken to protect and track the techniques used in synthetic bioengineering. The importance of safety training was emphasised, and there may be a pathway for online listeners as well as in-person participants. I imagined a virtual training simulator to enhance biosafety best practices, based on similar work I have done in the past.
Additionally, the intellectual property needs to be managed and shared much like the history of digital images that can now be combined and altered freely. Personal DNA that can be modified for therapeutic outcomes can also lead to unintended circumstances.
With Ai engines and algorithms being freely shared, the potential for Ai innovation is balanced with Ai disruption and contamination.
My proposed actions are to take a high level view and begin to track events and actions with full context to visualize the evolving landscape, using my project, the BioPhotoLab as a benchmark.
By “opting in” to a framework that shows participation, engagement and reflection in near realtime, we may begin to steer the behavioral data towards a desired state, and quickly identify outliers.
For participants who seek full transparency and verification, opting in with a unique identifier means that we can attribute works to an origin, and explore pathways that lead to greater discovery in an ethical and governed manner.
Risk or resistance occurs when personally identifiable data is leaked or unsecured, but the ability to discern verified sources from artificial or unethical sources may hold more weight.
In a lighter sense, tracking and visualizing behavioral change through engagement metrics and reflective feedback loops creates a culture of knowledge sharing in parallel, or adjacent to formally tracked and managed training completions. .
Highly engaged learners and practitioners demonstrate greater levels of ethical and well goverened best practice with opportunities for continual improvement.
Appendix
Mindmap:Initial Design
Instructions:
Use Middle-Mouse-Wheel to zoom in/out
Use Left Mouse Button to pan around map
use Reset Icon to reset view
graph TB
subgraph "BioArt Projects"
BP[BioPhotoLab]
SL[Slimeography]
BT[BioTerrain]
BS[BioScanner]
BE[BioEmulsion]
end
subgraph "Evaluate"
ASS[Assumptions]
TO[Trade-offs]
UN[Uncertainties]
SC[Scalability]
ACT[Actions]
end
subgraph "Assumptions Details"
ASS1["MakerSpace 'BioLab' dependency"]
ASS2[Knowledge Sharing through Class activities]
end
subgraph "Trade-offs Details"
TO1[Public workspace with emerging capabilities]
TO2[Anonymous utilization]
end
subgraph "Uncertainties Details"
UN1[Cloud Lab workstream availability]
UN2[Standardized best practices]
end
subgraph "Actions Framework"
PUR[Purpose: What is done now and what changes]
DES[Design: What is needed to make it work]
ASMP[Assumptions: What could you have wrong]
RISK[Risks of Failure & Success]
end
subgraph "Governance"
G1[Exist Safely]
G2[Enforce Privacy]
G3[Provide Digital Rights Management]
G4[Monetization]
G5[Integration with other frameworks]
G6[AI Influences]
G7[Enable Accessibility]
end
subgraph "Exist Safely Details"
G1A[Ensuring no contamination risk]
G1B[Providing certified lab and material safety training]
G1C[Logging all activities]
end
subgraph "Enforce Privacy Details"
G2A[Protecting personally identifiable information]
G2B[Opting in/out of managed accounts]
end
subgraph "Digital Rights Management Details"
G3A[Including metadata with attribution]
end
subgraph "AI Influences Details"
G6A[infringement]
G6B[hallucination/slop]
end
subgraph "Accessibility Details"
G7A[Meeting ADA guidelines]
G7B[Meeting WCAG3 guidelines for Web Accessibility]
G7C[Benchmarking usability]
end
subgraph "MakerSpace Administrators"
MSA1["Action 1: Encouraging technological creativity and community<br/>No Change"]
MSA2["Action 2: Membership Acquisition/Retention<br/>Recommending marketing campaigns"]
MSA3["Action 3: Funding dependent on membership dues,<br/>class revenue, donations, grants<br/>Recommending marketing campaigns and data support models"]
end
subgraph "MakerSpace Instructors"
MSI1["Action 1: Instructor-defined curriculum<br/>Must be documented and standardized<br/>Observational assessments for certification"]
MSI2["Action 2: Provide specialized training<br/>and certification to members and non-members"]
MSI3["Action 3: Succession planning,<br/>enrollment planning, certification management"]
end
subgraph "MakerSpace Members"
MSM1["Action 1: Evaluate level of knowledge,<br/>experience, interest"]
MSM2["Action 2: Participation, Knowledge Retention,<br/>Observational Assessment<br/>Using experience APIs for activity-based training"]
MSM3["Action 3: Intent to demonstrate<br/>and practice what was learned"]
MSM4["Action 4: Safe operation"]
MSM5["Action 5: Knowledge Sharing"]
end
subgraph "Disability & Accessibility Awareness Organizations"
DA1["Action 1: Accessible activities"]
DA2["Action 2: Universal Design<br/>If accessible for person with disability,<br/>good for everyone - Ron Mace"]
DA3["Action 3: Develop BioPhoto Lab<br/>that is accessible and experiential<br/>Find breakthrough in Accessibility"]
end
subgraph "Public Visitors"
PV1["Action 1: Safe, accessible lab"]
PV2["Action 2: Increase awareness of Biotechnology"]
PV3["Action 3: Increase opportunity for STEAM experiences<br/>Bio-ethical experience for public awareness"]
end
subgraph "Corporate R&D"
CR1["Action 1: Corporate experiential learning"]
CR2["Action 2: Corporate grant funding models"]
CR3["Action 3: Proprietary IP, Patents, Commercialization"]
end
subgraph "Academic Institutions"
AC1["Action 1: Knowledge Sharing with SMEs<br/>and Thought Leaders<br/>Standards of self-reported activities<br/>with data-driven dashboards"]
AC2["Action 2: Access to Cloud services and solutions"]
AC3["Action 3: Partnerships, outreach"]
end
BP --> ASS
SL --> ASS
BT --> ASS
BS --> ASS
BE --> ASS
ASS --> ASS1
ASS --> ASS2
TO --> TO1
TO --> TO2
UN --> UN1
UN --> UN2
ACT --> PUR
ACT --> DES
ACT --> ASMP
ACT --> RISK
PUR --> G1
DES --> G1
ASMP --> G1
RISK --> G1
G1 --> G1A
G1 --> G1B
G1 --> G1C
G2 --> G2A
G2 --> G2B
G3 --> G3A
G6 --> G6A
G6 --> G6B
G7 --> G7A
G7 --> G7B
G7 --> G7C
G1 --> MSA1
G1 --> MSI1
G1 --> MSM1
G1 --> DA1
G1 --> PV1
G1 --> CR1
G1 --> AC1
MSA1 --> MSA2
MSA2 --> MSA3
MSI1 --> MSI2
MSI2 --> MSI3
MSM1 --> MSM2
MSM2 --> MSM3
MSM3 --> MSM4
MSM4 --> MSM5
DA1 --> DA2
DA2 --> DA3
PV1 --> PV2
PV2 --> PV3
CR1 --> CR2
CR2 --> CR3
AC1 --> AC2
AC2 --> AC3
style BP fill:#90EE90
style SL fill:#90EE90
style BT fill:#90EE90
style BS fill:#90EE90
style BE fill:#90EE90
style G7 fill:#FFD700
style DA3 fill:#FFD700
Actor Governance Analysis
MakerSpace Administrators
The administrative role centers on sustaining and scaling the Makerspace’s core mission of encouraging technological creativity, learning-by-making, and community knowledge sharing. While the foundational purpose remains unchanged, key risks and assumptions relate to long-term viability: membership acquisition and retention directly influence funding, which is currently dependent on a mix of dues, class revenue, donations, grants, and member self-funding. These revenue streams are inconsistent and time-bound, particularly with respect to rent and grants. The proposed response emphasizes data-informed marketing campaigns to support membership growth and to generate evidence that can unlock alternative or supplemental funding models, while recognizing that not all donations are monetary and that growth must be matched with training capacity and governance maturity.
MakerSpace Instructors
Instructors are positioned as self-governing designers of curriculum and learning objectives, with responsibility extending beyond instruction to documentation, standardization, and succession planning. To ensure continuity, growth, and safety, curricula must be formalized and paired with clear rubrics that support observational assessment, certification, and compliance. The instructional purpose includes delivering specialized training and certifications to both members and non-members, reinforcing the Makerspace’s educational value. However, risks emerge around instructor availability, enrollment planning, certification management, and long-term succession, requiring governance structures that prevent knowledge silos and instructor burnout while maintaining consistent evaluation standards.
MakerSpace Members
Member participation is highly variable in terms of prior knowledge, experience, interests, and learning styles, which introduces significant assumptions into training and access models. A key misconception addressed is that class attendance alone equates to workcell access or operational competence. Because the Makerspace is not an accredited institution and learning is voluntary and experiential, governance must prioritize measurable, repeatable engagement over simple completion metrics. The proposal emphasizes observational assessment, feedback loops, and the use of privacy-preserving experience APIs to assess program “health” at a cohort level. Certification is non-anonymous and may lead to expanded access, increasing the importance of intent to practice, safe operation, and knowledge sharing as ongoing responsibilities rather than one-time achievements.
Accessibility organizations contribute assumptions, design principles, and purpose grounded in Universal Design, particularly the idea that solutions accessible to people with disabilities ultimately benefit everyone. Their involvement centers on ensuring activities are meaningfully accessible and on co-designing experiences that address unmet needs within the disability and accessibility community. The proposed BioPhoto Lab workcell serves as a concrete demonstration of how bioengineering aligns naturally with Universal Design principles, offering an experiential, inclusive activity suitable for a wide range of abilities. Beyond compliance, the aspirational goal is to enable innovation that could lead to genuine breakthroughs in accessibility, positioning the Makerspace as a site of applied, inclusive experimentation.
Public Visitors
For public visitors, the primary assumptions are that the Makerspace environment must be demonstrably safe, accessible, and well-governed. The purpose of engagement is to increase awareness of biotechnology and related STEAM fields through carefully designed, bio-ethical experiences that are approachable without requiring prior expertise. By lowering barriers to entry and emphasizing safety and accessibility, these public-facing experiences can serve as both educational outreach and a pathway to deeper participation, including eventual membership. Effective governance is essential here, as public interactions directly shape reputation, trust, and the perceived legitimacy of biotechnology in a community context.
Corporate R&D
Corporate R&D engagement is framed around experiential learning opportunities and potential grant-based funding models, with assumptions that industry partners may support exploratory, pre-competitive activities. However, significant risks arise around proprietary information, intellectual property, patents, and commercialization pathways. Governance must therefore clearly delineate boundaries between open, educational activities and protected corporate interests. Without explicit controls, collaboration risks either chilling participation due to IP concerns or unintentionally exposing proprietary assets, making this actor group highly sensitive to policy clarity and contractual safeguards.
Academic Institutions
Collaboration with academic institutions is intended to elevate the Makerspace by integrating subject-matter expertise, thought leadership, and social learning into a broader lifelong learning framework. The purpose is not formal accreditation but the creation of a shared baseline for advancing the “Art of Biotechnology” as a multidisciplinary medium. Assumptions include access to cloud services and digital infrastructure that support self-reported activity tracking, attribution, and data-driven dashboards. These tools enable scientific reflection, reproducibility, and deeper collaboration while allowing activities to be traced back to their original context. Partnerships and outreach are therefore central design elements, positioning the Makerspace as a bridge between academic rigor and experiential, community-based learning.
Ai Prompt References
The Governance Policy section was distilled directly from my original “Mind Map” (using ChatGPT 5.2 with the following prompt:
You are a biotechnology research scientist creating a governance model around the introduction of a new activity within a Makerspace BioArt lab. Using the exact verbiage provided without changing the intent, summarize this mind-map with topics into a clear, concise summary starting with a high level overview, a bold statement of purpose, and a well-organized matrix of options that can be ranked.
The Matrix was created from the source MindMap using the following prompt:
Create a scoring matrix from 1-3 or n/a for the following ACTORS compared to the ACTIONS listed. Maintain strict hierarchy:
Actions
Purpose, Design, Assumptions, Risks of Failure & “Success”
Purpose: What is done now and what changes are you proposing?
Design: What is needed to make it “work”? (including the actor(s) involved - who must opt-in, fund, approve, or implement, etc)
Assumptions: What could you have wrong (incorrect assumptions, uncertainties)?
Risks of Failure & “Success”: How might this fail, including any unintended consequences of the “success” of your proposed actions?
A series of refinement promps were required to format the results into a usable format for ranking purposes.
Matrix Refinement with ChatGPT 5.2
Create a governance scoring matrix
You asked for a 1–3 / N-A scoring matrix comparing Actors against Actions, with strict hierarchy across Purpose, Design, Assumptions, Risks, and a detailed governance rubric (safety, privacy, DRM, monetization, AI, accessibility).
Clarify that scores should be user-entered, not assigned
You corrected the approach to request a blank scoring table where you would enter values using the rubric (1 = best, 3 = worst).
Group the matrix by numbered Actions in a single table
You requested consolidation into one unified table, grouped by Action numbers rather than separate dimensions.
Create a ranking table with rubric columns
You specified the exact governance rubric to become table columns and asked that scoring cells be provided for each Actor × Action.
Group content by Actor
You instructed that Actions be grouped under each Actor rather flattened globally.
Pre-populate N/A where governance criteria did not apply
You asked for governance cells to be marked N/A when an Actor did not reference or control that rubric area.
Revert to the unpopulated version
You reversed that decision and requested a clean matrix with no prefilled values.
Recreate the matrix grouped under each Actor
You asked for a fresh rebuild, preserving grouping and rubric structure.
Remove N/A and revert one step again
You clarified you wanted a fully blank, neutral baseline.
Further normalize the Action column
You instructed that:
“Action 1 / 2 / 3” become a unique row ID
Parenthetical terms (Purpose, Design, Risk, Assumptions) become a new parsed field
Descriptions be separated cleanly for analysis
Add narrative summaries under each Actor header
You asked that the explanatory text originally in parentheses be synthesized into plain-language summary paragraphs under each Actor.
Convert the matrix into a Google Sheet
You requested export of the structured matrix into a Google-Sheets-compatible file.
Confirm whether actor summaries were included in the export
You explicitly asked for verification and correction if the narrative summaries were missing.
Resolve the omission if needed
You implicitly authorized modification of the file to include the summaries, which resulted in adding a second sheet.
Summarize all prompts from this session
You requested this final meta-summary of your own prompts.
The Actors (Personas) were further refined using the following Prompt:
Under each Actor group, include a paragraph that summarizes the commentary provided under sections that contained ()
Response: (See Appendix for results)
Below are clear narrative summaries, one paragraph under each Actor group, synthesizing only the commentary you provided in the parenthetical sections. I’ve stayed faithful to your language and intent, tightening it into governance-ready prose without adding new assumptions.
Link to Week 2 Prework can be found under /honmework/Week2_Lecture_Prep/_index.md and is heavily dependent upon ChatGPT to formulate answers while cross-referencing content for better understanding.
Subsections of Week 1 HW: Principles and Practices
Week 1 HW: Week 2 Lecture Prep
Lecture Prep:
My approach was to view the slides, then seek direct answers via ChatGPT, then review the slides to find corresponding answers. It is allowing me to begin to comprehend the depth of the subject matter. I look forward to the reinforcing live presentations.
Professor Jacobson
Nature’s machinery for copying DNA is called polymerase. What is the error rate of polymerase? How does this compare to the length of the human genome. How does biology deal with that discrepancy?
1:106
In contrast, the human genome is 3:109 or many magnitudes higher.
How many different ways are there to code (DNA nucleotide code) for an average human protein?
Average human protein length ≈ 400 amino acids
In practice what are some of the reasons that all of these different codes don’t work to code for the protein of interest?
Because DNA is not just a protein recipe. The sequence carries many layers of information beyond amino acids.
Dr. LeProust
What’s the most commonly used method for oligo synthesis currently?
Phosphoramidite solid-phase synthesis
Why is it difficult to make oligos longer than 200nt via direct synthesis?
small per-base imperfections compound exponentially, and the chemistry has no way to “fix” them once they happen.
Why can’t you make a 2000bp gene via direct oligo synthesis?
Because chemical oligo synthesis breaks down long before you reach that length, for fundamental probabilistic, chemical, and practical reasons. A 2000 bp gene is two orders of magnitude beyond what direct synthesis can support.
Professor Church
[Using Google & Prof. Church’s slide #4] What are the 10 essential amino acids in all animals and how does this affect your view of the “Lysine Contingency”?
The 10 essential amino acids
Histidine
Isoleucine
Leucine
Lysine
Methionine
Phenylalanine
Threonine
Tryptophan
Valine
Arginine
My view is now informed by the concept that “No lysine available → the organism stops functioning”.
[Given slides #2 & 4 (AA:NA and NA:NA codes)] What code would you suggest for AA:AA interactions? Need more fundamental understanding to repsond.
[(Advanced students)] Given the one paragraph abstracts for these real 2026 grant programs sketch a response to one of them or devise one of your own:
Part 2: Gel Art - Restriction Digests and Gel Electrophoresis (Optional- for those with Lab access)
Design Simulation
Part 3: DNA Design Challenge
3.1 Choose your Protein
3.2 Reverse Translate: Protein (amino acid) sequence to DNA (nucleotide) sequence.
3.3 Codon optimization
3.4. You have a sequence! Now what?
3.5. [Optional] How does it work in nature/biological systems?
Part 4: Prepare a Twist DNA Synthesis Order
4.1. Create a Twist account and a Benchling account
4.2. Build Your DNA Insert Sequence
4.3. On Twist, Select The “Genes” Option
4.4. Select “Clonal Genes” option
4.5. Import your sequence
4.6. Choose Your Vector
Part 5: DNA Read/Write/Edit
5.1 DNA Read
(i) What DNA would you want to sequence (e.g., read) and why?
(ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why?
5.2 DNA Write
(i) What DNA would you want to synthesize (e.g., write) and why?
(ii) What technology or technologies would you use to perform this DNA synthesis and why?
5.3 DNA Edit
(i) What DNA would you want to edit and why?
(ii) What technology or technologies would you use to perform these DNA edits and why?
Part 1: Benchling & In-silico Gel Art
In this section, I was able to successfully sign up for Benchling, request to join HTGAA (pending), and create a new project. I was able to find the Lambda DNA sequence in the FASTA database, which I copied and pasted. I then found the downloadable file in GenBank, which I imported into Benchling. It took me a few tries to get multiple Digests to appear, once I selected multiple restriction enzymes and ordered the tabs before Virtual Digest. I exported the resulting image as a .PNG as well as my NC_001416 Project “Linear Map” and “Sequence Map” as well as the Lambda Map from GenBank, as PDFs for future reference.
Part 2: Gel Art
Illustration by Eric Schneider: Simulated Gel Electrophoresis using node based image editing software, “Adobe Substance Designer”
Part 3: DNA Design Challenge
3.1 Choose your protien
I chose Candida antarctica Lipase B (CalB) since it has the ability to break down polylactic acid, or PLA, a commonly used filament in 3D printing.
My design intent is to reduce the amount of microplastics that reach our ecosystem. The best place to start remediation may be at the source; the waste stream of PLA that is generated in a MakerSpace. By introducing a biological process that depolymerizes PLA waste, we may contribute to a solution while providing governance and building awareness.
From a BioArt perspective, this is the first step in creating and containing the lactic acid and CO2 that may be generated, for downstream use in feeding and growing colorful algae. In turn, powdered algae pigment can be extracted, showing how PLA can help to create colorful pigments used in painting and other mediums.
3.2 Reverse Translate
I was able to find a suitable Protein for this design challenge by using Ai Prompts and comparing results between ChatGPT and Claude. ChatGPT led me to Proteinase Khttps://www.ncbi.nlm.nih.gov/nuccore/X14689 which turned out to be very challenging due to complexity of the construct, and actually caused Twist to “freeze” when attempting to synthesize.
I even conducted a rapid experiment where I asked Claude Ai to provide the translation, which it suprisingly did, very confidently. However, I ran into the same complexities when attempting to create a TWIST order.
I went back to the NIH database and found C.antarctica (LF 058) gene for lipase Bhttps://www.ncbi.nlm.nih.gov/nuccore/Z30645 which, according to Claude Ai, would lead to better results with less complexity. I exported a FASTA file for the protein’s genetic structure.
In TWIST, the Lipase B approach fell into the “standard” complexity level, so I am sticking with that translation. Which also validates that the follow-up Claude AI inquiry led to a good result. (see appendix for summary of prompt usage)
3.3 Codon Optimization
I used the Twist tool to optimize Codons. It showed me two regions that had repeating sequences that could be optimized.
Question: It seems that the Start and Stop codons were automatically added in, as ATG, and TAA but I want to better understand when and how to ensure they are present manually, with dependency on selected expression. (Note: Answer was found by properly annotating)
I chose e.coli as I learned that it is predictable and suitable for this sequence. Yeast may be used for a higher yield, but with possibly more optimization of repeating codons needed. I completed the Twist optimization, and downloaded the sequence to view in Benchling to learn more about the strucutre.
3.4 You have a sequence! Now what?
This protein can be created from DNA from either clonal or strand synthesis. The dna sequence I have identified can be inserted into a host plasmid which is cloned in an industrial-scale lab that can provide quality, speed and editing capabilities. The cell-based method provides more synthetic control and expected outcomes, acting like a factory. The cell-free method may introduce toxins and have lower yield. In advanced industrial production, both may work together for rapid prototyping and scalability.
3.5 [Optional] How does it work in nature/biological systems?
The ability to transcribe from different start points in the sequence leads to diveristy in proteins created.
I realized that my prior attempt to create an order was incomplete, as I had not fully optimized or annotated my sequence. I started “from scratch” and optimized my sequence in TWIST, then exported back to Benchling, where I prepared a sequence with the proper annotations. I took this back into Twist and prepared an order. I exported the new Plasmid back to Benchling. This “answered” my initial question related to Annotating start and stop codons, which was a key learning for me.
Part 5: DNA Read/Write/Edit
Part 5: DNA Read/Write/Edit
5.1 DNA Read
(i) What DNA would you want to sequence (e.g., read) and why?
I would like to sequence the DNA of a Lipase as it appears to be well suited for the depolymerization of PLA. I would also like to sequence a Cutinase as it has similar properties, as well as Proteinase K which may be best for industrial-scale applications. I am intrigued by the potential for a hybrid solution . I am also interested in harnessing any CO2 emissions for downstream processing or pigmented algae growth.
(ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why?
I would use UniProt to locate Proteins with sequences.
I would use TWIST or other standalone optimization tools to minimize repeats in a sequence
I would use Benchling as the primary method of visualizing sequences to be able to annotate and construct sequences with better probability of success when ordering Clones or Strands.
I would use TWIST for the speed, quality, and configuration capabilities when building Plasmids.
I would again use Benchling to visualize Plasmids once constructed.
I also learned about ChimeraX to 3D visualize Nucleotides and molecular bonds
5.2 DNA Write
(i) What DNA would you want to synthesize (e.g., write) and why?
In support of my PLA depolymerization design, I would want to manage and control the throughput through synthetic means, in contrast of depending on natural biodegradation, which may happen only under the most optimal conditions such as heat, sunlight/UV and presence of enzyme producing organisms.
(ii) What technology or technologies would you use to perform this DNA synthesis and why?
Using a technology like TWIST as well as a safe and operational synthetic biology lab, I feel that a repeatable solution can be designed that can scale to the global use case of 3D printed PLA filament sources of microplastic waste reduction
5.3 DNA Edit
(i) What DNA would you want to edit and why?
I would like to edit the DNA of enzymes that biodegrade PLA to create higher yield, lower temperature requirements, and safe industrial processing to ensure the production is accessible to the quickly growing market segment. This may lead to greater awareness of the growing problem of microplastics through educational Makerspace activities that demonstrate this concept.
(ii) What technology or technologies would you use to perform these DNA edits and why?
I would start with well-tested and proven enzymes such as LipaseB to ensure a baseline for any future experimentation. I would follow well-defined procedures of synthesizing DNA. For example, eColi is deemed a good vector, and yeast is also compatible. Once I have validated that a sequence can be synthesized, I would like to order via Twist, and collaborate with a Node Lab to conduct a PLA experiment with a control group, and measure PH, Co2 emissions, and weight delta, as well as temperature monitoring.
Appendix
Ai Prompts
Chat GPT was used to explore the environmenal and ecological impact of microplastics, which led me to the idea of capturing waste at the source.
Here is a condensed list of prompt themes used:
What biological system (enzyme) can depolymerize PLA into lactic acid?
What environmental problem does PLA create, especially regarding microplastic persistence in oceans?
How can PLA waste be prevented from entering mixed waste streams through source segregation?
What experimental conditions are required to depolymerize PLA at small scale?
How can successful depolymerization be quantitatively measured (mass balance and lactic acid detection)?
How can the experiment avoid generating microplastic through mechanical fragmentation?
What happens to PLA in marine environments or when ingested by sea life?
How can lactic acid or derived CO₂ be reused in biological systems (plants, algae)?
How can algae-derived pigment serve as a material outcome of the carbon loop?
Claude Ai seemed to better understand the boiengineering context:
What are some examples of polyester hydrolase
what (enzymes) cuts down PLA the best
Confirm which are considered synthetic and effective
what is proteinase K derived from
what enzyme will work best for DNA replication
is eColi or yeast better
clarify cell-dependent or cell-free methods, of synthetic biology
Week 3 HW: Lab Automation
Focus on Lab Automation research, with creative examples of OpenTrans instruction sets using Python. Final project slide to be included in Node deck.
I was able to quickly upload an image and randomize the colors, to generate a point paired data set.
I really like the bitmap rasterization and creative expression found in the gallery.
My investigation is based on my background in high resolution digital imaging. I wanted to better understand the pixel to microliter (uL) relationship. I see that with a 200 uL maximum quantity and a 90-100 mm Petrie Dish, it would seem that there are some basic constraints.
I look at that as an opportunity and design challenge to maximize resolution for the purpose of future scientific discovery. Similar to Moore’s law of exponential growth, the imaging industry has experienced the same trends, given today’’s 8K resolution and greater camera sensors.
Another reference point is with Twist labs, who have discovered how to overcome scale and quality limitations through in-silica transformation of a defined lab scale.
My approach was to explore how vector based graphics, defined by a series of points and splines, could be leveraged to create what is considered “infinite resolution” or at the very least, scalable and adjustable to meet the target output.
SVG, or “Scalable Vector Graphics” are the source of my BioArt for this activity. The entire library of icons we use in this Markdown format is a good example of what’s possible!
I used Claude Ai to explore a web-friendly code base that would allow me to generate the key value pairs needed to script a Python function in the Opentrons protocol. The React/JS framework made it possible to design a User Interface (Ui) that allows for selection of any SVG, to render a resolution independent sample to the screen.
Dynamic features include assignment of a Color from an available list, increase in “Pitch” which is the number of points that are spaced along the computed line segments. Most importantly, is “Radius” which includes a value for uL, which relates to the size of a droplet in OpenTrons.
The output is a PNG for a quick visual reference, and a JSON file or Text file for future parsing.
I chose a simple Copy/Paste Text field to obtain the list of x,y point pairs, for use in Python for Opentrons.
Screenshot of SVG-to-Opentrons Converter web app by Eric Schneider
I processed several sample images and ran into a slight issue with how SVG segments are deemed continuous, so I refined the parser to handle each line segment individually. I also introduced GitHub to maintain a sense of version control as a web application can quickly grow, or become corrupt, by Ai agents.
I then focused on ensuring the web application could appear inside of our preferred Colab environment, using Python and iFrame libraries. However, that is “sandboxed” and can’t share data directly. (Which is why the copy-paste is important to expedite). I tried to replicate the solution in Colab, but most things broke.
I moved on to the Opentron Simulator in Colab, with my new Data Set.
I have an intermediate understanding of coding, and with the help of Claude Ai, I was able to articulate my need for a recursive list that would not only plot the points needed for pipetting, but also manage aspiration in batches of 20, not exceeding 200 uL.
After some basic Python formatting errors, I was able to preview the results via the Simulation module, and it was a very close match to my design intent.
Reflection:
I noticed that I was able to control the results of Vector for a high quality line that uses the full range of X, Y to the 10th of a millimeter (1 decimal point). Of course there is still the limitation of 80 mm diameter and 200 uL saturation, but I am encouraged that this technique can be refined for the purpose of high resolution design intent. I’m thinking about:
BioCircuits that follow continual line traces for current
BioSensors with defined sizes and shapes that are scalable
BioArt that mirrors iconography and symbols, with dot-pitch resolution controls.
BioPhotos that strive for incremental bitmap resolution at the microscopic level.
Imaging App- Future enhancement ideas:
Z depth may impact Radius.
Multiple SVG Layers, for multi-color assignments.
Save/load to a repository
Data sharing with Colab workspaces.
Integration of JSON for data sharing
Replicate application in Python in Colab natively.
Integrate color selection into color location.
Branching existing Automation Art code and exploring how to contribute to codebase.
OpenTrons Lab:
I was able to coordinate a working session with an OpenTrons OT-2, with Karen Ingram at the Charlotte Makerspace “BioArt Studio” which is an emerging destination for bioscience and art.
We attempted to load my protocol with vectorized points, but we encountered errors partially due to some code bugs which were quickly resolved. However, my Labware profiles were not defined for this platform configuration.
We deferred additional debugging in favor of using a known working Protocol for this session, which led to the output shown here. This is a good test since it shows the current state of functionality.
I learned how to launch and calibrate the equipment for an automated production run. I also observed an opportunity to 3D print a calibration target that would make centering the gantry over a printable art medium like watercolor paper inside of a petrie dish. We discussed a custom hold-down to keep the paper flat for more control over quality.
Our BioArt Studio session concluded with a request for a copy of a working Protocol file, so I could “reverse engineer” and configure my Protocol with the correct Labware settings. I installed a local copy of OpenTrons controller app, and was able to edit the script to include available Labware, as well as suppress the Thermal plate as it is not used in this model, and required adjustments to handling of the Z axis.
Our next working session will fine-tune and test the Automation & Design protocol.
Update: 4/25/26 - The Protocol file was updated with the reassigned Labware, and was able to run the following design at 0.5uL with success:
Research Paper
I am sharing a link to an essay written by Karen Ingram, that illustrates the influence of automation on BioArt, including OpenTrons Ot-2 renderings.
I am excited about the field of synthetic Bioscience and Art as a result of our recent collaboration. I am grateful for the knowledge sharing and access to the BioLab.
Final Project
My Final project has been positively influenced by this week’s automation activity, as it validates that I can strive to achieve some specific lab results using the automated OpenTrons OT-2 as a tool in the process.
The path I will take for my final project starts with the identification of a Protein that can be synthesized to ensure my work is based on biotechnology best practices. The use of TWIST as a provider of automated creation of a Plasmid is the 1st step in the automation workflow.
Once I have a product, I expect to use the OpenTrons automation platform to construct a series of experiments in a host medium that will Grow into Art.
I plan on 3D printing supporting assemblies that will allow me to grow a photographic “film negative” plate, which could be a modified petrie dish that acts as a film back on a customized camera body and lighting rig.
I plan on creating a unique “exposure calibration” plate that will assist in lab test cases.
My long-range goal is to achieve a sustainable, repeatable solution that leverages automation and can scale up based on future demand for a BioPhoto “Lab” experience. I believe we are at pivotal moment in science and automation similar to when George Eastman revolutionized the photography industry through film and camera development for mass consumption. Many other industrial design solutions surround this theme.
My Final Project will reflect (and develop) artifacts of biotechnology and photography.
Checklist:
Review this week’s recitation and this week’s lab for details on the Opentrons and programming it.
Write your own Python script which draws your design using the Opentrons.
If you use AI to help complete this homework or lab, document how you used AI and which models made contributions.
Sign up for a robot time slot if you are at MIT/Harvard/Wellesley or at a Node offering Opentrons automation.(Alt:MakerspaceCharlotte)
Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications.
Write a description about what you intend to do with automation tools for your final project.
Final Project Ideas - Submit one slide to Node
Appendix - Python Code
fromopentronsimporttypesmetadata={'author':'Eric Schneider','protocolName':'Rasterizr','description':'SVG to OT','source':'HTGAA 2026 Opentrons Lab','apiLevel':'2.20'# 2.7}################################################################################# Robot deck setup constants - don't change these###############################################################################original HTGAA: #TIP_RACK_DECK_SLOT = 9 #HTGAA#COLORS_DECK_SLOT = 6 #HTGAA#AGAR_DECK_SLOT = 5 #HTGAA#PIPETTE_STARTING_TIP_WELL = 'A1'#Makerspace Charlotte: TIP_RACK_DECK_SLOT=6#MSCCOLORS_DECK_SLOT=3#MSCAGAR_DECK_SLOT=1#MSCPIPETTE_STARTING_TIP_WELL='A1'# *****TO BE CONFIRMED****# TO DO: update these colors and wells to match your actual color plate layoutwell_colors={'A1':'Red','B1':'Green','C1':'Orange'}defrun(protocol):################################################################################# Load labware, modules and pipettes############################################################################### Tipstips_20ul=protocol.load_labware('opentrons_96_tiprack_20ul',TIP_RACK_DECK_SLOT,'Opentrons 20uL Tips')# Pipettespipette_20ul=protocol.load_instrument("p20_single_gen2","right",[tips_20ul])#HTGAA same# Modules# temperature_module = protocol.load_module('temperature module gen2', COLORS_DECK_SLOT) #HTGAA temp module only, (not MSC)# Temperature Module Plate#temperature_plate = temperature_module.load_labware(# 'opentrons_96_aluminumblock_generic_pcr_strip_200ul', #HTGAA# 'opentrons_6_tuberack_nest_50ml_conical'#'Cold Plate'# )# Choose where to take the colors from#color_plate = temperature_plate#new no temperature module that adds Z height issuecolor_plate=protocol.load_labware('opentrons_6_tuberack_nest_50ml_conical',COLORS_DECK_SLOT)# Agar Plate# agar_plate = protocol.load_labware('htgaa_agar_plate', AGAR_DECK_SLOT, 'Agar Plate'). #HTGAA#Makerspace Charlotte CUSTOM AGAR PLATE 3D PRINTED WITH PETRIE DISH HOLDERagar_plate=protocol.load_labware('biorad_96_wellplate_200ul_pcr',AGAR_DECK_SLOT,'Agar Plate')# Get the top-center of the plate, make sure the plate was calibrated before running thiscenter_location=agar_plate['A1'].top()pipette_20ul.starting_tip=tips_20ul.well(PIPETTE_STARTING_TIP_WELL)################################################################################# Patterning#################################################################################### Helper functions for this lab#### pass this e.g. 'Red' and get back a Location which can be passed to aspirate()deflocation_of_color(color_string):forwell,colorinwell_colors.items():ifcolor.lower()==color_string.lower():returncolor_plate[well]raiseValueError(f"No well found with color {color_string}")# For this lab, instead of calling pipette.dispense(1, loc) use this: dispense_and_detach(pipette, 1, loc)defdispense_and_detach(pipette,volume,location):"""
Move laterally 5mm above the plate (to avoid smearing a drop); then drop down to the plate,
dispense, move back up 5mm to detach drop, and stay high to be ready for next lateral move.
"""assert(isinstance(volume,(int,float)))#above_location = location.move(types.Point(z=location.point.z + 5)) #original HTGAAabove_location=location.move(types.Point(z=5))pipette.move_to(above_location)pipette.dispense(volume,location)pipette.move_to(above_location)###### YOUR CODE HERE to create your design##### reminder set Zagar_plate.set_offset(x=0.00,y=0.00,z=0.00)# start by picking up tippipette_20ul.pick_up_tip()# PASTE a list of Current Coordinates (will be dynamic load once integrated or automated)currentCoords=[[-6.1,26.8],[-7.9,25.7],[-8.6,23.8],[-9.9,22.6],[-11.3,21.5],[-12.1,19.7],[-14.2,19.7],[-15.3,21.4],[-17.2,22.3],[-19.3,22],[-20.8,20.7],[-21.4,18.7],[-20.8,16.7],[-19.2,15.4],[-17.2,15.1],[-15.3,16.1],[-14.2,17.8],[-12.1,17.8],[-11.3,15.9],[-9.9,14.8],[-8.6,13.6],[-7.3,12.3],[-5.9,11.1],[-6.4,9.6],[-8.5,9.6],[-10.6,9.5],[-12.1,8.1],[-12.3,6],[-13.3,4.9],[-15.4,4.4],[-17.3,3.6],[-19.1,2.5],[-20.6,1],[-21.8,-0.7],[-22.7,-2.6],[-23.4,-4.6],[-23.8,-6.6],[-24.1,-8.7],[-24.6,-10.6],[-26,-12.1],[-26.8,-14.1],[-26.7,-16.2],[-25.9,-18.1],[-24.4,-19.5],[-23.4,-18.3],[-24.7,-16.6],[-25,-14.6],[-24.2,-12.7],[-22.5,-11.6],[-20.4,-11.5],[-18.7,-12.7],[-17.8,-14.6],[-18.1,-16.6],[-19.4,-18.2],[-18.5,-19.5],[-16.9,-18.1],[-16.1,-16.2],[-16.1,-14.1],[-16.8,-12.2],[-18.2,-10.6],[-18.6,-8.8],[-18.3,-6.7],[-17.7,-4.7],[-16.7,-2.9],[-15.1,-1.5],[-13.2,-0.6],[-12.3,-1.9],[-12.3,-4],[-12.3,-6.1],[-12.1,-8.2],[-10.6,-9.4],[-11.2,-11.3],[-13.1,-11.9],[-13.2,-14],[-13.2,-16.1],[-12.4,-17.7],[-10.5,-18.5],[-11.2,-20.5],[-13.1,-21.1],[-14.1,-22.9],[-14,-25],[-12.7,-26.6],[-10.7,-26.9],[-8.6,-26.9],[-6.5,-26.9],[-4.4,-26.9],[-2.3,-26.8],[-2.3,-24.7],[-2.3,-22.6],[-2.3,-20.5],[-2.3,-18.4],[-2.3,-16.3],[-2.3,-14.2],[-2.3,-12.1],[-2.3,-10],[-0.7,-9.5],[1.4,-9.5],[2.3,-11.3],[2.3,-13.4],[2.3,-15.5],[2.3,-17.6],[2.3,-19.7],[2.3,-21.8],[2.3,-23.9],[2.3,-26],[4,-26.9],[6.1,-26.9],[8.2,-26.9],[10.3,-26.9],[12.4,-26.7],[13.9,-25.3],[14.1,-23.2],[13.4,-21.3],[11.5,-20.5],[10.5,-19.3],[11,-17.7],[13,-17.4],[13.2,-15.4],[13.2,-13.3],[12.6,-11.4],[10.5,-11.3],[10.8,-9.4],[12.2,-7.9],[12.3,-5.8],[12.3,-3.7],[12.3,-1.6],[13.4,-0.7],[15.3,-1.6],[16.8,-3],[17.8,-4.9],[18.4,-6.9],[18.6,-9],[18,-10.8],[16.7,-12.4],[16,-14.3],[16.2,-16.4],[17.1,-18.3],[18.7,-19.6],[19.3,-18.1],[18,-16.4],[17.9,-14.3],[18.8,-12.5],[20.6,-11.5],[22.7,-11.6],[24.3,-12.9],[25.1,-14.8],[24.6,-16.8],[23.3,-18.5],[24.6,-19.4],[26.1,-17.9],[26.8,-15.9],[26.7,-13.8],[25.9,-11.9],[24.4,-10.5],[24.1,-8.5],[23.8,-6.4],[23.3,-4.3],[22.6,-2.4],[21.7,-0.5],[20.4,1.2],[18.9,2.6],[17.1,3.7],[15.1,4.5],[13.1,4.9],[12.3,6.8],[11.7,8.8],[9.9,9.5],[7.8,9.6],[5.9,10.3],[6,12.3],[8.1,12.3],[8.6,13.9],[10.1,14.9],[11.3,16.1],[11.8,17.8],[13.9,17.8],[15.1,16.3],[16.9,15.2],[18.9,15.2],[20.6,16.5],[21.4,18.4],[21,20.4],[19.5,21.9],[17.5,22.3],[15.6,21.6],[14.3,19.9],[12.4,19.7],[11.3,20.7],[10.7,22.3],[8.7,23],[8.3,25],[6.9,26.5],[4.9,26.9],[2.8,26.9],[0.7,26.9],[-1.4,26.9],[-3.5,26.9],[-5.6,26.8],[-2.6,23],[-3.1,21.4],[-4,22.8],[3.8,23],[3.3,21.5],[2.5,22.8],[-8.6,18.7],[-8.7,16.7],[-9.6,17.5],[-9.6,19.5],[-8.9,21],[-8.6,19.1],[9.2,20.9],[9.6,19.1],[9.6,17.1],[8.6,17.1],[8.6,19.1],[8.9,21],[-2.5,19.3],[-1.8,17.5],[0.1,16.9],[1.9,17.6],[2.7,19.4],[4.1,18.7],[3.5,16.8],[2,15.5],[0,15.1],[-1.9,15.5],[-3.5,16.7],[-4.1,18.6],[-2.7,19.5],[4.1,10.9],[3.4,9.6],[1.4,9.6],[-0.6,9.6],[-2.6,9.6],[-4.1,10.1],[-4.1,12.1],[-2.2,12.3],[-0.2,12.3],[1.8,12.3],[3.8,12.3],[-4,4.9],[-2.2,4],[-0.9,2.4],[-0.5,0.5],[-0.9,-1.5],[-2.2,-3.1],[-4,-4],[-6,-4],[-7.8,-3.1],[-9.1,-1.5],[-9.6,0.4],[-9.1,2.4],[-7.9,4],[-6.1,4.9],[-4.1,4.9],[9.3,4.7],[8.8,3.2],[6.8,3.2],[4.8,3.2],[2.8,3.3],[2.8,4.9],[4.8,5],[6.8,5],[8.8,5],[9.1,1.3],[9.1,-0.4],[7.1,-0.4],[5,-0.4],[3,-0.4],[1.4,0.3],[2.7,1.4],[4.7,1.4],[6.7,1.4],[8.7,1.3],[9.3,-2.6],[8.7,-4],[6.7,-4.1],[4.7,-4.1],[2.7,-4.1],[0.8,-3.9],[1.2,-2.3],[3.2,-2.3],[5.2,-2.3],[7.2,-2.3],[9.2,-2.4],[-7.8,-14.6],[-8.4,-16],[-10.4,-16],[-11.3,-14.9],[-11,-13.2],[-9,-13.2],[-7.8,-14.1],[11.3,-14.6],[10.7,-16],[8.7,-16],[7.8,-14.9],[8.2,-13.2],[10.2,-13.2],[11.3,-14.1],[-6,3],[-7.4,1.7],[-7.6,-0.3],[-6.5,-1.8],[-4.5,-2.2],[-2.9,-1.2],[-2.3,0.7],[-3.2,2.4],[-5,3.2]]batch_size=20total=0foriinrange(0,len(currentCoords),batch_size):batch=currentCoords[i:i+batch_size]coordCount=len(batch)print(f"\nBatch {i//batch_size+1}: aspirating {coordCount} units")pipette_20ul.aspirate(coordCount,location_of_color('Green'))forx,yinbatch:adjusted_location=center_location.move(types.Point(x,y))dispense_and_detach(pipette_20ul,1,adjusted_location)total+=1print(f" Dispensed at ({x}, {y}) — running total: {total}")print(f"\nTotal objects processed: {total}")pipette_20ul.drop_tip()
Week 4 HW: Protein Design Part I
This week focuses on how sequence, structure, and energetics can be modeled and manipulated to create or optimize proteins with specified functions.
Part A - Conceptual Questions
For my homework, I initated a conversation with Claude Ai using Sonnet v4.6.
My prompts use a method I use to start with a question, allow me to provide my answer, and receive an evaluation of my response with reinforcing key learning concepts. (Expand to see detailed responses to my answers.). I find this approach to be more interactive and leads to better knowledge retention.
Question 1: How many amino acid molecules are in a 500g piece of meat?
Using the given parameters — 500g of meat, average amino acid weight of 100 Daltons.
Feedback…
500g ÷ 100 g/mol = 5 moles × 6.022 × 10²³ = ~3 × 10²⁴ amino acid molecules — three septillion building blocks in a single meal.
Question 2:Why do humans eat beef but not become a cow? Eat fish but not become a fish?
We digest and metabolize proteins rather than absorb and mutate. Our proteins are protected and encoded.
Feedback…
The digestive system is a sophisticated demolition process. Proteases break proteins down into individual amino acids, stripping all biological identity. Your DNA then encodes precisely which proteins your ribosomes build from those recycled raw materials. The cow’s sequence is erased; your genome writes a new one. Sequence is everything — same 20 amino acids, completely different instructions.
Question 3: Why are there only 20 natural amino acids?
Because that represents a large number of variations — when combined with every possible sequence, the combinations are exponentially high in the billions.
Feedback…
Your combinatorics intuition is correct and actually undersells it. For a protein just 100 amino acids long, possible sequences number 20¹⁰⁰ — vastly larger than atoms in the observable universe. The 20 canonical amino acids cover the essential chemical toolkit life needs — acids, bases, hydrophobics, polars, aromatics, and structurally unique proline. The genetic code likely became frozen early once ribosome and tRNA machinery co-evolved around these 20. Selenocysteine is sometimes called the 21st, and synthetic biology is actively expanding the code. The 20 are evolution’s solution, not chemistry’s limit.
Question 4: Can you make other non-natural amino acids? Design some new ones.
Your answer: Yes — evolution converged on 20, but others may exist outside evolutionary pressure, possibly arising from light spectrum properties or geological timescales.
Feedback…
Absolutely yes. The Murchison meteorite contained over 70 amino acids, most non-canonical, suggesting amino acid chemistry is far broader than what evolution selected for. Synthetic non-natural amino acids already in use include p-Azidophenylalanine with precision bioconjugation handles, β-amino acids with protease-resistant backbones, D-amino acids as mirror images of natural L-forms, and fluorinated amino acids with altered hydrophobicity for drug design. To design a novel amino acid you manipulate the R-group side chain — engineering metal-binding groups, photoactivatable groups responsive to specific light wavelengths, or click-chemistry handles. Zhang’s QTY Code is itself this kind of thinking — recognizing structural mimicry between amino acids to repurpose the existing 20 in non-natural combinations.
Question 5: Where did amino acids come from before enzymes that make them, and before life started?
Your answer: Geological, light energy, electrical, and even gravitational forces were all at play resulting in amino acid formations before our understanding of life emerged.
Feedback…
Your answer is well-grounded and captures the essential abiotic forces. The Miller-Urey experiment (1953) demonstrated that electrical sparks, water vapor, methane, ammonia, and hydrogen — simulating early Earth — produced amino acids within days, no enzymes required. Hydrothermal vents provide another pathway, where iron-sulfur mineral surfaces act as primitive catalysts substituting for enzymes that didn’t yet exist. Extraterrestrial delivery is strongly supported by the Murchison meteorite, Tagish Lake meteorite, and asteroid Ryugu samples — amino acids forming in space through UV photochemistry on interstellar ice grains, then delivered to Earth by gravity over billions of years. The Strecker synthesis — hydrogen cyanide, ammonia, aldehydes, and water — produces amino acids abiotically from conditions readily available on early Earth. The deeper chicken-and-egg puzzle points toward the RNA World hypothesis — RNA molecules that could both carry information and catalyze reactions preceded both proteins and DNA, bootstrapping the system before protein synthesis machinery existed.
Question 6: If you make an α-helix using D-amino acids, what handedness would you expect?
An alpha-helix creates a right-handed coil.
Feedback…
Your answer describes the natural L-amino acid case correctly — but D-amino acids change everything. D-amino acids are mirror images of L-amino acids, and an α-helix built entirely from D-amino acids is left-handed — a perfect mirror image of the natural right-handed helix. Hydrogen bonding pattern and rise per residue remain the same, but the twist inverts. D-peptides are completely invisible to proteases that can’t recognize their mirror-image substrate, making them extraordinarily stable in biological environments and actively investigated for drug delivery and therapeutics.
Question 7: Can you discover additional helices in proteins?
Yes, since a protein may have many evolutionary and disrupted or folded variations.
Feedback…
Yes — definitively. Beyond the classic right-handed α-helix, the 3₁₀ helix hydrogen bonds every three residues and appears frequently at helix termini. The π-helix spans five residues and was long considered rare until computational analyses revealed it appears frequently at functionally important sites. The polyproline helix is unique — proline’s rigid ring forces a backbone geometry with no internal hydrogen bonds at all. As AlphaFold2 has predicted structures for hundreds of millions of proteins, helical motifs continue to be found in new contexts, including intrinsically disordered proteins that adopt helical structure only upon binding a partner. The full catalog of biologically relevant helix types remains incompletely mapped.
Question 8: Why are most molecular helices right-handed?
Due to molecular electrical charge initiating primary bonds resulting in a right-handed twist, with left-handed helices possible under favorable conditions.
Feedback…
Your answer captures something real — charge and bonding geometry do play a role — but the deeper answer touches one of the most profound unsolved questions in science: the origin of biological homochirality. The immediate structural reason is that L-amino acid backbone geometry makes the right-handed α-helix the lowest energy conformation, minimizing steric clashes while maximizing hydrogen bonding. The deeper question is why life chose L-amino acids at all. Leading hypotheses include circularly polarized UV light from cosmic sources preferentially destroying one mirror-image form — directly relevant to your light spectrum point. Parity violation in the weak nuclear force creates a vanishingly small but real energy difference between D and L molecules that could bias outcomes over geological timescales. A third possibility is frozen accident — an early self-replicating system happened to use L-amino acids and locked that choice in permanently. The handedness of life’s helices may trace all the way back to a molecular asymmetry that originated in the cosmos.
Question 9: Why do β-sheets tend to aggregate? What is the driving force?
β-sheets aggregate because they are flat and linear in design with bonding properties, repeating in a pattern or weave.
Feedback…
Your structural intuition is correct. β-sheets have exposed edges where hydrogen bond capacity is unfulfilled — unlike α-helices where all hydrogen bonding is internally satisfied. These exposed edges are essentially sticky, actively seeking additional strands to bond with. Hydrophobic stacking between sheet faces amplifies this — flat geometry allows face-to-face stacking driven by hydrophobic interactions between side chains above and below the sheet plane. Critically, aggregation is cooperative — each new strand makes the next addition more energetically favorable, which is why β-sheet aggregation can accelerate explosively once a nucleus forms. This cooperativity underlies amyloid formation in Alzheimer’s, Parkinson’s, Type 2 diabetes, and prion diseases — thermodynamically stable, protease-resistant, insoluble aggregates that are pathologically destructive precisely because they cannot be cleared.
Question 10: Why do amyloid diseases form β-sheets? Can you use amyloid β-sheets as materials?
Amyloid diseases form β-sheets due to misfolding pathology. Since they are β-folds they are sticky and thermodynamically strong, difficult to clear — the same properties that would make an excellent material such as a synthetic cement.
Feedback…
Your answer is strong on both counts and your cement analogy is more apt than you might realize. Proteins can sample alternative folding pathways under stress — aging, mutation, pH shifts — and for many proteins the cross-β amyloid structure represents a thermodynamic energy minimum more stable than the native fold. Once a misfolded nucleus forms it templates surrounding proteins in a prion-like propagation. The body cannot degrade these structures because dense hydrogen bond networks and hydrophobic cores resist proteases, and insolubility makes them inaccessible to cellular clearing machinery. As materials, amyloid fibrils have tensile strength comparable to steel on a per-weight basis, self-assemble from solution without external energy, and are chemically stable across wide pH and temperature ranges. Demonstrated applications include hydrogels for tissue scaffolding, conductive fibrils coated with metal nanoparticles for bioelectronics, water filtration membranes, and amyloid-silica composites as structural cement-like materials. Curli fibers — naturally occurring bacterial amyloid from E. coli biofilms — have been engineered as programmable living materials that assemble on demand. Biology’s problem becomes materials science’s solution.
Question 11: Design a β-sheet motif that forms a well-ordered structure.
A motif that acts as a 3-dimensional weave on the x, y, and z axis — resulting in a textile stronger than a simple x,y weave, useful in environments requiring strong resistant materials like Kevlar or heat resistant tiles.
Feedback…
Your 3D weave concept maps closely onto real structural strategies in nature and materials engineering. Well-ordered β-sheet design requires strict alternation of hydrophobic and hydrophilic residues — hydrophobics pack face-to-face between sheets while hydrophilics point outward into solvent. Edge-capping residues at strand termini prevent runaway aggregation. Turn sequences need geometrically precise residues — proline enforces bends, glycine provides backbone flexibility. Biology already builds your 3D concept: β-barrel proteins in bacterial outer membranes curve and close into cylinders of remarkable stability. Spider silk embeds nanocrystalline β-sheet domains in an amorphous matrix, distributing stress in three dimensions — outperforming Kevlar on a weight-normalized basis by absorbing energy through controlled deformation rather than brittle fracture. Computationally designed β-sheet proteins from David Baker’s group include closed barrels and extended lattices not found in nature. Your reentry tile analogy is structurally sound — ablative heat shields work by distributing energy across a 3D network with no single catastrophic failure point, exactly what a 3D β-sheet lattice would achieve. The key engineering challenge is controlling z-axis assembly using sequence-encoded electrostatic repulsion between sheet faces to set precise interlayer spacing rather than collapsing into amorphous aggregates.
Part B: Protein Analysis and Visualization
Briefly describe the protein you selected and why you selected it.
It is a widely studied protein with highly visual properties and application to biosensors, relevant to my final project scope.
Identify the amino acid sequence of your protein.
The amino acid sequence is
MSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTFSYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK
How long is it? What is the most frequent amino acid? You can use this Colab notebook to count the frequency of amino acids.
The length of the protein is: 238 amino acids.
The most common amino acid is: G, which appears 22 times.
How many protein sequence homologs are there for your protein? Hint: Use Uniprot’s BLAST tool to search for homologs.
The Blast Protein Existence menu showed 152 results with homology.
Does your protein belong to any protein family?
Yes, this is a member of the Green Fluorescent Protein (GFP) Family
Identify the structure page of your protein in RCSB
When was the structure solved? Is it a good quality structure? Good quality structure is the one with good resolution. Smaller the better (Resolution: 2.70 Å)
In 1996 the protein structure was solved.
It is a good quality structure with a resolution of 2.4 Å
A primary characteristic is the β-barrel fold with the chromophore inside, which helps to protect from damage.
Are there any other molecules in the solved structure apart from protein?
Chromophore (CRO) formed and protected inside.
Water molecules (HOH)
Does your protein belong to any structure classification family?
Green Fluorescent Proteins, with 633 structures.
Open the structure of your protein in any 3D molecule visualization software:
Visualize the protein as “cartoon”, “ribbon” and “ball and stick”.
Color the protein by secondary structure. Does it have more helices or sheets?
The structure has more sheets, indicated by amino acids, in yellow. The barrel shape is helical but the structure is formed in sheets.
Color the protein by residue type. What can you tell about the distribution of hydrophobic vs hydrophilic residues?
The amino acids create a hydrophilic barrel shape that positively attract and retain water, creating a protective surface. Inside of the barrel is the hydrophobic chromophore that is protected until it is triggered by light to release fluorescent illumination.
Visualize the surface of the protein. Does it have any “holes” (aka binding pockets)?
The surface is primarily hydrophilic but also has permeability via holes (binding pockets) to allow for controlled hydration, to protect the chromophore, which enables light photons to be absorbed and emitted as fluorescence.
Part C1: Protein Language Modeling
Deep Mutational Scans
Use ESM2 to generate an unsupervised deep mutational scan of your protein based on language model likelihoods.
Can you explain any particular pattern? (choose a residue and a mutation that stands out)
M48 has the single highest probability of a recurring sequence.
Region 20-27 has an overall high model score
Region 3 contains a strong outlier
Latent Space Analysis
Use the provided sequence dataset to embed proteins in reduced dimensionality.
My initial run showed a very dense plot.
Analyze the different formed neighborhoods: do they approximate similar proteins?
I reduced the complexity to generate a plot that includes my selected protein.
The plot shows similar proteins based on a wide range of dimensions, so they don’t always relate to similar proteins, just similar shared amino acids with higher probability of a match. In some instances, the proteins line up much more predictably, such as a high match in a linear progression.
Place your protein in the resulting map and explain its position and similarity to its neighbors.
My selected protein has a near neighbor of Clostridium botulinum which is in the family of Botulinum Neurotoxins. What is intersting is that a protein that creates biofluorescence in jellyfish is in proximity to a protein that creates a neurotoxin. This seems to be a function of evolutionary design of organisms that rely on this close relationship.
Part C2: Protein Folding
Fold your protein with ESMFold. Do the predicted coordinates match your original structure?
Yes, the folded protein closely matches my original structure, but there are some degraded areas of the barrel formation shown with a confidence gradient (green is good, red is bad)
Try changing the sequence, first try some mutations, then large segments. Is your protein structure resilient to mutations?
Yes, the structure seems resilient to mutations, even folding better in the α-helix regions.
Part C3: Protein Generation
Analyze the predicted sequence probabilities and compare the predicted sequence vs the original one.
I initially ran the Inverse Folding function, using default settings.
It predicted low confidence in the mutation scan:
It produced a model based on default settings, that was unexpected. (sea slug)
I realized that I need to enter a new PDB ID for my selected protein.
I ran it and received an expected result:
I then applied a mutation to my GFP based on a Claude AI inquiry to ’turn the GFP to blue fluorescence'
Y66H (Tyr→His) — replaces the phenol ring with an imidazole ring, shifting emission from ~509 nm (green) to ~448 nm (blue)
Y145F (Tyr→Phe) — the “enhanced” BFP (EBFP) stabilizer, improves brightness and folding
F64L — improves folding at 37°C (same as EGFP)
I ran the new sequence through the mutation scan:
I had Gemini help to write code that appends this new mutation sequence to the RDP target list.
Once a prediction was made, I applied the sequence to the ESM to see if it would produce a result.
Input this sequence into ESMFold and compare the predicted structure to your original.
Here is the mutated, inverse folded, and visualised with ESMFold:
Part D. Group Brainstorm on Bacteriophage Engineering
Find a group of ~3–4 students
Read through Phage Reading resources
Review bacteriophage goals
Brainstorm Session
Choose One or two main goals
One Page Proposal
Which tools
Why tools may help to solve sub-problem
One or two potential pitfalls
Schematic of Pipeline
Group’s short plan for engineering a bacteriophage
Post plan here
Part D - Plan
Hypothesis:
I believe we can focus on the cationic properties, or positive electrical charges that are present in the amino acid sequence. By substituting amino acids that enable more positive charge strengthening electrostatic attraction, we may create more binding activity. Lysis timing can be tuned in either direction by manipulating charge density.
Experimental Pipeline
Phase 1 — Discovery
UniProt
Retrieve canonical L-protein sequence
Confirm Region 1, 2, and 3 boundaries
BLAST
Search for homologous sequences across phage strains
Identify conservation and variability at target residues
PyMOL
Render 3D structural model
Apply polarity-based color coding to each region
Phase 2 — Mutation Analysis
PyMOL
Isolate target residues
Examine local chemical environment and spatial context
ESM2
Mask target residues and score substitution probability
Generate per-residue probability data for C2, C3, C4
Heatmap
Synthesize BLAST conservation and ESM2 probability scores
Overlay onto PyMOL structure to confirm target sites
ESMFold
Predict 3D structure of each mutant sequence
Generate pLDDT confidence scores per residue
PyMOL
Import ESMFold outputs
Render side-by-side comparison of C1 baseline vs C2, C3, C4
Phase 3 — Synthesis
Codon Optimization
Optimize mutant sequences for E. coli expression
Verify no unintended mRNA secondary structures introduced
Twist Bioscience
Submit all four constructs for gene synthesis
Confirm synthesis feasibility and receive gene fragments
Phase 4 — Plasmid Design
Benchling
Design annotated circular plasmid constructs for C1–C4
Include promoter, RBS, insert, terminator, and selection marker
Review Gate
Confirm correct reading frame and insert orientation
Verify no unintended open reading frames
Confirm host compatibility before proceeding
Phase 5 — Execution
Opentrons OT-2
Run liquid handling protocol for all four constructs
Collect lysis timing, plaque formation, and MurA activity data
Compare all results against C1 baseline
Potential Pitfalls
My hypothesis focuses on Region 1 (faces cytoplasm, cationic/hydrophilic)
and Region 3 (amphipathic, faces periplasm) to control timing of MurA enzyme inhibition.
Region 1 and Region 3
Polarity change risk
Too much polarity change could cause the phage to bind and become entrapped
Region 2
Avoid mutagenesis
Very well defined helical fold
Subject to disruption with minor change to structure
Week 5 HW: Protein Design Part II
This week we learned how cutting-edge AI and protein language models are used to design functional proteins and peptides “in silico”.
Part A: SOD1 Binder Peptide Design
Part 1: Generate Binders with PepMLM
Begin by retrieving the human SOD1 sequence from UniProt (P00441) and introducing the A4V mutation.
Using the PepMLM Colab linked from the HuggingFace PepMLM-650M model card,
Generate four peptides of length 12 amino acids conditioned on the mutant SOD1 sequence.
To your generated list, add the known SOD1-binding peptide FLYRWLPSRRGG for comparison.
Record the perplexity scores that indicate PepMLM’s confidence in the binders.
Part 2: Generate Binders with PepMLM
Navigate to the AlphaFold Server: alphafoldserver.com
For each peptide, submit the mutant SOD1 sequence followed by the peptide sequence as separate chains to model the protein-peptide complex.
Record the ipTM score and briefly describe where the peptide appears to bind. Does it localize near the N-terminus where A4V sits? Does it engage the β-barrel region or approach the dimer interface? Does it appear surface-bound or partially buried?
In a short paragraph, describe the ipTM values you observe and whether any PepMLM-generated peptide matches or exceeds the known binder.
Perplexity
Sequence
Perplexity
FLYRWLPSRRGG
21.42
WRYVAAAIARKK
14.24
WRYVAYALRWGE
26.03
KRYYWVAVARAA
12.95
HRYVAAAVKWKK
16.60
Peptide Observations
FLYRWLPSRRGG ⭐ Known Binder — Control
ipTM: 0.89 | pTM: 0.92
Distance to A4V: 22.285 Å
This is the known SOD1-binding peptide and serves as the baseline for all comparisons. The peptide is in the general vicinity of A4V. All PepMLM-generated peptides are evaluated against its ipTM of 0.89, pTM of 0.92, and distance of 22.285 Å.
WRYVAAAIARKK
ipTM: 0.85 | pTM: 0.89
Distance to A4V: 18.541 Å — 3.744 Å closer than the known binder
It is in proximity to the dimer region and engaging the B-Barrel.
This peptide is closer to the dimer region and approaches but does not exceed the ipTM value of the known binder. The peptide appears as a highly probably well formed b-barrel binder and intersects the surface, partially buried.
WRYVAAAIARKK Distance
WRYVAAAIARKK Overlapping Surface
HRYVAAAVKWKK
ipTM: 0.88 | pTM: 0.91
Distance to A4V: 27.536 Å — 5.251 Å farther than the known binder
Engages b-barrel region but does not localize near the N-terminus.
Appears partially buried. ipTM is just below the known binder at 0.88.
HRYVAAAVKWKK
HRYVAAAVKWKK showing surface incursion
WRYVAYALRWGE
ipTM: 0.68 | pTM: 0.79
Distance to A4V: 12.875 Å — 9.410 Å closer than the known binder
Approaching the A4V location 27.152 Å , not dimer interface. Surface bound. Considered near the A4V location. Lower confidence than the known binder, and demonstrates a partially folded structure.
The peptide is folding into a secondary structure upon binding rather than remaining as a random flexible chain.
This is called induced folding or folding upon binding — a hallmark of meaningful peptide-protein interactions.
The helix formation suggests the peptide is responding to the local environment of the SOD1 surface.
WRYVAYALRWGE Cartoon
WRYVAYALRWGE Surface
KRYYWVAVARAA
ipTM: 0.89 | pTM: 0.92
Distance to A4V: 17.228 Å — 5.057 Å closer than the known binder
No — localizes near the middle of Chain 1, not the N-terminus.
Engages surface in middle of region, not approaching dimer interface.
Surface bound — clipping view shows no intrusions.
ipTM matches the known binder exactly at 0.89, with a distance of 17.228 Å placing it closer to the target vicinity of A4V than the control.
KRYYWVAVARAA surface
KRYYWVAVARAA peptide with distance to A4V
KRYYWVAVARAA distance (Closeup)
ipTM Summary and Comparison to Known Binder
Peptide
Role
ipTM
pTM
Distance to A4V (Å)
Near A4V?
FLYRWLPSRRGG
⭐ Known binder (control)
0.89
0.92
22.285
Vicinity
WRYVAAAIARKK
PepMLM generated
0.85
0.89
18.541
Vicinity
HRYVAAAVKWKK
PepMLM generated
0.88
0.91
27.536
Far
WRYVAYALRWGE
PepMLM generated
0.68
0.79
12.875
Near
KRYYWVAVARAA
PepMLM generated
0.89
0.92
17.228
Vicinity
The ipTM values across the five PepMLM-generated peptides range from 0.68 to 0.89, indicating generally high predicted confidence in binding interactions. Using FLYRWLPSRRGG (ipTM 0.89, distance 22.285 Å) as the known binder control, two peptides — FLYRWLPSRRGG and KRYYWVAVARAA — match the known binder ipTM exactly at 0.89, while HRYVAAAVKWKK comes close at 0.88. However, high ipTM alone does not confirm therapeutic relevance — proximity to the A4V site matters equally. WRYVAYALRWGE carries the lowest ipTM at 0.68 yet achieves the closest proximity to the A4V mutation site at 12.875 Å — 9.410 Å closer than the known binder — and uniquely demonstrates induced folding behavior near the target. This combination of near-vicinity binding and structural reorganization makes it the most therapeutically interesting candidate despite its lower confidence score, and suggests it warrants further optimization to strengthen the binding pose while maintaining its proximity to the A4V site.
First Pass Analysis and Candidate Selection
What I found in AlphaFold 3 was that my initial peptides were primarily surface binding with varying levels of proximity to the A4V sequence location near the homodimer. WRYVAYALRWGE was not the highest scoring, but was closest to the target and demonstrated induced folding — organizing into a helical secondary structure upon binding rather than remaining flexible, which is a hallmark of meaningful peptide-protein interaction.
Higher ipTM scores did not consistently predict stronger binding affinity or closer proximity to the A4V site. FLYRWLPSRRGG and KRYYWVAVARAA matched the highest ipTM at 0.89 but were farther from the mutation site, while WRYVAYALRWGE at 0.68 was structurally the most relevant.
Selected Candidate
The peptide chosen to advance from this first pass was WRYVAYALRWGE. Despite a hemolysis probability of 0.104 — approximately 2x the known binder control — its induced folding behavior near the A4V site was the deciding factor. The structural response to the local SOD1 environment, combined with its closest proximity to the mutation site at 12.875 Å, outweighed the moderate hemolysis risk at this stage of evaluation. Further analysis via MoPPIT would follow to explore whether higher affinity candidates could be generated with a safer therapeutic profile.
Part 4: Generate Optimized Peptides with moPPIt
Now, move from sampling to controlled design. moPPIt uses Multi-Objective Guided Discrete Flow Matching (MOG-DFM) to steer peptide generation toward specific residues and optimize binding and therapeutic properties simultaneously. Unlike PepMLM, which samples plausible binders conditioned on just the target sequence, moPPIt lets you choose where you want to bind and optimize multiple objectives at once.
Open the moPPit Colab linked from the HuggingFace moPPIt model card
Make a copy and switch to a GPU runtime.
In the notebook:
Paste your A4V mutant SOD1 sequence.
Choose specific residue indices on SOD1 that you want your peptide to bind (for example, residues near position 4, the dimer interface, or another surface patch).
Set peptide length to 12 amino acids.
Enable motif and affinity guidance (and solubility/hemolysis guidance if available). Generate peptides.
After generation, briefly describe how these moPPit peptides differ from your PepMLM peptides. How would you evaluate these peptides before advancing them to clinical studies?
Control / Known Binder Reference: FLYRWLPSRRGG (PepMLM control | ipTM 0.89 | Binding pKd 5.938 | Hemolysis 0.047)
All MoPPIT-generated peptides are evaluated relative to this baseline.
Numeric Summary
Peptide
Hemolysis
Solubility
Binding (pKd)
Motif Score
FLYRWLPSRRGG (control)
0.047
1.000
5.938
—
RTCGLIETKKQT
0.982
0.833
6.298
0.693
KKTKTGKFCKQN
0.977
0.917
5.715
0.755
IKCGNKFKKKYH
0.957
0.833
7.713
0.632
Property-by-Property Analysis
Binding Affinity (pKd/pKi) — Strongest Property
All three MoPPIT peptides are classified as weak binders, but two exceed the control baseline significantly.
Peptide
pKd
vs Control
FLYRWLPSRRGG (control)
5.938
baseline
RTCGLIETKKQT
6.298
+0.360 above control
KKTKTGKFCKQN
5.715
−0.223 below control
IKCGNKFKKKYH
7.713
+1.775 above control
IKCGNKFKKKYH shows the highest binding affinity of any peptide evaluated in this entire session — exceeding the control by +1.775 pKd units and exceeding the best PepMLM candidate (WRYVAYALRWGE, 6.980) by +0.733. This is a notable result. KKTKTGKFCKQN is the only MoPPIT peptide that falls below the control baseline.
Hemolysis Probability — Critical Liability
This is the most significant finding in the MoPPIT dataset and represents a serious concern for all three peptides.
Peptide
Hemolysis
vs Control
Flag
FLYRWLPSRRGG (control)
0.047
baseline
Safe
RTCGLIETKKQT
0.982
~21x control
⚠️ Critical
KKTKTGKFCKQN
0.977
~21x control
⚠️ Critical
IKCGNKFKKKYH
0.957
~20x control
⚠️ Critical
All three MoPPIT peptides show hemolysis probabilities approaching 1.0 — dramatically higher than the PepMLM control and well above any therapeutically acceptable threshold. This is likely driven by their highly cationic, lysine-rich sequences (KK and KKK motifs) which are known to disrupt negatively charged cell membranes through electrostatic attraction. This is a critical liability that would need to be resolved before any of these peptides could be considered viable candidates.
Solubility
Peptide
Solubility
vs Control
FLYRWLPSRRGG (control)
1.000
baseline
RTCGLIETKKQT
0.833
below control
KKTKTGKFCKQN
0.917
below control
IKCGNKFKKKYH
0.833
below control
All MoPPIT peptides fall below the control solubility of 1.000. While none are insoluble, the reduction in solubility relative to the PepMLM candidates is worth noting — particularly for RTCGLIETKKQT and IKCGNKFKKKYH at 0.833.
Motif Position Score
Peptide
Motif Score
Interpretation
RTCGLIETKKQT
0.693
Moderate motif complementarity
KKTKTGKFCKQN
0.755
Highest motif complementarity
IKCGNKFKKKYH
0.632
Lowest motif complementarity
KKTKTGKFCKQN shows the strongest motif complementarity to the SOD1 target despite having a below-control binding affinity. This suggests the peptide is well-positioned relative to the SOD1 binding motif but may lack the side chain contacts needed to translate motif recognition into strong affinity. IKCGNKFKKKYH presents an interesting inversion — lowest motif score but highest affinity — suggesting its binding may be driven by non-specific electrostatic contacts rather than precise motif engagement.
Comparative Assessment — MoPPIT vs PepMLM
Property
Best PepMLM (WRYVAYALRWGE)
Best MoPPIT (IKCGNKFKKKYH)
Binding pKd
6.980
7.713
Hemolysis
0.104
0.957
Solubility
0.999
0.833
Distance to A4V
12.875 Å
not yet evaluated
Motif Score
not available
0.632
Induced folding
yes
not yet evaluated
MoPPIT generates peptides with superior raw binding affinity but at the cost of dramatically elevated hemolysis risk. PepMLM candidates show more balanced profiles with safer hemolysis values and demonstrated structural proximity to the A4V site.
Overall Candidate Assessment
Peptide
Affinity > Control?
Hemolysis Safe?
Solubility
Motif
Verdict
FLYRWLPSRRGG
baseline
yes
1.000
—
Control
RTCGLIETKKQT
yes (+0.360)
⚠️ critical
0.833
0.693
Needs redesign
KKTKTGKFCKQN
no (−0.223)
⚠️ critical
0.917
0.755
Needs redesign
IKCGNKFKKKYH
yes (+1.775)
⚠️ critical
0.833
0.632
High potential, high risk
Key Takeaway
IKCGNKFKKKYH has the highest predicted binding affinity of any peptide evaluated in this session (pKd 7.713), making it a structurally interesting lead. However, its hemolysis probability of 0.957 makes it unsuitable in its current form. The immediate optimization priority for all three MoPPIT peptides is reducing cationic character — specifically reducing lysine density — to bring hemolysis probability into a safe range while preserving the affinity advantage. AlphaFold structural evaluation of these peptides against the A4V SOD1 dimer would be the recommended next step to assess whether the affinity advantage translates to meaningful proximity to the mutation site.
Additional Investigation
Objective: Identify and resolve hemolysis liability in the highest-affinity MoPPIT peptide while preserving binding affinity to SOD1 A4V.
Stage 1 — Problem Identified
The three MoPPIT-generated peptides showed critically elevated hemolysis probabilities of 0.957–0.982 — approximately 20x the known binder control (FLYRWLPSRRGG, 0.047). The cause was identified as lysine-rich sequences — high cationic density causing electrostatic attraction to and disruption of negatively charged cell membranes.
Peptide
Hemolysis
Status
FLYRWLPSRRGG (control)
0.047
Safe
RTCGLIETKKQT
0.982
⚠️ Critical
KKTKTGKFCKQN
0.977
⚠️ Critical
IKCGNKFKKKYH
0.957
⚠️ Critical
Despite the hemolysis liability, IKCGNKFKKKYH was selected for optimization because it showed the highest binding affinity of any peptide in the entire session at pKd 7.713 — exceeding the known binder control by +1.775 units.
Stage 2 — Substitution Strategy Designed
Three variants were designed by targeting the five lysines at positions 2, 6, 8, 9, 10:
I K C G N K F K K K Y H
1 2 3 4 5 6 7 8 9 10 11 12
Variant
Sequence
Substitutions
Strategy
Original
IKCGNKFKKKYH
—
Control baseline
Variant 1
IQCGNKFKQQYH
K2→Q, K9→Q, K10→Q
Moderate K→Q reduction
Variant 2
IQCGNQFQKNYH
K2→Q, K6→Q, K8→Q, K9→N
Aggressive K→Q reduction
Variant 3
IKCGNEFKKEYH
K6→E, K9→E
Charge balancing with glutamate
Stage 3 — Results
All three variants achieved hemolysis safety (0.035–0.037) — matching the known binder control. However binding affinity diverged significantly by strategy.
Peptide
Hemolysis
pKd
Net Charge
pI
Classification
IKCGNKFKKKYH
0.035
7.713
4.83
10.03
Medium binding
IQCGNKFKQQYH
0.037
6.255
1.83
9.20
Weak binding
IQCGNQFQKNYH
0.037
6.165
0.84
8.21
Weak binding
IKCGNEFKKEYH
0.035
7.227
0.84
8.16
Medium binding
Stage 4 — Key Finding
K→E substitution (glutamate) outperformed K→Q substitution (glutamine) for preserving binding affinity. Variant 3 lost only 0.486 pKd units versus ~1.5 units lost by the Q-substitution variants — because glutamate can form new complementary contacts with the SOD1 surface rather than simply removing charge.
Variant 3 also achieved a net charge of 0.84 and pI of 8.16 — the most physiologically favorable profile of all variants and comparable to the best PepMLM candidate WRYVAYALRWGE (charge 0.77).
Substitution Strategy Comparison
Strategy
Hemolysis Resolved?
Affinity Retained?
Charge Reduced?
Verdict
K→Q moderate (Variant 1)
yes
partial (−1.458)
yes
Weak
K→Q aggressive (Variant 2)
yes
partial (−1.548)
best
Weak
K→E charge balance (Variant 3)
yes
best (−0.486)
best
Lead
Outcome
IKCGNEFKKEYH emerged as the optimized lead — retaining medium binding classification (pKd 7.227), achieving full hemolysis safety (0.035), and carrying a charge profile (0.84) and pI (8.16) that favor target selectivity over non-specific membrane disruption.
The K→E glutamate substitution strategy is the demonstrated approach for resolving cationic hemolysis liability without sacrificing binding affinity in this peptide series.
Visualization:
Submit IKCGNEFKKEYH to AlphaFold Server against the A4V SOD1 homodimer to evaluate structural proximity to the A4V mutation site.
Compare ipTM and distance to A4V against the best PepMLM candidate WRYVAYALRWGE (12.875 Å) to determine which pipeline produces the stronger structural result.
If structural proximity is confirmed, consider a fourth generation of optimization targeting further charge refinement while monitoring affinity retention.
IKCGNEFKKEYH Non-Hemolytic - AlphaFold
IKCGNEFKKEYH Non-Hemolytic Surface
IKCGNEFKKEYH Non-Hemolytic Illustration
Part C: Final Project: L-Protein Mutants
High level summary: The objective of this assignment is to improve the stability and auto-folding of the lysis protein of a MS2-phage. This mechanism is key to the understanding of how phages can potentially solve antibiotic-resistance.
Context & Motivation
The L protein of bacteriophage MS2 is a 74–75 amino acid lysis protein whose stability and auto-folding are critical to understanding how phages can solve antibiotic resistance. The CNN phage therapy case (Strathdee/Patterson) provided real-world context — phage therapy saved a life against Acinetobacter baumannii when all antibiotics failed, underlining why understanding phage lysis mechanisms matters.
“It’s estimated that by 2050, 10 million people per year — that’s one person every three seconds — is going to be dying from a superbug infection.”
— Steffanie Strathdee, UC San Diego
Step 1 — Understanding the Problem
Established that MS2 encodes 4 proteins: Maturation (A), Coat (CP), Lysis (L), Replicase (Rep)
Located L protein on genome: NC_001417 nt 1678–1902
Identified the core challenge: L gene overlaps CP and Rep simultaneously
Any nucleotide mutation in L is also a mutation in a neighboring reading frame
Overlapping Frames
Step 2 — Sequence Acquisition
Retrieved wildtype L protein sequence (74 aa)
Dataset-validated all 32 experimentally constrained positions against the wildtype
Attempted live fetch from UniProt (P03609) — network restricted
Reconstructed sequence from published Fiers 1976 data + dataset ground truth
Downloaded all 4 MS2 protein sequences as FASTA file
Both lysis=1, prot=1. Proline kink + aromatic anchor
3
A45P + I46F
Both lysis=1, prot=1. Classic TM stabilization
4
I46F + S49T
Mixed lysis — epistatic rescue candidate
5
L44P + N53S
TM entry + core — rescue test
Step 9 — Structural Analysis Tools
Tool
Purpose
Gate
Benchling
ORF-safe mutation design
Gate 3
ESMFold / AF2
Structure prediction
Gate 4
ChimeraX
3D visualization, residue swapping
Gate 4
FoldX / mCSM
ΔΔG stability scoring
Gate 5
AF2_Multimer
Oligomeric assembly prediction
Gate 6
ProteinMPNN
AI-guided sequence redesign
Design
QuikChange
Wet lab site-directed mutagenesis
Synthesis
Key Biological Insights
The L protein’s overlapping reading frames are the primary constraint on mutation design
The Free Zone (aa 16–28) is the only region where mutations affect L protein alone
The TM boundary (aa 44–46) is the most promising target — lysis=1 mutations exist there
L protein functions as an oligomer — monomer folding alone is insufficient
DnaJ chaperone interaction with the soluble domain is critical for proper folding
The C-terminal TM domain drives both membrane insertion and pore formation
Outstanding Steps
[ ] Run ESMFold on all 5 candidate mutants → get pLDDT scores
[ ] Run FoldX / mCSM → get ΔΔG for each candidate
[ ] Run AF2_Multimer → check dimer ipTM scores
[ ] Run Benchling ORF check → verify CP and Rep frames intact
[ ] Rank and select final top 5 candidates
[ ] Synthesize top 2 in wet lab → SDS-PAGE + lysis assay
Appendix: Pipeline Summary with Key Ai generated Prompts (Claude - Sonnet 4.6)
Stage 1 — Sequence Retrieval and Mutation Introduction
The session began with retrieving the canonical human SOD1 sequence from UniProt (P00441) and introducing the A4V point mutation — substituting Alanine for Valine at position 4 of the mature protein. This established the disease-relevant target sequence for all downstream analysis. Key concepts clarified included the numbering convention between the full canonical sequence and the mature processed form, and the biological significance of A4V as the most common fALS-linked SOD1 variant in North America.
Key prompts:
“Begin by retrieving the human SOD1 sequence from UniProt (P00441) and introducing the A4V mutation.”“What is the A4V mutation?”“What is a homodimer?”
Stage 2 — Conceptual Grounding
Before moving to computational tools, foundational questions established the biological framework: the structural location of A4V at β-strand 1 near the dimer interface, and the therapeutic rationale for designing a peptide binder — to intercept misfolded A4V SOD1 at the aggregation-prone hydrophobic surface exposed by dimer destabilization.
Key prompts:
“Remind, where is the actual critical region of A4V?”“Summarize what the fundamental purpose is to take the mutated protein and add a binder sequence.”
Stage 3 — AlphaFold Server Workflow
The AlphaFold Server workflow was established: inputting two copies of the A4V SOD1 sequence as Entity 1 to model the native homodimer, and the peptide as Entity 2. The distinction between protein chains and small molecule ligands was clarified. The rationale for five ranked models per job was explained, and the rank_0 CIF file was identified as the correct starting point.
Key prompts:
“In AlphaFold Server how to add a peptide to a protein sequence?”“When evaluating in AlphaFold should I be using one strand of the SOD1 sequence or two, to show the mutant form?”“Each export from AlphaFold includes 4 CIF files. Why?”
Stage 4 — ChimeraX Structural Evaluation
The built-in AlphaFold viewer was identified as insufficient for detailed analysis, leading to adoption of ChimeraX. A core command vocabulary was developed iteratively through troubleshooting: chain coloring, secondary structure coloring, residue labeling to landmark A4V, surface generation with transparency to assess binding depth, and distance measurement to quantify proximity to residue 4. Common errors were resolved including chain specification syntax, atom ambiguity, electrostatic surface cap persistence, and model number conflicts.
Key prompts:
“Is it better to evaluate in AlphaFold or in another visual program to get to these answers?”“When loading a model into ChimeraX and evaluating, summarize key questions for evaluating visually.”“Summarize how to best answer the questions, and what ChimeraX visualization will work best.”
Stage 5 — Peptide Observation and Scoring
Five PepMLM-generated peptides were evaluated across AlphaFold confidence metrics (ipTM, pTM) and ChimeraX structural observations (distance to A4V, structural feature engagement, binding depth). A key insight emerged: WRYVAYALRWGE — the lowest ipTM (0.68) — showed the closest proximity to A4V (12.875 Å) and uniquely demonstrated induced folding, a hallmark of meaningful peptide-protein interaction. FLYRWLPSRRGG (ipTM 0.89, distance 22.285 Å) was established as the known binder control baseline.
Key prompts:
“What if a lower ipTM has a closer proximity to the A4V location?”“In this case the peptide starts to show a helical fold.”“Update to final summary — FLYRWLPSRRGG is the control or known SOD1-binding peptide.”
PepMLM peptide physicochemical properties were analyzed relative to FLYRWLPSRRGG across seven dimensions: solubility, hemolysis, binding affinity (pKd/pKi), molecular weight, net charge, isoelectric point, and hydrophobicity (GRAVY). Three peptides exceeded the control binding affinity. WRYVAYALRWGE showed the highest pKd (6.980), lowest net charge (0.77), and lowest pI (8.59) — the most favorable selectivity profile.
Key prompts:
“Analyze the results.” (physicochemical data pasted)“Revise the plot — make FLYRWLPSRRGG the baseline and first value, color in gray bar.”“Format the detailed analysis as a markdown file.”
Stage 7 — MoPPIT Peptide Generation and Analysis
Three MoPPIT-generated peptides were introduced and analyzed. All three showed critically elevated hemolysis probabilities (0.957–0.982, ~20x control) driven by lysine-rich sequences. Despite this, IKCGNKFKKKYH was identified as highest-affinity peptide of the entire session at pKd 7.713. Motif position scores were introduced as an additional evaluation dimension.
Key prompts:
“What is a motif position?”“Graph the following MoPPIT generated peptide binders.” (data pasted)“Format the analysis of MoPPIT data in Hugo markdown format.”
Stage 8 — Hemolysis Resolution
The hemolysis liability of IKCGNKFKKKYH was addressed through systematic lysine substitution. Three variants were designed and screened. The K→E glutamate substitution strategy (Variant 3: IKCGNEFKKEYH) outperformed K→Q substitution — retaining medium binding classification (pKd 7.227), reducing net charge to 0.84, and achieving full hemolysis safety (0.035) comparable to the known binder control.
Key prompts:
“Summarize hemolysis probability and what we may do to resolve.”“Recommend three peptides derived from IKCGNKFKKKYH that might lower hemolysis.”“Here are the results of an attempt to lower hemolysis.” (variant data pasted)
Stage 9 — Synthesis and Outputs
All findings were compiled into structured Hugo markdown deliverables: peptide binding observations, numeric summary tables, ipTM vs distance scatter plot, six-panel physicochemical bar chart, MoPPIT analysis, hemolysis resolution pipeline summary, and this appendix. Two lead candidates emerged from separate pipelines for further structural validation.
Key prompts:
“Plot the data points in a visual graphic, highlighting the likely candidate.”“Download summary of the attempt to achieve hemolysis safety in Hugo markdown format.”“Revise the hemolysis summary in Hugo markdown format.”
Distilled Conclusion
Two lead candidates emerged from this session across two separate peptide generation pipelines:
WRYVAYALRWGE (PepMLM) — closest structural proximity to A4V (12.875 Å), highest PepMLM binding affinity (pKd 6.980), induced folding behavior upon binding, and the most favorable charge selectivity profile (net charge 0.77). Recommended for AlphaFold dimer evaluation and structural confirmation.
IKCGNEFKKEYH (MoPPIT, optimized) — highest binding affinity of the full session after hemolysis optimization (pKd 7.227), full hemolysis safety achieved (0.035), net charge 0.84, pI 8.16. Glutamate substitution (K→E) demonstrated as the superior strategy over glutamine substitution (K→Q) for charge reduction without affinity loss.
The recommended next step for both candidates is AlphaFold Server structural evaluation against the A4V SOD1 homodimer, followed by distance-to-A4V measurement in ChimeraX to determine which pipeline produces the more structurally relevant binder.
Part 3 — Footnote Attributions
Databases and Sequence Resources
UniProt — Human SOD1 canonical sequence (P00441 / SODC_HUMAN). UniProt Consortium. UniProt: the Universal Protein knowledgebase. Nucleic Acids Research. https://www.uniprot.org/uniprotkb/P00441
Structure Prediction
AlphaFold Server — Structure prediction of SOD1 A4V homodimer and peptide complexes. Abramson J, et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature, 2024. https://alphafoldserver.com
AlphaFold confidence metrics (ipTM / pTM) — Evans R, et al. Protein complex prediction with AlphaFold-Multimer. bioRxiv, 2022. https://doi.org/10.1101/2021.10.04.463034
Molecular Visualization
UCSF ChimeraX — Pettersen EF, et al. UCSF ChimeraX: Structure visualization for researchers, educators, and developers. Protein Science, 2021. https://www.rbvi.ucsf.edu/chimerax/
Peptide Design and Generation
PepMLM — Peptide design via masked language modeling. Truong Jr T, Bepler T. PepMLM: Target Sequence-Conditioned Generation of Peptide Binders via Masked Language Modeling. arXiv, 2023. https://arxiv.org/abs/2310.03842
MoPPIT — Motif-based peptide-protein interaction tool. Source to be confirmed from course materials.
Physicochemical Property Prediction
Solubility prediction — Peptide solubility probability scoring. Source dependent on tool used for property screening — confirm from course pipeline documentation.
Hemolysis prediction — Peptide hemolysis probability scoring. Likely derived from HemoPI or equivalent hemolysis prediction server. Gautam A, et al. HemoPI: a server to predict and design hemolytic peptides. Journal of Translational Medicine, 2014. https://webs.iiitd.edu.in/raghava/hemopi/
Binding affinity (pKd/pKi) — Peptide binding affinity prediction. Source dependent on tool used — confirm from course pipeline documentation.
GRAVY score (hydrophobicity) — Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. Journal of Molecular Biology, 1982. 157(1):105–132.
Isoelectric point (pI) prediction — Bjellqvist B, et al. The focusing positions of polypeptides in immobilized pH gradients can be predicted from their amino acid sequences. Electrophoresis, 1993.
Disease and Biology Context
SOD1 and fALS — Rosen DR, et al. Mutations in Cu/Zn superoxide dismutase gene are associated with familial amyotrophic lateral sclerosis. Nature, 1993. 362:59–62.
A4V mutation and ALS — Cudkowicz ME, et al. Epidemiology of mutations in superoxide dismutase in amyotrophic lateral sclerosis. Annals of Neurology, 1997. 41(2):210–221.
SOD1 misfolding and aggregation — Banci L, et al. Atomic-resolution monitoring of protein maturation in live human cells by NMR. Nature Chemical Biology, 2013.
Induced folding / folding upon binding — Dyson HJ, Wright PE. Intrinsically unstructured proteins and their functions. Nature Reviews Molecular Cell Biology, 2005. 6:197–208.
Structural Biology Concepts
Greek key β-barrel topology — Richardson JS. The anatomy and taxonomy of protein structure. Advances in Protein Chemistry, 1981. 34:167–339.
Protein distance thresholds and contact definition — Keskin O, et al. Principles of protein-protein interactions. Chemical Reviews, 2008. 108(4):1225–1244.
Lysine-mediated membrane disruption and hemolysis — Brogden KA. Antimicrobial peptides: pore formers or metabolic inhibitors in bacteria? Nature Reviews Microbiology, 2005. 3:238–250.
Additional References
Fiers W. et al. (1976) Complete nucleotide sequence of bacteriophage MS2 RNA. Nature 260:500–507
Kastelein R.A. et al. (1982) Lysis gene expression of RNA phage MS2. Nature 295:35–41
Week 6 HW: Genetic Circuits Part I: Assembly Technologies
This week we learn core molecular biology tools and techniques for processing and assembling DNA, including PCR and Gibson Assembly.
Assignment: DNA Assembly
What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose?
The mix contains DNA Polymerase, known for thermostable accuracy. Used to amplify fragments used in PCR for Gibson Assembly.
What are some factors that determine primer annealing temperature during PCR?
length, temperature, presence of GC.
There are two methods from this class that create linear fragments of DNA: PCR, and restriction enzyme digests. Compare and contrast these two methods, both in terms of protocol as well as when one may be preferable to use over the other.
PCR is much more likely to work with mutations that have a wider variation of results.
Restriction enzyme digests work best when the sequence is well defined and repeatable.
PCR has a thermocycle, RE has a single set temp.
PCR required Primers, RE requires recognition sites
How can you ensure that the DNA sequences that you have digested and PCR-ed will be appropriate for Gibson cloning?
Ensure overlaps are present for alignment
Evaluate using electrophoresis
Purify the dna
How does the plasmid DNA enter the E. coli cells during transformation?
Electrostatic conditioning, or heat shock temporarily make the cells porous, then left to build antibiotic resistance
Describe another assembly method in detail (such as Golden Gate Assembly)
Explain the other method in 5 - 7 sentences plus diagrams (either handmade or online).
Golden Gate Assembly is a Restriction based cloning method
Relies on Type IIS restriction enzymes
Gene of interests surrounded by overhangs
Overhangs are directional
Mixed with designation vector promoter and buffer overhangs
Type IIS restriction is removed from final product for accurate cloning
Model this assembly method with Benchling or Asimov Kernel!
For this assembly, I followed the references provided by “Golden Gate Cloning” Plasmids 101 eBook 1 as well as an extended dialogue with Claude AI to explore the benchling assembly process.
Golden Gate Assembly Wizard in Benchling
Resulting Assembly (with overlapping BSBI in Backbone)
Golden Gate Assembly Attempt Summary
Using the mUAV plasmid (MG252981) as the source sequence in Benchling, the goal was to model a Golden Gate Assembly to swap the wildtype chromophore region (cagtgtcagtac, bases 2291-2302) with a mutant sequence (cagTGTCAGtac) to produce a color change in amilCP. A functional assembly was completed in Benchling, however it was not biologically accurate because the existing BbsI cut sites at positions 2338 and 2523 were not correctly flanking the target chromophore region, and possible account permission restrictions prevented direct sequence editing to reposition them. (no “pencil” icon present).
Top Three Lessons Learned
Lesson 1
Know your cut site positions before starting — BbsI sites must flank the exact target region; misplaced sites produce incorrect assemblies regardless of how clean the protocol execution is.
Lesson 2
Sequence editability must be confirmed first — importing from GenBank into a course account created read-only restrictions that blocked direct editing, requiring workarounds.
Lesson 3
Insert fragment length matters — a 12bp insert is too short for Benchling primer design; flanking sequence context is needed for successful primer generation.
Follow-up Action Items
Action Items
Get edit permissions fixed by instructor
Replace existing BbsI sites at 2338 and 2523 with silent substitution GAGGAT
Place new BbsI sites flanking bases 2291-2302
Redesign assembly with correctly positioned cut sites
Extend insert sequence with flanking bases for successful primer design
Assignment: Asimov Kernel
Kernel Experiment 1 — Repressilator
In my first Kernel experiment, I was able to locate the Repressilator in the public Bacterial Demos repository.
Initial Approach
Search Scope Limitation
I currently don’t have search scope permissions to directly query the repository from within my notebook to pull in parts by search. My approach was to reverse engineer a known construct that provides expected results in a simulation.
I attempted to copy individual parts over to my notebook, which appeared to work initially.
Circuit Design
I created a Repressilator circuit design and confirmed it mirrored the sequence construct and sequence length identically.
Non-Oscillating Result
However, the circuit returned a non-oscillating expression. Two possible causes are under investigation:
Metadata is not fully present in the copied parts
A permissions issue may be blocking necessary metadata from executing relative copies of parts
Status: TBD
Working Solution
Successful Approach
Copied the entire working Repressilator construct to my local repository first
Copied the components one by one locally within my notebook
Successfully built and simulated with expected oscillating results
LAB: Gibson Assembly
Overview
Changing the color-generating chromophore of the purple Acropora millepora chromoprotein (amilCP) to a variety of orange, pink, and blue mutants.
Build the Gibson Assembly in Asimov Kernel
Participate in BioArt Engineering Workshop at MakerSpace Charlotte
BioArt Studio
On March 16th & 17th a two day “Introduction to BioArt Engineering” was conducted at MakerSpace Charlotte, led by Karen Ingram.
The activity timed well with this week’s subject of color-expressing proteins.
The program leverages a kit and hardware from Amino.bio, a provider of educational biotechnology experiments.
We performed a BioArt experiment, using the engineering toolkit.
We observed a demonstration of pre-lab set up protocols, including agar plate pouring, antibiotic treatment for control, and incubation.
Procedures
The activity was focused on the following procedures:
Prepare the cells — Wake up the E. coli from the stab, chill them in transformation buffer to make them chemically competent (membrane-ready to accept foreign DNA).
Transform — Add the DNA plasmid, then heat shock / ice shock to force the cell membranes to briefly open and close, trapping the plasmid inside.
Select & observe — Plate onto selective (antibiotic) and non-selective plates with controls, incubate, and see who survived and expressed color.
Protocol Steps
Each participant including myself, performed the steps defined in the protocol:
Plate Preparation — Make selective (antibiotic) and non-selective agar, then pour petri dishes and allow them to solidify.
Control Plate Streaking — Streak the positive control bacteria stab onto a non-selective plate using the stencil, establishing your experimental baseline.
Buffer Cooling — Chill the transformation buffer on ice to prepare it for making cells chemically competent.
Stab Sample Collection — Collect a small sample of the “blank” K12 E. coli from its stab vial.
Swirl Mixing — Mix the bacterial sample with the cold transformation buffer by gently swirling/flicking.
Heat Shock & Ice Shock — Subject the cell-buffer mixture to a brief heat shock (42°C), then immediately return to ice — this is the key step that opens cell membranes to accept the DNA plasmid.
Pouring Recovery Media — Add recovery media (LB broth) to the transformed cells and allow them to rest and recover.
Plating Positive Control — Plate the pre-engineered positive control cells onto a selective plate to confirm the antibiotic selection is working.
Scrape & Plate Transformed Cells — Spread the recovered transformed cells onto selective agar plates.
Painting with Colorful Microbes — Use the Canvas-style technique to create BioArt patterns on agar plates with the engineered pigment-producing bacteria.
Incubation — Place plates in the DNA Playground incubator (or equivalent at ~37°C) for 24–72 hours.
Timed Photo Observation — Document results at intervals, comparing colony colors, density, and plate conditions across experimental and control plates.
At the time of this submittal, the incubation is at ~24 hours out of a 72 hr cycle.
Here are selected images from a very engaging, hands-on BioArt & Engineering experience.
Photography (c)2026 Eric Schneider
Appendix
Footnotes
Gearing, Mary. “Golden Gate Cloning.” Plasmids 101. Addgene. link
Asimov. Kernel — genetic engineering design platform. link
Ingram, Karen. BioGenetic Blooms. MakerSpace Charlotte BioArt Studio. link
Amino Labs. Educational Biotechnology Experiments. link
Kernel Experiment — To-Do Checklist
Setup
Create a Repository for your work
Create a blank Notebook entry to document the homework
Save the Notebook entry to your Repository
Explore Bacterial Demos
Open the Bacterial Demos Repository
Explore the devices to understand how parts work together
Run the Simulator on various examples
Open the Info panel by clicking the “i” icon on the right
Follow the Simulator instructions found in the Info panel
Recreate the Repressilator
Create a blank Construct and save it to your Repository
Open the Characterized Bacterial Parts repository
Search for Repressilator parts using the Search function in the right menu (SEARCH SCOPE TBD)
Drag and drop parts into your blank Construct
Run the Simulator using the play button
Compare your results with the Repressilator Construct in Bacterial Demos
Confirm results match expected oscillating behavior
Document in Notebook
Copy the glyph image and paste into your Notebook entry
Copy the Simulator graphs and paste into your Notebook entry
Build Your Own Constructs
Build Construct 1 using parts from Characterized Bacterial Parts
Build Construct 2 using parts from Characterized Bacterial Parts
Build Construct 3 using parts from Characterized Bacterial Parts
For Each Construct
Explain in your Notebook how you expect the Construct to function
Run the Simulator and record your results in the Notebook
If results don’t match expectations:
Speculate on why in your Notebook
Adjust Simulator settings to attempt expected outcome
Document adjusted results
Week 7 HW: Genetic Circuits Part II: Neuromorphic Circuits
This week covers neuromorphic genetic circuits, showing how engineered gene networks can implement neural-network “perceptron”-like computation and learning.
Assignment Part 1: Intracellular Artificial Neural Networks (IANNs)
Q1. What advantages do IANNs have over traditional genetic circuits, whose input/output behaviors are Boolean functions?
Answer: IANNs have many possible responses, reflecting more of a gaussian distribution rather than binary ON/OFF outputs. This allows for gradiated, continuous range or responses versus the step-function behavior of Boolean genetic circuits, making them well-suited for environments with high levels of variability such as changing temperatures, pH, or time.
Q2. Describe a useful application for an IANN; include a detailed description of input/output behavior, as well as any limitations an IANN might face to achieve your goal.
Answer: A useful application would be when there are high levels of variability, such as changing temperatures, pH, or time. For example, if trying to express a large volume of folded proteins and the optimal conditions are changing, an IANN may refine the output by training and tuning the model based on output results. A limitation may be that the system relies on large datasets and is a measure of predictability, which may require additional downstream logic gates for boolean results.
Q3. Draw a diagram for an intracellular multilayer perceptron where layer 1 outputs an endoribonuclease that regulates a fluorescent protein output in layer 2.
Answer: In this dual region circuit:
X1 DNA binds to a positive receptor in Region 1 (a binding site, not a proton).
X1 also binds to a positive receptor in Region 2.
X2 DNA is repressed in Region 1.
X2 DNA binds in Region 2.
Region 2 weight is greater than Region 1, producing a favorable maximum output on the Y axis
Figure: Dual region circuit — With primary output of gFp in Layer 2.
Assignment Part 2: Fungal Materials
Figure: Platonic solids cast in Mycelium; Rendering by E. Schneider - Blender 5.01
Q1. What are some examples of existing fungal materials and what are they used for? What are their advantages and disadvantages over traditional counterparts?
Existing fungal materials include shaped mycelium pressed into molds, including bricks that
build large structures. There are also packaging solutions, textile designs such as clothing
patterns, and thin sheets which have been used as paper, as well as sculptural elements.
Some of the advantages are that the fungi can be mixed with substrates to create different
material properties. There are advantages in acoustic dampening, as well as fire resistance.
Some of the issues arise from the same properties; structural integrity can be compromised
when drying out, causing shrinking, cracking, or breaking. Contamination is an issue when
producing raw materials. The materials are inherently biodegradable which is dependent on how
the substrate is manufactured and post-processed.
Q2. What might you want to genetically engineer fungi to do and why? What are the advantages of doing synthetic biology in fungi as opposed to bacteria?
I am very interested in genetically engineering fungi to create co-cultures that gain the
benefit of both systems. Being able to grow fungi in a controlled environment will accelerate
production of selected strains needed for specific design concepts.
The ability to engineer synthetic biology with fungi means that we may introduce attributes
like color-changing materials, contamination and anti-fungal resistance, structural and tensile
integrity, and novel therapeutic solutions such as drug delivery substrates; all at a larger
scale than bacteria alone may provide.
Figure: 3D Printer preparation for platonic solid forms for Mycelium casting.
Assignment Part 3: First DNA Twist Order
For my final project, I seek to create a photographic image on a flood plate, using custom built labware to project a high resolution image.
My DNA Twist Order for colonal gene is note below (1B), with a parallel track to order a dual plasmid construct from Addgene (1A)
Aim 1(a) has a control experiment, which will use the light projection labware I am designing to produce bacterial photographs.
By requesting two existing, well-studied plasmids from Addgene, I may be able to expedite the Design, Build, Test and Analyze(Learn) cycle.
The following genes have been contributed by Tabor Labs, and can be ordered as standard bacterial stabs.
pSR58.6 (Plasmid #63176) which expresses CcaR constitutively and sfGFP under the PcpcG2-172 promoter
Note: This order selection will be reviewed and validated to confirm if this is the optimal candidate for this project. There are several versions, each with increasing on/off folding range, reaching >100X for high contrast, high quality visualization
Aim 1(b) is to submit a TWIST order of an engineered E. coli with a single-plasmid optogenetic system ( derived from eLightOn) that activates sfGFP expression in response to 470 nm blue light, producing a photographic image in green fluorescence on a flood plate using my projection labware.
The link to my shared Benchling file is found in the class submittal via Google Form.
Citations
Aim 1A — CcaSR Green Light System (pSR43.6r + pSR58.6)
Primary system citation:
Schmidl SR, Sheth RU, Wu A, Tabor JJ. Refactoring and optimization of light-switchable Escherichia coli two-component systems. ACS Synthetic Biology. 2014 Nov 21;3(11):820–31. doi: 10.1021/sb500273n. PMID: 25305428.
Foundational bacterial photography citation:
Levskaya A, Chevalier AA, Tabor JJ, Simpson ZB, Lavery LA, Levy M, Davidson EA, Scouras A, Ellington AD, Marcotte EM, Voigt CA. Synthetic biology: engineering Escherichia coli to see light. Nature. 2005 Nov 24;438(7067):441–2. doi: 10.1038/nature04405. PMID: 16306981.
Aim 1B — eLightOn Blue Light System (pBioLight-1B-eLightOn-v1)
Primary eLightOn system citation:
Li X, Zhang C, Xu X, Miao J, Yao J, Liu R, Zhao Y, Chen X, Yang Y. A single-component light sensor system allows highly tunable and direct activation of gene expression in bacterial cells. Nucleic Acids Research. 2020 Apr 6;48(6):e33. doi: 10.1093/nar/gkaa044. PMID: 31989175. PMC: PMC7102963.
sfGFP sequence source (extracted from pJT119b for E. coli codon-optimized sequence only):
Olson EJ, Hartsough LA, Landry BP, Shroff R, Tabor JJ. Characterizing bacterial gene circuit dynamics with optically programmed gene expression signals. Nature Methods. 2014 Mar 9;11(4):449–55. doi: 10.1038/nmeth.2884. PMID: 24608181.
RsLOV structural reference:
Conrad KS, Bilwes AM, Crane BR. Light-induced subunit dissociation by a light-oxygen-voltage domain photoreceptor from Rhodobacter sphaeroides. Biochemistry. 2013 Jan 15;52(2):378–91. doi: 10.1021/bi3015373. PMID: 23252338. PMC: PMC3582384. PDB: 4HJ6.
LexA408 mutation reference:
Little JW, Edmiston SH, Pacelli LZ, Mount DW. Cleavage of the Escherichia coli LexA protein by the RecA protease. Proceedings of the National Academy of Sciences. 1980;77(6):3225–9. doi: 10.1073/pnas.77.6.3225. PMID: 6251456.
pUC19 backbone — Addgene #50005. NCBI accession: L09137. Specified as clonal plasmid destination vector for Twist Bioscience synthesis order.
Week 9 HW: Cell-free Systems
This week introduces synthesis of proteins using cellular machinery outside of a cell.
Section 1: General Homework Questions
Question 1
Explain the main advantages of cell-free protein synthesis over traditional in vivo methods, specifically in terms of flexibility and control over experimental variables. Name at least two cases where cell-free expression is more beneficial than cell production.
The primary advantage is that the cell-free method can be lyophilized (freeze-dried) and stored indefinitely outside of a lab freezer, leading to more rapid experimentation in a wide range of environments. The ability to add purified water to reconstitute and deploy means that delivery systems and analysis can be conducted in the field. A good example is a COVID test, which includes a control strip and a result readout.
Another key benefit is that the cell-free process can include well-defined parts, each with specific functions and building blocks that are not dependent upon a living host cell. This means experiments will not fail due to toxicity or competing metabolic pathways, enabling an accelerated test cycle without having to clone or transform. The ability to fine-tune concentrations, DNA templates, and protein components is a core strength of cell-free systems — something not possible in the presence of living cells.
Question 2
Describe the main components of a cell-free expression system and explain the role of each component.
The main components of a cell-free system are lyophilized reagents, freeze-dried and pelletized, which are reconstituted by adding purified water to restart the transcription and translation machinery. Components include:
Cell extract — containing ribosomes, tRNA, and enzymes that carry out protein synthesis
DNA template — circular or linear plasmid providing the genetic instructions
RNA polymerase — responsible for transcription, converting the DNA template into mRNA
Ribosomes — carry out translation, reading the mRNA to assemble the protein from amino acids
Amino acids — the raw building blocks assembled into the target protein
Energy system — ATP and a regeneration source such as creatine phosphate to sustain the reaction
Salts and cofactors — such as Mg²⁺ and K⁺ to optimize ribosome function
Question 3
Why is energy provision regeneration critical in cell-free systems? Describe a method you could use to ensure continuous ATP supply in your cell-free experiment.
Energy provision is critical because without a living host cell, a substitute is required for synthesis to occur. ATP is consumed rapidly during transcription, translation, and tRNA charging, so continuous regeneration is essential. A phosphate donor such as creatine phosphate or PEP (phosphoenolpyruvate) provides the phosphate group that converts ADP back into ATP, sustaining the reaction throughout the experiment.
Question 4
Compare prokaryotic versus eukaryotic cell-free expression systems. Choose a protein to produce in each system and explain why.
Prokaryotic and eukaryotic cell-free expression systems have many differences when compared to the production of GFP, which relates to my core BioLight project. In prokaryotic cell-free, the cost would be less since the amount of expression needed for flood plates would be high. An exact level of brightness and contrast based on the presence of complex biosensors and promoters/repressors can be designed with DNA.
In contrast, the eukaryotic cell-free method is more complex and expensive, with slower and lower yield. However, this method is better suited for human therapeutics such as IL-27, an anti-inflammatory cytokine. With this approach, GFP could be fused to IL-10 to visually validate areas of inflammation being treated. IL-27 requires glycosylation — a post-translational modification not viable in prokaryotic cell-free systems — making eukaryotic cell-free the only viable option for this dual-output therapeutic application.
Question 5
How would you design a cell-free experiment to optimize the expression of a membrane protein? Discuss the challenges and how you would address them in your setup.
I would design a cell-free eukaryotic experiment that expresses IL-27R (the IL-27 membrane receptor) fused with a GFP reporter, expressed when exposed to specific light frequencies in targeted therapeutic areas. Being able to localize the mechanism of action and have it fluoresce to validate expression would be a compelling use case for membrane protein work.
The key challenge is overcoming the hydrophobic nature of membrane proteins, which aggregate and misfold without a lipid environment. This can be addressed by supplying artificial liposomes or nanodiscs — small lipid bilayer structures that the protein can correctly insert into during expression. Glycosylation of IL-27R also requires eukaryotic machinery, which is not possible in prokaryotic cell-free systems.
Question 6
Imagine you observe a low yield of your target protein in a cell-free system. Describe three possible reasons for this and suggest a troubleshooting strategy for each.
Energy depletion — A low yield may indicate exhaustion of the ATP supply needed to sustain transcription and translation. I would troubleshoot by increasing the concentration of creatine phosphate or PEP to ensure continuous ATP regeneration throughout the reaction.
Contamination or incorrect buffer composition — If the water used to reconstitute is not purified, or if salt concentrations such as Mg²⁺ and K⁺ are incorrect, this can disrupt ribosome function and lead to low or no protein output. I would ensure purified water is used at the correct volume, and verify buffer composition before reconstitution.
Membrane protein misfolding — In a eukaryotic cell-free system expressing a membrane protein, the hydrophobic nature of the target protein may lead to aggregation and misfolding without a lipid environment. I would address this by supplying nanodiscs or liposomes to provide a membrane scaffold for correct insertion and folding.
Section 2: Homework question from Kate Adamala
2a. What would the membrane be made of?
The membrane would be a liposome composed of POPC (palmitoyloleoylphosphatidylcholine) with cholesterol added to stabilize the bilayer within the cell-free system. POPC is highly biocompatible and provides a stable enough structure to support insertion of the TNF-α receptor on the membrane surface, enabling the synthetic cell to sense its inflammatory environment.
2b. What would you encapsulate inside?
The cell-free system encapsulated inside would include the DNA sequences to produce RFP and IL-27. The energy system consists of creatine phosphate and PEP, which donate phosphate groups to regenerate ATP from ADP. Required salts Mg²⁺ and K⁺ are included to stabilize the system and optimize ribosome function. The transcription and translation machinery — including ribosomes, tRNA, and RNA polymerase — provides the core expression engine. Finally, the EL222 light-sensing transcription factor is encapsulated to detect incoming 470nm blue light and trigger localized IL-27 expression in response.
2c. Which organism will your Tx/Tl system come from?
A prokaryotic cell-free system alone will not work for this design. While RFP and EL222 could be expressed using a prokaryotic E. coli extract, IL-27 requires glycosylation — a post-translational modification only available in mammalian systems. Therefore a mammalian cell-free extract, specifically HEK293, is required to correctly fold and modify IL-27. The eukaryotic machinery also better supports the overall complexity of the dual-output biocircuit. The POPC liposome membrane is constructed separately and is not dependent on the Tx/Tl system.
2d. How will your synthetic cell communicate with the environment?
The synthetic cell communicates with its environment through TNFR1 (Tumor Necrosis Factor Receptor 1) expressed on the outer membrane surface, which binds extracellular TNF-α at the inflammation site and triggers internal RFP expression as a fluorescent readout. EL222 resides inside the synthetic cell and responds to externally applied 470nm blue light, which penetrates the lipid membrane. Upon light activation, EL222 triggers IL-27 expression and the synthetic cell lyses, releasing the encapsulated IL-27 directly at the targeted inflammation site in a controlled, single-use therapeutic delivery event.
3b. How will you measure the function of your system?
The function of this system will be measured through an FDA-approved clinical trial. Subjects will be randomized into three groups: a control group receiving unmodified IL-27, a placebo group, and a treatment group receiving BioLight-IL-27. All subjects will undergo whole-body fluorescence imaging to capture a baseline inflammation heat map. The BioLight wand will then be applied to activate high-concentration inflammation sites in treatment subjects, triggering localized IL-27 release. After 24 hours, a second intravenous infusion of BioLight-IL-27 is administered and a follow-up fluorescence image captured. The delta between round 1 and round 2 imaging, alongside TNF-α blood panel measurements, will indicate therapeutic efficacy against both control and placebo groups.
BioLight-IL27: Freeze-Dried Biosensors for Robotic Home Healthcare Delivery
Pitch Summary
BioLight uses automated robotic manufacturing systems to produce freeze-dried, light-activated cell-free biosensors that detect inflammation markers and deliver localized IL-27 immunotherapy to healthcare patients at scale.
How It Works
Freeze-dried BioLight-IL27 biosensors are manufactured at scale using automated robotic systems in localized facilities, reducing the need for long-distance transportation and cold-chain refrigeration. Community healthcare providers leverage remote-operated robotic infusion systems to reconstitute and administer the biosensors intravenously, delivering them directly to patients at home for comfort and recovery. Once inside the body, the synthetic cell-free biosensors circulate to sites of elevated TNF-α, where TNFR1 membrane receptors detect inflammation and trigger RFP fluorescence as a visual readout. A mobile app connects to a fluorescence imaging sensor, and the BioLight wand delivers localized 470nm blue light to activate EL222, triggering IL-27 release precisely at the inflammation site — providing therapeutic relief instantly, anywhere, anytime.
Societal Challenge and Market Need
This represents the change needed to extend infusion-based therapies to home healthcare settings. As our population ages, debilitating chronic inflammatory conditions such as psoriatic arthritis are on the rise, and travel to approved infusion sites becomes increasingly challenging for patients. A targeted anti-inflammatory biosensor that can be self-administered at home opens the door for advanced robotically assisted, virtually supervised healthcare — representing the emergence of personalized synthetic bio-healthcare. With over 54 million Americans living with arthritis alone, the time and money saved by the medical profession will allow this market to expand exponentially, reaching more patients and delivering an extended quality of life for all.
Addressing Cell-Free System Limitations
The freeze-dried lyophilized format directly addresses stability — eliminating cold-chain dependency, extending shelf life, and enabling storage at room temperature in the home. Reconstitution with purified water is handled automatically by the robotic infusion system, removing the risk of user error during activation. While each biosensor is single-use by design, the BioLight wand and delivery hardware are fully reusable, creating a viable and cost-effective treatment model. A home healthcare platform with remote monitoring, replenishment alerts, expiration reminders, and 24/7 virtual assistance ensures consistent and safe utilization. As the market adapts and scales, automated handling of materials will make storage, transportation, and manufacturing a highly predictable, monitored, and continuously improving outcome.
Genes in Space
Question 1 — Background
(Maximum 100 words)
Cell-free protein expression systems offer a powerful platform for space biology research, diagnostics, and on-demand biomanufacturing. The BioBits® system makes this technology accessible from classrooms to the ISS. However, we do not yet know how microgravity affects the fundamental kinetics of transcription and translation outside a living cell. On Earth, gravity influences molecular sedimentation, crowding, and reaction dynamics. Removing these forces in spaceflight may fundamentally alter how efficiently cell-free systems perform. Understanding this has direct implications for long-duration missions and opens a new class of accessible, iterative experiments connecting student scientists on Earth with research aboard the ISS.
98 words
Question 2 — Molecular Target
(Maximum 30 words)
Competitive cell-free transcription and translation kinetics measured through RFP, YFP, and GFP across strong, medium, and weak promoters in four replicate BioBits® reaction wells on Earth and the ISS.
Question 3 — Target Relevance
(Maximum 100 words)
Three fluorescent reporters — red, yellow, and green — are coupled to strong, medium, and weak promoters respectively, and combined into BioBits® reaction wells where they compete for the same transcriptional and translational machinery. On Earth, gravitational effects including molecular sedimentation and crowding are expected to favor higher-strength promoters, with yellow dominant as a stable middle control. In microgravity, reduced physical barriers may shift the competitive balance toward weaker promoters, causing green to emerge more frequently. This traffic-light readout transforms subtle kinetic differences into a visually unambiguous, measurable signal directly observable through the P51 Molecular Fluorescence Viewer.
Question 4 — Hypothesis
(Maximum 150 words)
I hypothesize that microgravity will increase the efficiency of cell-free protein expression kinetics compared to Earth-based controls. In a gravitational environment, molecular sedimentation and crowding effects create physical barriers to optimal ribosome-mRNA interaction and protein folding. Removing gravity may reduce these barriers, allowing cell-free components to distribute more uniformly and interact more freely, resulting in faster or higher-yield expression. To test and predict this, we developed a three-layer platform: an in-silico simulator that models cell-free reaction dynamics computationally; a physical 3D printed magnetic kit that serves as an educational model on Earth — allowing students to hand-assemble cell-free components — and as an experimental observation tool aboard the ISS, where components are released in zero-g to document free-floating self-assembly behavior; and BioBits® four-well replicate reactions that generate real fluorescence data on Earth and the ISS. Each run retrains the simulator, improving predictive accuracy over time.
Question 5 — Experimental Plan
(Maximum 100 words)
BioBits® reaction tubes are prepared as four replicate wells on Earth and single-tube runs aboard the ISS: Tube 1 (RFP vs YFP — strong vs medium promoter), Tube 2 (YFP only — baseline control), Tube 3 (GFP vs YFP — weak vs medium promoter), Tube 4 (equal R+Y+G — open competition). Reactions are incubated using the miniPCR® thermal cycler; fluorescence outcomes are observed through the P51 Molecular Fluorescence Viewer and documented by Raspberry Pi camera. Aboard the ISS, a 3D printed magnetic molecular kit is released in zero-g; Raspberry Pi machine vision captures free-floating motion and self-assembly events. All fluorescence and motion data feed an in-silico simulator that predicts outcomes and retrains with each run.
Genes in Space 2026 — genesinspace.org
Week 10 HW: Advanced Imaging and Measurement Technology
This week’s lecture presents a range of advanced technologies to do precision measurement of proteins at atomic scales, characterizing chemical composition, and detecting protein sequence and structure.
Question 1 — What aspects of your project will you measure?
Validity and viability of the BioLightV5 plasmid obtained from Twist, confirmed through gel electrophoresis and successful colony growth in E. coli.
Fluorescence output of sfGFP in response to blue light exposure, captured across a 0-255 tonal grayscale scale and individual RGB channels to measure full color fluorescence luminosity.
Tonal range and image contrast of the expressed biological image relative to the projected photographic input.
Light source consistency of the 470nm LED array across the exposure field.
Plasmid molecular weight at three timepoints — pre-transformation, post-transformation, and post-expression — to characterize metabolic load.
Question 2 — How will you perform these measurements?
Plasmid sequence and size evaluated via gel electrophoresis at Genspace immediately following receipt of the Twist order.
Blue light exposure dose calibrated using an 8-gradation step-wedge pattern, producing a dose-response curve linking light input duration and intensity to fluorescence output.
Fluorescence intensity and spatial distribution captured via camera on the Raspberry Pi, with a histogram recorded per image and edge detection applied to map contrast across the expressed biological substrate.
Spectral output of the 470nm LED array verified in real time using the AS7341 sensor integrated into the BioLight exposure unit.
Protein molecular weight confirmed via MALDI-TOF mass spectrometry through Ginkgo Cloud Lab upon Twist order delivery, establishing a pre-expression baseline for Aim 2.
Question 3 — What technologies will you use?
Gel electrophoresis
Conduct the process in the Genspace lab to ensure relative folding counts meet minimum requirements for a successful incubation.
Mass spectrometry — MALDI-TOF via Ginkgo Cloud Lab (Aim 2)
Using the MALDI-TOF, the most accessible and widely used mass spectrometry instrument, to establish a baseline and control for Aim 2 and beyond.
Measurement unit: mass-to-charge ratio (m/z) expressed in Daltons (Da) or kiloDaltons (kDa)
BioLightV5 plasmid — expected approximately 1.44 MDa for the 2,201 bp double-stranded DNA construct
sfGFP protein confirmation — expected at approximately 26.9 kDa
EL222 protein confirmation — expected at approximately 23.6 kDa
Note: MALDI-TOF applied specifically to protein molecular weight confirmation post-expression; plasmid verification handled by gel electrophoresis
Step-wedge calibration
The step-wedge will allow for a cycle of blue light exposure, with ample off-time to ensure growth is sustained and not introduce toxicity.
The step-wedge will contain 8 gradations, providing a calibrated tonal range from minimum to maximum blue light exposure.
Fluorescence imaging — OpenCV
The data will be captured and used to make fine-tunings to exposure and image quality.
A histogram will be recorded for each image, mapping pixel intensity values across the 0-255 tonal scale and RGB channels to track expression range and consistency across exposures.
Edge detection via OpenCV Canny algorithm — for refinement of contrast, a direct correlation to folding and biosensor activity.
5. AS7341 spectral sensor — Raspberry Pi integration
Optimize and control light spectrum.
The sensor will be connected directly into the exposure unit, with spectral data contributing to the LLM training dataset for downstream image recognition and biosensor pattern interpretation.
Part I: Molecular Weight
Instrument: Waters Xevo G3 QTof MS
Method: Intact LC-MS, denatured state
Q1. Calculated Molecular Weight of eGFP
Based on the predicted amino acid sequence of eGFP (247 aa, including LEHHHHHH purification tag and linker), using the ExPASy Compute pI/Mw tool:
Note: The eGFP chromophore undergoes autocatalytic maturation from residues Thr65-Tyr66-Gly67: cyclization (−18.011 Da) + oxidation (−2.016 Da) = −20.027 Da total, giving an expected intact mass of ~27,986.6 Da for the fully matured protein.
Q2. Charge State Determination from Denatured ESI Spectrum
Using two adjacent peaks from the denatured eGFP charge state envelope:
Question: Can you observe the charge state for the zoomed-in peak in the mass spectrum for the intact eGFP? If yes, what is it? If no, why not?
Yes. The zoomed-in peaks at 1473.7429 and 1473.7959 are isotope peaks within a single charge state, spaced 0.0530 Da apart. Using the isotope spacing formula:
The charge state is z = +19. This is significantly lower than the denatured charge states (+31/+32) because in the folded native state the compact 3D structure buries basic residues, limiting proton access.
State
Charge State
m/z Range
Peak Spacing
Denatured
+31 / +32
~875–904
~28 Da
Native (folded)
+19
~1473
~0.053 Da (isotope)
Part I Conclusion
In this section, I learned that the formula is easy to replicate once I know the variables. The proton state change of 1 per sequence makes it easy to calculate the experimental weight vs the theoretical calculated weight of the sequence. Once I have that value, I can calculate the individual molecular weight of the intact protein by subtracting the proton contributions from the measured m/z signal. When zoomed into a peak less than 1 Da, we are looking at charge, but not the same scale as weight.
Part II: Secondary/Tertiary Structure — Native vs Denatured eGFP
Instrument: Waters Xevo G3 QTof MS (direct infusion, no LC)
Method: Native and denatured state comparison
Q1. Difference Between Native and Denatured Protein Conformations
Question: Based on learnings in the lab, please explain the difference between native and denatured protein conformations. For example, what happens when a protein unfolds? How is that determined with a mass spectrometer? What changes do you see in the mass spectrum between the native and denatured protein analyses (Figure 2)?
When a protein is denatured, it unfolds which creates more surface area, resulting in more measurable charges. It is determined by running the protein in both denatured and native states, using electrospray ionization (ESI) where the protein solution is sprayed through a charged capillary needle forming a fine mist — as the solvent evaporates in open air, protons transfer to the protein producing multiply-charged ions.
The changes I observe in Figure 2 show the top green spectrum (denatured/unfolded) beginning with high peaks at the lower m/z end, gradually decreasing in intensity toward the right — reflecting the broad charge state envelope produced when the unfolded chain exposes all its basic sites to proton measurement (+31/+32). In the bottom red spectrum (native/folded), there is a nearly flat baseline through the middle of the plot with peaks appearing only at specific m/z windows — the compact folded structure limits proton access, producing lower charge states (z = +19) and leaving large empty regions across the spectrum, in contrast to the broad gradually declining envelope seen in the denatured state.
Q2. Charge State of the Peak at ~2800 in the Native Spectrum
Question: Zooming into the native mass spectrum of eGFP from the Waters Xevo G3 QTof MS (see Figure 3), can you discern the charge state of the peak at ~2800? What is the charge state? How can you tell?
Yes, the charge state can be discerned from the native spectrum. Using the two visible peaks in the full spectrum at 2545.0388 and 2799.4199 as adjacent charge states:
These low charge states confirm the protein is in its folded native conformation. The estimated mass accuracy is −0.07%, informed by the prior Part I result of −0.088% and reasoned to be slightly smaller given the higher m/z range. The actual calculated accuracy is −0.080%, confirming the estimate was well-reasoned.
Part II Conclusion
The data in this section made the most sense and I was able to explain the relationship between the data and results.
Using the ExPASy PeptideMass tool with the full eGFP-6xHis sequence, the default mass filter returned 19 peptides. Removing the mass filter to include all daltons returned the complete theoretical digest of 27 peptides. The difference is accounted for by very small peptides (1–2 amino acids: TR, QK, IR, R) that fall below the default detection threshold.
Q3. Chromatographic Peaks in the Peptide Map (0.5–6 min, >10% relative abundance)
21 peaks were observed above 10% relative abundance. Some peaks were clustered early in the elution window, reflecting shorter and more hydrophilic peptides. Signal peaked before dropping off near the end of the window, consistent with the elution pattern expected for a globular protein like eGFP. The count of 21 falls between the filtered theoretical minimum of 19 detectable peptides and the full unfiltered digest of 27, with the difference accounted for by very small peptides falling below the detection threshold rather than missing sequence.
Q4. Peak Count vs Predicted Peptides
The observed count of approximately 26 peaks does not exactly match the predicted 27 peptides but is very close — a difference of only 1. At least one peak in the elution window was visible but not annotated. This near-complete match confirms the digest was efficient and the primary structure of eGFP is intact.
Q5. Charge State and Mass of Peptide in Figure 5b
From Figure 5b, two isotope peaks were observed at m/z 525.76 and 526.25, giving an isotope spacing of 0.490 Da. Using the isotope spacing formula where 1.003 Da represents the ¹²C → ¹³C mass difference:
Matching the measured [M+H]⁺ of 1050.51 Da to the PeptideMass theoretical list identified the peptide as FEGDTLVNR (residues 115–123, theoretical [M+H]⁺ = 1050.5214 Da). Tryptic cleavage confirmed: preceded by K at position 114, ends with R at position 123.
Using known subunit masses from Table 1 (7FU = 340 kDa, 8FU = 400 kDa) and the CDMS spectrum in Figure 7:
Species
Theoretical Mass
Observed Peak
Accuracy
7FU Decamer
3.4 MDa
3.4 MDa
Perfect ✓
8FU Didecamer
8.0 MDa
8.33 MDa
+4.1% ✓
8FU 3-Decamer
12.0 MDa
12.67 MDa
+5.6% ✓
8FU 4-Decamer
16.0 MDa
Not detected
Beyond spectrum range
The 8FU 4-Decamer at 16.0 MDa is not observed because it falls beyond the effective detection range of this CDMS acquisition, where signal intensity drops to near baseline after approximately 15 MDa. Additional peaks at 4.013 MDa and 7.52 MDa likely represent intermediate assemblies such as the 8FU Decamer (10 × 400 kDa = 4.0 MDa).
Part IV Conclusion
I quickly identified the 7FU Decamer and 8FU Didecamer. I identified the next two largest peaks, but I assumed they were both in range. I was off by one position, with one being larger than the waveform range.
Part V: Did I Make GFP?
Instrument: Waters Xevo G3 QTof MS
Method: Intact LC-MS, denatured state
Q1. Intact Protein Mass Confirmation Table
Theoretical
Observed/Measured on Intact LC-MS
PPM Mass Error
Molecular weight
28,006.60 Da
27,981.9 Da
−882 ppm
All three values are internally consistent and derived from Part I:
28,006.60 Da — ExPASy calculation of full 247 aa eGFP-6xHis sequence including LEHHHHHH tag
27,981.9 Da — back-calculated from m/z 875.4421, z = +32 on the Xevo G3 QTof
−882 ppm — accuracy formula applied to the two-peak manual calculation on the denatured spectrum
The measured MW of 27,981.9 Da is consistent with the expected mass of mature eGFP (chromophore maturation −20 Da from 28,006.60 Da = 27,986.6 Da), confirming the protein is correctly folded and the chromophore has matured.
Part V Conclusion
I retrieved the peptide mass but misread the theoretical value. Once corrected, it made sense that the corresponding PPM was −882 based on the full sequence weight in Daltons of ~28,000.
Appendix: Claude AI Assistance Analysis
Claude AI (Sonnet 4.6, Anthropic, 2026) was used as a computational coach throughout all five sections of this homework assignment. The following summarizes AI assistance by section.
Part I — Molecular Weight
Assistance provided: Validated the ExPASy sequence input and caught a critical tag truncation error (26,941 → 28,006.60 Da) when the LEHHHHHH tag was missing from the initial calculation. Tested and confirmed the ESI charge state formula against experimental peak values. Reframed the native MS isotope spacing interpretation to correctly derive z = +19 from isotope peaks rather than adjacent charge state peaks. Validated the final accuracy calculation of −0.000882 (−0.088%).
Rubric: Starting ~7.4/10 → Final 10/10 — ~30% improvement
Largest gains: sequence MW calculation, native MS charge state interpretation.
Part II — Secondary/Tertiary Structure
Assistance provided: Corrected the ESI ionization description from “electrically charged gas tube” to open-air electrospray ionization. Refined the spectral interpretation of Figure 2 to accurately reflect the gradually declining denatured envelope vs the narrow native charge state distribution with flat baseline in the middle. Calculated charge states z = +11/+10 from the two native spectrum peaks. Validated the estimated mass accuracy of −0.07% against the calculated −0.080%.
Rubric: Starting ~7/10 → Final 10/10 — ~25% improvement
Largest gains: ESI description correction, native MS spectral interpretation.
Part III — Peptide Mapping
Assistance provided: Verified K and R counts against the full sequence. Reconciled the PeptideMass filter discrepancy (19 vs 27 peptides) by identifying the default mass filter as the source of the difference. Confirmed the isotope spacing formula and its ¹³C basis. Calculated neutral mass and singly charged [M+H]⁺ from raw m/z values. Identified FEGDTLVNR as the matching tryptic peptide from the PeptideMass list. Calculated mass accuracy at −10.85 ppm. Illustrated the dramatic accuracy improvement from intact protein (~882 ppm) to peptide level (~11 ppm).
Rubric: Starting ~8.7/10 → Final 10/10 — ~15% improvement
Largest gains: peptide identification, ppm accuracy calculation, PeptideMass filter parameters.
Part IV — Oligomers
Assistance provided: Calculated theoretical masses for all four KLH oligomeric species from subunit masses. Matched observed CDMS peaks to theoretical values. Confirmed that the 8FU 4-Decamer at 16.0 MDa falls beyond the effective detection range of the acquisition rather than being absent from the sample. Identified additional unassigned peaks as likely intermediate assemblies.
Rubric: Starting ~8/10 → Final 10/10 — ~20% improvement
Largest gain: distinguishing detection range limitation from sample absence.
Part V — Did I Make GFP?
Assistance provided: Clarified that the theoretical pI of 5.90 is not the MW. Distinguished the peptide mass (1051 Da from Part III) from the intact protein mass (28,006.60 Da). Confirmed that −882 ppm derives from the two-peak manual denatured protein calculation in Part I using the full sequence Dalton weight of ~28,000 Da.
Rubric: Starting ~7/10 → Final 10/10 — ~20% improvement
Largest gain: distinguishing pI, peptide mass, and intact protein MW as separate values.
Overall Assessment
Section
Starting
Final
Improvement
Part I — Molecular Weight
7.4/10
10/10
+30%
Part II — Secondary/Tertiary
7.0/10
10/10
+25%
Part III — Peptide Mapping
8.7/10
10/10
+15%
Part IV — Oligomers
8.0/10
10/10
+20%
Part V — Did I Make GFP?
7.0/10
10/10
+20%
Overall
7.6/10
10/10
+22%
Claude AI served consistently as a computational coach — confirming, correcting, and refining student answers rather than generating them. The global participant independently reasoned all initial answers; AI provided formula validation, calculation checking, and conceptual reframing where needed. The largest improvements came in sequence-level calculations and instrument-specific interpretation, while the global participant demonstrated strong independent intuition throughout, particularly in spectral observation and pattern recognition.
Week 11 HW: Bioproduction and Cloud Labs
HTGAA 2026 — Week 11: Bioproduction & Cloud Labs
Hypothesis — Version 2.1
This is a hypothesis on the design of a variable luminosity construct based on cell-free protein synthesis. By adding independent reagent modifications to a fixed cell-free DNA and master mix, we hypothesize a measurable delta in sfGFP luminosity relative to the unmodified control, operating on a single mechanistic axis — free Mg2+ availability:
Potassium Phosphate Dibasic added above the baseline 5.625mM sequesters free Mg2+ through phosphate chelation, reducing ribosome assembly efficiency and T7 RNAP cofactor availability — driving sfGFP expression below the control baseline.
Magnesium Glutamate added above the baseline 6.975mM directly increases free Mg2+ in solution, stabilizing ribosome subunit assembly and activating Mg-NTP complexes for both transcription and translation — driving sfGFP expression above the control baseline.
Both reagents operate on the same Mg2+ ion target from opposite directions — phosphate as a Mg2+ sink and magnesium glutamate as a Mg2+ source. The relative magnitude of the positive and negative deltas from control, measured by spectrophotometry at excitation 485nm / emission 510nm, will reveal whether the master mix is operating below, at, or above its Mg2+ optimum — directly informing the optimized reaction conditions for eLightOn CFPS deployment in BioLightX5 Aim 2.
The reagent producing the largest delta will be selected as the candidate for multi-level dose titration in a subsequent round.
Figure 1. Mechanistic overview of the single-axis Mg2+ deviation hypothesis. Left: KPO4 dibasic as phosphate sink drives negative delta. Center: control baseline. Right: MgGlu as Mg2+ source drives positive delta.
Assignment Overview
This week’s homework is a collaborative cloud lab CFPS experiment — HTGAA 1536 — a real-time global sfGFP artwork canvas where each student contributes reagent modification wells to a shared 384-well plate, feeding into a class-wide CFPS optimization dataset.
DNA template, master mix composition, temperature, and reaction time are fixed by the class protocol and identical across all wells. No DNA modifications are introduced. The sole experimental variable is additive supplementation — reagents added on top of the fixed master mix to modulate sfGFP expression above or below the class baseline. Water volume is adjusted automatically by the platform to maintain total reaction volume of 2000nL per additive slot. All modifications operate on the free Mg2+ axis via two independent reagents from the approved list.
Final Well Assignments — JSON Verified
All volumes verified from submitted JSON. Stock concentrations: KPO4 dibasic 0.5M, MgGlu 0.5M. Total additive volume per well: 2000nL. Total reaction volume: 12,000nL.
Well
Label
Reagent
Stock nL
Water nL
Added (mM)
Total Final
Status
W1
P1
KPO4 dibasic
150nL
1850nL
+6.250mM
11.875mM
Above ceiling
W2
P2
KPO4 dibasic
100nL
1900nL
+4.167mM
9.792mM
Safe
W3
P3
KPO4 dibasic
50nL
1950nL
+2.083mM
7.708mM
Safe
W4
P4
None
0nL
2000nL
—
Baseline
Control
W5
P5
MgGlu
50nL
1950nL
+2.083mM
9.058mM
Safe
W6
P6
MgGlu
100nL
1900nL
+4.167mM
11.142mM
Safe
W7
P7
MgGlu
150nL
1850nL
+6.250mM
13.225mM
Above ceiling
W8
P8
MgGlu
200nL
1800nL
+8.333mM
15.308mM
Above ceiling
Wells P1 and P8 are designated Above ceiling — intentionally exceeding the published tolerable ionic range to map the suppression floor and inhibitory slope of the Mg2+ dose-response curve respectively. Results from these wells are expected to show reduced output relative to the safe-zone wells and will be interpreted as boundary conditions rather than optimal expression targets.
Delta under = RFU(control) - RFU(phosphate well)
Delta over = RFU(magnesium well) - RFU(control)
The well with the largest magnitude delta within the safe zone becomes the candidate for multi-level dose titration in a subsequent round. Above ceiling wells P1 and P8 are evaluated separately as boundary condition data.
Predicted Spectrophotometry — sfGFP Green Gradation
Figure 3. Predicted sfGFP fluorescence across 8 wells. Bar color maps to expected visual fluorescence under UV illumination. P1 and P8 above wells predicted to show reduced output despite higher reagent concentration — inhibitory zone behavior.
Footnote 1 — Baseline RFU uncertainty: The control baseline of ~3,500 RFU used in these predictions is a conservative mid-range estimate derived from published CFPS sfGFP benchmarks. Actual baseline fluorescence for this specific extract batch at 50nM DNA template may range from 5,000–20,000 RFU depending on lysate activity, plate reader gain settings, and chromophore maturation completeness within the class-defined reaction window. All predicted RFU values and delta calculations should be interpreted as relative proportions rather than absolute measurements. The class-wide control wells across all student plates will establish the true baseline. All downstream BioLightX5 Aim 2 calibration will reference actual measured RFU from this experiment rather than these predicted values.
Footnote 2 — Above ceiling conditions P1 and P8: Wells P1 (KPO4 dibasic 11.875mM, 150nL stock) and P8 (MgGlu 15.308mM, 200nL stock) intentionally exceed their respective published tolerable ionic ceilings of 10mM and 12mM. These Above ceiling conditions are designed to map the suppression floor and inhibitory slope of the Mg2+ dose-response curve. P1 is expected to show near-complete sfGFP suppression as phosphate chelation exhausts available free Mg2+. P8 is expected to show reduced expression relative to P6 and P7 as excess Mg2+ destabilizes ribosome conformation and competes with Mg-NTP complexes. Neither Above ceiling well will be used as a target for BioLightX5 Aim 2 optimization — they serve as boundary condition markers that define the outer limits of the Mg2+ operating window for this specific extract and master mix formulation.
Connection to BioLightX5 Final Project
This week’s lab activity may be considered Aim Zero of BioLightX5, as a quantitative CFPS calibration step. The results will provide an excellent starting point for Aim 2 — the cell-free version of BioLightX5 — as a predictive model for tunable sfGFP expression using additive-only Mg2+ axis control.
Aim
Title
Dependency on Aim Zero
Aim Zero
CFPS calibration
This experiment
Aim 1
Wetlab validation
Independent — running in parallel
Aim 2
Cell-free + imaging platform
Inherits Aim Zero predictive model
Aim 3
Makerspace deployment
Inherits Aim 2 validated protocol
Broader Significance
Additive-only expression control — without modifying DNA, master mix, temperature, or reaction time — establishes a portable, reproducible TXTL tuning framework applicable across automated and community lab settings.
Cost efficiency: Tuning TXTL output to only the required expression level eliminates over-expression waste and reduces reagent consumption proportionally.
Portability: A fixed master mix with additive-only modifications requires no reformulation across sites — directly deployable at Makerspace Charlotte and beyond.
Scalability: Decoupling expression tuning from master mix preparation enables batch-consistent results across distributed platforms including the OT-2.
Accessibility: Directly supports BioArt Studio’s mission and the iGEM 2026 distributed biomanufacturing framework.
References
sfGFP: Pédelacq et al. (2006). Nature Biotechnology 24(1):79-88. doi:10.1038/nbt1172
Hypothesis- Version 1.0 (retired-no Spermidine in reagent options. See Version 2.0 above)
This is a hypothesis on the design of a variable luminosity construct based on cell-free protein synthesis. By adding independent reagent modifications to a fixed cell-free DNA and master mix, we hypothesize a measurable delta in sfGFP luminosity relative to the unmodified control:
Spermidine at 3mM drives expression below baseline due to limiting promoter access caused by DNA over-compaction at the transcription initiation site.
Creatine phosphate at 15mM drives expression above baseline by replenishing ATP availability and extending the active translation window beyond the point of energy depletion.
The reagent producing the largest delta will be selected as the Round 2 candidate, where it will be tested at multiple dose levels — low, medium, and high — establishing a multi-point luminosity gradient. Mg²⁺ will be introduced in Round 2 as a co-variable to determine whether ionic modulation of ribosome activity compounds or independently shifts the Round 1 delta.
Experimental Design
DNA template, master mix composition, temperature, and reaction time are fixed by the class protocol and identical across all wells. No DNA modifications are introduced. The sole experimental variable is additive supplementation — small-molecule reagents added on top of the fixed master mix to modulate sfGFP expression above or below the class baseline.
My Well Assignments
Well
Additive
Mg²⁺
Target
Purpose
Control
None
Unchanged
Baseline
Class standard — shared delta reference
Under
Spermidine 3mM
Unchanged
Low expression
Limits promoter access via DNA over-compaction
Over
Creatine phosphate +15mM
Unchanged
High expression
Extends ATP window — longer active translation
Mg²⁺ is held constant in Round 1 and introduced only in Round 2 as a co-variable with the winning reagent.
Rationale
Spermidine and creatine phosphate were selected because they act at independent nodes in the expression pathway — transcription and energy respectively — ensuring Round 2 Mg²⁺ co-variable testing can be interpreted without confounding either mechanism.
Spermidine over-compacts DNA above its optimal concentration, limiting promoter access at the transcription initiation site and reducing mRNA output independently of ribosome activity or energy supply.
Creatine phosphate replenishes ATP availability, extending the active translation window beyond baseline energy depletion independently of transcription rate or DNA accessibility.
Measurements
Primary — Spectrophotometric fluorescence
Plate reader excitation 485nm / emission 510nm, RFU at class-defined endpoint. Delta from control is the decision metric:
Δ under = RFU(control) − RFU(spermidine well)
Δ over = RFU(creatine phosphate well) − RFU(control)
The well with the largest magnitude delta becomes the Round 2 candidate.
Secondary — Mass spectrometry
Where available, mass spectrometry quantifies total sfGFP yield independent of fluorescence — including misfolded protein that fails to mature the chromophore. Correlating mass spec yield against RFU across the three wells determines whether the delta reflects translation output, folding efficiency, or both.
Round 2 Design — Pending Round 1 Results
Well
Additive
Mg²⁺
Purpose
Control
None
Unchanged
Baseline reference
Low
Winner low dose
+ Mg²⁺
Combined effect — low
Medium
Winner mid dose
+ Mg²⁺
Combined effect — medium
High
Winner high dose
+ Mg²⁺
Combined effect — high
Connection to Final Project
This week’s lab activity may be considered Aim Zero of BioLight x2, as a quantitative CFPS calibration step. The results of Round 1, Round 2, and spectrophotometric readings will provide an excellent starting point for Aim 2 — the cell-free version of BioLight x2 — as a predictive model for tunable sfGFP expression using additive-only master mix control.
Aim
Title
Dependency on Aim Zero
Aim Zero
CFPS calibration
This experiment
Aim 1
Wetlab validation
Independent — running in parallel
Aim 2
Cell-free + imaging platform
Inherits Aim Zero predictive model
Aim 3
Makerspace deployment
Inherits Aim 2 validated protocol
References
sfGFP: Pédelacq et al. (2006). Nature Biotechnology 24(1):79–88. doi:10.1038/nbt1172
Part A: The 1,536 Pixel Artwork Canvas | Collective Artwork
Item 1: Pixel Contribution
I contributed to plate #G3, initiating a rose design on April 15. I seeded the concept on Discourse: "#G3 - Starting to build a rose… let’s see what grows!"
Item 2: HTGAA Webpage Notes
2a — What I contributed:
I contributed to plate #G3, planting the seed of a pixel rose and inviting others to build on it — fully expecting it to be overwritten, in the spirit of collaborative design.
2b — What I liked:
I liked how the canvas was a collaborative activity. It’s interesting how quickly it became an algorithmic scripted design. The timeline feature is a good addition, to be able to see the evolution over time.
2c — What could be made better:
I have been thinking of how to introduce more collaboration and team building by creating a challenge that rewards the player with pixels, or points that can be used to build with. While scripted algorithms are great, there could be a separate or individual board space for that. To take the game concept one step further, imagine the Tron Light Cycle game. The goal is to traverse the board — without crossing over another trail. Pick anywhere on an edge to begin, and see if you can make it to the middle. More earned “points” = more chances to move. Earn points by quiz questions, or scavenger hunt activities. Roll a random # and that’s how many “moves” you get.
Part B: Cell-Free Protein Synthesis Reaction Composition
Component Descriptions
1. E. coli Lysate — BL21 (DE3) Star Lysate (includes T7 RNA Polymerase)
The lysate contains all of the cellular building blocks needed to carry out transcription and translation outside of the cell. T7 polymerase is selected for its known capability of recognizing the T7 promoter sequence in the cell-free system design.
2. Potassium Glutamate
Potassium Glutamate has the most volume of salt in this buffer, and its role is to manage the ionic binding strength.
3. HEPES-KOH pH 7.5
Its main role in the buffer is to stabilize pH.
4. Magnesium Glutamate
Mg²⁺ is an important element in maintaining ribosome structure. It is highly tunable in order to promote ribosome function, without impacting other components.
5. Potassium Phosphate Monobasic & Dibasic
The potassium phosphate is primarily an energy source, that also helps to maintain pH. The key is the ratio between the two to raise/lower pH respectively.
6. Ribose
Ribose provides energy in the transcription process. It is a sugar-based component of fuel.
7. Glucose
Glucose, along with ribose, create an energy system to help drive regeneration of ATP, and works well over long timeframes.
8. AMP, CMP, GMP, UMP (Nucleoside Monophosphates)
These are the consumable building blocks that are considered substrates for RNA synthesis. They are recycled and sustainable over longer periods of time.
9. Guanine
Guanine provides the energy for GTP. It is important in that it prolongs the cycle of ribosome translation.
10. 17 Amino Acid Mix
These are the essential building blocks of protein synthesis, and will allow tRNA to pair with mRNA in the ribosomes, based on transcription information.
11. Tyrosine
It is another of the essential amino acids for tRNA to mRNA translation. It is handled uniquely due to its poor solubility.
12. Cysteine
Cysteine is an essential amino acid, and it is added separately due to its unstable chemical bonding properties, that prevent it from mixing with other elements. It is important for structural development in protein folding.
13. Nicotinamide
Nicotinamide is a stabilizer that supports the energy system found in metabolic enzymes.
14. Nuclease Free Water (Backfill)
As a backfill, it is free of contamination to allow the defined mix to perform at full potential.
Question 1: 1-Hour PEP-NTP vs. 20-Hour NMP-Ribose-Glucose Master Mix
The main difference between the 1-hour and 20-hour mix is in the composition of the master mix. The 20-hour incubation contains additional Ribose and Glucose to provide energy that allows translation to sustain the longer time frame. The 1-hour mix uses spermidine to help accelerate the translation, with short-term stabilization of nucleic acids. In contrast, the 20-hour contains nicotinamide to stabilize the enzymes. HEPES-KOH has a higher level to mitigate the higher pH of the concentration. In the one-hour mix, PEP (phosphoenolpyruvate) + NTPs is the energy source, which is fast and high-yield but generates inorganic phosphate buildup that inhibits the reaction, limiting it to ~1 hour.
Question 2 (Bonus): How Can Transcription Occur if GMP is Not Included but Guanine Is?
Guanine provides the energy for GTP. It is important in that it prolongs the cycle of ribosome translation. Guanine base can substitute for GMP — the lysate contains the enzymatic machinery to convert it.
Part C: Fluorescent Proteins in Cell-Free Systems
Question 1: Biophysical and Functional Properties
Protein
Answer
Supplemental Notes
sfGFP
sfGFP is well suited for cell-free systems as it has a proven high rate of folding ability, and has a fast chromophore response, needed for fast fluorescence
Folds without cellular chaperones; chromophore maturation still requires O₂
mRFP1
Slowly-maturing monomer with low acid sensitivity; requires more time and has low pH sensitivity which inhibits maturation
Primary limitation is slow maturation + low quantum yield (~0.25); pH is not the limiting factor at pH 7.5
mKO2
mKO2 has moderate pH sensitivity so it may be inhibited by higher pH; slow to mature which will limit fluorescence yield
Maturation half-time ~180 min vs sfGFP ~30 min; O₂ dependence also limits yield in sealed reactions
mTurquoise2
Rapidly-maturing monomer with very low acid sensitivity; medium expression with high brightness
Brightness driven by exceptionally high quantum yield (~0.93); maturation is moderate not rapid
mScarlet-I
Fast maturation and high brightness, with moderate acid sensitivity
Maturation half-time ~40 min; among best red FP choices for CFPS; “I” = intermediate brightness/speed tradeoff
Electra2
Oxygen-independent and requires an exogenous reagent; unique in the low waveform spectrum
Exogenous cofactor is biliverdin, must be added to mastermix; emission is near-infrared (~670nm), high not low wavelength
Question 2: Mastermix Hypothesis
Protein: sfGFP — selected for its direct relevance to BioLight Aim 2, a cell-free biosensor that expresses sfGFP when exposed to ~470nm blue light.
Reagents: Mg²⁺ availability, adjusted via two mechanisms operating on a single mechanistic axis:
A. Potassium Phosphate Dibasic (Mg²⁺ sink)
Added above the baseline 5.625mM, potassium phosphate dibasic sequesters free Mg²⁺ through phosphate chelation, reducing ribosome assembly efficiency and T7 RNAP cofactor availability — driving sfGFP expression below the control baseline.
B. Magnesium Glutamate (Mg²⁺ source)
Added above the baseline 6.975mM, magnesium glutamate directly increases free Mg²⁺ in solution, stabilizing ribosome subunit assembly and activating Mg-NTP complexes for both transcription and translation — driving sfGFP expression above the control baseline.
Expected effect: Since both reagents interact with ionic affinity to the ribosome, the focus is on a singular variable for fine tuning in both directions from the master mix as control. The relative magnitude of the positive and negative deltas from control, measured by spectrophotometry at excitation 485nm / emission 510nm, will reveal whether the master mix is operating below, at, or above its Mg²⁺ optimum — directly informing optimized reaction conditions for eLightOn CFPS deployment in BioLight x2 Aim 2.
Predicted dynamic range: ~10× from W1 to W8. The asymmetry between negative delta (−2,700 max) and positive delta (+4,300 max) suggests the master mix is operating below its Mg²⁺ optimum — more headroom above baseline than below.
Connection to BioLight x2:
Aim
Title
Dependency
Aim Zero
CFPS calibration
This experiment
Aim 1
Wetlab validation
Independent — running in parallel
Aim 2
Cell-free + imaging platform
Inherits Aim Zero predictive model
Aim 3
Makerspace deployment
Inherits Aim 2 validated protocol
Question 4: Fluorescence Data Analysis
Pending — due one week after data is returned (date TBD).
Part D: Build-A-Cloud-Lab (Bonus)
What I Built
For the Cloud Lab render, I used the default assembly, but focused on an idea for a circular floor plan, to introduce the idea of a circular manufacturing layout. While linear production lines are expected, there may be some spatial advantages to a circular layout, to fit more equipment in less space, with a central pivot point for automated transfer. I added a visualization of a data wall, where all activity and data is monitored from a command center or virtual reality heads-up display (which would correspond well with a circular layout — no linear navigation required).
What I Liked
I liked the idea of a visual configurator that leads the way towards a “digital twin” of the lab, where protocols can be designed, built, and simulated.
Design Ideas
I would like to be able to import my own custom labware, and apply transforms such as x, y, z position and scale within a unit. That would lead to connectivity with the API to perform animated sequences, true to the intent of remote automation.
Appendix: Scoring Summary
Part A
Item
Score
A1 — Pixel contribution
9/10
A2a — What I contributed
9/10
A2b — What I liked
8/10
A2c — What could be improved
9/10
Part A overall
8.75/10
Confidence: ± 6%
Part B — Component Scores
Component
Initial
Final
Confidence
E. coli Lysate
6/10
8/10
8/10
Potassium Glutamate
5/10
7/10
8/10
HEPES-KOH
8/10
8/10
8/10
Magnesium Glutamate
5/10
7/10
8/10
Potassium Phosphate
3/10
7/10
6/10
Ribose
4/10
6/10
7/10
Glucose
6/10
8/10
8/10
AMP/CMP/GMP/UMP
4/10
7/10
7/10
Guanine
4/10
7/10
8/10
17 AA Mix
7/10
8/10
9/10
Tyrosine
3/10
8/10
6/10
Cysteine
6/10
9/10
8/10
Nicotinamide
3/10
7/10
8/10
Nuclease Free Water
6/10
7/10
7/10
B-Q1: PEP vs NMP
7/10
9/10
9/10
B-Q2: Bonus
5/10
7/10
6/10
Average
5.3/10
7.6/10
7.5/10
Part C — Scores
Item
Initial
Final
Confidence
sfGFP
5/10
8/10
8/10
mRFP1
6/10
6/10
8/10
mKO2
4/10
6/10
6/10
mTurquoise2
6/10
6/10
9/10
mScarlet-I
4/10
8/10
7/10
Electra2
3/10
7/10
4/10
Q2 Hypothesis
8/10
9/10
9/10
Average
5.1/10
7.1/10
7.3/10
Part D — Scores
Item
Score
Participation
10/10
Design rationale
9/10
What I liked
9/10
Design Ideas
9/10
Overall
9.25/10
Overall Scorecard (All Parts)
Section
Final Accuracy
Part A
8.75/10
Part B
7.6/10
Part C
7.1/10
Part D
9.25/10
Overall
8.2/10
Metric
Score
Final accuracy
8.2 / 10
Competency score
8.4 / 10
Average improvement (initial → final)
+2.2 pts
Confidence calibration gap
± 8%
Self-correction rate
88%
Strengths: Part D design thinking, Part B Q1, Cysteine, mScarlet-I, sfGFP
Recurring gap: pH framing consistently inverted; molecular mechanism specificity (enzyme names, pathways, quantum yield)
Competency score (8.4) exceeds raw accuracy (8.2) — consistent self-correction and strong experimental design thinking throughout.
Document complete pending Part C Q4 data return.
Week 13 HW: Synbio
Final Project Build
This week has been focused on key milestones for my final project.
BioLight V5 circular plasmid finalized
Benchling construct submitted to the Node Order Form as approved
TWIST order simulated with no errors reported
pUC19 confirmed as vector with availabiliy at Genspace
Scheduled a visit to Genspace on May 28th to attend Safety Training and Orientation
Aim 1 objective is to validate viability of the Clonal Gene “BioLight V5” to be used in the “Photoplasm” optogenetic labware
Slides for final project are drafted
ChimeraX being used to generate Mechanism of Action illustrations.
pDawn with sfGFP identified and ordered from AddGene as a control to enable testing of protocols
Design, Development, and physical prototyping (Build) of the “Photoplasm” device is proceeding, with sensor data.
Additional applications developed (software) to facilitate electrical engineering schemats.
All steps being documented as an open-source build framework, Aim 2 objective in support of the larger Aim 3 “MakerSpace” vision.
Reviewed and contributed to the “Ai Tutor” project led by @Derek - Shared a Q&A prompt based on accuracy & confidence in a feedback loop.
Week 14 HW: Bio Design & Bio Manufacturing
This week has been focused on finalizing my final project.
This lab is 100% in person. Here are some activities I was able to explore:
I downloaded the Lab Inventory and Lab Prep Manuals for reference I watched the Additional Resource Videos: Measurement Crash Course Video Dilution Problems Video
This week’s lab was focused on the use of OpenTrons. I was able to write a protocol that ran with success in our MakerSpace Charlotte BioArt Studio. Visit the OpenTrons lab activity here: Opentrons Art
For Week 6, my labwork is embedded in my homework found here: Week 6 Genetic Circuits Parit I
During this week, I was able to attend the MakerSpace Charlotte BioArt Studio class focused on Microbial Engineering. See photos from the event at the link above.
This week’s lab was conducted at Waters Immerse Cambridge facility. I reviewed and used the protocols provided in the homework found here: Week 10: Advanced Imaging & Measurement Technlogy
Info Chapter overview. Three parts: (1) Round 1 Q&A on the six mandatory post-lab questions using an accuracy / confidence rubric with one refinement pass per answer; (2) a reflection introducing the Kirkpatrick model and a lab simulator built to extend Level 2 learning before the wet bench; (3) a placeholder for Round 2 — a re-run of the same six questions after a focused simulator session, used to test whether gamification measurably improves accuracy and calibration.
This week’s lab was focused on the use of OpenTrons. I was able to write a protocol that ran with success in our MakerSpace Charlotte BioArt Studio. Visit the OpenTrons lab activity here: Opentrons Art
Week 4 Lab: Protein Design Part I
Week 4 labwork was conducted within the homework assignment found here: Week 4 Homework
During this week, I was able to attend the MakerSpace Charlotte BioArt Studio class focused on Microbial Engineering. See photos from the event at the link above.
Chapter overview. Three parts: (1) Round 1 Q&A on the six mandatory post-lab questions using an accuracy / confidence rubric with one refinement pass per answer; (2) a reflection introducing the Kirkpatrick model and a lab simulator built to extend Level 2 learning before the wet bench; (3) a placeholder for Round 2 — a re-run of the same six questions after a focused simulator session, used to test whether gamification measurably improves accuracy and calibration.
Lab Context
Lab: Bioproduction of Beta-Carotene and Lycopene
System:E. coli transformed with pAC-LYC (lycopene) or pAC-BETA (beta-carotene), grown across a 16-condition matrix (2 plasmids × 4 media × 2 temperatures, in duplicate, plus 2 media-only controls = 34 cultures total).
Measurements: OD600 for cell density; peak absorbance at 474 nm (lycopene) and 456 nm (beta-carotene) for pigment quantification.
Part 1 — Round 1: Initial Q&A with Rubric
Each of the six mandatory post-lab questions is answered using a rubric loop: I provide an initial answer, rate my own confidence (1–10), receive an accuracy score (1–10) plus a set of next-step prompts, then revise once and rate confidence again. Final accuracy reflects the revised answer. Refinements are counted but not used in scoring.
Q1. Which genes when transferred into E. coli will induce the production of lycopene and beta-carotene, respectively?
Answer: plasmids pAC-LYC and pAC-BETA
Tip
Next steps to consider:
The question asks which genes — but plasmids are delivery vehicles, not genes. What’s actually inside each plasmid that does the work?
The pre-lab reading names three specific genes from one bacterium for lycopene, and one additional gene for beta-carotene. Can you go back and find them by name?
If pAC-BETA is just pAC-LYC plus one more gene, what does that tell you about how the pathway is built — does the extra gene create the pigment from scratch, or convert one pigment into another?
Q2. Why do the plasmids that are transferred into the E. coli need to contain an antibiotic resistance gene?
Answer: To ensure only plasmids selected with antibiotic resistance to chloramphenicol are transformed.
Tip
Next steps to consider:
Antibiotics don’t act on plasmids — they act on living things. So what is actually being killed when you add chloramphenicol to a culture, and what survives?
The protocol says the LB is already supplemented with antibiotic — not just used once at transformation. Why would you need antibiotic pressure to continue throughout a 24-hour growth, not just at the start?
If a cell loses its plasmid mid-culture, what happens to it? And if many cells lost their plasmids, what would that do to your pigment yield?
Q3. What outcomes might we expect to see when we vary the media, presence of fructose, and temperature conditions of the overnight cultures?
Answer: We may expect to see transcriptional changes in expression of proteins in differing intensity of color of pAC-LYC and pAC-BETA. Higher temperatures faster growth but less time to accumulate pigment. Richer media, also more growth but less pigment. More fructose, better for glucose receptors. I predict pAC-LYC will respond the strongest since it is more compatible with glucose, under wider range of temperature and base media.
Tip
Next steps to consider:
The fructose papers describe a specific phenomenon involving glucose and gene expression — it’s not about “glucose receptors” but about a regulatory system that suppresses certain genes when glucose is around. What’s that system called, and how does fructose let you escape it?
Your prediction compares pAC-LYC to pAC-BETA — but the two plasmids share the same backbone, the same promoter, and three of four genes. What would be a more meaningful comparison: which plasmid wins, or which conditions optimize each one?
The lab divides peak absorbance by OD600 to get pigment-per-cell. Why would the highest-growth condition not necessarily give the most pigment per cell? What does that tell you about how to phrase your prediction?
Q4. Generally describe what “OD600” measures and how it can be interpreted in this experiment.
Answer: OD600 is an optical density measurement of cells suspended in media. This is a calibration step to ensure a baseline. In this experiment, the variables of temperature, fructose, and base media will result in varying levels of cell density. The optical density normalization reading is a good way to determine what has occurred in the experiment.
Tip
Next steps to consider:
Why is 600 nm specifically chosen as the wavelength? Lycopene absorbs at 474 nm and beta-carotene at 456 nm — what does that suggest about why 600 nm is the right wavelength for measuring cells?
You used the words “calibration” and “normalization” in the same sentence. They’re actually two different operations in this lab — one happens before the measurement, one happens after. Can you separate them?
If two cultures have the same OD600 but very different pigment readings at 474 nm, what does that tell you? And what’s the math you’d do to express that difference as a single meaningful number?
Q5. What are other experimental setups where we may be able to use acetone to separate cellular matter from a compound we intend to measure?
Answer: The use of acetone assists in separating cells by dissolving proteins from cells, leaving target cells of interest for further evaluation, such as optical density or pigmentation.
Tip
Next steps to consider:
Re-read the protocol’s description of what acetone does to the pellet. Does it dissolve the proteins, or precipitate them? And which part — the pellet or the supernatant — do you actually keep for measurement?
After acetone treatment, are the cells still alive or intact? If not, what does that mean about the phrase “target cells of interest”?
The question specifically asks for other experimental setups. Think about what carotenoids have in common with chlorophyll, or with other plant pigments — could the same acetone trick work on a leaf? On algae? What’s the underlying property that makes acetone the right solvent?
Q6. Why might we want to engineer E. coli to produce lycopene and beta-carotene pigments when Erwinia herbicola naturally produces them?
Answer: We want to optimize the metabolic load, which allows us to predict and measure pigmentation for large scale bioproduction, using well-characterized systems. We can control the growth rate over wildtype Erwinia herbicola. We can also fine-tune the formulation of E. coli for better regulatory control.
Tip
Next steps to consider:
You mention “well-characterized systems” — what specifically makes E. coli well-characterized in a way Erwinia herbicola isn’t? Think about the tools you’ve encountered in HTGAA so far (plasmids, promoters, selection markers, codon optimization, CRISPR) — how many of those work out of the box in Erwinia?
“Regulatory control” can mean two very different things in synthetic biology — gene-level regulation (promoters, induction) and industrial/biosafety regulation (BSL classification, FDA approval). Which one were you describing? Could both apply here?
This week’s lab transforms E. coli with carotenoid genes — but tomorrow you might want to make zeaxanthin instead, or boost upstream pathway flux. Would it be easier to do that engineering in E. coli or in Erwinia herbicola? Why?
Tie it back to your own framing: you’ve described BioLight’s Aim 3 as a Kodak/Eastman analogy for democratizing bioproduction. Does Erwinia herbicola fit that vision, or does the chassis question become an infrastructure question?
Round 1 Diagnostic Summary
Q
Topic
Init Conf
Init Acc
Refs
Final Conf
Final Acc
Δ Acc
LGI %
Calib Gap
1
Genes for lycopene/beta-carotene
8
2
1
5
4
+2
25.0
+1
2
Antibiotic resistance gene purpose
7
7
1
8
7
0
0
+1
3
Media/fructose/temp outcomes
6
3
1
8
6
+3
42.9
+2
4
OD600 meaning & interpretation
8
5
1
7
6
+1
20.0
+1
5
Other acetone-separation setups
5
3
1
6
3
0
0
+3
6
Why E. coli over Erwinia herbicola
5
3
1
9
6
+3
42.9
+3
Session averages (n = 6):
Initial accuracy: 3.8 / 10
Final accuracy: 5.3 / 10
Δ accuracy: +1.5
Mean Learning Gain Index (LGI): 21.8%
Mean final calibration gap: +1.8 (confidence over accuracy)
Scoring methodology:
Accuracy (1–10): Coach-assigned assessment of factual correctness against the lab protocol and supporting papers.
Confidence (1–10): Self-reported certainty, asked at both initial answer and after revision. Never coach-assigned.
Refinements: Counter only — not used in scoring.
Learning Gain Index (LGI): (Final − Initial) / (10 − Initial) × 100% — measures improvement relative to available headroom.
Calibration gap: Final Confidence − Final Accuracy. Positive values indicate over-confidence.
Diagnostic plot — Confidence vs Accuracy by question:
Bars: Initial Confidence (blue) and Initial Accuracy (orange). Lines: Final Confidence (dark blue, circles) and Final Accuracy (dark orange, squares).
Pattern read: Confidence (upper trace) consistently sits above accuracy (lower trace) across all 6 questions. The calibration gap widens on Q5 and Q6 — the two questions with the lowest final accuracy — meaning confidence rose as answers got longer, not as they got more correct. Q2 is the only question where initial confidence and accuracy aligned (both 7) and held through revision.
Flagged for Round 2 revision:
Q1 — Gene names (CrtE, CrtI, CrtB for lycopene; +CrtY for beta-carotene) and source organism (Erwinia herbicola) not yet stated. Easiest fix in the set.
Q5 — Acetone mechanism inverted (acetone precipitates proteins and dissolves lipophilic pigments, not the other way around); other experimental setups not yet provided.
Q6 — Concepts present but under-substantiated; tooling advantage and modularity not yet stated.
Part 2 — Reflection: From Q&A to Gamification
Framing through the Kirkpatrick Model
The Kirkpatrick Four-Level Training Evaluation Model (Kirkpatrick & Kirkpatrick, 2016) provides a useful framing for thinking about how this lab’s pre-work, Q&A, and hands-on activity fit together:
Level 1 — Reaction: the learner’s initial engagement with material — interest, perceived relevance, willingness to continue. Reading the lab protocol and pre-work falls here.
Level 2 — Learning: the actual acquisition of knowledge, skills, attitudes, and confidence. Answering post-lab questions and reasoning about the experimental design lives here.
Level 3 — Behavior: application of learning when the learner is back in their own work environment. For HTGAA, this is the wet lab session itself, and later projects that draw on the same techniques.
Level 4 — Results: the downstream outcomes attributable to the training — research output, contribution to a discipline, capability built into a community.
Reading the prework and studying the experimental data are well-aligned with Level 1 and the early portion of Level 2. They are necessary, but on their own they don’t fully exercise the reasoning under variability that Level 2 ultimately demands — the ability to predict what happens when multiple conditions change at once, observe the result, and update.
In synthetic biology, the traditional path to deeper Level 2 (and on toward Level 3) is hands-on lab activity. There is no substitute for the formal in-person procedures of pipetting, plating, incubating, and measuring — the embodied feedback loop of doing the experiment is what converts conceptual knowledge into operational fluency.
Citation: Kirkpatrick, J. D., & Kirkpatrick, W. K. (2016). Kirkpatrick’s Four Levels of Training Evaluation. ATD Press.
The Hypothesis
Hypothesis: A lightweight virtual lab simulator — with sliders for the same variables this lab manipulates (plasmid choice, base media, fructose, temperature, incubation time) and near-real-time visual feedback — can extend Kirkpatrick Level 2 learning before the wet lab session, by allowing me to make and test predictions repeatedly with inherent variability.
The simulator will not replace the wet lab. It will scaffold it — letting me arrive at the bench with stronger intuition about which conditions optimize which outcomes, and why.
Why Gamification
Three design principles drive the simulator:
Inherent variability. Each “run” injects ±8% Gaussian noise into the result, mirroring real biological variation. Repeated runs of identical conditions don’t give identical answers — which forces me to think statistically rather than deterministically.
Prediction-first feedback loop. Before each run, I commit to a prediction (which condition wins on pigment-per-cell). The simulator scores my prediction against the simulated outcome. Score, streak, and best-result trackers turn the experiment matrix into a game with measurable improvement over time.
Visual saturation feedback. Two cuvettes render side-by-side, with color saturation scaling to pigment-per-cell. The visual channel adds a memory anchor that pure-numeric feedback doesn’t.
The Underlying Model
The simulator uses a transparent kinetic model — not real biology, but biologically plausible:
Growth rate scales with media richness (2YT > LB) and peaks at 37°C, falling off at extremes
Pigment-per-cell follows the metabolic-competition tradeoff: high growth → lower per-cell yield
Fructose boosts pigment expression by ~25% (escape from catabolite repression)
Temperature above 32°C penalizes recombinant enzyme function
pAC-LYC carries less metabolic burden than pAC-BETA (3 enzymes vs 4)
Theoretical optimum: 2YT + fructose at 30°C — but noise can shift any single run
Six controls, two cuvettes, three readouts (OD600, peak absorbance, pigment-per-cell), live game state with score / streak / best-run tracking, and a run log. All in-memory — no persistence yet, by design, so each session is a fresh experiment.
Tooltip Reference
Every interactive control and result readout in the simulator has a hover tooltip with context. The full reference is below for offline study.
Left panel — Controls
#
Element
Tooltip content
1
Plasmid label
Which gene cassette is in the E. coli. pAC-LYC carries CrtE, CrtI, CrtB (3 enzymes → lycopene). pAC-BETA adds CrtY (4 enzymes → beta-carotene). Source: Erwinia herbicola.
2
pAC-LYC button
3-enzyme pathway. Less metabolic burden. Produces red lycopene (peak absorbance 474 nm).
Growth medium. Richer media supports more biomass but can dilute pigment per cell. Both options are supplemented with chloramphenicol to maintain plasmid selection.
5
LB button
Luria Broth — standard E. coli medium. Lower nutrient density than 2YT. May give lower OD600 but comparable per-cell pigment yield.
6
2YT button
2× Yeast Tryptone — richer than LB. Drives higher cell density (OD600). Watch the per-cell pigment math: more cells does not mean more efficient.
7
Fructose supplement
Adds fructose as the primary carbon source. Fructose escapes the catabolite repression that glucose triggers via the cAMP/CRP system, allowing stronger expression of plasmid-borne genes. Boosts pigment-per-cell ~25% in this model.
8
Temperature label
Incubation temperature. 37°C is optimal for E. coli growth (highest OD600). Lower temperatures (~30°C) slow growth but typically improve recombinant enzyme folding and per-cell pigment accumulation. Above ~32°C, recombinant enzymes lose efficiency.
9
Incubation time label
Hours of culture growth. The lab protocol specifies 24h. Shorter times underdevelop both biomass and pigment; longer times saturate the OD600 plateau. Pigment-per-cell continues to accumulate longer than cell density.
10
“This run wins” button
Predict that the plasmid you currently have selected will produce more pigment-per-cell at these conditions than the opposite plasmid would.
11
“Opposite plasmid wins” button
Predict that the OTHER plasmid (the one not currently selected) would outperform the current selection at these same conditions.
12
Run culture button
Simulates a 24h culture at the current settings and computes OD600, peak absorbance, and pigment-per-cell. ±8% Gaussian noise per run mimics biological variability. Scores your prediction.
13
Reset score button
Clears score, streak, best-run tracker, and run log. Conditions stay where you set them.
Right panel — Output
#
Element
Tooltip content
14
Your run cuvette
Acetone-extracted pigment from the plasmid you selected. Color saturation = pigment-per-cell. Lycopene appears red-orange at peak 474 nm; beta-carotene appears yellow-orange at peak 456 nm.
15
Opposite plasmid cuvette
Same conditions, but with the OPPOSITE plasmid. Shown for comparison so you can see whether your prediction was correct.
16
OD600 readout
Optical density at 600 nm. Cells scatter 600 nm light proportional to their concentration — this is a cell-count proxy. 600 nm is chosen specifically because it sits OUTSIDE the pigment absorption peaks (474 nm lycopene, 456 nm beta-carotene), so the reading reflects density only, not pigment.
17
A474 / A456 readout
Peak absorbance at the pigment’s characteristic wavelength: 474 nm for lycopene, 456 nm for beta-carotene. Measured on the acetone-extracted supernatant. Reflects total pigment in the cuvette — both number of cells AND pigment per cell.
18
Pigment/cell readout
Peak absorbance divided by OD600. The most meaningful metric: pigment produced PER cell. A high-OD600 culture can have low pigment/cell if the cells are growing fast but expressing the pathway poorly. This is what the lab actually optimizes.
19
Score row
Correct predictions / total runs. Increments by 1 each time your prediction (this run wins / opposite plasmid wins) matches the simulated outcome.
20
Best pigment/cell row
Highest pigment-per-cell value seen so far in this session. Lets you track whether you are converging toward optimal conditions across runs.
21
Streak row
Consecutive correct predictions. Resets to 0 when you guess wrong. A high streak suggests you have built reliable intuition for this region of the parameter space.
22
Export Report button
Download a self-contained HTML report with a data table of all runs and a chart of pigment-per-cell over time. Opens in any browser, no internet required. Disabled until at least one run has been logged.
23
Export CSV button
Download all run data as CSV for spreadsheet or Python analysis. Includes plasmid, conditions, OD600, A474/A456, pigment-per-cell for both your run and the opposite plasmid, and prediction outcome. Disabled until at least one run has been logged.
24
Sort toggle
Switch between newest-first chronological order and ranked-by-pigment order. Ranked mode adds gold/silver/bronze badges to the top three runs.
25
Run log
Click any entry to restore the spectrophotometer view to that run’s conditions and results. Green = correct prediction, red = wrong.
Gameplay Loop
Set conditions with the sliders/segments
Predict whether the current plasmid or its opposite wins on pigment-per-cell
Click Run culture — watch the cuvettes saturate
Score increments if the prediction matched the simulated winner
Adjust one variable at a time to build intuition; chase a streak
Sample Session Snapshot
This is a snapshot of the simulator after an 11-run session. Several things worth noting in this view:
Score: 10/11 (91% prediction accuracy). The blue restore banner indicates I’m currently viewing Run 7, not running fresh — clicking any past run in the log restores the spectrophotometer to that snapshot.
Best pigment/cell: 1.71 at pAC-LYC · LB + Fructose · 27°C · 35h. The cuvette colors and fill heights confirm this run’s pigment-per-cell beat the opposite plasmid (1.47).
The session insights panel on the left captures four patterns that emerged across the run history — including one that contradicted my Round 1 Q3 prediction. I had predicted 2YT + fructose at 30°C as the winner, but in actual play, LB + fructose at 27°C with 35h incubation outperformed every 2YT condition I tried. The richer media drove higher OD600 but diluted pigment-per-cell — exactly the metabolic-competition tradeoff the lab is designed to expose.
The run log shows three of the most recent entries with green/red coding for correct/wrong predictions. The full 11-run history can be exported as a self-contained HTML report or as CSV for spreadsheet analysis.
This is the kind of insight the simulator generates that pure prework reading does not. The hypothesis going into Round 2 is that this hands-on optimization experience will measurably improve my answer accuracy and confidence calibration on the same six questions.
What Round 2 Will Test
After spending time with the simulator, I’ll re-run the same six post-lab questions in Section 3 below under the identical rubric (initial confidence → coach accuracy → next-step prompts → revised confidence → final accuracy → calibration gap). The expected signals:
Higher initial accuracy on Q3 (variable interactions are now hands-on)
Better calibration on Q5–Q6 (I’ll have made enough wrong predictions to know what I don’t know)
Improved Learning Gain Index (LGI) vs. Round 1’s 21.8%
If the simulator doesn’t improve those metrics, the hypothesis fails — and the Reflection itself becomes a Level 1 artifact rather than a bridge to Level 2.
Part 3 — Round 2: Post-Gamification Q&A
Info
Pending. This section will be populated after a focused session with the Lab Simulator v1. The same six mandatory post-lab questions will be re-answered using the rubric process, and a comparison table will plot Round 1 vs. Round 2 confidence and accuracy side-by-side to test the gamification hypothesis.
Week 13 Lab: AI, SynBio, and Scaling Health Innovation (ARPA-H)
Thank you for visiting the Final Project landing page for BioLight X5 & Photoplasm Click on the menu items to the left to view each section of the Final Project documentation.
Here are the presentation slides, for a high-level overview:
Click here to view the Abstract (and all Chapters on the menu)
HTGAA Group Project: MS2 Bacteriophage L Protein Engineering Date: March 31, 2026
Authored & Reviewed by:
2026a-john-adeyemo-adedeji 2026a-eric-schneider 2026a-albert-manrique 2026a-Tehseen Rubbab 2026a-brie-taylor Introduction This document represents the full scope of our Group Project activity within our Genspace Node.
“Group 2” was formed for the purpose of addressing Bacteriophage Final Project Goals for engineering the L Protein.
Subsections of Projects
Individual Final Project
Thank you for visiting the Final Project landing page for BioLight X5 & PhotoplasmClick on the menu items to the left to view each section of the Final Project documentation.
Here are the presentation slides, for a high-level overview:
Click here to view the Abstract (and all Chapters on the menu)
Subsections of Individual Final Project
Section One - Abstract
HTGAA 2026 Final Project Documentation
Eric Schneider · BioArt Studio, Makerspace Charlotte · Genspace NYC node
Section 1 — Abstract
Provide a concise, self-contained summary of your project (minimum 150 words). The abstract should allow a reader to understand the purpose, approach, and expected outcomes without referring to other sections.
Your abstract should briefly address the following elements:
Significance: What problem or question does the project address, and why is it important?
Broad Objective: What is the overall goal of the project?
Hypothesis: What prediction or principle is the project testing or demonstrating?
Specific Aims: What key steps or milestones will be completed to achieve the objective?
Methods: What experimental or technical approaches will be used?
1 — Significance
What problem or question does the project address, and why is it important?
The history of imaging offers a precise precedent for what synthetic biology must now accomplish. When Ferdinand Hurter and Vero Charles Driffield published their foundational sensitometry work in 1890, they did not merely characterize the photographic emulsion — they transformed an artisanal practice into a reproducible, industrially scalable system by rigorously quantifying the relationship between light exposure and material response. Their H&D curve made photography accessible at mass scale by encoding complexity into a predictable, designed workflow. BioLight proposes an analogous translation: applying the logic of exposure science to living cells, using controlled light as the variable input and protein expression as the measurable output — not in isolation from the research community, but in direct collaboration with it, extending and accelerating the outreach of institutional synthetic biology into the hands of designers, makers, and educators who are ready to engage.
Photograph of Hurter & Driffield “Actinograph,” a photographic exposure calculator using a logarithmic curve to predict light levels.
2 — Broad Objective, Hypothesis, and Aim 1
What is the overall goal of the project? What prediction or principle is the project testing or demonstrating?
The primary objective of BioLight is to engineer and validate a light-activated gene expression system in E. coli, and to develop Photoplasm — a purpose-designed labware device that delivers high-resolution, spatially controlled analog light exposure directly onto living cell cultures. The project tests the hypothesis that a research-grade optogenetic system can be reframed as an imaging instrument with measurable sensitometric properties — that bacterial cultures, like photographic emulsion, can be characterized by a dose-response curve relating light exposure to expressed signal, and that this characterization makes spatially patterned biological imaging reproducible at community scale. Aim 1 establishes the biological and hardware foundations through two parallel tracks. The experimental track employs BioLightV5, a derivative of the eLightOn optogenetic system¹ in which the RsLOV photoreceptor is fused to a LexA408 DNA-binding domain to drive sfGFP expression from the pColE408 promoter under 470 nm illumination. BioLightV5 is designed in Benchling and submitted for synthesis via Twist Biosciences. The control track uses pDawn-sfGFP (Addgene #107741), a well-characterized blue-light-repressible system, as a validated comparator for expression behavior under identical illumination conditions.
Illustration generated via FigureLabs: BioLight V5 - a blue-light activated sensor, based on eLightOn (Li, et al 2020)
3 — Methods / Photoplasm Device
What experimental or technical approaches will be used?
Photoplasm — described as “a darkroom enlarger reinvented as a programmable bio-imaging instrument” — comprises a Raspberry Pi 5 microcontroller, 470 nm LED light ring, light collimator, OLED digital image mask used for projection of selected images to create a variable density map (like a film negative or positive print), focusing lens, dark chamber cone, removable wavelength sensor, bacterial plate holder, and plate heater for incubation.
Photoplasm traditional darkroom enlarger modified for spatial image mapping onto light-reactive biosensors.
The device delivers spatially programmable 470 nm light exposures through a digital image mask projected onto live bacterial slabs (mixed and poured lawns in agarose), with calibrated step-wedge protocols generating a bacterial H&D curve that quantifies the dose-response relationship between light exposure and sfGFP expression intensity.
Photoplasm 470nm light projection test with step-wedge calibration image target
The biological design pipeline was built and simulated in Asimov Kernel (circuit-level logic), Benchling (sequence assembly using sfGFP as the reporter), AlphaFold (structural prediction of the RsLOV–LexA408 fusion fold), ChimeraX (visualization of the dark-state PDB 4HJ4 dimer and light-state monomer hypothesis), and Twist Biosciences (gene synthesis). The construct uses pUC19 backbone for high-copy sfGFP signal and AmpR selection on LB+Amp, and incorporates SD17 RBS to keep LexRO matched to FMN supply.
4 — Specific Aim 2 / Community Lab Project / Cell-Free Migration / Biomanufacturing
What key steps or milestones will be completed to achieve the broad objective? (Aim 2 development path)
Aim 2 begins with the receipt of the Biolight V5 clonal gene from Twist Bioscience, for transformation into a living cell system at Genspace, my designated node. We will verify the construct through a well defined protocol that includes a minimally viable functionality test with blue light to observe sfGFP illumination, and calibrate the device. We plan to relaunch the Genspace Optogenetics Community Lab Project, introducting and testing a host of light-responsive cellular systems through the Photoplasm labware device.
My parallel Aim 2 track is to migrate BioLightV5 from a live-culture wet-lab system into a cell-free protein synthesis (CFPS) variant, executed via Ginkgo Bioworks’ cloud-lab CFPS service. The migration to cell-free reactions removes the containment requirements and cold-chain logistics that govern live-organism distribution, transforming BioLight outputs into stable, shippable consumables. The same Photoplasm device that drives Aim 1 slab exposures also drives the cell-free reactions in Aim 2 — same hardware, two biological substrates. This architecture explicitly invokes the Eastman/Kodak distribution model: George Eastman’s breakthrough was not photochemistry but the system — standardized cartridges, global distribution, and a participant experience so simple the tagline became “you press the button, we do the rest.” BioLightV5 in CFPS form, manufactured by Ginkgo, paired with the open-source Photoplasm device, completes the analogous translation for synthetic biology: complexity lives in the consumable, while the participant loads, exposes, and observes.
Illustration of Ginko Bioworks producing light-sensing cell-free protein systems for use in Photoplasm labware
5 — Specific Aim 3 / Long-Term Vision / Makerspace Distribution
What key steps or milestones will be completed to achieve the broad objective? (Aim 3 visionary path)
The long-term vision of BioLight is wide distribution through the community makerspace network — motivated by a conviction that biological art and design offer one of the most effective entry points into an industry standing on the threshold of a transformation whose scope may equal or exceed the industrial and digital revolutions combined. Aim 3 is realized through a newly formed collaboration between the MakerSpace Charlotte BioArt Studio and the Genspace community wetlab, establishing a multi-node network through which protocols, plasmids, hardware files, and educational frameworks flow openly between an institutional community lab and a community makerspace. This collaboration is itself the prototype of the distribution model — if it works between two nodes, it scales to twenty, then two hundred or more. The result is a data-driven framework for democratized biotechnology that mirrors how Eastman/Kodak democratized photography: not by simplifying the science, but by engineering the system around it so that anyone curious enough to join can do so without first becoming a specialist.
Photoplasm Neural Network - Connected nodes with shared protocol data
Section Two - Aims
HTGAA 2026 Final Project Documentation
Eric Schneider · BioArt Studio, MakerSpace Charlotte · Genspace NYC node
Section 2 — Project Aims
Define three aims for your final project (minimum one sentence per aim).
Aim 1 — Experimental Aim · Design & Build
Aim 1: Experimental Aim (this project):
“The first aim of my final project is to [achievable experimental goal] by utilizing [protocols, tools, or strategies].”
i. This aim should describe the core experimental objective you will attempt during this class. List or link any relevant methods or resources you plan to use (e.g., experimental protocols, automation workflows, DNA or protein designs, protein design tools, or Twist orders).
ii. You will provide a detailed step-by-step experimental plan for Aim 1 in the Experimental Design section of this assignment.
The first aim of my final project is to design and build the foundational platform for community-deployable optogenetic synthetic biology — comprising both the biological substrate and the labware device that drives it — by completing BioLightV5, a blue-light-derepressed bacterial expression circuit derived from the eLightOn optogenetic system (Li et al. 2020), and Photoplasm, a programmable bio-imaging instrument purpose-built to deliver spatially controlled 470 nm exposures onto live bacterial cultures.
Methods, tools, and strategies: BioLightV5 design pipeline
Asimov Kernel (circuit-level logic with SBOL parts)
Asimov Kernel (simulation)
Benchling (sequence assembly using sfGFP)
AlphaFold (structural prediction of the RsLOV+LexA408 fusion fold)
ChimeraX (visualization of molecular dark-state structure of the RsLOV homodimer with FMN (choromphore) binding — the experimental foundation for visualizing how blue light disrupts the dimer.
Dark State - LexRO Fusion - Dimerized (LexA408-linker-RsLOV)
Light Activation- RsLOV Monomerized, LexA408 operator can now bind with pColE408 Promoter, expressing sfGFP
sfGFP β-barrel begins to fold and fluoresce. In absence of blue light, dark recovery begins when LexRO dimerizes.
Construct architecture:
pUC19 backbone for high-copy sfGFP signal;
AmpR selection on LB+Amp;
SD17 RBS to keep LexRO matched to FMN supply.
Twist Biosciences (gene synthesis)
Twist order: BioLightV5 plasmid submitted for clonal gene synthesis.
Control track: pDawn-sfGFP (Addgene #107741) ordered in parallel as a blue-light-inducible comparator.
Biofilm Lithography enables high-resolution cell patterning via optogenetic adhesin expression. Jin X, Riedel-Kruse IH. Proc Natl Acad Sci U S A. 2018 Apr 3;115(14):3698-3703. doi: 10.1073/pnas.1720676115. Epub 2018 Mar 19. 10.1073/pnas.1720676115 PubMed 29555779
Photoplasm …A modified vintage photographic darkroom enlarger:
Selected for the highly desirable light-modification feature, suitable for bacterial spatial imaging.
Condenser lens to direct light into parallel vertical rays
Focusing lens with adjustable aperture
Photoplasm hardware stack:Image by NanoBanana 2
Raspberry Pi 5 microcontroller
LED Light Ring (Blue 470nm wavelength)
Acrylic Disks for maximum light diffusion, edgelit with reflector (laser cut)
OLED digital image mask (for projection of digital images, like a film negative or positive print)
Dark Chamber Cone (3D printed,with spacer rings)
Wavelength Sensor(used for calibration)
RaspberryPi Cam (for live image capture, with longpass 515nm filter)
BioLightV5 in Agar Slab (Simulated fluorescent image)
Bacterial Plate Holder (3D printed PETG for heat resistance, epoxy sealed for sterilization)
Plate heater with Temperature Sensor (for setpoint control)
Step-by-step experimental plan: see “PhotoPlasm Quick Start Guide”
GitHub Repository
Aim 1 deliverables:
A Twist-synthesized BioLightV5 plasmid verified by sequence
A fully assembled Photoplasm prototype with calibrated optical stack
A documented genetic design package including SBOL-standard schematics, ChimeraX MOA figures, and the complete bill of materials
Reference figure:
BioLightV5 in Benchling — circular plasmid map showing the eLightOn-derived construct (RsLOV–LexA408 fusion + pColE408 operator + sfGFP) on pUC19 backbone with AmpR selection. Submitted via official HTGAA DNA design form under the Genspace node; reviewed and approved.
Figure 2.1 — BioLightV5 in Benchling
Annotation Table
Range
Annotation
Function
1–35
J23106 Promoter
Anderson family constitutive promoter (iGEM BBa_J23106) driving the LexRO fusion cassette. Validated choice from Li 2020 — its intermediate strength sets steady-state LexRO levels appropriate for the eLightOn dynamic range.
36–42
SD17
Shine-Dalgarno RBS variant from Li 2020 Supplementary Table S1/S3. SD17 is intermediate-strength (vs. weak SD2 / strong SD37), tuned to give the >500-fold ON/OFF ratio reported for eLightOn.
43–50
spacer_001
The AAA-containing spacer added in V5 to fix the SD17→ATG spacing identified as Issue 1. Brings SD core to start codon distance into the optimal 5–10 bp window.
51–656
LexA408_DBD_codonOpt
Part of LexRO Fusion (51–1193). LexA408 DNA-binding domain (mutant LexA recognizing the pColE408 operator, not wild-type LexA operator). N-terminal half of the LexRO fusion. Sequence verified to end in CTG with no internal stop.
657–662
KV Linker
Part of LexRO Fusion (51–1193). Lys-Val peptide linker between LexA408 C-terminus and RsLOV N-terminus. Maintains reading frame and preserves independent folding of the two domains in the LexRO fusion.
663–1193
RsLOV_Codon_Opt
Part of LexRO Fusion (51–1193). RsLOV (Rhodobacter sphaeroides LOV photoreceptor) codon-optimized for E. coli expression, including TGA stop. C-terminal half of the LexRO fusion. In darkness the LexRO dimer occupies pColE408 and represses sfGFP; 470 nm light triggers RsLOV conformational change, dissociating the dimer and derepressing the output cassette.
1191–1193
Stop Codon
TGA stop terminating the LexRO fusion ORF. Confirmed in NCBI ORF Finder as one of exactly two functional ORFs (LexRO 1,143 bp).
1194–1273
BBa_B0010 Terminator
iGEM BBa_B0010 — rrnB T1 transcription terminator from E. coli. Closes Cassette 1, prevents read-through into the intergenic region.
1274–1323
50bp Spacer
V5 replacement for the original 10 bp ACTTGTACGA neutral spacer (Issue 3 fix). 50 bp AT-rich synthetic sequence (ATATAT…) providing optimal intergenic separation between BBa_B0010 and pColE408; verified free of cryptic ATGs, stop codons, RBS-like motifs, and BsaI/BbsI sites.
1324–1475
pColE408 Promoter
Hybrid promoter from Li 2020 — strong ColE promoter combined with the LexA408 operator. Bound and repressed by the LexRO dimer in the dark; derepressed under 470 nm illumination. The light-responsive control point of the circuit.
1476–1501
BBa_B0034 RBS
iGEM BBa_B0034 — well-characterized medium-strength E. coli RBS driving sfGFP translation in Cassette 2.
1502–1509
SPACER_RBS-P
Short spacer restoring BBa_B0034 native ~7 bp spacing to the sfGFP ATG. Required after the V5 Issue 2 fix removed the EcoRI cutsite that previously sat between RBS and start codon.
1510–2226
sfGFP_Forward
Superfolder GFP (717 bp) — the output reporter producing the green fluorescent signal in lit regions. Confirmed as the second of two functional ORFs. EcoRI/XhoI flanking sites that previously bracketed it for modular swapping were removed in V5 to restore RBS spacing; future fluorescent protein swapping will need a different cloning strategy.
2227–2267
BBa_B0012 Terminator
iGEM BBa_B0012 — rrnB T2 transcription terminator. Closes Cassette 2 downstream of sfGFP. Distinct sequence from BBa_B0010, confirmed in V4 Check 4 to avoid direct-repeat flags at Twist.
2268–2317
end50bpsSpacer
Terminal 50 bp neutral AT-rich spacer between Cassette 2 and the pUC19 backbone junction. Mirrors the intergenic 50 bp spacer in design and rationale — provides clean handoff at the backbone boundary.
Aim 2 — Development Aim · Test & Analyze
Aim 2: Development Aim:
Describe the next step that would follow a successful Aim 1, extending the work beyond the scope of this course. This aim should represent a realistic progression of the project, such as executing additional experiments, solving a technical limitation, or developing the system or technology further.
Following a successful Aim 1, the second aim is to test and analyze the integrated BioLightV5 + Photoplasm system through a structured 7-step laboratory protocol that takes the project from Twist gene-synthesis delivery to a fully calibrated, image-producing platform — generating in the process the first published bacterial H&D curve characterizing dose-response between 470 nm exposure and sfGFP expression in living E. coli. This is the realistic next-step progression: Aim 1 produces the construct and the device; Aim 2 turns them into a measurable, repeatable, and open-source accessible imaging system.
The 7-step protocol — from Twist order to lab protocol:
Verify — confirm plasmid transformation integrity via gel electrophoresis and visual colony count
Transform — introduce BioLightV5 into DH5α
Plate — grow a uniform bacterial slab on LB+Amp at 37 °C
Expose — project a 470 nm calibration step-wedge onto the lawn through Photoplasm’s optical system
Develop — render the step-wedge as an sfGFP intensity gradient
Calibrate — generate the bacterial H&D curve and spectrophotometry reading to normalize readings
Print — expose an original image mask to demonstrate spatial light reactivity
The verb sequence — Verify, Transform, Plate, Expose, Develop, Calibrate, Print — deliberately mirrors the photographic darkroom protocol, anchoring the biology in a vocabulary the participant may already understand. The H&D curve produced in Step 6 is the analytical centerpiece of the project: it transforms BioLight from a demonstration into a data-driven imaging platform with measurable values of a logarithmic curve defined by latitude, toe, linear region, and shoulder.
Hurter & Driffield (H&D) Exposure Curve (1890)
“Photochemical Investigations and a New Method of Determination of the Sensitiveness of Photographic Plates”
Aim 2 deliverables:
A validated 7-step protocol from Twist delivery to first spatial image expressed in sfGFP
See full Genspace wetlab protocol in Section Four: Experimental Design
A bacterial H&D curve characterizing the BioLightV5 dose-response relationship
Spectrophotometry readings establishing the exposure window for reproducible imaging
A printed bacteriograph demonstrating spatial light reactivity through Photoplasm
Capture imaging results through a longpass 515nm filter, which is an emission/viewing filter over a RaspberryPi Camera Module.
Blocks the bright blue excitation LEDs, while allowing the green fluorescence from GFP to pass to camera sensor
Additional Validation:
Measure FMN absorbance peaks at 370 and 450 nm (verify chromophore presence)
Measure with OD600 to determine bacterial cell density (standard for E. coli growth)
Verify plasmid purity (260/280 nm ratio)
Aim 3 — Visionary Aim · Learn & Refine
Aim 3: Visionary Aim:
Describe the long-term vision for the project. Explain how the broader concept could have an impact if fully realized.
Examples include:
Challenging an existing paradigm or clinical practice.
Addressing a major barrier in a field.
Enabling a new experimental capability or research approach.
The long-term vision is to position BioLight as the prototype for a distributed, open-source synthetic biology platform that makes optogenetics accessible to community scientists, designers, and educators — refined through the ongoing collaboration between the MakerSpace Charlotte BioArt Studio and the Genspace community wetlab. If fully realized, this concept reframes synthetic biology as a participatory technology, much as photography became a participatory medium in the late 19th century.
Broader impact if fully realized:
The Genspace ↔ MakerSpace Charlotte collaboration is itself the prototype of the distribution model. If protocols, plasmids, hardware files, and educational frameworks flow openly between two nodes, the same architecture scales to multiple community labs, exponentially.As the platform expands, a CFPS variant of BioLightV5 — manufactured via Ginkgo Bioworks’ cloud-lab service — becomes the natural high-availability consumable, removing biocontainment and cold-chain barriers that limit live-organism distribution. This is the Eastman/Kodak step: standardized, mass-produced biological consumables paired with an open, well-documented device.The broader impact is the creation of a participatory biological literacy at the moment when synthetic biology is becoming a general-purpose technology — equipping designers, educators, and citizen scientists to engage with the field while it is still being shaped, rather than after the fact.BioLight challenges the existing paradigm that the boundary between professional researcher and citizen practitioner is fixed — proposing that well-engineered tools, similar to Eastman’s standardized film, Kodak’s camera systems, and the advantage of cloud based neural networks, can close the gap from discovery to innovation - with emphasis on shared protocols.
Aim 3 deliverables:
A documented Genspace ↔ MakerSpace Charlotte collaboration framework — protocol exchange, hardware files, educational pathways
An open-source and attributed release of BioLightV5 as a cell free protein system, Photoplasm hardware (CAD, BOM, firmware), and documentation under an MIT-style license
A roadmap for CFPS distribution via Ginkgo Bioworks as the high-availability expansion path
A measurement framework for tracking adoption across community nodes — the “Join the Resolution” tagline made operational
Section Three - Background & Literature
Author: Eric Schneider · 2026a-eric-schneider
Node: Genspace NYC
Affiliation: BioArt Studio, MakerSpace Charlotte
Q1 — Citation Summaries
Briefly summarize two peer-reviewed research citations relevant to your research (minimum four sentences).
I first experienced bacterial BioArt at MakerSpace Charlotte during a demonstration by Karen Ingram, scientific illustrator and co-author of BioBuilder,¹ where fluorescent proteins were being transcribed into colorful cells in agar using hand-drawn patterns and OpenTrons microliter pipettes. As a photographer, I asked the fundamental question: what is the resolution? That question started the entire journey into HTGAA and the scientific literature that followed.
BioBuilder
I quickly found Levskaya et al. 2005² — Engineering Escherichia coli to see light — the paper that demonstrated a complete bacterial photography system in which E. coli was engineered with a chimeric photoreceptor (Cph8) to respond to red light, producing spatially patterned gene expression across a plate with a resolution of approximately 100 megapixels per square inch. The Levskaya paper answered my resolution question empirically: the biological limit of the system was not optical, but cellular — the size of the bacteria itself. What it did not answer was the tonal question. The Levskaya system was binary — fully on in the light, fully off in the dark — producing sharp edges but no continuous grayscale gradation. For a photographer, that is the equivalent of a lithographic system, not a photographic one.
The paper that changed the trajectory of the project was Li et al. 2020,³ A single-component light sensor system allows highly tunable and direct activation of gene expression in bacterial cells — the eLightOn system. eLightOn uses a fusion of the RsLOV photoreceptor from Rhodobacter sphaeroides with a LexA408 DNA-binding domain to create a single-component, single-plasmid optogenetic switch with a reported ON/OFF dynamic range exceeding 500-fold under blue light activation at approximately 470 nm. That dynamic range — the biological equivalent of a photographic characteristic curve with a measurable toe, linear region, and shoulder — is what makes continuous-tone bacteriographic imaging a plausible scientific goal rather than a theoretical aspiration. The eLightOn system uses FMN as its chromophore, which is produced endogenously by E. coli, requiring no external cofactor supplementation. It fits within the 5 kbp synthesis limit for a single Twist Biosciences clonal gene order. And it had not, at the time of this project’s inception, been applied to spatially patterned photographic image production — which is the gap BioLight and Photoplasm are designed to fill.
Q2 — Novelty
Explain the novelty of your project (minimum three sentences). What makes it different from or an improvement upon existing work in the field?
The novelty of BioLight begins with a reframe: the darkroom enlarger is not a photography instrument — it is a precision optical projector capable of delivering spatially resolved, calibrated light at a defined wavelength to any photosensitive substrate placed at its focal plane. That substrate does not have to be silver halide paper. It can be a bacterial lawn embedded in an agarose slab, expressing a light-responsive genetic circuit that responds to blue photons the way a silver halide crystal responds to visible light. The traditional darkroom instrument is ideal for modification; the substrate is what changes to replace photographic paper.
My background is specific and relevant here. I have been a working photographer and photographic chemist for over forty years — I processed film for Time Inc. publications from 1987 to 1990 in the NYC Color Photo Lab, at mass-media publishing scale. During the analog-to-digital transformation of the photography industry, I learned to operate the Kodak Light Valve Technology (LVT) digitial-to-film printer and high-resolution film-to-digital drum scanners . I even built a panoramic film camera out of Lego bricks as my industrial design Master’s Degree thesis project at North Carolina State University.
Lego-based Panoramic Camera by Eric Schneider
I understand sensitometry — the H&D characteristic curve, the Zone System, the relationship between exposure and density — not as abstract science but as craft knowledge applied in darkrooms and imaging labs. When I look at the eLightOn dynamic range specification, I see a film emulsion with a measured contrast index. When I designed the Photoplasm device, I imagined an enlarger with a programmable negative.⁴
Three specific novelties distinguish BioLight from the existing bacterial photography literature. First, modularity: the Photoplasm device is designed as a stackable, component-based instrument whose throw distance, aperture, and mask format can be reconfigured for different plate geometries and biosensor substrates — extending the fixed-geometry flood illumination approach of the Levskaya and Tabor experiments into a variable, calibrated optical platform. Second, openness: every component is released under an MIT-style open-source license with full version-controlled documentation, inviting the kind of iterative community improvement that made the Arduino ecosystem what it is. Third, substrate independence: the optical stack does not presuppose any particular biosensor circuit — it delivers 470 nm light through a digital image mask, and any optogenetically responsive chassis that activates under blue light can be placed at the focal plane. As the Photoplasm platform matures toward full RGB capability, that substrate independence will extend across wavelengths, opening the system to the full diversity of characterized optogenetic tools in the synthetic biology database.
Q3 — Impact
Explain the impact of your project (minimum five sentences). Why does it matter? Who does it benefit?
Astro Teller, Captain of Moonshots at X (formerly Google X), has observed that today is the slowest rate of change we will ever experience.⁵ The convergence of artificial intelligence, accessible fabrication tools, and open-source biological parts registries is creating conditions in which community makerspaces and university laboratories alike can become meaningful nodes in the synthetic biology ecosystem — each contributing distinct capabilities, and each made stronger by collaboration with the other. BioLight and Photoplasm are designed specifically for that moment.
The direct beneficiaries are what Gartner Research has called citizen bioscientists⁶ — people with domain expertise in adjacent fields (design, photography, engineering, education, medicine) who are entering the biological sciences through community labs, accelerator programs, and initiatives like HTGAA. These participants bring non-standard perspectives that complement and enrich the formal research community. A photographer who asks “what is the resolution?” is asking a different question than a molecular biologist who asks “what is the fold-change?” Both questions are scientifically valid; both produce useful data. The Photoplasm platform is designed to make the photographer’s question answerable in a BSL-1 community wetlab setting with accessible, affordable tools.
The design philosophy of BioLight and Photoplasm draws explicitly on Universal Design principles, first articulated by Ron Mace at North Carolina State University.⁷ Mace’s central insight — that designs optimized for users at the margins of capability tend to work better for everyone — applies directly to community biology tools. A device that can be built, calibrated, and operated by a designer with no prior wetlab experience, following open-source documentation, is a device that will also work reliably in the hands of an experienced molecular biologist. Accessibility is not a constraint on rigor; it is a design specification that produces more robust and reproducible tools.
*Ron Mace (1940-1998) - Visonary of “Universal Design” (Tribute to a friend, colleauge and Mentor from 1996-1998)
The partnership between Genspace (Brooklyn, NY) and MakerSpace Charlotte is not incidental to BioLight — it is the proof-of-concept for the distribution model Aim 3 proposes to scale. Genspace provides certified BSL-1 infrastructure, institutional knowledge, and the HTGAA Node authorization framework. MakerSpace Charlotte provides fabrication capability, community design culture, and a student population drawn from manufacturing, industrial design, and biotech industry backgrounds. Together they demonstrate that the Photoplasm platform can operate across two geographically distributed sites with different institutional profiles — which is exactly what a national or international distribution network would require. Fun matters too: a biological imaging platform that produces gallery-ready bacteriographs — art objects made from living organisms expressing fluorescent proteins — creates an entry point into synthetic biology that no textbook or lecture can replicate.
Q4 — Ethics
Describe the ethical considerations relevant to your project (minimum two paragraphs).
The ethical framework for BioLight is drawn from the governance principles introduced in HTGAA Week 1, applied specifically to the context of community makerspace synthetic biology. The four bioethics principles — Beneficence, Non-maleficence, Justice, and Responsibility — map directly onto the three aims of this project. Beneficence is expressed through the open-source learning and making ethos of the platform: every protocol, hardware design, and calibration dataset is released publicly with the explicit goal of enabling others to replicate, extend, and improve the work. Non-maleficence is expressed through the BSL-1 containment framework: BioLightV5 uses DH5α E. coli with ampicillin selection, a strain and antibiotic combination with no pathogenic potential and no environmental persistence beyond standard autoclave disposal. Justice is expressed through the Universal Design commitment: the platform is specifically engineered to be accessible to participants without prior wetlab experience, lowering the barrier to meaningful synthetic biology practice. Responsibility is expressed through the open-source governance model: MIT licensing, version-controlled public repositories, and a commitment to documenting not just what works but what failed and why.
The primary ethical risk in BioLight is not biosafety — it is intellectual property and data governance. As the Photoplasm platform scales toward a distributed network of connected devices running optogenetic experiments and reporting results to a shared data model, questions of data ownership, attribution, and dual-use screening become real. The current approach addresses these risks in three ways. First, all primary wetlab work occurs at Genspace under their certified BSL-1 protocols and institutional oversight — no biological work is conducted at MakerSpace Charlotte until the HTGAA Node authorization pathway is complete. Second, all DNA synthesis passes through Twist Biosciences’ standard screening pipeline, which includes dual-use sequence review. Third, the Aim 3 data model — similar to a Transfyr.ai observational learning analytics integration — is designed to capture experimental outcomes and learner engagement data, and raw sequence data or unpublished results, minimizing the surface area for misuse. The device itself is inert and substrate-independent: the Photoplasm hardware delivers light, not biology, and has no inherent dual-use concern independent of the biological substrate placed at its focal plane.⁸ ⁹
Another potential risk worth exploring in scientific methodology, is the bias and influence of Ai models on engineering and design. What is the risk to snynthetic biology if erroneous assumptions and generative claims are accepted as fundamental truth? There are certainly rewards gained through trained data sets and accelerated data access. I have experienced the positive and negative implications of an artifical agent in the flow of design work, and we are still at the beginning of our interactive technology journey with artificial intelligence. I will continue to cautiously embrace Ai as a tool, for the purpose of acclerating and improving outcomes.
Footnotes
¹ Kuldell N, Bernstein R, Ingram K, Hart KM. BioBuilder: Synthetic Biology in the Lab. O’Reilly Media (2015). ISBN 978-1491904299.
² Levskaya A et al. Engineering Escherichia coli to see light. Nature 438:441–442 (2005). doi:10.1038/nature04405
³ Li Y et al. A single-component light sensor system allows highly tunable and direct activation of gene expression in bacterial cells. Nucleic Acids Research 48(6):e33 (2020). doi:10.1093/nar/gkaa044
⁴ Eric Schneider, personal statement — industrial design thesis, North Carolina State University; Time Inc. Color Lab photographic processing 1987–1990.
⁵ Teller E, quoted in Friedman TL. Thank You for Being Late. Farrar, Straus and Giroux (2016).
Author: Eric Schneider · 2026a-eric-schneider
Node: Genspace NYC
Affiliation: BioArt Studio, MakerSpace Charlotte
Form Prompt
Create a detailed experimental plan for your final project. Include a timeline for each part of your experimental plan (i.e., how long you would expect each step in your final project to take). (min. 15 lines/sentences — a numbered list is acceptable). Include specific methods/tools/technologies/biological concepts for each part of the final project and analysis. For each experiment and/or analysis, include a description of your expected results. If possible, include figure(s) that visually shows a broad workflow of your project or a specific aspect of your experimental plan.
Opening
The BioLight experimental plan follows a Design → Build → Test → Analyze framework — organized across three HTGAA aim phases: Aim 1 — Design & Build (Experimental), which establishes the biological construct and the Photoplasm hardware platform; Aim 2 — Test & Analyze (Development), which executes the full seven-stage wet-lab protocol from transformation through calibrated image exposure and bacterial H&D curve generation; and Aim 3 — Learn & Refine (Visionary), which extends the platform into open-source community distribution and cell-free biomanufacturing at scale.
The design phase of Aim 1 is essentially complete: BioLightV5 has been assembled in Benchling, validated through Asimov Kernel circuit logic, modeled in AlphaFold and ChimeraX, and ordered as a clonal gene from Twist Biosciences. The hardware build is running in parallel at my design studio and MakerSpace Charlotte — all Photoplasm physical parts are designed in Fusion 360 and printed in PETG on a Bambu X1 Carbon, with a full parts guide found in the Supplemental Information section.
All wet-lab work is conducted at Genspace (Brooklyn, NY), my assigned HTGAA Node and certified BSL-1 wetlab partner for this project. A complete fallback plan using pDawn-sfGFP (Addgene #107741) is documented and ready to activate if BioLightV5 sequence verification fails or exposure produces no usable bacteriograph after multiple attempts.
What you will find in this section:
Part A — Detailed Experimental Plan — a 17-step numbered timeline organized across the Aim 1 Design & Build and Aim 2 Test & Analyze phases, including the full Aim 2 — Test & Analyze (Development) seven-stage protocol (verify → transform → plate → expose → develop → calibrate → print) and the Minimum Viable Functional Validation (MVFV) blue light induction test that gates entry into image exposure work
Part B — Techniques Checklist — 19 techniques checked from the HTGAA form list, each annotated with its role in the project
Part C — Protocol Design — the single Aim 2 — Test & Analyze (Development) protocol, comprising two sequential blue light tests: a simple test-tube induction validation confirming construct function post-transformation, followed by full Photoplasm step-wedge calibration
Part D — Industry Council Companies — three primary partners each with a specific project role across the three aims, plus supporting partners
Part E — Workflow Figures — visual illustrations of the Aim 2 protocol and the BioLightV5 non-linear design network
Appendix — Standalone markdown protocol documents written for direct bench use at Genspace
The full step-by-step protocols are maintained as standalone documents and referenced throughout; what follows is the high-level experimental plan.
*Gannt chart: Aim 1, Aim 2, Aim 3 - 5/19/2026
Part A — Detailed Experimental Plan
All initial dates are anchored to Twist BioLightV5 delivery on or before May 27, 2026, and Genspace Safety Training and Orientation on May 28, 2026 — a fixed date independent of Twist delivery status. Actual dates may shift based on confirmed order and delivery; the structure and format of the protocol remains intact.
Aim 1A — Design & Build (Experimental) · Design
1. BioLightV5 plasmid design in Benchling(complete, ~3 weeks elapsed)
Asimov Kernel circuit logic → Benchling sequence assembly (pUC19 backbone, RsLOV–LexA408 fusion, sfGFP reporter, pColE408 operator, SD17 RBS, two distinct terminator sequences) → AlphaFold structural prediction → ChimeraX visualization of dark-state PDB 4HJ4 dimer and light-state monomer.
Expected result: passing all four Benchling quality checks before Twist submission.
2. Twist Biosciences clonal gene order placed(order pending)
BioLightV5 submitted as a single clonal gene under the 5 kbp synthesis limit. Target delivery on or before May 27, 2026.
Expected result: lyophilized DNA aliquot with full sequence verification report.
3. pDawn-sfGFP control ordered from Addgene(Order pending)
Addgene #107741 (Riedel-Kruse Lab, PNAS 2018) ordered as bacterial stab. Ready-to-use fallback if BioLightV5 sequence fails.
Expected result: A viable living-cell bacterial transformation of BioLightV5, ready for use in Photoplasm device experiments.
See Section Two - Aims for illustrated pipeline
Aim 1B — Design & Build (Experimental) · Build
Photoplasm device prototype: Fabrication & Documentation (Aim 1B completing by May 27)
3D-Printed Modular Components designed in Fusion 360 and printed in PETG on Bambu X1 Carbon at my design studio and MakerSpace Charlotte.
Parts include: dark chamber frustum cone (150 mm height, 51 mm ID top, 150 mm OD base), stackable spacer rings (~100 mm each, adjusting throw distance 6–12 inches), LED light Ring mount, OLED digital image carrier, bacterial plate holder, and plate heater with heat sensor.
Electronics: Raspberry Pi 5, 470 nm LED light ring, PWM/MOSFET driver (IRLZ44N, GPIO18 control), light collimator, OLED digital image mask for variable density projection, focusing lens, AS7341 spectral sensor, Raspberry Pi Camera Module.
Expected result: a calibrated, light-tight imaging platform with documented optical stack.
Full build sequece documented in Photoplasm Quick Start Guide
Aim 2 — Test & Analyze (Development)
The Aim 2 protocol begins on Twist plasmid receipt and Genspace orientation. Two blue light tests gate the path from transformation to image exposure: first a simple test-tube MVFV induction validation, then full Photoplasm step-wedge calibration. All wet-lab work is conducted under Genspace BSL-1 protocols only.
5. Genspace Safety Training and Orientation — Lab Block A(May 28, fixed, ~6 hours)
Site orientation, BSL-1 safety review, materials check-in, equipment familiarization, lab notebook initialization. Proceeds regardless of Twist delivery status.
Expected result: cleared to begin wet-lab work May 29.
6. P1 — Verify: plasmid receipt and gel verification(May 29, ~2 hours)
Resuspend Twist DNA aliquot, confirm sequence report, run confirmation gel.
Expected result: clean band, sequence-verified BioLightV5 ready for transformation.
7. P2 — Transform: DH5α transformation(May 29–30, ~2.5 hours active + 16 h overnight)
Heat-shock transformation of DH5α competent cells with BioLightV5, recovery in SOC, plate on LB+Amp. All handling under red safelight to prevent leaky expression.
Expected result: 10+ AmpR colonies after 16 h at 37°C in darkness.
8. P3 — Plate: colony picking, miniprep, and stock banking(May 30 – June 1, ~4 hours active + 16 h overnight culture)
Pick colonies, grow overnight in LB+Amp in darkness, miniprep, glycerol stock banking at −80°C.
Expected result: at least one sequence-verified working stock.
9. Blue Light Test 1 — Minimum Viable Functional Validation (MVFV)(June 1, ~2 hours — critical gate P3.6-G)
Two culture tubes prepared from verified stock: one exposed to bench-top 470 nm source, one held in darkness. sfGFP emission confirmed visually. No Photoplasm device required — intentionally minimal, confirming construct function independently of hardware. This is the primary go/no-go gate for Aim 2 — Test & Analyze (Development) image exposure work.Expected result: measurable fluorescence in light tube, minimal signal in dark control. Failure triggers pDawn-sfGFP backup protocol; the June 2–21 hold window provides recovery time.
10. Hold window — device pre-work and sequencing convergence(June 2–21, ~3 weeks)
Verified glycerol stocks held at −80°C. Photoplasm device pre-work completes: Cree LED irradiance gate cleared (≥100 µW/cm² at substrate plane), 16-step Bayer dither step-wedge calibration run, minimum effective dose (MED) and exposure window established.
Expected result: device validated, exposure parameters locked, working stock confirmed and ready for Lab Block B.
11. Genspace Lab Block B — P4: agarose slab casting(June 22, ~3 hours active + 16 h pre-incubation)
Following Aim2_Protocol_AgaroseSlab.md: measure overnight OD₆₀₀, temper low-melt agarose to 42–45°C, mix cells into molten agarose, cast thin uniform slab in 90 mm dish, pre-incubate in darkness.
Expected result: uniform photosensitive substrate analogous to a silver-halide-in-gelatin emulsion.
12. Blue Light Test 2 — P5: Photoplasm step-wedge calibration and image exposure(June 23, ~0.5 hours active + 4–8 hours exposure)
With construct function confirmed by MVFV, the full Photoplasm device is engaged.
Visual Guide to Calibration Cycles
Project 16-step Bayer dither calibration target through OLED digital image mask at calibrated 470 nm irradiance and predicted F/8 aperture setting, to establish wavelength and illumination values.
Insert agarose slab with bacterial lawn into plate holder and place under dark chamber
Start timed exposure duty cycle dosing — dark growth, blue light dose, dark recovery, repeat — prevents over-expression and metabolic exhaustion across the 24-hour exposure window.
Three planned experimental exposures:
(a) Circular step-wedge for calibrating to H&D curve
(b) Siemens Star Pattern for resolution and focus test measurement.
(c) One original continuous tone image mask for the 12-piece Photoplasm Art Gallery series.
Experimental Aim: Raspberry Pi Camera Module provides real-time machine vision feedback during each duty cycle, feeding image data into a self-correction algorithm that adjusts subsequent dose parameters based on observed expression response.
Expected result: spatially patterned sfGFP expression confirmed at exposure completion.
13. P6 — Develop: post-exposure incubation and imaging(June 23–25, ~3 hours active + 4–16 h development)
Post-exposure incubation in darkness at 37°C to allow sfGFP expression. Image under 470 nm transilluminator with 515 nm long-pass filter; AS7341 sensor captures fluorescence across 510–530 nm sfGFP emission window. Photograph plates for archival record.
Expected result: measurable bacteriograph with spatially resolved sfGFP intensity gradient.
14. P7 — Calibrate & Print: bacterial H&D curve generation(June 25, ~4 hours analysis)
Export AS7341 time-series CSV. Plot fluorescence vs. logarithmic light exposure for the step-wedge. Document toe, linear, and shoulder regions following Zone System sensitometric conventions.
Expected result: a calibrated bacterial H&D curve — the central Aim 2 — Test & Analyze (Development) deliverable — characterizing BioLightV5 as a photographic substrate.
15. Documentation, open-source release, and Aim 3 handoff(ongoing)
GitHub Repository for Photoplasm to be published with all four protocols, hardware specifications, device firmware, Photoplasm Art Gallery exhibition framework, and observational data schema in the style of Transfyre.ai. My instructional design methodology includes experiential learning activities. Repository to be released under MIT open-source license via GitHub repository.
Genspace Community Project ↔ MakerSpace Charlotte collaborative build workshop scheduled as the Aim 3 distribution proof-of-concept. Machine vision self-correction data archive to be created as the foundational training dataset for the Aim 3 fleet-level neural network.
Expected result: fully documented open-source platform ready for community replication.
At this time of this submittal, there are several HTGAA2026 colleagues interested in participating in a global expansion of the Photoplasm device initiative, as a cohort and individually. This is a very exciting prospect to demonstate the open-source and open-innovation pipeline, with a concept of a museum-grade “Photoplasm Art Exhibition” of experimental image exposures. A show that can be printed as fine art and travel the globe , with an online and printed publication. (5/23/26)
Decision Points and Fallbacks
MVFV gate (step 9, June 1): Failure triggers pDawn backup protocol; June 2–21 hold window provides recovery — pDawn timeline (~10–17 days from trigger) converges into Lab Block B if started by June 5.
Cree irradiance gate (by June 22): If ≥100 µW/cm² not achieved, fallback blue-light rig engages — uniform 470 nm exposure validates construct and wet-lab protocol without patterned imaging.
P6.2 inspection gate: If 2–3 image exposures produce no usable bacteriograph, project narrative shifts to “protocol and device validated” — a complete and defensible Aim 2 outcome.
Total active lab hours (post-orientation): ~14 h
Total wall-clock duration: ~4 weeks (May 28 → June 25, 2026)
Critical path: Twist delivery May 27 → MVFV gate June 1 → device pre-work convergence June 22
Part B — Techniques Checklist
Pipetting & Lab Safety
☑ Pipetting (hands-on competency established at Genspace Safety Training and Orientation, May 28, 2026 — fixed date, independent of Twist delivery)
☑ Lab Safety (Genspace BSL-1 Safety Training, May 28, 2026)
☑ Bioethical Considerations(mandatory — addressed in Section 3 Q4)
DNA Editing
☑ DNA Gel Art (gel electrophoresis as key transformation checkpoint — visual confirmation of BioLightV5 at P1)
☑ DNA Sequencing (Sanger verification of BioLightV5)
☑ DNA Construct Design (BioLightV5 in Benchling — Aim 1 — Design & Build)
☑ Databases (GenBank, NCBI, Addgene)
☐ Restriction Enzyme Digestion
☐ Gel Electrophoresis
☐ DNA Purification From Gel
Lab Automation
☑ Designing a Twist Order (BioLightV5 synthesis — Aim 1 — Design & Build)
☑ Creating a plan to use the Autonomous lab at Ginkgo Bioworks (Aim 3 — Learn & Refine)
☐ Creating Code for Laboratory Automation (deferred to Aim 3)
☐ Using Liquid Handling Robots (deferred to Aim 3)
☑ Freeze-Dried Cell Free Systems (observed in Week 10 ISS lab; Aim 3 distribution path targets this format for shippable consumables)
☐ miniPCR Tools
☐ Protein Purification
Cloning
☑ Primer Design or Selection (Sanger verification primers)
☑ PCR Reactions (colony PCR for sequence verification)
☐ Gibson Assembly
☐ Other Cloning Methods
☐ CRISPR / Cas9
☐ Designing Prime Editing gRNA
Total: 19 techniques checked.
Part C — Protocol Design
Expand upon two techniques you checked in the previous question by describing how you would utilize those techniques in your final project. (min. 4 sentences)
Protocol Design 1 — DNA Construct Design: BioLightV5 from eLightOn to Twist
Step 1 — Candidate selection: why eLightOn
The path to BioLightV5 began with a structured analysis of the full bacterial photography lineage — from Levskaya 2005 through the Tabor Lab multichromatic work — evaluating multiple optogenetic candidates against criteria including ON/OFF dynamic range, plasmid size, chromophore requirements, strain portability, and accessibility for community deployment. eLightOn (Li et al. 2020) was selected on the basis of its >500× ON/OFF folding ratio, which translates directly to photographic dynamic range — the capacity to produce a continuous tone image with measurable gradations between fully repressed dark state and fully induced light state, rather than a binary on/off signal.
As a parallel control and fallback, pDawn-sfGFP (Addgene #107741, Riedel-Kruse Lab, PNAS 2018) was selected as the next-best single-plasmid construct available directly from Addgene — requiring no reconstruction from literature. Both were selected as being endogenous vs complexity of exogenous chromophores requiring a second plasmid,increased metabolic burden, and future cell-free design requirements.
Table: Selection Criteria for Plasmid Design
Step 2 — Reconstructing eLightOn from the Li 2020 paper
Unlike pDawn-sfGFP, eLightOn is not available on Addgene and could not be ordered directly — it had to be reconstructed from the published protein sequences and supplemental data in Li et al. 2020. This required first extracting the RsLOV and LexA408 protein sequences from the paper, then converting those protein sequences back to DNA using the IDT Codon Optimization Tool (idtdna.com/CodonOpt). Codon optimization for E. coli K12 was essential because RsLOV originates from Rhodobacter sphaeroides, a purple bacterium with substantially different codon usage from E. coli — without optimization, expression would be poor and the light response weak or absent. The resulting codon-optimized DNA sequences were imported into Benchling as the foundation for BioLightV5, and this protein-derived DNA sequence is the same one subsequently modeled in AlphaFold — meaning the structural prediction reflects the actual construct rather than an approximation.
Step 3 — Benchling initial build and iteration
With the codon-optimized sequences created, the full BioLightV5 construct was assembled in Benchling — building the pUC19 backbone, RsLOV–LexA408 fusion (LexRO), pColE408 operator, sfGFP reporter, and double terminator (two distinct terminator sequences in series — a deliberate choice to ensure clean transcriptional stop while avoiding the direct repeat synthesis complications that arise when identical terminators are stacked, and which contributed to the final Twist order passing validation). The <5 kbp synthesis limit imposed by Twist Biosciences — including vector — was a primary selection criterion from the start, and eLightOn’s single-plasmid architecture was specifically chosen because it fits within this constraint, eliminating the need for multi-fragment assembly methods such as Gibson Assembly or Golden Gate.
This single-plasmid decision also directly simplified the Aim 2 — Test & Analyze (Development) validation protocol: transformation of a single verified plasmid into DH5α is all that is required to establish the full optogenetic circuit, with no in-lab assembly steps between Twist delivery and wet-lab testing. A deliberate fine-tuning decision was made at the RBS selection step: SD17 was chosen over faster alternatives specifically because it produces slower, more controlled LexRO expression that preserves the full dynamic range of the system — SD17 trades induction speed for full expression fidelity, the right tradeoff for a system designed to produce photographic gradations rather than a binary on/off signal.
TA mentor Anastasia Bernaz provided important guidance on the necessity of spacers between components, and advised allowing even more space between elements in future Benchling builds — a design note carried forward for subsequent iterations of BioLightV5.
Step 4 — Asimov Kernel parts, SBOL, and Twist order refinement
In Asimov Kernel, an individual part was created for each circuit component — RsLOV, LexA408 fusion, pColE408 operator, SD17 RBS, sfGFP reporter, and double terminator — and assembled into a complete SBOL representation. A key coaching moment came from TA mentor Yehuda Binik, who identified that the generative AI-assisted SBOL output was inaccurate in its biological representation — the constructs were present but not correctly structured in SBOL format, which directed the work toward Asimov Kernel as the proper tool for parts formalization.
The construct was intentionally built without explicit restriction cut sites for future sfGFP replacement — a simplification appropriate for Aim 1 and Aim 2 scope — however this introduced complications during the Twist order process, where ORF reading frame dependencies and the requirement to include the promoter and terminator in-frame caused several order attempts to fail validation, resolved through iterative refinement between Benchling and Twist.
A circuit simulation was then run in Asimov Kernel, producing a spike in predicted sfGFP expression — but without capturing the dark-state repression phase central to the eLightOn mechanism, attributable to two known model limitations: Asimov Kernel does not simulate FMN chromophore photochemistry, and the underlying model is mammalian-derived, which may further limit dark-state accuracy in an E. coli chassis. The simulation result is treated as a model artifact confirming sfGFP expression is achievable under induction, while the dark/light dynamic range is reserved for empirical validation in the Aim 2 MVFV test. (Asimov simulation graph: Figure 4.2.)
*Asimov Kernel Simulation: 24 hrs
Step 5 — AlphaFold structural prediction and key limitation
Following Benchling refinement, AlphaFold was used to predict the three-dimensional fold of the RsLOV–LexA408 fusion protein — using the codon-optimized, protein-derived sequence as the basis, ensuring structural prediction reflects the actual construct. AlphaFold produced a structurally confident model of the LexRO dimer, but with a critical and known limitation: it does not simulate FMN chromophore energy transfer or its photochemical interaction with the protein — meaning the predicted structure captures the overall fold with high confidence but cannot model the monomerization event triggered by 470 nm light. The result is a strong structural prediction paired with a weak link at the photochemical interface — the precise point where the dark-to-light state transition occurs.
*AlphaFold Prediction : LexRO fusion of RsLOV-LexA408
Step 6 — ChimeraX MOA evaluation
ChimeraX resolved the AlphaFold gap through direct exploration of the dark-state crystal structure (PDB 4HJ4), enabling precise visualization of the FMN cofactor distance to the Cys55 terminus — the 4.324 Å gap that represents the photochemical trigger point for LexRO monomerization. The spatial relationship between the LexRO dimer and the pColE408 DNA binding/release interface was mapped, providing visual reinforcement that dark-state dimerization physically occludes the operator and represses sfGFP transcription, and supports the theory that the geometry of monomerization under 470 nm light is sufficient to uncover the promoter and permit expression.
*ChimeraX Visualization FMN cofactor distance to the Cys55 terminus
The full six-step pipeline — candidate selection → IDT codon optimization → Benchling circular plasmid → Asimov Kernel parts and simulation → AlphaFold → ChimeraX MOA — forms a complementary design workflow where each tool’s features and limitations are explored, producing a construct that is sequence-verified, circuit-simulated, and structurally rationalized before a single wet-lab experiment begins.
Protocol Design 2 — Quality Control / Analysis: AS7341 Spectral Sensor as Photometric Calibration Instrument
Overview
The AS7341 11-channel spectral sensor serves a dual role in the Photoplasm system — first as a precision calibration instrument that characterizes the optical stack before any biological work begins, and second as a real-time plate reader during exposure and development. This protocol covers the calibration phase, which is a prerequisite for all Aim 2 — Test & Analyze (Development) exposure work. Full calibration specifications, Python scripts, and sensor deployment notes are documented in Photoplasm_Device_PreWork.md; the detailed build guide is published for future collaborators.
The optical stack and calibration geometry
The Photoplasm optical path is a darkroom enlarger rebuilt as a bio-imaging instrument: a 470 nm Cree XP-E2 LED ring delivers blue light through a condenser lens array that collimates and directs it into an even, parallel projection through the OLED digital image mask — where transparency is off and pixels are selectively on for masking — through a focusing lens, and onto the bacterial plate or agarose slab at approximately 10 inches from the nodal point of the focusing lens.
Photoplasm hardware stack:Image by NanoBanana 2
In initial testing this projection worked as designed, casting a sharp image onto the focal plane with measurable continuous tone gradations. The AS7341 is deployed at plate height to characterize this projection — reading the actual irradiance at the biological substrate plane rather than at the source, which is the only measurement that matters for exposure calibration.
Wavelength Sensor(used for calibration)
Python calibration scripts and key findings
Three Python calibration scripts were written and run on the Raspberry Pi 5 to characterize the Photoplasm optical stack: photoplasm_cal01.py (retired from irradiance calibration after the key finding described below), photoplasm_cal02.py (three-state OLED irradiance measurement), and photoplasm_densitometer.py (16-step Bayer dither H&D curve sweep). A critical calibration principle emerged during early testing: the AS7341 is sensitive enough that even a change in projected pixel density — as introduced by a step-wedge mask — registers as an irradiance change at the sensor. This means any patterned mask in the optical path during calibration will cause the sensor to read spatial variation in the mask rather than the true uniform field irradiance.
The correct calibration approach is therefore full-frame uniform illumination with no mask pattern in the path — measuring the light field as the bacterial substrate will actually receive it. The step-wedge is preserved as a biological exposure tool for plate work, where spatial density variation is precisely what is being controlled, but it is not used during device irradiance calibration.
For the calibration sweep itself, a control-to-maximum irradiance run was executed — from direct unmodulated LED output through 100 PWM levels, downsampled to 16 standardized steps — producing a clean dose curve from minimum to maximum irradiance that defines the operating range of the Photoplasm device independently of any mask pattern. The densitometer script applied this approach using a 16-step Bayer ordered dither pattern across the full OLED pixel density range, and the AS7341 F2+F3 channel sum (445 nm + 480 nm, used as the 470 nm dose proxy since no single AS7341 channel falls at exactly 470 nm) showed a logarithmic response characteristic: steep toe at 0–25% pixel density, linear zone at 25–75%, and shoulder plateau at 75–100%, with a log fit of F2+F3 ≈ 138 + 22.5 × ln(density + 1) at R²=0.968.
Densitometer Readings - (used for calibration)
This is the optical H&D curve of the Photoplasm device — confirming that the system produces a measurable, continuous-tone sensitometric response before a single bacterium has been exposed. An additional discovery emerged: the OLED digital image mask itself emits 470 nm light proportional to pixel density, making it an additive light source rather than a purely neutral mask — a finding that motivates the planned upgrade to an ILI9341 transmissive LCD.
Cree XP-E2 upgrade and f/8 aperture decision
Initial Aim 1 testing confirmed that the consumer-grade EBOOT LED ring measured approximately 2.0 µW/cm² at the substrate plane — approximately 50× below the eLightOn activation threshold of 100 µW/cm². The AS7341 calibration data provided the quantitative basis for the upgrade decision: Cree XP-E2 LEDs, with measured output 10–20× higher than the EBOOT array and a tighter wavelength specification centered at 470 nm, will comfortably exceed the activation threshold.The irradiance gate for Aim 2 — Test & Analyze (Development) is defined as ≥100 µW/cm² confirmed by the AS7341 at plate height before any biological exposure begins. The focusing lens will be set to f/8 — the optimal balance between image sharpness and depth of field for bacterial plate work. A lower f-stop risks out-of-focus regions across the agarose slab surface if the slab is not perfectly flat; a higher f-stop increases depth of field but reduces light reaching the substrate. f/8 is selected because agarose slabs and bacterial expression layers may vary slightly in surface topology — f/8 provides enough depth of field to accommodate this variation while maintaining adequate irradiance at the substrate plane with the Cree upgrade.
Timed duty cycle dosing and machine vision feedback
A key methodological innovation in the Aim 2 — Test & Analyze (Development) exposure protocol is the use of a timed duty cycle rather than a single continuous exposure. Bacterial cultures are allowed to grow in total darkness first, establishing baseline expression; a calibrated dose of 470 nm blue light is then delivered at the measured irradiance level, followed by a dark recovery period, then another dose — repeated across the exposure window to prevent over-expression and metabolic exhaustion of the host cells. The reversibility of the eLightOn / BioLightV5 mechanism makes this approach possible: because LexRO re-dimerizes in the dark and re-represses sfGFP transcription during recovery intervals, the system can be dosed, rested, and dosed again — allowing fine-tuning of the exposure across multiple cycles within a single 24-hour experimental run.
Code Sample (snippet) - PWM Duty Cycle for Raspberry Pi 5
Finding the optimal balance of dose duration, recovery time, and total cycle count is itself a deliverable of Aim 2, and the resulting duty cycle parameters will become part of the calibrated exposure protocol published in the open-source documentation. A Raspberry Pi Camera Module mounted in the Photoplasm dark chamber provides real-time machine vision feedback during the exposure cycle — capturing fluorescence pattern development at each dose interval and feeding image data into a self-correction algorithm that can adjust subsequent dose parameters based on observed expression response. This machine vision layer is the first implementation of an autonomous feedback loop in the Photoplasm system, and it represents the foundational data collection step for the Aim 3 — Learn & Refine (Visionary) large language model: as exposure data accumulates across multiple Photoplasm devices and experimental runs, the self-correction algorithm becomes a training dataset suitable for a shared neural network — a fleet-level learning model that improves calibration accuracy across all deployed devices over time.
Part D — Industry Council Companies
Identify any How To Grow (Almost) Anything Industry Council companies which are associated with your final project (optional).
Primary Partners
Ginkgo Bioworks(Aim 3 — Learn & Refine)
Ginkgo Bioworks is an essential partner for the Aim 3 — Learn & Refine (Visionary) cell-free protein synthesis path — the cloud lab infrastructure that transforms BioLightV5 from a live-culture wetlab construct into a stable, shippable, freeze-dried consumable manufacturable at industrial scale. Most significantly, Ginkgo Bioworks could serve as the provider of a cell-free protein synthesis system featuring a Photoplasm-compatible biosensor — a complete, ready-to-use biological kit that responds to 470 nm blue light and produces sfGFP output when exposed through the Photoplasm device. This would make Photoplasm a true distributed community kit: the Ginkgo-manufactured cell-free biosensor as the biological consumable, the open-source Photoplasm device as the exposure instrument, and the shared experiential activity data model as the learning layer.
This initiative recognizes the Eastman/Kodak photographic industry analogy made real, where the complexity lives in the consumable and the participant simply loads, exposes, and observes. Beyond the consumable model, if Photoplasm is validated as a third-party labware instrument compatible with Ginkgo’s automated cloud lab protocols, it could operate as an optogenetic exposure platform within the Ginkgo ecosystem itself — a named protocol element in a fully automated, remotely executed biological imaging workflow.
There may also be a living-cell pipeline reinforced by a fully automated biomanufacturing process which would extend the reach of the visionary aim to existing wetlabs undergoing cloud automation transformation.
pDawn-sfGFP plasmid #107741 — the validated control construct for Aim 2 — Test & Analyze (Development) — ordered and handled exclusively under Genspace BSL-1 protocols at the Genspace Node. Beyond immediate construct sourcing, direct engagement with Addgene during this project revealed a longer-term institutional pathway: becoming an MTA-ready lab (Material Transfer Agreement certified) is a formal Addgene requirement for any community lab that wishes to deposit or distribute plasmids through their repository. Pursuing MTA-ready status for the MakerSpace Charlotte BioArt Studio is an aspirational goal of Aim 3 — Learn & Refine (Visionary) — one that would formalize the studio’s capacity to receive, handle, and eventually contribute biological materials to the open plasmid commons, directly aligned with the two-step HTGAA Node authorization pathway described in Section 3.
Having attended the HTGAA guest speaker session, the connection between the Transfyr.ai observational learning model and the Photoplasm platform has become clearer and more specific. The Photoplasm device is a connected instrument — every exposure run generates structured experimental data (irradiance levels, duty cycle parameters, AS7341 spectral readings, machine vision outputs) alongside learner participation and engagement signals from the community lab context. This is precisely the observational data model Transfyr.ai is built to capture and analyze.
Photoplasm represents a novel category of an observational data source: a community-deployed scientific instrument that is simultaneously generating both experimental outcomes and participant engagement metrics in a single session. The Aim 3 — Learn & Refine (Visionary) goal of a fleet-level LLM becomes more achievable when paired with observational and experiential activitiy data from distributed device users over time.
I believe that a continued collaboration with Transfyr.ai may lead to novel use of activity-based tracking and measurement protocols known as IEEE 9274.1.1-2023 (xAPI 2.0) which I have deployed at global manufacturing scale, and can lead to measurable transformation of industry best-practices.
Supporting Partners
New England Biolabs — DH5α competent cells, ampicillin, transformation reagents for Aim 2 wet-lab work at Genspace
Asimov (Kernel) — circuit-level logic design of BioLightV5, used in Aim 1 — Design & Build
Twist Biosciences — essential synthesis pipeline partner for BioLightV5 clonal gene order
Part E — Workflow Figures
Figure 4.1 — Aim 2 Protocol: Two Blue Light Tests.
Visual illustration of the sequential blue light testing protocol within Aim 2 — Test & Analyze (Development). Left panel: Blue Light Test 1 — MVFV — two test tubes post-transformation, one illuminated at 470 nm, one dark, with AS7341 readout and go/no-go gate. Right panel: Blue Light Test 2 — Photoplasm step-wedge calibration — device with OLED digital image mask projecting a 16-step Bayer dither onto an agarose slab, AS7341 capturing dose-response, and the resulting bacterial H&D curve. (FormLabs illustration — attach on submission.)
Figure 4.2 — Asimov Kernel simulation graph.
Predicted sfGFP expression output from BioLightV5 circuit simulation. Spike in expression confirmed; dark-state repression not captured due to mammalian model limitation and absence of FMN chromophore photochemistry modeling. (Screenshot from Asimov Kernel — attach on submission.)
Figure 4.3 — BioLightV5 non-linear design network.
SVG diagram showing the iterative, non-linear pipeline from candidate selection through eLightOn reconstruction, Benchling, IDT codon optimization, Asimov Kernel, AlphaFold, and ChimeraX, with dashed feedback loops at two key iteration points. (Inline SVG — exported from interactive widget, converted offline.)
Appendix — Standalone Protocol Documents
Document
Version
Scope
Photoplasm_BioLightV5_Protocol.md
v0.3.0
Primary wet-lab protocol, phases P0–P6
Photoplasm_Device_PreWork.md
v0.1.0
Device prep, Cree LED irradiance gate, fallback rig
Aim2_Protocol_AgaroseSlab.md
v0.2.1
Agarose slab embedding method (adapted from Tabor 2011)
pDawn_Backup_Protocol.md
v0.1.0
Fallback protocol if BioLightV5 sequence fails
Section Five - Results & Validation
Author: Eric Schneider · 2026a-eric-schneider
Node: Genspace NYC
Affiliation: BioArt Studio, MakerSpace Charlotte
Form Prompt
Describe the results of your project. What were the results of your experiments? What data did you collect? What did you learn? If you have not yet completed your experiments, describe what results you expect to see and why. Include figures, images, graphs, or other visual representations of your data where possible. Describe any challenges you encountered and how you addressed them (or plan to address them).
Opening
BioLight & Photoplasm is a project in motion — part completed (Aim 1), part deliberately designed to begin at Genspace on May 28, 2026 (Aim 2) and part saved for an ongoing shared collaborative experience (Aim 3). The results presented here reflect that reality: some are in hand, verified, and documented; others are expected outcomes grounded in calibration data, construct design, and a carefully staged protocol. Together they tell the story of a project that has moved from a photographer’s question — what is the resolution? — through fourteen weeks of literature review, construct design, hardware build, calibration, and community formation, to the threshold of its first wet-lab exposure.
What you will find in this section:
Results Block 1 — Aim 1: Design & Build (Experimental) · BioLightV5 Plasmid Construct — completed design, simulation, and structural analysis results; expected outcomes from Twist delivery and MVFP validation at Genspace
Results Block 2 — Aim 1: Design & Build (Experimental) · Photoplasm Device — completed hardware build and calibration findings including the Bayer dither H&D curve and OLED 470 nm emission discovery; detailed cal02 three-state analysis quantifying the OLED additive limitation; direct comparisons of the baseline vs. Aim 2 light source and image-mask configurations against the BioLightV5 minimum effective dose (MED) reference; and a gain-selection diagnostic for Aim 2 Cree raw characterization
Results Block 3 — Aim 2: Test & Analyze (Development) — expected results across three rounds of wet-lab work at Genspace, from MVFP baseline through first bacteriograph to Aim 3 handoff
Figures — confirmed figure list with status, pending figures noted for v1.0 final pass
Challenges — personal statement on the challenge of learning a new domain, followed by nine specific challenges encountered and how each was addressed
The BioLightV5 plasmid construct — BioLight V5— was designed, verified, and submitted to Twist Biosciences for clonal gene synthesis. The final v5 construct is 2,201 bp on a pUC19 backbone with AmpR selection, confirmed at 48.98% GC content with exactly two functional ORFs: LexRO-Fusion (1,143 bp) and sfGFP (717 bp). Three sequence issues identified during the design review were resolved prior to submission: an SD17 RBS spacing correction (AAA insert), removal of internal EcoRI/XhoI restriction sites, and replacement of the neutral spacer with a 50 bp AT-rich synthetic sequence. All four Benchling quality checks passed before Twist submission.
Key completed findings — construct design:
Asimov Kernel circuit simulation confirmed sfGFP expression output; dark-state repression gap documented as known artifact of mammalian-derived simulation model — not a construct flaw (Figure 4.2)
AlphaFold structural prediction of LexRO-Fusion confirmed stable dimer fold; FMN chromophore energy transfer gap noted as known prediction model limitation
ChimeraX exploration of dark-state crystal structure (PDB 4HJ4) confirmed Cys55–FMN distance at 4.324 Å; LexRO dimer geometry mapped against pColE408 operator; dark-state dimerization confirmed to physically occlude promoter; monomerization under 470 nm confirmed geometrically sufficient to permit sfGFP transcription (Figures 3.1, 3.2)
BioLightV5 non-linear design network — six-step pipeline from candidate selection through ChimeraX MOA confirmation documented with dashed feedback loops at two key iteration points (Figure 4.3)
pDawn-sfGFP (Addgene #107741) ordered in parallel as validated single-plasmid control construct
Expected Results
A successful Twist delivery of both the engineered BioLightV5 plasmid and the Addgene pDawn-sfGFP control sets up a substantial and purposeful Aim 2 — Test & Analyze (Development) to be conducted in person at Genspace. As noted in the timeline, orientation at Genspace is scheduled for May 28 as a new community member — a fixed date independent of Twist delivery. In preparation for this work, a new working group has formed around the project: TA and mentor Yehuda Binik and HTGAA 2025 cohort participant David Chau have been meeting weekly — two-plus hours each Sunday session — focused on the project design, the protocol roadmap, and the longer-term Aim 3 — Learn & Refine (Visionary) vision of a shared collaborative biosensor experience paired with the Photoplasm light exposure unit. The protocol in the appendix is directly tied to the timeline and dependent on receipt of the first clonal DNA as its primary trigger.
A key consideration is that successful delivery of the construct is the starting point, not the finish line: there are multiple days and sequential steps of early validation required before the light projection system is engaged at all. The Minimum Viable Functional Prototype (MVFP) — a simple bench-top 470 nm blue light induction test on two culture tubes, one illuminated and one dark — must be completed and pass before any Photoplasm device work begins. This deliberate separation of construct validation from device operation ensures that if BioLightV5 does not fluoresce as expected, the fallback pDawn-sfGFP protocol can be activated without having committed time and resources to the full Photoplasm exposure workflow.
A second in-person visit to Genspace is planned following the initial round of experimental lab work, once the baseline MVFP has been established and a verified, transformed bacterial source is ready for exposure. At that point, I will travel to Genspace with the Photoplasm hardware and software — bringing the light projection system to the certified wetlab environment for the first time and beginning integrated image exposure work under institutional BSL-1 oversight. This visit is one node in a deliberately parallel model: while wet-lab milestones advance at Genspace, the MakerSpace Charlotte BioArt Lab continues building, testing, and refining the Photoplasm hardware and software in parallel — learning together across two sites simultaneously. This distributed, connected approach is not incidental to the project — it is the proof-of-concept for Aim 3 — Learn & Refine (Visionary). The Genspace ↔ MakerSpace Charlotte working collaboration, anchored by weekly sessions with Yehuda Binik, David Chau, and the broader Genspace community, is the first instance of the shared, multi-node biosensor experience that Aim 3 proposes to scale. When the first creative light mask is exposed onto a living bacterial substrate at Genspace, the MakerSpace Charlotte BioArt Lab will be ready to replicate that experience — and the distributed, connected model will have its first proof point.
This collaboration also reflects MakerSpace Charlotte BioArt Lab’s longer-term aspiration to become a recognized HTGAA Node — supported by Eric Schneider (HTGAA 2026) in a future Teaching Assistant role if deemed applicable by the program — and ultimately to achieve MTA-certified wetlab status with a direct pathway to cell-free protein synthesis via Ginkgo Bioworks, positioning the lab as a fully credentialed community node in the distributed BioLight network.
The Photoplasm device represents the novel hardware contribution of this project to the broader synthetic biology community — a purpose-built bio-imaging instrument that reimagines the photographic darkroom enlarger as an open-source, community-deployable optogenetic exposure platform. All physical components were designed in Fusion 360 and fabricated in PETG on a Bambu X1 Carbon at my design studio and MakerSpace Charlotte.
Hardware build — completed components:
Dark chamber frustum cone (256 mm height, 51 mm ID top, 152 mm OD base)
Plate heater / incubation controller — PTCYIDU PTC element, DS18B20 temperature probe, IRLZ44N MOSFET on GPIO13, 37°C setpoint, variable and tunable — maintains bacterial culture temperature throughout the exposure window without removing the plate from the dark chamber
Raspberry Pi 5 — all pin assignments locked (PWM GPIO18, OLED SPI, AS7341 I2C, shutdown GPIO21)
470 nm LED ring — EBOOT array (current); Cree XP-E2 upgrade in queue as prerequisite for Aim 2
PWM/MOSFET driver (IRLZ44N, GPIO18) — verified working on breadboard
Light collimator and focusing lens — sharp image projection confirmed at f/8, ~10 inches from nodal point
OLED SSD1309 digital image mask — operational, projection confirmed sharp at plate height
(Raspberry Pi Camera Module — in queue, to be installed during Aim 2)
Calibration findings — completed:
EBOOT LED ring measured 2.0 µW/cm² at substrate plane — 50× below eLightOn activation threshold; quantitative basis for Cree XP-E2 upgrade decision
Three-state OLED transmission test — initially showed 99.9% optically neutral; identified as measurement artifact (LED + OLED emission not separated in original test); resolved by the corrected cal02 three-state protocol described below
Pie-wedge step-wedge — AS7341 sensitivity sufficient to register mask spatial variation as irradiance change; retired for calibration; preserved for biological plate exposure work
PWM sweep — 100 levels downsampled to 16 standardized steps; clean dose curve established across full operating range
Bayer dither densitometer — 16-step sweep confirmed logarithmic H&D response: steep toe (0–25%), linear zone (25–75%), shoulder plateau (75–100%); log fit F2+F3 ≈ 138 + 22.5 × ln(density + 1), R²=0.968(Figure 5.1)
OLED 470 nm emission discovery — OLED emits 470 nm light proportional to pixel density, +58.7% contribution across full density range — additive light source, not neutral mask; motivates ILI9341 transmissive LCD upgrade (Figure 5.2)
Sharp image projection confirmed at f/8 — optimal balance of depth of field and irradiance for variable agarose slab surface topology
Detailed cal02 three-state analysis — quantifying the OLED additive limitation
The OLED 470 nm emission finding referenced in Figure 5.2 was characterized in detail through a corrected three-state protocol (photoplasm_cal02.py), which addressed the measurement artifact in the original three-state OLED transmission test. The cal02 protocol records the AS7341 response under three sequential optical configurations: LED ring alone with no OLED in the optical path (S1), OLED present with all pixels driven white (S2), and OLED present with all pixels off (S3). From these three readings, three transmission/attenuation quantities are derived.
Spatial Light Modulator (SLM): the addressable optical component that controls where light passes through to the sample plane. The SLM sits between the light source and the sample, acting as a digital image mask — each pixel either passes light through, blocks it, or attenuates it by some intermediate amount. This is what enables Photoplasm to project patterns onto a bacterial lawn rather than illuminate it uniformly, and is the conceptual analog of a photographic negative in darkroom enlargement. Common SLM technologies include transmissive LCDs (the planned Aim 2 component), reflective LCoS panels, digital micromirror devices (DMDs, as in DLP projectors), and — in Photoplasm Alpha — a transparent OLED. SLMs may operate subtractively (blocking incident light, the photographic norm) or additively (emitting their own light, the OLED case); this distinction is central to the Aim 2 component swap.
Quantity
Measured value
Interpretation
s1_no_oled_f2f3
390 counts
LED ring direct, no SLM
s2_oled_white_f2f3
240 counts
LED + OLED pixels on
s3_oled_off_f2f3
350 counts
LED + OLED pixels off
oled_transmission_pct
61.5%
S2/S1
glass_transmission_pct
89.7%
S3/S1 (substrate only)
pixel_attenuation_pct
145.8%
(S3 - S2) inverted ratio metric
The OLED’s glass substrate is acceptably transparent (~90% transmission, consistent with optical-grade glass plus a thin organic stack). The pixel layer, however, adds light to the optical path when driven white rather than attenuating it. The 145.8% pixel attenuation value — over 100% — is the quantitative signature of this additive behavior: white pixels emit their own 470 nm photons that sum with transmitted LED light at the sensor. This is the cal02-derived counterpart to the +58.7% densitometer finding (Figure 5.2); both metrics quantify the same physics from different measurement geometries.
The contrast ratio implication is severe. With S3 = 350 counts (pixels off, “maximum exposure”) and S2 = 240 counts (pixels on, “minimum exposure”), the OLED’s effective contrast ratio is 350:240 ≈ 1.46:1, or 0.69:1 if measured in the conventional direction (closed/open). In log-exposure units, the available modulation range is log₁₀(1.46) ≈ 0.16 log units — roughly an eighth of the 1.3 log units typically required to capture a complete photographic H&D curve (toe + linear + shoulder). This is not a calibration artifact: it reproduces the established physics of OLED displays (each pixel is an electroluminescent emitter) and confirms that the SSD1309, however convenient for prototyping, is architecturally unsuited to act as a subtractive density mask in a photographic-sensitometric exposure unit.
Figure 5.7. cal02 three-state measurement, April 28 2026. The S2 < S3 relationship is the quantitative signature of OLED additive emission: white pixels add 470 nm photons but block more LED light than they emit, producing net attenuation; pixels-off blocks less and produces higher net throughput. This inversion is what motivates the LCD substitution in Aim 2 and is the cal02-derived counterpart to the densitometer finding in Figure 5.2.
Baseline vs. Aim 2 light source — paired comparison against the BioLight V5 MED
The Alpha LED ring delivers 240 counts net through the OLED at the sample plane (cal02 S2 measurement). Mapped to irradiance via the placeholder coefficient Kc = 1.0, this is consistent with the calibration result of 2.0 µW/cm² noted above. The BioLight V5 minimum effective dose (MED) for construct activation — anchored on the upstream eLightOn precedent (Jayaraman et al. 2016) and the project’s stated ≥100 µW/cm² irradiance gate — corresponds to approximately 300 counts in the current AS7341 256X-gain scale. The BioLight V5 MED is anchored on the upstream eLightOn value pending in-house characterization at Genspace.
The Alpha LED+OLED system, in its working configuration, falls below MED. Even at maximum drive (PWM 100%), the combined optical losses prevent reliable construct activation. The Cree XP-E2 3-up star array is specified for 10–20× higher radiant flux than the EBOOT 5mm array at the substrate plane, tighter spectral binning around 470 nm, and more directional emission. Projected raw output at the sensor is on the order of ~2,300 counts before SLM losses, providing substantial headroom above MED even after the LCD transmission penalty.
Figure 5.8. Light source comparison with BioLight V5 MED reference. The Cree+LCD net output (~460 counts) provides ~50% headroom above the MED threshold, whereas the EBOOT+OLED baseline (240 counts) falls below it — meaning the Alpha system cannot reliably engage the construct even at maximum drive.
Light source diagnostic — raw vs. net output ranges
Figure 5.8’s grouped bars convey the four data points but compress the relationship between raw source output and net delivery through the SLM. Re-plotting each system as a paired range (raw, then net) makes the magnitude of SLM transmission loss visible at a glance while preserving the MED reference for direct comparison. This view also surfaces a measurement protocol implication: the Cree raw output (~2,300 counts) substantially exceeds the AS7341’s saturation point at 256X gain (~726 counts), meaning Aim 2 raw characterization must use a lower gain setting and normalize the result back to 256X-equivalent counts for valid comparison against the Alpha baseline.
Figure 5.9. Direct light source comparison expressed as raw-vs-net output ranges at fixed gain. The SLM transmission percentages (61.5% for the baseline OLED, 20.0% for the Aim 2 LCD) quantify the photon loss between each light source’s raw output and what actually reaches the sample plane. Despite the LCD losing ~3× more photons than the OLED, the Cree’s much higher raw output more than compensates: the Aim 2 net (~460) lands above MED while the Baseline net (240) lands below.
Baseline vs. Aim 2 image mask — paired comparison against the BioLight V5 MED
The OLED’s additive emission produces three coupled failures: inverted modulation direction (increasing pixel density increases sensor reading rather than decreasing it — the opposite of photographic density behavior); compressed modulation range (only 0.16 log units of D-log E swing, against a requirement of roughly 1.0+ log units for sensitometric characterization); and MED-bracketing ambiguity (the OLED’s modulation range of 240–350 counts brackets the BioLight V5 MED rather than straddling it, so every density step produces ambiguous biology — neither a clean dark control nor a strong activation).
The ILI9341 LCD with backlight removed operates as a true subtractive shutter array. Blue-subpixel “on” passes light through the polarizer stack at panel transmission efficiency (~20% projected for blue channel through full polarizer + color filter stack); blue-subpixel “off” blocks light to the panel’s contrast floor (typically ~10:1 ratio for this class of LCD). Modulation runs in the correct direction (more pixels on = more light transmitted), with sufficient range to straddle MED with toe-region headroom on both sides. The resolution change from 128×64 monochrome (8,192 pixels) to 320×240 RGB (76,800 pixels, ~25,600 addressable blue subpixels) further enables an 8×8 Bayer dither protocol with 64 distinct density levels, against the Alpha’s 16-level 4×4 dither. This provides 4× more density steps per H&D characterization sweep, sufficient to resolve toe and shoulder curvature rather than jumping past them.
Figure 5.10. SLM modulation comparison. The OLED’s range (240–350) brackets the MED rather than straddling it, producing biology that is neither reliably dark nor reliably activated. The LCD’s projected range (46–460) clears MED in the open state and falls well below it in the closed state, enabling true toe-to-shoulder H&D characterization.
Image mask diagnostic — modulation range against the MED reference
The contrast between the two SLM options is clearest when each is plotted as a single vertical range — endpoints, midpoint, and span all visible against the MED reference. Unlike the light source case in Figure 5.9, the SLM measurements all sit comfortably within the AS7341 working range at 256X gain (noise floor ~55, saturation ~726), so no gain normalization is required for SLM characterization. The 512X gain setting remains available as a 2× amplification lever if measured LCD output falls short of projection.
Figure 5.11. Direct SLM comparison expressed as modulation ranges at fixed gain. The OLED’s 0.16-log range fails to provide either a clean below-MED dark control or a clean above-MED activation state. The LCD’s projected 1.00-log range provides both, with the MED line falling near the middle of the operating range — the natural set-point for stepwedge characterization.
Gain selection diagnostic — Aim 2 measurement protocol
Figure 5.9 identified the measurement-protocol problem (Cree raw saturates the AS7341 at 256X gain) but did not resolve it. The AS7341’s gain ladder is multiplicative — each setting halves or doubles the analog amplification before the ADC — so the projected Cree raw counts at any candidate gain can be computed as Cree_counts(gain) = 2,300 × (gain / 256). Plotting this across the full gain ladder against the sensor’s working bounds (55-count noise floor, 726-count saturation ceiling) identifies the optimal operating window of usable gains: 8X, 16X, 32X, and 64X.
Figure 5.12. Gain selection diagnostic for Cree raw characterization. Each AS7341 gain setting is plotted against the projected counts for the Cree raw measurement, scaled from the 2,300-count reference at 256X-equivalent. The four working-range settings (8X = 72 counts, 16X = 144, 32X = 287, 64X = 575) provide candidate operating points; 32X is recommended as the balanced choice. All raw measurements at the chosen gain will be normalized back to 256X-equivalent counts for direct comparison against the Alpha cal02 dataset and the BioLight V5 MED reference.
Gain selection justification (the trade-off at 32X). Lower gain reduces analog amplification noise — gain amplifies signal and noise together, so a lower-gain reading of the same photon flux is intrinsically cleaner. This favors the lowest gain that still produces a readable signal. However, lower gain also produces fewer ADC counts per measurement, which means each count represents a larger fraction of the total signal — coarser quantization. At 8X (~72 counts) the signal occupies only ~10% of the sensor’s working range; quantization error becomes a meaningful fraction of the measurement. At 64X (~575 counts) the signal occupies ~80% of the working range with fine quantization, but the higher gain amplifies thermal and read noise more aggressively. At 32X (~287 counts) the signal occupies ~40% of the working range — deep enough into the ADC for clean quantization while keeping noise amplification minimal. A secondary consideration favors 32X: the normalization factor from measured gain back to 256X-equivalent is 8× at 32X versus 16× at 16X, so any measurement error at 32X is amplified less aggressively during normalization, preserving comparability with the Alpha cal02 baseline.
Expected Results
The Cree XP-E2 LED upgrade — delivering 10–20× higher output than the EBOOT array with a tighter 470 nm wavelength specification — is expected to clear the ≥100 µW/cm² irradiance gate at the substrate plane, bringing Photoplasm fully within the BioLightV5 exposure specification and enabling the first biological exposure runs at Genspace. With the ILI9341 transmissive LCD replacing the OLED as a true non-emissive mask, the optical stack will be free of the 470 nm emission confound discovered in calibration — producing a clean, controlled exposure field. The Raspberry Pi Camera Module will be installed during Aim 2, providing real-time machine vision feedback during duty cycle exposure runs.
In my view, Photoplasm is the novel contribution of this project to the greater synthetic biology community. As stated in the final presentation slide deck, it is unique, accessible, modifiable — hackable — and released as a fully open-source contribution under the MIT License attribution model, with detailed documentation and branching version control via GitHub to invite community innovation and rapid improvement. It is built to the same philosophy stated in the abstract: not in isolation from the research community, but in direct collaboration with it. Observing the 2026 HTGAA Final Project presentations confirmed what the design already anticipated — strong community interest in spatial modifiers, biosensors, and light-wavelength triggers across a wide range of applications.
Photoplasm is preparing to serve that interest: the current 470 nm blue light ring is the first in a planned expansion toward full RGB and color spectrometry capability, enabling a wider range of optogenetic systems to be profiled on the same open platform — the same way photographic film characteristic curves were measured across emulsions. From art and design to materials science, therapeutics, and environmental sensing, an analog light source paired with a digital LED mask will be an invaluable bench tool at scale. Many as-yet-undocumented tangential experiments and discoveries are likely to emerge as communities learn, build, calibrate, and deploy the device — and that possibility is not incidental to the project. It is the point.
Figures referenced:
Photoplasm parts guide (refer to full build documentation and Github repository)
AS7341 calibration data — Bayer dither H&D curve (Figure 5.1 — attach on submission)
OLED emission discovery plot (Figure 5.2 — attach on submission)
cal02 three-state OLED analysis (Figure 5.7)
Light source baseline vs. Aim 2 comparison (Figures 5.8, 5.9)
Image mask baseline vs. Aim 2 comparison (Figures 5.10, 5.11)
Gain selection diagnostic for Aim 2 Cree characterization (Figure 5.12)
What I expect to happen is straightforward, and I say that with full appreciation of how much work it represents: a Twist order will be received, and I will be onboarded to Genspace through hands-on lab activities following a carefully designed and executed protocol. The entire project will become real the moment I see the first indication of bacterial transformation — verified by a simple 470 nm dark box exposure that confirms BioLightV5 is doing exactly what the design predicted. That moment — a glowing tube held under blue light in a dark room — is the proof point that fourteen weeks of literature review, construct design, hardware build, and calibration work has been pointing toward.
The work unfolds in three rounds:
Round 1 — Establish the baseline (Genspace Lab Block A, May 29 – June 1)
The following steps are drawn from the Aim 2 protocol currently under review — to be finalized at Genspace orientation on May 28, 2026. Specific methods, equipment access, and sequencing approach will be confirmed at that time.
Twist DNA received, resuspended, and gel-verified at P1
DH5α transformation, LB+Amp plating, colony picking under red safelight at P2
Miniprep, sequence verification (method confirmed at orientation May 28), glycerol stock banking per Genspace BSL-1 SOP at P3
MVFP induction test — 470 nm exposed tube vs dark control, sfGFP emission confirmed visually and by AS7341 sensor
Expected result: measurable fluorescence in light tube, minimal signal in dark control (Figure 5.3 — pending)
Agarose slab casting from verified working stock (OD₆₀₀ measured, low-melt agarose at 42–45°C)
Timed duty cycle exposure through OLED digital image mask at calibrated f/8 aperture
Three planned exposures: full step-wedge for H&D curve · Photoplasm logo binary test print · one original image mask from the 12-piece Photoplasm Art Gallery series
AS7341 time-series data exported to CSV, fluorescence plotted vs. logarithmic light exposure
Raspberry Pi Camera Module installed and providing real-time machine vision feedback during duty cycle
Re-run cal02 protocol on Aim 2 hardware using gain-stepped acquisition per Figure 5.12 (32X for S1 raw to avoid saturation, 256X for S2/S3 net); normalize raw values to 256X-equivalent counts before comparison against the Alpha cal02 baseline
Expected result: calibrated bacterial H&D curve with toe, linear, and shoulder regions documented (Figure 5.4 — pending); at least one bacteriograph demonstrating spatial sfGFP patterning from the OLED image mask (Figures 5.5, 5.6 — pending)
Round 3 is less a protocol and more a curriculum — a shift from validation to creative and scientific expression. Participants will build, test, and design the next level of light-reactive exposures on biosensor circuits of their choosing, using the Photoplasm platform as the shared instrument and the bacterial H&D curve as the calibrated reference.
Genspace ↔ MakerSpace Charlotte parallel build sessions ongoing
Photoplasm hardware and software open-sourced on GitHub under MIT License
Transfyr.ai observational data model capturing participant engagement and experimental outcomes
Machine vision self-correction data accumulating as the foundational training dataset for the Aim 3 fleet-level neural network
MakerSpace Charlotte BioArt Lab advancing toward HTGAA Node authorization and Addgene MTA-ready status
Ginkgo Bioworks CFPS pathway under development — direct line from community lab to cell-free biosensor consumable
Figures — Master List
Confirmed figures — exist or captured
Figure
Description
Source
Status
2.1
BioLightV5 in Benchling — construct map
Benchling screenshot
✅ exists
3.1
ChimeraX — dark-state RsLOV dimer, PDB 4HJ4, slate gray, yellow FMN, red Cys55, 4.324 Å
ChimeraX render
✅ exists
3.2
ChimeraX — dark-state with teal-green DNA helix, pColE408 operator
Baseline vs. Aim 2 light source comparison with BioLight V5 MED reference
Python plot
✅ exists
5.9
Light source raw-vs-net diagnostic with SLM transmission percentages
Python plot
✅ exists
5.10
OLED vs. LCD modulation comparison — shutters open and closed states
Python plot
✅ exists
5.11
SLM modulation range diagnostic — OLED brackets MED, LCD straddles MED
Python plot
✅ exists
5.12
Cree raw gain selection diagnostic — optimal operating window at 32X
Python plot
✅ exists
Pending figures — wet-lab and submission
Figure
Description
Source
Status
5.3
MVFP result — light tube vs dark control fluorescence
Photo / AS7341
⚠️ pending May 29
5.4
Bacterial H&D curve — fluorescence vs log exposure
AS7341 CSV → Python
⚠️ pending June 25
5.5
First bacteriograph — spatially resolved sfGFP gradient
Transilluminator photo
⚠️ pending June 23
5.6
Photoplasm Art Gallery — one original image mask exposure
Transilluminator photo
⚠️ pending June 23
Additional figures to be added in the v1.0 final pass as wet-lab work produces results.
Challenges
The challenge of designing a plasmid for the first time is an exciting prospect — and with excellent coaching and mentoring, along with trying different constructs, making beginner mistakes, and researching papers and literature, it has opened up an entirely new language and domain. As an industrial designer and photographer, I have found the perfect blend of modalities to apply to an established scientific industry, embarking on a revolutionary moment in history given the influence of AI, neural networks, generative algorithms, microcontrollers, and core synthetic bioscience. All of which are inherently challenging. When pulled together, each now has a more centralized focus and leads to innovation.
I embrace the challenge of learning a new domain — even with ten years of pharma and biotechnology industry experience and fifteen years of manufacturing experience — because I feel like I have reset the clock at the perfect time, to present solutions that lead to deeper understanding, learning, and practical knowledge transfer. As I stated in my earliest opening ideas on these topics, I seek to provide engaging experiences to peers, colleagues, and learners. From a practicality perspective, the advent of powerful models such as Claude have made it possible to pull together disparate pieces of information into meaningful, connected rapid prototypes that can be iterated and actualized — as shown by the results of this endeavour. I plan on analyzing all prompt turns to identify key themes that helped lead to this point, assessing what worked and what did not with the help of AI platforms of choice — looking for trends such as confidence, accuracy, corrections, and reusable prompt libraries in support of the project.
Specific challenges encountered and how each was addressed:
Plasmid design from scratch — eLightOn not on Addgene. Unlike pDawn-sfGFP which could be ordered directly, BioLightV5 had to be reconstructed from the Li 2020 paper and supplemental data — extracting protein sequences, converting to DNA via the IDT Codon Optimization Tool, and assembling the full circular plasmid in Benchling. Addressed: iterative rebuild across Benchling → Asimov Kernel → Benchling, with TA mentorship from Yehuda Binik (SBOL correction) and Anastasia Bernaz. (spacer guidance). Three sequence issues (SD17 spacing, EcoRI/XhoI sites, neutral spacer) identified and resolved in BioLightV5 V5 prior to Twist submission.
Twist order ORF/in-frame failures. The absence of restriction cut sites for future sfGFP replacement, combined with promoter and terminator in-frame dependencies, caused multiple Twist order validation failures. Addressed: iterative refinement between Benchling and Twist across multiple submission attempts, ultimately resolved in V5.
Asimov Kernel simulation dark-state gap. The circuit simulation produced an sfGFP expression spike without capturing dark-state repression — traced to the mammalian-derived model not simulating FMN chromophore photochemistry in an E. coli context. Addressed: documented as a known model artifact, not a construct flaw. The MVFP empirical validation at Genspace is the designed correction — wet-lab observation replaces what simulation cannot predict.
AlphaFold FMN photochemical gap. AlphaFold predicted a strong LexRO dimer fold but cannot simulate the FMN chromophore energy transfer that drives the dark/light transition. Addressed: ChimeraX exploration of PDB 4HJ4 confirmed the 4.324 Å Cys55–FMN distance and DNA binding/release geometry — providing mechanistic confidence the simulation could not.
EBOOT LED insufficient irradiance. Consumer-grade LED ring measured 2.0 µW/cm² at the substrate plane — 50× below the eLightOn activation threshold. Addressed: Cree XP-E2 LED upgrade ordered; irradiance gate (≥100 µW/cm² confirmed by AS7341 at plate height) defined as prerequisite before Aim 2 exposure work begins.
Pie-wedge step-wedge sensor geometry finding. The initial calibration approach using a projected pie-wedge mask caused the AS7341 to read spatial variation in the mask rather than uniform field irradiance — discovered during the April 27 calibration run. Addressed: pie-wedge retired for calibration; Bayer ordered dither pattern adopted as the calibration standard.
OLED 470 nm emission discovery. The OLED digital image mask was found to emit 470 nm light proportional to pixel density — +58.7% across the full density range — invalidating the earlier three-state transmission result. Addressed: documented as a key finding; the corrected cal02 three-state protocol quantified the additive behavior as a 145.8% pixel attenuation signature (Figure 5.7); ILI9341 transmissive LCD upgrade planned as a true non-emissive variable density mask.
Cree raw output exceeds sensor saturation at 256X gain. The Cree XP-E2 projected raw output (~2,300 counts at 256X-equivalent) substantially exceeds the AS7341’s saturation ceiling (~726 counts), preventing direct re-measurement of S1 at the same gain used for Alpha cal02. Addressed: gain-selection diagnostic developed (Figure 5.12) identifying 32X as the optimal balance of noise performance, ADC quantization precision, and normalization-error amplification; Aim 2 cal02 will use gain-stepped acquisition (32X for raw, 256X for net) and normalize raw values back to 256X-equivalent counts for direct comparison against the Alpha baseline.
Protocol dependency on Genspace confirmation. The Aim 2 wet-lab protocol is under review and will not be finalized until Genspace orientation on May 28 — meaning specific methods, equipment access, and sequencing approach remain TBD at time of submission. Addressed: protocol framed as a living document; orientation scheduled as a fixed anchor; pDawn-sfGFP backup protocol fully documented and ready to activate at any decision gate.
AI-assisted design — accuracy and verification. The use of generative AI tools across the design pipeline introduced a category of challenge unique to this moment in synthetic biology: distinguishing AI-generated approximations from verified biological facts. Addressed: every AI-assisted output was cross-referenced against primary literature, TA review, or experimental data. The prompt turn analysis planned as a post-project deliverable will formalize this verification workflow into a reusable methodology — contributing to the emerging practice of AI-assisted synthetic biology design as a documented, auditable process.
Section Six - References & Budget
Author: Eric Schneider · 2026a-eric-schneider
Node: Genspace NYC
Affiliation: BioArt Studio, MakerSpace Charlotte
Form Prompts
List all references cited in your project documentation.
Provide a budget for your project. Include all costs associated with your project, including materials, equipment, and any other expenses. If you have not yet incurred any costs, provide an estimated budget.
References
All references ordered by first appearance across Sections 1–5. See Appendix A (separate file: biolight_appendixA_optogenetic_systems.md) for optogenetic systems evaluation table and associated references R17–R21.
Section 2 — Aims
R1. Li X, Zhang C, Xu X, Miao J, Yao J, Liu R, Zhao Y, Chen X & Yang Y.A single-component light sensor system allows highly tunable and direct activation of gene expression in bacterial cells.Nucleic Acids Research 48(6):e33 (2020).
doi:10.1093/nar/gkaa044 · PMC7102963 · PMID 31989175
(eLightOn — BioLightV5 molecular ancestor. Primary citation.)Link to eLightOn
R2. Jin X & Riedel-Kruse IH.Biofilm Lithography enables high-resolution cell patterning via optogenetic adhesin expression.PNAS 115(14):3698–3703 (2018).
doi:10.1073/pnas.1720676115 · RRID:Addgene_107741
(pDawn-sfGFP control construct — Addgene #107741.)
Section 3 — Background
R3. Kuldell N, Bernstein R, Ingram K, Hart KM.BioBuilder: Synthetic Biology in the Lab.
O’Reilly Media (2015). ISBN 978-1491904299.
https://www.oreilly.com/library/view/biobuilder/9781491904299/(Karen Ingram co-author and scientific illustrator — introduced HTGAA 2026, leading to the BioLight project.)
R4. Levskaya A, Chevalier AA, Tabor JJ, Simpson ZB, Lavery LA, Levy M, Davidson EA, Scouras A, Ellington AD, Marcotte EM, Voigt CA.Synthetic biology: Engineering Escherichia coli to see light.Nature 438(7067):441–442 (2005).
doi:10.1038/nature04405 · PMID 16306980
(Foundational bacterial photography — Cph8 chimeric receptor. Primary citation 1.)
R6. Tabor JJ, Levskaya A, Voigt CA.Multichromatic Control of Gene Expression in Escherichia coli.J. Mol. Biol. 405(2):315–324 (2011).
doi:10.1016/j.jmb.2010.10.038
(Tabor Lab CcaS/CcaR multichromatic — inline reference.)
R7. Teller E (Astro Teller, Captain of Moonshots, X / Google X).
Quoted by Thomas L. Friedman in:
Thank You for Being Late: An Optimist’s Guide to Thriving in the Age of Accelerations.
Farrar, Straus and Giroux (2016). ISBN 978-0374273538.
(“Today is the slowest rate of change we will ever experience.”)
R9. Mace RL.Universal Design: Barrier-Free Environments for Everyone.
Designers West 33(1):147–152 (1985).
Center for Universal Design, North Carolina State University College of Design.
https://design.ncsu.edu/research/center-for-universal-design/(Ron Mace coined Universal Design — mentor to Eric Schneider at NCSU.)
R10. The MIT License.
Massachusetts Institute of Technology.
https://opensource.org/licenses/MIT(Open-source license governing Photoplasm hardware, software, and protocol releases.)
Section 4 — Experimental Design
R11. Tabor JJ.Plate-based assays for light-regulated gene expression systems.Methods in Enzymology 497:373–391 (2011).
doi:10.1016/B978-0-12-385075-1.00015-9
(Agarose slab embedding method — foundational protocol for bacteriography.)
R13. IDT Codon Optimization Tool.
Integrated DNA Technologies.
https://www.idtdna.com/CodonOpt(Used for RsLOV protein sequence → E. coli K12 codon-optimized DNA conversion during BioLightV5 design.)
R14. Addgene plasmid #107741 — pDawn-sfGFP.
Deposited by Ingmar Riedel-Kruse Lab. RRID:Addgene_107741.
https://www.addgene.org/107741/(Control construct for Aim 2 — Test & Analyze. See also R2.)
Section 5 — Results & Validation
R15. Hurter F & Driffield VC.Photochemical investigations and a new method of determination of the sensitiveness of photographic plates.Journal of the Society of Chemical Industry 9:455–469 (1890).
(H&D curve — foundational sensitometry. Conceptual framework for the bacterial H&D curve deliverable of Aim 2.)
R16. Li X, Zhang C, Xu X, Miao J, Yao J, Liu R, Zhao Y, Chen X & Yang Y.A single-component light sensor system allows highly tunable and direct activation of gene expression in bacterial cells.Nucleic Acids Research 48(6):e33 (2020).
doi:10.1093/nar/gkaa044
(Full author list verified against published article. See also R1.)
Appendix A References
R17–R21 listed in full in biolight_appendixA_optogenetic_systems.md.
Ref
Citation
Role
R17
Olson EJ et al. Nature Methods 11(4):449 (2014)
CcaS/CcaR — evaluated, deselected
R18
Jayaraman P et al. ACS Synth. Biol. 5(12):1363 (2016)
EL222 — evaluated, deselected
R19
Baumschlager A & Khammash M. Advanced Biology 5(5):2000256 (2021)
Review — candidate selection
R20
Multamäki E et al. ACS Synth. Biol. 11(10):3354 (2022)
pREDawn — evaluated, deselected
R21
Castillo-Hair SM et al. Nature Commun. 10:3099 (2019)
TBD = Genspace community lab membership — prorated community member rate, to be confirmed at orientation May 28, 2026. Several Genspace consumable line items (1.03–1.12) may be available from Genspace lab stock at reduced or no cost — to be confirmed at orientation. Separate optical component sourcing alternatives are deferred to Aim 3 — Learn & Refine (Visionary) as community-driven hardware innovation. All estimates subject to revision as actuals are received.
This is the living source of record for the project — hardware documentation, calibration scripts, the Quick Start Guide, and all supporting assets. Everything built during HTGAA 2026 is here, versioned and open.
Snapshot Notice
This document overview is a snapshot of working files in active development — for the latest versions, branch history, and releases, visit the public repository.
What the Repo Contains
Documentation (/docs)
The Quick Start Guide is structured as ten chapters and three appendices, all in Markdown.
File
Title
Ch. 1
photoplasm_ch01_ssh.md
SSH Setup & VS Code Remote Development
Ch. 2
photoplasm_ch02_github.md
GitHub & Version Control
Ch. 3
photoplasm_ch03_wavelength_sensor.md
AS7341 Wavelength Sensor
Ch. 4
photoplasm_ch04_led_ring.md
LED Ring · 470nm PWM Control
Ch. 5
photoplasm_ch05_oled_mask.md
OLED Digital Image Mask
Ch. 6
photoplasm_ch06_heater_perfboard.md
Incubation Heater Perfboard
Ch. 7
photoplasm_ch07_system_integration.md
System Integration
Ch. 8
photoplasm_ch08_gui_flask.md
GUI / Flask Web Interface
Ch. 9
photoplasm_ch09_spaceplacer.md
SpacePlacer
Ch. 10
photoplasm_ch10_camera_module.md
Camera Module
App. A
appendix_A_calibration_protocol.md
Calibration Protocol
App. B
appendix_B_feature_specification.md
Feature Specification
App. C
appendix_C_pinout_NS-03_v8.md
Pinout NS-03 v8
Calibration Scripts (repo root)
Three Python scripts, all validated on hardware:
Script
What it does
photoplasm_densitometer.py
16-step Bayer ordered dither sweep — measures AS7341 response vs. OLED pixel density. Confirmed OLED optical neutrality at 470nm.
photoplasm_cal01.py
Cumulative pie-wedge step wedge — builds a 360° dose gradient across the plate for H&D curve construction.
photoplasm_cal02.py
Three-state irradiance calibration — display off / blank / all-white.
Hardware Stack
The device is a 3D-printed cylindrical assembly. From illumination source to sensor:
The hardware is operational and calibration scripts are validated. Remaining work:
Calibrate Kc (irradiance coefficient) — required before biological stepwedge experiments
Bench-test heater PWM1 and DS18B20 temperature loop
Implement Flask GUI (Ch. 8) and camera module (Ch. 10)
Run the biological stepwedge — characterize the H&D dose-response curve across toe, linear, and shoulder regions
That curve is the goal: a quantitative map of how Photoplasm translates light dose into gene expression. Everything built so far is the instrument.
Built at Makerspace Charlotte BioArt Studio · HTGAA 2026 · Eric Schneider
Appendix-Archive
This section represents the original draft of the final project from April 14, 2026 - For Archival Purposes only
BioLight — Final Project Update (archival - see full documentation)
April 14, 2026 | HTGAA 2026 Individual Final Project
Short Final Project Description
My final project develops a light-responsive genetic circuit in E. coli that expresses fluorescent protein, using LED light to map projected photographic images to a biological substrate on agar plates.
Custom-built LED exposure hardware controls light exposure, activating the engineered biosensor to achieve high-resolution, wide-gamut images appearing through protein expression in transformed bacteria.
The resulting workflow will serve as a framework for community makerspace activities and a platform for ongoing optogenetic imaging research.
Project Aims
Aim 1 — Experimental
Engineer and validate a light-responsive fluorescent protein expression system in E. coli
Success measured by fidelity and tonal resolution of the expressed fluorescent image relative to the projected visual image
Aim 2 — Development
Translate the validated bio-circuit into an integrated imaging platform
Custom LED exposure hardware, 3D printed components, and software protocols
Connect analog light to digital tools, back to biological output
Explore how a cell-free system and automated lab production could increase productivity
Custom-design and build of light projection system including:
Raspberry Pi 5 as the primary controller
LED light array for controlled blue light exposure
Wavelength sensor for real-time spectral verification
OpenCV machine vision algorithms for luminosity measurement
Environmental sensors including temperature monitoring
Cycle timer to regulate and automate exposure sequences
Aim 3 — Visionary
Establish a framework for experiential learning in synthetic biology within community makerspaces
Long-term extension into machine vision interpretation of biosensor expression patterns
LLM and neural network integration for image recognition and biosensor pattern analysis
Aim 1
Aim 1a — pBioLight x2 (primary)
pBioLight-1B-eLightOn-v1, designated pBioLight x2, is the primary construct for Aim 1a and the fastest path to first image. It is a 2,201 bp circular single-plasmid system designed in Benchling and ordered via Twist Bioscience clonal gene synthesis in a pUC19 backbone with AmpR selection. The eLightOn system uses a LexA408 DNA binding domain fused to RsLOV, a light-oxygen-voltage domain that undergoes a conformational change upon 450nm blue light activation, releasing repression of the pColE408 promoter and driving sfGFP expression.
No external reagents required — the system uses FMN, a molecule E. coli naturally produces, as its light-sensing cofactor. This simplifies the workflow compared to systems like CcaS/CcaR that require externally supplied chromophores.
Restriction cut sites flanking sfGFP enable future color swapping without redesigning the full circuit, supporting expansion toward wide-gamut multi-color biological imaging through Aim 2 and beyond
Appendix — Optogenetic Systems Evaluated
All systems below were evaluated for use in the BioLight platform. eLightOn was selected as the primary system for pBioLight x2. Systems marked with ★ remain viable parallel tracks.
System
Light (nm)
Plasmids
Chromophore
Dynamic Range
Complexity
Status
eLightOn
450 blue
1
None (FMN)
~10,000×
★★
Selected — pBioLight x2
LEVI
450 blue
1
None (FMN)
~10,000×
★★
Deselected — equivalent dynamic range, less documented
HTGAA Group Project: MS2 Bacteriophage L Protein Engineering
Date: March 31, 2026
Authored & Reviewed by:
2026a-john-adeyemo-adedeji
2026a-eric-schneider
2026a-albert-manrique
2026a-Tehseen Rubbab
2026a-brie-taylor
Introduction
This document represents the full scope of our Group Project activity within our Genspace Node.
“Group 2” was formed for the purpose of addressing Bacteriophage Final Project Goals for engineering the L Protein.
The group conducted an asynchronous brainstorming session, leading to a series of online meetings to further define the problem and focus area.
The actual brainstorming notes and meeting notes can be found in the appendix section.
Two individual pipelines were executed, and the results are shown, attributed to the individual researcher.
A final comparison table is provided to see the differing results.
Project Goal Summary
MS2 Bacteriophage L Protein Engineering — Group Project Summary
Our collaborative team effort led to strong findings
Eric, Albert, Tehseen, and John each contributed complementary expertise — mechanistic hypothesis, structural modeling, sequencing validation, and experimental cross-referencing — that converged on two different candidates.
Tehseen provided guidance around focus on N-Terminus region 1 which we then evaluated further through mltiple pipelines.
From Eric, P13L cleared a series of computational and experimental gates.
John ran an extensive analysis pipeline and demonstrated clear differences in a table format.
Albert provided additional insights and highlighted potential pitfalls in prediction models, as noted in our brainstorming sessions
Nice work to all!
Project Goal
Engineer the MS2 bacteriophage L lysis protein for increased lysis toxicity through computational mutation design, using structural stability as a required co-constraint. The project targeted Region 1 (N-terminal domain) as the primary site of intervention, based on the hypothesis that increasing cationic charge density in this region would enhance electrostatic membrane disruption and lytic potency.
Phase 1 — Sequence Retrieval and Structural Baseline
Retrieved the MS2 L protein sequence from UniProt. Confirmed working sequence matches homologs AEQ25570.1 / ACY07208.1. Ran BLAST against UniProtKB/Swiss-Prot and nr databases, retrieving 51 homologs across diverse phage strains for conservation analysis.
Two rounds of multiple sequence alignment were performed. The second run used the confirmed working sequence as reference, producing an accurate position-by-position conservation map across all 75 residues.
Key conservation findings (free zone aa 16-28):
Position
WT residue
Symbol
Charge
Risk
18
R
*
Positive
Avoid — fully conserved
21
P
*
Neutral
Avoid — fully conserved
23
K
*
Positive
Avoid — fully conserved
25
E
*
Negative
Avoid — fully conserved
27
Y
*
Neutral
Avoid — fully conserved
28
P
*
Neutral
Avoid — fully conserved
26
D
Negative
Candidate — variable, +2 charge delta
24
H
Mild+
Candidate — variable
13
P
.
Neutral
Caution — weakly conserved
Note: Positions 18-20 form a conserved RRR motif, confirming existing cationic character in the target region.
Phase 3 — AlphaFold-Multimer Oligomeric Modeling
The L protein functions as a homo-oligomer. AlphaFold-Multimer was run on the wildtype sequence across three copy numbers to identify the most confident assembly.
Wildtype oligomeric runs:
Copies
ipTM
pTM
Assessment
3 (trimer)
0.28
0.35
Below threshold
4 (tetramer)
0.32
0.37
Below threshold
5 (pentamer)
0.32
0.37
Below threshold
All runs returned ipTM well below the 0.6 reliability threshold. AlphaFold-Multimer was retired as a primary tool for this protein due to known underrepresentation of small integral membrane proteins in training data.
Mutant pentamer runs (for comparison):
Variant
Copies
ipTM
pTM
vs WT
Wildtype
5
0.32
0.37
Reference
P13L
5
0.23
0.29
-0.09 ipTM
D26G
5
0.28
0.33
-0.04 ipTM
Differences are within the low-confidence range and are not statistically meaningful at this confidence level.
Phase 4 — ESM2 Mutation Scan
ESM2 masked marginal scoring was run via the Hugging Face mutation scoring notebook (AmelieSchreiber/mutation-scoring). The D→R substitution at position 26 was evaluated.
Position
Substitution
ESM2 result
Notes
26 (D)
D->R
Lower log-likelihood
Evolutionarily less common but not catastrophic
P13L was not run through ESM2 as experimental confirmation was considered sufficient.
Phase 5 — ESMFold Monomer Structural Prediction
Single-copy ESMFold predictions were run for the wildtype and key mutant variants.
Variant
pTM
pLDDT
Delta pTM
Delta pLDDT
Assessment
Wildtype
0.273
64.407
—
—
Reference
D26R
0.267
63.339
-0.006
-1.068
Negligible — tolerated
P13L
0.420
—
+0.147
—
Best monomer score
P13L showed the highest pTM of any variant tested, with a +0.147 improvement over wildtype. ESMFold additionally showed high per-residue confidence at position 1, indicating the P→L substitution resolves N-terminal structure rather than introducing disorder. ChimeraX visualization confirmed electrostatic properties at the N-terminus, a transition to the soluble transmembrane region, and C-terminal amphipathic character.
Phase 6 — Experimental Data Cross-Reference
Group experimental lysis data was cross-referenced against all computational candidates.
AA position
Mutation
Lysis rep A
Lysis rep B
Result
13
P->L
1
1
Confirmed lytic — both replicates
26
D->G
1
0
Mixed
26
D->R
—
—
Not tested
23
K->E
1
0
Mixed
25
E->G
1
0
Mixed
19
R->S
1
0
Mixed
20
R->W
1
0
Mixed
The mixed results for charge-removing substitutions at positions 19, 20, and 23 provided experimental confirmation that cationic charge density in the RRR stretch is functionally important, directly supporting the toxicity hypothesis.
Phase 7 — ORF Overlap Resolution
P13L (aa 13) falls outside the ORF-free zone at nucleotide 1715, within the 50-nucleotide CP/L overlap region. Full DNA sequence analysis was performed to determine the effect of the C→T change on both reading frames simultaneously.
Exact codon analysis at genome position 1715:
Frame
Gene
Codon pos
WT codon
Mut codon
AA change
Effect
L protein
1678-1905
13 of 75
CCG
CTG
Pro -> Leu
P13L intended
Coat protein
1335-1727
127 of 131
TCC
TCT
Ser -> Ser
Synonymous — safe
The C→T change falls at the third base of CP codon 127 — the most degenerate position in the genetic code. The coat protein is completely unaffected. P13L is cleared for synthesis.
Lead Candidate: P13L
Mutant sequence (single substitution at position 13, P→L):
The surface electrostatic map shows molecular binding activity (negative potential, rendered in red) concentrated at three functionally distinct regions:
N-terminus (Region 1, aa 1–15) — where P13L is located. The electrostatic character here reflects the cationic RRR motif at positions 18–20 creating charge interactions at the membrane-facing surface. The high ESMFold confidence at position 1 is now visually corroborated — the N-terminal domain is well-defined and electrostatically active.
Junction to the transmembrane helix (Region 2 transition) — the boundary between the soluble N-terminal domain and the hydrophobic membrane-spanning segment. Electrostatic activity at this junction is consistent with the amphipathic character of Region 3 and the known mechanism by which the L protein inserts into and disrupts the inner membrane.
C-terminus — electrostatic activity here is consistent with the periplasm-facing amphipathic tail of the L protein, which interacts with the cell wall and MurA enzyme.
The key implication for P13L: the electrostatic map shows that the mutation does not disrupt the overall charge architecture of the protein — all three functional zones retain their activity. The P13L substitution in Region 1 appears to sharpen rather than disturb the N-terminal electrostatic profile, which is consistent with the improved pTM score and high position-1 confidence seen in ESMFold.
Secondary Candidates
Candidate
Free zone
ESMFold pTM
Experimental
Status
D26R
Yes
0.267
Not tested
Secondary — tolerated
D26G
Yes
Not run
Mixed (1/0)
Deprioritized
N17R
Yes
Not run
Not tested
Open candidate
H24R
Yes
Not run
Not tested
Open candidate
Tools Used
Tool
Purpose
Outcome
UniProt
Sequence retrieval
Confirmed 75aa working sequence
BLAST
Homolog identification
51 homologs retrieved
Clustal Omega
Conservation mapping
Free zone and candidate identification
AlphaFold-Multimer
Oligomeric modeling
Retired — all ipTM < 0.35
ESM2 (Hugging Face)
Mutation scoring
D26R cautionary signal noted
ESMFold
Monomer structure prediction
P13L pTM 0.420 — lead confirmed
ChimeraX
Structural visualization
Electrostatic and domain properties confirmed
Benchling
ORF analysis and plasmid design
Overlap zone mapped
Python / pandas
DNA sequence analysis
Codon-level overlap resolution
Potential Next Steps
Codon optimization of P13L mutant sequence for E. coli expression
Plasmid design in Benchling — confirm no additional ORF conflicts
Final ranked mutant report: predicted vs observed lysis efficiency
Key Working Notes
AlphaFold-Multimer is not reliable for this protein class — all oligomeric scores were below 0.35 ipTM regardless of copy number
The RRR motif at positions 18-20 represents existing cationic character in the free zone — mutations removing charge at these positions consistently reduce lysis in experimental data
P13L falls outside the ORF-free zone but was independently confirmed safe via DNA-level codon analysis
D26R remains the strongest untested in-zone candidate and should be prioritized for experimental validation alongside P13L
Computational Pipeline Report on MS2 Bacteriophage L Protein Engineering
Summary
The MS2 bacteriophage lysis protein L (UniProt P03609) is a 75-amino acid single-pass transmembrane protein whose N-terminal domain (aa 1-40) acts as a regulatory inhibitor of premature membrane insertion and oligomerization. This report describes a complete computational engineering pipeline designed to systematically truncate the N-terminal regulatory domain, identify optimal point mutations within it, and generate codon-optimized synthetic gene constructs for E. coli expression. The pipeline integrates ESM2 protein language model scanning, ESMFold structure prediction, AlphaFold-Multimer complex modeling with the E. coli chaperone DnaJ (P08622), GROMACS molecular dynamics stability assessment, ProteinMPNN sequence redesign, E. coli codon optimization, and downstream variant calling using Bowtie2 and BCFtools with IGV visualization. The primary candidate emerging from this analysis is L_trunc30, a 45-amino acid C-terminal fragment retaining the full transmembrane lytic domain with a net charge reduced to -2, the LS dipeptide motif preserved, and demonstrably lower RMSF in the transmembrane domain compared to the remaining N-terminal stub.
1. Background and Biological Rationale
MS2 L protein biology. The lysis protein of bacteriophage MS2 is one of the simplest known lytic mechanisms in biology. The 75 aa L protein is encoded on the MS2 genome overlapping both the coat protein gene (5’ end) and the replicase gene (3’ end). In the native viral context, L translation is coupled to ribosomal frameslipping during coat protein termination, occurring at approximately 5% frequency. However, when expressed from an independent inducible promoter on a plasmid (as in this engineering problem), L acts as a standalone lysis effector, allowing direct experimental control over expression timing and level.
N-terminal domain as regulatory inhibitor. The highly basic N-terminal half of MS2 L has been demonstrated experimentally to be dispensable for lytic activity (Bernhardt et al., 2002). Its function is inhibitory: the N-terminal domain forms intramolecular contacts with the C-terminal transmembrane domain, creating a conformational lock that prevents premature membrane insertion and oligomerization. Removal of this domain results in lysis occurring approximately 20 minutes earlier than wild-type, consistent with loss of the timing mechanism.
DnaJ interaction. The E. coli chaperone DnaJ (P08622) interacts specifically with the highly basic N-terminal domain of L via its P330 residue, further retarding lysis to allow sufficient time for assembly of progeny virions. This interaction represents the primary protein-protein interface targeted in this engineering campaign: variants that reduce DnaJ binding affinity are predicted to show faster uninhibited lysis kinetics.
Engineering hypothesis. This work tests three specific sub-hypotheses: (1) partial N-terminal truncations will incrementally diminish inhibitory effects and enhance lysis efficiency; (2) regulatory activity is localized to a distinct sub-region rather than the entire N-terminal domain; and (3) an optimal truncation point exists that balances increased toxicity with maintenance of transmembrane domain stability.
2. Pipeline Overview
The complete computational pipeline was implemented as a Google Colab notebook (Python 3, T4 GPU runtime) executing nine sequential analytical stages. All reference sequences were fetched directly via public APIs with no local downloads required.
Stage
Tool
Purpose
1
ESM2 (650M)
Masked prediction scan across all 75 positions; log-likelihood ratio scoring
2
ESMFold API
Structure prediction for WT and 6 truncation variants; interdomain contact analysis
3
ColabFold Multimer
L protein + DnaJ J-domain complex modeling; interface PAE extraction
Junction region redesign with fixed TM domain; charge-reduced variants
6
E. coli codon optimizer
Kazusa K-12 high-frequency codon table; LS motif verification
7
Synthetic gene assembly
Complete construct design with Ptrc, RBS, terminators, Gibson overhangs
8
Bowtie2 + BCFtools
Read alignment to reference; variant calling on sequencing output
9
IGV
Visual inspection of variant loci; batch script for desktop IGV
3. Stage 1 — ESM2 Mutagenesis Scanning
Method. The ESM2 650M parameter model (esm2_t33_650M_UR50D) was loaded on GPU and used to perform masked token prediction across all 75 positions of the wild-type MS2 L protein (METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT). At each position, the residue was masked and the log-softmax probability of every amino acid was extracted from layer 33. The log-likelihood ratio (LLR) was computed as the difference between the log probability of each mutant amino acid and the log probability of the wild-type amino acid at that position. Positive LLR indicates ESM2 assigns higher probability to the mutant than the wild-type.
The analysis was restricted to positions 1-40 (N-terminal domain) for the final candidate ranking, since the objective is to perturb the regulatory region while leaving the transmembrane lytic domain (aa 41-75) intact.
Figure 1. ESM2 log-likelihood ratio heatmap. Top: full 75 aa L protein with dashed line marking the NTD/TM boundary at position 40. Bottom: N-terminal domain zoom (aa 1-40). Red = favored substitution (positive LLR); blue = disfavored substitution. Position 29 (WT: Cys) is the dominant hotspot.
Top 20 N-Terminal Domain Mutations by LLR
Mutation
LLR
Domain
Notes
C29R
3.64
N-terminal
Cys29Arg — top ESM2 hit; position 29 hotspot
C29P
3.17
N-terminal
Cys29Pro — strong helix-breaking substitution
C29Q
3.06
N-terminal
Cys29Gln
C29S
3.04
N-terminal
Cys29Ser — conservative hydroxyl substitution
C29K
2.76
N-terminal
Cys29Lys — charge-altering
C29L
2.74
N-terminal
Cys29Leu — hydrophobic
C29A
2.55
N-terminal
Cys29Ala — alanine scan classic
C29T
2.52
N-terminal
Cys29Thr
C29E
2.46
N-terminal
Cys29Glu — charge-altering
Y39L
2.36
N-terminal
Tyr39Leu — aromatic to aliphatic
C29V
2.35
N-terminal
Cys29Val
C29Y
2.18
N-terminal
Cys29Tyr
C29N
2.17
N-terminal
Cys29Asn
C29I
2.15
N-terminal
Cys29Ile
C29H
2.11
N-terminal
Cys29His
C29G
2.01
N-terminal
Cys29Gly — flexible linker substitution
C29D
1.89
N-terminal
Cys29Asp — acidic substitution
F22R
1.86
N-terminal
Phe22Arg — second hotspot; basic charge introduction
C29F
1.76
N-terminal
Cys29Phe — aromatic substitution
S9Q
1.69
N-terminal
Ser9Gln — also found in prior HTGAA Week 5 ESM2 scan
Key findings. Position C29 is the dominant hotspot, accounting for 12 of the top 20 mutations. C29R (LLR = 3.64) is the top-ranked single substitution. F22R (LLR = 1.86) is the second distinct hotspot. S9Q (LLR = 1.69) matches the substitution independently recovered during the HTGAA Week 5 ESM2 scan, providing cross-validation.
4. Stage 2 — Structure Prediction and Interdomain Contact Analysis
Method. Structures for all seven variants (L_WT and six truncations) were predicted using the ESMFold API. Interdomain contacts were quantified by counting Cα-Cα pairs with distance below 8.0 Å where one residue belonged to the N-terminal domain (positions 1 to 40) and the other to the C-terminal transmembrane domain.
Figure 2. Interdomain Cα-Cα contacts (d < 8 Å) between N-terminal and transmembrane domains across all seven variants. All variants return 0 contacts, indicating intrinsic disorder in the N-terminal domain in solution.
Variant
Truncation (aa)
Remaining aa
Interdomain contacts
Interpretation
L_WT
0
75
0
N/A
L_trunc10
10
65
0
N/A
L_trunc20
20
55
0
N/A
L_trunc25
25
50
0
N/A
L_trunc30
30
45
0
-2.0
L_trunc35
35
40
0
N/A
L_trunc40
40
35
0
N/A
Interpretation. The uniform zero contact count reflects a known limitation of ESMFold for highly disordered proteins. The N-terminal domain of L is intrinsically disordered in solution and only adopts defined structure upon membrane engagement or DnaJ interaction. Meaningful structural differentiation requires either MD simulation in an explicit membrane environment (Stage 4) or AlphaFold3 predictions incorporating DnaJ (Stage 3).
5. Stage 3 — AlphaFold-Multimer: L Protein and DnaJ Complex
Method. Multimer FASTA files pairing each L variant sequence with the first 100 amino acids of E. coli DnaJ J-domain (P08622) were submitted to ColabFold multimer mode using AlphaFold2-multimer-v3.
Variant
Truncation (aa)
Interface PAE
Status
L_WT
0
N/A — ColabFold timeout
Pipeline step confirmed; HPC run required
L_trunc10
10
N/A — ColabFold timeout
Pipeline step confirmed; HPC run required
L_trunc20
20
N/A — ColabFold timeout
Pipeline step confirmed; HPC run required
L_trunc25
25
N/A — ColabFold timeout
Pipeline step confirmed; HPC run required
L_trunc30
30
N/A — ColabFold timeout
Pipeline step confirmed; HPC run required
L_trunc35
35
N/A — ColabFold timeout
Pipeline step confirmed; HPC run required
L_trunc40
40
N/A — ColabFold timeout
Pipeline step confirmed; HPC run required
Note on N/A results. The ColabFold multimer predictions returned N/A for all variants due to Colab GPU timeout constraints at the 600-second limit. The pipeline infrastructure is fully validated. Re-running Stage 3 on a Compute Ontario HPC node will generate PAE matrices within approximately 15-20 minutes per variant.
6. Stage 4 — GROMACS Molecular Dynamics
Method. All four GROMACS MDP input files were generated and validated. A complete SLURM submission script for Compute Ontario HPC infrastructure was produced for 100 ns production runs with GPU acceleration (GROMACS 2023.3-CUDA, 32 cores, 1 GPU, 48 h walltime). In Colab, a representative 1 ns production trajectory RMSF profile was computed for L_trunc30.
Figure 3. RMSF profile for L_trunc30 (45 aa). Orange region: remaining 10 aa N-terminal stub. Green region: transmembrane domain. Mean RMSF NTD stub: ~1.87 nm. Mean RMSF TM domain: ~0.27 nm. The 6.9-fold RMSF differential confirms high flexibility in the regulatory stub and low flexibility in the lytic transmembrane domain.
MDP File
Integrator
Duration
Key parameters
em.mdp
steep
50,000 steps
emtol = 1000 kcal/mol/nm; PME electrostatics
nvt.mdp
md
100 ps
V-rescale thermostat; 310 K; position restraints on protein
Method. ProteinMPNN was invoked with the TM domain sequence fixed (positions 11-45 in L_trunc30 numbering) and the junction region (positions 1-10) free for redesign. Net charge was computed for each truncation variant as K+R-D-E.
Figure 4. Net charge (K+R-D-E) of L_trunc30 variant = -2. Removal of the highly basic N-terminal domain (containing RRRPFK and RRQQR motifs) eliminates the electrostatic basis of the DnaJ-L interaction.
Method. All truncation variant protein sequences were back-translated to DNA using the E. coli K-12 high-frequency codon table (Kazusa database). Each optimized sequence was checked for preservation of the LS dipeptide motif.
Variant
Protein aa
DNA bp
GC%
LS motif
Action required
L_trunc30
24 aa
75 bp
30.7%
PRESERVED (CTGAGC)
GC below 40% threshold — consider IDT codon optimization with GC balancing before synthesis
Note on GC content. The codon-optimized L_trunc30 sequence has a GC content of 30.7%, which falls below the recommended 40-60% range for optimal E. coli expression. Before synthesis submission, the sequence should be passed through IDT’s codon optimization tool or GenScript’s OptimumGene algorithm with GC balancing enabled. The LS motif (CTGAGC encoding Leu-Ser) must not be altered during GC balancing.
9. Stage 7 — Synthetic Gene Construct Design
The full expression cassette for L_trunc30 was assembled with the following architecture, designed for direct Gibson assembly into the mUAV backbone:
Figure 5. Synthetic gene construct architecture for L_trunc30. Total construct: 230 bp. The BB_Fwd and Col_Rev overhangs are identical to those used in the HTGAA Week 6 Gibson assembly lab.
Element
Sequence / Notes
Length
BB_Fwd overhang
GCGCACCTGCATATTGAGACCC
22 bp
Ptrc promoter
TTGACAATTAATCATCGGCTCGTATAATGTGTGG
34 bp
RBS + spacer
AAAGAGGAGAAA + ATAAT
17 bp
L_trunc30 gene (codon-opt.)
ATG…TAA (E. coli K-12 optimized)
75 bp
lambda t0 terminator
GCAAAAAACCCCGCTTCGGCGGGGTTTTTTCG
32 bp
rrnB T1 terminator
GCGCAACGCAATTAATGTGAGTTAGCTCAC
30 bp
Col_Rev overhang
GTCTCAATATGCAGGTGCGC
20 bp
TOTAL
230 bp
Design rationale. The Ptrc promoter provides IPTG-inducible expression. The RBS sequence (AAAGAGGAGAAA) is an optimized Shine-Dalgarno sequence with a 5 bp ATAAT spacer. The lambda t0 and rrnB T1 tandem terminators provide robust transcription termination. The BB_Fwd and Col_Rev Gibson overhangs are the exact sequences used in the HTGAA Week 6 chromophore mutagenesis lab, making this construct directly compatible with the existing mUAV cloning infrastructure.
10. Stages 8-9 — Variant Calling and IGV Visualization
Bowtie2 alignment. The wild-type codon-optimized L gene was used as the alignment reference. For each truncation variant, 1,000 paired-end Illumina reads (150 bp, error rate 0.001) were simulated and aligned using Bowtie2. Sorted BAM files were indexed with SAMtools. Variant calling was performed with BCFtools mpileup and bcftools call (-mv flag, VCF output).
IGV visualization. An IGV batch script was generated for desktop IGV that loads the reference FASTA, all BAM alignment tracks, and all VCF variant tracks simultaneously, navigates to the full L gene locus, sorts by position, collapses reads, and exports a snapshot PNG.
ChimeraX electrostatic surface map, three functional zones confirmed
GROMACS MD
Full pipeline implemented, 4 MDP files generated; SLURM script for HPC; 1 ns demo RMSF computed
Not performed
ProteinMPNN
Junction redesign attempted for trunc30 with TM domain fixed
Not performed
Conservation analysis
Not performed as separate stage
Clustal Omega run twice on 51 homologs; free zone (aa 16 to 28) defined
ORF overlap analysis
Not performed
Full DNA-level codon analysis at nt 1715; P13L causes TCC to TCT at CP codon 127; synonymous S to S; cleared safe
Experimental lysis data
Not cross-referenced, computational pipeline only
Cross-referenced against group wet lab data; P13L confirmed lytic in both replicates
Wet lab validation status
Not yet validated, synthesis constructs designed
P13L experimentally confirmed lytic, both replicates positive
Codon optimization
Performed, E. coli K-12 Kazusa table; GC content 30.7% flagged; LS motif confirmed present
Identified as next step, not yet completed
Synthetic gene construct
Fully designed, 230 bp construct with Ptrc, RBS, lambda t0, rrnB T1, Gibson overhangs
Planned for synthesis via Twist Bioscience; construct not yet finalized
Bowtie2 / BCFtools / IGV
Implemented and demonstrated with simulated reads; IGV batch script generated
Listed as planned next step, not yet performed
DnaJ interaction
Central to hypothesis, truncation removes basic domain responsible for DnaJ electrostatic engagement
Not explicitly modeled
Net charge of lead candidate
-2 (charge reversal from highly basic WT)
Unchanged from WT, P13L does not alter charge
LS motif verification
Confirmed present in codon-optimized sequence (CTGAGC)
Not explicitly checked
Key methodological strength
Systematic genome-wide scanning and full pipeline automation; all stages reproducible from single notebook
Experimental ground truth, wet lab confirmation provides direct biological validation
Key methodological gap
No experimental validation yet; interdomain contact analysis inconclusive
No systematic positional scanning; ESM2 used for only 1 position; no MD or ProteinMPNN
Most actionable next step
Rerun Stage 3 on HPC for DnaJ PAE; GC balance codon sequence; order L_trunc30 synthesis
Order D26R for experimental validation alongside confirmed P13L
Appendix
A. Primary Requirements
Part D. Group Brainstorm on Bacteriophage Engineering
Find a group of ~3–4 students
2026a-john-adeyemo-adedeji
2026a-brie-taylor
2026a-eric-schneider
2026a-albert-manrique
2026a-Tehseen Rubbab
Read through the Phage Reading material listed under “Reading & Resources” below.
Review the Bacteriophage Final Project Goals for engineering the L Protein:
Increased stability (easiest)
Higher titers (medium)
Higher toxicity of lysis protein (hard)
Brainstorm Session
Choose one or two main goals from the list that you think you can address computationally. Write a 1-page proposal (bullet points or short paragraphs) describing:
Which tools/approaches from recitation you propose using
Why do you think those tools might help solve your chosen sub-problem?
Goal: I am recommending Goal C: Higher toxicity of lysis protein (hard)
Hypothesis: I believe we can focus on the cationic properties, or positive electrical charges that are present in the amino acid sequence. By substituting amino acids that enable more positive charge strengthening electrostatic attraction, we may create more binding activity. Lysis timing can be tuned in either direction by manipulating charge density.
Pipeline:
UniProt — retrieve sequence
BLAST — find homology
PyMOL — visualize polarity
PyMOL — isolate target residues
ESM2 — score substitution probability
Heatmap — synthesize data
ESMFold — predict mutant structures
PyMOL — compare mutants to baseline
Codon optimization — prepare sequences
Twist Bioscience — synthesize genes
Benchling — design plasmid constructs
Review gate — confirm replicability
Opentrons OT-2 — run protocol and collect data
Potential Pitfalls:
My hypothesis focuses on region 1 (facing cytoplasm, hydrophilic) and region 3 (a mix of hydrophobic and hydrophilic or “amphipathic,” facing periplasm) to control timing of MurA enzyme inhibition.
Region 1 & 3: Too much polarity change could cause the phage to bind and become entrapped.
Avoid region 2 as it is a very well defined helical fold that is subject to disruption with minor change to structure.
Review feedback: Will likely encounter overlapping frames, and will visualize in Benchling.
C. John’s Brainstorming Notes
Computational Goals:
Align reads to MG1655 & call SNPs/indels (Bowtie2/Mpileup/BCFtools)
Codon-optimize and synthesize L gene variants
Error-prone PCR mutagenesis to generate L mutant libraries
Proposal — Proposed tools:
Input: Paired-end Illumina reads (250 bp) from mutant and parental strain genomic DNA; Reference: MG1655 (E. coli K-12, accession NC_000913.3)
Quality Control: FastQC — raw read quality assessment; Trimmomatic or Fastp — adapter trimming, low-quality base removal
Alignment: Bowtie2 — short-read alignment to reference; SAMtools — convert SAM → BAM, sort, index
Variant Calling: SAMtools Mpileup — pileup of aligned reads per base position; BCFtools call — generate VCF files; Filter: QUAL score >100, present in mutant but absent in parental strain
Annotation: SnpEff or ANNOVAR — annotate variants with gene names, amino acid changes, functional impact
Visualization: IGV (Integrative Genomics Viewer) — manual inspection of called variants at loci of interest
Environment: Linux/bash, conda for dependency management; Galaxy platform (cpt.tamu.edu/galaxy-pub)
Output: Ranked list of candidate causal mutations unique to mutants (e.g., dnaJ P330Q)
Major sub-problem the tools solve: The core challenge is distinguishing a true causal mutation from background noise in a mutagenized genome.
Bowtie2 handles short-read alignment efficiently against a well-annotated reference, minimizing misalignment artifacts
Mpileup/BCFtools applies statistical models to distinguish true variants from sequencing errors
QUAL >100 filtering + parental subtraction eliminates pre-existing polymorphisms
SnpEff immediately translates nucleotide changes into amino acid consequences
Potential Pitfalls:
Sibling contamination
Reference bias
D. Albert’s Notes
Goals: Increase the L protein structural stability to improve lysis efficiency. It’s a small membrane protein that disrupts the inner E. coli membrane during phage infection.
Pipeline:
Get protein sequence from UniProt; Run BLAST to find homologs across phage strains; Run Clustal Omega to identify hot spots for mutations
Run ESM2 to identify mutations and where we can mutate without affecting structural stability; Keep mutations that don’t disrupt the protein structure
Run the mutations through ESMFold to predict structure and filter for stability
Rank the candidates by stability (pLDDT) improvements over the UniProt sequence
Run top candidates through AlphaFold-Multimer to confirm the mutations don’t affect the interaction between E. coli DnaJ
Take the top candidates and run them through the wet lab
ESM2 allows us to run stochastic gradient descent on how stable our protein sequences are likely to be and what evolution considers normal.
ESMFold provides us with a pLDDT value for structural confidence and together we can automate mutation screening before hitting the wet lab.
Clustal Omega provides us with positions on the phage strain that we should not change in order to further preserve structural stability.
Pitfalls: L protein is a membrane protein and might not be as well represented in ESM2 training data and the PDB so we might have less reliable outputs. Our folding models aren’t taking into account lipid membranes so we might have issues with modeling the interaction. Our stability estimates might also be inaccurate as the delta between mutations may be too small to rank them accurately.
E. Tehseen’s Brainstorming Notes
Systematic Tuning of the N-Terminal Regulatory Domain
Goal: Enhance and regulate the toxicity of the MS2 bacteriophage L lysis protein by systematically modifying its N-terminal domain. Instead of removing this region, identify the minimal regulatory segment needed for precise control of lysis timing and activity.
Background and rationale
The L protein, a 75-amino acid membrane-bound lysis protein, is responsible for killing E. coli during infection. Studies show that its N-terminal domain (~first 30–40 amino acids) is not required for lysis; truncation mutants (Lodj variants) lacking this region still lyse cells, often faster. This indicates the N-terminus acts as a regulatory brake to delay lysis and support viral replication.
Hypothesis
The regulatory function of the N-terminal domain in lysis is influenced by its length and charge characteristics. It is proposed that:
Partial truncations may incrementally diminish inhibitory effects and subsequently enhance lysis efficiency
The regulatory activity appears to be localised to a distinct sub-region rather than to the entire N-terminal domain
There is likely an optimal truncation point that achieves a balance between increased toxicity and maintenance of protein stability
Proposed Computational Pipeline:
Sequence Retrieval: Obtain the L protein sequence from UniProt.
Structural and Residue Analysis: Visualize the N-terminal domain using PyMOL to identify hydrophilic and cationic residues.
In Silico Mutagenesis: Use ESM2/ESMFold to predict the effect of substitutions that increase cationicity, focusing on residues facing the cytoplasm or periplasm.
Stability Check: Compare predicted mutants’ folding and stability using ESMFold and pLDDT scores.
Interaction Analysis: Optional AlphaFold-Multimer predictions to confirm L interaction with DnaJ or other host factors is preserved.
Prioritization: Generate a heatmap of mutants ranked by predicted lysis enhancement and structural stability.
Codon Optimization & Synthesis: Prepare selected mutants for experimental validation.
Expected Outcomes: Increased electrostatic interaction with target host proteins; tunable lysis timing while preserving N-terminal regulatory functions; generation of mutant library for wet lab testing of lytic efficiency.
Potential Pitfalls: Excessive cationic mutations could cause nonspecific aggregation or mislocalization. Predictions may differ from experimental results.
F. Group Meeting Notes (3/24)
10, 20, 30, 40 base pairs (changes)
Overlapping frames?
Pipeline approach: each person picks a tool to explore in depth, then come back and review/align on results
Tuesday — met to discuss current state:
What is the dependency outside of L-protein standalone?
What is the multi-frame dependency when engineering a plasmid?
L-protein is the focus — engineer
Refer to WEEK 5 Lab Resources for L-Protein
Reminder to post new questions/topics in Genspace Discourse Forum for knowledge sharing, TA support
Follow-up: met with John, identified focus area — IGV (Integrative Genomics Viewer) for manual inspection of called variants at loci of interest
ES: located some initial ChimeraX visualizations — will post images
Wednesday 3/25 — explore sequence in silico individually
Thursday 3/26 — pick a high probability option
Friday 3/27 — model in Benchling and Asimov Kernel
Saturday 3/28 — (TBD)
Sunday 3/29 — Final summary. By EOD Sunday 3/29, publish here. Please post personal pipeline visualizations/notes under your brainstorm section.
Status Update: Friday, March 27th
Eric’s Final Summary Notes: On 3/26 I did a “deep dive” into the remaining project scope, decided to focus on the identification of an amino acid substitution that would support our hypothesis around the N-1 Terminus region.
Primary request: Please review, and if you agree, or want to add/change anything, feel free to annotate with comments. Once we have consensus, we can submit the markdown file as our final “group project”.
References
Bernhardt TG, Roof WD, Young R (2002). The Escherichia coli FKBP-type PPIase SlyD is required for the stabilization of the phage PhiX174 lysis protein E. Mol Microbiol. PMC5446614.
Chamakura KR, Young R (2019). Phage single-gene lysis: how it works and why it matters. Future Microbiol. PMC5775895.
Lin DL et al. (2023). Structural insights into MS2 lysis protein L and its interaction with DnaJ. PMC10688784.
Schilling T, et al. (2023). Engineering bacteriophage lysis proteins for enhanced activity. PubMed 36608652.
Lin YW, et al. (2017). MS2 lysis protein L: a glycoprotein tethered to the membrane by a single transmembrane segment. PMC5446614.
Lin DL, Leick M, Young R (2017). Lysis protein gene products specifically inhibit phage-mediated bacterial cell lysis. PMC5775895.