Week 13

Node participant note: I am a remote Genspace node listener based in Nigeria without onsite lab access. The Week 13 lab was a continuation of final project work at Genspace nodes. I engaged with the AI+SynBio lecture content fully and document my reflections below, with particular attention to how the tools covered in this lecture connect to my own computational work throughout the course.

Class Assignment — Week 13


Part A. AI and Synthetic Biology: Course Notes

Renee Wegrzyn and the AI-Biology Interface

The framing Renee Wegrzyn brought to this week – that AI is not replacing the biologist but is expanding the design space the biologist can responsibly explore – is one I have lived through concretely across this course. By the time I reached Week 13, I had run ESMFold on TolC, latent-space clustered 250 protein sequences, designed peptide binders for SOD1-A4V, and generated ProteinMPNN sequences against a fixed backbone. Every one of those steps would have been inaccessible to me four months ago, not because the biology was unknown, but because the computational infrastructure was either unavailable, too slow, or too expensive for a student working from Nigeria on a consumer laptop and Google Colab.

What changed is not the biology. What changed is that the inference cost of structure prediction collapsed, and the tooling became accessible to remote participants without institutional compute. That shift is what makes AI+SynBio a genuinely global development rather than a tool that further concentrates capability in well-resourced institutions.

What These Tools Can and Cannot Do: Evidence from My Own Work

The Week 5 cross-reference between ESM2 LLR scores and experimental lysis data for the MS2 L-protein remains the clearest demonstration of this I have encountered. K50 was the highest-scoring position in the entire ESM2 deep mutational scan. Every experimentally tested K50 substitution abolished lysis. The language model had no access to the mechanistic information that made K50 functionally non-negotiable. It scored substitutability based on evolutionary co-occurrence patterns, which is a genuinely different question from biochemical necessity.

This is not a reason to distrust AI tools. It is a reason to use them correctly: as filters that reduce the experimental search space, not oracles that replace it. ESM2 at K50 is a false positive. But ESM2 correctly identified positions 45, 46, and 63 as tolerant, all of which were experimentally confirmed as lysis-competent. The tool is useful. It is not sufficient.

The same principle applies to ÌṢỌ. AlphaFold2/ESMFold gives me a structural model of the TolC-MccH47 export pathway. PeptiVerse gives me predicted solubility and haemolysis scores. Tellurium gives me ODE dynamics under assumed parameters. None of these replaces the experiment that would confirm whether MccH47 is actually exported and active in EcN at the tetrathionate concentrations I have modelled. The models are load-bearing design tools, not substitutes for wet-lab validation.


Part B. ÌṢỌ — AI Tool Audit

A retrospective mapping of which AI tools shaped which design decisions:

ToolWeek usedDecision it shaped
ESMFold4, 7TolC structure validation; MccH47 fold confidence
ESM2 (mutational scan)4, 5TolC constraint mapping; L-protein mutation selection
ProteinMPNN4TolC backbone-compatible sequence design
AlphaFold35SOD1-A4V binder structural confidence (ipTM)
PepMLM5SOD1-A4V candidate generation
moPPIt5Multi-objective optimised SOD1 binders
PeptiVerse5Multi-property therapeutic evaluation
Tellurium (ODE)7ÌṢỌ biosensor response circuit dynamics
ColabFold AF2-Multimer5MS2 L-protein octameric pore modelling

No single tool drove a design decision alone. Every row in this table represents a step in an integrated pipeline where the output of one tool was interrogated against independent evidence before being acted on.


Works Cited

Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., … Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583–589. https://doi.org/10.1038/s41586-021-03819-2

Lin, Z., Akin, H., Rao, R., Hie, B., Zhu, Z., Lu, W., Smetanin, N., Verkuil, R., Kabeli, O., Shmueli, Y., dos Santos Costa, A., Fazel-Zarandi, M., Sercu, T., Candido, S., & Rives, A. (2023). Evolutionary-scale prediction of atomic-level protein structure with ESMFold. Science, 379(6637), 1123–1130. https://doi.org/10.1126/science.ade2574

Abramson, J., Adler, J., Dunger, J., Evans, R., Green, T., Pritzel, A., Ronneberger, O., Willmore, L., Ballard, A. J., Bambrick, J., Bodenstein, S. W., Evans, D. A., Hung, C.-C., O’Neill, M., Reiman, D., Tunyasuvunakool, K., Wu, Z., Žemgulytė, A., Arany, Z., … Jumper, J. M. (2024). Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature, 630(8016), 493–500. https://doi.org/10.1038/s41586-024-07487-w

AI Prompts Employed (Claude AI)

  • Retrospectively map which AI tools shaped which design decisions in the ÌṢỌ project and what evidence each decision rested on
  • Explain the distinction between what ESM2 measures (evolutionary substitutability) and what experimental lysis data measures (biochemical necessity) using K50 as the specific example