Subsections of Group Final Project
First team brainstrorm
My teammates and I decided to split up the reading and help each other understand it. We were all going to read the 2 review type overall papers. Each of us then take our specialty aligned paper.
My paper was - Generative design of novel bacteriophages with genome language models
Notes from the review papers: Phage Therapy: From Biologic Mechanisms to Future Directions and Phage Therapy: Past, Present and Future
As of now the HTGAA group is focused on empowering the MS2 phage, specifically in making the lysis protein of the phage better.
Phage therapy: past, present and future
Phage therapy is very old, and started off in the modern world. Improper understanding and political preferences pushed the west away from phages. Georgia and Poland did a lot though.
Phages allow extremely directed targetting of cells for various reasons. The major thing being anti-microbial infections. Lack of enough clinical trials showing their efficacy has kept research back. Why is it not efficacious?
Two types - lytic and temperate(lysogenic). Only use strictly lytic phages.
Administering phages is not clear. Their concentrations change due to their reproducing nature. Pharmacokinetic and dynamic studies should predict their behaviour.
Major player - locus biosciences
Phage therapy: from biologic mechanisms to future directions
phages can affect genus or genera, but also just species.
Designing new phages
- Problems
- only phages are temperate
- phage doesnt kill efficiently
- Many phage protein functions are unknown
- Best to make a phage that modulates host response or reduces horizontal gene transfer of resistance.
phage genome engineering
2 steps:
- Build
- Select or recover
First a regular phage infects a bacteria, which adds our required sequence onto the new infected phages. Then through CRISPR-Cas counterselection and co electroporation, more of the enriched phages are produced. BRED method - bacteriophage recombineering of electroporated DNA
synthetic phage genomics
Gen design of novel bacteriophages
Genomic language models are a new strategy for designing bio systems. This paper tests the ability to generate functional sequences at the scale of whole genomes.
Models used - EVO1 and EVO2 as of 2024/2025 Evo 2: DNA Foundation Model | Arc Institute Chassis - lytic phage X174 Results:
- AI generated genomes gave rise to 16 viable phages with substantial evolutionary novelty.
- One of them were very evolutionarily distant from the common.
This is important because phages are always sent in entire cocktails, not as single ones.
Pros:
- Many phages were better in overcoming X174 resistance in E.coli strains.
Page 25 is very useful for computational methods. It is a great example for group project Method:
- Used special taxonomic labels to prompt
- The pretraining used taxonomic labels generated phage-like sequences
- Prompts corresponded to 3 major viral realms
example prompt: |r__Duplodnavi ria;k__, |r__Monodnaviria;k__, and |r__Riboviria;k__
- The novelty was checked using nucleotide BLAST in the core_nt database.
EVO Frontier model
Runs on a non-transformer architecture - striped hyena. This architecture does better on extremely long range tasks and takes up lesser memory.