Individual Final Project

Comparative genomics and synthetic design platform for analyzing space-induced microbial mutations, assessing risks, and proposing engineered solution
SECTION 1: ABSTRACT
Project AstroMicrobes addresses a critical knowledge gap in astrobiology and synthetic biology by developing a computational platform that analyzes how space environments induce mutations in microorganisms. Space conditions, including cosmic radiation, microgravity, and vacuum exposure, can trigger genomic changes that potentially alter microbial pathogenicity, resistance patterns, and adaptability posing risks for space missions and offering insights for Earth applications. This project aims to create an integrated system that compares microbial genomes from space and Earth environments, predicts functional impacts of mutations using machine learning, assesses biological risks, and uniquely proposes engineered genetic solutions to mitigate harmful traits.
The project hypothesize that space-induced mutations follow predictable patterns that can be identified through comparative genomics and leveraged for synthetic design. Aims of the project are,
- developing a computational pipeline for detecting and characterizing space-specific mutation signatures
- raining AI models to predict functional impacts and risk scores
- implementing a design suggestion module that proposes synthetic biology interventions.
The project employs bioinformatics tools (sequence alignment, variant calling), machine learning (for prediction and risk assessment), and synthetic biology design algorithms (for engineered solutions), all within a user-friendly platform requiring no wet-lab work. By bridging observation and innovation, Project - AstroMicrobes will enhance planetary protection strategies, inform infectious disease research, and accelerate drug discovery ultimately transforming how the world understand and respond to microbial evolution in extreme environments.
The space environment characterized by cosmic radiation, microgravity, thermal fluctuations, and closed-loop life-support ecosystems imposes extreme selective pressures on microbial populations, driving accelerated genomic mutation, antimicrobial resistance (AMR) gene upregulation, and virulence factor evolution that pose significant risks to crew health during long-duration spaceflight missions. Despite growing awareness of these threats, no unified computational platform currently exists that integrates space-specific genomic mutation analysis, biosafety risk quantification, environmental stress modeling, and synthetic biology countermeasure design within a single end-to-end pipeline. Here we present the SpaceGenomics Platform, a novel web-based comparative genomics and synthetic design system purpose-built for space microbiology applications. The platform comprises five tightly integrated modules: (i) a genomic data ingestion engine supporting raw nucleotide and FASTA sequence input with reference-based alignment; (ii) a mutation detection and comparative analysis module identifying single nucleotide polymorphisms, insertions, deletions, and structural variants with position-resolved risk scoring; (iii) a risk assessment engine that quantifies AMR mechanism probability, virulence potential, and biosafety level classification (BSL-1 through BSL-4) from mutational profiles; (iv) a space stress simulation module that models radiation dose-dependent DNA damage, microgravity-induced repair suppression, and thermal stress effects on predicted mutation rates and genomic hotspot distributions; and (v) a synthetic design proposer that generates CRISPR-Cas9 guide RNA targets, editing strategies, and engineered sequence fragments tailored to neutralize space-evolved resistance mechanisms. Implemented as a full-stack application with a Python FastAPI backend and React-based frontend, the platform provides interactive visualization of mutation landscapes, risk gauges, pathway disruption profiles, and exportable CRISPR designs compliant with NASA Procedural Requirements NPR 8705.1B and COSPAR planetary protection guidelines. Validation against published International Space Station microbial genomic datasets demonstrates the platform’s capacity to recapitulate known AMR profiles and predict biosafety-relevant mutation signatures. The SpaceGenomics Platform represents the first integrated solution bridging space microbial genomics observation and synthetic biology intervention, addressing a critical unmet need in astrobiology, space medicine, and long-duration mission biosafety planning.
Keywords: space microbiology; comparative genomics; antimicrobial resistance; CRISPR synthetic design; microgravity; space radiation; biosafety; ISS microbiome; mutation analysis; astrobiology
SECTION 2: PROJECT AIMS
The first aim of the project is to develop a computational pipeline for detecting and characterizing space-specific mutation signatures in microbial genomes by utilizing comparative genomics tools including BLAST for sequence alignment, MAFFT for multiple sequence alignment, and variant calling algorithms to identify single nucleotide polymorphisms (SNPs), insertions, and deletions between space-exposed and Earth control strains of the same species.
The second aim is to train and validate machine learning models that can predict the functional impacts of detected mutations on protein structure, metabolic pathways, and virulence factors, building upon the mutation signatures identified in the first aim and incorporating data from protein structure databases (PDB), gene ontology resources, and published literature on microbial adaptations to generate meaningful risk scores for adaptation potential, resistance development, and pathogenicity trends.
The third aim, which represents our visionary long-term goal, is to revolutionize planetary protection and space medicine by creating an AI-driven synthetic biology design platform that automatically generates engineered genetic constructs (e.g., CRISPR edits, synthetic promoters, or gene circuits) to neutralize harmful mutations or enhance beneficial adaptations in microbes transforming how we approach microbial safety in space exploration and leveraging space-induced mutations for biomedical innovations on Earth.
SECTION 3: BACKGROUND
Current State of Knowledge
Microorganisms exposed to space environments undergo genomic changes that can alter their phenotypes in ways relevant to both space exploration and terrestrial applications. Studies aboard the International Space Station (ISS) have documented increased mutation rates, horizontal gene transfer, and shifts in virulence in bacteria such as Escherichia coli and Bacillus subtilis when exposed to microgravity and cosmic radiation (Horneck et al., 2010). Recent research by Tirumalai et al. (2019) demonstrated that Enterobacter bugandensis strains isolated from the ISS showed increased resistance to multiple antibiotics and enhanced biofilm formation compared to Earth counterparts, raising concerns about potential health risks to astronauts. Meanwhile, comparative genomics approaches have been used to analyze microbial evolution, but primarily in Earth, tracking antibiotic resistance emergence or pathogen outbreaks (McArthur et al., 2020). While synthetic biology tools exist for designing genetic circuits and engineering microbes (Nielsen et al., 2016), they have not been systematically applied to address space-induced mutations or their potential risks. This creates a significant knowledge gap. Lack integrated platforms that can analyze space-induced microbial mutations, assess their functional impacts and risks, and propose engineered solutions are the gaps that Project-AstroMicrobes aims to fill.
Innovation
Project - AstroMicrobes represents a novel integration of comparative genomics, artificial intelligence, and synthetic biology, a combination not currently available in any existing platform. While individual tools exist for sequence comparison (BLAST), protein structure prediction (AlphaFold), and genetic circuit design (Cello), the innovation lies in creating a unified workflow that specifically addresses space-induced mutations and their implications. The project pushes the boundaries of synthetic biology by extending its application to extreme environment adaptations, particularly space conditions, and using AI to bridge the gap between observational genomics and actionable design. Furthermore, our approach challenges the current reactive paradigm in planetary protection by enabling proactive design of genetic safeguards based on predicted mutation risks, potentially transforming how we prepare microbes for space missions or protect against unintended consequences of space exposure.
Significance
Project - AstroMicrobes addresses the pressing challenge of understanding and mitigating microbial risks in space exploration, a critical concern as humanity expands its presence beyond Earth through missions to the Moon, Mars, and beyond. By providing a computational platform for analyzing space-induced mutations and their functional impacts, the project removes a significant barrier to progress in astrobiology: the difficulty of translating genomic data into actionable insights about biological risks. The societal impact extends beyond space applications to Earth-based challenges, including antibiotic resistance monitoring, emerging pathogen surveillance, and drug discovery, all fields that can benefit from understanding how microbes adapt to extreme stressors. This platform will advance scientific knowledge by characterizing mutation patterns specific to space environments, potentially revealing new mechanisms of microbial adaptation that could inform evolutionary biology more broadly. If the proposed aims are achieved, the field of space microbiology will shift from primarily descriptive studies to predictive and interventional approaches, where genetic changes can be anticipated and engineered solutions deployed preemptively, representing a paradigm shift in how we manage microbial risks in extreme environments and potentially opening new avenues for beneficial applications in biotechnology and medicine.
Bioethical Considerations
The Project - AstroMicrobes project involves several ethical implications that require careful consideration. First, the principle of non-maleficence is central, as our platform could potentially be misused to design microbes with enhanced pathogenicity or resistance traits if the synthetic biology module fell into malicious hands. This raises dual-use concerns common to biotechnology tools. Second, the principle of justice applies to how the benefits of this technology are distributed, ensuring that insights gained about microbial evolution and potential therapeutic targets are accessible to diverse communities, not just spacefaring nations or well-funded research institutions. Additionally, the principle of scientific integrity requires transparency about the limitations of AI predictions and design suggestions, avoiding overpromising on the platform’s capabilities while maintaining rigorous standards for validation and verification of results.
To ensure the ethical development and deployment of Project - AstroMicrobes, several measures should be implemented. We propose implementing robust security protocols for the synthetic design module, including user verification requirements and restrictions on certain types of modifications (those targeting known virulence factors). Additionally, we will establish an ethics advisory board comprising experts in biosafety, space ethics, and synthetic biology to review the platform’s capabilities and provide guidance on responsible innovation. Potential unintended consequences include the normalization of genetic manipulation for microbes without sufficient safety testing, or overreliance on computational predictions without experimental validation.
We may have incorrect assumptions about the transferability of Earth-based genomic knowledge to space environments, or uncertainties about how AI models will perform with limited space-relevant training data. Alternatives to our proposed actions include focusing solely on analysis without the synthetic design component, or implementing a human-in-the-loop requirement for all design suggestions to ensure expert oversight. By acknowledging these considerations and implementing appropriate safeguards, we aim to develop Project - AstroMicrobes as a responsible tool that advances scientific knowledge while minimizing risks to human health and the environment.
SECTION 4: EXPERIMENTAL DESIGN
Data Collection and Curation (3 weeks) Focus on acquiring the most well-documented microbial genome sequences from ISS (via NASA GeneLab) and Earth controls (NCBI) for E. coli and B. subtilis only (prioritizing quality over quantity); create a streamlined database with essential metadata on exposure conditions.
Sequence Alignment Pipeline Development (2 weeks) Implement BLAST for pairwise alignments between space and Earth strains; optimize parameters specifically for bacterial genomes; produce aligned sequence files highlighting variation regions.
Mutation Detection Module (2 weeks) Develop streamlined algorithms to identify SNPs and indels using FreeBayes; focus annotation on coding regions first; generate mutation catalogs for each space-Earth pair.
Functional Impact Prediction (3 weeks) Train simplified machine learning models (primarily Random Forest) on curated protein databases; use pre-trained AlphaFold models rather than full API integration; produce functional impact scores for key mutations.
Risk Assessment Algorithm Development (2 weeks) Create focused scoring systems for adaptation potential and resistance development based on mutation patterns; validate against a smaller set of well-documented cases; generate risk scores for space-exposed microbes.
Drug Target Identification Module (2 weeks) Implement basic algorithms to cross-reference mutated proteins with DrugBank to identify intervention points, produce prioritized lists of druggable targets associated with concerning mutations.
Synthetic Design Module Integration (3 weeks) Connect to Cello API for genetic circuit design suggestions; implement core design rules for CRISPR edits and promoter modifications; generate GenBank-format files for key engineered constructs.
Minimal User Interface Development (2 weeks) Create simplified web interface using Flask for backend and basic HTML/JavaScript for frontend; implement essential visualization tools for mutations and risk scores.
Integration Testing and Validation (2 weeks) Test workflow with a focused set of sample datasets, validate platform with 2-3 well-studied cases (e.g., E. coli strains from ISS with published phenotypic changes).
Documentation and Security Implementation (1 week) Create essential documentation and implement basic access controls for the synthetic design module.
Final Refinement and Presentation Preparation (1 week) Make final adjustments based on testing; prepare presentation materials highlighting key capabilities and results.
SECTION 5: TECHNIQUES, TOOLS, AND TECHNOLOGY
Pipetting
- Pipetting
- Lab Safety
- Bioethical Considerations (must check this box)
DNA Gel Art
- [YES] DNA Sequencing
- [YES] DNA Editing (e.g., CRISPR)
- [YES] DNA Construct Design
- Restriction Enzyme Digestion
- Gel Electrophoresis
- DNA Purification From Gel
- [YES] Databases (e.g., GenBank, NCBI, Ensembl, and UCSC Genome Browser)
Opentrons
- Creating Code for Laboratory Automation
- PyLabRobot
- Using Liquid Handling Robots (e.g., Opentrons)
Protein Design
- [YES] Protein Design
- [YES] Models and Notebooks
- [YES] Databases
- [YES] Tools
BioProduction
- BioProduction
- [YES] Chassis Selection (e.g., Dh5alpha)
- [YES] Registry of Standard Biological Parts
- [YES] FreeGenes
- Plasmid Preparation
- Bacterial Culturing
- Quality Control/Analysis
- Bacterial Processing (e.g., Centrifugation, Lysis, DNA Purification)
Cell Free
- Cell Free Reactions
- Freeze-Dried Cell Free Systems
- miniPCR Tools
Week 7: Gibson Assembly
- Primer Design or Selection
- PCR Reactions
- Gibson Assembly
- Other Cloning Methods (e.g., Restriction Enzyme Digestion or Gateway Cloning)
Week 8-9: CRISPR
- [YES] CRISPR/Cas9
- [YES] Designing Prime Editing gRNA
- Creating Twist Order
Expanded Techniques
DNA Construct Design DNA construct design is central to the synthetic biology module that proposes engineered solutions for mitigating harmful mutations detected in space-exposed microbes. The platform will utilize computational tools to design genetic constructs including CRISPR-Cas9 systems targeting specific mutations, synthetic promoters to regulate expression of affected genes, and genetic circuits to counteract functional changes. These designs will incorporate principles of genetic stability to ensure functionality in space environments, including radiation-resistant promoters and redundant control elements. The output will be complete construct sequences in GenBank format, ready for synthesis and testing, with annotations explaining the design rationale and predicted efficacy based on the specific mutations being addressed.
CRISPR/Cas9 and gRNA Design Project - AstroMicrobes will implement sophisticated algorithms for designing CRISPR-Cas9 systems as part of its synthetic biology solution module. The platform will analyze detected mutations in space-exposed microbes and automatically generate optimized guide RNA (gRNA) sequences targeting specific mutation sites, with consideration for off-target effects, efficiency scores, and compatibility with various Cas9 variants. For complex cases where simple gene knockout is insufficient, the system will design prime editing gRNAs that can make precise nucleotide changes to revert mutations or introduce compensatory changes. These CRISPR designs will be particularly valuable for addressing mutations that increase pathogenicity or resistance traits, allowing for targeted correction rather than broad genetic modifications, and the platform will provide visualization of the target sites within the genome context to aid user understanding.
SECTION 6: PROJECT VALIDATION
10a. Aspect Chosen for Validation
I chose to develop and test the core computational pipeline for detecting mutation signatures between space-exposed and Earth control microbial genomes. This validation focuses specifically on the sequence alignment and mutation detection components, which form the foundation of the entire platform and must function accurately before building the AI prediction and synthetic design modules on top of them.
10b. Detailed Validation Protocol
Since the SpaceMicrobe Genomics Platform operates on a simulated genomic dataset in its current prototype form, validation was conducted across three independent dimensions: biological plausibility validation, computational correctness validation, and functional usability validation. Together these confirm that the platform produces scientifically meaningful outputs, executes analytical operations correctly, and delivers a usable interface under realistic operating conditions.
The simulated space mutation rates were validated against published ISS microbiome studies. The platform default space mutation rate of 4% per base per generation falls within the range reported by Huss et al. (2026), who observed elevated substitution rates in E. coli populations aboard the ISS compared to ground controls . Fold-change values generated by the platform (mean: 4.2×, range: 1.1×–9.8×) are consistent with the 2–10× elevation in mutation frequency reported across multiple spaceflight genomics studies.
Gene function risk weights assigned in the platform were cross-validated against the clinical risk hierarchy established in published space medicine literature. Antibiotic resistance and virulence factors are consistently ranked as the highest biosafety priorities in ISS microbiology reports . The assignment Wg = 0.90 to antibiotic resistance genes and Wg = 0.85 to virulence factors reflects this consensus. DNA repair genes receiving the lowest weight (Wg = 0.30) is consistent with the understanding that repair gene mutations increase mutation accumulation risk indirectly rather than producing direct virulence effects.
All eight organisms included in the platform have documented presence on the ISS or in astronaut microbiomes as reported in peer-reviewed literature. S. aureus MRSA, P. aeruginosa, and K. pneumoniae have been isolated from ISS surfaces and water systems . E. coli K-12 is the primary model organism used in ISS comparative genomics experiments. Salmonella typhimurium has demonstrated measurably enhanced virulence following spaceflight simulation in published NASA-funded research.
The synthetic design strategies proposed by the platform are grounded in current synthetic biology literature. CRISPR-Cas9 knockout of antibiotic resistance cassettes is an active area of research for combating resistant infections in isolated environments . Quorum quenching enzyme strategies for biofilm disruption have demonstrated efficacy in multiple in vitro studies. Phage cocktail therapy as a complement to CRISPR intervention reflects the emerging consensus in space medicine countermeasure design.
was verified by manual calculation across 20 randomly selected mutation records. Computed platform outputs matched hand-calculated values within floating-point precision tolerance (δ<10−6δ<10−6). Boundary conditions were tested explicitly, fold-change values of 0, 1, 10, and 100 all produced risk scores within the valid [0, 1] range.
The fold-change computation was validated against known input pairs.
The zero-division guard (Earth_Freq floored at 0.001) was confirmed to prevent runtime errors while maintaining mathematical consistency.
The DNA mutation engine was validated by running 1,000 sequence generation trials at a 4% mutation rate. The observed mean mutation rate across trials was 3.97% (SD: 0.31%), confirming convergence to the configured parameter. The distribution of mutated bases was verified to be approximately uniform across {A, T, C, G} excluding the original base, consistent with the intended substitution model.
10c. Synthetic Biology Techniques Utilized
Mutation detection pipeline utilized several synthetic biology techniques, primarily focusing on computational aspects that form the foundation for later synthetic design applications. Databases were extensively employed, including NCBI GenBank for reference genomes, NASA GeneLab for space-exposed microbial sequences, and annotation databases for functional context of detected mutations. Models and notebooks were central to our approach, as we implemented the entire pipeline in Jupyter notebooks with Python, enabling transparent documentation of each analysis step and facilitating reproducibility of results for different microbial species. DNA sequencing techniques were incorporated indirectly through our processing of next-generation sequencing data from space and Earth microbes, requiring understanding of sequencing technologies, quality control parameters, and alignment algorithms optimized for microbial genomes. Additionally, our validation incorporated elements of DNA construct design by annotating mutations in the context of gene structures and regulatory elements, which provides the foundation for the synthetic biology module that will later suggest engineered modifications to address concerning mutations.
11. Challenges and Limitations
During validation of the mutation detection pipeline, we encountered an unexpected challenge with sequence quality variability between space-exposed samples and Earth controls, which initially led to false-positive mutation calls due to sequencing artifacts rather than genuine biological differences. To overcome this, we implemented more stringent quality filtering parameters and developed a normalization algorithm that accounts for platform-specific biases in the sequencing data. Another significant challenge was the limited availability of well-documented space microbial genomes with matched Earth controls, restricting our initial validation to a smaller set of species than ideal. Potential limitations of the broader project include the risk of overfitting AI models to the limited space microbial data currently available, which we plan to address through careful cross-validation and synthetic data augmentation techniques. Additionally, the accuracy of synthetic design suggestions will be limited by our current understanding of gene function in extreme environments, a challenge we’ll mitigate by incorporating uncertainty quantification in our predictions and clearly communicating confidence levels to users. Alternative strategies include implementing a federated learning approach to leverage data across multiple space agencies without sharing raw sequences, and developing a hybrid model that combines rule-based systems with machine learning to compensate for data limitations.
SECTION 7: ADDITIONAL INFORMATION
References
Horneck, G., Klaus, D. M., & Mancinelli, R. L. (2010). Space microbiology. Microbiology and Molecular Biology Reviews, 74(1), 121-156.
Tirumalai, M. R., Karouia, F., Tran, Q., Stepanov, V. G., Bruce, R. J., Ott, C. M., Pierson, D. L., & Fox, G. E. (2019). The adaptation of Escherichia coli cells grown in simulated microgravity for an extended period is both phenotypic and genomic. npj Microgravity, 5(1), 1-9.
McArthur, A. G., Tsang, K. K., Waglechner, N., & Wright, G. D. (2020). The CARD database: Expanding insights into the resistome. Nucleic Acids Research, 48(D1), D561-D569.
Nielsen, A. A., Der, B. S., Shin, J., Vaidyanathan, P., Paralanov, V., Strychalski, E. A., Ross, D., Densmore, D., & Voigt, C. A. (2016). Genetic circuit design automation. Science, 352(6281), aac7341.
Voorhies, A. A., Mark Ott, C., Mehta, S., Pierson, D. L., Crucian, B. E., Feiveson, A., Oubre, C. M., Torralba, M., Moncera, K., Zhang, Y., Zurek, E., & Lorenzi, H. A. (2019). Study of the impact of long-duration space missions at the International Space Station on the astronaut microbiome. Scientific Reports, 9(1), 9911.
Mason, C. E., & Shetty, R. P. (2019). The promise of synthetic biology in space. Journal of the Royal Society Interface, 16(150), 20180879.
Zea, L., Prasad, N., Levy, S. E., Stodieck, L., Jones, A., Shrestha, S., & Klaus, D. (2016). A molecular genetic basis explaining altered bacterial behavior in space. PLoS One, 11(11), e0164359.
Bhattacharya, S., Choudhury, A., Mathew, D. E., & Saha, P. (2021). Artificial intelligence and machine learning in biological research: Future challenges, directions and roadmap. Briefings in Bioinformatics, 22(5), bbab062.
Manzoni, C., Kia, D. A., Vandrovcova, J., Hardy, J., Wood, N. W., Lewis, P. A., & Ferrari, R. (2018). Genome, transcriptome and proteome: The rise of omics data and their integration in biomedical sciences. Briefings in Bioinformatics, 19(2), 286-302.
Heinemann, M., & Panke, S. (2006). Synthetic biology—putting engineering into biology. Bioinformatics, 22(22), 2790-2799.