Week 4 HW: Protein Design Part 1

South American Rattlesnakes (Crotalus durissus terrificus) with Crotamine protein

Part A: Conceptual Questions

How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons)

I intake approx. 5 * 10²³ Daltons of amino acids when ingesting 500 grams of meat. This is based off results indicating I ingest approx. 10²¹ Daltons of amino acids when ingesting 1 gram of meat

Why do humans eat beef but do not become a cow, eat fish but do not become fish?

Humans eat beef but don’t become cattle, and eat fish but don’t become fish because genetic information from the lifeform being ingested isn’t transferred wholesale. Much of the genetic material being eaten is broken down during digestion and more importantly a human beings’ cells follow instructions derived from their DNA. Human beings’ cells utilize amino acids from the lifeform being ingestion, but perform this utilization according to specific genetic instructions. The lifeform being ingested and its amino acids are the raw materials the cell uses for various means.

Why are there only 20 natural amino acids?

There are several broad reasons why there are only 20 standard natural amino acids. The first reason is that early on in the history of evolution, this group of amino acids became more or less ’locked in’, meaning that once the basic relationship between three letter codons and these 20 standard natural amino acids became widely distributed across the kingdom of life, it becamde too risky/dangerous from an evolutionary standpoint to alter this core set. Another reason is that the group of 20 gives enough range in structure and chemistry to build a large chunk of what evolution or directed evolution might desire. The other reasons seem to amount to various types of evolutionary trade-offs. Adding more than 20 amino acids to this standard set would add additional, potentially unwanted complexity, while decreasing the number of amino acids in the set might lead to issues with a lack of uniqueness with amino acids side chain sharing, which would in turn limit the functional flexibility of amino acids to do things like fold precisely.

Can you make other non-natural amino acids? Design some new amino acids.

Yes you can. My attempts to design some new amino acids usng SwissSideChain and the Cryo-EM structure of Receptor Tyrosine Kinase ROS1 PDB file in PyMol (open-source) are shown below:

Attempt at creating a non-natural amino acid residue mutation of Tyrosine Kinase ROS1 using cyclohexanecarboxylic acid

Attempt at creating a non-natural amino acid residue mutation of Tyrosine Kinase ROS1 using cyclopropanecarboxylic acid

Where did amino acids come from before enzymes that make them, and before life started?

Amino acids come from metabolic molecules within the cell. These metabolic molecules consist of carbon atom chemical backbone, inorganic nitrogen, and enzyme-facilitated chemical reactions. Before life as we know it started, amino acids originated from abiotic (not from living organisms) chemical reactions on Earth before the emergence of life as we know it. The chemical reactions occurred in the atmosphere, hydrothermal and oceanic vents, and via meteorite and comet (i.e., extraterrestrial) delivery

If you make an α-helix using D-amino acids, what handedness (right or left) would you expect?

If I made an a-helix using D-amino acids, I’d expect left handedness because the majority of alpha helices are built from L-amino acids, which exhibit right handedness.

Can you discover additional helices in proteins?

Yes additional helices can be discovered in proteins. These can include non-alpha helices and even π-helices. These can be discovered via x-ray crystallography, structure databases, or with the aid of AI prediction tools.

Why are most molecular helices right-handed?

Most molecular helices are right-handed because L-amino acides force the helix backbones to conform to right-handedness. Selection pressures also favors this form of handedness in the helix for hydrogen atom bonding and overall functionality

Why do β-sheets tend to aggregate?

β-sheets tend to aggregate because the edges of β-sheets have exposed nitrogen and hydrogen (donor) atomic pairs and carbon-oxygen pairs (acceptor) atoms. What happens is that when sheets get in close proximity these atomic pairs snap together, almost like a zipper.
- What is the driving force for β-sheet aggregation?
  - The driving force for β-sheet aggregation is the attraction of the donor and acceptor atomic pairs
Supporting prompts for this section listed below:

Supporting Prompt	Model
What is a Dalton as a unit of measurement?	Perplexity
How many molecules of amino acids are located within a single gram of edible meat, on average? Do NOT hallucinate when answering this question	Perplexity
What is Avogadro’s number and why is it relevant within the context of the answer to the previous prompt? What does mol mean in the context of g/mol? Do NOT hallucinate when answering these questions	Perplexity
10^{21} * 500 equals how much?	Perplexity
From a biological and genetic standpoint why is it that when an organism ingests food, they don’t become the organism they’re ingesting (i.e., if a human being eats salmon, why don’t they become a salmon)? Where do amino acids fit into this explanation? Do NOT hallucinate when answering these questions	Perplexity
Why are there only 20 amino acids found in nature? Do NOT hallucinate when answering this question	Perplexity
To elaborate on a point made in the 5th part of the answer to the previous prompt: what is a side chain in the context of amino acids? Do NOT hallucinate when answering this question	Perplexity
Share me resources for designing non-natural amino acids rather simply/with relatively little friction or extensive technical knowledge	Google AI
How can I quickly design a non-natural amino acids? What internet resources exist to help me do this?	Google AI
Having issues downloading SwissSidechain into PyMol. I’m on a MacOS. Do I need Python downloaded on my local machine for successful downloading of SwissSidechain? Do NOT hallucinate when answering these questions	ChatGPT
It’s telling me it’s installed but that Swiss Sidechain is Not Loaded. I downloaded the SwissSidechain package. Please tell me how to proceed	ChatGPT
I’m using ‘3.1.6.1’, 3.0, 3000000, 1749441553, ‘963e47f43382e009c2cd391f0747a8c20ef108e7’, 0’ of PyMol	ChatGPT
Getting ModuleNotFoundError: No module named ‘swisssidechain’	ChatGPT
When I go to select the SwissSidechain folder from my Downloads, I only can open the file. I can’t select the entire thing. Is that an issue?	ChatGPT
Yeah when I try to just select the entire folder from PyMol I can’t do. How should I proceed with the install?	ChatGPT
Ok. Now I have only a .zip and am able to select it. Getting the following error: “Plugin “PySwissSidechain2” has been installed but initialization failed. Tell me how to proceed	ChatGPT
It says ‘Unable to find license file’. Not understanding what’s going on here	ChatGPT
No I don’t have a PyMol license	ChatGPT
How do I install Conda on Mac? Is Miniconda downloadable on the web?	ChatGPT
Should I download the graphical or command Miniconda install?	ChatGPT
Yes	ChatGPT
Getting “zsh: command not found: conda”	ChatGPT
For this code, how do I find my username? Users//miniconda3	ChatGPT
Getting “zsh: no such file or directory: /Users/j_d_r/miniconda3”	ChatGPT
Yes	ChatGPT
Getting “Do you accept the Terms of Service (ToS) for https://repo.anaconda.com/pkgs/main? [(a)ccept/(r)eject/(v)iew]:” What should I write?	ChatGPT
Getting “PackagesNotFoundError: The following packages are not available from current channels: - pymol”	ChatGPT
Ok	ChatGPT
Ok. I’ve opened up the Open-Source version of PyMol and installed SwissSideChain. Now tell me how to mutate some residues from some Protein Data Bank (PDB) structures based on one of the non-natural L- or D-sidechains of the SwissSidechain database. Basically I want to create some graphical representations of non-natural amino acids using SwissSideChain in the Open-Source version of PyMol and I would like you to provide me a step by step process for how to get these visuals (when the visuals are created I also want to be able to know their names and other relevant information)	ChatGPT
What does a PDB structure mean within the context of proteins?	Perplexity
What’s a common protein downloaded from RCSB PDB?	Perplexity
Not totally understanding how SwissSidechain works. Do I just install random .pdb online, open that in PyMol, and then run SwissSidechain against it?	ChatGPT
Ok. At RCSB PDB and am looking to download a PyMol compatible file extension. What file extension should I go for?	ChatGPT
What is the name of a .pdb file? PDBx/mmCIF?	ChatGPT
Whenever I click ‘PySwissSidechain’ from Plugin –> Legacy Plugins my Open-Source PyMol window fails and I see this in Terminal: “libc++abi: terminating due to uncaught exception of type NSException”. What’s going on? How do I proceed with running the SwissSidechain extension?	ChatGPT
Yes do this	ChatGPT
Seeing this command on SwissSidechain: “Command line Use the command: Mutate Object//Chain/ResNumber/, Newres For instance, to mutate residue number ‘82’ on chain ‘E’ in object ‘complex’ into Homoleucine (HLEU), write: Mutate complex//E/82/, HLEU” What does each piece of this command mean in biological speak? How do I find residues and non-natural amino acids to choose from?	ChatGPT
In SwissSideChain, what is the short code for the non-natural amino acid ‘Cha’ and ‘Ca’?	Perplexity
Amino acids are produced by enzymes, correct? DO NOT hallucinate when answering this question	Perplexity
So based on the answer to the last prompt, where exactly do amino acids come from? Where exactly do they originate? DO NOT hallucinate when answering this question	Perplexity
When we say ’enzyme‑catalyzed pathway’ we mean a chemical reaction that an enzyme speeds up, correct? Do NOT hallucinate when answering this question	Perplexity
What is an enzyme pathway in the context converting a precursor into an amino acid? Is it simply a chemical reaction? Something else? Do NOT hallucinate when answering this question	Perplexity
In the answer 3 prompts ago, there was mention of “Carbon Skeleton” and “Intermediates of glycolysis, the citric acid (TCA) cycle, and the pentose phosphate pathway” when describing them. What does this mean in simple terms? Are we saying these provide the chemical structure of an amino acid to keep it fundamentally sound? Do NOT hallucinate when answering these questions	Perplexity
Ok. Based on available literature and if relevant/necessary, the information shared in response to the previous prompts, how did amino acids originate before the emergence of life as we know it on this planet? Was it via chemical process in cyanobacteria or another form of bacteria or archaea? Do NOT hallucinate when answering these questions	Perplexity
An a-helix within the context of genomics and synthetic biology is an alpha helix correct? How do D-amino acids relate to alpha helices? Do NOT hallucinate when answering these questions	Perplexity
Not understanding. In the answer to the last prompt you stated, " In genomics and synthetic biology, “α-helix” or “a-helix” refers to the standard protein alpha helix: a right‑handed secondary structure formed by regular backbone hydrogen bonding, almost always built from L‑amino acids in natural proteins.” Then you said, “A chain made entirely of L‑amino acids naturally forms a right‑handed α‑helix” Did hallucination occur here? If so, where and how?	Perplexity
If this is the case, how can an α-helix using D-amino acids exhibit right-handedness? Not seeing how this would work? Do NOT hallucinate when answering this question	Perplexity
Explain the following: –The difference between how protein is described in a nutritional context and what proteins are/how they are described in a biological or biotechnological context –My hunch is additional helices in naturally-occurring proteins can be found, although I’m not sure how. Confirm or deny this hunch. If my hunch is correct, tell me why additional helices for proteins found in nature can be found and some methods for discovering these additional helices. Do NOT hallucinate when explaining these items. If you there’s risk of exaggeration or outputting something not confirmed or based on sources, don’t output it	Perplexity
What are molecular helices in a biological context? How do they differ from other types of helices (if at all)? Do NOT hallucinate when answering these questions	Perplexity
What’s the handedness of most molecular helices? What explains their handedness? Do NOT hallucinate when answering these questions	Perplexity
What does this mean in the answer to the last prompt? “L-amino acids dictate right-handed protein helices (opposite would clash sterically).” What are L-amino acids again? Does the L stand for lysate? Explain all this to me as if I was a reasonably educated 14 year-old. Do NOT hallucinate when answering these questionss	Perplexity
What are β-sheets in a biological context? Do we call them beta-sheets? What does the β stand for? What exactly are they? Do NOT hallucinate when answering these questions	Perplexity
What causes beta-sheets to come together as a larger aggregate? Is it molecular forces from difference ionic charges on atoms comprising the beta-sheets? Something else? Do NOT hallucinate when answering these questions	Perplexity
From the answer to the previous prompt, “H-bond donors (N-H) and acceptors (C=O)”" means hydrogen bond donors and carbon acceptors, correct? Do NOT hallucinate when answering this question. What does the C stand for? Do NOT hallucinate when answering these questions	Perplexity

Part B: Protein Analysis and Visualization

I selected the crotamine protein found in the South American rattlesnake, the protein’s abilities to penetrate cells as a toxin allows it to serve as a template for antivenom and potentially targeted therapies.
The amino acid sequence of the crotamine protein protein is below ¹
- AAF34911.1 crotamine [Crotalus durissus terrificus] MKILYLLFAFLFLAFLSEPGNAYKQCHKKGGHCFPKEKICLPPSSDFGKMDCRWRWKCCKKGSGK
- This protein is 65 amino acids long. Its most frequent amino acid in this protein is lysine, which appears 11 times
- There are 250 protein sequence homologs for the crotamine protein after running it in UniProt BLAST (ID: Q9PWF3)
- Yes, this protein belongs to the crotamine-myotoxin family.
Structural answers to this question are listed below:
- The protein structure was discovered in 2005 ². It appears to be a mixed bag from a structural quality standpoint based on the NMR Structure Validation Report below: Crotamine protein RCSB NMR Structure Validation Report
- No, there are no other molecules in the solved structure apart from the protein
- It’s a defensin-like protein of the myotoxin family
3D visualization software (NGL Outputs)
- “Cartoon”, “Ribbon”, and “Ball and Stick” combo
- Protein coloring by secondary structure. I think it might have more sheets than helices, although I’m not sure
- Protein by residue type. It seems like it has a good mix of hydrophobic vs hydrophilic residues
- Protein surface visualization. It doesn’t seem like it has a lot of holes, although I’m not sure

All supporting prompts listed below:

Supporting Prompt	Model
I have a FASTA file and a GenBank identifier for a protein I found on NCBI.gov. How do I then find the amino acid sequence for that given protein? What will it look like? Do NOT hallucinate when answering this question	Perplexity
Confirming that the beta-keratin 2 [Gekko gecko] protein has a 3D structure, correct?	Perplexity
Does the beta-keratin 2 [Gekko gecko] protein have a 3D structure?	Google AI
Understood. Are there any slug mucus proteins that have a 3d structure?	Google AI
Gotcha. Tell me if there are any serpent-derived proteins that have a 3D structure	Google AI
What is the K amino acid? Is it potassium?	Perplexity
What are main protein sequence homologs in the context of genomics and biotechnology? What do they look like?	Perplexity
Show me the format of protein sequence homologs in UniProt database. Show me what they usually look like. Do NOT hallucinate when producing this output	Google AI
Find me resolution info depicting the quality of the structure described in this tab	Gemini
So I’ve heard a good quality structure has higher resolution. How can I obtain information on the quality o the structure from this tab? Is it possible?	Gemini
Based on the scores in the 2nd page of this tab, what is the quality of the structure? Where exactly does the black dot reside here?	Gemini
Based on this tab, tell me if there are any other molecules in the structure of the Crotamine protein beyond the protein itself? If there are, what are they? Do NOT hallucinate when answering this question	Gemini
Where can I find the structure classification family for this protein in this tab? Show me where I can find this	Gemini
Tell me what I’m looking at and how it helps determine the structure classification family of the crotamine protein	Gemini
Tell me if and/or how I can view this protein as a “cartoon” “ribbon” and “ball and stick” using this NGL viewer tool	Gemini
See the plus icon but can’t click on it	Gemini
I want to color the protein by secondary structure to determine whether or not it has more helices or shets. How can I do this in this viewer?	Gemini
Can you see the successful secondary structure output here? If not, say so	Gemini
I now want to color the protein by residue type to determine whether or not it has more helices or shets. How can I do this in this viewer?	Gemini
I no longer want to see secondary structures or helices or sheets. I want to see the protein’s residue types	Gemini
Right now my chain-related Color Scheme options are ChainID, ChainIndex, and ChainName. Which one should I choose?	Gemini
Yes it seems to be recognizing the entire structure as Chain A. Does this mean that most of the residue type is of 1 type? And does this mean that the residues are mostly hydrophobic or hydrophilic?	Gemini
Ok – now how would I visualize the surface of the protein. How can I see if it has holes or binding pockets?	Gemini

Part C: Using ML-Based Protein Design Tools

C1. Protein Language Modeling

Deep Mutational Scans
- See picture below: Crotamine Protein Deep Mutational Scan
- There seem to be these really interesting columns that occur as one moves rightward past position 20 in the heatmap. There is a single almost uniform top-down color for these columns, which according to Gemini, indicate a special sensitivity. These are apparently disulfide bridges, which are quite important for holding the protein together (these are referred to as ‘anchors’)
Latent Space Analysis
- See picture below Non-Crotamine tagged point cloud showing embedded proteins
- Yes they do–there are approximations of similar proteins found throughout the cloud. Crotamine tagged point cloud showing embedded proteins and Crotamine’s location among embeddings
- There are other proteins at the edge of the cloud that belong to either other viruses, or other organisms that might be found in similar locations/habitats to the South American rattlesnake (mice, chickens, humans), or other types of toxins or species carrying toxins. See picture below: Crotamine tagged point cloud showing embedded proteins and Crotamine’s location relative to its neighborhood

C2. Protein Folding

Approach 1: It doesn’t look like the predicted coordinates match the original structure in the PDB much at all (see pictures below). Think this might be due to the fact that the original inputted structures in the PDB for this protein were apparently a bit of a mixed bag according to its NMR Structure Validation Report results (see previous section). Crotamine Protein PDB < > ESM Side-by-Side Structural Comparison (1)
Crotamine Protein PDB < > ESM Side-by-Side Structural Comparison (2)
Approach 2: Asked Gemini about this discrepancy and was recommended to input the mature sequence containing the last 42 residues. The result still visually seems to have some discrepancies compared with the original PDB visual upon first glance, but far less than the outputs under Approach 1 (see photo below) Crotamine Protein PDB < > ESM Side-by-Side Structural Comparison (3) – Post-Mature Sequence Input
I tried the following mutations:
- Cytesine Break: Per Gemini suggestion, replaced all the C residues to A (alanine). This notably decreased the predicted Local Difference Test (pLDDT) score of the outputed mutated sequence and the visual also seemed to indicate less structural integrity to the protein, implying less resilience to this mutation (see photo below) Crotamine Cytesine Break Mutation Results
- Charge Swap: Per Gemini suggestion, replaced all the K (Lysine) and R (Arginine) residues to D (Aspartic Acid). This decreased the pLDDT score of the outputed mutated sequence a bit, but less than the Cytesine Break, indicating the protein was more comparably resilient to this mutation (see photo below) Crotamine Charge Swap Mutation Results

C3. Protein Generation

Based on the heatmap below, there seems to be a difference between the predicted sequence probabilities here and the original heatmap generated in the ‘Deep Mutational Scans’ subsection. 48 of the 65 positions were changed and the sequence recovery rate was around 0.26, which didn’t seem all that promising Crotamine ProteinMPNN Results (1)
Visually, this re-inserted ESMFold output also looks structurally different than the original protein structure (see photo below) Crotamine ProteinMPNN Results (2)

All supporting promopts for this section listed below:

Supporting Prompt	Model
Want to choose a GPU to run this. Believe the ability to select GPUs should be in the bottom right but am not seeing this. Direct me to where I should go on this page for GPU selection	Gemini
Looking over the heatmap under the ‘Mutation Scans’ section of the code, what stands out regarding the inputted protein sequence (MKILYLLFAFLFLAFLSEPGNAYKQCHKKGGHCFPKEKICLPPSSDFGKMDCRWRWKCCKKGSGK)? Are there any particular mutations or patterns that stand out and why? Do NOT hallucinate when answering this question	Gemini
I want to make sure I perform the Latent Space Analysis located under ‘Latent Space Analysis’ correctly, as in I want to make sure I perform Latent Space Analysis on the specific protein sequence I inputted under the previous section ‘Mutation Scans’. Is there anything I need to do or input? Does the code in ‘Latent Space Analysis’ just take the specific protein sequence I inputted under the previous section ‘Mutation Scans’ and run with it? Do NOT hallucinate when answering this question	Gemini
I want to make sure I perform the Latent Space Analysis located under ‘Latent Space Analysis’ correctly, as in I want to make sure I perform Latent Space Analysis on the specific protein sequence I inputted under the previous section ‘Mutation Scans’. It looks like I just inputted some incorrect code in the cell under ‘Latent Space Analysis’ that starts with ‘Latent Space Analysis’ Fix the errors in the code and make sure to input the appropriate information so the protein sequence from the previous ‘Mutation Scans’ cell can be analyzed here. Do NOT hallucinate when addressing this query	Gemini
Ok, so it looks like there’s output in the ‘3D T-SNE visualization of Protein Sequence Embeddings’ subsection of the ‘Latent Space Analysis’ section. Help me understand where my initially inputted Crotamine protein sequence (MKILYLLFAFLFLAFLSEPGNAYKQCHKKGGHCFPKEKICLPPSSDFGKMDCRWRWKCCKKGSGK) is located within the 3D plot. Do NOT hallucinate when addressing this query	Gemini
Ok, so it looks like there’s output in the ‘3D T-SNE visualization of Protein Sequence Embeddings’ subsection of the ‘Latent Space Analysis’ section. Help me understand where my initially inputted Crotamine protein sequence (MKILYLLFAFLFLAFLSEPGNAYKQCHKKGGHCFPKEKICLPPSSDFGKMDCRWRWKCCKKGSGK) is located within the 3D plot. It just seems like there are a lot of protein sequences in this plot and while I can drag over and find things manually, I’m not sure where to start or how to efficiently find the area in this plot where the inputted sequence is located. Any assistance would be useful. Do NOT hallucinate when addressing this query	Gemini
Yes let’s do that. Let’s give it a color that stands out.	Gemini
Now that the protein visualization has been generated in the ‘Run ESMFold’ section, I’m instructed to do the following: “Try changing the sequence, first try some mutations, then large segments. Is your protein structure resilient to mutations?” There’s a couple things here. The first is that the outputted protein structure in ESMFold doesn’t really seem to match up all that well with the original protein structure in the Protein Data Bank (PDB). This might be because the original protein structure in the PDB was apparently a bit of a mixed bag from a quality standpoint. Not sure how to proceed, and that hesitation is twofold: 1) Is there anything I can do regarding the discrepancy between the outputted protein structure in ESMFold and the original protein structure in the PDB? If this is most likely out of my control/if there’s nothing that can be changed from an input or coding perspective, say so 2) To accomplish the instructions listed above in the quotation marks, should I just go to the subsection that says ‘sequence’ in the first cell under the ‘Run ESMFold’ section and just start randomly changing letters? Does that make sense? Regardless of whether or not this is the best approach for fulfilling the instructions, how would I know if the outputted structure(s) are resilient (or not) to mutations? Would it just be discerned from degree of change in outputted structures (i.e., if the structures change a lot, it likely isn’t resilient and vice versa)? Do NOT hallucinate when addressing these questions	Gemini
What does the pLDDT acryonym stand for?	Gemini
My instructions for the ‘Inverse Folding with ProteinMPNN’ section are to ‘Analyze the predicted sequence probabilities and compare the predicted sequence vs the original one.’ I assume there’s some customization to be done here, either through coding or through inputting sequence information. What would be the most likely sensible plan to proceed? Success seems to be that the compared protein sequence and the original one look visually rather similar. Any next steps on how to begin working this workflow would be appreciated. Do NOT hallucinate when addressing this query	Gemini
My instructions for the ‘Inverse Folding with ProteinMPNN’ section are to ‘Analyze the predicted sequence probabilities and compare the predicted sequence vs the original one.’ From there I need to put results back into ESMFold to see if the visuals match up. I assume there’s some customization to be done here, either through coding or through inputting sequence information. What would be the most likely sensible plan to proceed? Any next steps on how to begin working this workflow would be appreciated. Do NOT hallucinate when addressing this query	Gemini
My instructions for the ‘Inverse Folding with ProteinMPNN’ section are to ‘Analyze the predicted sequence probabilities and compare the predicted sequence vs the original one.’ From there I need to put results back into ESMFold to see if the visuals match up. I assume there’s some customization to be done here, either through coding or through inputting sequence information. What would be the most likely sensible plan to proceed? Any next steps on how to begin working this workflow would be appreciated. Do NOT hallucinate when addressing this query	Gemini
Do these steps outputted from the answer to the previous prompt take into account earlier sequence information inputted into previous cells outside the ‘Inverse Folding with ProteinMPNN’ section, specifically information on the protein sequence in question we want to visually compare across the ‘Inverse Folding with Protein MPNN’ and ‘Run ESMFold’ sections (YKQCHKKGGHCFPKEKICLPPSSDFGKMDCRWRWKCCKKGSGK)? If not, how can this be addressed? Do NOT hallucinate when addressing these questions	Gemini
For the last step, just executed, I want a more 1-to-1 comparison between the results in the diagram under ‘Visualize Amino Acid Probabilities’ an the results located in the cell under ‘Mutation Scans’ earlier in the notebook (https://colab.research.google.com/drive/10EnA1imLYVVtWQYR-CsQIWU7tA-LliMj#scrollTo=09FwbZ6v1AUs&line=2&uniqifier=1) Can we make the X and Y axis of the heatmap just executed match this original heatmap with ‘Position in Protein Sequence’ with numbers left to right in the X axis and ‘Amino Acid Mutations’ in the Y axis? Also want ‘Model Scores’ scored from 2 (more yellow-ish) to -6 (more purple-ish) a-la this original heatmap If there are parts of this instruction you don’t understand or that are not possible to execute, say so. After this is resolved, we’ll return to the original workflow from the answer to the previous prompt	Gemini
Address the error found when running the ‘Run ESMFold’ cell	Gemini
Everything under ‘Configure ProteinMPNN with Crotamine Structure’ doesn’t appear to be working. What’s going on?	Gemini

Part D: Group Brainstorm on Bacteriophage Engineering

My William & Mary Node Bacteriophage Engineering Group Brainstorm Google Doc. can be found here ³.