Hi- I’m Robert C Beck, a visual artist merging environmental science with experimental technology to challenge social perceptions of global health, ecology, and mutualism between human and non-human entities.
Photosynthetic Image Generation (PIG) “Art cannot be separated from life. It is the expression of the greatest need of which life is capable, and we value art not because of the skilled product, but because of its revelation of a life’s experience.”
-Robert Henri-
Overview PIG is an experimental combination of biology, glassmaking, printmaking, and photography which produces a living, two dimmensional surface of bacteria cultured as graphic imagery. The research behind the process to date has produced a small series of images grown on a synthetic felt substrate.
Start Your Homework! Part 1: Benchling & In-silico Gel Art Johnny Got His Gun -The sequence has been rearranged into the shape of a pistol. I wanted to keep the number of columns restricted to the original seven enzymes for this composition, but I’m sure the image could be improved with additional columns. Question? How many columns can be introduced to the gel matrix?
Do Your Homework!!! Links to Final Project Slides
Photosynthetic Image Generation Surface Design Bryoculture Automation Reading Review: -Nature- Advancing sustainable agricultural transformation through the synergy of automated experimental platforms and living labs
Forget the Opentron, it’s time for cranes, planes, and Agricutural Automation! Seriously though, not to hack on the Opentron, what an amazing tool for working in the lab. But let’s not forget about the outdoor living lab which is part of arguably one of the most important industries, Agriculture. The scale of an agricultural system can range from a backyard garden to industrial complexes spanning thousands of hectares. How can automation help manage systems across the scales of indusrty which ultimately translates into what does or doesn’t end up in our shopping carts?
Hi- It’s time to do your homework! Part A Questions:
How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons) Solve- How many Daltons are in 500 grams? 1g = 6.022173643E+23 Daltons (6.022173643E+23 x 500)/100 = 4.29 x 10E+18
Time to do your homework- It’s always time to do homework in this course!
Part A: SOD1 Binder Peptide Design Part 1: Generate Binders with PepMLM
Starting with the human SOD1 sequence from UniProt (P00441), and then introducing the A4V mutation, we get the following sequence where the fourth amino acid in the sequences is changed from K to V:
Guess what. What? It’s time to do your homework. OMG!!! It’s never ending… I’m always doing my homework for HTGAA. Answer these questions about the protocol in this week’s lab
What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose?
Answer: There are several components in the master mix. It contains - DNA polymerase, dNTP’s, reaction buffers, and MgCL2. DNA polymerase is an enzyme which catylizes the replication of DNA by reading the parent’s unzipped DNA aa sequence and pairing it with its corresponding dNTP to synthesize a new strand of target DNA. dNTP is short for Deoxyribonucleotide Triphosphate. They are the building blocks which allow DNA replication to take place. They consist of a deoxyribose sugar molecule, a nitogenous base (A,C,T,G), and a triphosphate group. There are four dNTP’s - dATP, dCTP, dTTG, and dGTP. The reaction buffers dissolve, or lyse the phospholipid cell membranes which hold the DNA inside the cell. The MgCl2 is magnesium salt which neutralizes the charge on the sugar-phosphate backbone making the DNA less water soluble aiding in its precicipitation. It aslo helps remove proteins from DNA and keeps them dissolved in the lysed cell solution. What are some factors that determine primer annealing temperature during PCR?
Answer: There are three tempererature cycles protocals important to PCR, denaturization, annealing, and extension. The melting temperature (Tm) for primers is an important factor which initializes the PCR process, this is called denaturing. The increase in temperature is what separates the primers so they become single stranded. The annealing temperature (Ta) is dependent upon the length, sequence and concentration of primers. The extension process temperature is lower than first two and the last step where the 3’ ends are bound into finished PCR product. There are two methods from this class that create linear fragments of DNA: PCR, and restriction enzyme digests. Compare and contrast these two methods, both in terms of protocol as well as when one may be preferable to use over the other. PCR stands for Polymerase Chain Reaction and is widely used for its ability to amplify specific segments of DNA for study or testing. As stated above in question 2, it’s a thermocycling process using specific sets or a custom blend of primers, short single-stranded segments of DNA usually 18-25 nucleotides long, which bind to the target DNA strand in specific regions based on the primer sequence used. It is a relatively cheap and practical way to analyze DNA which requires a very basic “kitchen sink” lab to run protocal. It only requires small trace samples of DNA, which can be sourced from basically any tissue, and usually produces blunt end segments of DNA. Restriction enzymes are proteins found in bacteria which can be used to cut DNA at specific target sites. They cut either blunt, or sticky ends - an advantageous quality for constructing recombinant DNA strands; there are three types. Type I: recognize shorter sequences and do not cut at their recognition sites but rather at the unprotected ends. Type II: Most widely used and available in many different forms. They cut at specific recognition sites to a predictable sequence. Type III: Recognize short asymmetric DNA sequences and cut them nonspecfically into 25–28 nucleotide long sequences. The main difference between these two methods is in the way they reform DNA segments. PCR seperates the strands and reforms them using DNA polymerase dNTP’s and designed primers, and is very good at generating millions of copies for testing. Restiction enzymes cut double stranded segments of DNA with either sticky or blunt ends. Sticky ends are ideal when assembling recombinant strands. How can you ensure that the DNA sequences that you have digested and PCR-ed will be appropriate for Gibson cloning?
Answer: Gibson cloning works best when the PCR amplified fragments have overlapping ends which are 20-40 base pairs long with a high GC content. This can be achieved by optimizing the PCR conditions to ensure they are clean and specific, then checking the PCR product with gel electrophoresis before proceeding. How does the plasmid DNA enter the E. coli cells during transformation? Answer: Citation:
Subsections of Homework
Week 1 HW: Principles and Practices
Photosynthetic Image Generation (PIG)
“Art cannot be separated from life. It is the expression of the greatest need of which life is capable, and we value art not because of the skilled product, but because of its revelation of a life’s experience.”
-Robert Henri-
Overview
PIG is an experimental combination of biology, glassmaking, printmaking, and photography which produces a living, two dimmensional surface of bacteria cultured as graphic imagery. The research behind the process to date has produced a small series of images grown on a synthetic felt substrate.
Background
The project emerged over the course of a two year period of experimental interactions with biological material on a Brooklyn rooftop, which began in 2016. It was the site of a collaborative project with my partner, Sarah Max Beck, called studioHydrostatic. It was located on top of a four story walkup in Bedford Stuyvesant, where we had set up a small greenhouse to carry out visual media experiments using an artificial wetland ecosystem. The system was a hybridized combination of aquaponic and terraponic technologies chemically driven by urine, seashells, and chelated iron as nutrient sources. The elevated levels of nitrogen in the water lead to cyanobacteria growing on wet surfaces exposed to sunlight. It was interesting to observe how the bacteria would form into very specific shapes and patterns depending on the qualities of whatever it was interacting with, and marked the beginning of this project. It reveals the collaborative nature of life, the ongoing exchange of resources between organisms -which is more than one life’s experience- it’s the sum of many in a vibrant display of living energy.
Process
The live culture of cyanobacteria occurs in and around aquatic ecosystems, so I decided to set up a simple recirculating pond in a pasture to further experiment with the process. The main body of water was held by a fifty gallon, black plastic, water trough stocked with water hyacinth (Pontederia crassipes), which was harvested from a nearby body of freshwater. A small solar powered water pump moved water from the pond into a smaller processing tray which rested on top of the trough. Inside the tray was a piece of white polyester felt stretched over a foam board. I punched a small hole through the foam board and connected the water supply tube from the water pump. Resting on top of the felt was a plate of glass which had a negative image fused into its surface. The result was a water fed piece of felt sandwiched between foam and glass. As the sun shined, water moved from the pond up to the tray saturating the felt underneath the glass. The contraption exposed the felt to water, sunlight, and nutrients whereby, underneath the glass, cyanobacteria began to grow on the felt and produce an image pattern grown by photosynthetic bacteria.
Research with HTGAA
The beauty of this process is its ability to work with naturally occurring bio-cycles, organic matter, and the resources of its surroundings. These qualities do present a challenge to the idea of working with synthetic biology to augment a process which seems to work quite well without any manipulation. However, the current direction of the research is focused on the recyclability of the media, and since the current system utilizes a synthetic material to manifest the image on, I believe bacterial cellulose holds potential as an organic imaging substrate.
Perhaps differnt qualities of cellulose produced from certain strains of BC could be modified to obtain more desirable results, or perhaps change the process altogether. The possibility of creating something completely different also exist, and while my focus at this point is directed towards PIG, I am open to whatever I may discover through this course!
Strategies for Governance
The goal of my research with PIG is to produce a scalable, mobile technology for producing live surfaces as working media for visual artists in their studio practices as well as in exhibition environments. I believe it is important to research these potential pathways for humans to find new ways of interacting with ecology which are mutually beneficial, or symbiotic. However there are many potential hazards associated with the manipulation of any environment or material. Listed below are three real examples this project currently faces without synthetic alteration of organism genomes. Actions or considerations for GMO would be similar, but may be more procedurally specific.
Environmental protection from:
Release of foreign or exotic organisms from working displays into other ecosystems.
Equipment or device contamination from installations environments upon return to origin.
Release of foreign cultures from lab or experimental waste/water.
Self contamination from interaction with media:
Practioners could become infected by or transmit biological organisms from content/media to themselves or others.
Engaged public interaction poses similar risks.
Spread of Misinformation:
Public opposition to content insiting the spread of “bad” or incorrect information.
Misinterpreted information by gallerists, voluteers, or public.
Environmental Protection Policy
Environmental contamination could result from the improper handling or disposal of materials associated with work. While it is easier to safeguard in controlled laboratory settings, the potential still exists. There are additional risks once materials leave the lab for transportation between works sites, as well as for the work sites themselves.
Environmental Protection Goals
Protect localized environments from contamination by foreign organisms.
Protect people/personel interacting with experimental media.
Protect organizations hosting experimental media from harmful content.
Protection of Localized Environments
Actions to be taken
Follow Laboratory Protocals (this project is not currently active in any laboratories or public spaces.)
Follow BSL1 safety protocals and procedures.
Create list of ALL pertinent safety information specific to research for lab technicians/researcher working with media.
Awareness of experimental media’s location within the lab space: Use signs and markers to identify project space.
Be tidy and make sure surfaces are clean, organised, and clearly marked.
Make Visitors aware of active experiments. Remind them: please do not to touch anything w/o permission.
Post emergency contact information.
Design experiments and equipment for safe containment.
Equipment should be easy to clean and properly sealed.
Design traveling systems with primary and backup sealing/containment systems.
Improperly sealed systems could leak: always double check containment before transport.
Risk of containment breach is low with well maintained systems and adherence to protocals.
Clearly label shipping containers with contents and emergency contact information.
Maintain equipment to ensure systems and safety mechanisms are working properly.
Assess equipment before project initiation.
Replace any worn parts or seals and make any repairs.
Clean final assemblies.
Schedule maintenance with time for service before initiating project.
Does the Action:
Action 1
Action 2
Action 3
Enhance Biosecurity
• By preventing incidents
1
1
1
• By helping respond
1
2
3
Foster Lab Safety
• By preventing incident
1
1
1
• By helping respond
1
2
3
Protect the environment
• By preventing incidents
1
1
1
• By helping respond
2
1
3
Other considerations
• Minimizing costs and burdens to stakeholders
3
2
1
• Feasibility?
1
2
1
• Not impede research
1
3
2
• Promote constructive applications
2
1
2
—————————————————–
———-
———-
———-
Score
14
16
18
Conclusion
While I cannot imagine conducting any experimental research without following all the actions listed aboove, the exercise did make it clear that the research is still very much in a developmental phase, and there needs to be a little more care taken when considering the project’s deployment in non-laboratory environemnts. Overall, I believe the risk to benefit ratio to be very low, especially in its research and development phases. The application of the technology could be interesting as a bioprinting technique, but I’m more interested in exploring its application as an experiential/phenominological application within the visual arts. I’m not sure how synthetic biology will play into the research at this point, but I can see how this may change as I learn more about the organisms I’m working with, and the environmental challenges I may face when developing transport and display systems.
Week 2 HW: Reading and Writing DNA Sequences
Start Your Homework!
Part 1: Benchling & In-silico Gel Art
Johnny Got His Gun
-The sequence has been rearranged into the shape of a pistol. I wanted to keep the number of columns restricted to the original seven enzymes for this composition, but I’m sure the image could be improved with additional columns. Question? How many columns can be introduced to the gel matrix?
Part 2: Gel Art - Restriction Digests and Gel Electrophoresis
NA
Part 3: DNA Design: Pick a Protein… Any Protein?
The organism - Chroococcidiopsis - contains the orange carotenoid protein (OCP)
As seen in the page from my notes above, I’m interested in Chroococcidiopsis, a type of cyanobacteria, which is the focal point of my research with PIG. A protein, OCP, is found in the Phycobilisome (PBS), which is a large light harvesting complex attached to the Thylacoid Membranes within the cyanobacteria cell body. It is an Orange Carotenoid Protein (OCP) with 6483 genes. At this point, I have no idea what any of this means, and I’m not sure if this protein - which was one of over 41,000 results to my search for “proteins commonly found in cyanobacteria” - will be of any importance to my project. Apparently, it acts as a sunscreen with its orange coloration for the thylacoid membrane, which is important for regulating photosynthesis within the cell. I chose this protein in particular for primarily two reasons, it’s a necessary part of a functional cyanobacterial cell, and it has been isolated as a plasmid, which seems to be important in context to the work we will be doing within HTGAA.
Reverse Translation of OCP Protein sequence to DNA sequence.
Reverse Translate results
Results for 319 residue sequence “URD53675.1 orange carotenoid-binding protein (plasmid) [Chroococcidiopsis sp. CCNUC1]” starting “MPYTIESARS”
reverse translation of URD53675.1 orange carotenoid-binding protein (plasmid) [Chroococcidiopsis sp. CCNUC1] to a 957 base sequence of most likely codons.
atgccgtataccattgaaagcgcgcgcagcatttttccggatacccaggtggcgagcgcg
gtgccgaccattgtggaaagctttgaacagctgagcgcggaagatcgcctggcgctgctg
tggtttgcgtataccgaaatgggcgtgaccattaccccggcggcgatgcaggtggcgaac
atgatgtttgcggaaaaaaccctggcgcagattgaacagattccggcggcggaacagacc
caggtgatgtgcgatctgattaaccataccgataccccgatttgccgcacctatagctat
tttggcatgaacgtgaaactgggcttttggtatcagctgggcgaatggatgaaacagggc
attgtggcgccgattccggaaggctataaactgagcgcgaaagcgagcaacgtgctgcag
accattcgccagctggaaggcggccagcagctgaccgtgctgcgcgatattgtggtgaac
atgggccatagcccgaccaccgcgacccagaaagtggaagaaccggtggtgccgccgaaa
gatctggcgccgcgcaccaaaattgtgattgaaggcattaacaacagcaccgtgctgagc
tatatggaaaacatgaacgcgtttgattttgaagcggcggtggcgctgtttgcggaagat
ggcgcgctgcagccgccgtttgaagaaccgattgtgggccaggaaagcattctggcgttt
atgcgcgaagaatgctatggcctgaaactgattccggaacgcggcattagcgaaccgggc
gaacgcggctttacccagattaaagtgatgggcaaagtgcagaccccgtgggcgggcgat
agcgtgggcattaacctggcgtggcgctttctgattaaccgccagggcaaaatttttttt
gtggcgattgatgtgctggcgagcccgcaggaactgctgaacctgggcctggtgaaa
reverse translation of URD53675.1 orange carotenoid-binding protein (plasmid) [Chroococcidiopsis sp. CCNUC1] to a 957 base sequence of consensus codons.
atgccntayacnathgarwsngcnmgnwsnathttyccngayacncargtngcnwsngcn
gtnccnacnathgtngarwsnttygarcarytnwsngcngargaymgnytngcnytnytn
tggttygcntayacngaratgggngtnacnathacnccngcngcnatgcargtngcnaay
atgatgttygcngaraaracnytngcncarathgarcarathccngcngcngarcaracn
cargtnatgtgygayytnathaaycayacngayacnccnathtgymgnacntaywsntay
ttyggnatgaaygtnaarytnggnttytggtaycarytnggngartggatgaarcarggn
athgtngcnccnathccngarggntayaarytnwsngcnaargcnwsnaaygtnytncar
acnathmgncarytngarggnggncarcarytnacngtnytnmgngayathgtngtnaay
atgggncaywsnccnacnacngcnacncaraargtngargarccngtngtnccnccnaar
gayytngcnccnmgnacnaarathgtnathgarggnathaayaaywsnacngtnytnwsn
tayatggaraayatgaaygcnttygayttygargcngcngtngcnytnttygcngargay
ggngcnytncarccnccnttygargarccnathgtnggncargarwsnathytngcntty
atgmgngargartgytayggnytnaarytnathccngarmgnggnathwsngarccnggn
garmgnggnttyacncarathaargtnatgggnaargtncaracnccntgggcnggngay
wsngtnggnathaayytngcntggmgnttyytnathaaymgncarggnaarathttytty
gtngcnathgaygtnytngcnwsnccncargarytnytnaayytnggnytngtnaar
Question 1. The translation results were produced using bioinformatics’ Sequence Analysis - Reverse Translate which produced the results above. In addtion it produced code for graphing the base probabilities. How is this code converted in a graph? Python?
Question 2. What is the difference between most likely codons and consensus codons?
Question 3. In addition to nitrogenous bases ACTG, there are also: n,y,h,r,w,s and m. These represent codon combos which form amino acids? - yes
Codon Optimization
Clearly there are advantages as well as disadvantages to Codon Optimization, and after reviewing the basic concept it seems there are two divergent camps, pro optimization vs. the purist. Let’s compare.
Some Advantages:
Optimization can increase protein production by making more efficient use of codons by using a simplified version of synonymous amino acid combinations. This could help improve the efficiency of producing therapeutic drugs, and potentially reduce production cost.
Certain codon combinations can potentially increase the expression of a gene thereby making more affective treatments.
Optimization helps to align a target gene’s codon usage with the preferred codons of the host organism.
Some Disadvantages:
Modification of protein structure can affect protein function or performance.
Potentially, there are deeper levels of code in specific codon combinations which could be related to rythms associated with protein folding patterns during elongation, or functional aspects which have not yet been discovered.
Optimization can also affect wobble of tRNAand potentially cause unwanted mutations or render proteins disfunctional.
Based off this surface research, I can see reasons for and against codon optimization - and as with any experimental research - we must carefully weigh the potential benefits of optimization relative to the application. For instance, if the optimization is for use in animal research, which could potentially impact the wellbeing of an organism with unintended “side effects”, then comprehensive multi level testing should be conducted to exhaust or reduce error within the application. If the optimization is purely lab based and not able to modify DNA replication, and will improve the speed of data and research assimilation then I can see this as a definite advantage to employ the technology.
Codon Optimization- novoprolabs.com, I used this because the Twist link was broken. The sequence type was: DNA/RNA with E.coli as expression host which produced the following result from most likely codon group:
Question 4. In the optimized sequence above the 25th character is T of TAC which would indicate RNA start codon AUG?
question 5. Then in the next line below I found the sequence ACTTAC, which would indicate a stop codon with another start directly after in the sequence. Everything in between the start and stop is a particular gene in the chain? Is that the promoter of the sequence?
Question 6. If AUG is start codon, which contains uracile, then start and stop codons only exist in RNA?
You have a sequence! Now what?
I don’t know, this is where it falls apart for me… WAH Wah wah… GAME OVER… INSERT ANOTHER TOKEN!!!
I’m failing to grasp the connection between the Orange Caratenoid Protein OCP, which I chose to explore, and how it relates to the host organism, Ecoli, in the codon optimization stage of homework.
Question 7. Am I trying to cut and paste the OCP into the ecoli genome using the benchling software? Or am I looking for the expression genes within the OCP sequence by identifying the starts and stops? Question 8. How do I determine where start and stop codons are within the sequence? I know Methionine (AUG) is start and there are three stop codons, UAA, UAG, UGA. Do I start reading in sets of three from the begining of optimized sequence looking for TAC, which would be on the DNA side of AUG?
Annotation of Sequence (I think I need a tutor!)
Prepare a Twist DNA Synthesis Order
I was able to create both Benchling and Twist accounts no problem. I imported the optimization sequence above as a new sequence, which marked all the restriction enzyme sites, but was not able to distinguish promoter region etc. for annotation.
Question 9. How do I determine promoter, RBS, etc. ? -RBS stands for Ribosomal Biding Site- It is a sequence of of nucleotides (codons) upstream of the start codon (AUG Methionine) in mRNA and has 5-7 nucleotides rich in A and G.
-Start codon in DNA is TAC, and it should be followed by a stop codon (ATT,ATC,ACT).
PDF of my Benchling file (produced from the most likely codon sequence)
Question 10. Should I have fed the optimization results from the consensus codons into Benchling?
DNA Read
My Primary goal with HTGAA is to build my understanding cyanobacteria as the building block of not only my artwork, but also as a cellular organism which forms the foundational base of the ecological food chain. I cannot even begin to think or dream of how, what, or why I would want to alter the cellular systems of cyanobacteria at this point, but as my knowledge of these organisms develops I know my perspective will change.
DNA Write
DNA Edit
References
Image of Chroococcidiopsis -Villanueva, Chelsea & Hasler, Petr & Dvorak, Petr & Poulíčková, Aloisie & Casamatta, Dale. (2018). Brasilonema lichenoides sp. nov. and Chroococcidiopsis lichenoides sp. nov. (Cyanobacteria): Two novel cyanobacterial constituents isolated from a tripartite lichen of headstones. Journal of Phycology. 54. 10.1111/jpy.12621.
Sound JK, Bellamy-Carter J, Leney AC. The increasing role of structural proteomics in cyanobacteria. Essays Biochem. 2023 Mar 29;67(2):269-282. doi: 10.1042/EBC20220095. PMID: 36503929; PMCID: PMC10070481.
Forget the Opentron, it’s time for cranes, planes, and Agricutural Automation! Seriously though, not to hack on the Opentron, what an amazing tool for working in the lab. But let’s not forget about the outdoor living lab which is part of arguably one of the most important industries, Agriculture. The scale of an agricultural system can range from a backyard garden to industrial complexes spanning thousands of hectares. How can automation help manage systems across the scales of indusrty which ultimately translates into what does or doesn’t end up in our shopping carts?
Essentially, this article draws into focus many of the gaps which exist between actors working with biotech. As the pace of technology grows and areas of specialization become more specific - in terms of applications, economics, reqiured rescources etc. - a system of governance should be codeveloped between stakeholders, policymakers, and consumers alike to ensure that a reliable and equitable framework grows with industries as they develop. This is especially important to ensure that new research has a way to interact with development and implimentation within the overall framework of a rapidly transforming environment.
Citation: Hoffmann, M., Chen, C., Butterbach-Bahl, K. et al. Advancing sustainable agricultural transformation through the synergy of automated experimental platforms and living labs. Nat Commun 16, 8418 (2025). https://doi.org/10.1038/s41467-025-64450-7
Opentron Demo
Colab code for points genreated by the Ronan simulator:
Week 4 HW: Protein Design I
Hi- It’s time to do your homework!
Part A
Questions:
How many molecules of amino acids do you take with a piece of 500 grams of meat? (on average an amino acid is ~100 Daltons)
Solve- How many Daltons are in 500 grams? 1g = 6.022173643E+23 Daltons (6.022173643E+23 x 500)/100 = 4.29 x 10E+18
Answer- There are approximately 4.29 x 10E+18 amino acid molecules in 500 grams of meat.
Why do humans eat beef but do not become a cow, eat fish but do not become fish?
Answer- The process of digestion uses acids and enzymes to break tissues down into simpler molecules for use in cellular mechanics. This process helps provide the rescources/energy to power our own cellular processes which reassemble the digested amino acids, sugars and such into the forms required by each cell to perform its function which is dictated by the organisms genetic code.
Why are there only 20 natural amino acids?
Answer- According to Andrew Doig, a chemical biologist at the University of Manchester in the UK, it goes all the way back to LUCA. The twenty natural amino acids found in all living organisms are the building blocks we share in common. It is speculated that these twenty in particular are the most stable do to their structure, and able to maintain their functional forms after being buried, or exposed to the environmental conditions of Earth 3.5–3.8 billion years ago when LUCA first emerged, probably from some archaic form of RNA. They are more hydrophobic than hydrophilic, and that may have something to do with their role in LUCA’s evolution because of the way they are able to fold and work together structuraly.
Can you make other non-natural amino acids? Design some new amino acids.
Answer- Yes, absolutely! The only problem though is when trying to incorporate synthetic amino acids into cell function/DNA-RNA the catalytic function of those new amino acids have to be incorporated into the cell’s chemical function of enzymatic activity.
Where did amino acids come from before enzymes that make them, and before life started?
Answer- Apparently, there are several theories which range from riding in on an astreroid to chemical reactions occuring through electrical activity-i.e., lightening striking in areas rich in oxygen, nitrogen, hydrogen, carbon and sulphur compounds. The astroid theory has been proven viable by the discovery of at least 86 amino acids on the Murichison meteorite, which landed in Australia in 1969. I beleive it was probably a combination of planetary chemical reactions and celestial seeding.
If you make an α-helix using D-amino acids, what handedness (right or left) would you expect?
Answer- Left, because the a helix is usually right handed for the most common amino acids becuase the carbons have an L configuration, but in D-amino acids, the carbons have a D configuration, they mirror the L configuation.
x
Can you discover additional helices in proteins? In theory, yes. Although detecting structures outside the typical three is difficult, not to mention the limitation of the physical and chemical properties of helical structures. But with synbio, the possibility of discovering something new is certainly possible, maybe even likely.
Why are most molecular helices right-handed?
Answer- It’s a mystery… I think, but it has something to do with chirality. Perhaps polarity, too? Or maybe the ionic charge of particles, laws of attraction- opposites attract- who knows? Certainly, there must be an answer.
Why do β-sheets tend to aggregate?
What is the driving force for β-sheet aggregation?
Answer- It appears as though the proteing is programmed to fold in such a way that it wants to form aggregate structures. This behavior is probably caused by the alternating arrangement of charged and hydrophobic amino acids, which makes them inclined to gather through attraction of oppositely charged particles?
Part B
URD53675.1 orange carotenoid-binding protein (plasmid) Chroococcidiopsis sp. CCNUC1
For the sake of consistency I decided to stick with the first protein from the week two homework. It’s interesting to see how many different organisms this protein exists in, all with there own unique functions using this molecule.
It is 319 amnino acids long, most common is A with a count of 29. According to the UNIProt Homology, there are 239 homologs, and InterProscan identifies it as belonging to NTF2-like domain superfamily. Belongs to the orange carotenoid-binding protein family.
There were quite a few hits for OCP in RCSB PDB. The model above is a 3-D rendering of OCP2 Gloeocapsa sp. PCC 7428 from RCSB PDB.
PyMol Renderings: Stick and Ribbon (below)
Pymol - Orange Carotenoid-binding Protein
Clockwise from topleft: ribbon, secondary, stick, hyrdro
Number of helices: 10
Number of sheets: 6
Hyrophobic with 2,363 atoms in hydrophobic residues vs. Hydrophilic residues with a count of 782 atoms
Binding Pocket Yes, definately several pockets and even some holes leading to core.
Part C - Using ML-Based Protein Design Tools
C1. Protein Language Modeling
Deep Mutational Scans- pdb_00008qx5
Helical Carotenoid Protein 4 (HCP4) from Anabaena with bound Canthaxanthin• Heat map visuals indicate binding probabilities for:
Sites 43 and 112 exhibit high probability for following bases: A, D, E, G, K, N, Q, R, S, and T.
C has low probability across all sites except 43 and 112, where it is medium.
Y and X are fairly low across entire sequence.
Latent Space Analysis In the examples above the closest related matches were:
Kirschning A. On the Evolutionary History of the Twenty Encoded Amino Acids. Chemistry. 2022 Oct 4;28(55):e202201419. doi: 10.1002/chem.202201419. Epub 2022 Jul 28. PMID: 35726786; PMCID: PMC9796705.
Huang, W., Wang, S., Wei, Y. et al. Design and evolution of artificial enzyme with in-situ biosynthesized non-canonical amino acid. Nat Commun 16, 8698 (2025). https://doi.org/10.1038/s41467-025-63733-3
Medeiros-Silva J, Dregni AJ, Hong M. Distinguishing Different Hydrogen-Bonded Helices in Proteins by Efficient 1H-Detected Three-Dimensional Solid-State NMR. Biochemistry. 2024 Jan 2;63(1):181-190. doi: 10.1021/acs.biochem.3c00589. Epub 2023 Dec 21. PMID: 38127783; PMCID: PMC10880114.
Greg Huber et al, Entropy and chirality in sphinx tilings, Physical Review Research (2024). DOI: 10.1103/PhysRevResearch.6.013227
The PROSITE database Sigrist CJA, Cuche BA, de Castro E, Coudert E, Redaschi N, Bridge A.
The PROSITE database for protein families, domains, and sites.
Nucleic Acids Res. 2026; doi: 10.1093/nar/gkaf1188 [In press]
PubMed:41263099 [Full text] [PDF version]
Week 5 HW: Protein Design Part II
Time to do your homework- It’s always time to do homework in this course!
Part A: SOD1 Binder Peptide Design
Part 1: Generate Binders with PepMLM Starting with the human SOD1 sequence from UniProt (P00441), and then introducing the A4V mutation, we get the following sequence where the fourth amino acid in the sequences is changed from K to V:
All the sidechain peptides generated had low to very low plDDT scores, and did not appear to be close to either ends of the SOD1. This seems to indicate to me that these peptides are not well suited for binding with SOD1. In addition, the low ipTM scores all seem to indicate low confidence of interaction between molecules. In the model below there are similar results when combining two SOD1 A4V molecules with the known binder. The pedicted local distance score for the known peptide is low, which is confusing, and makes me think it may be an incorrect reading by the AlphaFold model, especially if the peptide is a known binder.
Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse
Peptiverse graphic results table for Peptide 0
-Peptiverse Results-
Peptide 0 - Sequence: WSYPAAALGLKK. Realatively weak binding (5.1 pKd/pKi) with second to highest net charge, and the second lowest score for hydrophobicity. It’s non-hemolytic and soluble.
Peptide 1 - Sequence: WRYPAAAAGWGX. Realatively weak binding (6.2 pKd/pKi) with the third highest net charge, and the third lowest score for hydrophobicity. It’s non-hemolytic and soluble.
Peptide 2 - Sequence: WLYYVVGLAHX. Realatively weak binding (6.9 pKd/pKi) with the lowest net charge, and the highest score for hydrophobicity. It’s non-hemolytic and soluble.
Peptide 3 - Sequence: WRYPAVAARHGK. Realatively weak binding (5.5 pKd/pKi) with the highest net charge, and the lowest score for hydrophobicity. It’s non-hemolytic and soluble.
Based on the results of this group, I don’t think I would advance any of them for further testing. The second peptide generated (peptide 1), seems closest to working with the highest iPTm score, third highest charge and the lowest hydrophobicity. Although, it is difficult for me to interpret these data sets at this point, and if I were actually looking for a binder to test I would continue looking for a set which had better predictions in the visualization models, which ideally would also match the Peptiverse scores.
Part 4: Generate Optimized Peptides with moPPIt
Functional parameters: Hemolysis, Non-Fouling, Solubility, Half-Life, Affinity, Motif, Specificity. All set with parameters set with objective importance of 1. Motif Positions were: 1-10, 144-154 (first ten positions, last ten positions) Generated peptide sequences:
KKKKKDKTTKWM
EEVQKKQEWKTI
ELIRWLQQRRTD
The following results are for the first peptide generated, Sequence - KKKKKDKTTKWM:
Based on these results, I am no more confident in this binder generated by moPPit than the first sets generated by PepMLM. Again, not sure if this is a result of the modeling, or perhaps I’m not doing this exercise correctly. It seems odd that none of the binders have very high ipTM or pTM scores.
Part C: Final Project: MS2L-Protein Mutants
Option 1: Unfortunately, I had technical issues with the colab notebook and was only able to run the first few sections. First, I generated the heat map, and the top ten mutations. As the goal of the assignment is to adapt the MS2L protein to fold on it’s own without the DnaJ chaperone, I used this data with Alphafold to improve it’s confidence in the model. I relize this is not how the assignment was intended to be done, but it was the best way I could proceed to generate five, hopefully better, mutations of the MS2L protein.
Working with AlphaFold to Generate New Sequences
Come up with 5 mutations along with how you came up with them and why you believe they would work. 2 of the variants you submit must have mutations in the transmembrane region (38-60, Sequence: LYVLIFLAIFLSKFTNQLLLSLL) and 2 of them must be in the soluble region.
AlphaFold of Original (non-mutated) MS2L Sequence
MS2L Heatmap of Likely Mutations
MS2L Top Ten Most Likely Mutations
Position
Wild_Type_AA
Mutation_AA
LLR_Score
50
K
L
2.561468
29
C
R
2.395427
39
Y
L
2.241780
29
C
S
2.043150
9
S
Q
2.014325
29
C
Q
1.997049
29
C
P
1.971029
29
C
L
1.960646
50
K
I
1.928801
53
N
L
1.864932
Mutational Tests
Test 1 and 2 -Sites 50, 29, 39, and 9 were mutated using the top ten mutational LLR scores from ESM notebook.
-The first set of mutations were sites 50, 29, and 39 which produced a confident plDDT at site 29 compared to the original non-mutated MS2L sequence. Interestingly, when I continued with the mutation and switched site 9 from S to Q, the plDDT score dropped at sight 29.
Above: Site 29 before S to Q mutation (lower-left), site 29 after S to Q (lower-right).
Test 3: The third mutation keeps the changes made to sites 50 and 29 from mutation 1, which yielded a higher plDDT score, and modified the aa at site 8 from Glutamine (Q) to Threonine (T). I chose Threonine because it has a neutral charge like Glutamine but is freely soluble where Glutamine’s solubility score is 2.6. This affected the AlphaFold’s model with a higher plDDT score for that position, and seemed to positively affect other positions in the aa 1-29 region.
Test 4 and 5:
In mutation 4 site 8 changed from it’s original aa residue of Q to T, which had a positive affect on the plDDT score. In the fifth mutation I wanted to see if I could improve confidence around site 26. I went through several variations of changing the original residue, Aspartate (D), to G, then to E, and finally to Q which improved confidence. However, in my final analysis of the fifth variation, I realized that the improved confidence at site 8 in the fourth mutation changed back to a low confidence state. It’s an interesting problem which I don’t understand. It would make sense why changing residues which are in close proximity to one another would impact the region where they reside; but it’s odd to see how a change at site 26 would effect the confidence at site 8. Overall, by playing with MS2L in AlphaFold I was able to improve the model’s confidence in sites 1-29, but it still has a long way to go. And - I just now relized it would have been interesting to incrporate moPPIt into testing these mutational problems, instead of solely relying on AF. AHHH!
Conclusions
This assignemnt was particularily challenging because it required me to consider which of the protein design tools could I use to help approach the task of improving the folding characteristics of a protein, which would hopefully have a net positive affect on it’s functionality. Being a visual person, I found AlphaFold (AF) to be very helpful with it’s ability to model aa interaction. It’s certainly much faster than constructing a physical ball and stick model. However, I can see the value of those models too. In particular, it’s perplexing to see how AF interprets changes in design, where residues which are separated by several aa’s in sequence affect one another, and this is where (for me) a more traditional approach using physical models could help to work through compatability issues. I understand the importance of these insilico techniques, and I appreciate all the hard work behind their development!
Final mutational design. Changes to sequence- site 8: Q->T, site 23: K->R, site 26: D->Q, site 29: C->R, site 50: K->L.
Alphafold: Abramson, J et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature (2024).
Peptiverse: @article {Zhang2025.12.31.697180,
author = {Zhang, Yinuo and Tang, Sophia and Chen, Tong and Mahood, Elizabeth and Vincoff, Sophia and Chatterjee, Pranam},
title = {PeptiVerse: A Unified Platform for Therapeutic Peptide Property Prediction},
elocation-id = {2025.12.31.697180},
year = {2026},
doi = {10.64898/2025.12.31.697180},
publisher = {Cold Spring Harbor Laboratory},
URL = {https://www.biorxiv.org/content/early/2026/01/03/2025.12.31.697180},
eprint = {https://www.biorxiv.org/content/early/2026/01/03/2025.12.31.697180.full.pdf},
journal = {bioRxiv}
}
The UniProt Consortium
UniProt: the Universal Protein Knowledgebase in 2025
Nucleic Acids Res. 53:D609–D617 (2025)
Week 6 HW: Genetic Circuits Part I
Guess what. What? It’s time to do your homework. OMG!!! It’s never ending… I’m always doing my homework for HTGAA.
Answer these questions about the protocol in this week’s lab
What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose? Answer: There are several components in the master mix. It contains - DNA polymerase, dNTP’s, reaction buffers, and MgCL2.
DNA polymerase is an enzyme which catylizes the replication of DNA by reading the parent’s unzipped DNA aa sequence and pairing it
with its corresponding dNTP to synthesize a new strand of target DNA.
dNTP is short for Deoxyribonucleotide Triphosphate. They are the building blocks which allow DNA replication to take place. They consist of a deoxyribose sugar molecule, a nitogenous base (A,C,T,G), and a triphosphate group.
There are four dNTP’s - dATP, dCTP, dTTG, and dGTP.
The reaction buffers dissolve, or lyse the phospholipid cell membranes which hold the DNA inside the cell.
The MgCl2 is magnesium salt which neutralizes the charge on the sugar-phosphate backbone making the DNA less water soluble aiding in its precicipitation. It aslo helps remove proteins from DNA and keeps them dissolved in the lysed cell solution.
What are some factors that determine primer annealing temperature during PCR? Answer: There are three tempererature cycles protocals important to PCR, denaturization, annealing, and extension.
The melting temperature (Tm) for primers is an important factor which initializes the PCR process, this is called denaturing. The increase in temperature is what separates the primers so they become single stranded.
The annealing temperature (Ta) is dependent upon the length, sequence and concentration of primers.
The extension process temperature is lower than first two and the last step where the 3’ ends are bound into finished PCR product.
There are two methods from this class that create linear fragments of DNA: PCR, and restriction enzyme digests. Compare and contrast these two methods, both in terms of protocol as well as when one may be preferable to use over the other.
PCR stands for Polymerase Chain Reaction and is widely used for its ability to amplify specific segments of DNA for study or testing. As stated above in question 2, it’s a thermocycling process using specific sets or a custom blend of primers, short single-stranded segments of DNA usually 18-25 nucleotides long, which bind to the target DNA strand in specific regions based on the primer sequence used. It is a relatively cheap and practical way to analyze DNA which requires a very basic “kitchen sink” lab to run protocal. It only requires small trace samples of DNA, which can be sourced from basically any tissue, and usually produces blunt end segments of DNA.
Restriction enzymes are proteins found in bacteria which can be used to cut DNA at specific target sites. They cut either blunt, or sticky ends - an advantageous quality for constructing recombinant DNA strands; there are three types.
Type I: recognize shorter sequences and do not cut at their recognition sites but rather at the unprotected ends.
Type II: Most widely used and available in many different forms. They cut at specific recognition sites to a predictable sequence.
Type III: Recognize short asymmetric DNA sequences and cut them nonspecfically into 25–28 nucleotide long sequences.
The main difference between these two methods is in the way they reform DNA segments. PCR seperates the strands and reforms them using DNA polymerase dNTP’s and designed primers, and is very good at generating millions of copies for testing. Restiction enzymes cut double stranded segments of DNA with either sticky or blunt ends. Sticky ends are ideal when assembling recombinant strands.
How can you ensure that the DNA sequences that you have digested and PCR-ed will be appropriate for Gibson cloning? Answer: Gibson cloning works best when the PCR amplified fragments have overlapping ends which are 20-40 base pairs long with a high GC content. This can be achieved by optimizing the PCR conditions to ensure they are clean and specific, then checking the PCR product with gel electrophoresis before proceeding.
How does the plasmid DNA enter the E. coli cells during transformation?
Answer:
Citation:
Lorenz TC. Polymerase chain reaction: basic protocol plus troubleshooting and optimization strategies. J Vis Exp. 2012 May 22;(63):e3998. doi: 10.3791/3998. PMID: 22664923; PMCID: PMC4846334.