<SILA ESTIGARRIBIA> — HTGAA Spring 2026

cover image cover image

About me

Hi! Im a junior biotechnology student from Paraguay, and i love synthetic biology. I discover it thanks to the DiyBio movement, and its was, for me, so cool.

I want to develop technologies that could expand all the possibilites that biotechnology enable. Im currently working on systems for synthetic multicellularity, in plants (and other organisms). From synthetic cell-to-cell communications, to a synthetic asymmetrical cell division system.

Hope to get along with everybody :PPP

Contact info

: silasopran@gmail.com

: X.com

Homework

Labs

Projects

Subsections of <SILA ESTIGARRIBIA> — HTGAA Spring 2026

Homework

Weekly homework submissions:

  • Week 1 HW: Principles and Practices

    Project Description I want to develop a synthetic assymetrical-cell-division system to enable synthetic cell differentiation; as a toolkit for engineer multicellular organization, development, pattern formation and others. To futher detail, see the page of the final project The project have two major subprojects: A mitotic counter: a circuit capable of counting as states how many cell division the linage have sense since the system has been activated (firts division, second, third, and so on). The system uses the natural fluctuations of the cell’s cycle regulators to induce or activate distinct proteins. When the system is activated, a TF would be activated by an cell-cycle-dependent kinase (at the G1 phase), that would induce the expression of a recombinase, that would inverse the sequence of its own promoter, inducing the expression of an inactivated, second, TF. This TF2 would be activated via a second cell-cycle-dependent kinase, one that would be activated in the final of the cell cycle and not be expressed at the same time with the firts kinase. The phosphorylation upon TF2 would, for example, link two homodymers together, in an activated form (an option, but not necessarily how it would work). When the kinase 2 activates TF2, gene expression would not be available because of chromatin condesation upon mitosis. After the cell divides, at G1, TF2 could induce the activation of other genes, that, in consenquence and, using the architecture, enables the activation of other genes at the second cell division, and so on, making possible to count cell division in the cell lineage. The system also needs other things such as a degradation system, repression for past states, etc. Assymetrical component: ¿How can we engineer an assymetrical distribution of molecules to induce the assymetrical cell division? It should be transferable to other organisms. The proposed one is to used an synthetic phased segregated condensate that is capable of create one individual and stable condensate upon the cell. It should carry an mRNA that would express a TF. It should be formated upon the activation of the scaffolds and dissegregated upon the cell division, enabling the translation of the mRNA selected. The mRNA should be sequestred in the condensate, where its translation-initiation site would be blocked, and it would be protected of degradations by directed nucleases over that mRNA. All of this via linking the sites of union of the scaffolds with this important sites of the mRNA (nuclease recognition site, translation initiation site, rybozome binding site, etc). The scaffolds should be inactivated upon cell division via phosphorylation at the binding sites, liberating the mRNA. But how we can make a single condensate exist? The idea is to use the pyrenoid (a rubisco aggregate condensate that makes CO2 fixation more efficient) as an example for this, where, in a dynamics between phosphatases and kinases, the algae maintains the pyrenoid as a single condensate or multiple when needed. ¿So how all of this is going to work in order to make the assymetrical cell division system function?

  • Week 2 HW: DNA read, write and edit

    Homework Part 1: Benchling and in-silico gel art This is the virtual digestion of the Lambda genome with this restriction enzymes: EcoRI, HindIII, BamHI, KpnI, EcoRV, SacI and SalI.

  • Week 3 HW: Lab automation

    Homework I create the following code by drawing at the opentrons art platform, coping the coordenates and asking chatgpt to generate the code below the code provided by the original opentrons HTGAA google colabs. Colabs link mclover3_points = [(-27, 24),(-24, 24),(-21, 24),(-18, 24),(-30, 21),(-27, 21),(-18, 21),(-15, 21),(-3, 21),(0, 21),(-33, 18),(-30, 18),(-15, 18),(-12, 18),(-3, 18),(0, 18),(-33, 15),(-30, 15),(-24, 15),(-21, 15),(-18, 15),(-12, 15),(-3, 15),(0, 15),(-33, 12),(-27, 12),(-24, 12),(-21, 12),(-18, 12),(-15, 12),(-3, 12),(0, 12),(-33, 9),(-27, 9),(-24, 9),(-12, 9),(-3, 9),(0, 9),(-30, 6),(-9, 6),(-6, 6),(-3, 6),(0, 6),(-33, 3),(-3, 3),(0, 3),(-6, 0),(-3, 0),(0, 0),(15, 0),(18, 0),(21, 0),(24, 0),(27, 0),(-3, -3),(0, -3),(12, -3),(15, -3),(18, -3),(24, -3),(27, -3),(30, -3),(-3, -6),(0, -6),(9, -6),(12, -6),(21, -6),(33, -6),(-3, -9),(0, -9),(9, -9),(15, -9),(18, -9),(21, -9),(24, -9),(27, -9),(30, -9),(-3, -12),(0, -12),(12, -12),(15, -12),(18, -12),(21, -12),(24, -12),(27, -12),(30, -12),(-3, -15),(0, -15),(3, -15),(6, -15),(12, -15),(15, -15),(18, -15),(21, -15),(24, -15),(27, -15),(-3, -18),(0, -18),(-3, -21),(0, -21),(-3, -24),(0, -24),(-3, -27),(0, -27),(-3, -30),(0, -30),(-3, -33),(0, -33),(-3, -36),(0, -36),(-3, -39),(0, -39)] sfgfp_points = [(-24, 21),(-21, 21),(-27, 18),(-24, 18),(-21, 18),(-18, 18),(-27, 15),(-15, 15),(-30, 12),(-12, 12),(-30, 9),(-9, 9),(-33, 6),(-6, 3), (21, -3),(15, -6),(18, -6),(24, -6),(27, -6),(30, -6),(12, -9),(33, -9),(6, -12),(9, -12),(3, -18)] electra2_points = [(-3, 24),(0, 24)] def draw_points(color, points): DISPENSE_VOL = 0.75 pipette_20ul.pick_up_tip() for x, y in points: pipette_20ul.aspirate(DISPENSE_VOL, location_of_color(color)) adjusted_location = center_location.move(types.Point(x=x, y=y)) dispense_and_detach(pipette_20ul, DISPENSE_VOL, adjusted_location) pipette_20ul.drop_tip() draw_points('Green', mclover3_points) draw_points('Red', sfgfp_points) draw_points('Orange', electra2_points) The simulation provide the following artwork.

  • Week 4 HW: Protein desing part I

    Homework Part A. Conceptual Questions Answer any NINE of the following questions from Shuguang Zhang: Why do humans eat beef but do not become a cow, eat fish but do not become fish? Because eating beef doesn´t make us incorporate the genetic information that make a cow, and even if we could, we don´t have machinery to process that information and make us a cow.

  • Week 5 HW: Protein design part II

    Homework Part A: SOD1 Binder Peptide Design (From Pranam) Part 1: Generate Binders with PepMLM 1- Begin by retrieving the human SOD1 sequence from UniProt (P00441) and introducing the A4V mutation.

  • Genetic Circuits Part I: Assembly Technologies

    Assignment: DNA Assembly What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose? The high Phusion High-fidelity polymerase, MgCl2, dNTPs, etc. The polymerase enable low error in the polymerization reaction. What are some factors that determine primer annealing temperature during PCR? how much Hidrogen bombs are present (type of bases and how much).

  • Genetic Circuits Part II: Neuromorphic Circuits

  • Cell-Free Systems

  • Advanced Imaging & Measurement Technology

  • Bioproduction & Cloud Labs

  • Biodesign & Engineered Living Materials

  • Building Genomes

Subsections of Homework

Week 1 HW: Principles and Practices

cover image cover image

Project Description

I want to develop a synthetic assymetrical-cell-division system to enable synthetic cell differentiation; as a toolkit for engineer multicellular organization, development, pattern formation and others. To futher detail, see the page of the final project

The project have two major subprojects:

  • A mitotic counter: a circuit capable of counting as states how many cell division the linage have sense since the system has been activated (firts division, second, third, and so on). The system uses the natural fluctuations of the cell’s cycle regulators to induce or activate distinct proteins. When the system is activated, a TF would be activated by an cell-cycle-dependent kinase (at the G1 phase), that would induce the expression of a recombinase, that would inverse the sequence of its own promoter, inducing the expression of an inactivated, second, TF. This TF2 would be activated via a second cell-cycle-dependent kinase, one that would be activated in the final of the cell cycle and not be expressed at the same time with the firts kinase. The phosphorylation upon TF2 would, for example, link two homodymers together, in an activated form (an option, but not necessarily how it would work). When the kinase 2 activates TF2, gene expression would not be available because of chromatin condesation upon mitosis. After the cell divides, at G1, TF2 could induce the activation of other genes, that, in consenquence and, using the architecture, enables the activation of other genes at the second cell division, and so on, making possible to count cell division in the cell lineage. The system also needs other things such as a degradation system, repression for past states, etc.
  • Assymetrical component: ¿How can we engineer an assymetrical distribution of molecules to induce the assymetrical cell division? It should be transferable to other organisms. The proposed one is to used an synthetic phased segregated condensate that is capable of create one individual and stable condensate upon the cell. It should carry an mRNA that would express a TF. It should be formated upon the activation of the scaffolds and dissegregated upon the cell division, enabling the translation of the mRNA selected. The mRNA should be sequestred in the condensate, where its translation-initiation site would be blocked, and it would be protected of degradations by directed nucleases over that mRNA. All of this via linking the sites of union of the scaffolds with this important sites of the mRNA (nuclease recognition site, translation initiation site, rybozome binding site, etc). The scaffolds should be inactivated upon cell division via phosphorylation at the binding sites, liberating the mRNA. But how we can make a single condensate exist? The idea is to use the pyrenoid (a rubisco aggregate condensate that makes CO2 fixation more efficient) as an example for this, where, in a dynamics between phosphatases and kinases, the algae maintains the pyrenoid as a single condensate or multiple when needed.

¿So how all of this is going to work in order to make the assymetrical cell division system function?

When both systems are activated, the mitotic counter and the phase-condensate system, the single condensate is going to be segregated to any of the son’s cell, and, after the division have been made, the TF2 of the mitotic counter could express a kinase for the dissegregation of the condensate, enabling the translation of the mRNA. Also the system could make possible the use of logic gates, whether both components are present or no, x gene is activated (in the other cell that would have only the TF2 expressed, AND, NOT or OR gates based on this could enable the differentiation in this cell also, taking another fate that the father cells haved. Or even more complex differentiation dynamics). Anyways, the now different cell could express an auto-induced transcription regulator, that, as follows, would induce the differentiation of its own self, and his son’s cells (between other feedbacks). Also the mRNA TF should repress the component of mRNA degradation.

¿And now what?

Those systems are toolkits for the multicellular engineering dicipline. To directly create organisms, we would need to have an autonomous cell differentiation system, a system that by its own is able to activate itself. The mitotic counter, also, for example could work as a signaling event upon certains cell divisions in the designated cell lineage, to activate cell-to-cell communications upon that moment, and only in that moment for example. Possibilities are endless and only the imagination is the limit.


Governance Context

Autonomous multicellular systems capable of growth, differentiation, and spatial organization introduce additional biosafety and biosecurity considerations compared to single-cell engineering. Risks include unintended environmental persistence, uncontrolled proliferation, ecological interaction, and potential dual-use of self-assembling biological structures. Governance must therefore balance intrinsic biological containment, institutional oversight, and research feasibility.

Options considered

Option 1 — Mandatory Genetic Containment and Fail-Safe Design
Intrinsic safeguards such as auxotrophy, kill-switches, replication limits, and environmental sensitivity.

Option 2 — Project Licensing, Design Review, and Traceability
Registration, documentation, biosafety review, strain tracking, training, and incident reporting.

Option 3 — Voluntary Standards and Community Best Practices
Non-binding guidelines and self-governance without formal enforcement.

Scoring scale: 1 (low contribution) to 5 (strong contribution).

Does the option:Option 1Option 2Option 3
Enhance Biosecurity
• By preventing incidents542
• By helping respond352
Foster Lab Safety
• By preventing incident543
• By helping respond353
Protect the environment
• By preventing incidents541
• By helping respond342
Other considerations
• Minimizing costs and burdens to stakeholders235
• Feasibility?345
• Not impede research245
• Promote constructive applications453

Homework pre HW2:

Answers to Professor Jacobson

DNA replication is carried out by DNA polymerases, which exhibit a raw error rate of approximately 10⁻⁵ errors per base incorporated. However, proofreading activity and mismatch-repair pathways dramatically improve fidelity, reducing the effective error rate to roughly 10⁻⁹–10⁻¹⁰ per base.

Given that the human genome contains about 3 × 10⁹ base pairs, an uncorrected replication process would introduce tens of thousands of mutations per cell division. In practice, only a few or fewer mutations accumulate per division. This discrepancy is resolved through polymerase exonuclease proofreading and post-replication DNA repair systems that detect and correct mismatches before they become permanent mutations.

Because the genetic code is degenerate, most amino acids are encoded by multiple synonymous codons (on average roughly three per amino acid). Consequently, a typical human protein of several hundred amino acids could theoretically be encoded by an astronomically large number of different DNA sequences. In practice, however, only a small subset of these sequences functions efficiently. Constraints include codon bias and tRNA availability, unfavorable mRNA secondary structures, extreme GC content, unintended splice or regulatory signals, reduced translation efficiency, and technical limitations in DNA synthesis or cloning. These factors strongly restrict the number of sequences that yield robust protein expression.


Answers to Professor LeProust

The most widely used method for oligonucleotide synthesis is solid-phase phosphoramidite chemical synthesis. In this approach, DNA is assembled one nucleotide at a time on a solid support through automated cycles of nucleotide coupling, washing, and deprotection. This chemistry has remained the industry standard for decades because it is fast, scalable, and reliable for producing short DNA fragments.

Oligos longer than approximately 150–200 nucleotides are difficult to synthesize efficiently because each coupling step has slightly less than perfect efficiency. A small fraction of strands fails to extend at every cycle, and these losses accumulate exponentially as the number of cycles increases. As a result, the final mixture contains many truncated or error-containing products, while the yield of full-length DNA becomes very low. Additional chemical side reactions and base damage further degrade quality.

For this reason, a 2000 base-pair gene cannot be synthesized directly. The cumulative inefficiency across thousands of addition steps makes the probability of obtaining a correct full-length product essentially zero. Instead, long genes are constructed by synthesizing many shorter oligonucleotides and assembling them enzymatically using methods such as PCR assembly or Gibson assembly.


Answer to Professor Church

In most animals, including humans, nine amino acids cannot be synthesized de novo and must be obtained from the diet: histidine, isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan, and valine. Some sources include arginine as conditionally essential, particularly in growing organisms, yielding a list of ten.

Because lysine is already universally essential, engineering a synthetic “lysine contingency” does not create a fundamentally new biological vulnerability. Instead, it exploits an existing metabolic dependency. Organisms would already require external lysine, so forcing supplementation simply makes this requirement explicit rather than introducing a novel Achilles’ heel. Consequently, such a contingency is predictable and limited as a containment strategy rather than uniquely robust.

Week 2 HW: DNA read, write and edit

Homework

Part 1: Benchling and in-silico gel art

Digestion of the lambda genome Digestion of the lambda genome

This is the virtual digestion of the Lambda genome with this restriction enzymes: EcoRI, HindIII, BamHI, KpnI, EcoRV, SacI and SalI.

And here is some artwork:

Digestion of the lambda genome Digestion of the lambda genome

Part 3: DNA design challenge

3.1- I have choose for the DNA nucleotidyllexotransferase, because it is used in the enzymatic DNA synthesis process, and been able to engineer it could conduct to synthesize DNA in vivo, thing that would make advance the field tremendously.

sp|P04053|TDT_HUMAN DNA nucleotidylexotransferase OS=Homo sapiens OX=9606 GN=DNTT PE=1 SV=3 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKG FRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMT GKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCE FRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAV LNDERYQSFKLFTSVFGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRMQKAGFLYYEDLV SCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGGFRRGKKMGHDVDFLITSPGSTEDEEQL LQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALDHFQKCFLIFKLPRQRVDSDQSS WQEGKTWKAIRVDLVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKT KRIFLKAESEEEIFAHLGLDYIEPWERNA

3.2-

reverse translation of sp|P04053|TDT_HUMAN DNA nucleotidylexotransferase OS=Homo sapiens OX=9606 GN=DNTT PE=1 SV=3 to a 1527 base sequence of most likely codons.

atggatccgccgcgcgcgagccatctgagcccgcgcaaaaaacgcccgcgccagaccggc gcgctgatggcgagcagcccgcaggatattaaatttcaggatctggtggtgtttattctg gaaaaaaaaatgggcaccacccgccgcgcgtttctgatggaactggcgcgccgcaaaggc tttcgcgtggaaaacgaactgagcgatagcgtgacccatattgtggcggaaaacaacagc ggcagcgatgtgctggaatggctgcaggcgcagaaagtgcaggtgagcagccagccggaa ctgctggatgtgagctggctgattgaatgcattcgcgcgggcaaaccggtggaaatgacc ggcaaacatcagctggtggtgcgccgcgattatagcgatagcaccaacccgggcccgccg aaaaccccgccgattgcggtgcagaaaattagccagtatgcgtgccagcgccgcaccacc ctgaacaactgcaaccagatttttaccgatgcgtttgatattctggcggaaaactgcgaa tttcgcgaaaacgaagatagctgcgtgacctttatgcgcgcggcgagcgtgctgaaaagc ctgccgtttaccattattagcatgaaagataccgaaggcattccgtgcctgggcagcaaa gtgaaaggcattattgaagaaattattgaagatggcgaaagcagcgaagtgaaagcggtg ctgaacgatgaacgctatcagagctttaaactgtttaccagcgtgtttggcgtgggcctg aaaaccagcgaaaaatggtttcgcatgggctttcgcaccctgagcaaagtgcgcagcgat aaaagcctgaaatttacccgcatgcagaaagcgggctttctgtattatgaagatctggtg agctgcgtgacccgcgcggaagcggaagcggtgagcgtgctggtgaaagaagcggtgtgg gcgtttctgccggatgcgtttgtgaccatgaccggcggctttcgccgcggcaaaaaaatg ggccatgatgtggattttctgattaccagcccgggcagcaccgaagatgaagaacagctg ctgcagaaagtgatgaacctgtgggaaaaaaaaggcctgctgctgtattatgatctggtg gaaagcacctttgaaaaactgcgcctgccgagccgcaaagtggatgcgctggatcatttt cagaaatgctttctgatttttaaactgccgcgccagcgcgtggatagcgatcagagcagc tggcaggaaggcaaaacctggaaagcgattcgcgtggatctggtgctgtgcccgtatgaa cgccgcgcgtttgcgctgctgggctggaccggcagccgccagtttgaacgcgatctgcgc cgctatgcgacccatgaacgcaaaatgattctggataaccatgcgctgtatgataaaacc aaacgcatttttctgaaagcggaaagcgaagaagaaatttttgcgcatctgggcctggat tatattgaaccgtgggaacgcaacgcg

3.3- Codon optimazed for Saccharomyces cerevisiae: ATGGATCCACCAAGAGCATCACATTTGTCCCCAAGAAAAAAAAGACCTAGGCAAACTGGTGCATTAATGGCTTCCTCTCCACAGGACATTAAATTCCAAGACTTAGTCGTTTTTATCCTCGAAAAAAAGATGGGTACTACAAGAAGAGCTTTTCTCATGGAATTGGCTAGACGTAAAGGTTTCAGAGTTGAAAATGAACTCAGTGATTCTGTTACACATATTGTTGCTGAAAATAATTCTGGTTCAGATGTTTTGGAATGGTTGCAAGCTCAAAAAGTTCAAGTCTCTTCTCAACCAGAATTGTTAGATGTCAGTTGGTTAATAGAATGCATTAGAGCTGGTAAGCCAGTTGAAATGACTGGCAAACATCAATTAGTAGTAAGAAGAGATTACTCAGATTCAACAAATCCAGGTCCACCAAAAACTCCACCAATAGCTGTTCAAAAAATTTCTCAATACGCTTGTCAACGTAGGACTACTTTAAATAATTGTAATCAAATTTTTACAGATGCTTTTGACATCTTGGCTGAAAACTGTGAATTTAGAGAAAATGAAGATAGCTGCGTTACGTTTATGAGAGCTGCCTCAGTTTTAAAAAGCTTACCTTTTACTATAATTTCTATGAAAGATACCGAAGGAATTCCATGTTTGGGTTCAAAAGTTAAGGGTATTATAGAAGAAATAATTGAAGATGGTGAATCATCAGAAGTTAAAGCAGTGTTGAATGATGAAAGATACCAATCTTTTAAATTATTTACTTCAGTCTTTGGTGTTGGTTTAAAAACCTCCGAAAAATGGTTTAGAATGGGCTTCAGAACCTTGAGTAAGGTGAGATCTGATAAATCATTGAAATTTACGAGAATGCAAAAAGCTGGTTTTTTGTACTACGAGGATTTGGTCAGCTGTGTTACTAGAGCTGAAGCTGAAGCTGTTTCTGTTCTGGTTAAAGAAGCAGTTTGGGCATTTTTGCCTGATGCTTTCGTTACAATGACAGGTGGTTTTAGAAGGGGTAAAAAAATGGGTCATGATGTAGATTTTTTGATTACTTCCCCTGGTTCTACTGAAGATGAAGAACAATTGCTACAAAAAGTGATGAACTTATGGGAGAAAAAAGGTTTGTTGTTGTACTATGATTTAGTTGAGTCTACTTTTGAAAAGTTGAGATTACCATCTCGCAAAGTTGATGCTTTAGATCATTTCCAAAAATGTTTTTTGATCTTTAAGTTGCCAAGACAAAGAGTAGATTCCGATCAGAGTTCCTGGCAAGAAGGTAAGACTTGGAAAGCCATTAGAGTCGACTTGGTTCTGTGTCCATACGAAAGAAGAGCTTTTGCATTATTAGGTTGGACAGGTTCCAGACAATTTGAAAGAGATTTGAGAAGATATGCCACACATGAGAGAAAGATGATTTTAGATAACCATGCATTATATGATAAAACTAAAAGAATCTTCTTGAAAGCCGAATCAGAGGAAGAAATTTTTGCCCATCTAGGTTTGGATTACATTGAACCATGGGAGAGAAATGCT

Avoid cleavage sites of restriction enzymes: BbsI BsaI

3.4- I can produce it in S. cerevisiae via putting the fragment into a vector with the corresponding parts (RBS, promoter, terminator, etc). I can add tags in the 3´ sequence if i wanted to purify it with a desire method. Also cell-free methods could be use to produce it.

Part 4: prepare a Twist DNA Synthesis Order

Dig DigDig Dig

Part 5: DNA Read/Write/Edit

Part 5.1: DNA Read

5.1- What DNA would you want to sequence (e.g., read) and why?

For the develop of my personal project, i would need to sequence a lot of the plasmids and designs that i will design in it. Also, genomic DNA sequencing to see the integration of the sequences in the corresponding site might be needed.

(ii) In lecture, a variety of sequencing technologies were mentioned. What technology or technologies would you use to perform sequencing on your DNA and why?

TGS. Nanopore, for example. To plasmid and genomic site-especific integration view i would amplify the concentration of the sequences by PCR they, and i would fragmentate (e.g via sonification) the sequences. Then, i would design primers to bind to specific regions (e.g adeine tails) and add adaptors to the sequencing machine and barcodes to aling the fragments, and PCR the sequences. What is next is to prepare the material for the sequencing machine (dilute, pipete). And for transcriptome reading the use of a reverse transcriptase in the firts PCR is needed. For nanopore sequencing, after the preparation of the library, the sample is loaded onto the flow cell, where a flow of ions create a current that passes throught pores. A protein motor guide that pass of the dna throught the pore, where each base disrupt the voltage and the machine is able to interpret it. The raw electrical signals are stored as FAST5 and by algorithms is converted into FASTQ.

5.2 DNA write

(i) What DNA would you want to synthesize (e.g., write) and why?

I would to synthesize the components of the system (e.g transcription factors, kinases, phosphatases) and their respective vectors and cassettes components (promoters, RBSs, terminators). Maybe, as the design process uses tools of protein design, some components may need to be prepared in libraries where certain AA sequences are changign for a more desired output. And, for a more broad characterization, libraries where the components of the cassettes are different. For now, the components are TF1,TF2, peptidase 1, kinase 1 , cas 13 and their gRNA, condensate andamio, condesate andamio 2, TF3, TF 4 and TF5.
All of this components are general because some would be engineered (modified domains, design of motives, etc).

5.3 DNA edit

(i) What DNA would you want to edit and why?

For my system to work i would need to integrate the components in the genome with, maybe, some recombinase-use engineering.

Week 3 HW: Lab automation

Homework

I create the following code by drawing at the opentrons art platform, coping the coordenates and asking chatgpt to generate the code below the code provided by the original opentrons HTGAA google colabs.

Colabs link

mclover3_points = [(-27, 24),(-24, 24),(-21, 24),(-18, 24),(-30, 21),(-27, 21),(-18, 21),(-15, 21),(-3, 21),(0, 21),(-33, 18),(-30, 18),(-15, 18),(-12, 18),(-3, 18),(0, 18),(-33, 15),(-30, 15),(-24, 15),(-21, 15),(-18, 15),(-12, 15),(-3, 15),(0, 15),(-33, 12),(-27, 12),(-24, 12),(-21, 12),(-18, 12),(-15, 12),(-3, 12),(0, 12),(-33, 9),(-27, 9),(-24, 9),(-12, 9),(-3, 9),(0, 9),(-30, 6),(-9, 6),(-6, 6),(-3, 6),(0, 6),(-33, 3),(-3, 3),(0, 3),(-6, 0),(-3, 0),(0, 0),(15, 0),(18, 0),(21, 0),(24, 0),(27, 0),(-3, -3),(0, -3),(12, -3),(15, -3),(18, -3),(24, -3),(27, -3),(30, -3),(-3, -6),(0, -6),(9, -6),(12, -6),(21, -6),(33, -6),(-3, -9),(0, -9),(9, -9),(15, -9),(18, -9),(21, -9),(24, -9),(27, -9),(30, -9),(-3, -12),(0, -12),(12, -12),(15, -12),(18, -12),(21, -12),(24, -12),(27, -12),(30, -12),(-3, -15),(0, -15),(3, -15),(6, -15),(12, -15),(15, -15),(18, -15),(21, -15),(24, -15),(27, -15),(-3, -18),(0, -18),(-3, -21),(0, -21),(-3, -24),(0, -24),(-3, -27),(0, -27),(-3, -30),(0, -30),(-3, -33),(0, -33),(-3, -36),(0, -36),(-3, -39),(0, -39)]

sfgfp_points = [(-24, 21),(-21, 21),(-27, 18),(-24, 18),(-21, 18),(-18, 18),(-27, 15),(-15, 15),(-30, 12),(-12, 12),(-30, 9),(-9, 9),(-33, 6),(-6, 3),  (21, -3),(15, -6),(18, -6),(24, -6),(27, -6),(30, -6),(12, -9),(33, -9),(6, -12),(9, -12),(3, -18)]
electra2_points = [(-3, 24),(0, 24)]
def draw_points(color, points):
DISPENSE_VOL = 0.75
pipette_20ul.pick_up_tip()

for x, y in points:
  pipette_20ul.aspirate(DISPENSE_VOL, location_of_color(color))
  adjusted_location = center_location.move(types.Point(x=x, y=y))
  dispense_and_detach(pipette_20ul, DISPENSE_VOL, adjusted_location)

pipette_20ul.drop_tip()


draw_points('Green', mclover3_points)
draw_points('Red', sfgfp_points)
draw_points('Orange', electra2_points)

The simulation provide the following artwork.

Dig Dig

Find and describe a published paper that utilizes the Opentrons or an automation tool to achieve novel biological applications.

A massively parallel reporter assay library to screen short synthetic promoters in mammalian cells The paper uses massive parallel reporter assays (MPRAs) as a high-throughput method to cuantify the barcoded mRNA produced by a series of synthetic promoters, and how the regulatory dynamics change over different compounds. They tested 6144 promoters in conventional and specific mammalian cell lines The results were obtein by doing NGS and counting the reads of the mRNAs presents in the cells

Write a description about what you intend to do with automation tools for your final project

For my final project i can use cell-free protein synthesis to screen the post-translational modifications with fluorescence based arrays, in a cloud laboratory. Im going to express the cdks-cyclins complexes, a peptidase, and the correspond TF that would have the post-translational modifications. If im going to use a peptidase that depends on those cdks/cyclins complexes i would be easy, but it firts needs to exist, and not have other activations, otherwise the system would be active at a different time, and i find this very difficult. Another option could be design it If i design the peptidase, firts i should test it in the same type of array. I can build libraries of different changes in the domains and test the perfomance in a well plate.

A possible automated workflow would be:

Echo transfer DNA templates encoding CDK-cyclin complexes, transcription factor variants, protease/peptidase candidates, and fluorescent reporter constructs into specified wells of a 96-well or 384-well plate. Echo transfer any required cofactors, kinase reagents, ATP supplementation conditions, or small molecules into selected wells depending on the test condition. Bravo dispense or stamp cell-free protein synthesis master mix into all wells. Multiflo add lysate or additional common reagents to initiate expression. PlateLoc seal the plate to prevent evaporation. Inheco incubate the plate under defined temperature conditions to allow protein synthesis and post-translational regulation to occur. XPeel remove the seal after incubation. PHERAstar measure fluorescence or time-course reporter output to compare activation, background signal, and dynamic range across designs.

 Example pseudocode for automated screening workflow
constructs = load_construct_library([
    "CDK_cyclin_variants",
    "TF_variants",
    "protease_variants",
    "reporter_constructs"
])

conditions = define_conditions([
"with_CDK_cyclin",
"without_CDK_cyclin",
"with_protease",
"without_protease",
"different_cofactors",
"different_temperatures"
])

plate = create_plate(format="384-well")

for well in plate:
    assign_construct_combination(well, constructs, conditions)
    echo_transfer_DNA(well)
    echo_transfer_cofactors(well)

bravo_add_mastermix(plate, reagent="CFPS_master_mix")
multiflo_add_lysate(plate)

seal_plate(plate)
incubate_plate(plate, temperature=37, time="defined incubation period")
unseal_plate(plate)

results = measure_fluorescence(plate)
analyze_dynamic_range(results)
rank_best_designs(results)

Week 4 HW: Protein desing part I

Homework

Part A. Conceptual Questions

Answer any NINE of the following questions from Shuguang Zhang:

Why do humans eat beef but do not become a cow, eat fish but do not become fish? Because eating beef doesn´t make us incorporate the genetic information that make a cow, and even if we could, we don´t have machinery to process that information and make us a cow.

Why are there only 20 natural amino acids? I think because they were the optimal to being building blocks and have catalytic function. Also maybe have to be with the abundance of those in early life and that maybe suited interactions with RNAs at that time.

Can you make other non-natural amino acids? Design some new amino acids. My naive try is to mix diketopyrrolopyrrole as the functional group l l

(Quiting the H close to the N and making there a covalent bound). Maybe it could work as organic photovoltaics for protein electrical processing

Another could use cyclopentadienyl as the functioanl group to bind rare earth minerals

d d

Or pentamethylcyclopentadiene

d d

Where did amino acids come from before enzymes that make them, and before life started? From abiotic reactions in special enviroments, with rich carbon, nitrogen and oxygen sources.

If you make an α-helix using D-amino acids, what handedness (right or left) would you expect? left handedness.

Can you discover additional helices in proteins? I found that proteins contain alternative helical structures—most notably -helices, -helices, and polyproline helices.

Why are most molecular helices right-handed? I think the repulsion between the CO from the second and third AA, when curving to the left is going to be too strong (they are going to be facing each other). Also, the H-bounds formed in the right handed wouldn´t be so stable in the left handed.

Why do β-sheets tend to aggregate? For the repetitiveness of the motif that form the initial bounds of the B-sheets that are exposed on the other side of the firts B-sheet. What is the driving force for β-sheet aggregation? The hydrophobic effect, because it drives a hiddeness of the functional hydrophobic groups from aqueos solvent.

Why do many amyloid diseases form β-sheets? I think because, being b-sheets a common motif, if there are problems in the folding process, they could thend to aggregate. And if we think about the a-helix, it could be very difficult to aggregate there, so if b-sheets are termodynamic stable, it would be easy to misfolded protein to aggregate in conjuction if the hydrophobic motif enable them. Can you use amyloid β-sheets as materials? If they are well designed, yeah why not.

Design a β-sheet motif that forms a well-ordered structure. What comes to my mind is a toroid-like estructure.

Part B: Protein Analysis and Visualization

Briefly describe the protein you selected and why you selected it. I selected TdT, because it polymerases DNA without any template, and i find it cool.

Identify the amino acid sequence of your protein. MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKG FRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMT GKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCE FRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAV LNDERYQSFKLFTSVFGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRMQKAGFLYYEDLV SCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGGFRRGKKMGHDVDFLITSPGSTEDEEQL LQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALDHFQKCFLIFKLPRQRVDSDQSS WQEGKTWKAIRVDLVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKT KRIFLKAESEEEIFAHLGLDYIEPWERNA

It have 509 AA. Lysine is the most frequent amino acid with 50 counts (9.82%). Its homologous to 250 proteins (using uniprot blast tool). Belongs to the DNA polymerase type-X family.

Identify the structure page of your protein in RCSB. 4I27 | pdb_00004i27. It was released in 2013-07-24. It has a good quality resolution (2.60 Å). In the structure is involved 2’,3’-DIDEOXY-THYMIDINE-5’-TRIPHOSPHATE, magnesium ion and sodium ion. Using the SCOP 2 database, it belong to the DNA nucleotidylexotransferase structural family.

d d

It have more helices thant sheets.

d d

It have a hole where DNA binds to it.

Part C: Using ML-Based Protein Design Tools

Deep Mutational Scans

d d

A series of mutations that stands out is in the T from the 273 AA to D and E (+2 score). Another close one is in the L in the 274 to D and E (+3 score). Another one is 264 E to L (4.76 score). This mutations are close to the binding pocket of the DNA, affecting maybe the catalysis of the TdT or the affinity for the DNA.

d d

In red L 274, in yellow T 273.

d d

In magenta E 264, in cyan single strand DNA.

Latent Space Analysis

d d

The neighborhoods approximate similar proteins! I could find neighborhoods of polymerases, transcription factors, etc etc. But i couldn´t find my protein.

Folding a protein

Fold your protein with ESMFold. Do the predicted coordinates match your original structure? I couldn`t see it with py3Dmol, so i open it with NGLviewer

d d

ptm: 0.756 plddt: 89.294

Try changing the sequence, first try some mutations, then large segments. Is your protein structure resilient to mutations?

264 E to L (Non-mutated: cyan, mutated: white.):

d d

ptm: 0.756 plddt: 89.287. The structure is maintain. This mutation could improve the binding to DNA or other ligands.

Here i tried a desestabilitation mutation predicted by the deep mutational scan. 256 F to P:

d d

ptm: 0.755 plddt: 89.634 It have the form of the non mutated version, but i thing it could work worse than the original protein. The binding pocket changes significantly.

Protein Generation. Inverse-Folding a protein

Let’s now use the backbone of your chosen PDB to propose sequence candidates via ProteinMPNN.

tmp, score=1.6401, fixed_chains=[], designed_chains=[‘A’], model_name=v_48_020 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVFGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRMQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGGFRRGKKMGHDVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALDHFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVDLVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA T=0.2, sample=0, score=0.9088, seq_recovery=0.4665 APPKPVVVRPPPPPPPPPPPPPPPPSPSLRKFKDIHVYILEKNLGKKNREKLKEEARKAGFTVSDELDNDVTHIVAENLTGSEVLELLKESGVKLKNTPKLLKISWLKESIKAGKPVPITPEHILKVEPPLEDPTLPPPPPPPPPPPPPLSPYACKRRTTLENKNQKFVDAFETLAEYYEYNNKEEEARRYRRAAAQLRSLPFRIRSMEDLEGIPHIGPEIRAIIEEILKNGYSSKVEAILNDPYFQTMKLFTSVYGFGLKRAKKFYKRGYTSIEEVKNDKSIKFTDEQKAGLKYLEDLRRPITKEEALAIHEIIKKAVHKFLPDAEVVLVGSFRRGHKTSKDVDFLISSPTWKGDQTLLGKVIEHLKAEGKLLYYKLTPSTYDPNQLPSEDINAPPKFARVDMILKLPLSSEKPAEKGRKPGVDYKAVKVDLTLVPYERKAYALLAFTASPQFLEDLRRYAKEEKGMLLDDSSLYDLKKGEFISAKTVKEIFDALGLEYIPPEEMNA T=0.2, sample=0, score=0.9001, seq_recovery=0.4370 PPPPPVVPIPPPPPPPPPPPPPPPPPPSLQRFADIVVYVLPENLGEAEKAALEAALRAAGFTVESELSNKTTHIVAKNLTGKEVLELLEKSGFKLENKPKLLKISWAEECLAAGKPVPITPEHILPIEPPPAVPLLPPPPPPPPPPRPPLSPYACKRKCSLKDYNQKFVDAFEVLARYYEFKGDREKAERYRRAAAILKSLPFKIEKLSDLEGIPHIGPEIRALIEEILKNGYSSEVQAVLADPYFQTYKLFLSVYGFGLERAKEFYERGYTSIEEVKEDPSIEFTPKQKAGLKYLEDLRRPVTREEAEQIHEIIKEAVHAVLPDAKVEIVGRFRRGHETARDVDFLISSPSLTGDQSVLGRVIERLKEQGLLLYAEYTPSTYQPDQLPSEDIDAPPPYARVDMIFKLPLKPEAPAEAGREPGKTYKAVKVDLYLVPYENYAYALLYFTGSRQFIRDLKRYAKEERGMLLDYTRLYDKKKKEFIEAKTEQEIYDALGLEYIPPELRNA T=0.2, sample=0, score=0.9000, seq_recovery=0.4567 PPPPPVVVVPPPPPAPPPPPPPPPPPPALQKFKDITVYILEKNLGEKEKEKLKKKARAAGFNVSDKLNNDVTHIVAKNLTGQEVLDLIKASKVKLTKTPTLLKISWLEECIKAGRPVPITPEHLLPVEPPLAAPDLPPPPPPPPPPPPPVSPYACKRRCSLEDKNQKFVDTFEILAKYYEFKNKEEEAERYRRAAAILRSLPFEIRTMEDLEGLPHIDEEIRAIIKEILENGYSSLVEKILNDPYFQTMKLFTSIYGVGLKTAKKFYKKGYTTVEQVRNDPTIKFTPKQKAGLKYLEDLRRPITREEAEKIHEIIKEAVHKHAPDAIVEVVGRFRRGHKESRDVDFLISSPTWTGDQSLLGRVLEDLKKRGLLLYAELTPSTYQPNQLPSTDPNAPIPFERVDMIFKLPKSGEEPEEEGEPPGRDWKAVKVDLVLVPYERRAYALLYFTGSPQFVRDLMEYARNERGMLLDAYRLYDTKTNQFLTAASEEEIYAALGLEYIPPEQRNC T=0.2, sample=0, score=0.9176, seq_recovery=0.4724 APKPPKIPIPPPPPPPPPPPPPPPPPPSLQRFADIVVYILEKNLGKERQEKLAEAARAAGFRVSDELNNDVTHIVAYNLTGSEVLKWLKESKVKLTNTPKLLKISWLEACLKAGKPVPITPEYILEIEPPEADPLVPPPPPPPPPPRPPLSPYACLRECGLEDKNKEFVDTFDILAKYYEFKNDKEKARKYRRAAAILRSLPFKIESMEDLEGIPHIGEEIRKIIEEILENGYSSEVQKILNDEYYQTYKLFTSIYGIGLKRAKKFYKKGYKTIEQVKADKSIKFSKKQKAGLKYYKDLVSPVTREEAEAIHKIIKEAVHKFLPDAEVELVGSFRRGHETSKDVDFLISSPTWTGDQSLLGKVLELLKEKGLLLYAEFTPSTYQPDQLPSTDPDAPLPFARVDMIFKLPLKGEEPSDKGREPGVKYKAVKVDLTLVPYERRAYALLWLTASPQFRRDLMEYAAKERGKLLSATALYDTKTNTFLEAETEQEIYDHLGLPYIPPELRNC T=0.2, sample=0, score=0.9219, seq_recovery=0.4665 APLPPVVPIPPPPPPPPPPPPPPPPPPSLQKFKDITVYILEKNLGSEKKKELQEKARAAGFNVSEELNDDVTHIVAENLSGTEVLELLEKSNFKLSNKPLLLKISWLEACIEAGEPVPITPEHILPVEEPPLLPLLPPPPPPPPPPPPPVSPYCCERRCSLEDKNQKFVDAFDILAEYARFRDDEATARRYRRAAAQLKSLPFRIESLEDLEGIPHIGPEIRAIIKEILENGYSSEVKAILNDPYFQTMRLFLSIYGIGLKRAKKFYRMGYTTVEQLKADKSIKFTEKQKAGLKYLKDLTRPVSREEALAIHEIIRAAVHAVLPDAEVELVGSFARGHETSRDVDFLISSPTWTEDQTILGKVIAALEADGLLLYHEYTPSTYKENDLPSTDVNAPIPFAKVDMILKLPRAPALPAEKGEEPGQDWRAVRVDLYLAPYDRRAYARLYLTGSPQFVADLREYAQEKRGMLLDYSRLYDLKEKKFIPAESEQEIYDHLGLEYIPPEERNC T=0.2, sample=0, score=0.9058, seq_recovery=0.4587 GPPRPVVIRPPPPPPPPPPPPPPPVPPSLNKFRNITVYILEKNLGKKRRKELEEKARKAGFNVSEELNNNVTHIVAENLTGTEVEELLKKSKFKLKNKPKLLKISWLLESIKAGKPVPITPEHILKIEPPKEDPSIPPPPPEPPPPRPPLSPYACKRRTTLEDKNKKFVDTFNVLAEYYEFRKDKEKARKYRAAAAQLKSLPFEIKSIEDLEGIPHIGPEIRKIIKEILENGYSTEVEKILNDPYFQTYKLFTSIYGIGLSRADKFYKLGYTTIEEVKNDKSIKFTPKQKAGLKYLEDLQRPITKKEAEQIHKIIKEAVHKFLPDAEVELVGRFARGHETAKDVDFLISSPTLKGDQSFLQKVIDYLKEKGLLLYYEYTPSTYDPNRLPSRDIDAPSPFARVDMILKLPLEPEEPDEKGRKPGVNYRAVKVDLVLAPYERKAYAKLYFTASPQVREDLMRYAEEERGYLLSDDSLYDLKKKEFLKANSEEEIYKHLGLPYIPPTERNG T=0.2, sample=0, score=0.9172, seq_recovery=0.4114 KPKRKKKLKKRKPKPPPPPPPPPPPPPALQKFADITVYVLEKNMGKAAVEALKAKARAAGFTVSDELNNDVTHIVAKDMTGTEVKELLKASGKELKNEPLLLKISWLERSIKAGKPVEITPEDILPIEEPLEDPSLPPPPPPPPPPRPPLSPYACQRACGLRDLNAPFVAAFETLAEYYEFKDDRDRALRYRRAAAVLRSLPFRVRTLADLEGLPHIGPEIRALIAEILANGYSSEVEAVKADEYYITYKTFLSIYGIGLKRAKYFYAKGYTSIEELKADKSIKFTEKQKAGLKYLKDLRRPISKEEAEAIAEIIKRAVHEFLPDAKVEVVGRFARGHEESKDVDFLISSPTWKDDQTLLGKVIDKLKAEGLILYYEYTPSTYNPNDLPSTDPDAPPKFARVDMILKLPLTPEAPEEKDRPPGRDYRAVKVDLTLCPYERYAYALLYFTASPQFRRDLMLYAKEERGMLLDYSRLYDTKTKKFIKAKTVQEIFDALGLPYIPPELQNA T=0.2, sample=0, score=0.9068, seq_recovery=0.4567 SLPPVVVVRPPPPPPPPPPPPPPPPPPSLRRFANINVYILPENLGEREREALAARARAAGFNVSDELNNDVTHIVARNLTGTEVLEILEKSGIELKNTPLLLKISWLEESIRAGRPVPITPEHILPIEEPLADPTLPPPPPPPPPPRKPLSPYACQRETSLVNKNQAFVDTFDILAEAAEFRNDRETARRYRRAAAQLRSLPFPIRSEADLEGIPHIGPEIRAIIKEILENGYSSRVAAILADPYFRTKRLFTSVYGFGLATADRLYKRGYRTIEEVIADKSIKFTEEQKAGLKYLEDLRAPITREEAEAIHKIIKEAVHKFLPDAKVEMVGSFARGHKESKDVDFLISSPTLKGDQSVLQKVIDYLKEEGLLLYSKYTPSTYKPDDLPSRDPDAPLKFARVDMILKLPLTPAAPAEAGRKPGVDYKAVKVDLTLVPYSRYAYALLWLTASPQVREDLRLYAEEERGMLLDASALYDKKTKKFLPANSVEEIYAHLGLPYIPPELLNG

tmp, score=1.6412, fixed_chains=[], designed_chains=[‘A’], model_name=v_48_020 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVFGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRMQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGGFRRGKKMGHDVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALDHFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVDLVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA T=0.2, sample=0, score=0.9274, seq_recovery=0.4429 PPPPPKVIRPPRPPPPPPPSPPPPPPPSLQKFKDIVVYVLEDNLGEKKRKELKEKLRAAGFTVSDKLNNDVTHIVAYNLTGTEVLDLIKASGIKLDNTPKLLKISWAEDCIEAGKPVEITPEYILPVEPPPKDKSIPPPPPPPPPPREPLSPYACKRRCSLIDYNKKFVDTFNILAEYYRFLNNSEKADKYNRAAAQLKSLPFEIKSMEDLEGIPHIYPEIRKIIEEILKNGYSTEVEKILNDPYFKTKKLFTSIYGFGLATADKYYKAGYTSIEEVKSDKSIKFSEEQKAGLKYLKDLTRPITREEALRIHEIIKEAVHAFLPDAIVELVGSFARGAETSRDVDFLISSPTWKGDQTLLEKVIEYLKEKGLLLYYKLTPSTYDPNALPSTDVNAPSPFQRVDMIFKLPLEEEEEQLGDRPPGKKWRAVKVDLTLVPYDRFAYARLYFTASPQFRRDLIEYARDERGMLLSSTSLYDLKKKEFISASSVEEIYAALGLPYIPPEELNC T=0.2, sample=0, score=0.9223, seq_recovery=0.4724 GPAPPVVVRPPPPPPPPPPPPPPPPPPSLQKFKDITVYVIEENLGKENREELEEKLRKAGFNVSERLNNDVTHIVGKNLTGSEVEELLKKSKIKLKNKPLLLKISWAEECIKAGKPVPITPEYILKVEPKLSLPLLPPPPPPPPPPPPPLSPYCCQRRCTLENKNQKFVDAFEILAKYAEFKDDRETAEKYRRAAAQLRSLPFAIKSVKDLEGIPHIDEEIRAIIEEILKNGYSSKVRAIKNDPYYQTMKLFTSIYGFGIGTAKKLYRAGYTSIEEVRADKSIKFTEEQKAGLKYYEDLVAPITREEAEQIHEVIKKAVHKFLPDAIVEMVGSFRRGAKESKDVDFLISSPTWKGDETVLQKVIDYLKKEGLVLYYKYTPSTYDPNQLPSEDIDAPPPFAKVDMIFKLPLKGESPAEKNRKPGRDWKAVKVDLVLTPYERYAYALLYFTASRQTRRDLIEYAKNERGMLLDYSSLYDKKENKFLEAETEEEIYAHLGLPYIPPEERNC T=0.2, sample=0, score=0.8952, seq_recovery=0.4567 PPPPKVVIKPPKPPPPPPPPPPPPPPPGLRKFADITVYIIEKNLGKKKREKLKAAARAAGFNVADELSNDVTHIVGYNLTGTEVLELLKKSGINLKNKPLLLKISWLEESLKAGRPVPITPEHILPVEPKPELPLLPPPPPPPPPPRPPLSPYCCKRRCSLEDKNQKFVDAFNVLAEYYEFNNDEEKAEKYRKAAAQLRSLPFRIESMEDLEGIPNIGEEVREIIKEILENGYSSKVEEILNDPYYQTKKLFLSVYGFGKKTADKLYKAGYTTIEQVKNDKSIKFTPEQKAGLKYLKDLTAPVTREEAEAIAEIIREAVHAVLPDAEVEVVGSFRRGAETSKDVDFLISSPTWAPDQSVLSRVIDRLKQQGLLLYAKLTPSTYDPDALPSEDVNAPLPFARVDMILKLPLPPAAPERRGRPPGKDWKAVKVDLTLCPWERRAYALLWLTGSPQFVRDLREYAREERGMLLSASSLYDRKAGRFLPARSEEEIYAALGLPYIPPEQRNC T=0.2, sample=0, score=0.9096, seq_recovery=0.4508 PAPPPVVIVPPKPPPPPPPPPPPPPPPSLQRFADITVYILPENLGEARRAELKARARAAGFTVSDELNDDVTHIVAYNRTGTEVLDILKKSGFKLSNKPLLLKISWLEASLEAGRPVPVTPEHILPVEPPPQEPLVPPPPPPPPPPREPLSPYACQRETSLTDKNKKFVDTFKVLAEYARFKNDEELALRYERAAAQLKSLPFEIRSMEDLEGIPHIDEEIRAIIEEILKNGYSSKVQAILNDPYFQTHKLFTSIYGIGLKLADKFYKKGYTSIEQVIADKSIKFTPEQKAGLKYLEDLQAPVTREEAEQIHKIIKEAVHKHLPDAEVVLVGSFARGAETSDDVDFLISSPTWKDDQSLLLKVIEDLKEKGLLLYYKYTPSTYDPDALPSTDDNAPIPFARVDMILKLPLTGAEPASAGRKPGKDWKAVKVDLTLVPYERKAYALLWLTGSPQFRRDLRRYALEERGMLLDYSALYNLETQEFIEAKSVEEIFEALGLPYIPPELLNA T=0.2, sample=0, score=0.9305, seq_recovery=0.4469 APKPKKVIKKPKPPPPPPPLPPPPPPPSLQKFKDITVYVLPENLGEEEVKKLKKELREAGFNVSDELNNDVTHIVAKDLTGTEVLELLKKSNVKLTNTPLLLKISWALESIKAGKPVPITPEHILPIEEPLEDPTVPPPPPPPPPPPPPLSPYACKRKTTLKDYNQKFVDAFNILAEYYEMLGKKEEARKYRRAAAQLKSLPFRIKSLEDLEGIPHIGPEVRAIIKEILENGYSSKVKAILNDPYYQTMKLFLSVYGIGLSRAKRLYKKGYTTIEQAKNDKSIKFSKKQKAGFKYYEDLTAPITKEEALKIHKIIKEAVHKFLPDAEVELVGSFARGHDTSRDVDFLISSPTWKDDQTVLQKVIDLLQKEGLLLYYELTPSTYQPNQLPSTDIDAPPTFQRVDMIFKLPLDESKPDEKGRKPGRTYKAVLVDLVFVPYERKAYALLWLTMSRQFRADLIEYAKEERNYLLSYDSLYNLKEQKFIEAKSVQEIFDNLGLEYIPPTQRNG T=0.2, sample=0, score=0.9164, seq_recovery=0.4528 PPPPVVVVRPPPPPPPPPPPPPPPPDPSLQKFKDITVYILEENLGEEERKELEEKARAAGFNVSKELNNDVTHIVAKNLTGTEVLELLESSNVVLENEPLLLKISWLEECIEAGKEVPITPEYILPVEEPLEVPLLPPPPPPPPPPRPPLSPYACKRRTSLKDLNQKFVDAFDILSQYYRMKNDSENARRYRRAAAQLKSLPFKITSMEDLEGIPHIGPEVRALIQEILENGYSSKVREILNDPYYQTMKLFLSVYGFGLSRANRLYKKGYTSIEQVKADPSIKFTPKQKAGLKYLEDLTRPVTKEEAEKIGEIIKEAVHKVLPDAKVTIVGSFARGAKTSKDVDFLISSPTLKNDQTVLQRVIDILKADGLLLYAELTPSTYDPNQLPSLDPNAPIPFEKVDMILKLPLSGEEEKEKGRKPGKNYRAVKVDLVLTPYERYAYALLYFTGSPQFVEDLRLYAREEKGMLLDYSNLYDTKKGEFLEAKTEKEIYDHLGLPYIPPEERNA T=0.2, sample=0, score=0.9163, seq_recovery=0.4705 SKPKKRIVKKPKPPAPPPPPPPPPPPPRLQRFADITVYILEKNLGKEAQDELKARARAAGFNVSDELNNDVTHIVAENLTGTEVLNLIKASGVELKNTPKLLKISWLEECIKAGRPVPITPEHILPVKPPLAVPGLPPPPPPPPPPRPPLSPYACKRKCSLENKNQKFVDAFNILADYYSFKNNTEKEIAYRRAAAQLKSLPFKITSMEDLEGIPHIGPEIRAIIEEILKNGYSSEVEKIKNDPYYQTMKLFTSVFGFGLSRAKKLYKKGYKTIEEVKNDPSIKFTPKQRAGLKYLEDLTSPVTREEAEAIGKIIKEAVHSVLPDAKVEIVGSFARGHKTSRDVDFLISSPTLTGDQSVLEKVLEKLKADGLLLYYELTPSTYDPDALPSEDPDAPPPFQRVEMILKLPLEGKEPEEAGRKPGRDWKAVRVDLVLAPYERYAYALLYFTNSPQTRRDLMEYARTERGKLLDYSRLYDKEKGEFIKAETEEEIYAALGLEYIPPELRNA T=0.2, sample=0, score=0.9087, seq_recovery=0.4469 APPPPVVPIPPPPPPPPPPPPPPPPPPALRKFSDITVYILEKNLGAAARDALTARARAAGFTVSDELNNDVTHIVARNLTGSEVLSWLKASGFKLKNKPLLLRIDWLEACLKAGRPVPITPEHILPVEPPPLDPSVPPPPPPPPPPPPPLSPYACKRRTSLKDLNQAFVDAFNVLAEYARMQKDTARADAYDRAAAQLKSLPFRVRTLEDLEGIPHIGADIRAIIAEILANGASSEVAAVLADPYFQTMKTFLSIYGIGLNTAEKFYKKGYTSIEQVRADKSIKFTPEQKAGLEYLEDLTRPVTRAEAEAIHRIIRAAVHAVLPDAQVELVGSFARGHETSRSVDFLISSPTWKGDQSLLGRVLERLRAEGLLLYAKLTPSTYKPDQLPSTDPNAPPPFQRVDMILKLPLPGEAPEAAGRPPGRDWRAVKVDLVLAPYERRAYARLWFTGSPQYRRDLMDYALKEKNMLLSYDSLYDLEKKEFLKAQTEQEIFDHLGLEYIPPEDRNA

New Sequence:PPPPPKVIRPPRPPPPPPPSPPPPPPPSLQKFKDIVVYVLEDNLGEKKRKELKEKLRAAGFTVSDKLNNDVTHIVAYNLTGTEVLDLIKASGIKLDNTPKLLKISWAEDCIEAGKPVEITPEYILPVEPPPKDKSIPPPPPPPPPPREPLSPYACKRRCSLIDYNKKFVDTFNILAEYYRFLNNSEKADKYNRAAAQLKSLPFEIKSMEDLEGIPHIYPEIRKIIEEILKNGYSTEVEKILNDPYFKTKKLFTSIYGFGLATADKYYKAGYTSIEEVKSDKSIKFSEEQKAGLKYLKDLTRPITREEALRIHEIIKEAVHAFLPDAIVELVGSFARGAETSRDVDFLISSPTWKGDQTLLEKVIEYLKEKGLLLYYKLTPSTYDPNALPSTDVNAPSPFQRVDMIFKLPLEEEEEQLGDRPPGKKWRAVKVDLTLVPYDRFAYARLYFTASPQFRRDLIEYARDERGMLLSSTSLYDLKKKEFISASSVEEIYAALGLPYIPPEELNC

I went and use ESMFold to see how the New Sequence folded. It was interesting. ptm: 0.761 plddt: 93.088 d d

Its different, but no at the same time, idk..

Then i wanted to try to change only the AA of the binding site. I use pymol to select the AA 4 A around the ligand, and asked chatgpt to write me a code to make this possible.

The code is: after print(f"Length of chain {chain} is {l}"), put:

positions_to_design = [253, 255, 256, 257, 258, 259, 260, 261, 262,
                   288, 332, 333, 336, 338, 340, 341, 342, 343,
                   345, 381, 397, 398, 405, 432, 434, 449, 450,
                   452, 457, 461]

fixed_positions_dict = {}
name = pdb_dict_list[0]['name']
fixed_positions_dict[name] = {}

for chain in designed_chain_list:
chain_length = len(pdb_dict_list[0][f"seq_chain_{chain}"])
fixed_positions = [i for i in range(1, chain_length + 1) if i not in positions_to_design]
fixed_positions_dict[name][chain] = fixed_positions

print("fixed_positions_dict:")
print(fixed_positions_dict)

And run the code.

tmp, score=1.1791, fixed_chains=[], designed_chains=[‘A’], model_name=v_48_020 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVFGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRMQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGGFRRGKKMGHDVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALDHFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVDLVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA T=0.2, sample=0, score=0.8505, seq_recovery=0.6000 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVYGIGLKTSEKWFRMGFRTLSKVRSDKSLKFTRRQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGSFRRGHKTSRKVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALPEFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVELVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA T=0.2, sample=0, score=0.7625, seq_recovery=0.7000 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVYGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRTQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGSFRRGHKESRDVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALSPFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVDLVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA T=0.2, sample=0, score=0.8259, seq_recovery=0.6667 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVYGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRSQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGSFRRGHKTSREVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALPDFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVDLVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA T=0.2, sample=0, score=0.7350, seq_recovery=0.7000 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVYGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRSQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGSFRRGHKTSRDVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALSPFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVDLVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA T=0.2, sample=0, score=0.7073, seq_recovery=0.6667 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVYGIGLKTSEKWFRMGFRTLSKVRSDKSLKFTRRQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGSFRRGHKTSRDVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALSPFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVDLVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA T=0.2, sample=0, score=0.7011, seq_recovery=0.7000 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVYGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRTQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGSFRRGHKTSRDVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALSPFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVDLVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA T=0.2, sample=0, score=0.7647, seq_recovery=0.6667 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFRSVYGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRTQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGSFRRGHKTSRDVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALPPFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVDLVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA T=0.2, sample=0, score=0.6985, seq_recovery=0.7000 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVYGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRTQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGSFRRGAKTSRDVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALSPFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVDLVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA

tmp, score=1.1870, fixed_chains=[], designed_chains=[‘A’], model_name=v_48_020 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVFGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRMQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGGFRRGKKMGHDVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALDHFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVDLVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA T=0.2, sample=0, score=0.7722, seq_recovery=0.6667 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVYGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRTQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGSFRRGAKESRDVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALSPFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVDLTLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA T=0.2, sample=0, score=0.7871, seq_recovery=0.7000 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVFGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRKQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGSFRRGHKTSRSVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALSPFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVDLVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA T=0.2, sample=0, score=0.7977, seq_recovery=0.7000 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVYGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRRQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGSFRRGAKESRDVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALPDFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVDLVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA T=0.2, sample=0, score=0.6841, seq_recovery=0.7333 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVYGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRRQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGSFRRGHKTSRDVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALDPFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVDLVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA T=0.2, sample=0, score=0.6955, seq_recovery=0.6333 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVYGIGLKTSEKWFRMGFRTLSKVRSDKSLKFTRTQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGSFRRGHKTSRDVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALSPFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVDLTLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA T=0.2, sample=0, score=0.7886, seq_recovery=0.6000 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVYGIGLKTSEKWFRMGFRTLSKVRSDKSLKFTRTQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGRFRRGHKTSRDVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALPDFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVVLTLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA T=0.2, sample=0, score=0.7773, seq_recovery=0.7000 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVYGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRTQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGSFRRGHKTARDVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALPPFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVDLVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA T=0.2, sample=0, score=0.7904, seq_recovery=0.6667 MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVYGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRTQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGSFRRGAKTSRDVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALPPFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVVLVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA

New Sequence:MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVYGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRTQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGSFRRGAKESRDVDFLITSPGSTEDEEQLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALSPFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVDLTLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLDYIEPWERNA

This is a library with only the binding site changed. Some will perform better, some worse. If i want to optimaze the structure, i can do the inverse.

Part D. Group Brainstorm on Bacteriophage Engineering

I think the no-dependency for DnaJ can be solved computationaly. Software like proteinMPNN to do inverse folding of the sequence or other sequences, EvolvePro to do in silico mutagenesis to explore variants, fooldseek for search of other natural ocurring sequences with the same structure, ESMFold to corroborate the 3D structure, etc.

An idea that i have is to anchor a synthetic domain into the L protein to help the conformational change. So that one L protein could help another without the need of Dnaj. The interactions may be difficult to stablish, but this could help the acumulation of phages upon oligomeration. The problem is size of the MS2 genome. This could make it more bigger, more inefficient translation and replication, and less RNA accumulated in the capsid.

Other is use the Loajd mutant but with a low translation efficienty, changing nucleotides involve in the translation of the peptide, to help acumulate enough Loadj upon for the lysis. Even replication rate could be manipulated.

About the firts idea, the pipeline could be: Know the residues in Dnaj that make the interaction with L protein, and their position in the tridimentional structure. Then, use RFdiffusion or other software to stablish those residues as constrains and make a backbone for that domain (small if possible), or maybe using the entire domain in the Dnaj. Check folding, stability, etc. Anchor that domain with a linker into the L protein. If the interactions between the domain an the N-terminal domain doesn`t occur in the same protein would be good. A good design could help to this. Then test for interactions. How the domain interacts with N-terminous domain vs Dnaj, different possittion to put it, optimazing, etc. Alphafold-Multimer and Boltz-1 to this, proteinMPNN to make the inverse folding, ESMFold to check 3D structure, EvolvePro and ESM-2 for mutational scoring for optimazing and for knowing the residues involve in the interactions use Alphafold-Multimer and Boltz-1 to get the pdb and a visualization system to see the residues.

A pitfall could be not getting enough data of the interactions between the Dnaj and the L protein, or the designed peptides don´t work well because of the limits of the tools.

For the second idea it could be identify the translational motif that interact with the ribosome or other translational structure, find the sequences and see if there are data about more or less efficient translation rate acording to the sequence, and design the new sequence.

Week 5 HW: Protein design part II

Homework

Part A: SOD1 Binder Peptide Design (From Pranam)

Part 1: Generate Binders with PepMLM

1- Begin by retrieving the human SOD1 sequence from UniProt (P00441) and introducing the A4V mutation.

MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ
(SOD1)

MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ
(SOD1 A4V)

2- Using the PepMLM Colab linked from the HuggingFace PepMLM-650M model card. 3- Generate four peptides of length 12 amino acids conditioned on the mutant SOD1 sequence:

0	WHYGVTGLELKE, 21.127040

1	WRVYAVALEWKX, 12.035041

2	KRYYAVAVAHK, 18.943191

3	WRYPVAAAEHGX, 8.500123

Control: FLYRWLPSRRGG (X were A in all the peptides)

Part 2: Evaluate Binders with AlphaFold3

1- Navigate to the AlphaFold Server: alphafoldserver.com. 2- For each peptide, submit the mutant SOD1 sequence followed by the peptide sequence as separate chains to model the protein-peptide complex. 3- Record the ipTM score and briefly describe where the peptide appears to bind. Does it localize near the N-terminus where A4V sits? Does it engage the β-barrel region or approach the dimer interface? Does it appear surface-bound or partially buried?

Results: P0: ipTM = 0.22 pTM = 0.8. It appears to bind to the dimer interface, surface-bound.

P1: ipTM = 0.2 pTM = 0.83. Binds between the N-terminus and the B-barrel, partially buried. (ipTM scores 0.3 in other run that i make accidentally)

P2: ipTM = 0.48 pTM = 0.89. Binds between the B-barrel and the dimer interface, partially buried?

P3: ipTM = 0.37 pTM = 0.83. Dimer interface, partially buried?

Control: ipTM = 0.38 pTM = 0.83. Dimer interface, partially buried.

4- In a short paragraph, describe the ipTM values you observe and whether any PepMLM-generated peptide matches or exceeds the known binder. The ipTM values where generally shorts, but the P2 is the most notorious of the generated peptides. This also exceeds the know binder.

Part 3: Evaluate Properties of Generated Peptides in the PeptiVerse

1- Paste the peptide sequence. 2- Paste the A4V mutant SOD1 sequence in the target field. 3- Check the boxes

(For the P0, as example) 💧 Solubility,Soluble 1.000 🔬 Permeability (Penetrance), Non-permeable 0.043 🩸 Hemolysis, Non-hemolytic 0.067 👯 Non-Fouling, Fouling 0.360 ⏱️ Half-Life 1.167 h 🔗 Binding Affinity, Weak binding 5.364 ⚖️ Molecular Weight 1431.6 ⚡ Net Charge (pH 7) -1.14 🎯 Isoelectric Point 5.55 💦 Hydrophobicity (GRAVY) -0.50

P0: 5.364,pKd/pKi

P1: 6.013,pKd/pKi

P2: 5.406,pKd/pKi

P3: 5.446,pKd/pKi

Control: 5.555,pKd/pKi

Compare these predictions to what you observed structurally with AlphaFold3. In a short paragraph, describe what you see. Do peptides with higher ipTM also show stronger predicted affinity? Are any strong binders predicted to be hemolytic or poorly soluble? Which peptide best balances predicted binding and therapeutic properties? Well, not necesarily. P2 wasn´t the best, as in comparison with P1. There is no correlation per se between AlphaFold 3 and PeptiVerse. No, there is no one with high hemolysis and poorly soluble. Im going to continue with P1 as has the best binding predicted, the other components seems to be okay, and in AlphaFold 3 the ipTM wasn´t so bad.

Part 4: Generate Optimized Peptides with moPPIt

After generation, briefly describe how these moPPit peptides differ from your PepMLM peptides. How would you evaluate these peptides before advancing them to clinical studies?

1: DPTVLLLDFNWL a: 5.3629 (AF: ipTM = 0.48 pTM = 0.86)

2: MLAGIFGATPGT a: 5.3119 (ipTM = 0.37 pTM = 0.87)

3: WTPEEQRQWREL a: 7.0440 (ipTM = 0.69 pTM = 0.88)

4: TWEEWMKIYYEA a: 7.6667 (ipTM = 0.67 pTM = 0.89)

5: CPSAAWVCGPIW a: 7.5019 (ipTM = 0.18 pTM = 0.69)

ipTM were very low. Some were far away from the 1-10 region selected. 1, 4 and 5 were closest to the N-terminus, all surface bounded.

Part C: Final Project: L-Protein Mutants

Im going for the Option 2: Mutagenesis using Af2-Multimer. Going to optimize the structure of the N-terminus domain while optimizing worst binding with Dnaj.

Useful data:

L protein sequence: METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT

1-30: METRFPQQSQQTPASTNRRRPFKHEDYPCR

31-75: RQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT

Dnaj sequence: MAKQDYYEILGVSKTAEEREIRKAYKRLAMKYHPDRNQGDKEAEAKFKEIKEAYEVLTDSQKRAAYDQYGHAAFEQGGMGGGGFGGGADFSDIFGDVFGDIFGGGRGRQRAARGADLRYNMELTLEEAVRGVTKEIRIPTLEECDVCHGSGAKPGTQPQTCPTCHGSGQVQMRQGFFAVQQTCPHCQGRGTLIKDPCNKCHGHGRVERSKTLSVKIPAGVDTGDRIRLAGEGEAGEHGAPAGDLYVQVQVKQHPIFEREGNNLYCEVPINFAMAALGGEIEVPTLDGRVKLKVPGETQTGKLFRMRGKGVKSVRGGAQGDLLCRVVVETPVGLNERQKQLLQELQESFGGPTGEHNSPRSKSFFDGVKKFFDDLTR

  Position Wild_Type_AA Mutation_AA  LLR_Score

989 50 K L 2.561464 574 29 C R 2.395427 769 39 Y L 2.241778 575 29 C S 2.043150 173 9 S Q 2.014323 573 29 C Q 1.997049 572 29 C P 1.971028 569 29 C L 1.960646 987 50 K I 1.928798 1049 53 N L 1.864930 1209 61 E L 1.818096 1029 52 T L 1.813965 984 50 K F 1.802066 576 29 C T 1.797247 568 29 C K 1.795877 93 5 F Q 1.795244 94 5 F R 1.659716 560 29 C A 1.648655 534 27 Y R 1.628060 434 22 F R 1.602028 92 5 F P 1.596889 997 50 K V 1.594573 995 50 K S 1.574555 96 5 F T 1.559023 95 5 F S 1.556416 889 45 A L 1.539248 775 39 Y S 1.517457 535 27 Y S 1.497052 789 40 V L 1.477630 529 27 Y L 1.474638 435 22 F S 1.423357 563 29 C E 1.383282 760 39 Y A 1.364997 571 29 C N 1.362601 980 50 K A 1.357792 567 29 C I 1.344121 89 5 F L 1.332615 334 17 N R 1.323652 767 39 Y I 1.320101 776 39 Y T 1.302803 514 26 D R 1.268762 566 29 C H 1.246106 764 39 Y F 1.245850 777 39 Y V 1.244389 454 23 K R 1.236555 494 25 E R 1.229349 474 24 H R 1.227778 996 50 K T 1.222128 533 27 Y Q 1.218850 536 27 Y T 1.215567

Amino Acid Position Score 0 L 50 2.561468 1 L 39 2.241780 2 I 50 1.928801 3 L 53 1.864932 4 L 52 1.813968 5 F 50 1.802069 6 V 50 1.594576 7 S 50 1.574557 8 L 45 1.539248 9 S 39 1.517457 10 L 40 1.477630 11 A 39 1.364999 12 A 50 1.357795 13 I 39 1.320103 14 T 39 1.302804 15 F 39 1.245851 16 V 39 1.244390 17 T 50 1.222131 18 L 54 1.120860 19 R 39 1.064191

 Position Wild_Type_AA Mutation_AA  LLR_Score

989 50 K L 2.561468 574 29 C R 2.395427 769 39 Y L 2.241780 575 29 C S 2.043150 173 9 S Q 2.014325 573 29 C Q 1.997049 572 29 C P 1.971029 569 29 C L 1.960646 987 50 K I 1.928801 1049 53 N L 1.864932 996 50 K T 1.222131 533 27 Y Q 1.218851 536 27 Y T 1.215567

8 copias: METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT:METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT

I select this mutations based on the scores predicted by the AIs and the N-terminus region utilize to model: C29R, C29S, S9Q, C29Q, C29P, C29L, C29T, C29K, F5Q, F5R, C29A, Y27R, F22R, F5P, F5T, F5S, Y27S, Y27L, F22S, C29E, C29N, C29I, F5L, N17R, D26R, C29H, K23R, E25R, H24R, Y27Q, Y27T

I make a code with chatgpt to from the mutations (using only 1-30 AA of the L protein) → ESM2 LLR → choose top candidates stables → ColabFold peptide only → ColabFold peptide + chaperon → score final by:

  • mono_pLDDT high
  • mono_pTM high
  • complex_ipTM low
  • contact peptide–chaperon low
  • final_score high
d d

Genetic Circuits Part I: Assembly Technologies

Assignment: DNA Assembly

What are some components in the Phusion High-Fidelity PCR Master Mix and what is their purpose?

The high Phusion High-fidelity polymerase, MgCl2, dNTPs, etc. The polymerase enable low error in the polymerization reaction.

What are some factors that determine primer annealing temperature during PCR?

how much Hidrogen bombs are present (type of bases and how much).

There are two methods from this class that create linear fragments of DNA: PCR, and restriction enzyme digests. Compare and contrast these two methods, both in terms of protocol as well as when one may be preferable to use over the other.

Restriction enzyme linearization could be more sequence dependent, and also not super escalable. In comparison, PCR linearization is more independent in those aspects, but have the downsides of primer design.

How can you ensure that the DNA sequences that you have digested and PCR-ed will be appropriate for Gibson cloning?

By amplifying with exacts overlaps fragments that flank the desire product.

How does the plasmid DNA enter the E. coli cells during transformation?

By making the cells competetent by (normally) two ways: electroporation and heatshock, inducing pores into the membrane that enables the DNA to entry into the cells.

Describe another assembly method in detail (such as Golden Gate Assembly). Explain the other method in 5 - 7 sentences plus diagrams (either handmade or online). Model this assembly method with Benchling or Asimov Kernel!

Golden Gate Assembly is a method of cloning that uses IIs restriction enzymes. Thesee enzymes cut not in but but close the recognition site, allowing to make specifics overhangs to assembly the parts. There entry vectors and overhangs to the parts, containing the correspond recognition sites and future overhangs.

This method allow to cloning up to 50 parts in one reaction, useful for multigene and from library-parts assembly. There is a common notation to the overhangs based on the type of parts, enabling MoClo (modular cloning) (leves 0: parts of a gene assembly; level 1: multigene assembly; level 2: multi-multigene assembly).

THe reaction usually is the following: (5 minutes 40 C, 5 minutes 16 C)x50, 5 minutes 50 C. The entry vectors usually have a sequence of Lacz for blue-white colony selection, where the whites have final insert.

For the Asimov (Kernel) i didn´t have the acces to the API to do the homework

Genetic Circuits Part II: Neuromorphic Circuits

Cell-Free Systems

Advanced Imaging & Measurement Technology

Bioproduction & Cloud Labs

Biodesign & Engineered Living Materials

Building Genomes

Subsections of Labs

Week 1 Lab: Pipetting

cover image cover image

Subsections of Projects

Individual Final Project

cover image cover image

Group Final Project

cover image cover image