Generative interaction protein screening
1 Product Overview
This product leverages the cutting-edge generative model RFdiffusion to de novo design artificial high-interaction proteins, while integrating advanced tools such as Foldseek, HDOCK, and AlphaFold 3 to screen and validate natural proteins. We have named this novel computational workflow InterProDesign (IPD, Interaction Protein Design).
With IPD, it becomes more efficient and accurate to screen for natural proteins that interact with target bait proteins, providing critical references for further experimental validation.
2 Product Screening Workflow
2.1 Designing Interaction Proteins with RFdiffusion
First, using the RFdiffusion generative model, 100 de novo protein candidates with potential high interaction with the bait protein are designed based on the known structure of the bait protein. The RFdiffusion model infers protein-binding hotspots and generates protein candidates with reasonable structures.
2.2 Searching for Natural Proteins with Foldseek
The 100 designed interaction proteins are input into the Foldseek tool to search for structurally similar natural proteins in the target species' protein 3D structure database. Foldseek compares protein 3D structures and calculates a bit score based on structural similarity. Only results with a bit score greater than 50 are retained, and they are ranked by score in descending order.
2.3 Quantitative Binding Energy Evaluation and Overall Structural Assessment
For the top-scoring natural proteins from the Foldseek screening, HDock is used to evaluate the binding energy between these proteins and the bait protein, selecting those with higher binding energy. Subsequently, AlphaFold 3 is used to further assess the confidence of the overall complex structure formed by these interaction proteins and the bait protein.
2.4 Final Screening of Natural Interaction Proteins
By ranking the results of binding energy evaluations and overall structural assessments, the most likely natural proteins interacting with the bait protein are identified. These final results provide key targets for subsequent experiments, significantly reducing the workload and cost of experimental screening.
3 Deliverables
1 | 3D Structural Model of the Bait Protein with Modeling Scores Provided |
2 | All Generative Binding Peptide Sequences and Corresponding PDB Files, Including Affinity Evaluation with the Bait Protein |
3 | Similarity Scores Between Binding Peptides and Natural Proteins from FoldSeek Comparison |
4 | Raw HDock Docking Data for the Bait Protein and Top 200 Candidate Proteins |
5 | Raw AlphaFold 3 Evaluation Data for the Bait Protein and Top 20 Candidate Proteins |
6 | Detailed Annotation Information for the Top 200 Candidate Proteins |
7 | Complete Excel Dataset and Project Service Report |