Cb-dock: a web server for cavity detection-guided protein–ligand blind docking

Cb-dock: a web server for cavity detection-guided protein–ligand blind docking

Play all audios:

Loading...

ABSTRACT As the number of elucidated protein structures is rapidly increasing, the growing data call for methods to efficiently exploit the structural information for biological and


pharmaceutical purposes. Given the three-dimensional (3D) structure of a protein and a ligand, predicting their binding sites and affinity are a key task for computer-aided drug discovery.


To address this task, a variety of docking tools have been developed. Most of them focus on docking in the preset binding sites given by users. To automatically predict binding modes without


information about binding sites, we developed a user-friendly blind docking web server, named CB-Dock, which predicts binding sites of a given protein and calculates the centers and sizes


with a novel curvature-based cavity detection approach, and performs docking with a popular docking program, Autodock Vina. This method was carefully optimized and achieved ~70% success rate


for the top-ranking poses whose root mean square deviation (RMSD) were within 2 Å from the X-ray pose, which outperformed the state-of-the-art blind docking tools in our benchmark tests.


CB-Dock offers an interactive 3D visualization of results, and is freely available at http://cao.labshare.cn/cb-dock/. You have full access to this article via your institution. Download PDF


SIMILAR CONTENT BEING VIEWED BY OTHERS DRUGREP: AN AUTOMATIC VIRTUAL SCREENING SERVER FOR DRUG REPURPOSING Article 10 October 2022 AUTOMATION OF ABSOLUTE PROTEIN-LIGAND BINDING FREE ENERGY


CALCULATIONS FOR DOCKING REFINEMENT AND COMPOUND EVALUATION Article Open access 13 January 2021 SQM2.20: SEMIEMPIRICAL QUANTUM-MECHANICAL SCORING FUNCTION YIELDS DFT-QUALITY PROTEIN–LIGAND


BINDING AFFINITY PREDICTIONS IN MINUTES Article Open access 06 February 2024 INTRODUCTION Protein–ligand docking has been widely used to predict binding modes and affinities of ligands.


Protein–ligand docking is a powerful tool for computer-aided drug discovery (CADD). Currently, there are dozens of commercial and academic tools available for protein–ligand docking


[1,2,3,4,5,6,7,8,9,10,11,12]. Most docking tools require the ligand binding region (the rotation and translation of a ligand in this region) in advance to search for the most energy


favorable binding mode. The binding region is usually represented as a cubic box, so its size and center are critical for accurate docking because it defines the boundaries of the


conformational sampling space. In many application scenarios, the binding regions are unknown. To identify potential interactions between a given protein and a ligand, docking has to be


performed on the entire protein surface to find the most probable binding mode. This process is called blind docking [13,14,15,16]. Compared to regular docking, blind docking is less


reliable and stable as the docking space is usually too large to sufficiently sample using a limited number of random searches. Nevertheless, blind docking is particularly valuable for


discovering unexpected interactions that may occur in unidentified binding modes [17]. Traditionally, blind docking is performed on the entire protein surface. Alternatively, docking on


putative binding regions of the given protein usually improves the sampling efficiency and reduces the computational cost of blind docking [18]. Currently, many binding site detection tools


have been developed [19,20,21,22,23,24,25,26,27,28,29]. These methods help users find residues that potentially bind with ligands. However, users must cluster residues into groups and


estimate the parameters manually and then perform several rounds of protein–ligand docking to obtain the final result. Although this process is feasible, it is not efficient and has not been


systematically optimized. To address this problem, several blind docking tools have been developed in recent years that have integrated cavity detection with a focused docking module. For


example, popular software SwissDock [30, 31], QuickVina-W [15] and BSP-SLIM [32] provide particularly valuable services for blind docking. In this paper, we described a new blind docking


tool, named CB-Dock, which focuses on enhancing the docking accuracy. CB-Dock predicts binding regions of a given protein, calculates the centers and sizes with a curvature-based cavity


detection approach, and performs docking with the state-of-the-art docking software Autodock Vina [33]. CB-Dock also ranks the binding modes according to Vina scores and provides an


interactive 3D visualization of the binding modes. Our benchmark tests show an ~70% success rate for the top-ranking poses whose root mean squared deviation (RMSD) was within 2 Å from the


position in the X-ray crystal structure. It is notably higher than the traditional blind docking method (~40%) and outperformed other popular blind docking tools. The server of CB-Dock is


freely available at http://cao.labshare.cn/cb-dock/, together with additional documentation and tutorials. MATERIALS AND METHODS BENCHMARK DATASET PDBBIND SET A total of 1684 protein–ligand


structures were selected from PDBbind (v2018) [8, 34]. The molecular weights of the proteins were limited to 150~500 g/mol, and the numbers of rotatable bonds were within 10. In addition,


the proteins that share 60% or more similarity to the Astex Diverse Set or MTiAutoDock data were eliminated. The structures can be downloaded from our website


http://cao.labshare.cn/cb-dock/. ASTEX DIVERSE SET The Astex Diverse Set contains 85 protein–ligand complexes [35], which were downloaded from the Protein Data Bank [36]. The redundant


identical chains, water molecules, and heteroatoms were discarded. MTIAUTODOCK SET The test data are from the benchmark set of MTiOpenScreen [37]. The data contains 27 crystal structures


that cover important drug targets, including enzymes, GPCRs, nuclear receptors, and PPIs. APO STRUCTURE SET The above Astex Diverse Set is composed of protein–ligand complex (holo)


structures. To test the docking in the unbound state (apo) of proteins, we collected 19 apo protein structures [18] available in the Astex Diverse Set. Each apo structure corresponds to a


holo structure in the Astex Diverse Set. The sequence identity and coverage of each pair are greater than 95%. To compare the accuracy of the docking results, we superimposed each apo


structure onto its corresponding holo structure. THE TRADITIONAL BLIND DOCKING AND REDOCKING PROTOCOLS The parameters of traditional blind docking were customized as described by the


protocol from Di Muzio et al. [38]. The docking center is the spatial geometric center of all the heavy atoms of the protein. To obtain the sizes of the docking box, distances between the


center and each atom along the three axes (_x, y_, and _z_) were calculated. Then, the maximum value of the distance along each dimension is doubled and adds an additional 5 Å as the size of


the docking box [38]. Redocking was performed with known binding sites. The docking parameters were customized by following the method from Wei and Michal [39]. In general, the search box


size is equal to 2.857 times the radius of gyration of the ligand, which consistently obtains the highest prediction accuracy when using AutoDock Vina [39]. RESULTS DETECTING CAVITIES ON


PROTEINS Most small-molecule binding occurs in protein pockets or cavities because high affinity can only be gained by sufficiently large interaction interfaces [40]. CB-Dock searches for


concave surfaces to detect cavities. Briefly, CB-Dock generates a set of points to represent the solvent-accessible surface and calculates the curvature factor of each point using the method


from our previous work [41, 42]. These points at the concave surface (curvature factor > 8) are clustered by a density-peak-based clustering algorithm [43]. Thus, we obtained several


clusters of points that represent cavities on the protein surface. We present the example of aminopeptidase (PDB ID: 1TXR), whose cavities are highlighted in Fig. 1a. The cavities were


ranked according to their sizes. We compared our method (called CurPocket) with state-of-the-art protein–ligand binding site prediction methods using the benchmark set of COACH [23], which


is one of the best prediction methods. The results showed that our method is comparable to that of COACH in terms of Matthews correlation coefficient, precision, and recall (see


Supplementary Table S1). Unlike traditional binding site prediction methods, our method detected the real binding cavities as much as possible to offer options for blind docking. To


investigate its performance in detecting real binding cavities, we submitted 1684 structures from PDBbind to CurPocket (see the Materials and methods section) and examined their success


rates by comparing the top 10 cavities with the real binding cavities from the crystal structures. Test results showed that the predicted success rates [44] of the top 1 to 10 cavities


increased from 63.7% to 92.4%, respectively (Fig. 1b). From the top 10 to top 5, the success rate dropped only 2%. To balance the computational expense and cavity detection accuracy, we


selected the top 5 cavities as candidates for focused docking. CALCULATING CENTERS AND SIZES OF DOCKING BOXES For a putative cavity, CB-Dock needs to customize a docking box for the


following computation. A good docking box should enclose the native binding pose and exclude as many as possible irrelevant poses. The center and size of the docking box are the key


parameters in this process. The center of the ligand from the crystal structure is the best choice for the docking box; however, we can base these parameters only on the putative cavity and


unbound ligands to estimate the center and size. Hence, we first selected the center of the putative cavity, i.e., the center of points at the concave surface, as the docking center. To


quantify its deviation from the best center, we calculated distances between the two centers using the PDBbind data set (see the Materials and methods section). The distances between centers


of real and putative target cavities were distributed from 1 to 10 Å (Fig. 2a). For most of the data (76.6%), the distances were within 5 Å and up to 97.7% when distances were within 10 Å.


The result indicated that for the majority of the data, the center of the cavity was close to the ideal center. Second, we needed to determine the lengths of the docking box in each


dimension, which was related to the size of the cavity, the size of the ligand and the deviation of the putative center from the ideal center. After systematical examination of the outcome


from docking, we finally calculated the _i_ axis length _L__i_ of the docking box by a constant _x_ plus the maximum of the length _C__i_ of the putative cavity or gyration radius _R_ of the


given ligand as follows: $$L_i = x + \max \left( {R,C_i} \right)$$ The constant _x_ is used to compensate for the deviation of the putative center and to ensure that the ligand is enclosed


in the docking box. To determine _x_, we tested the above protein–ligand structures to investigate the proportion of docking boxes that enclosed the ligands by gradually increasing _x_ from


0 to 12 Å (Fig. 2b). The results showed that the proportion grows rapidly when _x_ increases from 0 to 5 Å. When _x_ is 10 Å, all the ligands are enclosed in the docking box. Thus, we choose


_x_ = 10 Å in our program. Detailed analysis shows that the sizes of the docking box by the above formula were mostly less than 30 Å, which was within the recommended upper limit


(http://vina.scripps.edu/manual.html#faq). THE GUIDANCE OF CAVITY DETECTION IMPROVED BLIND DOCKING To assess the performance of CB-Dock, we compared it with traditional blind docking using a


protein–ligand complex from Astex Diverse Set [35]. The docking parameters of traditional blind docking are described in section ‘The traditional blind docking and redocking protocols’. In


addition, to determine the upper limit of this blind docking, we also tested redocking the centers and sizes of docking boxes that were obtained from crystal structures [39]. We measured the


accuracy by RMSD between the predicted binding mode with the lowest docking score and the native mode in the crystal structures. The performances of these methods were quantified by the


percentage of correct predictions (RMSD < 2 Å) (Fig. 3a). The results show that for traditional blind docking, redocking, and CB-Dock, the prediction accuracies were 38.8%, 76.5%, and


69.4%, respectively. As we expected, CB-Dock had significant improvements (~30% higher) over traditional blind docking, and the overall accuracy was much closer to redocking and the upper


limit of docking using Autodock Vina. Particularly, when the prediction was correct, CB-Dock and redocking had nearly identical RMSD values (Fig. 3b). This result implies that the cavity


detection and docking parameters of CB-Dock work rather well. As AutoDock Vina is based on a random algorithm, whose results may be different from the repeat runs, we repeated the test for 3


rounds to investigate the stability of the three methods. The results showed that the RMSD variations of CB-Dock and redocking were less than 5%, while it was up to 10% for traditional


blind docking. We argued that CB-Dock appropriately decreased the sampling space and thereby reduced the randomness of the results. In all, cavity detection is a powerful approach to improve


blind docking. COMPARISON OF CB-DOCK WITH EXISTING BLIND DOCKING TOOLS To gain an overall performance of CB-Dock, we further compared it with four state-of-the-art docking tools, including


DockingApp [38], MTiAutoDock [37], rDock [45], and SwissDock [30, 31]. Though the tools provide multiple usages, we focused on their performance of blind docking. DockingApp searches for


binding sites over the whole protein surface by AutoDock Vina [33]. MTiAutoDock uses the same strategy but is powered by AutoDock 4.2.6 [5]. rDock and SwissDock perform docking in the


vicinity of predicted cavities. Unlike curvature-based cavity detection in CB-Dock, rDock uses a two-probe sphere method [45], and SwissDock employs a variant of the grid-based LIGSITE


algorithm [46] to identify cavities. In general, DockingApp and MTiAutoDock follow the traditional strategy, while rDock, SwissDock, and CB-Dock only allow docking in the putative binding


regions. We conducted the benchmarks on the Astex Diverse Set and MTiAutoDock data (see the Materials and methods section). In the first dataset, DockingApp, MTiAutoDock, rDock, SwissDock


(accurate mode) and CB-Dock achieved 42.4%, 42.4%, 41.2%, 53.0%, and 69.4% success rates of top-ranking poses within the RMSD of 2 Å from crystal structures, respectively (Fig. 4a). In the


second set, the five tools achieved 33.3%, 51.9%, 33.3%, 70.4% and 74.1% success rates, respectively (Fig. 4b). Both benchmarks illustrated that, in terms of success rates for top-ranking


poses, CB-Dock outperformed other blind docking tools. As blind docking strongly depends on the accuracy of predicted binding sites, we further compared the average percentage of correctly


predicted binding sites [44]. The results showed that the accuracies were 70.6%, 67.1%, 71.8%, 78.3%, and 88.2% for DockingApp, MTiAutoDock, rDock, SwissDock (accurate mode) and CB-Dock,


respectively, on the Astex Diverse Set data (Fig. 4c) and were 70.4%, 70.4%, 77.8%, 88.9%, and 100%, respectively, on the MTiAutoDock data (Fig. 4d). These results exhibited good


correlations with the above success rates of predicting the binding sites and indicated the significance of the binding site prediction in CB-Dock. The above tests benchmarked blind docking


on the ligand-bound states (holo) of receptors from the protein–ligand complex structures. The blind docking in unbound (apo) structures is much more challenging as the conformational


changes of proteins are difficult to predict. We performed blind docking using the 19 apo crystal structures available in Astex Diverse Set (see Apo Structure Set in the Materials and


methods section). The results showed that the average percentages of correctly predicted binding sites [44] of the top-one predictions are 47.4%, 36.8%, 47.4%, 31.6%, and 68.4% for


DockingApp, MTiAutoDock, rDock, SwissDock (accurate mode), and CB-Dock, respectively. The RMSDs exhibited a similar trend. The success rates of top-ranking sites within the RMSD of 5 Å are


36.8%, 31.6%, 42.1%, 26.3%, and 63.2%, respectively (see Table 1). CB-Dock achieved the highest accuracy in the Apo Structure Set. However, the success rate was notably lower than that on


the holo structure set. Analysis showed that the conformational differences between apo and holo structures may result in two types of inaccurate docking. One type is that CB-Dock identifies


accurate cavities for docking; however, the detailed conformation of cavities was different between apo and holo structures. If the differences were critical for binding, docking may not be


accurate because CB-Dock does not model the conformational changes between apo and holo structures. An example of this type is the PDB structure 1L2S (see Fig. S1a and S1b). The side chain


of Ser64 at the protein–ligand interface was turned 44.5° from the apo structure (PDB ID: 2BLS) to the holo structure (PDB ID: 1L2S) to avoid atomic clashes. This difference misled docking


on the apo structure. The other type of inaccurate docking was that the top five cavities of the apo structure do not include the real binding cavity. An example of this type is the PDB


structure 1YVF (see Fig. S1c and S1d). The real binding cavity was ranked in the top five cavities on the holo structure (PDB ID: 1YVF), while it was too small to rank in the top five


cavities on apo structure (PDB ID: 2GIR). Hence, the docking has a very large RMSD. Computational speed is another critical feature of docking in high-throughput virtual screening. Because


only DockingApp, rDock, and CB-Dock provided a stand-alone version, the time consumption was analyzed for the three blind docking tools. The results showed that the average running times of


DockingApp, rDock, and CB-Dock on Astex Diverse Set were 44.4, 75.8, and 62.7 s per blind docking, respectively, on an AMD Ryzen1700 processor (see Table S2). Detailed data showed that the


running time of CB-Dock and DockApp did not show any correlation with the size of the protein (number of residues) but was slightly related to the flexibility of ligand (quantified by the


number of rotatable bonds) (See Fig. S2). In contrast, the time consumption of rDock had a strong relationship with the size of the protein but not the flexibility of the ligand (see Fig. 


S2). Although the precise time consumption of MTiAutoDock and SwissDock was not available, based on our tests, their online usages took over 10 min on average to return a docking result.


Taken together, we argue that CB-Dock serves as a relatively rapid blind docking tool. In particular, the protein-size-independent feature of CB-Dock is suitable for docking-based inverse


virtual screening. CB-DOCK WEB SERVER To facilitate the use of CB-Dock, we constructed a web server at http://cao.labshare.cn/cb-dock/, which only requires the input of a protein file to be


in the PDB format and a ligand file in the MOL2, MOL, or SDF. After submission, CB-Dock checks the input files and converts them to pdbqt formatted files using OpenBabel [47] and MGLTools


[5]. Next, CB-Dock predicts cavities of the protein and calculates the centers and sizes of the top N (_n_ = 5 by default) cavities. Each center and size, as well as the pdbqt files, are


submitted to AutoDock Vina for docking. The final results are displayed after the computation of N rounds. Users can browse binding scores, cavity sizes, and docking parameters of the


predicted binding modes in a table. Moreover, users can inspect the 3D structures of any binding modes on the web page by clicking the structures in the related table. The interactive 3D


structures are drawn by NGL Viewer [48], which is supported by most modern browsers. Users are able to display atom-specific information, rotate and translate molecules, select models and


colors. For more details, users could refer to the manual on the CB-Dock homepage. Here, we present a case study of the software CB-Dock (Fig. 5). Nultin3a, a potential anti-cancer drug, is


able to bind with the E3 ubiquitin-protein ligase MDM2 and inhibit the MDM2–P53 interaction. The MDM2 protein structure was downloaded (PDB ID: 4HG7) from PDB. The Nutlin-3a mol2 file was


generated by the PRODRG software [49]. The two files were uploaded and submitted to the CB-Dock server by clicking the button “Submit”. While processing docking, a progress bar appeared to


indicate the status of the job. When the processing was complete (after approximately 2 min), the web page was updated with the results. The table listed Vina scores, cavity sizes, docking


centers, and sizes of predicted cavities. Once a ligand in the table is selected, the structure in the interactive 3D graphics is visualized. In our example, the top binding mode with a Vina


score of −8.4 also had the largest binding cavity. The binding mode was almost identical to the mode of ligand in the crystal structure (RMSD = 0.484 Å). DISCUSSION Discovering


protein–ligand binding sites and conformations are particularly important in drug discovery. Blind docking is a powerful method for obtaining that information. Blind docking is also one of


the key components in high-throughput screening and inverse docking [50,51,52,53]. Therefore, it is of great value to develop accurate blind docking tools. Thanks to the well-established


AutoDock Vina docking software, we focused on developing methods of cavity detection and docking parameter optimization, which are critical for blind docking. CB-Dock is the first cavity


detection-guided blind docking tool designed with AutoDock Vina among many popular Vina-based tools (http://vina.scripps.edu/manual.html#faq). The benchmark tests show that CB-Dock


outperforms other state-of-the-art blind docking tools in terms of predicting binding sites and binding conformations. This performance is attributed to the curvature-based cavity detection


that precisely narrows down the docking space as well as the optimized parameters for AutoDock Vina. Some shortcomings of CB-Dock were also observed in the test. First, compared to regular


docking, CB-Dock was more time expensive because the docking was performed iteratively in five cavities. To reduce time consumption, cavity detection should be further improved in the


future. Second, if the size of cavities was notably greater than that of the ligand, the accuracy of docking tends to decrease. A typical example is the huge cavity detected on nitric-oxide


synthase (PDB ID: 1MMV), in which the predicted docking position is at the opposite side of the cavity (see Fig. S3). This result is mainly related to the accuracy of the scoring function,


which is supposed to distinguish the global minimum from local minimums. Using an additional scoring function to rerank binding positions could be a solution to this problem. Third, CB-Dock


needs to improve the accuracy of docking in apo structures. Compared to holo structures, apo structures show conformational rearrangement in ligand binding sites, which has not been captured


in current CB-Dock software. In the following developments of CB-Dock, the protein conformation sampling method will be incorporated in CB-Dock to enhance docking in apo structures. Apart


from blind docking capabilities, user-friendly interfaces are also very important for docking tools. CB-Dock offers a convenient web service that allows even nonexpert users to perform


protein–ligand docking and visualize results in 3D. We believe that CB-Dock can contribute to the characterization of newly determined protein structures and suggest novel therapeutic


targets for biological and pharmaceutical studies. REFERENCES * Pagadala NS, Syed K, Tuszynski J. Software for molecular docking: a review. Biophys Rev. 2017;9:91–102. CAS  PubMed  PubMed


Central  Google Scholar  * Yuriev E, Holien J, Ramsland PA. Improvements, trends, and new ideas in molecular docking: 2012–2013 in review. J Mol Recognit. 2015;28:581–604. CAS  PubMed 


Google Scholar  * Meiler J, Baker D. ROSETTALIGAND: protein-small molecule docking with full side-chain flexibility. Proteins. 2006;65:538–48. CAS  PubMed  Google Scholar  * Marialke J,


Tietze S, Apostolakis J. Similarity based docking. J Chem Inf Model. 2008;48:186–96. CAS  PubMed  Google Scholar  * Morris G, Huey R. AutoDock4 and AutoDockTools4: automated docking with


selective receptor flexibility. J Comput Chem. 2010;30:2785–91. Google Scholar  * Bolia A, Ozkan SB. Adaptive BP-Dock: an induced fit docking approach for full receptor flexibility. J Chem


Inf Model. 2016;56:734–46. CAS  PubMed  Google Scholar  * Allen WJ, Balius TE, Mukherjee S, Brozell SR, Moustakas DT, Lang PT, et al. DOCK 6: impact of new features and current docking


performance. J Comput Chem. 2015;36:1132–56. CAS  PubMed  PubMed Central  Google Scholar  * Liu Z, Su M, Han L, Liu J, Yang Q, Li Y, et al. Forging the basis for developing protein-ligand


interaction scoring functions. Acc Chem Res. 2017;50:302–9. CAS  PubMed  Google Scholar  * Lam PCH, Abagyan R, Totrov M. Ligand-biased ensemble receptor docking (LigBEnD): a hybrid


ligand/receptor structure-based approach. J Comput Aided Mol Des. 2018;32:187–98. CAS  PubMed  Google Scholar  * Padhorny D, Hall DR, Mirzaei H, Mamonov AB, Moghadasi M, Alekseenko A, et al.


Protein–ligand docking using FFT based sampling: D3R case study. J Comput Aided Mol Des. 2018;32:225–30. CAS  PubMed  Google Scholar  * Jones G, Willett P, Glen RC, Leach AR, Taylor R.


Development and validation of a genetic algorithm for flexible docking. J Mol Biol. 1997;267:0–748. CAS  Google Scholar  * Verdonk ML, Cole JC, Hartshorn MJ, Murray CW, Taylor RD. Improved


protein-ligand docking using GOLD. Proteins. 2003;52:609–23. CAS  PubMed  Google Scholar  * Hetényi C, Van Der Spoel D. Blind docking of drug-sized compounds to proteins with up to a


thousand residues. FEBS Lett. 2006;580:0–1450. Google Scholar  * Hetényi C, van der Spoel D. Efficient docking of peptides to proteins without prior knowledge of the binding site. Protein


Sci. 2002;11:1729–37. PubMed  PubMed Central  Google Scholar  * Hassan NM, Alhossary AA, Mu Y, Kwoh CK. Protein-ligand blind docking using QuickVina-W with inter-process spatio-temporal


integration. Sci Rep 2017;7:15451. * Sánchez-Linares I, Pérez-Sánchez H, Cecilia JM, García JM. High-throughput parallel blind virtual screening using BINDSURF. BMC Bioinformatics


2012;13(Suppl 14):S13. PubMed  PubMed Central  Google Scholar  * Iorga B, Herlem D, Barré E, Guillou C. Acetylcholine nicotinic receptors: finding the putative binding site of allosteric


modulators using the “blind docking” approach. J Mol Model. 2006;12:366–72. CAS  PubMed  Google Scholar  * Ghersi D, Sanchez R. Improving accuracy and efficiency of blind protein-ligand


docking by focusing on predicted binding sites. Proteins. 2009;74:417–24. CAS  Google Scholar  * Dai W, Wu A, Ma L, Li YX, Jiang T, Li YY. A novel index of protein-protein interface


propensity improves interface residue recognition. BMC Syst Biol. 2016;10:381–92. Google Scholar  * Shin WH, Seok C. GalaxyDock: Protein-ligand docking with flexible protein side-chains. J


Chem Inf Model. 2012;52:3225–32. CAS  PubMed  Google Scholar  * Capra JA, Laskowski RA, Thornton JM, Singh M, Funkhouser TA. Predicting protein ligand binding sites by combining evolutionary


sequence conservation and 3D structure. PLoS Comput Biol. 2009. https://doi.org/10.1371/journal.pcbi.1000585. PubMed  PubMed Central  Google Scholar  * Xu Y, Wang S, Hu Q, Gao S, Ma X,


Zhang W, et al. CavityPlus: a web server for protein cavity detection with pharmacophore modelling, allosteric site identification and covalent ligand binding ability prediction. Nucleic


Acids Res. 2018;46:W374–W379. CAS  PubMed  PubMed Central  Google Scholar  * Yang J, Roy A, Zhang Y. Protein-ligand binding site recognition using complementary binding-specific substructure


comparison and sequence profile alignment. Bioinformatics. 2013;29:2588–95. CAS  PubMed  PubMed Central  Google Scholar  * Levitt DG, Banaszak LJ. POCKET: A computer graphies method for


identifying and displaying protein cavities and their surrounding amino acids. J Mol Graph. 1992;10:229. CAS  PubMed  Google Scholar  * Laskowski RA. SURFNET: A program for visualizing


molecular surfaces, cavities, and intermolecular interactions. J Mol Graph. 1995;13:323–30. CAS  PubMed  Google Scholar  * Brylinski M, Skolnick J. A threading-based method (FINDSITE) for


ligand-binding site prediction and functional annotation. Proc Natl Acad Sci USA. 2008;105:129–34. CAS  PubMed  Google Scholar  * Venkatachalam CM, Jiang X, Oldfield T, Waldman M. LigandFit:


a novel method for the shape-directed rapid docking of ligands to protein active sites. J Mol Graph Model. 2003;21:289–307. CAS  PubMed  Google Scholar  * Brylinski M, Feinstein WP.


EFindSite: improved prediction of ligand binding sites in protein models using meta-threading, machine learning and auxiliary ligands. J Comput Aided Mol Des. 2013;27:551–67. CAS  PubMed 


Google Scholar  * Wu Qi, Peng Zhenling, Yang Zhang JY. COACH-D: improved protein–ligand binding sites prediction with refined ligand-binding poses through molecular docking. Nucleic Acids


Res. 2018;46:313–38. Google Scholar  * Grosdidier A, Zoete V, Michielin O. Blind docking of 260 protein-ligand complexes with eadock 2.0. J Comput Chem. 2010;30:2021–30. Google Scholar  *


Grosdidier A, Zoete V, Michielin O. SwissDock, a protein-small molecule docking web service based on EADock DSS. Nucleic Acids Res. 2011;39:270–7. Google Scholar  * Lee HS, Zhang Y.


BSP-SLIM: a blind low-resolution ligand-protein docking approach using predicted protein structures. Proteins. 2012;80:93–110. CAS  Google Scholar  * Trott O, Olson AJ. Software news and


update AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem. 2009;31:455–61. Google Scholar  *


Liu Z, Li Y, Han L, Li J, Liu J, Zhao Z, et al. PDB-wide collection of binding data: current status of the PDBbind database. Bioinformatics. 2015;31:405–12. CAS  PubMed  Google Scholar  *


Hartshorn MJ, Verdonk ML, Chessari G, Brewerton SC, Mooij WTM, Mortenson PN, et al. Diverse, high-quality test set for the validation of protein-ligand docking performance. J Med Chem.


2007;50:726–41. CAS  PubMed  Google Scholar  * Burley SK, Berman HM, Christie C, Duarte JM, Feng Z, Westbrook J, et al. RCSB Protein Data Bank: Sustaining a living digital data resource that


enables breakthroughs in scientific research and biomedical education. Protein Sci. 2018;27:316–30. CAS  PubMed  Google Scholar  * Labbé CM, Rey J, Lagorce D, Vavruša M, Becot J, Sperandio


O, et al. MTiOpenScreen: A web server for structure-based virtual screening. Nucleic Acids Res. 2015;43:448–54. Google Scholar  * Di Muzio E, Toti D, Polticelli F. DockingApp: a user


friendly interface for facilitated docking simulations with AutoDock Vina. J Comput Aided Mol Des. 2017;31:213–8. PubMed  Google Scholar  * Feinstein WP, Brylinski M. Calculating an optimal


box size for ligand docking and virtual screening against experimental and predicted binding pockets. J Cheminform. 2015;7:1–10. CAS  Google Scholar  * Sotriffer C, Klebe G. Identification


and mapping of small-molecule binding sites in proteins: Computational tools for structure-based drug design. Farmaco. 2002;3:243–51. Google Scholar  * Cao Y, Li L. Improved protein-ligand


binding affinity prediction by using a curvature-dependent surface-area model. Bioinformatics. 2014;30:1674–80. CAS  PubMed  Google Scholar  * Cao Yang, Wentao Dai ZM. Evaluation of


protein–ligand docking by cyscore. Comput Drug Discov Des. 2018;1762:223–32. Google Scholar  * Rodriguez A, Laio A, Xu R, Wunsch D, Frey BJ, Dueck D. et al.Machine learning. Clustering by


fast search and find of density peaks. Science. 2014;344:1492–6. CAS  PubMed  Google Scholar  * Schmidt T, Haas J, Gallo Cassarino T, Schwede T. Assessment of ligand-binding residue


predictions in CASP9. Proteins. 2011;79:126–36. CAS  Google Scholar  * Ruiz-Carmona S, Alvarez-Garcia D, Foloppe N, et al. rDock: a fast, versatile and open source program for docking


ligands to proteins and nucleic acids. PLoS Comput Biol. 2014;10:e1003571 https://doi.org/10.1371/journal.pcbi.1003571. Article  CAS  PubMed  PubMed Central  Google Scholar  * Hendlich M,


Rippmann F, Barnickel G. LIGSITE: Automatic and efficient detection of potential small molecule-binding sites in proteins. J Mol Graph Model. 1997;15:359–63. CAS  PubMed  Google Scholar  *


O’Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR. Open Babel: An Open chemical toolbox. J Cheminform. 2011;3:33. PubMed  PubMed Central  Google Scholar  * Rose AS,


Bradley AR, Valasatava Y, Jose M, Prli A, Rose PW. NGL Viewer : Web-based molecular graphics for large complexes. Bioinformatics. 2018;34:3755–8. CAS  PubMed  PubMed Central  Google Scholar


  * Schüttelkopf AW, Van Aalten DMF. PRODRG: A tool for high-throughput crystallography of protein-ligand complexes. Acta Crystallogr Sect D Biol Crystallogr. 2004;60:1355–63. Google Scholar


  * Sánchez-Linares I, Pérez-Sánchez H, Cecilia JM, García JM. High-Throughput parallel blind Virtual Screening using BINDSURF. BMC Bioinformatics. 2012;13:S13


https://doi.org/10.1186/1471-2105-13-S14-S13. Article  CAS  PubMed  PubMed Central  Google Scholar  * Pérot S, Sperandio O, Miteva MA, Camproux AC, Villoutreix BO. Druggable pockets and


binding site centric chemical space: A paradigm shift in drug discovery. Drug Discov Today. 2010;15:656–67. PubMed  Google Scholar  * Schwardt O, Cutting B, Kolb H, Ernst B. Drug discovery


today. Front Med Chem. 2005;3:1–9. Google Scholar  * Kharkar PS, Warrier S, Gaud RS. Reverse docking: A powerful tool for drug repositioning and drug rescue. Future Med Chem. 2014;6:333–42.


CAS  PubMed  Google Scholar  Download references ACKNOWLEDGEMENTS The authors thank Professor Jian-yi Yang of Nankai University for helping in running COACH-D and Dr. Holger Stitz of


Johannes Kepler University Linz for his invaluable editing of the manuscript. We also thank Professor Yang Zhang and Cheng-xin Zhang of the University of Michigan, Professor Xiang-jun Du of


Sun Yat-sen University and Dr. Zhi-chao Miao of Cambridge University for invaluable discussions. This work was supported by the National Natural Science Foundation of China (Grant numbers


31401130, 81830108, and 81672736), the National Key R&D Program of China (2018YFC0910500), the Shanghai Sailing Program (16YF1408600), the funding for prevention and control technology


of African swine fever (2018NZ0151) and the Shanghai Industrial Technology Institute (17CXXF008). AUTHOR INFORMATION Author notes * These authors contributed equally: Yang Liu, Maximilian


Grimm. AUTHORS AND AFFILIATIONS * Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan


University, Chengdu 610065, China Yang Liu, Maximilian Grimm, Mu-chun Hou, Zhi-Xiong Xiao & Yang Cao * Shanghai Center for Bioinformation Technology & Shanghai Engineering Research


Center of Pharmaceutical Translation, Shanghai Industrial Technology Institute, Shanghai 201203, China Wen-tao Dai Authors * Yang Liu View author publications You can also search for this


author inPubMed Google Scholar * Maximilian Grimm View author publications You can also search for this author inPubMed Google Scholar * Wen-tao Dai View author publications You can also


search for this author inPubMed Google Scholar * Mu-chun Hou View author publications You can also search for this author inPubMed Google Scholar * Zhi-Xiong Xiao View author publications


You can also search for this author inPubMed Google Scholar * Yang Cao View author publications You can also search for this author inPubMed Google Scholar CONTRIBUTIONS YL designed and


optimized the CB-Dock tool and wrote the manuscript. MG built the CB-Dock web server. WTD benchmarked the program. MCH tested the server. ZXX guided the experiments. YC designed the project


and wrote the manuscript. CORRESPONDING AUTHOR Correspondence to Yang Cao. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare that they have no conflict of interest. ADDITIONAL


INFORMATION PUBLISHER’S NOTE: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. SUPPLEMENTARY INFORMATION SUPPLEMENTARY


FIGURE 1 SUPPLEMENTARY FIGURE 2 SUPPLEMENTARY FIGURE 3 SUPPLEMENTARY TABLE RIGHTS AND PERMISSIONS Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Liu, Y., Grimm, M., Dai, Wt.


_et al._ CB-Dock: a web server for cavity detection-guided protein–ligand blind docking. _Acta Pharmacol Sin_ 41, 138–144 (2020). https://doi.org/10.1038/s41401-019-0228-6 Download citation


* Received: 05 December 2018 * Accepted: 14 March 2019 * Published: 01 July 2019 * Issue Date: January 2020 * DOI: https://doi.org/10.1038/s41401-019-0228-6 SHARE THIS ARTICLE Anyone you


share the following link with will be able to read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the


Springer Nature SharedIt content-sharing initiative KEYWORDS * bioinformatics * computer-aided design * computer-aided drug discovery