High-Quality Dataset of Protein-Bound Ligand Conformations and Its Application to Benchmarking Conformer Ensemble Generators
2017
We developed a
cheminformaticspipeline for the fully automated selection and extraction of high-quality protein-bound ligand conformations from X-ray structural data. The pipeline evaluates the validity and accuracy of the 3D structures of small molecules according to multiple criteria, including their fit to the electron density and their physicochemical and structural properties. Using this approach, we compiled two high-quality datasets from the
Protein Data Bank(PDB): a comprehensive dataset and a diversified subset of 4626 and 2912 structures, respectively. The datasets were applied to benchmarking seven freely available
conformer ensemblegenerators: Balloon (two different algorithms), the RDKit standard
conformer ensemblegenerator, the Experimental-Torsion basic Knowledge
Distance Geometry(ETKDG) algorithm, Confab, Frog2 and Multiconf-DOCK. Substantial differences in the performance of the individual algorithms were observed, with RDKit and ETKDG generally achieving a favorable balance of accur...
Keywords:
-
Correction
-
Source
-
Cite
-
Save
50
References
33
Citations
NaN
KQI