____________________________________________________________

Please cite the following paper Xavier et al. 2016 when using SAnDReS

Welcome to SAnDReS!            

SAnDReS is a free and open-source (GNU General Public License) computational environment for the development of machine-learning models for prediction of ligand-binding affinity. SAnDReS is also a tool for statistical analysis of docking simulations and evaluation of the predictive performance of computational models developed to calculate binding affinity. 

Stairway to SAnDReS (Windows) 

You need to have Python 3 installed on your computer to run SAnDReS. In addition, you also need NumPyMatplotlibscikit-learn, and SciPy. You can make the installation process easier by installing pyzo. While installing them, let's the music play... 

Step 1. Install the Pyzo IDE (download here)

Step 2. Install Python environment (download here)

Step 3. Install Scientific packages needed to run SAnDReS.

To run Pyzo IEP go to c:\Program Files (x86)\pyzo directory and you will have the pyzo IEP. Double click on pyzo. In the Pyzo’s shell (IEP), type the following commands:

    conda install numpy

    conda install scipy pyqt matplotlib 

    conda install scikit-learn

Step 4. Download SAnDReS 1.0.2 (download  here)

Step 5. Unzip the zipped file (sandres.zip) 

Step 6. Copy sandres directory to c:\ .

Step 7. Open a command prompt window and type: cd c:\sandres

then type: python sandres1_GUI.py

This launches GUI window for SAnDReS. That´s it, good sandres session. See tutorial page for additional information about how to run SAnDReS. You can also start SAnDReS clicking on the sandres.bat file. You may also create a shortcut for SAnDReS right clicking on the sandres.bat file. More details here.

SAnDReS Programs 

SAnDReS is composed of three programs: the SAnDReS GUI, the main program, and scikit_regression_methods_v1.py. SAnDReS GUI is a front-end that calls SAnDReS main program that carries out most of the computing work. The GUI takes care of preparing input files that will be executed by SAnDReS main program. The third program is an implementation of supervised machine-learning methods. SAnDReS was developed in a way that you can perform all its tasks from the GUI window. SAnDReS allows you to generate machine-learning models tailored to the biological system of interest. You can also edit the input file and run it. SAnDReS is able to produce high-quality graphs for publications or presentations.

SAnDReS was designed to analyze data from any protein-ligand docking program, the only requisite is to have protein structures in Protein Data Bank (PDB) format, ligands in Structure Data Format (SDF), docking and scoring function data in comma separated values (CSV) format. 

Docking Hub   

The most recent version of SAnDReS (1.0.2) is able to natively run AutoDock 4 (Morris et al., 1998), AutoDock Vina (Trott & Olson, 2010), and Molegro Virtual Docker (Thomsen & Christensen, 2006). For MVD, you must have it previously installed on your computer. SAnDReS is able to automatically create the inputs necessary to run the previously mentioned docking programs. In addition, SAnDReS can automatically read docking results generated from GemDock 2.1 (Yang & Chen, 2004) and SwissDock (Grosdidier et al., 2011), besides the previously mentioned docking programs that SAnDReS natively run.

Machine Learning Box  

SAnDReS makes use of supervised machine-learning techniques to generate polynomial equations to predict ligand-binding affinity. SAnDReS has a flexible interface (Machine Learning Box) that allows to test the predictive power of regression models generated by machine learning techniques, such as LinearRegressionLassoRidgeElasticNet, Stochastic Gradient Descent Regressor (SGDRegressor), and Support Vector Regression.     

All these methods are available from scikit-learn library (Pedregosa et al., 2011) and implemented as an intuitive workflow in SAnDReS. Such approach allows to use the terms available in scoring functions implemented in programs such as AutoDock 4 (Morris et al., 1998), AutoDock Vina (Trott & Olson, 2010), GemDock 2.1 (Yang & Chen, 2004), Molegro Virtual Docker (Thomsen & Christensen, 2006), and SwissDock (Grosdidier et al., 2011) for development of new scoring scoring functions tailored to the biological system being analyzed. SAnDReS is in continuing development and we intend to add new functions to SAnDReS to run other docking programs.

Protein-Ligand Docking Simulations

The progress of protein-ligand docking simulations and their application to drug development have resulted in new potential inhibitors of target proteins, many of them had their efficiency experimentally confirmed (Shoichet et al., 2002; Schneider & Böhm, 2002). To increase the success of molecular docking strategy we have to test different docking protocols to determine the performance of them, for instance, test whether a docking search algorithm is able to find the position of the ligand in the binding pocket (re-docking process), as shown in figure 1 for the structure of cyclin-dependent kinase in complex with roscovitine (De Azevedo et al., 1997). SAnDReS is able to evaluate different docking protocols indicating which one has better overall performance.

Figure 1. Re-dock result for the structure 2A4L. The ligand in red is the pose, and the crystallographic position is in white. Simulation was carried out by the program Molegro Virtual Docker (Thomsen & Christensen, 2006). 


Linking Crystallography and Docking

To assess the quality of a crystallographic structure it is common to use parameters such as resolution, R-factor, R-free, and B-values, to mention few. SAnDReS opens the possibility to evaluate the correlation of over 100 structural parameters with docking results, in order to investigate which parameter may exhibit some influence in the docking results.

Scoring Functions   

In the analysis of docking results, we apply scoring functions to rank pose results and/or evaluate binding affinity. These scoring functions can be classified in three different types: force-field based, knowledge based and empirical based scoring functions. Force-field based scoring functions use non-bonded terms of classical mechanics force fields. Knowledge-based scoring functions are based on statistical observations of intermolecular contacts identified from structure databases. Empirical scoring functions make use of several intermolecular interaction terms that are calibrated through a regression procedure, where theoretical values are fitted to be as  closest as possible to experimental data. After docking procedure is complete, scoring functions are employed to rank each ligand found by the docking procedure. This ranking process will predict the best affinity ligand. For reviews see (Azevedo et al. 2012de Avila & de Azevedo, 2014De Azevedo, 2010).


More About SAnDReS 

SAnDReS became operational on 12 January 2016 at the Laboratory of Computational Systems Biology in Porto Alegre, RS Brazil as version number 1.0.1. SAnDReS GUI is shown in figure 2. SAnDReS was developed by Dr. Walter F. de Azevedo Jr. with help from Mariana Morrone Xavier, Mauricio B. de Avila, Nayara M. Bernhardt Levin, Val de Oliveira Pintro, and Nathalia L. Carvalho ( Xavier et al., 2016.).

                Figure 2. SAnDReS 1.0.2 GUI window.


Contact

SAnDReS is in continuous development, feel free to download the latest version and use it in the analysis of your docking results. If you have any question regarding SAnDReS, please feel free to e-mail me: walter@azevedolab.net 

Funding

Funding Agency: Conselho Nacional de Desenvolvimento Científico e Tecnológico - National Counsel of Technological and Scientific Development (www.cnpq.br) 
Principal Investigator : Walter F. de Azevedo Jr., Ph.D
Process Numbers: 472590/2012-0 and 308883/2014-4.

License

SAnDReS is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

References

-Azevedo LS, Moraes FP, Xavier MM, Pantoja EO, Villavicencio B, Finck JA, Proenca, AM, Rocha, KB, de Azevedo, WF. Recent Progress of Molecular Docking Simulations Applied to Development of Drugs. Curr Bioinform 2012; 7(4): 352-65. Link to Journal       

-de Avila, MB, de Azevedo, WF. Data Mining of Docking Results. Application to 3-Dehydroquinate Dehydratase. Curr Bioinform 2014; 9(4): 361-79. Link to Journal    

-De Azevedo WF, Leclerc S, Meijer L, Havlicek L, Strnad M, Kim SH. Inhibition of cyclin-dependent kinases by purine analogues: crystal structure of human cdk2 complexed with roscovitine. Eur J Biochem. 1997; 243(1-2): 518-26.   PubMed  

-De Azevedo WF Jr. MolDock applied to structure-based virtual screening. Curr Drug Targets. 2010; 11(3):327-34. PubMed    

-Grosdidier A, Zoete V, Michielin O. SwissDock, a protein-small molecule docking web service based on EADock DSS. Nucleic Acids Res. 2011; 39(Web Server issue):W270-7.   PubMed  

-Morris G, Goodsell D, Halliday R, Huey R, Hart W, Belew R, Olson A. Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. J Comput Chem. 1998; 19:1639-1662.   PubMed   

-Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B,  Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Verplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E. Scikit-learn: Machine Learning in Python. J Mach Learn Res. 2011; 12: 2825-30. PDF

-Schneider G, Böhm HJ. Virtual screening and fast automated docking methods. Drug Discov Today. 2002; 7(1):64-70.   PubMed  

-Shoichet BK, McGovern SL, Wei B, Irwin JJ. Lead discovery using molecular docking. Curr Opin Chem Biol. 2002; 6(4):439-46.   PubMed  

-Thomsen R, Christensen MH. MolDock: a new technique for high-accuracy molecular docking. J Med Chem. 2006;49:3315-21.   PubMed  

-Trott O, Olson AJ. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem. 2010; 31(2):455-61.   PubMed      

-Xavier MM, Heck GS, de Avila MB, Levin NM, Pintro VO, Carvalho NL, Azevedo WF Jr. SAnDReS a Computational Tool for Statistical Analysis of Docking Results and Development of Scoring Functions. Comb. Chem. High Throughput Screen. 2016; 19(10): 801-12.   PubMed    

-Yang JM, Chen CC. GEMDOCK: a generic evolutionary method for molecular docking. Proteins. 2004; 55(2):288-304.   PubMed