Recent Publications  

 

 

Editor for the Following Journals

Research

“Greatest ideas are often met with violent opposition from mediocre minds.” Albert Einstein.

Welcome to SAnDReS!        




Make molecular docking reliable, fast, easy, free, and funny with SAnDReS. It is the easiest way to run dependable protein-ligand docking simulations. SAnDReS takes a different approach to docking; it focuses on the simulation of a system composed of an ensemble of crystallographic structures for which ligand binding affinity data is available. This experimental data is used to train a scoring function, specific to the biological system of interest (ensemble of structures with binding affinity data). In doing so, SAnDReS explores the scoring function space, selecting an adequate scoring function to predict binding affinity and to analyze docking results.    


Overview

SAnDReS is a free and open-source (GNU General Public License) computational environment for the development of machine-learning models for prediction of ligand-binding affinity. SAnDReS is also a tool for statistical analysis of docking simulations and evaluation of the predictive performance of computational models developed to calculate binding affinity. SAnDReS is an acronym for Statistical Analysis of Docking Results and Scoring Functions. We have successfully employed SAnDReS to study coagulation factor Xa (Xavier et al., 2016), cyclin-dependent kinases (de Ávila et al., 2017; Levin et al., 2018), HIV-1 protease (Pintro & de Azevedo, 2017), estrogen receptor (Amaral et al., 2018), cannabinoid receptor 1 (Russo & de Azevedo, 2018), and 3-dehydroquinate dehydratase (de Ávila & de Azevedo, 2018). Also, we used SAnDReS to develop a machine-learning model to predict Gibbs free energy of binding for protein-ligand complexes (Bitencourt-Ferreira & de Azevedo Jr., 2018).


Download from GitHub  

You may also download SAnDReS code from GitHub.

 

Installing SAnDReS without Installers (Windows)  

You need to have Python 3 installed on your computer to run SAnDReS. Also, you need to install NumPyMatplotlibscikit-learn, and SciPy.    

You can make the installation process easier by installing Anaconda. 

Step 1. Install Anaconda 32 bits (download here)

Step 2. Download SAnDReS 1.1.0 from GitHub (here)       

Step 3. Unzip the zipped file (sandres.zip) 

Step 4. Copy sandres directory to c:\ .

Step 5. Open a command prompt window and type: cd c:\sandres

then type: python sandres1_GUI.py

This launches GUI window for SAnDReS. That´s it, good SAnDReS session. See the tutorial page for additional information about how to run SAnDReS. You can also start SAnDReS clicking on the sandres.bat file. You may also create a shortcut for SAnDReS right clicking on the sandres.bat file. More details here

Download Installers   

We provide here the SAnDReS installers for Linux and Windows. These installers were developed by Mr. Amauri Duarte da Silva. You don't need to have Python installed on your computer to run it. Just go through the installation instructions below, and enjoy using SAnDReS, a new way to think about protein-ligand docking. Last updated on May 19, 2018.

      

Installing SAnDReS (Windows)   

You need administrator privileges to install SAnDReS. The easiest way to install SAnDReS (Windows) on your computer is via the stand-alone installers, which you can download from the above links. To install SAnDReS, unzip the zip file and click on the installer file. Keep the installation folder indicated by the installer (c:\sandres). Once finished the installation, you will have a desktop icon for SAnDReS, click on it and start your SAnDReS session. We have tested this installer (Windows 64 bits) on computers running Windows 8.1. It worked fine with us. If you have any question regarding the installation process, please feel free contact me by e-mail: walter@azevedolab.net  .


Installing SAnDReS (Linux) 

To install SAnDReS (Linux) on your computer use the stand-alone installers, which you can download from the above links. To install SAnDReS, unzip the tar.gz file, and then you can run SAnDReS.  


SAnDReS Programs 

SAnDReS is composed of three main programs: the SAnDReS GUI, the main program, and scikit_regression_methods_v1.py. SAnDReS GUI is a front-end that calls SAnDReS main program that carries out most of the computing work. The GUI takes care of preparing input files that will be executed by SAnDReS main program. The third program is an implementation of supervised machine-learning methods. SAnDReS was developed in a way that you can perform all its tasks from the GUI window. SAnDReS allows you to generate machine-learning models tailored to the biological system of interest. You can also edit the input file and run it. SAnDReS can produce high-quality graphs for publications or presentations.

SAnDReS was designed to analyze data from any protein-ligand docking program; the only requisite is to have protein structures in Protein Data Bank (PDB) format, ligands in Structure Data Format (SDF), docking and scoring function data in comma-separated values (CSV) format. 


Docking Hub   

The most recent version of SAnDReS (1.1.0) can natively run AutoDock 4 (Morris et al., 1998), AutoDock Vina (Trott & Olson, 2010), and Molegro Virtual Docker (MVD) (Thomsen & Christensen, 2006). For MVD, you must have it previously installed on your computer. SAnDReS can automatically create the inputs necessary to run the previously mentioned docking programs. 


Machine Learning Box    

SAnDReS makes use of supervised machine-learning techniques to generate polynomial equations to predict ligand-binding affinity. SAnDReS has a flexible interface (Machine Learning Box) that allows testing the predictive power of regression models created by machine learning techniques, such as Linear RegressionLeast Absolute Shrinkage and Selection Operator (Lasso)RidgeElastic Net, Stochastic Gradient Descent Regressor (SGDRegressor), and Support Vector Regression.     

All these methods are available from scikit-learn library (Pedregosa et al., 2011) and implemented as an intuitive workflow in SAnDReS. Such approach allows using the terms available in scoring functions implemented in programs such as AutoDock 4 (Morris et al., 1998), AutoDock Vina (Trott & Olson, 2010), and Molegro Virtual Docker (Thomsen & Christensen, 2006) for development of new scoring scoring functions targeted to the biological system being analyzed. SAnDReS is in continuing development, and we are applying to different biological systems. In figure 1, we can see the higher predictive power of the function Polscore #155 generated with the program SAnDReS to predict Gibbs free energy of binding for protein-ligand complexes (Bitencourt-Ferreira & de Azevedo Jr., 2018), when compared with other top-ranking scoring functions of the programs AutoDock 4  (Morris et al., 1998), AutoDock Vina (Trott & Olson, 2010), and Molegro Virtual Docker (Thomsen & Christensen, 2006).

Figure 1. Improvement of the predictive power of a scoring function generated with the program SAnDReS to predict Gibbs free energy of binding for protein-ligand complexes (Bitencourt-Ferreira & de Azevedo Jr., 2018). Figure created by Ms. Gabriela Bitencourt-Ferreira.

Protein-Ligand Docking Simulations

The progress of protein-ligand docking simulations and their application to drug development have resulted in new potential inhibitors of target proteins, many of them had their efficiency experimentally confirmed (Shoichet et al., 2002; Schneider & Böhm, 2002). To increase the success of molecular docking strategy, we have to test different docking protocols to determine the performance of them, for instance, test whether a docking search algorithm can find the position of the ligand in the binding pocket (re-docking process), as shown in figure 2 for the structure of cyclin-dependent kinase in complex with roscovitine (De Azevedo et al., 1997). SAnDReS can evaluate different docking protocols indicating which one has better overall performance.

                  

Figure 2. Re-dock result for the structure 2A4L. The ligand in red is the pose, and the crystallographic position is in white. The simulation was carried out by the program Molegro Virtual Docker (Thomsen & Christensen, 2006). 

Linking Crystallography and Docking

During evaluation of a structure, it is common to use parameters such as resolution, R-factor, R-free, and B-values, to mention few. SAnDReS opens the possibility to evaluate the correlation of over 100 structural parameters with docking results, to investigate which parameter may exhibit some influence in the docking results.  


Scoring Functions        

In the analysis of docking results, we apply scoring functions to rank pose results and/or evaluate binding affinity. These scoring functions can be classified into three different types: force-field based, knowledge-based and empirical-based scoring functions. Force-field based scoring functions use non-bonded terms of classical mechanics force fields. Knowledge-based scoring functions are based on statistical observations of intermolecular contacts identified from structure databases. Empirical scoring functions make use of several intermolecular interaction terms that are calibrated through a regression procedure, where theoretical values are fitted to be as closest as possible to experimental data. After docking procedure is complete, scoring functions are employed to rank each ligand found by the docking procedure. This ranking process will predict the best affinity ligand. For reviews see (Azevedo et al. 2012de Avila & de Azevedo, 2014De Azevedo, 2010).


Exploring the Scoring Function Space

We envisage protein-ligand interaction as a result of the relation between the protein space (Smith, 1970) and the chemical space (Bohacek et al., 1996), and we propose to approach these sets as a unique complex system, where the application of computational methodologies could contribute to establishing the physical principles to understand the structural basis for the specificity of ligands for proteins. Such approaches have the potential to create novel semi-empirical force fields to predict binding affinity with superior predictive power when compared with standard methodologies. We propose to use the abstraction of a mathematical space composed of infinite computational models to predict ligand-binding affinity, named here as scoring function space (figure 3).  By the use of supervised machine learning techniques, we can explore this scoring function space to build a computational model targeted to a specific biological system. For instance, we created targeted-scoring functions for HIV-1 Protease and cyclin-dependent kinases. We developed the programs SAnDReS and Taba to generate computational models to predict ligand-binding affinity. SAnDReS and Taba are integrated computational tools to explore the scoring function space.        


Figure 3. A view of the scoring function space as a way to develop a computational model to predict ligand-binding affinity. Structures of proteins available with the following PDB access codes: 2OW4, 2OVU, 2IDZ, 2GSJ, 2G85, 2A4l, 1ZTB, 1Z99, 1WE2, 1M73, 1FLH, and 1FHJ.


Biological Systems Analyzed by SAnDReS

Below you have a list of biological systems (datasets) that were analyzed using SAnDReS. Each zipped folder has the necessary files to reproduce the results reported for each dataset. See the tutorial page for additional information about these files.

     -Coagulation Factor Xa with Ki Information   ZIP   PubMed   
     -Cyclin-Dependent Kinases with IC50 Information   ZIP   PubMed   
     -Cyclin-Dependent Kinases with Ki Information   ZIP   (to be published)   
     -High-resolution Structures with Delta G Information   ZIP   Link     
     -High-resolution Structures with IC50 Information   ZIP   PubMed     
     -HIV-1 Protease with Ki Information   ZIP   PubMed   


More About SAnDReS 

SAnDReS became operational on 12 January 2016 at the Laboratory of Computational Systems Biology in Porto Alegre, RS Brazil as version number 1.0.1. SAnDReS GUI is shown in figure 4. Dr. Walter F. de Azevedo Jr. developed SAnDReS, with help from Gabriela Bitencourt-Ferreira, Amauri Duarte da Silva, Maurício Boff de Ávila, and Mariana Morrone Xavier ( Xavier et al., 2016.).

                Figure 4. SAnDReS 1.1.0 GUI window.


Contact

SAnDReS is in continuous development, feel free to download the latest version and use it in the analysis of your docking results. If you have any question regarding SAnDReS, please e-mail me: walter@azevedolab.net 


Funding

Funding Agency: Conselho Nacional de Desenvolvimento Científico e Tecnológico - National Counsel of Technological and Scientific Development (www.cnpq.br)  
Principal Investigator : Walter F. de Azevedo Jr., Ph.D
Process Numbers: 472590/2012-0 and 308883/2014-4. 


License

SAnDReS is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.  


References

-Amaral MEA, Nery LR, Leite CE, de Azevedo Junior WF, Campos MM. Pre-clinical effects of metformin and aspirin on the cell lines of different breast cancer subtypes. Invest New Drugs. 2018. doi: 10.1007/s10637-018-0568-y.   PubMed   PDF   

-Azevedo LS, Moraes FP, Xavier MM, Pantoja EO, Villavicencio B, Finck JA, Proenca, AM, Rocha, KB, de Azevedo, WF. Recent Progress of Molecular Docking Simulations Applied to Development of Drugs. Curr Bioinform 2012; 7(4): 352-65. Link to Journal       

-Bitencourt-Ferreira G, de Azevedo Jr. WF. Development of a machine-learning model to predict Gibbs free energy of binding for protein-ligand complexes. Biophys Chem. 2018; 240: 63–69.   PubMed   PDF  

-Bohacek RS, McMartin C, Guida WC. The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev. 1996; 16(1):3-50.   PubMed   

-de Avila, MB, de Azevedo, WF. Data Mining of Docking Results. Application to 3-Dehydroquinate Dehydratase. Curr Bioinform 2014; 9(4): 361-79. Link to Journal    

-de Ávila MB, Xavier MM, Pintro VO, de Azevedo WF. Supervised machine learning techniques to predict binding affinity. A study for cyclin-dependent kinase 2.  Biochem Biophys Res Commun. 2017; 494: 305-310.  PubMed   PDF 

-de Ávila MB, de Azevedo WF Jr. Development of machine learning models to predict inhibition of 3-dehydroquinate dehydratase. Chem Biol Drug Des. 2018;92:1468–1474.   PubMed   PDF      

-De Azevedo WF, Leclerc S, Meijer L, Havlicek L, Strnad M, Kim SH. Inhibition of cyclin-dependent kinases by purine analogues: crystal structure of human cdk2 complexed with roscovitine. Eur J Biochem. 1997; 243(1-2): 518-26.   PubMed  

-De Azevedo WF Jr. MolDock applied to structure-based virtual screening. Curr Drug Targets. 2010; 11(3):327-34. PubMed    

-Levin NMB, Pintro VO, Bitencourt-Ferreira G, Mattos BB, Silvério AC, de Azevedo Jr. WF. Development of CDK-targeted scoring functions for prediction of binding affinity. Biophys Chem. 2018; 235: 1–8.  PubMed   PDF        

-Morris G, Goodsell D, Halliday R, Huey R, Hart W, Belew R, Olson A. Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. J Comput Chem. 1998; 19:1639-1662.   Link   

-Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B,  Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Verplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E. Scikit-learn: Machine Learning in Python. J Mach Learn Res. 2011; 12: 2825-30. PDF

-Pintro VO, Azevedo WF. Optimized Virtual Screening Workflow. Towards Target-Based Polynomial Scoring Functions for HIV-1 Protease. Comb Chem High Throughput Screen. 2017; 20(9): 820-827.   PubMed   PDF   

-Russo S, De Azevedo WF. Advances in the Understanding of the Cannabinoid Receptor 1 - Focusing on the Inverse Agonists Interactions. Curr Med Chem. 2018. doi: 10.2174/0929867325666180417165247   PubMed     

-Smith JM. Natural selection and the concept of a protein space. Nature. 1970; 225(5232): 563–564.

-Schneider G, Böhm HJ. Virtual screening and fast automated docking methods. Drug Discov Today. 2002; 7(1):64-70.   PubMed  

-Shoichet BK, McGovern SL, Wei B, Irwin JJ. Lead discovery using molecular docking. Curr Opin Chem Biol. 2002; 6(4):439-46.   PubMed    

-Thomsen R, Christensen MH. MolDock: a new technique for high-accuracy molecular docking. J Med Chem. 2006;49:3315-21.   PubMed   

-Trott O, Olson AJ. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem. 2010; 31(2):455-61.   PubMed      

-Xavier MM, Heck GS, de Avila MB, Levin NM, Pintro VO, Carvalho NL, Azevedo WF Jr. SAnDReS a Computational Tool for Statistical Analysis of Docking Results and Development of Scoring Functions. Comb. Chem. High Throughput Screen. 2016; 19(10): 801-12.   PubMed    PDF    GitHub         

      

This site was designed by Dr. Walter F. de Azevedo Jr.