Dataset

This study used the well-curated NuRA chemical dataset to train machine learning models for nine nuclear receptors [23]. The dataset and the KNIME workflow [29] of data curation was downloaded from https://doi.org/10.5281/zenodo.3991561, and we carefully verified each step of the curation process. The dataset contains 15247 combined entries for nine different receptors, annotated as three binding class types 1) agonist, 2) antagonist, and 3) binders. Each type is further classified as activity type 1) active, 2) weakly active, 3) inactive, 4) inconclusive, and 5) data missing. Table 1 shows the compositions of different classes for each receptor. Missing data and inconclusive results were removed from the dataset. And then, because the number of chemicals in the weakly active category is low, we combined the active and weakly active entries into a single category in each binding class type, resulting in a binary (active vs inactive) designation for each of the agonists, antagonists and binders. Our study therefore developed machine learning models to predict each of these binding class types using a binary classification (Binding Class models).

 

Table 1: Number of chemicals by class for all receptors in the training and validation set.

Receptor

Class

Total

Inactive

Total

Active

Total

Weakly

Active

Training set

Actives/Inactives

Validation set

Actives/Inactives

Agonist

5670

349

27

290/4546

86/1124

PR

Antagonist

4400

741

548

1027/3524

262/876

Binder

5040

1251

53

1057/4018

247/1022

Agonist

4549

130

133

-

-

RXR

Antagonist

3

115

1

-

-

Binder

4569

861

145

-

-

Agonist

5384

737

41

613/4316

165/1068

GR

Antagonist

4577

657

190

666/3673

181/904

Binder

5228

1815

84

1537/4164

362/1064

Agonist

5578

513

121

517 /4452

117/ 1126

AR

Antagonist

4942

776

391

926 /3961

241/ 981

Binder

5130

1419

104

1243/ 4079

280/ 1051

Agonist

5060

476

461

751/ 4046

186/ 1014

ERA

Antagonist

5160

362

322

544/ 4131

140/1029

Binder

4861

1287

177

1184/ 3876

280/985

Agonist

5744

286

48

270/4592

64/1152

ERB

Antagonist

5133

224

229

359/4109

94/1024

Binder

5554

1159

66

998/4425

227/1129

Agonist

5349

372

85

346/4298

111/1051

FXR

Antagonist

4829

124

143

219/3857

48/972

Binder

5272

550

108

530/4214

128/5272

Agonist

5663

616

73

-

-

PPARD

Antagonist

5561

28

24

-

-

Binder

5742

730

52

-

-

Agonist

5223

1352

158

1200/4186

310/1037

PPARG

Antagonist

5249

88

153

203/4189

38/1060

Binder

5458

1699

205

1529/4360

375/1098

 

Since we are also interested in identifying active binding vs inactive chemicals (Effector models) regardless of agonist, antagonist, or undefined binding class, we additionally developed machine learning models by first merging the three binding classes and removing the inconclusive and missing data for each receptor to increase the sample size of Effector types (actives and inactive). Table 2 shows the active and inactive chemicals compositions for each receptor after merging the three binding types.

 

Table 2: Number of active and inactive chemicals for all receptors.

 

         Total

Training Set

Validation Set

Receptor

Actives

Inactives

Total

Actives/Inactives

Total

Actives/Inactives

RXR

1008

4569

4461

807/3654

1116

201/915

PR

2078

5063

5712

1646/4066

1429

432/997

GR

2143

5232

5900

1720/4180

1475

423/1052

AR

2217

5179

5916

1782/4134

1480

435/1045

ERA

2327

4956

5826

1863/3963

1457

464/993

ERB

1552

5563

5692

1228/4464

1423

324/1099

FXR

837

5276

4890

662/4228

1223

175/1048

PPARD

848

5745

5274

678/4596

1319

170/1149

PPARG

2118

5469

6069

1693/4376

1518

425/1093

 

 

 

Training Dataset

 

For each of the 9 nuclear receptors, the NR specific curated chemical datasets were randomly divided into training (80%) and test sets (20%) using the "train_test_split" function in the scikit-learn package (Table 1 and Table 2). The test set was used to give an estimate of the performance of each of the developed models. This 20% test set of chemicals were not used in the training set while developing and optimizing the performance of any of our ML models.

 

Molecular Features

 

In this investigation, we utilized molecular fingerprints for descriptor features. We employed two widely used fingerprinting methods 1) Morgan fingerprints, also called an extended-connectivity fingerprint (ECFP4), which is a circular substructure fingerprint where we chose a radius of 3 and a length of hashed binary vectors of 1024-bits and 2) Molecular ACCess System (MACCS) key fingerprints which have 166 public keys implemented as SMARTS. The Python-based RDKit [30] library was used to generate the molecular fingerprints from the SMILES data.

 

 

Machine Learning Model Development 

 

As noted previously [31], there is no single optimal machine learning algorithm for all potential data problems. However, one can define an approach that is guaranteed to generate the best from a set of explicit, competing algorithms. In our case, we used nine different machine learning techniques, including 1) AdaBoost [32], which is a boosting algorithm that combines multiple "weak classifiers" into a single "strong classifier", 2) Logistic regression [33], which predicts the value of a categorical variable based on its relationship with predictor variables, 3) Random Forest [34], which merges a collection of independent decision trees to decrease both bias and variance, 4) Support Vector Machine (SVM) [35], which is a classifier that finds an optimal hyperplane to maximize the margin between two classes, 5) k-nearest neighbors (k-NN) algorithm [36], which assumes that similar data points exist nearby each other and makes predictions by calculating the difference between the new data point and all other data points in the training set, 6) Bagging classifier [37], which is an ensemble-based model that fits base classifiers on random subsets of the original dataset and then aggregates their predictions to generate a final prediction, 7) Gaussian Naïve Bayes [38], which is a variant of Naive Bayes algorithm based on Bayes theorem, 8) decision tree classifier algorithm [39], which uses a tree where each node represents a feature, each branch represents the decision and each leaf represents an outcome, and 9) Super learner [31], which combines the predictive probabilities of NR binding across many ML algorithms and finds the optimal combination of the collection of algorithms by minimizing the cross-validated risk. This approach is an improvement over methods using only one ML algorithm because no one algorithm is universally optimal. Super learner has been shown in theory to be at least as good as the best performing algorithm in the ensemble and often performs considerably better than the component machine learning models. For each of these methods, we used a grid-search cross-validation (GridSearchCV) method as implemented in scikit-learn [40] to tune the hyperparameters.

 

Repeated k-Fold Cross-Validation

 

We assessed the performance of the classification models using stratified k-fold cross-validation. The stratified-folds function was utilized to split the data while keeping the correct ratio of different classes. We evaluated the classification performance for each receptor by repeated stratified k-fold cross-validation with ten splits and 100 repeats, in total 1000-fold.

 

Applicability Domain

 

Applicability domain is defined as described by Chen et al. [44] and was measured by the similarity to the molecules in the training set. Tanimoto similarity was calculated using ECFP4 fingerprints and MACCS key fingerprints for the respective feature space. The test molecule is considered to be within the applicability domain if the number of chemicals (Nmin (default =1)) with similarity is greater than the cutoff (Scutoff (default=0.25)) in the training dataset. The applicability domain was defined as a combination of Scutoff and Nmin.

 

 

Models for AR

 

Binding Class Models for AR

 

Agonist, antagonist, and binder datasets were used to build three different machine learning models for AR. Prediction accuracy for different types and algorithms on cross-validation with ECFP4 and MACCS key fingerprints is given in Tables 3 and 4, respectively. The algorithms on the agonist and binder dataset have achieved a cross-validation prediction accuracy of >90%. Best accuracy was obtained for both super learner and SVM based models: 87% on the  validation set with ECFP4 fingerprints (Table 5). With the MACCS key fingerprints, the best accuracy was obtained for super learner (Table 6) for the agonist. For the binder dataset, both SVM and super learner had similar performance measures with 97% and 96% accuracy on the  validation set for ECFP4 and MACCS key fingerprints.  For the agonist dataset, the precision-recall AUC (PR AUC) values of  validation for super learner and SVM are 0.81 and 0.80 (Table 5), respectively, for ECFP4 fingerprints and 0.81 and 0.79 for MACCS key fingerprints (Table 6). The validation dataset's PR-AUC value is 0.98 and 0.97 for ECFP4 and MACCS key fingerprints for the binder dataset. We applied the applicability domain to the validation set and removed the unreliable data points that were thus identified. Then we evaluated the performance of the SVM and super learner models on the remaining reliable data points from validation dataset.

 

AdaBoost classifier, bagging classifier, decision tree classifier, k-NN, random forest, super learner, SVM models have achieved a prediction accuracy of >85% for antagonist model with both ECFP4 and MACCS key fingerprints as a feature. On the  validation set with ECFP4 fingerprints, super learner and SVM based models achieved 83% and 84% accuracy, respectively (Table 5). Similar balanced accuracy was obtained for super learner and SVM models with MACCS key fingerprints (Table 6). The PR-AUC values on the validation set for super learner and SVM are 0.81 and 0.80 (Table 5), respectively, for ECFP4 fingerprints and 0.81 and 0.79 for MACCS key fingerprints (Table 6). The developed models performance is comparable to other developed models [11, 28].

 

Effector Models for AR

 

For AR, four algorithms: k-NN, random forest, SVM and super learner with ECFP4 fingerprints all exhibited high predictive power. The balanced accuracy values are 85%, 86%, 87% and 86%, respectively, with MCC scores of 0.77, 0.73, 0.78 and 0.89, respectively, on the validation dataset (Table 7). The accuracy scores on the k-fold CV for these three models are 0.90±0.01, 0.88±0.01, 0.89±0.01 and 0.90±0.01 (Table 8). The effector AR model has achieved a prediction accuracy of 90% on cross-validation for SVM and k-NN and 89% for super learner. Although k-NN and SVM achieved higher accuracy with MACCS key fingerprints, SVM with ECFP4 fingerprints performed best with a higher MCC value, which produced a more informative and truthful score in evaluating binary classifications [47].

 

 

 

 

Table 3: Average accuracy of different algorithms for three class approach for six receptors using ECFP4 fingerprints as input features on the repeated K-fold cross-validation.

 

Agonist

Antagonist

Binder

Receptor

Algorithm

Accuracy

Accuracy

Accuracy

AR

Super learner

0.95±0.01

0.87±0.02

0.98±0.01

Support vector machine

0.96±0.01

0.88±0.01

0.98±0.01

ERA

Super learner

0.84±0.02

0.86±0.02

0.94±0.01

Support vector machine

0.84±0.01

0.88±0.01

0.94±0.01

ERB

Super learner

0.96±0.01

0.86±0.02

0.98±0.01

Support vector machine

0.96±0.01

0.87±0.01

0.98±0.01

FXR

Super learner

0.98±0.01

0.88±0.02

0.97±0.01

Support vector machine

0.97±0.01

0.89±0.02

0.97±0.01

GR

Super learner

0.98±0.01

0.93±0.01

0.98±0.01

Support vector machine

0.98±0.01

0.93±0.01

0.98±0.01

PR

Super learner

0.98±0.01

0.86±0.02

0.99±0.00

Support vector machine

0.98±0.01

0.87±0.02

0.99±0.00

± = standard deviations

 

Table 4: Average accuracy of different algorithms for three class approach for six receptors using MACCS key fingerprints as input features on the repeated K-fold cross-validation.

 

Receptor

Agonist

Antagonist

Binders

 

Algorithm

Accuracy

Accuracy

Accuracy

AR

Super learner

0.95±0.01

0.87±0.02

0.98±0.01

 

Support vector machine

0.96±0.01

0.89±0.01

0.97±0.01

 

ERA

Super learner

0.83±0.02

0.83±0.02

0.95±0.01

Support vector machine

0.85±0.01

0.91±0.01

0.95±0.01

ERB

Super learner

0.96±0.01

0.85±0.02

0.97±0.01

Support vector machine

0.98±0.01

0.84±0.02

0.98±0.01

FXR

Super learner

0.96±0.01

0.81±0.04

0.96±0.01

Support vector machine

0.98±0.01

0.78±0.02

0.97±0.01

GR

Super learner

0.85±0.02

0.83±0.02

0.86±0.01

Support vector machine

0.98±0.01

0.94±0.01

0.97±0.01

PPARG

Super learner

0.86±0.01

0.92±0.01

0.84±0.01

Support vector machine

0.96±0.01

0.82±0.02

0.96±0.01

PR

Super learner

0.98±0.01

0.86±0.02

0.98±0.01

Support vector machine

0.98±0.01

0.88±0.01

0.98±0.01

± = standard deviations         

 

 

Table 5: Comparison of the Performance of the Different Classifiers on the validation set for three class approaches for AR using ECFP4 fingerprints as input features.

Method

BA

Sn

Sp

MCC

PR

AUC

TP

TN

FP

FN

Agonist

Super learner

0.87

0.77

0.98

0.75

0.81

90

1099

27

27

 

Support vector machine

0.87

0.77

0.97

0.74

0.80

90

1097

27

29

Antagonist

Super learner

0.83

0.72

0.94

0.67

0.83

173

922

68

59

 

Support vector machine

0.84

0.76

0.92

0.66

0.84

183

901

58

80

 

Binder

Super learner

0.97

0.95

0.99

0.95

0.98

265

1043

15

8

 

Support vector machine

0.97

0.94

0.99

0.94

0.98

264

1042

16

9

*BA - balanced accuracy, Sn – Sensitivity, Sp – Specificity, MCC – Mathew Correlation coefficient, PR AUC – Precision-Recall  Curve, TP – True Positive, TN – True Negative, FN – False Negative, FP – False Positive

 

Table 6: Comparison of the Performance of the Different Classifiers on the validation set for three class approaches for AR using MACCS key fingerprints as input features.

Method

BA

Sn

Sp

MCC

PR

AUC

TP

TN

FN

FP

Agonist

Super learner

0.87

0.79

0.96

0.71

0.81

92

1084

25

42

Support vector machine

0.86

0.79

0.94

0.64

0.79

92

1060

25

66

Antagonist

Super learner

0.84

0.75

0.93

0.66

0.82

180

910

61

71

Support vector machine

0.84

0.80

0.89

0.64

0.83

192

874

49

107

Binder

Super learner

0.96

0.94

0.98

0.92

0.97

264

1033

16

18

Support vector machine

0.96

0.94

0.98

0.92

0.97

262

1034

18

17

*BA - balanced accuracy, Sn – Sensitivity, Sp – Specificity, MCC – Mathew Correlation coefficient, PR AUC – Precision-Recall  Curve, TP – True Positive, TN – True Negative, FN – False Negative, FP – False Positive

 

Table 7: Comparison of the Performance of the Different Classifiers on the validation set for AR Effector dataset.

Fingerprint

Method

BA

Sn

Sp

MCC

PR

AUC

TP

TN

FN

FP

ECFP4

Super learner

0.86

0.76

0.96

0.75

0.89

332

1000

103

45

Support vector machine

0.87

0.76

0.98

0.78

0.90

329

1019

106

26

MACSS

Super learner

0.86

0.79

0.93

0.73

0.89

345

969

90

76

Support vector machine

0.87

0.79

0.94

0.75

0.90

345

982

90

63

*BA - balanced accuracy, Sn – Sensitivity, Sp – Specificity, MCC – Mathew Correlation coefficient, PR AUC – Precision-Recall  Curve, TP – True Positive, TN – True Negative, FN – False Negative, FP – False Positive

 

 

Table 8: Average accuracy of different algorithms for effector dataset of different receptors using ECFP4 fingerprints and MACCS key fingerprints as input feature on the repeated K-fold cross-validation.

Fingerprint

Accuracy

Method

AR

ERA

ERB

FXR

GR

PPARD

PPARG

PR

RXR

ECFP4

Super learner

0.89±0.01

0.85±0.01

0.94±0.01

0.94±0.01

0.95±0.01

0.98±0.01

0.94±0.01

0.89±0.01

0.96±0.01

Support vector machine

0.90±0.01

0.86±0.01

0.94±0.01

0.95±0.01

0.95±0.01

0.98±0.01

0.94±0.01

0.90±0.01

0.96±0.01

MACCS

Super learner

0.88±0.01

0.84±0.01

0.93±0.01

0.92±0.01

0.94±0.01

0.97±0.01

0.93±0.01

0.89±0.01

0.96±0.01

Support vector machine

0.89±0.01

0.85±0.01

0.93±0.01

0.94±0.01

0.95±0.01

0.98±0.01

0.94±0.01

0.90±0.01

0.96±0.01

± = standard deviations         

 

Models for ERA and ERB

 

Binding Class Models for ERA and ERB

Machine learning models of agonist, antagonist, and binder of both ERA and ERB were evaluated using an validation dataset and repeated k-fold CV. The performance measures for different algorithms with the validation test set and repeated k-fold CV are given in Tables 3 and 4 for ECFP4 and MACCS key fingerprints as input features, respectively. The bagging classifier has an average accuracy of 89%, 91% and 94% for agonist, antagonist, and binder datasets with ECFP4 fingerprints and 88%, 91% and 93% with MACCS key fingerprints, respectively, for ERA. The performance measure of ERA and ERB using the binding class classifier on the  validation set are given in supporting information Tables S3 and S4, respectively, for ECFP4 fingerprints as input feature and Tables S5 and S6, respectively, for MACCS key fingerprints. Even though the bagging classifier has better accuracy on CV, the SVM and super learner appear to give a more consistent prediction accuracy on both CV and the  validation dataset (Table S3 and S5). Similarly, for ERB, more consistent performance measures were obtained for SVM and super learner (see Tables S4 and S6).

 

Effector Models for ERA and ERB

 

The SVM model performed best (balanced accuracy, 80%; MCC score of 0.66), followed by random forest (accuracy, 79%; MCC score 0.61) for ECFP4 fingerprints as descriptors on the  validation dataset of ERA (Table S7).  For MACCS key fingerprints, SVM had comparable accuracy but a lower MCC (Table S7). The lower MCC was likely due to the promiscuous nature of ERA, which binds to diverse chemicals, which in turn made it somewhat harder for machine learning algorithms to discriminate between NR-binding and non-binding chemicals. For ERB, the accuracy score on the k-fold CV is 85% and 86% for super learner and SVM for ECFP4 fingerprints (Table S8)  and 84% and 85% for MACCS key fingerprints (Table S8). The model developed using SVM combined with ECFP4 fingerprints had a maximum MCC value of 0.82 with the specificity, sensitivity, and balanced accuracy of 94%, 94% and 89%, respectively. Similar performance has been observed for other classifiers with ECFP4 and MACCS key fingerprints.

 

 

 

Models for FXR and PPARG

 

Binding Class Models for FXR and PPARG

 

Classifiers based on the ECFP4 and MACCS key fingerprints average stratified k-fold CV accuracy for different classes of FXR and PPARG are given in Table 3 and 4, respectively, demonstrating that all of the classifiers have achieved accuracies of >90% at identifying FXR agonist and binders. Specifically, bagging classifier, k-nearest neighbors, random forest, super learner and SVM classifiers achieved an accuracy of >95% at identifying FXR agonist and binders with MACCS key fingerprints. For the FXR antagonist dataset, AdaBoost classifier, bagging classifier, decision tree classifier, and random forest all have accuracies of > 90% with k-fold CV with ECFP4 and MACCS key. The performance of different classifiers for different classes of FXR and PPARG on the  validation dataset are given in Tables S9 (ECFP4 fingerprints), S10 (MACCS key) and Tables S11 (ECFP4 fingerprints), S12 (MACCS Key), respectively. The results demonstrate that super learner has attained better performance for agonists and binders of FXR with different fingerprints. Similar performance has been achieved for PPARG agonists and binders. Poor performance of antagonist models was obtained on the  test set for all the classifiers for both FXR and PPARG due to the sample size of the training dataset.

 

 

 

Models for GR and PR

 

Binding Class Models for GR and PR

 

Classifiers based on the ECFP4 and MACCS key fingerprints average stratified k-fold CV accuracy for different classes of GR and PR are given in Tables 3 and 4, respectively. Results show that SVM and super learner algorithms have higher accuracy in identifying agonists and binders for GR and PR based on k-fold CV. The performance of different classifiers for different classes of GR and PR on the  validation dataset are given in Tables S13 (ECFP4 fingerprints), S14 (MACCS key) and Tables S15 (ECFP4 fingerprints), S16 (MACCS Key), respectively. Results show that random forest, super learner and SVM have good performance scores for the three classes of GR and PR with different features.

 

Effector Models for FXR, GR and PR, PPARG, PPARD and RXR

 

Data availability for antagonists of PPARD and RXR is limited; hence we have not modelled the different classes. We merged the dataset as described in the materials and methods to create an effector dataset for these receptors. Performance measures on the repeated k-fold CV for FXR, GR, PR, PPARG, PPARD and RXR are given in Tables 3 and 4 for ECFP4 and MACCS key fingerprints, respectively. Results show high accuracy across these NRs for all classifiers with both fingerprint types. The different performance measures on the  dataset for FXR, GR, PR, PPARD, PPARG, and RXR are given in supporting information Tables S17 to S22, respectively. Tables 3 and 4 show that super learner and SVM have both attained accuracies of >90% for the effector dataset of these receptors. The supporting information Tables S17 and S19 for FXR and PR show that most of the classifiers attained high accuracy for both fingerprint types. The Random Forest, k-nearest neighbors and support vector machine with ECFP4 fingerprints showed similar sensitivity/specificity 94 - 95% / 94 - 95%, respectively, with MCC value 0.75 - 0.76 on the validation dataset. The results for the ligand binding predictions for GR, PPARD, PPARG, and RXR (see supporting information Table S18, S20, S21, and S22) show that the support vector machine-based models achieved slightly higher accuracy and MCC score than other evaluated algorithms.

 

 

Applicability Domain on the Validation set

 

The results on the  validation dataset after filtering the dataset through the applicability domain for the reliability of the prediction are given in supporting information as a CSV file supporting information (S23 to S81). The results show that including the applicability domain with SVM and super learner models with ECFP4 fingerprints improves the model's performance. The stringent Scutoff = 0.6 and Nmin >= 5 reduces the number of chemicals within the applicability domain and gives the best prediction outcomes. Significant improvement in the performance of the antagonist models of FXR has been obtained using the strict applicability domain parameters.

 

Implementation of Web Server

 

Based on our trained and validated best performing models, we have developed a web-based application named NR-ToxPred with a user-friendly interface to assist the scientific community (Figure 1). The user interface of the NR- ToxPred allows for different formats to submit small molecules. Users can sketch the structure using a simple drawing interface, give SMILES codes as text input in the drawing interface, or input CAS ID data as the search criteria. Users can upload a two-column file with SMILES codes and corresponding names in a comma-separated CSV format for multiple ligand predictions. We implemented the best support vector machine-based model for all nine NRs on the webserver. For the single structure input, in addition to the tabulated results for each receptor, if the chemical is a predicted ligand, it is subsequently docked to the matching receptor(s). Users can select the Applicability domain criteria (Scutoff and Nmin). The NR-ToxPred web service can be accessed at http://nr-toxpred.cchem.berkeley.edu/.

 

Limitations of the models

 

In this study, we developed different machine learning models for predicting agonist, antagonist, binders (each binding class as binary: active vs inactive) and also effectors (binding vs not binding). Then, as needed, we constrained these to the applicability domain within each receptor according to the available number of chemicals in each class in the dataset. We initially found poor predictive power of for the antagonist models of FXR but this was overcome by setting stricter criteria for the applicability domain. For PPARG, PPARD and RXR models, we collapsed the agonist and antagonist from the dataset into one category, aka effector, due to the limitations in the available number of chemicals in each antagonist category in the dataset. The models herein are thus limited to predicting only the binding of the small molecules to these NRs. They are not capable of distinguishing agonists versus antagonists. However, this distinction is easily determined in an experimental setting once the binding candidates are identified. This experimental testing is much more tractable with the computationally shortlisted data set than testing the whole set of chemicals. For the other NRs, our predictions are well validated and robust with more robust data.

 

Table S3:  Comparison of the performance of the different classifiers on the validation set for binding type ERA – ECFP4 fingerprints as input features.

Agonist

Method

BA

Sn

Sp

MCC

PR

AUC

TP

TN

FN

FP

 

Superlearner

0.75

0.57

0.93

0.52

0.66

106

948

80

66

 

Support vector machine

0.77

0.65

0.89

0.49

0.64

120

903

66

111

Antagonist

 

Superlearner

0.75

0.58

0.92

0.47

0.63

81

948

59

81

 

Support vector machine

0.74

0.57

0.91

0.44

0.60

80

937

60

92

Binary

Superlearner

0.94

0.91

0.97

0.86

0.95

254

951

26

34

Support vector machine

0.93

0.90

0.96

0.85

0.95

252

945

28

40

 

 

Table S4:  Comparison of the performance of the different classifiers on the validation set for binding type ERB – ECFP4 fingerprints as input features.

 

Method

BA

Sn

Sp

MCC

PR AUC

TP

TN

FN

FP

Agonist

Support vector machine

0.94

0.91

0.97

0.71

0.84

58

1112

6

40

Superlearner

0.95

0.92

0.97

0.77

0.92

59

1123

5

29

Antagonist

Support vector machine

0.77

0.66

0.89

0.42

0.58

62

911

32

113

Superlearner

0.77

0.62

0.92

0.45

0.61

58

940

36

84

Binary

Superlearner

0.99

0.99

0.99

0.95

1.00

224

1114

3

15

Support vector machine

0.99

0.99

0.98

0.95

0.99

225

1111

2

18

 

 

 

 

 

 

 

Table S5:  Comparison of the performance of the different classifiers on the validation set for binding type ERA – MACCS key as input features

Agonist

Method

BA

Sn

Sp

MCC

PR

AUC

TP

TN

FN

FP

Superlearner

0.74

0.59

0.90

0.47

0.62

109

915

77

99

Support vector machine

0.74

0.61

0.87

0.43

0.60

114

878

72

136

Antagonist

Superlearner

0.75

0.60

0.91

0.45

0.62

84

932

56

97

Support vector machine

0.77

0.66

0.88

0.46

0.60

93

910

47

119

Binder

Superlearner

0.93

0.91

0.96

0.85

0.94

255

942

25

43

Support vector machine

0.93

0.91

0.95

0.84

0.95

256

936

24

49

 

 

Table S6:  Comparison of the performance of the different classifiers on the validation set for binding type ERB – MACCS key fingerprints as input features

Method

BA

Sn

Sp

MCC

PR

AUC

TP

TN

FN

FP

      Agonist

Support vector machine

0.93

0.91

0.96

0.68

0.78

58

1103

6

49

Superlearner

0.94

0.92

0.96

0.69

0.90

59

1102

5

50

Antagonist

Superlearner

0.80

0.68

0.91

0.48

0.66

64

936

30

88

Support vector machine

0.85

0.85

0.86

0.49

0.54

80

879

14

145

Binders

Support vector machine

0.98

0.98

0.98

0.93

0.99

223

1104

4

25

Superlearner

0.98

0.98

0.98

0.93

0.98

223

1105

4

24

 

 

 

Table S7:  Comparison of the performance of the different classifiers on the validation set for ERA.

ECFP4

Fingerprints

Method

BA

Sn

Sp

MCC

PR AUC

TP

TN

FN

FP

Superlearner

0.80

0.65

0.94

0.64

0.83

300

937

164

56

Support vector machine

0.80

0.63

0.96

0.66

0.83

293

954

171

39

MACCS

Fingerprints

Superlearner

0.79

0.68

0.91

0.61

0.73

314

904

150

89

Support vector machine

0.80

0.69

0.91

0.62

0.82

321

903

143

90

 

 

Table S8:  Comparison of the performance of the different classifiers on the validation set for ERB.

Fingerprint

Method

BA

Sn

Sp

MCC

PR

AUC

TP

TN

FN

FP

ECFP4

Superlearner

0.88

0.79

0.98

0.81

0.89

256

1074

68

25

Support vector machine

0.89

0.79

0.98

0.82

0.90

256

1082

68

17

MACCS

Superlearner

0.88

0.81

0.96

0.79

0.89

261

1059

63

40

Support vector machine

0.88

0.79

0.97

0.78

0.88

255

1061

69

38

 

 

 

Table S9:  Comparison of the performance of the different classifiers on the validation set for binding type FXR – ECFP4 fingerprints as input features

Agonist

Method

BA

Sn

Sp

MCC

PR

AUC

TP

TN

FN

FP

Superlearner

0.92

0.86

0.99

0.84

0.91

95

1036

16

15

Support vector machine

0.90

0.82

0.99

0.82

0.89

91

1036

20

15

Antagonist

Superlearner

0.75

0.58

0.92

0.36

0.49

28

899

20

73

Support vector machine

0.74

0.56

0.91

0.31

0.39

27

883

21

89

Binder

Support vector machine

0.94

0.90

0.99

0.89

0.91

115

1047

13

11

Superlearner

0.94

0.89

0.99

0.89

0.92

114

1046

14

12

 

 

 

Table S10:  Comparison of the performance of the different classifiers on the validation set for binding type FXR – MACCS

Method

BA

Sn

Sp

MCC

PR

AUC

TP

TN

FN

FP

Agonist

Superlearner

0.91

0.84

0.98

0.80

0.88

93

1027

18

24

Support vector machine

0.89

0.80

0.98

0.80

0.83

89

1034

22

17

Antagonist

Superlearner

0.94

0.89

0.98

0.86

0.87

114

1040

14

18

Support vector machine

0.94

0.89

0.98

0.85

0.91

114

1037

14

21

Binder

Support vector machine

0.79

0.71

0.88

0.35

0.43

34

856

14

116

Superlearner

0.76

0.58

0.95

0.42

0.51

28

920

20

52

 

 

Table S11: Comparison of the performance of the different classifiers on the validation set for binding type PPARG – ECFP4 fingerprints as input features

Method

BA

Sn

Sp

MCC

PR AUC

TP

TN

FN

FP

Agonist

Superlearner

0.94

0.89

0.98

0.89

0.95

276

1021

34

16

Support vector machine

0.94

0.88

0.99

0.90

0.95

273

1026

37

11

Antagonist

Superlearner

0.67

0.50

0.84

0.16

0.17

19

889

19

171

Support vector machine

0.65

0.55

0.74

0.12

0.14

21

783

17

277

Binders

Superlearner

0.93

0.89

0.98

0.88

0.95

332

1077

43

21

Support vector machine

0.93

0.87

0.99

0.89

0.95

328

1083

47

15

 

 

 

 

Table S12:  Comparison of the performance of the different classifiers on the validation set for binding type PPARG – MACCS fingerprints as input features

Agonist

Method

BA

Sn

Sp

MCC

PR

AUC

TP

TN

FN

FP

Superlearner

0.93

0.88

0.98

0.87

0.94

272

1012

38

25

Support vector machine

0.92

0.86

0.98

0.86

0.94

266

1017

44

20

Antagonist

Superlearner

0.69

0.55

0.83

0.18

0.19

21

882

17

178

Support vector machine

0.70

0.61

0.80

0.18

0.11

23

843

15

217

Binders

Superlearner

0.93

0.88

0.98

0.88

0.94

330

1077

45

21

Support vector machine

0.93

0.88

0.99

0.89

0.95

329

1083

46

15

                                                          

 

Table S13: Comparison of the performance of the different classifiers on the validation set for binding type GR – ECFP4 fingerprints as input features

Agonist

Method

BA

Sn

Sp

MCC

PR AUC

TP

TN

FP

FN

Superlearner

0.95

0.91

0.99

0.92

0.95

150

1061

15

7

Support vector machine

0.94

0.90

0.99

0.89

0.94

149

1054

16

14

Antagonist

Superlearner

0.89

0.81

0.96

0.77

0.89

146

871

35

33

Support vector machine

0.88

0.80

0.95

0.74

0.87

144

863

37

41

Binder

Superlearner

0.97

0.94

0.99

0.95

0.98

342

1055

20

9

Support vector machine

0.97

0.94

1.00

0.96

0.98

342

1061

20

3

 

Table S14:  Comparison of the performance of the different classifiers on the validation set for binding type GR – MACCS key fingerprints as input features

Agonist

Method

BA

Sn

Sp

MCC

PR

AUC

TP

TN

FN

FP

Superlearner

0.95

0.92

0.98

0.89

0.95

151

1050

14

18

Support vector machine

0.95

0.92

0.99

0.91

0.94

151

1056

14

12

Antagonist

Superlearner

0.89

0.83

0.95

0.77

0.89

151

862

30

42

Support vector machine

0.89

0.83

0.95

0.75

0.88

151

855

30

49

Binders

Superlearner

0.97

0.95

0.99

0.94

0.97

343

1052

19

12

Support vector machine

0.97

0.94

0.99

0.95

0.98

342

1056

20

8

 

Table S15: Comparison of the performance of the different classifiers on the validation set for binding type PR – ECFP4 fingerprints as input features

Agonist

Method

BA

Sn

Sp

MCC

PR

AUC

TP

TN

FN

FP

Superlearner

0.96

0.93

0.99

0.90

0.93

80

1114

6

10

Support vector machine

0.96

0.93

0.99

0.87

0.93

80

1108

6

16

Antagonist

Superlearner

0.83

0.76

0.91

0.65

0.83

199

795

63

81

Support vector machine

0.83

0.78

0.88

0.63

0.83

204

774

58

102

Binder

Superlearner

0.98

0.96

0.99

0.95

0.97

238

1012

9

10

Support vector machine

0.98

0.97

0.99

0.96

0.99

240

1012

7

10

 

 

Table S16:  Comparison of the performance of the different classifiers on the validation set for binding type PR - MACCS fingerprints as input features

Agonist

Method

BA

Sn

Sp

MCC

PR AUC

TP

TN

FN

FP

Superlearner

0.96

0.93

0.98

0.87

0.92

80

1107

6

17

Support vector machine

0.96

0.93

0.98

0.85

0.87

80

1103

6

21

Antagonist

Superlearner

0.85

0.80

0.91

0.68

0.82

210

795

52

81

Support vector machine

0.86

0.84

0.87

0.66

0.82

221

765

41

111

Binders

Superlearner

0.98

0.97

0.99

0.95

0.99

239

1010

8

12

Support vector machine

0.97

0.96

0.99

0.95

0.99

237

1011

10

11

 

 

Table S17:  Comparison of the performance of the different classifiers on the validation set for FXR effector dataset.

Fingerprints

Algorithm

BA

Sn

Sp

MCC

PR

AUC

TP

TN

FN

FP

ECFP4

Superlearner

0.86

0.74

0.97

0.75

0.85

129

1021

46

27

 

Support vector machine

0.83

0.67

0.99

0.76

0.85

117

1039

58

9

MACCS

 

 

Superlearner

0.85

0.75

0.95

0.68

0.82

131

995

44

53

 

Support vector machine

0.84

0.71

0.97

0.71

0.83

125

1015

50

33

 

 

Table S18:  Comparison of the performance of the different classifiers on the validation set for GR effector dataset.

ECFP4 Fingerprint

Method

BA

Sn

Sp

MCC

PR

AUC

TP

TN

FN

FP

 

Superlearner

0.94

0.9

0.98

0.9

0.95

379

1036

44

16

 

Support vector machine

0.94

0.89

0.98

0.9

0.96

377

1036

46

16

MACCS

Fingerprint

 

 

 

 

 

 

 

 

 

 

 

Support vector machine

0.94

0.91

0.98

0.90

0.96

383

1030

40

22

 

Superlearner

0.95

0.91

0.98

0.90

0.95

387

1030

36

22

 

Table S19:  Comparison of the performance of the different classifiers on the validation set for PR effector dataset

ECFP4

Fingerprint

Algorithm

BA

Sn

Sp

MCC

PR

AUC

TP

TN

FN

FP

Superlearner

0.87

0.77

0.96

0.77

0.90

334

957

98

40

Support vector machine

0.86

0.75

0.97

0.76

0.91

325

963

107

34

MACCS

Fingerprints

Superlearner

0.88

0.81

0.94

0.77

0.92

352

939

80

58

Support vector machine

0.87

0.80

0.95

0.76

0.88

345

944

87

53

 

Table S20: Comparison of the performance of the different classifiers on the validation set for PPARD effector dataset

ECFP4 Fingerprint

Algorithm

BA

Sn

Sp

MCC

PR

AUC

TP

TN

FN

FP

Superlearner

0.94

0.88

0.99

0.91

0.95

149

1143

21

6

Support vector machine

0.93

0.85

1.00

0.91

0.92

145

1149

25

0

MACCS

Superlearner

0.93

0.86

0.99

0.87

0.93

147

1133

23

16

Support vector machine

0.92

0.84

0.99

0.87

0.92

143

1138

27

11

 

Table S21:  Comparison of the performance of the different classifiers on the validation set for PPARG effector dataset

ECFP4

Fingerprint

Algorithm

BA

Sn

Sp

MCC

PR AUC

TP

TN

FN

FP

Superlearner

0.91

0.84

0.98

0.85

0.93

355

1071

70

22

Support vector machine

0.90

0.80

0.99

0.85

0.92

340

1086

85

7

MACCS

 Fingerprint

Superlearner

0.90

0.84

0.97

0.83

0.87

355

1063

70

30

Support vector machine

0.90

0.83

0.98

0.83

0.94

352

1066

73

27

 

Table S22: Comparison of the performance of the different classifiers on the validation set for RXR effector dataset

ECFP4 Fingerprint

Algorithm

BA

Sn

Sp

MCC

PR

AUC

TP

TN

FN

FP

Superlearner

0.90

0.80

1.00

0.87

0.92

160

913

41

2

Support vector machine

0.90

0.80

1.00

0.87

0.89

161

913

40

2

MACCS

Superlearner

0.90

0.81

0.99

0.85

0.91

162

906

39

9

Support vector machine

0.91

0.82

1.00

0.87

0.91

164

911

37

4