2D structures are standardized and ionized at pH 7.4. 95 unique molecules are obtained and divided into modelling and external set 70/30 respectively.
descriptors are computed to generate models. The classification categories were obtained using 1 cutoff as activity threshold. All molecules with a value between 1 and 2 have been deleted in order to generate an impute zone. RDKIT descriptors are considered to be consistent in order to perform ASBT_I classification models. After eliminating missing values and zero variance descriptors, minmax normalization was performed on modeling set.
Modeling set has been successively divided 30 times into training and test set 70/30 respectively. After categories balancing, the most ASBT_I discriminant descriptors were selected if there p-value was higher than 0.02. Correlated descriptors were removed with a correlation threshold of 0.75. RF algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are standardized and ionized at pH 7.4. 68 unique molecules are obtained and divided into modelling and external set 70/30 respectively.
descriptors are computed to generate models. The classification categories were obtained using 1 cutoff as activity threshold. All molecules with a value between 1 and 2 have been deleted in order to generate an impute zone. MORDRED descriptors are considered to be consistent in order to perform ASBT_S classification models. After eliminating missing values and zero variance descriptors, minmax normalization was performed on modeling set.
Modeling set has been successively divided 30 times into training and test set 70/30 respectively. After categories balancing, the most ASBT_S discriminant descriptors were selected if there p-value was higher than 0.01. Correlated descriptors were removed with a correlation threshold of 0.75. RF algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are not prepared. 974 unique molecules are obtained and divided into modelling and external set 70/30 respectively.
MORDRED descriptors are computed to generate models. The classification categories were obtained using 10 cutoff as activity threshold. All molecules with a value between 10 and 20 have been deleted in order to generate an impute zone. MORDRED descriptors are considered to be consistent in order to perform BBB_class classification models. After eliminating missing values and zero variance descriptors, minmax normalization was performed on modeling set.
Modeling set has been successively divided 30 times into training and test set 70/30 respectively. After categories balancing, the most BBB_class discriminant descriptors were selected if there p-value was higher than 0.01. Correlated descriptors were removed with a correlation threshold of 0.95. SVM algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are not prepared. 974 unique molecules are obtained and divided into modelling and external set 70/30 respectively.
MORDRED descriptors are computed to generate models. The classification categories were obtained using 10 cutoff as activity threshold. All molecules with a value between 10 and 20 have been deleted in order to generate an impute zone. MORDRED descriptors are considered to be consistent in order to perform BBB_class classification models. After eliminating missing values and zero variance descriptors, minmax normalization was performed on modeling set.
Modeling set has been successively divided 30 times into training and test set 70/30 respectively. After categories balancing, the most BBB_class discriminant descriptors were selected if there p-value was higher than 0.03. Correlated descriptors were removed with a correlation threshold of 0.95. SVM algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are standardized and ionized at pH 7.4. 129 unique molecules are obtained and divided into modelling and external set 70/30 respectively.
FCFP12 descriptors are computed to generate models (1024 bits fingerprints). The classification categories were obtained using 1 cutoff as activity threshold. All molecules with a value between 1 and 2 have been deleted in order to generate an impute zone. FCFP12 descriptors are considered to be consistent in order to perform BCRP_S classification models. After eliminating missing values and zero variance descriptors, minmax normalization was performed on modeling set.
Modeling set has been successively divided 30 times into training and test set 70/30 respectively. After categories balancing, the most BCRP_S discriminant descriptors were selected if there p-value was higher than 0.05. Correlated descriptors were removed with a correlation threshold of 0.65. SVM algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are standardized and ionized at pH 7.4. 435 unique molecules are obtained and divided into modelling and external set 70/30 respectively.
descriptors are computed to generate models. The classification categories were obtained using 1 cutoff as activity threshold. All molecules with a value between 1 and 2 have been deleted in order to generate an impute zone. MORDRED descriptors are considered to be consistent in order to perform CYP2C9_S classification models. After eliminating missing values and zero variance descriptors, minmax normalization was performed on modeling set.
Modeling set has been successively divided 30 times into training and test set 70/30 respectively. After categories balancing, the most CYP2C9_S discriminant descriptors were selected if there p-value was higher than 0.01. Correlated descriptors were removed with a correlation threshold of 0.75. SVM algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are only standardized. 471 unique molecules are obtained and divided into modelling and external set 70/30 respectively.
MORDRED descriptors are computed to generate models. The classification categories were obtained using 1 cutoff as activity threshold. All molecules with a value between 1 and 2 have been deleted in order to generate an impute zone. MORDRED descriptors are considered to be consistent in order to perform CYP2D6_S classification models. After eliminating missing values and zero variance descriptors, minmax normalization was performed on modeling set.
Modeling set has been successively divided 30 times into training and test set 70/30 respectively. After categories balancing, the most CYP2D6_S discriminant descriptors were selected if there p-value was higher than 0.02. Correlated descriptors were removed with a correlation threshold of 0.9. SVM algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are only standardized. 562 unique molecules are obtained and divided into modelling and external set 70/30 respectively.
MORDRED descriptors are computed to generate models. The classification categories were obtained using 1 cutoff as activity threshold. All molecules with a value between 1 and 2 have been deleted in order to generate an impute zone. MORDRED descriptors are considered to be consistent in order to perform CYP3A4_S classification models. After eliminating missing values and zero variance descriptors, minmax normalization was performed on modeling set.
Modeling set has been successively divided 30 times into training and test set 70/30 respectively. After categories balancing, the most CYP3A4_S discriminant descriptors were selected if there p-value was higher than 0.04. Correlated descriptors were removed with a correlation threshold of 0.95. SVM algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are not prepared. 324 unique molecules are obtained and divided into modelling and external set 70/30 respectively.
RDKIT descriptors are computed to generate models. The classification categories were obtained using 5 cutoff as activity threshold. All molecules with a value between 5 and 15 have been deleted in order to generate an impute zone. RDKIT descriptors are considered to be consistent in order to perform Fu_plasma classification models. After eliminating missing values and zero variance descriptors, z-score normalization was performed on modeling set.
Modeling set has been successively divided 30 times into training and test set 70/30 respectively. After categories balancing, the most Fu_plasma discriminant descriptors were selected if there p-value was higher than 0.03. Correlated descriptors were removed with a correlation threshold of 0.9. SVM algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are not prepared. 5988 unique molecules are obtained and divided into modelling and external set 70/30 respectively.
MORDRED descriptors are computed to generate models. MORDRED descriptors are considered to be consistent in order to perform LogD74 regression models. After eliminating missing values and zero variance descriptors, z-score normalization was performed on modeling set.
Modeling set has been successively divided 30 times into training and test set 70/30 respectively. The most correlated descriptors to LogD74 were selected with if there correlation coefficient was higher than 0.4. MLR algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are standardized and ionized at pH 7.4. 3931 unique molecules are obtained and divided into modelling and external set 75/25 respectively.
RDKIT descriptors are computed to generate models. RDKIT descriptors are considered to be consistent in order to perform LogPwo regression models. After eliminating missing values and zero variance descriptors, minmax normalization was performed on modeling set.
Modeling set has been successively divided 30 times into training and test set 75/25 respectively. The most correlated descriptors to LogPwo were selected with if there correlation coefficient was higher than 0.0. MLR algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are standardized and ionized at pH 7.4. 1037 unique molecules are obtained and divided into modelling and external set 75/25 respectively.
MORDRED descriptors are computed to generate models. The classification categories were obtained using 20 cutoff as activity threshold. MORDRED descriptors are considered to be consistent in order to perform LogS_EtOH classification models. After eliminating missing values and zero variance descriptors, minmax normalization was performed on modeling set.
Modeling set has been successively divided 10 times into training and test set 75/25 respectively. After categories balancing, the most LogS_EtOH discriminant descriptors were selected if there p-value was higher than 0.04. Correlated descriptors were removed with a correlation threshold of 0.95. SVM algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are standardized and ionized at pH 7.4. 2105 unique molecules are obtained and divided into modelling and external set 75/25 respectively.
RDKIT descriptors are computed to generate models. RDKIT descriptors are considered to be consistent in order to perform LogS_H2O regression models. After eliminating missing values and zero variance descriptors, minmax normalization was performed on modeling set.
Modeling set has been successively divided 30 times into training and test set 75/25 respectively. The most correlated descriptors to LogS_H2O were selected with if there correlation coefficient was higher than 0.0. SVM algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are standardized and ionized at pH 7.4. 2340 unique molecules are obtained and divided into modelling and external set 70/30 respectively.
MORDRED descriptors are computed to generate models. The classification categories were obtained using 1 cutoff as activity threshold. All molecules with a value between 1 and 2 have been deleted in order to generate an impute zone. MORDRED descriptors are considered to be consistent in order to perform MDR1_I classification models. After eliminating missing values and zero variance descriptors, minmax normalization was performed on modeling set.
Modeling set has been successively divided 30 times into training and test set 70/30 respectively. After categories balancing, the most MDR1_I discriminant descriptors were selected if there p-value was higher than 0.04. Correlated descriptors were removed with a correlation threshold of 0.95. SVM algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are standardized and ionized at pH 7.4. 517 unique molecules are obtained and divided into modelling and external set 75/25 respectively.
MORDRED descriptors are computed to generate models. The classification categories were obtained using 1 cutoff as activity threshold. All molecules with a value between 1 and 2 have been deleted in order to generate an impute zone. MORDRED descriptors are considered to be consistent in order to perform MDR1_S classification models. After eliminating missing values and zero variance descriptors, minmax normalization was performed on modeling set.
Modeling set has been successively divided 10 times into training and test set 75/25 respectively. Correlated descriptors were removed with a correlation threshold of 0.95. LGR algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are standardized and ionized at pH 7.4. 65 unique molecules are obtained and divided into modelling and external set 70/30 respectively.
descriptors are computed to generate models. The classification categories were obtained using 1 cutoff as activity threshold. All molecules with a value between 1 and 2 have been deleted in order to generate an impute zone. MORDRED descriptors are considered to be consistent in order to perform MRP2_I classification models. After eliminating missing values and zero variance descriptors, minmax normalization was performed on modeling set.
Modeling set has been successively divided 30 times into training and test set 70/30 respectively. After categories balancing, the most MRP2_I discriminant descriptors were selected if there p-value was higher than 0.04. Correlated descriptors were removed with a correlation threshold of 0.8. RF algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are standardized and ionized at pH 7.4. 65 unique molecules are obtained and divided into modelling and external set 70/30 respectively.
descriptors are computed to generate models. The classification categories were obtained using 1 cutoff as activity threshold. All molecules with a value between 1 and 2 have been deleted in order to generate an impute zone. MORDRED descriptors are considered to be consistent in order to perform MRP2_I classification models. After eliminating missing values and zero variance descriptors, minmax normalization was performed on modeling set.
Modeling set has been successively divided 30 times into training and test set 70/30 respectively. After categories balancing, the most MRP2_I discriminant descriptors were selected if there p-value was higher than 0.04. Correlated descriptors were removed with a correlation threshold of 0.75. SVM algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are standardized and ionized at pH 7.4. 148 unique molecules are obtained and divided into modelling and external set 70/30 respectively.
RDKIT descriptors are computed to generate models. The classification categories were obtained using 1 cutoff as activity threshold. All molecules with a value between 1 and 2 have been deleted in order to generate an impute zone. RDKIT descriptors are considered to be consistent in order to perform OCT1_I classification models. After eliminating missing values and zero variance descriptors, minmax normalization was performed on modeling set.
Modeling set has been successively divided 30 times into training and test set 70/30 respectively. After categories balancing, the most OCT1_I discriminant descriptors were selected if there p-value was higher than 0.05. Correlated descriptors were removed with a correlation threshold of 0.95. SVM algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are standardized and ionized at pH 7.4. 62 unique molecules are obtained and divided into modelling and external set 70/30 respectively.
RDKIT descriptors are computed to generate models. The classification categories were obtained using 1 cutoff as activity threshold. All molecules with a value between 1 and 2 have been deleted in order to generate an impute zone. RDKIT descriptors are considered to be consistent in order to perform OCT1_S classification models. After eliminating missing values and zero variance descriptors, minmax normalization was performed on modeling set.
Modeling set has been successively divided 30 times into training and test set 70/30 respectively. After categories balancing, the most OCT1_S discriminant descriptors were selected if there p-value was higher than 0.03. Correlated descriptors were removed with a correlation threshold of 0.95. SVM algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are standardized and ionized at pH 7.4. 64 unique molecules are obtained and divided into modelling and external set 70/30 respectively.
descriptors are computed to generate models. The classification categories were obtained using 1 cutoff as activity threshold. All molecules with a value between 1 and 2 have been deleted in order to generate an impute zone. MORDRED descriptors are considered to be consistent in order to perform PEPT1_I classification models. After eliminating missing values and zero variance descriptors, minmax normalization was performed on modeling set.
Modeling set has been successively divided 30 times into training and test set 70/30 respectively. After categories balancing, the most PEPT1_I discriminant descriptors were selected if there p-value was higher than 0.05. Correlated descriptors were removed with a correlation threshold of 0.85. SVM algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are standardized and ionized at pH 7.4. 110 unique molecules are obtained and divided into modelling and external set 70/30 respectively.
MORDRED descriptors are computed to generate models. The classification categories were obtained using 1 cutoff as activity threshold. All molecules with a value between 1 and 2 have been deleted in order to generate an impute zone. MORDRED descriptors are considered to be consistent in order to perform PEPT1_S classification models. After eliminating missing values and zero variance descriptors, minmax normalization was performed on modeling set.
Modeling set has been successively divided 30 times into training and test set 70/30 respectively. After categories balancing, the most PEPT1_S discriminant descriptors were selected if there p-value was higher than 0.04. Correlated descriptors were removed with a correlation threshold of 0.9. SVM algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.
2D structures are not prepared. 447 unique molecules are obtained and divided into modelling and external set 70/30 respectively.
CDK descriptors are computed to generate models. CDK descriptors are considered to be consistent in order to perform logKhsa regression models. After eliminating missing values and zero variance descriptors, z-score normalization was performed on modeling set.
Modeling set has been successively divided 10 times into training and test set 70/30 respectively. The most correlated descriptors to logKhsa were selected with if there correlation coefficient was higher than 0.1. Correlated descriptors were removed with a correlation threshold of 0.95. MLR algorithm was used to generate stratified models. Models were obtained following a 5-folds cross-validation step and validation stages (Y-scrambling, test set validation and external set validation) repeated 10 times.