rdkit.ML.Scoring.Scoring module¶
$Id$
Scoring - Calculate rank statistics
Created by Sereina Riniker, October 2012 after a file from Peter Gedeck, Greg Landrum
- param scores: ordered list with descending similarity containing
active/inactive information
param col: column index in scores where active/inactive information is stored param fractions: list of fractions at which the value shall be calculated param alpha: exponential weight
- rdkit.ML.Scoring.Scoring.CalcAUC(scores, col)¶
Determines the area under the ROC curve
- rdkit.ML.Scoring.Scoring.CalcBEDROC(scores, col, alpha)¶
BEDROC original defined here: Truchon, J. & Bayly, C.I. Evaluating Virtual Screening Methods: Good and Bad Metric for the “Early Recognition” Problem. J. Chem. Inf. Model. 47, 488-508 (2007). ** Arguments**
- scores: 2d list or numpy array
0th index representing sample scores must be in sorted order with low indexes “better” scores[sample_id] = vector of sample data
- col: int
Index of sample data which reflects true label of a sample scores[sample_id][col] = True iff that sample is active
- alpha: float
hyper parameter from the initial paper for how much to enrich the top
- Returns
float BedROC score
- rdkit.ML.Scoring.Scoring.CalcEnrichment(scores, col, fractions)¶
Determines the enrichment factor for a set of fractions
- rdkit.ML.Scoring.Scoring.CalcRIE(scores, col, alpha)¶
RIE original definded here: Sheridan, R.P., Singh, S.B., Fluder, E.M. & Kearsley, S.K. Protocols for Bridging the Peptide to Nonpeptide Gap in Topological Similarity Searches. J. Chem. Inf. Comp. Sci. 41, 1395-1406 (2001).
- rdkit.ML.Scoring.Scoring.CalcROC(scores, col)¶
Determines a ROC curve