rdkit.ML.Data.Stats module

various statistical operations on data

rdkit.ML.Data.Stats.FormCorrelationMatrix(mat)

form and return the covariance matrix

rdkit.ML.Data.Stats.FormCovarianceMatrix(mat)

form and return the covariance matrix

rdkit.ML.Data.Stats.GetConfidenceInterval(sd, n, level=95)
rdkit.ML.Data.Stats.MeanAndDev(vect, sampleSD=1)

returns the mean and standard deviation of a vector

rdkit.ML.Data.Stats.PrincipalComponents(mat, reverseOrder=1)

do a principal components analysis

rdkit.ML.Data.Stats.R2(orig, residSum)

returns the R2 value for a set of predictions

rdkit.ML.Data.Stats.StandardizeMatrix(mat)

This is the standard subtract off the average and divide by the deviation standardization function.

Arguments

  • mat: a numpy array

Notes

  • in addition to being returned, _mat_ is modified in place, so beware

rdkit.ML.Data.Stats.TransformPoints(tFormMat, pts)

transforms a set of points using tFormMat

Arguments

  • tFormMat: a numpy array

  • pts: a list of numpy arrays (or a 2D array)

Returns

a list of numpy arrays