rdkit.ML.Data.MLData module

classes to be used to help work with data sets

class rdkit.ML.Data.MLData.MLDataSet(data, nVars=None, nPts=None, nPossibleVals=None, qBounds=None, varNames=None, ptNames=None, nResults=1)

Bases: object

A data set for holding general data (floats, ints, and strings)

Note
this is intended to be a read-only data structure (i.e. after calling the constructor you cannot touch it)
AddPoint(pt)
AddPoints(pts, names)
GetAllData()

returns a copy of the data

GetInputData()

returns the input data

Note

_inputData_ means the examples without their result fields
(the last _NResults_ entries)
GetNPossibleVals()
GetNPts()
GetNResults()
GetNVars()
GetNamedData()

returns a list of named examples

Note

a named example is the result of prepending the example
name to the data list
GetPtNames()
GetQuantBounds()
GetResults()

Returns the result fields from each example

GetVarNames()
class rdkit.ML.Data.MLData.MLQuantDataSet(data, nVars=None, nPts=None, nPossibleVals=None, qBounds=None, varNames=None, ptNames=None, nResults=1)

Bases: rdkit.ML.Data.MLData.MLDataSet

a data set for holding quantized data

Note

this is intended to be a read-only data structure (i.e. after calling the constructor you cannot touch it)

Big differences to MLDataSet

  1. data are stored in a numpy array since they are homogenous
  2. results are assumed to be quantized (i.e. no qBounds entry is required)
GetAllData()

returns a copy of the data

GetInputData()

returns the input data

Note

_inputData_ means the examples without their result fields
(the last _NResults_ entries)
GetNamedData()

returns a list of named examples

Note

a named example is the result of prepending the example
name to the data list
GetResults()

Returns the result fields from each example