Package rdkit :: Package Chem :: Package Fingerprints :: Module ClusterMols
[hide private]
[frames] | no frames]

Module ClusterMols

source code

utility functionality for clustering molecules using fingerprints
 includes a command line app for clustering


Sample Usage:
  python ClusterMols.py  -d data.gdb -t daylight_sig     --idName="CAS_TF" -o clust1.pkl     --actTable="dop_test" --actName="moa_quant"

Functions [hide private]
 
GetDistanceMatrix(data, metric, isSimilarity=1)
data should be a list of tuples with fingerprints in position 1 (the rest of the elements of the tuple are not important)
source code
 
ClusterPoints(data, metric, algorithmId, haveLabels=False, haveActs=True, returnDistances=False) source code
 
ClusterFromDetails(details)
Returns the cluster tree
source code
Variables [hide private]
  _cvsVersion = '$Id$'
  idx1 = 0
  idx2 = 3
  __VERSION_STRING = '$Id'
  _usageDoc = '\nUsage: ClusterMols.py [args] <fName>\n\n If <f...
  __package__ = 'rdkit.Chem.Fingerprints'

Imports: DbConnect, DbInfo, DbUtils, DataUtils, Clusters, Murtagh, sys, cPickle, FingerprintMols, MolSimilarity, DataStructs, numpy, message, error


Function Details [hide private]

GetDistanceMatrix(data, metric, isSimilarity=1)

source code 
data should be a list of tuples with fingerprints in position 1
(the rest of the elements of the tuple are not important)

 Returns the symmetric distance matrix
 (see ML.Cluster.Resemblance for layout documentation)
 


Variables Details [hide private]

_usageDoc

Value:
'''
Usage: ClusterMols.py [args] <fName>

  If <fName> is provided and no tableName is specified (see below),
  data will be read from the text file <fName>.  Text files delimited
  with either commas (extension .csv) or tabs (extension .txt) are
  supported. 

...