Package rdkit :: Package ML :: Package Cluster :: Module Murtagh
[hide private]
[frames] | no frames]

Module Murtagh

source code

Interface to the C++ Murtagh hierarchic clustering code

Functions [hide private]
 
_LookupDist(dists, i, j, n)
*Internal Use Only*
source code
 
_ToClusters(data, nPts, ia, ib, crit, isDistData=0)
*Internal Use Only*
source code
 
ClusterData(data, nPts, method, isDistData=0)
clusters the data points passed in and returns the cluster tree
source code
Variables [hide private]
  WARDS = 1
  SLINK = 2
  CLINK = 3
  UPGMA = 4
  MCQUITTY = 5
  GOWER = 6
  CENTROID = 7
  methods = [('Ward\'s Minimum Variance', 1, 'Ward\'s Minimum Va...
  __package__ = 'rdkit.ML.Cluster'

Imports: Clusters, MurtaghCluster, MurtaghDistCluster, numpy


Function Details [hide private]

_LookupDist(dists, i, j, n)

source code 
*Internal Use Only*

returns the distance between points i and j in the symmetric
distance matrix _dists_

_ToClusters(data, nPts, ia, ib, crit, isDistData=0)

source code 
*Internal Use Only*

Converts the results of the Murtagh clustering code into
a cluster tree, which is returned in a single-entry list

ClusterData(data, nPts, method, isDistData=0)

source code 
clusters the data points passed in and returns the cluster tree

**Arguments**

  - data: a list of lists (or array, or whatever) with the input
    data (see discussion of _isDistData_ argument for the exception)

  - nPts: the number of points to be used

  - method: determines which clustering algorithm should be used.
      The defined constants for these are:
      'WARDS, SLINK, CLINK, UPGMA'

  - isDistData: set this toggle when the data passed in is a
      distance matrix.  The distance matrix should be stored
      symmetrically so that _LookupDist (above) can retrieve
      the results:
        for i<j: d_ij = dists[j*(j-1)/2 + i]


**Returns**

  - a single entry list with the cluster tree


Variables Details [hide private]

methods

Value:
[('Ward\'s Minimum Variance', 1, 'Ward\'s Minimum Variance'),
 ('Average Linkage', 4, 'Group Average Linkage (UPGMA)'),
 ('Single Linkage', 2, 'Single Linkage (SLINK)'),
 ('Complete Linkage', 3, 'Complete Linkage (CLINK)'),
 ('Centroid', 7, 'Centroid method')]