Package rdkit :: Package ML :: Package Descriptors :: Module CompoundDescriptors :: Class CompoundDescriptorCalculator
[hide private]
[frames] | no frames]

Class CompoundDescriptorCalculator

source code

Descriptors.DescriptorCalculator --+
                                   |
                                  CompoundDescriptorCalculator

used for calculating descriptors

This is the central point for descriptor calculation

**Notes**

- There are two kinds of descriptors this cares about:

   1) *Simple Descriptors* can be calculated solely using atomic descriptor
      values and the composition of the compound.  The full list of possible
      simple descriptors is determined by the types of *Calculator Methods*
      (see below) and the contents of an atomic database.

      Simple Descriptors can be marked as *nonZeroDescriptors*.  These are used
      to winnow out atom types where particular atomic descriptors are zero
      (usually indicating that the value is unknown)

      Simple Descriptors are maintained locally in the _simpleList_

   2) *Compound Descriptors* may rely upon more complicated computation schemes
      and descriptors for the compound as a whole (e.g. structural variables, etc.).
      The full list of compound descriptors is limitless.  They are calculated using
      the _ML.Descriptors.Parser_ module.

      Compound Descriptors are maintained locally in the _compoundList_

- This class has a some special methods which are labelled as *Calculator Method*
  These are used internally to take atomic descriptors and reduce them to a single
  simple descriptor value for a composition.  They are primarily intended for internal use.

- a *composition vector* is a list of 2-tuples: '[(atom1name,atom1Num),...]'
  where atom1Num is the contribution of the atom to the stoichiometry of the
  compound. No assumption is made about the stoichiometries (i.e. they don't
  have to be either integral or all sum to one).

Instance Methods [hide private]
 
SUM(self, desc, compos)
*Calculator Method*
source code
 
MEAN(self, desc, compos)
*Calculator Method*
source code
 
DEV(self, desc, compos)
*Calculator Method*
source code
 
MIN(self, desc, compos)
*Calculator Method*
source code
 
MAX(self, desc, compos)
*Calculator Method*
source code
 
ProcessSimpleList(self)
Handles the list of simple descriptors
source code
 
ProcessCompoundList(self)
Adds entries from the _compoundList_ to the list of _requiredDescriptors_
source code
 
BuildAtomDict(self)
builds the local atomic dict
source code
 
CalcSimpleDescriptorsForComposition(self, compos='', composList=None)
calculates all simple descriptors for a given composition
source code
 
CalcCompoundDescriptorsForComposition(self, compos='', composList=None, propDict={})
calculates all simple descriptors for a given composition
source code
 
CalcDescriptorsForComposition(self, composVect, propDict)
calculates all descriptors for a given composition
source code
 
CalcDescriptors(self, composVect, propDict)
calculates all descriptors for a given composition
source code
 
GetDescriptorNames(self)
returns a list of the names of the descriptors this calculator generates
source code
 
__init__(self, simpleList, compoundList=None, dbName=None, dbTable='atomic_data', dbUser='sysdba', dbPassword='masterkey')
Constructor
source code

Inherited from Descriptors.DescriptorCalculator: SaveState, ShowDescriptors

Method Details [hide private]

SUM(self, desc, compos)

source code 
*Calculator Method*

sums the descriptor values across the composition

**Arguments**

  - desc: the name of the descriptor

  - compos: the composition vector

**Returns**

  a float

MEAN(self, desc, compos)

source code 
*Calculator Method*

averages the descriptor values across the composition

**Arguments**

  - desc: the name of the descriptor

  - compos: the composition vector

**Returns**

  a float

DEV(self, desc, compos)

source code 
*Calculator Method*

average deviation of the descriptor values across the composition

**Arguments**

  - desc: the name of the descriptor

  - compos: the composition vector

**Returns**

  a float

MIN(self, desc, compos)

source code 
*Calculator Method*

minimum of the descriptor values across the composition

**Arguments**

  - desc: the name of the descriptor

  - compos: the composition vector

**Returns**

  a float

MAX(self, desc, compos)

source code 
*Calculator Method*

maximum of the descriptor values across the composition

**Arguments**

  - desc: the name of the descriptor

  - compos: the composition vector

**Returns**

  a float

ProcessSimpleList(self)

source code 
Handles the list of simple descriptors

This constructs the list of _nonZeroDescriptors_ and _requiredDescriptors_.

There's some other magic going on that I can't decipher at the moment.

ProcessCompoundList(self)

source code 
Adds entries from the _compoundList_ to the list of _requiredDescriptors_

Each compound descriptor is surveyed.  Any atomic descriptors it requires
are added to the list of _requiredDescriptors_ to be pulled from the database.

BuildAtomDict(self)

source code 
builds the local atomic dict

We don't want to keep around all descriptor values for all atoms, so this
method takes care of only pulling out the descriptors in which we are
interested.

**Notes**

  - this uses _chemutils.GetAtomicData_ to actually pull the data

CalcSimpleDescriptorsForComposition(self, compos='', composList=None)

source code 
calculates all simple descriptors for a given composition

**Arguments**

  - compos: a string representation of the composition

  - composList: a *composVect*

  The client must provide either _compos_ or _composList_.  If both are
  provided, _composList_ takes priority.

**Returns**
  the list of descriptor values

**Notes**

  - when _compos_ is provided, this uses _chemutils.SplitComposition_
    to split the composition into its individual pieces

  - if problems are encountered because of either an unknown descriptor or
    atom type, a _KeyError_ will be raised.

CalcCompoundDescriptorsForComposition(self, compos='', composList=None, propDict={})

source code 
calculates all simple descriptors for a given composition

**Arguments**

  - compos: a string representation of the composition

  - composList: a *composVect*

  - propDict: a dictionary containing the properties of the composition
    as a whole (e.g. structural variables, etc.)

  The client must provide either _compos_ or _composList_.  If both are
  provided, _composList_ takes priority.

**Returns**
  the list of descriptor values

**Notes**

  - when _compos_ is provided, this uses _chemutils.SplitComposition_
    to split the composition into its individual pieces

CalcDescriptorsForComposition(self, composVect, propDict)

source code 
calculates all descriptors for a given composition

**Arguments**

  - compos: a string representation of the composition

  - propDict: a dictionary containing the properties of the composition
    as a whole (e.g. structural variables, etc.). These are used to
    generate Compound Descriptors

**Returns**
  the list of all descriptor values

**Notes**

  - this uses _chemutils.SplitComposition_
    to split the composition into its individual pieces

CalcDescriptors(self, composVect, propDict)

source code 
calculates all descriptors for a given composition

**Arguments**

  - compos: a string representation of the composition

  - propDict: a dictionary containing the properties of the composition
    as a whole (e.g. structural variables, etc.). These are used to
    generate Compound Descriptors

**Returns**
  the list of all descriptor values

**Notes**

  - this uses _chemutils.SplitComposition_
    to split the composition into its individual pieces

Overrides: Descriptors.DescriptorCalculator.CalcDescriptors

GetDescriptorNames(self)

source code 
returns a list of the names of the descriptors this calculator generates

    

Overrides: Descriptors.DescriptorCalculator.GetDescriptorNames

__init__(self, simpleList, compoundList=None, dbName=None, dbTable='atomic_data', dbUser='sysdba', dbPassword='masterkey')
(Constructor)

source code 
Constructor

**Arguments**

  - simpleList: list of simple descriptors to be calculated
        (see below for format)

  - compoundList: list of compound descriptors to be calculated
        (see below for format)

  - dbName: name of the atomic database to be used

  - dbTable: name the table in _dbName_ which has atomic data

  - dbUser: user name for DB access

  - dbPassword: password for DB access

**Note**

  - format of simpleList:
     a list of 2-tuples containing:

        1) name of the atomic descriptor

        2) a list of operations on that descriptor (e.g. NonZero, Max, etc.)
           These must correspond to the *Calculator Method* names above.

  - format of compoundList:
     a list of 2-tuples containing:

        1) name of the descriptor to be calculated

        2) list of selected atomic descriptor names (define $1, $2, etc.)

        3) list of selected compound descriptor names (define $a, $b, etc.)

        4) text formula defining the calculation (see _Parser_)

Overrides: Descriptors.DescriptorCalculator.__init__