rdkit.Chem.GraphDescriptors module

Calculation of topological/topochemical descriptors.

rdkit.Chem.GraphDescriptors.AvgIpc(mol, dMat=None, forceDMat=False)

This returns the average information content of the coefficients of the characteristic polynomial of the adjacency matrix of a hydrogen-suppressed graph of a molecule.

From Eq 7 of D. Bonchev & N. Trinajstic, J. Chem. Phys. vol 67, 4517-4533 (1977)

rdkit.Chem.GraphDescriptors.BalabanJ(mol, dMat=None, forceDMat=0)

Calculate Balaban’s J value for a molecule

Arguments

  • mol: a molecule

  • dMat: (optional) a distance/adjacency matrix for the molecule, if this is not provide, one will be calculated

  • forceDMat: (optional) if this is set, the distance/adjacency matrix will be recalculated regardless of whether or not _dMat_ is provided or the molecule already has one

Returns

  • a float containing the J value

We follow the notation of Balaban’s paper:

Chem. Phys. Lett. vol 89, 399-404, (1982)

rdkit.Chem.GraphDescriptors.BertzCT(mol, cutoff=100, dMat=None, forceDMat=1)

A topological index meant to quantify “complexity” of molecules.

Consists of a sum of two terms, one representing the complexity of the bonding, the other representing the complexity of the distribution of heteroatoms.

From S. H. Bertz, J. Am. Chem. Soc., vol 103, 3599-3601 (1981)

“cutoff” is an integer value used to limit the computational expense. A cutoff value tells the program to consider vertices topologically identical if their distance vectors (sets of distances to all other vertices) are equal out to the “cutoff”th nearest-neighbor.

NOTE The original implementation had the following comment:

> this implementation treats aromatic rings as the > corresponding Kekule structure with alternating bonds, > for purposes of counting “connections”.

Upon further thought, this is the WRONG thing to do. It

results in the possibility of a molecule giving two different CT values depending on the kekulization. For example, in the old implementation, these two SMILES:

CC2=CN=C1C3=C(C(C)=C(C=N3)C)C=CC1=C2C CC3=CN=C2C1=NC=C(C)C(C)=C1C=CC2=C3C

which correspond to differentk kekule forms, yield different values.

The new implementation uses consistent (aromatic) bond orders

for aromatic bonds.

THIS MEANS THAT THIS IMPLEMENTATION IS NOT BACKWARDS COMPATIBLE.

Any molecule containing aromatic rings will yield different values with this implementation. The new behavior is the correct one, so we’re going to live with the breakage.

NOTE this barfs if the molecule contains a second (or

nth) fragment that is one atom.

rdkit.Chem.GraphDescriptors.Chi0(mol)

From equations (1),(9) and (10) of Rev. Comp. Chem. vol 2, 367-422, (1991)

rdkit.Chem.GraphDescriptors.Chi0n(x)
rdkit.Chem.GraphDescriptors.Chi0v(x)
rdkit.Chem.GraphDescriptors.Chi1(mol)

From equations (1),(11) and (12) of Rev. Comp. Chem. vol 2, 367-422, (1991)

rdkit.Chem.GraphDescriptors.Chi1n(x)
rdkit.Chem.GraphDescriptors.Chi1v(x)
rdkit.Chem.GraphDescriptors.Chi2n(x)
rdkit.Chem.GraphDescriptors.Chi2v(x)
rdkit.Chem.GraphDescriptors.Chi3n(x)
rdkit.Chem.GraphDescriptors.Chi3v(x)
rdkit.Chem.GraphDescriptors.Chi4n(x)
rdkit.Chem.GraphDescriptors.Chi4v(x)
rdkit.Chem.GraphDescriptors.ChiNn_(x, y)
rdkit.Chem.GraphDescriptors.ChiNv_(x, y)
rdkit.Chem.GraphDescriptors.HallKierAlpha(x)
rdkit.Chem.GraphDescriptors.Ipc(mol, avg=False, dMat=None, forceDMat=False)

This returns the information content of the coefficients of the characteristic polynomial of the adjacency matrix of a hydrogen-suppressed graph of a molecule.

‘avg = True’ returns the information content divided by the total population.

From Eq 6 of D. Bonchev & N. Trinajstic, J. Chem. Phys. vol 67, 4517-4533 (1977)

rdkit.Chem.GraphDescriptors.Kappa1(x)
rdkit.Chem.GraphDescriptors.Kappa2(x)
rdkit.Chem.GraphDescriptors.Kappa3(x)