rdkit.Chem.rdRGroupDecomposition module

Module containing RGroupDecomposition classes and functions.

class rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment

Bases: enum

MCS = rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.MCS
NoAlignment = rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.NoAlignment
None = rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.None
names = {'MCS': rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.MCS, 'NoAlignment': rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.NoAlignment, 'None': rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.None}
values = {0: rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.NoAlignment, 1: rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.MCS}
rdkit.Chem.rdRGroupDecomposition.RGroupDecompose((AtomPairsParameters)cores, (AtomPairsParameters)mols[, (bool)asSmiles=False[, (bool)asRows=True[, (RGroupDecompositionParameters)options=<rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters object at 0x7259b41c7760>]]]) object :
Decompose a collecion of molecules into their Rgroups
ARGUMENTS:
  • cores: a set of cores from most to least specific.

    See RGroupDecompositionParameters for more details on how the cores can be labelled

  • mols: the molecules to be decomposed

  • asSmiles: if True return smiles strings, otherwise return molecules [default: False]

  • asRows: return the results as rows (default) otherwise return columns

RETURNS: row_or_column_results, unmatched

Row structure:

rows[idx] = {rgroup_label: molecule_or_smiles}

Column structure:

columns[rgroup_label] = [ mols_or_smiles ]

unmatched is a vector of indices in the input mols that were not matched.

C++ signature :

boost::python::api::object RGroupDecompose(boost::python::api::object,boost::python::api::object [,bool=False [,bool=True [,RDKit::RGroupDecompositionParameters=<rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters object at 0x7259b41c7760>]]])

class rdkit.Chem.rdRGroupDecomposition.RGroupDecomposition((object)self, (AtomPairsParameters)cores)

Bases: instance

RGroupDecompositionParameters controls how the RGroupDecomposition sets labelling and matches structures OPTIONS:

  • RGroupCoreAlignment: can be one of RGroupCoreAlignment.None_ or RGroupCoreAlignment.MCS

    If set to MCS, cores labels are mapped to each other using their Maximum common substructure overlap.

  • RGroupLabels: optionally set where the rgroup labels to use are encoded.

    RGroupLabels.IsotopeLabels - labels are stored on isotopes RGroupLabels.AtomMapLabels - labels are stored on atommaps RGroupLabels.MDLRGroupLabels - labels are stored on MDL R-groups RGroupLabels.DummyAtomLabels - labels are stored on dummy atoms RGroupLabels.AtomIndexLabels - use the atom index as the label RGroupLabels.RelabelDuplicateLabels - fix any duplicate labels RGroupLabels.AutoDetect - auto detect the label [default]

    Note: in all cases, any rgroups found on unlabelled atoms will be automatically

    labelled.

  • RGroupLabelling: choose where the rlabels are stored on the decomposition

    RGroupLabelling.AtomMap - store rgroups as atom maps (for smiles) RGroupLabelling.Isotope - store rgroups on the isotope RGroupLabelling.MDLRGroup - store rgroups as mdl rgroups (for molblocks)

    default: AtomMap | MDLRGroup

  • onlyMatchAtRGroups: only allow rgroup decomposition at the specified rgroups

  • removeAllHydrogenRGroups: remove all user-defined rgroups that only have hydrogens

  • removeAllHydrogenRGroupsAndLabels: remove all user-defined rgroups that only have hydrogens, and also remove the corresponding labels from the core

  • removeHydrogensPostMatch: remove all hydrogens from the output molecules

  • allowNonTerminalRGroups: allow labelled Rgroups of degree 2 or more

  • doTautomers: match all tautomers of a core against each input structure

  • doEnumeration: expand input cores into enumerated mol bundles

-allowMultipleRGroupsOnUnlabelled: permit more that one rgroup to be attached to an unlabelled core atom

Construct from a molecule or sequence of molecules

C++ signature :

void __init__(_object*,boost::python::api::object)

__init__( (object)self, (AtomPairsParameters)cores, (RGroupDecompositionParameters)params) -> None :

Construct from a molecule or sequence of molecules and a parameters object

C++ signature :

void __init__(_object*,boost::python::api::object,RDKit::RGroupDecompositionParameters)

Add((RGroupDecomposition)self, (Mol)mol) int :
C++ signature :

int Add(RDKit::RGroupDecompositionHelper {lvalue},RDKit::ROMol)

GetMatchingCoreIdx((RGroupDecomposition)self, (Mol)mol[, (AtomPairsParameters)matches=None]) int :
C++ signature :

int GetMatchingCoreIdx(RDKit::RGroupDecompositionHelper {lvalue},RDKit::ROMol [,boost::python::api::object {lvalue}=None])

GetRGroupLabels((RGroupDecomposition)self) list :

Return the current list of found rgroups. Note, Process() should be called first

C++ signature :

boost::python::list GetRGroupLabels(RDKit::RGroupDecompositionHelper {lvalue})

GetRGroupsAsColumns((RGroupDecomposition)self[, (bool)asSmiles=False]) dict :
Return the rgroups as columns (note: can be fed directly into a pandas datatable)
ARGUMENTS:
  • asSmiles: if True return smiles strings, otherwise return molecules [default: False]

Column structure:

columns[rgroup_label] = [ mols_or_smiles ]

C++ signature :

boost::python::dict GetRGroupsAsColumns(RDKit::RGroupDecompositionHelper {lvalue} [,bool=False])

GetRGroupsAsRows((RGroupDecomposition)self[, (bool)asSmiles=False]) list :
Return the rgroups as rows (note: can be fed directly into a pandas datatable)
ARGUMENTS:
  • asSmiles: if True return smiles strings, otherwise return molecules [default: False]

Row structure:

rows[idx] = {rgroup_label: molecule_or_smiles}

C++ signature :

boost::python::list GetRGroupsAsRows(RDKit::RGroupDecompositionHelper {lvalue} [,bool=False])

Process((RGroupDecomposition)self) bool :

Process the rgroups (must be done prior to GetRGroupsAsRows/Columns and GetRGroupLabels)

C++ signature :

bool Process(RDKit::RGroupDecompositionHelper {lvalue})

ProcessAndScore((RGroupDecomposition)self) tuple :

Process the rgroups and returns the score (must be done prior to GetRGroupsAsRows/Columns and GetRGroupLabels)

C++ signature :

boost::python::tuple ProcessAndScore(RDKit::RGroupDecompositionHelper {lvalue})

class rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters((object)self)

Bases: instance

RGroupDecompositionParameters controls how the RGroupDecomposition sets labelling and matches structures OPTIONS:

  • RGroupCoreAlignment: can be one of RGroupCoreAlignment.None_ or RGroupCoreAlignment.MCS

    If set to MCS, cores labels are mapped to each other using their Maximum common substructure overlap.

  • RGroupLabels: optionally set where the rgroup labels to use are encoded.

    RGroupLabels.IsotopeLabels - labels are stored on isotopes RGroupLabels.AtomMapLabels - labels are stored on atommaps RGroupLabels.MDLRGroupLabels - labels are stored on MDL R-groups RGroupLabels.DummyAtomLabels - labels are stored on dummy atoms RGroupLabels.AtomIndexLabels - use the atom index as the label RGroupLabels.RelabelDuplicateLabels - fix any duplicate labels RGroupLabels.AutoDetect - auto detect the label [default]

    Note: in all cases, any rgroups found on unlabelled atoms will be automatically

    labelled.

  • RGroupLabelling: choose where the rlabels are stored on the decomposition

    RGroupLabelling.AtomMap - store rgroups as atom maps (for smiles) RGroupLabelling.Isotope - store rgroups on the isotope RGroupLabelling.MDLRGroup - store rgroups as mdl rgroups (for molblocks)

    default: AtomMap | MDLRGroup

  • onlyMatchAtRGroups: only allow rgroup decomposition at the specified rgroups

  • removeAllHydrogenRGroups: remove all user-defined rgroups that only have hydrogens

  • removeAllHydrogenRGroupsAndLabels: remove all user-defined rgroups that only have hydrogens, and also remove the corresponding labels from the core

  • removeHydrogensPostMatch: remove all hydrogens from the output molecules

  • allowNonTerminalRGroups: allow labelled Rgroups of degree 2 or more

  • doTautomers: match all tautomers of a core against each input structure

  • doEnumeration: expand input cores into enumerated mol bundles

-allowMultipleRGroupsOnUnlabelled: permit more that one rgroup to be attached to an unlabelled core atom

Constructor, takes no arguments

C++ signature :

void __init__(_object*)

property alignment
property allowMultipleRGroupsOnUnlabelled
property allowNonTerminalRGroups
property chunkSize
property doEnumeration
property doTautomers
property gaMaximumOperations
property gaNumberOperationsWithoutImprovement
property gaNumberRuns
property gaParallelRuns
property gaPopulationSize
property gaRandomSeed
property includeTargetMolInResults
property labels
property matchingStrategy
property onlyMatchAtRGroups
property removeAllHydrogenRGroups
property removeAllHydrogenRGroupsAndLabels
property removeHydrogensPostMatch
property rgroupLabelling
property scoreMethod
property substructMatchParams
property timeout
class rdkit.Chem.rdRGroupDecomposition.RGroupLabelling

Bases: enum

AtomMap = rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.AtomMap
Isotope = rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.Isotope
MDLRGroup = rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup
names = {'AtomMap': rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.AtomMap, 'Isotope': rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.Isotope, 'MDLRGroup': rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup}
values = {1: rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.AtomMap, 2: rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.Isotope, 4: rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup}
class rdkit.Chem.rdRGroupDecomposition.RGroupLabels

Bases: enum

AtomIndexLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomIndexLabels
AtomMapLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomMapLabels
AutoDetect = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AutoDetect
DummyAtomLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.DummyAtomLabels
IsotopeLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.IsotopeLabels
MDLRGroupLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.MDLRGroupLabels
RelabelDuplicateLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.RelabelDuplicateLabels
names = {'AtomIndexLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomIndexLabels, 'AtomMapLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomMapLabels, 'AutoDetect': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AutoDetect, 'DummyAtomLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.DummyAtomLabels, 'IsotopeLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.IsotopeLabels, 'MDLRGroupLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.MDLRGroupLabels, 'RelabelDuplicateLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.RelabelDuplicateLabels}
values = {1: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.IsotopeLabels, 2: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomMapLabels, 4: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomIndexLabels, 8: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.RelabelDuplicateLabels, 16: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.MDLRGroupLabels, 32: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.DummyAtomLabels, 255: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AutoDetect}
class rdkit.Chem.rdRGroupDecomposition.RGroupMatching

Bases: enum

Exhaustive = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Exhaustive
GA = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GA
Greedy = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Greedy
GreedyChunks = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GreedyChunks
NoSymmetrization = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.NoSymmetrization
names = {'Exhaustive': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Exhaustive, 'GA': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GA, 'Greedy': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Greedy, 'GreedyChunks': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GreedyChunks, 'NoSymmetrization': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.NoSymmetrization}
values = {1: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Greedy, 2: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GreedyChunks, 4: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Exhaustive, 8: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.NoSymmetrization, 16: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GA}
class rdkit.Chem.rdRGroupDecomposition.RGroupScore

Bases: enum

FingerprintVariance = rdkit.Chem.rdRGroupDecomposition.RGroupScore.FingerprintVariance
Match = rdkit.Chem.rdRGroupDecomposition.RGroupScore.Match
names = {'FingerprintVariance': rdkit.Chem.rdRGroupDecomposition.RGroupScore.FingerprintVariance, 'Match': rdkit.Chem.rdRGroupDecomposition.RGroupScore.Match}
values = {1: rdkit.Chem.rdRGroupDecomposition.RGroupScore.Match, 4: rdkit.Chem.rdRGroupDecomposition.RGroupScore.FingerprintVariance}
rdkit.Chem.rdRGroupDecomposition.RelabelMappedDummies((Mol)mol[, (int)inputLabels=rdkit.Chem.rdRGroupDecomposition.RGroupLabelling(7)[, (int)outputLabels=rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup]]) None :

Relabel dummy atoms bearing an R-group mapping (as atom map number, isotope or MDLRGroup label) such that they will be displayed by the rendering code as R# rather than #*, :#, #:#, etc. By default, only the MDLRGroup label is retained on output; this may be configured through the outputLabels parameter. In case there are multiple potential R-group mappings, the priority on input is Atom map number > Isotope > MDLRGroup. The inputLabels parameter allows to configure which mappings are taken into consideration.

C++ signature :

void RelabelMappedDummies(RDKit::ROMol {lvalue} [,unsigned int=rdkit.Chem.rdRGroupDecomposition.RGroupLabelling(7) [,unsigned int=rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup]])