rdkit.Chem.rdRGroupDecomposition module¶

Module containing RGroupDecomposition classes and functions.

class rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment¶

Bases: enum

MCS = rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.MCS¶

NoAlignment = rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.NoAlignment¶

None = rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.None¶

names = {'MCS': rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.MCS, 'NoAlignment': rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.NoAlignment, 'None': rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.None}¶

values = {0: rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.NoAlignment, 1: rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.MCS}¶

rdkit.Chem.rdRGroupDecomposition.RGroupDecompose((AtomPairsParameters)cores, (AtomPairsParameters)mols[, (bool)asSmiles=False[, (bool)asRows=True[, (RGroupDecompositionParameters)options=<rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters object at 0x75d9f6f62e60>]]]) → object :¶

Decompose a collecion of molecules into their Rgroups

ARGUMENTS:

cores: a set of cores from most to least specific.
See RGroupDecompositionParameters for more details on how the cores can be labelled
mols: the molecules to be decomposed
asSmiles: if True return smiles strings, otherwise return molecules [default: False]
asRows: return the results as rows (default) otherwise return columns

RETURNS: row_or_column_results, unmatched

Row structure:
rows[idx] = {rgroup_label: molecule_or_smiles}

Column structure:
columns[rgroup_label] = [ mols_or_smiles ]

unmatched is a vector of indices in the input mols that were not matched.

C++ signature :

boost::python::api::object RGroupDecompose(boost::python::api::object,boost::python::api::object [,bool=False [,bool=True [,RDKit::RGroupDecompositionParameters=<rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters object at 0x75d9f6f62e60>]]])

class rdkit.Chem.rdRGroupDecomposition.RGroupDecomposition((object)self, (AtomPairsParameters)cores)¶

Bases: instance

RGroupDecompositionParameters controls how the RGroupDecomposition sets labelling and matches structures OPTIONS:

RGroupCoreAlignment: can be one of RGroupCoreAlignment.None_ or RGroupCoreAlignment.MCS
If set to MCS, cores labels are mapped to each other using their Maximum common substructure overlap.

RGroupLabels: optionally set where the rgroup labels to use are encoded.

RGroupLabels.IsotopeLabels - labels are stored on isotopes RGroupLabels.AtomMapLabels - labels are stored on atommaps RGroupLabels.MDLRGroupLabels - labels are stored on MDL R-groups RGroupLabels.DummyAtomLabels - labels are stored on dummy atoms RGroupLabels.AtomIndexLabels - use the atom index as the label RGroupLabels.RelabelDuplicateLabels - fix any duplicate labels RGroupLabels.AutoDetect - auto detect the label [default]

Note: in all cases, any rgroups found on unlabelled atoms will be automatically
labelled.

RGroupLabelling: choose where the rlabels are stored on the decomposition

RGroupLabelling.AtomMap - store rgroups as atom maps (for smiles) RGroupLabelling.Isotope - store rgroups on the isotope RGroupLabelling.MDLRGroup - store rgroups as mdl rgroups (for molblocks)

default: AtomMap | MDLRGroup

onlyMatchAtRGroups: only allow rgroup decomposition at the specified rgroups

removeAllHydrogenRGroups: remove all user-defined rgroups that only have hydrogens

removeAllHydrogenRGroupsAndLabels: remove all user-defined rgroups that only have hydrogens, and also remove the corresponding labels from the core

removeHydrogensPostMatch: remove all hydrogens from the output molecules

allowNonTerminalRGroups: allow labelled Rgroups of degree 2 or more

doTautomers: match all tautomers of a core against each input structure

doEnumeration: expand input cores into enumerated mol bundles

-allowMultipleRGroupsOnUnlabelled: permit more that one rgroup to be attached to an unlabelled core atom

Construct from a molecule or sequence of molecules

C++ signature :
void __init__(_object*,boost::python::api::object)

__init__( (object)self, (AtomPairsParameters)cores, (RGroupDecompositionParameters)params) -> None :

Construct from a molecule or sequence of molecules and a parameters object

C++ signature :: void __init__(_object*,boost::python::api::object,RDKit::RGroupDecompositionParameters)

Add((RGroupDecomposition)self, (Mol)mol) → int :¶

C++ signature :: int Add(RDKit::RGroupDecompositionHelper {lvalue},RDKit::ROMol)

GetMatchingCoreIdx((RGroupDecomposition)self, (Mol)mol[, (AtomPairsParameters)matches=None]) → int :¶

C++ signature :: int GetMatchingCoreIdx(RDKit::RGroupDecompositionHelper {lvalue},RDKit::ROMol [,boost::python::api::object {lvalue}=None])

GetRGroupLabels((RGroupDecomposition)self) → list :¶

Return the current list of found rgroups. Note, Process() should be called first

C++ signature :: boost::python::list GetRGroupLabels(RDKit::RGroupDecompositionHelper {lvalue})

GetRGroupsAsColumns((RGroupDecomposition)self[, (bool)asSmiles=False]) → dict :¶

Return the rgroups as columns (note: can be fed directly into a pandas datatable)

ARGUMENTS:

asSmiles: if True return smiles strings, otherwise return molecules [default: False]

Column structure:
columns[rgroup_label] = [ mols_or_smiles ]

C++ signature :

boost::python::dict GetRGroupsAsColumns(RDKit::RGroupDecompositionHelper {lvalue} [,bool=False])

GetRGroupsAsRows((RGroupDecomposition)self[, (bool)asSmiles=False]) → list :¶

Return the rgroups as rows (note: can be fed directly into a pandas datatable)

ARGUMENTS:

asSmiles: if True return smiles strings, otherwise return molecules [default: False]

Row structure:
rows[idx] = {rgroup_label: molecule_or_smiles}

C++ signature :

boost::python::list GetRGroupsAsRows(RDKit::RGroupDecompositionHelper {lvalue} [,bool=False])

Process((RGroupDecomposition)self) → bool :¶

Process the rgroups (must be done prior to GetRGroupsAsRows/Columns and GetRGroupLabels)

C++ signature :: bool Process(RDKit::RGroupDecompositionHelper {lvalue})

ProcessAndScore((RGroupDecomposition)self) → tuple :¶

Process the rgroups and returns the score (must be done prior to GetRGroupsAsRows/Columns and GetRGroupLabels)

C++ signature :: boost::python::tuple ProcessAndScore(RDKit::RGroupDecompositionHelper {lvalue})

class rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters((object)self)¶

Bases: instance

RGroupDecompositionParameters controls how the RGroupDecomposition sets labelling and matches structures OPTIONS:

RGroupCoreAlignment: can be one of RGroupCoreAlignment.None_ or RGroupCoreAlignment.MCS
If set to MCS, cores labels are mapped to each other using their Maximum common substructure overlap.

RGroupLabels: optionally set where the rgroup labels to use are encoded.

RGroupLabels.IsotopeLabels - labels are stored on isotopes RGroupLabels.AtomMapLabels - labels are stored on atommaps RGroupLabels.MDLRGroupLabels - labels are stored on MDL R-groups RGroupLabels.DummyAtomLabels - labels are stored on dummy atoms RGroupLabels.AtomIndexLabels - use the atom index as the label RGroupLabels.RelabelDuplicateLabels - fix any duplicate labels RGroupLabels.AutoDetect - auto detect the label [default]

Note: in all cases, any rgroups found on unlabelled atoms will be automatically
labelled.

RGroupLabelling: choose where the rlabels are stored on the decomposition

RGroupLabelling.AtomMap - store rgroups as atom maps (for smiles) RGroupLabelling.Isotope - store rgroups on the isotope RGroupLabelling.MDLRGroup - store rgroups as mdl rgroups (for molblocks)

default: AtomMap | MDLRGroup

onlyMatchAtRGroups: only allow rgroup decomposition at the specified rgroups

removeAllHydrogenRGroups: remove all user-defined rgroups that only have hydrogens

removeAllHydrogenRGroupsAndLabels: remove all user-defined rgroups that only have hydrogens, and also remove the corresponding labels from the core

removeHydrogensPostMatch: remove all hydrogens from the output molecules

allowNonTerminalRGroups: allow labelled Rgroups of degree 2 or more

doTautomers: match all tautomers of a core against each input structure

doEnumeration: expand input cores into enumerated mol bundles

-allowMultipleRGroupsOnUnlabelled: permit more that one rgroup to be attached to an unlabelled core atom

Constructor, takes no arguments

C++ signature :: void __init__(_object*)

property alignment¶

property allowMultipleRGroupsOnUnlabelled¶

property allowNonTerminalRGroups¶

property chunkSize¶

property doEnumeration¶

property doTautomers¶

property gaMaximumOperations¶

property gaNumberOperationsWithoutImprovement¶

property gaNumberRuns¶

property gaParallelRuns¶

property gaPopulationSize¶

property gaRandomSeed¶

property includeTargetMolInResults¶

property labels¶

property matchingStrategy¶

property onlyMatchAtRGroups¶

property removeAllHydrogenRGroups¶

property removeAllHydrogenRGroupsAndLabels¶

property removeHydrogensPostMatch¶

property rgroupLabelling¶

property scoreMethod¶

property substructMatchParams¶

property timeout¶

class rdkit.Chem.rdRGroupDecomposition.RGroupLabelling¶

Bases: enum

AtomMap = rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.AtomMap¶

Isotope = rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.Isotope¶

MDLRGroup = rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup¶

names = {'AtomMap': rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.AtomMap, 'Isotope': rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.Isotope, 'MDLRGroup': rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup}¶

values = {1: rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.AtomMap, 2: rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.Isotope, 4: rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup}¶

class rdkit.Chem.rdRGroupDecomposition.RGroupLabels¶

Bases: enum

AtomIndexLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomIndexLabels¶

AtomMapLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomMapLabels¶

AutoDetect = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AutoDetect¶

DummyAtomLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.DummyAtomLabels¶

IsotopeLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.IsotopeLabels¶

MDLRGroupLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.MDLRGroupLabels¶

RelabelDuplicateLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.RelabelDuplicateLabels¶

names = {'AtomIndexLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomIndexLabels, 'AtomMapLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomMapLabels, 'AutoDetect': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AutoDetect, 'DummyAtomLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.DummyAtomLabels, 'IsotopeLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.IsotopeLabels, 'MDLRGroupLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.MDLRGroupLabels, 'RelabelDuplicateLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.RelabelDuplicateLabels}¶

values = {1: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.IsotopeLabels, 2: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomMapLabels, 4: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomIndexLabels, 8: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.RelabelDuplicateLabels, 16: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.MDLRGroupLabels, 32: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.DummyAtomLabels, 255: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AutoDetect}¶

class rdkit.Chem.rdRGroupDecomposition.RGroupMatching¶

Bases: enum

Exhaustive = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Exhaustive¶

GA = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GA¶

Greedy = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Greedy¶

GreedyChunks = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GreedyChunks¶

NoSymmetrization = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.NoSymmetrization¶

names = {'Exhaustive': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Exhaustive, 'GA': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GA, 'Greedy': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Greedy, 'GreedyChunks': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GreedyChunks, 'NoSymmetrization': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.NoSymmetrization}¶

values = {1: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Greedy, 2: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GreedyChunks, 4: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Exhaustive, 8: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.NoSymmetrization, 16: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GA}¶

class rdkit.Chem.rdRGroupDecomposition.RGroupScore¶

Bases: enum

FingerprintVariance = rdkit.Chem.rdRGroupDecomposition.RGroupScore.FingerprintVariance¶

Match = rdkit.Chem.rdRGroupDecomposition.RGroupScore.Match¶

names = {'FingerprintVariance': rdkit.Chem.rdRGroupDecomposition.RGroupScore.FingerprintVariance, 'Match': rdkit.Chem.rdRGroupDecomposition.RGroupScore.Match}¶

values = {1: rdkit.Chem.rdRGroupDecomposition.RGroupScore.Match, 4: rdkit.Chem.rdRGroupDecomposition.RGroupScore.FingerprintVariance}¶

rdkit.Chem.rdRGroupDecomposition.RelabelMappedDummies((Mol)mol[, (int)inputLabels=rdkit.Chem.rdRGroupDecomposition.RGroupLabelling(7)[, (int)outputLabels=rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup]]) → None :¶

Relabel dummy atoms bearing an R-group mapping (as atom map number, isotope or MDLRGroup label) such that they will be displayed by the rendering code as R# rather than #*, :#, #:#, etc. By default, only the MDLRGroup label is retained on output; this may be configured through the outputLabels parameter. In case there are multiple potential R-group mappings, the priority on input is Atom map number > Isotope > MDLRGroup. The inputLabels parameter allows to configure which mappings are taken into consideration.

C++ signature :: void RelabelMappedDummies(RDKit::ROMol {lvalue} [,unsigned int=rdkit.Chem.rdRGroupDecomposition.RGroupLabelling(7) [,unsigned int=rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup]])

rdkit.Chem.rdRGroupDecomposition module¶

Table of Contents

Previous topic

Next topic

This Page