RDKit
Open-source cheminformatics and machine learning.
RDKit::MolOps Namespace Reference

Groups a variety of molecular query and transformation operations. More...

Classes

struct  AdjustQueryParameters
 

Functions

int countAtomElec (const Atom *at)
 return the number of electrons available on an atom to donate for More...
 
int getFormalCharge (const ROMol &mol)
 sums up all atomic formal charges and returns the result More...
 
bool atomHasConjugatedBond (const Atom *at)
 returns whether or not the given Atom is involved in a conjugated bond More...
 
unsigned int getMolFrags (const ROMol &mol, std::vector< int > &mapping)
 find fragments (disconnected components of the molecular graph) More...
 
unsigned int getMolFrags (const ROMol &mol, std::vector< std::vector< int > > &frags)
 find fragments (disconnected components of the molecular graph) More...
 
std::vector< boost::shared_ptr< ROMol > > getMolFrags (const ROMol &mol, bool sanitizeFrags=true, std::vector< int > *frags=0, std::vector< std::vector< int > > *fragsMolAtomMapping=0, bool copyConformers=true)
 splits a molecule into its component fragments More...
 
template<typename T >
std::map< T, boost::shared_ptr< ROMol > > getMolFragsWithQuery (const ROMol &mol, T(*query)(const ROMol &, const Atom *), bool sanitizeFrags=true, const std::vector< T > *whiteList=0, bool negateList=false)
 splits a molecule into pieces based on labels assigned using a query More...
 
double computeBalabanJ (const ROMol &mol, bool useBO=true, bool force=false, const std::vector< int > *bondPath=0, bool cacheIt=true)
 calculates Balaban's J index for the molecule More...
 
double computeBalabanJ (double *distMat, int nb, int nAts)
 
unsigned getNumAtomsWithDistinctProperty (const ROMol &mol, std::string prop)
 returns the number of atoms which have a particular property set More...
 
Ring finding and SSSR
int findSSSR (const ROMol &mol, std::vector< std::vector< int > > &res)
 finds a molecule's Smallest Set of Smallest Rings More...
 
int findSSSR (const ROMol &mol, std::vector< std::vector< int > > *res=0)
 
void fastFindRings (const ROMol &mol)
 use a DFS algorithm to identify ring bonds and atoms in a molecule More...
 
int symmetrizeSSSR (ROMol &mol, std::vector< std::vector< int > > &res)
 symmetrize the molecule's Smallest Set of Smallest Rings More...
 
int symmetrizeSSSR (ROMol &mol)
 
Shortest paths and other matrices
double * getAdjacencyMatrix (const ROMol &mol, bool useBO=false, int emptyVal=0, bool force=false, const char *propNamePrefix=0, const boost::dynamic_bitset<> *bondsToUse=0)
 returns a molecule's adjacency matrix More...
 
double * getDistanceMat (const ROMol &mol, bool useBO=false, bool useAtomWts=false, bool force=false, const char *propNamePrefix=0)
 Computes the molecule's topological distance matrix. More...
 
double * getDistanceMat (const ROMol &mol, const std::vector< int > &activeAtoms, const std::vector< const Bond * > &bonds, bool useBO=false, bool useAtomWts=false)
 Computes the molecule's topological distance matrix. More...
 
double * get3DDistanceMat (const ROMol &mol, int confId=-1, bool useAtomWts=false, bool force=false, const char *propNamePrefix=0)
 Computes the molecule's 3D distance matrix. More...
 
std::list< int > getShortestPath (const ROMol &mol, int aid1, int aid2)
 Find the shortest path between two atoms. More...
 
Stereochemistry
void cleanupChirality (RWMol &mol)
 removes bogus chirality markers (those on non-sp3 centers): More...
 
void assignChiralTypesFrom3D (ROMol &mol, int confId=-1, bool replaceExistingTags=true)
 Uses a conformer to assign ChiralType to a molecule's atoms. More...
 
void assignStereochemistry (ROMol &mol, bool cleanIt=false, bool force=false, bool flagPossibleStereoCenters=false)
 Assign stereochemistry tags to atoms (i.e. R/S) and bonds (i.e. Z/E) More...
 
void removeStereochemistry (ROMol &mol)
 Removes all stereochemistry information from atoms (i.e. R/S) and bonds. More...
 
void findPotentialStereoBonds (ROMol &mol, bool cleanIt=false)
 finds bonds that could be cis/trans in a molecule and mark them as Bond::STEREONONE More...
 

Dealing with hydrogens

enum  AdjustQueryWhichFlags {
  ADJUST_IGNORENONE = 0x0, ADJUST_IGNORECHAINS = 0x1, ADJUST_IGNORERINGS = 0x4, ADJUST_IGNOREDUMMIES = 0x2,
  ADJUST_IGNORENONDUMMIES = 0x8, ADJUST_IGNOREMAPPED = 0x10, ADJUST_IGNOREALL = 0xFFFFFFF
}
 
ROMoladdHs (const ROMol &mol, bool explicitOnly=false, bool addCoords=false, const UINT_VECT *onlyOnAtoms=NULL)
 returns a copy of a molecule with hydrogens added in as explicit Atoms More...
 
void addHs (RWMol &mol, bool explicitOnly=false, bool addCoords=false, const UINT_VECT *onlyOnAtoms=NULL)
 
ROMolremoveHs (const ROMol &mol, bool implicitOnly=false, bool updateExplicitCount=false, bool sanitize=true)
 returns a copy of a molecule with hydrogens removed More...
 
void removeHs (RWMol &mol, bool implicitOnly=false, bool updateExplicitCount=false, bool sanitize=true)
 
ROMolmergeQueryHs (const ROMol &mol, bool mergeUnmappedOnly=false)
 
void mergeQueryHs (RWMol &mol, bool mergeUnmappedOnly=false)
 
ROMoladjustQueryProperties (const ROMol &mol, const AdjustQueryParameters *params=NULL)
 returns a copy of a molecule with query properties adjusted More...
 
void adjustQueryProperties (RWMol &mol, const AdjustQueryParameters *params=NULL)
 
ROMolrenumberAtoms (const ROMol &mol, const std::vector< unsigned int > &newOrder)
 returns a copy of a molecule with the atoms renumbered More...
 

Sanitization

enum  SanitizeFlags {
  SANITIZE_NONE = 0x0, SANITIZE_CLEANUP = 0x1, SANITIZE_PROPERTIES = 0x2, SANITIZE_SYMMRINGS = 0x4,
  SANITIZE_KEKULIZE = 0x8, SANITIZE_FINDRADICALS = 0x10, SANITIZE_SETAROMATICITY = 0x20, SANITIZE_SETCONJUGATION = 0x40,
  SANITIZE_SETHYBRIDIZATION = 0x80, SANITIZE_CLEANUPCHIRALITY = 0x100, SANITIZE_ADJUSTHS = 0x200, SANITIZE_ALL = 0xFFFFFFF
}
 
enum  AromaticityModel { AROMATICITY_DEFAULT = 0x0, AROMATICITY_RDKIT = 0x1, AROMATICITY_SIMPLE = 0x2, AROMATICITY_CUSTOM = 0xFFFFFFF }
 Possible aromaticity models. More...
 
void sanitizeMol (RWMol &mol, unsigned int &operationThatFailed, unsigned int sanitizeOps=SANITIZE_ALL)
 carries out a collection of tasks for cleaning up a molecule and More...
 
void sanitizeMol (RWMol &mol)
 
int setAromaticity (RWMol &mol, AromaticityModel model=AROMATICITY_DEFAULT, int(*func)(RWMol &)=NULL)
 Sets up the aromaticity for a molecule. More...
 
void cleanUp (RWMol &mol)
 Designed to be called by the sanitizer to handle special cases before. More...
 
void assignRadicals (RWMol &mol)
 Called by the sanitizer to assign radical counts to atoms. More...
 
void adjustHs (RWMol &mol)
 adjust the number of implicit and explicit Hs for special cases More...
 
void Kekulize (RWMol &mol, bool markAtomsBonds=true, unsigned int maxBackTracks=100)
 Kekulizes the molecule. More...
 
void setConjugation (ROMol &mol)
 flags the molecule's conjugated bonds More...
 
void setHybridization (ROMol &mol)
 calculates and sets the hybridization of all a molecule's Stoms More...
 

Detailed Description

Groups a variety of molecular query and transformation operations.

Enumeration Type Documentation

Enumerator
ADJUST_IGNORENONE 
ADJUST_IGNORECHAINS 
ADJUST_IGNORERINGS 
ADJUST_IGNOREDUMMIES 
ADJUST_IGNORENONDUMMIES 
ADJUST_IGNOREMAPPED 
ADJUST_IGNOREALL 

Definition at line 244 of file MolOps.h.

Possible aromaticity models.

  • AROMATICITY_DEFAULT at the moment always uses AROMATICITY_RDKIT
  • AROMATICITY_RDKIT is the standard RDKit model (as documented in the RDKit Book)
  • AROMATICITY_SIMPLE only considers 5- and 6-membered simple rings (it does not consider the outer envelope of fused rings)
  • AROMATICITY_CUSTOM uses a caller-provided function
Enumerator
AROMATICITY_DEFAULT 

future proofing

AROMATICITY_RDKIT 
AROMATICITY_SIMPLE 
AROMATICITY_CUSTOM 

use a function

Definition at line 383 of file MolOps.h.

Enumerator
SANITIZE_NONE 
SANITIZE_CLEANUP 
SANITIZE_PROPERTIES 
SANITIZE_SYMMRINGS 
SANITIZE_KEKULIZE 
SANITIZE_FINDRADICALS 
SANITIZE_SETAROMATICITY 
SANITIZE_SETCONJUGATION 
SANITIZE_SETHYBRIDIZATION 
SANITIZE_CLEANUPCHIRALITY 
SANITIZE_ADJUSTHS 
SANITIZE_ALL 

Definition at line 316 of file MolOps.h.

Function Documentation

ROMol* RDKit::MolOps::addHs ( const ROMol mol,
bool  explicitOnly = false,
bool  addCoords = false,
const UINT_VECT onlyOnAtoms = NULL 
)

returns a copy of a molecule with hydrogens added in as explicit Atoms

Parameters
molthe molecule to add Hs to
explicitOnly(optional) if this true, only explicit Hs will be added
addCoords(optional) If this is true, estimates for the atomic coordinates of the added Hs will be used.
onlyOnAtoms(optional) if provided, this should be a vector of IDs of the atoms that will be considered for H addition.
Returns
the new molecule

Notes:

  • it makes no sense to use the addCoords option if the molecule's heavy atoms don't already have coordinates.
  • the caller is responsible for deleteing the pointer this returns.
void RDKit::MolOps::addHs ( RWMol mol,
bool  explicitOnly = false,
bool  addCoords = false,
const UINT_VECT onlyOnAtoms = NULL 
)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

void RDKit::MolOps::adjustHs ( RWMol mol)

adjust the number of implicit and explicit Hs for special cases

Currently this:

  • modifies aromatic nitrogens so that, when appropriate, they have an explicit H marked (e.g. so that we get things like "c1cc[nH]cc1"
Parameters
molthe molecule of interest

Assumptions

  • this is called after the molecule has been sanitized, aromaticity has been perceived, and the implicit valence of everything has been calculated.
ROMol* RDKit::MolOps::adjustQueryProperties ( const ROMol mol,
const AdjustQueryParameters params = NULL 
)

returns a copy of a molecule with query properties adjusted

Parameters
molthe molecule to adjust
paramscontrols the adjustments made
Returns
the new molecule

Referenced by RDKit::MolOps::AdjustQueryParameters::AdjustQueryParameters().

void RDKit::MolOps::adjustQueryProperties ( RWMol mol,
const AdjustQueryParameters params = NULL 
)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

void RDKit::MolOps::assignChiralTypesFrom3D ( ROMol mol,
int  confId = -1,
bool  replaceExistingTags = true 
)

Uses a conformer to assign ChiralType to a molecule's atoms.

Parameters
molthe molecule of interest
confIdthe conformer to use
replaceExistingTagsif this flag is true, any existing atomic chiral tags will be replaced

If the conformer provided is not a 3D conformer, nothing will be done.

void RDKit::MolOps::assignRadicals ( RWMol mol)

Called by the sanitizer to assign radical counts to atoms.

void RDKit::MolOps::assignStereochemistry ( ROMol mol,
bool  cleanIt = false,
bool  force = false,
bool  flagPossibleStereoCenters = false 
)

Assign stereochemistry tags to atoms (i.e. R/S) and bonds (i.e. Z/E)

Parameters
molthe molecule of interest
cleanIttoggles removal of stereo flags from double bonds that can not have stereochemistry
forceforces the calculation to be repeated even if it has already been done
flagPossibleStereoCentersset the _ChiralityPossible property on atoms that are possible stereocenters

Notes:M

  • Throughout we assume that we're working with a hydrogen-suppressed graph.
bool RDKit::MolOps::atomHasConjugatedBond ( const Atom at)

returns whether or not the given Atom is involved in a conjugated bond

void RDKit::MolOps::cleanUp ( RWMol mol)

Designed to be called by the sanitizer to handle special cases before.

Currently this:

  • modifies nitro groups, so that the nitrogen does not have an unreasonable valence of 5, as follows:
    • the nitrogen gets a positive charge
    • one of the oxygens gets a negative chage and the double bond to this oxygen is changed to a single bond The net result is that nitro groups can be counted on to be: "[N+](=O)[O-]"
  • modifies halogen-oxygen containing species as follows: [Cl,Br,I](=O)(=O)(=O)O -> [X+3]([O-])([O-])([O-])O [Cl,Br,I](=O)(=O)O -> [X+3]([O-])([O-])O [Cl,Br,I](=O)O -> [X+]([O-])O
  • converts the substructure [N,C]=P(=O)-* to [N,C]=[P+](-[O-])-*
Parameters
molthe molecule of interest
void RDKit::MolOps::cleanupChirality ( RWMol mol)

removes bogus chirality markers (those on non-sp3 centers):

double RDKit::MolOps::computeBalabanJ ( const ROMol mol,
bool  useBO = true,
bool  force = false,
const std::vector< int > *  bondPath = 0,
bool  cacheIt = true 
)

calculates Balaban's J index for the molecule

Parameters
molthe molecule of interest
useBOtoggles inclusion of the bond order in the calculation (when false, we're not really calculating the J value)
forceforces the calculation (instead of using cached results)
bondPathwhen included, only paths using bonds whose indices occur in this vector will be included in the calculation
cacheItIf this is true, the calculated value will be cached as a property on the molecule
Returns
the J index
double RDKit::MolOps::computeBalabanJ ( double *  distMat,
int  nb,
int  nAts 
)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

int RDKit::MolOps::countAtomElec ( const Atom at)

return the number of electrons available on an atom to donate for

The result is determined using the default valency, number of lone pairs, number of bonds and the formal charge. Note that the atom may not donate all of these electrons to a ring for aromaticity (also used in Conjugation and hybridization code).

Parameters
atthe atom of interest
Returns
the number of electrons
void RDKit::MolOps::fastFindRings ( const ROMol mol)

use a DFS algorithm to identify ring bonds and atoms in a molecule

NOTE: though the RingInfo structure is populated by this function, the only really reliable calls that can be made are to check if mol.getRingInfo().numAtomRings(idx) or mol.getRingInfo().numBondRings(idx) return values >0

void RDKit::MolOps::findPotentialStereoBonds ( ROMol mol,
bool  cleanIt = false 
)

finds bonds that could be cis/trans in a molecule and mark them as Bond::STEREONONE

Parameters
molthe molecule of interest
cleanIttoggles removal of stereo flags from double bonds that can not have stereochemistry

This function is usefuly in two situations

  • when parsing a mol file; for the bonds marked here, coordinate informations on the neighbors can be used to indentify cis or trans states
  • when writing a mol file; bonds that can be cis/trans but not marked as either need to be specially marked in the mol file
int RDKit::MolOps::findSSSR ( const ROMol mol,
std::vector< std::vector< int > > &  res 
)

finds a molecule's Smallest Set of Smallest Rings

Currently this implements a modified form of Figueras algorithm (JCICS - Vol. 36, No. 5, 1996, 986-991)

Parameters
molthe molecule of interest
resused to return the vector of rings. Each entry is a vector with atom indices. This information is also stored in the molecule's RingInfo structure, so this argument is optional (see overload)
Returns
number of smallest rings found

Base algorithm:

  • The original algorithm starts by finding representative degree 2 nodes.
  • Representative because if a series of deg 2 nodes are found only one of them is picked.
  • The smallest ring around each of them is found.
  • The bonds that connect to this degree 2 node are them chopped off, yielding new deg two nodes
  • The process is repeated on the new deg 2 nodes.
  • If no deg 2 nodes are found, a deg 3 node is picked. The smallest ring with it is found. A bond from this is "carefully" (look in the paper) selected and chopped, yielding deg 2 nodes. The process is same as above once this is done.

Our Modifications:

  • If available, more than one smallest ring around a representative deg 2 node will be computed and stored
  • Typically 3 rings are found around a degree 3 node (when no deg 2s are available) and all the bond to that node are chopped.
  • The extra rings that were found in this process are removed after all the nodes have been covered.

These changes were motivated by several factors:

  • We believe the original algorithm fails to find the correct SSSR (finds the correct number of them but the wrong ones) on some sample mols
  • Since SSSR may not be unique, a post-SSSR step to symmetrize may be done. The extra rings this process adds can be quite useful.

Referenced by RDKit::Drawing::DrawMol().

int RDKit::MolOps::findSSSR ( const ROMol mol,
std::vector< std::vector< int > > *  res = 0 
)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

double* RDKit::MolOps::get3DDistanceMat ( const ROMol mol,
int  confId = -1,
bool  useAtomWts = false,
bool  force = false,
const char *  propNamePrefix = 0 
)

Computes the molecule's 3D distance matrix.

Parameters
molthe molecule of interest
confIdthe conformer to use
useAtomWtssets the diagonal elements of the result to 6.0/(atomic number)
forceforces calculation of the matrix, even if already computed
propNamePrefixused to set the cached property name (if set to an empty string, the matrix will not be cached)
Returns
the distance matrix.

Notes

  • The result of this is cached in the molecule's local property dictionary, which will handle deallocation. Do the caller should not delete this pointer.
double* RDKit::MolOps::getAdjacencyMatrix ( const ROMol mol,
bool  useBO = false,
int  emptyVal = 0,
bool  force = false,
const char *  propNamePrefix = 0,
const boost::dynamic_bitset<> *  bondsToUse = 0 
)

returns a molecule's adjacency matrix

Parameters
molthe molecule of interest
useBOtoggles use of bond orders in the matrix
emptyValsets the empty value (for non-adjacent atoms)
forceforces calculation of the matrix, even if already computed
propNamePrefixused to set the cached property name
Returns
the adjacency matrix.

Notes

  • The result of this is cached in the molecule's local property dictionary, which will handle deallocation. The caller should not delete this pointer.
double* RDKit::MolOps::getDistanceMat ( const ROMol mol,
bool  useBO = false,
bool  useAtomWts = false,
bool  force = false,
const char *  propNamePrefix = 0 
)

Computes the molecule's topological distance matrix.

Uses the Floyd-Warshall all-pairs-shortest-paths algorithm.

Parameters
molthe molecule of interest
useBOtoggles use of bond orders in the matrix
useAtomWtssets the diagonal elements of the result to 6.0/(atomic number) so that the matrix can be used to calculate Balaban J values. This does not affect the bond weights.
forceforces calculation of the matrix, even if already computed
propNamePrefixused to set the cached property name
Returns
the distance matrix.

Notes

  • The result of this is cached in the molecule's local property dictionary, which will handle deallocation. The caller should not delete this pointer.
double* RDKit::MolOps::getDistanceMat ( const ROMol mol,
const std::vector< int > &  activeAtoms,
const std::vector< const Bond * > &  bonds,
bool  useBO = false,
bool  useAtomWts = false 
)

Computes the molecule's topological distance matrix.

Uses the Floyd-Warshall all-pairs-shortest-paths algorithm.

Parameters
molthe molecule of interest
activeAtomsonly elements corresponding to these atom indices will be included in the calculation
bondsonly bonds found in this list will be included in the calculation
useBOtoggles use of bond orders in the matrix
useAtomWtssets the diagonal elements of the result to 6.0/(atomic number) so that the matrix can be used to calculate Balaban J values. This does not affect the bond weights.
Returns
the distance matrix.

Notes

  • The results of this call are not cached, the caller should delete this pointer.
int RDKit::MolOps::getFormalCharge ( const ROMol mol)

sums up all atomic formal charges and returns the result

unsigned int RDKit::MolOps::getMolFrags ( const ROMol mol,
std::vector< int > &  mapping 
)

find fragments (disconnected components of the molecular graph)

Parameters
molthe molecule of interest
mappingused to return the mapping of Atoms->fragments. On return mapping will be mol->getNumAtoms() long and will contain the fragment assignment for each Atom
Returns
the number of fragments found.
unsigned int RDKit::MolOps::getMolFrags ( const ROMol mol,
std::vector< std::vector< int > > &  frags 
)

find fragments (disconnected components of the molecular graph)

Parameters
molthe molecule of interest
fragsused to return the Atoms in each fragment On return mapping will be numFrags long, and each entry will contain the indices of the Atoms in that fragment.
Returns
the number of fragments found.
std::vector<boost::shared_ptr<ROMol> > RDKit::MolOps::getMolFrags ( const ROMol mol,
bool  sanitizeFrags = true,
std::vector< int > *  frags = 0,
std::vector< std::vector< int > > *  fragsMolAtomMapping = 0,
bool  copyConformers = true 
)

splits a molecule into its component fragments

Parameters
molthe molecule of interest
sanitizeFragstoggles sanitization of the fragments after they are built
fragsused to return the mapping of Atoms->fragments. if provided, frags will be mol->getNumAtoms() long on return and will contain the fragment assignment for each Atom
fragsMolAtomMappingused to return the Atoms in each fragment On return mapping will be numFrags long, and each entry will contain the indices of the Atoms in that fragment.
copyConformerstoggles copying conformers of the fragments after they are built
Returns
a vector of the fragments as smart pointers to ROMols
template<typename T >
std::map<T, boost::shared_ptr<ROMol> > RDKit::MolOps::getMolFragsWithQuery ( const ROMol mol,
T(*)(const ROMol &, const Atom *)  query,
bool  sanitizeFrags = true,
const std::vector< T > *  whiteList = 0,
bool  negateList = false 
)

splits a molecule into pieces based on labels assigned using a query

Parameters
molthe molecule of interest
querythe query used to "label" the molecule for fragmentation
sanitizeFragstoggles sanitization of the fragments after they are built
whiteListif provided, only labels in the list will be kept
negateListif true, the white list logic will be inverted: only labels not in the list will be kept
Returns
a map of the fragments and their labels
unsigned RDKit::MolOps::getNumAtomsWithDistinctProperty ( const ROMol mol,
std::string  prop 
)

returns the number of atoms which have a particular property set

std::list<int> RDKit::MolOps::getShortestPath ( const ROMol mol,
int  aid1,
int  aid2 
)

Find the shortest path between two atoms.

Uses the Bellman-Ford algorithm

Parameters
molmolecule of interest
aid1index of the first atom
aid2index of the second atom
Returns
an std::list with the indices of the atoms along the shortest path

Notes:

  • the starting and end atoms are included in the path
  • if no path is found, an empty path is returned
void RDKit::MolOps::Kekulize ( RWMol mol,
bool  markAtomsBonds = true,
unsigned int  maxBackTracks = 100 
)

Kekulizes the molecule.

Parameters
molthe molecule of interest
markAtomsBondsif this is set to true, isAromatic boolean settings on both the Bonds and Atoms are turned to false following the Kekulization, otherwise they are left alone in their original state.
maxBackTracksthe maximum number of attempts at back-tracking. The algorithm uses a back-tracking procedure to revist a previous setting of double bond if we hit a wall in the kekulization process

Notes:

Referenced by RDKit::Drawing::MolToDrawing().

ROMol* RDKit::MolOps::mergeQueryHs ( const ROMol mol,
bool  mergeUnmappedOnly = false 
)

returns a copy of a molecule with hydrogens removed and added as queries to the heavy atoms to which they are bound.

This is really intended to be used with molecules that contain QueryAtoms

Parameters
molthe molecule to remove Hs from
Returns
the new molecule

Notes:

  • Atoms that do not already have hydrogen count queries will have one added, other H-related queries will not be touched. Examples:
    • C[H] -> [C;!H0]
    • [C;H1][H] -> [C;H1]
    • [C;H2][H] -> [C;H2]
  • Hydrogens which aren't connected to a heavy atom will not be removed. This prevents molecules like "[H][H]" from having all atoms removed.
  • the caller is responsible for deleteing the pointer this returns.
  • By default all hydrogens are removed, however if mergeUnmappedOnly is true, any hydrogen participating in an atom map will be retained
void RDKit::MolOps::mergeQueryHs ( RWMol mol,
bool  mergeUnmappedOnly = false 
)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

ROMol* RDKit::MolOps::removeHs ( const ROMol mol,
bool  implicitOnly = false,
bool  updateExplicitCount = false,
bool  sanitize = true 
)

returns a copy of a molecule with hydrogens removed

Parameters
molthe molecule to remove Hs from
implicitOnly(optional) if this true, only implicit Hs will be removed
updateExplicitCount(optional) If this is true, when explicit Hs are removed from the graph, the heavy atom to which they are bound will have its counter of explicit Hs increased.
sanitize(optional) If this is true, the final molecule will be sanitized
Returns
the new molecule

Notes:

  • Hydrogens which aren't connected to a heavy atom will not be removed. This prevents molecules like "[H][H]" from having all atoms removed.
  • Labelled hydrogen (e.g. atoms with atomic number=1, but mass > 1), will not be removed.
  • two coordinate Hs, like the central H in C[H-]C, will not be removed
  • Hs connected to dummy atoms will not be removed
  • the caller is responsible for deleteing the pointer this returns.

Referenced by RDKit::ForwardSDMolSupplier::ForwardSDMolSupplier(), RDKit::SDMolSupplier::SDMolSupplier(), and RDKit::SDMolSupplier::~SDMolSupplier().

void RDKit::MolOps::removeHs ( RWMol mol,
bool  implicitOnly = false,
bool  updateExplicitCount = false,
bool  sanitize = true 
)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

void RDKit::MolOps::removeStereochemistry ( ROMol mol)

Removes all stereochemistry information from atoms (i.e. R/S) and bonds.

Parameters
molthe molecule of interest
ROMol* RDKit::MolOps::renumberAtoms ( const ROMol mol,
const std::vector< unsigned int > &  newOrder 
)

returns a copy of a molecule with the atoms renumbered

Parameters
molthe molecule to work with
newOrderthe new ordering of the atoms (should be numAtoms long) for example: if newOrder is [3,2,0,1], then atom 3 in the original molecule will be atom 0 in the new one
Returns
the new molecule

Notes:

  • the caller is responsible for deleteing the pointer this returns.

Referenced by RDKit::MolOps::AdjustQueryParameters::AdjustQueryParameters().

void RDKit::MolOps::sanitizeMol ( RWMol mol,
unsigned int &  operationThatFailed,
unsigned int  sanitizeOps = SANITIZE_ALL 
)

carries out a collection of tasks for cleaning up a molecule and

that it makes "chemical sense"

This functions calls the following in sequence

  1. MolOps::cleanUp()
  2. mol.updatePropertyCache()
  3. MolOps::symmetrizeSSSR()
  4. MolOps::Kekulize()
  5. MolOps::assignRadicals()
  6. MolOps::setAromaticity()
  7. MolOps::setConjugation()
  8. MolOps::setHybridization()
  9. MolOps::cleanupChirality()
  10. MolOps::adjustHs()
Parameters
mol: the RWMol to be cleaned
operationThatFailed: the first (if any) sanitization operation that fails is set here. The values are taken from the SanitizeFlags enum. On success, the value is SanitizeFlags::SANITIZE_NONE
sanitizeOps: the bits here are used to set which sanitization operations are carried out. The elements of the SanitizeFlags enum define the operations.

Notes:

  • If there is a failure in the sanitization, a SanitException will be thrown.
  • in general the user of this function should cast the molecule following this function to a ROMol, so that new atoms and bonds cannot be added to the molecule and screw up the sanitizing that has been done here
void RDKit::MolOps::sanitizeMol ( RWMol mol)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

int RDKit::MolOps::setAromaticity ( RWMol mol,
AromaticityModel  model = AROMATICITY_DEFAULT,
int(*)(RWMol &)  func = NULL 
)

Sets up the aromaticity for a molecule.

This is what happens here:

  1. find all the simple rings by calling the findSSSR function
  2. loop over all the Atoms in each ring and mark them if they are candidates for aromaticity. A ring atom is a candidate if it can spare electrons to the ring and if it's from the first two rows of the periodic table.
  3. based on the candidate atoms, mark the rings to be either candidates or non-candidates. A ring is a candidate only if all its atoms are candidates
  4. apply Hueckel rule to each of the candidate rings to check if the ring can be aromatic
Parameters
molthe RWMol of interest
modelthe aromaticity model to use
funca custom function for assigning aromaticity (only used when model=AROMATICITY_CUSTOM)
Returns
>0 on success, <= 0 otherwise

Assumptions:

void RDKit::MolOps::setConjugation ( ROMol mol)

flags the molecule's conjugated bonds

void RDKit::MolOps::setHybridization ( ROMol mol)

calculates and sets the hybridization of all a molecule's Stoms

int RDKit::MolOps::symmetrizeSSSR ( ROMol mol,
std::vector< std::vector< int > > &  res 
)

symmetrize the molecule's Smallest Set of Smallest Rings

SSSR rings obatined from "findSSSR" can be non-unique in some case. For example, cubane has five SSSR rings, not six as one would hope.

This function adds additional rings to the SSSR list if necessary to make the list symmetric, e.g. all atoms in cubane will be part of the same number of SSSRs. This function choses these extra rings from the extra rings computed and discarded during findSSSR. The new ring are chosen such that:

  • replacing a same sized ring in the SSSR list with an extra ring yields the same union of bond IDs as the orignal SSSR list
Parameters
mol- the molecule of interest
resused to return the vector of rings. Each entry is a vector with atom indices. This information is also stored in the molecule's RingInfo structure, so this argument is optional (see overload)
Returns
the total number of rings = (new rings + old SSSRs)

Notes:

int RDKit::MolOps::symmetrizeSSSR ( ROMol mol)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.