| Trees | Indices | Help |
|
|---|
|
|
Import all RDKit chemistry modules
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
Applies the transformation to a molecule and sets it up with a single conformer |
Returns a generator for the virtual library defined by
a reaction and a sequence of sidechain sets
>>> import Chem
>>> from Chem import AllChem
>>> s1=[Chem.MolFromSmiles(x) for x in ('NC','NCC')]
>>> s2=[Chem.MolFromSmiles(x) for x in ('OC=O','OC(=O)C')]
>>> rxn = AllChem.ReactionFromSmarts('[O:2]=[C:1][OH].[N:3]>>[O:2]=[C:1][N:3]')
>>> r = AllChem.EnumerateLibraryFromReaction(rxn,[s2,s1])
>>> [Chem.MolToSmiles(x[0]) for x in list(r)]
['CNC=O', 'CCNC=O', 'CNC(=O)C', 'CCNC(=O)C']
Note that this is all done in a lazy manner, so "infinitely" large libraries can
be done without worrying about running out of memory. Your patience will run out first:
Define a set of 10000 amines:
>>> amines = (Chem.MolFromSmiles('N'+'C'*x) for x in range(10000))
... a set of 10000 acids
>>> acids = (Chem.MolFromSmiles('OC(=O)'+'C'*x) for x in range(10000))
... now the virtual library (1e8 compounds in principle):
>>> r = AllChem.EnumerateLibraryFromReaction(rxn,[acids,amines])
... look at the first 4 compounds:
>>> [Chem.MolToSmiles(r.next()[0]) for x in range(4)]
['NC=O', 'CNC=O', 'CCNC=O', 'CCCNC=O']
|
Adds hydrogens to the graph of a molecule.
ARGUMENTS:
- mol: the molecule to be modified
- explicitOnly: (optional) if this toggle is set, only explicit Hs will
be added to the molecule. Default value is 0 (add implicit and explicit Hs).
- addCoords: (optional) if this toggle is set, The Hs will have 3D coordinates
set. Default value is 0 (no 3D coords).
RETURNS: a new molecule with added Hs
NOTES:
- The original molecule is *not* modified.
- Much of the code assumes that Hs are not included in the molecular
topology, so be *very* careful with the molecule that comes back from
this function.
C++ signature:
AddHs(RDKit::ROMol mol, bool explicitOnly=False, bool addCoords=False) -> RDKit::ROMol*
|
Optimally (minimum RMSD) align a molecule to another molecule
The 3D transformation required to align the specied conformation in the probe molecule
to a specified conformation in the reference molecule is computed so that the root mean
squared distance between a specified set of atoms is minimized.
This transform is then applied to the specified conformation in the probe molecule
ARGUMENTS
- prbMol molecule that is to be aligned
- refMol molecule used as the reference for the alignment
- prbCid ID of the conformation in the probe to be used
for the alignment (defaults to first conformation)
- refCid ID of the conformation in the ref molecule to which
the alignment is computed (defaults to first conformation)
- atomMap a vector of pairs of atom IDs (probe AtomId, ref AtomId)
used to compute the alignments. If this mapping is
not specified an attempt is made to generate on by
substructure matching
- weights Optionally specify weights for each of the atom pairs
- reflect if true reflect the conformation of the probe molecule
- maxIters maximum number of iteration used in mimizing the RMSD
RETURNS
RMSD value
C++ signature:
AlignMol(RDKit::ROMol {lvalue} prbMol, RDKit::ROMol refMol, int prbCid=-1, int refCid=-1, boost::python::api::object atomMap=[], boost::python::api::object weights=[], bool reflect=False, unsigned int maxIters=50) -> double
|
Alignment conformations in a molecule to each other
The first conformation in the molecule is used as the reference
ARGUMENTS
- mol molecule of interest
- atomIds List of atom ids to use a points for alingment - defaults to all atoms
- confIds Ids of conformations to align - defaults to all conformers
- weights Optionally specify weights for each of the atom pairs
- reflect if true reflect the conformation of the probe molecule
- maxIters maximum number of iteration used in mimizing the RMSD
RETURNS
RMSD value
C++ signature:
AlignMolConformers(RDKit::ROMol {lvalue} mol, boost::python::api::object atomIds=[], boost::python::api::object confIds=[], boost::python::api::object weights=[], bool reflect=False, unsigned int maxIters=50) -> void*
|
Does the CIP chirality assignment (R/S)
for the molecule's atoms.
Chiral atoms will have a property '_CIPCode' indicating
their chiral code.
ARGUMENTS:
- mol: the molecule to use
- cleanIt: (optional) if provided, atoms with a chiral specifier that aren't
actually chiral (e.g. atoms with duplicate substituents or only 2 substituents,
etc.) will have their chiral code set to CHI_UNSPECIFIED
- force: (optional) causes the calculation to be repeated, even if it has already
been done
C++ signature:
AssignAtomChiralCodes(RDKit::ROMol {lvalue} mol, bool cleanIt=False, bool force=False) -> void*
|
Does the CIP stereochemistry assignment (Z/E)
for the molecule's bonds .
Qualifying bonds will have a property '_CIPCode' indicating
their stereochemistry.
ARGUMENTS:
- mol: the molecule to use
- cleanIt: (optional) ignored
- force: (optional) causes the calculation to be repeated, even if it has already
been done
C++ signature:
AssignBondStereoCodes(RDKit::ROMol {lvalue} mol, bool cleanIt=False, bool force=False) -> void*
|
Construct a feature factory given a feature definition in a file
C++ signature:
BuildFeatureFactory(std::string) -> RDKit::MolChemicalFeatureFactory*
|
Construct a feature factory given a feature definition block
C++ signature:
BuildFeatureFactoryFromString(std::string) -> RDKit::MolChemicalFeatureFactory*
|
Canonicalize the orientation of a conformer so that its principal axes
around the specified center point coincide with the x, y, z axes
ARGUMENTS:
- conf : conformer of interest
- center : optionally center point about which the principal axes are computed
if not specified the centroid of the conformer will be used
- normalizeCovar : Optionally normalize the covariance matrix by the number of atoms
C++ signature:
CanonicalizeConformer(RDKit::Conformer {lvalue} conf, RDGeom::Point3D const* center=None, bool normalizeCovar=False, bool ignoreHs=True) -> void*
|
Loop over the conformers in a molecule and canonicalize their orientation
C++ signature:
CanonicalizeMol(RDKit::ROMol {lvalue} mol, bool normalizeCovar=False, bool ignoreHs=True) -> void*
|
Compute 2D coordinates for a molecule.
The resulting coordinates are stored on each atom of the molecule
ARGUMENTS:
mol - the molecule of interest
canonOrient - orient the molecule in a canonical way
clearConfs - if true, all existing conformations on the molecule
will be cleared
coordMap - a dictionary mapping atom Ids -> Point2D objects
with starting coordinates for atoms that should
have their positions locked.
nFlipsPerSample - number of rotatable bonds that are
flipped at random at a time.
nSample - Number of random samplings of rotatable bonds.
sampleSeed - seed for the random sampling process.
permuteDeg4Nodes - allow permutation of bonds at a degree 4
node during the sampling process
RETURNS:
ID of the conformation added to the molecule
C++ signature:
Compute2DCoords(RDKit::ROMol {lvalue} mol, bool canonOrient=False, bool clearConfs=True, boost::python::dict {lvalue} coordMap={}, unsigned int nFlipsPerSample=0, unsigned int nSample=0, int sampleSeed=0, bool permuteDeg4Nodes=False) -> unsigned int
|
Compute 2D coordinates for a molecule such
that the inter-atom distances mimic those in a user-provided
distance matrix.
The resulting coordinates are stored on each atom of the molecule
ARGUMENTS:
mol - the molecule of interest
distMat - distance matrix that we want the 2D structure to mimic
canonOrient - orient the molecule in a canonical way
clearConfs - if true, all existing conformations on the molecule
will be cleared
weightDistMat - weight assigned in the cost function to mimicing
the distance matrix.
This must be between (0.0,1.0). (1.0-weightDistMat)
is then the weight assigned to improving
the density of the 2D structure i.e. try to
make it spread out
nFlipsPerSample - number of rotatable bonds that are
flipped at random at a time.
nSample - Number of random samplings of rotatable bonds.
sampleSeed - seed for the random sampling process.
permuteDeg4Nodes - allow permutation of bonds at a degree 4
node during the sampling process
RETURNS:
ID of the conformation added to the molecule
C++ signature:
Compute2DCoordsMimicDistmat(RDKit::ROMol {lvalue} mol, boost::python::api::object distMat, bool canonOrient=False, bool clearConfs=True, double weightDistMat=0.5, unsigned int nFlipsPerSample=3, unsigned int nSample=100, int sampleSeed=100, bool permuteDeg4Nodes=True) -> unsigned int
|
Compute the transformation required aligna conformer so that
the the principal axes align up with the x,y, z axes
The conformer itself is left unchanged
ARGUMENTS:
- conf : the conformer of interest
- center : optional center point to compute the principal axes around (defaults to the centroid)
- normalizeCovar : optionally normalize the covariance matrix by the number of atoms
C++ signature:
ComputeCanonicalTransform(RDKit::Conformer conf, RDGeom::Point3D const* center=None, bool normalizeCovar=False, bool ignoreHs=True) -> _object*
|
Compute the centroid of the conformation - hydrogens are ignored and no attention
if paid to the difference in sizes of the heavy atoms
C++ signature:
ComputeCentroid(RDKit::Conformer conf, bool ignoreHs=True) -> RDGeom::Point3D
|
Compute the lower and upper corners of a cuboid that will fit the conformer
C++ signature:
ComputeConfBox(RDKit::Conformer conf, boost::python::api::object trans=None, double padding=2.0) -> boost::python::tuple
|
Compute the size of the box that can fit the conformations, and offset
of the box from the origin
C++ signature:
ComputeConfDimsAndOffset(RDKit::Conformer conf, boost::python::api::object trans=None, double padding=2.0) -> boost::python::tuple
|
Compute Gasteiger partial charges for molecule
The charges are computed using an iterative procedure presented in
Ref : J.Gasteiger, M. Marseli, Iterative Equalization of Oribital Electronegatiity
A Rapid Access to Atomic Charges, Tetrahedron Vol 36 p3219 1980
The computed charges are stored on each atom are stored a computed property ( under the name
_GasteigerCharge). In addition, each atom also stored the total charge for the implicit hydrogens
on the atom (under the property name _GasteigerHCharge)
ARGUMENTS:
- mol : the molecule of interrest
- nIter : number of iteration (defaults to 12)
- throwOnParamFailure : toggles whether or not an exception should be raised if parameters
for an atom cannot be found. If this is false (the default), all parameters for unknown
atoms will be set to zero. This has the effect of removing that atom from the iteration.
C++ signature:
ComputeGasteigerCharges(RDKit::ROMol const* mol, int nIter=12, bool throwOnParamFailure=False) -> void*
|
Compute the union of two boxes, so that all the points in both boxes are
contain in the new box
C++ signature:
ComputeUnionBox(boost::python::tuple, boost::python::tuple) -> boost::python::tuple
|
Returns a "Daylight"-type fingerprint for a molecule
Explanation of the algorithm below.
ARGUMENTS:
- mol: the molecule to use
- minPath: (optional) minimum number of bonds to include in the subgraphs
Defaults to 1.
- maxPath: (optional) maximum number of bonds to include in the subgraphs
Defaults to 7.
- fpSize: (optional) number of bits in the fingerprint
Defaults to 2048.
- nBitsPerPath: (optional) number of bits to set per path
Defaults to 4.
- useHs: (optional) include information about number of Hs on each
atom when calculating path hashes.
Defaults to 1.
- tgtDensity: (optional) fold the fingerprint until this minimum density has
been reached
Defaults to 0.
- minSize: (optional) the minimum size the fingerprint will be folded to when
trying to reach tgtDensity
Defaults to 128.
RETURNS: a DataStructs.ExplicitBitVect with _fpSize_ bits
ALGORITHM:
This algorithm functions by find all paths between minPath and maxPath in
length. For each path:
1) The Balaban J value is calculated.
2) The 32 bit Balaban J value is used to seed a random-number generator
3) _nBitsPerPath_ random numbers are generated and used to set the corresponding
bits in the fingerprint
C++ signature:
DaylightFingerprint(RDKit::ROMol mol, unsigned int minPath=1, unsigned int maxPath=7, unsigned int fpSize=2048, unsigned int nBitsPerHash=4, bool useHs=True, double tgtDensity=0.0, unsigned int minSize=128) -> ExplicitBitVect*
|
Removes atoms matching a substructure query from a molecule
ARGUMENTS:
- mol: the molecule to be modified
- query: the molecule to be used as a substructure query
- onlyFrags: (optional) if this toggle is set, atoms will only be removed if
the entire fragment in which they are found is matched by the query.
See below for examples.
Default value is 0 (remove the atoms whether or not the entire fragment matches)
RETURNS: a new molecule with the substructure removed
NOTES:
- The original molecule is *not* modified.
EXAMPLES:
The following examples substitute SMILES/SMARTS strings for molecules, you'd have
to actually use molecules:
- DeleteSubstructs('CCOC','OC') -> 'CC'
- DeleteSubstructs('CCOC','OC',1) -> 'CCOC'
- DeleteSubstructs('CCOCCl.Cl','Cl',1) -> 'CCOCCl'
- DeleteSubstructs('CCOCCl.Cl','Cl') -> 'CCOC'
C++ signature:
DeleteSubstructs(RDKit::ROMol mol, RDKit::ROMol query, bool onlyFrags=False) -> RDKit::ROMol*
|
Use distance geometry to obtain intial
coordinates for a molecule
ARGUMENTS:
- mol : the molecule of interest
- maxAttempts : the maximum number of attempts to try embedding
- randomSeed : provide a seed for the random number generator
so that the same coordinates can be obtained
for a molecule on multiple runs. The default
(-1) uses a random seed
- clearConfs : clear all existing conformations on the molecule
- useRandomCoords : Start the embedding from random coordinates instead of
using eigenvalues of the distance matrix.
- boxSizeMult Determines the size of the box that is used for
random coordinates. If this is a positive number, the
side length will equal the largest element of the distance
matrix times boxSizeMult. If this is a negative number,
the side length will equal -boxSizeMult (i.e. independent
of the elements of the distance matrix).
- randNegEig : If the embedding yields a negative eigenvalue,
pick coordinates that correspond
to this component at random
- numZeroFail : fail embedding is we have this more zero eigenvalues
RETURNS:
ID of the new conformation added to the molecule
C++ signature:
EmbedMolecule(RDKit::ROMol {lvalue} mol, unsigned int maxAttempts=30, int randomSeed=-1, bool clearConfs=True, bool useRandomCoords=False, double boxSizeMult=2.0, bool randNegEig=True, unsigned int numZeroFail=1) -> int
|
Use distance geometry to obtain multiple sets of
coordinates for a molecule
ARGUMENTS:
- mol : the molecule of interest
- numConfs : the number of conformers to generate
- maxAttempts : the maximum number of attempts to try embedding
- randomSeed : provide a seed for the random number generator
so that the same coordinates can be obtained
for a molecule on multiple runs. The default
(-1) uses a random seed
- clearConfs : clear all existing conformations on the molecule
- useRandomCoords : Start the embedding from random coordinates instead of
using eigenvalues of the distance matrix.
- boxSizeMult Determines the size of the box that is used for
random coordinates. If this is a positive number, the
side length will equal the largest element of the distance
matrix times boxSizeMult. If this is a negative number,
the side length will equal -boxSizeMult (i.e. independent
of the elements of the distance matrix).
- randNegEig : If the embedding yields a negative eigenvalue,
pick coordinates that correspond
to this component at random
- numZeroFail : fail embedding is we have this more zero eigenvalues
- pruneRmsThresh : Retain only the conformations out of 'numConfs'
after embedding that are at least
this far apart from each other.
RMSD is computed on the heavy atoms.
Pruning is greedy; i.e. the first embedded conformation
is retained and from then on only those that are at
least pruneRmsThresh away from all retained conformations
are kept. The pruning is done after embedding and
bounds violation minimization. No pruning by default.
RETURNS:
List of new conformation IDs
C++ signature:
EmbedMultipleConfs(RDKit::ROMol {lvalue} mol, unsigned int numConfs=10, unsigned int maxAttempts=10, int randomSeed=-1, bool clearConfs=True, bool useRandomCoords=False, double boxSizeMult=2.0, bool randNegEig=True, unsigned int numZeroFail=1, double pruneRmsThresh=-1.0) -> std::vector<int, std::allocator<int> >
|
Encode the shape of a molecule (one of it conformer) onto a grid
ARGUMENTS:
- mol : the molecule of interest
- grid : grid onto which the encoding is written
- confId : id of the conformation of interest on mol (defaults to the first one)
- trans : any transformation that needs to be used to encode onto the grid (note the molecule remain unchanged)
- vdwScale : Scaling factor for the radius of the atoms to determine the base radius
used in the encoding - grid points inside this sphere carry the maximum occupany
- setpSize : thickness of the layers outside the base radius, the occupancy value is decreased
from layer to layer from the maximum value
- maxLayers : the maximum number of layers - defaults to the number allowed the number of bits
use per grid point - e.g. two bits per grid point will allow 3 layers
- ignoreHs : when set, the contribution of Hs to the shape will be ignored
C++ signature:
EncodeShape(RDKit::ROMol mol, RDGeom::UniformGrid3D {lvalue} grid, int confId=-1, boost::python::api::object trans=None, double vdwScale=0.80000000000000004, double stepSize=0.25, int maxLayers=-1, bool ignoreHs=True) -> void*
|
Finds all paths of a particular length in a molecule
ARGUMENTS:
- mol: the molecule to use
- length: an integer with the target length for the paths.
- useBonds: (optional) toggles the use of bond indices in the paths.
Otherwise atom indices are used. *Note* this behavior is different
from that for subgraphs.
Defaults to 1.
RETURNS: a tuple of tuples with IDs for the bonds.
NOTES:
- Difference between _subgraphs_ and _paths_ ::
Subgraphs are potentially branched, whereas paths (in our
terminology at least) cannot be. So, the following graph:
C--0--C--1--C--3--C
|
2
|
C
has 3 _subgraphs_ of length 3: (0,1,2),(0,1,3),(2,1,3)
but only 2 _paths_ of length 3: (0,1,3),(2,1,3)
C++ signature:
FindAllPathsOfLengthN(RDKit::ROMol mol, unsigned int length, bool useBonds=True, bool useHs=False) -> std::list<std::vector<int, std::allocator<int> >, std::allocator<std::vector<int, std::allocator<int> > > >
|
Finds all subgraphs of a particular length in a molecule
ARGUMENTS:
- mol: the molecule to use
- length: an integer with the target number of bonds for the subgraphs.
- useHs: (optional) toggles whether or not bonds to Hs that are part of the graph
should be included in the results.
Defaults to 0.
- verbose: (optional, internal use) toggles verbosity in the search algorithm.
Defaults to 0.
RETURNS: a tuple of 2-tuples with bond IDs
NOTES:
- Difference between _subgraphs_ and _paths_ ::
Subgraphs are potentially branched, whereas paths (in our
terminology at least) cannot be. So, the following graph:
C--0--C--1--C--3--C
|
2
|
C
has 3 _subgraphs_ of length 3: (0,1,2),(0,1,3),(2,1,3)
but only 2 _paths_ of length 3: (0,1,3),(2,1,3)
C++ signature:
FindAllSubgraphsOfLengthN(RDKit::ROMol mol, unsigned int length, bool useHs=False) -> std::list<std::vector<int, std::allocator<int> >, std::allocator<std::vector<int, std::allocator<int> > > >
|
Finds unique subgraphs of a particular length in a molecule
ARGUMENTS:
- mol: the molecule to use
- length: an integer with the target number of bonds for the subgraphs.
- useHs: (optional) toggles whether or not bonds to Hs that are part of the graph
should be included in the results.
Defaults to 0.
- useBO: (optional) Toggles use of bond orders in distinguishing one subgraph from
another.
Defaults to 1.
RETURNS: a tuple of tuples with bond IDs
C++ signature:
FindUniqueSubgraphsOfLengthN(RDKit::ROMol mol, unsigned int length, bool useHs=False, bool useBO=True) -> std::list<std::vector<int, std::allocator<int> >, std::allocator<std::vector<int, std::allocator<int> > > >
|
Returns the molecule's adjacency matrix.
ARGUMENTS:
- mol: the molecule to use
- useBO: (optional) toggles use of bond orders in calculating the matrix.
Default value is 0.
- emptyVal: (optional) sets the elements of the matrix between non-adjacent atoms
Default value is 0.
- force: (optional) forces the calculation to proceed, even if there is a cached value.
Default value is 0.
- prefix: (optional, internal use) sets the prefix used in the property cache
Default value is .
RETURNS: a Numeric array of floats containing the adjacency matrix
C++ signature:
GetAdjacencyMatrix(RDKit::ROMol {lvalue} mol, bool useBO=False, int emptyVal=0, bool force=False, char const* prefix='') -> _object*
|
Compute the transformation required to align a molecule
The 3D transformation required to align the specied conformation in the probe molecule
to a specified conformation in the reference molecule is computed so that the root mean
squared distance between a specified set of atoms is minimized
ARGUMENTS
- prbMol molecule that is to be aligned
- refMol molecule used as the reference for the alignment
- prbCid ID of the conformation in the probe to be used
for the alignment (defaults to first conformation)
- refCid ID of the conformation in the ref molecule to which
the alignment is computed (defaults to first conformation)
- atomMap a vector of pairs of atom IDs (probe AtomId, ref AtomId)
used to compute the alignments. If this mapping is
not specified an attempt is made to generate on by
substructure matching
- weights Optionally specify weights for each of the atom pairs
- reflect if true reflect the conformation of the probe molecule
- maxIters maximum number of iteration used in mimizing the RMSD
RETURNS
a tuple of (RMSD value, transform matrix)
C++ signature:
GetAlignmentTransform(RDKit::ROMol prbMol, RDKit::ROMol refMol, int prbCid=-1, int refCid=-1, boost::python::api::object atomMap=[], boost::python::api::object weights=[], bool reflect=False, unsigned int maxIters=50) -> _object*
|
Returns an empty list if any of the features passed in share an atom.
Otherwise a list of lists of atom indices is returned.
C++ signature:
GetAtomMatch(boost::python::api::object featMatch, int maxAts=1024) -> boost::python::api::object
|
Returns the molecule's distance matrix.
ARGUMENTS:
- mol: the molecule to use
- useBO: (optional) toggles use of bond orders in calculating the distance matrix.
Default value is 0.
- useAtomWts: (optional) toggles using atom weights for the diagonal elements of the
matrix (to return a "Balaban" distance matrix).
Default value is 0.
- force: (optional) forces the calculation to proceed, even if there is a cached value.
Default value is 0.
- prefix: (optional, internal use) sets the prefix used in the property cache
Default value is .
RETURNS: a Numeric array of floats with the distance matrix
C++ signature:
GetDistanceMatrix(RDKit::ROMol {lvalue} mol, bool useBO=False, bool useAtomWts=False, bool force=False, char const* prefix='') -> _object*
|
Returns the formal charge for the molecule.
ARGUMENTS:
- mol: the molecule to use
C++ signature:
GetFormalCharge(RDKit::ROMol) -> int
|
Finds the disconnected fragments from a molecule.
For example, for the molecule 'CC(=O)[O-].[NH3+]C' GetMolFrags() returns
((0, 1, 2, 3), (4, 5))
ARGUMENTS:
- mol: the molecule to use
RETURNS: a tuple of tuples with IDs for the atoms in each fragment.
C++ signature:
GetMolFrags(RDKit::ROMol) -> std::vector<std::vector<int, std::allocator<int> >, std::allocator<std::vector<int, std::allocator<int> > > >
|
Returns the distance bounds matrix for a molecule
ARGUMENTS:
- mol : the molecule of interest
- set15bounds : set bounds for 1-5 atom distances based on
topology (otherwise stop at 1-4s)
- scaleVDW : scale down the sum of VDW radii when setting the
lower bounds for atoms less than 5 bonds apart
RETURNS:
the bounds matrix as a Numeric array with lower bounds in
the lower triangle and upper bounds in the upper triangle
C++ signature:
GetMoleculeBoundsMatrix(RDKit::ROMol {lvalue} mol, bool set15bounds=True, bool scaleVDW=False) -> _object*
|
Returns the application's PeriodicTable instance.
C++ signature:
GetPeriodicTable(void) -> RDKit::PeriodicTable*
|
Get the smallest set of simple rings for a molecule.
ARGUMENTS:
- mol: the molecule to use.
RETURNS: the number of rings found
This will be equal to NumBonds-NumAtoms+1 for single-fragment molecules.
C++ signature:
GetSSSR(RDKit::ROMol {lvalue}) -> int
|
Get a symmetrized SSSR for a molecule.
The symmetrized SSSR is at least as large as the SSSR for a molecule.
In certain highly-symmetric cases (e.g. cubane), the symmetrized SSSR can be
a bit larger (i.e. the number of symmetrized rings is >= NumBonds-NumAtoms+1).
ARGUMENTS:
- mol: the molecule to use.
RETURNS: the number of rings found
C++ signature:
GetSymmSSSR(RDKit::ROMol {lvalue}) -> std::vector<std::vector<int, std::allocator<int> >, std::allocator<std::vector<int, std::allocator<int> > > >
|
Kekulizes the molecule
ARGUMENTS:
- mol: the molecule to use
- clearAromaticFlags: (optional) if this toggle is set, all atoms and bonds in the
molecule will be marked non-aromatic following the kekulization.
Default value is 0.
NOTES:
- The molecule is modified in place.
C++ signature:
Kekulize(RDKit::ROMol {lvalue} mol, bool clearAromaticFlags=False) -> void*
|
Construct a molecule from a Mol block.
ARGUMENTS:
- molBlock: string containing the Mol block
- sanitize: (optional) toggles sanitization of the molecule.
Defaults to 1.
- removeHs: (optional) toggles removing hydrogens from the molecule.
This only make sense when sanitization is done.
Defaults to true.
RETURNS:
a Mol object, None on failure.
C++ signature:
MolFromMolBlock(std::string molBlock, bool sanitize=True, bool removeHs=True) -> RDKit::ROMol*
|
Construct a molecule from a Mol file.
ARGUMENTS:
- fileName: name of the file to read
- sanitize: (optional) toggles sanitization of the molecule.
Defaults to true.
- removeHs: (optional) toggles removing hydrogens from the molecule.
This only make sense when sanitization is done.
Defaults to true.
RETURNS:
a Mol object, None on failure.
C++ signature:
MolFromMolFile(char const* molFileName, bool sanitize=True, bool removeHs=True) -> RDKit::ROMol*
|
Construct a molecule from a SMARTS string.
ARGUMENTS:
- SMARTS: the smarts string
- mergeHs: (optional) toggles the merging of explicit Hs in the query into the attached
atoms. So, for example, 'C[H]' becomes '[C;!H0]'.
Defaults to 0.
RETURNS:
a Mol object, None on failure.
C++ signature:
MolFromSmarts(char const* SMARTS, bool mergeHs=False) -> RDKit::ROMol*
|
Construct a molecule from a SMILES string.
ARGUMENTS:
- SMILES: the smiles string
- sanitize: (optional) toggles sanitization of the molecule.
Defaults to 1.
RETURNS:
a Mol object, None on failure.
C++ signature:
MolFromSmiles(std::string SMILES, bool sanitize=True) -> RDKit::ROMol*
|
Construct a molecule from a TPL block.
ARGUMENTS:
- fileName: name of the file to read
- sanitize: (optional) toggles sanitization of the molecule.
Defaults to True.
- skipFirstConf: (optional) skips reading the first conformer.
Defaults to False.
This should be set to True when reading TPLs written by
the CombiCode.
RETURNS:
a Mol object, None on failure.
C++ signature:
MolFromTPLBlock(std::string tplBlock, bool sanitize=True, bool skipFirstConf=False) -> RDKit::ROMol*
|
Construct a molecule from a TPL file.
ARGUMENTS:
- fileName: name of the file to read
- sanitize: (optional) toggles sanitization of the molecule.
Defaults to True.
- skipFirstConf: (optional) skips reading the first conformer.
Defaults to False.
This should be set to True when reading TPLs written by
the CombiCode.
RETURNS:
a Mol object, None on failure.
C++ signature:
MolFromTPLFile(char const* fileName, bool sanitize=True, bool skipFirstConf=False) -> RDKit::ROMol*
|
Returns the a Mol block for a molecule
ARGUMENTS:
- mol: the molecule
- includeStereo: (optional) toggles inclusion of stereochemical
information in the output
- confId: (optional) selects which conformation to output (-1 = default)
RETURNS:
a string
C++ signature:
MolToMolBlock(RDKit::ROMol mol, bool includeStereo=False, int confId=-1) -> std::string
|
Returns a SMARTS string for a molecule
ARGUMENTS:
- mol: the molecule
- isomericSmarts: (optional) include information about stereochemistry in
the SMARTS. Defaults to false.
RETURNS:
a string
C++ signature:
MolToSmarts(RDKit::ROMol {lvalue} mol, bool isomericSmiles=False) -> std::string
|
Returns the canonical SMILES string for a molecule
ARGUMENTS:
- mol: the molecule
- isomericSmiles: (optional) include information about stereochemistry in
the SMILES. Defaults to false.
- kekuleSmiles: (optional) use the Kekule form (no aromatic bonds) in
the SMILES. Defaults to false.
- rootedAtAtom: (optional) if non-negative, this forces the SMILES
to start at a particular atom. Defaults to -1.
RETURNS:
a string
C++ signature:
MolToSmiles(RDKit::ROMol {lvalue} mol, bool isomericSmiles=False, bool kekuleSmiles=False, int rootedAtAtom=-1) -> std::string
|
Returns the Tpl block for a molecule.
ARGUMENTS:
- mol: the molecule
- partialChargeProp: name of the property to use for partial charges
Defaults to '_GasteigerCharge'.
- writeFirstConfTwice: Defaults to False.
This should be set to True when writing TPLs to be read by
the CombiCode.
RETURNS:
a string
C++ signature:
MolToTPLBlock(RDKit::ROMol mol, std::string partialChargeProp='_GasteigerCharge', bool writeFirstConfTwice=False) -> std::string
|
Writes a molecule to a TPL file.
ARGUMENTS:
- mol: the molecule
- fileName: name of the file to write
- partialChargeProp: name of the property to use for partial charges
Defaults to '_GasteigerCharge'.
- writeFirstConfTwice: Defaults to False.
This should be set to True when writing TPLs to be read by
the CombiCode.
C++ signature:
MolToTPLFile(RDKit::ROMol mol, std::string fileName, std::string partialChargeProp='_GasteigerCharge', bool writeFirstConfTwice=False) -> void*
|
Returns an RDKit topological fingerprint for a molecule
Explanation of the algorithm below.
ARGUMENTS:
- mol: the molecule to use
- minPath: (optional) minimum number of bonds to include in the subgraphs
Defaults to 1.
- maxPath: (optional) maximum number of bonds to include in the subgraphs
Defaults to 7.
- fpSize: (optional) number of bits in the fingerprint
Defaults to 2048.
- nBitsPerPath: (optional) number of bits to set per path
Defaults to 4.
- useHs: (optional) include information about number of Hs on each
atom when calculating path hashes.
Defaults to 1.
- tgtDensity: (optional) fold the fingerprint until this minimum density has
been reached
Defaults to 0.
- minSize: (optional) the minimum size the fingerprint will be folded to when
trying to reach tgtDensity
Defaults to 128.
RETURNS: a DataStructs.ExplicitBitVect with _fpSize_ bits
ALGORITHM:
This algorithm functions by find all paths between minPath and maxPath in
length. For each path:
1) A hash is calculated.
2) The hash is used to seed a random-number generator
3) _nBitsPerPath_ random numbers are generated and used to set the corresponding
bits in the fingerprint
C++ signature:
RDKFingerprint(RDKit::ROMol mol, unsigned int minPath=1, unsigned int maxPath=7, unsigned int fpSize=2048, unsigned int nBitsPerHash=4, bool useHs=True, double tgtDensity=0.0, unsigned int minSize=128) -> ExplicitBitVect*
|
construct a ChemicalReaction from an string in MDL rxn format
C++ signature:
ReactionFromRxnBlock(std::string) -> RDKit::ChemicalReaction*
|
construct a ChemicalReaction from an MDL rxn file
C++ signature:
ReactionFromRxnFile(std::string) -> RDKit::ChemicalReaction*
|
construct a ChemicalReaction from a reaction SMARTS string
C++ signature:
ReactionFromSmarts(std::string) -> RDKit::ChemicalReaction*
|
Removes any hydrogens from the graph of a molecule.
ARGUMENTS:
- mol: the molecule to be modified
- implicitOnly: (optional) if this toggle is set, only implicit Hs will
be removed from the graph. Default value is 0 (remove implicit and explicit Hs).
RETURNS: a new molecule with the Hs removed
NOTES:
- The original molecule is *not* modified.
C++ signature:
RemoveHs(RDKit::ROMol mol, bool implicitOnly=False) -> RDKit::ROMol*
|
Removes the core of a molecule and labels the sidechains with dummy atoms.
ARGUMENTS:
- mol: the molecule to be modified
- coreQuery: the molecule to be used as a substructure query for recognizing the core
- replaceDummies: toggles replacement of atoms that match dummies in the query
RETURNS: a new molecule with the core removed
NOTES:
- The original molecule is *not* modified.
EXAMPLES:
The following examples substitute SMILES/SMARTS strings for molecules, you'd have
to actually use molecules:
- ReplaceCore('CCC1CCC1','C1CCC1') -> 'CC[Xa]'
- ReplaceCore('CCC1CC1','C1CCC1') -> ''
- ReplaceCore('C1CC2C1CCC2','C1CCC1') -> '[Xa]C1CCC1[Xb]'
- ReplaceCore('C1CNCC1','N') -> '[Xa]CCCC[Xb]'
- ReplaceCore('C1CCC1CN','C1CCC1[*]',False) -> '[Xa]CN'
C++ signature:
ReplaceCore(RDKit::ROMol mol, RDKit::ROMol coreQuery, bool replaceDummies=True) -> RDKit::ROMol*
|
Replaces sidechains in a molecule with dummy atoms for their attachment points.
ARGUMENTS:
- mol: the molecule to be modified
- coreQuery: the molecule to be used as a substructure query for recognizing the core
RETURNS: a new molecule with the sidechains removed
NOTES:
- The original molecule is *not* modified.
EXAMPLES:
The following examples substitute SMILES/SMARTS strings for molecules, you'd have
to actually use molecules:
- ReplaceSidechains('CCC1CCC1','C1CCC1') -> '[Xa]C1CCC1'
- ReplaceSidechains('CCC1CC1','C1CCC1') -> ''
- ReplaceSidechains('C1CC2C1CCC2','C1CCC1') -> '[Xa]C1CCC1[Xb]'
C++ signature:
ReplaceSidechains(RDKit::ROMol mol, RDKit::ROMol coreQuery) -> RDKit::ROMol*
|
Replaces atoms matching a substructure query in a molecule
ARGUMENTS:
- mol: the molecule to be modified
- query: the molecule to be used as a substructure query
- replacement: the molecule to be used as the replacement
- replaceAll: (optional) if this toggle is set, all substructures matching
the query will be replaced in a single result, otherwise each result will
contain a separate replacement.
Default value is False (return multiple replacements)
RETURNS: a tuple of new molecules with the substructures replaced removed
NOTES:
- The original molecule is *not* modified.
EXAMPLES:
The following examples substitute SMILES/SMARTS strings for molecules, you'd have
to actually use molecules:
- ReplaceSubstructs('CCOC','OC','NC') -> ('CCNC',)
- ReplaceSubstructs('COCCOC','OC','NC') -> ('COCCNC','CNCCOC')
- ReplaceSubstructs('COCCOC','OC','NC',True) -> ('CNCCNC',)
C++ signature:
ReplaceSubstructs(RDKit::ROMol mol, RDKit::ROMol query, RDKit::ROMol replacement, bool replaceAll=False) -> _object*
|
Kekulize, check valencies, set aromaticity, conjugation and hybridization
- The molecule is modified in place.
- If sanitization fails, an exception will be thrown
ARGUMENTS:
- mol: the molecule to be modified
NOTES:
C++ signature:
SanitizeMol(RDKit::ROMol {lvalue}) -> void*
|
Compute the shape protrude distance between two molecule based on a predefined alignment
ARGUMENTS:
- mol1 : The first molecule of interest
- mol2 : The second molecule of interest
- confId1 : Conformer in the first molecule (defaults to first conformer)
- confId2 : Conformer in the second molecule (defaults to first conformer)
- gridSpacing : resolution of the grid used to encode the molecular shapes
- bitsPerPoint : number of bit used to encode the occupancy at each grid point
defaults to two bits per grid point
- vdwScale : Scaling factor for the radius of the atoms to determine the base radius
used in the encoding - grid points inside this sphere carry the maximum occupan
- stepSize : thickness of the each layer outside the base radius, the occupancy value is decreased
from layer to layer from the maximum value
- maxLayers : the maximum number of layers - defaults to the number allowed the number of bits
use per grid point - e.g. two bits per grid point will allow 3 layers
- ignoreHs : when set, the contribution of Hs to the shape will be ignored
- allowReordering : when set, the order will be automatically updated so that the value calculated
is the protrusion of the smaller shape from the larger one.
C++ signature:
ShapeProtrudeDist(RDKit::ROMol mol1, RDKit::ROMol mol2, int confId1=-1, int confId2=-1, double gridSpacing=0.5, RDKit::DiscreteValueVect::DiscreteValueType bitsPerPoint=DataStructs.cDataStructs.DiscreteValueType.TWOBITVALUE, double vdwScale=0.80000000000000004, double stepSize=0.25, int maxLayers=-1, bool ignoreHs=True, bool allowReordering=True) -> double
|
Compute the shape tanimoto distance between two molecule based on a predefined alignment
ARGUMENTS:
- mol1 : The first molecule of interest
- mol2 : The second molecule of interest
- confId1 : Conformer in the first molecule (defaults to first conformer)
- confId2 : Conformer in the second molecule (defaults to first conformer)
- gridSpacing : resolution of the grid used to encode the molecular shapes
- bitsPerPoint : number of bit used to encode the occupancy at each grid point
defaults to two bits per grid point
- vdwScale : Scaling factor for the radius of the atoms to determine the base radius
used in the encoding - grid points inside this sphere carry the maximum occupan
- stepSize : thickness of the each layer outside the base radius, the occupancy value is decreased
from layer to layer from the maximum value
- maxLayers : the maximum number of layers - defaults to the number allowed the number of bits
use per grid point - e.g. two bits per grid point will allow 3 layers
- ignoreHs : when set, the contribution of Hs to the shape will be ignored
C++ signature:
ShapeTanimotoDist(RDKit::ROMol mol1, RDKit::ROMol mol2, int confId1=-1, int confId2=-1, double gridSpacing=0.5, RDKit::DiscreteValueVect::DiscreteValueType bitsPerPoint=DataStructs.cDataStructs.DiscreteValueType.TWOBITVALUE, double vdwScale=0.80000000000000004, double stepSize=0.25, int maxLayers=-1, bool ignoreHs=True) -> double
|
C++ signature: SmilesMolSupplierFromText(std::string text, std::string delimiter=' ', int smilesColumn=0, int nameColumn=1, bool titleLine=True, bool sanitize=True) -> RDKit::SmilesMolSupplier* |
Transform the coordinates of a conformer
C++ signature:
TransformConformer(RDKit::Conformer {lvalue}, boost::python::api::object) -> void*
|
returns a UFF force field for a molecule
ARGUMENTS:
- mol : the molecule of interrest
- vdwThresh : used to exclude long-range van der Waals interactions
(defaults to 10.0)
- confId : indicates which conformer to optimize
C++ signature:
UFFGetMoleculeForceField(RDKit::ROMol {lvalue} mol, double vdwThresh=10.0, int confId=-1) -> ForceFields::PyForceField*
|
Use UFF to optimize a molecule's structure
ARGUMENTS:
- mol : the molecule of interrest
- maxIters : the maximum number of iterations (defaults to 100)
- vdwThresh : used to exclude long-range van der Waals interactions
(defaults to 10.0)
- confId : indicates which conformer to optimize
C++ signature:
UFFOptimizeMolecule(RDKit::ROMol {lvalue} self, int maxIters=200, double vdwThresh=10.0, int confId=-1) -> int
|
Set the wedging on single bonds in a molecule.
The wedging scheme used is that from Mol files.
ARGUMENTS:
- molecule: the molecule to update
C++ signature:
WedgeMolBonds(RDKit::ROMol {lvalue}, RDKit::Conformer const*) -> void*
|
C++ signature: tossit(void) -> void* |
| Trees | Indices | Help |
|
|---|
| Generated by Epydoc 3.0beta1 on Sat May 24 08:37:00 2008 | http://epydoc.sourceforge.net |