rdkit.Chem.rdRascalMCES module¶
Module containing implementation of RASCAL Maximum Common Edge Substructure algorithm.
- rdkit.Chem.rdRascalMCES.FindMCES((rdkit.Chem.rdchem.Mol)mol1, (rdkit.Chem.rdchem.Mol)mol2[, (rdkit.Chem.rdMolDescriptors.AtomPairsParameters)opts=None]) list :¶
Find one or more MCESs between the 2 molecules given. Returns a list of RascalResult objects.- mol1- mol2 The two molecules for which to find the MCES- opts Optional RascalOptions object changing the default run mode.
- C++ signature :
boost::python::list FindMCES(RDKit::ROMol,RDKit::ROMol [,boost::python::api::object=None])
- rdkit.Chem.rdRascalMCES.RascalButinaCluster((rdkit.Chem.rdMolDescriptors.AtomPairsParameters)mols[, (rdkit.Chem.rdMolDescriptors.AtomPairsParameters)opts=None]) list :¶
Use the RASCAL MCES similarity metric to do Butina clustering (Butina JCICS 39 747-750 (1999)). Returns a list of lists of molecules, each inner list being a cluster. The last cluster is all the molecules that didn’t fit into another cluster (the singletons).- mols List of molecules to be clustered- opts Optional RascalOptions object changing the default run mode.
- C++ signature :
boost::python::list RascalButinaCluster(boost::python::api::object [,boost::python::api::object=None])
- rdkit.Chem.rdRascalMCES.RascalCluster((rdkit.Chem.rdMolDescriptors.AtomPairsParameters)mols[, (rdkit.Chem.rdMolDescriptors.AtomPairsParameters)opts=None]) list :¶
Use the RASCAL MCES similarity metric to do fuzzy clustering. Returns a list of lists of molecules, each inner list being a cluster. The last cluster is all the molecules that didn’t fit into another cluster (the singletons).- mols List of molecules to be clustered- opts Optional RascalOptions object changing the default run mode.
- C++ signature :
boost::python::list RascalCluster(boost::python::api::object [,boost::python::api::object=None])
- class rdkit.Chem.rdRascalMCES.RascalClusterOptions((object)arg1) None :¶
Bases:
instanceRASCAL Cluster Options. Most of these pertain to RascalCluster calculations. Only similarityCutoff is used by RascalButinaCluster.
- C++ signature :
void __init__(_object*)
- property a¶
The penalty score for each unconnected component in the MCES. Default=0.05.
- property b¶
The weight of matched bonds over matched atoms. Default=2.
- property clusterMergeSim¶
Two clusters are merged if the fraction of molecules they have in common is greater than this. Default=0.6.
- property maxNumFrags¶
The maximum number of fragments allowed in the MCES for each pair of molecules. Default=2. So that the MCES isn’t a lot of small fragments scattered around the molecules giving an inflated estimate of similarity.
- property minFragSize¶
The minimum number of atoms in a fragment for it to be included in the MCES. Default=3.
- property minIntraClusterSim¶
Two pairs of molecules are included in the same cluster if the similarity between their MCESs is greater than this. Default=0.9.
- property numThreads¶
Number of threads to use during clustering. Default=-1 means all the hardware threads less one.
- property similarityCutoff¶
Similarity cutoff for molecules to be in the same cluster. Between 0.0 and 1.0, default=0.7.
- class rdkit.Chem.rdRascalMCES.RascalOptions((object)arg1) None :¶
Bases:
instanceRASCAL Options
- C++ signature :
void __init__(_object*)
- property allBestMCESs¶
If True, reports all MCESs found of the same maximum size. Default False means just report the first found.
- property completeAromaticRings¶
If True (default), partial aromatic rings won’t be returned.
- property completeSmallestRings¶
If True (default is False), only complete rings present in both input molecule’s RingInfo will be returned. Implies completeAromaticRings and ringMatchesRingOnly.
- property equivalentAtoms¶
SMARTS strings defining atoms that shouldbe considered equivalent. e.g.[F,Cl,Br,I] so all halogens will match each other.Space-separated list allowing more than 1class of equivalent atoms.
- property exactConnectionsMatch¶
If True (default is False), atoms will only match atoms if they have the same number of explicit connections. E.g. the central atom of C(C)(C) won’t match either atom in CC
- property ignoreAtomAromaticity¶
If True, matches atoms solely on atomic number. If False, will treat aromatic and aliphatic atoms as different. Default=True.
- property ignoreBondOrders¶
If True, will treat all bonds as the same, irrespective of order. Default=False.
- property maxBestMCESs¶
Some pathological cases produce huge numbers of equivalent solutions that can crash the program due to memory depletion. This caps the number of such solutions to prevent this happening. Default=10000.
- property maxBondMatchPairs¶
Too many matching bond (vertex) pairs can cause the process to run out of memory. The default of 1000 is fairly safe. Increase with caution, as memory use increases with the square of this number.
- property maxFragSeparation¶
Maximum number of bonds between fragments in the MCES for both to be reported. Default -1 means no maximum. If exceeded, the smaller fragment will be removed.
- property minCliqueSize¶
Normally, the minimum clique size is specified via the similarityThreshold. Sometimes it’s more convenient to specify it directly. If this is > 0, it will over-ride the similarityThreshold. Note that this refers to the minimum number of BONDS in the MCES. Default=0.
- property minFragSize¶
Imposes a minimum on the number of atoms in a fragment that may be part of the MCES. Default -1 means no minimum.
- property returnEmptyMCES¶
If the estimated similarity between the 2 molecules doesn’t meet the similarityThreshold, no results are returned. If you want to know what the estimates were, set this to True, and examine the tier1Sim and tier2Sim properties of the result then returned.
- property ringMatchesRingOnly¶
If True (default is False), ring bonds won’t match non-ring bonds.
- property similarityThreshold¶
Threshold below which MCES won’t be run. Between 0.0 and 1.0, default=0.7.
- property singleLargestFrag¶
Return the just single largest fragment of the MCES. It is equivalent to running with allBestMCEs=True, finding the result with the largest largestFragmentSize, and calling its largestFragmentOnly method. This option may not produce the largest possible single fragment that the molecules have in common. If you definitely want that you may be better off using rdFMCS.
- property timeout¶
Maximum time (in seconds) to spend on an individual MCESs determination. Default 60, -1 means no limit.
- class rdkit.Chem.rdRascalMCES.RascalResult¶
Bases:
instanceUsed to return RASCAL MCES results.
Raises an exception This class cannot be instantiated from Python
- atomMatches((RascalResult)self) list :¶
Likewise for atoms.
- C++ signature :
boost::python::list atomMatches(RDKit::RascalMCES::RascalResult)
- bondMatches((RascalResult)self) list :¶
A function returning a list of list of tuples, each inner list containing the matching bonds in the MCES as tuples of bond indices from mol1 and mol2
- C++ signature :
boost::python::list bondMatches(RDKit::RascalMCES::RascalResult)
- largestFragmentOnly((RascalResult)self) None :¶
Function that cuts the MCES down to the single largest frag. This cannot be undone.
- C++ signature :
void largestFragmentOnly(RDKit::RascalMCES::RascalResult {lvalue})
- property largestFragmentSize¶
Number of atoms in largest fragment.
- property numFragments¶
Number of fragments in MCES.
- property similarity¶
Johnson similarity between 2 molecules.
- property smartsString¶
SMARTS string defining the MCES.
- property tier1Sim¶
The tier 1 similarity estimate.
- property tier2Sim¶
The tier 2 similarity estimate.
- property timedOut¶
Whether it timed out.