rdkit.Chem.EnumerateStereoisomers module

rdkit.Chem.EnumerateStereoisomers.EnumerateStereoisomers(m, options=<rdkit.Chem.EnumerateStereoisomers.StereoEnumerationOptions object>, verbose=False)

returns a generator that yields possible stereoisomers for a molecule

Arguments:
  • m: the molecule to work with

  • options: parameters controlling the enumeration

  • verbose: toggles how verbose the output is

If m has stereogroups, they will be expanded

A small example with 3 chiral atoms and 1 chiral bond (16 theoretical stereoisomers):

>>> from rdkit import Chem
>>> from rdkit.Chem.EnumerateStereoisomers import EnumerateStereoisomers, StereoEnumerationOptions
>>> m = Chem.MolFromSmiles('BrC=CC1OC(C2)(F)C2(Cl)C1')
>>> isomers = tuple(EnumerateStereoisomers(m))
>>> len(isomers)
16
>>> for smi in sorted(Chem.MolToSmiles(x, isomericSmiles=True) for x in isomers):
...     print(smi)
...
F[C@@]12C[C@@]1(Cl)C[C@@H](/C=C/Br)O2
F[C@@]12C[C@@]1(Cl)C[C@@H](/C=C\Br)O2
F[C@@]12C[C@@]1(Cl)C[C@H](/C=C/Br)O2
F[C@@]12C[C@@]1(Cl)C[C@H](/C=C\Br)O2
F[C@@]12C[C@]1(Cl)C[C@@H](/C=C/Br)O2
F[C@@]12C[C@]1(Cl)C[C@@H](/C=C\Br)O2
F[C@@]12C[C@]1(Cl)C[C@H](/C=C/Br)O2
F[C@@]12C[C@]1(Cl)C[C@H](/C=C\Br)O2
F[C@]12C[C@@]1(Cl)C[C@@H](/C=C/Br)O2
F[C@]12C[C@@]1(Cl)C[C@@H](/C=C\Br)O2
F[C@]12C[C@@]1(Cl)C[C@H](/C=C/Br)O2
F[C@]12C[C@@]1(Cl)C[C@H](/C=C\Br)O2
F[C@]12C[C@]1(Cl)C[C@@H](/C=C/Br)O2
F[C@]12C[C@]1(Cl)C[C@@H](/C=C\Br)O2
F[C@]12C[C@]1(Cl)C[C@H](/C=C/Br)O2
F[C@]12C[C@]1(Cl)C[C@H](/C=C\Br)O2

Because the molecule is constrained, not all of those isomers can actually exist. We can check that:

>>> opts = StereoEnumerationOptions(tryEmbedding=True)
>>> isomers = tuple(EnumerateStereoisomers(m, options=opts))
>>> len(isomers)
8
>>> for smi in sorted(Chem.MolToSmiles(x,isomericSmiles=True) for x in isomers):
...     print(smi)
...
F[C@@]12C[C@]1(Cl)C[C@@H](/C=C/Br)O2
F[C@@]12C[C@]1(Cl)C[C@@H](/C=C\Br)O2
F[C@@]12C[C@]1(Cl)C[C@H](/C=C/Br)O2
F[C@@]12C[C@]1(Cl)C[C@H](/C=C\Br)O2
F[C@]12C[C@@]1(Cl)C[C@@H](/C=C/Br)O2
F[C@]12C[C@@]1(Cl)C[C@@H](/C=C\Br)O2
F[C@]12C[C@@]1(Cl)C[C@H](/C=C/Br)O2
F[C@]12C[C@@]1(Cl)C[C@H](/C=C\Br)O2

Or we can force the output to only give us unique isomers:

>>> m = Chem.MolFromSmiles('FC(Cl)C=CC=CC(F)Cl')
>>> opts = StereoEnumerationOptions(unique=True)
>>> isomers = tuple(EnumerateStereoisomers(m, options=opts))
>>> len(isomers)
10
>>> for smi in sorted(Chem.MolToSmiles(x,isomericSmiles=True) for x in isomers):
...     print(smi)
...
F[C@@H](Cl)/C=C/C=C/[C@@H](F)Cl
F[C@@H](Cl)/C=C\C=C/[C@@H](F)Cl
F[C@@H](Cl)/C=C\C=C\[C@@H](F)Cl
F[C@H](Cl)/C=C/C=C/[C@@H](F)Cl
F[C@H](Cl)/C=C/C=C/[C@H](F)Cl
F[C@H](Cl)/C=C/C=C\[C@@H](F)Cl
F[C@H](Cl)/C=C\C=C/[C@@H](F)Cl
F[C@H](Cl)/C=C\C=C/[C@H](F)Cl
F[C@H](Cl)/C=C\C=C\[C@@H](F)Cl
F[C@H](Cl)/C=C\C=C\[C@H](F)Cl

By default the code only expands unspecified stereocenters:

>>> m = Chem.MolFromSmiles('BrC=C[C@H]1OC(C2)(F)C2(Cl)C1')
>>> isomers = tuple(EnumerateStereoisomers(m))
>>> len(isomers)
8
>>> for smi in sorted(Chem.MolToSmiles(x,isomericSmiles=True) for x in isomers):
...     print(smi)
...
F[C@@]12C[C@@]1(Cl)C[C@@H](/C=C/Br)O2
F[C@@]12C[C@@]1(Cl)C[C@@H](/C=C\Br)O2
F[C@@]12C[C@]1(Cl)C[C@@H](/C=C/Br)O2
F[C@@]12C[C@]1(Cl)C[C@@H](/C=C\Br)O2
F[C@]12C[C@@]1(Cl)C[C@@H](/C=C/Br)O2
F[C@]12C[C@@]1(Cl)C[C@@H](/C=C\Br)O2
F[C@]12C[C@]1(Cl)C[C@@H](/C=C/Br)O2
F[C@]12C[C@]1(Cl)C[C@@H](/C=C\Br)O2

But we can change that behavior:

>>> opts = StereoEnumerationOptions(onlyUnassigned=False)
>>> isomers = tuple(EnumerateStereoisomers(m, options=opts))
>>> len(isomers)
16

Since the result is a generator, we can allow exploring at least parts of very large result sets:

>>> m = Chem.MolFromSmiles('Br' + '[CH](Cl)' * 20 + 'F')
>>> opts = StereoEnumerationOptions(maxIsomers=0)
>>> isomers = EnumerateStereoisomers(m, options=opts)
>>> for x in range(5):
...   print(Chem.MolToSmiles(next(isomers),isomericSmiles=True))
F[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)Br
F[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@H](Cl)Br
F[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@H](Cl)[C@@H](Cl)Br
F[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@H](Cl)[C@H](Cl)Br
F[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@H](Cl)[C@@H](Cl)[C@@H](Cl)Br

Or randomly sample a small subset. Note that if we want that sampling to be consistent across runs we need to provide a random number seed:

>>> m = Chem.MolFromSmiles('Br' + '[CH](Cl)' * 20 + 'F')
>>> opts = StereoEnumerationOptions(maxIsomers=3,rand=0xf00d)
>>> isomers = EnumerateStereoisomers(m, options=opts)
>>> for smi in isomers: #sorted(Chem.MolToSmiles(x, isomericSmiles=True) for x in isomers):
...     print(Chem.MolToSmiles(smi))
F[C@@H](Cl)[C@H](Cl)[C@H](Cl)[C@H](Cl)[C@@H](Cl)[C@H](Cl)[C@H](Cl)[C@@H](Cl)[C@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@H](Cl)[C@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@H](Cl)[C@H](Cl)[C@H](Cl)[C@@H](Cl)Br
F[C@H](Cl)[C@H](Cl)[C@H](Cl)[C@@H](Cl)[C@H](Cl)[C@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@H](Cl)[C@@H](Cl)[C@H](Cl)[C@H](Cl)[C@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@H](Cl)[C@H](Cl)[C@@H](Cl)[C@@H](Cl)Br
F[C@@H](Cl)[C@@H](Cl)[C@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@H](Cl)[C@@H](Cl)Br
rdkit.Chem.EnumerateStereoisomers.GetStereoisomerCount(m, options=<rdkit.Chem.EnumerateStereoisomers.StereoEnumerationOptions object>)

returns an estimate (upper bound) of the number of possible stereoisomers for a molecule

Arguments:
  • m: the molecule to work with

  • options: parameters controlling the enumeration

>>> from rdkit import Chem
>>> from rdkit.Chem.EnumerateStereoisomers import EnumerateStereoisomers, StereoEnumerationOptions
>>> m = Chem.MolFromSmiles('BrC(Cl)(F)CCC(O)C')
>>> GetStereoisomerCount(m)
4
>>> m = Chem.MolFromSmiles('CC(Cl)(O)C')
>>> GetStereoisomerCount(m)
1

double bond stereochemistry is also included:

>>> m = Chem.MolFromSmiles('BrC(Cl)(F)C=CC(O)C')
>>> GetStereoisomerCount(m)
8
class rdkit.Chem.EnumerateStereoisomers.StereoEnumerationOptions(tryEmbedding=False, onlyUnassigned=True, maxIsomers=1024, rand=None, unique=True, onlyStereoGroups=False)

Bases: object

  • tryEmbedding: if set the process attempts to generate a standard RDKit distance geometry conformation for the stereisomer. If this fails, we assume that the stereoisomer is non-physical and don’t return it. NOTE that this is computationally expensive and is just a heuristic that could result in stereoisomers being lost.

  • onlyUnassigned: if set (the default), stereocenters which have specified stereochemistry will not be perturbed unless they are part of a relative stereo group.

  • maxIsomers: the maximum number of isomers to yield, if the number of possible isomers is greater than maxIsomers, a random subset will be yielded. If 0, all isomers are yielded. Since every additional stereo center doubles the number of results (and execution time) it’s important to keep an eye on this.

  • onlyStereoGroups: Only find stereoisomers that differ at the StereoGroups associated with the molecule.

maxIsomers
onlyStereoGroups
onlyUnassigned
rand
tryEmbedding
unique