Package rdkit :: Package Chem :: Module BRICS
[hide private]
[frames] | no frames]

Module BRICS

source code

Implementation of the BRICS algorithm from Degen et al. ChemMedChem *3* 1503-7 (2008)

Functions [hide private]
 
FindBRICSBonds(mol, randomizeOrder=False, silent=True)
returns the bonds in a molecule that BRICS would cleave
source code
 
BreakBRICSBonds(mol, bonds=None, sanitize=True, silent=True)
breaks the BRICS bonds in a molecule and returns the results
source code
 
BRICSDecompose(mol, allNodes=None, minFragmentSize=1, onlyUseReactions=None, silent=True, keepNonLeafNodes=False, singlePass=False, returnMols=False)
returns the BRICS decomposition for a molecule
source code
 
BRICSBuild(fragments, onlyCompleteMols=True, seeds=None, uniquify=True, scrambleReagents=True, maxDepth=3) source code
 
_test() source code
Variables [hide private]
  environs = {'#L8': '[C;!R;!D1]-;!@[#6]', 'L1': '[C;D3]([#0,#6,...
  reactionDefs = ([('1', '3', '-'), ('1', '5', '-'), ('1', '10',...
  smartsGps = (['[$([C;D3]([#0,#6,#7,#8])(=O)):1]-;!@[$([O;D2]-;...
  environMatchers = {}
  bondMatchers = [[('1', '3', '-', <rdkit.Chem.rdchem.Mol object...
  reactions = tuple([[Reactions.ReactionFromSmarts(y) for y in x...
  reverseReactions = []
  dummyPattern = Chem.MolFromSmiles('[*]')
  __package__ = 'rdkit.Chem'
  bType = '-'
  bnd = '-'
  compats = [('16', '16', '-')]
  defn = '[$([c;$(c(:c):c)]):1]-;!@[$([c;$(c(:c):c)]):2]>>[16*]-...
  e1 = '[c;$(c(:c):c)]'
  e2 = '[c;$(c(:c):c)]'
  env = 'L9'
  g1 = '16'
  g2 = '16'
  gp = ['[$([c;$(c(:c):c)]):1]-;!@[$([c;$(c(:c):c)]):2]>>[16*]-[...
  i = 13
  i1 = '16'
  i2 = '16'
  j = 0
  labels = ['16', '16']
  patt = <rdkit.Chem.rdchem.Mol object at 0x32940c0>
  ps = '[16*]-[*:1].[16*]-[*:2]'
  r1 = '[c;$(c(:c):c)]'
  r2 = '[c;$(c(:c):c)]'
  rs = '[$([c;$(c(:c):c)]):1]-;!@[$([c;$(c(:c):c)]):2]'
  rxn = <rdkit.Chem.rdChemReactions.ChemicalReaction object at 0...
  rxnSet = ['[$([c;$(c(:c):c)]):1]-;!@[$([c;$(c(:c):c)]):2]>>[16...
  sma = '[16*]-[*:1].[16*]-[*:2]>>[$([c;$(c(:c):c)]):1]-;!@[$([c...
  t = <rdkit.Chem.rdChemReactions.ChemicalReaction object at 0x3...
  tmp = [('16', '16', '-', <rdkit.Chem.rdchem.Mol object at 0x32...
  x = '16'
  y = '[$([c;$(c(:c):c)]):1]-;!@[$([c;$(c(:c):c)]):2]>>[16*]-[*:...

Imports: Chem, Reactions, sys, re, copy, random


Function Details [hide private]

FindBRICSBonds(mol, randomizeOrder=False, silent=True)

source code 
returns the bonds in a molecule that BRICS would cleave

>>> from rdkit import Chem
>>> m = Chem.MolFromSmiles('CCCOCC')
>>> res = list(FindBRICSBonds(m))
>>> res
[((3, 2), ('3', '4')), ((3, 4), ('3', '4'))]

a more complicated case:
>>> m = Chem.MolFromSmiles('CCCOCCC(=O)c1ccccc1')
>>> res = list(FindBRICSBonds(m))
>>> res
[((3, 2), ('3', '4')), ((3, 4), ('3', '4')), ((6, 8), ('6', '16'))]

we can also randomize the order of the results:
>>> random.seed(23)
>>> res = list(FindBRICSBonds(m,randomizeOrder=True))
>>> sorted(res)
[((3, 2), ('3', '4')), ((3, 4), ('3', '4')), ((6, 8), ('6', '16'))]

Note that this is a generator function :
>>> res = FindBRICSBonds(m)
>>> res
<generator object ...>
>>> res.next()
((3, 2), ('3', '4'))

>>> m = Chem.MolFromSmiles('CC=CC')
>>> res = list(FindBRICSBonds(m))
>>> sorted(res)
[((1, 2), ('7', '7'))]

make sure we don't match ring bonds:
>>> m = Chem.MolFromSmiles('O=C1NCCC1')
>>> list(FindBRICSBonds(m))
[]

another nice one, make sure environment 8 doesn't match something connected
to a ring atom:
>>> m = Chem.MolFromSmiles('CC1(C)CCCCC1')
>>> list(FindBRICSBonds(m))
[]

BreakBRICSBonds(mol, bonds=None, sanitize=True, silent=True)

source code 
breaks the BRICS bonds in a molecule and returns the results

>>> from rdkit import Chem
>>> m = Chem.MolFromSmiles('CCCOCC')
>>> m2=BreakBRICSBonds(m)
>>> Chem.MolToSmiles(m2,True)
'[3*]O[3*].[4*]CC.[4*]CCC'

a more complicated case:
>>> m = Chem.MolFromSmiles('CCCOCCC(=O)c1ccccc1')
>>> m2=BreakBRICSBonds(m)
>>> Chem.MolToSmiles(m2,True)
'[3*]O[3*].[4*]CCC.[4*]CCC([6*])=O.[16*]c1ccccc1'


can also specify a limited set of bonds to work with:
>>> m = Chem.MolFromSmiles('CCCOCC')
>>> m2 = BreakBRICSBonds(m,[((3, 2), ('3', '4'))])
>>> Chem.MolToSmiles(m2,True)
'[3*]OCC.[4*]CCC'

this can be used as an alternate approach for doing a BRICS decomposition by
following BreakBRICSBonds with a call to Chem.GetMolFrags:
>>> m = Chem.MolFromSmiles('CCCOCC')
>>> m2=BreakBRICSBonds(m)
>>> frags = Chem.GetMolFrags(m2,asMols=True)
>>> [Chem.MolToSmiles(x,True) for x in frags]
['[4*]CCC', '[3*]O[3*]', '[4*]CC']

BRICSDecompose(mol, allNodes=None, minFragmentSize=1, onlyUseReactions=None, silent=True, keepNonLeafNodes=False, singlePass=False, returnMols=False)

source code 
returns the BRICS decomposition for a molecule

>>> from rdkit import Chem
>>> m = Chem.MolFromSmiles('CCCOCc1cc(c2ncccc2)ccc1')
>>> res = list(BRICSDecompose(m))
>>> sorted(res)
['[14*]c1ccccn1', '[16*]c1cccc([16*])c1', '[3*]O[3*]', '[4*]CCC', '[4*]C[8*]']

>>> res = BRICSDecompose(m,returnMols=True)
>>> res[0]
<rdkit.Chem.rdchem.Mol object ...>
>>> smis = [Chem.MolToSmiles(x,True) for x in res]
>>> sorted(smis)
['[14*]c1ccccn1', '[16*]c1cccc([16*])c1', '[3*]O[3*]', '[4*]CCC', '[4*]C[8*]']

nexavar, an example from the paper (corrected):
>>> m = Chem.MolFromSmiles('CNC(=O)C1=NC=CC(OC2=CC=C(NC(=O)NC3=CC(=C(Cl)C=C3)C(F)(F)F)C=C2)=C1')
>>> res = list(BRICSDecompose(m))
>>> sorted(res)
['[1*]C([1*])=O', '[1*]C([6*])=O', '[14*]c1cc([16*])ccn1', '[16*]c1ccc(Cl)c([16*])c1', '[16*]c1ccc([16*])cc1', '[3*]O[3*]', '[5*]NC', '[5*]N[5*]', '[8*]C(F)(F)F']

it's also possible to keep pieces that haven't been fully decomposed:
>>> m = Chem.MolFromSmiles('CCCOCC')
>>> res = list(BRICSDecompose(m,keepNonLeafNodes=True))
>>> sorted(res)
['CCCOCC', '[3*]OCC', '[3*]OCCC', '[3*]O[3*]', '[4*]CC', '[4*]CCC']

>>> m = Chem.MolFromSmiles('CCCOCc1cc(c2ncccc2)ccc1')
>>> res = list(BRICSDecompose(m,keepNonLeafNodes=True))
>>> sorted(res)
['CCCOCc1cccc(-c2ccccn2)c1', '[14*]c1ccccn1', '[16*]c1cccc(-c2ccccn2)c1', '[16*]c1cccc(COCCC)c1', '[16*]c1cccc([16*])c1', '[3*]OCCC', '[3*]OC[8*]', '[3*]OCc1cccc(-c2ccccn2)c1', '[3*]OCc1cccc([16*])c1', '[3*]O[3*]', '[4*]CCC', '[4*]C[8*]', '[4*]Cc1cccc(-c2ccccn2)c1', '[4*]Cc1cccc([16*])c1', '[8*]COCCC']

or to only do a single pass of decomposition:
>>> m = Chem.MolFromSmiles('CCCOCc1cc(c2ncccc2)ccc1')
>>> res = list(BRICSDecompose(m,singlePass=True))
>>> sorted(res)
['CCCOCc1cccc(-c2ccccn2)c1', '[14*]c1ccccn1', '[16*]c1cccc(-c2ccccn2)c1', '[16*]c1cccc(COCCC)c1', '[3*]OCCC', '[3*]OCc1cccc(-c2ccccn2)c1', '[4*]CCC', '[4*]Cc1cccc(-c2ccccn2)c1', '[8*]COCCC']

setting a minimum size for the fragments:
>>> m = Chem.MolFromSmiles('CCCOCC')
>>> res = list(BRICSDecompose(m,keepNonLeafNodes=True,minFragmentSize=2))
>>> sorted(res)
['CCCOCC', '[3*]OCC', '[3*]OCCC', '[4*]CC', '[4*]CCC']
>>> m = Chem.MolFromSmiles('CCCOCC')
>>> res = list(BRICSDecompose(m,keepNonLeafNodes=True,minFragmentSize=3))
>>> sorted(res)
['CCCOCC', '[3*]OCC', '[4*]CCC']
>>> res = list(BRICSDecompose(m,minFragmentSize=2))
>>> sorted(res)
['[3*]OCC', '[3*]OCCC', '[4*]CC', '[4*]CCC']


Variables Details [hide private]

environs

Value:
{'#L8': '[C;!R;!D1]-;!@[#6]',
 'L1': '[C;D3]([#0,#6,#7,#8])(=O)',
 'L10': '[N;R;$(N(@C(=O))@[C,N,O,S])]',
 'L11': '[S;D2](-;!@[#0,#6])',
 'L12': '[S;D4]([#6,#0])(=O)(=O)',
 'L13': '[C;$(C(-;@[C,N,O,S])-;@[N,O,S])]',
 'L14': '[c;$(c(:[c,n,o,s]):[n,o,s])]',
 'L14b': '[c;$(c(:[c,n,o,s]):[n,o,s])]',
...

reactionDefs

Value:
([('1', '3', '-'), ('1', '5', '-'), ('1', '10', '-')],
 [('3', '4', '-'),
  ('3', '13', '-'),
  ('3', '14', '-'),
  ('3', '15', '-'),
  ('3', '16', '-')],
 [('4', '5', '-'), ('4', '11', '-')],
 [('5', '12', '-'), ('5', '14', '-'), ('5', '16', '-'), ('5', '13', '-\
...

smartsGps

Value:
(['[$([C;D3]([#0,#6,#7,#8])(=O)):1]-;!@[$([O;D2]-;!@[#0,#6,#1]):2]>>[1\
*]-[*:1].[3*]-[*:2]',
  '[$([C;D3]([#0,#6,#7,#8])(=O)):1]-;!@[$([N;!D1;!$(N=*);!$(N-[!#6;!#1\
6;!#0;!#1]);!$([N;R]@[C;R]=O)]):2]>>[1*]-[*:1].[5*]-[*:2]',
  '[$([C;D3]([#0,#6,#7,#8])(=O)):1]-;!@[$([N;R;$(N(@C(=O))@[C,N,O,S])]\
):2]>>[1*]-[*:1].[10*]-[*:2]'],
 ['[$([O;D2]-;!@[#0,#6,#1]):1]-;!@[$([C;!D1;!$(C=*)]-;!@[#6]):2]>>[3*]\
-[*:1].[4*]-[*:2]',
...

bondMatchers

Value:
[[('1', '3', '-', <rdkit.Chem.rdchem.Mol object at 0x3287c90>),
  ('1', '5', '-', <rdkit.Chem.rdchem.Mol object at 0x3287d00>),
  ('1', '10', '-', <rdkit.Chem.rdchem.Mol object at 0x3287d70>)],
 [('3', '4', '-', <rdkit.Chem.rdchem.Mol object at 0x3287de0>),
  ('3', '13', '-', <rdkit.Chem.rdchem.Mol object at 0x3287e50>),
  ('3', '14', '-', <rdkit.Chem.rdchem.Mol object at 0x3287ec0>),
  ('3', '15', '-', <rdkit.Chem.rdchem.Mol object at 0x3287f30>),
  ('3', '16', '-', <rdkit.Chem.rdchem.Mol object at 0x3287fa0>)],
...

reactions

Value:
tuple([[Reactions.ReactionFromSmarts(y) for y in x] for x in smartsGps\
])

defn

Value:
'[$([c;$(c(:c):c)]):1]-;!@[$([c;$(c(:c):c)]):2]>>[16*]-[*:1].[16*]-[*:\
2]'

gp

Value:
['[$([c;$(c(:c):c)]):1]-;!@[$([c;$(c(:c):c)]):2]>>[16*]-[*:1].[16*]-[*\
:2]']

rxn

Value:
<rdkit.Chem.rdChemReactions.ChemicalReaction object at 0x329a210>

rxnSet

Value:
['[$([c;$(c(:c):c)]):1]-;!@[$([c;$(c(:c):c)]):2]>>[16*]-[*:1].[16*]-[*\
:2]']

sma

Value:
'[16*]-[*:1].[16*]-[*:2]>>[$([c;$(c(:c):c)]):1]-;!@[$([c;$(c(:c):c)]):\
2]'

t

Value:
<rdkit.Chem.rdChemReactions.ChemicalReaction object at 0x3287440>

tmp

Value:
[('16', '16', '-', <rdkit.Chem.rdchem.Mol object at 0x32940c0>)]

y

Value:
'[$([c;$(c(:c):c)]):1]-;!@[$([c;$(c(:c):c)]):2]>>[16*]-[*:1].[16*]-[*:\
2]'