Package rdkit :: Package SimDivFilters :: Module SimilarityPickers :: Class TopNOverallPicker
[hide private]
[frames] | no frames]

Class TopNOverallPicker

source code

   object --+    
            |    
GenericPicker --+
                |
               TopNOverallPicker

A class for picking the top N overall best matches across a library

Connect to a database and build molecules:
>>> from rdkit import Chem
>>> from rdkit import RDConfig
>>> import os.path
>>> from rdkit.Dbase.DbConnection import DbConnect
>>> dbName = RDConfig.RDTestDatabase
>>> conn = DbConnect(dbName,'simple_mols1')
>>> [x.upper() for x in conn.GetColumnNames()]
['SMILES', 'ID']
>>> mols = []
>>> for smi,id in conn.GetData():
...   mol = Chem.MolFromSmiles(str(smi))
...   mol.SetProp('_Name',str(id))
...   mols.append(mol)
>>> len(mols)
12

Calculate fingerprints:
>>> probefps = []
>>> for mol in mols:
...   fp = Chem.RDKFingerprint(mol)
...   fp._id = mol.GetProp('_Name')
...   probefps.append(fp)

Start by finding the top matches for a single probe.  This ether should pull
other ethers from the db:
>>> mol = Chem.MolFromSmiles('COC')
>>> probeFp = Chem.RDKFingerprint(mol)
>>> picker = TopNOverallPicker(numToPick=2,probeFps=[probeFp],dataSet=probefps)
>>> len(picker)
2
>>> fp,score = picker[0]
>>> id = fp._id
>>> str(id)
'ether-1'
>>> score
1.0

The results come back in order:
>>> fp,score = picker[1]
>>> id = fp._id
>>> str(id)
'ether-2'

Now find the top matches for 2 probes.  We'll get one ether and one acid:
>>> fps = []
>>> fps.append(Chem.RDKFingerprint(Chem.MolFromSmiles('COC')))
>>> fps.append(Chem.RDKFingerprint(Chem.MolFromSmiles('CC(=O)O')))
>>> picker = TopNOverallPicker(numToPick=3,probeFps=fps,dataSet=probefps)
>>> len(picker)
3
>>> fp,score = picker[0]
>>> id = fp._id
>>> str(id)
'acid-1'
>>> fp,score = picker[1]
>>> id = fp._id
>>> str(id)
'ether-1'
>>> score
1.0
>>> fp,score = picker[2]
>>> id = fp._id
>>> str(id)
'acid-2'

Instance Methods [hide private]
 
__init__(self, numToPick=10, probeFps=None, dataSet=None, simMetric=<Boost.Python.function object at 0x27a4ec0>) source code
 
MakePicks(self, force=False) source code

Inherited from GenericPicker: __getitem__, __len__

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Class Variables [hide private]

Inherited from GenericPicker (private): _picks

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, numToPick=10, probeFps=None, dataSet=None, simMetric=<Boost.Python.function object at 0x27a4ec0>)
(Constructor)

source code 


dataSet should be a sequence of BitVectors

Overrides: object.__init__

MakePicks(self, force=False)

source code 
Overrides: GenericPicker.MakePicks