Package rdkit :: Package PySVD :: Module SVDSimilarity :: Class SimilarityCalculator
[hide private]
[frames] | no frames]

Class SimilarityCalculator

source code

object --+
         |
        SimilarityCalculator

Instance Methods [hide private]
 
__init__(self)
x.__init__(...) initializes x; see x.__class__.__doc__ for signature
source code
 
SetVects(self, vects)
vects is a sequence of *sorted* sequences of bit IDs
source code
 
UpdateSingularValues1(self, k=-1, cleanup=1)
>>> calc = SimilarityCalculator() >>> try: ...
source code
 
UpdateSingularValues(self, k=-1, cleanup=1)
>>> calc = SimilarityCalculator() >>> try: ...
source code
 
ForceSingularValues(self, k, T, D, s, cleanup=1) source code
 
PackPoint(self, pt)
converts a point from the normal space to the reduced (packed) space
source code
 
TransformPoint(self, pt)
Transforms a point into the SVD space
source code
 
ScorePoint(self, pt, against=None, isTransformed=0, topN=0, threshold=-1.0, excludeThese=[])
return value is a sequence of 2-tuples: (score, index)
source code

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self)
(Constructor)

source code 
x.__init__(...) initializes x; see x.__class__.__doc__ for signature

Overrides: object.__init__
(inherited documentation)

SetVects(self, vects)

source code 


vects is a sequence of *sorted* sequences of bit IDs


>>> calc = SimilarityCalculator()
>>> calc.SetVects( ((1,2),(3,100),(1,2,2,4)) )
>>> calc._vects
((0, 1), (2, 4), (0, 1, 3))
>>> calc._vals
(1, 1, 1, 1, 1, 2, 1)
>>> calc._idMap[100]
4
>>> calc._idMap[4]
3

UpdateSingularValues1(self, k=-1, cleanup=1)

source code 

>>> calc = SimilarityCalculator()
>>> try:
...   calc.UpdateSingularValues()
... except ValueError:
...   ok=1
... else:
...   ok=0
>>> ok
1
>>> calc.SetVects( ((0,2),(1,3),(0,1,2)) )
>>> calc.UpdateSingularValues()
>>> calc._S.shape[0]
3

Unless the optional cleanup argument is unset,the local vects
  (untransformed data points) will be destroyed after we're done
  with them here.  This can save significant memory:
>>> try:
...   calc.UpdateSingularValues(2)
... except ValueError:
...   ok=1
... else:
...   ok=0
>>> ok
1
  
Have to call SetVects again:
>>> calc.SetVects( ((0,2),(1,3),(0,1,2)) )
>>> calc.UpdateSingularValues(2)
>>> calc._S.shape[0]
2
>>> print '%.4f'%calc._S[0]
2.1889
>>> print '%.4f'%calc._S[1]
1.4142

UpdateSingularValues(self, k=-1, cleanup=1)

source code 



>>> calc = SimilarityCalculator()
>>> try:
...   calc.UpdateSingularValues()
... except ValueError:
...   ok=1
... else:
...   ok=0
>>> ok
1
>>> calc.SetVects( ((0,2),(1,3),(0,1,2)) )
>>> calc.UpdateSingularValues()
>>> calc._S.shape[0]
3

Unless the optional cleanup argument is unset,the local vects
  (untransformed data points) will be destroyed after we're done
  with them here.  This can save significant memory:
>>> try:
...   calc.UpdateSingularValues(2)
... except ValueError:
...   ok=1
... else:
...   ok=0
>>> ok
1
  
Have to call SetVects again:
>>> calc.SetVects( ((0,2),(1,3),(0,1,2)) )
>>> calc.UpdateSingularValues(2)
>>> calc._S.shape[0]
2
>>> print '%.4f'%calc._S[0]
2.1889
>>> print '%.4f'%calc._S[1]
1.4142

PackPoint(self, pt)

source code 


converts a point from the normal space to the reduced (packed) space

>>> calc = SimilarityCalculator()
>>> calc.SetVects( ((1,2),(3,100),(1,2,2,4)) )
>>> calc.PackPoint( (1,2) )
array([ 1.,  1.,  0.,  0.,  0.])
>>> calc.PackPoint( (1,2,2) )
array([ 1.,  2.,  0.,  0.,  0.])
>>> calc.PackPoint( (1,2,5) )
array([ 1.,  1.,  0.,  0.,  0.])

TransformPoint(self, pt)

source code 
Transforms a point into the SVD space

>>> calc = SimilarityCalculator()
>>> calc.SetVects( ((0,2),(1,3),(0,1,2)) )
>>> calc.UpdateSingularValues(k=3)

if we pass in a point used for the SVD, we should just get the
transformed version of that point back:
>>> v2 = calc.TransformPoint( (0,2) )

#>>> v1 = transpose(calc._singularVects[0])
#>>> abs(max(v1-v2))<1e-6
#1
#>>> v1 = transpose(calc._singularVects[1])
#>>> abs(max(v1-v2))>1e-6
#1

ScorePoint(self, pt, against=None, isTransformed=0, topN=0, threshold=-1.0, excludeThese=[])

source code 


return value is a sequence of 2-tuples: (score, index)

>>> calc = SimilarityCalculator()
>>> calc.SetVects( ((0,2),(1,3),(0,1,2)) )
>>> calc.UpdateSingularValues(k=3)
>>> r = calc.ScorePoint((0,2),against=[0])[0]
>>> print '%.2f'%r[0], r[1]
1.00 0

can transform the point in advance:
>>> pt = calc.TransformPoint( (0,2) )
>>> r = calc.ScorePoint(pt,against=[0],isTransformed=1)[0]
>>> print '%.2f'%r[0], r[1]
1.00 0

default is to score against a variety of vectors at once:
>>> [abs(x[0])>1e-4 for x in calc.ScorePoint(pt,isTransformed=1)]
[True, False, True]

>>> [abs(x[0])>1e-4 for x in calc.ScorePoint((0,3,6))]
[True, True, True]

>>> [abs(x[0])>1e-4 for x in calc.ScorePoint((0,3,6),topN=2)]
[True, True]

you can also put a threshold on the similarity metric:
>>> len(calc.ScorePoint((0,3,6)))
3
>>> len(calc.ScorePoint((0,3,6),threshold=0.50))
2



"extra" bits (those that weren't in the training vectors) are
ignored:
>>> [abs(x[0])>1e-4 for x in calc.ScorePoint((0,2,12))]
[True, False, True]

# look at the indices:
>>> v = [x[1] for x in calc.ScorePoint((0,3,6),topN=2)]
>>> v.sort()
>>> v
[0, 1]
>>> [x[1] for x in calc.ScorePoint((0,3,6),topN=2,excludeThese=[1])]
[2, 0]