Package rdkit :: Package ML :: Package InfoTheory :: Module rdInfoTheory :: Class InfoBitRanker
[hide private]
[frames] | no frames]

Class InfoBitRanker

 object --+    
          |    
??.instance --+
              |
             InfoBitRanker

A class to rank the bits from a series of labelled fingerprints
A simple demonstration may help clarify what this class does. 
Here's a small set of vectors:
>>> for i,bv in enumerate(bvs): print bv.ToBitString(),acts[i]
... 
0001 0
0101 0
0010 1
1110 1

Default ranker, using infogain:
>>> ranker = InfoBitRanker(4,2)  
>>> for i,bv in enumerate(bvs): ranker.AccumulateVotes(bv,acts[i])
... 
>>> for bit,gain,n0,n1 in ranker.GetTopN(3): print int(bit),'%.3f'%gain,int(n0),int(n1)
... 
3 1.000 2 0
2 1.000 0 2
0 0.311 0 1

Using the biased infogain:
>>> ranker = InfoBitRanker(4,2,InfoTheory.InfoType.BIASENTROPY)
>>> ranker.SetBiasList((1,))
>>> for i,bv in enumerate(bvs): ranker.AccumulateVotes(bv,acts[i])
... 
>>> for bit,gain,n0,n1 in ranker.GetTopN(3): print int(bit),'%.3f'%gain,int(n0),int(n1)
... 
2 1.000 0 2
0 0.311 0 1
1 0.000 1 1

A chi squared ranker is also available:
>>> ranker = InfoBitRanker(4,2,InfoTheory.InfoType.CHISQUARE)
>>> for i,bv in enumerate(bvs): ranker.AccumulateVotes(bv,acts[i])
... 
>>> for bit,gain,n0,n1 in ranker.GetTopN(3): print int(bit),'%.3f'%gain,int(n0),int(n1)
... 
3 4.000 2 0
2 4.000 0 2
0 1.333 0 1

As is a biased chi squared:
>>> ranker = InfoBitRanker(4,2,InfoTheory.InfoType.BIASCHISQUARE)
>>> ranker.SetBiasList((1,))
>>> for i,bv in enumerate(bvs): ranker.AccumulateVotes(bv,acts[i])
... 
>>> for bit,gain,n0,n1 in ranker.GetTopN(3): print int(bit),'%.3f'%gain,int(n0),int(n1)
... 
2 4.000 0 2
0 1.333 0 1
1 0.000 1 1

Instance Methods [hide private]
 
AccumulateVotes(...)
AccumulateVotes( (InfoBitRanker)arg1, (AtomPairsParameters)arg2, (int)arg3) -> None : Accumulate the votes for all the bits turned on in a bit vector
 
GetTopN(...)
GetTopN( (InfoBitRanker)arg1, (int)arg2) -> object : Returns the top n bits ranked by the information metric This is actually the function where most of the work of ranking is happening
 
SetBiasList(...)
SetBiasList( (InfoBitRanker)arg1, (AtomPairsParameters)arg2) -> None : Set the classes to which the entropy calculation should be biased
 
SetMaskBits(...)
SetMaskBits( (InfoBitRanker)arg1, (AtomPairsParameters)arg2) -> None : Set the mask bits for the calculation
 
Tester(...)
Tester( (InfoBitRanker)arg1, (AtomPairsParameters)arg2) -> None :
 
WriteTopBitsToFile(...)
WriteTopBitsToFile( (InfoBitRanker)arg1, (str)arg2) -> None : Write the bits that have been ranked to a file
 
__init__(...)
__init__( (object)arg1, (int)nBits, (int)nClasses) -> None :
 
__reduce__(...)
helper for pickle

Inherited from unreachable.instance: __new__

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Class Variables [hide private]
  __instance_size__ = 128
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

AccumulateVotes(...)

 

AccumulateVotes( (InfoBitRanker)arg1, (AtomPairsParameters)arg2, (int)arg3) -> None :
    Accumulate the votes for all the bits turned on in a bit vector
    
    ARGUMENTS:
    
      - bv : bit vector either ExplicitBitVect or SparseBitVect operator
      - label : the class label for the bit vector. It is assumed that 0 <= class < nClasses 
    

    C++ signature :
        void AccumulateVotes(RDInfoTheory::InfoBitRanker*,boost::python::api::object,int)

GetTopN(...)

 

GetTopN( (InfoBitRanker)arg1, (int)arg2) -> object :
    Returns the top n bits ranked by the information metric
    This is actually the function where most of the work of ranking is happening
    
    ARGUMENTS:
    
      - num : the number of top ranked bits that are required
    

    C++ signature :
        _object* GetTopN(RDInfoTheory::InfoBitRanker*,int)

SetBiasList(...)

 

SetBiasList( (InfoBitRanker)arg1, (AtomPairsParameters)arg2) -> None :
    Set the classes to which the entropy calculation should be biased
    
    This list contains a set of class ids used when in the BIASENTROPY mode of ranking bits. 
    In this mode, a bit must be correlated higher with one of the biased classes than all the 
    other classes. For example, in a two class problem with actives and inactives, the fraction of 
    actives that hit the bit has to be greater than the fraction of inactives that hit the bit
    
    ARGUMENTS: 
    
      - classList : list of class ids that we want a bias towards
    

    C++ signature :
        void SetBiasList(RDInfoTheory::InfoBitRanker*,boost::python::api::object)

SetMaskBits(...)

 

SetMaskBits( (InfoBitRanker)arg1, (AtomPairsParameters)arg2) -> None :
    Set the mask bits for the calculation
    
    ARGUMENTS: 
    
      - maskBits : list of mask bits to use
    

    C++ signature :
        void SetMaskBits(RDInfoTheory::InfoBitRanker*,boost::python::api::object)

Tester(...)

 

Tester( (InfoBitRanker)arg1, (AtomPairsParameters)arg2) -> None :

    C++ signature :
        void Tester(RDInfoTheory::InfoBitRanker*,boost::python::api::object)

WriteTopBitsToFile(...)

 

WriteTopBitsToFile( (InfoBitRanker)arg1, (str)arg2) -> None :
    Write the bits that have been ranked to a file

    C++ signature :
        void WriteTopBitsToFile(RDInfoTheory::InfoBitRanker {lvalue},std::string)

__init__(...)
(Constructor)

 

__init__( (object)arg1, (int)nBits, (int)nClasses) -> None :

    C++ signature :
        void __init__(_object*,int,int)

__init__( (object)arg1, (int)nBits, (int)nClasses, (InfoType)infoType) -> None :

    C++ signature :
        void __init__(_object*,int,int,RDInfoTheory::InfoBitRanker::InfoType)

Overrides: object.__init__

__reduce__(...)

 
helper for pickle

Overrides: object.__reduce__
(inherited documentation)