Package ML :: Package InfoTheory :: Module rdInfoTheory :: Class InfoBitRanker
[hide private]
[frames] | no frames]

Class InfoBitRanker



 object --+    
          |    
??.instance --+
              |
             InfoBitRanker

A class to rank the bits from a series of labelled fingerprints
A simple demonstration may help clarify what this class does. 
Here's a small set of vectors:
>>> for i,bv in enumerate(bvs): print bv.ToBitString(),acts[i]
... 
0001 0
0101 0
0010 1
1110 1

Default ranker, using infogain:
>>> ranker = InfoBitRanker(4,2)  
>>> for i,bv in enumerate(bvs): ranker.AccumulateVotes(bv,acts[i])
... 
>>> for bit,gain,n0,n1 in ranker.GetTopN(3): print int(bit),'%.3f'%gain,int(n0),int(n1)
... 
3 1.000 2 0
2 1.000 0 2
0 0.311 0 1

Using the biased infogain:
>>> ranker = InfoBitRanker(4,2,InfoTheory.InfoType.BIASENTROPY)
>>> ranker.SetBiasList((1,))
>>> for i,bv in enumerate(bvs): ranker.AccumulateVotes(bv,acts[i])
... 
>>> for bit,gain,n0,n1 in ranker.GetTopN(3): print int(bit),'%.3f'%gain,int(n0),int(n1)
... 
2 1.000 0 2
0 0.311 0 1
1 0.000 1 1

A chi squared ranker is also available:
>>> ranker = InfoBitRanker(4,2,InfoTheory.InfoType.CHISQUARE)
>>> for i,bv in enumerate(bvs): ranker.AccumulateVotes(bv,acts[i])
... 
>>> for bit,gain,n0,n1 in ranker.GetTopN(3): print int(bit),'%.3f'%gain,int(n0),int(n1)
... 
3 4.000 2 0
2 4.000 0 2
0 1.333 0 1

As is a biased chi squared:
>>> ranker = InfoBitRanker(4,2,InfoTheory.InfoType.BIASCHISQUARE)
>>> ranker.SetBiasList((1,))
>>> for i,bv in enumerate(bvs): ranker.AccumulateVotes(bv,acts[i])
... 
>>> for bit,gain,n0,n1 in ranker.GetTopN(3): print int(bit),'%.3f'%gain,int(n0),int(n1)
... 
2 4.000 0 2
0 1.333 0 1
1 0.000 1 1



Instance Methods [hide private]
 
AccumulateVotes(...)
Accumulate the votes for all the bits turned on in a bit vector ARGUMENTS: - bv : bit vector either ExplicitBitVect or SparseBitVect operator - label : the class label for the bit vector.
 
GetTopN(...)
Returns the top n bits ranked by the information metric...
 
SetBiasList(...)
Set the classes to which the entropy calculation should be biased This list contains a set of class ids used when in the BIASENTROPY mode of ranking bits.
 
SetMaskBits(...)
Set the mask bits for the calculation...
 
Tester(...)
C++ signature:...
 
WriteTopBitsToFile(...)
Write the bits that have been ranked to a file...
 
__init__(...)
C++ signature:...

Inherited from unreachable.instance: __new__

Inherited from object: __delattr__, __getattribute__, __hash__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

Class Variables [hide private]
  __instance_size__ = 72
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

AccumulateVotes(...)

 
Accumulate the votes for all the bits turned on in a bit vector

ARGUMENTS:

  - bv : bit vector either ExplicitBitVect or SparseBitVect operator
  - label : the class label for the bit vector. It is assumed that 0 <= class < nClasses 

C++ signature:
    AccumulateVotes(RDInfoTheory::InfoBitRanker*, boost::python::api::object, int) -> void*

GetTopN(...)

 
Returns the top n bits ranked by the information metric
This is actually the function where most of the work of ranking is happening

ARGUMENTS:

  - num : the number of top ranked bits that are required

C++ signature:
    GetTopN(RDInfoTheory::InfoBitRanker*, int) -> _object*

SetBiasList(...)

 
Set the classes to which the entropy calculation should be biased

This list contains a set of class ids used when in the BIASENTROPY mode of ranking bits. 
In this mode, a bit must be correlated higher with one of the biased classes than all the 
other classes. For example, in a two class problem with actives and inactives, the fraction of 
actives that hit the bit has to be greater than the fraction of inactives that hit the bit

ARGUMENTS: 

  - classList : list of class ids that we want a bias towards

C++ signature:
    SetBiasList(RDInfoTheory::InfoBitRanker*, boost::python::api::object) -> void*

SetMaskBits(...)

 
Set the mask bits for the calculation

ARGUMENTS: 

  - maskBits : list of mask bits to use

C++ signature:
    SetMaskBits(RDInfoTheory::InfoBitRanker*, boost::python::api::object) -> void*

Tester(...)

 
C++ signature:
Tester(RDInfoTheory::InfoBitRanker*, boost::python::api::object) -> void*

WriteTopBitsToFile(...)

 
Write the bits that have been ranked to a file
C++ signature:
    WriteTopBitsToFile(RDInfoTheory::InfoBitRanker {lvalue}, std::string) -> void*

__init__(...)
(Constructor)

 
C++ signature:
    __init__(_object*, int nBits, int nClasses) -> void*
C++ signature:
    __init__(_object*, int nBits, int nClasses, RDInfoTheory::InfoBitRanker::InfoType infoType) -> void*

Overrides: object.__init__