BitOps.h File Reference

Contains general bit-comparison and similarity operations. More...

#include "BitVects.h"
#include <string>

Go to the source code of this file.

Functions

template<typename T>
double SimilarityWrapper (const T &bv1, const T &bv2, const double(*metric)(const T &, const T &), bool returnDistance=false)
template<typename T>
double SimilarityWrapper (const T &bv1, const T &bv2, double a, double b, const double(*metric)(const T &, const T &, double, double), bool returnDistance=false)
bool AllProbeBitsMatch (const char *probe, const char *ref)
bool AllProbeBitsMatch (const std::string &probe, const std::string &ref)
template<typename T1>
bool AllProbeBitsMatch (const T1 &probe, const std::string &pkl)
template<typename T1, typename T2>
int NumOnBitsInCommon (const T1 &bv1, const T2 &bv2)
 returns the number of on bits in common between two bit vectors
int NumOnBitsInCommon (const ExplicitBitVect &bv1, const ExplicitBitVect &bv2)
template<typename T1, typename T2>
const double TanimotoSimilarity (const T1 &bv1, const T2 &bv2)
 returns the Tanimoto similarity between two bit vects
template<typename T1, typename T2>
const double CosineSimilarity (const T1 &bv1, const T2 &bv2)
 returns the Cosine similarity between two bit vects
template<typename T1, typename T2>
const double KulczynskiSimilarity (const T1 &bv1, const T2 &bv2)
 returns the Kulczynski similarity between two bit vects
template<typename T1, typename T2>
const double DiceSimilarity (const T1 &bv1, const T2 &bv2)
 returns the Dice similarity between two bit vects
template<typename T1, typename T2>
const double TverskySimilarity (const T1 &bv1, const T2 &bv2, double a, double b)
 returns the Tversky similarity between two bit vects
template<typename T1, typename T2>
const double SokalSimilarity (const T1 &bv1, const T2 &bv2)
 returns the Sokal similarity between two bit vects
template<typename T1, typename T2>
const double McConnaugheySimilarity (const T1 &bv1, const T2 &bv2)
 returns the McConnaughey similarity between two bit vects
template<typename T1, typename T2>
const double AsymmetricSimilarity (const T1 &bv1, const T2 &bv2)
 returns the Asymmetric similarity between two bit vects
template<typename T1, typename T2>
const double BraunBlanquetSimilarity (const T1 &bv1, const T2 &bv2)
 returns the Braun-Blanquet similarity between two bit vects
template<typename T1, typename T2>
const double RusselSimilarity (const T1 &bv1, const T2 &bv2)
 returns the Russel similarity between two bit vects
template<typename T1, typename T2>
const double OnBitSimilarity (const T1 &bv1, const T2 &bv2)
 returns the on bit similarity between two bit vects
template<typename T1, typename T2>
const int NumBitsInCommon (const T1 &bv1, const T2 &bv2)
 returns the number of common bits (on and off) between two bit vects
template<typename T1, typename T2>
const double AllBitSimilarity (const T1 &bv1, const T2 &bv2)
 returns the commong-bit similarity (on and off) between two bit vects
template<typename T1, typename T2>
IntVect OnBitsInCommon (const T1 &bv1, const T2 &bv2)
 returns an IntVect with indices of all on bits in common between two bit vects
template<typename T1, typename T2>
IntVect OffBitsInCommon (const T1 &bv1, const T2 &bv2)
 returns an IntVect with indices of all off bits in common between two bit vects
template<typename T1, typename T2>
DoubleVect OnBitProjSimilarity (const T1 &bv1, const T2 &bv2)
 returns the on-bit projected similarities between two bit vects
template<typename T1, typename T2>
DoubleVect OffBitProjSimilarity (const T1 &bv1, const T2 &bv2)
 returns the on-bit projected similarities between two bit vects
template<typename T1>
T1 * FoldFingerprint (const T1 &bv1, unsigned int factor=2)
 folds a bit vector factor times and returns the result
template<typename T1>
std::string BitVectToText (const T1 &bv1)
 returns a text representation of a bit vector (a string of 0s and 1s)


Detailed Description

Contains general bit-comparison and similarity operations.

The notation used to document the similarity metrics is:

Definition in file BitOps.h.


Function Documentation

template<typename T1, typename T2>
const double AllBitSimilarity ( const T1 &  bv1,
const T2 &  bv2 
) [inline]

returns the commong-bit similarity (on and off) between two bit vects

Returns:
[bv1_n - (bv1^bv2)_o] / bv1_n

template<typename T1>
bool AllProbeBitsMatch ( const T1 &  probe,
const std::string &  pkl 
) [inline]

bool AllProbeBitsMatch ( const std::string &  probe,
const std::string &  ref 
)

bool AllProbeBitsMatch ( const char *  probe,
const char *  ref 
)

template<typename T1, typename T2>
const double AsymmetricSimilarity ( const T1 &  bv1,
const T2 &  bv2 
) [inline]

returns the Asymmetric similarity between two bit vects

Returns:
(bv1&bv2)_o / min(bv1_o,bv2_o)

template<typename T1>
std::string BitVectToText ( const T1 &  bv1  )  [inline]

returns a text representation of a bit vector (a string of 0s and 1s)

Parameters:
bv1 the vector to be folded
Returns:
an std::string

template<typename T1, typename T2>
const double BraunBlanquetSimilarity ( const T1 &  bv1,
const T2 &  bv2 
) [inline]

returns the Braun-Blanquet similarity between two bit vects

Returns:
(bv1&bv2)_o / max(bv1_o,bv2_o)

template<typename T1, typename T2>
const double CosineSimilarity ( const T1 &  bv1,
const T2 &  bv2 
) [inline]

returns the Cosine similarity between two bit vects

Returns:
(bv1&bv2)_o / sqrt(bv1_o + bv2_o)

template<typename T1, typename T2>
const double DiceSimilarity ( const T1 &  bv1,
const T2 &  bv2 
) [inline]

returns the Dice similarity between two bit vects

Returns:
2*(bv1&bv2)_o / [bv1_o + bv2_o]

template<typename T1>
T1* FoldFingerprint ( const T1 &  bv1,
unsigned int  factor = 2 
) [inline]

folds a bit vector factor times and returns the result

Parameters:
bv1 the vector to be folded
factor (optional) the number of times to fold it
Returns:
a pointer to the folded fingerprint, which is bv1_n/factor long.
Note: The caller is responsible for deleteing the result.

Referenced by SimilarityWrapper().

template<typename T1, typename T2>
const double KulczynskiSimilarity ( const T1 &  bv1,
const T2 &  bv2 
) [inline]

returns the Kulczynski similarity between two bit vects

Returns:
(bv1&bv2)_o * [bv1_o + bv2_o] / [2 * bv1_o * bv2_o]

template<typename T1, typename T2>
const double McConnaugheySimilarity ( const T1 &  bv1,
const T2 &  bv2 
) [inline]

returns the McConnaughey similarity between two bit vects

Returns:
[(bv1&bv2)_o * (bv1_o + bv2_o) - (bv1_o * bv2_o)] / (bv1_o * bv2_o)

template<typename T1, typename T2>
const int NumBitsInCommon ( const T1 &  bv1,
const T2 &  bv2 
) [inline]

returns the number of common bits (on and off) between two bit vects

Returns:
bv1_n - (bv1^bv2)_o

int NumOnBitsInCommon ( const ExplicitBitVect bv1,
const ExplicitBitVect bv2 
)

template<typename T1, typename T2>
int NumOnBitsInCommon ( const T1 &  bv1,
const T2 &  bv2 
) [inline]

returns the number of on bits in common between two bit vectors

Returns:
(bv1&bv2)_o

template<typename T1, typename T2>
DoubleVect OffBitProjSimilarity ( const T1 &  bv1,
const T2 &  bv2 
) [inline]

returns the on-bit projected similarities between two bit vects

Returns:
two values, as a DoubleVect:
  • [bv1_n - (bv1|bv2)_o] / [bv1_n - bv1_o]
  • [bv2_n - (bv1|bv2)_o] / [bv2_n - bv2_o]
Note: bv1_n = bv2_n

template<typename T1, typename T2>
IntVect OffBitsInCommon ( const T1 &  bv1,
const T2 &  bv2 
) [inline]

returns an IntVect with indices of all off bits in common between two bit vects

template<typename T1, typename T2>
DoubleVect OnBitProjSimilarity ( const T1 &  bv1,
const T2 &  bv2 
) [inline]

returns the on-bit projected similarities between two bit vects

Returns:
two values, as a DoubleVect:
  • (bv1&bv2)_o / bv1_o
  • (bv1&bv2)_o / bv2_o

template<typename T1, typename T2>
const double OnBitSimilarity ( const T1 &  bv1,
const T2 &  bv2 
) [inline]

returns the on bit similarity between two bit vects

Returns:
(bv1&bv2)_o / (bv1|bv2)_o

template<typename T1, typename T2>
IntVect OnBitsInCommon ( const T1 &  bv1,
const T2 &  bv2 
) [inline]

returns an IntVect with indices of all on bits in common between two bit vects

template<typename T1, typename T2>
const double RusselSimilarity ( const T1 &  bv1,
const T2 &  bv2 
) [inline]

returns the Russel similarity between two bit vects

Returns:
(bv1&bv2)_o / bv1_o
Note: that this operation is non-commutative: RusselSimilarity(bv1,bv2) != RusselSimilarity(bv2,bv1)

template<typename T>
double SimilarityWrapper ( const T &  bv1,
const T &  bv2,
double  a,
double  b,
const double(*)(const T &, const T &, double, double)  metric,
bool  returnDistance = false 
) [inline]

Definition at line 46 of file BitOps.h.

References FoldFingerprint().

template<typename T>
double SimilarityWrapper ( const T &  bv1,
const T &  bv2,
const double(*)(const T &, const T &)  metric,
bool  returnDistance = false 
) [inline]

general purpose wrapper for calculating the similarity between two bvs that may be of unequal size (will automatically fold as appropriate)

Definition at line 26 of file BitOps.h.

References FoldFingerprint().

Referenced by RDDataManip::TanimotoDistanceMetric(), and RDDataManip::TanimotoSimilarityMetric().

template<typename T1, typename T2>
const double SokalSimilarity ( const T1 &  bv1,
const T2 &  bv2 
) [inline]

returns the Sokal similarity between two bit vects

Returns:
(bv1&bv2)_o / [2*bv1_o + 2*bv2_o - 3*(bv1&bv2)_o]

template<typename T1, typename T2>
const double TanimotoSimilarity ( const T1 &  bv1,
const T2 &  bv2 
) [inline]

returns the Tanimoto similarity between two bit vects

Returns:
(bv1&bv2)_o / [bv1_o + bv2_o - (bv1&bv2)_o]

Referenced by RDDataManip::TanimotoDistanceMetric(), and RDDataManip::TanimotoSimilarityMetric().

template<typename T1, typename T2>
const double TverskySimilarity ( const T1 &  bv1,
const T2 &  bv2,
double  a,
double  b 
) [inline]

returns the Tversky similarity between two bit vects

Returns:
(bv1&bv2)_o / [a*bv1_o + b*bv2_o + (1 - a - b)*(bv1&bv2)_o]
Notes: # 0 <= a,b <= 1 # Tversky(a=1,b=1) = Tanimoto # Tversky(a=1/2,b=1/2) = Dice


Generated on Fri Apr 3 06:03:02 2009 for RDCode by  doxygen 1.5.6