RDKit
Open-source cheminformatics and machine learning.
RDKit::FPBReader Class Reference

class for reading and searching FPB files More...

#include <FPBReader.h>

Public Member Functions

 FPBReader ()
 
 FPBReader (const char *fname, bool lazyRead=false)
 ctor for reading from a named file More...
 
 FPBReader (const std::string &fname, bool lazyRead=false)
 
 FPBReader (std::istream *inStream, bool takeOwnership=true, bool lazyRead=false)
 ctor for reading from an open istream More...
 
 ~FPBReader ()
 
void init ()
 Read the data from the file and initialize internal data structures. More...
 
void cleanup ()
 cleanup More...
 
boost::shared_ptr< ExplicitBitVectgetFP (unsigned int idx) const
 returns the requested fingerprint as an ExplicitBitVect More...
 
boost::shared_array< boost::uint8_t > getBytes (unsigned int idx) const
 returns the requested fingerprint as an array of bytes More...
 
std::string getId (unsigned int idx) const
 returns the id of the requested fingerprint More...
 
std::pair< boost::shared_ptr< ExplicitBitVect >, std::string > operator[] (unsigned int idx) const
 returns the fingerprint and id of the requested fingerprint More...
 
std::pair< unsigned int, unsigned int > getFPIdsInCountRange (unsigned int minCount, unsigned int maxCount)
 
unsigned int length () const
 returns the number of fingerprints More...
 
unsigned int nBits () const
 returns the number of bits in our fingerprints More...
 
double getTanimoto (unsigned int idx, const boost::uint8_t *bv) const
 
double getTanimoto (unsigned int idx, boost::shared_array< boost::uint8_t > bv) const
 
double getTanimoto (unsigned int idx, const ExplicitBitVect &ebv) const
 
std::vector< std::pair< double, unsigned int > > getTanimotoNeighbors (const boost::uint8_t *bv, double threshold=0.7, bool usePopcountScreen=true) const
 returns tanimoto neighbors that are within a similarity threshold More...
 
std::vector< std::pair< double, unsigned int > > getTanimotoNeighbors (boost::shared_array< boost::uint8_t > bv, double threshold=0.7, bool usePopcountScreen=true) const
 
std::vector< std::pair< double, unsigned int > > getTanimotoNeighbors (const ExplicitBitVect &ebv, double threshold=0.7, bool usePopcountScreen=true) const
 
double getTversky (unsigned int idx, const boost::uint8_t *bv, double ca, double cb) const
 
double getTversky (unsigned int idx, boost::shared_array< boost::uint8_t > bv, double ca, double cb) const
 
double getTversky (unsigned int idx, const ExplicitBitVect &ebv, double ca, double cb) const
 
std::vector< std::pair< double, unsigned int > > getTverskyNeighbors (const boost::uint8_t *bv, double ca, double cb, double threshold=0.7, bool usePopcountScreen=true) const
 returns Tversky neighbors that are within a similarity threshold More...
 
std::vector< std::pair< double, unsigned int > > getTverskyNeighbors (boost::shared_array< boost::uint8_t > bv, double ca, double cb, double threshold=0.7, bool usePopcountScreen=true) const
 
std::vector< std::pair< double, unsigned int > > getTverskyNeighbors (const ExplicitBitVect &ebv, double ca, double cb, double threshold=0.7, bool usePopcountScreen=true) const
 
std::vector< unsigned int > getContainingNeighbors (const boost::uint8_t *bv) const
 returns indices of all fingerprints that completely contain this one More...
 
std::vector< unsigned int > getContainingNeighbors (boost::shared_array< boost::uint8_t > bv) const
 
std::vector< unsigned int > getContainingNeighbors (const ExplicitBitVect &ebv) const
 

Detailed Description

class for reading and searching FPB files

basic usage:

FPBReader reader("foo.fpb");
reader.init();
boost::shared_ptr<ExplicitBitVect> ebv = reader.getFP(95);
std::vector<std::pair<double, unsigned int> > nbrs =
reader.getTanimotoNeighbors(*ebv.get(), 0.70);

Note: this functionality is experimental and the API may change in future releases.

Note on thread safety Operations that involve reading from the FPB file are not thread safe. This means that the init() method is not thread safe and none of the search operations are thread safe when an FPBReader is initialized in lazyRead mode.

Definition at line 57 of file FPBReader.h.

Constructor & Destructor Documentation

RDKit::FPBReader::FPBReader ( )
inline

Definition at line 59 of file FPBReader.h.

RDKit::FPBReader::FPBReader ( const char *  fname,
bool  lazyRead = false 
)
inline

ctor for reading from a named file

Parameters
fnamethe name of the file to reads
lazyReadif set to false all fingerprints from the file will be read into memory when init() is called.

Definition at line 71 of file FPBReader.h.

RDKit::FPBReader::FPBReader ( const std::string &  fname,
bool  lazyRead = false 
)
inline

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Definition at line 75 of file FPBReader.h.

RDKit::FPBReader::FPBReader ( std::istream *  inStream,
bool  takeOwnership = true,
bool  lazyRead = false 
)
inline

ctor for reading from an open istream

Parameters
inStreamthe stream to read from
takeOwnershipif set, we will take over ownership of the stream pointer
lazyReadif set to false all fingerprints from the file will be read into memory when init() is called.

Some additional notes:

  • if lazyRead is set, inStream must support the seekg() and tellg() operations.

Definition at line 90 of file FPBReader.h.

RDKit::FPBReader::~FPBReader ( )
inline

Definition at line 97 of file FPBReader.h.

Member Function Documentation

void RDKit::FPBReader::cleanup ( )
inline

cleanup

Cleans up whatever memory was allocated during init()

Definition at line 120 of file FPBReader.h.

boost::shared_array<boost::uint8_t> RDKit::FPBReader::getBytes ( unsigned int  idx) const

returns the requested fingerprint as an array of bytes

std::vector<unsigned int> RDKit::FPBReader::getContainingNeighbors ( const boost::uint8_t *  bv) const

returns indices of all fingerprints that completely contain this one

(i.e. where all the bits set in the query are also set in the db molecule)

std::vector<unsigned int> RDKit::FPBReader::getContainingNeighbors ( boost::shared_array< boost::uint8_t >  bv) const
inline

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Definition at line 241 of file FPBReader.h.

std::vector<unsigned int> RDKit::FPBReader::getContainingNeighbors ( const ExplicitBitVect ebv) const

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

boost::shared_ptr<ExplicitBitVect> RDKit::FPBReader::getFP ( unsigned int  idx) const

returns the requested fingerprint as an ExplicitBitVect

std::pair<unsigned int, unsigned int> RDKit::FPBReader::getFPIdsInCountRange ( unsigned int  minCount,
unsigned int  maxCount 
)

returns beginning and end indices of fingerprints having on-bit counts within the range (including end points)

std::string RDKit::FPBReader::getId ( unsigned int  idx) const

returns the id of the requested fingerprint

double RDKit::FPBReader::getTanimoto ( unsigned int  idx,
const boost::uint8_t *  bv 
) const

returns the tanimoto similarity between the specified fingerprint and the provided fingerprint

double RDKit::FPBReader::getTanimoto ( unsigned int  idx,
boost::shared_array< boost::uint8_t >  bv 
) const
inline

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Definition at line 152 of file FPBReader.h.

double RDKit::FPBReader::getTanimoto ( unsigned int  idx,
const ExplicitBitVect ebv 
) const

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

std::vector<std::pair<double, unsigned int> > RDKit::FPBReader::getTanimotoNeighbors ( const boost::uint8_t *  bv,
double  threshold = 0.7,
bool  usePopcountScreen = true 
) const

returns tanimoto neighbors that are within a similarity threshold

The result vector of (similarity,index) pairs is sorted in order of decreasing similarity

Parameters
bvthe query fingerprint
thresholdthe minimum similarity to return
usePopcountScreenif this is true (the default) the popcount of the neighbors will be used to reduce the number of calculations that need to be done
std::vector<std::pair<double, unsigned int> > RDKit::FPBReader::getTanimotoNeighbors ( boost::shared_array< boost::uint8_t >  bv,
double  threshold = 0.7,
bool  usePopcountScreen = true 
) const
inline

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Definition at line 175 of file FPBReader.h.

std::vector<std::pair<double, unsigned int> > RDKit::FPBReader::getTanimotoNeighbors ( const ExplicitBitVect ebv,
double  threshold = 0.7,
bool  usePopcountScreen = true 
) const

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

double RDKit::FPBReader::getTversky ( unsigned int  idx,
const boost::uint8_t *  bv,
double  ca,
double  cb 
) const

returns the Tversky similarity between the specified fingerprint and the provided fingerprint

Parameters
idxthe fingerprint to compare to
bvthe query fingerprint
cathe Tversky a coefficient
cbthe Tversky a coefficient
double RDKit::FPBReader::getTversky ( unsigned int  idx,
boost::shared_array< boost::uint8_t >  bv,
double  ca,
double  cb 
) const
inline

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Definition at line 198 of file FPBReader.h.

double RDKit::FPBReader::getTversky ( unsigned int  idx,
const ExplicitBitVect ebv,
double  ca,
double  cb 
) const

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

std::vector<std::pair<double, unsigned int> > RDKit::FPBReader::getTverskyNeighbors ( const boost::uint8_t *  bv,
double  ca,
double  cb,
double  threshold = 0.7,
bool  usePopcountScreen = true 
) const

returns Tversky neighbors that are within a similarity threshold

The result vector of (similarity,index) pairs is sorted in order of decreasing similarity

Parameters
bvthe query fingerprint
cathe Tversky a coefficient
cbthe Tversky a coefficient
thresholdthe minimum similarity to return
usePopcountScreenif this is true (the default) the popcount of the neighbors will be used to reduce the number of calculations that need to be done
std::vector<std::pair<double, unsigned int> > RDKit::FPBReader::getTverskyNeighbors ( boost::shared_array< boost::uint8_t >  bv,
double  ca,
double  cb,
double  threshold = 0.7,
bool  usePopcountScreen = true 
) const
inline

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Definition at line 224 of file FPBReader.h.

std::vector<std::pair<double, unsigned int> > RDKit::FPBReader::getTverskyNeighbors ( const ExplicitBitVect ebv,
double  ca,
double  cb,
double  threshold = 0.7,
bool  usePopcountScreen = true 
) const

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

void RDKit::FPBReader::init ( )

Read the data from the file and initialize internal data structures.

This must be called before most of the other methods of this clases.

Some notes:

  • if lazyRead is not set, all fingerprints will be read into memory. This can require substantial amounts of memory for large files.
  • For large files, this can take a long time.
  • If lazyRead and takeOwnership are both false it is safe to close and delete inStream after calling init()

Referenced by RDKit::MultiFPBReader::addReader().

unsigned int RDKit::FPBReader::length ( ) const

returns the number of fingerprints

unsigned int RDKit::FPBReader::nBits ( ) const

returns the number of bits in our fingerprints

std::pair<boost::shared_ptr<ExplicitBitVect>, std::string> RDKit::FPBReader::operator[] ( unsigned int  idx) const
inline

returns the fingerprint and id of the requested fingerprint

Definition at line 133 of file FPBReader.h.


The documentation for this class was generated from the following file: