Chem.BuildFragmentCatalog: command line utility for working with FragmentCatalogs (CASE-type analysis)
**Usage**
BuildFragmentCatalog [optional args] <filename>
filename, the name of a delimited text file containing InData, is required
for some modes of operation (see below)
**Command Line Arguments**
- -n *maxNumMols*: specify the maximum number of molecules to be processed
- -b: build the catalog and OnBitLists
*requires InData*
- -s: score compounds
*requires InData and a Catalog, can use OnBitLists*
- -g: calculate info gains
*requires Scores*
- -d: show details about high-ranking fragments
*requires a Catalog and Gains*
- --catalog=*filename*: filename with the pickled catalog.
Chem.Graphs: Python functions for manipulating molecular graphs
In theory much of the functionality in here should be migrating into the
C/C++ codebase.
Chem.Lipinski: Calculation of Lipinski parameters for molecules
Chem.MACCSkeys: SMARTS definitions for the publically available MACCS keys...
Chem.Pharm2D: module with functionality for 2D pharmacophores
Chem.Pharm2D.Generate: generation of 2D pharmacophores
**Notes**
- The terminology for this gets a bit rocky, so here's a glossary of what
terms used here mean:
1) *N-point pharmacophore* a combination of N features along with
distances betwen them.
Chem.Pharm2D.Matcher: functionality for finding pharmacophore matches in molecules
See Docs/Chem/Pharm2D.triangles.jpg for an illustration of the way
pharmacophores are broken into triangles and labelled.
Chem.Pharm2D.Signature: data structures for holding 2D pharmacophore signatures
See Docs/Chem/Pharm2D.triangles.jpg for an illustration of the way
pharmacophores are broken into triangles and labelled.
Chem.Pharm2D.Utils: utility functionality for the 2D pharmacophores code
See Docs/Chem/Pharm2D.triangles.jpg for an illustration of the way
pharmacophores are broken into triangles and labelled.
Dbase.DbResultSet: defines class _DbResultSet_ for lazy interactions with Db query results
**Note**
this uses the Python iterator interface, so you'll need python 2.2 or above.
Dbase.DbUtils: a set of functions for interacting with databases...
Dbase.Pubmed.Searches: Tools for doing PubMed searches and processing the results
NOTE: much of the example code in the documentation here uses XML
files from the test_data directory in order to avoid having to call
out to PubMed itself.
Dbase.StorageUtils: Various storage (molecular and otherwise) functionality
ML.AnalyzeComposite: command line utility to report on the contributions of descriptors to
tree-based composite models
Usage: AnalyzeComposite [optional args] <models>
<models>: file name(s) of pickled composite model(s)
(this is the name of the db table if using a database)
Optional Arguments:
-n number: the number of levels of each model to consider
-d dbname: the database from which to read the models
-N Note: the note string to search for to pull models from the database
-X: Send the results to Excel.
ML.BuildComposite: command line utility for building composite models
#DOC
**Usage**
BuildComposite [optional args] filename
Unless indicated otherwise (via command line arguments), _filename_ is
a QDAT file.
ML.Composite.BayesComposite: code for dealing with Bayesian composite models
For a model to be useable here, it should support the following API:
- _ClassifyExample(example)_, returns a classification
Other compatibility notes:
1) To use _Composite.Grow_ there must be some kind of builder
functionality which returns a 2-tuple containing (model,percent accuracy).
ML.Composite.Composite: code for dealing with composite models
For a model to be useable here, it should support the following API:
- _ClassifyExample(example)_, returns a classification
Other compatibility notes:
1) To use _Composite.Grow_ there must be some kind of builder
functionality which returns a 2-tuple containing (model,percent accuracy).
ML.CompositeRun: contains a class to store parameters for and results from...
ML.Data.DataUtils: Utilities for data manipulation
**FILE FORMATS:**
- *.qdat files* contain quantized data suitable for
feeding to learning algorithms.
ML.Data.MLData: classes to be used to help work with data sets
ML.Data.Quantize: Automatic search for quantization bounds
This uses the expected informational gain to determine where quantization bounds should
lie.
ML.DecTree.CrossValidate: handles doing cross validation with decision trees
This is, perhaps, a little misleading.
ML.DecTree.DecTree: Defines the class _DecTreeNode_, used to represent decision trees...
ML.DecTree.Forest: code for dealing with forests (collections) of decision trees
**NOTE** This code should be obsolete now that ML.Composite.Composite is up and running.
ML.EnrichPlot: Command line tool to construct an enrichment plot from saved composite models
Usage: EnrichPlot [optional args] -d dbname -t tablename <models>
Required Arguments:
-d "dbName": the name of the database for screening
-t "tablename": provide the name of the table with the data to be screened
<models>: file name(s) of pickled composite model(s).
ML.GrowComposite: command line utility for growing composite models
**Usage**
_GrowComposite [optional args] filename_
**Command Line Arguments**
- -n *count*: number of new models to build
- -C *pickle file name*: name of file containing composite upon which to build.
ML.InfoTheory.BitRank: Functionality for ranking bits using info gains
**Definitions used in this module**
- *sequence*: an object capable of containing other objects which supports
__getitem__() and __len__().
ML.Neural.ActFuncs: Activation functions for neural network nodes
Activation functions should implement the following API:
- _Eval(input)_: returns the value of the function at a given point
- _Deriv(input)_: returns the derivative of the function at a given point
The current Backprop implementation also requires:
- _DerivFromVal(val)_: returns the derivative of the function when its
value is val
In all cases _input_ is a float as is the value returned.
ML.Neural.CrossValidate: handles doing cross validation with neural nets
This is, perhaps, a little misleading.
ML.Neural.NetNode: Contains the class _NetNode_ which is used to represent nodes in neural nets
**Network Architecture:**
A tacit assumption in all of this stuff is that we're dealing with
feedforward networks.
ML.Neural.Network: Contains the class _Network_ which is used to represent neural nets
**Network Architecture:**
A tacit assumption in all of this stuff is that we're dealing with
feedforward networks.
ML.Neural.Trainers: Training algorithms for feed-forward neural nets
Unless noted otherwise, algorithms and notation are taken from:
"Artificial Neural Networks: Theory and Applications",
Dan W.
ML.ScreenComposite: command line utility for screening composite models
**Usage**
_ScreenComposite [optional args] modelfile(s) datafile_
Unless indicated otherwise (via command line arguments), _modelfile_ is
a file containing a pickled composite model and _filename_ is a QDAT file.