Package ML :: Package DecTree :: Module CrossValidate
[hide private]
[frames] | no frames]

Module CrossValidate

source code

handles doing cross validation with decision trees

This is, perhaps, a little misleading.  For the purposes of this module,
cross validation == evaluating the accuracy of a tree.



Functions [hide private]
 
ChooseOptimalRoot(examples, trainExamples, testExamples, attrs, nPossibleVals, treeBuilder, nQuantBounds=[], **kwargs)
loops through all possible tree roots and chooses the one which produces the best tree **Arguments** - examples: the full set of examples - trainExamples: the training examples - testExamples: the testing examples - attrs: a list of attributes to consider in the tree building - nPossibleVals: a list of the number of possible values each variable can adopt - treeBuilder: the function to be used to actually build the tree - nQuantBounds: an optional list.
source code
 
CrossValidate(tree, testExamples, appendExamples=0)
Determines the classification error for the testExamples **Arguments** - tree: a decision tree (or anything supporting a _ClassifyExample()_ method) - testExamples: a list of examples to be used for testing - appendExamples: a toggle which is passed along to the tree as it does the classification.
source code
 
CrossValidationDriver(examples, attrs, nPossibleVals, holdOutFrac=0.3, silent=0, calcTotalError=0, treeBuilder=<function ID3Boot at 0x9f7ad4c>, lessGreedy=0, startAt=None, nQuantBounds=[], maxDepth=-1, **kwargs)
Driver function for building trees and doing cross validation **Arguments** - examples: the full set of examples - attrs: a list of attributes to consider in the tree building - nPossibleVals: a list of the number of possible values each variable can adopt - holdOutFrac: the fraction of the data which should be reserved for the hold-out set (used to calculate the error) - silent: a toggle used to control how much visual noise this makes as it goes.
source code
 
TestRun()
testing code...
source code
Variables [hide private]
  Complex0 = 'F'
  Complex16 = 'F'
  Complex32 = 'F'
  Complex64 = 'D'
  Complex8 = 'F'
  Float0 = 'f'
  Float16 = 'f'
  Float32 = 'f'
  Float64 = 'd'
  Float8 = 'f'
  Int0 = '1'
  Int16 = 's'
  Int32 = 'i'
  Int8 = '1'
  absolute = <ufunc 'absolute'>
  add = <ufunc 'add'>
  arccos = <ufunc 'arccos'>
  arccosh = <ufunc 'arccosh'>
  arcsin = <ufunc 'arcsin'>
  arcsinh = <ufunc 'arcsinh'>
  arctan = <ufunc 'arctan'>
  arctan2 = <ufunc 'arctan2'>
  arctanh = <ufunc 'arctanh'>
  bitwise_and = <ufunc 'bitwise_and'>
  bitwise_or = <ufunc 'bitwise_or'>
  bitwise_xor = <ufunc 'bitwise_xor'>
  ceil = <ufunc 'ceil'>
  conjugate = <ufunc 'conjugate'>
  cos = <ufunc 'cos'>
  cosh = <ufunc 'cosh'>
  divide = <ufunc 'divide'>
  divide_safe = <ufunc 'divide_safe'>
  e = 2.71828182846
  equal = <ufunc 'equal'>
  exp = <ufunc 'exp'>
  fabs = <ufunc 'fabs'>
  floor = <ufunc 'floor'>
  floor_divide = <ufunc 'floor_divide'>
  fmod = <ufunc 'fmod'>
  greater = <ufunc 'greater'>
  greater_equal = <ufunc 'greater_equal'>
  hypot = <ufunc 'hypot'>
  invert = <ufunc 'invert'>
  left_shift = <ufunc 'left_shift'>
  less = <ufunc 'less'>
  less_equal = <ufunc 'less_equal'>
  log = <ufunc 'log'>
  log10 = <ufunc 'log10'>
  logical_and = <ufunc 'logical_and'>
  logical_not = <ufunc 'logical_not'>
  logical_or = <ufunc 'logical_or'>
  logical_xor = <ufunc 'logical_xor'>
  maximum = <ufunc 'maximum'>
  minimum = <ufunc 'minimum'>
  multiply = <ufunc 'multiply'>
  negative = <ufunc 'negative'>
  not_equal = <ufunc 'not_equal'>
  pi = 3.14159265359
  power = <ufunc 'power'>
  remainder = <ufunc 'remainder'>
  right_shift = <ufunc 'right_shift'>
  sin = <ufunc 'sin'>
  sinh = <ufunc 'sinh'>
  sqrt = <ufunc 'sqrt'>
  subtract = <ufunc 'subtract'>
  tan = <ufunc 'tan'>
  tanh = <ufunc 'tanh'>
  true_divide = <ufunc 'true_divide'>
Function Details [hide private]

ChooseOptimalRoot(examples, trainExamples, testExamples, attrs, nPossibleVals, treeBuilder, nQuantBounds=[], **kwargs)

source code 
loops through all possible tree roots and chooses the one which produces the best tree

**Arguments**

  - examples: the full set of examples

  - trainExamples: the training examples

  - testExamples: the testing examples

  - attrs: a list of attributes to consider in the tree building

  - nPossibleVals: a list of the number of possible values each variable can adopt

  - treeBuilder: the function to be used to actually build the tree

  - nQuantBounds: an optional list.  If present, it's assumed that the builder
    algorithm takes this argument as well (for building QuantTrees)

**Returns**

  The best tree found
  
**Notes**

  1) Trees are built using _trainExamples_

  2) Testing of each tree (to determine which is best) is done using _CrossValidate_ and
     the entire set of data (i.e. all of _examples_)

  3) _trainExamples_ is not used at all, which immediately raises the question of
     why it's even being passed in

CrossValidate(tree, testExamples, appendExamples=0)

source code 
Determines the classification error for the testExamples

**Arguments**

  - tree: a decision tree (or anything supporting a _ClassifyExample()_ method)

  - testExamples: a list of examples to be used for testing

  - appendExamples: a toggle which is passed along to the tree as it does
    the classification. The trees can use this to store the examples they
    classify locally.

**Returns**

  a 2-tuple consisting of:

    1) the percent error of the tree

    2) a list of misclassified examples
    

CrossValidationDriver(examples, attrs, nPossibleVals, holdOutFrac=0.3, silent=0, calcTotalError=0, treeBuilder=<function ID3Boot at 0x9f7ad4c>, lessGreedy=0, startAt=None, nQuantBounds=[], maxDepth=-1, **kwargs)

source code 
Driver function for building trees and doing cross validation

**Arguments**

  - examples: the full set of examples

  - attrs: a list of attributes to consider in the tree building

  - nPossibleVals: a list of the number of possible values each variable can adopt

  - holdOutFrac: the fraction of the data which should be reserved for the hold-out set
     (used to calculate the error)

  - silent: a toggle used to control how much visual noise this makes as it goes.

  - calcTotalError: a toggle used to indicate whether the classification error
    of the tree should be calculated using the entire data set (when true) or just
    the training hold out set (when false)

  - treeBuilder: the function to call to build the tree

  - lessGreedy: toggles use of the less greedy tree growth algorithm (see
    _ChooseOptimalRoot_).

  - startAt: forces the tree to be rooted at this descriptor

  - nQuantBounds: an optional list.  If present, it's assumed that the builder
    algorithm takes this argument as well (for building QuantTrees)

  - maxDepth: an optional integer.  If present, it's assumed that the builder
    algorithm takes this argument as well

**Returns**

   a 2-tuple containing:

     1) the tree

     2) the cross-validation error of the tree
     

TestRun()

source code 
testing code