edu.harvard.seas.iis.abilities.classify
Class FeatureSelection

java.lang.Object
  extended by edu.harvard.seas.iis.abilities.classify.FeatureSelection

public class FeatureSelection
extends java.lang.Object

Tools for running the feature selection process

Author:
kgajos

Constructor Summary
FeatureSelection()
           
 
Method Summary
 ClassifierEvalStats crossvalidateOverUsers(weka.classifiers.Classifier c, DataSet dataSet, java.lang.String[] usersToEvaluateOn, boolean annotateDataSet)
          Performs a per-user crossvalidation; evaluation is performed on users listed in usersToEvaluateOn; the dataSet may have data from more users -- that's ok, the data from those users are used for training but not for evaluation
 ClassifierEvalStats evaluateFeatureSet(java.lang.String[] features, PositiveAndUnlabeledClassifier c, DataSet dataSet, boolean annotateDataSet)
          Uses crossvalidation to evaluate a particular classifier on a particular set of features on a particular data set
static void main(java.lang.String[] args)
           
 double overalEvaluationCriterion(ClassifierEvalStats results)
          Computes the score for determining which feature set is best; depending on your priorities, different things can go into this score
 double search(java.util.List<java.lang.String> startingFeatures, java.lang.String[] allowedFeatures, PositiveAndUnlabeledClassifier c, DataSet dataSet, java.lang.String prefix, double bestOEC)
          Given a starting set of features, searches for the best set of features to use with a particular classifier
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

FeatureSelection

public FeatureSelection()
Method Detail

overalEvaluationCriterion

public double overalEvaluationCriterion(ClassifierEvalStats results)
Computes the score for determining which feature set is best; depending on your priorities, different things can go into this score

Parameters:
results -
Returns:

crossvalidateOverUsers

public ClassifierEvalStats crossvalidateOverUsers(weka.classifiers.Classifier c,
                                                  DataSet dataSet,
                                                  java.lang.String[] usersToEvaluateOn,
                                                  boolean annotateDataSet)
                                           throws java.lang.Exception
Performs a per-user crossvalidation; evaluation is performed on users listed in usersToEvaluateOn; the dataSet may have data from more users -- that's ok, the data from those users are used for training but not for evaluation

Parameters:
c -
dataSet -
usersToEvaluateOn -
annotateDataSet -
Returns:
Throws:
java.lang.Exception

evaluateFeatureSet

public ClassifierEvalStats evaluateFeatureSet(java.lang.String[] features,
                                              PositiveAndUnlabeledClassifier c,
                                              DataSet dataSet,
                                              boolean annotateDataSet)
                                       throws java.lang.Exception
Uses crossvalidation to evaluate a particular classifier on a particular set of features on a particular data set

Parameters:
features -
c -
dataSet -
annotateDataSet -
Returns:
Throws:
java.lang.Exception

search

public double search(java.util.List<java.lang.String> startingFeatures,
                     java.lang.String[] allowedFeatures,
                     PositiveAndUnlabeledClassifier c,
                     DataSet dataSet,
                     java.lang.String prefix,
                     double bestOEC)
              throws java.lang.Exception
Given a starting set of features, searches for the best set of features to use with a particular classifier

Parameters:
startingFeatures -
allowedFeatures -
c -
dataSet -
Throws:
java.lang.Exception

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Parameters:
args -
Throws:
java.lang.Exception