edu.harvard.seas.iis.abilities.classify
Class ClassifierEvaluator

java.lang.Object
  extended by edu.harvard.seas.iis.abilities.classify.ClassifierEvaluator

public class ClassifierEvaluator
extends Object


Constructor Summary
ClassifierEvaluator()
           
 
Method Summary
 DataSet annotateDataSet(DataSet dataSet, weka.classifiers.Classifier c, InstanceFilter filter)
          Fills in the "Prediction probability" and "Predicted class" values in the dataSet using classifier c.
static double[] compareFittsLawModels(double[] baseline, double[] comparison)
           
static double[] computeFittsLawCoefficients(DataSet dataSet)
           
 String crossvalidateOverUsers(weka.classifiers.Classifier c, DataSet dataSet, String[] users, boolean generateAnnotatedDataSet)
           
 double[] evaluate(weka.classifiers.Classifier c, DataSet testData)
          Evaluate a trained classifier c on a particular test data set
 String evaluate(String[] features, File dataFile, boolean generateAnnotatedDataSet)
          Creates a positive and unlabeled classifier based on the globally set baseClassifier; evaluates it using crossvalidation; only specified features are used for classification
 void evaluateDataRequirements(String[] users, String[] features, File cleanDataDirectory)
          A draft of a method for testing how the classifier's accuracy changes depending on the amount of training data available
 DataSet getDeliberateInstances(weka.classifiers.Classifier c, DataSet d)
          Returns a copy of the data set that only contains instances positively classified by c
 void getPerUserStDevs(File dataFile, weka.classifiers.Classifier baseClassifier, String featureToComputeStdevsFor, String[] featureList, String[] users)
          For a given feature, computes the per-user stdevs on experimental, natural, and filtered natural data sets
static void main(String[] args)
           
 double[] runStatisticalTests(DataSet baseLine, DataSet testSet, String[] users, String attributeForTesting, int minNumberRequiredForTesting)
          Runs pairwise statistical tests to look for statistically significant differences across users on several metrics.
 void runStatisticalTests(File dataFile, weka.classifiers.Classifier baseClassifier, String[] featureList, String[] users)
           
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ClassifierEvaluator

public ClassifierEvaluator()
Method Detail

computeFittsLawCoefficients

public static double[] computeFittsLawCoefficients(DataSet dataSet)
Parameters:
dataSet -
Returns:
[intercept, slope]

compareFittsLawModels

public static double[] compareFittsLawModels(double[] baseline,
                                             double[] comparison)
Parameters:
baseline - params of the baseline model [intercept, slope]
comparison - params of the comparison model [intercept, slope]
Returns:
mean fraction by which prediction of comparison differ from baseline

getDeliberateInstances

public DataSet getDeliberateInstances(weka.classifiers.Classifier c,
                                      DataSet d)
                               throws Exception
Returns a copy of the data set that only contains instances positively classified by c

Parameters:
c -
d -
Returns:
Throws:
Exception

evaluate

public double[] evaluate(weka.classifiers.Classifier c,
                         DataSet testData)
                  throws Exception
Evaluate a trained classifier c on a particular test data set

Parameters:
c -
testData -
Returns:
Throws:
Exception

annotateDataSet

public DataSet annotateDataSet(DataSet dataSet,
                               weka.classifiers.Classifier c,
                               InstanceFilter filter)
                        throws Exception
Fills in the "Prediction probability" and "Predicted class" values in the dataSet using classifier c. If filter is not null, then annotation is only done for instances that pass the filter (if filter is null, then all instances get annotated)

Parameters:
dataSet -
c -
filter -
Returns:
Throws:
Exception

crossvalidateOverUsers

public String crossvalidateOverUsers(weka.classifiers.Classifier c,
                                     DataSet dataSet,
                                     String[] users,
                                     boolean generateAnnotatedDataSet)
                              throws Exception
Throws:
Exception

evaluate

public String evaluate(String[] features,
                       File dataFile,
                       boolean generateAnnotatedDataSet)
                throws Exception
Creates a positive and unlabeled classifier based on the globally set baseClassifier; evaluates it using crossvalidation; only specified features are used for classification

Parameters:
features -
dataFile - file containing the data to use for training and validation in the crossvalidation procedure
generateAnnotatedDataSet - if set to true, this method will produce a copy of the data set read from dataFile that has the classifier prediction fields filled in for each instance; the annotated data set is stored in the global variable annotatedDataSet
Returns:
Throws:
Exception

evaluateDataRequirements

public void evaluateDataRequirements(String[] users,
                                     String[] features,
                                     File cleanDataDirectory)
                              throws Exception
A draft of a method for testing how the classifier's accuracy changes depending on the amount of training data available

Parameters:
users -
features -
cleanDataDirectory -
Throws:
Exception

runStatisticalTests

public double[] runStatisticalTests(DataSet baseLine,
                                    DataSet testSet,
                                    String[] users,
                                    String attributeForTesting,
                                    int minNumberRequiredForTesting)
                             throws IllegalArgumentException,
                                    org.apache.commons.math.MathException
Runs pairwise statistical tests to look for statistically significant differences across users on several metrics.

Parameters:
baseLine -
testSet -
users -
attributeForTesting -
minNumberRequiredForTesting -
Returns:
Throws:
IllegalArgumentException
org.apache.commons.math.MathException

runStatisticalTests

public void runStatisticalTests(File dataFile,
                                weka.classifiers.Classifier baseClassifier,
                                String[] featureList,
                                String[] users)
                         throws Exception
Throws:
Exception

getPerUserStDevs

public void getPerUserStDevs(File dataFile,
                             weka.classifiers.Classifier baseClassifier,
                             String featureToComputeStdevsFor,
                             String[] featureList,
                             String[] users)
                      throws Exception
For a given feature, computes the per-user stdevs on experimental, natural, and filtered natural data sets

Parameters:
dataFile -
baseClassifier -
featureToComputeStdevsFor -
featureList -
users -
Throws:
Exception

main

public static void main(String[] args)
                 throws Exception
Parameters:
args -
Throws:
Exception