edu.harvard.seas.iis.abilities.classify
Class Clean

java.lang.Object
  extended by edu.harvard.seas.iis.abilities.classify.Clean

public class Clean
extends java.lang.Object

Author:
kgajos Tools for cleaning the data

Field Summary
static int CLEAN1
           
static int CLEAN2
           
 
Constructor Summary
Clean()
           
 
Method Summary
static void clean(java.io.File inputDirectory, java.io.File outputDirectory, int cleanLevel)
          A convenience method for cleaning an entire directory of data files
static DataSet cleanOnRawValues(DataSet fullDdataSet)
          Method from removing obviously problematic instances (very low IDs, missed clicks, clicks on unknown targets)
static DataSet cleanOnTransformedData(DataSet fullDdataSet)
          Method from removing outliers from the data.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

CLEAN1

public static final int CLEAN1
See Also:
Constant Field Values

CLEAN2

public static final int CLEAN2
See Also:
Constant Field Values
Constructor Detail

Clean

public Clean()
Method Detail

cleanOnRawValues

public static DataSet cleanOnRawValues(DataSet fullDdataSet)
                                throws java.lang.Exception
Method from removing obviously problematic instances (very low IDs, missed clicks, clicks on unknown targets)

Parameters:
fullDdataSet -
Returns:
the original data set with bad instances removed
Throws:
java.lang.Exception

cleanOnTransformedData

public static DataSet cleanOnTransformedData(DataSet fullDdataSet)
                                      throws java.lang.Exception
Method from removing outliers from the data. Right now, we throw away explicit movements where MT/ID is more than 2 stdevs away from the mean for the particular person

Parameters:
fullDdataSet -
Returns:
the original data set with bad instances removed
Throws:
java.lang.Exception

clean

public static void clean(java.io.File inputDirectory,
                         java.io.File outputDirectory,
                         int cleanLevel)
                  throws java.lang.Exception
A convenience method for cleaning an entire directory of data files

Parameters:
inputDirectory - a directory containing parsed movement data from one or more users
outputDirectory - a directory where the cleaned data should be written
Throws:
java.lang.Exception