====== Procedures for Running a Support Vector Machine on fMRI Data ======

===== Support Vector Machines Theory =====

==== Outside sources ====
  * For a good tutorial on Support Vector Machines Theory:\\  [[http://www.umiacs.umd.edu/~joseph/support-vector-machines4.pdf|Burges CJC (1998) A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery, 2, 121-167.]]
  * The first SVM paper that used fMRI\\ [[http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6WNP-48J44GM-D&_user=489256&_coverDate=06%2F30%2F2003&_fmt=full&_orig=search&_cdi=6968&view=c&_acct=C000022721&_version=1&_urlVersion=0&_userid=489256&md5=a1311ef2d141e0e726680ff965b910e7&ref=full Cox DD, Savoy RL. (2003)|Functional magnetic resonance imaging (fMRI) "brain reading": detecting and classifying distributed patterns of fMRI activity in human visual cortex. NeuroImage 19, 2, 261-270.]]
  * The first paper using whole brain data and a comprehensive analysis of the effects of preprocessing on SVM analysis\\  [[http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6WNP-4FSNXRC-2&_user=489256&_coverDate=06%2F30%2F2005&_fmt=full&_orig=search&_cdi=6968&view=c&_acct=C000022721&_version=1&_urlVersion=0&_userid=489256&md5=d4a2379710d42a35b26a66a9d267ad3c&ref=full | LaConte S, Strother S, Cherkassky V, Anderson J, Hu X. (2005) Support vector machines for temporal classification of block design fMRI. NeuroImage 26, 2, 317-329.]] 
  * A good review of SVM in fMRI Data\\ [[http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6VH9-4KKNNHN-8&_user=489256&_coverDate=09%2F30%2F2006&_alid=604369150&_rdoc=1&_fmt=full&_orig=search&_cdi=6061&_sort=d&_docanchor=&view=c&_ct=2&_acct=C000022721&_version=1&_urlVersion=0&_userid=489256&md5=42ffdb6866ba25732b94498a24b86205 | Norman KA, Polyn SM, Detre GJ, Haxby JV. (2006) Beyond mind-reading: multi-voxel pattern analysis of fMRI data. Trends in Cognitive Sciences. 10(9): 424-30.]]
  * A good website for links to information about SVM, including software packages.\\ [[http://www.support-vector-machines.org/ | Support Vector Machines.org]]
  * The website for the software package we used\\ [[http://svmlight.joachims.org/ | SVM light Thorsten Joachims]]

==== Summary ====

=== Black Box View ===
The purpose of a Support Vector Machine (SVM) is to be able to predict which class a test set of data belongs to, based on the characteristics of the data that it saw previously.

This process is shown in figure below:
<html><center></html>
{{:public:svm:fig1.png}}
<html></center></html>

From the training data, the SVM creates a high dimensional space by allowing each voxel to be a dimension.  The SVM then calculates a hyperplane to separate the data into distinct classes.  When it is given the test data, it analyzes what side of the hyperplane the test vector lies on.

=== Hands in The Cookie Jar View ===

In this case, we are using SVM to make binary decisions between classes.  This is because the accuracy results were quite unfavorable when we attempted to use various multi-class SVM software packages.  Many of these packages simply distill multi-class data to a one vs rest algorithm for each class.  Instead, we choose to be more specific and do our analysis through pairwise decisions between each of our classes.  This method necessitates a large amount of processing power and hard drive space while the analysis is running.  After results are gained, temporary files are deleted to conserve space. 

In the ideal case where all the data lies neatly on one side or the other of a hyperplane, the following Lagrangian is used to compute the equation of the hyperplane.  It is clear that this formula only applies to the linear hyperplane case.  This can be easily expanded to cover higher degree hyperplanes.

The following figures are from (3).
{{  :public:svm:fig2.png  }}
This yields the following picture:
{{  :public:svm:fig4.png  }}
To find the hyperplane, one wishes to maximize **w** in the Lagrangian equation with respect to the Karush-Kuhn-Tucker (KKT) Conditions listed below.  Clearly, for a higher order hyperplane, these equations are expanded using similar partial derivatives.
{{  :public:svm:fig3.png  }}
Not all data fits nicely into the high dimensional space, however.  As a result, error terms must be incorporated into the Lagrangian.  This results in a Lagrangian that looks like this:
{{  :public:svm:fig6.png  }}
These variables are defined in this more realistic figure:
{{  :public:svm:fig5.png  }}
Due to the fact that the Lagrangian changed, the KKT Conditions must also be edited.  They now look like this:
{{  :public:svm:fig7.png  }}
It is unrealistic for a human researcher to compute all of these partial derivatives for all of the training and test data.  That is why software packages have already been created to do the necessary computations.

===== How to Run MATLAB Scripts and Necessities for Using These Scripts =====

There is now a completely automated MATLAB script for running SVM with Voxbo.  All variables are defined in a prep file and can be changed easily for any purpose. 

[[https://cfn.upenn.edu/aguirre/code/matlablib/svm/svmprep.m | svmprep]]
[[https://cfn.upenn.edu/aguirre/code/matlablib/svm/svmrun.m | svmrun]]

Our SVM script relies heavily on [[https://cfn.upenn.edu/aguirre/code/matlablib/|our various utility functions]], which in turn require [[http://www.fil.ion.ucl.ac.uk/spm/ | SPM]], and [[http://svmlight.joachims.org/ | SVMlight]].

We are using a **modified** version of SVMlight which does a balanced "hold two out" (h2o) accuracy computation instead of the default "hold one out". To build it, you will need [[https://cfn.upenn.edu/aguirre/code/matlablib/svm/svml/|this svml code]].

==== Data Prep ====

You will need a CUB for each stimulus-exemplar. For example, if you have 5 subjects, 16 stimuli, and 5 exemplars per stimuli, you will have a total of 400 CUBs; read [[private:creating_direct_effect_and_training_cubs]] for details.

The **svmprep** script defines the subjects, ROIs, and pathnames for your experiment, and calls **svmrun**. You should not need to modify the **svmrun** m-file.

==== SVMRUN ====

What does SVMRUN do?

  - load all the training cubes and write a training file for each pair
  - submit all the training jobs to Voxbo
    * vbbatch is used to distribute svm_learn_h2o processes across the cluster
  - wait for Voxbo to finish
  - create per-subject and average w-maps
    * These are maps of the brain with the regions of high discrimination highlighted.  The regions displayed are the voxels that have the largest **w** value and therefore show the most difference between the classes being differentiated.
  - parse the output from svm_learn_h2o to obtain "hold two out" accuracies
  - clean up temporary files

===== References and Acknowledgments =====

  - Joachims, T, Learning to Classify Test Using Support Vector Machines. Dissertation, Kluwer, 2002.
  - [[http://svmlight.joachims.org/| SVM]]<sup>l</sup> is a software package created by Thorsten Joachims.
  - [[http://www.umiacs.umd.edu/~joseph/support-vector-machines4.pdf | Burges CJC (1998) A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery, 2, 121-167.]]