| Charles Elkan
|
151
|
 |
|
12-12-2007 01:38 AM ET (US)
|
|
for time complexity reasons, we were computing the Viterbi algorithm for all the words together and storing it in a matrix of size O(m*n*numwords).
If you are using the perceptron method (or stochastic gradient) this is not a good idea, because the weight vector changes after processing each word. So you have to run Viterbi each word at a time.
we have to compute four more f matrices (one for 00, 01, 10, 11) which are each (numwords*lengthOfLongestWord*numFeatures) in size. These matrices can be stored as 'sparse' in Matlab, because most feature-functions are 0, for every position of every word.
|