| sameer agarwal
|
3
|
 |
|
04-30-2001 01:52 AM ET (US)
|
|
hello,
frankly I was quite surprised with :
1. That the authors were able to get correlation with trends with such a cude language model
2. A 5-10 hour window was sufficent to get the correlations. (information does indeed seem to flow fast) but perhaps an explanation for it can be that by the time the news agencies get their news.. so do the traders on the floor.. yahoo biz news is not exactly the source on which people doing real time trading will depend on), so there is an implicit time delay already. The model I think wil have to be caliberated for each news source seperately. A single time window might not be sufficent.
since the length of the trends is variable, it would be interesting to see if its possible to correlate the length of a run (trend) and predict it.
Another things is that there are stories which are relevant to an entire market/sub market or the entire industry.. but which may not be tagged to a particular stock. It might be of some use to build a hierarchial set of models that take a news feed which is tagged to specific industries, to the whole economy and so on.. and correlate their effects with the trends.
Overall I think the paper is rather well written, I especially like the use of t-tests to decide the splits for trends and clustering. But if I am not mistaken the authors did not mention the significance level at which they tested their hypothesis.
sameer
|