top bar
QuickTopic free message boards logo
Skip to Messages

TOPIC:

Self-dissimilarity in word-frequency identifies hot news

8
Reid E. HarwardPerson was signed in when posted
02-19-2003
03:03 PM ET (US)

Of course, running the app is a much more rewarding experience. I need some help putting what I have up on the web in a downloadable format
7
Reid E. HarwardPerson was signed in when posted
02-19-2003
02:59 PM ET (US)

oops, one too many dubyas in that url:

http://www.well.com/user/reid/v2.html
6
Reid E. HarwardPerson was signed in when posted
02-19-2003
02:57 PM ET (US)

I've got a demo of something that is very similar to what this does. It rocks. I find myself fooling with it more and more. Its a visualbasic app I've titled spidercycle.

I'm an english graduate student right now and I'm going to base my thesis on this sort of text analysis.

Here's a graph of a keyword search from an amalgamated text file harvested from a series of popular blogs.

http://wwww.well.com/user/reid/v2.html


This is much more than just a text analysis program and I would like to develop it into an open source tool. Please feel free to email me at mudlab@adelphia.net for downloading instructions.
5
plughPerson was signed in when posted
02-19-2003
02:42 PM ET (US)
Yes, but that's hardly science. That's just an untested theory.

I imagine if you had a corpus significantly larger than the state of the union addresses, by the time a word "moved up on the charts" enough to be noticed, it WOULD be obvious.

Also, calling this kind of a thing an "algorithm" is somewhat overblown...!
4
TimmyTPerson was signed in when posted
02-19-2003
01:31 PM ET (US)
Ben, I think the point is that if you apply the technique to current texts, you might pick up on issues and trends before it becomes obvious.
3
erniePerson was signed in when posted
02-19-2003
12:47 PM ET (US)
I was just remarking the other day at the water cooler how I wish there was a more efficient way to extract trendy new words out of the blogosphere - then BAM, this comes a long!
2
bruxPerson was signed in when posted
02-19-2003
12:09 PM ET (US)
I just hope the word "burstiness" doesn't show up anywhere ever again.
1
Ben SkottPerson was signed in when posted
02-19-2003
11:54 AM ET (US)
While this idea has some value, why should we be interested in the fact that certain words appear in State of the Union adresses at points in history when those words have historical significance? Why would anyone be surprised that atomic appeared in State of the Union adresses int he 1950's? Aren't State of the Unions supposed to cover the major points of the time? I didn't need any algorithm to know this. In fact, I'm willing to bet that the word "terrorism" will show up if this algorithm is used on Bush's recent speech. I wish someone would give me a grant to study the obvious.
Upgrade to PRO

Upload pictures, personalize your board, and more!

Print | RSS Views: 513 (Unique: 406 ) / Subscribers: 1 | What's this?