|
|
| Who | When |
Messages | |
(not accepting new messages)
|
|
| Jacob Shwirtz
|
62
|
 |
|
12-10-2001 01:26 PM ET (US)
|
|
Hello, In the next few weeks my website will be making the giant leap from ASP to PHP. In translating the existing code we will be adding a slew of new features. I am wondering if this process can include a migration to ThreadsML. Here is my non-programmer head speaking - is the standard ready to be implemented? Is it still "under construction"? Is it ready for me to actually use on my user-generated content website? Keep fighting the good fight, Jacob Shwirtz http://www.GAZM.org
|
| Steve Yost
|
63
|
 |
|
12-10-2001 11:48 PM ET (US)
|
|
The standard is still under construction, but don't let that stop you. I recommend early experimentation similar to what's implemented here (add .rss to any topic URL and you'll see). This represents individual linear threads. Tracking inter-thread relationships will likely be just a superset of that.
|
| Martin Waligorski
|
64
|
 |
|
02-28-2002 05:53 AM ET (US)
|
|
Edited by author 02-28-2002 06:27 AM
I know I'm late in the discussion, but I think David's article touches upon a great point. In my few years' work within KM, I have arrived at the conclusion that the survivability and portability of a discussion thread is a critical success factor of any online community. As a matter of fact I've been working for some time on a discussion engine which would store everything in XML - and only in XML, one thread per XML file, in human-readable file structure, to support that principle. The advantages are obvious: 1. Wanna move threads to another site? Just grab all your XML files and off you go. 2. Wanna move a thread to an FAQ? Just copy the thread file to the FAQ directory and it's done. 3. Wanna make a personal backup of a forum? FTP down your XML files. Ready! 4. Wanna start a thread with a news article? Just copy that article to the forum directory and you have your thread. For those who are interested, I there is a test version going "Live" at http://ipmsstockholm.org/forum/test/ Most messages there are in Swedish, but please feel free to test and your own messages if you like. If there is more interest, I can also show you my XML storage format. Martin Waligorski mailto:martin.waligorski@guide.se
|
| Steve Q Yost
|
65
|
 |
|
02-28-2002 08:30 AM ET (US)
|
|
Edited by author 02-28-2002 09:07 AM
I like the sentiment behind your implementation, Martin. We're here to propose and work for agreement on an XML standard so threads can be exchanged between all complying apps. You'll see in this thread that we're heading towards an implementation based on RSS 1.0, which is RDF-based.
Any contributions you might have on the standard in particular are welcome.
This has nudged me to post a summary of where I think we are -- coming shortly.
|
| Steve Yost
|
66
|
 |
|
02-28-2002 10:11 AM ET (US)
|
|
Edited by author 05-22-2003 10:33 AM
It's time to go further with a concrete proposal. Awhile back, David Weinberger, Rael Dornfest and I had an impromptu conference call to gather agreement on the motivations and priorities for the standard. My notes are slim, but one key point was how to deal with threading. Some applications implement branched threads and other like are linear or have pseudo-threading like Quick Topic. Technical summary:- Use RSS 1.0 and require the standard Content module. (for example see Quick Topic's current RSS format). Optionally use the Dublin Core module for added details.
- For optional threading, use the proposed Threading module to represent parent-to-child relationships and use the Annotation module to represent child-to-parent relationships. [Its description answers any reservations about its name: "Provides support for resources that annotate, follow-up to, or reference other resources." (my italics)] If you want to support threading, both are required, but support for threading is optional.
Question: Where applicable we want to represent relationships within the XML file rather than pointing back to the original resource (so the file alone would be sufficient for an import operation). How does a child node refer to another item within the same file? How does it refer to a specific item in a different resource file? I guess it comes down to "what's the granularity of a resource in the RSS 1.0 framework"?. (The same question applies to parent pointers.) Use caseThreadsML should be able to support the following: A user exports a thread from a source application that supports threading. He defines its boundaries in whatever way the the application provides. The thread may include forked branches, i.e it may have a non-linear, branched structure (though there is an underlying chronological sequence of messages). The export is saved to a file. a. He then imports that thread file to a destination application. The destination application only supports a linear representation, so it presents the messages chronologically. At the thread boundaries (in this case before the first and after the last message), Previous and Next links in this application can optionally direct the user to particular messages in the source application from which the thread was originally exported (i.e. there should be enough information in ThreadsML to support this).b. Later, the user exports another thread from the source application, picking up chronologically exactly where he left off before, and imports it to the destination application. The destination application recognizes the first item in the new import as the successor to the last item in the previous import and represents this appropriately to the user.
This is all an extension of the conversation I mentioned, and while it owes almost everything to the other participants, they aren't accountable for my extrapolations from it. This is all flexible and in the works. Feedback is welcome. I'm especially hoping to hear from you, Aaron, but all others too.
|
| Marc M. Adkins
|
67
|
 |
|
03-21-2002 02:28 AM ET (US)
|
|
Hmm, actually looked (albeit briefly) at the specifications you reference.
So...is an RDF channel the equivalent of a QuickTopic? Then in a channel are elements...which (with the threading module) can have children. So at this point we have the single-level equivalent of QuickTopic. If I understand...and I'm really skimming this stuff quickly.
I'm not seeing the equivalent of a "Thread" object, but then perhaps that isn't actually necessary. You can always think of the threads as being defined by all messages that have no parent messages (roots of the lattice, as it were). But wouldn't there be a need for information attached to the threads themselves, necessitating actual thread objects?
Then I keep thinking that there absolutely must be some sort of unique ID for messages/threads. Otherwise what happens if a user imports the same message/thread twice into his/her repository? You don't want stuff duplicated, right?
But then I start thinking about what to use for a unique ID and my head starts to spin. The closest I get is some sort of URL (from the RDF/channel/BBS, whatever) coupled with a unique sequence number that can only be generated (supposedly) by the owner of the URL. So the user then imports messags with URLs not connected with the user. And if the user is bad and removes them...well then my head actually hurts.
Regarding the "how does a child node refer to another item in the same file question," it seems like there is plenty w/in XML to provide for that. I've approached XML by way of XSLT, mostly, but certainly the use of ID and IDREF attributes allows items in an XML file to be linked during XSLT processing. Shouldn't this be sufficient? Especially if we assume unique message IDs? Then there's the whole XLink thang, about which I understand little.
But it is late and I'm probably not thinking this through. Interested in any thoughts whatsoever, glad to see some progress continues.
|
| Steve Yost
|
68
|
 |
|
04-01-2002 09:12 PM ET (US)
|
|
Edited by author 04-01-2002 09:15 PM
Marc, sorry it's taken awhile to get back to you on this. A thread for our purposes is any arbitrary linear time-sequential series of messages in a particular forum on a particular topic. It can have zero or more branching points within it, and not all messages on all branches need be included. For example, let's say we have a tree-structured thread with many small branches and sub-branches. I can pick up all the messages from 3-Jan-2002 to 4-Apr-2002 on one arbtirarily selected depth-first traversal through the tree and call that a thread. I can also pick *all* the messages between those dates and call it a thread, though it has many branches. The idea is that I should be able to export an import either of these structures. Yes, there's a need to identify each message in a thread. That's currently done via the identifiers in the <items><rdf:Seq> section of the document. Each item in the sequence has a URI (e.g <rdf:li rdf:resource=" http://www.quicktopic.com/7/H/rhSrjkWgjnvRq/p1.1" />). That works fine for Quick Topic, where each message does have a URI. But what about email, for example? In email, each message has a Message-ID, which would suffice for tracking successive import/exports. The purpose for the ID is to manage successive export/imports from one particular source to one particular destination (e.g. email to Quick Topic), not to be able to (for example) transport threads arbitrarily to many services and gather them up again later. So a globally unique ID isn't necessary. Are we being too short-sighted or just expedient? If we need to, can we choose a unique naming later? Duplicate messages with different IDs will be floating around in various services. Is that OK? Maybe we should say that the original ID (which may be service-specific) should at least be preserved and identified as such. I like that. So, say I start with email, move the thread to QT, then move it on to Topica. The export from QT should include the original email Message-IDs. Oh, and regarding intra-file parent-child node references, yes XML has ID and IDREF. My question is what we include in our particular threadsML schema in this regard. Thanks for your thoughts, and for helping to keep *this* thread moving.
|
| Ben Hammersley
|
69
|
 |
|
05-06-2002 12:43 PM ET (US)
|
|
*thread-merging from offlist discussion - the formatting may go astray.*
> Have you had any more thoughts about the unique naming? I'm especially > interested in the ideas around moving from one service to another. > Say, taking a thread from email, to Quicktopic, to a blog, to IM and > back to email again. Off the top of my coffee addled brain I can see > all sorts of very interesting visualisation applications here - > graphically travelling down threads, and so on. As the thread gets > longer and more branched, the very structure gets as interesting as > the content. > > Again, off the top of my head, giving a *topic* a guid would allow for > multiple services to syndicate conversations off each other. Add in > Publish and Subscribe, and you could follow a slashdot thread (which > itself is a branching of a quicktopic discussion) using an IM client, > which is talking to a usenet gateway... Ahh - braindumpy, that, but it > could be very powerful. Especially if you add in enough dc and rdf > stuff and make them searchable: "find me all the conversations where > Rael and Dave discussed football" "find me all the conversation > points, where Rael replied to Dave in Spanish, about the History of > the Walkman, and where he was using Jabber" > > Gnutellaish searching between conversation services: ooooh.
|
| Steve Yost
|
70
|
 |
|
05-06-2002 12:56 PM ET (US)
|
|
And my email response:
Yes! In fact I've been mulling about the necessity of the similar tools among weblogs: e.g. the ability to search in weblog-only space or following links in both directions. For weblogs these exist (and I think they ought to be snapped up by Google). For web-based message threads, they should exist, including the idea of thread-hopping you mention. The example of searching you give is an great (if extreme :) specimen of the intelligent search I'm thinking of.
Given your input in addition to Marc's, I'm starting to think that it's worth the trouble(?) of adding to the RSS spec to allow for unique IDs (unless there's a module that includes one now). The IDs could be just GUIDs or they could contain information about the originating site. Maybe a separate identifier should denote the orignating site.
|
| Peter Kaminski
|
71
|
 |
|
05-06-2002 12:58 PM ET (US)
|
|
Ben Hammersley writes, >very interesting visualisation applications here graphically travelling >down threads Alex Shapiro has a very nice applet for generalized visualization of networks that might be nice for a highly-branched, multi-service group of related threads: TouchGraph < http://www.touchgraph.com/>. It's inspirational, at least, to play with such a nice visualizer -- my favorite application so far is the Google Similar Pages Browser. -- Pete http://www.istori.com/peterkaminski
|
| Marc M. Adkins
|
72
|
 |
|
05-06-2002 02:07 PM ET (US)
|
|
If one of the goals is to grab any arbitrary set of messages (say by topic and time frame) and call that a thread than I wonder if we _can_ assign an ID to a thread. It's like assigning a unique ID to a SQL query.
I think we can only assign an ID to a thread that is published somewhere. That is to say, if I have a forum and I decide that these 18 messages constitute a thread it can have an ID which is a combination of my forum's ID and some unique identifier in the context of my forum. Or I can allow any top-level message (not in response to an existing message) to define a new thread. Or whatever.
If you then come along and pull a subset of my thread, that query is not in itself a new thread. If you then publish the results of that query on your own forum then it becomes a new thread, but in the context of your forum.
A little like the distinction between an XML node or document and an XML node list, which is somewhat ephemeral.
---
I have also been thinking about graphical tools for viewing threads and messages. Here's an additional twist:
Consider mind maps and other graphical idea organizing tools. Wouldn't it be great if a set of messages on a forum (possibly/probably in multiple threads or even in multiple forums???) could be viewed/organized as a mind map?
Use case: we all decide to actually build the specification for ThreadML. We go in and organize all the message traffic herein into a mind map (or whatever model makes the most sense). We then add messages to the nodes in the mind map, filling in the (now obvious blanks). From there we create a second projection of the underlying data, which is an outline of the projected specification document. After that, writing the document should be easy (or at least not so hard).
I found a nice site with a lot of different "mind map-like" diagrams but I can't find the linkage right now.
|
| Ben Hammersley
|
73
|
 |
|
05-06-2002 02:45 PM ET (US)
|
|
> If one of the goals is to grab any arbitrary set of > messages (say by topic and time frame) and call that a > thread than I wonder if we _can_ assign an ID to a > thread.
I think you can - you just have to make the ID extensible: the id represents not just the message's URI, and it's position in the thread and the UID of the discussion itself (ie. the ultimate root post), but also the UID of the most root-post post on that system...
So if I grab a branch of a thread and use it elsewhere, it may look like a new thread - it may even act like a new thread, but the UIDs of new messages would contain the UID of their post-split root post, which in turn contains the UID of the original root post.
This would also allow the newly created thread, made from the cutting so to speak, to eventually be reassociated with the genuine root post, with everything in its right place.
Does that make sense?
|
| Marc M. Adkins
|
74
|
 |
|
05-06-2002 03:08 PM ET (US)
|
|
> I think you can - you just have to make the ID extensible: > the id represents not just the message's URI, and it's > position in the thread and the UID of the discussion > itself (ie. the ultimate root post), but also the UID of > the most root-post post on that system...
I'm probably not following you, but it sounds like the UIDs would keep getting longer and longer. Or that they would refer to previous UIDs (which would in turn refer to other UIDs) which would mean that the entire set of UIDs would need to exist forever or the ancestral data would be lost.
I agree that we want to keep the provenance [sic] of each message and thread and so forth. I'm just unsure how it happens efficiently.
Perhaps an example would clarify this? In your copious free time. ;)
|
| Ben Hammersley
|
75
|
 |
|
05-06-2002 03:21 PM ET (US)
|
|
> I'm probably not following you, but it sounds like the UIDs > would keep getting longer and longer. > > Perhaps an example would clarify this? In your copious free > time. ;)
You're right - they would keep getting longer and longer. But they would never be that long, and the additional utility would be of greater value than the cost of another 20, say, characters.I'll try and get an example going. But even with enormous amounts of thread branches, I've a feeling it could be done with a minimum of fuss. It just needs an *evil* encoding scheme. Well, it's worth a thought or two anyway. I'll give it a go.
Question for everyone: how many characters should a UID have max? 20? 50? 100?
|
| Marc M. Adkins
|
76
|
 |
|
05-06-2002 03:50 PM ET (US)
|
|
Edited by author 05-06-2002 03:51 PM
Try this on for size... Let's say I have a forum at http://www.Doorways.org/Forum/DarkKnight. On the forum are a ton of messages which would be identified as http://www.Doorways.org/Forum/DarkKnight/0001 or some such. These are unique IDs. Thread IDs would look like http://www.Doorways.org/Forum/DarkKnight/Thread/0001. Again, unique IDs. You come along and ask for all messages from http://www.Doorways.org/Forum/DarkKnight/Thread/0023 from April of 2002. You get back a structure like: <thread id=" http://www.Doorways.org/Forum/DarkKnight/Thread/0023"> <message id=" http://www.Doorways.org/Forum/DarkKnight/0691"> Yeah, well Batman could beat up the Green Lantern ANY day! </message> </thread> Now you put the messages on your site, under your forum: http://www.ComicNerds.net/Forum/Batman/Thread/0187. Let me suggest that you store, for each message, the _original_ id (e.g. http://www.Doorways.org/Forum/DarkNight/0056). This original ID should _always_ remain with each message since it unique identifies it across all time and space. So if a third party grabs your 0187 thread including my message they'll get something like: <thread id=" http://www.ComicNerds.net/Forum/Batman/Thread/0187"> <message id=" http://www.Doorways.org/Forum/DarkKnight/0691"> Yeah, well Batman could beat up the Green Lantern ANY day! </message> <message id=" http://www.ComicNerds.net/Forum/Batman/0123"> Green Lantern would whip his butt! </message> </thread> So if the thread eventually makes its way back to the original http://www.Doorways.org/Forum/DarkNight forum the message(s) that have originated there will be re-matched properly. What happens if the original forum is disbanded? Well, the URL is still unique. The message is uniquely identified. If it shows up on some other site from two different message pulls it will not be duplicated. Keep in mind that XML namespaces work this way, they don't necessarily represent an actual data object on a site (though they often do point to a schema file), the URLs just specify unique IDs for the namespaces. The only problem I see here is if the forum is _restarted_ and the same unique URL is assigned to a new message. Of course, this could be easily handled by simply checking messages with matching IDs to see if they actually match. Now I don't see this scaling to thread IDs. For one thing, the query may only pull part of a thread. For another, the thread changes over time (I'm assuming that the message doesn't, but that's probably open for discussion). So if you pull my entire Thread/0023 and place it on your site as Forum/Batman/Thread/1117 it shouldn't necessarily have an original ID of DarkNight/Thread/0023.
|
| David Weinberger
|
77
|
 |
|
05-06-2002 04:04 PM ET (US)
|
|
Off topic, but easier...
Since blogthreads are neither simply chronological nor hierarchical, a map that shows all the links among them is likely to become less useful as it becomes more comprehensive. Once y'all have solved the hard problems, do you think it'd make sense to consider having an attribute that codes for "direct reference" or "main reference" or some such? For example, if C replies to the outrageous lies in A, but in passing mentions that B's blog entry also shows that A is lying through his teeth, it'd be nice to know that even though C mentions both A and B, C is really replying to A.
Other attributes that capture relevancy and popularity (e.g., "found this blog entry helpful" as per a discussion with Ben Hammersley) would be useful for whatever apps decide to make these threads visible. It might be useful to have some set of such attributes built in, in addition to of course having the standard be extensible. But this is why I don't write standards.
|
|
|