QuickTopic (SM) free message boards QuickTopic (SM) free message boards
Skip to Messages
  Sign In to access your topic list  |New Topic |My Topics|Profile
Topic: ThreadsML???
Printer-Friendly Page
Introduction
This is a continuation of a discussion about ThreadsML started over a year ago. Marc Canter resurrected it with an introductory email and has been pushing it along since then.
 
Reference links:
ThreadsML.org
Early definitive statement in the older discussion
An early JOHO article by David Weinberger
All messages            1-284 of 284        
Who | When
Messagessort recent-top    (not accepting new messages)
Marc Canter  1
04-11-2003 12:55 PM ET (US)
Hey,

Pete Kaminski just alerted me to the effort and ideas behind something called ThreadsML http://www.quicktopic.com/7/H/rhSrjkWgjnvRq - but I can't find anything since Nov. '02. Discussions also seem to have gone silent on the RSS-DEV list about this idea.

I've been able to find this defintiive statement by Steve Yost:
http://www.quicktopic.com/7/H/rhSrjkWgjnvRq?m1=66&mN=66

and an early article:
http://www.hyperorg.com/backissues/joho-jun17-01.html

a RDF proposal:
http://web.resource.org/rss/1.0/modules/threading/

and an excellent analysis by Ben Hammersley:
http://rss.benhammersley.com/archives/000035.html

there's even a domain - which has been abandoned:
http://www.threadsml.org/

I assume a lot of these thoughts and insights have been beaten to death and discussed (via QuickTopics of course) and that everyone is just sick of it by now. But now that we're about to embark on a whole new thing - thanks to Matt and Paolo with ENT (Easy News Topics.) Having topics in an RSS feed and compatible with RDF and XTM is a great step forward.

Whether it be a multimedia conversation (as I theorized), a traditional discussion forum (use.net, Brainstorms, Yahoo Groups, Slashdot, iVillage, whatever) or something like what SocialText is developing.....

.... having an interchange standard for threaded discussions - forums - would be killer!

We're building new kinds of tools - to create new kinds of communities and man, could we use an interchange standard for threads.

- Marc

P.S. BOAF at Emerging Tech?
Steve YostPerson was signed in when posted  2
04-11-2003 01:02 PM ET (US)
Edited by author 04-11-2003 01:17 PM
As far as I know, the effort stands at the proposal you referred to, Marc (http://www.quicktopic.com/7/H/rhSrjkWgjnvRq). Can we get agreement on that? If so, it may be just a matter of specifying that more formally and taking the proposal to a larger group for approval. Can someone say how that's normally done, at least for reference later?

This is good timing for me, becuase once I get QT Pro released (a few weeks from now), I want to implement ThreadsML in QuickTopic. Once the standard is in place, there are lots of possibilities -- one I'm thinking of is providing very flexible viewing of any thread through XSL.

Thanks for resurrecting this, Marc. I'd let it lay idle for awhile while working on other things.

Whoops, the proposal is here: http://www.quicktopic.com/7/H/rhSrjkWgjnvRq?m1=66&mN=66
Danny Ayers  3
04-11-2003 01:42 PM ET (US)
re. QT : nifty footwork!

re. ThreadsML

Although I have some reservations about the spec in its current form, it certainly would be great to see some smarter applications appear!
Steve Cayzer has been working in the Semantic Blogging domain recently, and has surveyed what's out there - I'm cc'ing him in the hope he has comments re. ThreadsML.

My main reservation: personally I think the ThreadsML tries to do too much in one place, and runs the risk of the semantics being spread thin (if you see what I mean ;-) In particular I think the cataloguing terms (Topic, hasTopics, categoryOf etc) could be better defined separately from the thread terms (Post, agreesWith, commentsOn etc). The overlap between these two sets and other schema (e.g. the RSS taxo module, the ClaimMaker vocab) could make mixing messy.

Having said that, the way ThreadsML stands in its current form is probably already pretty close to the sweet spot for well-defined/easily-used. If there's a general hubbub of approval to the spec as it stands, I'll happily go with the flow and do what I can to support it in whatever apps I work on.
(just a random thought - I wonder if the Topic side could be linked to the RDF representation of Topic Maps work...that could pull in a whole community...)

btw, there seems to be loads of structured discussion related discussion, see for example :
http://collab.blueoxen.net/cgi-bin/wiki.pl

Evidence that it's an idea who's time has come, perhaps?

Cheers,
Danny.






< replied-to message removed by QT >
Marc Canter  4
04-11-2003 01:54 PM ET (US)
Cool.

I'm funneling David's reply - and mine - via this email - onto the QuickTopic site.

This would be ONE example of how a ThreadsML might provide cool new functionality.

I used today - as the day that Matt and Paolo shipped the first draft of ENT (East News Topic)

http://matt.blogs.it/specs/ENT/1.0/

as I believe THAT is how ThreadML will get implemented! 'cause without categorizing and flowing threads into ontologies or facet maps - what's the point?

I also suggested a BOAF at ENTCON (Dr. Weinberger is into it) - anybody else interested in attending?

- Marc
=================

ThreadsML never picked up sufficient steam, but I believe that it or something like it is needed more than ever. a BOAF at Emerging Tech sounds like a great idea. Count me in.


-- David W.

=================

>
< replied-to message removed by QT >
Marc Canter  5
04-11-2003 01:58 PM ET (US)
wow 2 mentions of Blue Oxen in one day - quite a day for them.

I'd like to point to Matt and Paolo's new ENT spec:

http://matt.blogs.it/specs/ENT/1.0/

as much as I'd like to see RDF succeed, it's still pretty daunting and complicated - despite the W3C's upcoming efforts at education.

ENT is an RSS extension and something any software developer can add in very quickly. It would also be 100% compatible with any paralell RDF or XTM efforts. That's KEY!

I propose that "we" use ENT as a mechanism for flowing threads through. It would take an alteration of the current ThreadsML spec - but something well worth it.

- Marc

>
< replied-to message removed by QT >
Matt MowerPerson was signed in when posted  6
04-12-2003 06:20 AM ET (US)
Hi folks,

I was feeling pretty ill yesterday when Marc pointed me to this thread so I'm just catching up. Also I missed ThreadML before so it's a 106 messages I'm reading!

From what I can see though ThreadML seems to be a good idea and, if it's based upon RSS, an idea whose time has come.

One thing I haven't come across so far is any explanation about why the previous initiative fell short. Can anyone summarise? Are there any lessons to be learned? Have we?

Regards,

Matt

p.s. The ENT spec URL should be

http://www.purl.org/NET/ENT/1.0/

which is it's permanent URL. The URL via my blog is where the document lives right now but the above PURL will always point at the latest version of the spec.
Ben Hammersley  7
04-13-2003 09:01 AM ET (US)
I can do one better than that. I'm speaking there on this very sort of thing. I'm planning on talking about ThreadsML (and now ENT). Perhaps making it into more of a discussion type affair would be good? Also attending are some Latent Semantic Indexing people, and the guys from Waypath, so we might be able to touch on a great deal of good stuff regarding fitting threads into ontologies, and representing such.
Personally I think retrofitting existing content into an ontology is the Really Big Problem we're going to have to face sooner rather than later. Might as well start talking about it.

On Friday, April 11, 2003, at 07:54 PM, QT - Marc Canter wrote:
>
> I also suggested a BOAF at ENTCON (Dr. Weinberger is into it) -
> anybody else interested in attending?
>
> - Marc
> =================
>
> ThreadsML never picked up sufficient steam, but I believe that
> it or something like it is needed more than ever. a BOAF at
> Emerging Tech sounds like a great idea. Count me in.
>
>
> -- David W.
Marc Canter  8
04-13-2003 04:03 PM ET (US)
sounds like a plan - I'm trying to get either Matt or Paolo to show up.
this is during your mail bot session?

>
< replied-to message removed by QT >
Ben Hammersley  9
04-13-2003 04:30 PM ET (US)
On Sunday, April 13, 2003, at 10:03 PM, QT - Marc Canter wrote:

> sounds like a plan - I'm trying to get either Matt or Paolo to
> show up.
> this is during your mail bot session?

Yes, precisely. There's much intertwingley goodness to be had when you start treating email, and mailing lists specifically, as
just-more-data. I'd wager that in total there is a great deal more meaningful signal in the collected listservs and majordomos of the world than the web. The more technical or complex a subject, it seems to me, the more it gravitates towards old school mailing lists, and the further it retreats from the web. Why this is I'm not sure, but
releasing the trapped knowledge in all those old list archives seems a very noble cause. And to do that, we need something like ThreadsML.
Or I might be talking cobblers. Anyone?
Paolo Valdemarin  10
04-14-2003 08:56 AM ET (US)
Looks like we might be able have some parts of the aggregator up and running by the conference time. We won't be able to participate (unless we win some money within the next few days), but I guess this is not going to be a big problem.

The basic idea is to start showing some practical applications of the protocol in order to have others thinking and possibly implementing.

There are a few interesting things that can be discussed about ENT, besides the protocol.

The first one is the usage of TopicRolls. They are basically a way to syndicate lists of types and topics and can be used in several different ways. We have thought of a couple of them, but most probably other will follow, maybe moving away from OPML which is the format which we are currently using.

The other front is on the aggregator side. Once we start collecting posts and organizing them according to topics, there's something more to do than simply sorting posts using topics. This is why we introduced Types as a way to create relations between posts.

In other words, if a post contains these topics:
  • places:San Francisco, Gradisca d'Isonzo, London, Somewhere in Sweden
  • people:Marc Canter, Ben Hammersley, Paolo Valdemarin, Matt Mower
  • companies:Broadband Mechanic, Evectors, Novissio


We could create relations between all these and all the other ones already existing in a directory, thus allowing to navigate trough posts in a completly new way (btw: this is what our aggregator will ultimately do).

But the kind of relations that can be created between topics and types is still to be completly explored and could probably create some very interesting results, especially if applied to existing archives.
Danny Ayers  11
04-14-2003 09:36 AM ET (US)
Ciaio Paolo,

> Looks like we might be able have some parts of the aggregator up
> and running by the conference time. We won't be able to
> participate (unless we win some money within the next few days),
> but I guess this is not going to be a big problem.

This is great - any URLs to look at already?

...

> The first one is the usage of TopicRolls. They are basically a
> way to syndicate lists of types and topics and can be used in
> several different ways. We have thought of a couple of them, but
> most probably other will follow, maybe moving away from OPML
> which is the format which we are currently using.

It would be great to see TopicRolls appearing in a fashion comparable to blogrolls (especially with tools like blogrolling.com to help).
Personally I'd run a mile from OPML -
see http://dannyayers.com/archives/001119.html

[snip re. types]
I'll be interested in seeing what you've come up with for this. Your remark "...we introduced Types as a way to create
relations between posts..." does make me wonder though - ThreadsML offers a way of creating relations between posts using an existing framework for expressing relations, even if you didn't find ThreadsML the language suitable you could still use the relationship framework (RDF). The ENT idea I can see as a neat way of introducing an important bit of semantics into RSS 2.0 (something else I'd run a mile from ;-) but extending this idea further really does sound like there might be wheel-reinvention going on. I hope & trust you'll convince me otherwise ;-)

Cheers,
Danny.
Steve YostPerson was signed in when posted  12
04-14-2003 10:17 AM ET (US)
Matt wrote in /m6:
> One thing I haven't come across so far is any
> explanation about why the previous initiative fell
> short. Can anyone summarise? Are there any lessons
> to be learned? Have we?

I'd say it was nothing more than lack of a consistent push on my part, my attention being pulled back to QuickTopic core features. I think the participants were in agreement about the fundamental issues. Divergences happened around details, and there was a little defocusing around tangential issues and applications.

I'm even more confident now that with enough attention and agreement on (and focus on) the problem we're trying to solve, we can create an extremely useful standard here.
Steve YostPerson was signed in when posted  13
04-14-2003 10:37 AM ET (US)
I like quite a few things I read in the ENT spec, especially the pragmatic approach revealed in lines like these:
===
The goals of ENT are to:
1. be as simple to implement as possible
2. represent topics sufficiently that they be useful in enabling smart aggregators (e.g. filtering, recombining feeds, etc...)
--
It is our position that the chief reason for the lack of widespread support of the existing standards is their perceived complexity. [To this I'd add the secondary result of lack of software library support. Has this changed recently?]
--
By using a namespace, non-ENT compliant RSS aggregators may safely ignore the ENT elements and attributes and the topic information contained within them.
===

I'm not yet up on RSS 2.0 (and BTW I'm certainly not the most qualified to push in any particular technical direction here.) Thinking about immediate applicability (a big boost for momentum), can anyone comment on current support for RSS 2.0 in software libraries, and in applications like RSS aggregators?
Danny Ayers  14
04-14-2003 11:04 AM ET (US)
> I like quite a few things I read in the ENT spec, especially the
> pragmatic approach revealed in lines like these:

I agree - this aspect is impressive.

But:
===
> It is our position that the chief reason for the lack of
> widespread support of the existing standards is their perceived
> complexity.
===

I reckon this analysis probably is spot on, but shouldn't the appropriate response be an attempt to change the perception, rather than creating a new standard - doesn't this just add to the (real) complexity? (I could be wrong, see below)

> [To this I'd add the secondary result of lack of
> software library support. Has this changed recently?]

A few people are trying to directly address the issue, but I don't think there have been any major revolutions, though the attention Movable Type's been getting is probably significant.

===
> By using a namespace, non-ENT compliant RSS aggregators may
> safely ignore the ENT elements and attributes and the topic
> information contained within them.
> ===

If you compare an RSS 1.0 feed with and RSS 2.0 feed, the scariest part for a newcomer is probably the use of namespaces in 1.0. Perhaps their use in places like ENT will sweeten this - so that a side effect of RSS 2.0 + ENT might be greater willingness to consider the RDF framework...

> I'm not yet up on RSS 2.0 (and BTW I'm certainly not the most
> qualified to push in any particular technical direction here.)
> Thinking about immediate applicability (a big boost for
> momentum), can anyone comment on current support for RSS 2.0 in
> software libraries, and in applications like RSS aggregators?

I think it can be checked somewhere around syndic8.com, but I think support for basic RSS 2.0 syndication/viewing is fairly universal (it varies little from the previous Userland specs). I don't know what the situation is regarding namespace-identified extensions (like ENT), though I'm fairly confident that 'rare' would be a good estimate.

Cheers,
Danny.
Steve YostPerson was signed in when posted  15
04-14-2003 11:06 AM ET (US)
I should add the following to my enquiry about RSS 2.0, just to balance the issue. As the ENT spec mentions, we're always trying to balance simplicity and flexibility. The flexibility of RDF-based 1.0 was what sold me on that approach, but the likelihood of success is, I still think, greatly dictated by the availability of software libraries (in popular languages) for reading and writing it.

So I'll reiterate: what libraries are now available for reading/writing RDF (or specifically RSS 1.0)?
Steve Yost  16
04-14-2003 11:10 AM ET (US)
Looks like your post crossed mine on the wire, Danny. Thanks.

I wonder: regarding software libraries, is there a big leap from supporting the namespace extension of RSS 2.0 for ENT to supporting RDF in general? (I think you implied that there isn't).
Matt Mower  17
04-14-2003 11:23 AM ET (US)
Hi Danny,

[Note: I started drafting this message before the last flurry of posts by Danny & Steve - I think it's contents are still pretty relevant]
> [snip re. types]
> I'll be interested in seeing what you've come up with for
> this. Your remark "...we introduced Types as a way to create
> relations between posts..." does make me wonder though -
> ThreadsML offers a way of creating relations between posts
> using an existing framework for expressing relations, even if
> you didn't find ThreadsML the language suitable you could
> still use the relationship framework (RDF). The ENT idea I
> can see as a neat way of introducing an important bit of
> semantics into RSS 2.0 (something else I'd run a mile from
> ;-) but extending this idea further really does sound like
> there might be wheel-reinvention going on. I hope & trust
> you'll convince me otherwise ;-)
>

I can see where you are coming from, it's a concern for us too. But I don't think there is too much to be worried about.

My view is that RDF is a great way of doing things as long as it is wrapped with 1st rate tool support and matched with applications that warrant it. So far as I can tell both of those are near to non-existant right now and RDF remains primarily the domain of "those people who are interested in RDF and think it is a good idea," with a few exceptions.
Back when Paolo and I were tossing this idea around we carefully thought over "are we re-inventing RDF" and came to the answer: "no." ENT is, by comparison, pretty simple/limited compared to RDF. But right now I think simplicity is more valuable than power. We are still in the early stages of building the semantic web and, really, the applications we have don't *need* RDF's power right now. We think ENT is "just good enough" to launch a raft of exciting applications.

I fully expect though, that those applications will grow too big for ENT and simple "hard-wired" standards like it. But this demand will, perhaps, lead to an awakening about RDF. It's time will have come because there will be applications that justify it's complexity & the perceived benefits of those applications will be enough to overcome the inertia involved in getting started with it. And, hopefully, by that time the RDF folks will have delivered much more solidly on the tools front ;)

In short my belief is that simple, focused, standards now will pave the way for the adoption of more powerful standards later. Also, as I have stated before[1], I don't think that RDF will really hit a home run until OWL is ready for prime-time.

Regards,

Matt

[1] http://matt.blogs.it/2003/04/08.html#a856
Danny Ayers  18
04-14-2003 11:35 AM ET (US)
> Looks like your post crossed mine on the wire, Danny. Thanks.

heh, and again.

> I wonder: regarding software libraries, is there a big leap from
> supporting the namespace extension of RSS 2.0 for ENT to
> supporting RDF in general? (I think you implied that there
> isn't).

Aah, scratch any implication to be found along those lines. The point I was trying to make was that what probably scares most potential RSS 1.0 users away is the ugly syntax of RDF/XML, which usually begins with lots of namespace declarations, and features prefixed:elements and even the rare prefixed:attributes all over the place. These are actually artifacts of XML not RDF, it's just that they don't appear in RSS 2.0, until you add extensions.

To support RDF in general, you would really be talking about supporting the model. This could be a big leap. I guess it depends a lot on how people are modelling things already - e.g. if their RDBMS tables are fairly well normalised then it should refactor without too many tears.

Cheers,
Danny.
Steve YostPerson was signed in when posted  19
04-14-2003 12:00 PM ET (US)
Edited by author 04-14-2003 12:10 PM
Would it be practical for someone to write a library that's specific to a particular RDF usage (like proposed ThreadsML) that doesn't cover RDF generally?

Also, a major concern is that an implementation doesn't break existing tools that might use it effectively.

As an example that covers both of these items, it seems that lots of RSS aggregators support RSS 1.0. [I'd really like a table of popular aggregators and their support. Here's one: http://blogspace.com/rss/readers
]

It's my understanding that any extension that's based on RSS 1.0 would not break existing apps that support RSS 1.0. True?
Steve YostPerson was signed in when posted  20
04-14-2003 12:15 PM ET (US)
Edited by author 04-14-2003 01:17 PM
Libraries supporting RSS 1.0:

Perl: http://perl-rss.sourceforge.net/
PHP example, not library: http://www.sitepoint.com/article/560/1
Java: Jena:
  preview 2 released: http://www.hpl.hp.com/semweb/jena2.htm
  RSS 1.0 parsing: http://www-uk.hpl.hp.com/people/bwm/rdf/jena/rssinjena.htm

I'll update this post as I find more. I couldn't find one for c-sharp. Any other favorites?
Ben Hammersley  21
04-14-2003 12:45 PM ET (US)
Nothing would break, that's true. The namespacing does that trick - same as in 2.0, actually. The question, however, is whether any
existing RSS aggregators actually aggregate 1.0 as RDF, i.e. by parsing it into triples, or building an RDF database, and then querying that database for the display. As far as I know, none do. They just parse 1.0 as if it were 2.0, by simple XML handling, or even RegExps.

The ramifications of this are quite plain: extensions are ignored unless the software is set up to see them, rather than the ideal situation where the data is thrown into an RDF soup, and it's just a matter of using a different query to get at it. It's this soup-ability that gives RDF the edge, imho, if only the documentation was there. Shelley Power's new book should help a lot.


On Monday, April 14, 2003, at 06:00 PM, QT - Steve Yost wrote:

>
< replied-to message removed by QT >
Steve YostPerson was signed in when posted  22
04-14-2003 01:11 PM ET (US)
Ben, thanks -- this sheds a lot of light on the issue for me.

The situation sounds very good to me. It sounds like app writers were able to conform to reading RSS 1.0 fairly simply, without writing themselves into a corner -- they can add more semantic processing if they wish later (presumably by rewriting non-user-visible layers, or using libraries when they're available). Meanwhile, content producers can produce whatever richness they're motivated to. The great bottom line is that the spec becomes a non-issue in bridging the two, leaving it to the market to determine that.

It sounds like Shelly Power's book will be good to help understand RDF. Hopefully it will motivate people to write software libraries. Oversimplifying, RDF needs its James Clark.
Marc Canter  23
04-14-2003 01:12 PM ET (US)
Good morning everyone - late night on the West Coast - but now I've caught with all the traffic.

Here's my replies to all the subjects:

- by definition - interest in ThreadsML is back - thanks to this
interchange.

- I think we can keep discussion of ENTs relevance to ThreadsML going, but it's obviously a tertiary importance to the core issues of each:

 - Matt and Paolo have just announced ENT - so they're heads down trying to get a new kind of aggregator released - their next tasks will be to evangelize ENT to the other aggregator tool vendors, all the while keeping a mind open to what's possible

 - TopicRolls and the concept of "clouds" are a clear way to bridge between any and all implementation using ENT today and any RDF - semantically inclined functionality and frameworks - moving forward. The key is end-user's investments in building their "digital lifestyles" around these new sort of software tools, and I think TopicRolls and Clouds are exactly appropriate places for transitioning forward - if and when that is required.
- that said - I've fascinated in the pure functionality of what ThreadsML can enable.

 - but as usual, for us to take any conceptual discussion forward, get it implemented and then place that functionality into the hands of end-users - is the key goal
 - so to that ends I proposed that the ThreadsML efforts be "flowed" through ENT
 - this might BOTH get us to our goals (see below) AND keep the channels open moving forward

==============

That said - here's what I think would be most powerful.

 - it's clear tome that RDF-RSS is a religious matter - and especially since this week is Passover, which then naturally leads to Easter, in the spirit of Jesus, and to avoid any Bosnia's, Iran-Iraq, African genocide, terrorist incidents or major rolling of the tanks - I think we should all be especially careful not to let this discussion end-up in some religious dispute. Each to his use. Vive le defrance!

 - it's clear to me that RDF DOES now have momentum - and that the ball is in Danny's, OSAF, the W3C, T B-L and the rest of those (notice I didn't say REST-ful ones :-) folks - to put their money where their mouth are...... (I for one am pretty excited about the possibilities!)

 - and I'd venture to bet that these RDF folks are probably pretty busy doing just that

 - and Matt and Paolo are pretty busy too

 - so then it falls onto the shoulders of the message board folks - symbolically represented by Steve - to try and see if there's some overlap/cross-over possibilities between what ThreadsML is today as defined as a RDF framework - and what can be done with good-old simple, hardwired framework that it is - RSS/ENT.

  - also just as clearly is the goal to keep 100% backwards compatibility so all those aggregators keep on rocking the end-user's worlds, and keep on providing new areas of functionality to humans. This is key. So that's another reason why namespaces - as they're implemented in RSS 2.0 makes sense. Nobody here wants to break anything, and if we're to leverage all those aggregators and aggregators users - this is also a requirement.
 - as I implied above - the REAL goal would be to get ThreadsML to work with BOTH an RDF/RSS1.0 AND a ENT/RSS2.0 implementation scheme. THAT would really rock the house, make all parties happy and...... nullify any potential religious warfare.

Does that make sense?

- Marc

BTW Our (Broadband Mechanics) goal is to create new kinds of front-ends and new kinds of aggregators - that would pull in Threads from iVillage, Slashdot, QuickTopic and all sorts of message boards into one place - in other words message/thread aggregation.

Venturing into this area naively I thought 'all' we needed was an: - open data structure
 - open API that all tools would adopt (post, read, etc.)

But clearly we're all poised to go beyond that. I just wanna make sure the simple goal of aggregating messages happens - so we can do for messages, what RSS/RDF did for news feeds.

That's the overall goal - new functionality for humans - NOW!
Marc Canter  24
04-14-2003 01:15 PM ET (US)
GREAT!

This (I think confirms my theory that we (collectively) could move forward implementing something new like ThreadsML - BOTH RDF/RSS1.0 AND ENT/RSS2.0.
Correct?

And that TopicRolls and Clouds (as ingeniously designed by Matt and Paolo) can be a bridge between these two implementation worlds.

Correct?

- Marc

>
< replied-to message removed by QT >
Steve YostPerson was signed in when posted  25
04-14-2003 01:32 PM ET (US)
Edited by author 04-14-2003 01:33 PM
Sure sounds like the best possible outcome, Marc. I need to learn more about the diffs and similarities between RDF/RSS1.0 and ENT/RSS2.0. A little homework for me.
Danny Ayers  26
04-14-2003 01:38 PM ET (US)
Matt - just acknowledging you have a good point, I'm not 100% convinced, but then we can't be 100% certain of anything anyway ;-)


< replied-to message removed by QT >
Marc Canter  27
04-14-2003 01:45 PM ET (US)
I would say Matt is the expert on that.

- Marc

>
< replied-to message removed by QT >
Peter Kaminski  28
04-14-2003 11:20 PM ET (US)
A few of us are proposing a "Social Software Alliance" to help coordinate and evangelize the sorts of standards needed for social software (like, for instance, ThreadsML or ENT).

Here's a pre-release/draft Call For Discussion:

<http://www.socialtext.net/ssa/>;

That's a wiki; if you'd like to edit, there's a free registration page at <http://www.socialtext.net/ssa-registration/>;.

There's also a mailing list:

subscribe: blank email to social-subscribe@lists.polycot.com
archive: http://lists.polycot.com/cgi-bin/ezmlm-cgi/2/

Comments/edits welcome! We have a conference call / online chat tentatively set for this Friday, April 18th (time TBD) to hash it out some more.

Pete
Myk Melez  29
04-15-2003 06:29 PM ET (US)
QT - Steve Yost wrote:

>I should add the following to my enquiry about RSS 2.0, just to
>balance the issue. As the ENT spec mentions, we're always trying
>to balance simplicity and flexibility. The flexibility of
>RDF-based 1.0 was what sold me on that approach, but the
>likelihood of success is, I still think, greatly dictated by the
>availability of software libraries (in popular languages) for
>reading and writing it.
>
>So I'll reiterate: what libraries are now available for
>reading/writing RDF (or specifically RSS 1.0)?
>
The Mozilla Application Framework includes an RDF parser and aggregator:
http://www.mozilla.org/rdf/doc/

In future releases Mozilla will allow web scripts to access these RDF services:

http://bugzilla.mozilla.org/show_bug.cgi?id=122846

-myk
Cayzer, Steve  30
04-16-2003 10:09 AM ET (US)
Hi Danny et al,

I don't really have much to add to what you say.
We found the ThreadsML spec[1] pretty attractive for balancing usability & complexity.
OTOH, we felt that the 'fit' between the ontology and our use case was a bit ungainly.

In particular, we want to connect blog entries, items that the entries are about, and annotations on those entries.
Forcing all these entities to fit the concept of a 'Post' (especially in the sense of posts belonging to a well defined topic container) feels a little unnatural.

Our current inclination is to use ThreadsML as one of a number of pluggable options.
We'll certainly follow developments on this group with interest, and we're happy to share thoughts/ontologies as and when they become available
Cheers

Steve


< replied-to message removed by QT >
Marc Canter  31
04-16-2003 11:54 AM ET (US)
is there any standards (or a proposal) for comments? As a data structure?
Steve's needs seem clear (and backed my many other requests)

:How can we extend the concept of blog comments?

At a minimum:
 - store comments with associated blog posts
 - notify me when someone leaves a comment

And perhaps:
 - tie comments into a more general message board system

These simple needs should be covered by ThreadsML. I know there's a difference between a protocol and data structure and...

....WHAT YOU DO WITH IT.....

but perhaps we can break a few rules and actually put our minds into the needs of the end-users and say.....

YES - this is ONE of the things ThreadsML is all about.

:-)

You say tomato (RDF) - I say toMAHto (ENT/RSS) but it's all the same outcome.

:-)

- Marc


>
< replied-to message removed by QT >
Marc Canter  32
04-27-2003 04:57 AM ET (US)
Here's my post on Ben's session. It includes notes of what people think ThreadsML should do.
Shelley (Burningbird) P  33
04-27-2003 10:13 PM ET (US)
My my, look at what I stumbled on. Hope all don't mind butting in.

Gentlemen from all this activity, can I deduce that you're trying to decide whether to incorporate ThreadsML capability into RSS 2.0, or into RSS 1.0?

Personally, I think you'd have better luck going with ENT, as long as you can get RSS 2.0 pulled into a separate specification body. The use of RDF within RSS has always been overkill -- RSS by its nature is transitory, and RDF is the fundamental data model meant for building ontologies which persist. Ben knows wherefore I'm coming from with this. There is no true semantics with RSS -- its a brain dead data model.

The ENT _vocabulary_ (this is not a model), seems simple, clean, easy to implement if you can get all the publication tools to generate the vocabulary items, and -- big and here -- the people who write the content to categorize and do all the work necessary to make this work. Folks have had trouble with Trackback, and that's also a dead simple concept. The concept behind ThreadsML -- or recording threaded communications -- is more complex.

Then you have the issues of storage, which was always the problems I had with ThreadNeedle -- I wanted something that was non-centralized. Your approach requires centralization, unless you want to restrict discovery of conversations by happening on to one node in the thread, such as we have with Trackback now. Even with bots, the bots have got to return home to some mama. You'd have to have someone willing to put up about a terrabyte database to do this properly. Marc, I saw your goals -- man, even a terrabyte won't cover it.

Remember, that every time someone starts a conversation, you have to record it just in case someone else continues it. (Unless you again, don't care except for certain coversations.) For instance, I have people trackbacking to items I created 5 months ago. That is a lot of data. And you have to persist it, because the RSS (RDF or not) file is not persistent. You could embed this XML vocabulary into an HTML document, but we all know what a pain that is. However, this would persist the info. But this might allow you to store just the first node in the thread, and save space. Lots of work, though.

Even with the use of bots, you're still centralizing this. Correct? I don't quite see here, there is a lot of side threads, how you all plan on handling the persistent data issue. Sorry if this is clear to you all. I know I'm a self-invitee late comer.

(As an aside on the book, and thanks to whomever mentioned this (Danny? Ben?), I covered over 60 APIs and what not focused on RDF -- the technology is there, we just haven't been promoting me. My bad, too, on this. What I hope comes out from the book is decent ontologies from many different domains, we have the APIs now. RDF doesn't need another inappropriate use to live down.)
Marc Canter  34
04-28-2003 12:34 AM ET (US)
wow - intelligent, in-depth analysis and criticism. Compared to the last time I saw your name - this is really refreshing!


>
> Gentlemen from all this activity, can I deduce that you're
> trying to decide whether to incorporate ThreadsML capability
> into RSS 2.0, or into RSS 1.0?


I'd say (idealistically) wouldn't it be cool to implement ThreadsML in both?


>
> Personally, I think you'd have better luck going with ENT, as
> long as you can get RSS 2.0 pulled into a separate specification
> body.

Do you mean 'technical' body or some sort of 'standards' body or a business entity?


> The use of RDF within RSS has always been overkill -- RSS
> by its nature is transitory, and RDF is the fundamental data
> model meant for building ontologies which persist. Ben knows
> wherefore I'm coming from with this. There is no true semantics
> with RSS -- its a brain dead data model..

Agreed - but you should have seen Jo Walsh's presentation at ETCON. There are LOTS of people using RDF - so vive le defrance, god bless them, more power to them. There is PLENTY of functionality and flexibility in RDF to implement a ThreadsML type interop. But it also makes sense to use ENT as well. No reason not to make sure we float something that works with both.


>
> The ENT _vocabulary_ (this is not a model), seems simple, clean,
> easy to implement if you can get all the publication tools to
> generate the vocabulary items, and -- big and here -- the people
> who write the content to categorize and do all the work
> necessary to make this work. Folks have had trouble with
> Trackback, and that's also a dead simple concept. The concept
> behind ThreadsML -- or recording threaded communications -- is
> more complex.


Yes and I think it's important to keep in mind that ThreadsML is not Trackback. We have the ability of baking this into a new generation of tools, and completely hide all the technical mumbo jumbo. I still don't really understand Trackback - you know why? I don't use MT. And because of that - I don't have a direct feedback of it's functionality.

One thing we have to consider and conspire on - is how we'd introduce this to the world. Perhaps it's actually more than one name - more than one way that it's benefits are received by the end-users. The same
interop/protocol/data structure/service - can be utilized in many ways.

>
> Then you have the issues of storage, which was always the
> problems I had with ThreadNeedle -- I wanted something that was
> non-centralized. Your approach requires centralization, unless
> you want to restrict discovery of conversations by happening on
> to one node in the thread, such as we have with Trackback now.
> Even with bots, the bots have got to return home to some mama.
> You'd have to have someone willing to put up about a terrabyte
> database to do this properly. Marc, I saw your goals -- man,
> even a terrabyte won't cover it.


More like 20 Terrabytes - just to start. Everywhere I look I see billionaire yuppies starting moon projects, buying houses, making charitable donations, etc. What would be more beneficial and charitable than to sponsor an open conversation server?

I figure between the Internet Archive, Google, Sun, Oracle, Macromedia, Adobe, HP, IBM, Microsoft and [insert here] we should be able to get as much storage and bandwidth that we need. Oh yah - Sony, Phillips, NEC, Fujitsu, Nokia, Samsung, etc.

And I think there's some sweet spot between centralized servers and LOTS of them - storing different KINDS and TYPES of conversations - based upon geography, scope, constituency, government, religious, commercial or education affiliation. In fact - it's a balance between decentralized and a central DNS-like interchange/registry - that I think will be required.

>
> Remember, that every time someone starts a conversation, you
> have to record it just in case someone else continues it.

HHmmmm - don't think we'll tackle audio - just right yet. But I DO think we can do this with that old fashioned stuff called text. And structured text and data structures - as well.


> (Unless you again, don't care except for certain conversations.)


The idea is - that new kinds of tools would enable these new kinds of conversations. All the old message boards, IM clients, email clients, blogs and text editors will remain "old school".

So I imagine that this sort of spooling, archiving, linking, hyperlocking, whatever that entails - will be on a case by case basis. But YES - in certain situations (school lectures, business meetings, flirting scenarios, affinity groups) I do see EVERY conversation being turned into a persistent, re-entrant file/structure.


> For instance, I have people trackbacking to items I created 5
> months ago. That is a lot of data. And you have to persist it,
> because the RSS (RDF or not) file is not persistent.


One good thing would be to get rid of redundancy and save only the diffs. Here's an example.

Joi Ito started playing with his RSS feed after he read Ben's book [thanks Ben.] He created an alternative feed - which includes both the comments and trackbacks in it. So every time someone leaves a comment or adds a trackback to one of Joi's posts - you get an entirely new copy of the whole thing - the post, ALL the comments and ALL the trackbacks.

This is obviously what we DON'T want.

If you read this post:

http://blogs.it/0100198/2003/04/27.html#a989

...you'll see that what people MOST want is interop - to flow the threads between different tools, different platforms/machines or different usage scenarios. This is totally do-able with both RDF and RSS2.0/ENT.
> You could
> embed this XML vocabulary into an HTML document, but we all know
> what a pain that is. However, this would persist the info. But
> this might allow you to store just the first node in the thread,
> and save space. Lots of work, though.

I'm hoping to keep HTML as a display format - and stay away from it like the plague. Details like this will flow - when we can agree upon a range of scope and implementation spec.

>
> Even with the use of bots, you're still centralizing this.
> Correct? I don't quite see here, there is a lot of side threads,
> how you all plan on handling the persistent data issue. Sorry if
> this is clear to you all. I know I'm a self-invitee late comer.


I need to spend some time matrixing all the feature requests we got at ETCON - but I'll pick up on the work we've done already - on what we call Multimedia Conversations:

a) a conversation - thread of thought - starts as an email interchange, IM session, chat session or message board thread. That thread gets "converted" or input into a tool that supports threadsML. This is the point when the thread (which can be a single 'post' or entire sequence of posts) gets stored on a server. That may be an open, public server - or as locked up as possible. The technology should be security scheme agnostic.

b) at this point - many things can happen.
 - The thread can continue onto or shall I say 'into' many other tools, systems or environments.
 - The thread can be transformed into visualizations, media sequences, interactive interfaces - all sorts of new wacky stuff

c) at any point - a conversation should be re-entrant - so that anybody (or in a more controlled manner) can jump into 'the middle' of a thread - instead of always appending to the end.

d) Threads/Conversations should be sortable/searchable by name, topic, chronology, size, media type associated with it, etc.

e) These conversations would be stored on shared servers (ala TopicExchange or blaxm!.) A distributed model could be used to mirror or proxy the conversations throughout the WWW.

http://blogs.it/0100198/stories/2003/01/20...aConversations.html
NOTE: Many of the usage scenarios that people requested had to do with data interchange between different tools or systems. So putting the thread into some standard form - seems to me to be half the battle.


>
>
> (As an aside on the book, and thanks to whomever mentioned this
> (Danny? Ben?), I covered over 60 APIs and what not focused on
> RDF -- the technology is there, we just haven't been promoting
> me. My bad, too, on this. What I hope comes out from the book is
> decent ontologies from many different domains, we have the APIs
> now. RDF doesn't need another inappropriate use to live down.)
> _________________________________________________________________
> QT Forum: http://www.quicktopic.com/em/H/mXbfHC2srY3/m33
> Unsubscribe: http://www.quicktopic.com/em/X/mXbfHC2srY3
> Current email group: aaron@theinfo.org, ben@benhammersley.com,
> danny666@virgilio.it, info@icite.net, jamie@fentonia.com,
> jito@neoteny.com, judell@mv.com, kaminski@istori.com,
> marc@broadbandmechanics.com, marc@prec-it.com, matt@novissio.com,
> mgraham@mail.ivillage.com, myk@melez.com, myk@zapogee.com,
> paolo@evectors.it, rael@oreilly.com,
> rcaccappolo@mail.ivillage.com, rossmay@earthlink.net,
> self@evident.com, steve@quicktopic.com
> Start your own topic in 20 seconds: http://www.quicktopic.com |QT
>
Steve YostPerson was signed in when posted  35
04-28-2003 06:25 AM ET (US)
I'd like to take a stab at an ENT implementation, just to provide myself something concrete to work with, and to have an idea of what it would be to support both (for myself and others).

You've probably all seen the current placeholder RSS for QuickTopic:
http://www.quicktopic.com/em/H/mXbfHC2srY3.rss. I worked from an example provided by Aaron Swartz to quickly implement that.

Could someone provide an ENT example of this thread? Just the first few messages, even.
Steve Yost  36
04-28-2003 08:44 AM ET (US)
I should start this post by saying that I've just read the ENT and RSS2.0 docs in the last hour, and I think I get just about all of it. It's good that they're this easy to understand.
[For easy reference: http://www.purl.org/NET/ENT/1.0/ and
http://backend.userland.com/rss]

I see now that asking for an ENT implementation example of ThreadsML is premature, and probably just off base. That's because ENT itself, being the addition of topic-map data to RSS2.0, doesn't include all the kinds of data that ThreadsML needs. In particular, we need the inter-message linking and full-content that are covered here:
http://www.quicktopic.com/7/H/rhSrjkWgjnvRq/m66
So Marc, as I understand it, you're suggesting not that ENT handle ThreadsML, but rather that ThreadsML be implemented as an RSS2.0 extension like ENT is.

Since the current proposal for ThreadsML is a collection of RSS1.0 modules and specification on how to use them in that context, they can easily be used in the RSS2.0 context as well (right?). That makes it easy for the app that exports ThreadsML to support both.

I like RSS2.0's simplicity and directness, which work because it's clearly targeted at specific applications. It seems like a slight stretch to apply this to discussion threads -- take for instance the <comments> element, which would be deprecated in this usage (similarly, maybe the ENT spec should suggest a usage for RSS2.0's <category> element when using the ENT extension). But again, I'm for the pragmatic approach: ease and likelihood of implementation.

This does seem like a good time to take a stab at both for QuickTopic.
Regarding terabytes of storage, etc: it's worth pointing out that central storage isn't necessary, given that many threads are stored in a
"permanently" accessible place. The proposed spec deals with linkage from a portion of a thread stored somewhere (say a web-based email archive) to somewhere else like QuickTopic. That's not to say a vast repository wouldn't be useful, at least for Google-like caching or Internet-Archive-like archiving.

Speaking of Google, there's the What Will Google Do question. With respect to ThreadsML, it would be great if they and other search engines supported thread traversal (http://www.quicktopic.com/7/H/rhSrjkWgjnvRq/m103). Regarding ENT and Google, it would be great if Google supported that too. As I mentioned on my blog in February
(http://www.quicktopic.com/blog/archives/000219.html#000219 see 'Blog topics'), I'm intrigued by such a prospect.
Steve YostPerson was signed in when posted  37
04-28-2003 10:32 AM ET (US)
Clarification: when I said
> This does seem like a good time to take a stab at both for QuickTopic.
I meant that I should try implementing the current proposal for ThreadsML for QuickTopic as both RSS1.0 and RSS2.0 extensions. Maybe someone could write/modify an app to import/consume one or both of these. This would help flush out any issues with the current proposal.
Marc Canter  38
04-28-2003 11:42 AM ET (US)
> Clarification: when I said
> > This does seem like a good time to take a stab at both for
> QuickTopic.
> I meant that I should try implementing the current proposal for
> ThreadsML for QuickTopic as both RSS1.0 and RSS2.0 extensions.
> Maybe someone could write/modify an app to import/consume one or
> both of these. This would help flush out any issues with the
> current proposal.


Cool dude - go for it - I'll make sure either Matt or Paolo are supporting you (if you need it.) eVector's (Paolo's company) has an ENT aware reader/aggregator called K-Collector. It should be able to read your ENT feeds.
http://blogs.it/0100198/2003/04/27.html#a992

What you need to do is to the extensions/changes/additional features required for ThreadsML and hand that code to Matt and Paolo. They'll bake it into the two basic apps they've got:
 - LiveTopics
 - K-Collector

then we get that code to Phil Pearson - and his TopicExchange - and we'll have three ThreadsML aware apps! Meanwhile I really DO need to sit down and matrix out the requests - made last week - and act like a marketing guy and make sure it all works "from the user's point of view".

This BTW this is exactly what the folks at the SSA BoaF were insisting - that the engineers listen to the humans. So regard me as their advocate (well I guess Dr. Weinberger gets that title - more than me.) Afterall - he DOES have the domain name afterall.

:-)
Marc Canter  39
04-28-2003 11:42 AM ET (US)
> I should start this post by saying that I've just read the ENT
> and RSS2.0 docs in the last hour, and I think I get just about
> all of it. It's good that they're this easy to understand.
> [For easy reference: http://www.purl.org/NET/ENT/1.0/ and
> http://backend.userland.com/rss]

Coolio

>
> I see now that asking for an ENT implementation example of
> ThreadsML is premature, and probably just off base. That's
> because ENT itself, being the addition of topic-map data to
> RSS2.0, doesn't include all the kinds of data that ThreadsML
> needs. In particular, we need the inter-message linking and
> full-content that are covered here:
> http://www.quicktopic.com/7/H/rhSrjkWgjnvRq/m66
> So Marc, as I understand it, you're suggesting not that ENT
> handle ThreadsML, but rather that ThreadsML be implemented as an
> RSS2.0 extension like ENT is.

Bingo - nail on the head - exact-de-mon

>
> Since the current proposal for ThreadsML is a collection of
> RSS1.0 modules and specification on how to use them in that
> context, they can easily be used in the RSS2.0 context as well
> (right?). That makes it easy for the app that exports ThreadsML
> to support both.

Bingo again - man you're matching all the #'s this morning!

>
> I like RSS2.0's simplicity and directness, which work because
> it's clearly targeted at specific applications.

I guess that's why it's called EASY News Topics!


> It seems like a
> slight stretch to apply this to discussion threads -- take for
> instance the <comments> element, which would be deprecated in
> this usage (similarly, maybe the ENT spec should suggest a usage
> for RSS2.0's <category> element when using the ENT extension).
> But again, I'm for the pragmatic approach: ease and likelihood
> of implementation.

Welll dude - I kn wo things are moving fast now - but believe it or not there MAY be some other usage scenarios and features that you MAY have not thought of - or have currently implemented. :-) SO give me some time (like till next week - I have to go to Chicago this week) and we'll make sure all those requests made last week during Ben's session - can get implemented as well. [That's what I call 'matrixing' the features......]

Taking your current spec and "implementing" it with RSS2.0/ENT - is a great first step.

Who knows - maybe by then, we'll convince Dave Winer to help.

:-)

>
> This does seem like a good time to take a stab at both for
> QuickTopic.

Cool - as I said in the pther post - k-collector is a new kind of aggregator, LiveTopics brings topics to Radio and could be thought of as a ENT generator and TopicExchange is a - well I guess you'd call it a shared repository.

So we got teh makings of some cool interchange. This is excactly what Matt and Paolo anticipated. They've been doing the 'mundane' stuff first, but they hoped that cool and new things (like ThreadsML) could ALSO leverage ENT.

And you're doing it!

One thing to take into account - if the connection that we're making RIGHT NOW with this mail list splattering itself into QT. Make SURE that kind of functionality gets implemented FIRST!

:-)


> Regarding terabytes of storage, etc: it's worth pointing out
> that central storage isn't necessary, given that many threads
> are stored in a
> "permanently" accessible place. The proposed spec deals with
> linkage from a portion of a thread stored somewhere (say a
> web-based email archive) to somewhere else like QuickTopic.


Amen brother. We can eventually bake this sort of funcntionality into smart "virtual storage" systems (object stores) which will take care of all the moving around, decentralized, distributed nature kind of stuff - that should sit ON TOP of the ThreadsML interop.

But without the basic building blocks of flowing a thread between disparate tools, systems and environments - we can never have a virtual file storage - so GO FOR IT!

We're changing the world here :-)

Thanks to Ben again.....



> That's not to say a vast repository wouldn't be useful, at least
> for Google-like caching or Internet-Archive-like archiving.

Or many, many vast repositories!


> Speaking of Google, there's the What Will Google Do question.


That's why Joi has been making sure to befriend Evan - as we knew that when the Neoteny investment was announced, they're more or less raising the bar and taking the Blogosphere to a new level. So communication between vendors will be even MORE important.

Joi approached Winer too - BTW.


> With respect to ThreadsML, it would be great if they and other
> search engines supported thread traversal
> (http://www.quicktopic.com/7/H/rhSrjkWgjnvRq/m103). Regarding
> ENT and Google, it would be great if Google supported that too.
> As I mentioned on my blog in February
> (http://www.quicktopic.com/blog/archives/000219.html#000219 see
> 'Blog topics'), I'm intrigued by such a prospect.

:-)


>
Shelley (Burningbird) P  40
04-28-2003 11:45 AM ET (US)
First, clarification -- when I say conversation, I meant among weblogs, quick topic, wikis, etc. Not audio.

I'd say before you think about adding yet more data to RSS files that get consumed by a plethora of bots on a minute by minute basis, you think of a interim architecture for a prototype that you all think you can create without having to get Google and Sony involved.

Steve, you mention that centralized data store isn't necessary, because many of the threads are permanently accessible -- then how does one find them? Can you walk through an implementation scenario for say, a multi-post weblog topic with comments, perhaps throwing in QuickTopic and a yahoo newsgroup item.

Let's be real about this -- use specific tools. What would the tools need to do, and what would the people need to do. In your spec, you say 'export the thread' -- what does this mean? Exactly what would be the physical implementation of same?
Marc Canter  41
04-28-2003 12:21 PM ET (US)
You're being so damm pragmatic and specific - I love it!


> First, clarification -- when I say conversation, I meant among
> weblogs, quick topic, wikis, etc. Not audio.

Ok - so going back to your point - the issue is that as 10's of millions of conversations are flowing, and a lot of them (not necessarily ALL of them) will be stored - whether on a centralized or distributed (or both) series of repositories, THEN we gotta worry about RIGHT NOW the issue of......
I guess that's the one time (taking the dumb network, world of ends POV) that centralized servers come in handy - to navigate, discover and interconnect disparate conversations together. LOTS of rules, restrictions, constitutions, security layers, PETs and encryption realities will have to be instituted to pull this off - but first things first - let's get the base interop working.

>
> I'd say before you think about adding yet more data to RSS files
> that get consumed by a plethora of bots on a minute by minute
> basis, you think of a interim architecture for a prototype that
> you all think you can create without having to get Google and
> Sony involved.

Well I'd say that ENT (though not in flux - certainly isn't 'done' yet.) It can embed topics fine and help flow blog posts into new kinds of
aggregators - but I'd hate to see it stop at that. So as far as
'protoyping' goes, Steve is gonna take a stab at implementing a
'ThreadsML'-like approach with ENT and we'll see what happens after that.
When it comes to the medta-data, internal bits, data that Google is gonna scarf - I think we can spend some time (like a week or two) thinking about that issue as well.

Anything - in particular - we should "watch out for?"


>
> Steve, you mention that centralized data store isn't necessary,
> because many of the threads are permanently accessible -- then
> how does one find them? Can you walk through an implementation
> scenario for say, a multi-post weblog topic with comments,
> perhaps throwing in QuickTopic and a yahoo newsgroup item.

:-) This is fun!

>
> Let's be real about this -- use specific tools. What would the
> tools need to do, and what would the people need to do. In your
> spec, you say 'export the thread' -- what does this mean?
> Exactly what would be the physical implementation of same?


I vote for:

 - a DNS-like open respository of thread "pointers"/topics/meta-data - which enables folks to not only track the trheads - but also discover them in the first place

 - a data structure that specifies how a "generic conversation" would look like (OPML based?) - so whether it's a weblog post, comment on a weblog post, email excerpt, IM or chat session, or a particular piece of structured text (coming out of my new-fangled tool) - there's a common format where all this stuff gets converted into or out of.

 - clear usage sceanrios schemas. In other words - a) in the case of a a single topic being converted into a message board item, this is teh blah blah blah - while b) threaded IM or chat sessions should convert into blah blah blah to do blah blah blah, while on the other hand, c) doing remote posting (via IM or email) to a message board, would use the blah blah blah methodology......

 - IOW all centered around a common format, but using different kinds of schemas and APIs.


BTW Steve - bug in QT - when I click on the button "25-35" to see Shelley's earlier point - that link is broken.

- Marc
David Weinberger  42
04-28-2003 12:50 PM ET (US)
I'm excited to see this thread picked up again, so to speak. Thanks to y'all (and a big hug to Marc for pushing this forward so
enthusiastically).

See embedded comment.

-- David W.

> I vote for:
>
> - a DNS-like open respository of thread
> "pointers"/topics/meta-data - which enables folks to not only
> track the trheads - but also discover them in the first place

 
> - a data structure that specifies how a "generic
> conversation" would look like (OPML based?) - so whether it's
> a weblog post, comment on a weblog post, email excerpt, IM or
> chat session, or a particular piece of structured text
> (coming out of my new-fangled tool) - there's a common format
> where all this stuff gets converted into or out of.

ThreadsML started out to be this data structure. With it, one could decide to build a DNS-like open repository. But even if no one decides to do that, threadsML would still accomplish (what I take to be) its primary objective: enabling two compliant apps (say, QuickTopic and Topica, both of which have expressed an interest in supporting the standard) to import/export threads. For example, each would add a button to enable a user to export the current thread (or perhaps the entire discussion board) in threadsML and a button to import a discussion that'd been saved in threadsML.

I point out this obviousness because I don't want to tie threadsML to the creation of a directory of threads. Much as I'd love the latter, it's a much bigger and riskier project than coming up with a threads interchange standard.

If I've got this wrong, I'm sure you'll straighten me out...

 
> - clear usage sceanrios schemas. In other words - a) in
> the case of a a single topic being converted into a message
> board item, this is teh blah blah blah - while b) threaded IM
> or chat sessions should convert into blah blah blah to do
> blah blah blah, while on the other hand, c) doing remote
> posting (via IM or email) to a message board, would use the
> blah blah blah methodology......
>
> - IOW all centered around a common format, but using
> different kinds of schemas and APIs.
>
>
> BTW Steve - bug in QT - when I click on the button "25-35"
> to see Shelley's earlier point - that link is broken.
>
> - Marc
> _________________________________________________________________
> QT Forum: http://www.quicktopic.com/em/H/mXbfHC2srY3/m41
> Unsubscribe: http://www.quicktopic.com/em/X/mXbfHC2srY3
> Current email group: aaron@theinfo.org,
> ben@benhammersley.com, danny666@virgilio.it, info@icite.net,
> jamie@fentonia.com, jito@neoteny.com, judell@mv.com,
> kaminski@istori.com, marc@broadbandmechanics.com,
> marc@prec-it.com, matt@novissio.com,
> mgraham@mail.ivillage.com, myk@melez.com, myk@zapogee.com,
> paolo@evectors.it, rael@oreilly.com,
> rcaccappolo@mail.ivillage.com, rossmay@earthlink.net,
> self@evident.com, shelleyp@burningbird.net,
> steve@quicktopic.com Start your own topic in 20 seconds:
http://www.quicktopic.com |QT
Marc Canter  43
04-28-2003 01:04 PM ET (US)
> I'm excited to see this thread picked up again, so to speak.
> Thanks to y'all (and a big hug to Marc for pushing this forward
> so
> enthusiastically).

What I like is the balance of expertise and naivete - being contributed. Me - I'm just the marketing guy. But we're asking all the right questions right now!

>
> See embedded comment.
>
> -- David W.
>
> > I vote for:
> >
> > - a DNS-like open repository of thread
> > "pointers"/topics/meta-data - which enables folks to not only
> > track the threads - but also discover them in the first place
>
>
> > - a data structure that specifies how a "generic
> > conversation" would look like (OPML based?) - so whether it's
> > a weblog post, comment on a weblog post, email excerpt, IM or
> > chat session, or a particular piece of structured text
> > (coming out of my new-fangled tool) - there's a common format
> > where all this stuff gets converted into or out of.
>
> ThreadsML started out to be this data structure. With it, one
> could decide to build a DNS-like open repository. But even if no
> one decides to do that, threadsML would still accomplish (what I
> take to be) its primary objective: enabling two compliant apps
> (say, QuickTopic and Topica, both of which have expressed an
> interest in supporting the standard) to import/export threads.
> For example, each would add a button to enable a user to export
> the current thread (or perhaps the entire discussion board) in
> threadsML and a button to import a discussion that'd been saved
> in threadsML.
>
> I point out this obviousness because I don't want to tie
> threadsML to the creation of a directory of threads. Much as I'd
> love the latter, it's a much bigger and riskier project than
> coming up with a threads interchange standard.
>
> If I've got this wrong, I'm sure you'll straighten me out...


No - you're right on! I secretly hope that once Nic Nyholm is done baking a DNS-like system for identity for us, we'll be able to repurpose that for ThreadsML. In the mean time - you're right, we can't afford to wait for that.......

And I think we should approach Prospero - as they have a critical mass of message boards - and with one win, could get A WHOLE LOTTA folks tied in.
Shelley Powers  44
04-28-2003 01:14 PM ET (US)
----------------------------------------------------------------- >
> You're being so damm pragmatic and specific - I love it!
>
>
> > First, clarification -- when I say conversation, I meant among
> > weblogs, quick topic, wikis, etc. Not audio.
>
> Ok - so going back to your point - the issue is that as 10's of
> millions of conversations are flowing, and a lot of them (not
> necessarily ALL of them) will be stored - whether on a
> centralized or distributed (or both) series of repositories,
> THEN we gotta worry about RIGHT NOW the issue of......
> I guess that's the one time (taking the dumb network, world of
> ends POV) that centralized servers come in handy - to navigate,
> discover and interconnect disparate conversations together.
> LOTS of rules, restrictions, constitutions, security layers,
> PETs and encryption realities will have to be instituted to pull
> this off - but first things first - let's get the base interop
> working.
>

So if I understand this then, your interest at this time is just to define the elements within the XML vocabulary so that one could export a file with the XML for this, and import it. In other words, from this list, we would have thirty something knots in the thread, one for each message, and exporting this file would show these knots. Someone else could import this into something, and then do what, add more new knots?

Thats what you're all looking for as a first implementation -- exporting and importing the XML? And then later you'll bake a DNS like repository or system for these threads. That's the goals right now?

This would mean, then, that there is no discovery of a conversation, which defeats the purpose of all this, doesn't it?

However, if the conversation is about embedding this as new info in RSS, which doesn't persist, why would you want to do this without some form of data persistence incorporated?


> >
> > I'd say before you think about adding yet more data to RSS
> files
> > that get consumed by a plethora of bots on a minute by minute
> > basis, you think of a interim architecture for a prototype
> that
> > you all think you can create without having to get Google and
> > Sony involved.
>
> Well I'd say that ENT (though not in flux - certainly isn't
> 'done' yet.) It can embed topics fine and help flow blog posts
> into new kinds of
> aggregators - but I'd hate to see it stop at that. So as far as
> 'protoyping' goes, Steve is gonna take a stab at implementing a
> 'ThreadsML'-like approach with ENT and we'll see what happens
> after that.
> When it comes to the medta-data, internal bits, data that Google
> is gonna scarf - I think we can spend some time (like a week or
> two) thinking about that issue as well.
>

Try this once again, a bit more slowly please. What exactly do you mean by 'flow blog posts into new kinds of aggregators'? An aggregator is nothing more than a way of showing the RSS information in human format. These don't persist the data either. Were you thinking of something like conversation aggregators that show that a discussion about a certain topic is happening, but without concern about search on this and without persisting it?

Shelley
Danny Ayers  45
04-28-2003 01:22 PM ET (US)
[sorry if you get this twice, but my last couple of posts seem to have evaporated]

> Steve, you mention that centralized data store isn't necessary,
> because many of the threads are permanently accessible -- then
> how does one find them? Can you walk through an implementation
> scenario for say, a multi-post weblog topic with comments,
> perhaps throwing in QuickTopic and a yahoo newsgroup item.

May I?

A starter, anyway:

Person A makes a statement, which is recorded as a blog item. For convenience I'll label the URI of this statement A.

Person B comments on this item, directly on the site. This is recorded as B inReplyTo A in the RDF store maintained on A's site (in practice, right now, this would probably be another table in their Movable Type database installation dedicated to thread tracking).

Person C comments on A, on their own blog. When this is recorded a trackback ping is sent to A. A records this as a regular trackback AND as C inReplyTo A in the extra table.

This is the important bit : if C is threads-enabled, then C records the thread too, as C inReplyTo A. i.e. the connection is recorded at both ends.

This recording-at-both-ends means that the whole thread can be reconstructed from any point in the thread. Ok, if there are leaves are 'threadbare' then they might be ignored, but existing trackback/link-following may well help here.

I think there are only a few tweaks needed to get a basic system working from MT as it stands, though it would be cool if the blog-entry form had some extra checkboxes that corresponded to the semantics available in ThreadsML (is there 'emphaticallyDisagree'? - whatever). This would almost certainly call for an extra MySQL table behind the scenes.

I think the natural interconnectivity of the web might be enough in itself, but you could also ping a centralised server with the RDF snippet corresponding to the join in the thread too.

Spidering/scraping email archives would produce a lot of threads thanks to the mail's inReplyTo, but this could be augmented if you could persuade people to add a special tag in their post somewhere e.g. thread:IAgree could be picked up by a spider and a triple generated:

thisMail IAgree previousMail

if your clients are really willing then you could refer to links in email in a scrapable fashion, e.g.

Regarding this post,
thread:IAgree http://dannyayers.com/archives/001231.html
I usually agree with myself.

btw, I'm not sure if everyone in this topic has encountered it before, but a lot of work has been done around the Topic Map domain on dialog mapping - try searching on these terms and 'Jack Park', and I think there are some links in my IBIS doc - http://purl.org/ibis (a not-unrelated spec, that my be useful in joining domains together).

--- later ----

> > I vote for:
> >
> > - a DNS-like open respository of thread
> > "pointers"/topics/meta-data - which enables folks to not only
> > track the trheads - but also discover them in the first place

I'd be skeptical about whether a predominantly centalised system would work, in terms of adoption and scalability. If the thread data was recorded local/nearby to the thread item's own content, then the load would be distributed, which is scalability dealt with. If it was done in a fashion that started simple, then the effort needed to enable this in blogging tools would be minimal, and the adoption aspect would be dealt with - people would get it with their tool.
 
Note that a centralised searchable directory of threads could be created as a side effect of the distributed system - tools could just ping the mother ship.

> > - a data structure that specifies how a "generic
> > conversation" would look like (OPML based?) - so whether it's
> > a weblog post, comment on a weblog post, email excerpt, IM or
> > chat session, or a particular piece of structured text
> > (coming out of my new-fangled tool) - there's a common format
> > where all this stuff gets converted into or out of.

OPML is a non-starter - it's effectively impossible to validate and would be hopeless at representing the general graph structures needed for threads. Before I'm accused of personal bias: although I think RDF would be the best and probably easiest solution, it certainly isn't the only possible solution. XTMs or even a completely new language could do the job.

> ThreadsML started out to be this data structure. With it, one
> could decide to build a DNS-like open repository. But even if no
> one decides to do that, threadsML would still accomplish (what I
> take to be) its primary objective: enabling two compliant apps
> (say, QuickTopic and Topica, both of which have expressed an
> interest in supporting the standard) to import/export threads.
> For example, each would add a button to enable a user to export
> the current thread (or perhaps the entire discussion board) in
> threadsML and a button to import a discussion that'd been saved
> in threadsML.
>
> I point out this obviousness because I don't want to tie
> threadsML to the creation of a directory of threads. Much as I'd
> love the latter, it's a much bigger and riskier project than
> coming up with a threads interchange standard.

I don't think it's either/or between a threads interchange standard and ThreadsML - ThreadsML does a fair job of modelling most/all the relationships that you'd be likely to want to represent. A basic t.i.s. could simply demand just a single, simple 'inReplyTo' (or whatever) relationship, formatted in whatever fashion was deemed appropriate. This would be mappable to the corresponding term in ThreadsML. I personally think the easiest way of implementing a t.i.s. *would* use a subset of ThreadsML, but again this isn't the only option.

Cheers,
Danny.
Shelley Powers  46
04-28-2003 01:39 PM ET (US)
>
>
> --QT-------------------------------------------------------------
> Note: replies go to the entire group (see below)
> -----------------------------------------------------------------
>
> [sorry if you get this twice, but my last couple of posts seem
> to have evaporated]
>
> > Steve, you mention that centralized data store isn't
> necessary,
> > because many of the threads are permanently accessible -- then
>
> > how does one find them? Can you walk through an implementation
>
> > scenario for say, a multi-post weblog topic with comments,
> > perhaps throwing in QuickTopic and a yahoo newsgroup item.
>
> May I?
>
> A starter, anyway:
>
> Person A makes a statement, which is recorded as a blog item.
> For convenience I'll label the URI of this statement A.
>
> Person B comments on this item, directly on the site. This is
> recorded as B inReplyTo A in the RDF store maintained on A's
> site (in practice, right now, this would probably be another
> table in their Movable Type database installation dedicated to
> thread tracking).
>
> Person C comments on A, on their own blog. When this is recorded
> a trackback ping is sent to A. A records this as a regular
> trackback AND as C inReplyTo A in the extra table.
>
> This is the important bit : if C is threads-enabled, then C
> records the thread too, as C inReplyTo A. i.e. the connection is
> recorded at both ends.
>
> This recording-at-both-ends means that the whole thread can be
> reconstructed from any point in the thread. Ok, if there are
> leaves are 'threadbare' then they might be ignored, but existing
> trackback/link-following may well help here.
>
> I think there are only a few tweaks needed to get a basic system
> working from MT as it stands, though it would be cool if the
> blog-entry form had some extra checkboxes that corresponded to
> the semantics available in ThreadsML (is there
> 'emphaticallyDisagree'? - whatever). This would almost certainly
> call for an extra MySQL table behind the scenes.
>

This has already been implemented via trackback in MT. I have trackback and backtrack which shows both directions. Sam Ruby took it further and allows one to traverse the distance all the threads in either direction.
However, based on this, there is nothing in this that allows one to designate a 'topic'. We all know that the category is useless. Would we use the title?

And, this just allows people who are thread aware to follow the thread. This doesn't allow for discovery of threads, except by stumbling across them while reading blogs, etc.

Still, why not follow through on Trackback, which is already 80%
implemented?

> I think the natural interconnectivity of the web might be enough
> in itself, but you could also ping a centralised server with the
> RDF snippet corresponding to the join in the thread too.
>

Again, this is doable, if one has the space and a very efficient search engine. But it can be done with trackback too, as Ben Trott has
demonstrated.


> Spidering/scraping email archives would produce a lot of threads
> thanks to the mail's inReplyTo, but this could be augmented if
> you could persuade people to add a special tag in their post
> somewhere e.g. thread:IAgree could be picked up by a spider and
> a triple generated:
>
> thisMail IAgree previousMail
>

Mail. What you do you mean by mail. Do you mean, QuickTopic mail? Outlook mail?

> if your clients are really willing then you could refer to links
> in email in a scrapable fashion, e.g.
>
> Regarding this post,
> thread:IAgree http://dannyayers.com/archives/001231.html
> I usually agree with myself.
>
> btw, I'm not sure if everyone in this topic has encountered it
> before, but a lot of work has been done around the Topic Map
> domain on dialog mapping - try searching on these terms and
> 'Jack Park', and I think there are some links in my IBIS doc -
> http://purl.org/ibis (a not-unrelated spec, that my be useful in
> joining domains together).
>
> --- later ----
>
> > > I vote for:
> > >
> > > - a DNS-like open respository of thread
> > > "pointers"/topics/meta-data - which enables folks to not
> only
> > > track the trheads - but also discover them in the first
> place
>
> I'd be skeptical about whether a predominantly centalised system
> would work, in terms of adoption and scalability. If the thread
> data was recorded local/nearby to the thread item's own content,
> then the load would be distributed, which is scalability dealt
> with. If it was done in a fashion that started simple, then the
> effort needed to enable this in blogging tools would be minimal,
> and the adoption aspect would be dealt with - people would get
> it with their tool.
>
> Note that a centralised searchable directory of threads could be
> created as a side effect of the distributed system - tools could
> just ping the mother ship.
>
> > > - a data structure that specifies how a "generic
> > > conversation" would look like (OPML based?) - so whether
> it's
> > > a weblog post, comment on a weblog post, email excerpt, IM
> or
> > > chat session, or a particular piece of structured text
> > > (coming out of my new-fangled tool) - there's a common
> format
> > > where all this stuff gets converted into or out of.
>
> OPML is a non-starter - it's effectively impossible to validate
> and would be hopeless at representing the general graph
> structures needed for threads. Before I'm accused of personal
> bias: although I think RDF would be the best and probably
> easiest solution, it certainly isn't the only possible solution.
> XTMs or even a completely new language could do the job.
>

And I have to push back at RDF. Why do you see this needing to be RDF, Danny? If it's because we already have RSS 1.0 and we're used to it, and it's a data model that handles namespaces and has existing technology to support it, well then that's cool. But we all agree there isn't any semantic richness to this thread structure. Correct? Or do we want semantic richness? If so, then we don't have it from current possible vocabularies.

<snip>

Shelley
Danny Ayers  47
04-28-2003 02:52 PM ET (US)
[another mail went into the void - maybe I shouldn't be doing reply-to-all...]

> This has already been implemented via trackback in MT. I have
> trackback and backtrack which shows both directions. Sam Ruby
> took it further and allows one to traverse the distance all the
> threads in either direction.

heh, cool! But remind me again why we're talking about threads and not topics?

> However, based on this, there is nothing in this that allows one
> to designate a 'topic'. We all know that the category is
> useless. Would we use the title?

What am I missing here?

Ok, there are issues of detail - use a globally defined topic for the whole thread, and have problems if the subject mutated along the way; use per-poster recording of the topic

item1 topic http://blogx.org#coycarp
item2 topic http://blogy.org#goldfish

but isn't this pretty much orthogonal to the threading dimension?
 
> And, this just allows people who are thread aware to follow the
> thread. This doesn't allow for discovery of threads, except by
> stumbling across them while reading blogs, etc.

Which is an indexing problem - orthogonal again! If you can discover one item in a thread and the thread is preserved then you can discover them all. A server ping somewhere along the line, perhaps?

> Still, why not follow through on Trackback, which is already 80%
> implemented?

Yep.

> > I think the natural interconnectivity of the web might be
> enough
> > in itself, but you could also ping a centralised server with
> the
> > RDF snippet corresponding to the join in the thread too.
> >
>
> Again, this is doable, if one has the space and a very efficient
> search engine.

The explicit data is available to the server thanks to the ping. The storage and indexing could take place at that point in time, so effectively all that has to be done to search is a DB query.

But it can be done with trackback too, as Ben
> Trott has
> demonstrated.

I don't follow..?
 
> > Spidering/scraping email archives would produce a lot of
> threads
> > thanks to the mail's inReplyTo, but this could be augmented if
> > you could persuade people to add a special tag in their post
> > somewhere e.g. thread:IAgree could be picked up by a spider
> and
> > a triple generated:
> >
> > thisMail IAgree previousMail
> >
>
> Mail. What you do you mean by mail. Do you mean, QuickTopic
> mail? Outlook mail?

Electronic mail. Email.
In the worst case, with generic mail clients I'm suggesting that the user adds simple markup manually to their mail, in the best case - I don't know, the user picks from a drop-down list in their QuickTopic web interface.

[snip]
> > structures needed for threads. Before I'm accused of personal
> > bias: although I think RDF would be the best and probably
> > easiest solution, it certainly isn't the only possible
> solution.
> > XTMs or even a completely new language could do the job.
> >
>
> And I have to push back at RDF. Why do you see this needing to
> be RDF, Danny?

I said "it certainly isn't the only possible solution"!!

If it's because we already have RSS 1.0 and we're
> used to it, and it's a data model that handles namespaces and
> has existing technology to support it, well then that's cool.

These are factors that make me think that RDF might be the easiest approach. Added to that, RDF is cute and cuddly ;-)

> But we all agree there isn't any semantic richness to this
> thread structure. Correct?

I have no idea. The talk of topics being one of the outstanding issues suggests it might be.

 Or do we want semantic richness? If
> so, then we don't have it from current possible vocabularies.

Again, I haven't a clue. I certainly misunderstood what the problem is. There is a lot of richness in existing vocabularies, but what are we trying to say?

Cheers,
Danny.
<
Shelley Powers  48