Loading ...
Sorry, an error occurred while loading the content.

Re: [RSS-DEV] Who are we and what are we trying to accomplish anyway ?

Expand Messages
  • Ian Graham
    ... Tappnel s commentary [1] reiterates this distinction, but also (IMO) makes the important but seemingly often glossed over point that the RDF formalism is
    Message 1 of 20 , Sep 21, 2002
    • 0 Attachment
      On Fri, 20 Sep 2002, Seth Russell wrote:

      > Maybe there are two kinds of design stratagies here. Maybe trying to
      > combine them in one group will mean that what we would produce would be
      > a bad compromise. I think the groups can be characterized as follows ....
      >
      > (1) One group (I count myself in this group) believes in RDF and it's
      > ability to say just about anything in a manner where the meaning of the
      > data is carried by the syntax of the data, it's vocabulary, and it's
      > schema definitions. This group believes that RDF data will be
      > interoperable between diverse applications, as long as those
      > applications process the data according to the syntax and it's schema
      > definitions. They believe that there will be a synergy at the end of
      > the tunnel when data will match up without being contrived to match up.
      > They use RDF tools designed to that belief ... and are developing more
      > as they speak. But, perhaps somebody can say that better.
      >
      >
      > (2) The other group of designers are comming in from the trenches of
      > actually syndicating information. They believe that the only way to
      > match up data is to contrive the matches. But perhaps somebody can
      > characterize that group better than I can.
      >
      > It seems to me that a combined group will not be able to come up with a
      > *winning* new format that goes beyond what RSS .9x, RSS 1.0 , and RSS
      > 2.0 already give us.
      >
      > Hey ... maybe group (1) should split off from group (2) .....

      Tappnel's commentary [1] reiterates this distinction, but also (IMO) makes
      the important but seemingly often glossed over point that the RDF
      formalism is sufficiently tricky and complex that it trips up and inhibits
      even very good programmers/developers.

      Bill Kearney's comment [2] on how current feeds rarely use mod_content,
      and how most that do implement this feature incorrectly, reiterates the
      need for simplicity of meaning and implementation in the specficiation.

      I too think that RDF has important value, but it adds complexity that can
      also lead to problems, by increasing the risk of error in designing
      feeds, and in reducing the pool of people who can practically work with
      RSS.

      IMO, the propogation of modules, namespaces, and complex rules for how
      these can be mixed inevitably makes it harder and harder to get things
      correct. And, once the RDF statements in a message buggered up, then the
      benefits of using RDF are lost.

      I think a three-pronged approach is best:

      1) a pragmatic, simplified, limited functionality non-RDF approach
      (or at least not heavily RDFfed) that is easy to understand by
      competent (but not AI specilist) developers, and easy to implement
      using non-RDF tools. In this case, a developer should NEVER
      need to read RDF specs (or understand what 'reified' means ;-) )
      2) a richer, more complex RDF-based syntax that can express all the
      abstract concepts of the non-RDF approach, plus many others.
      3) a formal transformation process for converting data conforming to
      the first standard into RDF, and vice versa.

      (1) provides for a consistent lowest-common-denominator playing field for
      a large fraction of current users/developers, while (2) is an exciting
      playing ground for new technologies and features, while (3) links the two
      together.

      [1] http://groups.yahoo.com/group/rss-dev/message/3897
      [2] http://groups.yahoo.com/group/rss-dev/message/3959
      --
      Ian
    • Seth Russell
      ... Hmmm ... I m sure this is true, but alass why? A database of triples is almost the simplest concept in the world. These triples (subject, property,
      Message 2 of 20 , Sep 21, 2002
      • 0 Attachment
        Ian Graham wrote:
           Hey ... maybe group (1) should split off from group (2) .....
            
        Tappnel's commentary [1] reiterates this distinction, but also (IMO) makes
        the important but seemingly often glossed over point that the RDF
        formalism is sufficiently tricky and complex that it trips up and inhibits
        even very good programmers/developers.
        Hmmm ... I'm sure this is true, but alass why?  A database of triples is almost the simplest concept in the world.  These triples (subject, property, object) are all the application sees from the parser.  
        I think a three-pronged approach is best:
        
        1) a pragmatic, simplified, limited functionality non-RDF approach 
           (or at least not heavily RDFfed) that is easy to understand by 
           competent (but not AI specilist) developers,  and easy to implement 
           using non-RDF tools.  In this case, a developer should NEVER 
           need to read RDF specs (or understand what 'reified' means ;-) ) 
        But what is the difference between that and Dave's RSS 2.0 ?   I know it's fashionable not to ack Dave on this list, but what is the use of duplicating his effort over here just because we don't like him ?  
        2) a richer, more complex RDF-based syntax that can express all the
           abstract concepts of the non-RDF approach, plus many others.
        Once we stop trying to compromise, this really just becomes RDF triples.  Syndication needs two new node types:  item and channel.  Of coure each of those kinds of nodes can be described with whatever properties the author wishes to use.  Applications just select off the properties of channels and items which they want to process and ignore the rest.   How could anything be simpler ?
        3) a formal transformation process for converting data conforming to 
           the first standard into RDF, and vice versa. 
        Agree.  This should be an easy piece.  

        Seth Russell

      • Phil Ringnalda
        ... Going back a couple of years, Dan Libby said [1] However, to be *practical*, we must first create the tools for 1) validation, 2) processing, and 3)
        Message 3 of 20 , Sep 21, 2002
        • 0 Attachment
          Ian Graham wrote:
          > I too think that RDF has important value, but it adds complexity that can
          > also lead to problems, by increasing the risk of error in designing
          > feeds, and in reducing the pool of people who can practically work with
          > RSS.
          >
          > IMO, the propogation of modules, namespaces, and complex rules for how
          > these can be mixed inevitably makes it harder and harder to get things
          > correct. And, once the RDF statements in a message buggered up, then the
          > benefits of using RDF are lost.

          Going back a couple of years, Dan Libby said [1] "However, to be
          *practical*, we must first create the tools for 1) validation, 2)
          processing, and 3) generation, pretty much in that order. With proper
          validation tools, people can begin writing processors and generators, or
          even producing files by hand. Without them, it is like shooting in the
          dark. [...] Validation is extremely important -- important enough to be
          listed apart from "tools". Someone publishing a document *must* be able to
          validate that the document is correct before sending it, particularly when
          setting up an automated system. Validation further helps prevent the format
          from splitting, particularly in areas where the spec may be unclear. For
          XML, validation requires minimally a DTD, and optimally XML-Schema and/or
          further application level processing. For RDF, validation requires an
          RDF-Schema aware processor (I believe)."

          I really don't know how much I'm asking for, but are we close to the point
          where we can have a useful-to-amateurs validator, one that will not only
          check syntax but also somehow show the RDF statements in the way they would
          probably be interpreted, so that, for example, someone using the default
          MovableType template would see that they are claiming that the
          admin:generatorAgent for their feed is
          "http://www.theirsite.com/index.rdf#MovableType/2.21"? I'm trying my
          damndest to have my RDF say what I mean, but with only Leigh Dodds'
          validator [2] and the W3C RDF validator [3] I'm barely sure that I'm
          speaking in grammatically correct sentences, and I'm not at all clear on
          whether my sentences mean "Where is the bathroom?" or "Where is your
          mother's coccyx?"

          Phil Ringnalda


          [1] http://groups.yahoo.com/group/syndication/message/586
          [2] http://www.ldodds.com/rss_validator/1.0/validator.html
          [3] http://www.w3.org/RDF/Validator/
        • Ian Graham
          ... I believe the issue is that the triples add an extra abstraction layer that developers don t understand or want. They want to be doers, not thinkers, and
          Message 4 of 20 , Sep 21, 2002
          • 0 Attachment
            On Sat, 21 Sep 2002, Seth Russell wrote:

            > Ian Graham wrote:
            >
            > >> Hey ... maybe group (1) should split off from group (2) .....
            > >>
            > >>
            > >
            > >Tappnel's commentary [1] reiterates this distinction, but also (IMO) makes
            > >the important but seemingly often glossed over point that the RDF
            > >formalism is sufficiently tricky and complex that it trips up and inhibits
            > >even very good programmers/developers.
            > >
            > Hmmm ... I'm sure this is true, but alass why? A database of triples is
            > almost the simplest concept in the world. These triples (subject,
            > property, object) are all the application sees from the parser.

            I believe the issue is that the triples add an extra abstraction layer
            that developers don't understand or want. They want to be doers, not
            thinkers, and the RDF approach asks them to first think through the
            underlying subtlety of their data semantics before developing an
            application. For the scope of most RSS applications, this is seen to be
            unnecessary baggage.

            It's a bit like OO -- you really don't need OO (and all the OO design
            abstractions) for small applications, becuase the extra layer of
            abstraction gets in the way of getting the simple tasks done.

            > >I think a three-pronged approach is best:
            > >
            > >1) a pragmatic, simplified, limited functionality non-RDF approach
            > > (or at least not heavily RDFfed) that is easy to understand by
            > > competent (but not AI specilist) developers, and easy to implement
            > > using non-RDF tools. In this case, a developer should NEVER
            > > need to read RDF specs (or understand what 'reified' means ;-) )
            > >
            > But what is the difference between that and Dave's RSS 2.0 ? I know
            > it's fashionable not to ack Dave on this list, but what is the use of
            > duplicating his effort over here just because we don't like him ?

            Well, I think some of Dave's ideas are poorly designed, but otherwise this
            is precisely the model (and rationale) he is following. OTOH, I am not in
            favor of specifications designed by fiat, as opposed to by some sort of WG
            consensus. This group, however, has never come to consensus on doing
            anything that follows model (1) -- which IMHO explains why this group
            continues to be fractured by this issue.

            > >2) a richer, more complex RDF-based syntax that can express all the
            > > abstract concepts of the non-RDF approach, plus many others.

            > Once we stop trying to compromise, this really just becomes RDF triples.
            > Syndication needs two new node types: item and channel. Of coure each
            > of those kinds of nodes can be described with whatever properties the
            > author wishes to use. Applications just select off the properties of
            > channels and items which they want to process and ignore the rest. How
            > could anything be simpler ?

            Well, yes -- to you and those comfortable with the technology and tooling.
            This is a bit of a stretch for someone writing RSS code using perl and a
            simple XML parser (if that) ;-) Better RDF tooling would help greatly...

            > >3) a formal transformation process for converting data conforming to
            > > the first standard into RDF, and vice versa.
            > >
            > Agree. This should be an easy piece.

            One would hope so.. however, the 'muckiness' of type (1) RSS may make it
            hard to transform this into type (2) -- much in the way it is hard to
            write an HTML parser that handles all the messiness of HTML...

            Ian
          • Ian Graham
            ... A very good point. I believe most pragmatic RSS developers (as opposed to the R&D developers who dominate this group) would be happy with a simple
            Message 5 of 20 , Sep 21, 2002
            • 0 Attachment
              On Sat, 21 Sep 2002, Phil Ringnalda wrote:

              > Ian Graham wrote:
              > > I too think that RDF has important value, but it adds complexity that can
              > > also lead to problems, by increasing the risk of error in designing
              > > feeds, and in reducing the pool of people who can practically work with
              > > RSS.
              > >
              > > IMO, the propogation of modules, namespaces, and complex rules for how
              > > these can be mixed inevitably makes it harder and harder to get things
              > > correct. And, once the RDF statements in a message buggered up, then the
              > > benefits of using RDF are lost.
              >
              > Going back a couple of years, Dan Libby said [1] "However, to be
              > *practical*, we must first create the tools for 1) validation, 2)
              > processing, and 3) generation, pretty much in that order. With proper
              > validation tools, people can begin writing processors and generators, or
              > even producing files by hand. Without them, it is like shooting in the
              > dark. [...] Validation is extremely important -- important enough to be
              > listed apart from "tools". Someone publishing a document *must* be able to
              > validate that the document is correct before sending it, particularly when
              > setting up an automated system. Validation further helps prevent the format
              > from splitting, particularly in areas where the spec may be unclear. For
              > XML, validation requires minimally a DTD, and optimally XML-Schema and/or
              > further application level processing. For RDF, validation requires an
              > RDF-Schema aware processor (I believe)."

              A very good point. I believe most 'pragmatic' RSS developers (as opposed
              to the 'R&D developers' who dominate this group) would be happy with a
              simple toolset adn API (perl, python, php, whatever) that lets them create
              outgoing or process incoming RSS without needing to understand any XML or
              RDF. The toolset should guarantee validity at the syntactic level. RSS
              feeds seem to still have lots of syntactic problems.

              Even for the R&D group, there seem to be too few good RDF tools, including
              validators and visualizers.

              > I really don't know how much I'm asking for, but are we close to the point
              > where we can have a useful-to-amateurs validator, one that will not only
              > check syntax but also somehow show the RDF statements in the way they would
              > probably be interpreted, so that, for example, someone using the default
              > MovableType template would see that they are claiming that the
              > admin:generatorAgent for their feed is
              > "http://www.theirsite.com/index.rdf#MovableType/2.21"? I'm trying my
              > damndest to have my RDF say what I mean, but with only Leigh Dodds'
              > validator [2] and the W3C RDF validator [3] I'm barely sure that I'm
              > speaking in grammatically correct sentences, and I'm not at all clear on
              > whether my sentences mean "Where is the bathroom?" or "Where is your
              > mother's coccyx?"

              A nice example, although it does open up the doors to way too many bad
              jokes ;-)

              Ian
            • Seth Russell
              ... Bolder dash! Programmers never had any problem with name_value pairs ... they loved and embraced them .. right? Well RDF is just name_value pairs *about
              Message 6 of 20 , Sep 21, 2002
              • 0 Attachment
                Ian Graham wrote:

                >I believe the issue is that the triples add an extra abstraction layer
                >that developers don't understand or want.
                >
                Bolder dash! Programmers never had any problem with name_value pairs
                ... they loved and embraced them .. right? Well RDF is just name_value
                pairs *about things*. Programmers never had any problem with relational
                databases ... they loved and embraced them .... right? Well RDF is just
                a relational database with a fixed and simplified column structure ...
                i.e. just three columns. If you look at RDF as data and forget about
                all the abstract semantics, it actually is a much simpler solution to
                the problem of saying anything about anything. It's much simpler than
                contriving customized structures every time we want to say something new.

                >Well, I think some of Dave's ideas are poorly designed, but otherwise this
                >is precisely the model (and rationale) he is following. OTOH, I am not in
                >favor of specifications designed by fiat, as opposed to by some sort of WG
                >consensus.
                >
                Well sometimes a single person can design a better structure ... where a committee will end up with an aberration of compromises trying to attain everybody's conflicting goals. I believe that RSS 1.0 is just such a aberration. But the 2.0 spec preserves compatibility with thousands of .9x feeds, yet allows for just the additional properties to be added which people are screaming for. It seems to me that the RSS 2.0 spec just reflects where the market is and where it wants to go. It's simple, uncontrived, and preserves the momentum of RSS. It is truly going to be difficult for a committee to come up with a better spec.

                >One would hope so.. however, the 'muckiness' of type (1) RSS may make it
                >hard to transform this into type (2)
                >
                Not at all. In fact we can transform any kind of items streaming in
                channel documents into RDF nodes and arrows streaming in whatever
                media. Emails, Usenet posts, XHTML marked up web pages, arbitrary XML,
                RSS .9x, RSS 2.0, RSS 1.0 etc .... all can be included. I bid 2000
                lines of code (or less) and a simple RDF description for each new kind
                of format.

                Seth Russell
                http://robustai.net/
              • Ian Graham
                ... If the triples were that simple, then everyone would be happy to use them :-/ But people don t seem to be entirely happy (otherwise this continuing
                Message 7 of 20 , Sep 21, 2002
                • 0 Attachment
                  On Sat, 21 Sep 2002, Seth Russell wrote:

                  > Ian Graham wrote:
                  >
                  > >I believe the issue is that the triples add an extra abstraction layer
                  > >that developers don't understand or want.
                  > >
                  > Bolder dash! Programmers never had any problem with name_value pairs
                  > ... they loved and embraced them .. right? Well RDF is just name_value
                  > pairs *about things*. Programmers never had any problem with relational
                  > databases ... they loved and embraced them .... right? Well RDF is just
                  > a relational database with a fixed and simplified column structure ...
                  > i.e. just three columns. If you look at RDF as data and forget about
                  > all the abstract semantics, it actually is a much simpler solution to
                  > the problem of saying anything about anything. It's much simpler than
                  > contriving customized structures every time we want to say something new.

                  If the triples were that simple, then everyone would be happy to use
                  them :-/

                  But people don't seem to be entirely happy (otherwise this continuing
                  discussion wouldn't be happening). I suspect this is because the
                  simplicity of raw triples gets lost in the complexity of the XML notation,
                  and in the complexity of the RDF semantics (the RDF specs are long for
                  good reason!).

                  I think RDF is very cool -- and I think it is an important and useful
                  technology. But I don't think you need the power of RDF for all use
                  cases of RSS.

                  It's true developers love relational databases. But then, if you're only
                  working with a few simple resources and only simple indexing requirements,
                  then I bet you dimes for dollars that most developers would just dump the
                  stuff into a filesytem, and use index files and hashtables.....

                  > >Well, I think some of Dave's ideas are poorly designed, but otherwise this
                  > >is precisely the model (and rationale) he is following. OTOH, I am not in
                  > >favor of specifications designed by fiat, as opposed to by some sort of WG
                  > >consensus.
                  > >

                  > Well sometimes a single person can design a better structure ... where
                  > a committee will end up with an aberration of compromises trying to
                  > attain everybody's conflicting goals. I believe that RSS 1.0 is just
                  > such a aberration. But the 2.0 spec preserves compatibility with
                  > thousands of .9x feeds, yet allows for just the additional properties
                  > to be added which people are screaming for. It seems to me that the
                  > RSS 2.0 spec just reflects where the market is and where it wants to
                  > go. It's simple, uncontrived, and preserves the momentum of RSS. It
                  > is truly going to be difficult for a committee to come up with a
                  > better spec.

                  This is quite true, and things like Relax and trex are good examples of
                  this. I'm not so convinced of all Dave's notions, however, although I do
                  admit that RSS 2.0 is a reasonable way forward along the 0.9x branch.

                  In these other cases, however, the contribution became an open standard,
                  with a great deal of community contribution / involvement / consensus. Why
                  that didn't happen with RSS is unclear to me. Perhaps it was because there
                  never was a true originator of RSS to lead things forward, or perhaps it's
                  because those going forward had two visions for where it should go. Or
                  perhaps it's both of these, and more.

                  > >One would hope so.. however, the 'muckiness' of type (1) RSS may make it
                  > >hard to transform this into type (2)
                  > >
                  > Not at all. In fact we can transform any kind of items streaming in
                  > channel documents into RDF nodes and arrows streaming in whatever
                  > media. Emails, Usenet posts, XHTML marked up web pages, arbitrary XML,
                  > RSS .9x, RSS 2.0, RSS 1.0 etc .... all can be included. I bid 2000
                  > lines of code (or less) and a simple RDF description for each new kind
                  > of format.

                  My muckiness referred to the need to add special handling to take care of
                  badly formed XML .... experience in other work I've done (albeit with
                  HTML) suggests to me that reliable information scraping from badly formed
                  input can get messy, and that up to 20% of the code needs to be customized
                  for each 'scraped' page/feed. If you've found that scraping into RDF is
                  relatively easy and reliable, then I think that's fantastic (and would;ve
                  made my life a whole lot easier, if I'd had such tools a few years back
                  ;-) )

                  Ian
                Your message has been successfully submitted and would be delivered to recipients shortly.