Loading ...
Sorry, an error occurred while loading the content.

Re: Decentralized Meta-Data Search Strategies

Expand Messages
  • Aaron Swartz
    [Please keep me cc-ed if you want a response.] ... OK, so neither John (a human being) or the abstract concept of email address have a real hash (since
    Message 1 of 4 , Aug 6, 2002
    • 0 Attachment
      [Please keep me cc-ed if you want a response.]

      > A question for somebody from Plesh: how does the emphasis on
      > serving and navigating RDF triples affect the topology of the
      > network? My guess is that you publish the hash of each element
      > of the triple, so users can address the elements separately or
      > together. A triple of
      >
      > :John :emailAddress <mailto:john@...> .
      >
      > allows the user to search for the hash of John and/or
      > emailAddress and/or mailto:john@....

      OK, so neither John (a human being) or the abstract concept of
      "email address" have a real hash (since they're not really bits)
      so we give them URIs and then hash the URIs.

      Valid searches have the form of a triple except with zero or
      more portions wildcarded. So...

      :John :emailAddress * .

      for example. Then you search a number of DHTs (there are some
      gritty details in optimizing this, but you can think of just
      three DHTs, one handling queries if the first element is a
      wildcard, another if the second is and yet another if the third
      is) and you get back the answer (hopefully
      mailto:john@...).

      Does that make sense?
      --
      Aaron Swartz [http://www.aaronsw.com] I am large, I contain multitudes.
    • Lucas Gonze
      ... Yup. So there s no substring match ( emailAddress=* is legal but emailAddress=*@amazon.com isn t) and that lets you address triples as documents. The
      Message 2 of 4 , Aug 6, 2002
      • 0 Attachment
        Aaron Swartz wrote:
        > Valid searches have the form of a triple except with zero or
        > more portions wildcarded. So...
        >
        > :John :emailAddress * .
        >
        > for example. Then you search a number of DHTs (there are some
        > gritty details in optimizing this, but you can think of just
        > three DHTs, one handling queries if the first element is a
        > wildcard, another if the second is and yet another if the third
        > is) and you get back the answer (hopefully
        > mailto:john@...).
        >
        > Does that make sense?

        Yup. So there's no substring match ( emailAddress=* is legal but
        emailAddress=*@... isn't) and that lets you address triples as
        documents. The impact of metadata on topology is that there are a fixed
        number of overlay networks, and Plesh could actually sit on top of any DHT
        that supports the same hash.

        It strikes me that a plaxton mesh would support full regex matching if you
        didn't hash the input identifiers. Normal Plaxton identifiers are a fixed
        format hash. EG the plaintext "john" becomes AE 09 9W. But a mesh could
        just as easily be built on characters. The reason that most people don't
        use a plaxton mesh is that the networks it builds are fairly brittle,
        given that they lead to a massive number of interconnections. A plaxton
        mesh that didn't hash the string "john" would need four dimensions. Full
        size books would need a dimension for every character.

        But non-plaxton DHTs get into similar trouble, since being able to
        traverse them with metadata also leads to having a dimension per chunk of
        metadata. Plesh has three chunks of metadata, and it has three
        dimensions.

        RDF has an interesting position here. It's supposed to fix the problem
        that the browser web has to figure out the meaning of free text in HTML
        pages. It does it by saying that any searchable dimension has to be
        a discrete element of a triple. Substring matching is mainly disallowed.

        Thanks Aaron.

        - Lucas
      • samrhjoseph
        ... problem ... HTML ... be ... disallowed. ... While I think this is true for Plesh, I don t think it is true generally for distributed search of RDF triples.
        Message 3 of 4 , Aug 6, 2002
        • 0 Attachment
          --- In decentralization@y..., Lucas Gonze <lgonze@p...> wrote:

          >
          > RDF has an interesting position here. It's supposed to fix the
          problem
          > that the browser web has to figure out the meaning of free text in
          HTML
          > pages. It does it by saying that any searchable dimension has to
          be
          > a discrete element of a triple. Substring matching is mainly
          disallowed.
          >

          While I think this is true for Plesh, I don't think it is true
          generally for distributed search of RDF triples. NeuroGrid supports
          distributed RDF searches, and is perfectly happy with substring
          matching. It won't provide the same guarantees about finding
          everything that chord might provide, but it's still distributed
          search.

          CHEERS> SAM
        • Lucas Gonze
          ... Yup -- it is only true for distributed hashtables.
          Message 4 of 4 , Aug 7, 2002
          • 0 Attachment
            > While I think this is true for Plesh, I don't think it is true
            > generally for distributed search of RDF triples. NeuroGrid supports
            > distributed RDF searches, and is perfectly happy with substring
            > matching. It won't provide the same guarantees about finding
            > everything that chord might provide, but it's still distributed
            > search.

            Yup -- it is only true for distributed hashtables.
          Your message has been successfully submitted and would be delivered to recipients shortly.