Loading ...
Sorry, an error occurred while loading the content.

use of # between '/'s

Expand Messages
  • Tilman Hausherr
    It seems that there s a bug in my software with links like this one: interview I just
    Message 1 of 6 , Jan 13, 2010
    • 0 Attachment
      It seems that there's a bug in my software with links like this one:

      <a
      href="http://www.dctp.tv/#/meinungsmacher/udo-vetter-lawblog">interview</a>

      I just throw away everything after the #, so I would spider to
      http://www.dctp.tv/ , which shows a different content.

      Does anybody know the meaning of a # that appears "deep inside" an URL,
      and what would the correct logic to differentiate it from the classic
      '#' as explained in
      http://www.w3.org/Addressing/URL/uri-spec.html ? Could it be "it doesn't
      count if the '#' is before a '/'" ?

      If so, what about this URL
      http://www.ftd.de/auto/bilder/:galerie-die-fiatisierung-von-chrysler/50059172.html#utm_source=rss&utm_medium=rss_feed&utm_campaign=/
      where the content is identical to this URL
      http://www.ftd.de/auto/bilder/:galerie-die-fiatisierung-von-chrysler/50059172.html
      ?

      Tilman
    • Daniel Norton
      That s not a bug in your software, it s a bug in the website. The hash sign (#) in a URI is a reserved character and a URI with a hash sign (#) should retrieve
      Message 2 of 6 , Jan 13, 2010
      • 0 Attachment
        That's not a bug in your software, it's a bug in the website. The hash sign (#) in a URI is a reserved character and a URI with a hash sign (#) should retrieve the same document as the URI without the hash sign and everything following it (the fragment identifier). From RFC 3986 (highlight added):

        4.4 Same-Document Reference

        When a URI reference refers to a URI that is, aside from its fragment component (if any), identical to the base URI (Section 5.1), that reference is called a "same-document" reference. The most frequent examples of same-document references are relative references that are empty or include only the number sign ("#") separator followed by a fragment identifier.

        When a same-document reference is dereferenced for a retrieval action, the target of that reference is defined to be within the same entity (representation, document, or message) as the reference; therefore, a dereference should not result in a new retrieval action.

        The specification does not provide for any exceptions for characters (such as "/") after the hash mark, so they must be considered to be part of the fragment identifier. The W3 document you referenced concurs.

        --
        Daniel

      Your message has been successfully submitted and would be delivered to recipients shortly.