Loading ...
Sorry, an error occurred while loading the content.

Re: [xenu-usergroup] Feature Suggestion: Please provide a loadable website definition file.

Expand Messages
  • Tilman Hausherr
    ... Yes, but it doesn t mean that you get more chances with your wishes :-) ... They are loaded after you have typed in the URL, or got it from the drop down
    Message 1 of 6 , Jan 2, 2010
    • 0 Attachment
      On Sat, 02 Jan 2010 14:20:54 -0000, Rolf Hemmerling wrote:

      >Hello,
      >is Tilman reading here ?

      Yes, but it doesn't mean that you get more chances with your wishes :-)

      >
      >My
      >Feature Suggestion:
      >a)
      >Please provide a loadable website definition file ( = configuration file )
      >
      >as typing in the "exceptions" each time is annoying ( after you offer exceptions, thanks for that ). And thinking about proper exceptions is hard work, so you have to write it down anyhow - why not in a configuration fle ?!
      >
      >So that a non-technical expert might run XENU monthly/weekly ect, by just running Xenu and loading the website file prepared by an expert.

      They are loaded after you have typed in the URL, or got it from the drop
      down box.

      >I think its time now that Xenu gets serious to be used "automatically".
      >
      >b)
      >Please provide a way to kill processes QUICKLY, when a user wants to abort Xenu.
      >Now there is the modal window which explains that 1000 processes are running. This might be true, technically, but does not help the user to get rid of Xenu quickly.

      press STOP, this is near the PAUSE button.

      If the processes are not stopped then, then it means there's a bug
      (seems to happen with some ftp jobs, i.e. the background process dies
      and xenu doesn't get notice)

      Tilman

      >
      >It helps to kill Xenu several times, then Windows askes if it may abort the whole stuff :-). But this is of course no "clean" solution.
      >
      >Sincerely
      >Rolf
      >
      >
      >
      >
      >------------------------------------
      >
      >Yahoo! Groups Links
      >
      >
      >
    • Thomas Fischer
      Hello Tilman, ... I d like to second Rolf on his request. There is the file Xenu.ini with all the information, and I prepared it for my users accordingly, so
      Message 2 of 6 , Jan 4, 2010
      • 0 Attachment
        Hello Tilman,

        > > On Sat, 02 Jan 2010 14:20:54 -0000, Rolf Hemmerling wrote:
        > > My Feature Suggestion:
        > > a) Please provide a loadable website definition file
        > > ( = configuration file )
        > >
        > > as typing in the "exceptions" each time is annoying ( after
        > > you offer exceptions, thanks for that ). And thinking about
        > > proper exceptions is hard work, so you have to write it down
        > > anyhow - why not in a configuration file ?!
        > >
        > > So that a non-technical expert might run XENU monthly/weekly
        > > etc., by just running Xenu and loading the website file
        > > prepared by an expert.
        >
        > They are loaded after you have typed in the URL, or got it
        > from the drop down box.

        I'd like to second Rolf on his request.
        There is the file Xenu.ini with all the information, and I prepared it for
        my users accordingly, so that they can start their linkchecks without
        worrying about restrictions.
        But it is a strange mixture of preferences, history, settings etc, and every
        time a check is run it is changed by Xenu.
        So while it serves some of the purposes that Rolf is asking for, it is not
        quite as straightforward as it could be.
        I would prefer a separation of the file into different files, in particular
        one file for each website that is checked with all the additional settings
        needed and which remains unchanged by Xenu (unless preferences are altered).
        My Xenu.ini now has 55KB and it starts to get a little confusing.

        All the best
        Thomas
      • Tilman Hausherr
        Hi, For those of you who are annoyed of the bug with mail URLs of the kind mailto:user@host.com?subject=xxx there s a new version that solves it:
        Message 3 of 6 , Jan 4, 2010
        • 0 Attachment
          Hi,

          For those of you who are annoyed of the bug with mail URLs of the kind

          mailto:user@...?subject=xxx

          there's a new version that solves it:
          http://home.snafu.de/tilman/tmp/xenubeta.zip

          Tilman
        • Tilman Hausherr
          It seems that there s a bug in my software with links like this one: interview I just
          Message 4 of 6 , Jan 13, 2010
          • 0 Attachment
            It seems that there's a bug in my software with links like this one:

            <a
            href="http://www.dctp.tv/#/meinungsmacher/udo-vetter-lawblog">interview</a>

            I just throw away everything after the #, so I would spider to
            http://www.dctp.tv/ , which shows a different content.

            Does anybody know the meaning of a # that appears "deep inside" an URL,
            and what would the correct logic to differentiate it from the classic
            '#' as explained in
            http://www.w3.org/Addressing/URL/uri-spec.html ? Could it be "it doesn't
            count if the '#' is before a '/'" ?

            If so, what about this URL
            http://www.ftd.de/auto/bilder/:galerie-die-fiatisierung-von-chrysler/50059172.html#utm_source=rss&utm_medium=rss_feed&utm_campaign=/
            where the content is identical to this URL
            http://www.ftd.de/auto/bilder/:galerie-die-fiatisierung-von-chrysler/50059172.html
            ?

            Tilman
          • Daniel Norton
            That s not a bug in your software, it s a bug in the website. The hash sign (#) in a URI is a reserved character and a URI with a hash sign (#) should retrieve
            Message 5 of 6 , Jan 13, 2010
            • 0 Attachment
              That's not a bug in your software, it's a bug in the website. The hash sign (#) in a URI is a reserved character and a URI with a hash sign (#) should retrieve the same document as the URI without the hash sign and everything following it (the fragment identifier). From RFC 3986 (highlight added):

              4.4 Same-Document Reference

              When a URI reference refers to a URI that is, aside from its fragment component (if any), identical to the base URI (Section 5.1), that reference is called a "same-document" reference. The most frequent examples of same-document references are relative references that are empty or include only the number sign ("#") separator followed by a fragment identifier.

              When a same-document reference is dereferenced for a retrieval action, the target of that reference is defined to be within the same entity (representation, document, or message) as the reference; therefore, a dereference should not result in a new retrieval action.

              The specification does not provide for any exceptions for characters (such as "/") after the hash mark, so they must be considered to be part of the fragment identifier. The W3 document you referenced concurs.

              --
              Daniel

            Your message has been successfully submitted and would be delivered to recipients shortly.