Loading ...
Sorry, an error occurred while loading the content.

link to wikipedia

Expand Messages
  • Lars
    I have quite some links to Wikipedia. They give this message. A little annoying error code: 403 (forbidden request)
    Message 1 of 15 , Apr 12 8:04 AM
    • 0 Attachment
      I have quite some links to Wikipedia.

      They give this message. A little annoying
      error code: 403 (forbidden request)
    • daniel norton, teeny tiny websites
      ... From http://en.wikipedia.org/robots.txt ... User-agent: Xenu Disallow: / ... -- Daniel
      Message 2 of 15 , Apr 12 8:12 AM
      • 0 Attachment
        On Mon, Apr 12, 2010 at 10:04 AM, Lars <baloo5419@...> wrote: 

        I have quite some links to Wikipedia.

        They give this message. A little annoying
        error code: 403 (forbidden request)

        ...
        User-agent: Xenu
        Disallow: /
        ...

        --
        Daniel


      • Vince Thacker
        I ve raised a similar query before about many Google links that return a 403, especially their RSS feeds. It doesn t seem you can do much about it, as these
        Message 3 of 15 , Apr 12 9:47 AM
        • 0 Attachment
          I've raised a similar query before about many Google links that return a
          403, especially their RSS feeds. It doesn't seem you can do much about it,
          as these sites have a policy of forbidding software such as Xenu getting
          access.

          Presumably if a page is generating a 403, you know at least that you have a
          valid link.

          Vince.

          ----- Original Message -----
          From: "Lars" <baloo5419@...>
          To: <xenu-usergroup@yahoogroups.com>
          Sent: Monday, April 12, 2010 4:04 PM
          Subject: [xenu-usergroup] link to wikipedia


          >
          > I have quite some links to Wikipedia.
          >
          > They give this message. A little annoying
          > error code: 403 (forbidden request)
          >
          >
          >
          >
          > ------------------------------------
          >
          > Yahoo! Groups Links
          >
          >
          >
          >
          >
        • Fischer, Thomas
          Hello, since checking links to Wikipedia seems to be a legitimate task for Xenu, shouldn t someone contact them and as for the removal of the robots.txt
          Message 4 of 15 , Apr 12 11:53 PM
          • 0 Attachment
            Hello,
             
            since checking links to Wikipedia seems to be a legitimate task for Xenu, shouldn't someone contact them and as for the removal of the robots.txt exclusion?. Or is there a reason that Xenu and Wikipedia don't work together smoothly, e.g because of the internal redirects in Wikipedia?
             
            By the way,
            User-agent: Xenu
            Disallow: /

            All the best
            Thomas
             

            Von: xenu-usergroup@yahoogroups.com [mailto:xenu-usergroup@yahoogroups.com] Im Auftrag von Vince Thacker
            Gesendet: Montag, 12. April 2010 18:48
            An: xenu-usergroup@yahoogroups.com
            Betreff: Re: [xenu-usergroup] link to wikipedia

             

            I've raised a similar query before about many Google links that return a
            403, especially their RSS feeds. It doesn't seem you can do much about it,
            as these sites have a policy of forbidding software such as Xenu getting
            access.

            Presumably if a page is generating a 403, you know at least that you have a
            valid link.

            Vince.

            ----- Original Message -----
            From: "Lars" <baloo5419@yahoo. co.uk>
            To: <xenu-usergroup@ yahoogroups. com>
            Sent: Monday, April 12, 2010 4:04 PM
            Subject: [xenu-usergroup] link to wikipedia

            >
            > I have quite some links to Wikipedia.
            >
            > They give this message. A little annoying
            > error code: 403 (forbidden request)
            >
            >
            >
            >
            > ------------ --------- --------- ------
            >
            > Yahoo! Groups Links
            >
            >
            >
            >
            >

          • Andy Mabbett
            There would be no point - Wikipedia always returns 200 OK never 404 page not found . ... -- Andy Mabbett @pigsonthewing http://pigsonthewing.org.uk ** via
            Message 5 of 15 , Apr 13 3:11 AM
            • 0 Attachment
              There would be no point - Wikipedia always returns "200 OK" never "404
              page not found".

              On Tue, April 13, 2010 07:53, Fischer, Thomas wrote:
              > Hello,
              >
              > since checking links to Wikipedia seems to be a legitimate task for Xenu,
              > shouldn't someone contact them and as for the removal of the robots.txt
              > exclusion?. Or is there a reason that Xenu and Wikipedia don't work
              > together smoothly, e.g because of the internal redirects in Wikipedia?
              >
              > By the way,
              >
              > User-agent: Xenu
              > Disallow: /
              >
              > is also contained in http://de.wikipedia.org/robots.txt.
              >
              > All the best
              > Thomas
              >
              > ________________________________
              > Von: xenu-usergroup@yahoogroups.com
              > [mailto:xenu-usergroup@yahoogroups.com] Im Auftrag von Vince Thacker
              > Gesendet: Montag, 12. April 2010 18:48
              > An: xenu-usergroup@yahoogroups.com
              > Betreff: Re: [xenu-usergroup] link to wikipedia
              >
              >
              >
              > I've raised a similar query before about many Google links that return a
              > 403, especially their RSS feeds. It doesn't seem you can do much about it,
              > as these sites have a policy of forbidding software such as Xenu getting
              > access.
              >
              > Presumably if a page is generating a 403, you know at least that you have
              > a
              > valid link.
              >
              > Vince.
              >
              > ----- Original Message -----
              > From: "Lars" <baloo5419@...<mailto:baloo5419%40yahoo.co.uk>>
              > To:
              > <xenu-usergroup@yahoogroups.com<mailto:xenu-usergroup%40yahoogroups.com>>
              > Sent: Monday, April 12, 2010 4:04 PM
              > Subject: [xenu-usergroup] link to wikipedia
              >
              >>
              >> I have quite some links to Wikipedia.
              >>
              >> They give this message. A little annoying
              >> error code: 403 (forbidden request)

              --
              Andy Mabbett
              @pigsonthewing
              http://pigsonthewing.org.uk
              ** via webmail **
            • Ron Jones
              ... I would suggest it s all about load on their servers - at the end of the day they are still a charity, so they won t have the finest and fastest servers in
              Message 6 of 15 , Apr 13 11:59 AM
              • 0 Attachment
                Fischer, Thomas wrote:
                > Hello,
                >
                > since checking links to Wikipedia seems to be a legitimate task for
                > Xenu, shouldn't someone contact them and as for the removal of the
                > robots.txt exclusion?. Or is there a reason that Xenu and Wikipedia
                > don't work together smoothly, e.g because of the internal redirects
                > in Wikipedia?
                >
                > By the way,
                >
                > User-agent: Xenu
                > Disallow: /
                >
                > is also contained in http://de.wikipedia.org/robots.txt.
                >
                > All the best
                > Thomas

                I would suggest it's all about load on their servers - at the end of the day
                they are still a charity, so they won't have the finest and fastest servers
                in the world. Also remember that pages are stored in Wiki markup - each
                page you ask for has to be converted to html for your browser to display.
                One wikipedia page is likely to lead to a huge tree of pages being
                requested - editors are always asked to make sure that a page links to
                plenty of other pages.
                They do allow Google to search, not sure if anyone else can do so.

                Ron Jones
                Process Safety & Development Specialist
                Don't repeat history, unreported chemical lab/plant near misses at
                http://www.crhf.org.uk Only two things are certain: The universe and
                human stupidity; and I'm not certain about the universe. ~ Albert
                Einstein
              • Ron Jones
                ... Just remembered - all the pages are all set for no follow - this is to stop unscrupulous companies adding spam links to build up their Google ranking.
                Message 7 of 15 , Apr 13 12:58 PM
                • 0 Attachment
                  Ron Jones wrote:
                  > Fischer, Thomas wrote:
                  >> Hello,
                  >>
                  >> since checking links to Wikipedia seems to be a legitimate task for
                  >> Xenu, shouldn't someone contact them and as for the removal of the
                  >> robots.txt exclusion?. Or is there a reason that Xenu and Wikipedia
                  >> don't work together smoothly, e.g because of the internal redirects
                  >> in Wikipedia?
                  >>
                  >> By the way,
                  >>
                  >> User-agent: Xenu
                  >> Disallow: /
                  >>
                  >> is also contained in http://de.wikipedia.org/robots.txt.
                  >>
                  >> All the best
                  >> Thomas
                  >
                  > I would suggest it's all about load on their servers - at the end of
                  > the day they are still a charity, so they won't have the finest and
                  > fastest servers in the world. Also remember that pages are stored in
                  > Wiki markup - each page you ask for has to be converted to html for
                  > your browser to display. One wikipedia page is likely to lead to a
                  > huge tree of pages being requested - editors are always asked to make
                  > sure that a page links to plenty of other pages.
                  > They do allow Google to search, not sure if anyone else can do so.

                  Just remembered - all the pages are all set for "no follow" - this is to
                  stop unscrupulous companies adding spam links to build up their Google
                  ranking. Any links added have no effect on the Google ranking.

                  Ron Jones
                  Process Safety & Development Specialist
                  Don't repeat history, unreported chemical lab/plant near misses at
                  http://www.crhf.org.uk Only two things are certain: The universe and
                  human stupidity; and I'm not certain about the universe. ~ Albert
                  Einstein
                • Tilman Hausherr
                  Although Xenu isn t a SEO tool, it is being misused as such. A guy asked to get the duration in milliseconds, and google has recently announced that loading
                  Message 8 of 15 , Apr 18 4:03 AM
                  • 0 Attachment
                    Although Xenu isn't a SEO tool, it is being "misused" as such. A guy
                    asked to get the duration in milliseconds, and google has recently
                    announced that loading time of websites would be taken into
                    consideration.

                    A new beta version is here:
                    http://home.snafu.de/tilman/tmp/xenubeta.zip

                    This is just a test so you see how it looks and give feedback. The
                    milliseconds value isn't saved in the .XEN file, nor in the export file.
                    (This will be done at a later time). If you need the milliseconds
                    feature, please test it and give feedback about wether this is usable,
                    or annoying.

                    Below are all the changes since the last regular version. If you like to
                    support me, please test it and give feedback.

                    Tilman

                    =====================

                    Major improvements:
                    24.2.2010: Check the domains of mail addresses (DNS lookup for MX
                    record)

                    Minor improvements:
                    7.12.2009: Include PARSETEST4 section in general release (convert
                    characters >80H to %XX, for "international" URLs)
                    19.12.2009: For "international" characters in local files: Use Unicode
                    for local directory search, URL launch in browser, read/check local
                    files
                    20.12.2009: But not for Windows 95/98/ME
                    22.12.2009: add ".class" for applets if needed, replace "." with "/".
                    example:
                    http://www.colorado.edu/physics/2000/applets/bec.html
                    27.12.2009: updated to NSIS 2.46
                    10.1.2010: use version 6 list column sort arrows on XP and higher
                    14.1.2010: added Description column
                    15.1.2010: added warning when settings overwritten by profile
                    16.1.2010: attempt at decoding .jar files for APPLET ARCHIVE thanks to
                    http://www.codeguru.com/cpp/cpp/cpp_mfc/article.php/c4049/
                    However:
                    - only one .jar archive per applet
                    - no unicode in file names
                    - name of archive must end with .jar
                    - .jar file must be internal, or the class link will
                    remain broken
                    - .class "in Jar" property isn't saved in .XEN file
                    (which prevents standard access in favor of waiting for .jar lookup)
                    24.1.2010: added <video src=
                    27.1.2010: improved list control divider double click (title is the
                    minimum)
                    26.2.2010: improved extra text in domain mail check
                    13.3.2010: Get page body only if not redirection or redirection but no
                    "Location:" in header
                    (should make PARSETEST3 fix superfluous)
                    16.3.2010: ...
                    30.3.2010: Abort box for ftp orphan search
                    2.4.2010: [Options] Accept="*/*" (default value)
                    14.4.2010: milliseconds in duration
                    (in progress; missing: export, save/load)

                    Bug fixes:
                    15.12.2009: PARSETEST4 section: replaced "> 80X" with ">= 80X"
                    20.12.2009: added version check for Unicode Clipboard and Sitemap for
                    Windows 95/98/ME (like 27.1.2009)
                    21.12.2009: corrected broken banner links
                    22.12.2009: tell "anchor occurs multiple times" only once per URL
                    4.1.2010: remove stuff after "?" in mailto: due to Microsoft error in
                    AfxParseURLEx()
                    10.1.2010: fixed list column sort arrows wrongly displayed in unsorted
                    columns (on 7, but not on XP)
                    12.1.2010: fixed "//" bug in applet codebase in local url
                    15.1.2010: disabled and unchecked "Inactive" checkbox after loading new
                    profile
                    18.1.2010: fixed title line of tab export
                    20.1.2010: Don't assume URLs to be UTF-8, use current charset instead
                    However: this solution isn't perfect, because the correct
                    charset of an URL would be the referring URL
                    But in most cases it will work, because URLs usually
                    have the same charset
                    Known bug: Root URL with exotic characters
                    20.1.2010: Corrected exotic URLs in sitemap
                    26.1.2010: Fixed % in file: URLs, only convert %XX
                    27.1.2010: "Conversion to lowercase" option uses codepage for conversion
                    31.1.2010: Fixed bug in report (max size + max size url), probably
                    introduced on 15.1.2010
                    15.3.2010: vNormalizeURL() with conversion to UTF8 prior to
                    AfxMyParseURL()
                    store URLs in UTF8, unless already ANSI or ISO-8859-1 (1252)
                    vRemovePercents for display only
                    3.4.2010: prevent reentrant calls to vDoIdle();
                    set fileNotFound status if tmp URL content file deleted by
                    antivirus software
                    10.4.2010: replaced "> 80X" with ">= 80X" in vAnsi2EntityEscaped()
                  • Fischer, Thomas
                    Hi Andy, ... That is not true. While this may hold for searches, trying to access specific pages will give error messages. E.g.
                    Message 9 of 15 , Apr 26 3:23 AM
                    • 0 Attachment
                      Hi Andy,

                      > There would be no point - Wikipedia always returns "200 OK"
                      > never "404 page not found".

                      That is not true.
                      While this may hold for searches, trying to access specific pages will give error messages.
                      E.g. http://en.wikipedia.org/wiki/QWER
                      gives "HTTP/1.0 404 Not Found"
                      before starting a redirect.
                      I assume that this would be the kind of link somebody might want to check using Xenu.

                      Thomas

                      >
                      > On Tue, April 13, 2010 07:53, Fischer, Thomas wrote:
                      > > Hello,
                      > >
                      > > since checking links to Wikipedia seems to be a legitimate task for
                      > > Xenu, shouldn't someone contact them and as for the removal of the
                      > > robots.txt exclusion?. Or is there a reason that Xenu and Wikipedia
                      > > don't work together smoothly, e.g because of the internal
                      > redirects in Wikipedia?
                      > >
                      > > By the way,
                      > >
                      > > User-agent: Xenu
                      > > Disallow: /
                      > >
                      > > is also contained in http://de.wikipedia.org/robots.txt.
                      > > <http://de.wikipedia.org/robots.txt.>
                      > >
                      > > All the best
                      > > Thomas
                      > >
                      > > ________________________________
                      > > Von: xenu-usergroup@yahoogroups.com
                      > > <mailto:xenu-usergroup%40yahoogroups.com>
                      > > [mailto:xenu-usergroup@yahoogroups.com
                      > > <mailto:xenu-usergroup%40yahoogroups.com> ] Im Auftrag von Vince
                      > > Thacker
                      > > Gesendet: Montag, 12. April 2010 18:48
                      > > An: xenu-usergroup@yahoogroups.com
                      > > <mailto:xenu-usergroup%40yahoogroups.com>
                      > > Betreff: Re: [xenu-usergroup] link to wikipedia
                      > >
                      > >
                      > >
                      > > I've raised a similar query before about many Google links
                      > that return
                      > > a 403, especially their RSS feeds. It doesn't seem you can do much
                      > > about it, as these sites have a policy of forbidding
                      > software such as
                      > > Xenu getting access.
                      > >
                      > > Presumably if a page is generating a 403, you know at least
                      > that you
                      > > have a valid link.
                      > >
                      > > Vince.
                      > >
                      > > ----- Original Message -----
                      > > From: "Lars" <baloo5419@...
                      > <mailto:baloo5419%40yahoo.co.uk>
                      > > <mailto:baloo5419%40yahoo.co.uk>>
                      > > To:
                      > > <xenu-usergroup@yahoogroups.com
                      > > <mailto:xenu-usergroup%40yahoogroups.com>
                      > > <mailto:xenu-usergroup%40yahoogroups.com>>
                      > > Sent: Monday, April 12, 2010 4:04 PM
                      > > Subject: [xenu-usergroup] link to wikipedia
                      > >
                      > >>
                      > >> I have quite some links to Wikipedia.
                      > >>
                      > >> They give this message. A little annoying error code: 403
                      > (forbidden
                      > >> request)
                      >
                      > --
                      > Andy Mabbett
                      > @pigsonthewing
                      > http://pigsonthewing.org.uk <http://pigsonthewing.org.uk>
                      > ** via webmail **
                      >
                      >
                      >
                      >
                      >
                    • Jack Stringer
                      ... There are a couple of thousand users using Xenu if they all started sending requests to wikipedia site then the server soon gets bogged down trying to
                      Message 10 of 15 , Apr 26 3:40 AM
                      • 0 Attachment
                        >>> since checking links to Wikipedia seems to be a legitimate task for
                        >>> Xenu, shouldn't someone contact them and as for the removal of the
                        >>> robots.txt exclusion?. Or is there a reason that Xenu and Wikipedia
                        >>> don't work together smoothly, e.g because of the internal
                        >> redirects in Wikipedia?
                        >>>
                        >>> By the way,
                        >>>
                        >>> User-agent: Xenu
                        >>> Disallow: /
                        >>>
                        >>> is also contained in http://de.wikipedia.org/robots.txt.
                        >>> <http://de.wikipedia.org/robots.txt.>


                        There are a couple of thousand users using Xenu if they all started
                        sending requests to wikipedia site then the server soon gets bogged down
                        trying to deliver the pages. Its the same as those people using website
                        copying software. I have had my photography gallery go very very slow at
                        times just because someone is trying to hoover up the pictures.

                        What would be nice is to find out from wikipedia what changes need to be
                        made to Xenu so make it nicer to their systems. E.g some sort of delay
                        when getting pages from wikipedia servers.


                        Jack Stringer
                      • Tilman Hausherr
                        ... Xenu is already nice , i.e. it makes a HEAD request, not a GET request. My opinion is that the wikipedia software is crappy. The organisation is mostly
                        Message 11 of 15 , Apr 26 9:17 AM
                        • 0 Attachment
                          On Mon, 26 Apr 2010 11:40:55 +0100, Jack Stringer wrote:

                          >>>> since checking links to Wikipedia seems to be a legitimate task for
                          >>>> Xenu, shouldn't someone contact them and as for the removal of the
                          >>>> robots.txt exclusion?. Or is there a reason that Xenu and Wikipedia
                          >>>> don't work together smoothly, e.g because of the internal
                          >>> redirects in Wikipedia?
                          >>>>
                          >>>> By the way,
                          >>>>
                          >>>> User-agent: Xenu
                          >>>> Disallow: /
                          >>>>
                          >>>> is also contained in http://de.wikipedia.org/robots.txt.
                          >>>> <http://de.wikipedia.org/robots.txt.>
                          >
                          >
                          >There are a couple of thousand users using Xenu if they all started
                          >sending requests to wikipedia site then the server soon gets bogged down
                          >trying to deliver the pages. Its the same as those people using website
                          >copying software. I have had my photography gallery go very very slow at
                          >times just because someone is trying to hoover up the pictures.
                          >
                          >What would be nice is to find out from wikipedia what changes need to be
                          >made to Xenu so make it nicer to their systems. E.g some sort of delay
                          >when getting pages from wikipedia servers.

                          Xenu is already "nice", i.e. it makes a HEAD request, not a GET request.
                          My opinion is that the wikipedia software is crappy. The organisation is
                          mostly concentrated on collecting money, enforcing censorship, altering
                          history, and being busy with itself (many of the admins are just very
                          intelligent kids with too much time), instead of delivering a high
                          quality product by running a Continuous Improvement Process.

                          Tilman (holder of a scarlet letter from the wikipedia arb board :-))
                          http://en.wikipedia.org/wiki/User:Tilman


                          >
                          >
                          >Jack Stringer
                          >
                          >
                          >------------------------------------
                          >
                          >Yahoo! Groups Links
                          >
                          >
                          >
                        • Ron Jones
                          ... There are plenty of old admins, I can assure you :-) The software is probably rough - it *is* still a charity, and due to the mindless antics of loads of
                          Message 12 of 15 , Apr 26 12:52 PM
                          • 0 Attachment
                            Tilman Hausherr wrote:
                            > On Mon, 26 Apr 2010 11:40:55 +0100, Jack Stringer wrote:
                            >
                            >>>>> since checking links to Wikipedia seems to be a legitimate task
                            >>>>> for Xenu, shouldn't someone contact them and as for the removal
                            >>>>> of the robots.txt exclusion?. Or is there a reason that Xenu and
                            >>>>> Wikipedia don't work together smoothly, e.g because of the
                            >>>>> internal
                            >>>> redirects in Wikipedia?
                            >>>>>
                            >>>>> By the way,
                            >>>>>
                            >>>>> User-agent: Xenu
                            >>>>> Disallow: /
                            >>>>>
                            >>>>> is also contained in http://de.wikipedia.org/robots.txt.
                            >>>>> <http://de.wikipedia.org/robots.txt.>
                            >>
                            >>
                            >> There are a couple of thousand users using Xenu if they all started
                            >> sending requests to wikipedia site then the server soon gets bogged
                            >> down trying to deliver the pages. Its the same as those people using
                            >> website copying software. I have had my photography gallery go very
                            >> very slow at times just because someone is trying to hoover up the
                            >> pictures.
                            >>
                            >> What would be nice is to find out from wikipedia what changes need
                            >> to be made to Xenu so make it nicer to their systems. E.g some sort
                            >> of delay when getting pages from wikipedia servers.
                            >
                            > Xenu is already "nice", i.e. it makes a HEAD request, not a GET
                            > request. My opinion is that the wikipedia software is crappy. The
                            > organisation is mostly concentrated on collecting money, enforcing
                            > censorship, altering history, and being busy with itself (many of the
                            > admins are just very intelligent kids with too much time), instead of
                            > delivering a high quality product by running a Continuous Improvement
                            > Process.
                            >
                            > Tilman (holder of a scarlet letter from the wikipedia arb board :-))
                            > http://en.wikipedia.org/wiki/User:Tilman
                            >

                            There are plenty of old admins, I can assure you :-)
                            The software is probably rough - it *is* still a charity, and due to the
                            mindless antics of loads of juniville vandals, it needs a large team of
                            vandal fighters (not just admins - there's only 1000 regular ones) to keep
                            the pages more or less intact - English Wikipedia has around 150-200 pages
                            change per minute, and around 10% of those have to be reverted - so the
                            servers are already very busy, and I think allowing Xenu in will grind it to
                            a halt - If the Dutch mirrors go down, and I have to connect direct (from
                            UK) to the USA servers, then it can take 30 seconds plus for a medium page
                            to load.

                            Ron Jones
                            Process Safety & Development Specialist
                            Don't repeat history, unreported chemical lab/plant near misses at
                            http://www.crhf.org.uk Only two things are certain: The universe and
                            human stupidity; and I'm not certain about the universe. ~ Albert
                            Einstein
                          • Tilman Hausherr
                            Now it does save/export and restore the milliseconds value. http://home.snafu.de/tilman/tmp/xenubeta.zip Tilman
                            Message 13 of 15 , May 6, 2010
                            • 0 Attachment
                              Now it does save/export and restore the milliseconds value.
                              http://home.snafu.de/tilman/tmp/xenubeta.zip

                              Tilman


                              On Sun, 18 Apr 2010 13:03:46 +0200, Tilman Hausherr wrote:

                              >Although Xenu isn't a SEO tool, it is being "misused" as such. A guy
                              >asked to get the duration in milliseconds, and google has recently
                              >announced that loading time of websites would be taken into
                              >consideration.
                              >
                              >A new beta version is here:
                              >http://home.snafu.de/tilman/tmp/xenubeta.zip
                              >
                              >This is just a test so you see how it looks and give feedback. The
                              >milliseconds value isn't saved in the .XEN file, nor in the export file.
                              >(This will be done at a later time). If you need the milliseconds
                              >feature, please test it and give feedback about wether this is usable,
                              >or annoying.
                              >
                              >Below are all the changes since the last regular version. If you like to
                              >support me, please test it and give feedback.
                              >
                              >Tilman
                              >
                              >=====================
                              >
                              >Major improvements:
                              >24.2.2010: Check the domains of mail addresses (DNS lookup for MX
                              >record)
                              >
                              >Minor improvements:
                              >7.12.2009: Include PARSETEST4 section in general release (convert
                              >characters >80H to %XX, for "international" URLs)
                              >19.12.2009: For "international" characters in local files: Use Unicode
                              >for local directory search, URL launch in browser, read/check local
                              >files
                              >20.12.2009: But not for Windows 95/98/ME
                              >22.12.2009: add ".class" for applets if needed, replace "." with "/".
                              > example:
                              >http://www.colorado.edu/physics/2000/applets/bec.html
                              >27.12.2009: updated to NSIS 2.46
                              >10.1.2010: use version 6 list column sort arrows on XP and higher
                              >14.1.2010: added Description column
                              >15.1.2010: added warning when settings overwritten by profile
                              >16.1.2010: attempt at decoding .jar files for APPLET ARCHIVE thanks to
                              > http://www.codeguru.com/cpp/cpp/cpp_mfc/article.php/c4049/
                              > However:
                              > - only one .jar archive per applet
                              > - no unicode in file names
                              > - name of archive must end with .jar
                              > - .jar file must be internal, or the class link will
                              >remain broken
                              > - .class "in Jar" property isn't saved in .XEN file
                              >(which prevents standard access in favor of waiting for .jar lookup)
                              >24.1.2010: added <video src=
                              >27.1.2010: improved list control divider double click (title is the
                              >minimum)
                              >26.2.2010: improved extra text in domain mail check
                              >13.3.2010: Get page body only if not redirection or redirection but no
                              >"Location:" in header
                              > (should make PARSETEST3 fix superfluous)
                              >16.3.2010: ...
                              >30.3.2010: Abort box for ftp orphan search
                              >2.4.2010: [Options] Accept="*/*" (default value)
                              >14.4.2010: milliseconds in duration
                              > (in progress; missing: export, save/load)
                              >
                              >Bug fixes:
                              >15.12.2009: PARSETEST4 section: replaced "> 80X" with ">= 80X"
                              >20.12.2009: added version check for Unicode Clipboard and Sitemap for
                              >Windows 95/98/ME (like 27.1.2009)
                              >21.12.2009: corrected broken banner links
                              >22.12.2009: tell "anchor occurs multiple times" only once per URL
                              >4.1.2010: remove stuff after "?" in mailto: due to Microsoft error in
                              >AfxParseURLEx()
                              >10.1.2010: fixed list column sort arrows wrongly displayed in unsorted
                              >columns (on 7, but not on XP)
                              >12.1.2010: fixed "//" bug in applet codebase in local url
                              >15.1.2010: disabled and unchecked "Inactive" checkbox after loading new
                              >profile
                              >18.1.2010: fixed title line of tab export
                              >20.1.2010: Don't assume URLs to be UTF-8, use current charset instead
                              > However: this solution isn't perfect, because the correct
                              >charset of an URL would be the referring URL
                              > But in most cases it will work, because URLs usually
                              >have the same charset
                              > Known bug: Root URL with exotic characters
                              >20.1.2010: Corrected exotic URLs in sitemap
                              >26.1.2010: Fixed % in file: URLs, only convert %XX
                              >27.1.2010: "Conversion to lowercase" option uses codepage for conversion
                              >31.1.2010: Fixed bug in report (max size + max size url), probably
                              >introduced on 15.1.2010
                              >15.3.2010: vNormalizeURL() with conversion to UTF8 prior to
                              >AfxMyParseURL()
                              > store URLs in UTF8, unless already ANSI or ISO-8859-1 (1252)
                              > vRemovePercents for display only
                              >3.4.2010: prevent reentrant calls to vDoIdle();
                              > set fileNotFound status if tmp URL content file deleted by
                              >antivirus software
                              >10.4.2010: replaced "> 80X" with ">= 80X" in vAnsi2EntityEscaped()
                              >
                              >
                              >------------------------------------
                              >
                              >Yahoo! Groups Links
                              >
                              >
                              >
                            • Ven. S. Upatissa (g)
                              On my local hard disk I have a folder containing hundreds of html files, and an index.html file that contains links to all of them. When I run xenu on the
                              Message 14 of 15 , May 6, 2010
                              • 0 Attachment
                                On my local hard disk I have a folder containing hundreds of html files,
                                and an index.html file that contains links to all of them.

                                When I run xenu on the index file, it correctly reports no broken links,
                                but it also reports that all of the other files are orphans.

                                Why is this? What am I doing wrong?

                                -Thanks
                              • Tilman Hausherr
                                Don t know.... send it to me in a zip, and send me a .XEN file in a ZIP too, at tilman at snafu dot de. Tilman
                                Message 15 of 15 , May 6, 2010
                                • 0 Attachment
                                  Don't know.... send it to me in a zip, and send me a .XEN file in a ZIP
                                  too, at tilman at snafu dot de.


                                  Tilman

                                  On Fri, 07 May 2010 07:59:19 +0530, Ven. S. Upatissa (g) wrote:

                                  >On my local hard disk I have a folder containing hundreds of html files,
                                  >and an index.html file that contains links to all of them.
                                  >
                                  >When I run xenu on the index file, it correctly reports no broken links,
                                  >but it also reports that all of the other files are orphans.
                                  >
                                  >Why is this? What am I doing wrong?
                                  >
                                  >-Thanks
                                  >
                                  >
                                  >------------------------------------
                                  >
                                  >Yahoo! Groups Links
                                  >
                                  >
                                  >
                                Your message has been successfully submitted and would be delivered to recipients shortly.