Loading ...
Sorry, an error occurred while loading the content.

Re: [xenu-usergroup] Broken Link Question

Expand Messages
  • Bruce Hartford
    ... Yes, if I don t have that checked it appears to only check the internal links between pages within my site. I want to check my outgoing links to other
    Message 1 of 30 , Oct 3, 2010
    • 0 Attachment
      On 10/3/2010 3:25 PM, Shiner wrote:
       

      On 08/07/2010 16:57, Bruce Hartford wrote:

       

      I have a different question regarding broken links reported by Xenu.

      My website is http://crmvet.org. When I run Xenu against it and produce
      a report, it lists broken links by the page they are on, which is great
      and very useful. For example:

      http://crmvet.org/biblio.htm
      http://www.kimopress.com/
      \_____ error code: 12002 (timeout)

      Which I interpret to mean that the link to www.kimopress.com on the
      biblio.htm page is not working. So far, so good.

      But it ALSO lists broken links that do not appear to be on my website at
      all. For example:

      http://archives.cnn.com/1999/US/12/08/king.assassination.01
      http://archives.cnn.com/1999/US/12/08/king.assassination.01/
      \_____ error code: 404 (not found)

      http://cnnstudentnews.cnn.com/2000/LAW/06/08/henry.avants
      http://www.cnn.com/studentnews/2000/LAW/06/08/henry.avants
      \_____ error code: 404 (not found)

      This looks to me as if Xenu is reporting a broken link on the
      archives.cnn.com and cnnstudentnews.cnn.com sites. Am I misunderstanding
      this? If not, why is Xenu reporting on broken links that are not on my
      site?


      Do you have check external domains enabled ?

      Yes, if I don't have that checked it appears to only check the internal links between pages within my site. I want to check my outgoing links to other sites, but I don't want to check links ON those other sites.

      Bruce

    • Bruce Hartford
      Well, now that I know that this is expected behavior, I m cool. I did try to find this issue in the documentation, but if it s there I must have missed it.
      Message 2 of 30 , Oct 4, 2010
      • 0 Attachment
        Well, now that I know that this is expected behavior, I'm cool. I did
        try to find this issue in the documentation, but if it's there I must
        have missed it. Perhaps something like, "When you look at your list of
        broken links (by page or link) you may see broken links from sites other
        than your own. The reason for this is [explanation]."

        Anyway, thanks for responding and clarifying that those mysterious
        broken links are not because I did something wrong.

        Bruce


        On 10/4/2010 1:47 AM, Tilman Hausherr wrote:
        > Sigh... this issue has really come up often in all these years, so maybe
        > I should really do something. Its a design issue, mostly. I can't place
        > all the URLs that link to the "mysterious" URL because it could be many,
        > and there might be a whole chain of redirections.
        >
        > What would be a solution?
        >
        > A small note like this?
        >
        >
        > Broken links, ordered by link:
        > http://archives.cnn.com/1999/US/12/08/king.assassination.01/
        > error code: 404 (not found), linked from page(s):
        > http://archives.cnn.com/1999/US/12/08/king.assassination.01
        > (Attention: the URL above redirects)
        >
        > 1 broken link(s) reported
        >
        > Return to Top
        >
        > Broken links, ordered by page:
        > http://archives.cnn.com/1999/US/12/08/king.assassination.01
        > (Attention: the URL above redirects)
        > http://archives.cnn.com/1999/US/12/08/king.assassination.01/
        > \_____ error code: 404 (not found)
        >
        > 1 broken link(s) reported
        >
        > Return to Top
        >
        >
        > Or a different text, like
        >
        > (Attention: see "List of redirected URLs" to find out more about this URL")
        > or
        > (Attention: enable "List of redirected URLs" to find out more about this
        > URL")
        > with an internal link to the section that deals with that URL.
        >
        > Then I'll probably get many mails from people who disabled that list in
        > the report but don't remember how to enable it :-(
        >
        > Tilman
        >
        >
        >
        > On Mon, 4 Oct 2010 09:55:29 +0200, "Fischer, Thomas"
        > <fischer@...-goettingen.de> wrote:
        >
        >>
        >>
        >>
        >> <head>
        >>
        >> <style type="text/css">
        >> <!--
        >>
        >> /* start of attachment style */
        >> .ygrp-photo-title{
        >> clear: both;
        >> font-size: smaller;
        >> height: 15px;
        >> overflow: hidden;
        >> text-align: center;
        >> width: 75px;
        >> }
        >> div.ygrp-photo{
        >> background-position: center;
        >> background-repeat: no-repeat;
        >> background-color: white;
        >> border: 1px solid black;
        >> height: 62px;
        >> width: 62px;
        >> }
        >>
        >> div.photo-title
        >> a,
        >> div.photo-title a:active,
        >> div.photo-title a:hover,
        >> div.photo-title a:visited {
        >> text-decoration: none;
        >> }
        >>
        >> div.attach-table div.attach-row {
        >> clear: both;
        >> }
        >>
        >> div.attach-table div.attach-row div {
        >> float: left;
        >> /* margin: 2px;*/
        >> }
        >>
        >> p {
        >> clear: both;
        >> padding: 15px 0 3px 0;
        >> overflow: hidden;
        >> }
        >>
        >> div.ygrp-file {
        >> width: 30px;
        >> valign: middle;
        >> }
        >> div.attach-table div.attach-row div div a {
        >> text-decoration: none;
        >> }
        >>
        >> div.attach-table div.attach-row div div span {
        >> font-weight: normal;
        >> }
        >>
        >> div.ygrp-file-title {
        >> font-weight: bold;
        >> }
        >> /* end of attachment style */
        >> -->
        >> </style>
        >> </head>
        >> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
        >> <HTML><HEAD>
        >> <META content="text/html; charset=us-ascii" http-equiv=Content-Type>
        >> <META name=GENERATOR content="MSHTML 8.00.7600.16625"></HEAD>
        >> <BODY style="BACKGROUND-COLOR: #fff">
        >>
        >>
        >> <!-- |**|begin egp html banner|**| -->
        >>
        >> <br><br>
        >>
        >> <!-- |**|end egp html banner|**| -->
        >>
        >>
        >>
        >> <DIV dir=ltr align=left><SPAN class=150065906-04102010><FONT size=2
        >> face=Arial>Hello Tilman,</FONT></SPAN></DIV>
        >> <DIV dir=ltr align=left><SPAN class=150065906-04102010><FONT size=2
        >> face=Arial></FONT></SPAN> </DIV>
        >> <DIV dir=ltr align=left><SPAN class=150065906-04102010><FONT size=2
        >> face=Arial>this could be the same issue as in my mail from 2010-05-20
        > (Very
        >> External Links Checked).</FONT></SPAN></DIV>
        >> <DIV dir=ltr align=left><SPAN class=150065906-04102010><FONT size=2
        > face=Arial>I
        >> might be due to the information in Xenu's report. E.g. I
        >> got</FONT></SPAN></DIV>
        >> <DIV dir=ltr align=left><SPAN class=150065906-04102010><FONT
        > color=#0000ff
        >> size=2 face=Arial></FONT></SPAN> </DIV>
        >> <DIV dir=ltr align=left><SPAN class=150065906-04102010><SPAN
        >> style="WIDOWS: 2; TEXT-TRANSFORM: none; TEXT-INDENT: 0px;
        > BORDER-COLLAPSE: separate; FONT: medium 'Times New Roman'; WHITE-SPACE:
        > normal; ORPHANS: 2; LETTER-SPACING: normal; COLOR: rgb(0,0,0);
        > WORD-SPACING: 0px; -webkit-border-horizontal-spacing: 0px;
        > -webkit-border-vertical-spacing: 0px;
        > -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust:
        > auto; -webkit-text-stroke-width: 0px"
        >> class=Apple-style-span><PRE><A
        > href="http://new.oberlin.edu/arts-and-sciences/departments/mathematics"
        > target=_blank><FONT
        > color=#0066cc>http://new.oberlin.edu/arts-and-sciences/departments/mathematics</FONT></A>
        >> redirected to:<A
        > href="http://new.oberlin.edu/arts-and-sciences/departments/mathematics/"
        > target=_blank><FONT
        > color=#0066cc>http://new.oberlin.edu/arts-and-sciences/departments/mathematics/</FONT></A>
        >> status code: 302 (object temporarily moved)
        >> linked from page(s):
        >> <A href="http://www.oberlin.edu/math/" target=_blank><FONT
        > color=#0066cc>http://www.oberlin.edu/math/</FONT></A></PRE></SPAN></SPAN></DIV>
        >> <DIV><SPAN class=150065906-04102010></SPAN><FONT size=2
        >>
        > face=Arial>and found that I linked to</FONT></DIV>
        >> <DIV><A href="http://www.oberlin.edu/math/" target=_blank><FONT
        > color=#0066cc
        >> size=2 face=Arial>http://www.oberlin.edu/math/</FONT></A><FONT
        > face=Arial><FONT
        >> size=2> </FONT></FONT></DIV>
        >> <DIV><FONT face=Arial><FONT
        >> size=2>which in turn redirects to</FONT></FONT></DIV>
        >> <DIV><FONT face=Arial><FONT size=2><A
        >>
        > href="http://new.oberlin.edu/arts-and-sciences/departments/mathematics/"><FONT
        >
        > color=#000000>http://new.oberlin.edu/arts-and-sciences/departments/mathematics/</FONT></A><SPAN
        >
        >> class=150065906-04102010></SPAN></FONT></FONT><BR></DIV>
        >> <DIV><FONT size=2 face=Arial><SPAN class=150065906-04102010>while<A
        >> href="http://www.oberlin.edu/math/" target=_blank><FONT color=#000000
        > size=2
        >> face=Arial>http://www.oberlin.edu/math/</FONT></A> is out of my
        >> range.</SPAN></FONT></DIV>
        >> <DIV><FONT size=2 face=Arial><SPAN class=150065906-04102010>Could Xenu's
        >> information somehow be changed to reflect this situation more
        >> clearly?</SPAN></FONT></DIV>
        >> <DIV><FONT color=#0000ff size=2 face=Arial><SPAN
        >> class=150065906-04102010></SPAN></FONT> </DIV>
        >> <DIV><FONT color=#0000ff size=2 face=Arial><SPAN
        > class=150065906-04102010>Best
        >> regards</SPAN></FONT></DIV>
        >> <DIV><FONT color=#0000ff size=2 face=Arial><SPAN
        >> class=150065906-04102010>Thomas</SPAN></FONT></DIV>
        >> <DIV><FONT color=#0000ff size=2 face=Arial><SPAN
        > class=150065906-04102010>(and
        >> thanks for a great product!)</SPAN></FONT></DIV>
        >> <DIV><FONT color=#0000ff size=2 face=Arial><SPAN
        >> class=150065906-04102010></SPAN></FONT> </DIV>
        >> <DIV><FONT color=#0000ff size=2 face=Arial><SPAN
        >> class=150065906-04102010></SPAN></FONT> </DIV>
        >> <BLOCKQUOTE
        >> style="BORDER-LEFT: #0000ff 2px solid; PADDING-LEFT: 5px; MARGIN-LEFT:
        > 5px; MARGIN-RIGHT: 0px">
        >> <DIV dir=ltr lang=de class=OutlookMessageHeader align=left>
        >> <HR tabIndex=-1>
        >> <FONT size=2 face=Tahoma><B>Von:</B> xenu-usergroup@yahoogroups.com
        >> [mailto:xenu-usergroup@yahoogroups.com]<B>Im Auftrag von</B>Tilman
        >> Hausherr<BR><B>Gesendet:</B> Montag, 4. Oktober 2010
        > 05:56<BR><B>An:</B>
        >> xenu-usergroup@yahoogroups.com<BR><B>Betreff:</B> Re:
        > [xenu-usergroup] A
        >> different question about broken links<BR></FONT><BR></DIV>
        >> <DIV></DIV><SPAN style="DISPLAY: none"> </SPAN>
        >> <DIV id=ygrp-text>
        >> <P>See in the report, the redirection list. You probably link to these
        >> CNN<BR>sites that redirect.<BR><BR>Tilman<BR><BR>On Tue, 06 Jul 2010
        > 08:41:40
        >> -0700, Bruce Hartford wrote:<BR><BR>>I have a different question
        > regarding
        >> broken links reported by Xenu.<BR>><BR>>My website is<A
        >> href="http://crmvet.org.">http://crmvet.org.</A> When I run Xenu
        > against it
        >> and produce<BR>>a report, it lists broken links by the page they
        > are on,
        >> which is great<BR>>and very useful. For example:<BR>><BR>><A
        >>
        > href="http://crmvet.org/biblio.htm">http://crmvet.org/biblio.htm</A><BR>>
        >
        >> <A
        > href="http://www.kimopress.com/">http://www.kimopress.com/</A><BR>>
        >> \_____ error code: 12002 (timeout)<BR>><BR>>Which I interpret
        > to mean
        >> that the link to www.kimopress.com on the<BR>>biblio.htm page is
        > not
        >> working. So far, so good.<BR>><BR>>But it ALSO lists broken
        > links that
        >> do not appear to be on my website at<BR>>all. For
        >> example:<BR>><BR>><A
        >>
        > href="http://archives.cnn.com/1999/US/12/08/king.assassination.01">http://archives.cnn.com/1999/US/12/08/king.assassination.01</A><BR>>
        >
        >> <A
        >>
        > href="http://archives.cnn.com/1999/US/12/08/king.assassination.01/">http://archives.cnn.com/1999/US/12/08/king.assassination.01/</A><BR>>
        >
        >> \_____ error code: 404 (not found)<BR>><BR>><A
        >>
        > href="http://cnnstudentnews.cnn.com/2000/LAW/06/08/henry.avants">http://cnnstudentnews.cnn.com/2000/LAW/06/08/henry.avants</A><BR>>
        >
        >> <A
        >>
        > href="http://www.cnn.com/studentnews/2000/LAW/06/08/henry.avants">http://www.cnn.com/studentnews/2000/LAW/06/08/henry.avants</A><BR>>
        >
        >> \_____ error code: 404 (not found)<BR>><BR>><BR>>This looks
        > to me as
        >> if Xenu is reporting a broken link on the<BR>>archives.cnn.com and
        >> cnnstudentnews.cnn.com sites. Am I misunderstanding<BR>>this? If
        > not, why
        >> is Xenu reporting on broken links that are not on my
        >>
        > <BR>>site?<BR>><BR>>Thanks.<BR>><BR>>Bruce<BR>><BR>><BR>><BR>><BR>>------------------------------------<BR>><BR>>Yahoo!
        >
        >> Groups Links<BR>><BR>><BR>><BR></P></DIV><!-- end group
        > email -->
        >>
        >>
        >> <!-- |**|begin egp html banner|**| -->
        >>
        >> <br>
        >>
        >>
        >>
        >> <br>
        >>
        >> <!-- |**|end egp html banner|**| -->
        >>
        >>
        >> <div width="1" style="color: white; clear: both;"/></div>
        >> </BODY></HTML>
        >
        >
        > ------------------------------------
        >
        > Yahoo! Groups Links
        >
        >
        >
        >
        >
      • Fischer, Thomas
        Hello Tilman, sorry about the mail format, I wasn t aware of the mess it creates! Anyway, there should be a difference between -- links on my site that fail --
        Message 3 of 30 , Oct 7, 2010
        • 0 Attachment
          Hello Tilman,

          sorry about the mail format, I wasn't aware of the mess it creates!
          Anyway, there should be a difference between
          -- links on my site that fail
          -- links on my site that are redirected
          -- links on my site that are redirected to a failing URL

          For me, it would be easiest to have the failing redirections clustered with the working redirections and not with the errors on my site, since I will have to work on the URL that is first redirected and can't do anything about the wrong redirection itself. In particular, I would need to know *my* URL that causes the trouble, rather than the one on the distant site.
          I don't know if it is possible to keep track of working and not working redirections when checking URLs, but something like that would be my preferred solution.

          Cheers
          Thomas

          > Sigh... this issue has really come up often in all these
          > years, so maybe I should really do something. Its a design
          > issue, mostly. I can't place all the URLs that link to the
          > "mysterious" URL because it could be many, and there might be
          > a whole chain of redirections.
          >
          > What would be a solution?
          >
          > A small note like this?
          >
          >
          > Broken links, ordered by link:
          > http://archives.cnn.com/1999/US/12/08/king.assassination.01/
          > error code: 404 (not found), linked from page(s):
          > http://archives.cnn.com/1999/US/12/08/king.assassination.01
          > (Attention: the URL above redirects)
          >
          > 1 broken link(s) reported
          >
          > Return to Top
          >
          > Broken links, ordered by page:
          > http://archives.cnn.com/1999/US/12/08/king.assassination.01
          > (Attention: the URL above redirects)
          > http://archives.cnn.com/1999/US/12/08/king.assassination.01/
          > \_____ error code: 404 (not found)
          >
          > 1 broken link(s) reported
          >
          > Return to Top
          >
          >
          > Or a different text, like
          >
          > (Attention: see "List of redirected URLs" to find out more
          > about this URL") or
          > (Attention: enable "List of redirected URLs" to find out more
          > about this
          > URL")
          > with an internal link to the section that deals with that URL.
          >
          > Then I'll probably get many mails from people who disabled
          > that list in the report but don't remember how to enable it :-(
          >
          > Tilman
          >
          >
          >
          > On Mon, 4 Oct 2010 09:55:29 +0200, "Fischer, Thomas"
          >
        • Tilman Hausherr
          I hereby declare that todays beta is a release candidate. http://home.snafu.de/tilman/tmp/xenubeta.zip Please support me by testing your website with it. I
          Message 4 of 30 , May 16, 2011
          • 0 Attachment
            I hereby declare that todays beta is a release candidate.
            http://home.snafu.de/tilman/tmp/xenubeta.zip

            Please support me by testing your website with it.

            I will probably release the 1.3.9 version at the end of the week.

            Tilman
          • Jonathan Crane
            Tilman, mind if I ask you to post the changes to an email message to the group? j Jonathan Crane
            Message 5 of 30 , May 17, 2011
            • 0 Attachment

              Tilman, mind if I ask you to post the changes to an email message to the group?

              j

               

              Jonathan Crane

               

            • ultra_blue
              Hi, Tilman: Is a change log available? Thanks! Greg
              Message 6 of 30 , May 17, 2011
              • 0 Attachment
                Hi, Tilman:

                Is a change log available?

                Thanks!
                Greg


                --- In xenu-usergroup@yahoogroups.com, Tilman Hausherr <tilman@...> wrote:
                >
                > I hereby declare that todays beta is a release candidate.
                > http://home.snafu.de/tilman/tmp/xenubeta.zip
                >
                > Please support me by testing your website with it.
                >
                > I will probably release the 1.3.9 version at the end of the week.
                >
                > Tilman
                >
              • Tilman Hausherr
                ... Here s the whatsnew file: 1.3.9 Major improvements: 16.4.2011-25.4.2011: Output duplicate content, title, description in the manager section Minor
                Message 7 of 30 , May 17, 2011
                • 0 Attachment
                  On Tue, 17 May 2011 16:26:08 -0000, ultra_blue wrote:

                  >Hi, Tilman:
                  >
                  >Is a change log available?
                  >
                  >Thanks!
                  >Greg

                  Here's the "whatsnew" file:

                  1.3.9

                  Major improvements:
                  16.4.2011-25.4.2011: Output duplicate content, title, description in the
                  manager section

                  Minor improvements:
                  4.9.2010: excludeMSO behaviour without "/" now
                  6.9.2010: remove "reset entry" from context menu when inactive
                  7.9.2010: ContextMenuManager for VS2010
                  29.9.2010: Report: "List of valid *internal* URLs you can submit to a
                  search engine"
                  12.10.2010: If-Modified-Since option in INI file
                  18.10.2010: Max depth to 9999 instead of 999
                  23.11.2010: clarify include/exclude text for wildcard version
                  30.12.2010: don't open in internet archive etc when URL from internet
                  archive
                  1.3.2011: remove percents for URLs in properties dialog
                  3.3.2011: remove percents in URLs in report
                  9.4.2011: -post querystring for command line version
                  15.4.2011: Keywords meta tag column (absolutely useless for google & co,
                  but people believe in it)
                  14.5.2011: Skip ldap:
                  16.5.2011: about box with "FAQ" instead of "Click me", corrected main
                  window title

                  Bug fixes:
                  10.9.2010: process CSS comments
                  5.10.2010: remove quotes for charset compoment in HTTP header from nginx
                  14.10.2010: mtime, not ctime for local files
                  7.11.2010: Unicode comparison for local orphan search
                  26.2.2011: flag ICU_NO_ENCODE in AfxParseURLEx() because of bug on
                  hebrew systems with
                  "tav" character in URL
                  28.2.2011: repaint visible line if charset changed
                  3.3.2011: handle &#nnnn; and &#xnnnn; as unicode in ProcessLink() and
                  others
                  3.3.2011: don't reset charset for redirections
                  4.3.2011: &#xnnnn; handling for anchors
                  8.3.2011: can now check mail domains with no MX record, but with A
                  record
                  14.3.2011: corrected license text
                  14.5.2011: later read jar contents marked as InJar; exclude paths of
                  these from orphan check
                  14.5.2011: Fixed abort box for local orphan search
                  16.5.2011: Fixed toolbar gripper paint problem in XP

                  Misc:
                  1.3.2011: remove side effect in csRemovePercents()
                  4.3.2011: &#nnnn; central conversion routine
                  9.4.2011: remove FORMTEST twice, focus on POST query string when
                  checkbox set
                  15.4.2011: CLinkInfo Archive format version 16 (Keywords)
                  16.4.2011: CLinkInfo Archive format version 17 (MD5 hash)



                  4.9.2010 (1.3.8)

                  Major improvements:
                  19.6.2010: check css @import statements within <STYLE>...</STYLE>
                  check url() elements within <STYLE>...</STYLE>
                  check url() element within STYLE=
                  (dedicated to The gorgeous Princess Victoria of
                  Sweden, whose
                  wedding to Clark Kent contributed that there's really
                  nothing on
                  television besides herself and the soccer world cup :-) )
                  See also who's got his hand:
                  http://thausherr.blogspot.com/2010/06/prinzessin-im-griff.html
                  20.6.2010: parse css files similar to <STYLE>...</STYLE>

                  Minor improvements:
                  1.7.2010: sort "broken page-local links" section in report
                  3.7.2010: url property dialog now resizeable
                  6.7.2010: mailto with empty rest => "mailto:", not "mailto:@".
                  24.7.2010: mailto:name%40host.com => mailto:name@...
                  25.7.2010: all mailto: URLs of a host with successful DNS lookup are set
                  to "skip type"
                  27.7.2010: dito also for previously failed mailto: URLs of that
                  successful looked up host
                  27.7.2010: light green color for "mail host ok", which replaces text
                  "skip type" for mailto:
                  7.8.2010: renamed "maximum level" to "maximum depth"
                  14.8.2010: GraphViz only for "ok" links

                  Misc:
                  20.6.2010: changed link counting method, now in AddUrl
                  4.7.2010: clean possible memory leaks when finishing; FreeLibrary() for
                  DNSAPI.DLL
                  7.7.2010: changed toolbars slightly, preparations for VS2010
                  20.7.2010: for VS2010, expand application class with virtual INI
                  functions because I hate the registry
                  15.8.2010: "#" as error (not in public release)
                  24.8.2010: DLL security: fully qualified path for LoadLibrary()

                  Bug fixes:
                  20.6.2010: Lower case in check for .gif, .png etc
                  23.7.2010: corrected bug in change from 25.5.2010 "set recent URL list
                  to 100 instead of 10"
                  1.8.2010: correct bug about CCriticalSection usage for ServerMap and
                  CharsetMap
                  2.9.2010: fix for false alert in VS2010 buffer overflow check


                  12.6.2010 (1.3.7)

                  Minor improvements:
                  12.6.2010: .class files that are in an external .jar file are marked as
                  skipped
                  ".class in Jar" property is now saved in .XEN file

                  Bug fixes:
                  14.6.2010: correct skip of ".class in Jar" property when choosing next
                  thread
                  set all unhandled ".class in Jar" URLs as "not found" when
                  all else done

                  Misc:
                  12.6.2010: CLinkInfo Archive format version 15 (".class in Jar"
                  property)



                  11.6.2010 (1.3.6)

                  Major improvements:
                  24.2.2010: Check the domains of mail addresses (DNS lookup for MX
                  record)

                  Minor improvements:
                  7.12.2009: Include PARSETEST4 section in general release (convert
                  characters >80H to %XX, for "international" URLs)
                  19.12.2009: For "international" characters in local files: Use Unicode
                  for local directory search, URL launch in browser, read/check local
                  files
                  20.12.2009: But not for Windows 95/98/ME
                  22.12.2009: add ".class" for applets if needed, replace "." with "/".
                  example:
                  http://www.colorado.edu/physics/2000/applets/bec.html
                  27.12.2009: updated to NSIS 2.46
                  10.1.2010: use version 6 list column sort arrows on XP and higher
                  14.1.2010: added Description column
                  15.1.2010: added warning when settings overwritten by profile
                  16.1.2010: attempt at decoding .jar files for APPLET ARCHIVE thanks to
                  http://www.codeguru.com/cpp/cpp/cpp_mfc/article.php/c4049/
                  However:
                  - only one .jar archive per applet
                  - no unicode in file names
                  - name of archive must end with .jar
                  - .jar file must be internal, or the class link will
                  remain broken
                  - .class "in Jar" property isn't saved in .XEN file
                  (which prevents standard access in favor of waiting for .jar lookup)
                  24.1.2010: added <video src=
                  27.1.2010: improved list control divider double click (title is the
                  minimum)
                  26.2.2010: improved extra text in domain mail check
                  13.3.2010: Get page body only if not redirection or redirection but no
                  "Location:" in header
                  (should make PARSETEST3 fix superfluous)
                  16.3.2010: ...
                  30.3.2010: Abort box for ftp orphan search
                  2.4.2010: [Options] Accept="*/*" (default value)
                  14.4.2010-6.5.2010: milliseconds in duration
                  12.5.2010: reset e-mail flag when loading .XEN file, because if set it
                  would mail and quit after loading a finished job
                  12.5.2010: include link text in report (LINKTEXT compile option)
                  25.5.2010: set recent URL list to 100 instead of 10
                  3.6.2010: version nr. in report
                  6.6.2010: show count of included / excluded URLs in the report
                  6.6.2010: Abort box for orphan search always

                  Bug fixes:
                  15.12.2009: PARSETEST4 section: replaced "> 80X" with ">= 80X"
                  20.12.2009: added version check for Unicode Clipboard and Sitemap for
                  Windows 95/98/ME (like 27.1.2009)
                  21.12.2009: corrected broken banner links
                  22.12.2009: tell "anchor occurs multiple times" only once per URL
                  4.1.2010: remove stuff after "?" in mailto: due to Microsoft error in
                  AfxParseURLEx()
                  10.1.2010: fixed list column sort arrows wrongly displayed in unsorted
                  columns (on 7, but not on XP)
                  12.1.2010: fixed "//" bug in applet codebase in local url
                  15.1.2010: disabled and unchecked "Inactive" checkbox after loading new
                  profile
                  18.1.2010: fixed title line of tab export
                  20.1.2010: Don't assume URLs to be UTF-8, use current charset instead
                  However: this solution isn't perfect, because the correct
                  charset of an URL would be the referring URL
                  But in most cases it will work, because URLs usually
                  have the same charset
                  Known bug: Root URL with exotic characters
                  20.1.2010: Corrected exotic URLs in sitemap
                  26.1.2010: Fixed % in file: URLs, only convert %XX
                  27.1.2010: "Conversion to lowercase" option uses codepage for conversion
                  31.1.2010: Fixed bug in report (max size + max size url), probably
                  introduced on 15.1.2010
                  15.3.2010: vNormalizeURL() with conversion to UTF8 prior to
                  AfxMyParseURL()
                  store URLs in UTF8, unless already ANSI or ISO-8859-1 (1252)
                  vRemovePercents for display only
                  3.4.2010: prevent reentrant calls to vDoIdle();
                  set fileNotFound status if tmp URL content file deleted by
                  antivirus software
                  10.4.2010: replaced "> 80X" with ">= 80X" in vAnsi2EntityEscaped()
                  30.4.2010: changed user agent with "/" as requested in
                  http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.43
                  and
                  http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.8
                  6.6.2010: add milliseconds in sum for manager statistics avg
                  calculation


                  Misc:
                  14.1.2010: CLinkInfo Archive format version 12 (Description)
                  15.1.2010: CLinkInfo Archive format version 13 (size now 64 bit value)
                  27.1.2010: OnNewDocument() with vNormalizeURL() instead of
                  AfxMyParseURL()
                  29.1.2010: OnNewDocument(): moved duplicate code to end
                  5.5.2010: CLinkInfo Archive format version 14 (milliseconds)
                  6.6.2010: MinSize, MaxSize unsigned


                  5.12.2009 (1.3.5)

                  Bug fixes:
                  4.12.2009: Skip xmpp: and others properly
                  4.12.2009: fixed another *.LNK file loss bug in NSIS script that would
                  occur when installing in existing folder

                  Misc:
                  5.11.2009: processorArchitecture="*" in manifest
                  28.11.2009: improved error messages for MultiByteToWideChar()
                  29.11.2009: updated to NSIS 2.45
                  1.12.2009: About box with correct spelling: "Xenu's"
                  5.12.2009: created this version on new PC


                  5.11.2009 (1.3.4)

                  Minor improvements:
                  30.5.2009: ignore "view-source:"
                  1.6.2009: set SECURITY_FLAG_IGNORE_REVOCATION after
                  ERROR_INTERNET_SEC_CERT_REV_FAILED (works only the first time, sadly)
                  1.6.2009: ErrorDlg for ERROR_INTERNET_SEC_CERT_REV_FAILED only if
                  SECURITY_FLAG_IGNORE_REVOCATION not set
                  5.6.2009: set up minimum status line segment widths
                  26.7.2009: Use local timezone when displaying date+time of website,
                  instead of GMT
                  29.8.2009: show time status every second
                  9.10.2009: mention empty URLs in report to avoid confusion

                  Bug fixes:
                  20.10.2009: ignore MIME type and charset when result not HTTP_STATUS_OK
                  5.11.2009: fixed /S setup.exe bug in NSIS script

                  Misc:
                  1.6.2009: ErrorDlg (certificates etc) now from app window, not desktop
                  window
                  9.10.2009: Test monocolor
                  15.10.2009: merged AFX_INET_SERVICE_HTTP and AFX_INET_SERVICE_HTTPS in
                  ThreadProcGET()
                  16.10.2009: tired of character in version number, now using digits
                  31.10.2009: VS2010 fixes: PFNCALLBACK, OnTimer (UINT_PTR), LRESULT
                  OnFindReplace, INT_PTR lHnd
                  3.11.2009: CLinkInfo Archive format version 11 (m_iThisURL ->
                  m_dwThisURL)

                  25.4.2009 (1.3c)

                  Bug fixes:
                  18.4.2009: Changed behaviour of google sitemap creation - only convert
                  five characters to ampersand
                  21.4.2009: Changed behaviour of google sitemap creation - Convert >80H
                  characters to %XX

                  Minor improvements:
                  21.4.2009: Only take the first TITLE, not later TITLEs


                  18.4.2009 (1.3b)
                  Install this version if you're in China, use Windows 95/98/ME, or check
                  sites with over a million URLs.

                  Minor improvements:
                  23.12.2008: ErrorDlg for ERROR_INTERNET_SEC_CERT_DATE_INVALID
                  24.12.2008: http:// within the path, like in archive.org URLs is never a
                  "//" error (see 30.11.2006)
                  25.12.2008: ErrorDlg for ERROR_INTERNET_SEC_CERT_CN_INVALID and
                  ERROR_INTERNET_SEC_CERT_REV_FAILED
                  27.12.2008: Optimized array growth
                  28.12.2008: Started IGNORETITLES compile option to save memory (ignores
                  titles and server name)
                  28.12.2008: Optimized charset, use ref in global hash table instead of
                  CString (saves memory)
                  29.12.2008: IGNORETITLES now ignores externals totally
                  29.12.2008: improved speed for collecting pending URLs within visible
                  section of xenu window
                  29.12.2008: CCriticalSection for CharsetMap
                  29.12.2008: Optimized server software name, use ref in global hash table
                  instead of CString (saves memory)
                  1.3.2009: Use PARSETEST version for general release
                  18.4.2009: New NSIS installer script with much help from Andrey
                  Aleksanyants

                  Bug fixes:
                  1.1.2009: Don't drop all input if </a> missing after <a...>
                  12.1.2009: Made own upper case conversion (for users in China)
                  27.1.2009: added version check, because Unicode API calls don't work for
                  Windows 95/98/ME
                  1.3.2009: corrected NL before charset in TAB export
                  31.3.2009: corrected bug in high port numbers
                  7.4.2009: fixed bug for PARSETEST version, moved replace "/./" with "/"
                  higher in AfxMyParseURL()
                  17.4.2009: report file not created message

                  Misc:
                  26.1.2009: replaced InitCommonControls() with InitCommonControlsEx()
                  17.4.2009: attempt at html FORMs with POST query string (for FORMTEST
                  version only)

                  20.12.2008 (1.3)
                  I've decided to call this version 1.3 instead of 1.2k. The international
                  charset, the google maps,
                  and the "ID=" anchor were widely requested, so I guess this is really a
                  good leap forward.

                  Major improvements:
                  - 22.1.2008: UTF-8 in Xenu Window and report
                  - 23.1.2008: all charsets in Xenu Window and report
                  - 2.2.2008: parse charset meta tag (note that header settings have
                  priority!)
                  - 2.2.2008: improved speed of charset handling by using hash table
                  - 29.2.2008: Google Sitemap
                  - 25.7.2008: parse ID= anchor
                  - 23.11.2008, 6.12.2008: GraphViz export

                  Minor improvements:
                  - 20.10.2007: Updated to InnoSetup 5.2.1
                  - 28.11.2007: decode "encrypted" mailto (Use parse result for
                  AFX_INET_SERVICE_MAILTO)
                  - 1.12.2007: Updated to InnoSetup 5.2.2
                  - 14.3.2007: Updated to InnoSetup 5.2.3
                  - 22.1.2008: CLinkInfo Archive format version 10 (charset)
                  - 8.3.2008: Accept-Language option (command line version only)
                  -language en / de / ...
                  - 10.4.2008: removed wsprintf() calls
                  - 8.5.2008: passive ftp option for ftp URLs too (previously just in
                  Orphan check) *** unfinished, not saved in .XEN; Version 11 ***
                  - 31.5.2008: new icon, sponsored by www.hitflip.de, designed by Dominic
                  Raths
                  - 8.6.2008: InternetCloseHandle() with trace, also trace in ftp stuff
                  - 9.6.2008: ShellExecute() separated in File and Path; new
                  vShellLaunchURL() function
                  - 29.6.2008: Google Sitemaps: higher priority for root URL
                  - 5.7.2008: Don't quit if mail fails
                  - 5.7.2008: xenulog.txt with date, too
                  - 5.7.2008: Improved email dialogbox (disable fields)
                  - 25.7.2008: Skip </...> in parser segment; changed bParseAnchorTag() to
                  be more general
                  - 5.8.2008: report: detect and report redirection loops
                  - 9.9.2008: IMG LONGDESC
                  - 10.10.2008: Abort Dialog for sitemap
                  - 15.10.2008: Fixed C++ language issues (scope of variables in 'for'
                  loop) for VS2008; #define _WIN32_IE 0x0400
                  - 17.10.2008: manifest for common control XP look and feel;
                  HOLLOW_BRUSH for Bitmap in Tip Dialog, solves problem in
                  2005 attempt
                  http://tech.groups.yahoo.com/group/xenu-usergroup/message/445
                  positive side effect: can now display exotic
                  charset in text control
                  - 18.10.2008: manifest resource
                  http://www.codeproject.com/KB/winsdk/xptheme.aspx
                  - 18.10.2008: sort list of redirections in report
                  - 31.10.2008: GetExitCodeThread() result when not STILL_ACTIVE
                  - 2.11.2008: added <OBJECT DATA="...">
                  - 6.12.2008: Added column to pagemap

                  Bug fixes:
                  - 13.12.2007: catch empty URL in HREF etc
                  - 14.2.2008: corrected WCHAR divider size bug in MakeShortStringW()
                  - 11.3.2008: Google Sitemap only for internal URLs; escape &'"<>
                  - 26.3.2008: CXenuDoc()::m_bCheckExternal set to profile value
                  - 9.9.2008: corrected bug in mailto (user name was missing)
                  - 9.9.2008: loop detection algorithm in redirection report sometimes
                  had an endless loop itself
                  - 24.9.2008: check for ";" removed in ParseImgTag() and ParseAnchorTag()
                  - 26.9.2008: reset charset only for HTML with bodies,
                  because of http://www.adventure-inn.com/ch/description/
                  - 1.10.2008: URLs are also UTF-8
                  - 1.10.2008: Clipboard URL copy in Unicode format
                  - 5.10.2008: IDC_URL in Property Dialog also UTF-8
                  - 11.10.2008: All fields in Property Dialog now in UTF-8
                  - 2.11.2008: No double separator in context menu for local files with
                  non-existing MIME types


                  Misc:
                  - 19.10.2008: ShellExecute with 0 as first param
                  - 7.11.2008: Orphan size as LONGLONG
                  - 19.11.2008: PARSETEST3 version for % stuff in redirections


                  8.10.2007 (1.2j)
                  Major improvements:
                  - 5.6.2007: second options pane with 7 "secret" settings
                  - 7.7.2007: up/down sort symbol on column header

                  http://www.codeguru.com/cpp/controls/listview/advanced/article.php/c4179/
                  Minor improvements:
                  - 4.10.2006: visible URLs are first in new threads
                  - 4.10.2006: update listctrl when "busy" is set
                  - 7.10.2006: 2nd part of report more efficient for huge sites
                  - 12.10.2006: REMOVEDOUBLESLASH compile option removes "/../" too
                  - 15.10.2006: application/xhtml+xml is hypertext, too
                  - 15.10.2006: Updated to InnoSetup 5.1.8
                  - 30.10.2006: Skip aim://, ymsgr://, rtsp://, xmpp://
                  - 30.11.2006: better error message for ShellExecute() errors
                  - 30.11.2006: "//" in URL after the host name is not "broken" when after
                  a "?"
                  - 8.1.2007: Max title length 1024
                  - 16.1.2007: ftp dialogbox wider
                  - 19.1.2007: [Options] MakeLowerCase=1 ==> converts all URLs to lower
                  case
                  (default is 0)
                  - 3.3.2007: [Options] ListLocalDirectories=1 ==> local directory
                  listing (default is 0)
                  - ??.3.2007: [Options] AllowLocalFilesInRemoteCheck=1 ==> Allow
                  file:// links in remote check
                  (default is 0)
                  - 16.3.2007: Skip callto:
                  - 25.3.3007: meta generator
                  - 31.3.2007: Upgraded to InnoSetup 5.1.11
                  - 31.3.2007: Title TrimRight()
                  - 31.3.2007: update listctrl when title becomes known
                  - 31.3.2007: convert titles in sitemap to &...; notation
                  - 1.4.2007: Added most of
                  http://www.htmlhelp.com/reference/html40/entities/special.html to
                  conversion table
                  - 29.5.2007: "asterisk" sound when done
                  - 2.6.2007: -save option for command line version to save .XEN file
                  (does overwrite)
                  - 2.6.2007: all command line options for command line version can now be
                  combined
                  - 5.6.2007: MakeLowerCase, vNormalizeURL() slightly changed internally
                  - 6.6.2007: .XEN Archive version 10
                  - 8.6.2007: "Autostart" feature when opening .XEN file
                  - 8.6.2007: all command line options for command line version can be
                  used when opening .XEN file
                  - 28.7.2007: retry feature in command line version (test)
                  - 3.8.2007: Upgraded to InnoSetup 5.1.13
                  - 15.8.2007: reset sort icon, and vUpdateColumnSortIcon() at InsertAll()

                  Bug fixes:
                  - 7.12.2006: check for iIndex < pList->GetItemCount()
                  - 13.2.2007: corrected bug in ListLocalDirectories feature (last file
                  ignored)
                  - 15.2.2007: wildcard version adds "*" at the end of each entry in
                  "Check URL list"
                  - 23.5.2007: aim: instead of aim://
                  - 20.8.2007: remove "file://" for ShellExecute()
                  - 21.9.2007: % size corrected in statistic (was % count!)
                  - 22.9.2007: fixed CFindFile security leak,
                  http://goodfellas.shellcode.com.ar/own/VULWKU200706142

                  1.10.2006 (1.2i)
                  Major improvements:
                  (none)
                  Minor improvements:
                  - 25.6.2006: Property dialogbox with count
                  - 25.7.2006: Added orphan size
                  - 19.8.2006: PARSETEST2 compile option (restore all %XX, like 23.7.2005)
                  - 19.8.2006: Updated to InnoSetup 5.1.6
                  - 9.9.2006: NEW Dialog Box wider
                  - 16.9.2006: [Options] MaxRetry
                  - file %TEMP%\XENULOG.TXT for people who have trouble launching the
                  browser
                  (the file is not automatically sent to anyone, this must be done
                  manually)
                  Bug fixes:
                  - 6.6.2006: vNormalizeURL (csBaseURL);
                  - 24.7.2006: Microsoft bug in CStdioFile::ReadString workaround
                  (happened with files with a multiple of 128 with no CR on
                  last line)
                  http://www.mpdvc.de/html.htm#Q71
                  http://avensoft.biz/kb/kbDetail.wsp?kb_id=162
                  - 26.7.2006: Added missing HTTP status codes (412-415)
                  - 30.7.2006 - 2.8.2006: corrected many HTML errors in report (Thanks
                  Spike!)
                  - 14.9.2006: corrected bug in 18.3.2006 feature that made Xenu slow when
                  unfinished URLs only at the bottom of huge URL list
                  - 18.9.2006: corrected error handling for smtp.Connect()
                  - 10.11.2006: total elapsed hours, instead of modulo 24 in status line

                  2.6.2006 (1.2h)
                  Major improvements:
                  - Tip of the day
                  Minor improvements:
                  - ALT part of <IMG > used for the title column
                  - [Options] FailSimilarHosts=0 (current behaviour and default is 1)
                  - more statistics for managers (min size with link, max size with link,
                  avg size)
                  - "In Links" and "Out Links" in headings for better readability when
                  small
                  - correct error message for empty ftp orphan directories
                  - error message for empty local orphan directories
                  - error message for non existing local orphan directories
                  - orphan list sorted
                  - (Test / by request only) IGNOREFRONTPAGEORPHANS
                  - ftp host field allows port number, ftphostname:port
                  - ftp dialog fields stored in .INI file
                  - ftp default page (e.g. index.html, home.html, default.asp, etc)
                  - ftp dialog does not appear when Xenu is launched with "-url", but is
                  still available in "corporate" version
                  - ReportBroken2 more efficient
                  - 8.6.2005 Updated to InnoSetup 5
                  - slight change in .TAB format: Status-Code and Status-Text instead of
                  Status only
                  - prevent empty input in NEW dialog
                  - Ignore "error" HTTP_STATUS_ACCEPTED (for user with VMware, host Fedora
                  Core 9 who has NAT problems)
                  - changed handling of "%XX" with file:// orphan files
                  - 23.7.2005: AfxMyParseURL removes "%XX" with file:// URLs
                  - include/exclude wildcard test thanks to
                  http://www.codeproject.com/string/wildcmp.asp
                  - better text for ftp orphan dialog
                  - 18.3.2006: currently selected URL is first next new thread
                  - 19.3.2006: ftp/gopher segment only when such URLs exist
                  - 19.3.2006: put include/exclude settings into report
                  - 1.4.2006: link to Google Sitemaps in report
                  Bug fixes:
                  - in file://///UNC-Host/Share, leading "//" is not an error
                  - &#xnnn; now recognised (in addition to &#nnn;)
                  - vNormalizeURL() when reading URL List
                  - need space or semicolon before a "name", "href", etc
                  - process % when checking an ftp URL on an ftp server


                  18.3.2005 (1.2g)
                  Major improvements:
                  - Attempt at javascript thanks to
                  http://www.codeguru.com/Cpp/Cpp/string/regex/article.php/c2779/
                  details explained at
                  http://home.snafu.de/tilman/xenulink.html#javascript
                  Minor improvements:
                  - [Options] ExcludeMSO=1 and Xenu ignores URLs that end with
                  /filelist.xml
                  /editdata.mso
                  /oledata.mso
                  - Show elapsed time in status bar [15.1.2005 changed archiving format]
                  - TARGET=_blank instead of TARGET=Xenu in report
                  - New Version 2.44 of CSMTPConnection http://www.naughter.com/smtp.html
                  - "//" in local files is always an error
                  - mailed report as "XXXX.htm" instead of "XXXX.tmp.htm"
                  - Version String in .XEN file
                  - vTimeoutSimilarHosts() more efficient with huge sites
                  - Faster local link checking (no copying to %temp% file)
                  - HTTP_STATUS_REDIRECT_KEEP_VERB (307)
                  - ERROR_INTERNET_CLIENT_AUTH_CERT_NEEDED (12044) error handling
                  - passive ftp mode in orphan dialog box
                  - Send XENU.INI file as mail test instead of CONFIG.SYS
                  - Orphan check also for https://
                  - "New" Dialogbox can be used to enter a ftp link (no crawling!)
                  - Cookies allowed when [Options] AllowCookies=1
                  don't use this if you have links that delete or change something!
                  - (Test / by request only) PARSETEST, ORPHANS_CASEINSENSIVE
                  Bug fixes:
                  - better error handling for error 12003 in FTP orphan check
                  - _findclose in local orphan check (to unlock directory!)
                  - /> bug fixed in META REFRESH
                  - &# handling in vReplaceAmpStuff() and in bProcessLink()
                  - handle redirection target as a possibly relative link
                  - No empty URLs in URL list
                  - date, size for file:///
                  - alexa, google cache and wayback only for http:// and https://
                  - offset in ParseTag as int instead of short for tags > 64K
                  - cut off after '?' in remote orphan check
                  - exclude excluded URLs in Orphan list
                  - WINVER 0x0400

                  6.8.2004 (1.2f)
                  Major improvements:
                  - Real setup (InnoSetup 4)
                  - Status code for redirections
                  - Context menu: Open in Google Cache
                  - Context menu: Open in Wayback Machine
                  - Context menu: Open Alexa
                  Minor improvements:
                  - report as "XXXX.htm" instead of "XXXX.tmp.htm"
                  - Max-Level also "connected" to the URL
                  - Compiled on Windows XP
                  - List of unfinished threads when closing
                  - Don't display ODP context menu for broken http://editors.dmoz.org
                  links
                  - "Display error" mentioned in Properties Box when too many links
                  - Look for subdirectories when doing orphan searches
                  - Remove "file:///" when launching local URLs without DDE
                  - Change "\" to "/" for "file://" URLs because of problems with Opera
                  7.5
                  Bug fixes:
                  - Deletes TGH*.* files also when limited number of levels
                  - Can work with http://www.dbdebunk.com: "location: " instead of
                  "Location: "
                  - Correct time in report (minute and second were mismatched)
                  - ReportStatistics and ReportOrphans flags in .XEN file
                  - No error message when click "not on a line"
                  - Prevent re-entrancy of vAttention() when e-mailing report

                  28.9.2003 (1.2e)
                  Major improvements:
                  - Remote Orphans
                  - Bugfix for sites > 65535 links: m_FromTab set to 32bit
                  - timeout feature (default: 60 secs)
                  - STOP button in addition to the PAUSE toolbar button
                  - Scan https:// websites with bad certificate (ERROR_INTERNET_INVALID_CA
                  = 12045)
                  - Validate URL with right mouse click
                  Minor improvements:
                  - Skip irc://, mms://, rtsp://, pnm://, wtai://
                  - <hr> instead of "=========" in report
                  - </li> in report, so that it is correct HTML
                  - "Normalization" of URLs in include/exclude list
                  - Len = 0 when file error with http GET
                  - OpenRequest() with INTERNET_FLAG_NO_COOKIES
                  - Site Map recursion warning
                  - "//" in URL after the host name is not "broken" when part of
                  "http://" or "https://"
                  - empty line in report after local link error for a page
                  - Local orphans case insensitive
                  - Automatic retries only when m_bBusy
                  - CInternetSession local to thread, to make STOP possible
                  - "http://dmoz.org" instead of "http://dmoz.org/" comparison, to avoid
                  extra menu item for dmoz-internal links
                  - Properties at right-mouse-key always the last item
                  - Make current item visible after sort
                  - More random spidering to balance the load
                  - Url Sort case unsensitive
                  - Buffer overflow bug in unknown errors removed
                  [29.5.2005 reinstalled VC++ after HD loss]

                  14.9.2002 (1.2d)
                  - "//" in URL after the host name is not "broken" when "http://" or
                  "https://" after a "?"
                  - Corrected bug that local non-HTML files would be downloaded in full
                  - Corrected GUI bug in "new" dialog
                  - Converted %5F to _
                  - Change in cmdline version about profile reading
                  (Matching now done before Normalization)

                  16.7.2002 (1.2c)
                  - <BLOCKQUOTE CITE
                  - Consider unexisting types like "httttp" as "not found"
                  - Editing of ODP websites in the right-mouse-menu
                  (useful for editors at http://dmoz.org)
                  - For local files, launch related applications (e.g. viewer, editor)
                  with the right-mouse-menu
                  - Corrected bug that had root page twice in Xenu list
                  - "//" in URL after the host name is always an error
                  - Prevent closing when threads running
                  - "R" launches "Properties" in right-mouse-menu
                  - Save directory of "Browse" location
                  - Enlarged "New" Dialogbox
                  - Retry also for error 403
                  - Local checks for "#"
                  - HTTP_STATUS_PROXY_AUTH_REQ handling not dependent of password setting
                  - Ignore "error" HTTP_STATUS_RESET_CONTENT (at
                  http://www.vietnamthink.com/ )
                  - Corrected '%' bug with Orphan files
                  - "\" not a bug when after a "?"
                  - Correct # of Threads and URLs in status line when finished
                  - Corrected Bug with stuff like "nohref=" or "classname=" inside

                  30.11.2001 (1.2b)
                  - !!!!! Moved the xenu.ini file from
                  \windows or \winnt to the current working directory
                  - Corrected bug with </Script>
                  - <TR BACKGROUND
                  - <TH BACKGROUND

                  6.10.2001 (1.2a)
                  - extra column: time spent
                  - Correct count for broken links in report
                  - Can get size of some ftp files
                  - <TABLE BACKGROUND
                  - Append header information from redirected files even if a body exists,
                  because of http://wap.loop.de
                  - Look up MIME type for local files
                  - Unofficial Option in XENU.INI:
                  [Options] UseDDE=0 to disable DDE on some systems
                  - Combined html and wml (WAP) scanning
                  - <INPUT SRC="image.gif"> checked
                  - Skip <SCRIPT>...</SCRIPT>
                  - Logo in About-Box changed
                  - Min Level can be 0
                  - CTRL-Numpad-ADD to resize all columns
                  - Attempt at Orphan files
                  - Improved speed
                  - Better method for Url lookup
                  - no UrlTable search in ctor of CLinkInfo
                  - check for "txt", "jpg" etc more efficient
                  - m_csRootURL tested in bIncluded()
                  - CLinkInfo::vAddFromURL more efficient
                  - Internal function bHasBrokenToURLs() more efficient
                  - Corrected weird bug in initial Combo-Box
                  - Changed Text in NEW Dialogbox
                  - Compiled with VC++ 6

                  22.7.2001 (1.1f)
                  - Changed User-Agent string to
                  Xenu Link Sleuth
                  because of problems with many websites, e.g. www.sptimes.com

                  21.7.2001 (1.1e)
                  - CTRL-W and CTRL-Q shortcuts for Close and Exit
                  - Ability to consider hard redirections as errors
                  - Changed character in User-Agent string from ' to ยด

                  2.7.2001 (1.1d)
                  - new error "no info to return" for empty web pages
                  - corrected bug about saving to tab file when file exists
                  - added statistics for managers :-)
                  - HEAD command also for .zip, .exe .swf (saves bandwidth)
                  - serializing requests for name/password
                  - changed include/exclude so as to work only on the *beginning* of URLs
                  (don't forget to start them with "http"!)

                  (1.1c)
                  - Added some extra error messages
                  - Saving columns width
                  - Adjusting column width with double-click
                  - e-mail feature
                  - removed mailto:www-request@... from report
                  - added LAYER SRC, IFRAME SRC and IMG LOWSRC
                  - sort URLs in broken link section of the report
                  - HEAD command also for .txt, .png, .rtf and .pdf (saves bandwidth)

                  (1.1b)
                  - Added <TD BACKGROUND="">
                  - file:/// instead of file://
                  - added BGSOUND
                  - Compiled with VC++ 5.0, smaller
                  - Can now launch URLs even with registry poorly configured
                  - URLs of the report open in new window
                  - Property box with Link Text / Title
                  - URLs for include/exclude are "bound" to the URL

                  (1.1a)
                  - [ and ] in URLs
                  - corrected bug in CODEBASE (must add "/" if not there)
                  - corrected bug that deleted include/exclude fields
                  - improved include/exclude dialog
                  - added text for error 300
                  - corrected bug about password sites

                  (1.0w 14.4.2000)
                  - PLUGINSPACE in EMBED tag now checked
                  - APPLET now checked, with CLASS and ARCHIVE, relative to CODEBASE

                  (1.0v 7.4.2000)
                  - EMBED tag now checked
                  - "Options" in the "New" dialog
                  - "Return to top" in Report
                  - Corrected bug in site map: broken links are not included
                  - Now converting more &blah; characters
                  - Titles get also converted
                  - converting &blah; characters before normalizing
                  - convert &# characters in URL
                  - can now handle URLs like http://user:password@host/ or ftp or https
                  - export always exports *all*, regardless of the view.
                  - sadly an old bug is back in: URLs with "\" are not recognised as
                  broken.
                  - Links that start with "/../" are considered to be broken

                  (1.0u 15.10.1999)
                  - "skip these" feature - this really excudes URLs
                  - &U for Check URL menu

                  (1.0t 9.9.1999)
                  - corrected /./ bug
                  - added CTRL-B to switch between views

                  (1.0s 12.8.1999)
                  - "normalizing" received URLs. Advantage: hostnames always converted
                  into lower case.
                  - considering all pending URLs with the same host as failed when
                  timeout, connection failed, or similar
                  - moved the "Browse..." button
                  - changed the URL combining method, now using Microsoft's
                  InternetCombineURL() instead of my own algorithm
                  - proxy authentication now supported
                  - corrected bug with '

                  (1.0r 29.5.1999)
                  - Corrected bug with image maps

                  (1.0q 29.5.1999)
                  - include titles of links
                  - include / exclude
                  - allowing the use of '
                  - corrected bug re: e.g. "src" being used *before* the actual "src" word
                  - new tags: link, script (the applet tag will come in a later version)
                  - removed empty <ul></ul> sequences in the report
                  - date in the title of the report
                  - corrected bug re: HTML pages with CR only
                  - set "text/html" for local files
                  - save size of columns

                  (1.0p 8.1.1999)
                  - corrected bug about URL-in-URL
                  - convert & when in URLs
                  - REFRESH META Tag
                  - Focus set to OK after entering local file
                  - remote URLs with "\" now always fail (because netscape cannot handle
                  them)

                  13.10.1998 (1.0o)
                  - corrected bug that prevented checking local files with a space in it
                  - corrected bug that thread count was not updated when finished
                  - corrected bug that ignored http:/host/directory error
                  - added banners

                  If anyone has locations that offer banners, please e-mail me.
                  I would advertise for non-profit organisations that deal with
                  human rights or environmental topics. Attention - I will only
                  use banners that I like, and link to organisations that I like.

                  5.9.1998 (1.0n)
                  - Can check local files - useful for people who don't want to install
                  a local WWW server; simplified toolbar / initial window
                  - "Check External" in INI file for new windows
                  - "Show Broken Links Only" in INI for new windows
                  - Corrected "//" bug for www.workstation.digital.com
                  - Added random seed for banner (actually, uploaded this already on 17.7)
                  - included HTML file in the ZIP file
                  - RegisterShellFileTypes(FALSE) to prevent the "new" and "print"
                  in the registry for new users
                  - Errors between 1 and 199 are also "errors"
                  - maximize MDI child when opening
                  - Randomize checking, so that there is less volume on just one host
                  (reduces peak volume on the ISP who hosts the site being checked)
                  - Slight change in report because of OPERA bug with <PRE> after <H2>

                  16.7.1998 (1.0m)
                  - Added banners in report
                  - corrected the "406" bug

                  24.6.1998 (1.0l)
                  - Added a column at the right (error text).
                  - removed "DELETE_ON_CLOSE" technique, didn't work on Windows NT
                  due to different OS behaviour. Sorry!

                  5.6.1998 (1.0k)
                  - Changed ftp access completely. It is now reliable, but won't work with
                  proxies.
                  - more than 32767 URLs
                  - Optimized HTML parser

                  18.4.1998 (1.0j)
                  (I was on vacation, and I am still behind in my other activities,
                  so no "big" new feature this time)
                  - no need to enter "http://" in the NEW dialog box
                  - Cool Xenu icon! See on the page above.
                  - CTRL-R for "retry broken links"
                  - Removed "search" from context menu (nothing was associated with it)

                  6.2.1998 (1.0i)
                  - URL launch should now work properly with Netscape Communicator

                  1.2.1998 (1.0h)
                  - added "export to TAB separated file" for Excel (for Marc)
                  - added max level
                  - 100% CPU usage problem solved (Miguelito) / changed idle processing
                  - Site Map
                  - URL launching improved (but still not perfect)

                  25.12.1997 (also 1.0g)
                  - corrected "%26" endless loop bug (Electronic Telegraph)

                  24.12.1997 (1.0g)
                  - added lots of new options (for Stu)
                  - chose what you want to have in the report
                  - chose to "fail" passworded sites
                  - changed the way that URLs are launched: now with DDE so that only
                  one instance (but another window) of Netscape comes up. Behaviour
                  with IE and Opera might be different
                  - corrected "text/html;...." bug (for Hanno)
                  - you can now launch URLs with ENTER
                  - you can now get the property box with ALT-ENTER
                  - force reload for every call --> INTERNET_FLAG_RELOAD (for Doug)
                  - changed initial dialog box, after two users didn't realize that one
                  has to input only one URL, and not every page of the site
                  - removed unused toolbar icons and menu elements

                  23.11.1997 (1.0f)
                  - corrected bug that made it difficult to check local or very fast sites
                  - corrected minor bug in Properties Dialog
                  - Added column with link level
                  - Added error message for wrong input
                  - Added different tries for image maps

                  12.10.1997 (1.0e)
                  - list of redirected URLs (useful because certain ISPs, e.g.
                  www.primenet.com do not provide proper error returns, instead they
                  redirect to an error page)
                  - checking of targets of redirected URLs (this often leads to more
                  broken links, as lots of sites make automatic redirection without
                  checking if the target site exists)
                  - ftp & gopher list for manual check
                  - added tips how to repair broken links in the FAQ
                  - retry mechanism enhanced (for sites that fail with the HEAD command)
                  - error handler improved (open file problem)
                  - status line accuracy improved

                  7.9.1997 (1.0d)
                  - "Find" dialog box
                  - # of threads can be configured (watch your TCP/IP line glow!)
                  - corrected bug related to titles that do not end
                  - added authorization for "simple" password sites (HTTP error 401)
                  (will not work with web-based passwords, e.g. NY Times)

                  24.8.1997 (1.0c)
                  - HTML report, so that you can view with your browser
                  - % of checked URLs in the status bar
                  - URL list to chose from in "new" dialog box
                  - Automatic retry with GET when certain conditions are
                  met that suggest that the server cannot process the HEAD
                  command (www.amazon.com , www.wildkidz.com, www.dejanews.com )
                  - corrected display bug in "Reset Item" feature
                  - corrected bug when http:// in the middle of an URL
                  (www.sueddeutsche.de used this)
                  - corrected bug that incorrectly processed URLs that started
                  with a space
                  - corrected bug when saving while busy, that made reloading crash

                  15.8.1997 (1.0b)
                  - <BASE HREF="url"> now handled correctly (www.trancenet.org used it)
                  - "Reset Item" feature to recheck a single broken URL
                  - Automatic saving of window placement in INI file
                  - Error msg when trying to check non-http/https sites
                  - Reports are deleted when the next report is made
                  (*** Please go to your temp directory and delete all the TGH*.* files)
                  - "Scroll bug" found and removed!
                  - Now possibility to check your bookmark file
                  - found column click bug, corrected, implemented time sorting
                  - New column: server.
                  - New column: title.
                  - Properties Dialog Box

                  10.8.1997
                  - ability to save & restore
                  - complete list of URLs (good to submit to a search engine)
                  - new icons
                  - # of threads in status line
                  - correct size of dynamic html files
                  - "copy" and "launch URL" function in menu and popup menu
                  - launch report all the time
                • tarastockford
                  Hi Tilman The duplicates feature is useful, thanks. It would be good to have separate subheadings for duplicate content, duplicate titles and duplicate
                  Message 8 of 30 , May 18, 2011
                  • 0 Attachment
                    Hi Tilman

                    The duplicates feature is useful, thanks. It would be good to have separate subheadings for duplicate content, duplicate titles and duplicate descriptions if possible, instead of them being all mixed together.

                    The release candidate is working well for me so far.

                    Thanks

                    Tara
                  • Tilman Hausherr
                    Hi Tara, While what you do write does make sense, the problem is that I should report dup titles only if the pages itself are not duplicates, i.e. it is all
                    Message 9 of 30 , May 18, 2011
                    • 0 Attachment
                      Hi Tara,

                      While what you do write does make sense, the problem is that I should
                      report dup titles only if the pages itself are not duplicates, i.e. it
                      is all connected. So if I would seperate the three, I would have to do
                      through my URL lists three times instead of just once, i.e. it would be
                      even slower. For a website with 70000 URLs, the manager statistics
                      currently take about 15 seconds. Sure, I could just write the segments
                      into temporary files and then put it back together, but that would also
                      be more work...

                      So I'll keep this mail but wait if more people complain about that :-)

                      I might also separate that part of the report from the manager section
                      and create a "SEO fans" section...

                      Tilman

                      On Wed, 18 May 2011 16:30:00 -0000, tarastockford wrote:

                      >Hi Tilman
                      >
                      >The duplicates feature is useful, thanks. It would be good to have separate subheadings for duplicate content, duplicate titles and duplicate descriptions if possible, instead of them being all mixed together.
                      >
                      >The release candidate is working well for me so far.
                      >
                      >Thanks
                      >
                      >Tara
                      >
                      >
                      >
                      >------------------------------------
                      >
                      >Yahoo! Groups Links
                      >
                      >
                      >
                    • Tilman Hausherr
                      ...And another. http://home.snafu.de/tilman/tmp/xenubeta.zip One guy had a seemingly minor bug in the report that showed that a change I made one year ago
                      Message 10 of 30 , May 21, 2011
                      • 0 Attachment
                        ...And another.
                        http://home.snafu.de/tilman/tmp/xenubeta.zip

                        One guy had a seemingly minor bug in the report that showed that a
                        change I made one year ago wasn't really good enough.

                        The bug was about foreign characters on pages done with ISO-8859-1 (this
                        is ANSI, sortof). If you have foreign characters in titles (even if you
                        use UTF8), and especially in URLs, please try the current beta.

                        Tilman

                        On Mon, 16 May 2011 18:55:08 +0200, Tilman Hausherr wrote:

                        >I hereby declare that todays beta is a release candidate.
                        >http://home.snafu.de/tilman/tmp/xenubeta.zip
                        >
                        >Please support me by testing your website with it.
                        >
                        >I will probably release the 1.3.9 version at the end of the week.
                        >
                        >Tilman
                        >
                        >
                        >
                        >------------------------------------
                        >
                        >Yahoo! Groups Links
                        >
                        >
                        >
                      • Bruce Hartford
                        Using the Xenu 1.3.8 I m getting a ton of false reports of broken links to external websites. They have error code 404 (not found), 12007 (no such host), 12029
                        Message 11 of 30 , Dec 16, 2011
                        • 0 Attachment
                          Using the Xenu 1.3.8 I'm getting a ton of false reports of broken links
                          to external websites. They have error code 404 (not found), 12007 (no
                          such host), 12029 (no connection). Yet when I click on the supposedly
                          bad link in the HTML Broken Link Report, the page loads with no problem
                          or delay. Over the past few weeks, I estimate that 95% of the supposedly
                          broken links Xenu reports have actually been valid URLs.

                          Any thoughts?

                          Bruce
                        • Tilman Hausherr
                          Try less threads. Also uncheck fail all URLs of same failed host in the advanced options dialog. Tilman
                          Message 12 of 30 , Dec 31, 2011
                          • 0 Attachment
                            Try less threads. Also uncheck "fail all URLs of same failed host" in
                            the advanced options dialog.

                            Tilman

                            On Fri, 16 Dec 2011 13:25:14 -0800, Bruce Hartford wrote:

                            >Using the Xenu 1.3.8 I'm getting a ton of false reports of broken links
                            >to external websites. They have error code 404 (not found), 12007 (no
                            >such host), 12029 (no connection). Yet when I click on the supposedly
                            >bad link in the HTML Broken Link Report, the page loads with no problem
                            >or delay. Over the past few weeks, I estimate that 95% of the supposedly
                            >broken links Xenu reports have actually been valid URLs.
                            >
                            >Any thoughts?
                            >
                            >Bruce
                            >
                            >
                            >
                            >------------------------------------
                            >
                            >Yahoo! Groups Links
                            >
                            >
                            >
                          • Bruce Hartford
                            Thanks, I tried your suggestion but no joy. I m still getting a huge number of false error code: 12007 (no such host) errors. When I click on the URL of the
                            Message 13 of 30 , Dec 31, 2011
                            • 0 Attachment
                              Thanks, I tried your suggestion but no joy. I'm still getting a huge
                              number of false "error code: 12007 (no such host)" errors. When I click
                              on the URL of the supposedly unavailable site it pops right up.

                              Bruce



                              On 12/31/2011 2:17 AM, Tilman Hausherr wrote:
                              > Try less threads. Also uncheck "fail all URLs of same failed host" in
                              > the advanced options dialog.
                              >
                              > Tilman
                              >
                              > On Fri, 16 Dec 2011 13:25:14 -0800, Bruce Hartford wrote:
                              >
                              >> Using the Xenu 1.3.8 I'm getting a ton of false reports of broken links
                              >> to external websites. They have error code 404 (not found), 12007 (no
                              >> such host), 12029 (no connection). Yet when I click on the supposedly
                              >> bad link in the HTML Broken Link Report, the page loads with no problem
                              >> or delay. Over the past few weeks, I estimate that 95% of the supposedly
                              >> broken links Xenu reports have actually been valid URLs.
                              >>
                              >> Any thoughts?
                              >>
                              >> Bruce
                              >>
                              >>
                              >>
                              >> ------------------------------------
                              >>
                              >> Yahoo! Groups Links
                              >>
                              >>
                              >>
                              >

                              --
                              Bruce Hartford
                              Webspinner: Civil Rights Movement Veterans website http://www.crmvet.org
                              Sojourner's Blog: http://ohfreedom.wordpress.com
                            • Fischer, Thomas
                              Hi Bruce, can you give examples of the URLs that are reported as failing while working when clicked? Xenu is pickier than browsers about correct URLs , e.g.
                              Message 14 of 30 , Jan 3, 2012
                              • 0 Attachment
                                Hi Bruce,
                                 
                                can you give examples of the URLs that are reported as failing while working when clicked? Xenu is pickier than browsers about "correct URLs", e.g. it will not follow erroneous relative paths and will regard directories without trailing slash as errors.
                                 
                                All the best
                                Thomas


                                Von: xenu-usergroup@yahoogroups.com [mailto:xenu-usergroup@yahoogroups.com] Im Auftrag von Bruce Hartford
                                Gesendet: Samstag, 31. Dezember 2011 22:24
                                An: xenu-usergroup@yahoogroups.com
                                Betreff: Re: [xenu-usergroup] Loads of False Error

                                 

                                Thanks, I tried your suggestion but no joy. I'm still getting a huge
                                number of false "error code: 12007 (no such host)" errors. When I click
                                on the URL of the supposedly unavailable site it pops right up.

                                Bruce

                                On 12/31/2011 2:17 AM, Tilman Hausherr wrote:
                                > Try less threads. Also uncheck "fail all URLs of same failed host" in
                                > the advanced options dialog.
                                >
                                > Tilman
                                >
                                > On Fri, 16 Dec 2011 13:25:14 -0800, Bruce Hartford wrote:
                                >
                                >> Using the Xenu 1.3.8 I'm getting a ton of false reports of broken links
                                >> to external websites. They have error code 404 (not found), 12007 (no
                                >> such host), 12029 (no connection). Yet when I click on the supposedly
                                >> bad link in the HTML Broken Link Report, the page loads with no problem
                                >> or delay. Over the past few weeks, I estimate that 95% of the supposedly
                                >> broken links Xenu reports have actually been valid URLs.
                                >>
                                >> Any thoughts?
                                >>
                                >> Bruce
                                >>
                                >>
                                >>
                                >> ------------------------------------
                                >>
                                >> Yahoo! Groups Links
                                >>
                                >>
                                >>
                                >

                                --
                                Bruce Hartford
                                Webspinner: Civil Rights Movement Veterans website http://www.crmvet.org
                                Sojourner's Blog: http://ohfreedom.wordpress.com

                              • Tilman Hausherr
                                Were these all the same host? (I.e. the part between http:// and the third / ) Btw you also need to press CTRL-R to retry after making the changes I mentioned.
                                Message 15 of 30 , Jan 3, 2012
                                • 0 Attachment
                                  Were these all the same host? (I.e. the part between http:// and the
                                  third / )

                                  Btw you also need to press CTRL-R to retry after making the changes I
                                  mentioned.

                                  Tilman

                                  Am 31.12.2011 22:23, schrieb Bruce Hartford:
                                  > Thanks, I tried your suggestion but no joy. I'm still getting a huge
                                  > number of false "error code: 12007 (no such host)" errors. When I click
                                  > on the URL of the supposedly unavailable site it pops right up.
                                  >
                                  > Bruce
                                  >
                                  >
                                  >
                                  > On 12/31/2011 2:17 AM, Tilman Hausherr wrote:
                                  >> Try less threads. Also uncheck "fail all URLs of same failed host" in
                                  >> the advanced options dialog.
                                  >>
                                  >> Tilman
                                  >>
                                  >> On Fri, 16 Dec 2011 13:25:14 -0800, Bruce Hartford wrote:
                                  >>
                                  >>> Using the Xenu 1.3.8 I'm getting a ton of false reports of broken links
                                  >>> to external websites. They have error code 404 (not found), 12007 (no
                                  >>> such host), 12029 (no connection). Yet when I click on the supposedly
                                  >>> bad link in the HTML Broken Link Report, the page loads with no problem
                                  >>> or delay. Over the past few weeks, I estimate that 95% of the supposedly
                                  >>> broken links Xenu reports have actually been valid URLs.
                                  >>>
                                  >>> Any thoughts?
                                  >>>
                                  >>> Bruce
                                  >>>
                                  >>>
                                  >>>
                                  >>> ------------------------------------
                                  >>>
                                  >>> Yahoo! Groups Links
                                  >>>
                                  >>>
                                  >>>
                                • Bruce Hartford
                                  ... http://www.crmvet.org/crmlinks.htm http://www.bobzellner.com/ _____ error code: 12007 (no such host)
                                  Message 16 of 30 , Jan 3, 2012
                                  • 0 Attachment
                                    On 1/3/2012 1:42 AM, Fischer, Thomas wrote:
                                    > Hi Bruce,
                                    >
                                    > can you give examples of the URLs that are reported as failing while
                                    > working when clicked? Xenu is pickier than browsers about "correct
                                    > URLs", e.g. it will not follow erroneous relative paths and will
                                    > regard directories without trailing slash as errors.

                                    http://www.crmvet.org/crmlinks.htm
                                    http://www.bobzellner.com/
                                    \_____ error code: 12007 (no such host)
                                    http://www.farmworkermovement.us/ufwarchives/index.shtml
                                    \_____ error code: 12007 (no such host)
                                    https://mycampus.asurams.edu/web/event-civil-rights/home
                                    \_____ error code: 12007 (no such host)
                                    http://www.pba.org/programming/programs/thisisatlanta/atlcivilrights
                                    \_____ error code: 12007 (no such host)
                                    http://www.atlantastudentmovement.org/
                                    \_____ error code: 12007 (no such host)

                                    Bruce
                                  Your message has been successfully submitted and would be delivered to recipients shortly.