  • Tilman Hausherr
    ... Search it and press ALT-ENTER ... Yeah, that part sucks. I should replace this with a normal big text box some day. ... No, you should guess :-) [Recent
    Message 1 of 6 , May 24, 2011
      On Tue, 24 May 2011 15:58:39 +0200, Fischer, Thomas wrote:

      >Hi Tilman,
      >thanks for the speedy reply!
      >> >1. On the page
      >> >http://www.mathguide.de/cgi-bin/ssgfi/navigator.pl?db=math&type=subj
      >> >
      >> >Xenu gives a "not found" error for a link
      >> "http://www.mathguide.de/db=math/type=form" which is linked
      >> from "Source Type Catalog".
      >> > While that URL really doesn't exist, the actual link (in the source
      >> >code) is
      >> >
      >> ><A HREF="db=math/type=subj"><IMG ALT="Subject Catalog"
      >> >SRC="/grafiken/new-left.gif" ALIGN=TOP BORDER=0 HEIGHT=15 WIDTH=15
      >> >HSPACE=5>Subject Catalog</A>
      >> No, its
      >> <A HREF="db=math/type=form"><IMG ALT="Source Type Catalog"
      >> SRC="/grafiken/new-right.gif" ALIGN=TOP BORDER=0 HEIGHT=15
      >> WIDTH=15 HSPACE=5>Source Type Catalog</A>
      >> and that "Source Type Catalog" link is really broken, even
      >> with a browser. (Opera)
      >> >
      >> >with a base of that page as
      >> >
      >> ><BASE HREF="http://www.MathGuide.de/cgi-bin/ssgfi/navigator2.pl/">
      >> No, it is
      >> <BASE HREF="http://www.MathGuide.de/">
      >You are so right! Sorry for my blindness, these pages are built together dynamically and I didn't look right. It took me ages and three runs of Xenu to eradicate all the "navigator.pl"-links that -- while working -- created the erroneous links. I would have needed some additional back step in the error report:
      > http://www.mathguide.de/db=math/type=subj
      > \_____ error code: 404 (not found)
      >Where is the first page linked from?

      Search it and press ALT-ENTER

      >I eventually created a site map that helped a little.
      >> >2. I get the error message
      >> >http://www.liv.ac.uk/maths/
      >> > \_____ error code: 400 (no object data) and similarly,
      >> >http://www.google.com/Top/World/Deutsch/Wissenschaft/Mathematik/
      >> > \_____ error code: 404 (not found)
      >> >and
      >> >http://magma.maths.usyd.edu.au/magma/
      >> > \_____ error code: 401 (auth required) while the links seem to work
      >> >fine.
      >> >
      >> >I know of "forbidden requests" like
      >> >http://de.wikibooks.org/wiki/Regal:Mathematik
      >> > \_____ error code: 403 (forbidden request) but this seems to be
      >> >different and starts to get annoying.
      >> Yes, all these servers "hate" Xenu, as described here
      >> http://home.snafu.de/tilman/xenulink.html Nr. 20
      >I wasn't aware of that. But it *is* a nuisance.
      >The page "http://www.andilinks.com/linkckg.shtm" referred to on xenulink.html Nr. 20 says:
      >"Be sure to mark those that deny Xenu so they can be easily excluded or remembered on the next pass."
      >I have no clue how I can exclude dozens (50?) websites from being checked. Filling them into the tiny space at the bottom of the "Check URL" form seems pretty cumbersome; some additional "Exclusionlist" file might be helpful.

      Yeah, that part sucks. I should replace this with a normal big text box
      some day.

      >Actually, I have already problems dealing with the 15 or so exclusions I use and have some trouble managing my Xenu.ini file (about 1900 lines by now). Is there a description of the specifics of this file somewhere? Most of the syntax can be guessed, but the connection between the [Recent URL List] and the specific inclusions and exclusions isn't quite clear to me.

      No, you should guess :-) [Recent URL List] is for the combo box. There
      is also a general include / exclude in the ini file, and there are the
      same with an URL in it, these are URL-specific.

      My own Xenu.ini file is 300K.

      >> >3. I am shown errors like
      >> >
      >> >http://www.mpib-berlin.mpg.de/DOK/metatagd.htm
      >> > http://www.mpib-berlin.mpg.de/de/DOK/metatagd.htm
      >> > \_____ error code: 404 (not found)
      >> >or
      >> >http://www.nist.gov/dads/
      >> > http://xlinux.nist.gov/dads//
      >> > \_____ error code: 404 (not found)
      >> >
      >> >Is there a way to find out how they relate to my website
      >> (www.MathGuide.de<http://www.MathGuide.de> in this case)?
      >> Yes, see the redirection segment in the report
      >Thanks, I found that. I suppose it would be too complicated to bring these too bits of information together automatically? I have loads of redirects and would try to fix the ones which are broken *and* permanent first. But they are not so easily spotted.

      Tricky indeed.

      >Another thing (I might have mentioned before): we use Dublin Core Metadata, which requires the following header information (see http://dublincore.org/documents/dc-html/):
      > <head profile="http://dublincore.org/documents/2008/08/04/dc-html/">
      > <title>...</title>
      > <link rel="schema.DC" href="http://purl.org/dc/elements/1.1/" >
      > <meta name="DC.title" content="..." >
      > </head>
      >Unfortunately that means that I get a warning
      >redirected to: http://dublincore.org/2010/10/11/dcelements.rdf
      >status code: 302 (object temporarily moved)
      >for every single page on my site.
      >Would a "don't check 'http://purl.org/dc/elements/1.1/'" help in this case?

      Sure, simply exclude http://purl.org . Maybe I asnwered that last time.

      >BTW, It seems that Xenu can check Tono's URLs just fine, but my respective mail didn't make it to the list yet.

      I'm not the moderator :)


      >Thanks again
