Loading ...
Sorry, an error occurred while loading the content.
 

AW: [xenu-usergroup] Link error

Expand Messages
  • Fischer, Thomas
    Hi Tilman, thanks for the speedy reply! ... You are so right! Sorry for my blindness, these pages are built together dynamically and I didn t look right. It
    Message 1 of 6 , May 24, 2011
      Hi Tilman,

      thanks for the speedy reply!

      > >1. On the page
      > >http://www.mathguide.de/cgi-bin/ssgfi/navigator.pl?db=math&type=subj
      > >
      > >Xenu gives a "not found" error for a link
      > "http://www.mathguide.de/db=math/type=form" which is linked
      > from "Source Type Catalog".
      > > While that URL really doesn't exist, the actual link (in the source
      > >code) is
      > >
      > ><A HREF="db=math/type=subj"><IMG ALT="Subject Catalog"
      > >SRC="/grafiken/new-left.gif" ALIGN=TOP BORDER=0 HEIGHT=15 WIDTH=15
      > >HSPACE=5>Subject Catalog</A>
      >
      > No, its
      >
      > <A HREF="db=math/type=form"><IMG ALT="Source Type Catalog"
      > SRC="/grafiken/new-right.gif" ALIGN=TOP BORDER=0 HEIGHT=15
      > WIDTH=15 HSPACE=5>Source Type Catalog</A>
      >
      > and that "Source Type Catalog" link is really broken, even
      > with a browser. (Opera)
      >
      > >
      > >with a base of that page as
      > >
      > ><BASE HREF="http://www.MathGuide.de/cgi-bin/ssgfi/navigator2.pl/">
      >
      > No, it is
      > <BASE HREF="http://www.MathGuide.de/">

      You are so right! Sorry for my blindness, these pages are built together dynamically and I didn't look right. It took me ages and three runs of Xenu to eradicate all the "navigator.pl"-links that -- while working -- created the erroneous links. I would have needed some additional back step in the error report:

      http://www.mathguide.de/cgi-bin/ssgfi/navigator.pl?db=math&type=form
      http://www.mathguide.de/db=math/type=subj
      \_____ error code: 404 (not found)

      Where is the first page linked from?
      I eventually created a site map that helped a little.

      > >2. I get the error message
      > >http://www.liv.ac.uk/maths/
      > > \_____ error code: 400 (no object data) and similarly,
      > >http://www.google.com/Top/World/Deutsch/Wissenschaft/Mathematik/
      > > \_____ error code: 404 (not found)
      > >and
      > >http://magma.maths.usyd.edu.au/magma/
      > > \_____ error code: 401 (auth required) while the links seem to work
      > >fine.
      > >
      > >I know of "forbidden requests" like
      > >http://de.wikibooks.org/wiki/Regal:Mathematik
      > > \_____ error code: 403 (forbidden request) but this seems to be
      > >different and starts to get annoying.
      >
      > Yes, all these servers "hate" Xenu, as described here
      > http://home.snafu.de/tilman/xenulink.html Nr. 20

      I wasn't aware of that. But it *is* a nuisance.
      The page "http://www.andilinks.com/linkckg.shtm" referred to on xenulink.html Nr. 20 says:
      "Be sure to mark those that deny Xenu so they can be easily excluded or remembered on the next pass."
      I have no clue how I can exclude dozens (50?) websites from being checked. Filling them into the tiny space at the bottom of the "Check URL" form seems pretty cumbersome; some additional "Exclusionlist" file might be helpful.
      Actually, I have already problems dealing with the 15 or so exclusions I use and have some trouble managing my Xenu.ini file (about 1900 lines by now). Is there a description of the specifics of this file somewhere? Most of the syntax can be guessed, but the connection between the [Recent URL List] and the specific inclusions and exclusions isn't quite clear to me.

      > >3. I am shown errors like
      > >
      > >http://www.mpib-berlin.mpg.de/DOK/metatagd.htm
      > > http://www.mpib-berlin.mpg.de/de/DOK/metatagd.htm
      > > \_____ error code: 404 (not found)
      > >or
      > >http://www.nist.gov/dads/
      > > http://xlinux.nist.gov/dads//
      > > \_____ error code: 404 (not found)
      > >
      > >Is there a way to find out how they relate to my website
      > (www.MathGuide.de<http://www.MathGuide.de> in this case)?
      >
      > Yes, see the redirection segment in the report

      Thanks, I found that. I suppose it would be too complicated to bring these too bits of information together automatically? I have loads of redirects and would try to fix the ones which are broken *and* permanent first. But they are not so easily spotted.

      Another thing (I might have mentioned before): we use Dublin Core Metadata, which requires the following header information (see http://dublincore.org/documents/dc-html/):

      <head profile="http://dublincore.org/documents/2008/08/04/dc-html/">
      <title>...</title>
      <link rel="schema.DC" href="http://purl.org/dc/elements/1.1/" >
      <meta name="DC.title" content="..." >
      </head>
      Unfortunately that means that I get a warning

      http://purl.org/dc/elements/1.1/
      redirected to: http://dublincore.org/2010/10/11/dcelements.rdf
      status code: 302 (object temporarily moved)

      for every single page on my site.
      Would a "don't check 'http://purl.org/dc/elements/1.1/'" help in this case?

      BTW, It seems that Xenu can check Tono's URLs just fine, but my respective mail didn't make it to the list yet.

      Thanks again
      Thomas
    • Tilman Hausherr
      ... Search it and press ALT-ENTER ... Yeah, that part sucks. I should replace this with a normal big text box some day. ... No, you should guess :-) [Recent
      Message 2 of 6 , May 24, 2011
        On Tue, 24 May 2011 15:58:39 +0200, Fischer, Thomas wrote:

        >Hi Tilman,
        >
        >thanks for the speedy reply!
        >
        >> >1. On the page
        >> >http://www.mathguide.de/cgi-bin/ssgfi/navigator.pl?db=math&type=subj
        >> >
        >> >Xenu gives a "not found" error for a link
        >> "http://www.mathguide.de/db=math/type=form" which is linked
        >> from "Source Type Catalog".
        >> > While that URL really doesn't exist, the actual link (in the source
        >> >code) is
        >> >
        >> ><A HREF="db=math/type=subj"><IMG ALT="Subject Catalog"
        >> >SRC="/grafiken/new-left.gif" ALIGN=TOP BORDER=0 HEIGHT=15 WIDTH=15
        >> >HSPACE=5>Subject Catalog</A>
        >>
        >> No, its
        >>
        >> <A HREF="db=math/type=form"><IMG ALT="Source Type Catalog"
        >> SRC="/grafiken/new-right.gif" ALIGN=TOP BORDER=0 HEIGHT=15
        >> WIDTH=15 HSPACE=5>Source Type Catalog</A>
        >>
        >> and that "Source Type Catalog" link is really broken, even
        >> with a browser. (Opera)
        >>
        >> >
        >> >with a base of that page as
        >> >
        >> ><BASE HREF="http://www.MathGuide.de/cgi-bin/ssgfi/navigator2.pl/">
        >>
        >> No, it is
        >> <BASE HREF="http://www.MathGuide.de/">
        >
        >You are so right! Sorry for my blindness, these pages are built together dynamically and I didn't look right. It took me ages and three runs of Xenu to eradicate all the "navigator.pl"-links that -- while working -- created the erroneous links. I would have needed some additional back step in the error report:
        >
        >http://www.mathguide.de/cgi-bin/ssgfi/navigator.pl?db=math&type=form
        > http://www.mathguide.de/db=math/type=subj
        > \_____ error code: 404 (not found)
        >
        >Where is the first page linked from?

        Search it and press ALT-ENTER

        >I eventually created a site map that helped a little.
        >
        >> >2. I get the error message
        >> >http://www.liv.ac.uk/maths/
        >> > \_____ error code: 400 (no object data) and similarly,
        >> >http://www.google.com/Top/World/Deutsch/Wissenschaft/Mathematik/
        >> > \_____ error code: 404 (not found)
        >> >and
        >> >http://magma.maths.usyd.edu.au/magma/
        >> > \_____ error code: 401 (auth required) while the links seem to work
        >> >fine.
        >> >
        >> >I know of "forbidden requests" like
        >> >http://de.wikibooks.org/wiki/Regal:Mathematik
        >> > \_____ error code: 403 (forbidden request) but this seems to be
        >> >different and starts to get annoying.
        >>
        >> Yes, all these servers "hate" Xenu, as described here
        >> http://home.snafu.de/tilman/xenulink.html Nr. 20
        >
        >I wasn't aware of that. But it *is* a nuisance.
        >The page "http://www.andilinks.com/linkckg.shtm" referred to on xenulink.html Nr. 20 says:
        >"Be sure to mark those that deny Xenu so they can be easily excluded or remembered on the next pass."
        >I have no clue how I can exclude dozens (50?) websites from being checked. Filling them into the tiny space at the bottom of the "Check URL" form seems pretty cumbersome; some additional "Exclusionlist" file might be helpful.

        Yeah, that part sucks. I should replace this with a normal big text box
        some day.

        >Actually, I have already problems dealing with the 15 or so exclusions I use and have some trouble managing my Xenu.ini file (about 1900 lines by now). Is there a description of the specifics of this file somewhere? Most of the syntax can be guessed, but the connection between the [Recent URL List] and the specific inclusions and exclusions isn't quite clear to me.

        No, you should guess :-) [Recent URL List] is for the combo box. There
        is also a general include / exclude in the ini file, and there are the
        same with an URL in it, these are URL-specific.

        My own Xenu.ini file is 300K.

        >
        >> >3. I am shown errors like
        >> >
        >> >http://www.mpib-berlin.mpg.de/DOK/metatagd.htm
        >> > http://www.mpib-berlin.mpg.de/de/DOK/metatagd.htm
        >> > \_____ error code: 404 (not found)
        >> >or
        >> >http://www.nist.gov/dads/
        >> > http://xlinux.nist.gov/dads//
        >> > \_____ error code: 404 (not found)
        >> >
        >> >Is there a way to find out how they relate to my website
        >> (www.MathGuide.de<http://www.MathGuide.de> in this case)?
        >>
        >> Yes, see the redirection segment in the report
        >
        >Thanks, I found that. I suppose it would be too complicated to bring these too bits of information together automatically? I have loads of redirects and would try to fix the ones which are broken *and* permanent first. But they are not so easily spotted.

        Tricky indeed.

        >Another thing (I might have mentioned before): we use Dublin Core Metadata, which requires the following header information (see http://dublincore.org/documents/dc-html/):
        >
        > <head profile="http://dublincore.org/documents/2008/08/04/dc-html/">
        > <title>...</title>
        > <link rel="schema.DC" href="http://purl.org/dc/elements/1.1/" >
        > <meta name="DC.title" content="..." >
        > </head>
        >Unfortunately that means that I get a warning
        >
        >http://purl.org/dc/elements/1.1/
        >redirected to: http://dublincore.org/2010/10/11/dcelements.rdf
        >status code: 302 (object temporarily moved)
        >
        >for every single page on my site.
        >Would a "don't check 'http://purl.org/dc/elements/1.1/'" help in this case?

        Sure, simply exclude http://purl.org . Maybe I asnwered that last time.

        >BTW, It seems that Xenu can check Tono's URLs just fine, but my respective mail didn't make it to the list yet.

        I'm not the moderator :)

        Tilman

        >
        >Thanks again
        >Thomas
        >
        >------------------------------------
        >
        >Yahoo! Groups Links
        >
        >
        >
      Your message has been successfully submitted and would be delivered to recipients shortly.