Loading ...
Sorry, an error occurred while loading the content.

Re: [xenu-usergroup] link to wikipedia

Expand Messages
  • Ron Jones
    ... There are plenty of old admins, I can assure you :-) The software is probably rough - it *is* still a charity, and due to the mindless antics of loads of
    Message 1 of 15 , Apr 26, 2010
    • 0 Attachment
      Tilman Hausherr wrote:
      > On Mon, 26 Apr 2010 11:40:55 +0100, Jack Stringer wrote:
      >
      >>>>> since checking links to Wikipedia seems to be a legitimate task
      >>>>> for Xenu, shouldn't someone contact them and as for the removal
      >>>>> of the robots.txt exclusion?. Or is there a reason that Xenu and
      >>>>> Wikipedia don't work together smoothly, e.g because of the
      >>>>> internal
      >>>> redirects in Wikipedia?
      >>>>>
      >>>>> By the way,
      >>>>>
      >>>>> User-agent: Xenu
      >>>>> Disallow: /
      >>>>>
      >>>>> is also contained in http://de.wikipedia.org/robots.txt.
      >>>>> <http://de.wikipedia.org/robots.txt.>
      >>
      >>
      >> There are a couple of thousand users using Xenu if they all started
      >> sending requests to wikipedia site then the server soon gets bogged
      >> down trying to deliver the pages. Its the same as those people using
      >> website copying software. I have had my photography gallery go very
      >> very slow at times just because someone is trying to hoover up the
      >> pictures.
      >>
      >> What would be nice is to find out from wikipedia what changes need
      >> to be made to Xenu so make it nicer to their systems. E.g some sort
      >> of delay when getting pages from wikipedia servers.
      >
      > Xenu is already "nice", i.e. it makes a HEAD request, not a GET
      > request. My opinion is that the wikipedia software is crappy. The
      > organisation is mostly concentrated on collecting money, enforcing
      > censorship, altering history, and being busy with itself (many of the
      > admins are just very intelligent kids with too much time), instead of
      > delivering a high quality product by running a Continuous Improvement
      > Process.
      >
      > Tilman (holder of a scarlet letter from the wikipedia arb board :-))
      > http://en.wikipedia.org/wiki/User:Tilman
      >

      There are plenty of old admins, I can assure you :-)
      The software is probably rough - it *is* still a charity, and due to the
      mindless antics of loads of juniville vandals, it needs a large team of
      vandal fighters (not just admins - there's only 1000 regular ones) to keep
      the pages more or less intact - English Wikipedia has around 150-200 pages
      change per minute, and around 10% of those have to be reverted - so the
      servers are already very busy, and I think allowing Xenu in will grind it to
      a halt - If the Dutch mirrors go down, and I have to connect direct (from
      UK) to the USA servers, then it can take 30 seconds plus for a medium page
      to load.

      Ron Jones
      Process Safety & Development Specialist
      Don't repeat history, unreported chemical lab/plant near misses at
      http://www.crhf.org.uk Only two things are certain: The universe and
      human stupidity; and I'm not certain about the universe. ~ Albert
      Einstein
    • Tilman Hausherr
      Now it does save/export and restore the milliseconds value. http://home.snafu.de/tilman/tmp/xenubeta.zip Tilman
      Message 2 of 15 , May 6, 2010
      • 0 Attachment
        Now it does save/export and restore the milliseconds value.
        http://home.snafu.de/tilman/tmp/xenubeta.zip

        Tilman


        On Sun, 18 Apr 2010 13:03:46 +0200, Tilman Hausherr wrote:

        >Although Xenu isn't a SEO tool, it is being "misused" as such. A guy
        >asked to get the duration in milliseconds, and google has recently
        >announced that loading time of websites would be taken into
        >consideration.
        >
        >A new beta version is here:
        >http://home.snafu.de/tilman/tmp/xenubeta.zip
        >
        >This is just a test so you see how it looks and give feedback. The
        >milliseconds value isn't saved in the .XEN file, nor in the export file.
        >(This will be done at a later time). If you need the milliseconds
        >feature, please test it and give feedback about wether this is usable,
        >or annoying.
        >
        >Below are all the changes since the last regular version. If you like to
        >support me, please test it and give feedback.
        >
        >Tilman
        >
        >=====================
        >
        >Major improvements:
        >24.2.2010: Check the domains of mail addresses (DNS lookup for MX
        >record)
        >
        >Minor improvements:
        >7.12.2009: Include PARSETEST4 section in general release (convert
        >characters >80H to %XX, for "international" URLs)
        >19.12.2009: For "international" characters in local files: Use Unicode
        >for local directory search, URL launch in browser, read/check local
        >files
        >20.12.2009: But not for Windows 95/98/ME
        >22.12.2009: add ".class" for applets if needed, replace "." with "/".
        > example:
        >http://www.colorado.edu/physics/2000/applets/bec.html
        >27.12.2009: updated to NSIS 2.46
        >10.1.2010: use version 6 list column sort arrows on XP and higher
        >14.1.2010: added Description column
        >15.1.2010: added warning when settings overwritten by profile
        >16.1.2010: attempt at decoding .jar files for APPLET ARCHIVE thanks to
        > http://www.codeguru.com/cpp/cpp/cpp_mfc/article.php/c4049/
        > However:
        > - only one .jar archive per applet
        > - no unicode in file names
        > - name of archive must end with .jar
        > - .jar file must be internal, or the class link will
        >remain broken
        > - .class "in Jar" property isn't saved in .XEN file
        >(which prevents standard access in favor of waiting for .jar lookup)
        >24.1.2010: added <video src=
        >27.1.2010: improved list control divider double click (title is the
        >minimum)
        >26.2.2010: improved extra text in domain mail check
        >13.3.2010: Get page body only if not redirection or redirection but no
        >"Location:" in header
        > (should make PARSETEST3 fix superfluous)
        >16.3.2010: ...
        >30.3.2010: Abort box for ftp orphan search
        >2.4.2010: [Options] Accept="*/*" (default value)
        >14.4.2010: milliseconds in duration
        > (in progress; missing: export, save/load)
        >
        >Bug fixes:
        >15.12.2009: PARSETEST4 section: replaced "> 80X" with ">= 80X"
        >20.12.2009: added version check for Unicode Clipboard and Sitemap for
        >Windows 95/98/ME (like 27.1.2009)
        >21.12.2009: corrected broken banner links
        >22.12.2009: tell "anchor occurs multiple times" only once per URL
        >4.1.2010: remove stuff after "?" in mailto: due to Microsoft error in
        >AfxParseURLEx()
        >10.1.2010: fixed list column sort arrows wrongly displayed in unsorted
        >columns (on 7, but not on XP)
        >12.1.2010: fixed "//" bug in applet codebase in local url
        >15.1.2010: disabled and unchecked "Inactive" checkbox after loading new
        >profile
        >18.1.2010: fixed title line of tab export
        >20.1.2010: Don't assume URLs to be UTF-8, use current charset instead
        > However: this solution isn't perfect, because the correct
        >charset of an URL would be the referring URL
        > But in most cases it will work, because URLs usually
        >have the same charset
        > Known bug: Root URL with exotic characters
        >20.1.2010: Corrected exotic URLs in sitemap
        >26.1.2010: Fixed % in file: URLs, only convert %XX
        >27.1.2010: "Conversion to lowercase" option uses codepage for conversion
        >31.1.2010: Fixed bug in report (max size + max size url), probably
        >introduced on 15.1.2010
        >15.3.2010: vNormalizeURL() with conversion to UTF8 prior to
        >AfxMyParseURL()
        > store URLs in UTF8, unless already ANSI or ISO-8859-1 (1252)
        > vRemovePercents for display only
        >3.4.2010: prevent reentrant calls to vDoIdle();
        > set fileNotFound status if tmp URL content file deleted by
        >antivirus software
        >10.4.2010: replaced "> 80X" with ">= 80X" in vAnsi2EntityEscaped()
        >
        >
        >------------------------------------
        >
        >Yahoo! Groups Links
        >
        >
        >
      • Ven. S. Upatissa (g)
        On my local hard disk I have a folder containing hundreds of html files, and an index.html file that contains links to all of them. When I run xenu on the
        Message 3 of 15 , May 6, 2010
        • 0 Attachment
          On my local hard disk I have a folder containing hundreds of html files,
          and an index.html file that contains links to all of them.

          When I run xenu on the index file, it correctly reports no broken links,
          but it also reports that all of the other files are orphans.

          Why is this? What am I doing wrong?

          -Thanks
        • Tilman Hausherr
          Don t know.... send it to me in a zip, and send me a .XEN file in a ZIP too, at tilman at snafu dot de. Tilman
          Message 4 of 15 , May 6, 2010
          • 0 Attachment
            Don't know.... send it to me in a zip, and send me a .XEN file in a ZIP
            too, at tilman at snafu dot de.


            Tilman

            On Fri, 07 May 2010 07:59:19 +0530, Ven. S. Upatissa (g) wrote:

            >On my local hard disk I have a folder containing hundreds of html files,
            >and an index.html file that contains links to all of them.
            >
            >When I run xenu on the index file, it correctly reports no broken links,
            >but it also reports that all of the other files are orphans.
            >
            >Why is this? What am I doing wrong?
            >
            >-Thanks
            >
            >
            >------------------------------------
            >
            >Yahoo! Groups Links
            >
            >
            >
          Your message has been successfully submitted and would be delivered to recipients shortly.