Loading ...
Sorry, an error occurred while loading the content.

3108patch for large logs, countries, referrer, 32 bit overflow, 12+ month logs, etc.

Expand Messages
  • Landon Noll
    Feb 10, 2005
    • 0 Attachment
      I have updated comments related to my webalizer-2.01-10 patch based on
      previous feedback. Those who have been reading this group since July
      2004 will recognize this as my 4 part webalizer-2.01-10 patch. That
      code patch underwent a slight change to make it easier to apply to
      Solaris systems. Otherwise it is the same.

      The URL:


      contains 4 patches for webalizer-2.01-10, 3 of which I recommend to
      all webalizer users and the 4th is a re-package of the geolizer patch.

      =-= The 0.basic.patch =-=

      The 0.basic.patch is found at:


      does the following:

      * ability to process very large log files (> 2GB in size)

      * countries patch

      Some of the entries on the list are not countries. In some cases the
      nation state status is contested. In other cases the entry is related
      to a territory that does not claim to be a country. In some cases what
      some claim is a country is in dispute by another country. And things
      like .arpa are not a country.

      I recommend that one use the term 'location' instead of 'Nation' or
      'Country' to avoid the whole mess. ;-)

      Added are some missing locations (from the ISO UN codes and from
      GeoIP's list). Some location names have been corrected or changed to
      their official name. Added some more TLDs.

      * avoid referrer spamming

      Spammers and other low-life forms have been stuffing the "top N
      referrer" table in order to get webalizer to generate links to their
      sites ... (perhaps because they think this will improve their search
      engine placement or perhaps because they wish to direct people to a
      poisoned web page in an effort to exploit some browser bug?). Whatever
      the reason, we don't need to give them their links.

      This patch turns the "top N referrer" table into just values instead
      of A tag.

      * correctly process log entries made during a leap second

      * long referrer and search patch

      Quite a few referrer and search strings are between 128 and 256 chars
      in length. Avoid truncating them.

      =-= The 1.64bit.patch =-=

      The 1.64bit.patch is found at the URL:


      does the following:

      * avoid 32 bit counter overflow

      For very busy sites, 32 bit signed counters can overflow. This is
      particularly when using webalizer to cover a long span of time. This
      patch converts a few values to be u_int64_t to avoid these numeric
      overflow problems.

      For very busy sites, 32 bit signed counters can overflow. This is
      particularly when using webalizer to cover a long span of time. This
      patch converts a few values to be u_int64_t to avoid these numeric
      overflow problems.

      =-= The 2.hist.patch =-=

      The 2.hist.patch is found at the URL:


      does the following:

      * extend the summary page for longer than 12 months

      By default, webalizer only keeps the last 12 months of data. And at
      the start of a month, the oldest month is discarded resulting in only
      11+ months of data.

      This code gets around the 12 month limit by maintaining a history of
      older months in a parallel directory ../history.

      See the webalizer page:


      for an example of this effect.

      NOTE: After the 2.hist.patch has been applied, the
      track-hist tool:


      should be run on a monthly basis. See the comments in the 2.hist.patch
      file as well as the track-hist tool itself for details.

      =-= The optional 3.geolizer.patch =-=

      The optional 3.geolizer.patch is found at the URL:


      is If AND ONLY IF you use one the MaxMind:


      GeoIP database. It is a just a reapplication of the geolizer.patch
      that works for Un*x / Linux / GNU-Linux systems after the first 3
      patches have been applied.

      -=-=- in Summary -=-=-

      At a minimum, I'd highly recommend the 0.basic.patch patch.

      Large web sites will want the 1.64bit.patch patch (after the
      0.basic.patch has been applied). It doesn't hurt smaller sites to have
      it either.

      Sites that want to keep more than 12 months of webalizer stats need
      the 0.basic.patch, 1.64bit.patch and the 2.hist.patch as well as the
      track_hist tool.

      chongo (Landon Curt Noll /\oo/\
      Share and enjoy! :-)