Loading ...
Sorry, an error occurred while loading the content.

Re: [webalizer] incremental logs not processing

Expand Messages
  • Dave Patton [DCP]
    ... There is no naming convention - Webalizer will try and process whatever logfile you tell it to process. ... You are talking about two different things: -
    Message 1 of 5 , Jan 21, 2005
    • 0 Attachment
      mlecha wrote:
      >
      > I need to know what naming convention Webalizer is looking for
      > the logs to be in

      There is no "naming convention" - Webalizer will try and process
      whatever logfile you tell it to process.

      > I gather that I'll need to decompress and concatenate these logs some
      > how, and then name them in a way that webalizer will recognize.

      You are talking about two different things:
      - whether or not to concatenate logfiles and process them monthly
      - how to tell webalizer the name of the logfile to process

      > My server seems to be rolling over the logs every 3 days.
      >
      > present filename format:
      > access_log
      > access_log-20040723.gz
      > access_log-20040726.gz
      > access_log-20040729.gz
      > access_log-20040801.gz
      > access_log-20040804.gz

      That doesn't explain what's in access_log. Is it the "live"
      Apache access log, or is it the result of unzipping
      access_log-20040720.gz?

      > webalizer.conf
      > LogFile /var/log/apache2/access_log
      > Incremental yes

      And your webalizer.conf should also have:
      ----------------------------------------------------------------------
      # LogFile defines the web server log file to use. If not specified
      # here or on on the command line, input will default to STDIN. If
      # the log filename ends in '.gz' (ie: a gzip compressed file), it will
      # be decompressed on the fly as it is being read.
      LogFile /var/log/apache2/access_log
      ----------------------------------------------------------------------

      If you want it to process access_log-20040723.gz you would use:
      LogFile /var/log/apache2/access_log-20040723.gz
      or else supply the logfile data on STDIN.

      For example, you can run webalizer from a shell script
      that is run as a cron job, and do something like:
      ------------------------------------------------------------------
      # run from our Webalizer directory
      cd /path/to/webalizer

      # yesterday's log is named my_log.YYYYMMDD.bz2
      # logs are bzip compressed
      # yesterdayslog.pl is a Perl script that generates
      # the name of the logfile for "yesterday"
      bzcat /mydir/`perl ~/yesterdayslog.pl` | ./webalizer -q -c my.conf

      # example of processing a log manually
      # cd /path/to/webalizer
      # bzcat /mydir/my_log.20040922.bz2 | ./webalizer -q -c my.conf
      ------------------------------------------------------------------

      > I gather that Webalizer would prefer to find the logs divided into 1
      > month per file?

      No. And if you decide YOU prefer that, then you'll have
      to change your webalizer.cong file:
      Incremental no

      --
      Dave Patton
      Canadian Coordinator, Degree Confluence Project
      http://www.confluence.org/
      My website: http://members.shaw.ca/davepatton/
    • mlecha
      ... So it only attempts to process the single access_log , as I have specified in the webalizer.conf file? Not the archived ones. ... Yes, access_log is the
      Message 2 of 5 , Jan 21, 2005
      • 0 Attachment
        > > I need to know what naming convention Webalizer is looking for
        > > the logs to be in
        >
        > There is no "naming convention" - Webalizer will try and process
        > whatever logfile you tell it to process.

        So it only attempts to process the single "access_log", as I have
        specified in the webalizer.conf file? Not the archived ones.

        > > present filename format:
        > > access_log
        > > access_log-20040723.gz
        > > access_log-20040726.gz
        > > access_log-20040729.gz
        > > access_log-20040801.gz
        > > access_log-20040804.gz
        >
        > That doesn't explain what's in access_log. Is it the "live"
        > Apache access log, or is it the result of unzipping
        > access_log-20040720.gz?

        Yes, access_log is the live Apache2 access log, and gets archived
        automatically every 3 days. I'll have to find out how to change this
        in Apache2. I think that we'll go with month long logs for our site.

        > If you want it to process access_log-20040723.gz you would use:
        > LogFile /var/log/apache2/access_log-20040723.gz
        > or else supply the logfile data on STDIN.

        Right. I understand.

        > For example, you can run webalizer from a shell script
        > that is run as a cron job, and do something like:
        > ------------------------------------------------------------------
        > # run from our Webalizer directory
        > cd /path/to/webalizer
        >
        > # yesterday's log is named my_log.YYYYMMDD.bz2
        > # logs are bzip compressed
        > # yesterdayslog.pl is a Perl script that generates
        > # the name of the logfile for "yesterday"
        > bzcat /mydir/`perl ~/yesterdayslog.pl` | ./webalizer -q -c my.conf
        >
        > # example of processing a log manually
        > # cd /path/to/webalizer
        > # bzcat /mydir/my_log.20040922.bz2 | ./webalizer -q -c my.conf
        > ------------------------------------------------------------------

        Could you publish ~/yesterdayslog.pl ? I'll need to see how to
        generate the filename with the correct date.

        I can see that I'd want process the log (of what ever period) after
        it's been archived, or else I could lose some data. eg: if the log is
        archived before the webalizer cron job is run.

        > > I gather that Webalizer would prefer to find the logs divided into 1
        > > month per file?
        >
        > No. And if you decide YOU prefer that, then you'll have
        > to change your webalizer.cong file:
        > Incremental no

        Right. Ok, this is clearer now. I misunderstood what "incremental"
        processing would do for me. I was thinking that it would go through
        the directory and process all the archived logs.

        Thanks,

        Michael
      • Dave Patton [DCP]
        ... #!/usr/bin/perl -w # determine yesterdays log filename $year = (localtime(time() - 86400))[5] + 1900; $month = (localtime(time() - 86400))[4] + 1; $day =
        Message 3 of 5 , Jan 21, 2005
        • 0 Attachment
          mlecha wrote:
          >>For example, you can run webalizer from a shell script
          >>that is run as a cron job, and do something like:
          >>------------------------------------------------------------------
          >># run from our Webalizer directory
          >>cd /path/to/webalizer
          >>
          >># yesterday's log is named my_log.YYYYMMDD.bz2
          >># logs are bzip compressed
          >># yesterdayslog.pl is a Perl script that generates
          >># the name of the logfile for "yesterday"
          >>bzcat /mydir/`perl ~/yesterdayslog.pl` | ./webalizer -q -c my.conf
          >>
          >># example of processing a log manually
          >># cd /path/to/webalizer
          >># bzcat /mydir/my_log.20040922.bz2 | ./webalizer -q -c my.conf
          >>------------------------------------------------------------------

          > Could you publish ~/yesterdayslog.pl ? I'll need to see how to
          > generate the filename with the correct date.

          #!/usr/bin/perl -w
          # determine yesterdays log filename
          $year = (localtime(time() - 86400))[5] + 1900;
          $month = (localtime(time() - 86400))[4] + 1;
          $day = (localtime(time() - 86400))[3];
          printf("my_log.%04d%02d%02d.bz2", $year, $month, $day);

          --
          Dave Patton
          Canadian Coordinator, Degree Confluence Project
          http://www.confluence.org/
          My website: http://members.shaw.ca/davepatton/
        • Bradford L. Barrett
          As an alternative method, without invoking the overhead to load the perl interpreter (which is quite large), you can just use ... -- ... -- Bradford L. Barrett
          Message 4 of 5 , Jan 21, 2005
          • 0 Attachment
            As an alternative method, without invoking the overhead to load
            the perl interpreter (which is quite large), you can just use
            the 'date' command, ie:

            > bzcat /mydir/my_log.`date +%Y%m%d -d yesterday`.bz2 | ./webalizer ...

            --

            On Fri, 21 Jan 2005, Dave Patton [DCP] wrote:

            >
            > mlecha wrote:
            > >>For example, you can run webalizer from a shell script
            > >>that is run as a cron job, and do something like:
            > >>------------------------------------------------------------------
            > >># run from our Webalizer directory
            > >>cd /path/to/webalizer
            > >>
            > >># yesterday's log is named my_log.YYYYMMDD.bz2
            > >># logs are bzip compressed
            > >># yesterdayslog.pl is a Perl script that generates
            > >># the name of the logfile for "yesterday"
            > >>bzcat /mydir/`perl ~/yesterdayslog.pl` | ./webalizer -q -c my.conf
            > >>
            > >># example of processing a log manually
            > >># cd /path/to/webalizer
            > >># bzcat /mydir/my_log.20040922.bz2 | ./webalizer -q -c my.conf
            > >>------------------------------------------------------------------
            >
            > > Could you publish ~/yesterdayslog.pl ? I'll need to see how to
            > > generate the filename with the correct date.
            >
            > #!/usr/bin/perl -w
            > # determine yesterdays log filename
            > $year = (localtime(time() - 86400))[5] + 1900;
            > $month = (localtime(time() - 86400))[4] + 1;
            > $day = (localtime(time() - 86400))[3];
            > printf("my_log.%04d%02d%02d.bz2", $year, $month, $day);
            >
            > --
            > Dave Patton
            > Canadian Coordinator, Degree Confluence Project
            > http://www.confluence.org/
            > My website: http://members.shaw.ca/davepatton/
            >
            >
            >
            > Webalizer homepage: http://www.webalizer.org
            >
            > Yahoo! Groups Links
            >
            >
            >
            >
            >
            >
            >
            --
            Bradford L. Barrett brad@...
            A free electron in a sea of neutrons DoD#1750 KD4NAW

            The only thing Micro$oft has done for society, is make people
            believe that computers are inherently unreliable.
          Your message has been successfully submitted and would be delivered to recipients shortly.