Loading ...
Sorry, an error occurred while loading the content.
 

Re: [webalizer] processing a large archive of logs

Expand Messages
  • Enric Naval
    Webalizer just looks inside the files. It will ignore the file s timestamp. Ignoring history and setting incremental to off are the same thing. Ignoring
    Message 1 of 6 , Feb 9, 2005
      Webalizer just looks inside the files. It will ignore
      the file's timestamp.

      "Ignoring history" and "setting incremental to off"
      are the same thing.

      Ignoring history means that if you process twice the
      same logfile, you will probably destroy your stats or
      totally screw them beyond recognition. Even one single
      line processed twice may cause strange effects in your
      stats.

      Ignoring history always processes every line in the
      logfiles, which makes it slower.

      On the other hand, keeping history means that you can
      happily process as many times as you want the same
      logfile, and it will only take into account the new
      lines, ignoring the old ones.

      Keeping history is also faster, because it is not
      processing the old lines. The problem appears when the
      history file breaks. You should always keep backups of
      your old logfiles, in CDs or other support, so you can
      reprocess your logfiles if you can't recover your
      history file.

      If you keep history, then changes in your
      webalizer.conf file will only apply to the new lines
      processed, not the old ones already counted as
      processed in the history file. To get the changes
      applied to all your stats, you have to ignore history
      and reprocess all logfiles from the start, so the
      stats get re-done from scratch, then keep history
      again.



      To loop the logs, if they are named so that they
      appear sorted when doing "ls", you can use:

      #!/bin/bash
      #
      # process_log.sh
      # process a logfile directory for webalizer
      #
      for i in $( ls /path/to/logfiles/* ) do;
      echo "Processing: "$i;
      webalizer "$i" -more-options ;
      done;


      Hum, I wrote a veeeery long email again :)

      --- Mark Steudel <msteudel@...> wrote:

      > I have a large archive of of logs that I would like
      > to process. Can I
      > process the logs out of order? I'm just going to
      > write a little script to
      > list the content of a directory then loop through
      > each file and process it.
      > What are the best options for this? I'm a little
      > unclear as to what the
      > ramifications of ignoring history and using the
      > preseve incremental or not
      > using it. I've moved the files around so sometimes
      > the timestamps on the
      > files themselves are all the same, does that matter
      > or does webalizer just
      > look inside the files. Anyway any advice would be
      > appreciated.
      >
      > Thanks, Mark
      >
      >


      =====
      Enric Naval
      Estudiante de Inform�tica de Gesti�n en la Udl (Lleida)
      GRIHO webalizer.conf
      http://griho.udl.es/webalizer/webalizer.conf.txt



      __________________________________
      Do you Yahoo!?
      Yahoo! Mail - Easier than ever with enhanced search. Learn more.
      http://info.mail.yahoo.com/mail_250
    • Bradford L. Barrett
      ... This is incorrect. Ignoring history will lose all previous months saved data, so you wind up with a main index showing only the month you just processed
      Message 2 of 6 , Feb 9, 2005
        > "Ignoring history" and "setting incremental to off"
        > are the same thing.

        This is incorrect. Ignoring history will lose all previous months
        saved data, so you wind up with a main index showing only the month
        you just processed and none prior. It causes the program to ignore
        any existing 'webalizer.hist' file.

        Incremental mode allows you to process a month using multiple,
        partial log files. Setting it to off forces you to process
        whole months in a single log, and causes the program to ignore
        any existing 'webalizer.current' file.

        For the original question, as long as your log fies are named
        correctly (ie: they will list correctly in chronological order,
        typically named like YYYYMMDD-something), then you can use
        incremantal mode and process all the logs like:

        for i in /path/to/logs; do webalizer $i; done

        Make sure your config file is set correctly for the output
        directory, hostname, etc... or specify them on the command
        line above.

        [...]

        Specifics:

        > > I have a large archive of of logs that I would like
        > > to process. Can I
        > > process the logs out of order?

        No, you cannot process out of order as long as you use
        incremental mode. They must be in chronological order.

        > > I'm just going to
        > > write a little script to
        > > list the content of a directory then loop through
        > > each file and process it.
        > > What are the best options for this?

        See above..

        > > I'm a little
        > > unclear as to what the
        > > ramifications of ignoring history and using the
        > > preseve incremental or not
        > > using it.

        You should NEVER ignore history, and as long as you have partial
        logs (not full months), then you must use incremental mode.


        > > I've moved the files around so sometimes
        > > the timestamps on the
        > > files themselves are all the same, does that matter
        > > or does webalizer just
        > > look inside the files. Anyway any advice would be
        > > appreciated.

        The timestamps on the files themselves doesn't matter, unless
        you were going to rely on them to tell you what order to feed
        the files. If you named them correctly, then you already know,
        based on filename, which order the files must be processed.

        --
        Bradford L. Barrett brad@...
        A free electron in a sea of neutrons DoD#1750 KD4NAW

        The only thing Micro$oft has done for society, is make people
        believe that computers are inherently unreliable.
      • Mark Steudel
        Thank s a bunch! One last question, I have a few logs that when they got downloaded to my local server the log files were empty, will that cause a problem?
        Message 3 of 6 , Feb 11, 2005
          Thank's a bunch! One last question, I have a few logs that when they got
          downloaded to my local server the log files were empty, will that cause a
          problem?
          ________________________________________
          From: Bradford L. Barrett [mailto:brad@...]
          Sent: Wednesday, February 09, 2005 2:58 PM
          To: Enric Naval
          Cc: webalizer@yahoogroups.com
          Subject: Re: [webalizer] processing a large archive of logs


          > "Ignoring history" and "setting incremental to off"
          > are the same thing.

          This is incorrect.  Ignoring history will lose all previous months
          saved data, so you wind up with a main index showing only the month
          you just processed and none prior.  It causes the progls /rairam to ignore
          any existing 'webalizer.hist' file.

          Incremental mode allows you to process a month using multiple,
          partial log files.  Setting it to off forces you to process
          whole months in a single log, and causes the program to ignore
          any existing 'webalizer.current' file.

          For the original question, as long as your log fies are named
          correctly (ie: they will list correctly in chronological order,
          typically named like YYYYMMDD-something), then you can use
          incremantal mode and process all the logs like:

          for i in /path/to/logs; do webalizer $i; done

          Make sure your config file is set correctly for the output
          directory, hostname, etc... or specify them on the command
          line above.

          [...]

          Specifics:

          > > I have a large archive of of logs that I would like
          > > to process. Can I
          > > process the logs out of order?

          No, you cannot process out of order as long as you use
          incremental mode.  They must be in chronological order.

          > > I'm just going to
          > > write a little script to
          > > list the content of a directory then loop through
          > > each file and process it.
          > > What are the best options for this?

          See above..

          > > I'm a little
          > > unclear as to what the
          > > ramifications of ignoring history and using the
          > > preseve incremental or not
          > > using it.

          You should NEVER ignore history, and as long as you have partial
          logs (not full months), then you must use incremental mode.


          > > I've moved the files around so sometimes
          > > the timestamps on the
          > > files themselves are all the same, does that matter
          > > or does webalizer just
          > > look inside the files. Anyway any advice would be
          > > appreciated.

          The timestamps on the files themselves doesn't matter, unless
          you were going to rely on them to tell you what order to feed
          the files.  If you named them correctly, then you already know,
          based on filename, which order the files must be processed.

          --
          Bradford L. Barrett                      brad@...
          A free electron in a sea of neutrons     DoD#1750 KD4NAW

          The only thing Micro$oft has done for society, is make people
          believe that computers are inherently unreliable.


          Webalizer homepage: http://www.webalizer.org




          Yahoo! Groups Sponsor
          ADVERTISEMENT




          ________________________________________
          Yahoo! Groups Links
          • To visit your group on the web, go to:
          http://groups.yahoo.com/group/webalizer/
           
          • To unsubscribe from this group, send an email to:
          webalizer-unsubscribe@yahoogroups.com
           
          • Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
        • Bradford L. Barrett
          ... No.. it will just tell you that no records were processed for that log, and no activity will be reported for that time period. -- Bradford L. Barrett
          Message 4 of 6 , Feb 11, 2005
            On Fri, 11 Feb 2005, Mark Steudel wrote:
            >
            > Thank's a bunch! One last question, I have a few logs that when they got
            > downloaded to my local server the log files were empty, will that cause a
            > problem?

            No.. it will just tell you that no records were processed for that log,
            and no activity will be reported for that time period.

            --
            Bradford L. Barrett brad@...
            A free electron in a sea of neutrons DoD#1750 KD4NAW

            How do you give Microsoft the benefit of the doubt when you
            know that if you were to throw it in a room with truth, you'd
            risk a matter/anti-matter explosion? -- Nicholas Petreley IDG
          Your message has been successfully submitted and would be delivered to recipients shortly.