Loading ...
Sorry, an error occurred while loading the content.
 

Re: [webalizer] reports show less or no traffic

Expand Messages
  • Bradford L. Barrett
    The program only reports on what it s given.. the most common cause of what you describe is due to improper operations on the server, where logs are not
    Message 1 of 11 , Jan 12, 2004
      The program only reports on what it's given.. the most common cause of
      what you describe is due to improper operations on the server, where logs
      are not processed. The proper sequence should be to rotate the log, then
      immediately process it with the webalizer before processing anything else.
      If you rotate a log and don't process it, that data will never be seen and
      therefore never reported, leaving 'holes' in the reporting similar to what
      you describe.

      --
      On Mon, 12 Jan 2004, Linux-Guru wrote:

      > Hi all,
      >
      > we are running webalizer on six Linux-servers with each about 400
      > virtual domains.
      > Until now there were no problems. Since about 2 months, on some domains
      > there are days with no or just minimal traffic (according to webalizer),
      > which is wrong (according to Apache logs).
      > This behaviour appears on all servers on some dimains, which are always
      > the same. Usually it shows too less traffic for one or max. four days,
      > runs properly for some days and comes with errors again.
      > Questions:
      > Does anybody have similar problems (A friend of mine which is in hosting
      > business, too, told me, that he has similar errors, too)?
      > Is more detailed information needed?
      > Does anybody know a solution?
      > Note: It is impossible to delete all reports and re-initialize
      > webailizer from scratch. Our customers don't allow such a procedure.
      >
      > Regards from Switzerland
      >
      > Tobias
      >
      >
      > Webalizer homepage: http://www.webalizer.org
      > Webalizer for NT: http://www.medasys-lille.com/webalizer/
      >
      >
      > ttp://www.webalizer.org
      > Webalizer for NT: http://www.medasys-lille.com/webalizer/
      >
      >
      >
      >
      >
      > Yahoo! Groups Links
      >
      > To visit your group on the web, go to:
      > http://groups.yahoo.com/group/webalizer/
      >
      > To unsubscribe from this group, send an email to:
      > webalizer-unsubscribe@yahoogroups.com
      >
      > Your use of Yahoo! Groups is subject to:
      > http://docs.yahoo.com/info/terms/
      >
      >
      --
      Bradford L. Barrett brad@...
      A free electron in a sea of neutrons DoD#1750 KD4NAW

      The only thing Micro$oft has done for society, is make people
      believe that computers are inherently unreliable.
    • Linux-Guru
      Hi, thanks for this fast answer. I had an explicit look on the config and trying to find out if this could be the reason and, if yes, how to solve it. Current
      Message 2 of 11 , Jan 12, 2004
        Hi,

        thanks for this fast answer.
        I had an explicit look on the config and trying to find out if this
        could be the reason and, if yes, how to solve it.

        Current config is:
        httpd.conf:
        CustomLog "|/path/to/cronolog --symlink=/path/to/users_logdir/access_log
        --prev-symlink=/path/to/users_logdir/current_access_log
        /path/to/users_logdir/%Y/%m/%d/access_log" combined

        crontab:
        59 23 * * * /path/to/report.sh

        report.sh:
        #!/bin/bash
        cd /path/to/stats_configdir/
        ./start-report.sh
        [some other stuff to clean up Apache logs after 6 days]

        start_report.sh
        #!/bin/bash
        webalizer -c /path/to/stats_configdir/domain.conf

        As far as I could see, the problem could be, that at 00:00 a new logfile
        is written (cronolog) and all logfiles which are not processed until
        then are "lost".
        So if I am right, is there any possibility to tell webalizer to use
        /path/to/users_logdir/%Y/%m/%d/access_log where it calculates /%Y/%m/%d
        as "today -1" and to start report.sh on 00:00?
        I can not change start_report, httpd.conf and domain.conf because it is
        written from our hosting-automation-tool and I am not able to change
        things there.

        Any help would be aprreciated

        Regards

        Tobias


        Am Mon, den 12.01.2004 schrieb Bradford L. Barrett um 15:42:
        > The program only reports on what it's given.. the most common cause of
        > what you describe is due to improper operations on the server, where logs
        > are not processed. The proper sequence should be to rotate the log, then
        > immediately process it with the webalizer before processing anything else.
        > If you rotate a log and don't process it, that data will never be seen and
        > therefore never reported, leaving 'holes' in the reporting similar to what
        > you describe.
        >
        > --
        > On Mon, 12 Jan 2004, Linux-Guru wrote:
        >
        > > Hi all,
        > >
        > > we are running webalizer on six Linux-servers with each about 400
        > > virtual domains.
        > > Until now there were no problems. Since about 2 months, on some domains
        > > there are days with no or just minimal traffic (according to webalizer),
        > > which is wrong (according to Apache logs).
        > > This behaviour appears on all servers on some dimains, which are always
        > > the same. Usually it shows too less traffic for one or max. four days,
        > > runs properly for some days and comes with errors again.
        > > Questions:
        > > Does anybody have similar problems (A friend of mine which is in hosting
        > > business, too, told me, that he has similar errors, too)?
        > > Is more detailed information needed?
        > > Does anybody know a solution?
        > > Note: It is impossible to delete all reports and re-initialize
        > > webailizer from scratch. Our customers don't allow such a procedure.
        > >
        > > Regards from Switzerland
        > >
        > > Tobias
        > >
        > >
        > > Webalizer homepage: http://www.webalizer.org
        > > Webalizer for NT: http://www.medasys-lille.com/webalizer/
        > >
        > >
        > > ttp://www.webalizer.org
        > > Webalizer for NT: http://www.medasys-lille.com/webalizer/
        > >
        > >
        > >
        > >
        > >
        > > Yahoo! Groups Links
        > >
        > > To visit your group on the web, go to:
        > > http://groups.yahoo.com/group/webalizer/
        > >
        > > To unsubscribe from this group, send an email to:
        > > webalizer-unsubscribe@yahoogroups.com
        > >
        > > Your use of Yahoo! Groups is subject to:
        > > http://docs.yahoo.com/info/terms/
        > >
        > >
        > --
        > Bradford L. Barrett brad@...
        > A free electron in a sea of neutrons DoD#1750 KD4NAW
        >
        > The only thing Micro$oft has done for society, is make people
        > believe that computers are inherently unreliable.
        >
        > Webalizer homepage: http://www.webalizer.org
        > Webalizer for NT: http://www.medasys-lille.com/webalizer/
        >
        >
        > ttp://www.webalizer.org
        > Webalizer for NT: http://www.medasys-lille.com/webalizer/
        >
        >
        >
        >
        > Yahoo! Groups Links
        >
        > To visit your group on the web, go to:
        > http://groups.yahoo.com/group/webalizer/
        >
        > To unsubscribe from this group, send an email to:
        > webalizer-unsubscribe@yahoogroups.com
        >
        > Your use of Yahoo! Groups is subject to:
        > http://docs.yahoo.com/info/terms/
      • Linux-Guru
        Hi, I sent a similar mail to the list about one week ago, but didn t receive an answer. Again my question/problem: As you can see in the enclosed code,
        Message 3 of 11 , Jan 21, 2004
          Hi,

          I sent a similar mail to the list about one week ago, but didn't receive an answer.
          Again my question/problem:

          As you can see in the enclosed code, logfiles are rotated daily (IMHO at 00:00 / done by cronolog). At 23:59 a webalizer-run is started.
          If it doesn't finish until 00:00 it works on, but with ne new logfiles, which is not wanted.
          Is there any solution to tell webalizer to use logfiles from yesterday (by wildcard or kind of "date today - 1 = yesterday"-calculation?
          How did other people solve this problem?

          Current config is:
          httpd.conf:
          CustomLog "|/path/to/cronolog --symlink=/path/to/users_logdir/access_log
          --prev-symlink=/path/to/users_logdir/current_access_log
          /path/to/users_logdir/%Y/%m/%d/access_log" combined

          crontab:
          59 23 * * * /path/to/report.sh

          report.sh:
          #!/bin/bash
          cd /path/to/stats_configdir/
          ./start-report.sh
          [some other stuff to clean up Apache logs after 6 days]

          start_report.sh
          #!/bin/bash
          webalizer -c /path/to/stats_configdir/domain.conf

          As far as I could see, the problem could be, that at 00:00 a new logfile
          is written (cronolog) and all logfiles which are not processed until
          then are "lost".
          So if I am right, is there any possibility to tell webalizer to use
          /path/to/users_logdir/%Y/%m/%d/access_log where it calculates /%Y/%m/%d
          as "today -1" and to start report.sh on 00:00?
          I can not change start_report, httpd.conf and domain.conf because it is
          written from our hosting-automation-tool and I am not able to change
          things there.

          Any help would be appreciated

          Regards

          Tobias
        • enventa2000
          My two cents. I concatenate all logs in /var/log/httpd to create a single log file. I recreate the file every night and then process it. This way, it doesn t
          Message 4 of 11 , Mar 20, 2004
            My two cents. I concatenate all logs in /var/log/httpd to create a
            single log file. I recreate the file every night and then process it.
            This way, it doesn't mind if the files are rotated or not. It takes
            about five minutes to concatenate 200 MG of logs. Using incremental
            is advised, or webalizer will attempt to process the whole file.

            Only problem is the way logrotate names files.
            If I concatenate like this:

            cat $(ls /var/log/httpd/access_log*) > /tmp/logfile

            Then the file called access_log.10 will come before access_log.1,
            causing all kind of strange problems in the statistics.

            I'm sure there are tools to merge and sorts files, but I use this
            small script in the same cron script where I call webalizer, just
            before calling webalizer:

            #/bin/bash
            LOGTMP=" "
            n=200 #max of rotated logs we presume we will ever have
            while [ "$n$ -ne "0" ];do
            #notice the extra space at the end to separate the names
            LOGTMP="$LOGTMP""/var/log/httpd/access_log.""$n"" "
            n=$(($n - 1)) # n=n-1
            done;
            # we add the last name to the list
            LOGTMP="$LOGTMP""/var/log/httpd/access_log "

            # we redirect 2 (the error output) to /dev/null so we won't
            # have to listen complaints about non-existing files
            for i in $LOGTMP ; do
            cat "$i" 2>/dev/null >> /tmp/merged_httpd_logs
            done

            # Now you only need to configure webalizer to
            # process /tmp/merged_httpd_logs
            # We put its "nice" to the lowest priority so it doesn't
            # steal resources from the httpd
            nice -n 19 webalizer



            --- In webalizer@yahoogroups.com, Linux-Guru <linux-guru@w...> wrote:
            > Hi,
            >
            > thanks for this fast answer.
            > I had an explicit look on the config and trying to find out if this
            > could be the reason and, if yes, how to solve it.
            >
            > Current config is:
            > httpd.conf:
            > CustomLog "|/path/to/cronolog --
            symlink=/path/to/users_logdir/access_log
            > --prev-symlink=/path/to/users_logdir/current_access_log
            > /path/to/users_logdir/%Y/%m/%d/access_log" combined
            >
            > crontab:
            > 59 23 * * * /path/to/report.sh
            >
            > report.sh:
            > #!/bin/bash
            > cd /path/to/stats_configdir/
            > ./start-report.sh
            > [some other stuff to clean up Apache logs after 6 days]
            >
            > start_report.sh
            > #!/bin/bash
            > webalizer -c /path/to/stats_configdir/domain.conf
            >
            > As far as I could see, the problem could be, that at 00:00 a new
            logfile
            > is written (cronolog) and all logfiles which are not processed until
            > then are "lost".
            > So if I am right, is there any possibility to tell webalizer to use
            > /path/to/users_logdir/%Y/%m/%d/access_log where it calculates /%Y/%
            m/%d
            > as "today -1" and to start report.sh on 00:00?
            > I can not change start_report, httpd.conf and domain.conf because
            it is
            > written from our hosting-automation-tool and I am not able to change
            > things there.
            >
            > Any help would be aprreciated
            >
            > Regards
            >
            > Tobias
            >
            >
            > Am Mon, den 12.01.2004 schrieb Bradford L. Barrett um 15:42:
            > > The program only reports on what it's given.. the most common
            cause of
            > > what you describe is due to improper operations on the server,
            where logs
            > > are not processed. The proper sequence should be to rotate the
            log, then
            > > immediately process it with the webalizer before processing
            anything else.
            > > If you rotate a log and don't process it, that data will never be
            seen and
            > > therefore never reported, leaving 'holes' in the reporting
            similar to what
            > > you describe.
            > >
            > > --
            > > On Mon, 12 Jan 2004, Linux-Guru wrote:
            > >
            > > > Hi all,
            > > >
            > > > we are running webalizer on six Linux-servers with each about
            400
            > > > virtual domains.
            > > > Until now there were no problems. Since about 2 months, on some
            domains
            > > > there are days with no or just minimal traffic (according to
            webalizer),
            > > > which is wrong (according to Apache logs).
            > > > This behaviour appears on all servers on some dimains, which
            are always
            > > > the same. Usually it shows too less traffic for one or max.
            four days,
            > > > runs properly for some days and comes with errors again.
            > > > Questions:
            > > > Does anybody have similar problems (A friend of mine which is
            in hosting
            > > > business, too, told me, that he has similar errors, too)?
            > > > Is more detailed information needed?
            > > > Does anybody know a solution?
            > > > Note: It is impossible to delete all reports and re-initialize
            > > > webailizer from scratch. Our customers don't allow such a
            procedure.
            > > >
            > > > Regards from Switzerland
            > > >
            > > > Tobias
            > > >
            > > >
            > > > Webalizer homepage: http://www.webalizer.org
            > > > Webalizer for NT: http://www.medasys-lille.com/webalizer/
            > > >
            > > >
            > > > ttp://www.webalizer.org
            > > > Webalizer for NT: http://www.medasys-lille.com/webalizer/
            > > >
            > > >
            > > >
            > > >
            > > >
            > > > Yahoo! Groups Links
            > > >
            > > > To visit your group on the web, go to:
            > > > http://groups.yahoo.com/group/webalizer/
            > > >
            > > > To unsubscribe from this group, send an email to:
            > > > webalizer-unsubscribe@yahoogroups.com
            > > >
            > > > Your use of Yahoo! Groups is subject to:
            > > > http://docs.yahoo.com/info/terms/
            > > >
            > > >
            > > --
            > > Bradford L. Barrett brad@m...
            > > A free electron in a sea of neutrons DoD#1750 KD4NAW
            > >
            > > The only thing Micro$oft has done for society, is make people
            > > believe that computers are inherently unreliable.
            > >
            > > Webalizer homepage: http://www.webalizer.org
            > > Webalizer for NT: http://www.medasys-lille.com/webalizer/
            > >
            > >
            > > ttp://www.webalizer.org
            > > Webalizer for NT: http://www.medasys-lille.com/webalizer/
            > >
            > >
            > >
            > >
            > > Yahoo! Groups Links
            > >
            > > To visit your group on the web, go to:
            > > http://groups.yahoo.com/group/webalizer/
            > >
            > > To unsubscribe from this group, send an email to:
            > > webalizer-unsubscribe@yahoogroups.com
            > >
            > > Your use of Yahoo! Groups is subject to:
            > > http://docs.yahoo.com/info/terms/
          • enventa2000
            Hum, the easiest solution could be running webalizer at 23:00 instead of at 23:59 so it has plenty of time to process the logs? Or what about commenting the
            Message 5 of 11 , Mar 20, 2004
              Hum, the easiest solution could be running webalizer at 23:00 instead
              of at 23:59 so it has plenty of time to process the logs?

              Or what about commenting the cronolog and the webalizer entries in
              the cronotab and then making a that says like this:

              #!/bin/bash
              # in crontab we have 23 59 * * * /path/to/this_script.sh

              webalizer parameter parameter parameter
              cronolog


              This way the logs will never be rotated before webalizer has finished
              running. Of course this means that cronolog could start at 00:04 if
              webalizer took 5 minutes to run instead of one minute.

              (You can also change the start time for cron.daily and put the script
              there)



              --- In webalizer@yahoogroups.com, Linux-Guru <linux-guru@w...> wrote:
              > Hi,
              >
              > I sent a similar mail to the list about one week ago, but didn't
              receive an answer.
              > Again my question/problem:
              >
              > As you can see in the enclosed code, logfiles are rotated daily
              (IMHO at 00:00 / done by cronolog). At 23:59 a webalizer-run is
              started.
              > If it doesn't finish until 00:00 it works on, but with ne new
              logfiles, which is not wanted.
              > Is there any solution to tell webalizer to use logfiles from
              yesterday (by wildcard or kind of "date today - 1 = yesterday"-
              calculation?
              > How did other people solve this problem?
              >
              > Current config is:
              > httpd.conf:
              > CustomLog "|/path/to/cronolog --
              symlink=/path/to/users_logdir/access_log
              > --prev-symlink=/path/to/users_logdir/current_access_log
              > /path/to/users_logdir/%Y/%m/%d/access_log" combined
              >
              > crontab:
              > 59 23 * * * /path/to/report.sh
              >
              > report.sh:
              > #!/bin/bash
              > cd /path/to/stats_configdir/
              > ./start-report.sh
              > [some other stuff to clean up Apache logs after 6 days]
              >
              > start_report.sh
              > #!/bin/bash
              > webalizer -c /path/to/stats_configdir/domain.conf
              >
              > As far as I could see, the problem could be, that at 00:00 a new
              logfile
              > is written (cronolog) and all logfiles which are not processed until
              > then are "lost".
              > So if I am right, is there any possibility to tell webalizer to use
              > /path/to/users_logdir/%Y/%m/%d/access_log where it calculates /%Y/%
              m/%d
              > as "today -1" and to start report.sh on 00:00?
              > I can not change start_report, httpd.conf and domain.conf because
              it is
              > written from our hosting-automation-tool and I am not able to change
              > things there.
              >
              > Any help would be appreciated
              >
              > Regards
              >
              > Tobias
            • Bradford L. Barrett
              ... Bad suggestion.. you should _never_ run the webalizer against the live log if you are using incremental mode (which you must use if you are rotating once a
              Message 6 of 11 , Mar 20, 2004
                > Or what about commenting the cronolog and the webalizer entries in
                > the cronotab and then making a that says like this:
                >
                > #!/bin/bash
                > # in crontab we have 23 59 * * * /path/to/this_script.sh
                >
                > webalizer parameter parameter parameter
                > cronolog
                >
                >
                > This way the logs will never be rotated before webalizer has finished
                > running. Of course this means that cronolog could start at 00:04 if
                > webalizer took 5 minutes to run instead of one minute.

                Bad suggestion.. you should _never_ run the webalizer against the
                live log if you are using incremental mode (which you must use if
                you are rotating once a day). When you use incremental mode, you
                should always process the _rotated_ log. The README file contains
                an example that can be used on small/medium sized sites. You can
                also use a -USR1 signal to rotate your logs if you can't stand a
                couple seconds of downtime on your server to rotate the logs.

                > (You can also change the start time for cron.daily and put the script
                > there)

                Most systems don't have a 'cron.daily'... (none of the 74 machines I admin
                have one :)

                --
                Bradford L. Barrett brad@...
                A free electron in a sea of neutrons DoD#1750 KD4NAW

                The only thing Micro$oft has done for society, is make people
                believe that computers are inherently unreliable.
              • enventa2000
                ... finished ... if ... script ... I admin ... Then,use: #!/bin/bash # this script is schudled for 00:00 in crontab # logs get rotated at exactly 00:00
                Message 7 of 11 , Mar 20, 2004
                  --- In webalizer@yahoogroups.com, "Bradford L. Barrett" <brad@m...>
                  wrote:
                  >
                  > > Or what about commenting the cronolog and the webalizer entries in
                  > > the cronotab and then making a that says like this:
                  > >
                  > > #!/bin/bash
                  > > # in crontab we have 23 59 * * * /path/to/this_script.sh
                  > >
                  > > webalizer parameter parameter parameter
                  > > cronolog
                  > >
                  > >
                  > > This way the logs will never be rotated before webalizer has
                  finished
                  > > running. Of course this means that cronolog could start at 00:04
                  if
                  > > webalizer took 5 minutes to run instead of one minute.
                  >
                  > Bad suggestion.. you should _never_ run the webalizer against the
                  > live log if you are using incremental mode (which you must use if
                  > you are rotating once a day). When you use incremental mode, you
                  > should always process the _rotated_ log. The README file contains
                  > an example that can be used on small/medium sized sites. You can
                  > also use a -USR1 signal to rotate your logs if you can't stand a
                  > couple seconds of downtime on your server to rotate the logs.
                  >
                  > > (You can also change the start time for cron.daily and put the
                  script
                  > > there)
                  >
                  > Most systems don't have a 'cron.daily'... (none of the 74 machines
                  I admin
                  > have one :)
                  >
                  > --
                  > Bradford L. Barrett brad@m...
                  > A free electron in a sea of neutrons DoD#1750 KD4NAW
                  >
                  > The only thing Micro$oft has done for society, is make people
                  > believe that computers are inherently unreliable.



                  Then,use:


                  #!/bin/bash
                  # this script is schudled for 00:00 in crontab

                  # logs get rotated at exactly 00:00
                  cronolog

                  YEAR=$(date +%Y)
                  MONTH=$(date +%m)
                  DAY=$(date +%d)
                  YESTERDAY=$(($DAY - 1)) # n=n-1

                  # this is one line
                  USER_LOGFILE="/path/to/users_logdir/$YEAR/$MONTH/
                  $YESTERDAY/access_log"

                  webalizer -c config_file -o output_dir "$USER_LOGFILE"


                  # of course you won't have bash in any of your machines, so
                  # you will have to use:
                  # #!/bin/sh
                  # YEAR='date +%Y'
                  # etc ... :)
                  # by the way, is there an easier way to calculate n=n-1 in bash?
                  # how do you calculate it in sh?
                  # what machines are you running webalizer on?
                  #
                  #
                  # I don't use incremental mode. Every day I copy the logs (rotated
                  # or not) into /tmp, then I grep them and process them. This way
                  # any change in the cron script or the config files is
                  # propagated to all old stats pages (pity the program only
                  # runs one year, in six months more I will have to download the
                  # source and see if I can change that). The process is surpri-
                  # singly fast.
                • Bradford L. Barrett
                  ... [...] Right.. process the _rotated_ log. Your script will work (sometimes, see below) if you insist on using cronolog.. I prefer to rotate them myself
                  Message 8 of 11 , Mar 20, 2004
                    On Sun, 21 Mar 2004, enventa2000 wrote:

                    > Then,use:
                    >
                    > #!/bin/bash
                    > # this script is schudled for 00:00 in crontab
                    [...]

                    Right.. process the _rotated_ log. Your script will work (sometimes,
                    see below) if you insist on using cronolog.. I prefer to rotate them
                    myself however :) There are about a zillion different ways to do the same
                    thing.

                    [...]
                    > # of course you won't have bash in any of your machines, so

                    Huh? Doesn't _every_ *nix box out there have bash?!?

                    > # by the way, is there an easier way to calculate n=n-1 in bash?

                    Umm, how about "let n=n-1" ?? If you mean for caculating the previous
                    days value, then just use "date -d yesterday <format-string>".. In
                    your example:

                    [...]
                    > YEAR=$(date +%Y)
                    > MONTH=$(date +%m)
                    > DAY=$(date +%d)
                    > YESTERDAY=$(($DAY - 1)) # n=n-1
                    >
                    > # # this is one line
                    > # USER_LOGFILE="/path/to/users_logdir/$YEAR/$MONTH/$YESTERDAY/access_log"
                    [...]

                    Would fail on the first day of the month.. (ie: the day before
                    january 1, 2004 in your example would use year=2004, month=01 and
                    yesterday=0 instead of 2003, 12 and 31 like it should).

                    > # how do you calculate it in sh?

                    Dunno.. don't use it ;0

                    > # what machines are you running webalizer on?

                    Linux and solaris. Just an fyi, the primary development machine for
                    the webalizer is a Sun ultra 30 that dual boots Solaris and Linux,
                    and is where the main CVS tree is for the code.

                    > # I don't use incremental mode. Every day I copy the logs (rotated
                    > # or not) into /tmp, then I grep them and process them. This way
                    > # any change in the cron script or the config files is
                    > # propagated to all old stats pages (pity the program only
                    > # runs one year, in six months more I will have to download the
                    > # source and see if I can change that). The process is surpri-
                    > # singly fast.

                    Sounds like a lot of unnecessary work for a little gain! And as for
                    the 'only runs one year', that is incorrect. The program will process
                    as many days, months or years of logs that you can feed it. I have a
                    test log file I use all the time that goes back 5 years.

                    --
                    Bradford L. Barrett brad@...
                    A free electron in a sea of neutrons DoD#1750 KD4NAW

                    How do you give Microsoft the benefit of the doubt when you
                    know that if you were to throw it in a room with truth, you'd
                    risk a matter/anti-matter explosion? -- Nicholas Petreley IDG
                  • Enric Naval
                    ... I d say at least one undecillion ways. I, for one, prefer making a script that will process and munch the data so that the program will find most of work
                    Message 9 of 11 , Mar 20, 2004
                      --- "Bradford L. Barrett" <brad@...> wrote:
                      >
                      > On Sun, 21 Mar 2004, enventa2000 wrote:
                      >
                      > > Then,use:
                      > >
                      > > #!/bin/bash
                      > > # this script is schudled for 00:00 in crontab
                      > [...]
                      >
                      > Right.. process the _rotated_ log. Your script will
                      > work (sometimes,
                      > see below) if you insist on using cronolog.. I
                      > prefer to rotate them
                      > myself however :) There are about a zillion
                      > different ways to do the same
                      > thing.
                      >


                      I'd say at least one undecillion ways. I, for one,
                      prefer making a script that will process and munch
                      the data so that the program will find most of work
                      already done for it.

                      > [...]
                      > > # of course you won't have bash in any of your
                      > machines, so
                      >
                      > Huh? Doesn't _every_ *nix box out there have
                      > bash?!?
                      >

                      I don't know. But in the university we have a SunOs
                      sparc box that lacks things like lynx, nmap, locate,
                      and so on. Of course, no webalizer.

                      Also, it doesn't have /etc/cron.daily :) Just
                      /etc/cron.d and one script inside called "logchecker"
                      with a 1997 copyright notice. And that script doesn't
                      rotate the logs. It CUTS them when they are too long.
                      Ugh. I believe they run the actual maintenance scripts
                      by hand every week. Ugh again. Oh well, it's not MY
                      server.


                      > > # by the way, is there an easier way to calculate
                      > n=n-1 in bash?
                      >
                      > Umm, how about "let n=n-1" ?? If you mean for
                      > caculating the previous
                      > days value, then just use "date -d yesterday
                      > <format-string>".. In
                      > your example:
                      >
                      > [...]
                      > > YEAR=$(date +%Y)
                      > > MONTH=$(date +%m)
                      > > DAY=$(date +%d)
                      > > YESTERDAY=$(($DAY - 1)) # n=n-1
                      > >
                      > > # # this is one line
                      > > #
                      >
                      USER_LOGFILE="/path/to/users_logdir/$YEAR/$MONTH/$YESTERDAY/access_log"
                      > [...]
                      >
                      > Would fail on the first day of the month.. (ie: the
                      > day before
                      > january 1, 2004 in your example would use year=2004,
                      > month=01 and
                      > yesterday=0 instead of 2003, 12 and 31 like it
                      > should).


                      Ouch! Novice mistake. Me silly. Okay, we won't run
                      webalizer on day 1

                      if [[ ! "$day" == "1" ]]; then
                      # run webalizer;
                      else
                      # no stats for today
                      fi;

                      With a bit of luck, no customer will notice...

                      :) ok, ok, I'll use:

                      DATE=$( date -d yesterday +%Y%m%d )


                      >
                      > > # how do you calculate it in sh?
                      >
                      > Dunno.. don't use it ;0
                      >

                      I feel guilty for not being geek enough to be able to
                      use it XD I believe that bash is an acronym for
                      Bourne Again SHell (Born Again, the developer is
                      called Bourne), and it was developed from sh, so there
                      must be very old machines with just sh.

                      I have always taken bash as a luxury, as the SunOs
                      here logins you into sh... or maybe it is bash with an
                      option set so it looks like sh. After all, there is
                      also bash in the machine. I believe they keep the sh
                      login to keep us awake and uncomfortable, and so that
                      we don't take things like bash for granted.

                      I have read a rant from one person who had many
                      systems and complained that sh was the only shell that
                      was available on most machines, and that there was
                      subtle differences among shells. Yes, he had machines
                      with neither sh nor bash! You know, csh, ksh, tcsh,
                      etc. Ugh. And me here complaining about webalizer
                      lacking features...


                      > > # what machines are you running webalizer on?
                      >
                      > Linux and solaris. Just an fyi, the primary
                      > development machine for
                      > the webalizer is a Sun ultra 30 that dual boots
                      > Solaris and Linux,
                      > and is where the main CVS tree is for the code.
                      >

                      I use a Pentium IV with 1024 MG RAM and two 80 GB hard
                      disks. Mainly it sits there all day long, with 235
                      hits/hour in average (very ocasionally it gets 3550
                      hits/hour). That's only 4 hits per minute!. Life is
                      hard for this bored machine.

                      If it wasn't for the SETI@home screensaver, the hard
                      disk wouldn't even budge because everything is already
                      in cache. The hard disk would only be needed for
                      writing tiny entries to the database, rotating logs
                      and writing webalizer stats.


                      > > # I don't use incremental mode. Every day I copy
                      > the logs (rotated
                      > > # or not) into /tmp, then I grep them and
                      > process them. This way
                      > > # any change in the cron script or the config
                      > files is
                      > > # propagated to all old stats pages (pity the
                      > program only
                      > > # runs one year, in six months more I will have
                      > to download the
                      > > # source and see if I can change that). The
                      > process is surpri-
                      > > # singly fast.
                      >
                      > Sounds like a lot of unnecessary work for a little
                      > gain!

                      Yes, it is too much work! However, it is fast and
                      anyways I'm still making changes to the config file
                      and the cron script. If I used incremental mode, then
                      the changes would only apply to the last data
                      collected (that's it, the last 24 hours). That doesn't
                      allow to see if the changes had a nice visual output
                      or not, because the data would get fused with the
                      month totals.

                      Doing this, of course, if I screw the config, then all
                      stats for all months and sites will be screwed also
                      until I mend the config and run the program again or
                      wait for cron to run it. That wouldn't be acceptable
                      in a commercial site and I would have to process
                      stable incremental stats for the public, and then make
                      the non-incremental in a test folder. When the test
                      looked nice, I would put the changes in the
                      "commercial" processing.


                      >And as for
                      > the 'only runs one year', that is incorrect. The
                      > program will process
                      > as many days, months or years of logs that you can
                      > feed it. I have a
                      > test log file I use all the time that goes back 5
                      > years.
                      >

                      I didn't know that. Then again, when the logfile
                      reaches 2 GBs (like the one on the old server before
                      it got deleted) then perhaps I'll have to archive some
                      old stats and some old logs in order to keep them from
                      becoming unmanageables. Meanwhile, that 5 years thing
                      just made me happy :)

                      I will also put the old logs in some place that the
                      backup script doesn't reach. I want my gziped whole
                      system backups to fit in just one DVD! I will keep
                      separate backups for the gigantic logs.



                      > --
                      > Bradford L. Barrett
                      > brad@...
                      > A free electron in a sea of neutrons DoD#1750
                      > KD4NAW
                      >
                      > How do you give Microsoft the benefit of the doubt
                      > when you
                      > know that if you were to throw it in a room with
                      > truth, you'd
                      > risk a matter/anti-matter explosion? -- Nicholas
                      > Petreley IDG


                      =====
                      Enric Naval
                      Estudiante de Inform�tica de Gesti�n en la Udl (Lleida)

                      __________________________________
                      Do you Yahoo!?
                      Yahoo! Finance Tax Center - File online. File on time.
                      http://taxes.yahoo.com/filing.html
                    • waldo kitty
                      ... i think the point here is that one can only see one year s worth of data in the reports... the initial page only shows one years worth... some would like
                      Message 10 of 11 , Mar 21, 2004
                        Bradford L. Barrett wrote:

                        > Sounds like a lot of unnecessary work for a little gain! And as for
                        > the 'only runs one year', that is incorrect. The program will process
                        > as many days, months or years of logs that you can feed it. I have a
                        > test log file I use all the time that goes back 5 years.

                        i think the point here is that one can only see one year's worth of data in the reports... the initial page only shows one years
                        worth... some would like to be able to compare this march with last march in the overall view... i actually have all the process
                        reports since dec 1997 for my stuff... i've thought about doing something where i can load the graphic files for a side by side
                        comparison of the months over the years... hummm... sounds like a little something for me to toss together today... manually, of
                        course... at least for now ;)

                        --
                        _\/
                        (@@) Waldo Kitty, Waldo's Place USA
                        __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com
                        _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com
                        ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com
                        _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net
                      Your message has been successfully submitted and would be delivered to recipients shortly.