Loading ...
Sorry, an error occurred while loading the content.

How to understand the statistics?

Expand Messages
  • Ric
    Hello, Is there anyone on the list that lives in Brisbane Australia who could go through my Usage Statistics with me and explain fully what it all means, and
    Message 1 of 16 , Mar 15, 2004
    • 0 Attachment
      Message
      Hello,
      Is there anyone on the list that lives in Brisbane Australia who could go through my Usage Statistics with me and explain fully what it all means, and how I can use these statistics to improve my web site and therefore my business?
      My web site uses webalizer, but I do not really know how to make use of the valuable information it provides.
      If there is someone in Brisbane who would be willing to sit down with me and explain it all I would be very appreciative.
      Thank you,
      Ric
       
      Mr. Ric Willmot
      Professional Speaker & Management Consultant
      Executive Wisdom Consulting Group
      "Improving the effectiveness of individuals & organisations
      by Building Better Business Strategies and
      Creating Inquisitive Corporations by Developing High Performance Thinking!"
      Willbert House 68 Mayfield Road, Carina QLD 4152
      Ph: 07 3395 1050
      Fax: 07 3395 1805
      Mob: 0412 728 113
      Visit the web site to download useful articles & resources free! You can also subscribe for free to
      our electronic newsletter by visiting our web site.

      The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.

       
    • Bob
      Most bots are logged with a Referer none so to filter out bots to see how many real human visitors there are, we could filter on Referer none, then add
      Message 2 of 16 , Mar 16, 2004
      • 0 Attachment
        Most bots are logged with a Referer "none" so to filter out
        bots to see how many real human visitors there are, we
        could filter on Referer "none, then add filters on bot names
        or bot Referer strings that get past that test.

        Which is better, ignore or hide?

        /etc/webalizer.conf

        HideReferer "none"

        IgnoreReferer "none"

        -Bob D
      • Bob
        Correction: spelled Referrer, two r s .
        Message 3 of 16 , Mar 16, 2004
        • 0 Attachment
          Correction: spelled Referrer, two "r's".

          Bob wrote:

          > Most bots are logged with a Referer "none" so to filter out
          > bots to see how many real human visitors there are, we
          > could filter on Referer "none, then add filters on bot names
          > or bot Referer strings that get past that test.
          >
          > Which is better, ignore or hide?
          >
          > /etc/webalizer.conf
          >
          > HideReferer "none"
          >
          > IgnoreReferer "none"
          >
          > -Bob D
          >
          >
          >
          > Webalizer homepage: http://www.webalizer.org
          > Webalizer for NT: http://www.medasys-lille.com/webalizer/
          >
          >
          > ttp://www.webalizer.org
          > Webalizer for NT: http://www.medasys-lille.com/webalizer/
          >
          >
          >
          >
          >
          > *Yahoo! Groups Sponsor*
          > ADVERTISEMENT
          > <http://rd.yahoo.com/SIG=12cnpp93g/M=268585.4521611.5694062.1261774/D=egroupweb/S=1705007181:HM/EXP=1079510862/A=1950447/R=0/SIG=1245hvqf1/*http://ashnin.com/clk/muryutaitakenattogyo?YH=4521611&yhad=1950447>
          >
          >
          >
          > ------------------------------------------------------------------------
          > *Yahoo! Groups Links*
          >
          > * To visit your group on the web, go to:
          > http://groups.yahoo.com/group/webalizer/
          >
          > * To unsubscribe from this group, send an email to:
          > webalizer-unsubscribe@yahoogroups.com
          > <mailto:webalizer-unsubscribe@yahoogroups.com?subject=Unsubscribe>
          >
          > * Your use of Yahoo! Groups is subject to the Yahoo! Terms of
          > Service <http://docs.yahoo.com/info/terms/>.
          >
          >
        • waldo kitty
          ... i know what you are saying but i cannot confirm it with my experiance... all the search engine bots and many others that visit my site carry valid
          Message 4 of 16 , Mar 16, 2004
          • 0 Attachment
            Bob wrote:

            > Most bots are logged with a Referer "none" so to filter out
            > bots to see how many real human visitors there are, we
            > could filter on Referer "none, then add filters on bot names
            > or bot Referer strings that get past that test.

            i know what you are saying but i cannot confirm it with my experiance... all the search engine bots and many others that visit my
            site carry valid useragent strings... that's how i use webalizer to track the bot activity...

            as for UA string "none"... i can't see that on my apache servers... the UA string carries a '-' in it if the UA is blank...

            > Which is better, ignore or hide?
            >
            > /etc/webalizer.conf
            >
            > HideReferer "none"
            >
            > IgnoreReferer "none"

            just on the general question, if you don't want it counted or taken into effect, i'd say ignore so webalizer will jump right on over
            that record and not store any info about it in memory...

            --
            _\/
            (@@) Waldo Kitty, Waldo's Place USA
            __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com
            _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com
            ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com
            _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net
          • Bob
            ... I filter them out by User Agent name if they get past null referrer filter. If you want, for apache, I guess you would do it like this-- # in
            Message 5 of 16 , Mar 17, 2004
            • 0 Attachment
              waldo kitty wrote:

              > Bob wrote:
              >
              > > Most bots are logged with a Referer "none" so to filter out
              > > bots to see how many real human visitors there are, we
              > > could filter on Referer "none, then add filters on bot names
              > > or bot Referer strings that get past that test.
              >
              > i know what you are saying but i cannot confirm it with my
              > experiance... all the search engine bots and many others that visit my
              > site carry valid useragent strings... that's how i use webalizer to
              > track the bot activity...

              I filter them out by User Agent name if they get past null referrer filter.

              If you want, for apache, I guess you would do it like this--

              # in /etc/webalizer.conf
              IgnoreReferrer "-"

              I am discussing with a friend who thought he
              was getting 4000 hits a day, and after some
              pressing pedantic pestering I finally brought
              him to the realization that most of those are
              bots, and he ought to look into what log
              analyzer options he has at his webhost to
              filter out bots.


              >
              > as for UA string "none"... i can't see that on my apache servers...
              > the UA string carries a '-' in it if the UA is blank...
              >
              > > Which is better, ignore or hide?
              > >
              > > /etc/webalizer.conf
              > >
              > > HideReferrer "none"
              > >
              > > IgnoreReferrer "none"
              >
              > just on the general question, if you don't want it counted or taken
              > into effect, i'd say ignore so webalizer will jump right on over
              > that record and not store any info about it in memory...
              >
              > --
              > _\/
              > (@@) Waldo Kitty, Waldo's Place USA
              > __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com
              > _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com
              > ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com
              > _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net
              >
              >
              > Webalizer homepage: http://www.webalizer.org
              > Webalizer for NT: http://www.medasys-lille.com/webalizer/
              >
              >
              > ttp://www.webalizer.org
              > Webalizer for NT: http://www.medasys-lille.com/webalizer/
              >
              >
              >
              >
              >
              > ------------------------------------------------------------------------
              > *Yahoo! Groups Links*
              >
              > * To visit your group on the web, go to:
              > http://groups.yahoo.com/group/webalizer/
              >
              > * To unsubscribe from this group, send an email to:
              > webalizer-unsubscribe@yahoogroups.com
              > <mailto:webalizer-unsubscribe@yahoogroups.com?subject=Unsubscribe>
              >
              > * Your use of Yahoo! Groups is subject to the Yahoo! Terms of
              > Service <http://docs.yahoo.com/info/terms/>.
              >
              >
            • Bob
              # http://www.scottkriebel.com/webalizer/webalizer.conf.txt # This one hides non-referrers ( - Direct requests) HideReferrer Direct Request
              Message 6 of 16 , Mar 17, 2004
              • 0 Attachment
                # http://www.scottkriebel.com/webalizer/webalizer.conf.txt

                # This one hides non-referrers ("-" Direct requests)
                HideReferrer Direct Request


                Bob wrote:

                > waldo kitty wrote:
                >
                > > Bob wrote:
                > >
                > > > Most bots are logged with a Referer "none" so to filter out
                > > > bots to see how many real human visitors there are, we
                > > > could filter on Referer "none, then add filters on bot names
                > > > or bot Referer strings that get past that test.
                > >
                > > i know what you are saying but i cannot confirm it with my
                > > experiance... all the search engine bots and many others that visit my
                > > site carry valid useragent strings... that's how i use webalizer to
                > > track the bot activity...
                >
                > I filter them out by User Agent name if they get past null referrer
                > filter.
                >
                > If you want, for apache, I guess you would do it like this--
                >
                > # in /etc/webalizer.conf
                > IgnoreReferrer "-"
                >
                > I am discussing with a friend who thought he
                > was getting 4000 hits a day, and after some
                > pressing pedantic pestering I finally brought
                > him to the realization that most of those are
                > bots, and he ought to look into what log
                > analyzer options he has at his webhost to
                > filter out bots.
                >
                >
                > >
                > > as for UA string "none"... i can't see that on my apache servers...
                > > the UA string carries a '-' in it if the UA is blank...
                > >
                > > > Which is better, ignore or hide?
                > > >
                > > > /etc/webalizer.conf
                > > >
                > > > HideReferrer "none"
                > > >
                > > > IgnoreReferrer "none"
                > >
                > > just on the general question, if you don't want it counted or taken
                > > into effect, i'd say ignore so webalizer will jump right on over
                > > that record and not store any info about it in memory...
                > >
                > > --
                > > _\/
                > > (@@) Waldo Kitty, Waldo's Place USA
                > > __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com
                > > _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com
                > > ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com
                > > _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net
                > >
                > >
                > > Webalizer homepage: http://www.webalizer.org
                > > Webalizer for NT: http://www.medasys-lille.com/webalizer/
                > >
                > >
                > > ttp://www.webalizer.org
                > > Webalizer for NT: http://www.medasys-lille.com/webalizer/
                > >
                > >
                > >
                > >
                > >
                > > ------------------------------------------------------------------------
                > > *Yahoo! Groups Links*
                > >
                > > * To visit your group on the web, go to:
                > > http://groups.yahoo.com/group/webalizer/
                > >
                > > * To unsubscribe from this group, send an email to:
                > > webalizer-unsubscribe@yahoogroups.com
                > > <mailto:webalizer-unsubscribe@yahoogroups.com?subject=Unsubscribe>
                > >
                > > * Your use of Yahoo! Groups is subject to the Yahoo! Terms of
                > > Service <http://docs.yahoo.com/info/terms/>.
                > >
                > >
                >
                >
                >
                > Webalizer homepage: http://www.webalizer.org
                > Webalizer for NT: http://www.medasys-lille.com/webalizer/
                >
                >
                > ttp://www.webalizer.org
                > Webalizer for NT: http://www.medasys-lille.com/webalizer/
                >
                >
                >
                >
                >
                > ------------------------------------------------------------------------
                > *Yahoo! Groups Links*
                >
                > * To visit your group on the web, go to:
                > http://groups.yahoo.com/group/webalizer/
                >
                > * To unsubscribe from this group, send an email to:
                > webalizer-unsubscribe@yahoogroups.com
                > <mailto:webalizer-unsubscribe@yahoogroups.com?subject=Unsubscribe>
                >
                > * Your use of Yahoo! Groups is subject to the Yahoo! Terms of
                > Service <http://docs.yahoo.com/info/terms/>.
                >
                >
              • waldo kitty
                ... i can see that... just make sure that the *referer* is understood to be where the user clicked to come to your site and also understand that there is no
                Message 7 of 16 , Mar 17, 2004
                • 0 Attachment
                  Bob wrote:

                  > waldo kitty wrote:
                  >>i know what you are saying but i cannot confirm it with my
                  >>experiance... all the search engine bots and many others that visit my
                  >>site carry valid useragent strings... that's how i use webalizer to
                  >>track the bot activity...
                  >
                  >
                  > I filter them out by User Agent name if they get past null referrer filter.
                  >
                  > If you want, for apache, I guess you would do it like this--
                  >
                  > # in /etc/webalizer.conf
                  > IgnoreReferrer "-"
                  >
                  > I am discussing with a friend who thought he
                  > was getting 4000 hits a day, and after some
                  > pressing pedantic pestering I finally brought
                  > him to the realization that most of those are
                  > bots, and he ought to look into what log
                  > analyzer options he has at his webhost to
                  > filter out bots.

                  i can see that... just make sure that the *referer* is understood to be where the user clicked to come to your site and also
                  understand that there is no referer if the user just manually types the URL into their browser... to eliminate search engine bots, i
                  would use the domain names and ignore them instead... ignoring the UA could also be done...

                  maybe i'm being a bit pedantic on this... it just seems that you're using referer for useragent and they are not the same...

                  --
                  _\/
                  (@@) Waldo Kitty, Waldo's Place USA
                  __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com
                  _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com
                  ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com
                  _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net
                • waldo kitty
                  ... right... that includes search engine bots as well as users that manually type the URL into the destination field... -- _ / (@@) Waldo
                  Message 8 of 16 , Mar 17, 2004
                  • 0 Attachment
                    Bob wrote:

                    > # http://www.scottkriebel.com/webalizer/webalizer.conf.txt
                    >
                    > # This one hides non-referrers ("-" Direct requests)
                    > HideReferrer Direct Request

                    right... that includes search engine bots as well as users that manually type the URL into the destination field...

                    --
                    _\/
                    (@@) Waldo Kitty, Waldo's Place USA
                    __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com
                    _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com
                    ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com
                    _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net
                  • Bob
                    ... How about pasting the url instead of typing? I do that all the time since thunderbird is not set up to send url s in email to firefox. Setting up is pretty
                    Message 9 of 16 , Mar 20, 2004
                    • 0 Attachment
                      waldo kitty wrote:

                      > Bob wrote:
                      >
                      > > # http://www.scottkriebel.com/webalizer/webalizer.conf.txt
                      > >
                      > > # This one hides non-referrers ("-" Direct requests)
                      > > HideReferrer Direct Request
                      >
                      > right... that includes search engine bots as well as users that
                      > manually type the URL into the destination field...

                      How about pasting the url instead of typing? I do that all
                      the time since thunderbird is not set up to send url's in
                      email to firefox. Setting up is pretty easy I think.

                      -bob
                    • Bob
                      ... That was not me. Also I have no problem filtering out legitimate User Agents, I don t see the point there. I just filter out Slurp and such. If I work
                      Message 10 of 16 , Mar 20, 2004
                      • 0 Attachment
                        waldo kitty wrote:

                        > Bob wrote:
                        >
                        > > waldo kitty wrote:
                        > >>i know what you are saying but i cannot confirm it with my
                        > >>experiance... all the search engine bots and many others that visit my
                        > >>site carry valid useragent strings... that's how i use webalizer to
                        > >>track the bot activity...
                        > >
                        > >
                        > > I filter them out by User Agent name if they get past null referrer
                        > filter.
                        > >
                        > > If you want, for apache, I guess you would do it like this--
                        > >
                        > > # in /etc/webalizer.conf
                        > > IgnoreReferrer "-"
                        > >
                        > > I am discussing with a friend who thought he
                        > > was getting 4000 hits a day, and after some
                        > > pressing pedantic pestering I finally brought
                        > > him to the realization that most of those are
                        > > bots, and he ought to look into what log
                        > > analyzer options he has at his webhost to
                        > > filter out bots.
                        >
                        > i can see that... just make sure that the *referer* is understood to
                        > be where the user clicked to come to your site and also
                        > understand that there is no referer if the user just manually types
                        > the URL into their browser... to eliminate search engine bots, i
                        > would use the domain names and ignore them instead... ignoring the UA
                        > could also be done...
                        >
                        > maybe i'm being a bit pedantic on this... it just seems that you're
                        > using referer for useragent and they are not the same...

                        That was not me. Also I have no problem filtering out
                        "legitimate" User Agents, I don't see the point there.
                        I just filter out "Slurp" and such. If I work up a good
                        list of User Agent names that are bots, some will
                        sneak through, but if I filter on null referrers those
                        individuals typing(or pasting?) urls will get filtered.
                        Neither way is perfect.

                        -Bob
                      • waldo kitty
                        ... same difference... if the url is manually entered in that field, then there is no referrer sent... i do it all the time when i want to go from one site to
                        Message 11 of 16 , Mar 20, 2004
                        • 0 Attachment
                          Bob wrote:
                          > waldo kitty wrote:
                          >
                          >
                          >>Bob wrote:
                          >>
                          >>
                          >>># http://www.scottkriebel.com/webalizer/webalizer.conf.txt
                          >>>
                          >>># This one hides non-referrers ("-" Direct requests)
                          >>>HideReferrer Direct Request
                          >>
                          >>right... that includes search engine bots as well as users that
                          >>manually type the URL into the destination field...
                          >
                          >
                          > How about pasting the url instead of typing? I do that all
                          > the time since thunderbird is not set up to send url's in
                          > email to firefox. Setting up is pretty easy I think.

                          same difference... if the url is manually entered in that field, then there is no referrer sent... i do it all the time when i want
                          to go from one site to another and not tell the destination that i found their link on another site...

                          --
                          _\/
                          (@@) Waldo Kitty, Waldo's Place USA
                          __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com
                          _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com
                          ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com
                          _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net
                        • waldo kitty
                          ... they can be caught and added to the list over time... the only ones you d really miss would be those that fly UAs appearing to be regular users... email
                          Message 12 of 16 , Mar 20, 2004
                          • 0 Attachment
                            Bob wrote:

                            > waldo kitty wrote:
                            >
                            >
                            >>Bob wrote:
                            >>>I am discussing with a friend who thought he
                            >>>was getting 4000 hits a day, and after some
                            >>>pressing pedantic pestering I finally brought
                            >>>him to the realization that most of those are
                            >>>bots, and he ought to look into what log
                            >>>analyzer options he has at his webhost to
                            >>>filter out bots.
                            >>
                            >>i can see that... just make sure that the *referer* is understood to
                            >>be where the user clicked to come to your site and also
                            >>understand that there is no referer if the user just manually types
                            >>the URL into their browser... to eliminate search engine bots, i
                            >>would use the domain names and ignore them instead... ignoring the UA
                            >>could also be done...
                            >>
                            >>maybe i'm being a bit pedantic on this... it just seems that you're
                            >>using referer for useragent and they are not the same...
                            >
                            >
                            > That was not me. Also I have no problem filtering out
                            > "legitimate" User Agents, I don't see the point there.
                            > I just filter out "Slurp" and such. If I work up a good
                            > list of User Agent names that are bots, some will
                            > sneak through,

                            they can be caught and added to the list over time... the only ones you'd really miss would be those that fly UAs appearing to be
                            regular users... email harvestors are the first of this genre that come to mind... if there are any search engine bots that do not
                            properly ID themselves, they should be able to be spotted by IP number and blocked that way... a firm note should also be sent to
                            their humans about them properly IDing themselves if they are a legitimate spider bot...

                            > but if I filter on null referrers those
                            > individuals typing(or pasting?) urls will get filtered.
                            > Neither way is perfect.

                            the main thing is to look thru your logs and see what patterns you see... for example...

                            i just grabbed a random block of lines from this month's log... it came out at some 25000+ lines...
                            then i extracted all entries with "-" as the referrer... that gave me some 5500 lines...
                            then i knocked everything with bot in it out to eliminate googlebot and a few others... down to 4300 lines...
                            then filtering out yahoo gave me about 3800 lines...
                            filtering ia_archiver drops me to 3700 lines...
                            filtering out overture's webcrawler drops me to 3500 lines...
                            and finally knocking out teoma takes me to 3100 lines...

                            so, out of 5500 with a blank referrer, 2400 were search engines and 3100 were legitimate visitors... yeah, there may still be a few
                            more bots in there but nothing obvious or very prolific... i'd have to use a finer toothed comb than my eyes on a rough run for this
                            excercise to filter out more... mind you, this was also just a "small" block of my log and not the entire log... the whole point of
                            this excercise being that if i were to just block on blank referrers, i'd miss counting some 3100 non-bot hits... more than 50%...



                            --
                            _\/
                            (@@) Waldo Kitty, Waldo's Place USA
                            __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com
                            _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com
                            ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com
                            _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net
                          • Bradford L. Barrett
                            Just thought I d throw in my 2cents worth on this subject, as I get a lot of e-mails about it.. First off, a hit is a hit, regardless of it s origin, be it a
                            Message 13 of 16 , Mar 20, 2004
                            • 0 Attachment
                              Just thought I'd throw in my 2cents worth on this subject, as I get
                              a lot of e-mails about it.. First off, a hit is a hit, regardless
                              of it's origin, be it a human or a program. So for the 'friend'
                              who thought he was getting 4000 hits a day, he really was, even if
                              some of those were non-human in origin. The web server doesn't
                              care at all, when it sees a request, it has to process it.

                              For myself, I don't see why people would worry about such things.
                              I want to know how hard my server is being pounded, and automated
                              processes hitting it count in that metric. That is why I wrote
                              the program, so I can guage how hard my box is being hit, and to
                              aid in capacity planning. Ignoring traffic, for any reason, will
                              cause the numbers to be inaccurate and give a false reading when
                              trying to determine usage and growth patterns. In any event, if
                              someone feels the need to filter out bots (or any other type of
                              request), it is always best to use the Hide* keywords, so the
                              totals remain accurate, but the offending request type is kept
                              from being shown in the 'Top*' tables.

                              Cheers,
                              Brad

                              --

                              [...]
                              > > > I am discussing with a friend who thought he
                              > > > was getting 4000 hits a day, and after some
                              > > > pressing pedantic pestering I finally brought
                              > > > him to the realization that most of those are
                              > > > bots, and he ought to look into what log
                              > > > analyzer options he has at his webhost to
                              > > > filter out bots.
                              > >
                              > > i can see that... just make sure that the *referer* is understood to
                              > > be where the user clicked to come to your site and also
                              > > understand that there is no referer if the user just manually types
                              > > the URL into their browser... to eliminate search engine bots, i
                              > > would use the domain names and ignore them instead... ignoring the UA
                              > > could also be done...
                              > >
                              > > maybe i'm being a bit pedantic on this... it just seems that you're
                              > > using referer for useragent and they are not the same...
                              >
                              > That was not me. Also I have no problem filtering out
                              > "legitimate" User Agents, I don't see the point there.
                              > I just filter out "Slurp" and such. If I work up a good
                              > list of User Agent names that are bots, some will
                              > sneak through, but if I filter on null referrers those
                              > individuals typing(or pasting?) urls will get filtered.
                              > Neither way is perfect.
                              >
                              > -Bob
                              >
                              >
                              > Webalizer homepage: http://www.webalizer.org
                              > Webalizer for NT: http://www.medasys-lille.com/webalizer/
                              >
                              >
                              > ttp://www.webalizer.org
                              > Webalizer for NT: http://www.medasys-lille.com/webalizer/
                              >
                              >
                              >
                              > Yahoo! Groups Links
                              >
                              >
                              >
                              >
                              >
                              --
                              Bradford L. Barrett brad@...
                              A free electron in a sea of neutrons DoD#1750 KD4NAW

                              The only thing Micro$oft has done for society, is make people
                              believe that computers are inherently unreliable.
                            • waldo kitty
                              ... i agree completely with you on this... i think the thing is that they want to see how many humans are hitting their server(s)... since they do this stuff
                              Message 14 of 16 , Mar 20, 2004
                              • 0 Attachment
                                Bradford L. Barrett wrote:

                                > Just thought I'd throw in my 2cents worth on this subject, as I get
                                > a lot of e-mails about it.. First off, a hit is a hit, regardless
                                > of it's origin, be it a human or a program. So for the 'friend'
                                > who thought he was getting 4000 hits a day, he really was, even if
                                > some of those were non-human in origin. The web server doesn't
                                > care at all, when it sees a request, it has to process it.

                                i agree completely with you on this... i think the thing is that they want to see how many humans are hitting their server(s)...
                                since they do this stuff for humans, if the bots are the only visitors and there are no humans, then something is wrong... either
                                they need to do some (heavy?) SEO or they need to rethink their layout or even their product line... in any case, their target isn't
                                being reached...

                                > For myself, I don't see why people would worry about such things.
                                > I want to know how hard my server is being pounded, and automated
                                > processes hitting it count in that metric. That is why I wrote
                                > the program, so I can guage how hard my box is being hit, and to
                                > aid in capacity planning. Ignoring traffic, for any reason, will
                                > cause the numbers to be inaccurate and give a false reading when
                                > trying to determine usage and growth patterns. In any event, if
                                > someone feels the need to filter out bots (or any other type of
                                > request), it is always best to use the Hide* keywords, so the
                                > totals remain accurate, but the offending request type is kept
                                > from being shown in the 'Top*' tables.

                                agreed again... i'm kinda the opposite of those wanting to ignore or hide the bots... i like to know the where my visitors are
                                coming from... if that means that i can get their town from their domain, that's great! its like getting a postage stamp on a letter
                                from overseas... i like knowing that the majority of my french visitors are arriving via google's image search and what images they
                                are after... i like knowing that certain educational and governmental groups are interested in the floods and flood pictures we've
                                had in this area... i even like knowing where certain folk are that continually beat on a nonexistant formmail.pl script and its
                                variant names... for me, the details are nice... i just wish i could get more accuracy with the IP numbers without it costing me
                                even more $$$ than i already spend doing this stuff :)

                                it all comes down to numbers... that's for sure... what numbers do you want to see... seems that old adage about making statistics
                                say what you want them to say is at play ;) i've even started considering processing my logs with another config to try to
                                eliminate the bots and see what human traffic my site(s) are truely getting... it would be an interesting project...

                                BTW: thanks for a great tool!

                                > Cheers,
                                > Brad
                                >
                                > --
                                >
                                > [...]
                                >
                                >>>>I am discussing with a friend who thought he
                                >>>>was getting 4000 hits a day, and after some
                                >>>>pressing pedantic pestering I finally brought
                                >>>>him to the realization that most of those are
                                >>>>bots, and he ought to look into what log
                                >>>>analyzer options he has at his webhost to
                                >>>>filter out bots.
                                >>>
                                >>>i can see that... just make sure that the *referer* is understood to
                                >>>be where the user clicked to come to your site and also
                                >>>understand that there is no referer if the user just manually types
                                >>>the URL into their browser... to eliminate search engine bots, i
                                >>>would use the domain names and ignore them instead... ignoring the UA
                                >>>could also be done...
                                >>>
                                >>>maybe i'm being a bit pedantic on this... it just seems that you're
                                >>>using referer for useragent and they are not the same...
                                >>
                                >>That was not me. Also I have no problem filtering out
                                >>"legitimate" User Agents, I don't see the point there.
                                >>I just filter out "Slurp" and such. If I work up a good
                                >>list of User Agent names that are bots, some will
                                >>sneak through, but if I filter on null referrers those
                                >>individuals typing(or pasting?) urls will get filtered.
                                >>Neither way is perfect.
                                >>
                                >>-Bob
                                >>
                                >>
                                >>Webalizer homepage: http://www.webalizer.org
                                >>Webalizer for NT: http://www.medasys-lille.com/webalizer/
                                >>
                                >>
                                >>ttp://www.webalizer.org
                                >>Webalizer for NT: http://www.medasys-lille.com/webalizer/
                                >>
                                >>
                                >>
                                >>Yahoo! Groups Links
                                >>
                                >>
                                >>
                                >>
                                >>
                                >
                                > --
                                > Bradford L. Barrett brad@...
                                > A free electron in a sea of neutrons DoD#1750 KD4NAW
                                >
                                > The only thing Micro$oft has done for society, is make people
                                > believe that computers are inherently unreliable.
                                >
                                >
                                >
                                > Webalizer homepage: http://www.webalizer.org
                                > Webalizer for NT: http://www.medasys-lille.com/webalizer/
                                >
                                >
                                > ttp://www.webalizer.org
                                > Webalizer for NT: http://www.medasys-lille.com/webalizer/
                                >
                                >
                                >
                                > Yahoo! Groups Links
                                >
                                >
                                >
                                >
                                >
                                >


                                --
                                _\/
                                (@@) Waldo Kitty, Waldo's Place USA
                                __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com
                                _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com
                                ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com
                                _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net
                              • Bob
                                ... I don t see why people would worry about people wanting to know BOTH how many page requests their server is serving and how many people are clicking on a
                                Message 15 of 16 , Mar 28, 2004
                                • 0 Attachment
                                  waldo kitty wrote:

                                  > Bradford L. Barrett wrote:
                                  >
                                  > > Just thought I'd throw in my 2cents worth on this subject, as I get
                                  > > a lot of e-mails about it.. First off, a hit is a hit, regardless
                                  > > of it's origin, be it a human or a program. So for the 'friend'
                                  > > who thought he was getting 4000 hits a day, he really was, even if
                                  > > some of those were non-human in origin. The web server doesn't
                                  > > care at all, when it sees a request, it has to process it.
                                  >
                                  > i agree completely with you on this... i think the thing is that they
                                  > want to see how many humans are hitting their server(s)...
                                  > since they do this stuff for humans, if the bots are the only visitors
                                  > and there are no humans, then something is wrong... either
                                  > they need to do some (heavy?) SEO or they need to rethink their layout
                                  > or even their product line... in any case, their target isn't
                                  > being reached...
                                  >
                                  > > For myself, I don't see why people would worry about such things.
                                  > > I want to know how hard my server is being pounded

                                  "I don't see why people would worry about people wanting
                                  to know" BOTH how many page requests their server is
                                  serving and how many people are clicking on a particular
                                  ad or content page, so as to inform not only the server
                                  administrator but A WHOLE CHAIN or team of people,
                                  including clients AND THEIR DESIGN EMPLOYEES
                                  AND CONTRACTORS and authors....I can't see why
                                  someone would worry or want to close off the feedback?!

                                  It's not as if know how many real human visitors clicked
                                  on which pages is mutually exclusive with knowing what
                                  the server load is.

                                  I just wrote a script to access the webserver log without
                                  webalizer and make a crude but effective html file telling
                                  each advertising client what their hits are and for which
                                  of several products they advertise. I don't see why anyone
                                  should "worry" or insist that I delete another type of log
                                  now, as if they're mutually exclusive.

                                  Actually fnord my webserver, in process of being replaced
                                  by gatling, does not supply a time stamp or else its in a
                                  milliseconds since 1980 format I don't care to interact
                                  with, so my script which harvests fnord logs and translates
                                  them into a format webalizer can understand is actually
                                  inserting a very coarse-grained time stamp at the interval
                                  the script runs. If I want to know how many hits per second
                                  are loading the webserver from a given source, I have to
                                  use NETFILTER logging for port 80. That would show
                                  me a DoS or DDos attack and its source or sources. That's
                                  not a webalizer thing.

                                  -Bob
                                • Bob
                                  I have a friend who said his organization s site was receiving 5000 hits a day. It took a while but I whittled that down to less than 800 human visitors,
                                  Message 16 of 16 , Mar 28, 2004
                                  • 0 Attachment
                                    I have a friend who said his organization's site was
                                    receiving 5000 hits a day. It took a while but I whittled
                                    that down to less than 800 human visitors, without
                                    having access myself to the workings or logs of
                                    his host's log analyzer. The 800 probably does not
                                    include script kiddies on a fishing expedition since
                                    404's are filtered by any sane log analyzer, but I'm
                                    sure there are more indexing bots puffing up my
                                    friend's self esteem, and I'm helping him to face
                                    reality.

                                    Beyond that, there is much more reason to care about
                                    shaping content to cater to visitor interests and successful
                                    search engine keywords and real human visits than to
                                    worry if a few thousand hits a day might necessitate
                                    a move to greater bandwidth, though as I said, there
                                    is no zero sum mutual exclusivity--the more feedback
                                    the better, and the more tools the better! In addition
                                    to webalizer I have sed scripts filtering raw logs
                                    which I look at in the utility "less", and those are
                                    hard enough to use anyway without null referrers
                                    causing info glut--or I'll look at null referrers only
                                    to see what bots are indexing!

                                    I appreciate someone saying webalizer doesn't use
                                    regular expressions for User Agents or Referrers.
                                    I'll change my webalizer.conf accordingly, to just
                                    use a sub-string.

                                    -Bob D
                                  Your message has been successfully submitted and would be delivered to recipients shortly.