Loading ...
Sorry, an error occurred while loading the content.

Re: [webalizer] Visit stats too low

Expand Messages
  • William L. Thomson Jr.
    ... To expand a hit tends to be a request. So if you have a page with 4 images on it. When someone requests that page, it will equal 5 hits. If the same person
    Message 1 of 10 , Aug 25, 2005
    • 0 Attachment
      On Thu, 2005-08-25 at 13:28 -0700, Colonel Angel wrote:
      > HI!
      >
      > Well, what I can tell you is that a hit is not a
      > visit,

      To expand a hit tends to be a request. So if you have a page with 4
      images on it. When someone requests that page, it will equal 5 hits.
      If the same person re-requests that page, you get 5 more hits, and so
      on.

      Not a very accurate way to measure web site traffic or etc. If solely
      going by hits. It's more a reflection of the amount of requests, or hits
      to the web server or etc.

      --
      Sincerely,
      William L. Thomson Jr.
      Support Group
      Obsidian-Studios, Inc.
      http://www.obsidian-studios.com
    • Bradford L. Barrett
      ... URLs without extensions are automatically counted as a page. Your config option PageType * is invalid and should be removed. If you are using
      Message 2 of 10 , Aug 25, 2005
      • 0 Attachment
        > There is no extension. I have set PageType in webalizer.conf to:
        >
        > PageType *
        > PageType .php

        URLs without extensions are automatically counted as a page. Your
        config option "PageType *" is invalid and should be removed. If you
        are using incremental mode, be sure to reset your stats and re-process
        after changing the config file.

        --
        Bradford L. Barrett brad@...
        A free electron in a sea of neutrons DoD#1750 KD4NAW

        The only thing Micro$oft has done for society, is make people
        believe that computers are inherently unreliable.
      • Enric Naval
        ... (I m sorry, after re-reading your email, I see that you already have the number of requests per page in mind when weighting the number of requests) It is
        Message 3 of 10 , Aug 26, 2005
        • 0 Attachment
          --- "William L. Thomson Jr."
          <yahoogroups@...> wrote:

          > On Thu, 2005-08-25 at 13:28 -0700, Colonel Angel
          > wrote:
          > > HI!
          > >
          > > Well, what I can tell you is that a hit is not a
          > > visit,
          >
          > To expand a hit tends to be a request. So if you
          > have a page with 4
          > images on it. When someone requests that page, it
          > will equal 5 hits.
          > If the same person re-requests that page, you get 5
          > more hits, and so
          > on.
          >
          > Not a very accurate way to measure web site traffic
          > or etc. If solely
          > going by hits. It's more a reflection of the amount
          > of requests, or hits
          > to the web server or etc.

          (I'm sorry, after re-reading your email, I see that
          you already have the number of requests per page in
          mind when weighting the number of requests)


          It is also a measure of having too many images in your
          pages, like using images for your navigation bar
          instead of text links.

          I have a page in my server that has 29 images, 8
          external style sheets and one javascript external
          file. Including the HTML itself, that's 29+8+1+1=39
          requests for one visit, so I have to be careful when
          comparing the request total to other websites who
          generate only 5 or 6 elements per page, or when every
          page in a website has a different number of elements.


          I'm getting an average of 6'7 hits per visit for a
          website that has 39 hits per webpage. This is a very
          high average, mind you, because most people will come
          throught a search engine, get a single file, and then
          go away. There is also people linking directly to
          .pdfs buried somewhere in the website, and the
          ocasional image stealer, who shows YOUR images in
          THEIR pages, creating a hit for every visit to THEIR
          pages.


          Also notice the following fenomena:


          - visitor opens page in your server:
          - 1 visit
          - 39 requests (Code 200 - OK) for HTML, images, css
          and javascrip


          - visitor opens page in your server AFTER having
          opened a very similar page:
          - 0 visits
          - 1 request (Code 200 - OK) for HTML
          - 3 requests (Code 200 - OK) for new images
          - 38 requests (Code 304 - Not Modified) for images,
          css and javascript that were already in the other page



          - visitor from a search engine gets a .doc file
          - 1 visits
          - 1 request (Code 200 - OK) for the .doc


          - visitor from a search engine gets a very big .avi
          file and requests to get it in many small pieces
          - 1 visits
          - 80 requests (Code 206 - Partial Content), this is
          1 request for every small piece of .avi






          >
          > --
          > Sincerely,
          > William L. Thomson Jr.
          > Support Group
          > Obsidian-Studios, Inc.
          > http://www.obsidian-studios.com
          >
          >


          Enric Naval
          Estudiante de Informática de Gestión en la Udl (Lleida)
          GRIHO webalizer.conf
          http://griho.udl.es/webalizer/webalizer.conf.txt

          __________________________________________________
          Do You Yahoo!?
          Tired of spam? Yahoo! Mail has the best spam protection around
          http://mail.yahoo.com
        • Enric Naval
          Look at the Sites Top List. It only lists 3 different sites! Maybe your rewrite rules are preventing your webserver from writing in the logs the correct IPs of
          Message 4 of 10 , Aug 26, 2005
          • 0 Attachment
            Look at the Sites Top List. It only lists 3 different
            sites!

            Maybe your rewrite rules are preventing your webserver
            from writing in the logs the correct IPs of your
            visitors. It is writing instead your own IP as the
            visitor's IP.

            So, when I visit from telefonica.es, the rewrite
            causes the webserver to write "fr.wikinations.be" in
            the logfile instead of "telefonica.es". When I visit
            from udl.es, the webserver also writes
            "fr.wikinations.be". When I visit from ya.com, etc.

            (Can you post those rewrite rules here? I would like
            to see them.)

            In your logfile you must have something like this:


            fr.wikinations.be - - [01/Aug/ (...) "Bloglines/2.0
            (http://www.bloglines.com; 1 subscriber)"
            fr.wikinations.be - - [01/Aug/ (...)
            "Googlebot-Image/1.0"
            fr.wikinations.be - - [01/Aug/ (...)
            "Googlebot-Image/1.0"
            fr.wikinations.be - - [01/Aug/ (...) "Mozilla/5.0
            (compatible; Yahoo! Slurp; (...)

            According to this, Google, Yahoo and bloglines.com
            must have merged, because their bots come all from the
            same IP.....


            fr.wikinations.be [01/Aug/2005:21:02:06 +0200](...)
            "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;
            SV1; .NET CLR 1.1.4322)"

            fr.wikinations.be [01/Aug/2005:21:02:07 +0200](...)
            "Mozilla/5.0 (Macintosh; U; PPC Mac OS X; es-es) Apple
            WebKit/412.6 (KHTML, like Gecko) Safari/412.2"


            Also, you would be using both Windows and Macintosh...
            at the same time.


            For example, for July 2005 you have 109419 hits from
            fr.wikinations.be, which would mean that you are
            visiting yourself an average of 2,5 times per hour :)

            Those are probably visits from comcast.net,
            telefonica.es, and a thousand different places, but
            your rewrite rules are causing the logfiles to claim
            that they all come from fr.wikinations.be.

            There so many little visits because a new visit only
            starts when one of these conditions become true:

            a) more than 30 minutes pass without a hit from that
            IP

            b) a different User Agent is used (I'M NOT SURE
            WEBALIZER IS USING THIS CONDITION)


            Condition "a" is probably almost never becoming true,
            because all visits during appear to come from the same
            IP, so there is a continuous stream of hits that
            appears to come from that IP (remember that you 2'5
            hits per hour for fr.wikinations.be).

            So a new visit is only started when, for example,
            someone uses IE 5.0 instead of IE 6.0., and then a
            another won't start until someone visits with other
            that IE 5.0. Because about 50% people will IE 6.0
            with eactly the same User Agent, there is a chance
            that you have 5 visitors in a row, all sharing a same
            visit because they use the same User Agent.


            If all visitors used the same User Agent, and you had
            a hit at least every 30 minutes, you would only get 3
            visits per month, because you have only 3 different
            sites!



            --- xirzon <moeller@...> wrote:

            > Hi,
            >
            > I'm the webmaster of:
            > http://fr.wikinations.org/
            >
            > There's a webalizer V2.01-10 generating daily stats
            > at:
            > http://fr.wikinations.be/stats/
            >
            > My problem is that the "Visits" statistics for every
            > day are much too
            > low. For example, on August 20, I am counting 274
            > unique IP addresses
            > and 1983 hits. Webalizer reports the number of hits
            > correctly, but
            > only reports 22 visits.
            >
            > I'm not sure why it is doing this. One theory is
            > that it's not
            > counting many of these hits as visits because of my
            > rewrite rules. For
            > example, a typical URL on the site will look like
            > this:
            >
            > http://fr.wikinations.be/Notaire
            >
            > There is no extension. I have set PageType in
            > webalizer.conf to:
            >
            > PageType *
            > PageType .php
            >
            > But this doesn't seem to do the trick. I also note
            > that the "Page"
            > counts seem accurate (i.e. slightly lower than the
            > number of hits)
            >
            > Any ideas how I could get an accurate visit count?
            >
            > Best,
            >
            > Erik
            >
            >
            >


            Enric Naval
            Estudiante de Informática de Gestión en la Udl (Lleida)
            GRIHO webalizer.conf
            http://griho.udl.es/webalizer/webalizer.conf.txt

            __________________________________________________
            Do You Yahoo!?
            Tired of spam? Yahoo! Mail has the best spam protection around
            http://mail.yahoo.com
          • Bradford L. Barrett
            ... The Webalizer doesn t use the user agent for visit calculations.. only the amount of time (or first time) between requests from the same IP address.. and
            Message 5 of 10 , Aug 26, 2005
            • 0 Attachment
              > There so many little visits because a new visit only
              > starts when one of these conditions become true:
              >
              > a) more than 30 minutes pass without a hit from that
              > IP
              >
              > b) a different User Agent is used (I'M NOT SURE
              > WEBALIZER IS USING THIS CONDITION)

              The Webalizer doesn't use the user agent for visit calculations.. only the
              amount of time (or first time) between requests from the same IP address..
              and the amount of time is configurable, but defaults to 30 minutes as
              stated above. Also, only 'PageType' requests trigger visits.. if you
              host images, PDF's, or other non-pagetype URLs that are linked to from
              other sites, those won't be counted as visits (but the hits will still
              be counted).

              --
              Bradford L. Barrett brad@...
              A free electron in a sea of neutrons DoD#1750 KD4NAW

              How do you give Microsoft the benefit of the doubt when you
              know that if you were to throw it in a room with truth, you'd
              risk a matter/anti-matter explosion? -- Nicholas Petreley IDG
            • xirzon
              ... You re probably on to something. However, the logs appear to record the correct IPs, e.g. log entries look like this: fr.wikinations.be 213.189.XXX.XXX - -
              Message 6 of 10 , Aug 29, 2005
              • 0 Attachment
                --- In webalizer@yahoogroups.com, Enric Naval <enventa2000@y...> wrote:
                > Look at the Sites Top List. It only lists 3 different
                > sites!
                >
                > Maybe your rewrite rules are preventing your webserver
                > from writing in the logs the correct IPs of your
                > visitors. It is writing instead your own IP as the
                > visitor's IP.
                >

                You're probably on to something. However, the logs appear to record
                the correct IPs, e.g. log entries look like this:

                fr.wikinations.be 213.189.XXX.XXX - - [20/Aug/2005:07:17:28 +0200]
                "GET /Sunpark_De_Haan
                HTTP/1.1" 200 6966
                "http://www.google.be/search?q=sunpark&hl=fr&cr=countryBE&lr=lang_fr
                &sa=X&oi=lrtip7" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"

                (IP censored for privacy reasons). I suspect that perhaps my CustomLog
                format might not be recognized

                CustomLog /var/log/nationsbe_access_log "%{Host}i %h %l %u %t \"%r\"
                %>s %b \"%{Referer}i\" \"%{User-Agent}i\""

                I will investigate that further. For the record, the rewrite rules are:

                RewriteCond %{REQUEST_URI} !^/(upload|style|skins|backup|adm|stats)/
                RewriteCond %{REQUEST_URI} !^/(redirect|texvc|index).php
                RewriteCond %{REQUEST_URI} !^/favicon.ico
                RewriteCond %{REQUEST_URI} !^/robots.txt
                RewriteRule ^/(.*)$ /index.php?title=$1 [L,QSA]

                Thanks for the help so far,

                Erik
              • Bradford L. Barrett
                ... Thats not a supported format.. drop the %{Host}i so you have a standard combined format. You can stick it at the end of the log if you have to have it
                Message 7 of 10 , Aug 29, 2005
                • 0 Attachment
                  > CustomLog /var/log/nationsbe_access_log "%{Host}i %h %l %u %t \"%r\"
                  > %>s %b \"%{Referer}i\" \"%{User-Agent}i\""

                  Thats not a supported format.. drop the %{Host}i so you have a standard
                  'combined' format. You can stick it at the end of the log if you have
                  to have it included.

                  --
                  Bradford L. Barrett brad@...
                  A free electron in a sea of neutrons DoD#1750 KD4NAW

                  The only thing Micro$oft has done for society, is make people
                  believe that computers are inherently unreliable.
                • Enric Naval
                  ... Erik, I posted an explanation about CustomLog and LogFormat some months ago. You can read it here: http://groups.yahoo.com/group/webalizer/message/3128 In
                  Message 8 of 10 , Aug 30, 2005
                  • 0 Attachment
                    --- "Bradford L. Barrett" <brad@...> wrote:

                    >
                    > > CustomLog /var/log/nationsbe_access_log "%{Host}i
                    > %h %l %u %t \"%r\"
                    > > %>s %b \"%{Referer}i\" \"%{User-Agent}i\""
                    >
                    > Thats not a supported format.. drop the %{Host}i so
                    > you have a standard
                    > 'combined' format. You can stick it at the end of
                    > the log if you have
                    > to have it included.
                    >


                    Erik, I posted an explanation about CustomLog and
                    LogFormat some months ago. You can read it here:

                    http://groups.yahoo.com/group/webalizer/message/3128

                    In your logfile, you shouldn't have
                    "fr.wikinations.be" at the start. The first "word"
                    should be the visitor's IP (where "word" is a space
                    delimited set of non-space and non-quote characters or
                    a quotes delimited set of characters). Like Barret
                    said, that part is wrong and the rest of the line is
                    correct.

                    Also, someone already had a problem similar to yours:

                    http://groups.yahoo.com/group/webalizer/message/3070

                    You can search more related messages where I posted
                    answers:

                    http://groups.yahoo.com/group/webalizer/messagesearch?query=customlog%20logformat%20enric





                    Enric Naval
                    Estudiante de Informática de Gestión en la Udl (Lleida)
                    GRIHO webalizer.conf
                    http://griho.udl.es/webalizer/webalizer.conf.txt

                    __________________________________________________
                    Do You Yahoo!?
                    Tired of spam? Yahoo! Mail has the best spam protection around
                    http://mail.yahoo.com
                  Your message has been successfully submitted and would be delivered to recipients shortly.