Loading ...
Sorry, an error occurred while loading the content.

Re: [webalizer] inktomisearch causes lots of fake visits in my stats

Expand Messages
  • gary hall
    Dave, We don t seem to have this problem, but it is somrthing to think about. Gary
    Message 1 of 6 , May 14, 2005
    • 0 Attachment
      Dave,

      We don't seem to have this problem, but it is somrthing to think about.

      Gary

      Enric Naval wrote:

      >(this is a long email, sorry)
      >
      >
      >I have a problem im my stats:
      >
      >Inktomi uses a different IP for each hit. So, each hit
      >from inktomi counts as a separate visit, instead of
      >many hits
      >
      >counting as only one visit.
      >
      >For example, these two entries are two different
      >visits, despite it being the same user agent fetching
      >the same file
      >
      >from the same domain within less than 30 minutes of
      >difference:
      >
      >lj1124.inktomisearch.com - - [01/May/2005:00:14:07
      >+0200] "GET /robots.txt HTTP/1.0" 200 873 "-"
      >"Mozilla/5.0
      >
      >(compatible; Yahoo! Slurp;
      >http://help.yahoo.com/help/us/ysearch/slurp)"
      >
      >lj2545.inktomisearch.com - - [01/May/2005:00:42:18
      >+0200] "GET /robots.txt HTTP/1.0" 200 873 "-"
      >"Mozilla/5.0
      >
      >(compatible; Yahoo! Slurp;
      >http://help.yahoo.com/help/us/ysearch/slurp)"
      >
      >
      >Every day 25 and every day 26 between hours 17 and 18
      >inktomisearch makes most of the visits to my server,
      >and webalizer
      >
      >counts from 500 to 1000 visits more than usual every
      >one of those days.
      >
      >This causes some weird kind of camel back in my
      >graphics and leads me to believe that I had more
      >visits than usual for
      >
      >some reason, but it was only inktomisearch crawling
      >the sites in the server. It also makes the other bars
      >smaller, and
      >
      >it's more difficult to look at trends in visits.
      >
      >Usually it not noticeable in individual sites, because
      >it gets lost in the noise, but I can see it when I mix
      >together
      >
      >the logs for every site in the server.
      >
      >
      >If you look at this image, you will see that on days
      >25 and 26 I'm getting 30% more visits than days 27 and
      >28, but they
      >
      >all have about the same number of hits and sites,
      >which is higly suspicious. This happened in February,
      >March and April,
      >
      >so there had to be something wrong there. I have
      >marked in red the suspicious-looking part (this is
      >April).
      >
      >http://griho.udl.es/naval/webalizer/inktomisearch.gif
      >
      >
      >
      >
      >webalizer.conf has no option to prevent this from
      >happening. If I use, for example:
      >
      >GroupReferrer .inktomisearch.com Stupid inktomi
      >
      >then webalizer will still count each hit as a visit.
      >
      >
      >
      >
      >To solve this I have made a one-line sed command and
      >now I use it on my logfiles before feeding them to
      >webalizer (I
      >
      >explain it below):
      >
      >sed
      >s/^[a-z][a-z][0-9][0-9][0-9][0-9][.]inktomisearch[.]com/inktomisearch.com/
      >access_log > access_log_sed
      >
      >
      >This transforms all inktomi IPs this way. From:
      >
      >"lj2534.inktomisearch.com"
      >
      >or
      >
      >"fj3612.inktomisearch.com"
      >
      >to
      >
      >"inktomisearch.com"
      >
      >
      >This way, all visits from inktomisearch.com get
      >counted as only one visit, and you get a more
      >realistical count of
      >
      >visits.
      >
      >
      >
      >This is a comparison of daily visits graphs from my
      >server stats for April. As you can see, inktomi was
      >raising the maximum number of visits, and leveling all
      >days at the same level. After using the script, it's
      >easier to see that the server receives way less visits
      >in weekends.
      >
      >http://griho.udl.es/naval/webalizer/inktomi_difference.gif
      >
      >
      >
      >
      >
      >Notes for the sed command:
      >
      >s/ means "substitute"
      >
      >^ means start of a line
      >
      >[a-z] means all letters from a to z
      >
      >[0-9] all digits form 0 to 9
      >
      >[.] the dot character has a especial meaning by
      >itself, so I surround it with claudators
      >
      >
      >Enric Naval
      >Estudiante de Informática de Gestión en la Udl (Lleida)
      >GRIHO webalizer.conf
      >http://griho.udl.es/webalizer/webalizer.conf.txt
      >
      >
      >
      >Yahoo! Mail
      >Stay connected, organized, and protected. Take the tour:
      >http://tour.mail.yahoo.com/mailtour.html
      >
      >
      >
      >Webalizer homepage: http://www.webalizer.org
      >
      >Yahoo! Groups Links
      >
      >
      >
      >
      >
      >
      >
      >
      >
    • gary hall
      Opps! Sorry - Hit the reply by mistake.. Warm regards, Gary
      Message 2 of 6 , May 14, 2005
      • 0 Attachment
        Opps!

        Sorry - Hit the "reply" by mistake..

        Warm regards,

        Gary

        Enric Naval wrote:

        >(this is a long email, sorry)
        >
        >
        >I have a problem im my stats:
        >
        >Inktomi uses a different IP for each hit. So, each hit
        >from inktomi counts as a separate visit, instead of
        >many hits
        >
        >
      • Enric Naval
        ... You aren t crawled by inktomisearch? Are you sure? This domain is used by Slurp (the Yahoo! bot), so this would mean that your site won t appear in any
        Message 3 of 6 , May 14, 2005
        • 0 Attachment
          --- gary hall <gary.chris@...> wrote:

          > Dave,
          >
          > We don't seem to have this problem, but it is
          > somrthing to think about.
          >
          > Gary

          You aren't crawled by inktomisearch? Are you sure?
          This domain is used by Slurp (the Yahoo! bot), so this
          would mean that your site won't appear in any search
          from Yahoo! because they don't crawl your site. For
          commercial sites, that's is a bad thing.

          If you are sure you have no visit from them (for
          example, because you have an intranet), then please
          forget the rest of this email.


          You should grep your access_log files, looking for
          visits from inktomisearch, since you will probably
          have visits from them. I believe that it is not
          possible to see the problem in the stats unless you
          look directly the logfiles. This command lists all
          visits from inktomisearch. Could you run it and tell
          us if it worked? (remember that each line you will see
          is counted as one different visit)

          grep access_log
          ^[a-z][a-z][.][0-9][0-9][0-9][0-9][.]inktomisearch[.]com


          You see, I use this line in webalizer.conf:

          "GroupSite *inktomisearch.com Inktomi"

          In the Top Site list I had 148 visits from
          inktomisearch.com both before AND after using the
          script BUT the total number of visits had gone down
          from 41292 to 31512!

          So:

          Using_script Total_visits Visits_from_inktomi
          Yes 41292 148
          No 31512 148

          All other totals in the "Top Sites" list remain the
          same.

          Mind you, this behaviour is the expected and absolutly
          correct behaviour.

          "Top Site list" and "total visits" are not related to
          each other and use different algorithms and get very
          different results....




          >
          > Enric Naval wrote:
          >
          > >(this is a long email, sorry)
          > >
          > >
          > >I have a problem im my stats:
          > >
          > >Inktomi uses a different IP for each hit. So, each
          > hit
          > >from inktomi counts as a separate visit, instead of
          > >many hits
          > >
          > >counting as only one visit.
          > >
          > >For example, these two entries are two different
          > >visits, despite it being the same user agent
          > fetching
          > >the same file
          > >
          > >from the same domain within less than 30 minutes of
          > >difference:
          > >
          > >lj1124.inktomisearch.com - - [01/May/2005:00:14:07
          > >+0200] "GET /robots.txt HTTP/1.0" 200 873 "-"
          > >"Mozilla/5.0
          > >
          > >(compatible; Yahoo! Slurp;
          > >http://help.yahoo.com/help/us/ysearch/slurp)"
          > >
          > >lj2545.inktomisearch.com - - [01/May/2005:00:42:18
          > >+0200] "GET /robots.txt HTTP/1.0" 200 873 "-"
          > >"Mozilla/5.0
          > >
          > >(compatible; Yahoo! Slurp;
          > >http://help.yahoo.com/help/us/ysearch/slurp)"
          > >
          > >
          > >Every day 25 and every day 26 between hours 17 and
          > 18
          > >inktomisearch makes most of the visits to my
          > server,
          > >and webalizer
          > >
          > >counts from 500 to 1000 visits more than usual
          > every
          > >one of those days.
          > >
          > >This causes some weird kind of camel back in my
          > >graphics and leads me to believe that I had more
          > >visits than usual for
          > >
          > >some reason, but it was only inktomisearch crawling
          > >the sites in the server. It also makes the other
          > bars
          > >smaller, and
          > >
          > >it's more difficult to look at trends in visits.
          > >
          > >Usually it not noticeable in individual sites,
          > because
          > >it gets lost in the noise, but I can see it when I
          > mix
          > >together
          > >
          > >the logs for every site in the server.
          > >
          > >
          > >If you look at this image, you will see that on
          > days
          > >25 and 26 I'm getting 30% more visits than days 27
          > and
          > >28, but they
          > >
          > >all have about the same number of hits and sites,
          > >which is higly suspicious. This happened in
          > February,
          > >March and April,
          > >
          > >so there had to be something wrong there. I have
          > >marked in red the suspicious-looking part (this is
          > >April).
          > >
          >
          >http://griho.udl.es/naval/webalizer/inktomisearch.gif
          > >
          > >
          > >
          > >
          > >webalizer.conf has no option to prevent this from
          > >happening. If I use, for example:
          > >
          > >GroupReferrer .inktomisearch.com Stupid inktomi
          > >
          > >then webalizer will still count each hit as a
          > visit.
          > >
          > >
          > >
          > >
          > >To solve this I have made a one-line sed command
          > and
          > >now I use it on my logfiles before feeding them to
          > >webalizer (I
          > >
          > >explain it below):
          > >
          > >sed
          >
          >s/^[a-z][a-z][0-9][0-9][0-9][0-9][.]inktomisearch[.]com/inktomisearch.com/
          > >access_log > access_log_sed
          > >
          > >
          > >This transforms all inktomi IPs this way. From:
          > >
          > >"lj2534.inktomisearch.com"
          > >
          > >or
          > >
          > >"fj3612.inktomisearch.com"
          > >
          > >to
          > >
          > >"inktomisearch.com"
          > >
          > >
          > >This way, all visits from inktomisearch.com get
          > >counted as only one visit, and you get a more
          > >realistical count of
          > >
          > >visits.
          > >
          > >
          > >
          > >This is a comparison of daily visits graphs from my
          > >server stats for April. As you can see, inktomi was
          > >raising the maximum number of visits, and leveling
          > all
          > >days at the same level. After using the script,
          > it's
          > >easier to see that the server receives way less
          > visits
          > >in weekends.
          > >
          >
          >http://griho.udl.es/naval/webalizer/inktomi_difference.gif
          > >
          > >
          > >
          > >
          > >
          > >Notes for the sed command:
          > >
          > >s/ means "substitute"
          > >
          > >^ means start of a line
          > >
          > >[a-z] means all letters from a to z
          > >
          > >[0-9] all digits form 0 to 9
          > >
          > >[.] the dot character has a especial meaning by
          > >itself, so I surround it with claudators
          > >
          > >
          > >Enric Naval
          > >Estudiante de Inform�tica de Gesti�n en la Udl
          > (Lleida)
          > >GRIHO webalizer.conf
          > >http://griho.udl.es/webalizer/webalizer.conf.txt
          > >
          > >
          > >
          > >Yahoo! Mail
          > >Stay connected, organized, and protected. Take the
          > tour:
          > >http://tour.mail.yahoo.com/mailtour.html
          > >
          > >
          > >
          > >Webalizer homepage: http://www.webalizer.org
          > >
          > >Yahoo! Groups Links
          > >
          > >
          > >
          > >
          > >
          > >
          > >
          > >
          > >
          >


          Enric Naval
          Estudiante de Inform�tica de Gesti�n en la Udl (Lleida)
          GRIHO webalizer.conf
          http://griho.udl.es/webalizer/webalizer.conf.txt



          Yahoo! Mail
          Stay connected, organized, and protected. Take the tour:
          http://tour.mail.yahoo.com/mailtour.html
        • gary hall
          Hi Eric, Currently my webmaster has installed *Webalizer Version 2.01* (with *Geolizer*
          Message 4 of 6 , May 14, 2005
          • 0 Attachment
            Hi Eric,

            Currently my webmaster has installed "Webalizer Version 2.01 (with Geolizer patch)".

            When looking at "total referrers" I show Google (528 hits), Yahoo (155) and AOL(100) spelled out, but the traffic for the 25 / 26 is "normal" looking. Don't see any spikes like you were showing. We show a total of 3392 visits, 22,500 hits - looks normal.

            I will pass this on to "Dave" and when I get his response (because he is the smart one of this effort) I will send you the results.

            Thanks.

            Warm regards,

            Gary

            Enric Naval wrote:
            --- gary hall <gary.chris@...> wrote:
            
              
            Dave,
            
            We don't seem to have this problem, but it is
            somrthing to think about.
            
            Gary
                
            You aren't crawled by inktomisearch? Are you sure?
            This domain is used by Slurp (the Yahoo! bot), so this
            would mean that your site won't appear in any search
            from Yahoo! because they don't crawl your site. For
            commercial sites, that's is a bad thing.
            
            If you are sure you have no visit from them (for
            example, because you  have an intranet), then please
            forget the rest of this email.
            
            
            You should grep your access_log files, looking for
            visits from inktomisearch, since you will probably
            have visits from them. I believe that it is not
            possible to see the problem in the stats unless you
            look directly the logfiles. This command lists all
            visits from inktomisearch. Could you run it and tell
            us if it worked? (remember that each line you will see
            is counted as one different visit)
            
            grep access_log
            ^[a-z][a-z][.][0-9][0-9][0-9][0-9][.]inktomisearch[.]com
            
            
            You see, I use this line in webalizer.conf:
            
            "GroupSite *inktomisearch.com Inktomi"
            
            In the Top Site list I had 148 visits from
            inktomisearch.com both before AND after using the
            script BUT the total number of visits had gone down
            from 41292 to 31512!
            
            So:
            
            Using_script Total_visits Visits_from_inktomi
            Yes            41292       148
            No             31512       148
            
            All other totals in the "Top Sites" list remain the
            same.
            
            Mind you, this behaviour is the expected and absolutly
            correct behaviour. 
            
            "Top Site list" and "total visits" are not related to
            each other and use different algorithms and get very
            different results....
            
            
            
            
              
            Enric Naval wrote:
            
                
            (this is a long email, sorry)
            
            
            I have a problem im my stats: 
            
            Inktomi uses a different IP for each hit. So, each
                  
            hit
            >from inktomi counts as a separate visit, instead of
                
            many hits 
            
            counting as only one visit. 
            
            For example, these two entries are two different
            visits, despite it being the same user agent
                  
            fetching
                
            the same file 
            
                  
            >from the same domain within less than 30 minutes of
                
            difference:
            
            lj1124.inktomisearch.com - - [01/May/2005:00:14:07
            +0200] "GET /robots.txt HTTP/1.0" 200 873 "-"
            "Mozilla/5.0 
            
            (compatible; Yahoo! Slurp;
            http://help.yahoo.com/help/us/ysearch/slurp)"
            
            lj2545.inktomisearch.com - - [01/May/2005:00:42:18
            +0200] "GET /robots.txt HTTP/1.0" 200 873 "-"
            "Mozilla/5.0 
            
            (compatible; Yahoo! Slurp;
            http://help.yahoo.com/help/us/ysearch/slurp)"
            
            
            Every day 25 and every day 26 between hours 17 and
                  
            18
                
            inktomisearch makes most of the visits to my
                  
            server,
                
            and webalizer 
            
            counts from 500 to 1000 visits more than usual
                  
            every
                
            one of those days. 
            
            This causes some weird kind of camel back in my
            graphics and leads me to believe that I had more
            visits than usual for 
            
            some reason, but it was only inktomisearch crawling
            the sites in the server. It also makes the other
                  
            bars
                
            smaller, and 
            
            it's more difficult to look at trends in visits.
            
            Usually it not noticeable in individual sites,
                  
            because
                
            it gets lost in the noise, but I can see it when I
                  
            mix
                
            together 
            
            the logs for every site in the server.
            
            
            If you look at this image, you will see that on
                  
            days
                
            25 and 26 I'm getting 30% more visits than days 27
                  
            and
                
            28, but they 
            
            all have about the same number of hits and sites,
            which is higly suspicious. This happened in
                  
            February,
                
            March and April, 
            
            so there had to be something wrong there. I have
            marked in red the suspicious-looking part (this is
            April).
            
                  
            http://griho.udl.es/naval/webalizer/inktomisearch.gif
                
            
            
            webalizer.conf has no option to prevent this from
            happening. If I use, for example: 
            
            GroupReferrer .inktomisearch.com  Stupid inktomi 
            
            then webalizer will still count each hit as a
                  
            visit.
                
            
            
            To solve this I have made a one-line sed command
                  
            and
                
            now I use it on my logfiles before feeding them to
            webalizer (I 
            
            explain it below):
            
            sed
                  
            s/^[a-z][a-z][0-9][0-9][0-9][0-9][.]inktomisearch[.]com/inktomisearch.com/
                
            access_log > access_log_sed
            
            
            This transforms all inktomi IPs this way. From:
            
            "lj2534.inktomisearch.com" 
            
            or
            
            "fj3612.inktomisearch.com" 
            
            to
            
            "inktomisearch.com"
            
            
            This way, all visits from inktomisearch.com get
            counted as only one visit, and you get a more
            realistical count of 
            
            visits.
            
            
            
            This is a comparison of daily visits graphs from my
            server stats for April. As you can see, inktomi was
            raising the maximum number of visits, and leveling
                  
            all
                
            days at the same level. After using the script,
                  
            it's
                
            easier to see that the server receives way less
                  
            visits
                
            in weekends.
            
                  
            http://griho.udl.es/naval/webalizer/inktomi_difference.gif
                
            
            
            
            Notes for the sed command:
            
            s/      means "substitute"
            
            ^       means start of a line
            
            [a-z]   means all letters from a to z
            
            [0-9]   all digits form 0 to 9
            
            [.]     the dot character has a especial meaning by
            itself, so I surround it with claudators
            
            
            Enric Naval
            Estudiante de Informática de Gestión en la Udl
                  
            (Lleida)
              
          • Enric Naval
            ... My sites get crawled by inktomi in days 25 and 26. For your site, it will be different days. I guess that Inktomi crawls all days in the month, and days 25
            Message 5 of 6 , May 14, 2005
            • 0 Attachment
              --- gary hall <gary.chris@...> wrote:

              > Hi Eric,
              >
              > Currently my webmaster has installed "*Webalizer
              > Version 2.01*
              > <http://www.mrunix.net/webalizer/> (with *Geolizer*
              > <http://sysd.org/proj/log.php#glzr> patch)".
              >
              > When looking at "total referrers" I show Google (528
              > hits), Yahoo (155)
              > and AOL(100) spelled out, but the traffic for the 25
              > / 26 is "normal"
              > looking. Don't see any spikes like you were showing.
              > We show a total of
              > 3392 visits, 22,500 hits - looks normal.


              My sites get crawled by inktomi in days 25 and 26. For
              your site, it will be different days. I guess that
              Inktomi crawls all days in the month, and days 25 and
              26 are my turn to be re-crawled in depth, or the
              crawler happens to stump upon a big website on that
              day inside its monthly cycle.


              The referrers won't show you this problem, because
              inktomisearch (Yahoo! Slurp) shows an empty referrer
              "-". The referrers will normally show you what engines
              the visitors have used to reach you, but they won't
              show wheter the engine's bots have crawled your site
              because many times they use empty referrers.

              To find the engine's bots activity, you have to look
              in "Total Sites" or in "Total User Agents".

              For Inktomi, you should look for the string
              "inktomisearch.com" in "Total Sites" and the string
              "Slurp" in "Total User Agents".


              >
              > I will pass this on to "Dave" and when I get his
              > response (because he is
              > the smart one of this effort) I will send you the
              > results.
              >
              > Thanks.


              OK, thanks to you, too. Send this email to "Dave", if
              you can.


              >
              > Warm regards,
              >
              > Gary
              >
              > Enric Naval wrote:
              >
              > >--- gary hall <gary.chris@...> wrote:
              > >
              > >
              > >
              > >>Dave,
              > >>
              > >>We don't seem to have this problem, but it is
              > >>somrthing to think about.
              > >>
              > >>Gary
              > >>
              > >>
              > >
              > >You aren't crawled by inktomisearch? Are you sure?
              > >This domain is used by Slurp (the Yahoo! bot), so
              > this
              > >would mean that your site won't appear in any
              > search
              > >from Yahoo! because they don't crawl your site. For
              > >commercial sites, that's is a bad thing.
              > >
              > >If you are sure you have no visit from them (for
              > >example, because you have an intranet), then
              > please
              > >forget the rest of this email.
              > >
              > >
              > >You should grep your access_log files, looking for
              > >visits from inktomisearch, since you will probably
              > >have visits from them. I believe that it is not
              > >possible to see the problem in the stats unless you
              > >look directly the logfiles. This command lists all
              > >visits from inktomisearch. Could you run it and
              > tell
              > >us if it worked? (remember that each line you will
              > see
              > >is counted as one different visit)
              > >
              > >grep access_log
              >
              >^[a-z][a-z][.][0-9][0-9][0-9][0-9][.]inktomisearch[.]com
              > >
              > >
              > >You see, I use this line in webalizer.conf:
              > >
              > >"GroupSite *inktomisearch.com Inktomi"
              > >
              > >In the Top Site list I had 148 visits from
              > >inktomisearch.com both before AND after using the
              > >script BUT the total number of visits had gone down
              > >from 41292 to 31512!
              > >
              > >So:
              > >
              > >Using_script Total_visits Visits_from_inktomi
              > >Yes 41292 148
              > >No 31512 148
              > >
              > >All other totals in the "Top Sites" list remain the
              > >same.
              > >
              > >Mind you, this behaviour is the expected and
              > absolutly
              > >correct behaviour.
              > >
              > >"Top Site list" and "total visits" are not related
              > to
              > >each other and use different algorithms and get
              > very
              > >different results....
              > >
              > >
              > >
              > >
              > >
              > >
              > >>Enric Naval wrote:
              > >>
              > >>
              > >>
              > >>>(this is a long email, sorry)
              > >>>
              > >>>
              > >>>I have a problem im my stats:
              > >>>
              > >>>Inktomi uses a different IP for each hit. So,
              > each
              > >>>
              > >>>
              > >>hit
              > >>>from inktomi counts as a separate visit, instead
              > of
              > >>
              > >>
              > >>>many hits
              > >>>
              > >>>counting as only one visit.
              > >>>
              > >>>For example, these two entries are two different
              > >>>visits, despite it being the same user agent
              > >>>
              > >>>
              > >>fetching
              > >>
              > >>
              > >>>the same file
              > >>>
              > >>>
              > >>>
              > >>>from the same domain within less than 30 minutes
              > of
              > >>
              > >>
              > >>>difference:
              > >>>
              > >>>lj1124.inktomisearch.com - -
              > [01/May/2005:00:14:07
              > >>>+0200] "GET /robots.txt HTTP/1.0" 200 873 "-"
              > >>>"Mozilla/5.0
              > >>>
              > >>>(compatible; Yahoo! Slurp;
              > >>>http://help.yahoo.com/help/us/ysearch/slurp)"
              > >>>
              > >>>lj2545.inktomisearch.com - -
              > [01/May/2005:00:42:18
              > >>>+0200] "GET /robots.txt HTTP/1.0" 200 873 "-"
              > >>>"Mozilla/5.0
              > >>>
              > >>>(compatible; Yahoo! Slurp;
              > >>>http://help.yahoo.com/help/us/ysearch/slurp)"
              > >>>
              > >>>
              > >>>Every day 25 and every day 26 between hours 17
              > and
              > >>>
              > >>>
              > >>18
              > >>
              > >>
              > >>>inktomisearch makes most of the visits to my
              > >>>
              > >>>
              > >>server,
              > >>
              > >>
              > >>>and webalizer
              > >>>
              > >>>counts from 500 to 1000 visits more than usual
              > >>>
              > >>>
              > >>every
              > >>
              > >>
              > >>>one of those days.
              > >>>
              > >>>This causes some weird kind of camel back in my
              > >>>graphics and leads me to believe that I had more
              > >>>visits than usual for
              > >>>
              > >>>some reason, but it was only inktomisearch
              > crawling
              > >>>the sites in the server. It also makes the other
              > >>>
              > >>>
              > >>bars
              > >>
              > >>
              > >>>smaller, and
              > >>>
              > >>>it's more difficult to look at trends in visits.
              > >>>
              > >>>Usually it not noticeable in individual sites,
              > >>>
              > >>>
              > >>because
              > >>
              > >>
              > >>>it gets lost in the noise, but I can see it when
              > I
              > >>>
              > >>>
              > >>mix
              > >>
              > >>
              > >>>together
              > >>>
              > >>>the logs for every site in the server.
              > >>>
              > >>>
              > >>>If you look at this image, you will see that on
              > >>>
              > >>>
              > >>days
              > >>
              > >>
              > >>>25 and 26 I'm getting 30% more visits than days
              > 27
              > >>>
              > >>>
              > >>and
              > >>
              > >>
              > >>>28, but they
              > >>>
              > >>>all have about the same number of hits and sites,
              > >>>which is higly suspicious. This happened in
              > >>>
              > >>>
              > >>February,
              > >>
              > >>
              > >>>March and April,
              > >>>
              > >>>so there had to be something wrong there. I have
              > >>>marked in red the suspicious-looking part (this
              > is
              > >>>April).
              > >>>
              > >>>
              > >>>
              >
              >>http://griho.udl.es/naval/webalizer/inktomisearch.gif
              > >>
              > >>
              > >>>
              > >>>
              > >>>webalizer.conf has no option to prevent this from
              > >>>happening. If I use, for example:
              > >>>
              > >>>GroupReferrer .inktomisearch.com Stupid inktomi
              > >>>
              > >>>then webalizer will still count each hit as a
              > >>>
              > >>>
              > >>visit.
              > >>
              > >>
              > >>>
              > >>>
              > >>>To solve this I have made a one-line sed command
              > >>>
              > >>>
              > >>and
              > >>
              > >>
              > >>>now I use it on my logfiles before feeding them
              > to
              > >>>webalizer (I
              > >>>
              > >>>explain it below):
              > >>>
              > >>>sed
              > >>>
              > >>>
              >
              >>s/^[a-z][a-z][0-9][0-9][0-9][0-9][.]inktomisearch[.]com/inktomisearch.com/
              > >>
              > >>
              > >>>access_log > access_log_sed
              > >>>
              > >>>
              > >>>This transforms all inktomi IPs this way. From:
              > >>>
              > >>>"lj2534.inktomisearch.com"
              > >>>
              > >>>or
              > >>>
              > >>>"fj3612.inktomisearch.com"
              > >>>
              > >>>to
              > >>>
              > >>>"inktomisearch.com"
              > >>>
              > >>>
              > >>>This way, all visits from inktomisearch.com get
              > >>>counted as only one visit, and you get a more
              > >>>realistical count of
              > >>>
              > >>>visits.
              > >>>
              > >>>
              > >>>
              > >>>This is a comparison of daily visits graphs from
              > my
              > >>>server stats for April. As you can see, inktomi
              > was
              > >>>raising the maximum number of visits, and
              > leveling
              > >>>
              > >>>
              > >>all
              > >>
              > >>
              > >>>days at the same level. After using the script,
              > >>>
              > >>>
              > >>it's
              > >>
              > >>
              > >>>easier to see that the server receives way less
              > >>>
              > >>>
              > >>visits
              > >>
              > >>
              > >>>in weekends.
              > >>>
              > >>>
              > >>>
              >
              >>http://griho.udl.es/naval/webalizer/inktomi_difference.gif
              > >>
              > >>
              > >>>
              > >>>
              > >>>
              > >>>Notes for the sed command:
              > >>>
              > >>>s/ means "substitute"
              > >>>
              > >>>^ means start of a line
              > >>>
              > >>>[a-z] means all letters from a to z
              > >>>
              > >>>[0-9] all digits form 0 to 9
              > >>>
              > >>>[.] the dot character has a especial meaning
              > by
              > >>>itself, so I surround it with claudators
              > >>>
              > >>>
              > >>>Enric Naval
              > >>>Estudiante de Inform�tica de Gesti�n en la Udl
              > >>>
              > >>>
              > >>(Lleida)
              > >>
              > >
              > >
              >


              Enric Naval
              Estudiante de Inform�tica de Gesti�n en la Udl (Lleida)
              GRIHO webalizer.conf
              http://griho.udl.es/webalizer/webalizer.conf.txt



              Yahoo! Mail
              Stay connected, organized, and protected. Take the tour:
              http://tour.mail.yahoo.com/mailtour.html
            Your message has been successfully submitted and would be delivered to recipients shortly.