Loading ...
Sorry, an error occurred while loading the content.

RE: [webalizer] A question

Expand Messages
  • Andy Brager
    Clearly we have different ideas as to what constitutes a full fledged statistics solution . I m talking about 1) filtering out certain data so that it s not
    Message 1 of 9 , Apr 19, 2007
    • 0 Attachment
      Clearly we have different ideas as to what constitutes "a full fledged statistics solution".  I'm talking about 1) filtering out certain data so that it's not counted and 2) summing a different portion of the data than is currently summed.
       
      If that's a "a full fledged statistics solution" than clearly people paying for it are getting seriously ripped off.  Just my two cents.
       
       
      -----Original Message-----
      From: webalizer@yahoogroups.com [mailto:webalizer@yahoogroups.com]On Behalf Of Peter K Yanke
      Sent: Thursday, April 19, 2007 12:38 PM
      To: webalizer@yahoogroups.com
      Subject: RE: [webalizer] A question

      Just my two cents, but to my understanding the webalizer program is not meant to be a full fledged statistics solution. You may want to try paid services such as Statcounter and others that do what you are talking about and more, or look for a paid solution that can be implemented on your own server. I've never looked at webalizer as anything other than a 'rough idea' of what is going on as far as visitors goes. However, a relatively simple spreadsheet using the right numbers from the webalizer output or even logfiles could get you the info you want to without any kind of fancy web display.
       
      Like I said...just my two cents...:0)


      From: webalizer@yahoogrou ps.com [mailto:webalizer@ yahoogroups. com] On Behalf Of Andy Brager
      Sent: Thursday, April 19, 2007 1:22 PM
      To: webalizer@yahoogrou ps.com
      Subject: RE: [webalizer] A question

      I wonder if the developers are listening.  I have some ideas for improvement I would dearly love to see implemented.  I hope nobody minds me expressing those ideas here.
       
      I would love to be able to do the following:
       
      1) Filter out bots from the stats.  A regular expression match on for example "*googlebot* " would do the trick.
       
      2) Filter out specific domains and/or IP addresses (in particular, I want to filter out myself, as I'm responsible for about 90% of the traffic.  I suspect this is not unusual during development and/or right around launch time - or at least until real traffic builds.  I don't need to know where I've been, I already know by virtue of having been there).  This would also help filter out certain bots, and/or certain users, and/or useless information from aggregated users - like AOL for instance if they're all using the same IP.
       
      3) See reports that show me which IP addresses are hitting/visiting/ entering/ exiting which pages - currently you get a summary, but it doesn't tell me that IP 1.2.3.4 entered on page foo.html and exited on page bar.html, and also clicked through to foobar.html and fubar.html.  You're summarizing URL info. as in URL x.html was visted 1000 times (without corresponding IP info.).  I want to see a summary of IP info. as in IP 1.2.3.4 visited URL x.html 4 times, y.html 6 times and z.html 9 times.  Don't limit me to the top 10 or top 50 or top 100, unless I ask to limit it.  Show me all of them if I want to see all of them, or just the bottom N IPs or top N IPs, etc.  Give me some flexibility in what I choose to see.
       
      4) In addition to item 3 above, I would like to see the exact time & date they visited those pages.
       
      5) I'd like to be able to filter out specific files from being reported.  For example, I know I have images on specific pages and that I use stylesheets, I don't need to know that image.jpg and/or style.css was hit when they visited index.html and/or page.html.  The fact that they visited index.html and/or page.html is sufficient for me to know that the images and stylesheets on those pages were hit, and providing that superfluous information doesn't add any value.  In fact, it substantially decreases the value because I can't get the information I want, it's overwhelmed by this other pointless info. which is packed into the "top 10".
       
      It tells me that foo.html was hit 500 times which is nice to know, but it doesn't tell me that IP 4.3.2.1 hit (visited?) foo.html file 100 times, and that IP 12.12.12.12 hit foo.html file 400 times.  That however, is the information I really want to know.  That shows me only 2 IPs are responsible for all of my 500 hits/visits and the 50 other sites listed were apparently hitting other files (which would also show up in the stats).  Now I can decide if those 500 hits have any true value to me and/or what that value is, based on the IP reported.
       
      All of this info. is in the raw log files.  It simply needs to be organized differently than is currently reported by webalizer.
       
      Please note, as a simple user subject to my webhosts restrictions, I have no control over the compile time characteristics of the program, nor even startup characteristics.  These features would have to be accessible from the web page that webalizer prints when it sums up the stats (or a separate runtime configuration page if necessary).  I should be able to "lock in" my choices so I don't have to specify them each and every time I run webalizer.
       
      Thank you
       
       
      -----Original Message-----
      From: webalizer@yahoogrou ps.com [mailto:webalizer@ yahoogroups. com]On Behalf Of Southerland, Adam
      Sent: Thursday, April 19, 2007 11:28 AM
      To: webalizer@yahoogrou ps.com
      Subject: RE: [webalizer] A question

      Have you looked at these documents yet?

      Stats Explained in general: http://www.webalize r.org/simpleton. html

      This has the stats in depth: http://www.mrunix. net/webalizer/ webalizer_ help.html  (Including what the words like Visits and Hits represent)

      From: webalizer@yahoogrou ps.com [mailto:webalizer@ yahoogroups. com] On Behalf Of Steve Bryant
      Sent: Thursday, April 19, 2007 1:19 PM
      To: webalizer@yahoogrou ps.com
      Subject: [webalizer] A question

      Hello

      I have a couple of internet shops - www.celtic-fringe. com and www.worksofbeauty. co.uk and the company hosting them use Webalizer for the stats.

      i cannot find so far any understandable information on what the stats mean.  I'm told that 'visits' are important - but can someone tell me please - what are visits and what are hits.  How can I tell what is going on.  Is there any concise information available?

      Regards

      Steve


       

      Steve Bryant
      Cornwall UK
      TR2 5JP
      Telephone 07785 941781
      E-mail highbridgesb@ yahoo.co. uk


      Yahoo! Mail is the world's favourite email. Don't settle for less, sign up for your free account today.

    • Southerland, Adam
      I know there are some other Web Analysis programs out there that do what you want. I forget if they are free or not. For my purposes, Webalyzer is just what I
      Message 2 of 9 , Apr 19, 2007
      • 0 Attachment

        I know there are some other Web Analysis programs out there that do what you want. I forget if they are free or not. For my purposes, Webalyzer is just what I need/want. You on the other hand may need something else and as long as you can get your log files off the server (most web hosts I’ve seen allow this.) then you can try the other applications… (having the logs also mean that you can use webalyzer locally to produce the same stats and then tweak the config)

         

        Web Druid (An orange color scheme web site) which looks like it is based on webalyzer shows the flow of the users. There may be others out there that do what you want without programming involved. (I’ve looked at several, but it was about a year ago and I don’t remember names. A web search should give nice results.)

         

        I understand your concerns and they do differ from mine due to the site size. If I need other stats, I go an calculate them by hand (using tools)

         

        Adam Southerland

         

        From: webalizer@yahoogroups.com [mailto:webalizer@yahoogroups.com] On Behalf Of Andy Brager
        Sent: Thursday, April 19, 2007 3:38 PM
        To: webalizer@yahoogroups.com
        Subject: RE: [webalizer] A question

         

        Just additional info for you… Running stats like that could take days or weeks to generate. A few have tried and on my log files (400-600 Mb per day), well, I quit processing them manually after 18 hours because I want to do something else at that point.

         

        I could understand on a file that size.  However you could further break it down by hour and just process part of the file.  It's been my experience that most programs attempt to suck the entire file into memory, and therefore spends all the CPU's time paging & swapping instead of actually analyzing.  If webalizer (or whatever program) instead broke the file up into manageable chunks, chances are it would go substantially faster.

        However, I'm not really thinking about your scenario because 1) my log files are very small because it's a brand new website.  Newer sites are more concerned with these things than older sites with an established presence.  When I reach log files of that size I too will probably find this extra information less useful.  2) You don't HAVE to process them in this manner, it's an option, not a requirement.  3) Faster hardware is always coming out.  What used to take months years ago, now takes days.  Software should never be handicapped due to hardware limitations, other than to write them more efficiently.

         

        You can filter out some stats like googlebot by using the hidden tags in the config file. (I’m not sure if this completely removes all stats from googlebot or not though, but it would remove it from the agents section.)

         

        Keypoint:  To the best of my knowledge at this time, I do not have access to the config file.  I'm just a plain old user on a webhost that controls most everything I do.  Just about all tasks require that I click on an icon.  No shell access.  Sucks to be me.  Further, unless it does in fact remove it from all stats, it doesn't help me.

        [..snip..]

         

        Btw… Visits are like ‘Sessions’ – a visit is a user hit that hit your pages 1 or more times within X minutes (30 default I think) – if the user comes back 2 hours later, then that is a second visit.

         

        Useful to know thank you, but not as useful as what I outlined earlier - in my opinion.

         

        Like someone else has suggested before I finished  this, you should probably do you own analysis on top of what webalyzer gives you… I’d recommend shoving logs into SQL and then running queries on them there and display as html.

         

        I guess.  I find that I usually have to "roll my own" for virtually everything and I'm getting too old to keep on slicing and dicing.  It would just be nice if for once somebody's program had features I found truly useful.  The data is there.  The people/person who wrote the software did a really nice job of presenting some of it.   They clearly know how to write nice software.  Can't there be some thought  & consideration given to presenting the data in a different way that might be more meaningful to some people?

         

         

        From: webalizer@yahoogroups.com [mailto:webalizer@yahoogroups.com] On Behalf Of Andy Brager
        Sent: Thursday, April 19, 2007 2:22 PM
        To: webalizer@yahoogroups.com
        Subject: RE: [webalizer] A question

        I wonder if the developers are listening.  I have some ideas for improvement I would dearly love to see implemented.  I hope nobody minds me expressing those ideas here.

        I would love to be able to do the following:

        1) Filter out bots from the stats.  A regular expression match on for example "*googlebot*" would do the trick.

        2) Filter out specific domains and/or IP addresses (in particular, I want to filter out myself, as I'm responsible for about 90% of the traffic.  I suspect this is not unusual during development and/or right around launch time - or at least until real traffic builds.  I don't need to know where I've been, I already know by virtue of having been there).  This would also help filter out certain bots, and/or certain users, and/or useless information from aggregated users - like AOL for instance if they're all using the same IP.

        3) See reports that show me which IP addresses are hitting/visiting/entering/exiting which pages - currently you get a summary, but it doesn't tell me that IP 1.2.3.4 entered on page foo.html and exited on page bar.html, and also clicked through to foobar.html and fubar.html.  You're summarizing URL info. as in URL x.html was visted 1000 times (without corresponding IP info.).  I want to see a summary of IP info. as in IP 1.2.3.4 visited URL x.html 4 times, y.html 6 times and z.html 9 times.  Don't limit me to the top 10 or top 50 or top 100, unless I ask to limit it.  Show me all of them if I want to see all of them, or just the bottom N IPs or top N IPs, etc.  Give me some flexibility in what I choose to see.

        4) In addition to item 3 above, I would like to see the exact time & date they visited those pages.

        5) I'd like to be able to filter out specific files from being reported.  For example, I know I have images on specific pages and that I use stylesheets, I don't need to know that image.jpg and/or style.css was hit when they visited index.html and/or page.html.  The fact that they visited index.html and/or page.html is sufficient for me to know that the images and stylesheets on those pages were hit, and providing that superfluous information doesn't add any value.  In fact, it substantially decreases the value because I can't get the information I want, it's overwhelmed by this other pointless info. which is packed into the "top 10".

        It tells me that foo.html was hit 500 times which is nice to know, but it doesn't tell me that IP 4.3.2.1 hit (visited?) foo.html file 100 times, and that IP 12.12.12.12 hit foo.html file 400 times.  That however, is the information I really want to know.  That shows me only 2 IPs are responsible for all of my 500 hits/visits and the 50 other sites listed were apparently hitting other files (which would also show up in the stats).  Now I can decide if those 500 hits have any true value to me and/or what that value is, based on the IP reported.

        All of this info. is in the raw log files.  It simply needs to be organized differently than is currently reported by webalizer.

        Please note, as a simple user subject to my webhosts restrictions, I have no control over the compile time characteristics of the program, nor even startup characteristics.  These features would have to be accessible from the web page that webalizer prints when it sums up the stats (or a separate runtime configuration page if necessary).  I should be able to "lock in" my choices so I don't have to specify them each and every time I run webalizer.

        Thank you

        -----Original Message-----
        From: webalizer@yahoogroups.com [mailto:webalizer@yahoogroups.com]On Behalf Of Southerland, Adam
        Sent: Thursday, April 19, 2007 11:28 AM
        To: webalizer@yahoogroups.com
        Subject: RE: [webalizer] A question

        Have you looked at these documents yet?

        Stats Explained in general: http://www.webalizer.org/simpleton.html

        This has the stats in depth: http://www.mrunix.net/webalizer/webalizer_help.html  (Including what the words like Visits and Hits represent)

        From: webalizer@yahoogroups.com [mailto:webalizer@yahoogroups.com] On Behalf Of Steve Bryant
        Sent: Thursday, April 19, 2007 1:19 PM
        To: webalizer@yahoogroups.com
        Subject: [webalizer] A question

        Hello

        I have a couple of internet shops - www.celtic-fringe.com and www.worksofbeauty.co.uk and the company hosting them use Webalizer for the stats.

        i cannot find so far any understandable information on what the stats mean.  I'm told that 'visits' are important - but can someone tell me please - what are visits and what are hits.  How can I tell what is going on.  Is there any concise information available?

        Regards

        Steve


         

        Steve Bryant
        Cornwall UK
        TR2 5JP
        Telephone 07785 941781
        E-mail highbridgesb@...


        Yahoo! Mail is the world's favourite email. Don't settle for less, sign up for your free account today.

      • Peter K Yanke
        My point was that you may want to look at solutions other than Webalizer...that is all...:0) _____ From: webalizer@yahoogroups.com
        Message 3 of 9 , Apr 19, 2007
        • 0 Attachment
          My point was that you may want to look at solutions other than Webalizer...that is all...:0)


          From: webalizer@yahoogroups.com [mailto:webalizer@yahoogroups.com] On Behalf Of Andy Brager
          Sent: Thursday, April 19, 2007 2:44 PM
          To: webalizer@yahoogroups.com
          Subject: RE: [webalizer] A question

          Clearly we have different ideas as to what constitutes "a full fledged statistics solution".  I'm talking about 1) filtering out certain data so that it's not counted and 2) summing a different portion of the data than is currently summed.
           
          If that's a "a full fledged statistics solution" than clearly people paying for it are getting seriously ripped off.  Just my two cents.
           
           
          -----Original Message-----
          From: webalizer@yahoogroups.com [mailto:webalizer@yahoogroups.com]On Behalf Of Peter K Yanke
          Sent: Thursday, April 19, 2007 12:38 PM
          To: webalizer@yahoogroups.com
          Subject: RE: [webalizer] A question

          Just my two cents, but to my understanding the webalizer program is not meant to be a full fledged statistics solution. You may want to try paid services such as Statcounter and others that do what you are talking about and more, or look for a paid solution that can be implemented on your own server. I've never looked at webalizer as anything other than a 'rough idea' of what is going on as far as visitors goes. However, a relatively simple spreadsheet using the right numbers from the webalizer output or even logfiles could get you the info you want to without any kind of fancy web display.
           
          Like I said...just my two cents...:0)


          From: webalizer@yahoogrou ps.com [mailto:webalizer@ yahoogroups. com] On Behalf Of Andy Brager
          Sent: Thursday, April 19, 2007 1:22 PM
          To: webalizer@yahoogrou ps.com
          Subject: RE: [webalizer] A question

          I wonder if the developers are listening.  I have some ideas for improvement I would dearly love to see implemented.  I hope nobody minds me expressing those ideas here.
           
          I would love to be able to do the following:
           
          1) Filter out bots from the stats.  A regular expression match on for example "*googlebot* " would do the trick.
           
          2) Filter out specific domains and/or IP addresses (in particular, I want to filter out myself, as I'm responsible for about 90% of the traffic.  I suspect this is not unusual during development and/or right around launch time - or at least until real traffic builds.  I don't need to know where I've been, I already know by virtue of having been there).  This would also help filter out certain bots, and/or certain users, and/or useless information from aggregated users - like AOL for instance if they're all using the same IP.
           
          3) See reports that show me which IP addresses are hitting/visiting/ entering/ exiting which pages - currently you get a summary, but it doesn't tell me that IP 1.2.3.4 entered on page foo.html and exited on page bar.html, and also clicked through to foobar.html and fubar.html.  You're summarizing URL info. as in URL x.html was visted 1000 times (without corresponding IP info.).  I want to see a summary of IP info. as in IP 1.2.3.4 visited URL x.html 4 times, y.html 6 times and z.html 9 times.  Don't limit me to the top 10 or top 50 or top 100, unless I ask to limit it.  Show me all of them if I want to see all of them, or just the bottom N IPs or top N IPs, etc.  Give me some flexibility in what I choose to see.
           
          4) In addition to item 3 above, I would like to see the exact time & date they visited those pages.
           
          5) I'd like to be able to filter out specific files from being reported.  For example, I know I have images on specific pages and that I use stylesheets, I don't need to know that image.jpg and/or style.css was hit when they visited index.html and/or page.html.  The fact that they visited index.html and/or page.html is sufficient for me to know that the images and stylesheets on those pages were hit, and providing that superfluous information doesn't add any value.  In fact, it substantially decreases the value because I can't get the information I want, it's overwhelmed by this other pointless info. which is packed into the "top 10".
           
          It tells me that foo.html was hit 500 times which is nice to know, but it doesn't tell me that IP 4.3.2.1 hit (visited?) foo.html file 100 times, and that IP 12.12.12.12 hit foo.html file 400 times.  That however, is the information I really want to know.  That shows me only 2 IPs are responsible for all of my 500 hits/visits and the 50 other sites listed were apparently hitting other files (which would also show up in the stats).  Now I can decide if those 500 hits have any true value to me and/or what that value is, based on the IP reported.
           
          All of this info. is in the raw log files.  It simply needs to be organized differently than is currently reported by webalizer.
           
          Please note, as a simple user subject to my webhosts restrictions, I have no control over the compile time characteristics of the program, nor even startup characteristics.  These features would have to be accessible from the web page that webalizer prints when it sums up the stats (or a separate runtime configuration page if necessary).  I should be able to "lock in" my choices so I don't have to specify them each and every time I run webalizer.
           
          Thank you
           
           
          -----Original Message-----
          From: webalizer@yahoogrou ps.com [mailto:webalizer@ yahoogroups. com]On Behalf Of Southerland, Adam
          Sent: Thursday, April 19, 2007 11:28 AM
          To: webalizer@yahoogrou ps.com
          Subject: RE: [webalizer] A question

          Have you looked at these documents yet?

          Stats Explained in general: http://www.webalize r.org/simpleton. html

          This has the stats in depth: http://www.mrunix. net/webalizer/ webalizer_ help.html  (Including what the words like Visits and Hits represent)

          From: webalizer@yahoogrou ps.com [mailto:webalizer@ yahoogroups. com] On Behalf Of Steve Bryant
          Sent: Thursday, April 19, 2007 1:19 PM
          To: webalizer@yahoogrou ps.com
          Subject: [webalizer] A question

          Hello

          I have a couple of internet shops - www.celtic-fringe. com and www.worksofbeauty. co.uk and the company hosting them use Webalizer for the stats.

          i cannot find so far any understandable information on what the stats mean.  I'm told that 'visits' are important - but can someone tell me please - what are visits and what are hits.  How can I tell what is going on.  Is there any concise information available?

          Regards

          Steve


           

          Steve Bryant
          Cornwall UK
          TR2 5JP
          Telephone 07785 941781
          E-mail highbridgesb@ yahoo.co. uk


          Yahoo! Mail is the world's favourite email. Don't settle for less, sign up for your free account today.

        Your message has been successfully submitted and would be delivered to recipients shortly.