Loading ...
Sorry, an error occurred while loading the content.
 

Newcomer hello and first question...

Expand Messages
  • fephoo
    Hello, I am a newcomer in the web analytics world and really happy to start learning. Here is probably a newbie kind of question : I want to get an idea of the
    Message 1 of 6 , May 30, 2008
      Hello,

      I am a newcomer in the web analytics world and really happy to start
      learning.

      Here is probably a newbie kind of question :

      I want to get an idea of the numbre of unique human visitors that
      visited a website for a given period.
      Unfortunately, the only data I have (I hope to be able to implement at
      least GA soon) are the Awstats logs which I know represents all hits
      on the web server.

      Is there a way to apply filters on the logs to remove all request hits
      on images, robots, css... ? maybe a regex formula ?

      Thanks a lot.
      Fabien
    • Julien Coquet
      Salut Fabien, When using Awstats (or any log-based analytics tool for that matter), I usually find it easier to pre-process my server logs using Textpad or any
      Message 2 of 6 , May 31, 2008
        Salut Fabien,

        When using Awstats (or any log-based analytics tool for that matter),
        I usually find it easier to pre-process my server logs using Textpad
        or any regexp-enabled text editor.

        Julien


        On 30 mai 08, at 22:32, "fephoo" <fabienponson1@...> wrote:

        > Hello,
        >
        > I am a newcomer in the web analytics world and really happy to start
        > learning.
        >
        > Here is probably a newbie kind of question :
        >
        > I want to get an idea of the numbre of unique human visitors that
        > visited a website for a given period.
        > Unfortunately, the only data I have (I hope to be able to implement at
        > least GA soon) are the Awstats logs which I know represents all hits
        > on the web server.
        >
        > Is there a way to apply filters on the logs to remove all request hits
        > on images, robots, css... ? maybe a regex formula ?
        >
        > Thanks a lot.
        > Fabien
        >
        >
        >


        [Non-text portions of this message have been removed]
      • ALEX BRASIL
        Following on that line of thought, if you have some time to burn and are familiar or comfortable with a command line interface, grep and awk are worth learning
        Message 3 of 6 , May 31, 2008
          Following on that line of thought, if you have some time to burn and are familiar or comfortable with a command line interface, grep and awk are worth learning and will pay dividends for many computer related tasks.

          I'm sure if you dig around for grep and server log files, you'll find some pre-formatted strings to test out and work with that directly relate to your query as well.

          Incidentally, they're also great tools for finding out what pages are tagged/not tagged on a website, or for hunting for malformed tags. As an example someone on this list was lamenting that their code was responding with 2 vists for every 1 literal visit. With grep, a simple search string would have easily discerned where the problem was.




          Julien Coquet <julien.coquet@...> wrote: Salut Fabien,

          When using Awstats (or any log-based analytics tool for that matter),
          I usually find it easier to pre-process my server logs using Textpad
          or any regexp-enabled text editor.

          Julien

          On 30 mai 08, at 22:32, "fephoo" <fabienponson1@...> wrote:

          > Hello,
          >
          > I am a newcomer in the web analytics world and really happy to start
          > learning.
          >
          > Here is probably a newbie kind of question :
          >
          > I want to get an idea of the numbre of unique human visitors that
          > visited a website for a given period.
          > Unfortunately, the only data I have (I hope to be able to implement at
          > least GA soon) are the Awstats logs which I know represents all hits
          > on the web server.
          >
          > Is there a way to apply filters on the logs to remove all request hits
          > on images, robots, css... ? maybe a regex formula ?
          >
          > Thanks a lot.
          > Fabien
          >
          >
          >

          [Non-text portions of this message have been removed]






          [Non-text portions of this message have been removed]
        • Julien Coquet
          Alex, your follow-up confirms that Pandora s Box of Nerdiness is indeed wide open, with no chance of ever putting the lid back on ;-) I ve had my fair share of
          Message 4 of 6 , May 31, 2008
            Alex,

            your follow-up confirms that Pandora's Box of Nerdiness is indeed wide open,
            with no chance of ever putting the lid back on ;-)

            I've had my fair share of grep/awk scripting but keep in mind that most of
            us have little to no access to a Un*x/Linux system and cannot install Cygwin
            or any advanced command-line shell tools!

            So we started the discussion on regular expressions and then digressed into
            shell implementation of said filters and search patterns.
            Who feels like writing a tutorial? ;-)

            Cheers,

            Julien

            On Sat, May 31, 2008 at 3:14 PM, ALEX BRASIL <alexbrasil@...> wrote:

            > Following on that line of thought, if you have some time to burn and are
            > familiar or comfortable with a command line interface, grep and awk are
            > worth learning and will pay dividends for many computer related tasks.
            >
            > I'm sure if you dig around for grep and server log files, you'll find some
            > pre-formatted strings to test out and work with that directly relate to your
            > query as well.
            >
            > Incidentally, they're also great tools for finding out what pages are
            > tagged/not tagged on a website, or for hunting for malformed tags. As an
            > example someone on this list was lamenting that their code was responding
            > with 2 vists for every 1 literal visit. With grep, a simple search string
            > would have easily discerned where the problem was.
            >
            > Julien Coquet <julien.coquet@... <julien.coquet%40gmail.com>> wrote:
            > Salut Fabien,
            >
            >
            > When using Awstats (or any log-based analytics tool for that matter),
            > I usually find it easier to pre-process my server logs using Textpad
            > or any regexp-enabled text editor.
            >
            > Julien
            >
            > On 30 mai 08, at 22:32, "fephoo" <fabienponson1@...<fabienponson1%40yahoo.com>>
            > wrote:
            >
            > > Hello,
            > >
            > > I am a newcomer in the web analytics world and really happy to start
            > > learning.
            > >
            > > Here is probably a newbie kind of question :
            > >
            > > I want to get an idea of the numbre of unique human visitors that
            > > visited a website for a given period.
            > > Unfortunately, the only data I have (I hope to be able to implement at
            > > least GA soon) are the Awstats logs which I know represents all hits
            > > on the web server.
            > >
            > > Is there a way to apply filters on the logs to remove all request hits
            > > on images, robots, css... ? maybe a regex formula ?
            > >
            > > Thanks a lot.
            > > Fabien
            > >
            > >
            > >
            >
            > [Non-text portions of this message have been removed]
            >
            >
            >
            >
            >
            > [Non-text portions of this message have been removed]
            >
            >
            >



            --
            Julien Coquet
            julien.coquet@...
            skype: juliencoquet


            [Non-text portions of this message have been removed]
          • Steve McInerney
            Hi Fabian (and by implication Julien and Alex :-) ) ... Wouldn t we all.... :-) Fortunately, your use of the idea is about right. No tool or technology will
            Message 5 of 6 , May 31, 2008
              Hi Fabian (and by implication Julien and Alex :-) )

              On Sat, May 31, 2008 at 6:32 AM, fephoo <fabienponson1@...> wrote:
              > I want to get an idea of the numbre of unique human visitors that
              > visited a website for a given period.

              Wouldn't we all.... :-)
              Fortunately, your use of the "idea" is about right. No tool or
              technology will give you an exact number. They all are approximations
              - either high or low.


              > Unfortunately, the only data I have (I hope to be able to implement at
              > least GA soon) are the Awstats logs which I know represents all hits
              > on the web server.

              Clarification?
              Do you mean you have access to awstats to do the processing? Awstats
              being a log analyser program.
              Or, by implication, do you mean you have the raw logs out of Apache?


              > Is there a way to apply filters on the logs to remove all request hits
              > on images, robots, css... ? maybe a regex formula ?

              If you're already using awstats, I *understand* that it already does
              some simple alterations to the numbers displayed to account for
              robots. Or did. It's been a while since I used it.
              Assuming Alex and Julien's suggestions are difficult etc, the
              implication is that you do have access to perl. In which case a manual
              pre-filter using perl itself, into awstats is trivial.

              Flip side, I'm pretty sure awstats would enable user defined filtering
              within itself???

              Julien, as to your thrown gauntlet of a tutorial, and given the
              opening of pandora's box? ;-) Well it's a starter, but my first ever
              post to this forum (how sweet it was to resurrect... :-) ) may help:
              http://tech.groups.yahoo.com/group/webanalytics/message/5650

              I recognise this may sound a bit too much like advocacy, but if you do
              want simple access to a linux pc to play around with and experiment
              with this sort of filtering, just grab a live linux cd/dvd from ..
              well.. anywhere and boot it up - eg Ubuntu, Mandriva, OpenSuse etc etc
              etc. It won't touch the underlying system unless you tell it to. And
              is a great way of dipping the toes in the water to see how fine the
              water actually is. :-)


              HTH?
              Cheers!
              - Steve
            • fephoo
              Thank you Julien, Alex and Steve for your answers. Yes I meant that I have the apache raw logs, and visualize them through Awstats ;). In my search for a
              Message 6 of 6 , Jun 2, 2008
                Thank you Julien, Alex and Steve for your answers.

                Yes I meant that I have the apache raw logs, and visualize them
                through Awstats ;).

                In my search for a ballpark number of the number of human visits... I
                prefer to have an under-estimated number than a over-estimated one.
                Using the raw logs and even with filtering etc... it seems likely that
                I will get a number higher than reality (bots are adding a lot).

                I am planning on installing GA, let it run for a while and extrapolate
                the unique visitors number given by GA to a longer period. I assume I
                will get an under-estimated number of the real number of human visits.

                Do you think it's an ok approach ?

                Thanks,
                Fabien

                --- In webanalytics@yahoogroups.com, "Steve McInerney"
                <steve.mcinerney@...> wrote:
                >
                > Hi Fabian (and by implication Julien and Alex :-) )
                >
                > On Sat, May 31, 2008 at 6:32 AM, fephoo <fabienponson1@...> wrote:
                > > I want to get an idea of the numbre of unique human visitors that
                > > visited a website for a given period.
                >
                > Wouldn't we all.... :-)
                > Fortunately, your use of the "idea" is about right. No tool or
                > technology will give you an exact number. They all are approximations
                > - either high or low.
                >
                >
                > > Unfortunately, the only data I have (I hope to be able to implement at
                > > least GA soon) are the Awstats logs which I know represents all hits
                > > on the web server.
                >
                > Clarification?
                > Do you mean you have access to awstats to do the processing? Awstats
                > being a log analyser program.
                > Or, by implication, do you mean you have the raw logs out of Apache?
                >
                >
                > > Is there a way to apply filters on the logs to remove all request hits
                > > on images, robots, css... ? maybe a regex formula ?
                >
                > If you're already using awstats, I *understand* that it already does
                > some simple alterations to the numbers displayed to account for
                > robots. Or did. It's been a while since I used it.
                > Assuming Alex and Julien's suggestions are difficult etc, the
                > implication is that you do have access to perl. In which case a manual
                > pre-filter using perl itself, into awstats is trivial.
                >
                > Flip side, I'm pretty sure awstats would enable user defined filtering
                > within itself???
                >
                > Julien, as to your thrown gauntlet of a tutorial, and given the
                > opening of pandora's box? ;-) Well it's a starter, but my first ever
                > post to this forum (how sweet it was to resurrect... :-) ) may help:
                > http://tech.groups.yahoo.com/group/webanalytics/message/5650
                >
                > I recognise this may sound a bit too much like advocacy, but if you do
                > want simple access to a linux pc to play around with and experiment
                > with this sort of filtering, just grab a live linux cd/dvd from ..
                > well.. anywhere and boot it up - eg Ubuntu, Mandriva, OpenSuse etc etc
                > etc. It won't touch the underlying system unless you tell it to. And
                > is a great way of dipping the toes in the water to see how fine the
                > water actually is. :-)
                >
                >
                > HTH?
                > Cheers!
                > - Steve
                >
              Your message has been successfully submitted and would be delivered to recipients shortly.