Loading ...
Sorry, an error occurred while loading the content.

Doubts about Unique Visitors

Expand Messages
  • austinmurphy
    I have doubts about the information being supplied to me: My vendor tells me that he can use web log information to track unique visitors coming to my
    Message 1 of 10 , Sep 27, 2007
    • 0 Attachment
      I have doubts about the information being supplied to me:

      My vendor tells me that he can use web log information to
      track 'unique visitors' coming to my website. He is defining a unique
      visitor as: the same IP address + same user agent (OS type + browser
      type + browser version).

      I'm not an expert, but this does not sound right. I referred to
      Kaushik's book and it sounds like I should work with cookies + Java
      Script Tags.

      Could I get an opinion as to whether it is worthwhile to proceed with
      this definition to address business/marketing questions (it's the best
      we have for now)? And if yes, what types of factors should I watch-out
      for?

      Thanks for any advice.
    • Stephen Turner
      ... There are two different issues here, which I think you re confusing. The first point is that you should definitely use cookies, not IP+user-agent, for
      Message 2 of 10 , Sep 27, 2007
      • 0 Attachment
        --- In webanalytics@yahoogroups.com, "austinmurphy" <austinmurphy@...>
        wrote:
        >
        >
        > I have doubts about the information being supplied to me:
        >
        > My vendor tells me that he can use web log information to
        > track 'unique visitors' coming to my website. He is defining a unique
        > visitor as: the same IP address + same user agent (OS type + browser
        > type + browser version).
        >
        > I'm not an expert, but this does not sound right. I referred to
        > Kaushik's book and it sounds like I should work with cookies + Java
        > Script Tags.
        >
        > Could I get an opinion as to whether it is worthwhile to proceed with
        > this definition to address business/marketing questions (it's the best
        > we have for now)? And if yes, what types of factors should I watch-out
        > for?
        >
        > Thanks for any advice.
        >


        There are two different issues here, which I think you're confusing.
        The first point is that you should definitely use cookies, not
        IP+user-agent, for identifying visitors. The second is whether to use
        logfiles or JavaScript, and here there are arguments on both sides:
        see http://en.wikipedia.org/wiki/Web_analytics for a summary. But
        cookies are available with either technology.

        Having said that, IP+user-agent isn't too bad, if that's all you have.
        You should switch to cookies because it's easy and because you will
        get more accurate results, but if you only have IP+user-agent in the
        mean time you're still going to be able to do useful work. Whether you
        should trust the vendor who's trying to ignore the issue is a
        different matter...

        --
        Stephen Turner
        CTO, ClickTracks http://www.clicktracks.com/
      • ju2ltd
        If you need to stick to log file analysis you can still use a first party cookie to track unique visitors . Log file analysis will under report page views due
        Message 3 of 10 , Sep 27, 2007
        • 0 Attachment
          If you need to stick to log file analysis you can still use a first
          party cookie to track unique visitors . Log file analysis will under
          report page views due to caching but does offer other advantages over
          JavaScript tagging it some circumstances - it depends on the kind of
          site and the web analytics solution your vendor is using.

          Jim.

          --- In webanalytics@yahoogroups.com, "austinmurphy"
          <austinmurphy@...> wrote:
          >
          >
          > I have doubts about the information being supplied to me:
          >
          > My vendor tells me that he can use web log information to
          > track 'unique visitors' coming to my website. He is defining a
          unique
          > visitor as: the same IP address + same user agent (OS type +
          browser
          > type + browser version).
          >
          > I'm not an expert, but this does not sound right. I referred to
          > Kaushik's book and it sounds like I should work with cookies + Java
          > Script Tags.
          >
          > Could I get an opinion as to whether it is worthwhile to proceed
          with
          > this definition to address business/marketing questions (it's the
          best
          > we have for now)? And if yes, what types of factors should I watch-
          out
          > for?
          >
          > Thanks for any advice.
          >
        • Nick Arnett
          ... To add to the good comments already made... your vendor s approach will never *over* count visitors, which is good. However, it may under-count. I
          Message 4 of 10 , Sep 27, 2007
          • 0 Attachment
            On 9/27/07, austinmurphy <austinmurphy@...> wrote:
            >
            >
            > I have doubts about the information being supplied to me:
            >
            > My vendor tells me that he can use web log information to
            > track 'unique visitors' coming to my website. He is defining a unique
            > visitor as: the same IP address + same user agent (OS type + browser
            > type + browser version).
            >





            To add to the good comments already made... your vendor's approach will
            never *over* count visitors, which is good. However, it may under-count. I
            recently compared that approach to a cookie-based one and found that both of
            them under-count quite a bit (don't have the bandwidth to grab the numbers,
            I'm on a train) and a combination of the two yields a significantly higher
            number than either one alone. Cookies undercount due to people blocking
            them; IP address undercount because of proxies and address re-use in DHCP.

            I'd be very curious to hear if others here have used any sort of hybrid
            system. I want to do some more analysis of our logged-in users, since
            logging in identifies them unambiguously (hackers notwithstanding... on my
            mind after yesterday's eBay craziness).

            Nick

            --
            Nick Arnett
            narnett@...
            Messages: 408-904-7198


            [Non-text portions of this message have been removed]
          • Tim Wilson
            IP+user-agent is fairly common, isn t it? I know that that s what I tended to use when doing targeted analysis on raw log files. This always seemed like a
            Message 5 of 10 , Sep 27, 2007
            • 0 Attachment
              IP+user-agent is fairly common, isn't it? I know that that's what I
              tended to use when doing targeted analysis on raw log files. This always
              seemed like a "good 'nuf" approach for the sorts of things I was looking
              at.



              Nick -- I think you're oversimplifying a little bit, though. Both
              IP+user-agent-based and cookie-based approaches have reasons they would
              overcount *and* reasons they would undercount, don't they?



              With the IP+user-agent approach, the over-counting happens due to DHCP-
              I have one IP address today but a different one next week, so I look
              like I am two different visitors. Or, I upgrade my browser (less of an
              issue). Or I install something that changes my user-agent string.



              The cookie-based approach can over-count when users delete their cookies
              and then return to the site. As for cookie blocking, do most WA tools
              not acknowledge these as visitors at all? Or do they just count the
              session as a unique visitor? I thought it was the latter.



              I always thought that, while there are factors pushing to both over- and
              undercounting, the net is that both approaches tend to overcount rather
              than undercount. No?



              Without putting an undue burden on the user to identify himself in some
              repeatable fashion on each visit, there is no perfect measure of unique
              visitors. But it's a reasonable assumption that, regardless of the
              approach, the error is consistent. I wouldn't throw the vendor out on
              this basis alone - there are other shortcomings in log-file-based WA
              (and...yes...there are shortcomings in tag-based solutions, too). It
              really depends on what information is most critical to you and then what
              approach will get you the most accurate data in that area.



              Tim



              ________________________________

              From: webanalytics@yahoogroups.com [mailto:webanalytics@yahoogroups.com]
              On Behalf Of Nick Arnett
              Sent: Thursday, September 27, 2007 12:51 PM
              To: webanalytics@yahoogroups.com
              Subject: Re: [webanalytics] Doubts about Unique Visitors



              On 9/27/07, austinmurphy <austinmurphy@...
              <mailto:austinmurphy%40yahoo.com> > wrote:
              >
              >
              > I have doubts about the information being supplied to me:
              >
              > My vendor tells me that he can use web log information to
              > track 'unique visitors' coming to my website. He is defining a unique
              > visitor as: the same IP address + same user agent (OS type + browser
              > type + browser version).
              >

              To add to the good comments already made... your vendor's approach will
              never *over* count visitors, which is good. However, it may under-count.
              I
              recently compared that approach to a cookie-based one and found that
              both of
              them under-count quite a bit (don't have the bandwidth to grab the
              numbers,
              I'm on a train) and a combination of the two yields a significantly
              higher
              number than either one alone. Cookies undercount due to people blocking
              them; IP address undercount because of proxies and address re-use in
              DHCP.

              I'd be very curious to hear if others here have used any sort of hybrid
              system. I want to do some more analysis of our logged-in users, since
              logging in identifies them unambiguously (hackers notwithstanding... on
              my
              mind after yesterday's eBay craziness).

              Nick

              --
              Nick Arnett
              narnett@... <mailto:narnett%40mccmedia.com>
              Messages: 408-904-7198

              [Non-text portions of this message have been removed]





              [Non-text portions of this message have been removed]
            • Steve
              To add onto all the other fine comments to date! ... Using his definition? Yes it probably is accurate. My definition of 1+1=3 is correct for me too. I
              Message 6 of 10 , Sep 27, 2007
              • 0 Attachment
                To add onto all the other fine comments to date!

                On 9/28/07, austinmurphy <austinmurphy@...> wrote:
                > I have doubts about the information being supplied to me:
                >
                > My vendor tells me that he can use web log information to
                > track 'unique visitors' coming to my website. He is defining a unique
                > visitor as: the same IP address + same user agent (OS type + browser
                > type + browser version).

                Using his definition? Yes it probably is accurate. My definition of
                1+1=3 is correct for me too. I wouldn't suggest anyone else use it
                though. ;-)

                Having said that, IP alone, or IP+Agent isn't *bad*. Rather that there
                are better options available.

                The other part of this equation is: What sort of automagic filtering
                for bot's et al do they have. How easy to add new filters? And how
                fast does the product run with massive numbers of filters? How easy to
                re-analyse with new filters?
                Filters are the principle key to log analysis.


                > I'm not an expert, but this does not sound right. I referred to
                > Kaushik's book and it sounds like I should work with cookies + Java
                > Script Tags.

                Avinash is, in essence, trying to help people get the best bang for
                buck. When you're starting out in WA, tagging is *typically* easier to
                get good results that you can immediately work from.


                > Could I get an opinion as to whether it is worthwhile to proceed with
                > this definition to address business/marketing questions (it's the best
                > we have for now)? And if yes, what types of factors should I watch-out
                > for?

                See above. It is worthwhile, but you may find a better bang for buck
                by using eg Google Analytics (GA). It will depend on your website and
                your audience. There are market segments where tagging (eg GA) is all
                but useless.


                Tim, Nick:
                The additional point I'd make - it almost seems irrelevant these days
                as to whether logs/tagging give different numbers.[1] When the Stone
                Temple report showed quite nicely that even high end tagging solutions
                can't remotely agree on what the numbers should be.
                I'd love to see the results from a proper scientific/repeatable test.
                *That* would be fascinating. :-)


                Cheers!
                - Steve

                [1] Don't anyone *dare* quote me back on this if I ever (yet again)
                state how much better than tagging, logs are. ;-)
              • austinmurphy
                Thanks to Steve, Tim, Nick, Jim, and Stephen, I m just new to this field - so this forum is a lifesaver. The feedback is pretty clear: using the IP address +
                Message 7 of 10 , Sep 28, 2007
                • 0 Attachment
                  Thanks to Steve, Tim, Nick, Jim, and Stephen,

                  I'm just new to this field - so this forum is a lifesaver.

                  The feedback is pretty clear: using the IP address + user agent is
                  not the best solution, but one could still learn something by using
                  it.

                  Should I be concerned about page caching in a relatively small and
                  concentrated market like Switzerland? Alternatively, would there
                  also be a larege effect from DHCP given high broadband and wireless
                  penetration? I imagine these issues wouldn't be a problem for sites
                  with millions of visitors - but the one I'm working on only had a
                  few thousand over the course of 14 months!

                  In my case, I am working with historical data that has been migrated
                  to a new analytical tool(WebTrends to XiTi). That is why I am stuck
                  with web logs and don't have the option of cookies right now (or so
                  I'm told). I have been asked to learn the maximum about 'what
                  happened' on the webiste during it's lifetime before it is
                  overhauled.

                  To do this I am asking for reports on visitors, top pages, task
                  completion rates for 6 activities, as well sales information -
                  usually segmented by new/returning, bounce/did not, and organic vs.
                  search. (Basically following Kaushik's advice as close as possible).

                  Could someone please send me a link to the Stone Temple report?
                  Sounds interesting.

                  Jon





                  --- In webanalytics@yahoogroups.com, Steve <nuilvows@...> wrote:
                  >
                  > To add onto all the other fine comments to date!
                  >
                  > On 9/28/07, austinmurphy <austinmurphy@...> wrote:
                  > > I have doubts about the information being supplied to me:
                  > >
                  > > My vendor tells me that he can use web log information to
                  > > track 'unique visitors' coming to my website. He is defining a
                  unique
                  > > visitor as: the same IP address + same user agent (OS type +
                  browser
                  > > type + browser version).
                  >
                  > Using his definition? Yes it probably is accurate. My definition of
                  > 1+1=3 is correct for me too. I wouldn't suggest anyone else use it
                  > though. ;-)
                  >
                  > Having said that, IP alone, or IP+Agent isn't *bad*. Rather that
                  there
                  > are better options available.
                  >
                  > The other part of this equation is: What sort of automagic
                  filtering
                  > for bot's et al do they have. How easy to add new filters? And how
                  > fast does the product run with massive numbers of filters? How
                  easy to
                  > re-analyse with new filters?
                  > Filters are the principle key to log analysis.
                  >
                  >
                  > > I'm not an expert, but this does not sound right. I referred to
                  > > Kaushik's book and it sounds like I should work with cookies +
                  Java
                  > > Script Tags.
                  >
                  > Avinash is, in essence, trying to help people get the best bang for
                  > buck. When you're starting out in WA, tagging is *typically*
                  easier to
                  > get good results that you can immediately work from.
                  >
                  >
                  > > Could I get an opinion as to whether it is worthwhile to proceed
                  with
                  > > this definition to address business/marketing questions (it's
                  the best
                  > > we have for now)? And if yes, what types of factors should I
                  watch-out
                  > > for?
                  >
                  > See above. It is worthwhile, but you may find a better bang for
                  buck
                  > by using eg Google Analytics (GA). It will depend on your website
                  and
                  > your audience. There are market segments where tagging (eg GA) is
                  all
                  > but useless.
                  >
                  >
                  > Tim, Nick:
                  > The additional point I'd make - it almost seems irrelevant these
                  days
                  > as to whether logs/tagging give different numbers.[1] When the
                  Stone
                  > Temple report showed quite nicely that even high end tagging
                  solutions
                  > can't remotely agree on what the numbers should be.
                  > I'd love to see the results from a proper scientific/repeatable
                  test.
                  > *That* would be fascinating. :-)
                  >
                  >
                  > Cheers!
                  > - Steve
                  >
                  > [1] Don't anyone *dare* quote me back on this if I ever (yet again)
                  > state how much better than tagging, logs are. ;-)
                  >
                • Nick Arnett
                  ... I would hope not. That would be a very bad mistake. ... I m not convinced that anyone knows... ;-) Both approaches count some twice and some not at all...
                  Message 8 of 10 , Sep 28, 2007
                  • 0 Attachment
                    On 9/27/07, Tim Wilson <twilson@...> wrote:
                    >
                    > I
                    > As for cookie blocking, do most WA tools
                    > not acknowledge these as visitors at all? Or do they just count the
                    > session as a unique visitor? I thought it was the latter.
                    >







                    I would hope not. That would be a very bad mistake.

                    > I always thought that, while there are factors pushing to both over- and
                    > undercounting, the net is that both approaches tend to overcount rather
                    > than undercount. No?
                    >





                    I'm not convinced that anyone knows... ;-)

                    Both approaches count some twice and some not at all... how much of a
                    problem it is depends somewhat on the time frame. I don't think IP plus
                    user-agent will over-count as a practical point, except over a very long
                    time period. Even though your address might change periodically, so that
                    you are counted as more than one visitor, there's likely to be somebody else
                    who, as a result, has your old IP address and likely to be using the same
                    browser, since there aren't that many browsers in use and the majority uses
                    just one.

                    However, this also depends on how much of the user-agent string is being
                    parsed. If it's the whole thing, then all the crud that software like
                    Funweb creates (seems like a unique id for every pageview sometimes!), then
                    it really becomes a mess.

                    Nick

                    --
                    Nick Arnett
                    narnett@...
                    Messages: 408-904-7198


                    [Non-text portions of this message have been removed]
                  • Nick Arnett
                    ... Nobody said it, so I will. You ll get useful *trends* from it. And that s about the best anybody can hope for with regard to unique visitors. Absolute
                    Message 9 of 10 , Sep 28, 2007
                    • 0 Attachment
                      On 9/28/07, austinmurphy <austinmurphy@...> wrote:
                      >
                      > Thanks to Steve, Tim, Nick, Jim, and Stephen,
                      >
                      > I'm just new to this field - so this forum is a lifesaver.
                      >
                      > The feedback is pretty clear: using the IP address + user agent is
                      > not the best solution, but one could still learn something by using
                      > it.
                      >









                      Nobody said it, so I will. You'll get useful *trends* from it. And that's
                      about the best anybody can hope for with regard to unique visitors.
                      Absolute numbers are much harder.

                      Nick


                      --
                      Nick Arnett
                      narnett@...
                      Messages: 408-904-7198


                      [Non-text portions of this message have been removed]
                    • Craig Sullivan
                      Tim, There is one other reason for having a different IP address. Many ISPs load balance customer traffic through proxy servers that look like they are you .
                      Message 10 of 10 , Oct 1, 2007
                      • 0 Attachment
                        Tim,



                        There is one other reason for having a different IP address. Many ISPs
                        load balance customer traffic through proxy servers that look like they
                        are 'you'.



                        These servers often switch you around depending on their loading etc. so
                        you could have several IP addresses within an hour, never mind a week.


                        Regards,



                        Craig Sullivan

                        Digital Development Manager

                        LOVEFiLM International

                        +44 (0)20 8896 8050

                        +44 (0)7711 657315

                        ________________________________

                        From: webanalytics@yahoogroups.com [mailto:webanalytics@yahoogroups.com]
                        On Behalf Of Tim Wilson
                        Sent: 27 September 2007 21:40
                        To: webanalytics@yahoogroups.com
                        Subject: RE: [webanalytics] Doubts about Unique Visitors



                        IP+user-agent is fairly common, isn't it? I know that that's what I
                        tended to use when doing targeted analysis on raw log files. This always
                        seemed like a "good 'nuf" approach for the sorts of things I was looking
                        at.

                        Nick -- I think you're oversimplifying a little bit, though. Both
                        IP+user-agent-based and cookie-based approaches have reasons they would
                        overcount *and* reasons they would undercount, don't they?

                        With the IP+user-agent approach, the over-counting happens due to DHCP-
                        I have one IP address today but a different one next week, so I look
                        like I am two different visitors. Or, I upgrade my browser (less of an
                        issue). Or I install something that changes my user-agent string.

                        The cookie-based approach can over-count when users delete their cookies
                        and then return to the site. As for cookie blocking, do most WA tools
                        not acknowledge these as visitors at all? Or do they just count the
                        session as a unique visitor? I thought it was the latter.

                        I always thought that, while there are factors pushing to both over- and
                        undercounting, the net is that both approaches tend to overcount rather
                        than undercount. No?

                        Without putting an undue burden on the user to identify himself in some
                        repeatable fashion on each visit, there is no perfect measure of unique
                        visitors. But it's a reasonable assumption that, regardless of the
                        approach, the error is consistent. I wouldn't throw the vendor out on
                        this basis alone - there are other shortcomings in log-file-based WA
                        (and...yes...there are shortcomings in tag-based solutions, too). It
                        really depends on what information is most critical to you and then what
                        approach will get you the most accurate data in that area.

                        Tim

                        ________________________________

                        From: webanalytics@yahoogroups.com
                        <mailto:webanalytics%40yahoogroups.com>
                        [mailto:webanalytics@yahoogroups.com
                        <mailto:webanalytics%40yahoogroups.com> ]
                        On Behalf Of Nick Arnett
                        Sent: Thursday, September 27, 2007 12:51 PM
                        To: webanalytics@yahoogroups.com <mailto:webanalytics%40yahoogroups.com>

                        Subject: Re: [webanalytics] Doubts about Unique Visitors

                        On 9/27/07, austinmurphy <austinmurphy@...
                        <mailto:austinmurphy%40yahoo.com>
                        <mailto:austinmurphy%40yahoo.com> > wrote:
                        >
                        >
                        > I have doubts about the information being supplied to me:
                        >
                        > My vendor tells me that he can use web log information to
                        > track 'unique visitors' coming to my website. He is defining a unique
                        > visitor as: the same IP address + same user agent (OS type + browser
                        > type + browser version).
                        >

                        To add to the good comments already made... your vendor's approach will
                        never *over* count visitors, which is good. However, it may under-count.
                        I
                        recently compared that approach to a cookie-based one and found that
                        both of
                        them under-count quite a bit (don't have the bandwidth to grab the
                        numbers,
                        I'm on a train) and a combination of the two yields a significantly
                        higher
                        number than either one alone. Cookies undercount due to people blocking
                        them; IP address undercount because of proxies and address re-use in
                        DHCP.

                        I'd be very curious to hear if others here have used any sort of hybrid
                        system. I want to do some more analysis of our logged-in users, since
                        logging in identifies them unambiguously (hackers notwithstanding... on
                        my
                        mind after yesterday's eBay craziness).

                        Nick

                        --
                        Nick Arnett
                        narnett@... <mailto:narnett%40mccmedia.com>
                        <mailto:narnett%40mccmedia.com>
                        Messages: 408-904-7198

                        [Non-text portions of this message have been removed]

                        [Non-text portions of this message have been removed]





                        [Non-text portions of this message have been removed]
                      Your message has been successfully submitted and would be delivered to recipients shortly.