Loading ...
Sorry, an error occurred while loading the content.
 

Re: [webalizer] Problems with Truncating oversized referrer

Expand Messages
  • Enric Naval
    Hello: you can donwload the source code, then change these values in webalizer.c and then compile webalizer and execute the compiled version: #define MAXREF
    Message 1 of 12 , Jul 29, 2005
      Hello:

      you can donwload the source code, then change these
      values in webalizer.c and then compile webalizer and
      execute the compiled version:


      #define MAXREF 1024 /* Max referrer
      field size */
      #define MAXREFH 128 /* Max referrer
      field size in htab */

      You can set both MAXREF and MAXREFH to 2048 and
      recompile webalizer. This way it will process 2048
      characters in the referrer field. Webalizer will go a
      little bit slower.


      #define MAXREF 2048 /* Max referrer
      field size */
      #define MAXREFH 2048 /* Max referrer
      field size in htab */



      With the new format for ebay, you can group like this:

      GroupReferrer itemZ4560520029 aluminum-rims-4-Harley
      HideReferrer itemZ4560520029

      to group all items like this one:

      http://cgi.ebay.com/ebaymotors/EP4606-19-16-aluminum-rims-4-Harley-Indian-wheels_W0QQitemZ4560520029QQcategoryZ35557QQrdZ1QQcmd



      And of course, you can group all those stupid blocked
      referres, so they use only one line in the top
      referrer list:

      GroupReferrer XXXX:+++++++++++++ Blocked Referrer
      HideReferrer XXXX:+++++++++++++



      Enric Naval
      Estudiante de Informática de Gestión en la Udl (Lleida)
      GRIHO webalizer.conf
      http://griho.udl.es/webalizer/webalizer.conf.txt



      ____________________________________________________
      Start your day with Yahoo! - make it your home page
      http://www.yahoo.com/r/hs
    • waldo kitty
      ... i think that the actual intended result is to be able to tell which auction the hits are coming from so that they can manage the auctions better... i don t
      Message 2 of 12 , Jul 29, 2005
        Colonel Angel wrote:
        > It is possible to group by domain name only without
        > the long referrnce line. It's in the webalizer.conf
        > file itself.
        >
        > I'm sure you don't need the full referrence line, so
        > you should be able to group it by domain instead.

        i think that the actual intended result is to be able to tell which auction the hits are coming from so that they can
        manage the auctions better... i don't think it has anything to do with grouping hits by domain or such...

        > I could be wrong, but... I believe it's possible.
        >
        > Angel
        >
        >
        > --- "William L. Thomson Jr."
        > <yahoogroups@...> wrote:
        >
        >
        >>The problem is getting larger by day it seems. Today
        >>over 6000 truncated
        >>oversized referrers.
        >>
        >>On Thu, 2005-07-28 at 22:18 -0400, waldo kitty
        >>wrote:
        >>
        >>>William L. Thomson Jr. wrote:
        >>>
        >>>>Ok I have a growing problem. One of the sites I
        >>
        >>host has allot of eBay
        >>
        >>>>auctions. I am starting to see urls like this in
        >>
        >>the logs
        >>
        >>>>
        > http://cgi.ebay.com/ebaymotors/EP4606-19-16-aluminum-rims-4-Harley-Indian-wheels_W0QQitemZ4560520029QQcategoryZ35557QQrdZ1QQcmd
        >
        >>>>I believe even longer ones as well. I am getting
        >>>
        >>>where are you seeing these?? in the referrer
        >>
        >>field??
        >>
        >>Yes and I believe it's entirely eBay related. They
        >>seem to have changed
        >>their format recently. No more ? and params
        >>following a file, dll, etc.
        >>Seems they are parsing it via some other means but
        >>it's creating huge
        >>refers.
        >>
        >>
        >>>>Warning: Truncating oversized referrer field
        >>>
        >>>TTBOMK, there's really only so much of the
        >>
        >>referrer field that webalizer uses... however, there
        >>have been posts with
        >>
        >>>info on how to enlarge the buffer(s) and recompile
        >>
        >>webalizer...
        >>
        >>I can look into doing that but I assume it will be
        >>effecting others as
        >>well. eBay + webalizer = problem. Just my
        >>assumptions so far. I will see
        >>if I can fit some time in tomorrow to run some
        >>tests.
        >>
        >>
        >>>>On avg around 3000 times when running Webalizer
        >>
        >>daily via cron.
        >>
        >>>>I am also seeing stuff like XXXX:+++++++ with a
        >>
        >>ton of ++++ but not as
        >>
        >>>>many and I am not sure if that is a problem or
        >>
        >>not.
        >>
        >>>this is generally due to users using some sort of
        >>
        >>anonymizing service or proxy to block the referring
        >>site they came
        >>
        >>>from... on my site, i (try) to specifically block
        >>
        >>these with a rewrite rule in my apache... the
        >>rewrite rule basically
        >>
        >>>redirects them to a page that tells them to stop
        >>
        >>blocking their referrer so that they will be able to
        >>access the site as
        >>
        >>>intended... i've many pages that require they be
        >>
        >>accessed from internal urls and they are disallowed
        >>if accessed from
        >>
        >>>externals...
        >>
        >>Yes I will look into that, but seems I have much
        >>less of those than the
        >>eBay ones. It's only effecting one site at the
        >>moment, mainly because
        >>that site does allot with eBay.
        >>
        >>It's data I do want in webalizer because it allows
        >>me to track that they
        >>did come from eBay and etc. In fact with their new
        >>format, I will know
        >>even more. Instead of just
        >>
        >>http://cgi.ebay.com/ebaymotors/ws/eBayISAPI.dll
        >>
        >>I am pretty sure it's eBay because the webalizer is
        >>starting to show
        >>more and more of the smaller ones. The larger longer
        >>ones not so much.
        >>
        >>In fact it might be part of eBays switch to Sun
        >>servers and etc,
        >>obviously the above or old ones were Windows. Glad
        >>they made the switch,
        >>but it's wrecking havoc on webalizer :)
        >>
        >>
        >>--
        >>Sincerely,
        >>William L. Thomson Jr.
        >>Support Group
        >>Obsidian-Studios, Inc.
        >>http://www.obsidian-studios.com


        --
        _\/
        (@@) Waldo Kitty, Waldo's Place USA
        __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com
        _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com
        ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com
        _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net
      • waldo kitty
        ... exactly what i was thinking of... thanks for digging it out and showing it... ... interesting... i hadn t actually looked that deeply into the posted
        Message 3 of 12 , Jul 29, 2005
          Enric Naval wrote:
          > Hello:
          >
          > you can donwload the source code, then change these
          > values in webalizer.c and then compile webalizer and
          > execute the compiled version:
          >
          >
          > #define MAXREF 1024 /* Max referrer
          > field size */
          > #define MAXREFH 128 /* Max referrer
          > field size in htab */
          >

          exactly what i was thinking of... thanks for digging it out and showing it...

          > You can set both MAXREF and MAXREFH to 2048 and
          > recompile webalizer. This way it will process 2048
          > characters in the referrer field. Webalizer will go a
          > little bit slower.
          >
          >
          > #define MAXREF 2048 /* Max referrer
          > field size */
          > #define MAXREFH 2048 /* Max referrer
          > field size in htab */
          >
          >
          >
          > With the new format for ebay, you can group like this:
          >
          > GroupReferrer itemZ4560520029 aluminum-rims-4-Harley
          > HideReferrer itemZ4560520029
          >
          > to group all items like this one:
          >
          > http://cgi.ebay.com/ebaymotors/EP4606-19-16-aluminum-rims-4-Harley-Indian-wheels_W0QQitemZ4560520029QQcategoryZ35557QQrdZ1QQcmd

          interesting... i hadn't actually looked that deeply into the posted referrer line(s)... i also wasn't really that aware
          that you could group on substrings within the referrer field...

          >
          >
          >
          > And of course, you can group all those stupid blocked
          > referres, so they use only one line in the top
          > referrer list:
          >
          > GroupReferrer XXXX:+++++++++++++ Blocked Referrer
          > HideReferrer XXXX:+++++++++++++

          i agree on this... however, the ones that i see are variable numbers of +'s and i've yet to find a way for webalizer to
          take regex's for pattern matching in several areas that their use would be a big help...

          --
          _\/
          (@@) Waldo Kitty, Waldo's Place USA
          __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com
          _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com
          ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com
          _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net
        • Enric Naval
          ... http://cgi.ebay.com/ebaymotors/EP4606-19-16-aluminum-rims-4-Harley-Indian-wheels_W0QQitemZ4560520029QQcategoryZ35557QQrdZ1QQcmd ... You can match anything
          Message 4 of 12 , Jul 29, 2005
            > > With the new format for ebay, you can group like
            > this:
            > >
            > > GroupReferrer itemZ4560520029
            > aluminum-rims-4-Harley
            > > HideReferrer itemZ4560520029
            > >
            > > to group all items like this one:
            > >
            > >
            >
            http://cgi.ebay.com/ebaymotors/EP4606-19-16-aluminum-rims-4-Harley-Indian-wheels_W0QQitemZ4560520029QQcategoryZ35557QQrdZ1QQcmd
            >
            > interesting... i hadn't actually looked that deeply
            > into the posted referrer line(s)... i also wasn't
            > really that aware
            > that you could group on substrings within the
            > referrer field...
            >

            You can match anything before the query string.
            Webalizer stops at the first "?" character in the
            referrer. That's why with the old format he could only
            see the ebay domain and the auction page name. The
            rest of the information was after the "?" character,
            and webalizer stripped it away.


            > >
            > >
            > >
            > > And of course, you can group all those stupid
            > blocked
            > > referres, so they use only one line in the top
            > > referrer list:
            > >
            > > GroupReferrer XXXX:+++++++++++++ Blocked
            > Referrer
            > > HideReferrer XXXX:+++++++++++++
            >
            > i agree on this... however, the ones that i see are
            > variable numbers of +'s and i've yet to find a way
            > for webalizer to
            > take regex's for pattern matching in several areas
            > that their use would be a big help...
            >

            GroupReferrer matches substrings, so, in this example,
            all blocked referrers containing at least 13 +'s will
            be matched. This will match all those pesky referrers,
            because all of them have more than 13 +'s :)

            I would also like to have regexp in webalizer :(


            > --
            > _\/
            > (@@) Waldo Kitty,
            > Waldo's Place USA
            > __ooO_( )_Ooo_____________________
            > telnet://bbs.wpusa.dynip.com
            > _|_____|_____|_____|_____|_____|_____
            > http://www.wpusa.dynip.com
            > ____|_____|_____|_____|_____|_____|_____
            > ftp://ftp.wpusa.dynip.com
            > _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42
            > -at- alltel.net
            >


            Enric Naval
            Estudiante de Informática de Gestión en la Udl (Lleida)
            GRIHO webalizer.conf
            http://griho.udl.es/webalizer/webalizer.conf.txt



            ____________________________________________________
            Start your day with Yahoo! - make it your home page
            http://www.yahoo.com/r/hs
          • William L. Thomson Jr.
            Well I run Gentoo so I am always installing Webalizer via sources and compiling. However making that change is going to be a total pain. I wonder if in the
            Message 5 of 12 , Jul 29, 2005
              Well I run Gentoo so I am always installing Webalizer via sources and
              compiling. However making that change is going to be a total pain. I
              wonder if in the future either via the .conf files or command line etc.
              One can adjust that without having to deal with changes in the binary
              itself.

              Much easier said then done, because I am sure it has everything to do
              with memory allocation and etc. However I do think this will be a
              growing problem for others as well.

              I can hack, fix my install for now. But I think others will be effected
              as well. They just might not know it yet. I run webalizer via cron, and
              the output is emailed to me nightly. I am not sure others monitor
              webalizer like that if at all?


              --
              Sincerely,
              William L. Thomson Jr.
              Support Group
              Obsidian-Studios, Inc.
              http://www.obsidian-studios.com
            • Colonel Angel
              I m running Mandrake 9.2 with the latest Apache. I have Webmin installed and able to access the server via local area network. I m not sure if it s able to
              Message 6 of 12 , Jul 29, 2005
                I'm running Mandrake 9.2 with the latest Apache. I
                have 'Webmin' installed and able to access the server
                via local area network.

                I'm not sure if it's able to complete the full
                functionality of what you are wanting or needing to
                do, but... It does have a file manager included that
                you can directly edit files on the server.

                While I don't know of anything else such as you are
                suggesting, at least you could look at it and see if
                it fits your needs. It's a great way to control the
                server remotely without TightVNC or another type of
                oIP program.

                You can Google search Webmin, it's the first site that
                appears.

                As far as the binary compiling of Webalizer, you can
                directly edit the 'webalizer.conf', and keep a backup
                copy of it so all you have to do is make the necessary
                changes, then upload it to the directory you have
                listed for the conf file.

                I have backup copies of all necessary conf files on my
                regular computer that I use in case I happen to do
                something and screwup the regular conf file.

                You may even be able to adjust the Webalizer Module on
                Webmin to assist you in making changes to the
                'webalizer.conf'. It may take some manual configuring
                of the CGI to handle the updated version of the
                Webalizer Module.

                Anyway... Just some thoughts...

                SwtDivaLove



                --- "William L. Thomson Jr."
                <yahoogroups@...> wrote:

                > Well I run Gentoo so I am always installing
                > Webalizer via sources and
                > compiling. However making that change is going to be
                > a total pain. I
                > wonder if in the future either via the .conf files
                > or command line etc.
                > One can adjust that without having to deal with
                > changes in the binary
                > itself.
                >
                > Much easier said then done, because I am sure it has
                > everything to do
                > with memory allocation and etc. However I do think
                > this will be a
                > growing problem for others as well.
                >
                > I can hack, fix my install for now. But I think
                > others will be effected
                > as well. They just might not know it yet. I run
                > webalizer via cron, and
                > the output is emailed to me nightly. I am not sure
                > others monitor
                > webalizer like that if at all?
                >
                >
                > --
                > Sincerely,
                > William L. Thomson Jr.
                > Support Group
                > Obsidian-Studios, Inc.
                > http://www.obsidian-studios.com
                >
                >


                __________________________________________________
                Do You Yahoo!?
                Tired of spam? Yahoo! Mail has the best spam protection around
                http://mail.yahoo.com
              • William L. Thomson Jr.
                ... I ssh into the box? I have total access to everything. :) ... I admin my servers directly no need for Webmin. I run dedicated servers, so Webmin is not
                Message 7 of 12 , Jul 29, 2005
                  On Fri, 2005-07-29 at 13:32 -0700, Colonel Angel wrote:
                  >
                  > I'm not sure if it's able to complete the full
                  > functionality of what you are wanting or needing to
                  > do, but... It does have a file manager included that
                  > you can directly edit files on the server.

                  I ssh into the box? I have total access to everything. :)

                  > While I don't know of anything else such as you are
                  > suggesting, at least you could look at it and see if
                  > it fits your needs. It's a great way to control the
                  > server remotely without TightVNC or another type of
                  > oIP program.

                  I admin my servers directly no need for Webmin. I run dedicated servers,
                  so Webmin is not really practical as it requires apache on all machines.
                  None of my servers run X, so there is not GUI to VNC with.

                  ssh is perfectly fine no problems there.

                  > As far as the binary compiling of Webalizer, you can
                  > directly edit the 'webalizer.conf', and keep a backup
                  > copy of it so all you have to do is make the necessary
                  > changes, then upload it to the directory you have
                  > listed for the conf file.

                  The problem is I have to increase the buffer size. Gentoo fetches and
                  builds everything by source. So if I needed upgrade webalizer I just

                  emerge -u webalizer

                  or just

                  emerge webalizer

                  Now I will have to interrupt that process or tweak and package sources
                  to allow portage to do it's job with compiling and installing. Not
                  impossible but a bit difficult.

                  --
                  Sincerely,
                  William L. Thomson Jr.
                  Support Group
                  Obsidian-Studios, Inc.
                  http://www.obsidian-studios.com
                Your message has been successfully submitted and would be delivered to recipients shortly.