Loading ...
Sorry, an error occurred while loading the content.

Re: [webalizer] Problems with Truncating oversized referrer

Expand Messages
  • waldo kitty
    ... i think that the actual intended result is to be able to tell which auction the hits are coming from so that they can manage the auctions better... i don t
    Message 1 of 12 , Jul 29, 2005
    • 0 Attachment
      Colonel Angel wrote:
      > It is possible to group by domain name only without
      > the long referrnce line. It's in the webalizer.conf
      > file itself.
      >
      > I'm sure you don't need the full referrence line, so
      > you should be able to group it by domain instead.

      i think that the actual intended result is to be able to tell which auction the hits are coming from so that they can
      manage the auctions better... i don't think it has anything to do with grouping hits by domain or such...

      > I could be wrong, but... I believe it's possible.
      >
      > Angel
      >
      >
      > --- "William L. Thomson Jr."
      > <yahoogroups@...> wrote:
      >
      >
      >>The problem is getting larger by day it seems. Today
      >>over 6000 truncated
      >>oversized referrers.
      >>
      >>On Thu, 2005-07-28 at 22:18 -0400, waldo kitty
      >>wrote:
      >>
      >>>William L. Thomson Jr. wrote:
      >>>
      >>>>Ok I have a growing problem. One of the sites I
      >>
      >>host has allot of eBay
      >>
      >>>>auctions. I am starting to see urls like this in
      >>
      >>the logs
      >>
      >>>>
      > http://cgi.ebay.com/ebaymotors/EP4606-19-16-aluminum-rims-4-Harley-Indian-wheels_W0QQitemZ4560520029QQcategoryZ35557QQrdZ1QQcmd
      >
      >>>>I believe even longer ones as well. I am getting
      >>>
      >>>where are you seeing these?? in the referrer
      >>
      >>field??
      >>
      >>Yes and I believe it's entirely eBay related. They
      >>seem to have changed
      >>their format recently. No more ? and params
      >>following a file, dll, etc.
      >>Seems they are parsing it via some other means but
      >>it's creating huge
      >>refers.
      >>
      >>
      >>>>Warning: Truncating oversized referrer field
      >>>
      >>>TTBOMK, there's really only so much of the
      >>
      >>referrer field that webalizer uses... however, there
      >>have been posts with
      >>
      >>>info on how to enlarge the buffer(s) and recompile
      >>
      >>webalizer...
      >>
      >>I can look into doing that but I assume it will be
      >>effecting others as
      >>well. eBay + webalizer = problem. Just my
      >>assumptions so far. I will see
      >>if I can fit some time in tomorrow to run some
      >>tests.
      >>
      >>
      >>>>On avg around 3000 times when running Webalizer
      >>
      >>daily via cron.
      >>
      >>>>I am also seeing stuff like XXXX:+++++++ with a
      >>
      >>ton of ++++ but not as
      >>
      >>>>many and I am not sure if that is a problem or
      >>
      >>not.
      >>
      >>>this is generally due to users using some sort of
      >>
      >>anonymizing service or proxy to block the referring
      >>site they came
      >>
      >>>from... on my site, i (try) to specifically block
      >>
      >>these with a rewrite rule in my apache... the
      >>rewrite rule basically
      >>
      >>>redirects them to a page that tells them to stop
      >>
      >>blocking their referrer so that they will be able to
      >>access the site as
      >>
      >>>intended... i've many pages that require they be
      >>
      >>accessed from internal urls and they are disallowed
      >>if accessed from
      >>
      >>>externals...
      >>
      >>Yes I will look into that, but seems I have much
      >>less of those than the
      >>eBay ones. It's only effecting one site at the
      >>moment, mainly because
      >>that site does allot with eBay.
      >>
      >>It's data I do want in webalizer because it allows
      >>me to track that they
      >>did come from eBay and etc. In fact with their new
      >>format, I will know
      >>even more. Instead of just
      >>
      >>http://cgi.ebay.com/ebaymotors/ws/eBayISAPI.dll
      >>
      >>I am pretty sure it's eBay because the webalizer is
      >>starting to show
      >>more and more of the smaller ones. The larger longer
      >>ones not so much.
      >>
      >>In fact it might be part of eBays switch to Sun
      >>servers and etc,
      >>obviously the above or old ones were Windows. Glad
      >>they made the switch,
      >>but it's wrecking havoc on webalizer :)
      >>
      >>
      >>--
      >>Sincerely,
      >>William L. Thomson Jr.
      >>Support Group
      >>Obsidian-Studios, Inc.
      >>http://www.obsidian-studios.com


      --
      _\/
      (@@) Waldo Kitty, Waldo's Place USA
      __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com
      _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com
      ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com
      _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net
    • waldo kitty
      ... exactly what i was thinking of... thanks for digging it out and showing it... ... interesting... i hadn t actually looked that deeply into the posted
      Message 2 of 12 , Jul 29, 2005
      • 0 Attachment
        Enric Naval wrote:
        > Hello:
        >
        > you can donwload the source code, then change these
        > values in webalizer.c and then compile webalizer and
        > execute the compiled version:
        >
        >
        > #define MAXREF 1024 /* Max referrer
        > field size */
        > #define MAXREFH 128 /* Max referrer
        > field size in htab */
        >

        exactly what i was thinking of... thanks for digging it out and showing it...

        > You can set both MAXREF and MAXREFH to 2048 and
        > recompile webalizer. This way it will process 2048
        > characters in the referrer field. Webalizer will go a
        > little bit slower.
        >
        >
        > #define MAXREF 2048 /* Max referrer
        > field size */
        > #define MAXREFH 2048 /* Max referrer
        > field size in htab */
        >
        >
        >
        > With the new format for ebay, you can group like this:
        >
        > GroupReferrer itemZ4560520029 aluminum-rims-4-Harley
        > HideReferrer itemZ4560520029
        >
        > to group all items like this one:
        >
        > http://cgi.ebay.com/ebaymotors/EP4606-19-16-aluminum-rims-4-Harley-Indian-wheels_W0QQitemZ4560520029QQcategoryZ35557QQrdZ1QQcmd

        interesting... i hadn't actually looked that deeply into the posted referrer line(s)... i also wasn't really that aware
        that you could group on substrings within the referrer field...

        >
        >
        >
        > And of course, you can group all those stupid blocked
        > referres, so they use only one line in the top
        > referrer list:
        >
        > GroupReferrer XXXX:+++++++++++++ Blocked Referrer
        > HideReferrer XXXX:+++++++++++++

        i agree on this... however, the ones that i see are variable numbers of +'s and i've yet to find a way for webalizer to
        take regex's for pattern matching in several areas that their use would be a big help...

        --
        _\/
        (@@) Waldo Kitty, Waldo's Place USA
        __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com
        _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com
        ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com
        _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net
      • Enric Naval
        ... http://cgi.ebay.com/ebaymotors/EP4606-19-16-aluminum-rims-4-Harley-Indian-wheels_W0QQitemZ4560520029QQcategoryZ35557QQrdZ1QQcmd ... You can match anything
        Message 3 of 12 , Jul 29, 2005
        • 0 Attachment
          > > With the new format for ebay, you can group like
          > this:
          > >
          > > GroupReferrer itemZ4560520029
          > aluminum-rims-4-Harley
          > > HideReferrer itemZ4560520029
          > >
          > > to group all items like this one:
          > >
          > >
          >
          http://cgi.ebay.com/ebaymotors/EP4606-19-16-aluminum-rims-4-Harley-Indian-wheels_W0QQitemZ4560520029QQcategoryZ35557QQrdZ1QQcmd
          >
          > interesting... i hadn't actually looked that deeply
          > into the posted referrer line(s)... i also wasn't
          > really that aware
          > that you could group on substrings within the
          > referrer field...
          >

          You can match anything before the query string.
          Webalizer stops at the first "?" character in the
          referrer. That's why with the old format he could only
          see the ebay domain and the auction page name. The
          rest of the information was after the "?" character,
          and webalizer stripped it away.


          > >
          > >
          > >
          > > And of course, you can group all those stupid
          > blocked
          > > referres, so they use only one line in the top
          > > referrer list:
          > >
          > > GroupReferrer XXXX:+++++++++++++ Blocked
          > Referrer
          > > HideReferrer XXXX:+++++++++++++
          >
          > i agree on this... however, the ones that i see are
          > variable numbers of +'s and i've yet to find a way
          > for webalizer to
          > take regex's for pattern matching in several areas
          > that their use would be a big help...
          >

          GroupReferrer matches substrings, so, in this example,
          all blocked referrers containing at least 13 +'s will
          be matched. This will match all those pesky referrers,
          because all of them have more than 13 +'s :)

          I would also like to have regexp in webalizer :(


          > --
          > _\/
          > (@@) Waldo Kitty,
          > Waldo's Place USA
          > __ooO_( )_Ooo_____________________
          > telnet://bbs.wpusa.dynip.com
          > _|_____|_____|_____|_____|_____|_____
          > http://www.wpusa.dynip.com
          > ____|_____|_____|_____|_____|_____|_____
          > ftp://ftp.wpusa.dynip.com
          > _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42
          > -at- alltel.net
          >


          Enric Naval
          Estudiante de Informática de Gestión en la Udl (Lleida)
          GRIHO webalizer.conf
          http://griho.udl.es/webalizer/webalizer.conf.txt



          ____________________________________________________
          Start your day with Yahoo! - make it your home page
          http://www.yahoo.com/r/hs
        • William L. Thomson Jr.
          Well I run Gentoo so I am always installing Webalizer via sources and compiling. However making that change is going to be a total pain. I wonder if in the
          Message 4 of 12 , Jul 29, 2005
          • 0 Attachment
            Well I run Gentoo so I am always installing Webalizer via sources and
            compiling. However making that change is going to be a total pain. I
            wonder if in the future either via the .conf files or command line etc.
            One can adjust that without having to deal with changes in the binary
            itself.

            Much easier said then done, because I am sure it has everything to do
            with memory allocation and etc. However I do think this will be a
            growing problem for others as well.

            I can hack, fix my install for now. But I think others will be effected
            as well. They just might not know it yet. I run webalizer via cron, and
            the output is emailed to me nightly. I am not sure others monitor
            webalizer like that if at all?


            --
            Sincerely,
            William L. Thomson Jr.
            Support Group
            Obsidian-Studios, Inc.
            http://www.obsidian-studios.com
          • Colonel Angel
            I m running Mandrake 9.2 with the latest Apache. I have Webmin installed and able to access the server via local area network. I m not sure if it s able to
            Message 5 of 12 , Jul 29, 2005
            • 0 Attachment
              I'm running Mandrake 9.2 with the latest Apache. I
              have 'Webmin' installed and able to access the server
              via local area network.

              I'm not sure if it's able to complete the full
              functionality of what you are wanting or needing to
              do, but... It does have a file manager included that
              you can directly edit files on the server.

              While I don't know of anything else such as you are
              suggesting, at least you could look at it and see if
              it fits your needs. It's a great way to control the
              server remotely without TightVNC or another type of
              oIP program.

              You can Google search Webmin, it's the first site that
              appears.

              As far as the binary compiling of Webalizer, you can
              directly edit the 'webalizer.conf', and keep a backup
              copy of it so all you have to do is make the necessary
              changes, then upload it to the directory you have
              listed for the conf file.

              I have backup copies of all necessary conf files on my
              regular computer that I use in case I happen to do
              something and screwup the regular conf file.

              You may even be able to adjust the Webalizer Module on
              Webmin to assist you in making changes to the
              'webalizer.conf'. It may take some manual configuring
              of the CGI to handle the updated version of the
              Webalizer Module.

              Anyway... Just some thoughts...

              SwtDivaLove



              --- "William L. Thomson Jr."
              <yahoogroups@...> wrote:

              > Well I run Gentoo so I am always installing
              > Webalizer via sources and
              > compiling. However making that change is going to be
              > a total pain. I
              > wonder if in the future either via the .conf files
              > or command line etc.
              > One can adjust that without having to deal with
              > changes in the binary
              > itself.
              >
              > Much easier said then done, because I am sure it has
              > everything to do
              > with memory allocation and etc. However I do think
              > this will be a
              > growing problem for others as well.
              >
              > I can hack, fix my install for now. But I think
              > others will be effected
              > as well. They just might not know it yet. I run
              > webalizer via cron, and
              > the output is emailed to me nightly. I am not sure
              > others monitor
              > webalizer like that if at all?
              >
              >
              > --
              > Sincerely,
              > William L. Thomson Jr.
              > Support Group
              > Obsidian-Studios, Inc.
              > http://www.obsidian-studios.com
              >
              >


              __________________________________________________
              Do You Yahoo!?
              Tired of spam? Yahoo! Mail has the best spam protection around
              http://mail.yahoo.com
            • William L. Thomson Jr.
              ... I ssh into the box? I have total access to everything. :) ... I admin my servers directly no need for Webmin. I run dedicated servers, so Webmin is not
              Message 6 of 12 , Jul 29, 2005
              • 0 Attachment
                On Fri, 2005-07-29 at 13:32 -0700, Colonel Angel wrote:
                >
                > I'm not sure if it's able to complete the full
                > functionality of what you are wanting or needing to
                > do, but... It does have a file manager included that
                > you can directly edit files on the server.

                I ssh into the box? I have total access to everything. :)

                > While I don't know of anything else such as you are
                > suggesting, at least you could look at it and see if
                > it fits your needs. It's a great way to control the
                > server remotely without TightVNC or another type of
                > oIP program.

                I admin my servers directly no need for Webmin. I run dedicated servers,
                so Webmin is not really practical as it requires apache on all machines.
                None of my servers run X, so there is not GUI to VNC with.

                ssh is perfectly fine no problems there.

                > As far as the binary compiling of Webalizer, you can
                > directly edit the 'webalizer.conf', and keep a backup
                > copy of it so all you have to do is make the necessary
                > changes, then upload it to the directory you have
                > listed for the conf file.

                The problem is I have to increase the buffer size. Gentoo fetches and
                builds everything by source. So if I needed upgrade webalizer I just

                emerge -u webalizer

                or just

                emerge webalizer

                Now I will have to interrupt that process or tweak and package sources
                to allow portage to do it's job with compiling and installing. Not
                impossible but a bit difficult.

                --
                Sincerely,
                William L. Thomson Jr.
                Support Group
                Obsidian-Studios, Inc.
                http://www.obsidian-studios.com
              Your message has been successfully submitted and would be delivered to recipients shortly.