Loading ...
Sorry, an error occurred while loading the content.
 

RE: [webalizer] Error message from cron?

Expand Messages
  • Edward Chase
    Lisa, I don t know specifically why, but I see it all the time... I suspect its not considered abnormal. If I was to make a stab, I would say that your site
    Message 1 of 9 , Oct 14 9:31 AM
      Lisa,

      I don't know specifically why, but I see it all the time... I suspect its
      not considered abnormal.

      If I was to make a stab, I would say that your site may have some long URLs
      (our does) and webalizer may have some limits on what size data can be held
      in some of it's variables.




      > -----Original Message-----
      > From: webalizer@yahoogroups.com
      > [mailto:webalizer@yahoogroups.com] On Behalf Of Lisa Casey
      > Sent: Friday, October 14, 2005 9:38 AM
      > To: webalizer@yahoogroups.com
      > Subject: [webalizer] Error message from cron?
      >
      >
      > Hi,
      >
      > I've been running webalizer on about 6 virtual domains for a couple
      > of weeks now. I run it from a cron job and it seems to be doing OK
      > (I get graphs that look accurate). My cron jpb keeps a log file
      > (/var/log/httpd/webalizer.log) in which it appears that Webalizer is
      > running without errors BUT I get e-mail from cron that says:
      >
      > Error: Skipping oversized log record
      > Error: Skipping oversized log record
      > Error: Skipping oversized log record
      > Error: Skipping oversized log record
      > Error: Skipping oversized log record
      > Error: Skipping oversized log record
      > (etc)
      >
      > Does anyone know why?
      >
      > Thanks,
      >
      > Lisa
    • Bradford L. Barrett
      ... Oversized request fields, typically generated by malicious code in an attempt to exploit one of the zillions of holes on windoze servers (IIS and friends).
      Message 2 of 9 , Oct 14 12:20 PM
        > Error: Skipping oversized log record
        > Error: Skipping oversized log record
        > (etc)
        >
        > Does anyone know why?

        Oversized request fields, typically generated by malicious code in an
        attempt to exploit one of the zillions of holes on windoze servers
        (IIS and friends).

        --
        Bradford L. Barrett brad@...
        A free electron in a sea of neutrons DoD#1750 KD4NAW

        The only thing Micro$oft has done for society, is make people
        believe that computers are inherently unreliable.
      • Southerland, Adam
        I get this a lot too... sometimes; the data is just larger than Webalizer can handle... Especially with User Clients and Referral URLs for us... Some are
        Message 3 of 9 , Oct 14 12:24 PM
          I get this a lot too... sometimes; the data is just larger than
          Webalizer can handle... Especially with User Clients and Referral URLs
          for us...

          Some are trying to hack us as well, but most are legitimate requests.
          (for us; may not be for you)

          Adam Southerland
          Webmaster

          -----Original Message-----
          From: webalizer@yahoogroups.com [mailto:webalizer@yahoogroups.com] On
          Behalf Of Bradford L. Barrett
          Sent: Friday, October 14, 2005 2:20 PM
          To: Lisa Casey
          Cc: webalizer@yahoogroups.com
          Subject: Re: [webalizer] Error message from cron?


          > Error: Skipping oversized log record
          > Error: Skipping oversized log record
          > (etc)
          >
          > Does anyone know why?

          Oversized request fields, typically generated by malicious code in an
          attempt to exploit one of the zillions of holes on windoze servers
          (IIS and friends).

          --
          Bradford L. Barrett brad@...
          A free electron in a sea of neutrons DoD#1750 KD4NAW

          The only thing Micro$oft has done for society, is make people
          believe that computers are inherently unreliable.



          Webalizer homepage: http://www.webalizer.org

          Yahoo! Groups Links
        • William L. Thomson Jr.
          ... This is something I brought up some time back. Seems no one is paying attention to a new trend I first saw on eBay. eBay URLs no longer containg ? and
          Message 4 of 9 , Oct 14 12:32 PM
            On Fri, 2005-10-14 at 14:24 -0500, Southerland, Adam wrote:
            > I get this a lot too... sometimes; the data is just larger than
            > Webalizer can handle... Especially with User Clients and Referral URLs
            > for us...
            >
            > Some are trying to hack us as well, but most are legitimate requests.
            > (for us; may not be for you)

            This is something I brought up some time back. Seems no one is paying
            attention to a new trend I first saw on eBay. eBay URLs no longer
            containg ? and variables. The URL is encoded in a way that the variables
            are part of the actual URL.

            Granted in the past it's mostly malicious activity and URLs. However
            these are valid URLs. The fix it to increase the buffer sizes in
            webalizer.c but it has to be done on an individual basis.

            I really think the webalizer dev team should increase the size of these
            buffers. It's a trivial change, and will slowly start to effect more and
            more people.

            Ideally some of the buffer sizes should be a configurable option in the
            config files. However a hard coded change the buffer sizes would work at
            this point. In the mean time I have had to make the changes myself
            locally. Which is a total pain given that Gentoo fetches and compiles
            sources. I have to almost repackage webalizer just of for the change.

            Total pain, glad it's starting to effect others. Maybe when it effects
            enough people someone on the webalizer dev team will do something about
            it. Till then?

            --
            Sincerely,
            William L. Thomson Jr.
            Support Group
            Obsidian-Studios, Inc.
            http://www.obsidian-studios.com
          • Bradford L. Barrett
            The internal buffer size for a single log entry is 4K (4096 bytes). For a typical text screen with a width of 80 characters, this would represent a log entry
            Message 5 of 9 , Oct 14 12:38 PM
              The internal buffer size for a single log entry is 4K (4096 bytes). For a
              typical text screen with a width of 80 characters, this would represent a
              log entry that spans over 50 lines. I have never seen a legitimate log
              entry, even from some of the more verbose sites, that was anywhere close
              to that size. Note that the 'Skipping oversized log record' error only
              applies to the entire log record, not the user agent or referrer portions
              which can generate their own "truncating oversized..." warnings and are
              completely different and unrelated.

              --

              On Fri, 14 Oct 2005, Southerland, Adam wrote:

              > I get this a lot too... sometimes; the data is just larger than
              > Webalizer can handle... Especially with User Clients and Referral URLs
              > for us...
              >
              > Some are trying to hack us as well, but most are legitimate requests.
              > (for us; may not be for you)
              >
              > Adam Southerland
              > Webmaster
              >
              > -----Original Message-----
              > From: webalizer@yahoogroups.com [mailto:webalizer@yahoogroups.com] On
              > Behalf Of Bradford L. Barrett
              > Sent: Friday, October 14, 2005 2:20 PM
              > To: Lisa Casey
              > Cc: webalizer@yahoogroups.com
              > Subject: Re: [webalizer] Error message from cron?
              >
              >
              > > Error: Skipping oversized log record
              > > Error: Skipping oversized log record
              > > (etc)
              > >
              > > Does anyone know why?
              >
              > Oversized request fields, typically generated by malicious code in an
              > attempt to exploit one of the zillions of holes on windoze servers
              > (IIS and friends).
              >
              > --
              > Bradford L. Barrett brad@...
              > A free electron in a sea of neutrons DoD#1750 KD4NAW
              >
              > The only thing Micro$oft has done for society, is make people
              > believe that computers are inherently unreliable.
              --
              Bradford L. Barrett brad@...
              A free electron in a sea of neutrons DoD#1750 KD4NAW

              How do you give Microsoft the benefit of the doubt when you
              know that if you were to throw it in a room with truth, you'd
              risk a matter/anti-matter explosion? -- Nicholas Petreley IDG
            • William L. Thomson Jr.
              ... I think that size is ok, it s the sizes of the other buffers that should be increased. From webalizer.h, I mis-posted saying the stuff was in webalizer.c,
              Message 6 of 9 , Oct 14 12:47 PM
                On Fri, 2005-10-14 at 15:38 -0400, Bradford L. Barrett wrote:
                > The internal buffer size for a single log entry is 4K (4096 bytes). For a
                > typical text screen with a width of 80 characters, this would represent a
                > log entry that spans over 50 lines. I have never seen a legitimate log
                > entry, even from some of the more verbose sites, that was anywhere close
                > to that size. Note that the 'Skipping oversized log record' error only
                > applies to the entire log record, not the user agent or referrer portions
                > which can generate their own "truncating oversized..." warnings and are
                > completely different and unrelated.

                I think that size is ok, it's the sizes of the other buffers that should
                be increased. From webalizer.h, I mis-posted saying the stuff was in
                webalizer.c, where it's not. It's in the header file, webalizer.h.

                #define MAXHASH 2048 /* Size of our hash tables */
                #define BUFSIZE 4096 /* Max buffer size for log record */
                #define MAXHOST 128 /* Max hostname buffer size */
                #define MAXURL 1024 /* Max HTTP request/URL field size */
                #define MAXURLH 128 /* Max URL field size in htab */
                #define MAXREF 1024 /* Max referrer field size */
                #define MAXREFH 128 /* Max referrer field size in htab */
                #define MAXAGENT 64 /* Max user agent field size */
                #define MAXCTRY 48 /* Max country name size */
                #define MAXSRCH 256 /* Max size of search string buffer */
                #define MAXSRCHH 64 /* Max size of search str in htab */
                #define MAXIDENT 64 /* Max size of ident string (user) */

                The referrer size has been a problem for me. It's not the buffer size
                for the log record that's the problem. It's that the other buffers.
                Which I assume combined into one can't equal more than the 4096. However
                for URLs like eBay's new format, there is no search string. So those
                buffers are not used. It's almost like if there is no seach string, the
                url buffer should be longer?


                --
                Sincerely,
                William L. Thomson Jr.
                Support Group
                Obsidian-Studios, Inc.
                http://www.obsidian-studios.com
              • Enric Naval
                I have the exact same problem. Actually, I would prefer being able to set the buffer sizes from webalizer.conf. This way, I could use a minimal webalizer.conf
                Message 7 of 9 , Oct 15 1:56 AM
                  I have the exact same problem.


                  Actually, I would prefer being able to set the buffer
                  sizes from webalizer.conf.

                  This way, I could use a minimal webalizer.conf with
                  small buffers to process very fastly, or use my
                  monstruous webalizer.conf with very big buffer sizes
                  to catch as much information as possible.

                  People could use my webalizer.conf, which would
                  already have on it the most adequate buffer sizes for
                  my options.




                  Meanwhile, I solved partially the problem by changing
                  these lines:

                  #define MAXAGENT 128

                  This is because some agents have very long names, and
                  their identifier is past the 64K barrier. It is the
                  only way to group Safari with GroupAgent.

                  #define MAXREFH 512
                  #define MAXSRCH 388
                  #define MAXSRCHH 388


                  I increase the space for referrers and search strings
                  because some search engines have very long query
                  strings, and some of them also put their keywords at
                  the end of the query, so you have to parse all the
                  query, not just the start of it.



                  LONG TECHNICAL EXPLANATION

                  Also, you see, MAXREF is already big enough. Even if
                  you increase it, webalizer will still only search for
                  the search phrases inside the size indicated by
                  MAXREFH.

                  Even if a referrer is accepted by webalizer, it will
                  only parse it as far as MAXREFH indicates. The rest of
                  the referrer gets dropped silently, so part of the
                  query string may be lost anyways and you will see no
                  warning in the output of your cron job.

                  But that's no the end of it. Then it will only parse
                  query strings up to the size of MAXSRCH (I'm not sure
                  whether it gives a warning when the query string is
                  too long), and then it will only search for search
                  phrases in the size indicated by MAXSRCHH.

                  Hum, I may have made some mistake there, but that's
                  the big picture more or less. Increasing only the
                  referrer size serves for nothing unless you also
                  increase the values for the referrer hash, the query
                  string size and the query string hash.

                  That's programming for you.



                  MORE LOOOOONG DETAILS

                  Webalizer uses hash tables internally, defined in
                  "hastab.h" and "hastab.c". The logfile parser
                  algorithm uses "snode" structs to save the query
                  strings inside a hash table. Inside the "snode" struct
                  there is a string. This string has at most size
                  MAXSRCHH.

                  Now, you see, LONGER STRINGS GET SILENTLY SHORTENED TO
                  MASXSRCHH. That's where information gets lost. The
                  search phrase parser uses the strings inside the
                  "snode" structs, so anything that gets removed from
                  that string is never parsed, and no warning is
                  generated.

                  In hashtab.c, you can search for this line:

                  str[MAXSRCHH-1]=0;

                  This line puts a null in the MAXSRCHH-1 position so
                  that the program will stop searching there. That's the
                  function new_snode().

                  All functions in hashtab.c starting like new_snode,
                  new_inode, etc have the same behaviour.




                  --- "William L. Thomson Jr."
                  <yahoogroups@...> wrote:

                  > On Fri, 2005-10-14 at 15:38 -0400, Bradford L.
                  > Barrett wrote:
                  > > The internal buffer size for a single log entry is
                  > 4K (4096 bytes). For a
                  > > typical text screen with a width of 80 characters,
                  > this would represent a
                  > > log entry that spans over 50 lines. I have never
                  > seen a legitimate log
                  > > entry, even from some of the more verbose sites,
                  > that was anywhere close
                  > > to that size. Note that the 'Skipping oversized
                  > log record' error only
                  > > applies to the entire log record, not the user
                  > agent or referrer portions
                  > > which can generate their own "truncating
                  > oversized..." warnings and are
                  > > completely different and unrelated.
                  >
                  > I think that size is ok, it's the sizes of the other
                  > buffers that should
                  > be increased. From webalizer.h, I mis-posted saying
                  > the stuff was in
                  > webalizer.c, where it's not. It's in the header
                  > file, webalizer.h.
                  >
                  > #define MAXHASH 2048 /* Size of
                  > our hash tables */
                  > #define BUFSIZE 4096 /* Max buffer
                  > size for log record */
                  > #define MAXHOST 128 /* Max
                  > hostname buffer size */
                  > #define MAXURL 1024 /* Max HTTP
                  > request/URL field size */
                  > #define MAXURLH 128 /* Max URL
                  > field size in htab */
                  > #define MAXREF 1024 /* Max
                  > referrer field size */
                  > #define MAXREFH 128 /* Max
                  > referrer field size in htab */
                  > #define MAXAGENT 64 /* Max user
                  > agent field size */
                  > #define MAXCTRY 48 /* Max
                  > country name size */
                  > #define MAXSRCH 256 /* Max size
                  > of search string buffer */
                  > #define MAXSRCHH 64 /* Max size
                  > of search str in htab */
                  > #define MAXIDENT 64 /* Max size
                  > of ident string (user) */
                  >
                  > The referrer size has been a problem for me. It's
                  > not the buffer size
                  > for the log record that's the problem. It's that the
                  > other buffers.
                  > Which I assume combined into one can't equal more
                  > than the 4096. However
                  > for URLs like eBay's new format, there is no search
                  > string. So those
                  > buffers are not used. It's almost like if there is
                  > no seach string, the
                  > url buffer should be longer?
                  >
                  >
                  > --
                  > Sincerely,
                  > William L. Thomson Jr.
                  > Support Group
                  > Obsidian-Studios, Inc.
                  > http://www.obsidian-studios.com
                  >
                  >


                  Enric Naval
                  Estudiante de Informática de Gestión en la Udl (Lleida)
                  GRIHO webalizer.conf
                  http://griho.udl.es/webalizer/webalizer.conf.txt




                  __________________________________
                  Yahoo! Mail - PC Magazine Editors' Choice 2005
                  http://mail.yahoo.com
                • Marek Simon
                  Read the documentation or FAQ on webalizer page. They say it is normal and you can patch it with changing the source. Marek
                  Message 8 of 9 , Oct 26 4:12 AM
                    Read the documentation or FAQ on webalizer page. They say it is normal
                    and you can patch it with changing the source.
                    Marek

                    Lisa Casey wrote:

                    > Hi,
                    >
                    > I've been running webalizer on about 6 virtual domains for a couple
                    > of weeks now. I run it from a cron job and it seems to be doing OK
                    > (I get graphs that look accurate). My cron jpb keeps a log file
                    > (/var/log/httpd/webalizer.log) in which it appears that Webalizer is
                    > running without errors BUT I get e-mail from cron that says:
                    >
                    > Error: Skipping oversized log record
                    > Error: Skipping oversized log record
                    > Error: Skipping oversized log record
                    > Error: Skipping oversized log record
                    > Error: Skipping oversized log record
                    > Error: Skipping oversized log record
                    > (etc)
                    >
                    > Does anyone know why?
                    >
                    > Thanks,
                    >
                    > Lisa
                    >
                    >
                    >
                    >
                    >
                    >
                    > Webalizer homepage: http://www.webalizer.org
                    >
                    >
                    >
                    >
                    > SPONSORED LINKS
                    > Software distribution
                    > <http://groups.yahoo.com/gads?t=ms&k=Software+distribution&w1=Software+distribution&w2=Salon+software&w3=Medical+software&w4=Software+association&w5=Software+jewelry&w6=Software+deployment&c=6&s=142&.sig=XcuzZXUhhqAa4nls1QYuCg>
                    > Salon software
                    > <http://groups.yahoo.com/gads?t=ms&k=Salon+software&w1=Software+distribution&w2=Salon+software&w3=Medical+software&w4=Software+association&w5=Software+jewelry&w6=Software+deployment&c=6&s=142&.sig=CW98GQRF3_rWnTxU62jsdA>
                    > Medical software
                    > <http://groups.yahoo.com/gads?t=ms&k=Medical+software&w1=Software+distribution&w2=Salon+software&w3=Medical+software&w4=Software+association&w5=Software+jewelry&w6=Software+deployment&c=6&s=142&.sig=86bMQqtlpuDBvFzrRcQApw>
                    >
                    > Software association
                    > <http://groups.yahoo.com/gads?t=ms&k=Software+association&w1=Software+distribution&w2=Salon+software&w3=Medical+software&w4=Software+association&w5=Software+jewelry&w6=Software+deployment&c=6&s=142&.sig=YhKUbszKHqjPXh21AbTSwg>
                    > Software jewelry
                    > <http://groups.yahoo.com/gads?t=ms&k=Software+jewelry&w1=Software+distribution&w2=Salon+software&w3=Medical+software&w4=Software+association&w5=Software+jewelry&w6=Software+deployment&c=6&s=142&.sig=9EWe0V3gtVyQaCqOgchvlw>
                    > Software deployment
                    > <http://groups.yahoo.com/gads?t=ms&k=Software+deployment&w1=Software+distribution&w2=Salon+software&w3=Medical+software&w4=Software+association&w5=Software+jewelry&w6=Software+deployment&c=6&s=142&.sig=VNvgzp250z70B2EFV3JYqg>
                    >
                    >
                    >
                    > ------------------------------------------------------------------------
                    > YAHOO! GROUPS LINKS
                    >
                    > * Visit your group "webalizer
                    > <http://groups.yahoo.com/group/webalizer>" on the web.
                    >
                    > * To unsubscribe from this group, send an email to:
                    > webalizer-unsubscribe@yahoogroups.com
                    > <mailto:webalizer-unsubscribe@yahoogroups.com?subject=Unsubscribe>
                    >
                    > * Your use of Yahoo! Groups is subject to the Yahoo! Terms of
                    > Service <http://docs.yahoo.com/info/terms/>.
                    >
                    >
                    > ------------------------------------------------------------------------
                    >
                  Your message has been successfully submitted and would be delivered to recipients shortly.