Loading ...
Sorry, an error occurred while loading the content.

Increase the default buffer sizes (was Error message from cron?)

Expand Messages
  • Enric Naval
    I have the exact same problem. Actually, I would prefer being able to set the buffer sizes from webalizer.conf. This way, I could use a minimal webalizer.conf
    Message 1 of 9 , Oct 15, 2005
    • 0 Attachment
      I have the exact same problem.


      Actually, I would prefer being able to set the buffer
      sizes from webalizer.conf.

      This way, I could use a minimal webalizer.conf with
      small buffers to process very fastly, or use my
      monstruous webalizer.conf with very big buffer sizes
      to catch as much information as possible.

      People could use my webalizer.conf, which would
      already have on it the most adequate buffer sizes for
      my options.




      Meanwhile, I solved partially the problem by changing
      these lines:

      #define MAXAGENT 128

      This is because some agents have very long names, and
      their identifier is past the 64K barrier. It is the
      only way to group Safari with GroupAgent.

      #define MAXREFH 512
      #define MAXSRCH 388
      #define MAXSRCHH 388


      I increase the space for referrers and search strings
      because some search engines have very long query
      strings, and some of them also put their keywords at
      the end of the query, so you have to parse all the
      query, not just the start of it.



      LONG TECHNICAL EXPLANATION

      Also, you see, MAXREF is already big enough. Even if
      you increase it, webalizer will still only search for
      the search phrases inside the size indicated by
      MAXREFH.

      Even if a referrer is accepted by webalizer, it will
      only parse it as far as MAXREFH indicates. The rest of
      the referrer gets dropped silently, so part of the
      query string may be lost anyways and you will see no
      warning in the output of your cron job.

      But that's no the end of it. Then it will only parse
      query strings up to the size of MAXSRCH (I'm not sure
      whether it gives a warning when the query string is
      too long), and then it will only search for search
      phrases in the size indicated by MAXSRCHH.

      Hum, I may have made some mistake there, but that's
      the big picture more or less. Increasing only the
      referrer size serves for nothing unless you also
      increase the values for the referrer hash, the query
      string size and the query string hash.

      That's programming for you.



      MORE LOOOOONG DETAILS

      Webalizer uses hash tables internally, defined in
      "hastab.h" and "hastab.c". The logfile parser
      algorithm uses "snode" structs to save the query
      strings inside a hash table. Inside the "snode" struct
      there is a string. This string has at most size
      MAXSRCHH.

      Now, you see, LONGER STRINGS GET SILENTLY SHORTENED TO
      MASXSRCHH. That's where information gets lost. The
      search phrase parser uses the strings inside the
      "snode" structs, so anything that gets removed from
      that string is never parsed, and no warning is
      generated.

      In hashtab.c, you can search for this line:

      str[MAXSRCHH-1]=0;

      This line puts a null in the MAXSRCHH-1 position so
      that the program will stop searching there. That's the
      function new_snode().

      All functions in hashtab.c starting like new_snode,
      new_inode, etc have the same behaviour.




      --- "William L. Thomson Jr."
      <yahoogroups@...> wrote:

      > On Fri, 2005-10-14 at 15:38 -0400, Bradford L.
      > Barrett wrote:
      > > The internal buffer size for a single log entry is
      > 4K (4096 bytes). For a
      > > typical text screen with a width of 80 characters,
      > this would represent a
      > > log entry that spans over 50 lines. I have never
      > seen a legitimate log
      > > entry, even from some of the more verbose sites,
      > that was anywhere close
      > > to that size. Note that the 'Skipping oversized
      > log record' error only
      > > applies to the entire log record, not the user
      > agent or referrer portions
      > > which can generate their own "truncating
      > oversized..." warnings and are
      > > completely different and unrelated.
      >
      > I think that size is ok, it's the sizes of the other
      > buffers that should
      > be increased. From webalizer.h, I mis-posted saying
      > the stuff was in
      > webalizer.c, where it's not. It's in the header
      > file, webalizer.h.
      >
      > #define MAXHASH 2048 /* Size of
      > our hash tables */
      > #define BUFSIZE 4096 /* Max buffer
      > size for log record */
      > #define MAXHOST 128 /* Max
      > hostname buffer size */
      > #define MAXURL 1024 /* Max HTTP
      > request/URL field size */
      > #define MAXURLH 128 /* Max URL
      > field size in htab */
      > #define MAXREF 1024 /* Max
      > referrer field size */
      > #define MAXREFH 128 /* Max
      > referrer field size in htab */
      > #define MAXAGENT 64 /* Max user
      > agent field size */
      > #define MAXCTRY 48 /* Max
      > country name size */
      > #define MAXSRCH 256 /* Max size
      > of search string buffer */
      > #define MAXSRCHH 64 /* Max size
      > of search str in htab */
      > #define MAXIDENT 64 /* Max size
      > of ident string (user) */
      >
      > The referrer size has been a problem for me. It's
      > not the buffer size
      > for the log record that's the problem. It's that the
      > other buffers.
      > Which I assume combined into one can't equal more
      > than the 4096. However
      > for URLs like eBay's new format, there is no search
      > string. So those
      > buffers are not used. It's almost like if there is
      > no seach string, the
      > url buffer should be longer?
      >
      >
      > --
      > Sincerely,
      > William L. Thomson Jr.
      > Support Group
      > Obsidian-Studios, Inc.
      > http://www.obsidian-studios.com
      >
      >


      Enric Naval
      Estudiante de Informática de Gestión en la Udl (Lleida)
      GRIHO webalizer.conf
      http://griho.udl.es/webalizer/webalizer.conf.txt




      __________________________________
      Yahoo! Mail - PC Magazine Editors' Choice 2005
      http://mail.yahoo.com
    • Marek Simon
      Read the documentation or FAQ on webalizer page. They say it is normal and you can patch it with changing the source. Marek
      Message 2 of 9 , Oct 26, 2005
      • 0 Attachment
        Read the documentation or FAQ on webalizer page. They say it is normal
        and you can patch it with changing the source.
        Marek

        Lisa Casey wrote:

        > Hi,
        >
        > I've been running webalizer on about 6 virtual domains for a couple
        > of weeks now. I run it from a cron job and it seems to be doing OK
        > (I get graphs that look accurate). My cron jpb keeps a log file
        > (/var/log/httpd/webalizer.log) in which it appears that Webalizer is
        > running without errors BUT I get e-mail from cron that says:
        >
        > Error: Skipping oversized log record
        > Error: Skipping oversized log record
        > Error: Skipping oversized log record
        > Error: Skipping oversized log record
        > Error: Skipping oversized log record
        > Error: Skipping oversized log record
        > (etc)
        >
        > Does anyone know why?
        >
        > Thanks,
        >
        > Lisa
        >
        >
        >
        >
        >
        >
        > Webalizer homepage: http://www.webalizer.org
        >
        >
        >
        >
        > SPONSORED LINKS
        > Software distribution
        > <http://groups.yahoo.com/gads?t=ms&k=Software+distribution&w1=Software+distribution&w2=Salon+software&w3=Medical+software&w4=Software+association&w5=Software+jewelry&w6=Software+deployment&c=6&s=142&.sig=XcuzZXUhhqAa4nls1QYuCg>
        > Salon software
        > <http://groups.yahoo.com/gads?t=ms&k=Salon+software&w1=Software+distribution&w2=Salon+software&w3=Medical+software&w4=Software+association&w5=Software+jewelry&w6=Software+deployment&c=6&s=142&.sig=CW98GQRF3_rWnTxU62jsdA>
        > Medical software
        > <http://groups.yahoo.com/gads?t=ms&k=Medical+software&w1=Software+distribution&w2=Salon+software&w3=Medical+software&w4=Software+association&w5=Software+jewelry&w6=Software+deployment&c=6&s=142&.sig=86bMQqtlpuDBvFzrRcQApw>
        >
        > Software association
        > <http://groups.yahoo.com/gads?t=ms&k=Software+association&w1=Software+distribution&w2=Salon+software&w3=Medical+software&w4=Software+association&w5=Software+jewelry&w6=Software+deployment&c=6&s=142&.sig=YhKUbszKHqjPXh21AbTSwg>
        > Software jewelry
        > <http://groups.yahoo.com/gads?t=ms&k=Software+jewelry&w1=Software+distribution&w2=Salon+software&w3=Medical+software&w4=Software+association&w5=Software+jewelry&w6=Software+deployment&c=6&s=142&.sig=9EWe0V3gtVyQaCqOgchvlw>
        > Software deployment
        > <http://groups.yahoo.com/gads?t=ms&k=Software+deployment&w1=Software+distribution&w2=Salon+software&w3=Medical+software&w4=Software+association&w5=Software+jewelry&w6=Software+deployment&c=6&s=142&.sig=VNvgzp250z70B2EFV3JYqg>
        >
        >
        >
        > ------------------------------------------------------------------------
        > YAHOO! GROUPS LINKS
        >
        > * Visit your group "webalizer
        > <http://groups.yahoo.com/group/webalizer>" on the web.
        >
        > * To unsubscribe from this group, send an email to:
        > webalizer-unsubscribe@yahoogroups.com
        > <mailto:webalizer-unsubscribe@yahoogroups.com?subject=Unsubscribe>
        >
        > * Your use of Yahoo! Groups is subject to the Yahoo! Terms of
        > Service <http://docs.yahoo.com/info/terms/>.
        >
        >
        > ------------------------------------------------------------------------
        >
      Your message has been successfully submitted and would be delivered to recipients shortly.