Loading ...
Sorry, an error occurred while loading the content.

Re: Apache::Clean worthwhile in addition to mod_gzip ?

Expand Messages
  • Carl Johnstone
    ... It s one of those things that really depends on your circumstances. If your HTML is generated by tools that inserts lots of whitespace and extra tags then
    Message 1 of 9 , Feb 24, 2005
    • 0 Attachment
      > If you are already using a compression tool like mod_gzip, does it tend to
      > be
      > worthwhile to add an Apache::Clean phase as well?
      >
      > I'm curious to know if other Apache::Clean users have felt there was
      > significant benefit or a noticeably performance penalty.
      >
      > It would same the bandwidth is more of an issue than the processor time,
      > so my
      > assumption is that a little extra processor time would be a reasonable
      > trade-off.

      It's one of those things that really depends on your circumstances.

      If your HTML is generated by tools that inserts lots of whitespace and extra
      tags then you'll see more benefit than if you hand-craft or machine-generate
      HTML.

      Looking at a typical hand-built homepage I'm seeing around a 9-10% saving
      from cleaning (at level 9) before and after compression. I reckon it's
      probably worthwhile _if_ you've got plenty of CPU to spare and would like to
      knock 10% off your bandwidth utilisation.

      Realistically it's one of those things you'll need to test for yourself.
      Monitor your system for a period of time, noting CPU load and bandwith
      usage. Install Apache::Clean and perform the same monitoring. You can then
      make the decision based on your real-world situation.

      Carl
    • Mark Stosberg
      ... Thanks Slava. I hadn t read closely about Dynagzip before. Now I see that I see it does white space compression, I think I may stop there, and not try to
      Message 2 of 9 , Feb 24, 2005
      • 0 Attachment
        On 2005-02-24, Slava Bizyayev <sbizyaye@...> wrote:
        > Hi Mark,
        >
        > Regarding the implementation of Apache::Clean, the question is --
        > whether or not you can benefit one way or another from the fact that
        > your uncompressed responses are 5-20% less? I can really talk about the
        > Light-Compression in Apache::Dynagzip, keeping in mind that
        > Apache::Clean provides many extra things, but I assume that the blank
        > spaces make the main "blank volume" of unprepared for transmission
        > files.
        >
        > Normal compression (gzip) usually makes files 3-20 times less. The
        > compression ratio depends very little on whether the light compression
        > was applied prior to gzip, or not.
        >

        Thanks Slava.

        I hadn't read closely about Dynagzip before. Now I see that I see it
        does white space compression, I think I may stop there, and not try to
        add Apache::Clean to the mix as well.

        > What question would you like to add to Web Content Compression FAQ?

        Well, I can tell you my question, but I can't tell if you it has been
        frequent. :)

        Basically: Is it worth "cleaning" (safely modifying) HTML before it's
        compressed?

        I have a minor gripe about HTML::Clean, because it doesn't document
        which methods could affect the design of the page, versus which methods
        don't. This must start happening somewhere between level 1 and level 9,
        but it's not documented where. For example, removing 'blink' tags alters
        the design and perhaps other things do as well.

        But I should really be directing this to the author, and not this list. :)

        Mark
      • Slava Bizyayev
        Hi Jonathan, ... The short answer is: The typical CPU overhead originated from the content compression is insignificant. Actually, in my observations over the
        Message 3 of 9 , Feb 24, 2005
        • 0 Attachment
          Hi Jonathan,

          On Thu, 2005-02-24 at 11:13, Jonathan Vanasco wrote:

          > a _ what is the typical overhead in terms of cpu use -- ie,
          > cpu/connection time saved by smaller/faster downloads vs those used by
          > zipping

          The short answer is: The typical CPU overhead originated from the
          content compression is insignificant.

          Actually, in my observations over the files up to 200K it was estimated
          as less then 60 ms on P4 3 GHz processor. I could not measure the low
          boundary reliably. I can suggest to count some 10 ms per request on
          "regular" web pages for estimation of performance, but I would refrain
          from betting on this recommendation really.

          The estimation of a connection time is even more complicated, because of
          a variety of possible features on the network. The worse scenario is
          pretty impressive: Slow dial-up user connected via the ISP with no
          proxy/buffering holds your socket for a while that is proportional to
          the size of the requested file. So far, gzip can make this some 3-20
          times shorter. However, if the ISP is buffered, you might be feeling not
          that bad apart of the fact that you are paying to your X-Telecom for the
          transmission of a "blank data".

          > b_ just to clarify, mod_deflate is the only chain usable for apache 2
          > -- and the various Apache:: perlmods are unneeded are incompatible?

          Basically, this is true for the Apache::Dynagzip at least, and I was
          thinking that this is stated pretty clear in FAQ. Additional tiny
          filters could be prepended to mod_deflate on Apache-2 in order to make
          necessary features when required.

          Thanks,
          Slava
          --
          http://www.lastmileisp.com/
        • Slava Bizyayev
          ... Thanks Geoff, I got it. Slava
          Message 4 of 9 , Feb 24, 2005
          • 0 Attachment
            On Thu, 2005-02-24 at 11:17, Geoffrey Young wrote:
            > > b_ just to clarify, mod_deflate is the only chain usable for apache 2 --
            > > and the various Apache:: perlmods are unneeded are incompatible?
            >
            > there is an Apache::Clean designed for use with apache2 on cpan
            >
            > http://search.cpan.org/~geoff/Apache-Clean-2.00_5/
            >
            > you can also read about it here
            >
            > http://www.perl.com/pub/a/2003/04/17/filters.html
            > http://www.perl.com/pub/a/2003/05/22/testing.html
            >

            Thanks Geoff, I got it.

            Slava
          • Slava Bizyayev
            ... However, please let me know if you decide to use it for some reason. It should be compatible with Apache::Dynagzip within the Apache::Filter chain. You can
            Message 5 of 9 , Feb 24, 2005
            • 0 Attachment
              On Thu, 2005-02-24 at 11:53, Mark Stosberg wrote:

              > I hadn't read closely about Dynagzip before. Now I see that I see it
              > does white space compression, I think I may stop there, and not try to
              > add Apache::Clean to the mix as well.

              However, please let me know if you decide to use it for some reason. It
              should be compatible with Apache::Dynagzip within the Apache::Filter
              chain. You can turn off the Light-Compression in that case, and use all
              features of Apache::Clean instead.

              >
              > > What question would you like to add to Web Content Compression FAQ?
              >
              > Well, I can tell you my question, but I can't tell if you it has been
              > frequent. :)
              >
              > Basically: Is it worth "cleaning" (safely modifying) HTML before it's
              > compressed?

              Thanks, I hope to edit the FAQ shortly using all mentioned questions.

              Regards,
              Slava
            Your message has been successfully submitted and would be delivered to recipients shortly.