Loading ...
Sorry, an error occurred while loading the content.

Show hits based on query string?

Expand Messages
  • Weidong Wang
    Hi, I am using Webalizer 2.01. On my site, there are quite some links like the following that use a script (perl program) to do certain custimization on the
    Message 1 of 10 , Jan 12, 2004
    • 0 Attachment
      Hi, I am using Webalizer 2.01. On my site, there are quite some
      links like the following that use a script (perl program) to do
      certain custimization on the page before displaying:

      /cgi-bin/process.pl?sponsor=abc&file=intro.htm
      /cgi-bin/process.pl?sponsor=abc&file=order.htm

      Webalizer shows them all under the URL /cgi-bin/process.pl, which is
      the correct URL. I understand that the query string part is not
      considered as part of URL.

      But I need to have a way to tell which real page (intro.htm or
      order.htm) is being used, the hits, etc.

      I could have a unique script for each real page, but that will
      require alot of changes and is not that clean.

      Is there something I can do on the welalizer side to be able to tell
      of all the triffic hitting /cgi-bin/process.pl, how many are for
      intro.htm and how many for order.htm, etc?

      Thanks.

      Weidong
    • blane_warrene
      Double check your httpd.conf (this is of course assuming you are using apache. You can add the directive to the CustomLog format to log the query string in
      Message 2 of 10 , Jan 12, 2004
      • 0 Attachment
        Double check your httpd.conf (this is of course assuming you are using
        apache. You can add the directive to the CustomLog format to log the
        query string in the logfile using a "%q" (no quotes) added in to the
        line that might look something like this:

        LogFormat "%h %l %u %t \"%r\" %>s %b"

        ONce in the log - webalizer should pick it up under the full url's
        section.



        -- In webalizer@yahoogroups.com, "Weidong Wang" <weidong@y...> wrote:
        > Hi, I am using Webalizer 2.01. On my site, there are quite some
        > links like the following that use a script (perl program) to do
        > certain custimization on the page before displaying:
        >
        > /cgi-bin/process.pl?sponsor=abc&file=intro.htm
        > /cgi-bin/process.pl?sponsor=abc&file=order.htm
        >
        > Webalizer shows them all under the URL /cgi-bin/process.pl, which is
        > the correct URL. I understand that the query string part is not
        > considered as part of URL.
        >
        > But I need to have a way to tell which real page (intro.htm or
        > order.htm) is being used, the hits, etc.
        >
        > I could have a unique script for each real page, but that will
        > require alot of changes and is not that clean.
        >
        > Is there something I can do on the welalizer side to be able to tell
        > of all the triffic hitting /cgi-bin/process.pl, how many are for
        > intro.htm and how many for order.htm, etc?
        >
        > Thanks.
        >
        > Weidong
      • Weidong Wang
        Does it matter where I put %q? Here is what I have for LogFormat: LogFormat %h %l %u %t %r % s %b %q %{Referer}i %{User- Agent}i combined But in
        Message 3 of 10 , Jan 13, 2004
        • 0 Attachment
          Does it matter where I put %q? Here is what I have for LogFormat:

          LogFormat "%h %l %u %t \"%r\" %>s %b %q \"%{Referer}i\" \"%{User-
          Agent}i\"" combined

          But in my webalizer report, I still don't see any URL with the query
          string. The top URLs list still show the program one (process.pl),
          without any query string info.

          Thanks for the help.

          Weidong

          --- In webalizer@yahoogroups.com, "blane_warrene"
          <blane_warrene@y...> wrote:
          > Double check your httpd.conf (this is of course assuming you are
          using
          > apache. You can add the directive to the CustomLog format to log
          the
          > query string in the logfile using a "%q" (no quotes) added in to
          the
          > line that might look something like this:
          >
          > LogFormat "%h %l %u %t \"%r\" %>s %b"
          >
          > ONce in the log - webalizer should pick it up under the full url's
          > section.
          >
          >
          >
          > -- In webalizer@yahoogroups.com, "Weidong Wang" <weidong@y...>
          wrote:
          > > Hi, I am using Webalizer 2.01. On my site, there are quite some
          > > links like the following that use a script (perl program) to do
          > > certain custimization on the page before displaying:
          > >
          > > /cgi-bin/process.pl?sponsor=abc&file=intro.htm
          > > /cgi-bin/process.pl?sponsor=abc&file=order.htm
          > >
          > > Webalizer shows them all under the URL /cgi-bin/process.pl,
          which is
          > > the correct URL. I understand that the query string part is not
          > > considered as part of URL.
          > >
          > > But I need to have a way to tell which real page (intro.htm or
          > > order.htm) is being used, the hits, etc.
          > >
          > > I could have a unique script for each real page, but that will
          > > require alot of changes and is not that clean.
          > >
          > > Is there something I can do on the welalizer side to be able to
          tell
          > > of all the triffic hitting /cgi-bin/process.pl, how many are for
          > > intro.htm and how many for order.htm, etc?
          > >
          > > Thanks.
          > >
          > > Weidong
        • Bradford L. Barrett
          ... The above format is invalid and what you will wind up with is query strings being processed as referrers, and referrers being processed as user agents. Get
          Message 4 of 10 , Jan 13, 2004
          • 0 Attachment
            > Does it matter where I put %q? Here is what I have for LogFormat:
            >
            > LogFormat "%h %l %u %t \"%r\" %>s %b %q \"%{Referer}i\" \"%{User-
            > Agent}i\"" combined

            The above format is invalid and what you will wind up with is query
            strings being processed as referrers, and referrers being processed
            as user agents.

            Get rid of the '%q'

            The webalizer, by design, strips cgi query info from URLs and referrers.
            This is to make the URL/referrer counts more accurate. If you want them
            preserved, you need to edit webalizer.c, add '?', '&' and '=' to the
            "isurlchar()" function and re-compile. Please note that doing so will
            cause less accurate counts and will open up the possibility of a cross
            site scripting vulnerability (query strings are not checked since they
            are not supposed to be present).

            --
            Bradford L. Barrett brad@...
            A free electron in a sea of neutrons DoD#1750 KD4NAW

            The only thing Micro$oft has done for society, is make people
            believe that computers are inherently unreliable.
          • waldo kitty
            ... this cross site scripting vulnerability is really only a problem is anyone can access the stats url, correct? i protect my stats from outside viewers for
            Message 5 of 10 , Jan 13, 2004
            • 0 Attachment
              Bradford L. Barrett wrote:

              > The webalizer, by design, strips cgi query info from URLs and referrers.
              > This is to make the URL/referrer counts more accurate. If you want them
              > preserved, you need to edit webalizer.c, add '?', '&' and '=' to the
              > "isurlchar()" function and re-compile. Please note that doing so will
              > cause less accurate counts and will open up the possibility of a cross
              > site scripting vulnerability (query strings are not checked since they
              > are not supposed to be present).

              this cross site scripting vulnerability is really only a problem is anyone can access the stats url, correct?

              i protect my stats from outside viewers for several reasons... first is its no ones business what my site does unless i want to let
              them know... second (and very important) is to thwart logfile spammers... there are other reasons but these two are at the very top
              of my list...

              --
              _\/
              (@@) Waldo Kitty, Waldo's Place USA
              __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com
              _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com
              ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com
              _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net
            • Bradford L. Barrett
              ... Yes, and only if the culprit can inject enough bogus data to show up in a Top table.. and even then, it might not be possible (I need to go look through
              Message 6 of 10 , Jan 13, 2004
              • 0 Attachment
                > this cross site scripting vulnerability is really only a problem is
                > anyone can access the stats url, correct?

                Yes, and only if the culprit can inject enough bogus data to show up in a
                'Top' table.. and even then, it might not be possible (I need to go look
                through the code to see the relationship between when the URL is escape
                sequence decoded and the 'isurlchar()' function is called.. which should
                catch the angle brackets). I didn't bother scanning the cgi portion since
                it should have been trimmed via the code that calls 'isurlchar()', but
                angle brackets should be caught, so it might not be possible to exploit.
                Just be carefull.

                --
                Bradford L. Barrett brad@...
                A free electron in a sea of neutrons DoD#1750 KD4NAW

                The only thing Micro$oft has done for society, is make people
                believe that computers are inherently unreliable.
              • Weidong Wang
                ... referrers. ... want them ... the ... will ... cross ... they ... Thanks. This works. I do have access control for who can read the reports. Weidong
                Message 7 of 10 , Jan 13, 2004
                • 0 Attachment
                  --- In webalizer@yahoogroups.com, "Bradford L. Barrett" <brad@m...>
                  wrote:
                  > The webalizer, by design, strips cgi query info from URLs and
                  referrers.
                  > This is to make the URL/referrer counts more accurate. If you
                  want them
                  > preserved, you need to edit webalizer.c, add '?', '&' and '=' to
                  the
                  > "isurlchar()" function and re-compile. Please note that doing so
                  will
                  > cause less accurate counts and will open up the possibility of a
                  cross
                  > site scripting vulnerability (query strings are not checked since
                  they
                  > are not supposed to be present).

                  Thanks. This works. I do have access control for who can read the
                  reports.

                  Weidong
                • Bradford L. Barrett
                  ... Access control does not prevent cross-site scripting :( -- Bradford L. Barrett brad@mrunix.net A free electron in a sea of neutrons
                  Message 8 of 10 , Jan 13, 2004
                  • 0 Attachment
                    > Thanks. This works. I do have access control for who can read the
                    > reports.

                    Access control does not prevent cross-site scripting :(

                    --
                    Bradford L. Barrett brad@...
                    A free electron in a sea of neutrons DoD#1750 KD4NAW

                    The only thing Micro$oft has done for society, is make people
                    believe that computers are inherently unreliable.
                  • Weidong Wang
                    ... I suppose I don t really understand what cross-site scripting means. The change to isurlchar() is ony to webalizer, nothing is changed on the Apache server
                    Message 9 of 10 , Jan 14, 2004
                    • 0 Attachment
                      --- In webalizer@yahoogroups.com, "Bradford L. Barrett" <brad@m...>
                      wrote:
                      >
                      > > Thanks. This works. I do have access control for who can read the
                      > > reports.
                      >
                      > Access control does not prevent cross-site scripting :(

                      I suppose I don't really understand what cross-site scripting means.
                      The change to isurlchar() is ony to webalizer, nothing is changed on
                      the Apache server side.

                      You mentioned something about someone sending bogus query string to
                      make to the top 20. let us further assume someone else get to see
                      the report Webalizer makes. He sees that the URL with the bogus
                      query string. What then?

                      Thanks for helping out.

                      Weidong
                    • Bradford L. Barrett
                      ... See this CERT advisory: http://www.cert.org/advisories/CA-2000-02.html -- Bradford L. Barrett brad@mrunix.net A free electron in a sea
                      Message 10 of 10 , Jan 14, 2004
                      • 0 Attachment
                        > > > Thanks. This works. I do have access control for who can read the
                        > > > reports.
                        > >
                        > > Access control does not prevent cross-site scripting :(
                        >
                        > I suppose I don't really understand what cross-site scripting means.
                        > The change to isurlchar() is ony to webalizer, nothing is changed on
                        > the Apache server side.
                        >
                        > You mentioned something about someone sending bogus query string to
                        > make to the top 20. let us further assume someone else get to see
                        > the report Webalizer makes. He sees that the URL with the bogus
                        > query string. What then?

                        See this CERT advisory: http://www.cert.org/advisories/CA-2000-02.html

                        --
                        Bradford L. Barrett brad@...
                        A free electron in a sea of neutrons DoD#1750 KD4NAW

                        The only thing Micro$oft has done for society, is make people
                        believe that computers are inherently unreliable.
                      Your message has been successfully submitted and would be delivered to recipients shortly.