Loading ...
Sorry, an error occurred while loading the content.

Re: [webalizer] Multiple log entries

Expand Messages
  • waldo kitty
    ... the DA 5.3 at the end of the line is the UserAgent being used for the connection... DA is Download Assistant... the 206 code means the server has filled
    Message 1 of 8 , Jul 13, 2004
    • 0 Attachment
      John [aka VJ Scorpio] wrote:

      > My first posting, so hope this gets to everyone and I am doing this right..
      > I am a "newbie" really when it comes to servers and logs, and I recently
      > signed up for a dedicated server which has webalyzer installed for
      > stats, although I do have access to the raw logs and have used Weblog
      > Expert Lite also.
      > What is worrying me a bit is what appears to be multiple log entries for
      > the same files downloaded by the same person.
      > Here is an example from my logs, I have put ** = hidden for privacy/security
      >
      > 128.125.*.* - - [12/Jul/2004:23:23:44 +0200] "GET text1/text2/729.zip
      > HTTP/1.1" 206 2949639 "http://****" "DA 5.3"
      > 128.125.*.* - - [12/Jul/2004:23:23:44 +0200] "GET text1/text2/729.zip
      > HTTP/1.1" 206 2949639 "http://****" "DA 5.3"
      > 128.125.*.* - - [12/Jul/2004:23:23:44 +0200] "GET text1/text2/279.zip
      > HTTP/1.1" 206 2949639 "http://****" "DA 5.3"
      > 128.125.*.* - - [12/Jul/2004:23:23:44 +0200] "GET text1/text2/729.zip
      > HTTP/1.1" 206 2949642 "http://****" "DA 5.3"

      the DA 5.3 at the end of the line is the UserAgent being used for the connection... DA is Download Assistant... the 206 code means
      "the server has filled the partial GET request for the resource". basically what's happening is that someone is using a download
      tool that breaks the files up into parts and then downloads each part at the same time... in most cases, this effectively pulls the
      file in faster than if they just download it straight from the beginning to the end...

      if this is an apache server, there are mods that can be added to limit the number of connections on system may have on a url... you
      can also do things based on the useragent... it all depends on if you don't mind this traffic or not...

      anyway, on the http status codes, i'd recommend dropping by http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html and printing
      yourself out a copy ;)

      --
      _\/
      (@@) Waldo Kitty, Waldo's Place USA
      __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com
      _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com
      ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com
      _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net
    • John [aka VJ Scorpio]
      Thanks for your reply, and sorry I should have mentioned that most of the downloads are protected by anti-leech [www.anti-leech.com] and the only download
      Message 2 of 8 , Jul 13, 2004
      • 0 Attachment
        Thanks for your reply, and sorry I should have mentioned that most of the downloads are protected by anti-leech [www.anti-leech.com] and the only download manager/accelerator that can be used is netpumper [www.netpumper.com].
        So if I may ask this question then, if a download manager is used, which as you say might break up downloads would this actually increase the "real" traffic, as the file is only being downloaded once? I have a limit to my traffic allowance and would be charged extra for exceeding it, hence my concern.
        I do also have a one connection per user setup via anti-leech, although not sure if this prevents download managers from connecting several times, something I need to check with anti-leech.
        But anyway, seems a bit "unfair" as from what I assume actual traffic or bandwidth or whatever you want to call it is being shown wrong on the logs, isnt it?
         
        Thanks
         
        John


        From: waldo kitty [mailto:wkitty42@...]
        Sent: 13 July 2004 15:49
        To: webalizer@yahoogroups.com
        Subject: Re: [webalizer] Multiple log entries

        John [aka VJ Scorpio] wrote:

        > My first posting, so
        hope this gets to everyone and I am doing this right..
        > I am a "newbie"
        really when it comes to servers and logs, and I recently
        > signed up for
        a dedicated server which has webalyzer installed for
        > stats, although I
        do have access to the raw logs and have used Weblog
        > Expert Lite
        also.
        > What is worrying me a bit is what appears to be multiple log
        entries for
        > the same files downloaded by the same person.
        > Here
        is an example from my logs, I have put ** = hidden for privacy/security

        > 128.125.*.* - - [12/Jul/2004:23:23:44
        +0200] "GET text1/text2/729.zip
        > HTTP/1.1" 206 2949639 "
        href="http://****">http://****" "DA 5.3"
        > 128.125.*.* - -
        [12/Jul/2004:23:23:44 +0200] "GET text1/text2/729.zip
        > HTTP/1.1" 206
        2949639 "http://****" "DA 5.3"
        > 128.125.*.* - -
        [12/Jul/2004:23:23:44 +0200] "GET text1/text2/279.zip
        > HTTP/1.1" 206
        2949639 "http://****" "DA 5.3"
        > 128.125.*.* - -
        [12/Jul/2004:23:23:44 +0200] "GET text1/text2/729.zip
        > HTTP/1.1" 206
        2949642 "http://****" "DA 5.3"

        the DA 5.3 at the end of the line is the UserAgent being used for the connection... DA is Download Assistant... the 206 code means
        "the server has filled the partial GET request for the resource". basically what's happening is that someone is using a download
        tool that breaks the files up into parts and then downloads each part at the same time... in most cases, this effectively pulls the
        file in faster than if they just download it straight from the beginning to the end...

        if this is an apache server, there are mods that can be added to limit the number of connections on system may have on a url... you
        can also do things based on the useragent... it all depends on if you don't mind this traffic or not...

        anyway, on the http status codes, i'd recommend dropping by http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html and printing
        yourself out a copy ;)

        --
                _\/
               (@@)                      Waldo Kitty, Waldo's Place USA
        __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com
        _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com
        ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com
        _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net


        Webalizer homepage: http://www.webalizer.org
        Webalizer for NT: http://www.medasys-lille.com/webalizer/


        ttp://www.webalizer.org
        Webalizer for NT: http://www.medasys-lille.com/webalizer/






        ---
        Incoming mail is certified Virus Free.
        Checked by AVG anti-virus system (http://www.grisoft.com).
        Version: 6.0.718 / Virus Database: 474 - Release Date: 09/07/2004


        ---
        Outgoing mail is certified Virus Free.
        Checked by AVG anti-virus system (http://www.grisoft.com).
        Version: 6.0.718 / Virus Database: 474 - Release Date: 09/07/2004

      • oliver
        I have a large log file from June 27th to July 11th. When I run webalizer, I see this: Webalizer V2.01-10 (SunOS 5.8) English Using logfile
        Message 3 of 8 , Jul 13, 2004
        • 0 Attachment
          I have a large log file from June 27th to July 11th. When I run
          webalizer, I see this:

          Webalizer V2.01-10 (SunOS 5.8) English
          Using logfile /data1/reports/VA/master_access.log (clf)
          Creating output in /opt/apache2/htdocs/webalizer/all_hits
          History file not found...
          Generating report for June 2004
          Generating report for July 2004
          Generating summary report
          Saving history information...
          2677440 records (263 ignored) in 249.19 seconds, 10744/sec

          All is good... I see reports for all days and the history file is created.

          Now, If I run the report again but only with data from July 4th to
          July 11th. It looks like the history file is not being read because I
          only see data from 4th --> 11th in webalizer.

          Here is the output from the second webalizer run:

          Webalizer V2.01-10 (SunOS 5.8) English
          Using logfile /data1/reports/VA/master_access.log (clf)
          Creating output in /opt/apache2/htdocs/webalizer/all_hits
          Reading history file... /opt/apache2/htdocs/webalizer/all_hits/webalizer.hist
          Generating report for July 2004
          Generating summary report
          Saving history information...
          1415133 records in 114.71 seconds, 12336/sec

          so I see the hist file is being read. Here's the conf file

          HostName ******************** (removed by me)
          LogFile /data1/reports/VA/master_access.log
          OutputDir /opt/apache2/htdocs/webalizer/all_hits
          HistoryName /opt/apache2/htdocs/webalizer/all_hits/webalizer.hist

          Any ideas what could be causing this behaviour?

          Thanks in advance
        • Bradford L. Barrett
          Hint: Incremental mode Follow precautions in README file. -- ... -- Bradford L. Barrett brad@mrunix.net A free electron in a sea of
          Message 4 of 8 , Jul 13, 2004
          • 0 Attachment
            Hint: Incremental mode

            Follow precautions in README file.

            --
            On Tue, 13 Jul 2004, oliver wrote:

            > Now, If I run the report again but only with data from July 4th to
            > July 11th. It looks like the history file is not being read because I
            > only see data from 4th --> 11th in webalizer.
            --
            Bradford L. Barrett brad@...
            A free electron in a sea of neutrons DoD#1750 KD4NAW

            The only thing Micro$oft has done for society, is make people
            believe that computers are inherently unreliable.
          • oliver
            On Tue, 13 Jul 2004 17:10:06 -0400 (EDT), Bradford L. Barrett ... yes... that was it. I cleared everything out, ran webalizer twice and all is good. Thanks
            Message 5 of 8 , Jul 13, 2004
            • 0 Attachment
              On Tue, 13 Jul 2004 17:10:06 -0400 (EDT), Bradford L. Barrett
              <brad@...> wrote:
              >
              > Hint: Incremental mode
              >
              > Follow precautions in README file.

              yes... that was it. I cleared everything out, ran webalizer twice and
              all is good. Thanks so much
            • waldo kitty
              ... i m not familiar with that software... either of them... ... ummm... what s is being done is that they are able to download one file in 1/3rd the time, for
              Message 6 of 8 , Jul 13, 2004
              • 0 Attachment
                John [aka VJ Scorpio] wrote:
                > Thanks for your reply, and sorry I should have mentioned that most of
                > the downloads are protected by anti-leech [www.anti-leech.com] and the
                > only download manager/accelerator that can be used is netpumper
                > [www.netpumper.com].

                i'm not familiar with that software... either of them...

                > So if I may ask this question then, if a download manager is used, which
                > as you say might break up downloads would this actually increase the
                > "real" traffic, as the file is only being downloaded once?

                ummm... what's is being done is that they are able to download one file in 1/3rd the time, for example, if they are set for three
                connections with each being 1/3rd of the total filesize they are retrieving...

                > I have a
                > limit to my traffic allowance and would be charged extra for exceeding
                > it, hence my concern.

                i don't know if this will affect your traffic allowance... that would depend on the agreement you have with your hosting company...

                > I do also have a one connection per user setup via anti-leech, although
                > not sure if this prevents download managers from connecting several
                > times, something I need to check with anti-leech.

                yes, the logs snippet you showed did not indicate that any prevention was being performed... at least, not to my eyes...

                > But anyway, seems a bit "unfair" as from what I assume actual traffic or
                > bandwidth or whatever you want to call it is being shown wrong on the
                > logs, isnt it?

                there is a difference between traffic and bandwidth... bandwidth is how fat the pipe is... if you have a 1.5meg download pipe, then
                your max bandwidth is going to be close to 1.5meg... sometimes you'll see bandwidth confused with data quantity... something like
                "you're allowed 50gig per month in bandwidth"... that could be called traffic... the main thing is to clarify these definitions with
                your hosting company and determine exactly what /they/ are working with and the terms they are using...

                as for what's shown in the log? no, i don't see that that is wrong at all... the lines posted showed the conclusion of several
                connections all pulling different pieces of the same file by the same system at the same time... at least, from what i could see and
                guessing that the parts you "secured" were all the same...

                i think i'll take a peek at anti-leech myself... hopefully it'll run on my chosen OS and is not limited to only a few that i choose
                not to run as servers...

                --
                _\/
                (@@) Waldo Kitty, Waldo's Place USA
                __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com
                _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com
                ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com
                _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net
              • John [aka VJ Scorpio]
                Thanks again for your reply. I will need to look into this further with my hosting company although this is what their FAQ says, but its a bit double dutch to
                Message 7 of 8 , Jul 13, 2004
                • 0 Attachment
                  Thanks again for your reply. I will need to look into this further with my hosting company although this is what their FAQ says, but its a bit double dutch to me, and YES they are talking about traffic being bandwidth, as they say its excess bandwidth they will charge extra for if my 700GB allowance per month is exceeded.
                  "Bandwidth is measured at three points in our network for the highest accuracy and redundancy. We use the TrafficProbe system to analyze the complete traffic flow in our data center and provide you the most accurate billing in the industry. In addition, we monitor all switch ports for bandwidth usage."
                  BTW, ant-leech is not server dependant afaik, its just a service they provide so no real direct links to files show up on a web page as they go through anti-leech first, but users have to install a browser plugin, and they are only available for MS OS afaik, although there are several versions for different browsers. You can control the amount of downloads per hour, day or whatever and other stuff like IP bans, speed limits etc. Its free but the user does gets some adware unless you pay to remove them, or the user pays for the full version of the netpumper download manager.
                   
                  Take Care
                   
                  John [aka VJ Scorpio]
                  The Nets Largest Music Video Collection
                  Tel/Fax USA: 8018495657
                  Tel/Fax UK: 07092034200
                  VIDEO ARCHIVE


                  From: waldo kitty [mailto:wkitty42@...]
                  Sent: 14 July 2004 00:16
                  To: webalizer@yahoogroups.com
                  Subject: Re: [webalizer] Multiple log entries

                  John [aka VJ Scorpio] wrote:
                  > Thanks for your reply, and
                  sorry I should have mentioned that most of
                  > the downloads are protected
                  by anti-leech [www.anti-leech.com] and the
                  > only download
                  manager/accelerator that can be used is netpumper
                  >
                  [www.netpumper.com].

                  i'm not familiar with that software... either of them...

                  > So if I may ask this question then, if a download manager is
                  used, which
                  > as you say might break up downloads would this actually
                  increase the
                  > "real" traffic, as the file is only being downloaded once?

                  ummm... what's is being done is that they are able to download one file in 1/3rd the time, for example, if they are set for three
                  connections with each being 1/3rd of the total filesize they are retrieving...

                  > I have
                  a
                  > limit to my traffic allowance and would be charged extra for
                  exceeding
                  > it, hence my concern.

                  i don't know if this will affect your traffic allowance... that would depend on the agreement you have with your hosting company...

                  > I do also have a one connection per user setup
                  via anti-leech, although
                  > not sure if this prevents download managers
                  from connecting several
                  > times, something I need to check with
                  anti-leech.

                  yes, the logs snippet you showed did not indicate that any prevention was being performed... at least, not to my eyes...

                  > But
                  anyway, seems a bit "unfair" as from what I assume actual traffic or
                  >
                  bandwidth or whatever you want to call it is being shown wrong on the
                  >
                  logs, isnt it?

                  there is a difference between traffic and bandwidth... bandwidth is how fat the pipe is... if you have a 1.5meg download pipe, then
                  your max bandwidth is going to be close to 1.5meg... sometimes you'll see bandwidth confused with data quantity... something like
                  "you're allowed 50gig per month in bandwidth"... that could be called traffic... the main thing is to clarify these definitions with
                  your hosting company and determine exactly what /they/ are working with and the terms they are using...

                  as for what's shown in the log? no, i don't see that that is wrong at all... the lines posted showed the conclusion of several
                  connections all pulling different pieces of the same file by the same system at the same time... at least, from what i could see and
                  guessing that the parts you "secured" were all the same...

                  i think i'll take a peek at anti-leech myself... hopefully it'll run on my chosen OS and is not limited to only a few that i choose
                  not to run as servers...

                  --
                          _\/
                         (@@)                      Waldo Kitty, Waldo's Place USA
                  __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com
                  _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com
                  ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com
                  _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net


                  Webalizer homepage: http://www.webalizer.org
                  Webalizer for NT: http://www.medasys-lille.com/webalizer/


                  ttp://www.webalizer.org
                  Webalizer for NT: http://www.medasys-lille.com/webalizer/






                  ---
                  Incoming mail is certified Virus Free.
                  Checked by AVG anti-virus system (http://www.grisoft.com).
                  Version: 6.0.718 / Virus Database: 474 - Release Date: 09/07/2004


                  ---
                  Outgoing mail is certified Virus Free.
                  Checked by AVG anti-virus system (http://www.grisoft.com).
                  Version: 6.0.718 / Virus Database: 474 - Release Date: 09/07/2004

                Your message has been successfully submitted and would be delivered to recipients shortly.