Loading ...
Sorry, an error occurred while loading the content.
 

Re: [ARRL-LOTW] Duplicates

Expand Messages
  • David Levine
    Of course I pasted it into a fixed font which would only help if spaces were used or not stripped by Yahoo. I can t view your table in Yahoo or if I copy/paste
    Message 1 of 147 , Feb 20, 2013
      Of course I pasted it into a fixed font which would only help if spaces were used or not stripped by Yahoo. I can't view your table in Yahoo or if I copy/paste into plain old notepad.

      But you didn't answer the question which is where are YOUR numbers coming from? I know you are fixated on the % you claim are duplicates but how to do come to those results? I explained in my post what data I see available and asked you to presented here again:

      +++ The only numbers i can see obvious to scrape from the pages each hour are a snapshot of # QSOs in the system with the diff from the prev hour being # of new QSOs over the past hour.  The same LoTW page shows # of files uploaded which like the above can tell how many uploaded files in the past hour. Where do any other numbers you use in calculations come from?  The status page doesnt show any data that can provide info on the past hour unless the queue is backed up over 1 hour.


      What are the source pieces of data you are using for your calculations? It has to be mathematical just as I described in the paragraph above.You come up with a bunch of data points and I'm asking what are you data points and how are determining their values? Once you have your data points, you then can then insert them into a simple mathematical formula. 

      What are the column headings for the 5 data elements in the table?
      How are you determining each?

      Should be a simple response.




      On Wed, Feb 20, 2013 at 7:13 PM, Joe Subich, W4TV <lists@...> wrote:
       


      > +++ I can't tell what the above represents. There are 5 columns of
      > data and the headings don't provide any obvious matching with could
      > be a formating issue with Yahoo.

      Use a fixed space typeface and text based e-mail. The table will
      be perfectly clear. The important numbers are the right most
      column which shows the percentage of Raw QSOs each week that had
      been processed previously. That ratio has been above 80% for three
      of the last four weeks and is at least 15% higher than the rate of
      "previously processed" QSO records before the new hardware went
      on-line.

      The rate of duplicate QSOs continues to increase more rapidly than
      the number of total QSOs.

      73,

      ... Joe, W4TV


      On 2/20/2013 6:02 PM, David Levine wrote:
      > I'm sorry 1 word tripped you up so much. See +++ below.
      >
      > On Wed, Feb 20, 2013 at 3:20 PM, Joe Subich, W4TV lists@...> wrote:
      >>
      >>
      >>
      >>
      >>> Now if those 580 new QSOs contained 2 NEW QSOs each, there would be
      >>> no increase in duplicates and in fact the overall % of duplicates per
      >>> upload would have actually decreased.
      >>
      >> Your logic can't be correct if you start with false assumptions - QSOs
      >> containing QSOs?
      >
      > +++ Now if those 580 new UPLOADS contained 2 NEW QSOs each...
      >
      >>>> Let's use real numbers as an example...
      >
      > +++ All my numbers are perfectly valid and since you didn't point out any
      > math mistakes I must not have made any typos on my phone for that portion.
      >
      >>
      >> Here are the real numbers. I've adjusted the calculations slightly
      >> to more closely match the endpoints of each week (I had changed the
      >> time at which I captured the LotW Status data from 0600z to 2300z
      >> during the three months). Here is an adjusted *weekly* calculation
      >> of the percentage of reprocessed QSOs - as defined by uploaded QSOs
      >> (user files times average log size) minus New QSO Records - for the
      >>
      >>
      >> last 12 weeks:
      >>
      >> Logs Average
      >> Week Ending New QSOs Processed QSOs/Log % Reprocessed
      >> ----------------------------------------------------------
      >> 12/03/2012 1,825,662 17,225 316 66.4%
      >> 12/10/2012 1,837,292 13,092 316 55.5%
      >> 12/17/2012 1,416,710 15,288 316 70.6%
      >> 12/24/2012 1,323,942 17,952 340 78.3%
      >> 12/31/2012 1,040,081 12,815 318 74.4%
      >> 01/07/2013 891,254 15,229 289 79.7%
      >> 01/14/2013 4,422,746 53,411 224 63.0%
      >> 01/31/2013 1,760,810 18,407 472 79.7%
      >> 01/28/2013 1,367,417 22,025 410 84.9%
      >> 02/04/2013 1,519,401 22,830 357 81.4%
      >> 02/11/2013 1,587,253 24,483 473 86.3%
      >> 02/18/2013 2,310,539 33,004 271 74.1%.
      >
      > +++ I can't tell what the above represents. There are 5 columns of data
      > and the headings don't provide any obvious matching with could be a
      > formating issue with Yahoo.
      >
      > +++ The only numbers i can see obvious to scrape from the pages each hour
      > are a snapshot of # QSOs in the system with the diff from the prev hour
      > being # of new QSOs over the past hour. The same LoTW page shows # of
      > files uploaded which like the above can tell how many uploaded files in the
      > past hour. Where do any other numbers you use in calculations come from?
      > The status page doesnt show any data that can provide info on the past hour
      > unless the queue is backed up over 1 hour.
      >
      >>
      >> One can see the increase in duplicates as users uploaded all their old
      >> logs following the bad advice from some individuals to do so rather
      >> than upload only QSOs that were missing following the discovery of the
      >> "lost upload" bug as instructed by ARRL. This is particularly clear in
      >> the week of 1/7 just before the new hardware went on-line with the rate
      >> of reprocessing reaching nearly 80%.
      >>
      >> Although the reprocessing rate dropped slightly the following week as
      >> the new hardware went on-line, it has quickly reached and even exceeded
      >> the peak levels of the old system and the 80%+ reprocessing rate is
      >> more than 50% higher than the 60% average in early December before the
      >> push to "upload everything again."
      >>
      >> Fortunately, instead of a maximum processing rate around 750,000 QSOs
      >> per day (5 million per week) on the old server, the new one peaks out
      >> around 20 to 21 million QSOs per week. The roughly four-fold increase
      >> in capacity means that the 80% reprocessing rate does not result in
      >> resource exhaustion at the present upload rates. As another point of
      >> reference, 70% of uploads are processed in *one minute* or less and
      >> 89.5% are processed in less than 30 minutes.
      >>
      >> Duplicates represent a significant and growing drain on processing
      >> capacity. Fortunately, there is currently enough reserve capacity
      >> that most users no longer feel the impact of the bad behavior and
      >> processing time has remained below one hour for 94% of the time.
      >>
      >> 73,
      >>
      >> ... Joe, W4TV
      >>
      >> On 2/20/2013 9:29 AM, k2dsl wrote:
      >>> I'm not clear on what you wrote below and the assumptions you are
      > coming up with. The specific snippet is:
      >>>
      >>> 1) Uploads per day increased 58%
      >>> 2) New QSOs increased by 15%
      >>>
      >>> Why is there necessarily a conclusion that duplicates have increased?
      > If now the majority of uploads contained just 1-5 QSO records, there
      > wouldn't be a correlation that I can see between the 2 numbers you posted
      > that duplicates increased.
      >>>
      >>> Let's use real numbers as an example...
      >>>
      >>> Before Hardware Upgrade:
      >>> Daily Uploads: 1,000
      >>> Daily new QSOs: 10,000
      >>>
      >>> After Hardware Upgrade:
      >>> Uploads: 1,580
      >>> Daily new QSOs: 11,150
      >>>
      >>> The above represents a 58% increase in the number of uploads and a 15%
      > increase in # of new QSOs.
      >>>
      >>> Now if those 580 new QSOs contained 2 NEW QSOs each, there would be no
      > increase in duplicates and in fact the overall % of duplicates per upload
      > would have actually decreased.
      >>>
      >>> Is there anything incorrect with my math? If not, is your statement a
      > guess vs fact or is there some other info you have that wasn't presented?
      > If it's just a guess on your part, you certainly didn't make it out to be
      > that and my numbers above which are all possible, indicate your assumption
      > is incorrect.
      >>>
      >>> David - K2DSL
      >


    • David Cole
      I run ACLog... When you hit the ALL SINCE button, change the date to be something about a week prior to the LoTW failure... I believe , ACLog got very
      Message 147 of 147 , Aug 24, 2014
        I run ACLog... When you hit the "ALL SINCE" button, change the date to
        be something about a week prior to the LoTW failure...

        I "believe", ACLog got very confused as a result of the fail mode of
        LoTW. That corrected a very similar problem for me.
        --
        Thanks and 73's,
        For equipment, and software setups and reviews see:
        www.nk7z.net
        for MixW support see;
        http://groups.yahoo.com/neo/groups/mixw/info
        for Dopplergram information see:
        http://groups.yahoo.com/neo/groups/dopplergram/info
        for MM-SSTV see:
        http://groups.yahoo.com/neo/groups/MM-SSTV/info


        On Sun, 2014-08-24 at 09:05 -0700, reillyjf@... [ARRL-LOTW]
        wrote:
        >
        >
        > Thanks for the suggestion. I did a complete download, and beat the
        > number of duplicates down from 275 to 30. No exactly sure why the
        > N3FJP ACL is missing this information.
        > - 73, John, N0TA
        >
        >
        >
      Your message has been successfully submitted and would be delivered to recipients shortly.