- +++ Below from K2DSL and then I'm letting the subject go. I just find it disappointing that the truth, whatever it may be, is stifled because of one individuals closed mind and lack of acceptance of every other person's ideas.On Sat, Feb 23, 2013 at 8:40 PM, Joe Subich, W4TV <lists@...> wrote:> K2DSL: I didn't say you were using P to solve for T so why stateYou said it again ... there is only *one* unknown in that equation

> that? I said there are 2 unknowns in the equation which is accurate.

since the two items you claim to be "unknown" are related to each

other by a constant (known). Thus, if you know one, you know both

- they can not both be unknown.+++ Here's the simple formulas again that you cut out from your reply.

(Avg # QSOs per log * # uploaded logs) = Total Uploaded QSOsTotal Uploaded QSOs - Newly Uploaded QSOs = Previously Uploaded QSOs

+++ From the above you only have # uploaded logs & Newly Upload QSOs. In every equation you have 2 unknown data points. You cannot claim avg # QSOs per log as known because you don't have that as a fact but as an assumption relying on your calculation being a valid average # QSOs per log, which it is not.You can't arbitrarily remove 80% of the samples - each sample is as

> If I remove the top 20% of status records based on backed up QSOs

> leaving 80% of the samples, that removes 95% of the QSOs you are

> using from the samples.

valid as any other. Picking and choosing data to suit your hypothesis

is wrong, wrong, wrong.+++ I didn't remove 80% of the samples. I removed 20%. Why would you say I removed 80% of the samples when I wrote I removed 20% multiple times?> Without the top 20%, the remaining 80% of the samples result in an averageYou don't know *as a fact* that the total QSOs are 10,598,584. You have

> 85 QSOs/log. Knowing 125,261 logs uploaded as a fact within the sample

> period, Total QSOs in the sample period = 10,598,584 QSOs.

an inaccurate value for QSOs/log because you arbitrarily discarded the

largest logs. Any time you lop off the top 80% of the samples you are

going to understate the average log size and excessively reduce the

level of previously processed records - specifically because it is the

large non-contest logs that are the primary source of the duplicates

we're trying to measure! Of course you know that are are intentionally

trying to manipulate the data to support your "low dupes" mythology.+++ I never said 10,598,584 is a fact so yet again why would you state I said it was a fact? And again, I didn't "lop off 80% of the samples" so why are you saying that's what I wrote? Are you reading what I and others write or just replying or are you making false statements on purpose? I excluded 20% of the samples as they alone represented 95% of the QSOs that were captured LEAVING 80% of the samples. If the % of samples you are basing your analysis on is enough, reducing it by 20% should be immaterial.

+++ I have no mythology (why would you choose such a word) nor am I manipulating the data. I'm simply putting knowledge and logic against the data that is available and coming up with a different answer from what you have posted. I believe my knowledge and logic is no less plausible than yours. I don't care what the % of dupes is or isn't.

> that calculates out to:

> New QSO = 82%

> Previously uploaded QSOs = 18%

level given by K1MK. You're not going to have a 65% decline in the

rate of dupes between the first four weeks of December and February

with only a 15% increase in new QSOs - not when the number of logs

processed increased by 50% in the same time period. Your hypothesis

fails even the most basic examination.+++ If the old % reported was somewhere around 50% or so, why is it not absurd that you are now claiming it is 50% higher but I must be wrong with my calculations showing it to be 65% less? That 15% makes a material difference? You should apply the same rigor to your own analysis.There is nothing to support a 75% *decrease* in average log size -

which would be required to reach your 18% dupe rate - between December

and February. Instead, all of the signs point to a modest increase in

average upload size - on the order of 15% or so.+++ How do you have any average log size from Dec or any other month for that matter? And there are no "signs of an increase" unless you are speaking to your assumptions which many don't agree with.

> That is how skewed the data is and also what others have been

> criticizing. With knowledge of the fundamental way the data will

> show in any one second, a single large file will have much greater

> impact as can be seen from this simple analysis.

one needs a larger sample to reach the same confidence level as one

would achieve with a less skewed (smaller standard deviation in terms

of "normal" distributions) population. Go study statistics instead of

spewing a bunch of stuff that is provably wrong.+++ Why the need for statistics when simple common sense & knowledge can be applied. Facts are facts and in the case of this analysis, facts are limited and not able to communicate the entire picture. You don't have the facts to produce a valid result. An ever increasing # of samples which themselves, when applying knowledge of the situation, are not valid, doesn't make the average valid.

> You don't have to agree with the approach (I didn't particularly

> agree with yours) and I know you won't like the outcome, but it shows

> that not having the facts and making general assumptions based on a

> limited set of data and one that is particularly skewed can result in

> potentially inaccurate analysis.

data that does not agree with your desired outcome and 2) results are

unreliable when you arbitrarily reduce the sample size. Both of those

are known principles of statistics/statistical analysis.

You're never going to agree with my analysis because it doesn't support

your belief system. There is obviously no use debating this further -

however, all of the equations to calculate the "expected value" of a

distribution, probability mass function, cumulative distribution

function, confidence intervals, and the other parameters on which to

base an unbiased analysis are available on-line for you to study and

maybe learn something.

+++ My "belief" is the truth and I believe your analysis to not represent the truth. If I felt your analysis was sound and took into account applying knowledge to the data, it wouldn't matter what the outcome is. There is no need for statistics on such a basic problem. a 6th grader would be more than capable of doing the addition, subtraction, multiplication and division necessary to perform any calculation required against this data. What's needed is to apply knowledge of the situation and not arbitrarily take the numbers and blindly process them. - I run ACLog... When you hit the "ALL SINCE" button, change the date to

be something about a week prior to the LoTW failure...

I "believe", ACLog got very confused as a result of the fail mode of

LoTW. That corrected a very similar problem for me.

--

Thanks and 73's,

For equipment, and software setups and reviews see:

www.nk7z.net

for MixW support see;

http://groups.yahoo.com/neo/groups/mixw/info

for Dopplergram information see:

http://groups.yahoo.com/neo/groups/dopplergram/info

for MM-SSTV see:

http://groups.yahoo.com/neo/groups/MM-SSTV/info

On Sun, 2014-08-24 at 09:05 -0700, reillyjf@... [ARRL-LOTW]

wrote:>

>

> Thanks for the suggestion. I did a complete download, and beat the

> number of duplicates down from 275 to 30. No exactly sure why the

> N3FJP ACL is missing this information.

> - 73, John, N0TA

>

>

>