[XTalk] Re: Proof

  Bob Schacht
    Thanks to Mike and David for this interesting thread. I d like to jump in at a few points: ... There are several types of logical entities here that need to be
    Message 1 of 104 , Sep 1, 2002
      Thanks to Mike and David for this interesting thread. I'd like to jump in
      at a few points:

      At 11:48 PM 9/1/2002 -0400, David C. Hindley wrote:
      > >>I'm afraid you're mixing up two things, Dave. If a contradiction can be
      >found to a hypothesis H, then H is not possible, even if an advocate of H
      >doesn't select that contradictory evidence as part of their data. You're
      >mixing up the universe of available evidence with the sub-universe of
      >evidence selected by an advocate of H. Probability within the latter is of
      >no interest.<<
      >It appears we agree what a possibility is, but with regard to invalidating a
      >hypothesis through the existence of a contradiction, we seem to be talking
      >past one another. When I say that a critic selects a subset of the total
      >data available to them, I am not suggesting it makes a difference on the
      >validity. I am saying that a critic would not intentionally create a
      >hypothesis that had an invalid premise intentionally, but perhaps could do
      >so if s/he did not recognize that a contradiction existed in the data they
      >did not select. Then another critic might say "Ahah! This evidence, which
      >you did not see the significance of, is indeed significant, and invalidates
      >one of your key premises."

      There are several types of logical entities here that need to be kept
      straight, and you name two of them: "data" and "premises."
      Premises often come in the form of (often untestable) assertions, about
      which there can be disagreement based on theology, philosophy, or
      epistemology. For example, I can maintain that human behavior is subject to
      certain laws, basing that premise on an epistemological belief rather than
      knowledge of any specific laws, and you can claim that my premise is
      nonsense. You might challenge me to name such a law (a favorite tactic in
      such debates), whereupon I will either decline to name one (which you will
      interpret as invalidating my premise) or I will name one and you will find
      fault with it. In neither case is my actual premise invalidated-- whether
      or not such laws exist is an entirely different question than my ability to
      articulate accurately any such law.

      The "rules of evidence" cited by Hoover in The Five Gospels are a good
      illustration of how confusing this issue can become, because some of their
      "rules of evidence" are empirical generalizations, while others are
      premises that are based on certain theories of human existence that are not
      directly testable.

      > >>You seem to be trading on an ambiguity in the word 'select'. So that we
      >can be clear about this, let's define the universe of "immediately
      >available" evidence as that which has so far been unearthed and publicized.
      >This evidence has been "objectively selected", because no one person or
      >cohesive group of people has been solely responsible for digging it up and
      >presenting it. If the word 'selected' as at all applicable in this
      >situation, I think it IS correct to say that this evidence has been
      >"randomly selected", though not "scientifically selected", because the
      >immediately available evidence is not a scientific sample of the totally
      >available evidence (some of which still buried).<<

      Before getting to David's rejoinder, I want to point to a problem with the
      phrase "objectively selected," which post-modernists have taught us to
      distrust. The situation Mike describes is more akin to "haphazardly
      selected;" bias can be due to all kinds of things other than conscious intent.

      >We would not be in agreement here. By "universe population" I am referring
      >to the technical term used in statistics to describe the total number of
      >some set of things. When we draw inferences from a set of evidence that is a
      >subset of all the evidence, we are trying to predict the proportion of some
      >characteristics of the universe population by means of the proportions
      >existing in a sample. The degree to which the sample proportion represents
      >the universe proportion becomes important when these generalizations are
      >used to support an inductive hypothesis.

      Yes, but not all hypotheses are inductive. And it is important to
      distinguish between the evidence from which a hypothesis is generalized and
      the evidence on which a hypothesis is tested. For example, it is
      methodological tautology to generate a hypothesis inductively from a set of
      data, and then turn around and "test" the hypothesis on the same data. All
      that this "tests" is whether or not the induction was logically consistent.
      Any decent inductive hypothesis should be tested on *new* data that is
      independent of the data on which the induction was based.

      >However, I believe I am right in
      >saying that surviving evidence (your "available evidence") is not the same
      >thing as a random sample of the universe population,

      I agree.

      >and so we can never be
      >sure how much non-random factors have skewed the evidence. But, even if
      >surviving evidence can be equated with a random sample, we cannot determine
      >the confidence level we can ascribe to that sample proportion without
      >knowing the total number of things being described.
      > >>I'm not sure what you mean by "statistical syllogism", but in any case, "H
      >is probable" cannot mean that H "can be confirmed" in any way whatsoever,
      >since to "confirm" H is to _prove_ H, and if H can be proven, then it isn't
      >merely "probable" - it's quite certain.<<
      >I confess, I confused that term, which is related to drawing inferences by
      >simple enumeration, with an inductive hypothesis, which is sometimes also
      >known as a statistical hypothesis. To be honest, I tend to have trouble
      >discriminating between the two.

      "H is probable" is not a useful construction. Any hypothesis is "probable"
      in the sense of having a probability ranging between 0 and 100. A more
      useful construction is something of the form "The probability of H is > x"
      But have we adequately covered the difference between a statistical
      hypothesis, and a formal hypothesis?
      In a formal hypothesis, in my understanding, *any* contrary data
      invalidates the hypothesis.
      In a statistical hypothesis, apparent counter-examples are expected, but
      the hypothesis is considered supported if the probability of the hypothesis
      is greater than some previously established benchmark or confidence
      interval. What type of hypothesis is relevant depends on the nature of the
      variables involved. If the key variable is a numerical variable (e.g.,
      percentage of the population who live in poverty, according to some agreed
      standard), then that variable can be expected to have a statistical
      distribution (e.g., "normal,") such that one might expect a contrary
      result "by chance" a certain percentage of the time. To put it into the
      form of an historical hypothesis grabbed out of thin air for purposes of
      illustration, suppose the hypothesis is that "The number of crucifixions
      per unit of time is proportional to the percentage of the population living
      in poverty." In principle, one could divide the first century into 5-year
      periods, count the number of reported crucifixions per period, and estimate
      the percentage living in poverty by some ingenious method (e.g., studies of
      burials, Roman records, etc.), and then see if the resulting sample of 20
      supports the hypothesis. One would not be too surprised to encounter one
      period that departs from expectation if the other 19 line up as expected.

      > >>Neither of these conjuncts strikes me as correct, as stated. As for "not
      >randomly selected", see earlier discussion. We aren't talking about the
      >evidence selected by an individual, but about the total evidence immediately
      >and publicly available. Surely THAT has been "randomly selected".

      Haphazard is NOT the same as random!

      Dave concluded:

      >Now you can see why I need to give a lot of thought to this issue. Keeping
      >the terms straight alone will drive many a sane person mad. ...

      Nevertheless, it is important to think such things through, and to consult
      the experts who have puzzled on the highest levels about such matters, to
      avoid re-inventing the wheel. We have visited some of the places on this
      route before, and the issue is not one that has been settled by anyone, so
      it is important to keep thinking about it.

      Thanks to both of you!
