Sometimes the literature gives a proof, and then a reference,
but the reference is not really to that proof.
Here is an example.
The theorem in question is the following:
For any seq of n^{2}+1 distinct reals there is
either a dec subseq of length n
or an inc subseq of length n+1.
In
Proofs from the book
the authors attribute the following argument to
ErdosSzekeres
(paraphrased):
Let the seq be a(1),...,a(n^{2}+1).
Associate to each a(i) the number t(i)
which is the length of a longest inc subseq
starting at a(i).
If there is an i such that t(i) &ge n+1 then
we are done.
If not then the function mapping a(i) to t(i)
maps a set of size n^{2}+1 to a set of size n.
Hence there is some s such that n+1 elements of the seq
map to s. These n+1 elements form a dec seq.
The actual proof in the ErdosSzekeres paper is this (paraphased):
Let f(n) be the minimum number of points out of which we can
select n inc or dec subseq. We show
f(n+1) = f(n)+2n1.
Let a(1),...,a(f(n)+2n1) be a sequence of distinct reals.
Out of the first f(n) of them there is an inc
or dec subseq of length n. Call the last elements of that subseq b(1).
Remove it. (There is now a seq of length f(n)+2n2.)
Repeat the process to obtain b(1), b(2), ..., b(2n).
Let A be the b(i)'s that are from inc subseq.
Let B be the b(i)'s that are from dec subseq.
It is easy to show that A is itself a dec subseq
and that B is itself an inc subseq.
If A or B has n+1 elements then we are done.
The only case left is that A and B each have n elements.
Let a be the last element in A and b be the last element in b.
It is easy to show that a=b. But one of them was removed
before the other, so this is impossible.
Thoughts:
I suspect that in talks Erdos gave he did the proof now atributed
to ErdosSzekeres and hence people naturally assumed that this
is the paper it was in.

I am surprised that Proofs from the book gave this proof.
There is, IMHO, a better proof. I do not know whose it is.
Let the seq be a(1),...,a(n^{2}+1).
Map every 1 \le i \le n^{2}+1 to (x,y)
where x (y) is the length of the longest inc (dec) subseq
than ends with a(i). It is easy to show that this
map is 11. If all of the x,y that are mapped to
are in [n]x[n] then the domain is bigger than the range,
contradiction.

ErdosSzekeres first proved this theorem in 1935.
Martin and Joseph Kruskal proved it 15 years later
without knowing that ErdosSzekeres had proven it; though
by the time Joesph Kruskal published it he knew.
I have gathered up 5 proofs of the theorem that I know
here.

In those days it was harder to find out if someone else
had done what you had done since they didn't have google,
but it may have (in some cases) been easier
since there were so many fewer researchers you could
just call them. OH but long distance was expensive
back then.

Posted By GASARCH to Computational Complexity at 11/14/2008 01:51:00 PM