[Computational Complexity] Finding Duplicates
Here is an interesting problem given by Muthukrishnan during his talk in the New Horizons workshop.
Start with an array A of n+1 entries each consisting of an integer between 1 and n. By the pigeonhole principle there must be some i≠j and a w such that A(i)=A(j)=w. The goal is to find w. Depending on A there may be several such w, we want to find any one of them. The catch is that you only get to use O(log n) bits of memory.
First a warm-up puzzle: Find w using only O(n) queries to entries of A (remember you only get O(log n) space). Hint: Use pointer chasing.
Now suppose A is streamed, that is we get in order A(1),A(2),…,A(n+1) and then get another pass etc. How many passes do you need to find a w?
You can find w in n+1 passes just by trying each possible value for w. With a little work you can use O(log n) passes doing a binary search.
Muthukrishnan asks whether the number of passes needed is Ω(log n) or O(1) or something in between.
Posted by Lance to Computational Complexity at 3/4/2005 03:06:00 AM