- A grad student asked me the following question about how to measure running times.
A paper in POPL 08 claims to have broken the long-standing n

I remember complaining as a student that sorting actually takes O(n log^{3}barrier on a problem in programming languages, bringing it down to n^{3}/log n. The algorithm has a lot of details, but the main pillar it stands on is the so-called 'fast set' data structure.This data structure is used to represent sets as bitmaps of its elements (so a set {1,2} over the domain 0-7 is represented as 01100000), and supports three operations, where one is important for my question: set difference. Set difference can be performed by going through the bits of each set. Here's the claim I have a question about: The author says, assuming an architecture with word size Θ(log n) bits, we can split the bitmaps of each set into (n/log n) chunks of size (log n) bits each, and since the word size is Θ(log n), these chunks will be stored in a word each, and operating on each word is a constant. Hence, set difference can be performed in Θ(n/log n) time. This is the whole gist of the algorithm to break the previous known barrier.

I was appalled by this argument, since my primitive techniques would easily 'prove' that set difference has a lower bound Ω(n), since you have to 'touch' each element of the set. I contacted the author, and he said assuming this word size is standard in algorithms literature, such as saying that depth-first search takes O(e+v) time ignores the log v time needed to spend to retrieve each vertex.

I talked to a few people in my department with varied responses, some said it is cheating since algorithms such as DFS do not depend on the logarithmic word size, it is just ignored; and some said it is probably a good argument.

The simple question is, what do you think as a complexity researcher? Is the complexity given a valid one? I am worried that you will say no, because a lot of people are citing this paper and everybody thinks the n

^{3}barrier for that problem has been broken. I am also worried that you will say yes, as a person not particularly in the complexity field, but thinks he has a decent knowledge and sense of algorithms.^{2}n) time since one needs O(log n) time to do a comparison but my complaints fell on deaf ears.In algorithms you use the unit-cost RAM model where basic register operations over O(log n) bit registers count as a single computation step. There are some good arguments for this: As technology improves for us to handle larger input sizes, the size of the registers tend to increase as well. For example, registers have grown from 8 to 64 bits on microprocessors over the past few decades.

You can do set difference of two register bitmaps A and B by the bitwise AND of A with the bitwise complement of B, all standard register operations. So the author has a legitimate O(n

^{3}/log n) time algorithm in the standard algorithmic unit-cost model.I understand the student's frustration. The algorithm seems like a cheat and would not be in DTIME(O(n

^{3}/log n)) on a multi-tape Turing machine as I teach it in an introductory complexity course. But as an algorithmic result on the standard algorithmic model, the author stands on solid ground.

--

Posted By Lance to Computational Complexity at 5/13/2009 07:45:00 AM