Loading ...
Sorry, an error occurred while loading the content.

[Computational Complexity] Birthday Paradox Variance

Expand Messages
  • Lance
    First a message from David Johnson for proposals on locations for SODA 2012 both in and outside the US. Here s an interesting approach to the birthday paradox
    Message 1 of 1 , Nov 25, 2009
    • 0 Attachment
      First a message from David Johnson for proposals on locations for SODA 2012 both in and outside the US.

      Here's an interesting approach to the birthday paradox using variances.

      Suppose we have m people who have birthdays spread uniformly over n days. We want to bound m such that the probability that there are are at least two people with the same birthay is about one-half.

      For 1 ≤ i < j ≤ m, let Ai,j be a random variable taking value 1 if the ith and jth person share the same birthday and zero otherwise. Let A be the sum of the Ai,j. At least two people have the same birthday if A ≥ 1.

      E(Ai,j) = 1/n so by linearity of expectations, E(A) = m(m-1)/2n. By Markov's inequality, Prob(A &ge 1) ≤ E(A) so if m(m-1)/2n ≤ 1/2 (approximately m ≤ n1/2), the probability that two people have the same birthday is less than 1/2.

      How about the other direction? For that we need to compute the variance. Var(Ai,j) = E(Ai,j2)-E2(Ai,j) = 1/n-1/n2 = (n-1)/n2.

      Ai,j and Au,v are independent random variables, obvious if {i,j}∩{u,v} = ∅ but still true even if they share an index: Prob(Ai,jAi,v = 1) = Prob(The ith, jith and vth person all share the same birthday) = 1/n2 = Prob(Ai,j=1)Prob(Ai,v=1).

      The variance of a sum of pairwise independent random variables is the sum of the variances so we have Var(A) = m(m-1)(n-1)/2n2.

      Since A is integral we have
      Prob(A &ge 1) = Prob(A > 0) ≤ Prob(|A-E(A)| > E(A)) ≤ Var(A)/E2(A) by Chebyshev's inequality. After simplifying we get Prob(A &ge 1) ≤ 2(n-1)/(m(m-1)) or approximately 2n/m2. Setting that equal to 1/2 says that if m ≥ 2n1/2 the probability that everyone has different birthdays is at most 1/2.

      If m is the value that gives probability one-half that we have at least two people with the same birthday, we get n1/2 ≤ m ≤ 2n1/2, a factor of 2 difference. Not bad for a simple variance calculation.

      Plugging in n = 365 into the exact formulas we get 19.612 ≤ m ≤ 38.661 where the real answer is about m = 23.

      Enjoy the Thanksgiving holiday. We'll be back on Monday.

      --
      Posted By Lance to Computational Complexity at 11/25/2009 05:31:00 AM

    Your message has been successfully submitted and would be delivered to recipients shortly.