Birthday problem

The birthday paradox is the counterintuitive fact that only 23 people are needed for that probability to exceed 50%.

[3] From a permutations perspective, let the event A be the probability of finding a group of 23 people without any repeated birthdays.

, without repetitions and order matters (e.g. for a group of 2 people, mm/dd birthday format, one possible outcome is

For simplicity, leap years, twins, selection bias, and seasonal and weekly variations in birth rates[4] are generally disregarded, and instead it is assumed that there are 365 possible birthdays, and that each person's birthday is equally likely to be any of these days, independent of the other people in the group.

Therefore, its probability p(n) is The following table shows the probability for some other values of n (for this table, the existence of leap years is ignored, and each birthday is assumed to be equally likely): The Taylor series expansion of the exponential function (the constant e ≈ 2.718281828) provides a first-order approximation for ex for

Being independent would be equivalent to picking with replacement, any pair of people in the world, not just in a room.

A good rule of thumb which can be used for mental calculation is the relation which can also be written as which works well for probabilities less than or equal to ⁠1/2⁠.

For instance, to estimate the number of people required for a ⁠1/2⁠ chance of a shared birthday, we get Which is not too far from the correct answer of 23.

For comparison, 10−18 to 10−15 is the uncorrectable bit error rate of a typical hard disk.

[9] In theory, 128-bit hash functions, such as MD5, should stay within that range until about 8.2×1011 documents, even if its possible outputs are many more.

Incidentally, solving n2 − n = 730 ln 2 for n gives the approximate formula of Frank H. Mathis cited above.

This derivation only shows that at most 23 people are needed to ensure the chances of a birthday match are at least even; it leaves open the possibility that n is 22 or less could also work.

In other words, n(d) is the minimal integer n such that The classical birthday problem thus corresponds to determining n(365).

[10] For any d ≥ 1, the number n(d) satisfies[11] These bounds are optimal in the sense that the sequence n(d) − √2d ln 2 gets arbitrarily close to while it has as its maximum, taken for d = 43.

The theory behind the birthday problem was used by Zoe Schnabel[18] under the name of capture-recapture statistics to estimate the size of fish population in lakes.

[20] In the simplest extension there are two types of people, say m men and n women, and the problem becomes characterizing the probability of a shared birthday between at least one man and one woman.

This variation of the birthday problem is interesting because there is not a unique solution for the total number of people m + n. For example, the usual 50% probability value is realized for both a 32-member group of 16 men and 16 women and a 49-member group of 43 women and 6 men.

This number is significantly higher than ⁠365/2⁠ = 182.5: the reason is that it is likely that there are some birthday matches among the other people in the room.

The expected number of people needed until every birthday is achieved is called the Coupon collector's problem.

The probability that the kth integer randomly chosen from [1,d] will repeat at least one previous choice equals q(k − 1; d) above.

If we consider the probability function Pr[n people have at least one shared birthday], this average is determining the mean of the distribution, as opposed to the customary formulation, which asks for the median.

The problem is relevant to several hashing algorithms analyzed by Donald Knuth in his book The Art of Computer Programming.

It may be shown[23][24] that if one samples uniformly, with replacement, from a population of size M, the number of trials required for the first repeated sampling of some individual has expected value n = 1 + Q(M), where The function has been studied by Srinivasa Ramanujan and has asymptotic expansion: With M = 365 days in a year, the average number of people required to find a pair with the same birthday is n = 1 + Q(M) ≈ 24.61659, somewhat more than 23, the number required for a 50% chance.

[25] For each pair (i, j) for k people in a room, we define the indicator random variable Xij, for

[27] Further results showed that psychology students and women did better on the task than casino visitors/personnel or men, but were less confident about their estimates.

The question is whether one can usually (that is, with probability close to 1) transfer the weights between the left and right arms to balance the scale.

[citation needed] The reason is that the correct comparison is to the number of partitions of the weights into left and right.

As stated by a physicist passenger: "If you have a group of more than twenty-four people, the odds are better than even that two of them have the same birthday."

The birthday problem used to be a splendid illustration of the advantages of pure thought over mechanical manipulation; the inequalities can be obtained in a minute or two, whereas the multiplications would take much longer, and be much more subject to error, whether the instrument is a pencil or an old-fashioned desk computer.

What calculators do not yield is understanding, or mathematical facility, or a solid basis for more advanced, generalized theories.