Thursday, June 16, 2011

Collect them all!


Coupon Collector Problem
Have you ever heard of ads that say "Collect All 27" or "Collect All 50" or "Collect All Whatever"?? I'm pretty sure that almost all of you had a craze to collect all of those 'stuff' in your childhood days. But have you ever wondered how many items that you'd have to buy the item to collect all those 'stuff'? Thats the Coupon Collector Problem.

What is $e$?
Lets start answering this question by a popular joke.
Once $e^x$ and $logx$ were walking down the street. From the other end, they saw a differential walking towards them. $logx$reeched and started running away from the differential. But $e^x$ stood tall and said 'Haha, you are no match for me. No matter what you try to do, you cannot cause me any trouble!'. The differential replied, "I'm $\frac{d}{dy}$!".
But what is this $e$? $e$ is a convergent infinite series. By 'convergent', we mean that the infinite series tends to a particular value. $e$ is given by:
$e = 1 + \frac{1}{1!} + \frac{1}{2!} + \frac{1}{3!} + \frac{1}{4!} + ...$
Clearly, we can see that this value is greater than 2.
We know that $\frac{1}{2} + \frac{1}{2^2} + \frac{1}{2^3} + \frac{1}{2^4} + ... = 1.$ Compare this series with $(\frac{1}{2!} + \frac{1}{3!} + \frac{1}{4!} + ...)$. The latter must be lesser than the former. Hence $(\frac{1}{2!} + \frac{1}{3!} + \frac{1}{4!} + ...) < 1$. Therefore, $e$ must be a value between 2 and 3. That is, $2 < e < 3$. By actual computation (upto a certain limit ofcourse), we find the value of $e$ happens to be 2.7182818284590451. Now thats $e$!

What has $e$ got to do with the coupon collector problem?
You'll see.

Consider a group of $n$ people. Lets say you know just 1 among them. If a person is picked randomly, the probability that the chosen one happens to be the person you know is $\frac{1}{n}$. But how many trials should you take until the randomly chosen person is the one you know? Its $n$.
Lets say you know 2 among the group. Now the the probability shifts to $\frac{2}{n}$. In this case, you'll have to take $\frac{n}{2}$ trials. Now we see a connection here, if the probabilty of of an event occuring is $k$, you'll have to perform $\frac{1}{k}$ trials to see the event happen.

Now that we're done with the prerequisites, lets start off with the actual problem.
Lets say you get a coupon for free whenever you buy an item. Let there be $n$ such coupons. You'll have to collect all the coupons.
At the start, you have no coupons with you. Any item bought will add a new coupon to your collection.
The second time you buy an item, the probability of you getting the same old coupon is $\frac{1}{n}$. But you need a new coupon, not the ones already collected. The probability of getting a new coupon is $1 - \frac{1}{n}$, that is $\frac{n-1}{n}$. Hence you'll have to make $\frac{n}{n-1}$ trials.
The third time you buy an item, the probability of you getting a new coupon is $1 - \frac{2}{n}$, that is $\frac{n-2}{n}$. This is because you have already collected 2 different coupons. To get the third coupon, you'll have to buy $\frac{n}{n-2}$ items.
Similarly, to get the $K$th coupon, you'll have to buy $\frac{n}{n-(k-1)}$. Summing this over the range $[1,n]$, we get the following:
Number of items purchased =
$\sum_{k=1}^{n}\frac{n}{n-(k-1)}$
=$1+\frac{n}{n-1}+\frac{n}{n-2}+\frac{n}{n-3}+\frac{n}{n-4}+...$
=$n*(\frac{1}{n}+\frac{1}{n-1}+\frac{1}{n-2}+\frac{1}{n-3}+\frac{1}{n-4}+...)$
=$n*\log_e n$

Could find out where $e$ popped into the picture? Its hidden in the last step of the proof. Thats right, $\log_e n$.






Blog Authors:





Vijay


Vijesh

1 comment:

  1. vijay and vijesh,
    great article.

    but some stuff was a little unclear to me on the first read.

    specifically,your colloquial definition of the coupon collector problem. i couldn't clearly understand the stuff about "collect everything". that part could use a little more explanation.

    other than that..
    it was just delicious..

    the math itself was simple and enjoyable.

    appreciated,
    chetan.

    ReplyDelete