Monday, August 1, 2011

In Quest of the Best !!


Jerry, a cute looking mouse, extremely intelligent(Stupid at times !! ) is eating cheese.. He hears some crackling noise ou

t and goes out to check. He reads a note that says,
“ Dear Jerry,
Please take care of Tuffy over the Thanksgiving holiday. P.S : he loves to eat! “ :P

Jerry being a little lazy frowns over the idea of spending a nice holiday with Tuffy. A little later, “A Nanny!!” ,he exclaims to himself and thinks “I can always hire a nanny as long as I have enough cheese to pay her ;)”. But Jerry is a perfectionist. He wants to hire ‘The Best’ nanny for his dear nephew and was quite excited about the idea. But on what basis would he know who the Best was? He had an answer for that too. He said “Whoever has scored the highest marks in ‘Mouse-keeping’ school will acquire this post”. By the way, you have to know this, Jerry loves bossing around. Ah...ha…you guessed it right!! That is another reason why he wanted to ‘hire’ someone :P

Much to Jerry’s delight, he saw a bunch of them waiting for the post. Soon he murmured to himself, “Dude it is not going to be easy to choose the best”. He planned to do the following:

(Are you wondering "Is Jerry not smart enough to find out the highest-scorer among the lot?". Well, let me tell you. Jerry has no clue about anyone's marks. Only when a person enters the room for interview process, he gets to know the score and either he has to be hired or fired with no other option left. Also assuming that ‘Mouse-keeping’ school doesn’t give out the highest –scorer)

Ask them to maintain a queue and let everyone in one by one in the order A,B,C and so on.
Blindly hire ‘A ‘and ‘pay’ some cheese.
Look at ‘B’. In case if ‘B’ has a score better than the previously chosen one i.e. ‘A’, then he fires ‘A’ (along with which, the paid cheese is gone :P) and hires ‘B’. Not to forget, he has to pay cheese to ‘B’ for hiring.

He plans to do this until he interviews everyone in order to find the best among the lot.

“Oh dear God!! Won’t I run out of cheese if I do that? What if I feel restless in the middle of the whole process?? ” . Yes…finally...he realizes how much he loves cheese and how lazy he is!! So he contacts his good old mathematician friend Mr.X to help him out with this problem.
Mr.X thinks over the problem a little and speaks to Jerry. The conversation goes like this.

Mr. X : I can help you if you are ready to compromise a little.
Jerry : Oh ! I don’t mind as long as it finds me The Best nanny and I get a larger share of the cheese.
Mr.X : Ok I can’t promise you that I will find the best Nanny, but I promise you to save large amount of your cheese and get you a Nanny closest to the Best.
Jerry : Nice. Alright then, give out the solution.
Mr.X : Given members, find the best among 1 to by hiring and firing process. Let us call her . Hire $\beta$. Find the first person among $k+1$ to $n$ who beats $\beta$ with her score, say $B1$. There you are!! She is the nanny you want to hire.
Jerry (with a weird look) : Really?? You think this is gonna work?? And how do I know $k$?
Mr.X : Well ,$ k$ = number of people attending the interview($n$) /exponential constant ($e$)
Jerry : Oh great !! I will try that out.
Hurray !! he verifies that the method actually works !! He hires $B1$ and she really does take very well care of tuffy .

As usual, happy Ending of the story. :) :) :)

But wait a minute, what exactly happened behind the scene? How was he so sure that this will work out? Any miracle? By the way,Who is this Mr.X?
Mr.X is none other than the ‘Probability’ !!!!!

Let us see how probability finds us a solution for this !! Summarizing the method :

1. A $k$ is found. This $k$ is the point at which you can AFFORD to feel restless.
2. What is the use of this $k$?
Hire the best person $\beta$ in the range 1 to $k$ by ‘hiring and firing’ process. Find the person in the range $k+1$ to $n$ who beats $\beta$. Let us call it $B1$. There are high chances of $B1$ being the Actual best. If this works, then u save a lot of money, time and yet you get a best or closest to best. But how?? Perplexed??

Points to be noted is that ‘best’ i.e. $Bi$ has to occur after $k$ and most of the times it does !! So proceeding with this is worth a try. But the question is how to find a $k$ so that this method works?

$Bi$ : Probability of best occurring in ith p
osition i.e. $\frac{1}{n}$
$Wi$ : Probability of $\beta$ occurring in right position $\frac{k}{i}$
$\beta$ can occur in any one of the k positions. k has to occur before i.

Hence for the method to be a success, both these conditions needs to be satisfied. As both the events are independent, the probabilities have to be multiplied. Hence, by the product rule, the probability of this happening is $B_i.W_i$

$i$ has to occur after $k$, i.e. anywhere between $k+1$ to $n$

$\sum_{i=k+1}^{n}\; W_i B_i$

$\sum_{i=k+1}^{n}\frac{k}{i-1} \frac{1}{n}$

$\frac{k}{n}\sum_{i=k}^{n}\; \frac{1}{i}$


We know tat ,
$\ln n= 1\: + \frac{1}{2}\: + \frac{1}{3}\: + \frac{1}{4}+\cdots +\frac{1}{n}$

$f(k) = \frac{k}{n}\;\left ( \ln n - ln k \right )$



(Click on the image to get a clear view)

A tangent is drawn at the peak. As the slope is 0, f'(k) is equated to 0 in order to find the value of k.

f '(k) = 0

f '(k)=$ \frac{ln n}{n} -\frac{1}{n}\;\left ( k.\:\frac{1}{k}\;+\; ln\:k \right )$

$\frac{ln n}{n} - \frac{1}{n} - \frac{ln k}{n} = 0$
$\frac{1}{n} \left(ln \frac{n}{k} - 1 \right) = 0$
$\left(ln \frac{n}{k} - 1 \right) = 0 $
$ln \frac{n}{k} = 1$
$\frac{n}{k} = e $
$k = \frac{n}{e}$

Wow is that it?? At $k = \frac{n}{e}$, there is a high probability of beta being very close to ‘Best’ which is our aim. You can afford to stop your interviewing process at ‘k’ and also not compromise with your principle of finding the ‘Best’.
If there are n candidates for interview, you can stop the process at n/e and still find the Best with high probability.


Not convinced?? Let us work out the problem given :
(A python generated output :P you can trust this :P )
Work out this on your own and tally the answers given. This way, you will get more acquainted with the method.
Let us say there are 100 candidates for the interview (assuming that there is only one ‘Best’). Their marks are as follows.(These marks are 'random'ly generated by python)

Lets follow Jerry's method i.e. hiring and firing process...That way, the total number of hiring is 5. But once you reach '100' you dunno if that is the best or not. You end up interviewing everyone with the hope that you can find someone better than the one you have already found.
The main aim of the method proposed is to avoid the interviewing process for everyone and yet find a best one. Basically what it means to tell is that you can afford to get restless at $k$
According to the method proposed,

1.K=n/e =100/2.714 =36 (represented by the red line)
2.Number of hirings before k = 4
3.The Obtained Best within k = $\beta$ = 98
4. The ‘Best’ chosen after k = 100 !!!! Which is the actual Best !!

*************************************************************************************

Consider this case where you a $B1$ closest to $Bi$
1. K=n/e =100/2.714 =36 (represented by the red line)
2. Number of hirings before k = 6
3. The Obtained Best within k = $\beta$ = 91
4. The ‘Best’ chosen after k = 98 (represented with yellow)!!!! Which is closest to the actual Best (represented with yellow)!!

*************************************************************************************
Worst Case: If the Actual Best occurs within the range 1 to k, then the method fails and you
1. K=n/e =100/2.714 =36 (represented by the red line)
2. Number of hirings before k = 6
3. The Obtained Best within k = $\beta$ = 100 (Disaster :P)
4. The ‘Best’ chosen after k = 17 (represented with blue)!!!! If none of the values beat $\beta$ then by default the last value is chosen to show that you have traversed the complete list.

*************************************************************************************

Probability at its 'Best' ;) Hope you have your eyebrows raised ;) If not, you need to go through it once again ;)

FYI : The class of algorithms where input is fed in serial fashion and you don't have the prior knowledge about the entire input at the time of processing are called Online Algorithms. Also, this problem is the same as 'The Secretary Problem'. You can google and take a look at it :) :)

Shruthi R Nayak



18 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Very well written.Starting with jerry, i was wondering how can anyone possibly link it to the online hiring problem. Artistically done.The beta and B1 quantities are clearly explained. Brilliant title!
    You could have shortened it a little bit. But that's not a problem."In the quest of Best!", is altogether an amazing job!

    ReplyDelete
  3. Creativity at the best!.projecting the online hiring problem this way was awesome idea:)
    Hey wont the reader get a doubt that when there is already a queue formed isnt it trivial to see who is the highest and pick them?I think just a assumption is to be added that
    they cant be seen all at once and as one person enters either he has to be hired or rejected with no other option :)"Mr.X is none other than the ‘Probability’ !!!!!" is too good:)hey rem the confusion that occurred trying to explain how Wi is K/i to other interns?May be a little explanation there would be apt:)The python generated outputs are well done :)Overall a very good job:)keep it up goh!;):)

    ReplyDelete
  4. i love cartoon stories..capturing the minds of ppl and explain the problem is fantastically done :).the question is made v clear .
    Coming to solution part introduction to MR.X (probability) is done v well.i guess the product of two probabilities need to be explained.as they both the events independent in nature occur simultaneously so we have to multiply it :). The graph suggests a good means of why f'(k) = 0 .Both the best and worst cases are explained properly.
    well done shruthi.Keep it up.:)
    "in the quest of best" is the best .

    ReplyDelete
  5. First of all, using Jerry and Tuffy for online hiring problem..Just wow!!..:):)Mouse keeping school..Creative mind!!:):)
    U hired them to act in this episode eh?..;):)
    Cheese was a gud substitute for cost..:)Problem statement very smartly introduced thru the conversations between Jerry and probability..:):)
    conditions nicely explained and k neatly derived..:)A graph too..
    Using Python for examples great..:)examples prove what u explained above..
    superb blog,superb title..:)

    ReplyDelete
  6. First of all, awed by the title. :)
    Mr. X says "Ok I can’t promise you that I will find the best Nanny, but I promise you to save large amount of your cheese and get you a Nanny closest to the Best.". I agree with the part about getting the best or closest-to-the-best nanny. But I cant seem to figure out how a large amount of cheese is being saved.

    Its just plane amazing to note that
    ln(x) = 1 + 1/2 + 1/3 + ... + 1/x (correction, approximately equal to)
    But how do we get such a relation? Here's the answer: I'm pretty sure that we all have mugged up the derivative of ln(x) as 1/x. Based on that assumption, we were also taught that integral of 1/x is ln(x). But what does this integral physically signify?

    Remember the good old days when we used to find the area of a rectangle way back in middle school? Thats exactly what an integral is. Area under the curve. In this case, the curve just happens to be 1/x. Integral of 1/n gives the area under the curve y = 1/x starting from x = 1 to x = n. By a quick look at the plot of y = 1/x, we see that the this integral can be approximated as 1 + 1/2 + 1/3 + ... + 1/n.
    And voila..
    ln(n) = 1 + 1/2 + 1/3 + ... + 1/n

    ReplyDelete
  7. MIND BLOWING...

    excellent article.

    especially liked the part where k is proved to be n/e.

    using maxima to find the value of k was clever.
    and the analysis was also well done. probability can go horribly wrong if initial assumptions are made incorrectly.

    the story is INGENIOUS. well done.
    it is often apparent to the inerudite that probability is just a complicated way of stating obvious truths....but this article proves the power of probability quite emphatically.

    ReplyDelete
  8. @nikitha- Nice catch :) i resolved the ambiguity :) Just go through the updated version :)
    @vijay - Thank u :) will add it :)
    @vijesh - Beautiful :) thank u for the derivation :)
    @shruti and shobitha - Thanks a lot :)

    ReplyDelete
  9. @Chetan.S.Kumar - Thank u so much :) And yes "Probability is just a complicated way of telling obvious truths"...Well said :) :)

    ReplyDelete
  10. Superb work shruthi.....
    Very simple and easy to understand......
    GREAT WORK :)

    ReplyDelete
  11. The blog is simply wonderful :)
    Tom and Jerry was my favorite cartoon in my childhood.
    Its always better when things are explained as a story rather than simply explaining how the algorithm works, people will be more interested to read.
    Things have been put well in creative way. Liked the part about Mr.X .
    The python generated code part explains the whole thing clearly.
    loved it :)

    ReplyDelete
  12. good job, you have managed to make it a fun little story. I will try to remember this one.. :)

    ReplyDelete
  13. concept well explained using jerry...
    the probability concept for a first timer could have been narrated more impressively in the Wi part... I'm quoting this because ... a cartoon analogy for d need for multiplying the two events... would have been too good :D

    for an above average... article is just Simple n Super

    ReplyDelete
  14. good one ,well explained and its very impressive, I liked it

    ReplyDelete
  15. Well i standby all the above comments :) Nice, creative way of beginning :) but wat i felt necessary was, the 'Hiring Firing' method shud be carried out 1st for randomly chosen 100 cadidates, n then with ur 'Efficient Algo' method for th same 100 cadidates.... Compare both, n later u can actually tell tht ur 'Efficient Algo' method is Better :)
    Hope u understood th reason behind this procedure :)

    ReplyDelete
  16. @preetam - Necessary changes has been made..Go through :) Thanks for the input :)

    ReplyDelete
  17. And one other case, where i felt 'Impractical', as far as ur xplanation is concerned, is- "The Worst case..!!! "
    U do not end up hiring a candidate with 17 marks,("The 'Best' chosen by default" as u have mentioned), cos u already have Hired th candidate with 100 marks..!!
    But i agree tht ur Efficient Algo Method fails to prove its 'Efficiency' in this particular case, but dont forget, our Purpose is Served :)

    ReplyDelete
  18. Agreed that the purpose is served..But what matters is how the purpose is served :) The Efficient algo ends up in the mere hiring and firing process in the worst case :) The last value is highlighted to show that you have traversed the complete list and there are no more inputs to be processed :) Will try to convey this clearly :)

    ReplyDelete