Wednesday, November 28, 2012

Interval Scheduling

Interval Scheduling
                                   
                                                                               By: Abhishek Arora


Disclaimer:

         All characters appearing in this article are fictitious. Any resemblance to real persons, living or dead, is purely coincidental.

Abstract:
 
A long time ago, there was a saint born on this earth. Lord Shiva gifted him with enormous powers in return of his prayers.

          One day on seeing the grieving family members of a dead person, the saint got very angry with the death. He couldn’t able to bear the pain and sorrow caused by death. He made a protective layer on the earth by his powers so that no death reapers can now enter the earth and take a life.

This caused a great imbalance in the functioning of earth and all the gods became very tensed especially Yamraj because it was their duty to make proper balance of life and death on this earth. All the gods gathered together and request the saint to remove that protected layer but he didn't agree with them. Though he allowed Yamraj to send his one death reaper on the night of Full moon to take life and carry souls to afterlife one by one.


It was a very short period of time to carry all the souls to afterlife. Yamraj was in the state of dilemma to how to properly schedule his time and carry maximum souls to afterlife because he can only send one death reaper to earth and the death time of every earthling is written in his life book. Time to take a soul depends upon how far is the death place of the person from the place of Yamraj. Reapers carry soul to Yamraj one by one.

He didn't have a solution for this question. Goddess Saraswati is the goddess of knowledge. Yamraj was sure that goddess Saraswati must have a solution for his problem. Yamraj decided to take help from Goddess Saraswati.


Goddess Saraswati being the goddess of knowledge knew how to handle such type of problem. She picked a stone and wrote the procedure on it. She told the Yamraj that this procedure would be known as Interval Scheduling in the future.



                                          Interval Scheduling
 
Given a set of tasks with their arrival time and running time we need to schedule them within a given time period such that we are able to schedule maximum number of tasks.

Let p1,p2,p3....pI be the tasks to be scheduled.
Let s1,s2,s3......sI be the starting time of tasks.
Let r1,r2,r3.......rI be the time required by each task respectively.


Algorithm:

 
Implementation:

• Sorting intervals according to the right-most ends (their finishing time)
• For every consecutive task:
  • If the left-most end(starting time) is after the right-most end of the last
selected interval then we select this interval
  • Otherwise we skip it and go to the next interval

This approach is a Greedy approach.
Consider the following set of activities represented graphically in non-decreasing order of finishing times




Using the greedy strategy an optimal solution is {1, 4, 8, 11}.









Scenario of Yamraj:

But Yamraj wasn't able to understand a single word written above.
Then Goddess Saraswati smiled and said to Yamraj:
  • Taking a soul to afterlife is a task for you.
  • Death time of a person is the starting time of your task.
  • Time to reach to the dying person and taking his soul to your place is the time required to complete that task.
This approach worked for Yamraj and he implemented this approach in bringing souls to afterlife on every Full moon night for many years. 

But as the years passed Yamraj noticed that sin and violence had increased a lot in the earth. But he couldn't able to do anything. He just have one night to cause deaths in the earth and it also didn't assure that Yamraj is picking bad guys.

It made Yamraj to worry. He wanted to do something about the increasing sin and “adharm” on earth. But he didn't know how to pick maximum bad guys in his given time slot. So he again decided to take help from Goddess Saraswati. He went to Goddess Saraswati and asked for her help.

Goddess Saraswati smiled and asked Yamraj, “how will you decide who is bad and who is good??”.

Yamraj answered, “Devi we keep a count of all the deeds of humans. If a human do sins throughout his life he gets punished in the afterlife.”

Goddess Saraswati smiled and said to Yamraj, “ I can help you with this but it doesn't guarantee you that you will able to bring maximum souls to afterlife”

           It put Yamraj in a thinking state. He didn't know what to do now. He want to take as many as possible lives on every full moon night but he also want to do something about the increasing sins and adharma on earth. So Yamraj said, “I want you to tell me the way to pick as many as possible bad guys.”
 
Goddess Saraswati again picked a stone and wrote the procedure on it. She told the Yamraj that this procedure would be known as Dynamic Weighted Interval Scheduling in the future.




Weighted Interval Scheduling

Suppose we are given n jobs. Each job i has a start time si , a finish time fi , and a weight wi .

We would like to find a set M of optimal jobs whose total weight is maximized.

Let us assume that our n jobs are sorted by non-decreasing finish times fi .

M(i)= The maximum weight of any set of compatible jobs, all of which finish by fi .

p(j)= job with the largest index less than j which is compatible with job j. In other words, the largest k such that fk ≤ sj . If no such job exists define p(j) = 0.

Then
M(i) = max{M(i − 1), wi + M(p(i))}.


Consider the optimal solution for M(i). Either job i is used or it is not.
  • If it is not, then we have a maximum weight set of compatible jobs from among jobs 1 through i − 1. By definition, this is M(i − 1).
  • Alternatively, if i is used for M(i). Since jobs p(i) + 1 through i − 1 all conflict with job i, this means that the remaining jobs selected for M(i) are drawn from 1 through p(i). Removing i from the optimal solution for M(i) yields a compatible solution on jobs 1 . . . p(i).
Therefore M(i) = wi + M(p(i)).

Finally, since M(i) is a maximization, the larger of these two terms is correct.
As base cases, we define M(0) = 0 and M(1) = w1.



The algorithm works as follows:

  • sort jobs by increasing finish times.
  • compute function p(i) for i from 1 to n (this can clearly be done in O(n2) time. If we want to do better, we can either binary search for each i, or all p(i) values can be computed in a single linear pass if we have already created a second list of jobs sorted by increasing start times).

set M(0) = 0 and M(1) = w1
loop over i from 2 to n
            set M(i) = max{M(i − 1), wi + M(p(i))}
endloop

Sorting takes O(n log n) time. The computation of p(i) takes another O(n log n) time and the main loop are both linear, and therefore the total running time is O(n log n).


Note that we computed the value of an optimal schedule, but not the schedule itself. To actually compute the actual schedule, we have a few options. One is to use the computed values M(i) to reverse engineer an optinal set A of jobs to select.


Alternatively, we can change the algorithms so we build lists as we go:
  • sort jobs by increasing finish times.
  • compute function p(i) for i from 1 to n
set M(0) = 0 and M(1) = w1
set A(0) = 0 and A(1) = {1}
loop over i from 2 to n
            if M(i − 1) > wi + M(p(i))
                         set A(i) = A(i − 1) and M(i) = M(i − 1)
             else M(i − 1) ≤ wi + M(p(i))
                         set A(i) = {i} A(p(i)) and M(i) = wi + M(p(i))
             endif
endloop



Again Yamraj wasn't able to understand how to implement it. Then goddess Saraswati told him:
  • Taking a soul to afterlife is a job for you
  • Death time of a person is the starting time of your task.
  • Time to reach to the dying person and taking his soul to your place is the time required to complete that job.
  • All the bad deeds done by that human is the weight of the job.

Yamraj was finally happy to have a solution for this. He continued to use this approach to take life of bad guys until the death protecting layer of earth was removed by the saint.

In the final moments of the saint’s life when he was old and in the state of misery, and no one was there to support him. He was waiting eagerly for his death but death wasn’t able to come to him due to his protected layer. He then realized that it is very important to leave this life and move to afterlife. He removed that protected layer and apologize to all the gods. It was a sigh of relief for all the gods specially Yamraj.


References:

faculty.cse.tamu.edu/nikolova/Teaching/.../lec3.pdf - United States
lonati.di.unimi.it/algo/0910/lab/kowalski6.pdf



Tuesday, January 24, 2012

Linked ( A Summary )

Link 1

Introduction

Feb 7 2000, SHOULD have been a big day for Yahoo. Instead of millions of users that daily flock to it, billions tried to enter. Such an exploding popularity could have made Yahoo the most valuable company of the emerging economy. There was however a problem, they all tried to enter at the same time, and not one of them asked for stock quote or pecan pie recipe. They rather sent a message in scripted computer language “I heard you!”. Hundreds of yahoo computers at Santa Clara and California were kept busy by these “ghosts”, while millions of legitimate users waited for a few minutes before they switched.

The next day Amazon.com, eBay, CNN.com, ETrade, and Excite, fell under the same spell. Of course making billions of real users hit yahoo.com in their browsers at 10.20pm is not a possibility. There simply weren’t enough computers to do that, at that time.
The consensus suggested it may be a group of sophisticated hackers, fascinated by the challenge of the security systems, who hijacked 1000’s of computers, at schools, research and turned them into zombies telling yahoo “I heard you!” thousands of time. Every second Yahoo was thrown huge amounts of data, which was much more than it could handle.
FBI following the leads based on a chat room voice bragging about next targets, arrived at the suburban home of a Canadian teenager, hiding behind the pseudonym mafia boy, a fifteen year old who successfully halted operations of billion dollar companies. The attempt succeeded with brute force, a lot of nerve and a little sophistication.
What is it that triggered the intelligent move to be so successful? What can, if a group of trained and skilled individuals do if they thought so?



Judaism was followed by a very small group of people who were prosecuted by both roman and Jewish authorities for putting Jesus, on the same level as God. Despite the odds now more than 2 billion people call Christianity their religion. This is no different from the above event.
The credit goes to a young chap Paul, who has never met Jesus.

Paul who was doing the prosecutions at the age of 34 underwent a major change and became a follower of what he opposed. He walked from land to land and spread the message. He walked more than 10000 miles in the twelve years of his life. He however did not walk randomly.. he reached out to the biggest communities of the era.

There is something that Paul and mafia boy did in common to bring about such a ground breaking change. They used a key in common, with or without knowing they are doing it, the structure of the network, which is defined by the property of how it grows into what it is.

These are two instances which are governed by the basic laws that govern all the networks. You will see how the emergence of a new company, trend, or the very same idea of the network study is also ruled by the laws of network formation.

This is the beginning of the pursuit of network theory..





Link 2

The Random Universe

Konigsberg, a flowering city in eastern Prussia, The healthy economy allowed city officials to build seven bridges across the river connecting different regions. The people of Konigsberg, enjoying a time of peace and prosperity, amused themselves with mind puzzles, one of which was: "Can one walk across the seven bridges and never cross the same one twice and end where they started?”

Almost 150 years ago, in 1736 Euler offered a rigorous mathematical proof stating that with the seven bridges such a path does not exist. He not only solved the Konigsberg problem but in his brief paper inadvertently started an immense branch of mathematics graph theory. Graph theory is the basis of networks.

Paul Erdos, mathematician from Budapest, wrote more than 1500 papers before his death in 1996. Such a work unparalleled since Euler, contained eight articles published with another Hungarian mathematician, Alfred Renyi. These eight papers addressed for the first time in history the most fundamental question pertaining to our under- standing of our interconnected universe: How do networks form? Their solution laid the foundation of random networks.

Erdos Renyi graph (network formation):
On a given set of n nodes
1) Pick a pair in random of the nc2 pairs, and toss a die.
If the output is one put an edge between them, else don’t
2) Repeat step two until you exhausted all pairs of nodes.

As you can see the graph has almost even distribution of edges over it. The degree distribution is similar to the graph obtained for the example below.
List all your acquaintances’ heights and plot the no. of people in each height range (range ex. 4-5 feet, 5-6 etc). to the height ranges. You observe something like this:





This is known as a bell curve. The extensive work of Erdos and Renyi gives an insight into a whole new universe of properties that have brilliant applications in yet to be known fields.

Barabasi discovered very soon that the human network or what nature follows isn’t quite formed like the above. The journey had just begun.

(Post note: Reading on Paul Erdos might get you started with the random world)






Link Three

Six Degrees of Separation

In 1912 , Frigyes Karinthy, a now celebrated writer from Budapest, sitting at a coffee house wrote anyone in this world can reach any one else in just few handshakes. He offered a bet that the readers
“could name any person among earth's one and a half billion inhabitants and through at most five acquaintances, one of which he knew personally, he could link to the chosen one," writes Karinthy in "Lancszemek."

Stanley Milgram who proves more than 42 successful tests awakened to the fact that, not only are we connected, but we live in a world in which no one is more than a few handshakes from anyone else. That is, we live in a small world.

(PS. One of the brilliant observations of the century, we will come back to look at it)






Link 4

Small Worlds

Duncan Watts, working on his Ph.D. in applied mathematics at Cornell Unive rsity in the mid-1990s, was asked to investigate a peculiar problem: how crickets synchronize their chirping. Male crickets attract females by chirping loudly. Unlike many humans, crickets eschew the spotlight by carefully listening to the other crickets around them, adjusting their chirp to match that of their neighbors.
Put many of them together and from the cacophony a symphony emerges that we often enjoy on the back porch on humid summer nights. Possessing an agile mind, watts has the rare ability to stop, step back, and reflect on his work, changing direction if he needs to. Indeed, the cricket study turned him into a student of social networks and eventually a sociologist. He turned to his guide Steven Strogatz, an applied Mathematics professor at Cornell University, who studied chaos and synchronization. Soon they were off to uncharted territories, taking networks beyond the boundaries set by Erdos and Renyi.

They looked at patterns in terms of clustering co efficient and their change due to addition of edges (not completely at random).
Clustering co efficient: total no. of edges on the graph/ The number of edges on a complete graph of same no. of nodes.


They were surprised to find the distance between any two vertices gets reduced to half of what it is when some edges were added. They found something common to the real world. They realized we live in small groups (they called small worlds) which are connected be feeble number of links.
Intuition told then everyone in this world can be connected with in few steps.







Link 5 (My Favorite)

The Hubs and Connectors

The Brilliant observation (Six degrees of separation):
Craig Fass, Brian Turtle, and Mike were watching “The Air Up” which was airing on television the night, they were at Albright College in Reading, Pennsylvania. They found something startling (Genius I would say their observation was).

Full of excitement, in January of 1994 they mailed a letter to the Jon Stewart Show, an irreverent celebrity talk show popular with college audiences. "We are three men on a mission. Our mission is to
Prove to the Jon Stewart audience, nay, the world, that Bacon is God."
Much to their surprise, they got their fifteen minutes of fame. They were invited to appear on the Stewart show with Kevin Bacon, and charmed the audience with their ability to connect Bacon to any actor whose name was thrown at them.

The Kevin Bacon game would have remained mere movie trivia had two computer science students not watched the Stewart show. Glen Wasson and Brett Tjaden, from the University of Virginia, immediately realized that determining the distance between any two actors was a viable computer science project, if one had access to a complete database of all actors and movies ever released. The Internet Movie Database, or IMDb.com, a cinephile powerhouse offering more information about actors and movies than one could ever need, was already in place. It took Wasson and Tjaden a few weeks of programming to set up The Oracle of Bacon Website, which became the unbeatable master of the game. If you type in the name of any two actors, in milliseconds it provides the shortest path between them, listing the chain of actors and movies through which they are connected. In no time the Website was receiving over 20,000 visits per day.

Though, Kevin Bacon was not at the centre of Hollywood, he was not even near. Studies showed he was the 876 most connected actor. It was luck as it chose him to be the ambassador of the six degrees.

It was found that there are very few nodes in networks that actually causes the six degrees of separation possible, like Kevin Bacon. These were called hubs. They connect the remote two ends of the network in just a few steps. These were found to be the ones which hold the network intact. Removing a few of these would trigger increase of distance between any two points big enough that they can’t be traced in short time.

For example:
Erdos is a Hub of the scientific society. An Erdos number 0 was assigned to Erdos. who co authored with Erdos was given erdos no.1. A co author to erdos no.1 is erdos no. 2 and so on. They seemed to map out the entire scientic society in just a few steps. Einstein had erdos no. 4.

However it’s amazing that Erdos has a Kevin Bacon no. 4. He starred in a documentary “N is a Number” with a guy who starred with some guy who had a Kevin bacon no. 2.







Link 6

The 80/ 20 Rule

Vilfredo Pareto, the influential economist, once pointed out the 80/20 rule (not inexact fraction by himself). In his garden 80% of the peas were produced by 20% of the peapods. He said this was the case many real scenarios than we can normally see. He said, 80 percent of Italy's land was owned by only 20 percent of the population. 80% of a company’s profit is produced by 20% of the employees, 80% customer service problems are produced by 20% of the customers. 80% of the crimes are produced by 20% of the criminals.

When Hawoong Jeong started building little robot to map the Web, He had naive expectations about what the network behind the Web would look like. Guided by the insights of Erdos and Renyi, he and Barabasi expected to find that WebPages are connected to each other randomly.
They were most surprised when they plotted the histogram of the number of nodes (websites) to the number of incoming links. It looked like:



There were 80% of the websites with almost the same number of links as each other (around 3-4). And 20% of them with more than 1000 and 2-3 with more than a million. Further findings returned an even more biased result. The web is not random which means every voice is not equally heard.

A bell curve is close to a democratic setup. Scientists observed that nature occasionally produces quantities that follow power law distribution.

Each power law is characterized by a unique exponent, telling us, for example, how many very popular WebPages are out there relative to the less popular ones.



Kenneth Wilson was an assistant professor, physics department of Cornell University, working on renormalization found the two missing critical exponents of the power law. A work which won him the 1982 Nobel Prize in physics. Thus, putting the finishing tip at the top of the pyramid.






Link 7

Rich get Richer

In 1999, while learning about the structure of other real networks, Barabasi realized what was not making the real world fit into the Erdos and Strogatz graphs. These two graphs already had the nodes in place. He said in real world networks, nodes are added in time and edge formation is not based on any specific criteria.

So he simulated a new graph where:
Nodes come in one at a time and have two edges each (Graph created over time) At any time when a node comes, the probability of it having an edge with any other node is directly proportional to the number of edges the other node has (preferential attachment).

This means that the node that comes first has maximum time to form the edges and the one that comes last the least. This in turn reinforces the edge forming ability of the nodes due to preferential attachment. This led to a new graph that more closely modeled the real world graphs. He called it Scale free model.

What it meant, a node coming as early in the network has the most chances of becoming the hub. An example cited by Barabasi of his familiarity that he would turn to a NY times review than a CNN since he would was reading NY times since his early time.

As a result of the pattern, he observed any new investment in the world went towards the node with more preference (degree) for example an advertisement of an apple pie on the yahoo recipes page is seen by very much more number of people than the number of people who see it on some healthy diets blog.

This followed a Pareto principle, and the need of preferential attachment destroyed the random world of Erdos and Renyi, leading to a world where Rich get richer.

First, power laws gave legitimacy to the hubs. Then the scale-free model elevated the power laws seen in real networks to a mathematically backed conceptual advance. Supported by a sophisticated theory
of evolving networks that allows us to precisely predict the scaling exponents and network dynamics, we have reached a new level of comprehension about our complex interconnected world, bringing us closer than ever to understanding the architecture of complexity. But the scale-free model raised new questions. One in particular kept resurfacing: How do latecomers make it in a world in which only
the rich get richer? The quest for the answer took us to a very unlikely place: the birth of quantum mechanics at the beginning of the twentieth century.






Link 8

Einstein's legacy

If the earliest nodes have a preference over the nodes that arrive late. Google a latecomer in the web has grown to be the biggest hub in short time. Google intrigued Barabasi because it violated the basic prediction of the scale-free model, that the first mover has an advantage. In the scale-free model the most connected nodes are those that appeared first. They have had the longest time to collect links and develop into hubs. Google, launched only in 1997, was a latecomer to the Web. Popular search engines like AltaVista or Inktomi had been dominating the market long before Google's arrival, clearly making it a second mover. In less than three years, however, Google became the biggest node and the most popular search engine.

He did find that in real world links are broken and formed. And this is driven by the fitness of the nodes. A fit node say that has a fitness number n, makes a link before all the nodes that‘re less fit than it. This recursively happens as the nodes come in with different fitness numbers. This has happened with Google. Barabasi is reminded a famous line on Google homepage “I’m feeling lucky”.

Bianconi, a member of Barabasi’s research group, with unusual fascination for physics, who quantum mechanically treated the network as a Bose-Einstein condensate, was able to clearly explain the network behavior. The nodes acted as levels and edges as particles, mathematics derived beyond a particular point (with the growth of a fit node into a hub) newly coming edges would fall under same level (node). While searching for real world examples they found, Bill Gates and Paul Allen’s Microsoft windows as one such. Beyond a particular time with increasing Windows, usage more than 94%,all the incoming computers used Windows whether they liked it or hated it.






Link 9

Achilles' heel


On learning about networks, Barabasi investigated whether there is a way to collapse these. His pursuit led him to find a big difference in the stabilities of the Erdos Renyi and Scale free networks.

An Erdos Renyi Network would collapse after a specific number of edges were randomly removed every time. He tried with Scale free networks, it was amazing to find that the network remained intact even after 98% of the edges were removed.

What is the source of this amazing topological robustness? The distinguishing feature of scale-free networks is the existence of hubs, the few highly connected nodes that keep these networks together. Failures, however, do not discriminate between nodes but affect small nodes and large hubs with the same probability. If I blindly pick ten balls from a bag in which there are 10 red and 9,990 white balls, chances are ninety nine in a hundred that I will have only white balls in my hand.Therefore, if failures in networks affect with equal chance all nodes, small nodes are far more likely to be dismantled, since there are many more of them. Small nodes contribute little to a network's integrity.

He then tried hitting the hubs. Not to his surprise the networks collapsed soon after he removed a few hubs.

This in turn became the basic strategy of the US military as a valuable defense method.

So, how vulnerable are we?
Fortunately, our understanding of attacks indicates that cascading failures and local breakdowns can be addressed.






Link 10

Viruses and Fads

The search for reasons for the spread of aids, which further led to the study of the sexual network gave a startling discovery, the sexual networks were indeed following a power law.This meant they were scale free. If not they could stop the spread of the disease, they started treating the hubs, and this has indeed slowed down the spread.

Barabasi observed that, the same structure of the scale free network supports the spread of even a weak infection. This network simply overthrows the threshold of virus/bacteria as in random network, in which they die out when their virulence is below a threshold. Once a virus reaches a hub it spreads to all its links and so on. So, unconfirmed reports suggest a single air host responsible for carrying the AIDS virus to America who is said to be a hub with 250 sexual partners a year.

It is the same in internet network as above, a best example the spread of the luv virus that spread around the world in days and still exists in few parts of the world despite anti love virus cures being available.






Link 11

The Awakening Internet

From deduced maps of the internet. It was found it was a scale free network growing at a huge rate. With increased number of resources online, the question rises as to how to use them efficiently.
Experts predict that there are will be around 10,000 telemetric devices for each human on the planet. This number is not particularly significant in and of itself, we've had sensors for a long time, ranging from surveillance cameras in supermarkets to car detectors buried in the pavement at traffic signals that switch the lights at the intersection. What is changing is that for the first time these various sensors are feeding information into a single integrated system. There will soon be over 3 billion Internet connected cell phones and close to 16 billion Internet-connected computers embedded in everything from toasters to fashion designs. The tiny sensors of this planetary skin will spy on everything from the environment to our highways and bodies. Most importantly, however, they are all connected.

Our planet is evolving into a single vast computer made of billions of interconnected processors and sensors. The question being asked by many is, when will this computer become self-aware? When will a thinking machine, orders of magnitude faster than a human brain, emerge spontaneously from billions of interconnected modules? It is impossible to predict when the Internet will become self aware, but clearly it already lives a life of its own. It grows and evolves at an unparalleled rate while following the same laws that nature uses to spin its own webs. Indeed, it shows many similarities to real organisms. Just like the millions of reactions taking place in a cell, terabytes of information are passed along its links every day. The surprising thing is that some of this information is very difficult to find. That brings us to yet another network: the World Wide Web.







Link 12

The Fragmented Web

The web is a directed graph. Mapping the directions and hubs leads to web which is fragmented into 4 continents. One with the core, to where most of the links point. The core points to a OUT continent which is a dead end (no links lead to any where out side it). All the IN links point to the core and a few to OUT continent. And the fourth quarter is a bunch of disconnected islands with few links towards IN, OUT and central continents.


Far from being a homogenous sea of nodes and links, the Web is fragmented into four continents, each of which hosts many villages and cities that appear as overlapping communities. Any of us willing to take
up a virtual presence belongs to one or several of them. To be sure, we are far from fully understanding this fine structure of the Web. But many forces, from commercial interests to scientific curiosity, increasingly motivate us to do better. As we dig deeper, I am sure that we will encounter many surprises, offering us an even clearer view of this complex, amorphous, ever changing online universe.
Something exhibited by governments trying to have a say in what the websites should show and should not, has an effect on the links formed by the hubs.






Link 13

The map of life (by the six degrees of separation)

In FEBRUARY 1987 the journal Nature reported a landmark discovery: the gene for manic depression, or by its more recent name, bipolar disorder. Manic depression affects 1 to 5 percent of adults in the
United States, and as many as 25 to 50 percent of those attempt suicide at least once. Because the risk of developing manic depression is five to ten times higher if first-degree relatives have the disease, the prevailing view is that manic depression is a genetic disorder. So as soon as methods for linking illnesses to specific genes emerged, the race was on to find the manic depression gene. The much coveted "first" seemed to have gone to the authors of the 1987 Nature paper, who located the gene on chromosome 11 while studying a large Amish family in Lancaster, Pennsylvania. Yet two years later the research group recanted the results. The blunder did not discourage other gene hunters, however. If anything, it gave them extra motivation to find the real gene. In 1996, almost a decade after the first published study, three independent research groups reported links to genes on other chromosomes. Another Amish study implicated chromosomes 6, 13, and 15; a study focusing on the isolated population of Costa Rica's Central Valley documented links to chromosome 18; and results derived from a large Scottish family indicated the involvement of chromosome 4. Research on another prominent mental disorder, schizophrenia, followed a similar pattern, linking the disease to two different regions of chromosome 1, with a different research group implicating chromosome 5 a few years later. Absentminded scientists? Bad research? Far from it.

These are not conflicting results. They simply demonstrate that most illnesses, rang- ing from manic depression to cancer, are not caused by a single malfunctioning gene. Rather, several genes interacting through a complex network hidden within our cells are simultaneously responsible. Faced with the gigantic task of figuring out the building blocks of the cell, from genes to proteins, scientists until recently focused on biology rather than networks. But with the pieces now in hand, post genomic
biology is taking a step back to grasp the big picture. New and exciting discoveries that are revolutionizing biology and medicine tell us loud and clear: If we want to understand life—and ultimately cure disease— we must think networks.

When studying about the Protein interaction network, they were surprised to find they could map every protein in 3 to 4 steps. They found what they call P53 gene, which produces P53 protein to be an evolutionary protein that was not left behind. This had a particular function in every protein interaction. This protein when it fails to provide command to destroy the Existing DNA cells to produce new Cancer occurs. The entire map of the network isn’t mapped out yet. However, the link of the proteins between different diseases and provided a vision towards treating diseases.






Link 14

The Network Economy

Economy too is driven by a scale free from. Where the dominance is a characteristic of being a hub. Any changes to hubs greatly affect the economies. While the failure or selling of minor companies is of very less importance.
Such a hierarchy is found in the work structure of the companies. Here people who are directors of more than one firm play as hubs bringing the changes in the economy.






Link 15

Web without a Spider
In 1994, or even in early 1998, nobody could have anticipated the flood of discoveries that would completely reshape our understanding of our interconnected world in the following years. Not even in my wildest dreams I could conjure power laws or scale-free networks, he says. There is how ever a self balancing act by the natural networks, as observed by not hunting the sea otters. When these nodes carry load and there is a bearing capacity for each link, a scale free network can breakdown due to a cascading effect when the hierarchy of capacity breaks is found.

Today we know that, though real networks are not as random as Erdos and Renyi envisioned, chance and randomness do play an important role in their construction. Real networks are not static, as all graph theoretical models were until recently. Instead, growth plays a key role in shaping their topology. They are not as centralized as a star network is. Rather, there is a hierarchy of hubs that keep these networks together, a heavily connected node closely followed by several less connected ones, trailed by dozens of even smaller nodes. No central node sits in the middle of the spider web, controlling and monitoring every link and node. There is no single node whose removal could break the web. A scale-free network is a web without a spider.

In the twentieth century we went as far as we could to uncover and describe the components of complex systems. Our quest to understand nature has hit a glass ceiling because we do not yet know how to fit the pieces together. The complex issues with which we are faced, in fields from communication systems to cell biology, demand a brand new framework. Embarking on the journey ahead without a map would be hopeless. Fortunately the ongoing network revolution has already provided many of the key maps. Though there are still many "dragons" ahead, the shape of a new world has become discernible, continent by continent. Most important, we have learned the laws of web cartography, allowing us to draw new maps whenever we are faced with new systems. Now we must follow these maps to complete the journey, fitting the pieces to one another, node by node and link by link, and capturing their dynamic interplay. We have ninety-eight years to succeed at this, and make the twenty-first the century of complexity.

- Barabasi
( Compiled by Arvind M )

Sunday, September 25, 2011

MARRY ALGORITHMICALLY :P

As soon as you see the title, you might wonder, "Why 'marriage' in algorithinking?!!". Well, before you end up thinking that I am going to describe how matrimonial sites work(!!), here is an excellent algorithm I came across recently, called the "Stable Marriage Algorithm".

According to Wikipedia, The Stable marriage problem is stated as, "Given n men and n women, where each person has ranked all members of the opposite sex with a unique number between 1 and n in order of preference, marry the men and women together such that there are no two people of opposite sex who would both rather have each other than their current partners. If there are no such people, all the marriages are "stable"."

If this sounds confusing, let me restate it for you. There are two disjoint sets $M$ and $W$. The cardinality of both the sets are equal and let it be $n$. If each element in $M$ has ranked all the elements in $N$ from $1$ to $n$ (A unique number to each element of the set) and vice-verse, we have to find a stable match between the two sets. Let $A\inM$ and $B\inN$, where $A$ and $B$ are not matched with each other, then a match between the two sets is said to be stable if for any pair $(A,B)$, it should not happen such that $A$ would prefer $B$ over its current partner and $B$ as well prefer $A$ over its current partner.

When I first went through the problem statement, I thought how can anyone possibly find a match. I was referring to real time implications. I was flabbergasted to find out that this algorithm was basically framed to solve a problem where graduating medical students are assigned to hospital jobs. And guess what, it worked!
David Gale (left)and Lloyd Shapely (right), in the year 1962, proved that given any $n$, it is always possible to solve SMP(Stable Marriage Problem) and hence find a match.

Here's how they explain there algorithm:

It basically consists of a number of rounds. In each round, each unengaged man proposes to a woman he prefers the most and has not proposed yet. And each woman considers all her suitors and replies "Maybe" to the one she most prefers and to rest all she says a straight "No". This will go on and on unless and until there is a match. When this happens, the woman is engaged to her most preferred suitor and that suitor is engaged to her.
To begin with the first round, at first,each unengaged man proposes to the woman he prefers most. And then,each woman replies a "maybe" to the suitor she prefers most and "no" to the rest all.
In the following rounds, each unengaged man proposes to the most preferred woman he has not proposed yet. This will be irrespective of whether the woman is engaged or not. And each woman replies with "Maybe" to the most preferred suitor and "no" to the rest all. This will be again irrespective of whom she is currently engaged to.

However there is a prerequisite which says that all the participant's preference list should be complete and no candidate should be declared unacceptable.

In the end,


  • Everyone gets married: A woman is always engaged to someone since she replies "May Be" to a person in each round.( If she is happy with her first choice, she might not change her decision and just say "no" to remaining candidates in all the rounds. Even in that case she will be engaged to her first choice always).
    A man proposes to every woman (if necessary). So there can be no man and woman unengaged.

  • The marriages are stable: Plagiarising wikipedia, let me explain this with an example. Let Aishwarya and Abhishek be married at the end, but not with each other. Now, its impossible for both Abhishek and Aishwarya to prefer each other over their current partners.Here's how. Since Abhishek prefers Aishwarya over his current wife, he must have proposed Aishwarya, before he proposed his current partner. There are two cases:
    1. If Aishwarya had accepted his proposal, then she surely rejected him for her current partner. And that's the reason she is currently married to him. So, Aishwarya cant prefer Abhishek over her current partner.
    2. If Aishwarya hadn't accepted Abhishek's proposal, then she never liked him in the first place.
Even though this problem works well, there are few problems with respect to its optimality. You can refer to wikipedia page for more info.

You can try out and play with this algorithm. Code it in any language you like. Its quite easy. In the following you are asked for the order of preference and it gives you a stable match.
http://sephlietz.com/gale-shapley/.

The following is a link to the paper studying the Gale-Shapely algorithm. It focuses on proving how appropriate are stable matching mechanisms. It says, even if a woman knows the preference list of all other women, she cannot cheat.
www.columbia.edu/~js1353/pubs/tst-ms01.pdf

You can also find an interactive flash demonstration of this problem. After going through it you can easily understand the problem.

This algorithm is also currently being employed in New York and Boston Public schools, to assign students to schools.

Shruti Ranade

Thursday, August 4, 2011

Maximum Subsequence

6 interns sat in front of their laptops with enthusiasm flashing on their faces brighter than the laptop light.And why not,it was the first day of their internship & they were eager to start off.Their wait ended after their professor entered & without wasting a single moment,gave them a problem to solve.He called it the maximum sub sequence problem.It sounded trivial enough.

Given a sequence of numbers in a array can you find a continuous sequence of numbers such that they give you the maximum sum?If all the numbers are positive then the max subsequence would be the whole array.problem is when we encounter negative numbers.Completing the problem statement,if all the numbers turn out to be neg then the max sub sequence sum would be taken as zero.
If you are still not clear about the problem.It is equivalent to placing two sticks in the array in such a position that the sum between them is the highest compared to any other positions.
Within no time they had an algorithm ready.The only thing in their mind was to solve it as quickly as possible and never really gave any thought about bettering the algorithm.
Again the brute force came to their rescue.Simply do it for all positions and one with the highest sum would be the answer.Turns out the complexity was quadratic.Well atleast the problem was solved.
But the professor was adamant that the problem could be tacked in linear time.
Yup thats the professor!:)
How can it be said there can be linear time algorithm for this?
The awesome proof follows.Forget about the algorithms for a while and think this way.
Say there is an oracle which gives you the answer for array of numbers given.But there's a problem.It takes only $n-1$ inputs.Oops but we have $n$ numbers in the array.But think for a while,there has to be a way in which it can be used.
I'll just give the oracle the whole array except for the last element and the oracle returns with the answer for the $n-1$ elements.Now we have two numbers in hand.One is the last element which was left out and the other is the max subsequence sum for $n-1$ elements.Can we with these two components infer the actual max sub sequence sum of the whole array?And guess what yes we can!
There are just two scenarios
1.The last element is negative-This is the easiest one.In this whatever the oracle gave becomes the final answer
2.The last element is positive-Two cases arrive here
  1. The max subsequence given by the oracle includes the $n-1$th element-In this case since the last no is positive we include the last $n$th element too to the max subsequence and arrive at the final answer
  2. The max subsequence does not include the last $n-1$ element.This is the most tricky one.In this we never know,the max subsequence might be vry different and the one with the $n$th element.So what we can do is just find the max sub sequence including the last element($n$th).In simpler words what can be the maximum sum you can obtain fixing one position at the end.Compare this with the sub sequence sum given by the oracle.Which ever is greater will fetch us the answer.
So the oracle is making our task very easy.But you may wonder where is this discussion heading.How do u think the oracle will solve the problem?Dont you think the oracle having $n-1$ elements can further give all elements except one to another oracle.
Doesnt this thought trigger something?Its nothing but the decrease and conquer told differently!and as you can see this is done in linear time.
Interestingly this problem has a history too.It first arose in two dimensional form and inorder to simplify it was converted in one dimentional form.First a cubic algol was formulated after which a quadratic and a logarithmic time followed.The best part was the linear algol was designed in less than a minute by a person who attended a seminar where this problem was told about.

How does it really matter?
You may always say that my computer is very fast and so it doesnt really matter.ok whats the harm in comparing anyway.Lets say a computer is as fast as 10GHz. That is nothing but it can perform$10^{10}$ steps in one second.

$N$ $N^{2}$
$10$ $100$
$100$ 10000
$1000$ $10^{6}$
$10000$ $10^{8}$
$10^{5}$ $10^{10}$
$10^{9}$ $10^{18}$

Lets calculate the time for the last instance.
Time taken by the linear one would be=$10^{9}/10^{10}$=0.1seconds
Time taken by the quadratic one would be=$10^{18}/10^{10}$=$10^{8}$seconds
Converting that into years it would approximately 3.5yrs!!!!!!
You see what difference it makes.One takes less than a second to solve the problem whereas the other takes years.And this is just between linear and quadratic.Imagine the scenario for the cubic and the further ones.
Of course it is inavitable to use those when u cant find a linear time algol or when there is no such algol.But effort has to made to reduce the complexity and one way of bettering would be using the apporopriate algorithm design techniques.
"Science is what we understand well enough to explain it to a computer".The technology may advance even more tomorrow and there will be new programming languages formed,but independent of all this the essence of algorithms will always be the same.Thats what makes it the most important part computer science.Dijkstra was right when he said "Computer Science is no more about computers than astronomy is about telescope".
What do you think is the real foundation of computer science?:)

So what happened to our 6 friends?Well they had their best beginning and continued on what became the amazing 2 months of their learning experience.
Moral of the story-Between a fast computer and a fast algorithm always go for the latter.you know why :):)

Nikitha Shenoy