Poisson Distribution

In this post, I explain how to derive the Poisson distribution from the binomial distribution.

The Poisson distribution is a discrete probability distribution. It expresses the probability that a specified number of events occur in a fixed interval (in time and/or space). The crucial assumptions here are (1) the events occur independently of each other, and (2) the rate at which they occur is constant.

To understand the Poisson distribution, we first revisit the binomial distribution. This distribution is also discrete, and it gives the probability that a specified number out of independent Bernouille trials, each with success probability , are successes. This frequency function is given by,

Now imagine that by some strange turn of fate, I am abandoned on a remote island and forced to drink stagnant water from a broken coconut shell. There are bacteria randomly distributed in this water, so the more of it I drink, the more bacteria I ingest, and the larger the risk that I succumb to some infection and die!

Being a statistician, I would like to quantify my risk. Assume that this risk directly corresponds to the number of bacteria I consume, I could model the number of bacteria as a binomial random variable. A trial would be 1ml of water, and a success is signified by the presence of at least one baterium in that 1ml of water. The probability of success is roughly the concentration of bacteria per ml.

If we assume that the bacteria are evenly distributed in the water, then we can assume that the number of bacteria in the next ml of water I drink is independent of the ones I have drunk. This satisfies the independent trials assumption of the binomial distribution.

Now there’s a problem with the binomial model. The bacteria count will not be very accurate if there are often more than one bacteria in a ml of water, since these additional bacteria will not be counted in the success/failure counting method.

Since the amount of water is finite, increasing the number of trials means decreasing the amount of water in a trial (say 0.1 ml, 0.01 ml, etc). Consequently, the bacteria concentration per unit of water will also decrease.

because

so smaller unit size results in larger number of units, and consequently smaller concentration.

Therefore to make our model more accurate, we need to find the limit of the binomial distribution as the number of trials approach infinity. As , but stays the same and is finite.

Let denote the bacteria concentration in ml water, and the number of trials, each with ml of water ( could be , etc). So

and we have

Because

the previous limit becomes

which is the frequency function of Poisson.

Thus if Poisson then gives the probability that events will occur in an interval.