**IMPROBABLE PROBABILITIES**

**Assorted comments on some uses and
misuses of probability theory**

**First posted on June 22, 1999;
updated in September 2001; the section "Probability Estimate Is Often
Tricky" updated in October 2006.**

*By Mark Perakh*

·
**PROBABILITY ESTIMATE
IS OFTEN TRICKY**

·
**SOME
SITUATIONS WITH SEEMINGLY UNEQUAL PROBABILITIES**

·
**THE
COGNITIVE ASPECTS OF PROBABILITY**

·
**PSYCHOLOGICAL
ASPECTS OF PROBABILITY**

·
**THE
CASE OF MULTIPLE WINS IN A LOTTERY**

·
**APPENDIX**

**Las Vegas****
is arguably one of the most famous (or infamous) places on the globe. The
glittering towers of its hotels rising to the sky in the middle of the desert
have been reproduced on millions of photographs distributed all over the world.
Since its first hotel/casino named The Flamingo sprang up from the sandy ground
of the **

**What is the basis of the casinos’ unbreakable sequence of immense
profitability? There are two. One is a human psychological frailty ensuring an
uninterrupted supply of fools hoping to catch Lady Luck’s attention. The other
is a science.**

**The science in question is mathematical statistics. Of course, the
casinos’ operators are by and large scientifically illiterate. They don’t need
to know mathematical statistics any more than an old lady driving her motorized
jalopy needs to know the physical chemistry of oil’s oxidation in the cylinders
of her engine. All she needs is some primitive skill in pushing certain pedals
and rotating the steering wheel. However, it is the physical chemistry of the
oil’s oxidation which makes her driving possible. Likewise, even though the
casinos’ operators normally have hardly any knowledge of mathematical
statistics, it is mathematical statistics which makes their business so
immensely profitable.**

**Another field where mathematical statistics is the basis of success is
the insurance industry, where professional statisticians are routinely employed
to determine the values of premiums and payoffs necessary to maintain the
insurer’s profitability.**

**In both cases, that of casinos and that of insurance, the success is
based on the proper (even if sometimes not quite conscious) use of mathematical statistics.**

**There are, however, situations where mathematical statistics is being
used in a way contradicting its own main concepts. When that happens it often
results in claims, which may look statistically sound but are actually
meaningless and often misleading.**

**At the core of mathematical statistics is probability theory. Besides
mathematical statistics, probability theory is also the foundation of
statistical physics. It deals with the quantity called probability. While the
concept of probability may seem to be rather simple for laymen, probability
theory reveals that that quantity is multi-faceted and its use must follow
certain precautions. When those precautions are not adhered to, the result is
often a meaningless conclusion.**

**While an incorrect application of mathematical statistics may involve any
part of that science, a large portion of the errors in question occur already
at the stage when its seminal quantity, probability, is miscalculated or
misinterpreted.**

**One example of an incorrect application of the probability concept is the
attempts by the proponents of the so-called Bible code to calculate the
probability of occurrence of certain letter sequences in various texts. Another
example is the often-proposed calculation of the probability of the spontaneous
emergence of life on earth. There are, of course, many other examples of
improper uses of the probability calculation.**

**There are many good textbooks on probability theory. Usually they make
use of a rather sophisticated mathematical apparatus. This article is not meant
to be one more discussion of probabilities on a rigorously mathematical level.
In this article I will discuss the concept of probability mostly without
resorting to mathematical formulas or to the axiomatic foundation of
probability theory. I will rather try to clarify the concept in question by considering
examples of various situations in which different facets of probability
manifest themselves and can be viewed in as simple a way as possible. Of
course, since probability theory is essentially a mathematical discipline, it
is only possible to discuss probability, without resorting to some mathematical
apparatus, to a very limited extent. Hence, this paper will stop at the point
where the further discussion without mathematical tools would become too crude.**

**PROBABILITY
ESTIMATE IS OFTEN TRICKY **

**
Calculation of probabilities is sometimes a tricky task even for qualified
mathematicians, not to mention laymen. Here are two examples of rather simple
probabilistic problems whose solution often escaped even some experienced
scientists.**

** **
The first problem is as follows. Imagine that you watch buses arriving at a
certain stop. After watching them for a long time, you have determined that the
interval between the arrivals of any two sequential buses is, *on the
average,* one minute. The question you ask is: How long should you expect to
wait for the next bus if you start waiting at an arbitrary moment of time? Many
people asked to answer that question would confidently assert that the *average*
time of waiting is thirty seconds. This answer would be correct if all the
buses arrived at exactly the same interval of one minute. However, the
situation is different in that one minute is just the *average* interval
between any two consecutive bus arrivals. This number – one minute - is a *mean*
of a distribution over a range from zero to a maximum which is larger than one
minute. Therefore, *the average *waiting time,

which in the case of a constant inter-arrival interval equals half the inter-arrival interval, in the case of a varying interval is always larger than a half of the inter-arrival interval. While this result can be proven in a rigorously mathematical way [ ] it can easily be understood intuitively. If the inter-arrival intervals vary, some of them being shorter than others, then obviously, on the average, more passengers are expected to start waiting for the next bus within longer inter-arrival intervals than starting within shorter intervals. Therefore the average waiting time happens to be longer than a half of the inter-arrival interval (as it was in the case of constant intervals). For a certain inter-arrivals distributions (for example for the so called Poisson process, studied in mathematical statistics, which corresponds to the perfectly random arrivals), the average waiting time exactly equals the average inter-arrival interval (i.e. in our example, 1 minute). For an arbitrary arrivals distributions the average waiting time is larger than a half of the average inter-arrival intervals, and may even be larger than the entire average inter-arrival intervals.

**
The second problem had been used in a popular TV game show conducted by Monty
Hall, wherein the players were offered a choice among the three closed doors.
Behind one of the doors, there was a valuable prize, while behind the two other
doors there was nothing. Obviously, whichever door a player chose, the
probability of winning the prize would be 1/3. However, after the player chose
a certain door, the compere, who knew where the price
was, would open one of the two doors not chosen by the player, and show that
there was nothing behind it. At that point, the player would be given a choice,
either to stick to the door he had already chosen, or to choose instead the
remaining closed door. The problem a player faced was to estimate whether or
not changing his original choice would provide a better chance of winning. Most
of the people, including some trained mathematicians, answered that the
probability of winning is exactly the same regardless of whether the participant
sticks to the originally chosen door or switches to the other yet unopened
door. Indeed, at first glance it seems that the chance of the prize being
behind any of the two yet unopened doors is the same. Such a conclusion
would be correct only if the compere chose at random
which door to open. In Monty Hall’s actual game, however, he knew precisely
where the prize was hidden and chose which door to open not at random, but with
a confidence that the door he opens hides no prize. In this case changing the choice
from the door originally chosen by the player to the other yet unopened door
actually doubles the probability of winning. **

**
To see why this is so, note that at the beginning of the game, there was only
one “winning” door and two “losing” doors. Hence, when a player chose
arbitrarily one of the doors, the probability of his choosing the “winning”
door was 1/3 while the probability of his choosing the “losing” door was 2/3,
i.e., twice as large. Now, if the player luckily chose the “winning” door, he
would win if he did not change his choice. This situation happens, on the
average, in 1/3 of games, if the game is played many times. If, though, the
player happened to choose the “losing” door, he had to change his choice in
order to win. This situation happens, on the average, in 2/3 of the games if
they are played many times. Hence, to double his chance to win, the player
better has to change his original choice. **

**
I can suggest a simple semi-formal
proof of the above conclusion. **

**
Denote the doors A, B, and C. P(X) is probability of X being the winning door.
Obviously P(A)=P(B)=P(C)=1/3 and P(A)+P(B)+P(C)=1. In our case P(A)=1/3 and P(~A)=P(B)+P(C)=2/3; Assume the compere opened door B and showed that it did not hide the
prize (as the compere already knew with a 100%
certainty). Now we see that P(B)=0, hence P(A)+P(C)=1.
Since P(A)=1/3, P(C)=2/3. QED. **

**Instead
of 3, any number N of doors can be used. The calculation (left to readers)
shows that in such a case, changing the choice from the originally chosen door
to some other (specified) of N-2 not originally chosen and still closed doors
increases probability of winning (N-1)/(N-2) times
(while the probability of the originally chosen door losing, that is the
probability of some (unspecified) of the originally not chosen door winning,
increases N-1 times). **

**Comment
1: I refer to the above proof as “semi-formal” because it is simplified for the
sake of readers not well versed in probability theory; a more detailed proof
would use conditional probabilities; the result, however, would not change. The
most rigorous but substantially more complicated proof can be performed using
the so called Bayesian approach. **

**Comment
2: the above simple proof is based on the fact that the compere
knew precisely where the prize was and which door, B or C, was empty. In the
absence of such knowledge, that is were Monty Hall to choose at random which
door (B or C) to open, the above proof would become invalid; indeed, in such a
case it does not matter whether the player sticks to the originally chosen door
or switches to an alternative door – the chance of winning in both cases will
be the same. **

**
If many people, including trained mathematicians, are sometimes confused by the
above rather simple probabilistic situations, the misuse of probabilities in
many more complex cases happens quite often, thus showing the extreme caution
necessary if probabilities are used to arrive at important conclusions. **

**The above section was
substantially improved due to discussions with several colleagues.
Specifically, the part related to the “buses” problem was clarified thanks to
comments by Brendan McKay and Peter Olofsson; the
part related to Monty-Hall game, was likewise edited thanks to comments by
Brendan McKay, Peter Olofsson, Jason Rosenhouse, and Douglas Theobald.
**

**
**

**Consider a game with a coin. Each time we toss a coin it can result in
either tails or heads facing up. If we toss a die, it can result in any of six
numbers facing up, namely 1, 2, 3, 4, 5 and 6. If we want to choose one card
out of fifty cards scattered on a table, face down, and turn its face up, it
can result in any one of fifty cards facing up.**

**Let us now introduce certain terms. Each time we toss a coin or a die, or
turn over a card, this will be referred to as a trial. In the
case of a coin, the trial can have either of two possible outcomes, tails
(T) or heads (H). In the game with dice, each trial can result in any of six
possible outcomes, 1, 2, 3, 4, 5, or 6. In the case of 50 cards, each
trial can result in any one of fifty possible outcomes, such as five of
spades, or seven of diamonds, etc.**

**Now assume we conduct the game in sets of several trials. For example,
one of the players tosses a die five times in a row, resulting in a set of five
outcomes, for example 5, 3, 2, 4, and 4. Then his competitor also tosses the
die five times resulting in some other combination of five outcomes. The player whose five trials result in a larger sum of numbers
wins. The set of 5 (or 10, or 100, or 10,000 etc) trials
constitutes a test. The combination of 5 (or 10, or 100, or 10,000 etc) outcomes
obtained in a test constitutes an event. Obviously, if each test
comprises only a single trial, terms trial and test as well as
terms outcome and event become interchangeable.**

**For the further discussion we have to introduce the concept of an
"honest coin" (also referred to as a "fair coin"). It means
we postulate that the coin is perfectly round and its density is uniform all
over its volume, and that in no trial do the players consciously attempt to
favor either of the two possible outcomes. If our postulate conforms to
reality, what is our estimate of the probability that the outcome of an
arbitrary trial will be, for example T (or H)?**

**First, it seems convenient to assign to something that is certain, a
probability of 1 (or, alternatively, 100%). It is further convenient to assign
to something that is impossible a probability of zero. Then the probability of
an event that is not certain, will always be
between 0 and 1 (or between 0% and 100%).**

**Now we can reasonably estimate the actual value of some probabilities in
the following way. For example, if we toss a fair coin, the outcomes H and T
actually differ only in the names we give them. Thus, in a long sequence of
coin tosses, H and T can be expected to happen almost equally often. In other
words, outcomes H and T can be reasonably assumed to have the same probability.
This is only possible if each has probability of ½ (or 50%).**

**Since we use the concept of probability, which by definition is
not certainty, it means that we do not predict the precise outcome of a
particular trial. We expect, though, that in a large number of trials the
number of occurrences of H will be roughly equal that of T. For
example, if we conduct a million of trials, we expect that in approximately one
half of trials (i.e. in close to 500,000 trials) the outcome will be T
and in about the same number of trials it will be H.**

**Was our postulate of an honest coin correct? Obviously it could not be
absolutely correct. No coin is perfect. Each coin has certain imprecision of
shape and mass distribution, which may make the T outcome slightly more
likely than the H outcome, or vice versa. A player may inadvertently
favor a certain direction, which may be due to some anatomical peculiarities of
his/her arm. There may be occasional wind affecting the coin’s fall, etc.
However, for our theoretical discussion we will ignore the listed possible
factors and assume an honest coin. Later we will return to the
discussion of the above possible deviations from a perfectly honest coin.**

**We see that our postulate of an honest coin led to another postulate,
that of the equal probability of possible outcomes of trials. In the
case of a coin there were two different possible outcomes, T and H,
equally probable. In some other situation there can be any number of possible
outcomes. In some situations those possible outcomes can be assumed to be all
equally probable, while in some other situations the postulate of their equal
probability may not hold. In each specific situation it is necessary to
clearly establish whether or not the postulate of equal probability of all
possible outcomes is reasonably acceptable or whether it must be dismissed. Ignoring
this requirement has been the source of many erroneous considerations of
probabilities. We will discuss specific examples of such errors later on.**

**Now consider one more important feature of probability. Suppose we have
conducted a trial and the outcome was T. Suppose
we proceed to conduct one more trial, tossing the coin once more. Can we
predict the outcome of the second trial given the known outcome of the first
trial? Obviously, if we accept the postulate of an honest coin and the
postulate of equal probability of outcomes, the outcome of the first trial has
no effect on the second trial. Hence, the postulates of an honest coin and of
equal probability of outcomes lead us to a third postulate, that of independence
of tests. The postulate of independence of tests is based on the
assumption that in each test the conditions are exactly the same, which means
that after each test the initial conditions of the first test are exactly restored.
The applicability of the postulate of tests’ independence must be ascertained
before any conclusions can be made in regard to the probabilities’ estimation.
If the independence of tests cannot be ascertained, the probability must be
calculated differently from the situation when the tests are independent.**

**We will discuss situations with independent and not independent tests in
more detail later on.**

**A discussion analogous to that of the honest coin can be also applied to those
cases where the number of possible outcomes of a trial is larger than two, be
this number three, six, or ten million. For example, if, instead of a coin, we
deal with a die, the postulate of an honest coin has to be replaced with the
similar postulate of an honest die, while the postulates of equal
probability of all possible outcomes (of which there now are six instead of
two) and of independent tests have to be verified as well before calculating
the probability.**

**The postulate of an honest coin or its analogs are
conventionally implied when probabilities are calculated. Except for some
infrequent situations, this postulate is usually reasonably valid. However,
some writers who calculate probabilities do not verify the validity of the
postulates of equal probability and of independence of tests. This is not an
uncommon source of erroneous estimation of probabilities. Pertinent examples
will be discussed later on.**

**Suppose we conduct our coin game in consecutive sets of 10 trials each.
Each set of 10 trials constitutes a test. In each ten-trial test the
result is a set of 10 outcomes, constituting an event. For example,
suppose that in the first test the event comprised the following 10 outcomes:
H, H, T, H, T, T, H, T, H, and H. Hence, the event in question included 6 heads
and 4 tails. Suppose that the next event comprised the following
outcomes: T, H, T, T, H, H, H, T, T, and T. This time the event included 6 tails
and 4 heads. In neither of the two events the number of T was
equal the number of H, and, moreover, the ratio of H to T
was different in the two tests. Does this mean that, first, our estimate of the
probability of, say, H, as ½ was wrong, and, second, that our postulate
of equal probabilities of H and T was wrong? Of
course not.**

**We realize that the probability does not predict the exact outcome of
each trial and hence does not predict particular events. What is, then, the
meaning of probability?**

**If we accept the three postulates introduced earlier (honest coin, equal
probability of outcomes and independence of tests) then we can define
probability in the following manner. Let us suppose that the probability of a
certain event A is expressed as 1/N, where N is a positive
number. For example, if the event in question is the combination of two
outcomes of tossing a coin, the probability of each such event is 1/4, where
N=4. It means that in a large number X of tests event A will
occur, on the average, once in every N tests. For this prediction
to hold, X must be much larger than N. The larger is the ratio X/N,
the closer the number of occurrences of event A will be to the
probability value, i.e. to 1 occurrence in every N tests.**

**For example, as we concluded earlier, in a test comprising two consecutive
tosses of a coin the probability of each of the four possible events is the
same ¼, so N=4. It means that if we repeat the described test X times, where X
is much larger than 4 (say, one million times) each of the four possible
events, namely HH, HT, TT, and TH will happen, on the average once in
every four tests.**

**We have now actually introduced (not quite rigorously) one more
postulate, sometimes referred to as the law of large numbers. The gist
of that law is that the value of probability can be some accidental number
unless it is determined over a large number of tests. The value of probability
does not predict the outcome of any particular test, but in a certain sense we
can say that it "predicts" the results of a very large number of
tests in terms of the values averaged over all the tests.**

*If any one of the four postulates is not held (the number of test is
not much larger than N, the "coin" or "die" etc is not
"honest," the outcomes are not equally probable, and finally if the
tests are not independent) the value of probability calculated as 1/N has no
meaningful interpretation.*

**Ignoring the last statement is often the source of unfounded conclusions
from probability calculations.**

**Later we will also discuss situations when some of the above postulates
do not hold (in particular, the postulate of independence) but nevertheless the
probabilities of events comprising several trials each can be reasonably
estimated.**

**The above descriptive definition of probability is sometimes referred to
as the "classical" one.**

**There are in probability theory also some other definitions of
probability. They overcome certain logical shortcomings of the classical
definition and generalize it. In this paper we will not use explicitly (even
though they may be sometimes implied) those more rigorous definitions since the
above offered classical definition is sufficient for our purpose.**

**Let us now discuss the calculation of the probability of an event.
Remember that event is defined as the combination of outcomes in a set
of trials. For example, what is the probability that in a set of two trials
with a coin the event will be "T, H" i.e. that the outcome of
the first trial will be T and of the second trial will be H? We
know that the probability of T in the first trial was ½. This conclusion
stemmed from the fact that there were two equally probable outcomes. The
probability of ½ was estimated dividing 1 by the number (which was 2) of all
possible equally probable outcomes. If the trials are conducted twice in a row,
how many possible equally probable events can be imagined? Here is the obvious
list of all such events: 1) T, T; 2) T, H; 3) H, H; 4) H, T. The
total of 4 possible results, all equally probable, covers all possible events.
Obviously, the probability of each of those four events is the same ¼. We see
that the probability of the event comprising the outcomes of two consecutive
trials equals the product of probabilities of each of the sequential outcomes.
This is one more postulate, which is based on the independence of tests, the
rule of probabilities multiplication. The probability of an event is the
product of the probabilities of the outcomes of all sequential trials constituting that
event. As we will see later, this rule has certain limitations.**

**(In textbooks on the probability theory the independence of tests is
often treated in the opposite way, namely establishing that if the probability
of a combination of events equals the product of the probabilities of the
individual events, then these individual events are independent).**

**SOME SITUATIONS WITH
SEEMINGLY UNEQUAL PROBABILITIES**

**Let us discuss certain aspects of probability calculations which have
been a pivotal point in the dispute between "creationists" (who
assert that the life could not have emerged spontaneously but only via a
divine act by the Creator) and "evolutionists" (who adhere to a
theory asserting that life emerged as a result of random interactions between
chemical compounds in the primeval atmosphere of our planet or of some other planet).**

**In particular, the creationists maintain that the probability of life’s
spontaneous emergence was so negligibly low that it must be dismissed as
improbable.**

**To ascertain their view, the creationists use a number of various
arguments. Lest I be misunderstood, I would like to point out that I am not
discussing here whether the creationists or evolutionists are correct in their
assertions in regard to the origin of life. This question is very complex and
multi-faceted and the probabilistic argument often employed by the creationists
is only one aspect of their view. What I will show is that the probabilistic
argument itself, as commonly used by many creationists, is unfounded and cannot
be viewed as a proof of their views, regardless of whether those views are
correct or incorrect.**

**The probabilistic argument often used by the creationist is as follows.
Imagine tossing a die with six facets. Repeat it 100 times. There are many
possible combinations of the six numbers (we would say there are possible many events,
each comprising 100 outcomes of individual trials). The probability of each
event is exceedingly small (about one over 10 ^{77} ) and is the same
for each combination of numbers, including, say, a combination of 100
"fours," that is 4,4,4,4,4,4,4,4,4,4 … etc, ("four"
repeated 100 times in a row). However, say the creationists, the probability
that the set of 100 numbers will be some random combination is much larger than
the probability of 100 "fours," which is a unique,
or "special" event. Likewise, the spontaneous emergence of life is a
special event whose probability is exceedingly small,
hence it could not happen spontaneously. Without discussing the ultimate
conclusion about the origin of life, let us discuss only the example with the
die.**

**Indeed, the probability that the event will be some random collection of
numbers is much larger than the probability of "all fours." It does
not mean anything. The larger probability of random sets of numbers is simply
due to the fact that it is a combined probability of many events, while
for "all fours" it is the probability of only one particular event.
From the standpoint of the probability value, there is nothing special about
"all fours" event; it is an event which is exactly as probable as any
other individual combination of numbers, be it "all sixes,"
"half threes + half sevens" or any arbitrary disordered set of 100
numbers made up of six symbols, like 2,5,3,6,1,3,3,2…. etc. The probability
that 100 trials result in any particular set of numbers is always less
then the combined probability of all the rest of the possible sets of numbers,
exactly to the same extent as it is for "all fours." For example, the
probability that 100 consecutive trials will result in the following disordered
set of numbers: 2, 4, 1, 5, 2, 6, 2, 3, 3, 4, 4, 6, 1…etc., which is not a
"special" event, is less than the combined probability of all other
about 10^{77} possible combinations of outcomes, including the
"all fours" event, to the same extent as this is true for the
"special" event of "all fours" itself.**

**The crucial fault of the creationists’ probabilistic argument is their
next step. They proceed to assert that the "special’ event whose
probability is extremely small, simply did not happen. However, this argument
can be equally applied to any competing event whose probability is equally
extremely small. In the case of a set of 100 trials, every one of about 10 ^{77}
possible events has the same exceedingly small probability. Nevertheless, one
of them must necessarily take place. If we accept the probabilistic argument of
the creationists, we will have to conclude that none of the 10^{77}
possible events could have happened, which is an obvious absurdity.**

**Of course, nothing in probability theory forbids any event to be
"special" in some sense, and spontaneous emergence of life qualifies
very well for the title of a "special" event. Being
"special" from our human viewpoint in no way makes this or any other
event stand alone from the standpoint of probability estimation. Therefore
probabilistic arguments are simply irrelevant when the spontaneous emergence of
life is discussed.**

**Let us look once more, by way of a very simple example, at the argument
based on the very small probability of a "special" event versus
"non-special" ones. Consider a case when the events under discussion
are sets of three consecutive tosses of a coin. The possible events are as
follows: HHH, HHT, HTT, HTH, TTT, TTH, THH, THT. Let
say that for some reasons we view events HHH and TTT as "special"
while the rest of the possible events are not "special." If we adopt
the probabilistic arguments of creationists, we can assert that the probability
of a "special" event, say, HHH (which is in this case 1/8) is less
then the probability of event "Not HHH" (which is 7/8). This assertion
is true. However, it does not at all mean that event HHH is indeed special from
the standpoint of probability. Indeed, we can assert by the same token that the
probability of any other of the eight possible events, for example of event HTH
(which is also 1/8) is less than the probability of event "Not HTH"
(which is 7/8). There are no probabilistic reasons to see event HHH as
happening by miracle. Its probability is not less than that of any of the other
eight possible events. This conclusion is equally applicable to situations in
which not eight but billions of billions alternative events are possible.**

**The "all fours" type of argument has no bearing whatsoever on
the question of the spontaneous emergence of life.**

**I will return to the discussion of supposedly "special" vs.
"non-special" events in a subsequent section of this essay.**

**Now I will discuss situations in which the probabilities calculated
before the first trial cannot be directly multiplied to calculate the
probability of an event.**

**Again, consider an example. Imagine a box containing six balls identical
in all respects except for their colors. Let one ball be white, two balls, red,
and three balls, green. We randomly pull out one ball. (The term
"randomly" in this context is equivalent to the previously introduced
concepts of an "honest coin" and an "honest die"). What is
the probability that the randomly chosen ball is of a certain color? Since all
balls are otherwise identical and are chosen randomly, each of the six balls
has the same probability of 1/6 to be chosen in the first trial. However, since
the number of balls of different colors varies, the probability that a certain
color is chosen, is different for white, red, and green. Since there is only
one white ball available, the probability that the chosen ball will be white is
1/6. Since there are two red balls available, the probability of a red ball to
be chosen is 2/6=1/3. Finally, since there are three green balls available, the
probability that the chosen ball happens to be green is 3/6=1/2.**

**Assume first that the ball chosen in the first trial happens to be red.
Now, unlike in the previous example, let us proceed to the second trial without
replacing the red ball. Hence, after the first trial there remain only five
balls in the box, one white, one red and three green. Since all these five
balls are identical except for their color, each of them has the same
probability of 1/5 to be randomly chosen in the second trial. What is the
probability that the ball chosen in the second trial is of a certain color?
Since there is still only one white ball available, the probability of that
ball to be randomly chosen is 1/5. There is now only one red ball available, so
the probability of a red ball to be randomly chosen is also 1/5. Finally, for a
green ball the probability is 3/5. So if in the first trial a red ball was
randomly chosen, the probabilities of balls of different colors to be randomly
chosen in the second trial are 1/5 (W), 1/5 (R), and 3/5 (G).**

**Assume now that in the first trial not a red, but a green ball was
randomly chosen. Again, adhering to the "no replacement" procedure,
we proceed to the second trial without replacing the green ball in the box. Now
there remain again only five balls available, one white, two red and two green.
What are the probabilities that in the second trial balls of specific colors
will be randomly chosen? Each of the five balls available has the same
probability to be randomly chosen, 1/5. Since, though, there are only one white
ball, two red and two green balls available, the probability that the ball
randomly chosen in the second trial happens to be white is 1/5, while for both
red and green balls it is 2/5.**

**Hence, if the ball chosen in the first trial happened to be red, then the
probabilities to be chosen in the second trial would be 1/5 ( W), 1/5 (R)
and 3/5 (G). If, though, the ball chosen in the first trial happened to
be green, then the probabilities in the second trial would change to 1/5 (W),
2/5 (R) and 2/5 (G).**

**The conclusion: in the case of trials without replacement, the
probabilities of outcomes in the second trial depend on the actual outcome
of the first trial,
hence in this case the tests are not independent.**

**When the tests are not independent, the probabilities calculated
separately for each of the sequential trials cannot be directly multiplied.
Indeed, the probabilities calculated before the first trial
were as follows: 1/6 ( W), 2/6=1/3 (R) and 3/6=1/2 (G).
If we multiplied the probabilities like in the case of independent test, we
would have obtained the probabilities, for example, for the event (RR)
as 1/3 times 1/3 which equals 1/9. Actually, though, the probability of that
event is 1/3 times 1/5 which is 1/15. Of course, probability theory provides an
excellent way to deal with the "no replacement" situation, using the
concept of so-called "conditional probabilities." However some
writers utilizing probability calculations seem to be unaware of the
distinction between independent and non-independent tests. Ignoring that
distinction has been a source of crude errors.**

**One example of such erroneous calculations of probabilities is how some
proponents of the so-called Bible code estimate the probability of the
appearance in a text of certain letter sequences.**

**The letter sequences in question are the so-called ELS which stands for "equidistant letter sequences." For
example, in the preceding sentence the word "question" includes the
letter "s" as the fourth letter from the left. Skip the preceding
letter "e" and there is the letter "u." Skip again the
preceding letter "q" and the space between the words (which is to be
ignored), and there is the letter "n." The three letters, s, u, and
n, separated by "skips" of 2, constitute the word "sun" if
read from right to left. This is an ELS with a
negative "skip" of –2. There are many such ELS, both read from right
to left and from left to right in any text.**

**There are people who are busy looking for arrays of ELS in the Hebrew
Bible believing these arrays had been inserted into the text of the Bible by
the divine Creator and constitute a meaningful "code." As one of the
arguments in favor of their beliefs, the proponents of the "code"
attempt to show that the probability of such arrays of ELS happening in a text
by sheer chance is exceedingly small and therefore the presence of those arrays
of ELS must be attributed to the divine design.**

**There are a few publications in which attempts have been made to apply an
allegedly sound statistical test to the question of the Bible code. In
particular, D. Witzum, E. Rips, and Y. Rosenberg
(WRR) described such an attempt in a paper published in 1994 in
"Statistical Science" (v. 9, No 3, 429 – 438). The methodology by WRR
has been thoroughly analyzed in a number of critical publications and shown to
be deficient. This methodology goes further than the application of probability
theory, making use of some tools of mathematical statistics, and therefore is
not discussed here since this paper is only about probability calculations.
However, besides the paper by WRR and some other similar publications, there
are many publications where no real statistical analysis is attempted but only
"simple" calculations of probabilities are employed. There are common
errors in those publications, one being the multiplication of probabilities in
cases when the tests are not independent. (There are also many web publications
in which a supposedly deeper statistical approach is utilized to prove the
existence of the Bible code. These calculations purport to determine the
probability of appearance in the text not just of individual ELS, but of whole
clusters of such. Such analysis usually starts with the same erroneous
calculation of probabilities of individual words as examined in the following
paragraphs. **

**Usually the calculations in question start by choosing a word whose
possible appearance as an ELS in the given text is
being explored. When such a word has been selected, its first letter becomes
thus determined. The next step is estimating the probability of the letter in
question to appear at arbitrary locations in a text. The procedure is repeated
for every letter of the chosen word. After having allegedly determined the
probabilities of occurrence of each letter of a word constituting an ELS, the proponents of the "code" then multiply
the calculated probabilities, thus supposedly finding the probability of the
occurrence of the given ELS.**

**Such multiplication is illegitimate. Indeed, a given text comprises a
certain set of letters. When the first letter of an ELS
has been chosen (and the probability of its occurrence anywhere in the text
has been calculated) this makes all the sites in the text occupied by that
letter inaccessible to any other letter. Let us assume that the first letter of
the word in question is X, and it happens x times in the entire
text, whose total length is N letters. The proponents of the code
calculate the probability of X occurring at any arbitrary site as x/N.
This calculation would be correct only for a random collection of N
letters, among which letter X happens x times. For a meaningful
text this calculation is wrong. However, since we wish at this time to address
only the question of test’s independence, let us accept the described
calculation for the sake of discussion. As soon as letter X has been
selected, and the probability of its occurrence at any location in the text
allegedly determined, the number of sites accessible for the second letter in
the chosen word decreases from N to N-x. Hence, even if we accept
the described calculation, then the probability of the second letter (let us
denote it Y) to appear at an arbitrary still accessible site is now y/(N-x)
where y is the number of occurrences of letter Y in the entire
text. It is well known that the frequencies of various letters in meaningful
texts are different. For example, in English the most frequent letter is e,
whose frequency (about 12.3%) is about 180 times larger than that of the least
frequent letter, which is z (about 0.07%).**

**Hence, depending on which letter is the first one in the chosen word,
i.e., on what the value of x is, the probability of the occurrence of
the second letter, estimated as y/(N-x), will
differ.**

**Therefore we have in the described case a typical situation "without
replacement" where the outcome of the second trial (the probability of Y)
depends on the outcome of the preceding trial (which in its turn depends on the
choice of X). Therefore the multiplication of calculated probabilities
performed by the code proponents as the second (as well as the third, the
fourth, etc) step of their estimation of ELS probability is illegitimate and
produces meaningless numbers of alleged probabilities.**

**The probabilities of various individual letters appearing at an arbitrary
site in a text are not very small (mostly between about 1/8 and 1/100). If a
word consists of, say, six letters, the multiplication of six such fractions
results in a very small number which is then considered to be the probability
of an ELS but is actually far from the correct value of the probability in
question.**

**Using y/(N-x) instead of y/N, and thus
correcting one of the errors of such calculations, would not suffice to make
the estimation of the probability of an ELS reliable. The correct probability of an ELS could be calculated based on certain assumption in
regard to the text’s structure, which distinguishes meaningful texts from
random conglomerates of letters. There is no mathematical model of meaningful
texts available, and therefore the estimations of the ELS probability, even if
calculated accounting for interdependence of tests,
would have little practical meaning until such a mathematical model is
developed.**

**Finally, the amply demonstrated presence of immense numbers of various
ELS in both biblical and any other texts, in Hebrew as well as in other
languages, is the simplest and also the most convincing proof that the
allegedly very small probabilities of ELS appearance, as calculated by the
proponents of the "code," are indeed of no evidential merit
whatsoever.**

**THE COGNITIVE ASPECTS OF
PROBABILITY**

**So far I have discussed the quantitative aspects of probability. I will
now discuss probability from a different angle, namely analyzing its cognitive
aspects. This discussion will be twofold. One side of the cognitive meaning of
probability is that it essentially reflects the amount of information available
about the possible events. The other side of the probability’s cognitive aspect
is the question of what the significance of this or that value of probability
essentially is.**

**I will start with the question of the relationship between the calculated
probability and the level of information available about the subject of the
probability analysis. I will proceed by considering certain examples
illustrating that feature of probability.**

**Imagine that you want to meet your friend who works for a company with
offices in a multistory building in the downtown. Close to 5 pm you are
on the opposite side of the street, waiting for your friend to come out of the
building. Let us imagine that you would like to estimate the probability that
the first person coming out will be male. You have never been inside that
building so you have no knowledge of the composition of the people working in that
building. Your estimate will necessarily be that the probability of the
first person coming out being male is ½, and the same probability for
female. Let us further imagine that your friend who works in that
building knows that among the people working there about 2/3 are female and
about 1/3 are male. Obviously his estimate will be that the probability of the
first person coming out to be male is 1/3 rather than ½. Obviously, the
objective likelihood of a male coming out first does not depend on who makes
the estimate. It is 1/3. The different estimates of probability are due
to something that has no relation to the subject of the probability estimation.
They are due to the different level of information about the subject possessed
by you and your friend. Because of a very limited knowledge about the
subject, you have to assume that two possible events – a male or a female
coming first, are equally probable. Your friend knew more,
in particular he knew that the probability of a female coming out first was
larger than of a male coming out first.
**

**This example illustrates an important property of the calculated
probability. It reflects the level of knowledge about a subject. If we possess
the full knowledge about the subject we know exactly, in advance, the outcome
of a test, so instead of probability we deal with certainty.**

**A common situation in which we
have full knowledge of the situation is when an event has actually
occurred. In such a situation the question of the probability of the
event is meaningless. After the first person had actually come out of the
building, the question of the probability of that event becomes moot. Of
course we still can calculate the probability of that event, but doing so we
necessarily deal with an imaginary situation assuming the event has not yet
actually occurred. **

**Being the reflection of the
level of knowledge about a subject is the ubiquitous and most essential feature
of the probability from the viewpoint of its cognitive essence.**

**What about the examples with a coin or a die, where we thought we
possessed the full knowledge of all possible outcomes and all those possible
outcomes definitely seemed to be equally probable?**

**We did not possess such knowledge! Our assumption of the equal
probability of either heads or tails, or of the equal probability of each of
the six possible outcomes of a trial with a die was due to our limited
knowledge about the actual properties of the coin or of the die. No coin and no
die are perfect. Therefore, in the tests with a coin, either head or tail may
have a slightly better chance of occurring. Likewise, in the test with a die,
some of the six facets of the die may have a slightly better chance to face
upward. In tests conducted by K. Pearson with a coin (1921), after it was
tossed 24,000 times, head occurred in 12,012 trials, while tail, in 11988
trials. Generally speaking, the slight difference between the numbers of heads
and tails is expected in a large sequence of truly random tests. On the other
hand, we cannot exclude that the described result was due, at least partially,
to a certain imperfection in the coin used, or in the procedure employed.**

**Since we have no knowledge of the particular subtle imperfections of a
given coin or die, we have to postulate the equal probability of all possible
outcomes.**

**In the tests with a die or a coin, we at least know all possible
outcomes. There are many situations in which we have no such knowledge. If that
is the case, we have to assume the existence of some supposedly possible events
which actually are impossible, but we simply cannot rule them out.**

**For example, assume we wish to estimate the probability that upon
entering a property at **

**Quite often the very small calculated probabilities of certain events are
due to the lack of information and hence to an exaggerated number of supposedly
possible events many of which are actually impossible. One example of such a
greatly underestimated probability of an event is the alleged estimation of the
probability of life’s spontaneous emergence. The calculations in question are
based on a number of arbitrary assumptions and deal with a situation whose
details are largely unknown. Therefore, in such calculations the number of
possible events is greatly exaggerated, and all of them are assumed to be
equally probable, which leads to extremely small values of calculated
probability. Actually, many of the allegedly possible paths of chemical
interactions may be impossible, and those possible are by no means equally
probable. Therefore (and for some other reasons as well) the extremely small
probability of life’s spontaneous emergence must be viewed with the utmost
skepticism.**

**Of course, it is equally easy to give an example of a case in which
insufficient knowledge of the situation results not in an increased but rather
in a decreased number of supposedly possible outcomes of a test. Imagine that
you made an appointment over the phone to meet John Doe at the entrance to his
residence. You have never before seen his residence. When you arrive at his
address you discover that he lives in a large apartment house which seems to
have two entrances at the opposite corners of the building. You have to watch
both entrances. Your estimate of the probability that John would exit from the
eastern door is ½, as it is also that he would exit from the western door. The
estimated number, ½, results from your assumption of equal
probability of John’s choosing either of the exits and from your
knowledge that there are two exits. However, what if you don’t know that the
building has also one more exit in the rear? If you knew that fact, your
estimated probability would drop to 1/3 for each of the doors. Insufficient
knowledge (you knew only about two possible outcomes) led you to an increased
estimated probability compared with that calculated with a more complete knowledge
of the situation, accounting for all three possible outcomes.**

**The two described situations, one when the number of possible outcomes is
assumed to be larger than it actually is, and the other when the number of
supposedly possible outcomes is less that the actual number of them, may result
in two different types of judgment, leading either to exaggerated or to
underestimated probability for the event in question.**

**Now let us discuss the other side of the probability’s cognitive aspect.
What is the real meaning of probability’s calculated value if it happens to be
very small?**

**Consider first the situation when all possible outcomes of trials are
supposedly equally probable. Assume the probability of an event A was
calculated as 1/N where N is a very large number so the
probability of the event is very low. Often, such a result is interpreted as an
indication that the event in question should be considered, to all intents and
purposes, as practically impossible. However, such an interpretation, which may
be psychologically attractive, has no basis in probability theory. The actual
meaning of that value of 1/N is just that – the event in question is one
of N equally probable events. If event A has not occurred it
simply means that some other event B has occurred instead. But event B
had the same very low probability of occurring as event A. So why could
the low-probability event B actually occur but event A
which had the same probability as B, could not occur?**

**An extremely low value for a calculated probability has no cognitive
meaning in itself. Whichever one of N possible events has actually
occurred, it necessarily had the same very low probability as the others, but
has occurred nevertheless. Therefore the assertion of impossibility of such events
as the spontaneous emergence of life, based on its calculated very low
probability, has no merit. **

**If the possible events are actually not equally probable, which is a more
realistic approach, a very low calculated probability of an event has even less
of a cognitive meaning, since its calculation ignored the possible existence of
preferential chains of outcomes which could ensure a much higher probability
for the event in question.**

**The above discourse may produce in the minds of some readers an impression
that my thesis was to show that the concept of probability is really not very
useful since its cognitive contents is very limited. This was by no means my
intention. When properly applied and if not expected to produce unrealistic
predictions, the concept of probability may be a very potent tool for shedding
light on many problems in science and engineering. When applied improperly and
if expected to be a magic bullet to produce predictions, it often becomes
misleading and a basis for a number of unfounded and sometimes ludicrous
conclusions. The real power of the properly calculated and interpreted
probability is, however, not in the calculations of probability of this or that
event, when it is indeed of a limited value, but when the probability is utilized
as an integrated tool within the much more sophisticated framework of either
mathematical statistics or statistical physics.**

**PSYCHOLOGICAL ASPECTS OF
PROBABILITY**

**The scientific theories often seem to contradict common sense. When this
is the case, it is the alleged common sense that is deceptive, while the
assertions of science are correct. The whole science of quantum mechanics,
which is one of the most magnificent achievements of the human mind, seems to
be contrary to the "common sense" based on the everyday experience of
men.**

**One good example of the above contradiction is related to the motion of
spacecrafts in orbit about a planet. If there are two spacecrafts moving in the
same orbit, one behind the other, what should the pilot of the craft that is
behind do if he wishes to overtake the one ahead? "Common sense" tells us that the pilot in question has to increase the
speed of his craft along the orbital path. Indeed, that is what we do when we
wish to overtake a car that is ahead of us on a road. However, in the case of
an orbital flight the "common sense" is wrong. To overtake a
spacecraft that is ahead in the orbit, the pilot of the craft that lags behind
must decrease rather than to increase his speed. This theoretical
conclusion of the science of mechanics has been decisively confirmed in
multiple flights of spacecrafts and artificial satellites, despite its
seemingly contradicting the normal experience of car drivers, pedestrians,
runners, and horsemen, and "common sense" based on that experience.
Likewise, many conclusions of probability theory may seem to contradict common
sense, but nevertheless probability theory is correct while "common
sense" in those cases is wrong.**

**Consider an experiment with a die, where events in question are sets of 10
trials each. Recall that we assume an "honest die" and in addition
the independence of outcomes. If we toss the die once, each of the six possible
outcomes has the same chance of happening, the probability of each of the six
numbers to face up being the same 1/6. Assume that in the first trial the
outcome was, say, 3. Then we toss the die the second time. It is the same die,
tossed in the same way, with the same six equally probable outcomes. To get an
outcome of 3 is as probable as any of the five other outcomes. The tests are
independent, so the outcome of each subsequent trial does not depend on the
outcomes of any of the preceding trials.**

**Now toss the die in sets of 10 trials each. Assume that the first event
is as follows: A (3, 5, 6, 2, 6, 5, 6, 4, 1, 1).
We are not surprised in the least since we know that there are 6^{10 }(which
is 60,466,176) possible, equally probable events. Event A is just one of
them and does not stand alone in any respect among those over sixty million
events, so it could have happened in any set of 10 trials as well as any other
of those sixty million variations of numbers. Let us assume that in the second
set of 10 trials the event is B (6, 5, 4, 2, 6, 2, 3, 2, 1, 6). Again, we have no reason to be surprised by such a
result since it just another of those millions of possible events and there is
no reason whatsoever for it not to happen. So far the probability theory seems
to agree with common sense.**

**Assume now that in the third set of 10 trials the event is C (4,
4, 4, 4, 4, 4, 4, 4, 4, 4). I am confident that in
such a case everybody would be amazed and the immediate explanation of that
seemingly "improbable" event would be the suspicion that either the
die has been tampered with or that it was tossed using some sleight of hand.**

**While cheating cannot be excluded, the event with all ten
"fours" does not necessarily require the assumption of cheating.**

**Indeed, what was the probability of event A? It was one in over
sixty million. Despite the exceedingly small probability of A, its occurrence
did not surprise anybody. What was the probability of event B? Again
only one in over sixty million but we were not amazed at all. What was the
probability of event C? The same one in over sixty million, but this
time we are amazed.**

**From the standpoint of probability theory there is no difference
whatsoever between any of the sixty million possible events, including events A,
B and C, and all other (60,466,176 – 3 = 60,466,173) possible variations of
a six-number combination.**

**Is the ten-time repeat of "four" extremely unlikely? Yes, it
is. Indeed, its probability was only 1 in over sixty million! However, we
should remember that any other combination of six numbers is as unlikely (or as
likely) to occur as has the "all fours" combination. The occurrence
of 10 identical outcomes in a row is very unlikely, but not less likely than
the occurrence of any other possible set of ten numbers.**

**The theory of probability asserts that if we repeat this ten-die-tossing
test, say a billion billion billions times, then each
of the about sixty million possible combinations of ten numbers will happen
approximately once in every 60,466,176 ten-tossing tests. This is true equally
for the "all-fours" combination and for any other of the over sixty
million competing combinations.**

**Why does the "all-fours" event seem amazing? Only
for psychological reasons. It seems easier to assume cheating on the
part of the dice-tossing player than the never before seen occurrence of
"all fours" in ten trials. What is not realized is that the
overwhelming majority of events other than "all fours" was never seen, either. There are so many possible
combinations of ten numbers, composed of six different unique numbers, that
each of them occurs extremely rarely. The set of 10 identical numbers seems
psychologically to be "special" among combinations of different
numbers. For probability theory, though, the set of "all fours" is
not special in any respect.**

**Of course, if the actual event is highly favorable to one of the players,
it justifies a suspicion of cheating . The reason for
that is our experience which tells us that cheating is rather highly probable
when a monetary or other award is in the offing. However, the probability of
cheating is actually irrelevant to our discussion. Indeed, the probability of
cheating is just a peculiar feature of the example with a game of dice. This
example is used, however, to illustrate the question of the spontaneous
emergence of life where no analog of cheating is present. Therefore, the proper
analogy is one in which cheating is excluded. When the possibility of cheating
is excluded, only the mathematical probability of any of the over sixty million
possible events has to be considered. In such a case, every one of those over
sixty million events is equally probable. Therefore the extremely low
probability of any of those events, including an ordered sequence of "all
fours," is of no cognitive significance. However special this ordered
sequence may be from a certain viewpoint, it is not special at all from the
standpoint of probability. The same must be said about the probability of
spontaneous emergence of life. However small it is, it is not less than the
probability of any of the competing events and therefore its extremely small
probability in no way means it could not have happened.**

**Probability theory is a part of science and has been overwhelmingly
confirmed to be a good theory of great power. There is no doubt that the
viewpoint of probability theory is correct. The psychological reaction to ten
identical outcomes in a set is as wrong as is the suggestion to a pilot of a
spacecraft lagging behind to increase his speed if he wishes to overcome a
craft ahead in orbit.**

**
THE CASE OF MULTIPLE WINS IN A
LOTTERY**

**Another example of erroneous attitude to an "improbable" event,
based on psychological reasons, is the case of multiple wins in a lottery.**

**Consider a simple raffle in which there are only 100 tickets on sale. To
determine the winner, numbers from 1 to 100 are written on small identical
pieces of paper, the pieces are rolled up and placed
in a rotating cylinder. After the cylinder has been rotated several times, a
child whose eyes are covered with a piece of cloth pulls one of the pieces out of
the cylinder. This procedure seems to ensure as complete an absence of bias as
humanly possible.**

**Obviously each of the tickets has the same probability of winning, namely
1/100. Let us assume John Doe is the lucky one. We congratulate him but nobody
is surprised by John’s win. Out of the hundred tickets one must necessarily
win, so why shouldn’t John be the winner?**

**Assume now that the raffle had not 100 but 10,000 tickets sold. In this
case the probability of winning was the same for each ticket, namely 1/10,000.
Assume Jim Jones won in that lottery. Are we surprised? Of
course not. One ticket out of 10,000 had to win, so why shouldn’t it be
that of Jim?**

**The same discussion is applicable to any big lottery where there are hundreds
of thousands or even millions of tickets. Regardless of the number of tickets
available, one of them, either sold or unsold, must
necessarily win, so why shouldn’t it be that of Jim or John?**

**Now let us return to the small raffle with only 100 tickets sold. Recall
that John Doe won it. Assume now that, encouraged by his win,
John decides to play once again. John has already won once; the other 99
players have not yet won at all. What is the probability of winning in the
second run? For every one of the 100 players, including John, it is again the
same 1/100. Does John’s previous win provide him with any advantages or
disadvantages compared to other 99 players? None whatsoever.
All one hundred players are in the same position, including John.**

**Assume now that John wins again. It is as probable as that any of the
other 99 players winning this time, so why shouldn’t it be John? However, if
John wins the second time in a row, everybody is amazed by his luck. Why the
amazement?**

**Let us calculate the probability of a double win, based on the assumption
that no cheating was possible. The probability of winning in the first run was
1/100. The probability of winning in the second run was again 1/100. The events
are independent, therefore the probability of winning
twice in a row is 1/100 times 1/100 which is 1 in 10,000. It is exactly the
same probability as it was in the raffle with 10,000 tickets played in one run.
When Jim won that raffle, we were not surprised at all, despite the probability
of his win being only 1 in 10,000, nor should we have been. So why should we be
amazed at John’s double win whose probability was exactly the same 1 in 10,000?**

**Let us clarify the difference between the cases of a large raffle played
only once and a small raffle played several times in a row.**

**If a raffle is played only once and N tickets have
been distributed, covering all N possible versions of numbers, of which
each one has the same chance to win, then the probability that a particular
player wins is p(P)=1/N while the probability that someone
out of N players (whoever he or she might be) wins is p(S)=1 (i.e.
100%).**

**If though the raffle is played k times, and each
time n players participate, where n^k=N,
the probability that a particular player wins k times in a row is
again the same 1/N. Indeed, in each game the probability of winning for a
particular player now is 1/n. The games are independent of each other.
Hence the probability of winning k times equals the product of
probabilities of winning in each game, i.e. it is (1/n)^k=1/N.
**

**However, the probability that someone (whoever
he/she happens to be) wins k times in a row is now not 1, but not more
than n/N, that maximum value corresponding to the situation in which the
same n players play in all k games. Indeed, for each particular
player the probability of winning k times in a row is 1/N. Since
there are n players, each with the same chance to win k times,
the probability of someone in that group winning k times in a row
is n times 1/N i.e. n/N . In other words, in a big raffle
played only once somebody necessarily wins (p=1). On the other hand, in a
small raffle played k times, it is likely that nobody wins k
times in a row, as the probability of such a multiple win is small.**

**Here is a numerical example. Let the big raffle be such
that N=1,000,000. If all N tickets are distributed, the probability that
John Doe wins is one in a million. However, the probability that somebody
(whoever he/she happens to be) wins is 1 (i.e.100%).**

**If the raffle is small, such that only n=100 tickets are
distributed, the probability of any particular player winning in a given
game is 1/100. If k=3 games are played, the probability that a
particular John Doe wins 3 times in a row is (1/100)^3 which is again one
in a million, exactly as it was in a one-game raffle with N=1,000,000 tickets.**

**However, the probability that someone wins three times in a row,
whoever he or she happens to be, is now not 100% but not more than only n/N=100/1,000,000
(or less, if the composition of the players group changes from game to game)
which is 0.0001, i.e. 10,000 times less than in
a one-game raffle with one million tickets. Hence, such a raffle may be
played time after time after time, without anybody winning k times in a
row. Actually such a multiple win must be very rare.**

**When John Doe wins three times in a row, we are amazed
not because the probability of that event was one in a million (which is the
same as for a single win in a big one-game raffle) but
because the probability of anyone winning three times in a row is ten
thousand times less than it is in a one-game big raffle.**

**Hence, while in the big raffle played just once, the fact
that somebody won is a 100% probable (i.e. is a certain event), in the case of
a small raffle played three times a triple win is a rare event of low
probability (in our example 1 in 10,000).**

**However, if we adhere to the postulate of a fair game, a
triple win is not a special event despite its low probability. It is as
probable as any other combination of three winning tickets, namely in our
example one in a million. To suspect fraud means to abolish the postulate
of a fair game. Indeed, if we know that fraud is possible, intuitively, we
compare the probability of an honest triple win with the probability of fraud.
Our estimate is that the probability of an honest triple win (in our case 1 in
10,000) is less than the probability of fraud (which in some cases may be quite
high).**

**The above discussion related only to a raffle-type lottery. If the lottery
is what sometimes is referred to as the Irish-type lottery, the situation is
slightly different. In this type of a lottery, the players themselves choose a
set of numbers for their tickets. For example, I believe that in the **

**
From the above we can conclude that when a particular player wins more than
once in consecutive games, we are amazed not because the probability of winning
for that particular player is very low, but because the probability of anybody
(whoever he/she happens to be) winning consecutively in more than one game is
much less than the probability of someone winning only once in an even much
larger lottery. We intuitively estimate the difference between the two
situations. However, the important point is that what impresses us is not
the sheer small probability of someone winning against enormous odds. This
probability is equally small in the case of winning only once in a big lottery,
but in that case we are not amazed. This illustrates the psychological
aspect of probability.**

** Let us
briefly discuss the meaning of the term "special event." When stating
that none of the N possible, equally probable events was in any way
special, I only meant to say that it was not special from the standpoint of its
probability. Any event, while not special in the above sense, may be very
special in some other sense.**

** Consider an example. Let
us imagine a die whose six facets bear, instead of
numbers from 1 to 6, six letters A, B, C, D, E, and F. Let us imagine further
that we toss the die in sets of six trials each. In such a case there are 6 ^{6}
= 46,656 possible, equally probable events. Among those events are the following
three: ABCDEF, AAAAAA, and FDCABE. Each of these three events has the same
probability of 1 in 46,656. Hence, from the standpoint of probability none of
these three events is special in any sense.**

**However, each of these three events may be special in a sense otherwise
than probabilistic. Indeed, for somebody interested in alphabets the first of
the three events may seem to be very special since the six outcomes are in the
alphabetical order. Of course, alphabetical order in itself has no intrinsic special
meaning, thus a person whose language is, for example, Chinese, would hardly
see anything special in that particular order of symbols. The second event,
with its six identical outcomes may seem miraculous to a person inclined to see
miracles everywhere and to attach some special significance to coincidences
many of which happen all the time. The third event seems to be not special but
rather just one of the large number of possible events. However, imagine a
person whose first name is Franklin, middle name
is Delano, and whose last name is (no, not **

**Whatever special significance this or that person may be inclined to
attribute to any of the possible, equally probable events, none of them is
special from the standpoint of probability. This conclusion is equally valid
regardless of the value of the probability, however small it happens to be.**

**In particular, the extremely small probability of the spontaneous
emergence of intelligent life, as calculated (usually not quite correctly) by
the opponents of the hypothesis of life’s spontaneous emergence, by no means
indicates that the spontaneous emergence of life must be ruled out. (There are
many non-probabilistic arguments both in favor of creationism and against it
which we will not discuss in this essay). The spontaneous emergence of life was
an extremely unlikely event, but all other alternatives were extremely unlikely
as well. One out of N possible events did occur, and there is nothing special
in that from the standpoint of probability, even though it may be very special from
your or my personal viewpoint.**

** In this
appendix, I will calculate the probability of more than one player
simultaneously winning the Irish type lottery. **

**Let N be the number of possible
combinations of numbers to be chosen by players (of which one combination,
chosen randomly, will be the winning set). Let T <= N be the number of tickets sold.**

**Now calculate p(L),
the probability that exactly L players select the winning combination.**

**The number of choices of L
tickets out of T is given by the binomial distribution: **

**
****bin****(T,L) =
T!/(L!(T-L)!) **

** For those L tickets to be
the only winners, they must all select the winning combination
. The probability of that is (1/N)^L. All the
other T-L players must select a non-winning combination, probability of that
being (1-1/N)^(T-L). **

**We multiply those three
quantities, which yields the formula **

**
****P(L) = bin(T,L) (1/N)^L
(1-1/N)^(T-L). **

**This formula can be simplified,
preserving a good precision. Since usually N and T are very large and L
is very small, we can use the following approximations: **

**
****T!/(T-L)! approx=
T^L ; (1-1/N)^(T-L) approx= exp(-T/N).
**

**
Now the formula becomes **

**
****P(L) approx=
(T/N)^L exp(-T/N) / L! **

**This approximate (but quite accurate) formula is the
Poisson distribution with mean T/N. In the case when T=N (i.e. when all
available tickets are sold) we have a simpler
formula: **

**
P(L) approx= exp(-1) / L!. **

**(A complication in practice may
be that when one person buys more than one ticket he/she certainly makes sure
that all the combinations of numbers he/she chooses are different. However, the
approximate formula will still be very accurate unless someone is buying a
large fraction of all tickets, which is unlikely).**

**The probability that only one, but not less than one player wins once in
this type of a lottery is now less than 100%, but is (assuming that L=1)
p(1)=1/E=0.368, or close to 37%, which is still thousands time more than the
probability of the same player winning consecutively in more than one
drawings. **

**Mark Perakh's main page: http://members.cox.net/marperak .**

** **

** **

** **

** **

** **

** **