next up previous contents
Next: Basic rules of probability: Up: Cryptology Class Notes Previous: Counting and choosing things

Probability and statistics

Randomness is an essential consideration in cryptology. The goal in designing a cryptosystem is to make the ciphertext appear as much as possible as random jumbles of letters, which cannot be deciphered without knowledge of the secret order that lies within. On the other hand, a common method of cryptanalysis is to collect statistics about the ciphertext in order to perceive the underlying pattern.

Probability is based upon the notion of an upcoming ``experiment'' which may have any number of possible outcomes. For instance, the experiment of flipping a coin may have two plausible outcomes, landing with its ``tails'' side up, or with its ``heads'' side up. (Nitpickers may argue in favor of an ``edge'' outcome; but as it is our job to model the experiment, it is also our prerogative to decide what the possible outcomes are.)

Anyway, we collect all the outcomes in a set we call the space of outcomes (or sample space). Each outcome is assigned a ``probability'' which is abstractly just a number between 0 and 1, sometimes represented as a percentage. For example, a ``fair coin'' is said to have probability 1/2 or $50\%$ of landing with its head side showing.

One very important condition is that the sum of the probabilities over all possible outcomes must turn out to be 1. That is, some outcome must certainly happen.

Probabilities are estimated or modelled by collecting statistics from large parts of data. For instance, suppose we count all the letters in 100,000 pages of English text and find that $12.702\%$ of them are E's. We would then find it very plausible to say that the probability of any given letter in English writing being an E is approximately 0.12702. The larger the quantity of data on which this conclusion is based, the more plausible we find it. This is the intuitive meaning of what is called the Law of Large Numbers.

When we determine the probabilities for all 26 letters, and say piis the probability of letter number i occurring, our basic condition is that

\begin{displaymath}p_1+p_2+\dotsb +p_{26} = \sum_{i=1}^\infty p_i = 1
\end{displaymath}



 
next up previous contents
Next: Basic rules of probability: Up: Cryptology Class Notes Previous: Counting and choosing things
David J. Wright
2000-09-11