next up previous contents
Next: Probability and statistics Up: Cryptology Class Notes Previous: Modular Arithmetic

Counting and choosing things

Counting is pretty close to the first piece of math we learn, and yet is very likely to be the hardest math there is. The subject of counting combinations of things is called combinatorics, and has produced many elegant pieces of reasoning.

As a first example, lets count the number of possible bigrams made out of the letters A, B and C. That means we have to choose two letters like BC, AA or AB. There are 3 choices for each of the first and second letters. Now we use the

  Fundamental Principle of Counting: If we make a choice of two things, and there are m possibilities for the first thing and n possibilities for the second thing, then altogether there are $m\times n$ possible choices.

In our example there are $3\times 3=9$ bigrams. The validity of the fundamental principle is verified by writing all the choices out in a table:

  A B C
A AA AB AC
B BA BB BC
C CA CB CC
The first choice is marked in the left column, and the second choice is on the top row. The rest of the table shows the nine possible bigrams.

As another example consider all bigrams where the first thing is a letter (26 possibilities) and the second thing is a decimal digit (0-9). Then there are $26\times 10=260$ possible bigrams, including A4, Q8, etc.

Now we can say that the number of possible bigrams of English letters is 262=676, the number of English trigrams is 263 = 17,576, etc.

It gets more interesting when we put conditions on the combinations being considered. For example, we might ask for combinations that avoid all duplicates. The key to a monoalphabetic cipher is a list of all 26 letters, just not in the same order. An example key is

QAZXSWEDCVFRTGBNHYUJMKIOPL

This means that A is enciphered as Q, B is enciphered as A, C is enciphered as Z, etc. What is the number of possible monoalphabetic keys? We start considering the choices from left to right. There are 26 choices for the cipher equivalent of A. After choosing one, we cannot use that again to encipher any letter other than A. Hence, there will be only 25 remaining choices for the cipher equivalent of B. After choosing that one, there will be 24 choices for the cipher equivalent of C, and so on. Once we say that, the fundamental principle of counting tells us that altogether there are

\begin{displaymath}26\cdot 25 \cdot 24\cdot \dotsb \cdot 3\cdot 2\cdot 1 = 26!\doteq
4\times 10^26
\end{displaymath}

possible monoalphabetic keys.

Many questions involving no duplicates may be handled by factorials. For instance, what is the number of five card hands that may be drawn from a standard 52 card deck? In general, when we pick up our hand, it does not matter what order the cards were dealt in, but for the first example lets suppose it does matter (in Stud Poker, order is important, at least for betting purposes). For the first card dealt, there are 52 choices. After dealing that card, there are only 51 cards remaining in the deck. Thus, there are 51 choices for the second card dealt. Continuing in that way, there are 50, 49 and 48 choices for the third, fourth and fifth cards dealt, respectively. The fundamental principle now says there are

\begin{displaymath}52\cdot 51 \cdot 50 \cdot 49\cdot 48 = 311,875,200
\end{displaymath}

total choices.

Let's do this with a smaller deal and work out what the combinations are. Suppose we choose 2 different letters in order from A, B, C, D and E. Then there should be $5\cdot 4 = 20$ possible choices. This can also be fit into a table:

  A B C D E
A   AB AC AD AE
B BA   BC BD BE
C CA CB   CD CE
D DA DB DC   DE
E EA EB EC ED  
You can now see there are 20 boxes filled in, with the 5 diagonals missing since they correspond to duplicates. That is another way to answer this question. The number of two-letter choices is the total number $5\cdot 5=25$ less the number with duplicates, which is 5.

That is a bit of a hint for Problem 4(b) in Chapter 3. In Problem 4(a), you will have determined the number of passwords of 4 alphanumeric characters. That includes A-Z and 0-9 (forget about upper versus lower case). So all told there are 36 choices for each character. Problem 4(b) asks for the number where we require that we have to have at least one digit and one letter. Think about what the passwords are that are excluded. They would be all four digits or all four letters. Those two possibilities are disjoint, i.e. they cannot happen at the same time, and they are easier to count. Put these ideas together and you will have the answer to 4(b).

Next, let's discuss how to count combinations where we are unconcerned about the order of appearance, as in a five-card hand. Suppose the five cards are A, B, C, D and E, just for the sake of discussion. When they are dealt, the cards could appear in many orders, for instance:

A B C D E
A B C E D
A E C D B
E A D B C
etc.        
Thus, we have to account for the number of ways of reordering five cards. Fortunately, we have already solved that problem! There are 5 possible choices for the first card, then 4 remaining choices for the second, after that 3 remaining choices for the third and so on. Altogether there are $5\cdot 4\cdot 3\cdot 2\cdot 1 = 5! =120$ reorderings of any five-card hand. Therefore, to obtain the total number of hands possible, we divide the total number of hands listed in order by 5! to account for all the different reorderings of the same hand. Thus, the number of five-card hands is

\begin{displaymath}\dfrac{52 \cdot 51\cdot 50 \cdot 49 \cdot 48}
{5\cdot 4\cdot 3 \cdot 2 \cdot 1}
\end{displaymath}

We used this reasoning in counting the number of pairs of distinct letters selected from a text of n letters. There are n choices for the first letter and n-1 choices for the second. Since we were only interested in whether these pairs matched, we did not care about the order in which the letters were selected. Thus, we had to divide by the number of reorderings of two letters, which is only $2=2!=2\cdot
1$. So the number of pairs of letters is

\begin{displaymath}\dfrac{n(n-1)}{2}
\end{displaymath}

which is the formula that came up in our computation of the index of coincidence.

These are examples of a general formula for the number of choices of k different things from a set of n things. The formula is just

\begin{displaymath}\dbinom{n}{k} = \dfrac{n(n-1)(n-2)\dotsb (n-k+1)}{k(k-1)(k-2) \dotsb
2\cdot 1}
\end{displaymath}

This is called the binomial symbol, pronounced ``nchoose k,'' also sometimes written nCk.

The binomial can be written tersely in terms of factorials. We just have to fill out the product in the numerator all the way out to 1, and then cancel the new terms in the denominator. For example,

\begin{displaymath}\dfrac{52\cdot 51\cdot 50\cdot 49\cdot 48}{5\cdot 4\cdot 3\cd...
...(5!)(47\cdot 46\cdot \dotsb 2\cdot 1)}
=\dfrac{52!}{5! \, 47!}
\end{displaymath}

In general,

\begin{displaymath}\dbinom{n}{k} = \dfrac{n!}{k!(n-k)!}
\end{displaymath}

This can be used to give one answer to problem 23, part (c) in Chapter 1. That concerned a ``tableaux'' cipher where the key consists of a selection of say 17 transparent boxes out 48 possible. The number of keys is then

\begin{displaymath}\dbinom{48}{17} = \dfrac{48!}{17! 31!}
\end{displaymath}

You might suspect that computing with factorials is very difficult, but in fact there are many tricks and very good approximations for calculating with these. Binomial formulas are very effective and practical ways of answering problems of this kind.

Finally, there is one small wrinkle in counting code wheels used in various ciphers. When we fill in 26 letters in a circle, there is no way to specify where the ``beginning'' of a circle is. The letter Amust appear somewhere. Thus, we might as well start just after A and make subsequent choices of how to write in the remaining letters. Then, surprisingly, there are only 25! code wheels of this kind, not 26!.

To really convince yourself of this, consider only a three letter alphabet A, B, and C. If we write the letters in order on a wheel as ABC, or BCA or CAB, the resulting wheel is the same! Thus, there are only two such code wheels, with orders ABC and ACB(which is not a rotation of ABC).


next up previous contents
Next: Probability and statistics Up: Cryptology Class Notes Previous: Modular Arithmetic
David J. Wright
2000-09-11