8.   Strings


This chapter studies basic combinatorial properties of strings, sequences of characters or letters drawn from a fixed alphabet, and introduces algorithms that process strings ranging from fundamental methods at the heart of the theory of computation to practical text-processing methods with a host of important applications.


8.1 String Searching


8.2 Combinatorial properties of bitstrings


8.3 Regular expressions


8.4 Finite-State Automata and the Knuth-Morris-Pratt Algorithm


8.5 Context-Free Grammars


8.6 Tries


8.7 Trie Algorithms


8.8 Combinatorial Properties of Tries


8.9 Larger alphabets

Selected Exercises

8.1

Give two recurrences satisfied by $[z^N]B_P(z)$.


8.2

How long a string of random bits should be taken to be 99% sure that there are at least three consecutive 0s?


8.3

How long a string of random bits should be taken to be 50% sure that there are at least 32 consecutive 0s?


8.6

By considering bitstrings with no runs of two consecutive 0s, evaluate the following sum involving Fibonacci numbers: $\sum_{j\ge0} F_j/2^j$.


8.14

Suppose that a monkey types randomly at a 32-key keyboard. What is the expected number of characters typed before the monkey hits upon the phrase THE QUICK BROWN FOX JUMPED OVER THE LAZY DOG?


8.15

Suppose that a monkey types randomly at a 32-key keyboard. What is the expected number of characters typed before the monkey hits upon the phrase TO BE OR NOT TO BE?


8.22

Suppose that a monkey types randomly at a 2-key keyboard. What is the expected number of bits typed before the monkey hits upon a string of $2k$ alternating 0s and 1s?


8.40

Suppose that a monkey types randomly at a 32-key keyboard that has 26 letters A through Z, the symbols +, *, (, and ), a space key, and a period. What is the expected number of characters typed before the monkey hits upon a legal regular expression? Assume that spaces can appear anywhere in a regular expression and that a legal regular expression must be enclosed in parentheses and have exactly one period, at the end.


8.42

There are ${8\choose5}=56$ different sets of five three-bit bitstrings. Which trie is associated with the most of these sets? The least?


8.44

How many different tries are there with $N$ external nodes?


8.45

What proportion of the external nodes are void in a "random" trie (assuming each different trie structure to be equally likely to occur)?


8.52

Show that $A_N/N$ is equal to $1/\ln 2$ plus a fluctuating term.


8.53

Write a program to compute $A_N$ to within $10^{-9}$ for $N<10^6$ and explore the oscillatory nature of $A_N/N$.


8.57

Solve the recurrence for $p_N$ given in the proof of Theorem 8.9, to within the oscillating term. $$p_N= {1\over 2^{N}} \sum_k{N\choose k}p_k\quad \hbox{for}\ N>1\quad \hbox{with}\ p_0=0\ \hbox{and}\ p_1 = 1$$