8. Strings
This chapter studies basic combinatorial properties of
strings, sequences of characters or letters
drawn from a fixed alphabet, and introduces algorithms that process strings
ranging from fundamental methods at the heart of
the theory of computation to practical text-processing methods with a host
of important applications.
8.1 String Searching
8.2 Combinatorial properties of bitstrings
8.3 Regular expressions
8.4 Finite-State Automata and the Knuth-Morris-Pratt Algorithm
8.5 Context-Free Grammars
8.6 Tries
8.7 Trie Algorithms
8.8 Combinatorial Properties of Tries
8.9 Larger alphabets
Selected Exercises
8.1
Give two recurrences satisfied by $[z^N]B_P(z)$.8.2
How long a string of random bits should be taken to be 99% sure that there are at least three consecutive 0s?8.3
How long a string of random bits should be taken to be 50% sure that there are at least 32 consecutive 0s?8.6
By considering bitstrings with no runs of two consecutive 0s, evaluate the following sum involving Fibonacci numbers: $\sum_{j\ge0} F_j/2^j$.8.14
Suppose that a monkey types randomly at a 32-key keyboard. What is the expected number of characters typed before the monkey hits upon the phrase THE QUICK BROWN FOX JUMPED OVER THE LAZY DOG?8.15
Suppose that a monkey types randomly at a 32-key keyboard. What is the expected number of characters typed before the monkey hits upon the phrase TO BE OR NOT TO BE?8.22
Suppose that a monkey types randomly at a 2-key keyboard. What is the expected number of bits typed before the monkey hits upon a string of $2k$ alternating 0s and 1s?8.40
Suppose that a monkey types randomly at a 32-key keyboard that has 26 letters A through Z, the symbols +, *, (, and ), a space key, and a period. What is the expected number of characters typed before the monkey hits upon a legal regular expression? Assume that spaces can appear anywhere in a regular expression and that a legal regular expression must be enclosed in parentheses and have exactly one period, at the end.8.42
There are ${8\choose5}=56$ different sets of five three-bit bitstrings. Which trie is associated with the most of these sets? The least?8.44
How many different tries are there with $N$ external nodes?8.45
What proportion of the external nodes are void in a "random" trie (assuming each different trie structure to be equally likely to occur)?8.52
Show that $A_N/N$ is equal to $1/\ln 2$ plus a fluctuating term.8.53
Write a program to compute $A_N$ to within $10^{-9}$ for $N<10^6$ and explore the oscillatory nature of $A_N/N$.8.57
Solve the recurrence for $p_N$ given in the proof of Theorem 8.9, to within the oscillating term. $$p_N= {1\over 2^{N}} \sum_k{N\choose k}p_k\quad \hbox{for}\ N>1\quad \hbox{with}\ p_0=0\ \hbox{and}\ p_1 = 1$$
Review Questions
Q8.1
Rank these patterns by expected wait time in a random bit string (lowest to highest). (a) 00000 (b) 00001 (c) 01000 (d) 01010 (e) 10101