# 8. Strings

This chapter studies basic combinatorial properties of
*strings*, sequences of characters or letters
drawn from a fixed alphabet, and introduces algorithms that process strings
ranging from fundamental methods at the heart of
the theory of computation to practical text-processing methods with a host
of important applications.

## 8.1 String Searching

## 8.2 Combinatorial properties of bitstrings

## 8.3 Regular expressions

## 8.4 Finite-State Automata and the Knuth-Morris-Pratt Algorithm

## 8.5 Context-Free Grammars

## 8.6 Tries

## 8.7 Trie Algorithms

## 8.8 Combinatorial Properties of Tries

## 8.9 Larger alphabets

#### Selected Exercises

## 8.1

Give two recurrences satisfied by $[z^N]B_P(z)$.## 8.2

How long a string of random bits should be taken to be 99% sure that there are at least three consecutive 0s?## 8.3

How long a string of random bits should be taken to be 50% sure that there are at least 32 consecutive 0s?## 8.6

By considering bitstrings with no runs of two consecutive 0s, evaluate the following sum involving Fibonacci numbers: $\sum_{j\ge0} F_j/2^j$.## 8.14

Suppose that a monkey types randomly at a 32-key keyboard. What is the expected number of characters typed before the monkey hits upon the phrase`THE QUICK BROWN FOX JUMPED OVER THE LAZY DOG`?

## 8.15

Suppose that a monkey types randomly at a 32-key keyboard. What is the expected number of characters typed before the monkey hits upon the phrase`TO BE OR NOT TO BE`?

## 8.22

Suppose that a monkey types randomly at a 2-key keyboard. What is the expected number of bits typed before the monkey hits upon a string of $2k$ alternating 0s and 1s?## 8.40

Suppose that a monkey types randomly at a 32-key keyboard that has 26 letters`A`through

`Z`, the symbols

`+`,

`*`,

`(`, and

`)`, a space key, and a period. What is the expected number of characters typed before the monkey hits upon a legal regular expression? Assume that spaces can appear anywhere in a regular expression and that a legal regular expression must be enclosed in parentheses and have exactly one period, at the end.

## 8.42

There are ${8\choose5}=56$ different sets of five three-bit bitstrings. Which trie is associated with the most of these sets? The least?## 8.44

How many different tries are there with $N$ external nodes?## 8.45

What proportion of the external nodes are void in a "random" trie (assuming each different trie structure to be equally likely to occur)?## 8.52

Show that $A_N/N$ is equal to $1/\ln 2$ plus a fluctuating term.## 8.53

Write a program to compute $A_N$ to within $10^{-9}$ for $N<10^6$ and explore the oscillatory nature of $A_N/N$.## 8.57

Solve the recurrence for $p_N$ given in the proof of Theorem 8.9, to within the oscillating term. $$p_N= {1\over 2^{N}} \sum_k{N\choose k}p_k\quad \hbox{for}\ N>1\quad \hbox{with}\ p_0=0\ \hbox{and}\ p_1 = 1$$

#### Inclass Exam Questions

## Q8.1

Give the OGF for the number of bitstrings of length $n$ not containing 01010.## Q8.2

Rank these patterns by expected wait time in a random bit string (lowest to highest). (*a*) 00000 (

*b*) 00001 (

*c*) 01000 (

*d*) 01010 (

*e*) 10101