When non-uniformity doesn't help

Posted on February 19, 2017

Today I want to write about a simple proof that certain problems that can be computed by polynomial-sized non-uniform read-once branching programs (also called non-uniform automata) can actually be computed uniformly just as efficiently. That is, in some cases, a degree of non-uniformity doesn't actually help solve the problem any more efficiently.

What is a non-uniform automaton? The basic idea is that of a deterministic finite automaton (i.e., a DFA, a.k.a. a finite state machine), but which is allowed to vary based on the input length. That is, a non-uniform automaton for a language $L$ is a sequence of DFAs $M_0, M_1, \dots$ such that for $x$ of length $n$, $x \in L$ iff $M_n$ accepts.

The fact is the following:

Fact: If $(G, A)$ is a finitely presented group and the word problem for $G$ can be computed by non-uniform automata with polynomially many states, then the word problem for $G$ has a linear time (multi-tape Turing machine) algorithm.

Proof. Suppose the word problem for $(G, A)$ has a non-uniform automaton $M_0, M_1, \dots$ with polynomially many states. Say $S(n)$ is a polynomial upper bound for the number of states of $M_n$ (assume it is monotonically increasing WLOG). Let $R(\ell)$ denote the set of elements of $G$ with a word of length exactly $\ell$.

I claim that $|R(n / 2)| \leq S(n)$ I.e., there are at least as many states in $M_n$ as there are group elements with words of length exactly $n/2$. Why? The map from $R(n/2)$ to the states of $M_n$ given by running $M_n$ will be an injection: Take $x, y \in R(n/2)$ and take words $w_x, w_y$ of length $n/2$ representing them. Suppose $M_n$ is in the same state $s$ after running on these words. $M_n$ takes $n/2$ more input characters, so from this state, we can feed it the word $w_x^{-1}$, which has length $n / 2$. Since it should accept $w_x w_x^{-1}$ (since that's a trivial word), this will put it in an accepting state. But that means $M_n$ will accept on $w_y w_x^{-1}$ as well since it will first run on $w_y$ and enter state $s$, and then proceed as with the word $w_x w_x^{-1}$ (since $w_x$ also puts $M_n$ in state $s$). Thus the word $w_y w_x^{-1}$ is trivial, which means $x = y$ in $G$.

Now let $B(e, r)$ denote the ball of radius $r$ in the Cayley graph of $(G, A)$. We have \begin{align*} |B(e, r)| &= \sum_{\ell \leq r} |R(\ell)| \\ &\leq \sum_{\ell \leq r} S(2 \ell) \\ &\leq r S(2 r) \end{align*} $r S(2 r)$ is a polynomial in $r$, and thus $G$ has polynomial growth. Thus, by Gromov's theorem, $G$ is virtually nilpotent. By the work in this paper of Goodman and Shapiro, $G$ then has a linear time algorithm.