# Tighter representation theory lower bounds from stronger assumptions

Let me preface this post by saying that the things proved here are more easily proved by the simplest counting arguments. I think the proofs are still worthwhile, however, as they give indications of how complexity theoretic assumptions can be used to prove purely mathematical statements, and the way in which stronger assumptions allow you to prove stronger statements.

Last time, I talked about how complexity theoretic assumptions given you lower bounds on the dimensions of irreducible representations of the symmetric group. This time, I want to show how certain stronger cryptographic assumptions give better lower bounds on such dimensions.

Namely, we'll assume that one way functions exist. I.e., there is some polytime computable function \(f : 2^k \to 2^m\) such that for any non-uniform PPT algorithm \(A\), we have \[ \mathrm{Pr}_{x \sim U(2^k)} \left[ A(f(x)) \in f^{-1}(f(x)) \right ] < \frac{1}{|p(k)|} \] for every polynomial \(k\) and for large enough \(k\). What this says is that \(f\) cannot be inverted on a generic input, except with vanishingly small probability.

Now let's prove the following theorem under this assumption:

TheoremFix a sequence of finite fields \(F_k\) with \(\log |F_k| = \mathrm{poly}(k)\) and operations computable in \(\mathrm{poly}(k)\) time.Let \(d_k\) be the dimension of the minimal \(F_k\)-representation of \(S(2^k)\). Then for any polynomial \(q(k)\), for all sufficiently large \(k\) we have \(d_k > q(k)\)

*Proof:* Let \(S(2^k)\) be the symmetric group on binary strings of length \(k\) equipped with a generating set \(X_k\) corresponding to a collection > of universal gates.

Fix a polynomial \(q(k)\). Suppose to the contrary that for infinitely many \(k\) we have \(d_k \leq q(k)\). Let \(\rho_k\) be the corresponding representation.

Then for such \(k\), we can given a non-uniform polytime algorithm for deciding the word problem in \(S(2^k)\). The advice string is \(\rho_k(x_1), \rho_k(x_2), \dots\) where the \(x_i\) are the generators. There are \(\mathrm{poly}(k)\) many generators and \(d_k\) is bounded by the polynomial \(q\), so the total advice is \(\mathrm{poly}(k)\). Now, we can just solve the word problem by taking our word, replacing generators with the corresonding matrices, multiplying them together and checking if the result is the identity matrix. All this can be done in polynomial time.

The word problem for \(S(2^k)\) on our given generating set is NP-complete (this is somewhat imprecise, what I really mean is described in the last post). We note that inverting any one-way function can be done in polytime with an NP oracle. Since we non-uniformly solve NP hard problems on arbitrarily large input sizes, we can non-uniformly invert one-way functions infinitely often, (come to think of it, one has to do a bit more work, but it should work out). This contradicts our cryptographic assumption.

Examining the proof, we can actually get away with a significantly weaker assumption. Namely, that there is some NP-hard problem \(L\) such that for any \(\mathrm{P/poly}\) algorithm \(A\), there is some constant \(N\) such that for all \(k > N\), some \(x\) of length \(k\) has \(A(k) \neq L(k)\).

# A more quantitative assumption

If we make a more quantitative assumption, such as the following non-uniform exponential time hypothesis:

There is a universal constant \(\delta\) such that any circuit for 3-SAT has size at least \(2^{\delta k}\).

Analyzing the circuit given in the above proof, using the fact that multiplying two \(d_k \times d_k\) matrices can be done in time \(d_k^3 \mathrm{poly}(k)\), we get the inequality \[ d_k^3 \mathrm{poly}(k) \geq 2^{\delta k} \] and thus for any \(\epsilon > 0\), \[ d_k \geq 2^{\frac{1}{3} (\delta - \epsilon) k} \] for large enough \(k\).

Sadly this is still far from the truth, which is that \(d_k = 2^k - 1\). Still, a purely complexity theoretic assumption lets us get pretty close.

# Using complexity theory to prove things in representation theory

The other day, Evan Shapiro and I figured something interesting out about how complexity theoretic assumptions can help to prove asymptotic statements about the representation theory of symmetric groups! At a high level, we proved that \(S_{2^k}\) can't have very small faithful representations over any finite field (given a certain complexity-theoretic assumption). It's a very crude lower bound and the minimal size of a faithful representation of the symmetric group is already known, which makes this proof a bit of a nuclear fly-swatter.

But! I think the proof is interesting because it involves basically nothing about representation theory and just uses a complexity theoretic argument. It's really weird! To me this seems like a fact about geometry of Euclidean space but you can prove it just with complexity theory in a super hands off way. (The argument itself feels a bit like using a cardinality argument to establish that not all reals are computable or something.) Also, the proof generalizes to broader classes of groups to conclude something about their rep theory apparently out of nothing.

Specifically, we proved the following:

Theorem: Fix a sequence of finite fields \(F_k\) whose elements can be represented in \(\mathrm{poly}(k)\) space and where operations can be done in time \(\mathrm{poly}(k)\).Let \((G_1, S_1), (G_2, S_2), \dots\) be a sequence of groups equipped with generating sets with \(|S_k| = \mathrm{poly}(k)\) such that the language \[L := \left\{ (1^k, w) : k \in \mathbb{N}, w \text{ a word in the } S_k,\, w =_{G_k} 1 \right\}\] is coNP-hard. Suppose further that each \((G_k, S_k)\) includes into \((G_{k+1}, S_{k+1})\). Let \(d_k\) be the dimension of the minimal faithful \(F_k\)-representation of \(G_k\).

Then if \(\mathrm{coNP} \not\subset \mathrm{P/poly}\), \(d_k\) is not bounded by any polynomial in \(k\).

Here's a proof: Suppose \(d_k\) is bounded by some polynomial in \(k\). Then we can give a \(\mathrm{P/poly}\) algorithm for the problem \(L\) as follows: Specifically, for the inputs of length \(n\), our advice string will be the \(d_n\)-dimensional faithful representation of \(G_n\), encoded as a list of \(d_n\)-dimensional matrices, one matrix \(S_i\) for each generator \(s_i\).

Now say our input is \((1^k, w)\). We have \(k \leq n\), so \((G_k, S_k)\) includes into \((G_n, S_n)\). Thus we may decide if \(w =_{G_k} 1\) by checking if \(w =_{G_n} 1\). This we do by applying our representation to \(w\). I.e., \(w = s_{i_1} \dots s_{i_r}\), so we compute the matrix \(W = S_{i_1} \dots S_{i_r}\) which we can do in time, say \(d_n^4\). Then, since the representation is faithful, \(W\) is the identity matrix iff \(w = 1\), so we can just check if \(W = I\), which is done in \(d_n^2\) time. In total the runtime of our algorithm is roughly \(d_n^4 + d_n^2\), which is polynomial in \(n\) since by assumption \(d_n\) is.

This contradicts the assumption that \(\mathrm{NP}\) is not contained in \(\mathrm{P / poly}\), so \(d_k\) is not bounded by any polynomial in \(k\).

You can strengthen this a little bit in a few ways, but I won't talk about that here. To conclude that the dimension of the smallest faithful rep of \(S_{2^k}\) isn't bounded by a polynomial in \(k\), we can notice that reversible circuits are basically words in a particular generating set of \(S_{2^k}\). Namely, the generating set consisting of

- apply CNOT on the first two bits
- apply a Toffoli gate on the first 3 bits
- swap the \(i^{\text{th}}\) bit and the \(j^{\text{th}}\) bit

The corresponding language \(L\) essentially consists of the reversible circuits which compute the identity map. Deciding whether a reversible circuit is the identity is coNP hard, so the word problem \(L\) is too and our theorem applies. The conclusion that \(S_{2^k}\) doesn't always have small representations then follows.

I think this result is interesting because it shows how complexity theoretic information can be used as a very general tool for proving things about finitely representable mathematical objects. Certain things have to get very complicated or else they could be used to give efficient algorithms for hard problems! This is also interesting because it says that if we had a more exact complexity theory (which talked about specific polynomials or at least leading terms rather than just P, etc) then we could get even more specific information about the representation theory of such groups.

I think it gives an indication as to why separations in complexity theory are so hard to prove: You need to rule out all algorithms, ones that may based on properties of arbitrary mathematical objects, which could plausibly involved in algorithms given the ubiquity of coding problems in seemingly unrelated objects. So, separating classes ends up proving properties of lots of different objects.

A big thanks to Mario Sanchez, German Stefanich, and especially Nic Brody and Patrick Lutz for thinking about related stuff with me. Thanks also to Brandon Rayhaun for pointing out this is a total nuclear flyswatter.

# Existence of global sections is NP-complete

\(\newcommand{\Set}[1]{\left\{#1\right\}}\) This may be well known, but I found the following pretty interesting.

If \(\pi : E \to B\) is a map (in any category, here I'm thinking about topological spaces though), a section of \(\pi\) is a map \(s : B \to E\) such that \[ \array{ B & \stackrel{s}{\to} & E & \stackrel{\pi}{\to} & B &&=&& B & \stackrel{\mathrm{id}_B}{\to} & B } \] I.e., \(\pi \circ s = \mathrm{id}_B\). Given \(\pi\), we can ask if it has any sections. If \(E\) and \(B\) are sets, this amounts to asking whether the map \(\pi\) is surjective. Checking whether a map of finite sets is surjective is easy. Make an array of bits with \(|B|\) elements, loop over \(e \in E\) and set the \(\pi(e)\) bit to \(1\). If the array is all \(1\)s at the end, then \(\pi\) is surjective.

If \(E\) and \(B\) are topological spaces however and \(\pi\) is a continuous map, the problem becomes NP complete. Before discussing the NP-completeness of this problem, let me say a bit about the way people think about sections.

Here is a diagram of a function \(\pi : E \to B\). \(\pi\) is surjective. We can witness the surjectivity with a section, illustrated below as the dotted blue line. A section takes every point to a point in its preimage. Often people draw these diagrams so that the preimage of each point is sitting visually above that point, and the map \(\pi\) is understood to be "projecting" the points straight down, as below. Here it is more clear that a section assigns each point to a point in its preimage. Another way of drawing a section is by drawing the space \(X\) as sitting inside of \(E\), like so: Now for topological spaces, like a cube \(E\) projecting onto a square \(B\), this looks like the following. Checking whether maps of topological spaces have sections is trickier even if the map is surjective, the topology of \(B\) creates global constraints which may make it impossible to find a copy of it sitting inside \(E\). For example, consider the following map \(p\) from the real line to the circle: A section would have to wrap around and couldn't connect back up with itself:# The problem

This part is a bit less accessible, I apologize. But let's be precise about the decision problem, let's call it SEC, that I have in mind:

SEC: Given simplicial complexes \(E\) and \(B\) and a simplicial map \(\pi : E \to B\), does there exist a simplicial map \(s : E \to B\) such that \(\pi \circ s = \mathrm{id}_B\)?

It is not hard to see that this is NP. Given a map \(s\) from the vertex set of \(B\) to the vertex set of \(E\), checking that it is simplicial amounts to looping over the simplices \(\Set{b_1, \dots, b_k}\) in \(B\) and checking that \(\Set{s(b_1), \dots, s(b_k)}\) is a simplex in \(E\). Checking that \(\pi \circ s = \mathrm{id}_B\) just amounts to looping over the vertices \(b \in B\) and checking that \(\pi(s(b)) = b\).

Let's show that it is NP hard by reducing graph \(3\)-coloring to this problem. Let's say we are given a graph \(G\). \(G\) can be considered as a simplicial complex. We will construct a space \(E\) and a simplicial map \(\pi : E \to G\) such that \(\pi\) admits a global section iff \(G\) has a \(3\)-coloring. \(E\) will in fact just be a graph. The vertex set of \(E\) is \(G \times \Set{0, 1, 2}\) (where \(\Set{0, 1, 2}\) is our set of "colors"). There is an edge between \((u, i)\) and \((v, j)\) iff there is an edge betwen \(u\) and \(v\) in \(G\) and \(i \neq j\). This graph is clearly computable in polytime. \(\pi\) is just given by \(\pi(v, i) = v\).

Let's prove that there is a correspondence between \(3\)-colorings of \(G\) and sections of \(E\). Suppose \(s : G \to E\) is a section. Define \(c : G \to \Set{0, 1, 2}\) by```
$c(v)$ = let (v, i) = $s(v)$ in i
```

I claim that \(c\) is a \(3\)-coloring of \(G\). If \((u, v)\) is an edge in \(G\), then \(s\) lifts it to an edge between \((u, c(u))\) and \((v, c(v))\) in \(E\). By the definition of the edge set of \(E\), this means \(c(u) \neq c(v)\), so \(c\) is a \(3\)-coloring.

Conversely, suppose \(G\) has a \(3\)-coloring \(c\). Then \(s(v) = (v, c(v))\) will be a section of \(E\) since for any edge \((u, v)\) in \(G\), there will be an edge between \((u, c(u))\) and \((v, c(v))\) in \(E\). Thus SEC is NP-complete.

# Conclusions

This is particularly interesting since in addition to the existence of sections having witnesses (namely the sections themselves) the *non*-existence of sections of certain maps \(\pi\) also has witnesses, namely characteristic classes. I came up with a discrete version of the problem of checking whether a vector bundle is trivial, which it was easy to see was in \(\text{NP} \cap \text{coNP}\). After thinking about it for longer, I realized that it was in fact in \(P\).

# A "type theoretic" principle for reasoning about smooth bundles over manifolds

\[ \newcommand{\br}{\mathbb{R}} \newcommand{\angled}[1]{\langle{} {#1} \rangle{}} \newcommand{\mrm}{\mathrm} \] I'm now taking a Riemannian geometry course, and one of the problems on the first problem set was the following:

The way one does this problem is to pick a chart \((U, \varphi : U \to \br^n)\) with \(p \in U\), use that to pull back the standard orthonormal frame on \(\br^n\) to obtain a frame \(E_1, \dots, E_n\) on \(U\), and then essentially do Gram Schmidt pointwise with the \(E_i\) by defining \begin{align*} \mrm{normed}(X)(x) &:= \frac{X(x)}{\sqrt{\angled{x,x}}} \\ G_1 &:= E_1 \\ F_1 &:= \mrm{normed}(G_1) \\ G_{k+1}(x) &:= E_{k+1}(x) - \sum_{i=1}^k \angled{E_{k+1}(x), F_i(x)} F_i(x) \\ F_{k+1} &:= \mrm{normed}(G_{k+1}) \end{align*}Let \(p\) be any point in a Riemannian n-manifold \((M,g)\). Show that there is a local orthonormal frame near \(p\), that is, a local frame \(E_1,\dots,E_n\) defined in a neighborhood of \(p\) that forms an orthonormal basis for the tangent space at each point.

and we could go and verify that this does indeed produce an orthormal basis at each point, just as Gram Schmidt does.

This suggests a more general reasoning principle:

Principal for inner product space bundles:If \(M\) is a manifold with a bundle of inner product spaces (i.e., a vector bundle \(E\) on \(M\) with a smoothly varying inner product) then any construction or theorem in the language of inner product spaces can be carried out in \(E\), interpreting vectors as global sections of \(E\) and the inner product of vectors as a pointwise inner product of sections.

The point is that any construction carried out generically for an inner product space can be interpreted as talking about a bundle of inner product spaces instead.

For example, the Gram Schmidt construction is

Gram Schmidt: For any basis \(e_1, \dots, e_n\) of \(V\), there exists a basis \(f_1, \dots, f_n\) of \(V\) such that \(\angled{f_i, f_j} = \delta_{i,j}\).

Which, interpreting vectors as vector fields and the inner product as the pointwise inner product gives us the construction:

Gram Schmidt for Riemannian manifolds: For any frame \(E_1, \dots, E_n\) of \(V\), there exists a frame \(F_1, \dots, F_n\) of \(V\) such that \((x \mapsto \angled{F_i(x), F_j(x)}) = (x \mapsto \delta_{i,j})\)

Note that this does not say that any vector bundle has an orthonormal frame! That would be false because we cannot in general get global frames. It merely says that if we *already have* a global frame, we can get an orthonormal global frame. Moreover, it is not possible to obtain a basis in the language of vector spaces (although to be clear, this should be some kind of language of vector spaces as a type theory rather than in FOL, so that constructions may be defined and given semantics).

I think there should be a somewhat more general reasoning principle. Let \(\mathcal{C}\) be a category. If, waving hands vigorously, one defines a \(\mathcal{C}\)-bundle on \(M\) to be a "smoothly varying" family of objects of \(\mathcal{C}\) over \(M\), then any construction or theorem in some kind of internal language for \(\mathcal{C}\) will give a corresponding construction for \(\mathcal{C}\)-bundles in which points of objects correspond to global sections of bundles \(M\). (Perhaps we should think of a global section of a bundle \(E\) as a bundle map from the "point bundle" over \(M\) (isomorphic to \(M\)) to \(E\)).

**Edit:** Andrej Bauer points out on Reddit that a lot of this is explained in "Sheaves in Geometry and Logic" by Mac Lane and Moerdijk.

# Parallel transport and version control

One of the primary problems of version control is that two people can edit the same version of the same file, and the two sets of changes they made have to be reconciled somehow.

One of the ways of solving this in many version control systems is the notion of "rebase". Say \(v_0\) is a version of a repository and \(u\) is a child version. That is, there is some patch \(c\) taking \(v_0\) to \(u\), which we can depict diagrammatically as \[
\array{
v_0 & \stackrel{c}{\to} & u
}
\] Now say while we were writing version \(u\), the project we're working on progressed upstream from version \(v_0\) to version \(v_1\) with a patch \(p\). \[
\array{
v_0 & \stackrel{p}{\to} & v_1
}
\] The *rebase* of \(c\) along \(p\) is a patch \(R_p(c)\) going from \(v_1 \to u'\) which is somehow obtained from \(c\) in the most generic possible way. And moreover, if the main version progresses from \(v_0\) to \(v_1\) to \(v_2\) as follows \[
\array{
v_0 & \stackrel{p_1}{\to} & v_1 & \stackrel{p_2}{\to} & v_2
}
\] then we should have \[\begin{align}R_{p_2}(R_{p_1}(c)) = R_{p_2 \circ p_1}(c)\end{align}\] where \(p_2 \circ p_1\) denotes the patch obtained by first applying the patch \(p_1\) and then appling the patch \(p_2\). In other words, I should be able to rebase incrementally, or all in one step, and it shouldn't make a difference. We also demand that rebasing against the empty patch \(\mathrm{id}{v_0}\) does nothing. Or in other words, \[\begin{align}R_{\mathrm{id}{v_0}}(c) = c\end{align}\].

One way of summarizing this is that rebasing should be functorial. To me, this looks a lot like parallel transport of (tangent) vectors along paths in manifolds (which is also functorial in the same way). This analogy is even more plausible since we can think of a changeset \(c : v_0 \to u\) as being like a tangent vector at \(v_0\) (if the changeset \(c\) is small enough, and maybe think of it like a path if it's bigger). Making this analogy rigorous and potentially useful for designing a version control system could perhaps involve a combinatorial description of a flat connection for some discrete group (or possibly even monoid) \(G\) which connects the versions of a file.

There have been several formalisms trying to talk about version control. I think it is well known for example that merging should be something like a pushout in an appropriate category, and of course there was Homotopical Patch Theory, which formalized a version control system in Homotopy Type Theory. This is similar to the idea I've mentioned here in that patches are interpreted as a kind of path.