Izaak Meckler

Using complexity theory to prove things in representation theory

Posted on December 11, 2016

The other day, Evan Shapiro and I figured something interesting out about how complexity theoretic assumptions can help to prove asymptotic statements about the representation theory of symmetric groups! At a high level, we proved that \(S_{2^k}\) can't have very small faithful representations over any finite field (given a certain complexity-theoretic assumption). It's a very crude lower bound and the minimal size of a faithful representation of the symmetric group is already known, which makes this proof a bit of a nuclear fly-swatter.

But! I think the proof is interesting because it involves basically nothing about representation theory and just uses a complexity theoretic argument. It's really weird! To me this seems like a fact about geometry of Euclidean space but you can prove it just with complexity theory in a super hands off way. (The argument itself feels a bit like using a cardinality argument to establish that not all reals are computable or something.) Also, the proof generalizes to broader classes of groups to conclude something about their rep theory apparently out of nothing.

Specifically, we proved the following:

Theorem: Fix a sequence of finite fields \(F_k\) whose elements can be represented in \(\mathrm{poly}(k)\) space and where operations can be done in time \(\mathrm{poly}(k)\).

Let \((G_1, S_1), (G_2, S_2), \dots\) be a sequence of groups equipped with generating sets with \(|S_k| = \mathrm{poly}(k)\) such that the language \[L := \left\{ (1^k, w) : k \in \mathbb{N}, w \text{ a word in the } S_k,\, w =_{G_k} 1 \right\}\] is coNP-hard. Suppose further that each \((G_k, S_k)\) includes into \((G_{k+1}, S_{k+1})\). Let \(d_k\) be the dimension of the minimal faithful \(F_k\)-representation of \(G_k\).

Then if \(\mathrm{coNP} \not\subset \mathrm{P/poly}\), \(d_k\) is not bounded by any polynomial in \(k\).

Here's a proof: Suppose \(d_k\) is bounded by some polynomial in \(k\). Then we can give a \(\mathrm{P/poly}\) algorithm for the problem \(L\) as follows: Specifically, for the inputs of length \(n\), our advice string will be the \(d_n\)-dimensional faithful representation of \(G_n\), encoded as a list of \(d_n\)-dimensional matrices, one matrix \(S_i\) for each generator \(s_i\).

Now say our input is \((1^k, w)\). We have \(k \leq n\), so \((G_k, S_k)\) includes into \((G_n, S_n)\). Thus we may decide if \(w =_{G_k} 1\) by checking if \(w =_{G_n} 1\). This we do by applying our representation to \(w\). I.e., \(w = s_{i_1} \dots s_{i_r}\), so we compute the matrix \(W = S_{i_1} \dots S_{i_r}\) which we can do in time, say \(d_n^4\). Then, since the representation is faithful, \(W\) is the identity matrix iff \(w = 1\), so we can just check if \(W = I\), which is done in \(d_n^2\) time. In total the runtime of our algorithm is roughly \(d_n^4 + d_n^2\), which is polynomial in \(n\) since by assumption \(d_n\) is.

This contradicts the assumption that \(\mathrm{NP}\) is not contained in \(\mathrm{P / poly}\), so \(d_k\) is not bounded by any polynomial in \(k\).

You can strengthen this a little bit in a few ways, but I won't talk about that here. To conclude that the dimension of the smallest faithful rep of \(S_{2^k}\) isn't bounded by a polynomial in \(k\), we can notice that reversible circuits are basically words in a particular generating set of \(S_{2^k}\). Namely, the generating set consisting of

  • apply CNOT on the first two bits
  • apply a Toffoli gate on the first 3 bits
  • swap the \(i^{\text{th}}\) bit and the \(j^{\text{th}}\) bit

The corresponding language \(L\) essentially consists of the reversible circuits which compute the identity map. Deciding whether a reversible circuit is the identity is coNP hard, so the word problem \(L\) is too and our theorem applies. The conclusion that \(S_{2^k}\) doesn't always have small representations then follows.

I think this result is interesting because it shows how complexity theoretic information can be used as a very general tool for proving things about finitely representable mathematical objects. Certain things have to get very complicated or else they could be used to give efficient algorithms for hard problems! This is also interesting because it says that if we had a more exact complexity theory (which talked about specific polynomials or at least leading terms rather than just P, etc) then we could get even more specific information about the representation theory of such groups.

I think it gives an indication as to why separations in complexity theory are so hard to prove: You need to rule out all algorithms, ones that may based on properties of arbitrary mathematical objects, which could plausibly involved in algorithms given the ubiquity of coding problems in seemingly unrelated objects. So, separating classes ends up proving properties of lots of different objects.

A big thanks to Mario Sanchez, German Stefanich, and especially Nic Brody and Patrick Lutz for thinking about related stuff with me. Thanks also to Brandon Rayhaun for pointing out this is a total nuclear flyswatter.

Existence of global sections is NP-complete

Posted on October 31, 2016

\(\newcommand{\Set}[1]{\left\{#1\right\}}\) This may be well known, but I found the following pretty interesting.

If \(\pi : E \to B\) is a map (in any category, here I'm thinking about topological spaces though), a section of \(\pi\) is a map \(s : B \to E\) such that \[ \array{ B & \stackrel{s}{\to} & E & \stackrel{\pi}{\to} & B &&=&& B & \stackrel{\mathrm{id}_B}{\to} & B } \] I.e., \(\pi \circ s = \mathrm{id}_B\). Given \(\pi\), we can ask if it has any sections. If \(E\) and \(B\) are sets, this amounts to asking whether the map \(\pi\) is surjective. Checking whether a map of finite sets is surjective is easy. Make an array of bits with \(|B|\) elements, loop over \(e \in E\) and set the \(\pi(e)\) bit to \(1\). If the array is all \(1\)s at the end, then \(\pi\) is surjective.

If \(E\) and \(B\) are topological spaces however and \(\pi\) is a continuous map, the problem becomes NP complete. Before discussing the NP-completeness of this problem, let me say a bit about the way people think about sections.

Here is a diagram of a function \(\pi : E \to B\).
\(\pi\) is surjective. We can witness the surjectivity with a section, illustrated below as the dotted blue line.
A section takes every point to a point in its preimage. Often people draw these diagrams so that the preimage of each point is sitting visually above that point, and the map \(\pi\) is understood to be "projecting" the points straight down, as below.
Here it is more clear that a section assigns each point to a point in its preimage.
Another way of drawing a section is by drawing the space \(X\) as sitting inside of \(E\), like so:
Now for topological spaces, like a cube \(E\) projecting onto a square \(B\), this looks like the following.
Checking whether maps of topological spaces have sections is trickier even if the map is surjective, the topology of \(B\) creates global constraints which may make it impossible to find a copy of it sitting inside \(E\). For example, consider the following map \(p\) from the real line to the circle:
Image credit: Hatcher's Algebraic Topology
A section would have to wrap around and couldn't connect back up with itself:

The problem

This part is a bit less accessible, I apologize. But let's be precise about the decision problem, let's call it SEC, that I have in mind:

SEC: Given simplicial complexes \(E\) and \(B\) and a simplicial map \(\pi : E \to B\), does there exist a simplicial map \(s : E \to B\) such that \(\pi \circ s = \mathrm{id}_B\)?

It is not hard to see that this is NP. Given a map \(s\) from the vertex set of \(B\) to the vertex set of \(E\), checking that it is simplicial amounts to looping over the simplices \(\Set{b_1, \dots, b_k}\) in \(B\) and checking that \(\Set{s(b_1), \dots, s(b_k)}\) is a simplex in \(E\). Checking that \(\pi \circ s = \mathrm{id}_B\) just amounts to looping over the vertices \(b \in B\) and checking that \(\pi(s(b)) = b\).

Let's show that it is NP hard by reducing graph \(3\)-coloring to this problem. Let's say we are given a graph \(G\). \(G\) can be considered as a simplicial complex. We will construct a space \(E\) and a simplicial map \(\pi : E \to G\) such that \(\pi\) admits a global section iff \(G\) has a \(3\)-coloring. \(E\) will in fact just be a graph. The vertex set of \(E\) is \(G \times \Set{0, 1, 2}\) (where \(\Set{0, 1, 2}\) is our set of "colors"). There is an edge between \((u, i)\) and \((v, j)\) iff there is an edge betwen \(u\) and \(v\) in \(G\) and \(i \neq j\). This graph is clearly computable in polytime. \(\pi\) is just given by \(\pi(v, i) = v\).

Let's prove that there is a correspondence between \(3\)-colorings of \(G\) and sections of \(E\). Suppose \(s : G \to E\) is a section. Define \(c : G \to \Set{0, 1, 2}\) by

$c(v)$ = let (v, i) = $s(v)$ in i

I claim that \(c\) is a \(3\)-coloring of \(G\). If \((u, v)\) is an edge in \(G\), then \(s\) lifts it to an edge between \((u, c(u))\) and \((v, c(v))\) in \(E\). By the definition of the edge set of \(E\), this means \(c(u) \neq c(v)\), so \(c\) is a \(3\)-coloring.

Conversely, suppose \(G\) has a \(3\)-coloring \(c\). Then \(s(v) = (v, c(v))\) will be a section of \(E\) since for any edge \((u, v)\) in \(G\), there will be an edge between \((u, c(u))\) and \((v, c(v))\) in \(E\). Thus SEC is NP-complete.

Conclusions

This is particularly interesting since in addition to the existence of sections having witnesses (namely the sections themselves) the non-existence of sections of certain maps \(\pi\) also has witnesses, namely characteristic classes. I came up with a discrete version of the problem of checking whether a vector bundle is trivial, which it was easy to see was in \(\text{NP} \cap \text{coNP}\). After thinking about it for longer, I realized that it was in fact in \(P\).

A "type theoretic" principle for reasoning about smooth bundles over manifolds

Posted on September 20, 2016

\[ \newcommand{\br}{\mathbb{R}} \newcommand{\angled}[1]{\langle{} {#1} \rangle{}} \newcommand{\mrm}{\mathrm} \] I'm now taking a Riemannian geometry course, and one of the problems on the first problem set was the following:

Let \(p\) be any point in a Riemannian n-manifold \((M,g)\). Show that there is a local orthonormal frame near \(p\), that is, a local frame \(E_1,\dots,E_n\) defined in a neighborhood of \(p\) that forms an orthonormal basis for the tangent space at each point.

The way one does this problem is to pick a chart \((U, \varphi : U \to \br^n)\) with \(p \in U\), use that to pull back the standard orthonormal frame on \(\br^n\) to obtain a frame \(E_1, \dots, E_n\) on \(U\), and then essentially do Gram Schmidt pointwise with the \(E_i\) by defining

and we could go and verify that this does indeed produce an orthormal basis at each point, just as Gram Schmidt does.

This suggests a more general reasoning principle:

Principal for inner product space bundles: If \(M\) is a manifold with a bundle of inner product spaces (i.e., a vector bundle \(E\) on \(M\) with a smoothly varying inner product) then any construction or theorem in the language of inner product spaces can be carried out in \(E\), interpreting vectors as global sections of \(E\) and the inner product of vectors as a pointwise inner product of sections.

The point is that any construction carried out generically for an inner product space can be interpreted as talking about a bundle of inner product spaces instead.

For example, the Gram Schmidt construction is

Gram Schmidt: For any basis \(e_1, \dots, e_n\) of \(V\), there exists a basis \(f_1, \dots, f_n\) of \(V\) such that \(\angled{f_i, f_j} = \delta_{i,j}\).

Which, interpreting vectors as vector fields and the inner product as the pointwise inner product gives us the construction:

Gram Schmidt for Riemannian manifolds: For any frame \(E_1, \dots, E_n\) of \(V\), there exists a frame \(F_1, \dots, F_n\) of \(V\) such that \((x \mapsto \angled{F_i(x), F_j(x)}) = (x \mapsto \delta_{i,j})\)

Note that this does not say that any vector bundle has an orthonormal frame! That would be false because we cannot in general get global frames. It merely says that if we already have a global frame, we can get an orthonormal global frame. Moreover, it is not possible to obtain a basis in the language of vector spaces (although to be clear, this should be some kind of language of vector spaces as a type theory rather than in FOL, so that constructions may be defined and given semantics).

I think there should be a somewhat more general reasoning principle. Let \(\mathcal{C}\) be a category. If, waving hands vigorously, one defines a \(\mathcal{C}\)-bundle on \(M\) to be a "smoothly varying" family of objects of \(\mathcal{C}\) over \(M\), then any construction or theorem in some kind of internal language for \(\mathcal{C}\) will give a corresponding construction for \(\mathcal{C}\)-bundles in which points of objects correspond to global sections of bundles \(M\). (Perhaps we should think of a global section of a bundle \(E\) as a bundle map from the "point bundle" over \(M\) (isomorphic to \(M\)) to \(E\)).

Edit: Andrej Bauer points out on Reddit that a lot of this is explained in "Sheaves in Geometry and Logic" by Mac Lane and Moerdijk.

Parallel transport and version control

Posted on September 14, 2016
Parallel transport

One of the primary problems of version control is that two people can edit the same version of the same file, and the two sets of changes they made have to be reconciled somehow.

One of the ways of solving this in many version control systems is the notion of "rebase". Say \(v_0\) is a version of a repository and \(u\) is a child version. That is, there is some patch \(c\) taking \(v_0\) to \(u\), which we can depict diagrammatically as \[ \array{ v_0 & \stackrel{c}{\to} & u } \] Now say while we were writing version \(u\), the project we're working on progressed upstream from version \(v_0\) to version \(v_1\) with a patch \(p\). \[ \array{ v_0 & \stackrel{p}{\to} & v_1 } \] The rebase of \(c\) along \(p\) is a patch \(R_p(c)\) going from \(v_1 \to u'\) which is somehow obtained from \(c\) in the most generic possible way. And moreover, if the main version progresses from \(v_0\) to \(v_1\) to \(v_2\) as follows \[ \array{ v_0 & \stackrel{p_1}{\to} & v_1 & \stackrel{p_2}{\to} & v_2 } \] then we should have \[\begin{align}R_{p_2}(R_{p_1}(c)) = R_{p_2 \circ p_1}(c)\end{align}\] where \(p_2 \circ p_1\) denotes the patch obtained by first applying the patch \(p_1\) and then appling the patch \(p_2\). In other words, I should be able to rebase incrementally, or all in one step, and it shouldn't make a difference. We also demand that rebasing against the empty patch \(\mathrm{id}{v_0}\) does nothing. Or in other words, \[\begin{align}R_{\mathrm{id}{v_0}}(c) = c\end{align}\].

One way of summarizing this is that rebasing should be functorial. To me, this looks a lot like parallel transport of (tangent) vectors along paths in manifolds (which is also functorial in the same way). This analogy is even more plausible since we can think of a changeset \(c : v_0 \to u\) as being like a tangent vector at \(v_0\) (if the changeset \(c\) is small enough, and maybe think of it like a path if it's bigger). Making this analogy rigorous and potentially useful for designing a version control system could perhaps involve a combinatorial description of a flat connection for some discrete group (or possibly even monoid) \(G\) which connects the versions of a file.

There have been several formalisms trying to talk about version control. I think it is well known for example that merging should be something like a pushout in an appropriate category, and of course there was Homotopical Patch Theory, which formalized a version control system in Homotopy Type Theory. This is similar to the idea I've mentioned here in that patches are interpreted as a kind of path.

Rhetorical Dishonesty

Posted on May 31, 2016

I was really struck by John McWhorter's response to this question. He barely argued substantively against the idea raised by the questioner, instead using rhetoric to discredit it on other grounds.

I've transcribed the response below, along with my interpretations of the (just barely) subtextual meanings of his statements.

[Laughs]
He begins by signalling that the idea is so ridiculous that it merits immediate ridicule.
It's brilliant in its way, and I actually discuss it in the book
"This is a terrible idea and I've already discredited it thoroughly in writing"

"If you don't have future marking in your language then that makes you more sensitive to things like saving money and keeping healthy; if you have it then you're less likely to do it."

That is the idea of Keith Chan who was a brilliant economist at Yale and this...there's a TED talk, and it got around, and the media just ate it up with a spoon
"This guy is not a linguist, and his idea was popular in venues with large audiences, which means it's crap."

so if you don't have future marking such as Chinese does not in the sense of the European language has, then that makes you more likely to save money.

And Chen used statistics and you've got to respect them
"This guy uses fancy numbers which often convey authority, but I'm going to imply that such use really evidences an absence of true holistic understanding."
and apparently that showed that all of this made sense and you know I, um, this is not name dropping but I have spoken to, to him, we have corresponded, there is no battle, but no and I can just say very simply this. It's so easy
He firmly proclaims his opinion, long apparent from other signals in such a way that he seems to believe it's almost self-evident. He declares "It's so easy", implying that the opponent's opinion is so wrong, so shoddily supported, that anyone could see it.

Czech. Slovak. There used to be a country called Czechoslovakia and there's a reason. Czech and Slovak are really, this is going to offend somebody but, they are really the same language. I mean they are variations on a theme. The savings rates of Czechs and Slovaks are vastly different and yet they're speaking the same language with the same lack of a future. I give you that one. If you look at Chen's chart it's full of that it. It just doesn't work. It's not necessarily that he did bad linguistic work. That would be easy, that would be fun.

He slipped in that way in about three places where nobody could help it. I mean you simply have to know Russian well to understand whether or not it has a future (it's not his fault).
"Chen was simply oblivious to certain understandings which I'm privy to, and that resulted in his incorrect conclusion."
But there's a little of that, and then really the chart just doesn't work
"I only have that one minor criticism, so I have to vaguely assert that his chart is just wrong and quickly conclude...
and so it's a gorgeous idea, TED Talks live forever, he is a great presenter, but that idea is utterly bereft of hope.
...by reasserting that this is a superficially attractive idea, presented in popular and thus non-credible forums by a substanceless showman, which is ultimately wofeully mistaken."