The Co-Mathematician's Question

In late April, an Oxford topologist named Marc Lackenby fed a problem from a battered Russian notebook to an AI that DeepMind had been quietly building for the better part of a year. The notebook is the Kourovka Notebook, and it has been collecting unsolved questions in group theory since 1965, passed mathematician to mathematician across continents and editions, like an open-mic list nobody quite knows how to close. The question Lackenby chose, problem 21.10, had outlived two generations of mathematicians. A few days and one caught-out flawed proof later, the problem was closed. The strange part isn’t that the machine solved it. The strange part is what happened in between.

Most reporting on the result settled into the predictable register. Machine cracks human problem. The expected think pieces filed themselves. What got less airtime, and what is actually the story, is the workflow that did the cracking, and a small philosophical pinch point it produced almost as a by-product. The proof is correct. Whether anyone, including Lackenby, understands it the way we used to mean understanding, is now a live question. I want to argue here that this is not a technicality. It might be the question that decides what mathematics looks like by the end of the decade.

SECTION 1

A Sixty-Year-Old Question, Answered by a Committee of Bots

The Kourovka Notebook is one of those mathematical artefacts that sounds like a Borges premise but is real and a little embarrassing in its physicality. It started in 1965 in Novosibirsk, at a group theory conference, when the Soviet algebraist Mikhail Kargapolov produced a small ruled notebook and asked everyone present to write down a problem they could not solve. They did. The notebook went home with someone. The next year it had more problems. It got typed up, mailed around, photocopied, re-edited. It is now in its 21st edition, published this past January, and it remains the longest-running open list of unsolved problems in any branch of mathematics.

Problem 21.10 had the quality that makes a question survive that many editions. It is short. It is crisp. It asks whether every finite group has what the literature calls a just-infinite presentation, which means a description so economical that if you remove any single rule the group ceases to be finite and blows up to infinity. You can hand it to a graduate student in twenty seconds. You can fail to crack it for a career.

Lackenby did not hand it to a graduate student. He handed it to DeepMind’s AI Co-Mathematician, a system the lab released as a preprint on the first of May. The Co-Mathematician is not a chatbot in a costume. The architecture matters, so it is worth slowing down.

A coordinator agent dispatches workstreams in parallel. Reviewers check each step. The human picks the thread that survives

At the top sits a project coordinator agent whose only job is to keep score. It receives the problem, divides the attack into workstreams, and spawns sub-agents to chase each one. Some of those agents try to prove the conjecture. Others, working in parallel, try to disprove it. The point of running both is that the system does not need to know in advance which side is correct. It can converge on the truth by elimination. Beneath each strand of attack, reviewer agents read everything that gets produced, looking for the small logical gaps that human referees are paid to spot a few months after a paper goes up on arXiv.

What happened with problem 21.10 was instructive. The first attempted proof had a hole in it. It was not a hole that any of the worker agents noticed. It was the reviewer agent that flagged the step where the argument quietly conflated congruence with conjugacy, two relations on groups that look similar on a chalkboard and behave very differently if you push them. Lackenby read both the attempted proof and the critique, recognised the shape of the strategy, and redirected the system toward a cleaner formulation. He told the agents, in effect, to stop treating presentations as objects and start treating them as a poset, an ordered family with a top and a bottom. That suggestion did the work. The system rewrote the argument, the reviewers passed it, and Q 21.10 went into the back of the next Kourovka edition with a citation rather than a question mark.

The AI is not playing the role of mathematician. It is playing the role of an extremely fast, tireless graduate student who has read everything.

Lackenby has been careful in interviews about what the experience actually was. The system, he said, works best when the human is already familiar with the area. He used the AI the way a senior researcher uses a postdoc with infinite stamina, except this postdoc reads the entire literature again before lunch. It is a useful framing because it cuts through the marketing language. Nobody was replaced. A workflow was multiplied.

The wider context is the part that gets buried in the press release. The same week the Lackenby result was announced, DeepMind also reported that the Co-Mathematician had scored forty-eight percent on FrontierMath Tier 4, a benchmark Epoch AI explicitly built to be unreachable by machine learning systems for, in their words, decades. The base model, by itself, scored nineteen percent. The twenty-nine point jump came almost entirely from the agentic scaffolding sitting on top. Which is to say, the lift was not in the model. It was in how the model was made to argue with itself.

SECTION 2

A Different QR Code, A Different Kind of Tool

Two weeks before the Co-Mathematician story broke, Quanta Magazine ran a piece by Erica Klarreich about a new tool for distinguishing knots. The tool, built by a small group of geometric topologists, is being described as a powerful new QR code for knots. The headline undersells it slightly. What the team did was construct a computable, visualisable approximation to the Kontsevich integral, an object that has hovered in knot theory since the 1990s as the theoretical fingerprint of every knot in the universe. The integral is so detailed it might, in principle, distinguish any two knots that look alike. But it is written in a language no human can read off a real piece of string.

Think of it this way. The Kontsevich integral is the perfect identification photo of every knot you will ever tie or untie. Crisp, unambiguous, complete. The catch is that it is stored in a file format no camera can render. For thirty years it has been a beautiful object that nobody can hold up to the light. The new QR code, in effect, is the first rendering of fragments of that file that the human eye can actually parse. It is a piece of mathematics built by humans that lets humans see something they could not see before.

Now place the two stories next to each other. The QR-code knot tool extends human sight. It lets a person, a working topologist, look at a knot and read something off the page that was always there but invisible. The Co-Mathematician, by contrast, extends the machine’s sight. It lets a system carry out the seeing on behalf of the mathematician, who then signs off on the result. Both expand the reach of the field. They are doing very different epistemic work.