AlphaGeometry and the threat of AI’s takeover of mathematics

Google’s machine was able to solve 25 out of 30 Olympiadlevel geometry problems and could also write humanreadable proofs and draw diagrams to explain a proof. According to a U.S. Mathematical Olympiad coach, this performance exceeds that of an average

2024-03-14 - Mohan R.

few weeks ago, an animated discussion unfolded in a WhatsApp group whose members are mathematicians interested in the Indian Mathematical Olympiad. The spark was a Nature paper that announced a Google DeepMind artificial intelligence (AI) named AlphaGeometry had achieved a milestone: it could solve geometry problems at the level of the International Mathematical Olympiad, nearly matching the prowess of gold medallists.

The news evoked a mix of awe, fear, and wonder among us, especially in light of how AI tools like ChatGPT have started to reshape education. Some mathematicians wondered if the advent of AlphaGeometry signals the start of AI’s ascendancy in mathematics.

Is this truly the beginning of an AI takeover in mathematics? To answer this question, let’s take a look at the inner workings of AlphaGeometry.

AHow does mathematical logic work?

The Nature paper was coauthored by two computer scientists at New York University and two DeepMind researchers. AlphaGeometry is one of DeepMind’s array of AI systems — perhaps the most popular of which is AlphaZero, a deeplearning algorithm that excels at playing chess. Programs like these are part of researchers’ efforts to work up a ladder of complexity, building tools that can perform more complex tasks with better reliably.

The AlphaGeometry team has published supplementary information describing the proofs generated by AlphaGeometry for some geometry problems, showcasing its ability to create hundreds of logical steps in proof construction.

Let’s start with a simple example from school mathematics. Suppose we only know that for any number a, a + 0 = a. From this, we will be able to prove that for any number × 0 = 0. How? If +0 = 0 for any number a, then we should have 0 + 0 = 0. Thus × 0 can be written as × (0 + 0), which is the same as × 0 + × 0. So we have the equality × 0= (a × 0) + (a × 0). Cancelling × 0 on both sides of the equation, we can conclude that a × 0 = 0.

Here, the entire proof is simply derived from the hypothesis using the rules of logic. Many computer programs can execute such a process but AlphaGeometry stands apart because of its ‘Deductive Database’ — a method that significantly reduces the number of steps in a proof.

aaaaaSuppose we are given a statement A, and we want to deduce the statement Z. The program can spit out all possible next steps — let’s call them — that can be deduced from using the rules of logic. Then it will spit out all possible next steps

that can be deduced from and so on. If there are only finitely many steps possible, then it should reach the conclusion at some point. But once it reaches it will perform a ‘traceback’ process to find the proof that takes the minimum number of steps.

So much for arithmetic and logic; geometry requires something more. In geometry, we use algebraic relations between different kinds of measures to find new relations. For example, we will have used simple techniques in school geometry called ‘angle chasing’, ‘ratio chasing’ and ‘distance chasing’.

To illustrate the meaning of these ideas, let us take an example from school geometry. Let a, b, and be three lines on a plane. If we know the angle between and and the angle between and c, we can immediately determine the angle between and (see figure 1). This is an example of ‘angle chasing’. Similarly, AlphaGeometry can quickly discover all possible algebraic relationships between some given quantities using its ‘Algebraic Rules’ program.

When it combines its ‘Deductive Database’ and ‘Algebraic Rules’ programs, AlphaGeometry can write complete proofs for most schoollevel geometry problems.

For example, let A, B, C, and be any four points on a plane (see figure 2). Suppose by angle chasing we know that the angle between the lines and BD is equal to the angle between the lines AC and CD.

Then ‘Deductive Database’ can immediately figure out all the four points lie on a circle while ‘Algebraic Rules’ can determine that the angle between the lines and is equal to the angle

CbBCZ,aZaaAcCAaBcB,bABDabetween the lines

BDand DA.

The combination of these two programs makes AlphaGeometry a very powerful tool. The AlphaGeometry team could solve 14 of the 30 geometry problems in the International Mathematical

Olympiad in this way.

This achievement also reveals that a significant amount of difficulty in these problems was not in terms of the ingenuity required to solve them but in the ability to deduce the most number of relations — and computers are better at this than humans.

Fortunately, this ability is not sufficient to prove all problems in geometry, but AlphaGeometry seems to have summited this peak as well.

Mathematics is really a creative field because mathematicians often come up with clever constructions to solve a problem. Their name for such a construction is an auxiliary construction. Auxiliary constructions are not part of what is ‘given’ to us nor what we want to prove, and also illustrate what makes automatic theorem proving difficult. There are infinite ways to build constructions, and human intelligence is required to judge which one to choose for a given problem and how to use it.

There is a classic example: some

2,000 years ago, Euclid proved that there are infinitely many prime numbers. His proof goes as follows: suppose there are only finitely many primes numbers, say p1, p2, …, pn. Take the product of all these primes and add 1 to the product. Let’s call this new number p. That is, p = p1 p2 … pn + 1. The question now is whether p is a prime.

If is a prime, and since is bigger than all the other primes, we have a new prime. However, this shouldn’t be possible because we assumed originally that there is only a finite number of primes. If is not a prime, we will be forced to conclude that one of the primes should divide 1, which is absurd. In sum,

pppassuming there is a number of primes leads us to absurdity, which means there have to be infinitely many primes.

The auxiliary construction in this proof is constructing the number p.

There are no particular restrictions for how we can come up with different constructions, and thus different ways to solve the problem. They simply require experience and deep insight.

The success of this project will certainly lead to the development of AI programs that can efficiently do mathematics at least at the school level

Invariably, most geometry proofs require auxiliary constructions. Large language models like GPT4, which is behind ChatGPT, can be taught to come up with possible constructions. One can train them to use rulesets from different fields to build auxiliary constructions and use them to write proofs. However, there is no guarantee that the new constructions they devise will be able to lead to new proofs.

But when the AlphaGeometry team combined GPT4 with ‘Deductive Database’ and ‘Algebraic Rules’, the program could produce auxiliary constructions for geometry problems, with no prior human demonstration. This is a new development in the field, and in this sense, AlphaGeometry seems like a big step towards AI’s takeover of mathematics, which has thus far been a very human enterprise.

In all, AlphaGeometry could solve 11 more Olympiad geometry problems, bringing its tally to 25 out of 30 problems. It is also commendable that AlphaGeometry can write humanreadable proofs and can draw diagrams to explain a proof. Once it did so, the team asked a coach of the U.S. Mathematical Olympiad to evaluate the proofs and grade them. The result: AlphaGeometry performed better than an average silver medallist.

The architecture developed for AlphaGeometry may not have been able to solve the other Olympiad problems, but the techniques it developed are directly useful to solve problems from other areas of mathematics. The success of this project will certainly lead to the development of AI programs that can efficiently do mathematics at least at the school level.

(Mohan R. is a mathematician at Azim Premji University, Bengaluru.)