# ImProver: Agent-Based Automated Proof Optimization

Automated Proof Optimization is the task of rewriting a formal proof so that it is *both* correct and optimized for an arbitrary criterion.

We present ImProver, a general LLM agent-based framework that solves this task. Here's an example of ImProver making a lemma from the 2019 IMO Q2 more concise and efficient:

Original (Human-Written)

```
lemma lemma0 {α : Type} {p : α → α → Prop}
(h1 : ∀ x, ∃! y, p x y)
(h2 : ∀ x y, p x y ↔ p y x) :
∀ x, Classical.choose
(h1 (Classical.choose (h1 x).exists)).exists=x := by
-- PROOF START
intro x
obtain ⟨y, h1e, h1u⟩ := h1 x
have h2' : Classical.choose (h1 x).exists = y :=
h1u _ (Classical.choose_spec (h1 x).exists)
rw [h2']
obtain ⟨w, h1e', h1u'⟩ := h1 y
have h4 := Classical.choose_spec (h1 y).exists
have hxw : x = w := by
apply h1u'
rw [h2]
exact h1e
rw [hxw]
exact h1u' _ h4
```

ImProver (Length-Optimized)

```
lemma lemma0 {α : Type} {p : α → α → Prop}
(h1 : ∀ x, ∃! y, p x y)
(h2 : ∀ x y, p x y ↔ p y x) :
∀ x, Classical.choose
(h1 (Classical.choose (h1 x).exists)).exists=x := by
-- PROOF START
intro x
obtain ⟨y, h1e, h1u⟩ := h1 x
rw [h1u _ (Classical.choose_spec _)]
obtain ⟨w, h1e', h1u'⟩ := h1 y
rw [h1u' _ ((h2 _ _).mpr h1e)]
exact h1u' _ (Classical.choose_spec _)
```

We find that naively applying LLMs to proof optimization falls short, and we incorporate various improvements into ImProver, such as the use of symbolic Lean context in a novel Chain-of-States technique, as well as error-correction and retrieval.

We test ImProver on rewriting real-world undergraduate, competition, and research-level mathematics problems, finding that ImProver is capable of rewriting proofs so that they are substantially shorter and more readable compared to the raw LLM baseline.

Metric | Model | Improvement | Nonempty Improvement | Accuracy | Improved Acc. |

Length | GPT-4o | 3.7 | 15.15 | 26.36% | 8.31% |

Length |
ImProver |
20.96 |
55.29 |
100.0% |
35.44% |

Readability | GPT-4o | 2.21 | 8.02 | 18.75% | 6.13% |

Readability |
ImProver |
9.34 |
30.53 |
100.0% |
24.56% |

Check out our code and run ImProver yourself here, or keep exploring to learn more about ImProver.

## Recent Activity

### Compfiles

ImProver is the first AI software to contribute to a major Lean dataset!

### Official Release

ImProver has been officially released!

### ArXiV

ImProver is officially in preprint

### ICLR

ImProver has been submitted for ICLR 2025

### MAI @ NeurIPS2025

ImProver has been submitted for the Math for AI workshop at NeurIPS 2025

### Github

ImProver is now Open-Source