ImProver Theorem Proving

Overview

For completeness, we evaluate ImProver as a neural theorem prover over a subset of MIL.

We evaluate on 23 exercises in group theory (12) and set theory (11) from MIL, with an empty input proof.

Results

MIL Set Theory Group Theory Overall
GPT-4o 18.18% 25% 21.73%
ImProver 45.45% 33.33% 39.13%
Where each cell signifies percent accuracy.

Raw data can be found and downloaded here.