ImProver Theorem Proving
Overview
For completeness, we evaluate ImProver as a neural theorem prover over a subset of MIL.
We evaluate on 23 exercises in group theory (12) and set theory (11) from MIL, with an empty input proof.
Results
MIL | Set Theory | Group Theory | Overall |
GPT-4o | 18.18% | 25% | 21.73% |
ImProver | 45.45% | 33.33% | 39.13% |
Raw data can be found and downloaded here.