Equality Saturation Guided by Large Language Models
One critical issue with LLMs is their inability to guarantee correctness. Although this problem can be addressed by applying LLMs to formal rewrite systems, the capability of LLMs is still far from adequate to generate sound rewrite chains. To bridge this gap, this paper proposes \textit{LLM-guided equality saturation}, dubbed as LGuess, by incorporating e-graphs as an intermediate layer between LLMs and rewrite systems. LGuess queries LLMs for only high-level rewrite checkpoints and uses e-graphs to supply low-level rewrite chains between these checkpoints. In this procedure, the key technical challenge lies in effectively extracting a suitable checkpoint from a saturated e-graph, and LGuess addresses this by learning a probabilistic model from the LLM. The model predicts probable checkpoints while remaining simple enough for effective extraction.
We have implemented a prototype of LGuess and evaluated it on the problem of factorizing multi-variable polynomials. The results demonstrate a significant advantage of LGuess compared to both straightforward equality saturation and the approach that queries the LLM directly for the rewrite chain.