Slotted E-Graphs: First-Class Support for (Bound) Variables in E-Graphs
Equality saturation has gained significant interest as a powerful optimization and reasoning technique. At its heart is the e-graph data structure, that space-efficiently represents equal sub-terms uniquely.
An important open problem in this context is extending this efficient representation to languages featuring (bound) variables.
Independent of how we represent variables in e-graphs, either as names or nameless (using de Bruijn indices), sharing is broken as sub-terms that differ only in the names of their variables are represented separately.
This results in aggressive e-graph growth, bad performance, as well as reduced expressiveness.
In this paper, we present a novel approach to representing bound variables in e-graphs by making them a first-class built-in feature of the data structure.
Our <em>slotted e-graph</em> represents terms that differ only by (bound or free) variable names uniquely.
To do so, e-classes that represent equivalent terms via e-nodes are parameterized by <em>slots</em>, abstracting over free variables of the represented terms.
Referring to an e-class from an e-node now requires relating the variables from its context to the slots of the e-class.
Our evaluation of slotted e-graph uses two case studies from compiler optimization and theorem proving to show that performing equality saturation for languages with bound variables is greatly simplified and that we can solve practically relevant problems that cannot be solved with e-graphs using de Bruijn indices.