First-Class Verification Dialects for MLIR (PLDI 2025 - PLDI Research Papers)

Who

Mathieu Fehr, Yuyou Fan, Hugo Pompougnac, John Regehr, Tobias Grosser

Track

PLDI 2025 PLDI Research Papers

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 18 Jun 2025 14:40 - 15:00 at Orchid - Semantics Chair(s): Alexa VanHattum

Abstract

MLIR is a toolkit supporting the development of extensible and composable intermediate representations (IRs) called \emph{dialects}; it was created in response to rapid changes in hardware platforms, programming languages, and application domains such as machine
learning. MLIR supports development teams creating compilers and compiler-adjacent tools by factoring out common infrastructure
such as parsers and printers. A major limitation of MLIR is that it is syntax-focused: it has no support for directly encoding the semantics of operations in its dialects. Thus, at present, the parts of MLIR tools that depend on semantics—optimizers, analyzers, verifiers, transformers—must all be engineered by hand.

Our work makes formal semantics a first-class citizen in the MLIR ecosystem. We designed and implemented a collection of semantics-supporting MLIR dialects for encoding the semantics of compiler IRs. These dialects support a separation of concerns between three domains
of expertise when building formal-methods-based tooling for compilers. First, compiler developers define their dialect’s semantics as a
lowering (compilation transformation) from their dialect to one or more of ours. Second, SMT solver experts provide tools to optimize domain-specific high-level semantics and lower them to SMT queries. Third, tool builders create dialect-independent verification tools.

We validate our work by defining semantics for five key MLIR dialects, defining a state-of-the-art SMT encoding for memory-based semantics, and building three dialect-agnostic tools, which we used to find five miscompilation bugs in upstream MLIR, verify a canonicalization pass,
and also formally verify transfer functions for two dataflow analyses: “known bits” (that finds individual bits that are always zero or one
in all executions) and “demanded bits” (that finds donot-care bits). The transfer functions that we verify are improved versions of those in upstream MLIR; they detect, on average, 36.6% more known bits in real-world MLIR programs compared to the upstream implementation.

DOI

https://doi.org/10.1145/3729309

Mathieu Fehr

The University of Edinburgh

United Kingdom

Yuyou Fan

University of Utah

United States

Hugo Pompougnac

Université Grenoble Alpes; Inria; CNRS; Grenoble INP

France

John Regehr