PLDI 2025
Mon 16 - Fri 20 June 2025 Seoul, South Korea

This program is tentative and subject to change.

Wed 18 Jun 2025 17:00 - 17:20 at Grand Ball Room 1 - Big Red KATs

This paper is about semantic regular expressions (SemREs). This is a concept that was recently proposed by Smore (Chen et al. 2023) in which classical regular expressions are extended with a primitive to query external oracles such as databases and large language models (LLMs). SemREs can be used to identify lines of text containing references to semantic concepts such as cities, celebrities, political entities, etc. The focus in their paper was on automatically synthesizing semantic regular expressions from positive and negative examples.

In this paper, we study the \emph{membership testing problem}.

First, we present a two-pass NFA-based algorithm to determine whether a string $w$ matches a SemRE $r$ in $O(|r|^2 |w|^2 + |r| |w|^3)$ time, assuming the oracle responds to each query in unit time. In common situations, where oracle queries are not nested, we show that this procedure runs in $O(|r|^2 |w|^2)$ time. Experiments with a prototype implementation of this algorithm validate our theoretical analysis, and show that the procedure massively outperforms a dynamic programming-based baseline, and incurs a $\approx 2 \times$ overhead over the time needed for interaction with the oracle.

Second, we establish connections between SemRE membership testing and the triangle finding problem from graph theory, which suggest that developing algorithms which are simultaneously practical and asymptotically faster might be challenging. Furthermore, algorithms for classical regular expressions primarily aim to optimize their time and memory consumption. In contrast, an important consideration in our setting is to minimize the cost of invoking the oracle. We demonstrate an $\Omega(|w|^2)$ lower bound on the number of oracle queries necessary to make this determination.

This program is tentative and subject to change.

Wed 18 Jun

Displayed time zone: Seoul change

16:00 - 17:20
16:00
20m
Talk
Active Learning of Symbolic NetKAT Automata
PLDI Research Papers
Mark Moeller Cornell University, Tiago Ferreira University College London, Thomas Lu Cornell University, Nate Foster Cornell University; Jane Street, Alexandra Silva Cornell University
DOI
16:20
20m
Talk
StacKAT: Infinite State Network Verification
PLDI Research Papers
Jules Jacobs Cornell University, Nate Foster Cornell University; Jane Street, Tobias Kappé Leiden University, Dexter Kozen Cornell University, Lily Saada Cornell University, Alexandra Silva Cornell University, Jana Wagemaker Radboud University Nijmegen
DOI
16:40
20m
Talk
Probabilistic Kleene Algebra with Angelic Nondeterminism
PLDI Research Papers
Shawn Ong Cornell University, Stephanie Ma Cornell University, Dexter Kozen Cornell University
DOI
17:00
20m
Talk
Membership Testing for Semantic Regular Expressions
PLDI Research Papers
Yifei Huang University of Southern California, Matin Amini University of Southern California, Alexis Le Glaunec Rice University, Konstantinos Mamouras Rice University, Mukund Raghothaman University of Southern California
DOI