Type-Constrained Code Generation with Language Models (PLDI 2025 - PLDI Research Papers)

Who

Niels Mündler, Jingxuan He, Hao Wang, Koushik Sen, Dawn Song, Martin Vechev

Track

PLDI 2025 PLDI Research Papers

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 18 Jun 2025 16:00 - 16:20 at Cosmos, Violet & Tulip - Machine Learning Chair(s): Feras A. Saad

Abstract

Large language models (LLMs) have achieved notable success in code generation. However, they still frequently produce uncompilable output because their next-token inference procedure does not model formal aspects of code. Although constrained decoding is a promising approach to alleviate this issue, it has only been applied to handle either domain-specific languages or syntactic features of general-purpose programming languages. However, LLMs frequently generate code with typing errors, which are beyond the domain of syntax and generally hard to adequately constrain. To address this challenge, we introduce a type-constrained decoding approach that leverages type systems to guide code generation. We develop novel prefix automata for this purpose and introduce a sound approach to enforce well-typedness based on type inference and a search over inhabitable types. We formalize our approach on a foundational simply-typed language and extend it to TypeScript to demonstrate practicality. Our evaluation on the HumanEval and MBPP datasets shows that our approach reduces compilation errors by more than half and significantly increases functional correctness in code synthesis, translation, and repair tasks across LLMs of various sizes and model families, including state-of-the-art open-weight models with more than 30B parameters. The results demonstrate the generality and effectiveness of our approach in constraining LLM code generation with formal rules of type systems.

Link to Preprint

https://arxiv.org/abs/2504.09246

DOI

https://doi.org/10.1145/3729274

Niels Mündler

ETH Zurich

Switzerland

Jingxuan He

University of California at Berkeley

United States

Hao Wang

University of California at Berkeley

United States

Koushik Sen

University of California at Berkeley

United States

Dawn Song

University of California at Berkeley

United States

Martin Vechev

ETH Zurich

Switzerland

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 18 Jun
Displayed time zone: Seoul change

16:00 - 17:20	Machine LearningPLDI Research Papers at Cosmos, Violet & Tulip Chair(s): Feras A. Saad Carnegie Mellon University

16:00 20m Talk		Type-Constrained Code Generation with Language Models PLDI Research Papers Niels Mündler ETH Zurich, Jingxuan He University of California at Berkeley, Hao Wang University of California at Berkeley, Koushik Sen University of California at Berkeley, Dawn Song University of California at Berkeley, Martin Vechev ETH Zurich DOI Pre-print
16:20 20m Talk		Reductive Analysis with Compiler-Guided Large Language Models for Input-Centric Code OptimizationsRecorded PLDI Research Papers Xiangwei Wang North Carolina State University, Xinning Hui North Carolina State University, Chunhua Liao Lawrence Livermore National Laboratory, Xipeng Shen North Carolina State University DOI
16:40 20m Talk		Scalable, Validated Code Translation of Entire Projects using Large Language Models PLDI Research Papers Hanliang Zhang University of Bristol, Cristina David University of Bristol, Meng Wang University of Bristol, Brandon Paulsen Amazon, Daniel Kroening Amazon DOI
17:00 20m Talk		Guided Tensor Lifting PLDI Research Papers Yixuan Li University of Edinburgh, José Wesley De Souza Magalhães University of Edinburgh, Alexander Brauckmann University of Edinburgh, Michael F. P. O'Boyle University of Edinburgh, Elizabeth Polgreen University of Edinburgh DOI