This program is tentative and subject to change.
Large language models (LLMs) have achieved notable success in code generation. However, they still frequently produce uncompilable output because their next-token inference procedure does not model formal aspects of code. Although constrained decoding is a promising approach to alleviate this issue, it has only been applied to handle either domain-specific languages or syntactic features of general-purpose programming languages. However, LLMs frequently generate code with typing errors, which are beyond the domain of syntax and generally hard to adequately constrain. To address this challenge, we introduce a type-constrained decoding approach that leverages type systems to guide code generation. We develop novel prefix automata for this purpose and introduce a sound approach to enforce well-typedness based on type inference and a search over inhabitable types. We formalize our approach on a foundational simply-typed language and extend it to TypeScript to demonstrate practicality. Our evaluation on the HumanEval and MBPP datasets shows that our approach reduces compilation errors by more than half and significantly increases functional correctness in code synthesis, translation, and repair tasks across LLMs of various sizes and model families, including state-of-the-art open-weight models with more than 30B parameters. The results demonstrate the generality and effectiveness of our approach in constraining LLM code generation with formal rules of type systems.
This program is tentative and subject to change.
Wed 18 JunDisplayed time zone: Seoul change
16:00 - 17:20 | Machine LearningPLDI Research Papers at Cosmos, Violet & Tulip Chair(s): Feras Saad Carnegie Mellon University | ||
16:00 20mTalk | Type-Constrained Code Generation with Language Models PLDI Research Papers Niels Mündler ETH Zurich, Jingxuan He University of California at Berkeley, Hao Wang University of California at Berkeley, Koushik Sen University of California at Berkeley, Dawn Song University of California at Berkeley, Martin Vechev ETH Zurich DOI Pre-print | ||
16:20 20mTalk | Reductive Analysis with Compiler-Guided Large Language Models for Input-Centric Code OptimizationsRecorded PLDI Research Papers Xiangwei Wang North Carolina State University, Xinning Hui North Carolina State University, Chunhua Liao Lawrence Livermore National Laboratory, Xipeng Shen North Carolina State University DOI | ||
16:40 20mTalk | Scalable, Validated Code Translation of Entire Projects using Large Language Models PLDI Research Papers Hanliang Zhang University of Bristol, Cristina David University of Bristol, Meng Wang University of Bristol, Brandon Paulsen Amazon, Daniel Kroening Amazon DOI | ||
17:00 20mTalk | Guided Tensor Lifting PLDI Research Papers Yixuan Li University of Edinburgh, José Wesley De Souza Magalhães University of Edinburgh, Alexander Brauckmann University of Edinburgh, Michael F. P. O'Boyle University of Edinburgh, Elizabeth Polgreen University of Edinburgh DOI |