PLDI 2025
Mon 16 - Fri 20 June 2025 Seoul, South Korea

Actors are lightweight reactive processes that communicate by asynchronous message-passing. Programmers use actors to solve common problems like concurrency control and fault tolerance, but resource management remains challenging: in all four of the most popular actor frameworks (Pekko, Akka, Erlang, and Elixir) programmers are responsible for manually killing actors and freeing their resources. To simplify resource management, researchers devised actor garbage collectors (actor GCs) that monitor the system and detect when actors are safe to kill. However, actor GCs are not yet practical for distributed systems, where actors run on nodes that can fail. The simplest actor GCs do not collect cyclic garbage, whereas more complex actor GCs are not fault-recovering: they leave floating garbage when messages are dropped and when nodes crash.

We present Conflict-free Replicated Garbage Collection (CRGC): the first fault-recovering cyclic actor GC. In CRGC, actors and nodes record information locally and broadcast updates to the garbage collector process on each node. The approach does not require locks, barriers, or any assumptions about message delivery order, except for reliable FIFO channels from actors to their local garbage collector. CRGC is also simple: we formalize it with a TLA$^{+}$ specification and prove it to be sound (live actors are never killed) and complete (garbage actors are eventually killed). To evaluate whether CRGC is practical, we develop a preliminary implementation in Apache Pekko and measure its performance using two popular actor benchmark suites. Our results show that the performance overhead of CRGC is competitive with simpler approaches like weighted reference counting, while being much more powerful.