CRGC: Fault-Recovering Actor Garbage Collection in Pekko
This program is tentative and subject to change.
Actors are lightweight reactive processes that communicate by asynchronous message-passing. Actors address common problems like concurrency control and fault tolerance, but resource management remains challenging: in all four of the most popular actor frameworks (Pekko, Akka, Erlang, and Elixir) programmers must explicitly kill actors to free up resources. To simplify resource management, researchers have devised \emph{actor garbage collectors (actor GCs)} that monitor the application and detect when actors are safe to kill. However, existing actor GCs are impractical for distributed systems where the network is unreliable and nodes can fail. The simplest actor GCs do not collect cyclic garbage, whereas more sophisticated actor GCs are not \emph{fault-recovering}: dropped messages and crashed nodes can cause actors to become garbage that never gets collected.
We present Conflict-free Replicated Garbage Collection (CRGC): the first fault-recovering cyclic actor GC. In CRGC, actors and nodes record information locally and broadcast updates to the garbage collectors running on each node. CRGC does not require locks, explicit memory barriers, or any assumptions about message delivery order, except for reliable FIFO channels from actors to their local garbage collector. Moreover, CRGC is simple: we concisely present its operational semantics, which has been formalized in TLA$^{+}$, and prove both soundness (non-garbage actors are never killed) and completeness (all garbage actors are eventually killed, under reasonable assumptions). We also present a preliminary implementation in Apache Pekko and measure its performance using two actor benchmark suites. Our results show the performance overhead of CRGC is competitive with simpler approaches like weighted reference counting, while also being much more powerful.
This program is tentative and subject to change.
Fri 20 JunDisplayed time zone: Seoul change
10:30 - 12:10 | |||
10:30 20mTalk | Verifying General-Purpose RCU for Reclamation in Relaxed Memory Separation Logic PLDI Research Papers Jaehwang Jung Rebellions Inc, Sunho Park KAIST, Janggun Lee KAIST, Jeho Yeon KAIST, Jeehoon Kang KAIST DOI | ||
10:50 20mTalk | Leveraging Immutability to Validate Hazard Pointers for Optimistic Traversals PLDI Research Papers DOI | ||
11:10 20mTalk | Iso: Request-Private Garbage Collection PLDI Research Papers Tianle Qiu Australian National University, Stephen M. Blackburn Google; Australian National University DOI | ||
11:30 20mTalk | CRGC: Fault-Recovering Actor Garbage Collection in Pekko PLDI Research Papers Dan Plyukhin University of Southern Denmark, Gul Agha University of Illinois at Urbana-Champaign, Fabrizio Montesi University of Southern Denmark DOI | ||
11:50 20mTalk | RRR-SMR: Reduce, Reuse, Recycle: Better Methods for Practical Lock-Free Data Structures PLDI Research Papers DOI |