ADaPS: Adaptive Data Partitioning to Parallelize CNN Inference on Resource-Constrained Hardware (LCTES 2025 - Languages, Compilers, Tools and Theory of Embedded Systems)

Mon 16 - Fri 20 June 2025 Seoul, South Korea

Who

Jaume Mateu Cuadrat, Bernhard Egger

Track

LCTES 2025

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 16 Jun 2025 16:00 - 16:20 at Violet - AI and Accelerator Architecture + WIP Chair(s): Yongjun Park

Abstract

The growing adoption of AI applications has led to an increased demand for deploying neural networks on diverse device platforms. However, even modest networks now require specialized hardware for efficient execution due to their rising computational cost. To address this, distributed execution across connected, resource-constrained devices is gaining importance. While prior work relies on empirical models or supports limited partitioning, we present ADaPS, a novel framework for distributing Convolutional Neural Networks (CNNs) inference workloads across heterogeneous
embedded devices. Our analytical model partitions the height and width dimensions of 4D tensors and explores layer fusion opportunities, accounting for compute, memory, and communication constraints. ADaPS efficiently explores the vast partitioning space using a tree-based hybrid optimization algorithm combining Alpha-Beta pruning and dynamic programming. Evaluations on multiple CNNs and device configurations show that ADaPS is able to improve inference latency by up to 1.2x on average while significantly reducing data transfers compared to state-of-the-art methods.

DOI

https://doi.org/10.1145/3735452.3735532

Jaume Mateu Cuadrat

Seoul National University

Bernhard Egger