Sparse Computing Drives Energy-Efficient Computation for Artificial Intelligence (Sparse 2025)

Mon 16 - Fri 20 June 2025 Seoul, South Korea

Track

Sparse 2025

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 16 Jun 2025 16:20 - 16:40 at Cosmos - Session 4 Chair(s): Kazem Cheshmi

Abstract

As Artificial Intelligence (AI) algorithms become increasingly complex, sparse computing plays a crucial role in their evolution. This is because sparsity is an important method for compressing neural network models and reducing computational workload. Furthermore, generative algorithms like Large Language Models (LLMs) are ushering AI into the 2.0 era, and the immense computational complexity of LLMs makes it even more critical to use sparsity to reduce the workload. This report, centered on sparse computing and hardware-software co-design, presents a series of works from both system-level and hardware-level perspectives, targeting applications such as sparse Graph Neural Networks (GNNs), LLMs, and multimodal large models. At the system level, we first introduce sparse kernel optimization strategies on GPU systems and propose an open-source sparse kernel library, dgSPARSE. The dgSPARSE library outperforms commercial libraries across various GNN models and sparse operators. At the hardware level, we implemented an efficient large model inference solution, FlightLLM, on FPGAs, achieving a 6.0x improvement in energy efficiency compared to V100S GPUs of equivalent process technology. For video generation tasks, by leveraging the unique inter-frame and intra-frame correlations in videos, we propose an architecture based on spatial-temporal sparsification and mixed-precision computation. Implemented on an AMD V80 FPGA, this achieves a 1.3x speedup compared to a GPU implementation, despite the GPU possessing over 21 times the peak computational power.

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 16 Jun
Displayed time zone: Seoul change

15:40 - 17:00	Session 4Sparse at Cosmos Chair(s): Kazem Cheshmi McMaster University

15:40 20m Talk		Hyperreal Specifications for Continuous Sparse Data Computations Sparse Vladimir Gladshtein
16:00 20m Talk		Quantum Simulation with Sparse Tensors Sparse Meisam Tarabkhah University of Edinburgh
16:20 20m Talk		Sparse Computing Drives Energy-Efficient Computation for Artificial Intelligence Sparse Guohao Dai
16:40 20m Talk		Panel 4 Sparse Meisam Tarabkhah University of Edinburgh, Vladimir Gladshtein , Guohao Dai

Sparse Computing Drives Energy-Efficient Computation for Artificial Intelligence

Mon 16 Jun
Displayed time zone: Seoul change

Guohao Dai

Tracks

Co-hosted Conferences

Workshops

Sparse Computing Drives Energy-Efficient Computation for Artificial Intelligence

Program Display Configuration

Program Display Configuration

Mon 16 JunDisplayed time zone: Seoul change

Guohao Dai

Mon 16 Jun
Displayed time zone: Seoul change