Low-Overhead General-Purpose Near-Data Processing in CXL Memory Expanders
The paper presents a low-overhead general-purpose near-data processing architecture for CXL memory, named Memory-Mapped NDP (M^2NDP). This architecture aims to enhance performance for memory-bound applications by reducing latency and energy consumption. The proposed solution achieves significant speedups and energy savings compared to traditional CPU/GPU systems.
- ▪CXL enables cost-efficient memory expansion beyond local DRAM, but frequent memory accesses can slow down applications.
- ▪The proposed M^2NDP architecture includes memory-mapped functions and lightweight μthreads for efficient processing.
- ▪M^2NDP can achieve speedups of up to 128x and energy reductions of up to 87.9% compared to baseline systems.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Hardware Architecture arXiv:2404.19381 (cs) [Submitted on 30 Apr 2024 (v1), last revised 23 Sep 2024 (this version, v3)] Title:Low-overhead General-purpose Near-Data Processing in CXL Memory Expanders Authors:Hyungkyu Ham, Jeongmin Hong, Geonwoo Park, Yunseon Shin, Okkyun Woo, Wonhyuk Yang, Jinhoon Bae, Eunhyeok Park, Hyojin Sung, Euicheol Lim, Gwangsun Kim View a PDF of the paper titled Low-overhead General-purpose Near-Data Processing in CXL Memory Expanders, by Hyungkyu Ham and 10 other authors View PDF HTML (experimental) Abstract:Emerging Compute Express Link (CXL) enables cost-efficient memory expansion beyond the local DRAM of processors.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv.org.