Towards Efficient Load Balancing BFS on GPUs: One Code for AMD, Intel & Nvidia

K. Olgu, T. Kenter, J. Nunez-Yanez, S. McIntosh-Smith, T. Deakin, in: Proceedings of the SC ’25 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, ACM, 2025.

Download
No fulltext has been uploaded.
Conference Paper | Published | English
Author
Olgu, Kaan; Kenter, TobiasLibreCat; Nunez-Yanez, Jose; McIntosh-Smith, Simon; Deakin, Tom
Abstract
Efficient graph processing is essential for a wide range of applications. Scalability and memory access patterns are still a challenge, especially with the Breadth-First Search algorithm. This work focuses on leveraging HPC systems with multiple GPUs available in a single node with peer-to-peer functionality of the Intel oneAPI implementation of SYCL. We propose three GPU-based load-balancing methods: work-group localisation for efficient data access, even workload distribution for higher GPU occupancy, and a hybrid strided-access approach for heuristic balancing. These methods ensure performance, portability, and productivity with a unified codebase. Our proposed methodologies outperform state-of-the-art single-GPU implementations based on CUDA on synthetic RMAT graphs. We analysed BFS performance across NVIDIA A100, Intel Max 1550, and AMD MI300X GPUs, achieving a peak performance of 153.27 GTEPS on an RMAT25-64 graph using 8 GPUs on the NVIDIA A100. Furthermore, our work demonstrates the capability to handle RMAT graphs up to scale 29, achieving superior performance on synthetic graphs and competitive results on real-world datasets.
Publishing Year
Proceedings Title
Proceedings of the SC '25 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis
LibreCat-ID

Cite this

Olgu K, Kenter T, Nunez-Yanez J, McIntosh-Smith S, Deakin T. Towards Efficient Load Balancing BFS on GPUs: One Code for AMD, Intel & Nvidia. In: Proceedings of the SC ’25 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM; 2025. doi:10.1145/3731599.3767570
Olgu, K., Kenter, T., Nunez-Yanez, J., McIntosh-Smith, S., & Deakin, T. (2025). Towards Efficient Load Balancing BFS on GPUs: One Code for AMD, Intel & Nvidia. Proceedings of the SC ’25 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis. https://doi.org/10.1145/3731599.3767570
@inproceedings{Olgu_Kenter_Nunez-Yanez_McIntosh-Smith_Deakin_2025, title={Towards Efficient Load Balancing BFS on GPUs: One Code for AMD, Intel & Nvidia}, DOI={10.1145/3731599.3767570}, booktitle={Proceedings of the SC ’25 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis}, publisher={ACM}, author={Olgu, Kaan and Kenter, Tobias and Nunez-Yanez, Jose and McIntosh-Smith, Simon and Deakin, Tom}, year={2025} }
Olgu, Kaan, Tobias Kenter, Jose Nunez-Yanez, Simon McIntosh-Smith, and Tom Deakin. “Towards Efficient Load Balancing BFS on GPUs: One Code for AMD, Intel & Nvidia.” In Proceedings of the SC ’25 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 2025. https://doi.org/10.1145/3731599.3767570.
K. Olgu, T. Kenter, J. Nunez-Yanez, S. McIntosh-Smith, and T. Deakin, “Towards Efficient Load Balancing BFS on GPUs: One Code for AMD, Intel & Nvidia,” 2025, doi: 10.1145/3731599.3767570.
Olgu, Kaan, et al. “Towards Efficient Load Balancing BFS on GPUs: One Code for AMD, Intel & Nvidia.” Proceedings of the SC ’25 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, ACM, 2025, doi:10.1145/3731599.3767570.

Export

Marked Publications

Open Data LibreCat

Search this title in

Google Scholar