---
_id: '46120'
abstract:
- lang: eng
text: The rise of exascale supercomputers has fueled competition among GPU vendors,
driving lattice QCD developers to write code that supports multiple APIs. Moreover,
new developments in algorithms and physics research require frequent updates to
existing software. These challenges have to be balanced against constantly changing
personnel. At the same time, there is a wide range of applications for HISQ fermions
in QCD studies. This situation encourages the development of software featuring
a HISQ action that is flexible, high-performing, open source, easy to use, and
easy to adapt. In this technical paper, we explain the design strategy, provide
implementation details, list available algorithms and modules, and show key performance
indicators for SIMULATeQCD, a simple multi-GPU lattice code for large-scale QCD
calculations, mainly developed and used by the HotQCD collaboration. The code
is publicly available on GitHub.
author:
- first_name: Lukas
full_name: Mazur, Lukas
id: '90492'
last_name: Mazur
orcid: ' 0000-0001-6304-7082'
- first_name: Dennis
full_name: Bollweg, Dennis
last_name: Bollweg
- first_name: David A.
full_name: Clarke, David A.
last_name: Clarke
- first_name: Luis
full_name: Altenkort, Luis
last_name: Altenkort
- first_name: Olaf
full_name: Kaczmarek, Olaf
last_name: Kaczmarek
- first_name: Rasmus
full_name: Larsen, Rasmus
last_name: Larsen
- first_name: Hai-Tao
full_name: Shu, Hai-Tao
last_name: Shu
- first_name: Jishnu
full_name: Goswami, Jishnu
last_name: Goswami
- first_name: Philipp
full_name: Scior, Philipp
last_name: Scior
- first_name: Hauke
full_name: Sandmeyer, Hauke
last_name: Sandmeyer
- first_name: Marius
full_name: Neumann, Marius
last_name: Neumann
- first_name: Henrik
full_name: Dick, Henrik
last_name: Dick
- first_name: Sajid
full_name: Ali, Sajid
last_name: Ali
- first_name: Jangho
full_name: Kim, Jangho
last_name: Kim
- first_name: Christian
full_name: Schmidt, Christian
last_name: Schmidt
- first_name: Peter
full_name: Petreczky, Peter
last_name: Petreczky
- first_name: Swagato
full_name: Mukherjee, Swagato
last_name: Mukherjee
citation:
ama: 'Mazur L, Bollweg D, Clarke DA, et al. SIMULATeQCD: A simple multi-GPU lattice
code for QCD calculations. Computer Physics Communications. Published online
2023. doi:10.48550/ARXIV.2306.01098'
apa: 'Mazur, L., Bollweg, D., Clarke, D. A., Altenkort, L., Kaczmarek, O., Larsen,
R., Shu, H.-T., Goswami, J., Scior, P., Sandmeyer, H., Neumann, M., Dick, H.,
Ali, S., Kim, J., Schmidt, C., Petreczky, P., & Mukherjee, S. (2023). SIMULATeQCD:
A simple multi-GPU lattice code for QCD calculations. Computer Physics Communications.
https://doi.org/10.48550/ARXIV.2306.01098'
bibtex: '@article{Mazur_Bollweg_Clarke_Altenkort_Kaczmarek_Larsen_Shu_Goswami_Scior_Sandmeyer_et
al._2023, title={SIMULATeQCD: A simple multi-GPU lattice code for QCD calculations},
DOI={10.48550/ARXIV.2306.01098},
journal={Computer Physics Communications}, author={Mazur, Lukas and Bollweg, Dennis
and Clarke, David A. and Altenkort, Luis and Kaczmarek, Olaf and Larsen, Rasmus
and Shu, Hai-Tao and Goswami, Jishnu and Scior, Philipp and Sandmeyer, Hauke and
et al.}, year={2023} }'
chicago: 'Mazur, Lukas, Dennis Bollweg, David A. Clarke, Luis Altenkort, Olaf Kaczmarek,
Rasmus Larsen, Hai-Tao Shu, et al. “SIMULATeQCD: A Simple Multi-GPU Lattice Code
for QCD Calculations.” Computer Physics Communications, 2023. https://doi.org/10.48550/ARXIV.2306.01098.'
ieee: 'L. Mazur et al., “SIMULATeQCD: A simple multi-GPU lattice code for
QCD calculations,” Computer Physics Communications, 2023, doi: 10.48550/ARXIV.2306.01098.'
mla: 'Mazur, Lukas, et al. “SIMULATeQCD: A Simple Multi-GPU Lattice Code for QCD
Calculations.” Computer Physics Communications, 2023, doi:10.48550/ARXIV.2306.01098.'
short: L. Mazur, D. Bollweg, D.A. Clarke, L. Altenkort, O. Kaczmarek, R. Larsen,
H.-T. Shu, J. Goswami, P. Scior, H. Sandmeyer, M. Neumann, H. Dick, S. Ali, J.
Kim, C. Schmidt, P. Petreczky, S. Mukherjee, Computer Physics Communications (2023).
date_created: 2023-07-24T10:55:25Z
date_updated: 2023-07-26T09:21:35Z
department:
- _id: '27'
doi: 10.48550/ARXIV.2306.01098
language:
- iso: eng
publication: Computer Physics Communications
status: public
title: 'SIMULATeQCD: A simple multi-GPU lattice code for QCD calculations'
type: journal_article
user_id: '90492'
year: '2023'
...
---
_id: '46119'
article_number: '014503'
author:
- first_name: Luis
full_name: Altenkort, Luis
last_name: Altenkort
- first_name: Alexander M.
full_name: Eller, Alexander M.
last_name: Eller
- first_name: Anthony
full_name: Francis, Anthony
last_name: Francis
- first_name: Olaf
full_name: Kaczmarek, Olaf
last_name: Kaczmarek
- first_name: Lukas
full_name: Mazur, Lukas
id: '90492'
last_name: Mazur
orcid: ' 0000-0001-6304-7082'
- first_name: Guy D.
full_name: Moore, Guy D.
last_name: Moore
- first_name: Hai-Tao
full_name: Shu, Hai-Tao
last_name: Shu
citation:
ama: Altenkort L, Eller AM, Francis A, et al. Viscosity of pure-glue QCD from the
lattice. Physical Review D. 2023;108(1). doi:10.1103/physrevd.108.014503
apa: Altenkort, L., Eller, A. M., Francis, A., Kaczmarek, O., Mazur, L., Moore,
G. D., & Shu, H.-T. (2023). Viscosity of pure-glue QCD from the lattice. Physical
Review D, 108(1), Article 014503. https://doi.org/10.1103/physrevd.108.014503
bibtex: '@article{Altenkort_Eller_Francis_Kaczmarek_Mazur_Moore_Shu_2023, title={Viscosity
of pure-glue QCD from the lattice}, volume={108}, DOI={10.1103/physrevd.108.014503},
number={1014503}, journal={Physical Review D}, publisher={American Physical Society
(APS)}, author={Altenkort, Luis and Eller, Alexander M. and Francis, Anthony and
Kaczmarek, Olaf and Mazur, Lukas and Moore, Guy D. and Shu, Hai-Tao}, year={2023}
}'
chicago: Altenkort, Luis, Alexander M. Eller, Anthony Francis, Olaf Kaczmarek, Lukas
Mazur, Guy D. Moore, and Hai-Tao Shu. “Viscosity of Pure-Glue QCD from the Lattice.”
Physical Review D 108, no. 1 (2023). https://doi.org/10.1103/physrevd.108.014503.
ieee: 'L. Altenkort et al., “Viscosity of pure-glue QCD from the lattice,”
Physical Review D, vol. 108, no. 1, Art. no. 014503, 2023, doi: 10.1103/physrevd.108.014503.'
mla: Altenkort, Luis, et al. “Viscosity of Pure-Glue QCD from the Lattice.” Physical
Review D, vol. 108, no. 1, 014503, American Physical Society (APS), 2023,
doi:10.1103/physrevd.108.014503.
short: L. Altenkort, A.M. Eller, A. Francis, O. Kaczmarek, L. Mazur, G.D. Moore,
H.-T. Shu, Physical Review D 108 (2023).
date_created: 2023-07-24T10:54:18Z
date_updated: 2023-07-26T09:23:32Z
department:
- _id: '27'
doi: 10.1103/physrevd.108.014503
intvolume: ' 108'
issue: '1'
language:
- iso: eng
publication: Physical Review D
publication_identifier:
issn:
- 2470-0010
- 2470-0029
publication_status: published
publisher: American Physical Society (APS)
quality_controlled: '1'
status: public
title: Viscosity of pure-glue QCD from the lattice
type: journal_article
user_id: '90492'
volume: 108
year: '2023'
...
---
_id: '38041'
abstract:
- lang: eng
text: "While FPGA accelerator boards and their respective high-level design
tools are maturing, there is still a lack of multi-FPGA applications, libraries,
and not least, benchmarks and reference implementations towards sustained HPC
usage of these devices. As in the early days of GPUs in HPC, for workloads that
can reasonably be decoupled into loosely coupled working sets, multi-accelerator
support can be achieved by using standard communication interfaces like MPI on
the host side. However, for performance and productivity, some applications can
profit from a tighter coupling of the accelerators. FPGAs offer unique opportunities
here when extending the dataflow characteristics to their communication interfaces.\r\n
\ In this work, we extend the HPCC FPGA benchmark suite by multi-FPGA
support and three missing benchmarks that particularly characterize or stress
inter-device communication: b_eff, PTRANS, and LINPACK. With all benchmarks implemented
for current boards with Intel and Xilinx FPGAs, we established a baseline for
multi-FPGA performance. Additionally, for the communication-centric benchmarks,
we explored the potential of direct FPGA-to-FPGA communication with a circuit-switched
inter-FPGA network that is currently only available for one of the boards. The
evaluation with parallel execution on up to 26 FPGA boards makes use of one of
the largest academic FPGA installations."
author:
- first_name: Marius
full_name: Meyer, Marius
id: '40778'
last_name: Meyer
- first_name: Tobias
full_name: Kenter, Tobias
id: '3145'
last_name: Kenter
- first_name: Christian
full_name: Plessl, Christian
id: '16153'
last_name: Plessl
orcid: 0000-0001-5728-9982
citation:
ama: Meyer M, Kenter T, Plessl C. Multi-FPGA Designs and Scaling of HPC Challenge
Benchmarks via MPI and Circuit-Switched Inter-FPGA Networks. ACM Transactions
on Reconfigurable Technology and Systems. Published online 2023. doi:10.1145/3576200
apa: Meyer, M., Kenter, T., & Plessl, C. (2023). Multi-FPGA Designs and Scaling
of HPC Challenge Benchmarks via MPI and Circuit-Switched Inter-FPGA Networks.
ACM Transactions on Reconfigurable Technology and Systems. https://doi.org/10.1145/3576200
bibtex: '@article{Meyer_Kenter_Plessl_2023, title={Multi-FPGA Designs and Scaling
of HPC Challenge Benchmarks via MPI and Circuit-Switched Inter-FPGA Networks},
DOI={10.1145/3576200}, journal={ACM
Transactions on Reconfigurable Technology and Systems}, publisher={Association
for Computing Machinery (ACM)}, author={Meyer, Marius and Kenter, Tobias and Plessl,
Christian}, year={2023} }'
chicago: Meyer, Marius, Tobias Kenter, and Christian Plessl. “Multi-FPGA Designs
and Scaling of HPC Challenge Benchmarks via MPI and Circuit-Switched Inter-FPGA
Networks.” ACM Transactions on Reconfigurable Technology and Systems, 2023.
https://doi.org/10.1145/3576200.
ieee: 'M. Meyer, T. Kenter, and C. Plessl, “Multi-FPGA Designs and Scaling of HPC
Challenge Benchmarks via MPI and Circuit-Switched Inter-FPGA Networks,” ACM
Transactions on Reconfigurable Technology and Systems, 2023, doi: 10.1145/3576200.'
mla: Meyer, Marius, et al. “Multi-FPGA Designs and Scaling of HPC Challenge Benchmarks
via MPI and Circuit-Switched Inter-FPGA Networks.” ACM Transactions on Reconfigurable
Technology and Systems, Association for Computing Machinery (ACM), 2023, doi:10.1145/3576200.
short: M. Meyer, T. Kenter, C. Plessl, ACM Transactions on Reconfigurable Technology
and Systems (2023).
date_created: 2023-01-23T08:40:42Z
date_updated: 2023-07-28T08:02:05Z
department:
- _id: '27'
- _id: '518'
doi: 10.1145/3576200
keyword:
- General Computer Science
language:
- iso: eng
main_file_link:
- open_access: '1'
url: https://dl.acm.org/doi/10.1145/3576200
oa: '1'
project:
- _id: '52'
name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
- _id: '4'
name: 'SFB 901 - C: SFB 901 - Project Area C'
- _id: '1'
grant_number: '160364472'
name: 'SFB 901: SFB 901'
- _id: '14'
grant_number: '160364472'
name: 'SFB 901 - C2: SFB 901 - Subproject C2'
publication: ACM Transactions on Reconfigurable Technology and Systems
publication_identifier:
issn:
- 1936-7406
- 1936-7414
publication_status: published
publisher: Association for Computing Machinery (ACM)
quality_controlled: '1'
status: public
title: Multi-FPGA Designs and Scaling of HPC Challenge Benchmarks via MPI and Circuit-Switched
Inter-FPGA Networks
type: journal_article
user_id: '24135'
year: '2023'
...
---
_id: '45893'
author:
- first_name: Tim
full_name: Hansmeier, Tim
id: '49992'
last_name: Hansmeier
orcid: 0000-0003-1377-3339
- first_name: Tobias
full_name: Kenter, Tobias
id: '3145'
last_name: Kenter
- first_name: Marius
full_name: Meyer, Marius
id: '40778'
last_name: Meyer
- first_name: Heinrich
full_name: Riebler, Heinrich
id: '8961'
last_name: Riebler
- first_name: Marco
full_name: Platzner, Marco
id: '398'
last_name: Platzner
- first_name: Christian
full_name: Plessl, Christian
id: '16153'
last_name: Plessl
orcid: 0000-0001-5728-9982
citation:
ama: 'Hansmeier T, Kenter T, Meyer M, Riebler H, Platzner M, Plessl C. Compute Centers
I: Heterogeneous Execution Environments. In: Haake C-J, Meyer auf der Heide F,
Platzner M, Wachsmuth H, Wehrheim H, eds. On-The-Fly Computing -- Individualized
IT-Services in Dynamic Markets. Vol 412. Verlagsschriftenreihe des Heinz Nixdorf
Instituts. Heinz Nixdorf Institut, Universität Paderborn; 2023:165-182. doi:10.5281/zenodo.8068642'
apa: 'Hansmeier, T., Kenter, T., Meyer, M., Riebler, H., Platzner, M., & Plessl,
C. (2023). Compute Centers I: Heterogeneous Execution Environments. In C.-J. Haake,
F. Meyer auf der Heide, M. Platzner, H. Wachsmuth, & H. Wehrheim (Eds.), On-The-Fly
Computing -- Individualized IT-services in dynamic markets (Vol. 412, pp.
165–182). Heinz Nixdorf Institut, Universität Paderborn. https://doi.org/10.5281/zenodo.8068642'
bibtex: '@inbook{Hansmeier_Kenter_Meyer_Riebler_Platzner_Plessl_2023, place={Paderborn},
series={Verlagsschriftenreihe des Heinz Nixdorf Instituts}, title={Compute Centers
I: Heterogeneous Execution Environments}, volume={412}, DOI={10.5281/zenodo.8068642},
booktitle={On-The-Fly Computing -- Individualized IT-services in dynamic markets},
publisher={Heinz Nixdorf Institut, Universität Paderborn}, author={Hansmeier,
Tim and Kenter, Tobias and Meyer, Marius and Riebler, Heinrich and Platzner, Marco
and Plessl, Christian}, editor={Haake, Claus-Jochen and Meyer auf der Heide, Friedhelm
and Platzner, Marco and Wachsmuth, Henning and Wehrheim, Heike}, year={2023},
pages={165–182}, collection={Verlagsschriftenreihe des Heinz Nixdorf Instituts}
}'
chicago: 'Hansmeier, Tim, Tobias Kenter, Marius Meyer, Heinrich Riebler, Marco Platzner,
and Christian Plessl. “Compute Centers I: Heterogeneous Execution Environments.”
In On-The-Fly Computing -- Individualized IT-Services in Dynamic Markets,
edited by Claus-Jochen Haake, Friedhelm Meyer auf der Heide, Marco Platzner, Henning
Wachsmuth, and Heike Wehrheim, 412:165–82. Verlagsschriftenreihe Des Heinz Nixdorf
Instituts. Paderborn: Heinz Nixdorf Institut, Universität Paderborn, 2023. https://doi.org/10.5281/zenodo.8068642.'
ieee: 'T. Hansmeier, T. Kenter, M. Meyer, H. Riebler, M. Platzner, and C. Plessl,
“Compute Centers I: Heterogeneous Execution Environments,” in On-The-Fly Computing
-- Individualized IT-services in dynamic markets, vol. 412, C.-J. Haake, F.
Meyer auf der Heide, M. Platzner, H. Wachsmuth, and H. Wehrheim, Eds. Paderborn:
Heinz Nixdorf Institut, Universität Paderborn, 2023, pp. 165–182.'
mla: 'Hansmeier, Tim, et al. “Compute Centers I: Heterogeneous Execution Environments.”
On-The-Fly Computing -- Individualized IT-Services in Dynamic Markets,
edited by Claus-Jochen Haake et al., vol. 412, Heinz Nixdorf Institut, Universität
Paderborn, 2023, pp. 165–82, doi:10.5281/zenodo.8068642.'
short: 'T. Hansmeier, T. Kenter, M. Meyer, H. Riebler, M. Platzner, C. Plessl, in:
C.-J. Haake, F. Meyer auf der Heide, M. Platzner, H. Wachsmuth, H. Wehrheim (Eds.),
On-The-Fly Computing -- Individualized IT-Services in Dynamic Markets, Heinz Nixdorf
Institut, Universität Paderborn, Paderborn, 2023, pp. 165–182.'
date_created: 2023-07-07T08:15:45Z
date_updated: 2023-07-28T09:38:14Z
ddc:
- '004'
department:
- _id: '7'
- _id: '27'
- _id: '518'
doi: 10.5281/zenodo.8068642
editor:
- first_name: Claus-Jochen
full_name: Haake, Claus-Jochen
last_name: Haake
- first_name: Friedhelm
full_name: Meyer auf der Heide, Friedhelm
last_name: Meyer auf der Heide
- first_name: Marco
full_name: Platzner, Marco
last_name: Platzner
- first_name: Henning
full_name: Wachsmuth, Henning
last_name: Wachsmuth
- first_name: Heike
full_name: Wehrheim, Heike
last_name: Wehrheim
file:
- access_level: open_access
content_type: application/pdf
creator: florida
date_created: 2023-07-07T08:15:35Z
date_updated: 2023-07-07T11:17:33Z
file_id: '45894'
file_name: C2-Chapter-SFB-Buch-Final.pdf
file_size: 2288788
relation: main_file
file_date_updated: 2023-07-07T11:17:33Z
has_accepted_license: '1'
intvolume: ' 412'
language:
- iso: eng
oa: '1'
page: 165-182
place: Paderborn
project:
- _id: '1'
grant_number: '160364472'
name: 'SFB 901: SFB 901: On-The-Fly Computing - Individualisierte IT-Dienstleistungen
in dynamischen Märkten '
- _id: '4'
name: 'SFB 901 - C: SFB 901 - Project Area C'
- _id: '14'
grant_number: '160364472'
name: 'SFB 901 - C2: SFB 901 - On-The-Fly Compute Centers I: Heterogene Ausführungsumgebungen
(Subproject C2)'
publication: On-The-Fly Computing -- Individualized IT-services in dynamic markets
publisher: Heinz Nixdorf Institut, Universität Paderborn
series_title: Verlagsschriftenreihe des Heinz Nixdorf Instituts
status: public
title: 'Compute Centers I: Heterogeneous Execution Environments'
type: book_chapter
user_id: '3145'
volume: 412
year: '2023'
...
---
_id: '46190'
author:
- first_name: Jan-Oliver
full_name: Opdenhövel, Jan-Oliver
last_name: Opdenhövel
- first_name: Christian
full_name: Plessl, Christian
id: '16153'
last_name: Plessl
orcid: 0000-0001-5728-9982
- first_name: Tobias
full_name: Kenter, Tobias
id: '3145'
last_name: Kenter
citation:
ama: 'Opdenhövel J-O, Plessl C, Kenter T. Mutation Tree Reconstruction of Tumor
Cells on FPGAs Using a Bit-Level Matrix Representation. In: Proceedings of
the 13th International Symposium on Highly Efficient Accelerators and Reconfigurable
Technologies. ACM; 2023. doi:10.1145/3597031.3597050'
apa: Opdenhövel, J.-O., Plessl, C., & Kenter, T. (2023). Mutation Tree Reconstruction
of Tumor Cells on FPGAs Using a Bit-Level Matrix Representation. Proceedings
of the 13th International Symposium on Highly Efficient Accelerators and Reconfigurable
Technologies. https://doi.org/10.1145/3597031.3597050
bibtex: '@inproceedings{Opdenhövel_Plessl_Kenter_2023, title={Mutation Tree Reconstruction
of Tumor Cells on FPGAs Using a Bit-Level Matrix Representation}, DOI={10.1145/3597031.3597050},
booktitle={Proceedings of the 13th International Symposium on Highly Efficient
Accelerators and Reconfigurable Technologies}, publisher={ACM}, author={Opdenhövel,
Jan-Oliver and Plessl, Christian and Kenter, Tobias}, year={2023} }'
chicago: Opdenhövel, Jan-Oliver, Christian Plessl, and Tobias Kenter. “Mutation
Tree Reconstruction of Tumor Cells on FPGAs Using a Bit-Level Matrix Representation.”
In Proceedings of the 13th International Symposium on Highly Efficient Accelerators
and Reconfigurable Technologies. ACM, 2023. https://doi.org/10.1145/3597031.3597050.
ieee: 'J.-O. Opdenhövel, C. Plessl, and T. Kenter, “Mutation Tree Reconstruction
of Tumor Cells on FPGAs Using a Bit-Level Matrix Representation,” 2023, doi: 10.1145/3597031.3597050.'
mla: Opdenhövel, Jan-Oliver, et al. “Mutation Tree Reconstruction of Tumor Cells
on FPGAs Using a Bit-Level Matrix Representation.” Proceedings of the 13th
International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies,
ACM, 2023, doi:10.1145/3597031.3597050.
short: 'J.-O. Opdenhövel, C. Plessl, T. Kenter, in: Proceedings of the 13th International
Symposium on Highly Efficient Accelerators and Reconfigurable Technologies, ACM,
2023.'
date_created: 2023-07-28T09:49:23Z
date_updated: 2023-07-28T09:58:06Z
department:
- _id: '27'
- _id: '518'
doi: 10.1145/3597031.3597050
language:
- iso: eng
main_file_link:
- open_access: '1'
url: https://dl.acm.org/doi/pdf/10.1145/3597031.3597050
oa: '1'
project:
- _id: '52'
name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: Proceedings of the 13th International Symposium on Highly Efficient Accelerators
and Reconfigurable Technologies
publication_status: published
publisher: ACM
quality_controlled: '1'
status: public
title: Mutation Tree Reconstruction of Tumor Cells on FPGAs Using a Bit-Level Matrix
Representation
type: conference
user_id: '3145'
year: '2023'
...
---
_id: '46188'
author:
- first_name: Jennifer
full_name: Faj, Jennifer
id: '78722'
last_name: Faj
- first_name: Tobias
full_name: Kenter, Tobias
id: '3145'
last_name: Kenter
- first_name: Sara
full_name: Faghih-Naini, Sara
last_name: Faghih-Naini
- first_name: Christian
full_name: Plessl, Christian
id: '16153'
last_name: Plessl
orcid: 0000-0001-5728-9982
- first_name: Vadym
full_name: Aizinger, Vadym
last_name: Aizinger
citation:
ama: 'Faj J, Kenter T, Faghih-Naini S, Plessl C, Aizinger V. Scalable Multi-FPGA
Design of a Discontinuous Galerkin Shallow-Water Model on Unstructured Meshes.
In: Proceedings of the Platform for Advanced Scientific Computing Conference.
ACM; 2023. doi:10.1145/3592979.3593407'
apa: Faj, J., Kenter, T., Faghih-Naini, S., Plessl, C., & Aizinger, V. (2023).
Scalable Multi-FPGA Design of a Discontinuous Galerkin Shallow-Water Model on
Unstructured Meshes. Proceedings of the Platform for Advanced Scientific Computing
Conference. https://doi.org/10.1145/3592979.3593407
bibtex: '@inproceedings{Faj_Kenter_Faghih-Naini_Plessl_Aizinger_2023, title={Scalable
Multi-FPGA Design of a Discontinuous Galerkin Shallow-Water Model on Unstructured
Meshes}, DOI={10.1145/3592979.3593407},
booktitle={Proceedings of the Platform for Advanced Scientific Computing Conference},
publisher={ACM}, author={Faj, Jennifer and Kenter, Tobias and Faghih-Naini, Sara
and Plessl, Christian and Aizinger, Vadym}, year={2023} }'
chicago: Faj, Jennifer, Tobias Kenter, Sara Faghih-Naini, Christian Plessl, and
Vadym Aizinger. “Scalable Multi-FPGA Design of a Discontinuous Galerkin Shallow-Water
Model on Unstructured Meshes.” In Proceedings of the Platform for Advanced
Scientific Computing Conference. ACM, 2023. https://doi.org/10.1145/3592979.3593407.
ieee: 'J. Faj, T. Kenter, S. Faghih-Naini, C. Plessl, and V. Aizinger, “Scalable
Multi-FPGA Design of a Discontinuous Galerkin Shallow-Water Model on Unstructured
Meshes,” 2023, doi: 10.1145/3592979.3593407.'
mla: Faj, Jennifer, et al. “Scalable Multi-FPGA Design of a Discontinuous Galerkin
Shallow-Water Model on Unstructured Meshes.” Proceedings of the Platform for
Advanced Scientific Computing Conference, ACM, 2023, doi:10.1145/3592979.3593407.
short: 'J. Faj, T. Kenter, S. Faghih-Naini, C. Plessl, V. Aizinger, in: Proceedings
of the Platform for Advanced Scientific Computing Conference, ACM, 2023.'
date_created: 2023-07-28T09:42:14Z
date_updated: 2023-07-28T09:48:19Z
department:
- _id: '27'
- _id: '518'
doi: 10.1145/3592979.3593407
language:
- iso: eng
main_file_link:
- open_access: '1'
url: https://dl.acm.org/doi/pdf/10.1145/3592979.3593407
oa: '1'
project:
- _id: '52'
name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: Proceedings of the Platform for Advanced Scientific Computing Conference
publication_status: published
publisher: ACM
quality_controlled: '1'
status: public
title: Scalable Multi-FPGA Design of a Discontinuous Galerkin Shallow-Water Model
on Unstructured Meshes
type: conference
user_id: '3145'
year: '2023'
...
---
_id: '46189'
author:
- first_name: Charles
full_name: Prouveur, Charles
last_name: Prouveur
- first_name: Matthieu
full_name: Haefele, Matthieu
last_name: Haefele
- first_name: Tobias
full_name: Kenter, Tobias
id: '3145'
last_name: Kenter
- first_name: Nils
full_name: Voss, Nils
last_name: Voss
citation:
ama: 'Prouveur C, Haefele M, Kenter T, Voss N. FPGA Acceleration for HPC Supercapacitor
Simulations. In: Proceedings of the Platform for Advanced Scientific Computing
Conference. ACM; 2023. doi:10.1145/3592979.3593419'
apa: Prouveur, C., Haefele, M., Kenter, T., & Voss, N. (2023). FPGA Acceleration
for HPC Supercapacitor Simulations. Proceedings of the Platform for Advanced
Scientific Computing Conference. https://doi.org/10.1145/3592979.3593419
bibtex: '@inproceedings{Prouveur_Haefele_Kenter_Voss_2023, title={FPGA Acceleration
for HPC Supercapacitor Simulations}, DOI={10.1145/3592979.3593419},
booktitle={Proceedings of the Platform for Advanced Scientific Computing Conference},
publisher={ACM}, author={Prouveur, Charles and Haefele, Matthieu and Kenter, Tobias
and Voss, Nils}, year={2023} }'
chicago: Prouveur, Charles, Matthieu Haefele, Tobias Kenter, and Nils Voss. “FPGA
Acceleration for HPC Supercapacitor Simulations.” In Proceedings of the Platform
for Advanced Scientific Computing Conference. ACM, 2023. https://doi.org/10.1145/3592979.3593419.
ieee: 'C. Prouveur, M. Haefele, T. Kenter, and N. Voss, “FPGA Acceleration for HPC
Supercapacitor Simulations,” 2023, doi: 10.1145/3592979.3593419.'
mla: Prouveur, Charles, et al. “FPGA Acceleration for HPC Supercapacitor Simulations.”
Proceedings of the Platform for Advanced Scientific Computing Conference,
ACM, 2023, doi:10.1145/3592979.3593419.
short: 'C. Prouveur, M. Haefele, T. Kenter, N. Voss, in: Proceedings of the Platform
for Advanced Scientific Computing Conference, ACM, 2023.'
date_created: 2023-07-28T09:46:25Z
date_updated: 2023-07-28T09:58:16Z
department:
- _id: '27'
- _id: '518'
doi: 10.1145/3592979.3593419
language:
- iso: eng
main_file_link:
- open_access: '1'
url: https://dl.acm.org/doi/pdf/10.1145/3592979.3593419
oa: '1'
project:
- _id: '52'
name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: Proceedings of the Platform for Advanced Scientific Computing Conference
publication_status: published
publisher: ACM
quality_controlled: '1'
status: public
title: FPGA Acceleration for HPC Supercapacitor Simulations
type: conference
user_id: '3145'
year: '2023'
...
---
_id: '43228'
abstract:
- lang: eng
text: "The computation of electron repulsion integrals (ERIs) over Gaussian-type
orbitals (GTOs) is a challenging problem in quantum-mechanics-based atomistic
simulations. In practical simulations, several trillions of ERIs may have to be\r\ncomputed
for every time step.\r\nIn this work, we investigate FPGAs as accelerators for
the ERI computation. We use template parameters, here within the Intel oneAPI
tool flow, to create customized designs for 256 different ERI quartet classes,
based on their orbitals. To maximize data reuse, all intermediates are buffered
in FPGA on-chip memory with customized layout. The pre-calculation of intermediates
also helps to overcome data dependencies caused by multi-dimensional recurrence\r\nrelations.
The involved loop structures are partially or even fully unrolled for high throughput
of FPGA kernels. Furthermore, a lossy compression algorithm utilizing arbitrary
bitwidth integers is integrated in the FPGA kernels. To our\r\nbest knowledge,
this is the first work on ERI computation on FPGAs that supports more than just
the single most basic quartet class. Also, the integration of ERI computation
and compression it a novelty that is not even covered by CPU or GPU libraries
so far.\r\nOur evaluation shows that using 16-bit integer for the ERI compression,
the fastest FPGA kernels exceed the performance of 10 GERIS ($10 \\times 10^9$
ERIs per second) on one Intel Stratix 10 GX 2800 FPGA, with maximum absolute errors
around $10^{-7}$ - $10^{-5}$ Hartree. The measured throughput can be accurately
explained by a performance model. The FPGA kernels deployed on 2 FPGAs outperform
similar computations using the widely used libint reference on a two-socket server
with 40 Xeon Gold 6148 CPU cores of the same process technology by factors up
to 6.0x and on a new two-socket server with 128 EPYC 7713 CPU cores by up to 1.9x."
author:
- first_name: Xin
full_name: Wu, Xin
id: '77439'
last_name: Wu
- first_name: Tobias
full_name: Kenter, Tobias
id: '3145'
last_name: Kenter
- first_name: Robert
full_name: Schade, Robert
id: '75963'
last_name: Schade
orcid: 0000-0002-6268-539
- first_name: Thomas
full_name: Kühne, Thomas
id: '49079'
last_name: Kühne
- first_name: Christian
full_name: Plessl, Christian
id: '16153'
last_name: Plessl
orcid: 0000-0001-5728-9982
citation:
ama: 'Wu X, Kenter T, Schade R, Kühne T, Plessl C. Computing and Compressing Electron
Repulsion Integrals on FPGAs. In: 2023 IEEE 31st Annual International Symposium
on Field-Programmable Custom Computing Machines (FCCM). ; 2023:162-173. doi:10.1109/FCCM57271.2023.00026'
apa: Wu, X., Kenter, T., Schade, R., Kühne, T., & Plessl, C. (2023). Computing
and Compressing Electron Repulsion Integrals on FPGAs. 2023 IEEE 31st Annual
International Symposium on Field-Programmable Custom Computing Machines (FCCM),
162–173. https://doi.org/10.1109/FCCM57271.2023.00026
bibtex: '@inproceedings{Wu_Kenter_Schade_Kühne_Plessl_2023, title={Computing and
Compressing Electron Repulsion Integrals on FPGAs}, DOI={10.1109/FCCM57271.2023.00026},
booktitle={2023 IEEE 31st Annual International Symposium on Field-Programmable
Custom Computing Machines (FCCM)}, author={Wu, Xin and Kenter, Tobias and Schade,
Robert and Kühne, Thomas and Plessl, Christian}, year={2023}, pages={162–173}
}'
chicago: Wu, Xin, Tobias Kenter, Robert Schade, Thomas Kühne, and Christian Plessl.
“Computing and Compressing Electron Repulsion Integrals on FPGAs.” In 2023
IEEE 31st Annual International Symposium on Field-Programmable Custom Computing
Machines (FCCM), 162–73, 2023. https://doi.org/10.1109/FCCM57271.2023.00026.
ieee: 'X. Wu, T. Kenter, R. Schade, T. Kühne, and C. Plessl, “Computing and Compressing
Electron Repulsion Integrals on FPGAs,” in 2023 IEEE 31st Annual International
Symposium on Field-Programmable Custom Computing Machines (FCCM), 2023, pp.
162–173, doi: 10.1109/FCCM57271.2023.00026.'
mla: Wu, Xin, et al. “Computing and Compressing Electron Repulsion Integrals on
FPGAs.” 2023 IEEE 31st Annual International Symposium on Field-Programmable
Custom Computing Machines (FCCM), 2023, pp. 162–73, doi:10.1109/FCCM57271.2023.00026.
short: 'X. Wu, T. Kenter, R. Schade, T. Kühne, C. Plessl, in: 2023 IEEE 31st Annual
International Symposium on Field-Programmable Custom Computing Machines (FCCM),
2023, pp. 162–173.'
date_created: 2023-03-30T11:15:40Z
date_updated: 2023-08-02T15:05:42Z
department:
- _id: '27'
- _id: '518'
doi: 10.1109/FCCM57271.2023.00026
external_id:
arxiv:
- '2303.13632'
language:
- iso: eng
main_file_link:
- url: https://ieeexplore.ieee.org/document/10171537
page: 162-173
project:
- _id: '52'
name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: 2023 IEEE 31st Annual International Symposium on Field-Programmable Custom
Computing Machines (FCCM)
quality_controlled: '1'
status: public
title: Computing and Compressing Electron Repulsion Integrals on FPGAs
type: conference
user_id: '75963'
year: '2023'
...
---
_id: '45361'
abstract:
- lang: eng
text: The non-orthogonal local submatrix method applied to electronic structure–based
molecular dynamics simulations is shown to exceed 1.1 EFLOP/s in FP16/FP32-mixed
floating-point arithmetic when using 4400 NVIDIA A100 GPUs of the Perlmutter system.
This is enabled by a modification of the original method that pushes the sustained
fraction of the peak performance to about 80%. Example calculations are performed
for SARS-CoV-2 spike proteins with up to 83 million atoms.
article_number: '109434202311776'
article_type: original
author:
- first_name: Robert
full_name: Schade, Robert
id: '75963'
last_name: Schade
orcid: 0000-0002-6268-539
- first_name: Tobias
full_name: Kenter, Tobias
id: '3145'
last_name: Kenter
- first_name: Hossam
full_name: Elgabarty, Hossam
id: '60250'
last_name: Elgabarty
orcid: 0000-0002-4945-1481
- first_name: Michael
full_name: Lass, Michael
id: '24135'
last_name: Lass
orcid: 0000-0002-5708-7632
- first_name: Thomas
full_name: Kühne, Thomas
id: '49079'
last_name: Kühne
- first_name: Christian
full_name: Plessl, Christian
id: '16153'
last_name: Plessl
orcid: 0000-0001-5728-9982
citation:
ama: Schade R, Kenter T, Elgabarty H, Lass M, Kühne T, Plessl C. Breaking the exascale
barrier for the electronic structure problem in ab-initio molecular dynamics.
The International Journal of High Performance Computing Applications. Published
online 2023. doi:10.1177/10943420231177631
apa: Schade, R., Kenter, T., Elgabarty, H., Lass, M., Kühne, T., & Plessl, C.
(2023). Breaking the exascale barrier for the electronic structure problem in
ab-initio molecular dynamics. The International Journal of High Performance
Computing Applications, Article 109434202311776. https://doi.org/10.1177/10943420231177631
bibtex: '@article{Schade_Kenter_Elgabarty_Lass_Kühne_Plessl_2023, title={Breaking
the exascale barrier for the electronic structure problem in ab-initio molecular
dynamics}, DOI={10.1177/10943420231177631},
number={109434202311776}, journal={The International Journal of High Performance
Computing Applications}, publisher={SAGE Publications}, author={Schade, Robert
and Kenter, Tobias and Elgabarty, Hossam and Lass, Michael and Kühne, Thomas and
Plessl, Christian}, year={2023} }'
chicago: Schade, Robert, Tobias Kenter, Hossam Elgabarty, Michael Lass, Thomas Kühne,
and Christian Plessl. “Breaking the Exascale Barrier for the Electronic Structure
Problem in Ab-Initio Molecular Dynamics.” The International Journal of High
Performance Computing Applications, 2023. https://doi.org/10.1177/10943420231177631.
ieee: 'R. Schade, T. Kenter, H. Elgabarty, M. Lass, T. Kühne, and C. Plessl, “Breaking
the exascale barrier for the electronic structure problem in ab-initio molecular
dynamics,” The International Journal of High Performance Computing Applications,
Art. no. 109434202311776, 2023, doi: 10.1177/10943420231177631.'
mla: Schade, Robert, et al. “Breaking the Exascale Barrier for the Electronic Structure
Problem in Ab-Initio Molecular Dynamics.” The International Journal of High
Performance Computing Applications, 109434202311776, SAGE Publications, 2023,
doi:10.1177/10943420231177631.
short: R. Schade, T. Kenter, H. Elgabarty, M. Lass, T. Kühne, C. Plessl, The International
Journal of High Performance Computing Applications (2023).
date_created: 2023-05-30T09:19:09Z
date_updated: 2023-08-02T15:04:53Z
department:
- _id: '27'
- _id: '518'
doi: 10.1177/10943420231177631
keyword:
- Hardware and Architecture
- Theoretical Computer Science
- Software
language:
- iso: eng
main_file_link:
- open_access: '1'
url: https://journals.sagepub.com/doi/10.1177/10943420231177631
oa: '1'
project:
- _id: '52'
name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: The International Journal of High Performance Computing Applications
publication_identifier:
issn:
- 1094-3420
- 1741-2846
publication_status: published
publisher: SAGE Publications
quality_controlled: '1'
status: public
title: Breaking the exascale barrier for the electronic structure problem in ab-initio
molecular dynamics
type: journal_article
user_id: '75963'
year: '2023'
...
---
_id: '50172'
abstract:
- lang: eng
text: "Viscous hydrodynamics serves as a successful mesoscopic description of the\r\nQuark-Gluon
Plasma produced in relativistic heavy-ion collisions. In order to\r\ninvestigate,
how such an effective description emerges from the underlying\r\nmicroscopic dynamics
we calculate the hydrodynamic and non-hydrodynamic modes\r\nof linear response
in the sound channel from a first-principle calculation in\r\nkinetic theory.
We do this with a new approach wherein we discretize the\r\ncollision kernel to
directly calculate eigenvalues and eigenmodes of the\r\nevolution operator. This
allows us to study the Green's functions at any point\r\nin the complex frequency
space. Our study focuses on scalar theory with quartic\r\ninteraction and we find
that the analytic structure of Green's functions in the\r\ncomplex plane is far
more complicated than just poles or cuts which is a first\r\nstep towards an equivalent
study in QCD kinetic theory."
author:
- first_name: Stephan
full_name: Ochsenfeld, Stephan
last_name: Ochsenfeld
- first_name: Sören
full_name: Schlichting, Sören
last_name: Schlichting
citation:
ama: Ochsenfeld S, Schlichting S. Hydrodynamic and Non-hydrodynamic Excitations
in Kinetic Theory -- A Numerical Analysis in Scalar Field Theory. arXiv:230804491.
Published online 2023.
apa: Ochsenfeld, S., & Schlichting, S. (2023). Hydrodynamic and Non-hydrodynamic
Excitations in Kinetic Theory -- A Numerical Analysis in Scalar Field Theory.
In arXiv:2308.04491.
bibtex: '@article{Ochsenfeld_Schlichting_2023, title={Hydrodynamic and Non-hydrodynamic
Excitations in Kinetic Theory -- A Numerical Analysis in Scalar Field Theory},
journal={arXiv:2308.04491}, author={Ochsenfeld, Stephan and Schlichting, Sören},
year={2023} }'
chicago: Ochsenfeld, Stephan, and Sören Schlichting. “Hydrodynamic and Non-Hydrodynamic
Excitations in Kinetic Theory -- A Numerical Analysis in Scalar Field Theory.”
ArXiv:2308.04491, 2023.
ieee: S. Ochsenfeld and S. Schlichting, “Hydrodynamic and Non-hydrodynamic Excitations
in Kinetic Theory -- A Numerical Analysis in Scalar Field Theory,” arXiv:2308.04491.
2023.
mla: Ochsenfeld, Stephan, and Sören Schlichting. “Hydrodynamic and Non-Hydrodynamic
Excitations in Kinetic Theory -- A Numerical Analysis in Scalar Field Theory.”
ArXiv:2308.04491, 2023.
short: S. Ochsenfeld, S. Schlichting, ArXiv:2308.04491 (2023).
date_created: 2024-01-04T08:47:38Z
date_updated: 2024-01-04T08:47:47Z
department:
- _id: '27'
external_id:
arxiv:
- '2308.04491'
language:
- iso: eng
project:
- _id: '52'
name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: arXiv:2308.04491
status: public
title: Hydrodynamic and Non-hydrodynamic Excitations in Kinetic Theory -- A Numerical
Analysis in Scalar Field Theory
type: preprint
user_id: '67287'
year: '2023'
...