---
_id: '53663'
abstract:
- lang: eng
  text: 'Noctua 2 is a supercomputer operated at the Paderborn Center for Parallel
    Computing (PC2) at Paderborn University in Germany. Noctua 2 was inaugurated in
    2022 and is an Atos BullSequana XH2000 system. It consists mainly of three node
    types: 1) CPU Compute nodes with AMD EPYC processors in different main memory
    configurations, 2) GPU nodes with NVIDIA A100 GPUs, and 3) FPGA nodes with Xilinx
    Alveo U280 and Intel Stratix 10 FPGA cards. While CPUs and GPUs are known off-the-shelf
    components in HPC systems, the operation of a large number of FPGA cards from
    different vendors and a dedicated FPGA-to-FPGA network are unique characteristics
    of Noctua 2. This paper describes in detail the overall setup of Noctua 2 and
    gives insights into the operation of the cluster from a hardware, software and
    facility perspective.'
article_type: original
author:
- first_name: Carsten
  full_name: Bauer, Carsten
  id: '90082'
  last_name: Bauer
- first_name: Tobias
  full_name: Kenter, Tobias
  id: '3145'
  last_name: Kenter
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Lukas
  full_name: Mazur, Lukas
  id: '90492'
  last_name: Mazur
  orcid: ' 0000-0001-6304-7082'
- first_name: Marius
  full_name: Meyer, Marius
  id: '40778'
  last_name: Meyer
- first_name: Holger
  full_name: Nitsche, Holger
  id: '15272'
  last_name: Nitsche
- first_name: Heinrich
  full_name: Riebler, Heinrich
  id: '8961'
  last_name: Riebler
- first_name: Robert
  full_name: Schade, Robert
  id: '75963'
  last_name: Schade
  orcid: 0000-0002-6268-5397
- first_name: Michael
  full_name: Schwarz, Michael
  id: '5312'
  last_name: Schwarz
- first_name: Nils
  full_name: Winnwa, Nils
  id: '61189'
  last_name: Winnwa
- first_name: Alex
  full_name: Wiens, Alex
  id: '23522'
  last_name: Wiens
  orcid: 0000-0003-1764-9773
- first_name: Xin
  full_name: Wu, Xin
  id: '77439'
  last_name: Wu
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
- first_name: Jens
  full_name: Simon, Jens
  id: '15273'
  last_name: Simon
citation:
  ama: Bauer C, Kenter T, Lass M, et al. Noctua 2 Supercomputer. <i>Journal of large-scale
    research facilities</i>. 2024;9. doi:<a href="https://doi.org/10.17815/jlsrf-8-187
    ">10.17815/jlsrf-8-187 </a>
  apa: Bauer, C., Kenter, T., Lass, M., Mazur, L., Meyer, M., Nitsche, H., Riebler,
    H., Schade, R., Schwarz, M., Winnwa, N., Wiens, A., Wu, X., Plessl, C., &#38;
    Simon, J. (2024). Noctua 2 Supercomputer. <i>Journal of Large-Scale Research Facilities</i>,
    <i>9</i>. <a href="https://doi.org/10.17815/jlsrf-8-187 ">https://doi.org/10.17815/jlsrf-8-187
    </a>
  bibtex: '@article{Bauer_Kenter_Lass_Mazur_Meyer_Nitsche_Riebler_Schade_Schwarz_Winnwa_et
    al._2024, title={Noctua 2 Supercomputer}, volume={9}, DOI={<a href="https://doi.org/10.17815/jlsrf-8-187
    ">10.17815/jlsrf-8-187 </a>}, journal={Journal of large-scale research facilities},
    author={Bauer, Carsten and Kenter, Tobias and Lass, Michael and Mazur, Lukas and
    Meyer, Marius and Nitsche, Holger and Riebler, Heinrich and Schade, Robert and
    Schwarz, Michael and Winnwa, Nils and et al.}, year={2024} }'
  chicago: Bauer, Carsten, Tobias Kenter, Michael Lass, Lukas Mazur, Marius Meyer,
    Holger Nitsche, Heinrich Riebler, et al. “Noctua 2 Supercomputer.” <i>Journal
    of Large-Scale Research Facilities</i> 9 (2024). <a href="https://doi.org/10.17815/jlsrf-8-187
    ">https://doi.org/10.17815/jlsrf-8-187 </a>.
  ieee: 'C. Bauer <i>et al.</i>, “Noctua 2 Supercomputer,” <i>Journal of large-scale
    research facilities</i>, vol. 9, 2024, doi: <a href="https://doi.org/10.17815/jlsrf-8-187
    ">10.17815/jlsrf-8-187 </a>.'
  mla: Bauer, Carsten, et al. “Noctua 2 Supercomputer.” <i>Journal of Large-Scale
    Research Facilities</i>, vol. 9, 2024, doi:<a href="https://doi.org/10.17815/jlsrf-8-187
    ">10.17815/jlsrf-8-187 </a>.
  short: C. Bauer, T. Kenter, M. Lass, L. Mazur, M. Meyer, H. Nitsche, H. Riebler,
    R. Schade, M. Schwarz, N. Winnwa, A. Wiens, X. Wu, C. Plessl, J. Simon, Journal
    of Large-Scale Research Facilities 9 (2024).
date_created: 2024-04-26T07:39:41Z
date_updated: 2024-04-26T08:44:30Z
ddc:
- '004'
department:
- _id: '27'
- _id: '518'
doi: '10.17815/jlsrf-8-187 '
file:
- access_level: open_access
  content_type: application/pdf
  creator: deffel
  date_created: 2024-04-26T07:30:20Z
  date_updated: 2024-04-26T08:35:17Z
  file_id: '53664'
  file_name: Noctua2_Supercomputer.pdf
  file_size: 3825480
  relation: main_file
file_date_updated: 2024-04-26T08:35:17Z
has_accepted_license: '1'
intvolume: '         9'
keyword:
- Noctua 2
- Supercomputer
- FPGA
- PC2
- Paderborn Center for Parallel Computing
language:
- iso: eng
oa: '1'
project:
- _id: '52'
  name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: Journal of large-scale research facilities
publication_status: published
status: public
title: Noctua 2 Supercomputer
type: journal_article
user_id: '8961'
volume: 9
year: '2024'
...
---
_id: '56607'
author:
- first_name: Abdul Rehman
  full_name: Tareen, Abdul Rehman
  id: '76938'
  last_name: Tareen
- first_name: Marius
  full_name: Meyer, Marius
  id: '40778'
  last_name: Meyer
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
- first_name: Tobias
  full_name: Kenter, Tobias
  id: '3145'
  last_name: Kenter
citation:
  ama: 'Tareen AR, Meyer M, Plessl C, Kenter T. HiHiSpMV: Sparse Matrix Vector Multiplication
    with Hierarchical Row Reductions on FPGAs with High Bandwidth Memory. In: <i>2024
    IEEE 32nd Annual International Symposium on Field-Programmable Custom Computing
    Machines (FCCM)</i>. Vol 35. IEEE; 2024. doi:<a href="https://doi.org/10.1109/fccm60383.2024.00014">10.1109/fccm60383.2024.00014</a>'
  apa: 'Tareen, A. R., Meyer, M., Plessl, C., &#38; Kenter, T. (2024). HiHiSpMV: Sparse
    Matrix Vector Multiplication with Hierarchical Row Reductions on FPGAs with High
    Bandwidth Memory. <i>2024 IEEE 32nd Annual International Symposium on Field-Programmable
    Custom Computing Machines (FCCM)</i>, <i>35</i>. <a href="https://doi.org/10.1109/fccm60383.2024.00014">https://doi.org/10.1109/fccm60383.2024.00014</a>'
  bibtex: '@inproceedings{Tareen_Meyer_Plessl_Kenter_2024, title={HiHiSpMV: Sparse
    Matrix Vector Multiplication with Hierarchical Row Reductions on FPGAs with High
    Bandwidth Memory}, volume={35}, DOI={<a href="https://doi.org/10.1109/fccm60383.2024.00014">10.1109/fccm60383.2024.00014</a>},
    booktitle={2024 IEEE 32nd Annual International Symposium on Field-Programmable
    Custom Computing Machines (FCCM)}, publisher={IEEE}, author={Tareen, Abdul Rehman
    and Meyer, Marius and Plessl, Christian and Kenter, Tobias}, year={2024} }'
  chicago: 'Tareen, Abdul Rehman, Marius Meyer, Christian Plessl, and Tobias Kenter.
    “HiHiSpMV: Sparse Matrix Vector Multiplication with Hierarchical Row Reductions
    on FPGAs with High Bandwidth Memory.” In <i>2024 IEEE 32nd Annual International
    Symposium on Field-Programmable Custom Computing Machines (FCCM)</i>, Vol. 35.
    IEEE, 2024. <a href="https://doi.org/10.1109/fccm60383.2024.00014">https://doi.org/10.1109/fccm60383.2024.00014</a>.'
  ieee: 'A. R. Tareen, M. Meyer, C. Plessl, and T. Kenter, “HiHiSpMV: Sparse Matrix
    Vector Multiplication with Hierarchical Row Reductions on FPGAs with High Bandwidth
    Memory,” in <i>2024 IEEE 32nd Annual International Symposium on Field-Programmable
    Custom Computing Machines (FCCM)</i>, 2024, vol. 35, doi: <a href="https://doi.org/10.1109/fccm60383.2024.00014">10.1109/fccm60383.2024.00014</a>.'
  mla: 'Tareen, Abdul Rehman, et al. “HiHiSpMV: Sparse Matrix Vector Multiplication
    with Hierarchical Row Reductions on FPGAs with High Bandwidth Memory.” <i>2024
    IEEE 32nd Annual International Symposium on Field-Programmable Custom Computing
    Machines (FCCM)</i>, vol. 35, IEEE, 2024, doi:<a href="https://doi.org/10.1109/fccm60383.2024.00014">10.1109/fccm60383.2024.00014</a>.'
  short: 'A.R. Tareen, M. Meyer, C. Plessl, T. Kenter, in: 2024 IEEE 32nd Annual International
    Symposium on Field-Programmable Custom Computing Machines (FCCM), IEEE, 2024.'
date_created: 2024-10-14T07:59:08Z
date_updated: 2024-10-14T12:27:55Z
department:
- _id: '27'
- _id: '518'
doi: 10.1109/fccm60383.2024.00014
intvolume: '        35'
language:
- iso: eng
publication: 2024 IEEE 32nd Annual International Symposium on Field-Programmable Custom
  Computing Machines (FCCM)
publication_status: published
publisher: IEEE
quality_controlled: '1'
status: public
title: 'HiHiSpMV: Sparse Matrix Vector Multiplication with Hierarchical Row Reductions
  on FPGAs with High Bandwidth Memory'
type: conference
user_id: '3145'
volume: 35
year: '2024'
...
---
_id: '62067'
abstract:
- lang: eng
  text: Most FPGA boards in the HPC domain are well-suited for parallel scaling because
    of the direct integration of versatile and high-throughput network ports. However,
    the utilization of their network capabilities is often challenging and error-prone
    because the whole network stack and communication patterns have to be implemented
    and managed on the FPGAs. Also, this approach conceptually involves a trade-off
    between the performance potential of improved communication and the impact of
    resource consumption for communication infrastructure, since the utilized resources
    on the FPGAs could otherwise be used for computations. In this work, we investigate
    this trade-off, firstly, by using synthetic benchmarks to evaluate the different
    configuration options of the communication framework ACCL and their impact on
    communication latency and throughput. Finally, we use our findings to implement
    a shallow water simulation whose scalability heavily depends on low-latency communication.
    With a suitable configuration of ACCL, good scaling behavior can be shown to all
    48 FPGAs installed in the system. Overall, the results show that the availability
    of inter-FPGA communication frameworks as well as the configurability of framework and
    network stack are crucial to achieve the best application performance with low
    latency communication.
author:
- first_name: Marius
  full_name: Meyer, Marius
  id: '40778'
  last_name: Meyer
- first_name: Tobias
  full_name: Kenter, Tobias
  id: '3145'
  last_name: Kenter
- first_name: Lucian
  full_name: Petrica, Lucian
  last_name: Petrica
- first_name: Kenneth
  full_name: O’Brien, Kenneth
  last_name: O’Brien
- first_name: Michaela
  full_name: Blott, Michaela
  last_name: Blott
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: 'Meyer M, Kenter T, Petrica L, O’Brien K, Blott M, Plessl C. Optimizing Communication
    for Latency Sensitive HPC Applications on up to 48 FPGAs Using ACCL. In: <i>Lecture
    Notes in Computer Science</i>. Springer Nature Switzerland; 2024. doi:<a href="https://doi.org/10.1007/978-3-031-69766-1_9">10.1007/978-3-031-69766-1_9</a>'
  apa: Meyer, M., Kenter, T., Petrica, L., O’Brien, K., Blott, M., &#38; Plessl, C.
    (2024). Optimizing Communication for Latency Sensitive HPC Applications on up
    to 48 FPGAs Using ACCL. In <i>Lecture Notes in Computer Science</i>. Springer
    Nature Switzerland. <a href="https://doi.org/10.1007/978-3-031-69766-1_9">https://doi.org/10.1007/978-3-031-69766-1_9</a>
  bibtex: '@inbook{Meyer_Kenter_Petrica_O’Brien_Blott_Plessl_2024, place={Cham}, title={Optimizing
    Communication for Latency Sensitive HPC Applications on up to 48 FPGAs Using ACCL},
    DOI={<a href="https://doi.org/10.1007/978-3-031-69766-1_9">10.1007/978-3-031-69766-1_9</a>},
    booktitle={Lecture Notes in Computer Science}, publisher={Springer Nature Switzerland},
    author={Meyer, Marius and Kenter, Tobias and Petrica, Lucian and O’Brien, Kenneth
    and Blott, Michaela and Plessl, Christian}, year={2024} }'
  chicago: 'Meyer, Marius, Tobias Kenter, Lucian Petrica, Kenneth O’Brien, Michaela
    Blott, and Christian Plessl. “Optimizing Communication for Latency Sensitive HPC
    Applications on up to 48 FPGAs Using ACCL.” In <i>Lecture Notes in Computer Science</i>.
    Cham: Springer Nature Switzerland, 2024. <a href="https://doi.org/10.1007/978-3-031-69766-1_9">https://doi.org/10.1007/978-3-031-69766-1_9</a>.'
  ieee: 'M. Meyer, T. Kenter, L. Petrica, K. O’Brien, M. Blott, and C. Plessl, “Optimizing
    Communication for Latency Sensitive HPC Applications on up to 48 FPGAs Using ACCL,”
    in <i>Lecture Notes in Computer Science</i>, Cham: Springer Nature Switzerland,
    2024.'
  mla: Meyer, Marius, et al. “Optimizing Communication for Latency Sensitive HPC Applications
    on up to 48 FPGAs Using ACCL.” <i>Lecture Notes in Computer Science</i>, Springer
    Nature Switzerland, 2024, doi:<a href="https://doi.org/10.1007/978-3-031-69766-1_9">10.1007/978-3-031-69766-1_9</a>.
  short: 'M. Meyer, T. Kenter, L. Petrica, K. O’Brien, M. Blott, C. Plessl, in: Lecture
    Notes in Computer Science, Springer Nature Switzerland, Cham, 2024.'
date_created: 2025-11-04T09:50:24Z
date_updated: 2025-11-04T09:51:22Z
department:
- _id: '27'
- _id: '518'
doi: 10.1007/978-3-031-69766-1_9
language:
- iso: eng
main_file_link:
- open_access: '1'
oa: '1'
place: Cham
project:
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
publication: Lecture Notes in Computer Science
publication_identifier:
  isbn:
  - '9783031697654'
  - '9783031697661'
  issn:
  - 0302-9743
  - 1611-3349
publication_status: published
publisher: Springer Nature Switzerland
quality_controlled: '1'
status: public
title: Optimizing Communication for Latency Sensitive HPC Applications on up to 48
  FPGAs Using ACCL
type: book_chapter
user_id: '3145'
year: '2024'
...
---
_id: '45893'
author:
- first_name: Tim
  full_name: Hansmeier, Tim
  id: '49992'
  last_name: Hansmeier
  orcid: 0000-0003-1377-3339
- first_name: Tobias
  full_name: Kenter, Tobias
  id: '3145'
  last_name: Kenter
- first_name: Marius
  full_name: Meyer, Marius
  id: '40778'
  last_name: Meyer
- first_name: Heinrich
  full_name: Riebler, Heinrich
  id: '8961'
  last_name: Riebler
- first_name: Marco
  full_name: Platzner, Marco
  id: '398'
  last_name: Platzner
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: 'Hansmeier T, Kenter T, Meyer M, Riebler H, Platzner M, Plessl C. Compute Centers
    I: Heterogeneous Execution Environments. In: Haake C-J, Meyer auf der Heide F,
    Platzner M, Wachsmuth H, Wehrheim H, eds. <i>On-The-Fly Computing -- Individualized
    IT-Services in Dynamic Markets</i>. Vol 412. Verlagsschriftenreihe des Heinz Nixdorf
    Instituts. Heinz Nixdorf Institut, Universität Paderborn; 2023:165-182. doi:<a
    href="https://doi.org/10.5281/zenodo.8068642">10.5281/zenodo.8068642</a>'
  apa: 'Hansmeier, T., Kenter, T., Meyer, M., Riebler, H., Platzner, M., &#38; Plessl,
    C. (2023). Compute Centers I: Heterogeneous Execution Environments. In C.-J. Haake,
    F. Meyer auf der Heide, M. Platzner, H. Wachsmuth, &#38; H. Wehrheim (Eds.), <i>On-The-Fly
    Computing -- Individualized IT-services in dynamic markets</i> (Vol. 412, pp.
    165–182). Heinz Nixdorf Institut, Universität Paderborn. <a href="https://doi.org/10.5281/zenodo.8068642">https://doi.org/10.5281/zenodo.8068642</a>'
  bibtex: '@inbook{Hansmeier_Kenter_Meyer_Riebler_Platzner_Plessl_2023, place={Paderborn},
    series={Verlagsschriftenreihe des Heinz Nixdorf Instituts}, title={Compute Centers
    I: Heterogeneous Execution Environments}, volume={412}, DOI={<a href="https://doi.org/10.5281/zenodo.8068642">10.5281/zenodo.8068642</a>},
    booktitle={On-The-Fly Computing -- Individualized IT-services in dynamic markets},
    publisher={Heinz Nixdorf Institut, Universität Paderborn}, author={Hansmeier,
    Tim and Kenter, Tobias and Meyer, Marius and Riebler, Heinrich and Platzner, Marco
    and Plessl, Christian}, editor={Haake, Claus-Jochen and Meyer auf der Heide, Friedhelm
    and Platzner, Marco and Wachsmuth, Henning and Wehrheim, Heike}, year={2023},
    pages={165–182}, collection={Verlagsschriftenreihe des Heinz Nixdorf Instituts}
    }'
  chicago: 'Hansmeier, Tim, Tobias Kenter, Marius Meyer, Heinrich Riebler, Marco Platzner,
    and Christian Plessl. “Compute Centers I: Heterogeneous Execution Environments.”
    In <i>On-The-Fly Computing -- Individualized IT-Services in Dynamic Markets</i>,
    edited by Claus-Jochen Haake, Friedhelm Meyer auf der Heide, Marco Platzner, Henning
    Wachsmuth, and Heike Wehrheim, 412:165–82. Verlagsschriftenreihe Des Heinz Nixdorf
    Instituts. Paderborn: Heinz Nixdorf Institut, Universität Paderborn, 2023. <a
    href="https://doi.org/10.5281/zenodo.8068642">https://doi.org/10.5281/zenodo.8068642</a>.'
  ieee: 'T. Hansmeier, T. Kenter, M. Meyer, H. Riebler, M. Platzner, and C. Plessl,
    “Compute Centers I: Heterogeneous Execution Environments,” in <i>On-The-Fly Computing
    -- Individualized IT-services in dynamic markets</i>, vol. 412, C.-J. Haake, F.
    Meyer auf der Heide, M. Platzner, H. Wachsmuth, and H. Wehrheim, Eds. Paderborn:
    Heinz Nixdorf Institut, Universität Paderborn, 2023, pp. 165–182.'
  mla: 'Hansmeier, Tim, et al. “Compute Centers I: Heterogeneous Execution Environments.”
    <i>On-The-Fly Computing -- Individualized IT-Services in Dynamic Markets</i>,
    edited by Claus-Jochen Haake et al., vol. 412, Heinz Nixdorf Institut, Universität
    Paderborn, 2023, pp. 165–82, doi:<a href="https://doi.org/10.5281/zenodo.8068642">10.5281/zenodo.8068642</a>.'
  short: 'T. Hansmeier, T. Kenter, M. Meyer, H. Riebler, M. Platzner, C. Plessl, in:
    C.-J. Haake, F. Meyer auf der Heide, M. Platzner, H. Wachsmuth, H. Wehrheim (Eds.),
    On-The-Fly Computing -- Individualized IT-Services in Dynamic Markets, Heinz Nixdorf
    Institut, Universität Paderborn, Paderborn, 2023, pp. 165–182.'
date_created: 2023-07-07T08:15:45Z
date_updated: 2024-05-02T10:33:00Z
ddc:
- '004'
department:
- _id: '7'
- _id: '27'
- _id: '518'
- _id: '78'
doi: 10.5281/zenodo.8068642
editor:
- first_name: Claus-Jochen
  full_name: Haake, Claus-Jochen
  last_name: Haake
- first_name: Friedhelm
  full_name: Meyer auf der Heide, Friedhelm
  last_name: Meyer auf der Heide
- first_name: Marco
  full_name: Platzner, Marco
  last_name: Platzner
- first_name: Henning
  full_name: Wachsmuth, Henning
  last_name: Wachsmuth
- first_name: Heike
  full_name: Wehrheim, Heike
  last_name: Wehrheim
file:
- access_level: open_access
  content_type: application/pdf
  creator: florida
  date_created: 2023-07-07T08:15:35Z
  date_updated: 2023-07-07T11:17:33Z
  file_id: '45894'
  file_name: C2-Chapter-SFB-Buch-Final.pdf
  file_size: 2288788
  relation: main_file
file_date_updated: 2023-07-07T11:17:33Z
has_accepted_license: '1'
intvolume: '       412'
language:
- iso: eng
oa: '1'
page: 165-182
place: Paderborn
project:
- _id: '1'
  grant_number: '160364472'
  name: 'SFB 901: SFB 901: On-The-Fly Computing - Individualisierte IT-Dienstleistungen
    in dynamischen Märkten '
- _id: '4'
  name: 'SFB 901 - C: SFB 901 - Project Area C'
- _id: '14'
  grant_number: '160364472'
  name: 'SFB 901 - C2: SFB 901 - On-The-Fly Compute Centers I: Heterogene Ausführungsumgebungen
    (Subproject C2)'
publication: On-The-Fly Computing -- Individualized IT-services in dynamic markets
publisher: Heinz Nixdorf Institut, Universität Paderborn
series_title: Verlagsschriftenreihe des Heinz Nixdorf Instituts
status: public
title: 'Compute Centers I: Heterogeneous Execution Environments'
type: book_chapter
user_id: '398'
volume: 412
year: '2023'
...
---
_id: '38041'
abstract:
- lang: eng
  text: "<jats:p>While FPGA accelerator boards and their respective high-level design
    tools are maturing, there is still a lack of multi-FPGA applications, libraries,
    and not least, benchmarks and reference implementations towards sustained HPC
    usage of these devices. As in the early days of GPUs in HPC, for workloads that
    can reasonably be decoupled into loosely coupled working sets, multi-accelerator
    support can be achieved by using standard communication interfaces like MPI on
    the host side. However, for performance and productivity, some applications can
    profit from a tighter coupling of the accelerators. FPGAs offer unique opportunities
    here when extending the dataflow characteristics to their communication interfaces.</jats:p>\r\n
    \         <jats:p>In this work, we extend the HPCC FPGA benchmark suite by multi-FPGA
    support and three missing benchmarks that particularly characterize or stress
    inter-device communication: b_eff, PTRANS, and LINPACK. With all benchmarks implemented
    for current boards with Intel and Xilinx FPGAs, we established a baseline for
    multi-FPGA performance. Additionally, for the communication-centric benchmarks,
    we explored the potential of direct FPGA-to-FPGA communication with a circuit-switched
    inter-FPGA network that is currently only available for one of the boards. The
    evaluation with parallel execution on up to 26 FPGA boards makes use of one of
    the largest academic FPGA installations.</jats:p>"
author:
- first_name: Marius
  full_name: Meyer, Marius
  id: '40778'
  last_name: Meyer
- first_name: Tobias
  full_name: Kenter, Tobias
  id: '3145'
  last_name: Kenter
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: Meyer M, Kenter T, Plessl C. Multi-FPGA Designs and Scaling of HPC Challenge
    Benchmarks via MPI and Circuit-Switched Inter-FPGA Networks. <i>ACM Transactions
    on Reconfigurable Technology and Systems</i>. Published online 2023. doi:<a href="https://doi.org/10.1145/3576200">10.1145/3576200</a>
  apa: Meyer, M., Kenter, T., &#38; Plessl, C. (2023). Multi-FPGA Designs and Scaling
    of HPC Challenge Benchmarks via MPI and Circuit-Switched Inter-FPGA Networks.
    <i>ACM Transactions on Reconfigurable Technology and Systems</i>. <a href="https://doi.org/10.1145/3576200">https://doi.org/10.1145/3576200</a>
  bibtex: '@article{Meyer_Kenter_Plessl_2023, title={Multi-FPGA Designs and Scaling
    of HPC Challenge Benchmarks via MPI and Circuit-Switched Inter-FPGA Networks},
    DOI={<a href="https://doi.org/10.1145/3576200">10.1145/3576200</a>}, journal={ACM
    Transactions on Reconfigurable Technology and Systems}, publisher={Association
    for Computing Machinery (ACM)}, author={Meyer, Marius and Kenter, Tobias and Plessl,
    Christian}, year={2023} }'
  chicago: Meyer, Marius, Tobias Kenter, and Christian Plessl. “Multi-FPGA Designs
    and Scaling of HPC Challenge Benchmarks via MPI and Circuit-Switched Inter-FPGA
    Networks.” <i>ACM Transactions on Reconfigurable Technology and Systems</i>, 2023.
    <a href="https://doi.org/10.1145/3576200">https://doi.org/10.1145/3576200</a>.
  ieee: 'M. Meyer, T. Kenter, and C. Plessl, “Multi-FPGA Designs and Scaling of HPC
    Challenge Benchmarks via MPI and Circuit-Switched Inter-FPGA Networks,” <i>ACM
    Transactions on Reconfigurable Technology and Systems</i>, 2023, doi: <a href="https://doi.org/10.1145/3576200">10.1145/3576200</a>.'
  mla: Meyer, Marius, et al. “Multi-FPGA Designs and Scaling of HPC Challenge Benchmarks
    via MPI and Circuit-Switched Inter-FPGA Networks.” <i>ACM Transactions on Reconfigurable
    Technology and Systems</i>, Association for Computing Machinery (ACM), 2023, doi:<a
    href="https://doi.org/10.1145/3576200">10.1145/3576200</a>.
  short: M. Meyer, T. Kenter, C. Plessl, ACM Transactions on Reconfigurable Technology
    and Systems (2023).
date_created: 2023-01-23T08:40:42Z
date_updated: 2023-07-28T08:02:05Z
department:
- _id: '27'
- _id: '518'
doi: 10.1145/3576200
keyword:
- General Computer Science
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://dl.acm.org/doi/10.1145/3576200
oa: '1'
project:
- _id: '52'
  name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
- _id: '4'
  name: 'SFB 901 - C: SFB 901 - Project Area C'
- _id: '1'
  grant_number: '160364472'
  name: 'SFB 901: SFB 901'
- _id: '14'
  grant_number: '160364472'
  name: 'SFB 901 - C2: SFB 901 - Subproject C2'
publication: ACM Transactions on Reconfigurable Technology and Systems
publication_identifier:
  issn:
  - 1936-7406
  - 1936-7414
publication_status: published
publisher: Association for Computing Machinery (ACM)
quality_controlled: '1'
status: public
title: Multi-FPGA Designs and Scaling of HPC Challenge Benchmarks via MPI and Circuit-Switched
  Inter-FPGA Networks
type: journal_article
user_id: '24135'
year: '2023'
...
---
_id: '27364'
author:
- first_name: Marius
  full_name: Meyer, Marius
  id: '40778'
  last_name: Meyer
- first_name: Tobias
  full_name: Kenter, Tobias
  id: '3145'
  last_name: Kenter
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: Meyer M, Kenter T, Plessl C. In-depth FPGA Accelerator Performance Evaluation
    with Single Node Benchmarks from the HPC Challenge Benchmark Suite for Intel and
    Xilinx FPGAs using OpenCL. <i>Journal of Parallel and Distributed Computing</i>.
    Published online 2022. doi:<a href="https://doi.org/10.1016/j.jpdc.2021.10.007">10.1016/j.jpdc.2021.10.007</a>
  apa: Meyer, M., Kenter, T., &#38; Plessl, C. (2022). In-depth FPGA Accelerator Performance
    Evaluation with Single Node Benchmarks from the HPC Challenge Benchmark Suite
    for Intel and Xilinx FPGAs using OpenCL. <i>Journal of Parallel and Distributed
    Computing</i>. <a href="https://doi.org/10.1016/j.jpdc.2021.10.007">https://doi.org/10.1016/j.jpdc.2021.10.007</a>
  bibtex: '@article{Meyer_Kenter_Plessl_2022, title={In-depth FPGA Accelerator Performance
    Evaluation with Single Node Benchmarks from the HPC Challenge Benchmark Suite
    for Intel and Xilinx FPGAs using OpenCL}, DOI={<a href="https://doi.org/10.1016/j.jpdc.2021.10.007">10.1016/j.jpdc.2021.10.007</a>},
    journal={Journal of Parallel and Distributed Computing}, author={Meyer, Marius
    and Kenter, Tobias and Plessl, Christian}, year={2022} }'
  chicago: Meyer, Marius, Tobias Kenter, and Christian Plessl. “In-Depth FPGA Accelerator
    Performance Evaluation with Single Node Benchmarks from the HPC Challenge Benchmark
    Suite for Intel and Xilinx FPGAs Using OpenCL.” <i>Journal of Parallel and Distributed
    Computing</i>, 2022. <a href="https://doi.org/10.1016/j.jpdc.2021.10.007">https://doi.org/10.1016/j.jpdc.2021.10.007</a>.
  ieee: 'M. Meyer, T. Kenter, and C. Plessl, “In-depth FPGA Accelerator Performance
    Evaluation with Single Node Benchmarks from the HPC Challenge Benchmark Suite
    for Intel and Xilinx FPGAs using OpenCL,” <i>Journal of Parallel and Distributed
    Computing</i>, 2022, doi: <a href="https://doi.org/10.1016/j.jpdc.2021.10.007">10.1016/j.jpdc.2021.10.007</a>.'
  mla: Meyer, Marius, et al. “In-Depth FPGA Accelerator Performance Evaluation with
    Single Node Benchmarks from the HPC Challenge Benchmark Suite for Intel and Xilinx
    FPGAs Using OpenCL.” <i>Journal of Parallel and Distributed Computing</i>, 2022,
    doi:<a href="https://doi.org/10.1016/j.jpdc.2021.10.007">10.1016/j.jpdc.2021.10.007</a>.
  short: M. Meyer, T. Kenter, C. Plessl, Journal of Parallel and Distributed Computing
    (2022).
date_created: 2021-11-10T14:36:27Z
date_updated: 2023-09-26T10:26:56Z
department:
- _id: '27'
- _id: '518'
doi: 10.1016/j.jpdc.2021.10.007
language:
- iso: eng
project:
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
publication: Journal of Parallel and Distributed Computing
publication_identifier:
  issn:
  - 0743-7315
publication_status: published
quality_controlled: '1'
status: public
title: In-depth FPGA Accelerator Performance Evaluation with Single Node Benchmarks
  from the HPC Challenge Benchmark Suite for Intel and Xilinx FPGAs using OpenCL
type: journal_article
user_id: '15278'
year: '2022'
...
---
_id: '27365'
author:
- first_name: Marius
  full_name: Meyer, Marius
  id: '40778'
  last_name: Meyer
citation:
  ama: 'Meyer M. Towards Performance Characterization of FPGAs in Context of HPC using
    OpenCL Benchmarks. In: <i>Proceedings of the 11th International Symposium on Highly
    Efficient Accelerators and Reconfigurable Technologies</i>. ; 2021. doi:<a href="https://doi.org/10.1145/3468044.3468058">10.1145/3468044.3468058</a>'
  apa: Meyer, M. (2021). Towards Performance Characterization of FPGAs in Context
    of HPC using OpenCL Benchmarks. <i>Proceedings of the 11th International Symposium
    on Highly Efficient Accelerators and Reconfigurable Technologies</i>. <a href="https://doi.org/10.1145/3468044.3468058">https://doi.org/10.1145/3468044.3468058</a>
  bibtex: '@inproceedings{Meyer_2021, title={Towards Performance Characterization
    of FPGAs in Context of HPC using OpenCL Benchmarks}, DOI={<a href="https://doi.org/10.1145/3468044.3468058">10.1145/3468044.3468058</a>},
    booktitle={Proceedings of the 11th International Symposium on Highly Efficient
    Accelerators and Reconfigurable Technologies}, author={Meyer, Marius}, year={2021}
    }'
  chicago: Meyer, Marius. “Towards Performance Characterization of FPGAs in Context
    of HPC Using OpenCL Benchmarks.” In <i>Proceedings of the 11th International Symposium
    on Highly Efficient Accelerators and Reconfigurable Technologies</i>, 2021. <a
    href="https://doi.org/10.1145/3468044.3468058">https://doi.org/10.1145/3468044.3468058</a>.
  ieee: 'M. Meyer, “Towards Performance Characterization of FPGAs in Context of HPC
    using OpenCL Benchmarks,” 2021, doi: <a href="https://doi.org/10.1145/3468044.3468058">10.1145/3468044.3468058</a>.'
  mla: Meyer, Marius. “Towards Performance Characterization of FPGAs in Context of
    HPC Using OpenCL Benchmarks.” <i>Proceedings of the 11th International Symposium
    on Highly Efficient Accelerators and Reconfigurable Technologies</i>, 2021, doi:<a
    href="https://doi.org/10.1145/3468044.3468058">10.1145/3468044.3468058</a>.
  short: 'M. Meyer, in: Proceedings of the 11th International Symposium on Highly
    Efficient Accelerators and Reconfigurable Technologies, 2021.'
date_created: 2021-11-10T14:42:17Z
date_updated: 2022-01-06T06:57:38Z
department:
- _id: '27'
doi: 10.1145/3468044.3468058
language:
- iso: eng
project:
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
publication: Proceedings of the 11th International Symposium on Highly Efficient Accelerators
  and Reconfigurable Technologies
publication_status: published
status: public
title: Towards Performance Characterization of FPGAs in Context of HPC using OpenCL
  Benchmarks
type: conference
user_id: '40778'
year: '2021'
...
---
_id: '21632'
abstract:
- lang: eng
  text: FPGAs have found increasing adoption in data center applications since a new
    generation of high-level tools have become available which noticeably reduce development
    time for FPGA accelerators and still provide high-quality results. There is, however,
    no high-level benchmark suite available, which specifically enables a comparison
    of FPGA architectures, programming tools, and libraries for HPC applications.
    To fill this gap, we have developed an OpenCL-based open-source implementation
    of the HPCC benchmark suite for Xilinx and Intel FPGAs. This benchmark can serve
    to analyze the current capabilities of FPGA devices, cards, and development tool
    flows, track progress over time, and point out specific difficulties for FPGA
    acceleration in the HPC domain. Additionally, the benchmark documents proven performance
    optimization patterns. We will continue optimizing and porting the benchmark for
    new generations of FPGAs and design tools and encourage active participation to
    create a valuable tool for the community. To fill this gap, we have developed
    an OpenCL-based open-source implementation of the HPCC benchmark suite for Xilinx
    and Intel FPGAs. This benchmark can serve to analyze the current capabilities
    of FPGA devices, cards, and development tool flows, track progress over time,
    and point out specific difficulties for FPGA acceleration in the HPC domain. Additionally,
    the benchmark documents proven performance optimization patterns. We will continue
    optimizing and porting the benchmark for new generations of FPGAs and design tools
    and encourage active participation to create a valuable tool for the community.
author:
- first_name: Marius
  full_name: Meyer, Marius
  id: '40778'
  last_name: Meyer
- first_name: Tobias
  full_name: Kenter, Tobias
  id: '3145'
  last_name: Kenter
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: 'Meyer M, Kenter T, Plessl C. Evaluating FPGA Accelerator Performance with
    a Parameterized OpenCL Adaptation of Selected Benchmarks of the HPCChallenge Benchmark
    Suite. In: <i>2020 IEEE/ACM International Workshop on Heterogeneous High-Performance
    Reconfigurable Computing (H2RC)</i>. ; 2020. doi:<a href="https://doi.org/10.1109/h2rc51942.2020.00007">10.1109/h2rc51942.2020.00007</a>'
  apa: Meyer, M., Kenter, T., &#38; Plessl, C. (2020). Evaluating FPGA Accelerator
    Performance with a Parameterized OpenCL Adaptation of Selected Benchmarks of the
    HPCChallenge Benchmark Suite. <i>2020 IEEE/ACM International Workshop on Heterogeneous
    High-Performance Reconfigurable Computing (H2RC)</i>. <a href="https://doi.org/10.1109/h2rc51942.2020.00007">https://doi.org/10.1109/h2rc51942.2020.00007</a>
  bibtex: '@inproceedings{Meyer_Kenter_Plessl_2020, title={Evaluating FPGA Accelerator
    Performance with a Parameterized OpenCL Adaptation of Selected Benchmarks of the
    HPCChallenge Benchmark Suite}, DOI={<a href="https://doi.org/10.1109/h2rc51942.2020.00007">10.1109/h2rc51942.2020.00007</a>},
    booktitle={2020 IEEE/ACM International Workshop on Heterogeneous High-performance
    Reconfigurable Computing (H2RC)}, author={Meyer, Marius and Kenter, Tobias and
    Plessl, Christian}, year={2020} }'
  chicago: Meyer, Marius, Tobias Kenter, and Christian Plessl. “Evaluating FPGA Accelerator
    Performance with a Parameterized OpenCL Adaptation of Selected Benchmarks of the
    HPCChallenge Benchmark Suite.” In <i>2020 IEEE/ACM International Workshop on Heterogeneous
    High-Performance Reconfigurable Computing (H2RC)</i>, 2020. <a href="https://doi.org/10.1109/h2rc51942.2020.00007">https://doi.org/10.1109/h2rc51942.2020.00007</a>.
  ieee: 'M. Meyer, T. Kenter, and C. Plessl, “Evaluating FPGA Accelerator Performance
    with a Parameterized OpenCL Adaptation of Selected Benchmarks of the HPCChallenge
    Benchmark Suite,” 2020, doi: <a href="https://doi.org/10.1109/h2rc51942.2020.00007">10.1109/h2rc51942.2020.00007</a>.'
  mla: Meyer, Marius, et al. “Evaluating FPGA Accelerator Performance with a Parameterized
    OpenCL Adaptation of Selected Benchmarks of the HPCChallenge Benchmark Suite.”
    <i>2020 IEEE/ACM International Workshop on Heterogeneous High-Performance Reconfigurable
    Computing (H2RC)</i>, 2020, doi:<a href="https://doi.org/10.1109/h2rc51942.2020.00007">10.1109/h2rc51942.2020.00007</a>.
  short: 'M. Meyer, T. Kenter, C. Plessl, in: 2020 IEEE/ACM International Workshop
    on Heterogeneous High-Performance Reconfigurable Computing (H2RC), 2020.'
date_created: 2021-04-16T10:17:22Z
date_updated: 2023-09-26T11:42:53Z
department:
- _id: '27'
- _id: '518'
doi: 10.1109/h2rc51942.2020.00007
keyword:
- FPGA
- OpenCL
- High Level Synthesis
- HPC benchmarking
language:
- iso: eng
main_file_link:
- url: https://ieeexplore.ieee.org/document/9306963
project:
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
publication: 2020 IEEE/ACM International Workshop on Heterogeneous High-performance
  Reconfigurable Computing (H2RC)
publication_identifier:
  isbn:
  - '9781665415927'
publication_status: published
quality_controlled: '1'
related_material:
  link:
  - description: Official repository of the benchmark suite on GitHub
    relation: supplementary_material
    url: https://github.com/pc2/HPCC_FPGA
status: public
title: Evaluating FPGA Accelerator Performance with a Parameterized OpenCL Adaptation
  of Selected Benchmarks of the HPCChallenge Benchmark Suite
type: conference
user_id: '15278'
year: '2020'
...
