---
_id: '16277'
abstract:
- lang: eng
  text: CP2K is an open source electronic structure and molecular dynamics software
    package to perform atomistic simulations of solid-state, liquid, molecular, and
    biological systems. It is especially aimed at massively parallel and linear-scaling
    electronic structure methods and state-of-theart ab initio molecular dynamics
    simulations. Excellent performance for electronic structure calculations is achieved
    using novel algorithms implemented for modern high-performance computing systems.
    This review revisits the main capabilities of CP2K to perform efficient and accurate
    electronic structure simulations. The emphasis is put on density functional theory
    and multiple post–Hartree–Fock methods using the Gaussian and plane wave approach
    and its augmented all-electron extension.
article_number: '194103'
author:
- first_name: Thomas
  full_name: Kühne, Thomas
  id: '49079'
  last_name: Kühne
- first_name: Marcella
  full_name: Iannuzzi, Marcella
  last_name: Iannuzzi
- first_name: Mauro Del
  full_name: Ben, Mauro Del
  last_name: Ben
- first_name: Vladimir V.
  full_name: Rybkin, Vladimir V.
  last_name: Rybkin
- first_name: Patrick
  full_name: Seewald, Patrick
  last_name: Seewald
- first_name: Frederick
  full_name: Stein, Frederick
  last_name: Stein
- first_name: Teodoro
  full_name: Laino, Teodoro
  last_name: Laino
- first_name: Rustam Z.
  full_name: Khaliullin, Rustam Z.
  last_name: Khaliullin
- first_name: Ole
  full_name: Schütt, Ole
  last_name: Schütt
- first_name: Florian
  full_name: Schiffmann, Florian
  last_name: Schiffmann
- first_name: Dorothea
  full_name: Golze, Dorothea
  last_name: Golze
- first_name: Jan
  full_name: Wilhelm, Jan
  last_name: Wilhelm
- first_name: Sergey
  full_name: Chulkov, Sergey
  last_name: Chulkov
- first_name: Mohammad Hossein Bani-Hashemian
  full_name: Mohammad Hossein Bani-Hashemian, Mohammad Hossein Bani-Hashemian
  last_name: Mohammad Hossein Bani-Hashemian
- first_name: Valéry
  full_name: Weber, Valéry
  last_name: Weber
- first_name: Urban
  full_name: Borstnik, Urban
  last_name: Borstnik
- first_name: Mathieu
  full_name: Taillefumier, Mathieu
  last_name: Taillefumier
- first_name: Alice Shoshana
  full_name: Jakobovits, Alice Shoshana
  last_name: Jakobovits
- first_name: Alfio
  full_name: Lazzaro, Alfio
  last_name: Lazzaro
- first_name: Hans
  full_name: Pabst, Hans
  last_name: Pabst
- first_name: Tiziano
  full_name: Müller, Tiziano
  last_name: Müller
- first_name: Robert
  full_name: Schade, Robert
  id: '75963'
  last_name: Schade
  orcid: 0000-0002-6268-539
- first_name: Manuel
  full_name: Guidon, Manuel
  last_name: Guidon
- first_name: Samuel
  full_name: Andermatt, Samuel
  last_name: Andermatt
- first_name: Nico
  full_name: Holmberg, Nico
  last_name: Holmberg
- first_name: Gregory K.
  full_name: Schenter, Gregory K.
  last_name: Schenter
- first_name: Anna
  full_name: Hehn, Anna
  last_name: Hehn
- first_name: Augustin
  full_name: Bussy, Augustin
  last_name: Bussy
- first_name: Fabian
  full_name: Belleflamme, Fabian
  last_name: Belleflamme
- first_name: Gloria
  full_name: Tabacchi, Gloria
  last_name: Tabacchi
- first_name: Andreas
  full_name: Glöß, Andreas
  last_name: Glöß
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Iain
  full_name: Bethune, Iain
  last_name: Bethune
- first_name: Christopher J.
  full_name: Mundy, Christopher J.
  last_name: Mundy
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
- first_name: Matt
  full_name: Watkins, Matt
  last_name: Watkins
- first_name: Joost
  full_name: VandeVondele, Joost
  last_name: VandeVondele
- first_name: Matthias
  full_name: Krack, Matthias
  last_name: Krack
- first_name: Jürg
  full_name: Hutter, Jürg
  last_name: Hutter
citation:
  ama: 'Kühne T, Iannuzzi M, Ben MD, et al. CP2K: An electronic structure and molecular
    dynamics software package - Quickstep: Efficient and accurate electronic structure
    calculations. <i>The Journal of Chemical Physics</i>. 2020;152(19). doi:<a href="https://doi.org/10.1063/5.0007045">10.1063/5.0007045</a>'
  apa: 'Kühne, T., Iannuzzi, M., Ben, M. D., Rybkin, V. V., Seewald, P., Stein, F.,
    Laino, T., Khaliullin, R. Z., Schütt, O., Schiffmann, F., Golze, D., Wilhelm,
    J., Chulkov, S., Mohammad Hossein Bani-Hashemian, M. H. B.-H., Weber, V., Borstnik,
    U., Taillefumier, M., Jakobovits, A. S., Lazzaro, A., … Hutter, J. (2020). CP2K:
    An electronic structure and molecular dynamics software package - Quickstep: Efficient
    and accurate electronic structure calculations. <i>The Journal of Chemical Physics</i>,
    <i>152</i>(19), Article 194103. <a href="https://doi.org/10.1063/5.0007045">https://doi.org/10.1063/5.0007045</a>'
  bibtex: '@article{Kühne_Iannuzzi_Ben_Rybkin_Seewald_Stein_Laino_Khaliullin_Schütt_Schiffmann_et
    al._2020, title={CP2K: An electronic structure and molecular dynamics software
    package - Quickstep: Efficient and accurate electronic structure calculations},
    volume={152}, DOI={<a href="https://doi.org/10.1063/5.0007045">10.1063/5.0007045</a>},
    number={19194103}, journal={The Journal of Chemical Physics}, author={Kühne, Thomas
    and Iannuzzi, Marcella and Ben, Mauro Del and Rybkin, Vladimir V. and Seewald,
    Patrick and Stein, Frederick and Laino, Teodoro and Khaliullin, Rustam Z. and
    Schütt, Ole and Schiffmann, Florian and et al.}, year={2020} }'
  chicago: 'Kühne, Thomas, Marcella Iannuzzi, Mauro Del Ben, Vladimir V. Rybkin, Patrick
    Seewald, Frederick Stein, Teodoro Laino, et al. “CP2K: An Electronic Structure
    and Molecular Dynamics Software Package - Quickstep: Efficient and Accurate Electronic
    Structure Calculations.” <i>The Journal of Chemical Physics</i> 152, no. 19 (2020).
    <a href="https://doi.org/10.1063/5.0007045">https://doi.org/10.1063/5.0007045</a>.'
  ieee: 'T. Kühne <i>et al.</i>, “CP2K: An electronic structure and molecular dynamics
    software package - Quickstep: Efficient and accurate electronic structure calculations,”
    <i>The Journal of Chemical Physics</i>, vol. 152, no. 19, Art. no. 194103, 2020,
    doi: <a href="https://doi.org/10.1063/5.0007045">10.1063/5.0007045</a>.'
  mla: 'Kühne, Thomas, et al. “CP2K: An Electronic Structure and Molecular Dynamics
    Software Package - Quickstep: Efficient and Accurate Electronic Structure Calculations.”
    <i>The Journal of Chemical Physics</i>, vol. 152, no. 19, 194103, 2020, doi:<a
    href="https://doi.org/10.1063/5.0007045">10.1063/5.0007045</a>.'
  short: T. Kühne, M. Iannuzzi, M.D. Ben, V.V. Rybkin, P. Seewald, F. Stein, T. Laino,
    R.Z. Khaliullin, O. Schütt, F. Schiffmann, D. Golze, J. Wilhelm, S. Chulkov, M.H.B.-H.
    Mohammad Hossein Bani-Hashemian, V. Weber, U. Borstnik, M. Taillefumier, A.S.
    Jakobovits, A. Lazzaro, H. Pabst, T. Müller, R. Schade, M. Guidon, S. Andermatt,
    N. Holmberg, G.K. Schenter, A. Hehn, A. Bussy, F. Belleflamme, G. Tabacchi, A.
    Glöß, M. Lass, I. Bethune, C.J. Mundy, C. Plessl, M. Watkins, J. VandeVondele,
    M. Krack, J. Hutter, The Journal of Chemical Physics 152 (2020).
date_created: 2020-03-10T15:12:31Z
date_updated: 2023-08-02T14:56:21Z
ddc:
- '540'
department:
- _id: '27'
- _id: '518'
- _id: '304'
doi: 10.1063/5.0007045
external_id:
  arxiv:
  - '2003.03868'
file:
- access_level: closed
  content_type: application/pdf
  creator: lass
  date_created: 2020-05-25T15:21:56Z
  date_updated: 2020-05-25T15:21:56Z
  file_id: '17061'
  file_name: 5.0007045.pdf
  file_size: 4887650
  relation: main_file
  success: 1
file_date_updated: 2020-05-25T15:21:56Z
has_accepted_license: '1'
intvolume: '       152'
issue: '19'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://aip.scitation.org/doi/pdf/10.1063/5.0007045?download=true
oa: '1'
project:
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
- _id: '32'
  grant_number: PL 595/2-1 / 320898746
  name: Performance and Efficiency in HPC with Custom Computing
- _id: '52'
  name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: The Journal of Chemical Physics
publication_status: published
quality_controlled: '1'
status: public
title: 'CP2K: An electronic structure and molecular dynamics software package - Quickstep:
  Efficient and accurate electronic structure calculations'
type: journal_article
user_id: '75963'
volume: 152
year: '2020'
...
---
_id: '16898'
abstract:
- lang: eng
  text: "Electronic structure calculations based on density-functional theory (DFT)\r\nrepresent
    a significant part of today's HPC workloads and pose high demands on\r\nhigh-performance
    computing resources. To perform these quantum-mechanical DFT\r\ncalculations on
    complex large-scale systems, so-called linear scaling methods\r\ninstead of conventional
    cubic scaling methods are required. In this work, we\r\ntake up the idea of the
    submatrix method and apply it to the DFT computations\r\nin the software package
    CP2K. For that purpose, we transform the underlying\r\nnumeric operations on distributed,
    large, sparse matrices into computations on\r\nlocal, much smaller and nearly
    dense matrices. This allows us to exploit the\r\nfull floating-point performance
    of modern CPUs and to make use of dedicated\r\naccelerator hardware, where performance
    has been limited by memory bandwidth\r\nbefore. We demonstrate both functionality
    and performance of our implementation\r\nand show how it can be accelerated with
    GPUs and FPGAs."
author:
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Robert
  full_name: Schade, Robert
  id: '75963'
  last_name: Schade
  orcid: 0000-0002-6268-539
- first_name: Thomas
  full_name: Kühne, Thomas
  id: '49079'
  last_name: Kühne
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: 'Lass M, Schade R, Kühne T, Plessl C. A Submatrix-Based Method for Approximate
    Matrix Function Evaluation in the Quantum Chemistry Code CP2K. In: <i>Proc. International
    Conference for High Performance Computing, Networking, Storage and Analysis (SC)</i>.
    IEEE Computer Society; 2020:1127-1140. doi:<a href="https://doi.org/10.1109/SC41405.2020.00084">10.1109/SC41405.2020.00084</a>'
  apa: Lass, M., Schade, R., Kühne, T., &#38; Plessl, C. (2020). A Submatrix-Based
    Method for Approximate Matrix Function Evaluation in the Quantum Chemistry Code
    CP2K. <i>Proc. International Conference for High Performance Computing, Networking,
    Storage and Analysis (SC)</i>, 1127–1140. <a href="https://doi.org/10.1109/SC41405.2020.00084">https://doi.org/10.1109/SC41405.2020.00084</a>
  bibtex: '@inproceedings{Lass_Schade_Kühne_Plessl_2020, place={Los Alamitos, CA,
    USA}, title={A Submatrix-Based Method for Approximate Matrix Function Evaluation
    in the Quantum Chemistry Code CP2K}, DOI={<a href="https://doi.org/10.1109/SC41405.2020.00084">10.1109/SC41405.2020.00084</a>},
    booktitle={Proc. International Conference for High Performance Computing, Networking,
    Storage and Analysis (SC)}, publisher={IEEE Computer Society}, author={Lass, Michael
    and Schade, Robert and Kühne, Thomas and Plessl, Christian}, year={2020}, pages={1127–1140}
    }'
  chicago: 'Lass, Michael, Robert Schade, Thomas Kühne, and Christian Plessl. “A Submatrix-Based
    Method for Approximate Matrix Function Evaluation in the Quantum Chemistry Code
    CP2K.” In <i>Proc. International Conference for High Performance Computing, Networking,
    Storage and Analysis (SC)</i>, 1127–40. Los Alamitos, CA, USA: IEEE Computer Society,
    2020. <a href="https://doi.org/10.1109/SC41405.2020.00084">https://doi.org/10.1109/SC41405.2020.00084</a>.'
  ieee: 'M. Lass, R. Schade, T. Kühne, and C. Plessl, “A Submatrix-Based Method for
    Approximate Matrix Function Evaluation in the Quantum Chemistry Code CP2K,” in
    <i>Proc. International Conference for High Performance Computing, Networking,
    Storage and Analysis (SC)</i>, Atlanta, GA, US, 2020, pp. 1127–1140, doi: <a href="https://doi.org/10.1109/SC41405.2020.00084">10.1109/SC41405.2020.00084</a>.'
  mla: Lass, Michael, et al. “A Submatrix-Based Method for Approximate Matrix Function
    Evaluation in the Quantum Chemistry Code CP2K.” <i>Proc. International Conference
    for High Performance Computing, Networking, Storage and Analysis (SC)</i>, IEEE
    Computer Society, 2020, pp. 1127–40, doi:<a href="https://doi.org/10.1109/SC41405.2020.00084">10.1109/SC41405.2020.00084</a>.
  short: 'M. Lass, R. Schade, T. Kühne, C. Plessl, in: Proc. International Conference
    for High Performance Computing, Networking, Storage and Analysis (SC), IEEE Computer
    Society, Los Alamitos, CA, USA, 2020, pp. 1127–1140.'
conference:
  location: Atlanta, GA, US
  name: 'SC20: International Conference for High Performance Computing, Networking,
    Storage and Analysis (SC)'
date_created: 2020-04-28T14:44:21Z
date_updated: 2023-08-02T14:55:59Z
department:
- _id: '27'
- _id: '518'
- _id: '304'
doi: 10.1109/SC41405.2020.00084
external_id:
  arxiv:
  - '2004.10811'
language:
- iso: eng
main_file_link:
- url: https://ieeexplore.ieee.org/document/9355245
page: 1127-1140
place: Los Alamitos, CA, USA
project:
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
- _id: '32'
  grant_number: PL 595/2-1 / 320898746
  name: Performance and Efficiency in HPC with Custom Computing
- _id: '52'
  name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: Proc. International Conference for High Performance Computing, Networking,
  Storage and Analysis (SC)
publisher: IEEE Computer Society
quality_controlled: '1'
status: public
title: A Submatrix-Based Method for Approximate Matrix Function Evaluation in the
  Quantum Chemistry Code CP2K
type: conference
user_id: '75963'
year: '2020'
...
---
_id: '12878'
abstract:
- lang: eng
  text: In scientific computing, the acceleration of atomistic computer simulations
    by means of custom hardware is finding ever-growing application. A major limitation,
    however, is that the high efficiency in terms of performance and low power consumption
    entails the massive usage of low precision computing units. Here, based on the
    approximate computing paradigm, we present an algorithmic method to compensate
    for numerical inaccuracies due to low accuracy arithmetic operations rigorously,
    yet still obtaining exact expectation values using a properly modified Langevin-type
    equation.
article_number: '39'
author:
- first_name: Varadarajan
  full_name: Rengaraj, Varadarajan
  last_name: Rengaraj
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
- first_name: Thomas
  full_name: Kühne, Thomas
  id: '49079'
  last_name: Kühne
citation:
  ama: Rengaraj V, Lass M, Plessl C, Kühne T. Accurate Sampling with Noisy Forces
    from Approximate Computing. <i>Computation</i>. 2020;8(2). doi:<a href="https://doi.org/10.3390/computation8020039">10.3390/computation8020039</a>
  apa: Rengaraj, V., Lass, M., Plessl, C., &#38; Kühne, T. (2020). Accurate Sampling
    with Noisy Forces from Approximate Computing. <i>Computation</i>, <i>8</i>(2),
    Article 39. <a href="https://doi.org/10.3390/computation8020039">https://doi.org/10.3390/computation8020039</a>
  bibtex: '@article{Rengaraj_Lass_Plessl_Kühne_2020, title={Accurate Sampling with
    Noisy Forces from Approximate Computing}, volume={8}, DOI={<a href="https://doi.org/10.3390/computation8020039">10.3390/computation8020039</a>},
    number={239}, journal={Computation}, publisher={MDPI}, author={Rengaraj, Varadarajan
    and Lass, Michael and Plessl, Christian and Kühne, Thomas}, year={2020} }'
  chicago: Rengaraj, Varadarajan, Michael Lass, Christian Plessl, and Thomas Kühne.
    “Accurate Sampling with Noisy Forces from Approximate Computing.” <i>Computation</i>
    8, no. 2 (2020). <a href="https://doi.org/10.3390/computation8020039">https://doi.org/10.3390/computation8020039</a>.
  ieee: 'V. Rengaraj, M. Lass, C. Plessl, and T. Kühne, “Accurate Sampling with Noisy
    Forces from Approximate Computing,” <i>Computation</i>, vol. 8, no. 2, Art. no.
    39, 2020, doi: <a href="https://doi.org/10.3390/computation8020039">10.3390/computation8020039</a>.'
  mla: Rengaraj, Varadarajan, et al. “Accurate Sampling with Noisy Forces from Approximate
    Computing.” <i>Computation</i>, vol. 8, no. 2, 39, MDPI, 2020, doi:<a href="https://doi.org/10.3390/computation8020039">10.3390/computation8020039</a>.
  short: V. Rengaraj, M. Lass, C. Plessl, T. Kühne, Computation 8 (2020).
date_created: 2019-07-23T12:03:07Z
date_updated: 2023-09-26T11:43:52Z
department:
- _id: '27'
- _id: '518'
- _id: '304'
doi: 10.3390/computation8020039
external_id:
  arxiv:
  - '1907.08497'
intvolume: '         8'
issue: '2'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://www.mdpi.com/2079-3197/8/2/39/pdf
oa: '1'
project:
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
- _id: '32'
  grant_number: PL 595/2-1 / 320898746
  name: Performance and Efficiency in HPC with Custom Computing
publication: Computation
publisher: MDPI
quality_controlled: '1'
status: public
title: Accurate Sampling with Noisy Forces from Approximate Computing
type: journal_article
user_id: '15278'
volume: 8
year: '2020'
...
---
_id: '15478'
abstract:
- lang: eng
  text: Stratix 10 FPGA cards have a good potential for the acceleration of HPC workloads
    since the Stratix 10 product line introduces devices with a large number of DSP
    and memory blocks. The high level synthesis of OpenCL codes can play a fundamental
    role for FPGAs in HPC, because it allows to implement different designs with lower
    development effort compared to hand optimized HDL. However, Stratix 10 cards are
    still hard to fully exploit using the Intel FPGA SDK for OpenCL. The implementation
    of designs with thousands of concurrent arithmetic operations often suffers from
    place and route problems that limit the maximum frequency or entirely prevent
    a successful synthesis. In order to overcome these issues for the implementation
    of the matrix multiplication, we formulate Cannon's matrix multiplication algorithm
    with regard to its efficient synthesis within the FPGA logic. We obtain a two-level
    block algorithm, where the lower level sub-matrices are multiplied using our Cannon's
    algorithm implementation. Following this design approach with multiple compute
    units, we are able to get maximum frequencies close to and above 300 MHz with
    high utilization of DSP and memory blocks. This allows for performance results
    above 1 TeraFLOPS.
author:
- first_name: Paolo
  full_name: Gorlani, Paolo
  id: '72045'
  last_name: Gorlani
- first_name: Tobias
  full_name: Kenter, Tobias
  id: '3145'
  last_name: Kenter
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: 'Gorlani P, Kenter T, Plessl C. OpenCL Implementation of Cannon’s Matrix Multiplication
    Algorithm on Intel Stratix 10 FPGAs. In: <i>Proceedings of the International Conference
    on Field-Programmable Technology (FPT)</i>. IEEE; 2019. doi:<a href="https://doi.org/10.1109/ICFPT47387.2019.00020">10.1109/ICFPT47387.2019.00020</a>'
  apa: Gorlani, P., Kenter, T., &#38; Plessl, C. (2019). OpenCL Implementation of
    Cannon’s Matrix Multiplication Algorithm on Intel Stratix 10 FPGAs. In <i>Proceedings
    of the International Conference on Field-Programmable Technology (FPT)</i>. IEEE.
    <a href="https://doi.org/10.1109/ICFPT47387.2019.00020">https://doi.org/10.1109/ICFPT47387.2019.00020</a>
  bibtex: '@inproceedings{Gorlani_Kenter_Plessl_2019, title={OpenCL Implementation
    of Cannon’s Matrix Multiplication Algorithm on Intel Stratix 10 FPGAs}, DOI={<a
    href="https://doi.org/10.1109/ICFPT47387.2019.00020">10.1109/ICFPT47387.2019.00020</a>},
    booktitle={Proceedings of the International Conference on Field-Programmable Technology
    (FPT)}, publisher={IEEE}, author={Gorlani, Paolo and Kenter, Tobias and Plessl,
    Christian}, year={2019} }'
  chicago: Gorlani, Paolo, Tobias Kenter, and Christian Plessl. “OpenCL Implementation
    of Cannon’s Matrix Multiplication Algorithm on Intel Stratix 10 FPGAs.” In <i>Proceedings
    of the International Conference on Field-Programmable Technology (FPT)</i>. IEEE,
    2019. <a href="https://doi.org/10.1109/ICFPT47387.2019.00020">https://doi.org/10.1109/ICFPT47387.2019.00020</a>.
  ieee: P. Gorlani, T. Kenter, and C. Plessl, “OpenCL Implementation of Cannon’s Matrix
    Multiplication Algorithm on Intel Stratix 10 FPGAs,” in <i>Proceedings of the
    International Conference on Field-Programmable Technology (FPT)</i>, 2019.
  mla: Gorlani, Paolo, et al. “OpenCL Implementation of Cannon’s Matrix Multiplication
    Algorithm on Intel Stratix 10 FPGAs.” <i>Proceedings of the International Conference
    on Field-Programmable Technology (FPT)</i>, IEEE, 2019, doi:<a href="https://doi.org/10.1109/ICFPT47387.2019.00020">10.1109/ICFPT47387.2019.00020</a>.
  short: 'P. Gorlani, T. Kenter, C. Plessl, in: Proceedings of the International Conference
    on Field-Programmable Technology (FPT), IEEE, 2019.'
conference:
  name: International Conference on Field-Programmable Technology (FPT)
date_created: 2020-01-09T12:54:48Z
date_updated: 2022-01-06T06:52:26Z
ddc:
- '004'
department:
- _id: '27'
- _id: '518'
doi: 10.1109/ICFPT47387.2019.00020
file:
- access_level: closed
  content_type: application/pdf
  creator: plessl
  date_created: 2020-01-09T12:53:57Z
  date_updated: 2020-01-09T12:53:57Z
  file_id: '15479'
  file_name: gorlani19_fpt.pdf
  file_size: 250559
  relation: main_file
  success: 1
file_date_updated: 2020-01-09T12:53:57Z
has_accepted_license: '1'
language:
- iso: eng
project:
- _id: '33'
  grant_number: 01|H16005
  name: HighPerMeshes
- _id: '32'
  grant_number: PL 595/2-1
  name: Performance and Efficiency in HPC with Custom Computing
publication: Proceedings of the International Conference on Field-Programmable Technology
  (FPT)
publisher: IEEE
quality_controlled: '1'
status: public
title: OpenCL Implementation of Cannon's Matrix Multiplication Algorithm on Intel
  Stratix 10 FPGAs
type: conference
user_id: '3145'
year: '2019'
...
---
_id: '21'
abstract:
- lang: eng
  text: "We address the general mathematical problem of computing the inverse p-th\r\nroot
    of a given matrix in an efficient way. A new method to construct iteration\r\nfunctions
    that allow calculating arbitrary p-th roots and their inverses of\r\nsymmetric
    positive definite matrices is presented. We show that the order of\r\nconvergence
    is at least quadratic and that adaptively adjusting a parameter q\r\nalways leads
    to an even faster convergence. In this way, a better performance\r\nthan with
    previously known iteration schemes is achieved. The efficiency of the\r\niterative
    functions is demonstrated for various matrices with different\r\ndensities, condition
    numbers and spectral radii."
author:
- first_name: Dorothee
  full_name: Richters, Dorothee
  last_name: Richters
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Andrea
  full_name: Walther, Andrea
  last_name: Walther
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
- first_name: Thomas
  full_name: Kühne, Thomas
  id: '49079'
  last_name: Kühne
citation:
  ama: Richters D, Lass M, Walther A, Plessl C, Kühne T. A General Algorithm to Calculate
    the Inverse Principal p-th Root of Symmetric Positive Definite Matrices. <i>Communications
    in Computational Physics</i>. 2019;25(2):564-585. doi:<a href="https://doi.org/10.4208/cicp.OA-2018-0053">10.4208/cicp.OA-2018-0053</a>
  apa: Richters, D., Lass, M., Walther, A., Plessl, C., &#38; Kühne, T. (2019). A
    General Algorithm to Calculate the Inverse Principal p-th Root of Symmetric Positive
    Definite Matrices. <i>Communications in Computational Physics</i>, <i>25</i>(2),
    564–585. <a href="https://doi.org/10.4208/cicp.OA-2018-0053">https://doi.org/10.4208/cicp.OA-2018-0053</a>
  bibtex: '@article{Richters_Lass_Walther_Plessl_Kühne_2019, title={A General Algorithm
    to Calculate the Inverse Principal p-th Root of Symmetric Positive Definite Matrices},
    volume={25}, DOI={<a href="https://doi.org/10.4208/cicp.OA-2018-0053">10.4208/cicp.OA-2018-0053</a>},
    number={2}, journal={Communications in Computational Physics}, publisher={Global
    Science Press}, author={Richters, Dorothee and Lass, Michael and Walther, Andrea
    and Plessl, Christian and Kühne, Thomas}, year={2019}, pages={564–585} }'
  chicago: 'Richters, Dorothee, Michael Lass, Andrea Walther, Christian Plessl, and
    Thomas Kühne. “A General Algorithm to Calculate the Inverse Principal P-Th Root
    of Symmetric Positive Definite Matrices.” <i>Communications in Computational Physics</i>
    25, no. 2 (2019): 564–85. <a href="https://doi.org/10.4208/cicp.OA-2018-0053">https://doi.org/10.4208/cicp.OA-2018-0053</a>.'
  ieee: 'D. Richters, M. Lass, A. Walther, C. Plessl, and T. Kühne, “A General Algorithm
    to Calculate the Inverse Principal p-th Root of Symmetric Positive Definite Matrices,”
    <i>Communications in Computational Physics</i>, vol. 25, no. 2, pp. 564–585, 2019,
    doi: <a href="https://doi.org/10.4208/cicp.OA-2018-0053">10.4208/cicp.OA-2018-0053</a>.'
  mla: Richters, Dorothee, et al. “A General Algorithm to Calculate the Inverse Principal
    P-Th Root of Symmetric Positive Definite Matrices.” <i>Communications in Computational
    Physics</i>, vol. 25, no. 2, Global Science Press, 2019, pp. 564–85, doi:<a href="https://doi.org/10.4208/cicp.OA-2018-0053">10.4208/cicp.OA-2018-0053</a>.
  short: D. Richters, M. Lass, A. Walther, C. Plessl, T. Kühne, Communications in
    Computational Physics 25 (2019) 564–585.
date_created: 2017-07-25T14:48:26Z
date_updated: 2023-09-26T11:45:02Z
department:
- _id: '27'
- _id: '518'
- _id: '304'
- _id: '104'
doi: 10.4208/cicp.OA-2018-0053
external_id:
  arxiv:
  - '1703.02456'
intvolume: '        25'
issue: '2'
language:
- iso: eng
page: 564-585
project:
- _id: '32'
  grant_number: PL 595/2-1 / 320898746
  name: Performance and Efficiency in HPC with Custom Computing
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
publication: Communications in Computational Physics
publisher: Global Science Press
quality_controlled: '1'
status: public
title: A General Algorithm to Calculate the Inverse Principal p-th Root of Symmetric
  Positive Definite Matrices
type: journal_article
user_id: '15278'
volume: 25
year: '2019'
...
---
_id: '20'
abstract:
- lang: eng
  text: "Approximate computing has shown to provide new ways to improve performance\r\nand
    power consumption of error-resilient applications. While many of these\r\napplications
    can be found in image processing, data classification or machine\r\nlearning,
    we demonstrate its suitability to a problem from scientific\r\ncomputing. Utilizing
    the self-correcting behavior of iterative algorithms, we\r\nshow that approximate
    computing can be applied to the calculation of inverse\r\nmatrix p-th roots which
    are required in many applications in scientific\r\ncomputing. Results show great
    opportunities to reduce the computational effort\r\nand bandwidth required for
    the execution of the discussed algorithm, especially\r\nwhen targeting special
    accelerator hardware."
author:
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Thomas
  full_name: Kühne, Thomas
  id: '49079'
  last_name: Kühne
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: Lass M, Kühne T, Plessl C. Using Approximate Computing for the Calculation
    of Inverse Matrix p-th Roots. <i>Embedded Systems Letters</i>. 2018;10(2):33-36.
    doi:<a href="https://doi.org/10.1109/LES.2017.2760923">10.1109/LES.2017.2760923</a>
  apa: Lass, M., Kühne, T., &#38; Plessl, C. (2018). Using Approximate Computing for
    the Calculation of Inverse Matrix p-th Roots. <i>Embedded Systems Letters</i>,
    <i>10</i>(2), 33–36. <a href="https://doi.org/10.1109/LES.2017.2760923">https://doi.org/10.1109/LES.2017.2760923</a>
  bibtex: '@article{Lass_Kühne_Plessl_2018, title={Using Approximate Computing for
    the Calculation of Inverse Matrix p-th Roots}, volume={10}, DOI={<a href="https://doi.org/10.1109/LES.2017.2760923">10.1109/LES.2017.2760923</a>},
    number={2}, journal={Embedded Systems Letters}, publisher={IEEE}, author={Lass,
    Michael and Kühne, Thomas and Plessl, Christian}, year={2018}, pages={33–36} }'
  chicago: 'Lass, Michael, Thomas Kühne, and Christian Plessl. “Using Approximate
    Computing for the Calculation of Inverse Matrix P-Th Roots.” <i>Embedded Systems
    Letters</i> 10, no. 2 (2018): 33–36. <a href="https://doi.org/10.1109/LES.2017.2760923">https://doi.org/10.1109/LES.2017.2760923</a>.'
  ieee: M. Lass, T. Kühne, and C. Plessl, “Using Approximate Computing for the Calculation
    of Inverse Matrix p-th Roots,” <i>Embedded Systems Letters</i>, vol. 10, no. 2,
    pp. 33–36, 2018.
  mla: Lass, Michael, et al. “Using Approximate Computing for the Calculation of Inverse
    Matrix P-Th Roots.” <i>Embedded Systems Letters</i>, vol. 10, no. 2, IEEE, 2018,
    pp. 33–36, doi:<a href="https://doi.org/10.1109/LES.2017.2760923">10.1109/LES.2017.2760923</a>.
  short: M. Lass, T. Kühne, C. Plessl, Embedded Systems Letters 10 (2018) 33–36.
date_created: 2017-07-25T14:41:08Z
date_updated: 2022-01-06T06:54:18Z
department:
- _id: '27'
- _id: '518'
- _id: '304'
doi: 10.1109/LES.2017.2760923
external_id:
  arxiv:
  - '1703.02283'
intvolume: '        10'
issue: '2'
language:
- iso: eng
page: ' 33-36'
project:
- _id: '32'
  grant_number: PL 595/2-1
  name: Performance and Efficiency in HPC with Custom Computing
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
publication: Embedded Systems Letters
publication_identifier:
  eissn:
  - 1943-0671
  issn:
  - 1943-0663
publication_status: published
publisher: IEEE
status: public
title: Using Approximate Computing for the Calculation of Inverse Matrix p-th Roots
type: journal_article
user_id: '16153'
volume: 10
year: '2018'
...
---
_id: '1590'
abstract:
- lang: eng
  text: "We present the submatrix method, a highly parallelizable method for the approximate
    calculation of inverse p-th roots of large sparse symmetric matrices which are
    required in different scientific applications. Following the idea of Approximate
    Computing, we allow imprecision in the final result in order to utilize the sparsity
    of the input matrix and to allow massively parallel execution. For an n x n matrix,
    the proposed algorithm allows to distribute the calculations over n nodes with
    only little communication overhead. The result matrix exhibits the same sparsity
    pattern as the input matrix, allowing for efficient reuse of allocated data structures.\r\n\r\nWe
    evaluate the algorithm with respect to the error that it introduces into calculated
    results, as well as its performance and scalability. We demonstrate that the error
    is relatively limited for well-conditioned matrices and that results are still
    valuable for error-resilient applications like preconditioning even for ill-conditioned
    matrices. We discuss the execution time and scaling of the algorithm on a theoretical
    level and present a distributed implementation of the algorithm using MPI and
    OpenMP. We demonstrate the scalability of this implementation by running it on
    a high-performance compute cluster comprised of 1024 CPU cores, showing a speedup
    of 665x compared to single-threaded execution."
author:
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Stephan
  full_name: Mohr, Stephan
  last_name: Mohr
- first_name: Hendrik
  full_name: Wiebeler, Hendrik
  last_name: Wiebeler
- first_name: Thomas
  full_name: Kühne, Thomas
  id: '49079'
  last_name: Kühne
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: 'Lass M, Mohr S, Wiebeler H, Kühne T, Plessl C. A Massively Parallel Algorithm
    for the Approximate Calculation of Inverse p-th Roots of Large Sparse Matrices.
    In: <i>Proc. Platform for Advanced Scientific Computing (PASC) Conference</i>.
    ACM; 2018. doi:<a href="https://doi.org/10.1145/3218176.3218231">10.1145/3218176.3218231</a>'
  apa: Lass, M., Mohr, S., Wiebeler, H., Kühne, T., &#38; Plessl, C. (2018). A Massively
    Parallel Algorithm for the Approximate Calculation of Inverse p-th Roots of Large
    Sparse Matrices. <i>Proc. Platform for Advanced Scientific Computing (PASC) Conference</i>.
    Platform for Advanced Scientific Computing Conference (PASC), Basel, Switzerland.
    <a href="https://doi.org/10.1145/3218176.3218231">https://doi.org/10.1145/3218176.3218231</a>
  bibtex: '@inproceedings{Lass_Mohr_Wiebeler_Kühne_Plessl_2018, place={New York, NY,
    USA}, title={A Massively Parallel Algorithm for the Approximate Calculation of
    Inverse p-th Roots of Large Sparse Matrices}, DOI={<a href="https://doi.org/10.1145/3218176.3218231">10.1145/3218176.3218231</a>},
    booktitle={Proc. Platform for Advanced Scientific Computing (PASC) Conference},
    publisher={ACM}, author={Lass, Michael and Mohr, Stephan and Wiebeler, Hendrik
    and Kühne, Thomas and Plessl, Christian}, year={2018} }'
  chicago: 'Lass, Michael, Stephan Mohr, Hendrik Wiebeler, Thomas Kühne, and Christian
    Plessl. “A Massively Parallel Algorithm for the Approximate Calculation of Inverse
    P-Th Roots of Large Sparse Matrices.” In <i>Proc. Platform for Advanced Scientific
    Computing (PASC) Conference</i>. New York, NY, USA: ACM, 2018. <a href="https://doi.org/10.1145/3218176.3218231">https://doi.org/10.1145/3218176.3218231</a>.'
  ieee: 'M. Lass, S. Mohr, H. Wiebeler, T. Kühne, and C. Plessl, “A Massively Parallel
    Algorithm for the Approximate Calculation of Inverse p-th Roots of Large Sparse
    Matrices,” presented at the Platform for Advanced Scientific Computing Conference
    (PASC), Basel, Switzerland, 2018, doi: <a href="https://doi.org/10.1145/3218176.3218231">10.1145/3218176.3218231</a>.'
  mla: Lass, Michael, et al. “A Massively Parallel Algorithm for the Approximate Calculation
    of Inverse P-Th Roots of Large Sparse Matrices.” <i>Proc. Platform for Advanced
    Scientific Computing (PASC) Conference</i>, ACM, 2018, doi:<a href="https://doi.org/10.1145/3218176.3218231">10.1145/3218176.3218231</a>.
  short: 'M. Lass, S. Mohr, H. Wiebeler, T. Kühne, C. Plessl, in: Proc. Platform for
    Advanced Scientific Computing (PASC) Conference, ACM, New York, NY, USA, 2018.'
conference:
  end_date: 2018-07-04
  location: Basel, Switzerland
  name: Platform for Advanced Scientific Computing Conference (PASC)
  start_date: 2018-07-02
date_created: 2018-03-22T10:53:01Z
date_updated: 2023-09-26T11:48:12Z
department:
- _id: '27'
- _id: '518'
- _id: '304'
doi: 10.1145/3218176.3218231
external_id:
  arxiv:
  - '1710.10899'
keyword:
- approximate computing
- linear algebra
- matrix inversion
- matrix p-th roots
- numeric algorithm
- parallel computing
language:
- iso: eng
place: New York, NY, USA
project:
- _id: '32'
  grant_number: PL 595/2-1 / 320898746
  name: Performance and Efficiency in HPC with Custom Computing
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
publication: Proc. Platform for Advanced Scientific Computing (PASC) Conference
publication_identifier:
  isbn:
  - 978-1-4503-5891-0/18/07
publisher: ACM
quality_controlled: '1'
status: public
title: A Massively Parallel Algorithm for the Approximate Calculation of Inverse p-th
  Roots of Large Sparse Matrices
type: conference
user_id: '15278'
year: '2018'
...
---
_id: '1592'
abstract:
- lang: eng
  text: Compared to classical HDL designs, generating FPGA with high-level synthesis
    from an OpenCL specification promises easier exploration of different design alternatives
    and, through ready-to-use infrastructure and common abstractions for host and
    memory interfaces, easier portability between different FPGA families. In this
    work, we evaluate the extent of this promise. To this end, we present a parameterized
    FDTD implementation for photonic microcavity simulations. Our design can trade-off
    different forms of parallelism and works for two independent OpenCL-based FPGA
    design flows. Hence, we can target FPGAs from different vendors and different
    FPGA families. We describe how we used pre-processor macros to achieve this flexibility
    and to work around different shortcomings of the current tools. Choosing the right
    design configurations, we are able to present two extremely competitive solutions
    for very different FPGA targets, reaching up to 172 GFLOPS sustained performance.
    With the portability and flexibility demonstrated, code developers not only avoid
    vendor lock-in, but can even make best use of real trade-offs between different
    architectures.
author:
- first_name: Tobias
  full_name: Kenter, Tobias
  id: '3145'
  last_name: Kenter
- first_name: Jens
  full_name: Förstner, Jens
  id: '158'
  last_name: Förstner
  orcid: 0000-0001-7059-9862
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: 'Kenter T, Förstner J, Plessl C. Flexible FPGA design for FDTD using OpenCL.
    In: <i>Proc. Int. Conf. on Field Programmable Logic and Applications (FPL)</i>.
    IEEE; 2017. doi:<a href="https://doi.org/10.23919/FPL.2017.8056844">10.23919/FPL.2017.8056844</a>'
  apa: Kenter, T., Förstner, J., &#38; Plessl, C. (2017). Flexible FPGA design for
    FDTD using OpenCL. <i>Proc. Int. Conf. on Field Programmable Logic and Applications
    (FPL)</i>. <a href="https://doi.org/10.23919/FPL.2017.8056844">https://doi.org/10.23919/FPL.2017.8056844</a>
  bibtex: '@inproceedings{Kenter_Förstner_Plessl_2017, title={Flexible FPGA design
    for FDTD using OpenCL}, DOI={<a href="https://doi.org/10.23919/FPL.2017.8056844">10.23919/FPL.2017.8056844</a>},
    booktitle={Proc. Int. Conf. on Field Programmable Logic and Applications (FPL)},
    publisher={IEEE}, author={Kenter, Tobias and Förstner, Jens and Plessl, Christian},
    year={2017} }'
  chicago: Kenter, Tobias, Jens Förstner, and Christian Plessl. “Flexible FPGA Design
    for FDTD Using OpenCL.” In <i>Proc. Int. Conf. on Field Programmable Logic and
    Applications (FPL)</i>. IEEE, 2017. <a href="https://doi.org/10.23919/FPL.2017.8056844">https://doi.org/10.23919/FPL.2017.8056844</a>.
  ieee: 'T. Kenter, J. Förstner, and C. Plessl, “Flexible FPGA design for FDTD using
    OpenCL,” 2017, doi: <a href="https://doi.org/10.23919/FPL.2017.8056844">10.23919/FPL.2017.8056844</a>.'
  mla: Kenter, Tobias, et al. “Flexible FPGA Design for FDTD Using OpenCL.” <i>Proc.
    Int. Conf. on Field Programmable Logic and Applications (FPL)</i>, IEEE, 2017,
    doi:<a href="https://doi.org/10.23919/FPL.2017.8056844">10.23919/FPL.2017.8056844</a>.
  short: 'T. Kenter, J. Förstner, C. Plessl, in: Proc. Int. Conf. on Field Programmable
    Logic and Applications (FPL), IEEE, 2017.'
date_created: 2018-03-22T11:10:23Z
date_updated: 2023-09-26T13:24:38Z
ddc:
- '000'
department:
- _id: '27'
- _id: '518'
- _id: '61'
doi: 10.23919/FPL.2017.8056844
file:
- access_level: closed
  content_type: application/pdf
  creator: ups
  date_created: 2018-11-02T15:02:28Z
  date_updated: 2018-11-02T15:02:28Z
  file_id: '5291'
  file_name: 08056844.pdf
  file_size: 230235
  relation: main_file
  success: 1
file_date_updated: 2018-11-02T15:02:28Z
has_accepted_license: '1'
keyword:
- tet_topic_hpc
language:
- iso: eng
project:
- _id: '1'
  grant_number: '160364472'
  name: SFB 901
- _id: '4'
  name: SFB 901 - Project Area C
- _id: '14'
  grant_number: '160364472'
  name: SFB 901 - Subproject C2
- _id: '33'
  grant_number: 01|H16005A
  name: HighPerMeshes
- _id: '32'
  grant_number: PL 595/2-1 / 320898746
  name: Performance and Efficiency in HPC with Custom Computing
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
publication: Proc. Int. Conf. on Field Programmable Logic and Applications (FPL)
publisher: IEEE
quality_controlled: '1'
status: public
title: Flexible FPGA design for FDTD using OpenCL
type: conference
user_id: '15278'
year: '2017'
...
---
_id: '24'
author:
- first_name: Tobias
  full_name: Kenter, Tobias
  id: '3145'
  last_name: Kenter
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: 'Kenter T, Plessl C. Microdisk Cavity FDTD Simulation on FPGA using OpenCL.
    In: <i>Proc. Workshop on Heterogeneous High-Performance Reconfigurable Computing
    (H2RC)</i>. ; 2016.'
  apa: Kenter, T., &#38; Plessl, C. (2016). Microdisk Cavity FDTD Simulation on FPGA
    using OpenCL. <i>Proc. Workshop on Heterogeneous High-Performance Reconfigurable
    Computing (H2RC)</i>.
  bibtex: '@inproceedings{Kenter_Plessl_2016, title={Microdisk Cavity FDTD Simulation
    on FPGA using OpenCL}, booktitle={Proc. Workshop on Heterogeneous High-performance
    Reconfigurable Computing (H2RC)}, author={Kenter, Tobias and Plessl, Christian},
    year={2016} }'
  chicago: Kenter, Tobias, and Christian Plessl. “Microdisk Cavity FDTD Simulation
    on FPGA Using OpenCL.” In <i>Proc. Workshop on Heterogeneous High-Performance
    Reconfigurable Computing (H2RC)</i>, 2016.
  ieee: T. Kenter and C. Plessl, “Microdisk Cavity FDTD Simulation on FPGA using OpenCL,”
    2016.
  mla: Kenter, Tobias, and Christian Plessl. “Microdisk Cavity FDTD Simulation on
    FPGA Using OpenCL.” <i>Proc. Workshop on Heterogeneous High-Performance Reconfigurable
    Computing (H2RC)</i>, 2016.
  short: 'T. Kenter, C. Plessl, in: Proc. Workshop on Heterogeneous High-Performance
    Reconfigurable Computing (H2RC), 2016.'
date_created: 2017-07-26T15:00:43Z
date_updated: 2023-09-26T13:26:17Z
ddc:
- '004'
department:
- _id: '27'
- _id: '518'
file:
- access_level: closed
  content_type: application/pdf
  creator: kenter
  date_created: 2018-11-14T12:38:45Z
  date_updated: 2018-11-14T12:38:45Z
  file_id: '5602'
  file_name: paper_26.pdf
  file_size: 129552
  relation: main_file
  success: 1
file_date_updated: 2018-11-14T12:38:45Z
has_accepted_license: '1'
language:
- iso: eng
project:
- _id: '32'
  grant_number: PL 595/2-1 / 320898746
  name: Performance and Efficiency in HPC with Custom Computing
- _id: '1'
  grant_number: '160364472'
  name: SFB 901
- _id: '4'
  name: SFB 901 - Project Area C
- _id: '14'
  grant_number: '160364472'
  name: SFB 901 - Subproject C2
publication: Proc. Workshop on Heterogeneous High-performance Reconfigurable Computing
  (H2RC)
quality_controlled: '1'
status: public
title: Microdisk Cavity FDTD Simulation on FPGA using OpenCL
type: conference
user_id: '15278'
year: '2016'
...