---
_id: '62034'
abstract:
- lang: eng
  text: 'Effective single-particle theories, such as Hartree–Fock, density functional
    theory, and tight-binding, are limited by the computational cost of the self-consistent
    field (SCF) procedure, which typically scales cubically with the system size.
    This makes large-scale applications impractical without specialized algorithms
    and hardware. Here, we present the submatrix and graphical processing unit (GPU)-accelerated
    software implementation of the PTB tight-binding potential, realized in the open-source
    ptb codebase [M. Mueller, A. Katbashev, and S. Ehlert (2025). “grimme-lab/ptb:
    v3.8.1,” Zenodo. https://zenodo.org/records/17015872]. We first benchmark a traditional
    diagonalization-based SCF solver against density-matrix-based purification approaches,
    systematically varying both system size and computer hardware. Our findings show
    that the usage of GPUs permits shifting the boundaries to much larger systems
    than previously thought feasible, achieving an overall 10–15-fold performance
    speedup. Second, we introduce the implementation of a decomposition-type submatrix
    method, specifically designed for efficient operation on mid- to large-sized systems,
    to address the computational overhead associated with full-system diagonalization.
    We demonstrate that, from a certain dimension (≈104 basis functions) on, our submatrix
    method reduces the overall computational cost while maintaining acceptable numerical
    accuracy. Our study demonstrates the significance of the interplay between modern
    hardware, algorithmic considerations, and novel tight-binding methods, paving
    the way for further development in this direction.'
article_number: '132501'
author:
- first_name: Abylay
  full_name: Katbashev, Abylay
  last_name: Katbashev
- first_name: Robert
  full_name: Schade, Robert
  id: '75963'
  last_name: Schade
  orcid: 0000-0002-6268-5397
- first_name: Michael
  full_name: Laß, Michael
  id: '24135'
  last_name: Laß
  orcid: 0000-0002-5708-7632
- first_name: Marcel
  full_name: Müller, Marcel
  last_name: Müller
- first_name: Stefan
  full_name: Grimme, Stefan
  last_name: Grimme
- first_name: Andreas
  full_name: Hansen, Andreas
  last_name: Hansen
- first_name: Thomas
  full_name: Kühne, Thomas
  id: '49079'
  last_name: Kühne
citation:
  ama: Katbashev A, Schade R, Laß M, et al. Submatrix and GPU-accelerated implementation
    of density matrix tight-binding. <i>The Journal of Chemical Physics</i>. 2025;163(13).
    doi:<a href="https://doi.org/10.1063/5.0271379">10.1063/5.0271379</a>
  apa: Katbashev, A., Schade, R., Laß, M., Müller, M., Grimme, S., Hansen, A., &#38;
    Kühne, T. (2025). Submatrix and GPU-accelerated implementation of density matrix
    tight-binding. <i>The Journal of Chemical Physics</i>, <i>163</i>(13), Article
    132501. <a href="https://doi.org/10.1063/5.0271379">https://doi.org/10.1063/5.0271379</a>
  bibtex: '@article{Katbashev_Schade_Laß_Müller_Grimme_Hansen_Kühne_2025, title={Submatrix
    and GPU-accelerated implementation of density matrix tight-binding}, volume={163},
    DOI={<a href="https://doi.org/10.1063/5.0271379">10.1063/5.0271379</a>}, number={13132501},
    journal={The Journal of Chemical Physics}, publisher={AIP Publishing}, author={Katbashev,
    Abylay and Schade, Robert and Laß, Michael and Müller, Marcel and Grimme, Stefan
    and Hansen, Andreas and Kühne, Thomas}, year={2025} }'
  chicago: Katbashev, Abylay, Robert Schade, Michael Laß, Marcel Müller, Stefan Grimme,
    Andreas Hansen, and Thomas Kühne. “Submatrix and GPU-Accelerated Implementation
    of Density Matrix Tight-Binding.” <i>The Journal of Chemical Physics</i> 163,
    no. 13 (2025). <a href="https://doi.org/10.1063/5.0271379">https://doi.org/10.1063/5.0271379</a>.
  ieee: 'A. Katbashev <i>et al.</i>, “Submatrix and GPU-accelerated implementation
    of density matrix tight-binding,” <i>The Journal of Chemical Physics</i>, vol.
    163, no. 13, Art. no. 132501, 2025, doi: <a href="https://doi.org/10.1063/5.0271379">10.1063/5.0271379</a>.'
  mla: Katbashev, Abylay, et al. “Submatrix and GPU-Accelerated Implementation of
    Density Matrix Tight-Binding.” <i>The Journal of Chemical Physics</i>, vol. 163,
    no. 13, 132501, AIP Publishing, 2025, doi:<a href="https://doi.org/10.1063/5.0271379">10.1063/5.0271379</a>.
  short: A. Katbashev, R. Schade, M. Laß, M. Müller, S. Grimme, A. Hansen, T. Kühne,
    The Journal of Chemical Physics 163 (2025).
date_created: 2025-11-01T00:41:50Z
date_updated: 2025-11-01T00:43:19Z
department:
- _id: '27'
doi: 10.1063/5.0271379
intvolume: '       163'
issue: '13'
language:
- iso: eng
project:
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
publication: The Journal of Chemical Physics
publication_identifier:
  issn:
  - 0021-9606
  - 1089-7690
publication_status: published
publisher: AIP Publishing
status: public
title: Submatrix and GPU-accelerated implementation of density matrix tight-binding
type: journal_article
user_id: '75963'
volume: 163
year: '2025'
...
---
_id: '53474'
abstract:
- lang: eng
  text: We present a novel approach to characterize and quantify microheterogeneity
    and microphase separation in computer simulations of complex liquid mixtures.
    Our post-processing method is based on local density fluctuations of the different
    constituents in sampling spheres of varying size. It can be easily applied to
    both molecular dynamics (MD) and Monte Carlo (MC) simulations, including periodic
    boundary conditions. Multidimensional correlation of the density distributions
    yields a clear picture of the domain formation due to the subtle balance of different
    interactions. We apply our approach to the example of force field molecular dynamics
    simulations of imidazolium-based ionic liquids with different side chain lengths
    at different temperatures, namely 1-ethyl-3-methylimidazolium chloride, 1-hexyl-3-methylimidazolium
    chloride, and 1-decyl-3-methylimidazolium chloride, which are known to form distinct
    liquid domains. We put the results into the context of existing microheterogeneity
    analyses and demonstrate the advantages and sensitivity of our novel method. Furthermore,
    we show how to estimate the configuration entropy from our analysis, and we investigate
    voids in the system. The analysis has been implemented into our program package
    TRAVIS and is thus available as free software.
article_number: '322'
author:
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Tobias
  full_name: Kenter, Tobias
  id: '3145'
  last_name: Kenter
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
- first_name: Martin
  full_name: Brehm, Martin
  id: '100167'
  last_name: Brehm
citation:
  ama: Lass M, Kenter T, Plessl C, Brehm M. Characterizing Microheterogeneity in Liquid
    Mixtures via Local Density Fluctuations. <i>Entropy</i>. 2024;26(4). doi:<a href="https://doi.org/10.3390/e26040322">10.3390/e26040322</a>
  apa: Lass, M., Kenter, T., Plessl, C., &#38; Brehm, M. (2024). Characterizing Microheterogeneity
    in Liquid Mixtures via Local Density Fluctuations. <i>Entropy</i>, <i>26</i>(4),
    Article 322. <a href="https://doi.org/10.3390/e26040322">https://doi.org/10.3390/e26040322</a>
  bibtex: '@article{Lass_Kenter_Plessl_Brehm_2024, title={Characterizing Microheterogeneity
    in Liquid Mixtures via Local Density Fluctuations}, volume={26}, DOI={<a href="https://doi.org/10.3390/e26040322">10.3390/e26040322</a>},
    number={4322}, journal={Entropy}, publisher={MDPI AG}, author={Lass, Michael and
    Kenter, Tobias and Plessl, Christian and Brehm, Martin}, year={2024} }'
  chicago: Lass, Michael, Tobias Kenter, Christian Plessl, and Martin Brehm. “Characterizing
    Microheterogeneity in Liquid Mixtures via Local Density Fluctuations.” <i>Entropy</i>
    26, no. 4 (2024). <a href="https://doi.org/10.3390/e26040322">https://doi.org/10.3390/e26040322</a>.
  ieee: 'M. Lass, T. Kenter, C. Plessl, and M. Brehm, “Characterizing Microheterogeneity
    in Liquid Mixtures via Local Density Fluctuations,” <i>Entropy</i>, vol. 26, no.
    4, Art. no. 322, 2024, doi: <a href="https://doi.org/10.3390/e26040322">10.3390/e26040322</a>.'
  mla: Lass, Michael, et al. “Characterizing Microheterogeneity in Liquid Mixtures
    via Local Density Fluctuations.” <i>Entropy</i>, vol. 26, no. 4, 322, MDPI AG,
    2024, doi:<a href="https://doi.org/10.3390/e26040322">10.3390/e26040322</a>.
  short: M. Lass, T. Kenter, C. Plessl, M. Brehm, Entropy 26 (2024).
date_created: 2024-04-12T18:31:39Z
date_updated: 2024-04-12T18:34:32Z
department:
- _id: '27'
- _id: '518'
- _id: '803'
doi: 10.3390/e26040322
intvolume: '        26'
issue: '4'
language:
- iso: eng
project:
- _id: '52'
  name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: Entropy
publication_identifier:
  issn:
  - 1099-4300
publication_status: published
publisher: MDPI AG
status: public
title: Characterizing Microheterogeneity in Liquid Mixtures via Local Density Fluctuations
type: journal_article
user_id: '24135'
volume: 26
year: '2024'
...
---
_id: '53663'
abstract:
- lang: eng
  text: 'Noctua 2 is a supercomputer operated at the Paderborn Center for Parallel
    Computing (PC2) at Paderborn University in Germany. Noctua 2 was inaugurated in
    2022 and is an Atos BullSequana XH2000 system. It consists mainly of three node
    types: 1) CPU Compute nodes with AMD EPYC processors in different main memory
    configurations, 2) GPU nodes with NVIDIA A100 GPUs, and 3) FPGA nodes with Xilinx
    Alveo U280 and Intel Stratix 10 FPGA cards. While CPUs and GPUs are known off-the-shelf
    components in HPC systems, the operation of a large number of FPGA cards from
    different vendors and a dedicated FPGA-to-FPGA network are unique characteristics
    of Noctua 2. This paper describes in detail the overall setup of Noctua 2 and
    gives insights into the operation of the cluster from a hardware, software and
    facility perspective.'
article_type: original
author:
- first_name: Carsten
  full_name: Bauer, Carsten
  id: '90082'
  last_name: Bauer
- first_name: Tobias
  full_name: Kenter, Tobias
  id: '3145'
  last_name: Kenter
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Lukas
  full_name: Mazur, Lukas
  id: '90492'
  last_name: Mazur
  orcid: ' 0000-0001-6304-7082'
- first_name: Marius
  full_name: Meyer, Marius
  id: '40778'
  last_name: Meyer
- first_name: Holger
  full_name: Nitsche, Holger
  id: '15272'
  last_name: Nitsche
- first_name: Heinrich
  full_name: Riebler, Heinrich
  id: '8961'
  last_name: Riebler
- first_name: Robert
  full_name: Schade, Robert
  id: '75963'
  last_name: Schade
  orcid: 0000-0002-6268-5397
- first_name: Michael
  full_name: Schwarz, Michael
  id: '5312'
  last_name: Schwarz
- first_name: Nils
  full_name: Winnwa, Nils
  id: '61189'
  last_name: Winnwa
- first_name: Alex
  full_name: Wiens, Alex
  id: '23522'
  last_name: Wiens
  orcid: 0000-0003-1764-9773
- first_name: Xin
  full_name: Wu, Xin
  id: '77439'
  last_name: Wu
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
- first_name: Jens
  full_name: Simon, Jens
  id: '15273'
  last_name: Simon
citation:
  ama: Bauer C, Kenter T, Lass M, et al. Noctua 2 Supercomputer. <i>Journal of large-scale
    research facilities</i>. 2024;9. doi:<a href="https://doi.org/10.17815/jlsrf-8-187
    ">10.17815/jlsrf-8-187 </a>
  apa: Bauer, C., Kenter, T., Lass, M., Mazur, L., Meyer, M., Nitsche, H., Riebler,
    H., Schade, R., Schwarz, M., Winnwa, N., Wiens, A., Wu, X., Plessl, C., &#38;
    Simon, J. (2024). Noctua 2 Supercomputer. <i>Journal of Large-Scale Research Facilities</i>,
    <i>9</i>. <a href="https://doi.org/10.17815/jlsrf-8-187 ">https://doi.org/10.17815/jlsrf-8-187
    </a>
  bibtex: '@article{Bauer_Kenter_Lass_Mazur_Meyer_Nitsche_Riebler_Schade_Schwarz_Winnwa_et
    al._2024, title={Noctua 2 Supercomputer}, volume={9}, DOI={<a href="https://doi.org/10.17815/jlsrf-8-187
    ">10.17815/jlsrf-8-187 </a>}, journal={Journal of large-scale research facilities},
    author={Bauer, Carsten and Kenter, Tobias and Lass, Michael and Mazur, Lukas and
    Meyer, Marius and Nitsche, Holger and Riebler, Heinrich and Schade, Robert and
    Schwarz, Michael and Winnwa, Nils and et al.}, year={2024} }'
  chicago: Bauer, Carsten, Tobias Kenter, Michael Lass, Lukas Mazur, Marius Meyer,
    Holger Nitsche, Heinrich Riebler, et al. “Noctua 2 Supercomputer.” <i>Journal
    of Large-Scale Research Facilities</i> 9 (2024). <a href="https://doi.org/10.17815/jlsrf-8-187
    ">https://doi.org/10.17815/jlsrf-8-187 </a>.
  ieee: 'C. Bauer <i>et al.</i>, “Noctua 2 Supercomputer,” <i>Journal of large-scale
    research facilities</i>, vol. 9, 2024, doi: <a href="https://doi.org/10.17815/jlsrf-8-187
    ">10.17815/jlsrf-8-187 </a>.'
  mla: Bauer, Carsten, et al. “Noctua 2 Supercomputer.” <i>Journal of Large-Scale
    Research Facilities</i>, vol. 9, 2024, doi:<a href="https://doi.org/10.17815/jlsrf-8-187
    ">10.17815/jlsrf-8-187 </a>.
  short: C. Bauer, T. Kenter, M. Lass, L. Mazur, M. Meyer, H. Nitsche, H. Riebler,
    R. Schade, M. Schwarz, N. Winnwa, A. Wiens, X. Wu, C. Plessl, J. Simon, Journal
    of Large-Scale Research Facilities 9 (2024).
date_created: 2024-04-26T07:39:41Z
date_updated: 2024-04-26T08:44:30Z
ddc:
- '004'
department:
- _id: '27'
- _id: '518'
doi: '10.17815/jlsrf-8-187 '
file:
- access_level: open_access
  content_type: application/pdf
  creator: deffel
  date_created: 2024-04-26T07:30:20Z
  date_updated: 2024-04-26T08:35:17Z
  file_id: '53664'
  file_name: Noctua2_Supercomputer.pdf
  file_size: 3825480
  relation: main_file
file_date_updated: 2024-04-26T08:35:17Z
has_accepted_license: '1'
intvolume: '         9'
keyword:
- Noctua 2
- Supercomputer
- FPGA
- PC2
- Paderborn Center for Parallel Computing
language:
- iso: eng
oa: '1'
project:
- _id: '52'
  name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: Journal of large-scale research facilities
publication_status: published
status: public
title: Noctua 2 Supercomputer
type: journal_article
user_id: '8961'
volume: 9
year: '2024'
...
---
_id: '56604'
abstract:
- lang: eng
  text: This manuscript makes the claim of having computed the 9th Dedekind number,
    D(9). This was done by accelerating the core operation of the process with an
    efficient FPGA design that outperforms an optimized 64-core CPU reference by 95x.
    The FPGA execution was parallelized on the Noctua 2 supercomputer at Paderborn
    University. The resulting value for D(9) is 286386577668298411128469151667598498812366.
    This value can be verified in two steps. We have made the data file containing
    the 490 M results available, each of which can be verified separately on CPU,
    and the whole file sums to our proposed value. The paper explains the mathematical
    approach in the first part, before putting the focus on a deep dive into the FPGA
    accelerator implementation followed by a performance analysis. The FPGA implementation
    was done in Register-Transfer Level using a dual-clock architecture and shows
    how we achieved an impressive FMax of 450 MHz on the targeted Stratix 10 GX 2,800
    FPGAs. The total compute time used was 47,000 FPGA hours.
author:
- first_name: Lennart
  full_name: Van Hirtum, Lennart
  id: '100210'
  last_name: Van Hirtum
- first_name: Patrick
  full_name: De Causmaecker, Patrick
  last_name: De Causmaecker
- first_name: Jens
  full_name: Goemaere, Jens
  last_name: Goemaere
- first_name: Tobias
  full_name: Kenter, Tobias
  id: '3145'
  last_name: Kenter
- first_name: Heinrich
  full_name: Riebler, Heinrich
  id: '8961'
  last_name: Riebler
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: Van Hirtum L, De Causmaecker P, Goemaere J, et al. A Computation of the Ninth
    Dedekind Number Using FPGA Supercomputing. <i>ACM Transactions on Reconfigurable
    Technology and Systems</i>. 2024;17(3):1-28. doi:<a href="https://doi.org/10.1145/3674147">10.1145/3674147</a>
  apa: Van Hirtum, L., De Causmaecker, P., Goemaere, J., Kenter, T., Riebler, H.,
    Lass, M., &#38; Plessl, C. (2024). A Computation of the Ninth Dedekind Number
    Using FPGA Supercomputing. <i>ACM Transactions on Reconfigurable Technology and
    Systems</i>, <i>17</i>(3), 1–28. <a href="https://doi.org/10.1145/3674147">https://doi.org/10.1145/3674147</a>
  bibtex: '@article{Van Hirtum_De Causmaecker_Goemaere_Kenter_Riebler_Lass_Plessl_2024,
    title={A Computation of the Ninth Dedekind Number Using FPGA Supercomputing},
    volume={17}, DOI={<a href="https://doi.org/10.1145/3674147">10.1145/3674147</a>},
    number={3}, journal={ACM Transactions on Reconfigurable Technology and Systems},
    publisher={Association for Computing Machinery (ACM)}, author={Van Hirtum, Lennart
    and De Causmaecker, Patrick and Goemaere, Jens and Kenter, Tobias and Riebler,
    Heinrich and Lass, Michael and Plessl, Christian}, year={2024}, pages={1–28} }'
  chicago: 'Van Hirtum, Lennart, Patrick De Causmaecker, Jens Goemaere, Tobias Kenter,
    Heinrich Riebler, Michael Lass, and Christian Plessl. “A Computation of the Ninth
    Dedekind Number Using FPGA Supercomputing.” <i>ACM Transactions on Reconfigurable
    Technology and Systems</i> 17, no. 3 (2024): 1–28. <a href="https://doi.org/10.1145/3674147">https://doi.org/10.1145/3674147</a>.'
  ieee: 'L. Van Hirtum <i>et al.</i>, “A Computation of the Ninth Dedekind Number
    Using FPGA Supercomputing,” <i>ACM Transactions on Reconfigurable Technology and
    Systems</i>, vol. 17, no. 3, pp. 1–28, 2024, doi: <a href="https://doi.org/10.1145/3674147">10.1145/3674147</a>.'
  mla: Van Hirtum, Lennart, et al. “A Computation of the Ninth Dedekind Number Using
    FPGA Supercomputing.” <i>ACM Transactions on Reconfigurable Technology and Systems</i>,
    vol. 17, no. 3, Association for Computing Machinery (ACM), 2024, pp. 1–28, doi:<a
    href="https://doi.org/10.1145/3674147">10.1145/3674147</a>.
  short: L. Van Hirtum, P. De Causmaecker, J. Goemaere, T. Kenter, H. Riebler, M.
    Lass, C. Plessl, ACM Transactions on Reconfigurable Technology and Systems 17
    (2024) 1–28.
date_created: 2024-10-14T07:38:29Z
date_updated: 2025-11-04T09:53:26Z
department:
- _id: '27'
- _id: '518'
doi: 10.1145/3674147
intvolume: '        17'
issue: '3'
language:
- iso: eng
main_file_link:
- open_access: '1'
oa: '1'
page: 1-28
project:
- _id: '52'
  name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: ACM Transactions on Reconfigurable Technology and Systems
publication_identifier:
  issn:
  - 1936-7406
  - 1936-7414
publication_status: published
publisher: Association for Computing Machinery (ACM)
quality_controlled: '1'
status: public
title: A Computation of the Ninth Dedekind Number Using FPGA Supercomputing
type: journal_article
user_id: '3145'
volume: 17
year: '2024'
...
---
_id: '53202'
abstract:
- lang: eng
  text: At large scales, quantum systems may become advantageous over their classical
    counterparts at performing certain tasks. Developing tools to analyze these systems
    at the relevant scales, in a manner consistent with quantum mechanics, is therefore
    critical to benchmarking performance and characterizing their operation. While
    classical computational approaches cannot perform like-for-like computations of
    quantum systems beyond a certain scale, classical high-performance computing (HPC)
    may nevertheless be useful for precisely these characterization and certification
    tasks. By developing open-source customized algorithms using high-performance
    computing, we perform quantum tomography on a megascale quantum photonic detector
    covering a Hilbert space of 106. This requires finding 108 elements of the matrix
    corresponding to the positive operator valued measure (POVM), the quantum description
    of the detector, and is achieved in minutes of computation time. Moreover, by
    exploiting the structure of the problem, we achieve highly efficient parallel
    scaling, paving the way for quantum objects up to a system size of 1012 elements
    to be reconstructed using this method. In general, this shows that a consistent
    quantum mechanical description of quantum phenomena is applicable at everyday
    scales. More concretely, this enables the reconstruction of large-scale quantum
    sources, processes and detectors used in computation and sampling tasks, which
    may be necessary to prove their nonclassical character or quantum computational
    advantage.
author:
- first_name: Timon
  full_name: Schapeler, Timon
  id: '55629'
  last_name: Schapeler
  orcid: 0000-0001-7652-1716
- first_name: Robert
  full_name: Schade, Robert
  id: '75963'
  last_name: Schade
  orcid: 0000-0002-6268-5397
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
- first_name: Tim
  full_name: Bartley, Tim
  id: '49683'
  last_name: Bartley
citation:
  ama: Schapeler T, Schade R, Lass M, Plessl C, Bartley T. Scalable quantum detector
    tomography by high-performance computing. <i>Quantum Science and Technology</i>.
    2024;10(1). doi:<a href="https://doi.org/10.1088/2058-9565/ad8511">10.1088/2058-9565/ad8511</a>
  apa: Schapeler, T., Schade, R., Lass, M., Plessl, C., &#38; Bartley, T. (2024).
    Scalable quantum detector tomography by high-performance computing. <i>Quantum
    Science and Technology</i>, <i>10</i>(1). <a href="https://doi.org/10.1088/2058-9565/ad8511">https://doi.org/10.1088/2058-9565/ad8511</a>
  bibtex: '@article{Schapeler_Schade_Lass_Plessl_Bartley_2024, title={Scalable quantum
    detector tomography by high-performance computing}, volume={10}, DOI={<a href="https://doi.org/10.1088/2058-9565/ad8511">10.1088/2058-9565/ad8511</a>},
    number={1}, journal={Quantum Science and Technology}, publisher={IOP Publishing},
    author={Schapeler, Timon and Schade, Robert and Lass, Michael and Plessl, Christian
    and Bartley, Tim}, year={2024} }'
  chicago: Schapeler, Timon, Robert Schade, Michael Lass, Christian Plessl, and Tim
    Bartley. “Scalable Quantum Detector Tomography by High-Performance Computing.”
    <i>Quantum Science and Technology</i> 10, no. 1 (2024). <a href="https://doi.org/10.1088/2058-9565/ad8511">https://doi.org/10.1088/2058-9565/ad8511</a>.
  ieee: 'T. Schapeler, R. Schade, M. Lass, C. Plessl, and T. Bartley, “Scalable quantum
    detector tomography by high-performance computing,” <i>Quantum Science and Technology</i>,
    vol. 10, no. 1, 2024, doi: <a href="https://doi.org/10.1088/2058-9565/ad8511">10.1088/2058-9565/ad8511</a>.'
  mla: Schapeler, Timon, et al. “Scalable Quantum Detector Tomography by High-Performance
    Computing.” <i>Quantum Science and Technology</i>, vol. 10, no. 1, IOP Publishing,
    2024, doi:<a href="https://doi.org/10.1088/2058-9565/ad8511">10.1088/2058-9565/ad8511</a>.
  short: T. Schapeler, R. Schade, M. Lass, C. Plessl, T. Bartley, Quantum Science
    and Technology 10 (2024).
date_created: 2024-04-04T08:43:18Z
date_updated: 2025-12-16T11:32:12Z
department:
- _id: '27'
- _id: '623'
- _id: '15'
doi: 10.1088/2058-9565/ad8511
external_id:
  arxiv:
  - '2404.02844'
intvolume: '        10'
issue: '1'
language:
- iso: eng
main_file_link:
- open_access: '1'
oa: '1'
project:
- _id: '52'
  name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
- _id: '239'
  name: 'ERC-Grant: QuESADILLA: Quantum Engineering Superconducting Array Detectors
    in Low-Light Applications'
- _id: '191'
  name: 'PhoQuant: Photonische Quantencomputer -  Quantencomputing Testplattform'
publication: Quantum Science and Technology
publisher: IOP Publishing
status: public
title: Scalable quantum detector tomography by high-performance computing
type: journal_article
user_id: '55629'
volume: 10
year: '2024'
...
---
_id: '43439'
abstract:
- lang: eng
  text: "This preprint makes the claim of having computed the $9^{th}$ Dedekind\r\nNumber.
    This was done by building an efficient FPGA Accelerator for the core\r\noperation
    of the process, and parallelizing it on the Noctua 2 Supercluster at\r\nPaderborn
    University. The resulting value is\r\n286386577668298411128469151667598498812366.
    This value can be verified in two\r\nsteps. We have made the data file containing
    the 490M results available, each\r\nof which can be verified separately on CPU,
    and the whole file sums to our\r\nproposed value."
author:
- first_name: Lennart
  full_name: Van Hirtum, Lennart
  last_name: Van Hirtum
- first_name: Patrick
  full_name: De Causmaecker, Patrick
  last_name: De Causmaecker
- first_name: Jens
  full_name: Goemaere, Jens
  last_name: Goemaere
- first_name: Tobias
  full_name: Kenter, Tobias
  id: '3145'
  last_name: Kenter
- first_name: Heinrich
  full_name: Riebler, Heinrich
  id: '8961'
  last_name: Riebler
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: Van Hirtum L, De Causmaecker P, Goemaere J, et al. A computation of D(9) using
    FPGA Supercomputing. <i>arXiv:230403039</i>. Published online 2023.
  apa: Van Hirtum, L., De Causmaecker, P., Goemaere, J., Kenter, T., Riebler, H.,
    Lass, M., &#38; Plessl, C. (2023). A computation of D(9) using FPGA Supercomputing.
    In <i>arXiv:2304.03039</i>.
  bibtex: '@article{Van Hirtum_De Causmaecker_Goemaere_Kenter_Riebler_Lass_Plessl_2023,
    title={A computation of D(9) using FPGA Supercomputing}, journal={arXiv:2304.03039},
    author={Van Hirtum, Lennart and De Causmaecker, Patrick and Goemaere, Jens and
    Kenter, Tobias and Riebler, Heinrich and Lass, Michael and Plessl, Christian},
    year={2023} }'
  chicago: Van Hirtum, Lennart, Patrick De Causmaecker, Jens Goemaere, Tobias Kenter,
    Heinrich Riebler, Michael Lass, and Christian Plessl. “A Computation of D(9) Using
    FPGA Supercomputing.” <i>ArXiv:2304.03039</i>, 2023.
  ieee: L. Van Hirtum <i>et al.</i>, “A computation of D(9) using FPGA Supercomputing,”
    <i>arXiv:2304.03039</i>. 2023.
  mla: Van Hirtum, Lennart, et al. “A Computation of D(9) Using FPGA Supercomputing.”
    <i>ArXiv:2304.03039</i>, 2023.
  short: L. Van Hirtum, P. De Causmaecker, J. Goemaere, T. Kenter, H. Riebler, M.
    Lass, C. Plessl, ArXiv:2304.03039 (2023).
date_created: 2023-04-08T11:05:29Z
date_updated: 2024-01-22T09:56:42Z
department:
- _id: '27'
- _id: '518'
external_id:
  arxiv:
  - '2304.03039'
language:
- iso: eng
project:
- _id: '52'
  name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: arXiv:2304.03039
status: public
title: A computation of D(9) using FPGA Supercomputing
type: preprint
user_id: '3145'
year: '2023'
...
---
_id: '45361'
abstract:
- lang: eng
  text: <jats:p> The non-orthogonal local submatrix method applied to electronic structure–based
    molecular dynamics simulations is shown to exceed 1.1 EFLOP/s in FP16/FP32-mixed
    floating-point arithmetic when using 4400 NVIDIA A100 GPUs of the Perlmutter system.
    This is enabled by a modification of the original method that pushes the sustained
    fraction of the peak performance to about 80%. Example calculations are performed
    for SARS-CoV-2 spike proteins with up to 83 million atoms. </jats:p>
article_number: '109434202311776'
article_type: original
author:
- first_name: Robert
  full_name: Schade, Robert
  id: '75963'
  last_name: Schade
  orcid: 0000-0002-6268-539
- first_name: Tobias
  full_name: Kenter, Tobias
  id: '3145'
  last_name: Kenter
- first_name: Hossam
  full_name: Elgabarty, Hossam
  id: '60250'
  last_name: Elgabarty
  orcid: 0000-0002-4945-1481
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Thomas
  full_name: Kühne, Thomas
  id: '49079'
  last_name: Kühne
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: Schade R, Kenter T, Elgabarty H, Lass M, Kühne T, Plessl C. Breaking the exascale
    barrier for the electronic structure problem in ab-initio molecular dynamics.
    <i>The International Journal of High Performance Computing Applications</i>. Published
    online 2023. doi:<a href="https://doi.org/10.1177/10943420231177631">10.1177/10943420231177631</a>
  apa: Schade, R., Kenter, T., Elgabarty, H., Lass, M., Kühne, T., &#38; Plessl, C.
    (2023). Breaking the exascale barrier for the electronic structure problem in
    ab-initio molecular dynamics. <i>The International Journal of High Performance
    Computing Applications</i>, Article 109434202311776. <a href="https://doi.org/10.1177/10943420231177631">https://doi.org/10.1177/10943420231177631</a>
  bibtex: '@article{Schade_Kenter_Elgabarty_Lass_Kühne_Plessl_2023, title={Breaking
    the exascale barrier for the electronic structure problem in ab-initio molecular
    dynamics}, DOI={<a href="https://doi.org/10.1177/10943420231177631">10.1177/10943420231177631</a>},
    number={109434202311776}, journal={The International Journal of High Performance
    Computing Applications}, publisher={SAGE Publications}, author={Schade, Robert
    and Kenter, Tobias and Elgabarty, Hossam and Lass, Michael and Kühne, Thomas and
    Plessl, Christian}, year={2023} }'
  chicago: Schade, Robert, Tobias Kenter, Hossam Elgabarty, Michael Lass, Thomas Kühne,
    and Christian Plessl. “Breaking the Exascale Barrier for the Electronic Structure
    Problem in Ab-Initio Molecular Dynamics.” <i>The International Journal of High
    Performance Computing Applications</i>, 2023. <a href="https://doi.org/10.1177/10943420231177631">https://doi.org/10.1177/10943420231177631</a>.
  ieee: 'R. Schade, T. Kenter, H. Elgabarty, M. Lass, T. Kühne, and C. Plessl, “Breaking
    the exascale barrier for the electronic structure problem in ab-initio molecular
    dynamics,” <i>The International Journal of High Performance Computing Applications</i>,
    Art. no. 109434202311776, 2023, doi: <a href="https://doi.org/10.1177/10943420231177631">10.1177/10943420231177631</a>.'
  mla: Schade, Robert, et al. “Breaking the Exascale Barrier for the Electronic Structure
    Problem in Ab-Initio Molecular Dynamics.” <i>The International Journal of High
    Performance Computing Applications</i>, 109434202311776, SAGE Publications, 2023,
    doi:<a href="https://doi.org/10.1177/10943420231177631">10.1177/10943420231177631</a>.
  short: R. Schade, T. Kenter, H. Elgabarty, M. Lass, T. Kühne, C. Plessl, The International
    Journal of High Performance Computing Applications (2023).
date_created: 2023-05-30T09:19:09Z
date_updated: 2023-08-02T15:04:53Z
department:
- _id: '27'
- _id: '518'
doi: 10.1177/10943420231177631
keyword:
- Hardware and Architecture
- Theoretical Computer Science
- Software
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://journals.sagepub.com/doi/10.1177/10943420231177631
oa: '1'
project:
- _id: '52'
  name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: The International Journal of High Performance Computing Applications
publication_identifier:
  issn:
  - 1094-3420
  - 1741-2846
publication_status: published
publisher: SAGE Publications
quality_controlled: '1'
status: public
title: Breaking the exascale barrier for the electronic structure problem in ab-initio
  molecular dynamics
type: journal_article
user_id: '75963'
year: '2023'
...
---
_id: '32414'
author:
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
citation:
  ama: Lass M. <i>Bringing Massive Parallelism and Hardware Acceleration to Linear
    Scaling Density Functional Theory Through Targeted Approximations</i>. Universität
    Paderborn; 2022. doi:<a href="https://doi.org/10.17619/UNIPB/1-1281">10.17619/UNIPB/1-1281</a>
  apa: Lass, M. (2022). <i>Bringing Massive Parallelism and Hardware Acceleration
    to Linear Scaling Density Functional Theory Through Targeted Approximations</i>.
    Universität Paderborn. <a href="https://doi.org/10.17619/UNIPB/1-1281">https://doi.org/10.17619/UNIPB/1-1281</a>
  bibtex: '@book{Lass_2022, place={Paderborn}, title={Bringing Massive Parallelism
    and Hardware Acceleration to Linear Scaling Density Functional Theory Through
    Targeted Approximations}, DOI={<a href="https://doi.org/10.17619/UNIPB/1-1281">10.17619/UNIPB/1-1281</a>},
    publisher={Universität Paderborn}, author={Lass, Michael}, year={2022} }'
  chicago: 'Lass, Michael. <i>Bringing Massive Parallelism and Hardware Acceleration
    to Linear Scaling Density Functional Theory Through Targeted Approximations</i>.
    Paderborn: Universität Paderborn, 2022. <a href="https://doi.org/10.17619/UNIPB/1-1281">https://doi.org/10.17619/UNIPB/1-1281</a>.'
  ieee: 'M. Lass, <i>Bringing Massive Parallelism and Hardware Acceleration to Linear
    Scaling Density Functional Theory Through Targeted Approximations</i>. Paderborn:
    Universität Paderborn, 2022.'
  mla: Lass, Michael. <i>Bringing Massive Parallelism and Hardware Acceleration to
    Linear Scaling Density Functional Theory Through Targeted Approximations</i>.
    Universität Paderborn, 2022, doi:<a href="https://doi.org/10.17619/UNIPB/1-1281">10.17619/UNIPB/1-1281</a>.
  short: M. Lass, Bringing Massive Parallelism and Hardware Acceleration to Linear
    Scaling Density Functional Theory Through Targeted Approximations, Universität
    Paderborn, Paderborn, 2022.
date_created: 2022-07-25T18:13:51Z
date_updated: 2022-07-25T18:14:23Z
department:
- _id: '27'
- _id: '518'
doi: 10.17619/UNIPB/1-1281
language:
- iso: eng
place: Paderborn
publisher: Universität Paderborn
status: public
supervisor:
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
title: Bringing Massive Parallelism and Hardware Acceleration to Linear Scaling Density
  Functional Theory Through Targeted Approximations
type: dissertation
user_id: '24135'
year: '2022'
...
---
_id: '33684'
article_number: '102920'
author:
- first_name: Robert
  full_name: Schade, Robert
  id: '75963'
  last_name: Schade
  orcid: 0000-0002-6268-539
- first_name: Tobias
  full_name: Kenter, Tobias
  id: '3145'
  last_name: Kenter
- first_name: Hossam
  full_name: Elgabarty, Hossam
  id: '60250'
  last_name: Elgabarty
  orcid: 0000-0002-4945-1481
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Ole
  full_name: Schütt, Ole
  last_name: Schütt
- first_name: Alfio
  full_name: Lazzaro, Alfio
  last_name: Lazzaro
- first_name: Hans
  full_name: Pabst, Hans
  last_name: Pabst
- first_name: Stephan
  full_name: Mohr, Stephan
  last_name: Mohr
- first_name: Jürg
  full_name: Hutter, Jürg
  last_name: Hutter
- first_name: Thomas
  full_name: Kühne, Thomas
  id: '49079'
  last_name: Kühne
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: Schade R, Kenter T, Elgabarty H, et al. Towards electronic structure-based
    ab-initio molecular dynamics simulations with hundreds of millions of atoms. <i>Parallel
    Computing</i>. 2022;111. doi:<a href="https://doi.org/10.1016/j.parco.2022.102920">10.1016/j.parco.2022.102920</a>
  apa: Schade, R., Kenter, T., Elgabarty, H., Lass, M., Schütt, O., Lazzaro, A., Pabst,
    H., Mohr, S., Hutter, J., Kühne, T., &#38; Plessl, C. (2022). Towards electronic
    structure-based ab-initio molecular dynamics simulations with hundreds of millions
    of atoms. <i>Parallel Computing</i>, <i>111</i>, Article 102920. <a href="https://doi.org/10.1016/j.parco.2022.102920">https://doi.org/10.1016/j.parco.2022.102920</a>
  bibtex: '@article{Schade_Kenter_Elgabarty_Lass_Schütt_Lazzaro_Pabst_Mohr_Hutter_Kühne_et
    al._2022, title={Towards electronic structure-based ab-initio molecular dynamics
    simulations with hundreds of millions of atoms}, volume={111}, DOI={<a href="https://doi.org/10.1016/j.parco.2022.102920">10.1016/j.parco.2022.102920</a>},
    number={102920}, journal={Parallel Computing}, publisher={Elsevier BV}, author={Schade,
    Robert and Kenter, Tobias and Elgabarty, Hossam and Lass, Michael and Schütt,
    Ole and Lazzaro, Alfio and Pabst, Hans and Mohr, Stephan and Hutter, Jürg and
    Kühne, Thomas and et al.}, year={2022} }'
  chicago: Schade, Robert, Tobias Kenter, Hossam Elgabarty, Michael Lass, Ole Schütt,
    Alfio Lazzaro, Hans Pabst, et al. “Towards Electronic Structure-Based Ab-Initio
    Molecular Dynamics Simulations with Hundreds of Millions of Atoms.” <i>Parallel
    Computing</i> 111 (2022). <a href="https://doi.org/10.1016/j.parco.2022.102920">https://doi.org/10.1016/j.parco.2022.102920</a>.
  ieee: 'R. Schade <i>et al.</i>, “Towards electronic structure-based ab-initio molecular
    dynamics simulations with hundreds of millions of atoms,” <i>Parallel Computing</i>,
    vol. 111, Art. no. 102920, 2022, doi: <a href="https://doi.org/10.1016/j.parco.2022.102920">10.1016/j.parco.2022.102920</a>.'
  mla: Schade, Robert, et al. “Towards Electronic Structure-Based Ab-Initio Molecular
    Dynamics Simulations with Hundreds of Millions of Atoms.” <i>Parallel Computing</i>,
    vol. 111, 102920, Elsevier BV, 2022, doi:<a href="https://doi.org/10.1016/j.parco.2022.102920">10.1016/j.parco.2022.102920</a>.
  short: R. Schade, T. Kenter, H. Elgabarty, M. Lass, O. Schütt, A. Lazzaro, H. Pabst,
    S. Mohr, J. Hutter, T. Kühne, C. Plessl, Parallel Computing 111 (2022).
date_created: 2022-10-11T08:17:02Z
date_updated: 2023-08-02T15:03:55Z
department:
- _id: '613'
- _id: '27'
- _id: '518'
doi: 10.1016/j.parco.2022.102920
intvolume: '       111'
keyword:
- Artificial Intelligence
- Computer Graphics and Computer-Aided Design
- Computer Networks and Communications
- Hardware and Architecture
- Theoretical Computer Science
- Software
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://www.sciencedirect.com/science/article/pii/S0167819122000242
oa: '1'
project:
- _id: '52'
  name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: Parallel Computing
publication_identifier:
  issn:
  - 0167-8191
publication_status: published
publisher: Elsevier BV
quality_controlled: '1'
status: public
title: Towards electronic structure-based ab-initio molecular dynamics simulations
  with hundreds of millions of atoms
type: journal_article
user_id: '75963'
volume: 111
year: '2022'
...
---
_id: '16277'
abstract:
- lang: eng
  text: CP2K is an open source electronic structure and molecular dynamics software
    package to perform atomistic simulations of solid-state, liquid, molecular, and
    biological systems. It is especially aimed at massively parallel and linear-scaling
    electronic structure methods and state-of-theart ab initio molecular dynamics
    simulations. Excellent performance for electronic structure calculations is achieved
    using novel algorithms implemented for modern high-performance computing systems.
    This review revisits the main capabilities of CP2K to perform efficient and accurate
    electronic structure simulations. The emphasis is put on density functional theory
    and multiple post–Hartree–Fock methods using the Gaussian and plane wave approach
    and its augmented all-electron extension.
article_number: '194103'
author:
- first_name: Thomas
  full_name: Kühne, Thomas
  id: '49079'
  last_name: Kühne
- first_name: Marcella
  full_name: Iannuzzi, Marcella
  last_name: Iannuzzi
- first_name: Mauro Del
  full_name: Ben, Mauro Del
  last_name: Ben
- first_name: Vladimir V.
  full_name: Rybkin, Vladimir V.
  last_name: Rybkin
- first_name: Patrick
  full_name: Seewald, Patrick
  last_name: Seewald
- first_name: Frederick
  full_name: Stein, Frederick
  last_name: Stein
- first_name: Teodoro
  full_name: Laino, Teodoro
  last_name: Laino
- first_name: Rustam Z.
  full_name: Khaliullin, Rustam Z.
  last_name: Khaliullin
- first_name: Ole
  full_name: Schütt, Ole
  last_name: Schütt
- first_name: Florian
  full_name: Schiffmann, Florian
  last_name: Schiffmann
- first_name: Dorothea
  full_name: Golze, Dorothea
  last_name: Golze
- first_name: Jan
  full_name: Wilhelm, Jan
  last_name: Wilhelm
- first_name: Sergey
  full_name: Chulkov, Sergey
  last_name: Chulkov
- first_name: Mohammad Hossein Bani-Hashemian
  full_name: Mohammad Hossein Bani-Hashemian, Mohammad Hossein Bani-Hashemian
  last_name: Mohammad Hossein Bani-Hashemian
- first_name: Valéry
  full_name: Weber, Valéry
  last_name: Weber
- first_name: Urban
  full_name: Borstnik, Urban
  last_name: Borstnik
- first_name: Mathieu
  full_name: Taillefumier, Mathieu
  last_name: Taillefumier
- first_name: Alice Shoshana
  full_name: Jakobovits, Alice Shoshana
  last_name: Jakobovits
- first_name: Alfio
  full_name: Lazzaro, Alfio
  last_name: Lazzaro
- first_name: Hans
  full_name: Pabst, Hans
  last_name: Pabst
- first_name: Tiziano
  full_name: Müller, Tiziano
  last_name: Müller
- first_name: Robert
  full_name: Schade, Robert
  id: '75963'
  last_name: Schade
  orcid: 0000-0002-6268-539
- first_name: Manuel
  full_name: Guidon, Manuel
  last_name: Guidon
- first_name: Samuel
  full_name: Andermatt, Samuel
  last_name: Andermatt
- first_name: Nico
  full_name: Holmberg, Nico
  last_name: Holmberg
- first_name: Gregory K.
  full_name: Schenter, Gregory K.
  last_name: Schenter
- first_name: Anna
  full_name: Hehn, Anna
  last_name: Hehn
- first_name: Augustin
  full_name: Bussy, Augustin
  last_name: Bussy
- first_name: Fabian
  full_name: Belleflamme, Fabian
  last_name: Belleflamme
- first_name: Gloria
  full_name: Tabacchi, Gloria
  last_name: Tabacchi
- first_name: Andreas
  full_name: Glöß, Andreas
  last_name: Glöß
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Iain
  full_name: Bethune, Iain
  last_name: Bethune
- first_name: Christopher J.
  full_name: Mundy, Christopher J.
  last_name: Mundy
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
- first_name: Matt
  full_name: Watkins, Matt
  last_name: Watkins
- first_name: Joost
  full_name: VandeVondele, Joost
  last_name: VandeVondele
- first_name: Matthias
  full_name: Krack, Matthias
  last_name: Krack
- first_name: Jürg
  full_name: Hutter, Jürg
  last_name: Hutter
citation:
  ama: 'Kühne T, Iannuzzi M, Ben MD, et al. CP2K: An electronic structure and molecular
    dynamics software package - Quickstep: Efficient and accurate electronic structure
    calculations. <i>The Journal of Chemical Physics</i>. 2020;152(19). doi:<a href="https://doi.org/10.1063/5.0007045">10.1063/5.0007045</a>'
  apa: 'Kühne, T., Iannuzzi, M., Ben, M. D., Rybkin, V. V., Seewald, P., Stein, F.,
    Laino, T., Khaliullin, R. Z., Schütt, O., Schiffmann, F., Golze, D., Wilhelm,
    J., Chulkov, S., Mohammad Hossein Bani-Hashemian, M. H. B.-H., Weber, V., Borstnik,
    U., Taillefumier, M., Jakobovits, A. S., Lazzaro, A., … Hutter, J. (2020). CP2K:
    An electronic structure and molecular dynamics software package - Quickstep: Efficient
    and accurate electronic structure calculations. <i>The Journal of Chemical Physics</i>,
    <i>152</i>(19), Article 194103. <a href="https://doi.org/10.1063/5.0007045">https://doi.org/10.1063/5.0007045</a>'
  bibtex: '@article{Kühne_Iannuzzi_Ben_Rybkin_Seewald_Stein_Laino_Khaliullin_Schütt_Schiffmann_et
    al._2020, title={CP2K: An electronic structure and molecular dynamics software
    package - Quickstep: Efficient and accurate electronic structure calculations},
    volume={152}, DOI={<a href="https://doi.org/10.1063/5.0007045">10.1063/5.0007045</a>},
    number={19194103}, journal={The Journal of Chemical Physics}, author={Kühne, Thomas
    and Iannuzzi, Marcella and Ben, Mauro Del and Rybkin, Vladimir V. and Seewald,
    Patrick and Stein, Frederick and Laino, Teodoro and Khaliullin, Rustam Z. and
    Schütt, Ole and Schiffmann, Florian and et al.}, year={2020} }'
  chicago: 'Kühne, Thomas, Marcella Iannuzzi, Mauro Del Ben, Vladimir V. Rybkin, Patrick
    Seewald, Frederick Stein, Teodoro Laino, et al. “CP2K: An Electronic Structure
    and Molecular Dynamics Software Package - Quickstep: Efficient and Accurate Electronic
    Structure Calculations.” <i>The Journal of Chemical Physics</i> 152, no. 19 (2020).
    <a href="https://doi.org/10.1063/5.0007045">https://doi.org/10.1063/5.0007045</a>.'
  ieee: 'T. Kühne <i>et al.</i>, “CP2K: An electronic structure and molecular dynamics
    software package - Quickstep: Efficient and accurate electronic structure calculations,”
    <i>The Journal of Chemical Physics</i>, vol. 152, no. 19, Art. no. 194103, 2020,
    doi: <a href="https://doi.org/10.1063/5.0007045">10.1063/5.0007045</a>.'
  mla: 'Kühne, Thomas, et al. “CP2K: An Electronic Structure and Molecular Dynamics
    Software Package - Quickstep: Efficient and Accurate Electronic Structure Calculations.”
    <i>The Journal of Chemical Physics</i>, vol. 152, no. 19, 194103, 2020, doi:<a
    href="https://doi.org/10.1063/5.0007045">10.1063/5.0007045</a>.'
  short: T. Kühne, M. Iannuzzi, M.D. Ben, V.V. Rybkin, P. Seewald, F. Stein, T. Laino,
    R.Z. Khaliullin, O. Schütt, F. Schiffmann, D. Golze, J. Wilhelm, S. Chulkov, M.H.B.-H.
    Mohammad Hossein Bani-Hashemian, V. Weber, U. Borstnik, M. Taillefumier, A.S.
    Jakobovits, A. Lazzaro, H. Pabst, T. Müller, R. Schade, M. Guidon, S. Andermatt,
    N. Holmberg, G.K. Schenter, A. Hehn, A. Bussy, F. Belleflamme, G. Tabacchi, A.
    Glöß, M. Lass, I. Bethune, C.J. Mundy, C. Plessl, M. Watkins, J. VandeVondele,
    M. Krack, J. Hutter, The Journal of Chemical Physics 152 (2020).
date_created: 2020-03-10T15:12:31Z
date_updated: 2023-08-02T14:56:21Z
ddc:
- '540'
department:
- _id: '27'
- _id: '518'
- _id: '304'
doi: 10.1063/5.0007045
external_id:
  arxiv:
  - '2003.03868'
file:
- access_level: closed
  content_type: application/pdf
  creator: lass
  date_created: 2020-05-25T15:21:56Z
  date_updated: 2020-05-25T15:21:56Z
  file_id: '17061'
  file_name: 5.0007045.pdf
  file_size: 4887650
  relation: main_file
  success: 1
file_date_updated: 2020-05-25T15:21:56Z
has_accepted_license: '1'
intvolume: '       152'
issue: '19'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://aip.scitation.org/doi/pdf/10.1063/5.0007045?download=true
oa: '1'
project:
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
- _id: '32'
  grant_number: PL 595/2-1 / 320898746
  name: Performance and Efficiency in HPC with Custom Computing
- _id: '52'
  name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: The Journal of Chemical Physics
publication_status: published
quality_controlled: '1'
status: public
title: 'CP2K: An electronic structure and molecular dynamics software package - Quickstep:
  Efficient and accurate electronic structure calculations'
type: journal_article
user_id: '75963'
volume: 152
year: '2020'
...
---
_id: '16898'
abstract:
- lang: eng
  text: "Electronic structure calculations based on density-functional theory (DFT)\r\nrepresent
    a significant part of today's HPC workloads and pose high demands on\r\nhigh-performance
    computing resources. To perform these quantum-mechanical DFT\r\ncalculations on
    complex large-scale systems, so-called linear scaling methods\r\ninstead of conventional
    cubic scaling methods are required. In this work, we\r\ntake up the idea of the
    submatrix method and apply it to the DFT computations\r\nin the software package
    CP2K. For that purpose, we transform the underlying\r\nnumeric operations on distributed,
    large, sparse matrices into computations on\r\nlocal, much smaller and nearly
    dense matrices. This allows us to exploit the\r\nfull floating-point performance
    of modern CPUs and to make use of dedicated\r\naccelerator hardware, where performance
    has been limited by memory bandwidth\r\nbefore. We demonstrate both functionality
    and performance of our implementation\r\nand show how it can be accelerated with
    GPUs and FPGAs."
author:
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Robert
  full_name: Schade, Robert
  id: '75963'
  last_name: Schade
  orcid: 0000-0002-6268-539
- first_name: Thomas
  full_name: Kühne, Thomas
  id: '49079'
  last_name: Kühne
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: 'Lass M, Schade R, Kühne T, Plessl C. A Submatrix-Based Method for Approximate
    Matrix Function Evaluation in the Quantum Chemistry Code CP2K. In: <i>Proc. International
    Conference for High Performance Computing, Networking, Storage and Analysis (SC)</i>.
    IEEE Computer Society; 2020:1127-1140. doi:<a href="https://doi.org/10.1109/SC41405.2020.00084">10.1109/SC41405.2020.00084</a>'
  apa: Lass, M., Schade, R., Kühne, T., &#38; Plessl, C. (2020). A Submatrix-Based
    Method for Approximate Matrix Function Evaluation in the Quantum Chemistry Code
    CP2K. <i>Proc. International Conference for High Performance Computing, Networking,
    Storage and Analysis (SC)</i>, 1127–1140. <a href="https://doi.org/10.1109/SC41405.2020.00084">https://doi.org/10.1109/SC41405.2020.00084</a>
  bibtex: '@inproceedings{Lass_Schade_Kühne_Plessl_2020, place={Los Alamitos, CA,
    USA}, title={A Submatrix-Based Method for Approximate Matrix Function Evaluation
    in the Quantum Chemistry Code CP2K}, DOI={<a href="https://doi.org/10.1109/SC41405.2020.00084">10.1109/SC41405.2020.00084</a>},
    booktitle={Proc. International Conference for High Performance Computing, Networking,
    Storage and Analysis (SC)}, publisher={IEEE Computer Society}, author={Lass, Michael
    and Schade, Robert and Kühne, Thomas and Plessl, Christian}, year={2020}, pages={1127–1140}
    }'
  chicago: 'Lass, Michael, Robert Schade, Thomas Kühne, and Christian Plessl. “A Submatrix-Based
    Method for Approximate Matrix Function Evaluation in the Quantum Chemistry Code
    CP2K.” In <i>Proc. International Conference for High Performance Computing, Networking,
    Storage and Analysis (SC)</i>, 1127–40. Los Alamitos, CA, USA: IEEE Computer Society,
    2020. <a href="https://doi.org/10.1109/SC41405.2020.00084">https://doi.org/10.1109/SC41405.2020.00084</a>.'
  ieee: 'M. Lass, R. Schade, T. Kühne, and C. Plessl, “A Submatrix-Based Method for
    Approximate Matrix Function Evaluation in the Quantum Chemistry Code CP2K,” in
    <i>Proc. International Conference for High Performance Computing, Networking,
    Storage and Analysis (SC)</i>, Atlanta, GA, US, 2020, pp. 1127–1140, doi: <a href="https://doi.org/10.1109/SC41405.2020.00084">10.1109/SC41405.2020.00084</a>.'
  mla: Lass, Michael, et al. “A Submatrix-Based Method for Approximate Matrix Function
    Evaluation in the Quantum Chemistry Code CP2K.” <i>Proc. International Conference
    for High Performance Computing, Networking, Storage and Analysis (SC)</i>, IEEE
    Computer Society, 2020, pp. 1127–40, doi:<a href="https://doi.org/10.1109/SC41405.2020.00084">10.1109/SC41405.2020.00084</a>.
  short: 'M. Lass, R. Schade, T. Kühne, C. Plessl, in: Proc. International Conference
    for High Performance Computing, Networking, Storage and Analysis (SC), IEEE Computer
    Society, Los Alamitos, CA, USA, 2020, pp. 1127–1140.'
conference:
  location: Atlanta, GA, US
  name: 'SC20: International Conference for High Performance Computing, Networking,
    Storage and Analysis (SC)'
date_created: 2020-04-28T14:44:21Z
date_updated: 2023-08-02T14:55:59Z
department:
- _id: '27'
- _id: '518'
- _id: '304'
doi: 10.1109/SC41405.2020.00084
external_id:
  arxiv:
  - '2004.10811'
language:
- iso: eng
main_file_link:
- url: https://ieeexplore.ieee.org/document/9355245
page: 1127-1140
place: Los Alamitos, CA, USA
project:
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
- _id: '32'
  grant_number: PL 595/2-1 / 320898746
  name: Performance and Efficiency in HPC with Custom Computing
- _id: '52'
  name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: Proc. International Conference for High Performance Computing, Networking,
  Storage and Analysis (SC)
publisher: IEEE Computer Society
quality_controlled: '1'
status: public
title: A Submatrix-Based Method for Approximate Matrix Function Evaluation in the
  Quantum Chemistry Code CP2K
type: conference
user_id: '75963'
year: '2020'
...
---
_id: '12878'
abstract:
- lang: eng
  text: In scientific computing, the acceleration of atomistic computer simulations
    by means of custom hardware is finding ever-growing application. A major limitation,
    however, is that the high efficiency in terms of performance and low power consumption
    entails the massive usage of low precision computing units. Here, based on the
    approximate computing paradigm, we present an algorithmic method to compensate
    for numerical inaccuracies due to low accuracy arithmetic operations rigorously,
    yet still obtaining exact expectation values using a properly modified Langevin-type
    equation.
article_number: '39'
author:
- first_name: Varadarajan
  full_name: Rengaraj, Varadarajan
  last_name: Rengaraj
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
- first_name: Thomas
  full_name: Kühne, Thomas
  id: '49079'
  last_name: Kühne
citation:
  ama: Rengaraj V, Lass M, Plessl C, Kühne T. Accurate Sampling with Noisy Forces
    from Approximate Computing. <i>Computation</i>. 2020;8(2). doi:<a href="https://doi.org/10.3390/computation8020039">10.3390/computation8020039</a>
  apa: Rengaraj, V., Lass, M., Plessl, C., &#38; Kühne, T. (2020). Accurate Sampling
    with Noisy Forces from Approximate Computing. <i>Computation</i>, <i>8</i>(2),
    Article 39. <a href="https://doi.org/10.3390/computation8020039">https://doi.org/10.3390/computation8020039</a>
  bibtex: '@article{Rengaraj_Lass_Plessl_Kühne_2020, title={Accurate Sampling with
    Noisy Forces from Approximate Computing}, volume={8}, DOI={<a href="https://doi.org/10.3390/computation8020039">10.3390/computation8020039</a>},
    number={239}, journal={Computation}, publisher={MDPI}, author={Rengaraj, Varadarajan
    and Lass, Michael and Plessl, Christian and Kühne, Thomas}, year={2020} }'
  chicago: Rengaraj, Varadarajan, Michael Lass, Christian Plessl, and Thomas Kühne.
    “Accurate Sampling with Noisy Forces from Approximate Computing.” <i>Computation</i>
    8, no. 2 (2020). <a href="https://doi.org/10.3390/computation8020039">https://doi.org/10.3390/computation8020039</a>.
  ieee: 'V. Rengaraj, M. Lass, C. Plessl, and T. Kühne, “Accurate Sampling with Noisy
    Forces from Approximate Computing,” <i>Computation</i>, vol. 8, no. 2, Art. no.
    39, 2020, doi: <a href="https://doi.org/10.3390/computation8020039">10.3390/computation8020039</a>.'
  mla: Rengaraj, Varadarajan, et al. “Accurate Sampling with Noisy Forces from Approximate
    Computing.” <i>Computation</i>, vol. 8, no. 2, 39, MDPI, 2020, doi:<a href="https://doi.org/10.3390/computation8020039">10.3390/computation8020039</a>.
  short: V. Rengaraj, M. Lass, C. Plessl, T. Kühne, Computation 8 (2020).
date_created: 2019-07-23T12:03:07Z
date_updated: 2023-09-26T11:43:52Z
department:
- _id: '27'
- _id: '518'
- _id: '304'
doi: 10.3390/computation8020039
external_id:
  arxiv:
  - '1907.08497'
intvolume: '         8'
issue: '2'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://www.mdpi.com/2079-3197/8/2/39/pdf
oa: '1'
project:
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
- _id: '32'
  grant_number: PL 595/2-1 / 320898746
  name: Performance and Efficiency in HPC with Custom Computing
publication: Computation
publisher: MDPI
quality_controlled: '1'
status: public
title: Accurate Sampling with Noisy Forces from Approximate Computing
type: journal_article
user_id: '15278'
volume: 8
year: '2020'
...
---
_id: '21'
abstract:
- lang: eng
  text: "We address the general mathematical problem of computing the inverse p-th\r\nroot
    of a given matrix in an efficient way. A new method to construct iteration\r\nfunctions
    that allow calculating arbitrary p-th roots and their inverses of\r\nsymmetric
    positive definite matrices is presented. We show that the order of\r\nconvergence
    is at least quadratic and that adaptively adjusting a parameter q\r\nalways leads
    to an even faster convergence. In this way, a better performance\r\nthan with
    previously known iteration schemes is achieved. The efficiency of the\r\niterative
    functions is demonstrated for various matrices with different\r\ndensities, condition
    numbers and spectral radii."
author:
- first_name: Dorothee
  full_name: Richters, Dorothee
  last_name: Richters
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Andrea
  full_name: Walther, Andrea
  last_name: Walther
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
- first_name: Thomas
  full_name: Kühne, Thomas
  id: '49079'
  last_name: Kühne
citation:
  ama: Richters D, Lass M, Walther A, Plessl C, Kühne T. A General Algorithm to Calculate
    the Inverse Principal p-th Root of Symmetric Positive Definite Matrices. <i>Communications
    in Computational Physics</i>. 2019;25(2):564-585. doi:<a href="https://doi.org/10.4208/cicp.OA-2018-0053">10.4208/cicp.OA-2018-0053</a>
  apa: Richters, D., Lass, M., Walther, A., Plessl, C., &#38; Kühne, T. (2019). A
    General Algorithm to Calculate the Inverse Principal p-th Root of Symmetric Positive
    Definite Matrices. <i>Communications in Computational Physics</i>, <i>25</i>(2),
    564–585. <a href="https://doi.org/10.4208/cicp.OA-2018-0053">https://doi.org/10.4208/cicp.OA-2018-0053</a>
  bibtex: '@article{Richters_Lass_Walther_Plessl_Kühne_2019, title={A General Algorithm
    to Calculate the Inverse Principal p-th Root of Symmetric Positive Definite Matrices},
    volume={25}, DOI={<a href="https://doi.org/10.4208/cicp.OA-2018-0053">10.4208/cicp.OA-2018-0053</a>},
    number={2}, journal={Communications in Computational Physics}, publisher={Global
    Science Press}, author={Richters, Dorothee and Lass, Michael and Walther, Andrea
    and Plessl, Christian and Kühne, Thomas}, year={2019}, pages={564–585} }'
  chicago: 'Richters, Dorothee, Michael Lass, Andrea Walther, Christian Plessl, and
    Thomas Kühne. “A General Algorithm to Calculate the Inverse Principal P-Th Root
    of Symmetric Positive Definite Matrices.” <i>Communications in Computational Physics</i>
    25, no. 2 (2019): 564–85. <a href="https://doi.org/10.4208/cicp.OA-2018-0053">https://doi.org/10.4208/cicp.OA-2018-0053</a>.'
  ieee: 'D. Richters, M. Lass, A. Walther, C. Plessl, and T. Kühne, “A General Algorithm
    to Calculate the Inverse Principal p-th Root of Symmetric Positive Definite Matrices,”
    <i>Communications in Computational Physics</i>, vol. 25, no. 2, pp. 564–585, 2019,
    doi: <a href="https://doi.org/10.4208/cicp.OA-2018-0053">10.4208/cicp.OA-2018-0053</a>.'
  mla: Richters, Dorothee, et al. “A General Algorithm to Calculate the Inverse Principal
    P-Th Root of Symmetric Positive Definite Matrices.” <i>Communications in Computational
    Physics</i>, vol. 25, no. 2, Global Science Press, 2019, pp. 564–85, doi:<a href="https://doi.org/10.4208/cicp.OA-2018-0053">10.4208/cicp.OA-2018-0053</a>.
  short: D. Richters, M. Lass, A. Walther, C. Plessl, T. Kühne, Communications in
    Computational Physics 25 (2019) 564–585.
date_created: 2017-07-25T14:48:26Z
date_updated: 2023-09-26T11:45:02Z
department:
- _id: '27'
- _id: '518'
- _id: '304'
- _id: '104'
doi: 10.4208/cicp.OA-2018-0053
external_id:
  arxiv:
  - '1703.02456'
intvolume: '        25'
issue: '2'
language:
- iso: eng
page: 564-585
project:
- _id: '32'
  grant_number: PL 595/2-1 / 320898746
  name: Performance and Efficiency in HPC with Custom Computing
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
publication: Communications in Computational Physics
publisher: Global Science Press
quality_controlled: '1'
status: public
title: A General Algorithm to Calculate the Inverse Principal p-th Root of Symmetric
  Positive Definite Matrices
type: journal_article
user_id: '15278'
volume: 25
year: '2019'
...
---
_id: '20'
abstract:
- lang: eng
  text: "Approximate computing has shown to provide new ways to improve performance\r\nand
    power consumption of error-resilient applications. While many of these\r\napplications
    can be found in image processing, data classification or machine\r\nlearning,
    we demonstrate its suitability to a problem from scientific\r\ncomputing. Utilizing
    the self-correcting behavior of iterative algorithms, we\r\nshow that approximate
    computing can be applied to the calculation of inverse\r\nmatrix p-th roots which
    are required in many applications in scientific\r\ncomputing. Results show great
    opportunities to reduce the computational effort\r\nand bandwidth required for
    the execution of the discussed algorithm, especially\r\nwhen targeting special
    accelerator hardware."
author:
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Thomas
  full_name: Kühne, Thomas
  id: '49079'
  last_name: Kühne
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: Lass M, Kühne T, Plessl C. Using Approximate Computing for the Calculation
    of Inverse Matrix p-th Roots. <i>Embedded Systems Letters</i>. 2018;10(2):33-36.
    doi:<a href="https://doi.org/10.1109/LES.2017.2760923">10.1109/LES.2017.2760923</a>
  apa: Lass, M., Kühne, T., &#38; Plessl, C. (2018). Using Approximate Computing for
    the Calculation of Inverse Matrix p-th Roots. <i>Embedded Systems Letters</i>,
    <i>10</i>(2), 33–36. <a href="https://doi.org/10.1109/LES.2017.2760923">https://doi.org/10.1109/LES.2017.2760923</a>
  bibtex: '@article{Lass_Kühne_Plessl_2018, title={Using Approximate Computing for
    the Calculation of Inverse Matrix p-th Roots}, volume={10}, DOI={<a href="https://doi.org/10.1109/LES.2017.2760923">10.1109/LES.2017.2760923</a>},
    number={2}, journal={Embedded Systems Letters}, publisher={IEEE}, author={Lass,
    Michael and Kühne, Thomas and Plessl, Christian}, year={2018}, pages={33–36} }'
  chicago: 'Lass, Michael, Thomas Kühne, and Christian Plessl. “Using Approximate
    Computing for the Calculation of Inverse Matrix P-Th Roots.” <i>Embedded Systems
    Letters</i> 10, no. 2 (2018): 33–36. <a href="https://doi.org/10.1109/LES.2017.2760923">https://doi.org/10.1109/LES.2017.2760923</a>.'
  ieee: M. Lass, T. Kühne, and C. Plessl, “Using Approximate Computing for the Calculation
    of Inverse Matrix p-th Roots,” <i>Embedded Systems Letters</i>, vol. 10, no. 2,
    pp. 33–36, 2018.
  mla: Lass, Michael, et al. “Using Approximate Computing for the Calculation of Inverse
    Matrix P-Th Roots.” <i>Embedded Systems Letters</i>, vol. 10, no. 2, IEEE, 2018,
    pp. 33–36, doi:<a href="https://doi.org/10.1109/LES.2017.2760923">10.1109/LES.2017.2760923</a>.
  short: M. Lass, T. Kühne, C. Plessl, Embedded Systems Letters 10 (2018) 33–36.
date_created: 2017-07-25T14:41:08Z
date_updated: 2022-01-06T06:54:18Z
department:
- _id: '27'
- _id: '518'
- _id: '304'
doi: 10.1109/LES.2017.2760923
external_id:
  arxiv:
  - '1703.02283'
intvolume: '        10'
issue: '2'
language:
- iso: eng
page: ' 33-36'
project:
- _id: '32'
  grant_number: PL 595/2-1
  name: Performance and Efficiency in HPC with Custom Computing
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
publication: Embedded Systems Letters
publication_identifier:
  eissn:
  - 1943-0671
  issn:
  - 1943-0663
publication_status: published
publisher: IEEE
status: public
title: Using Approximate Computing for the Calculation of Inverse Matrix p-th Roots
type: journal_article
user_id: '16153'
volume: 10
year: '2018'
...
---
_id: '1590'
abstract:
- lang: eng
  text: "We present the submatrix method, a highly parallelizable method for the approximate
    calculation of inverse p-th roots of large sparse symmetric matrices which are
    required in different scientific applications. Following the idea of Approximate
    Computing, we allow imprecision in the final result in order to utilize the sparsity
    of the input matrix and to allow massively parallel execution. For an n x n matrix,
    the proposed algorithm allows to distribute the calculations over n nodes with
    only little communication overhead. The result matrix exhibits the same sparsity
    pattern as the input matrix, allowing for efficient reuse of allocated data structures.\r\n\r\nWe
    evaluate the algorithm with respect to the error that it introduces into calculated
    results, as well as its performance and scalability. We demonstrate that the error
    is relatively limited for well-conditioned matrices and that results are still
    valuable for error-resilient applications like preconditioning even for ill-conditioned
    matrices. We discuss the execution time and scaling of the algorithm on a theoretical
    level and present a distributed implementation of the algorithm using MPI and
    OpenMP. We demonstrate the scalability of this implementation by running it on
    a high-performance compute cluster comprised of 1024 CPU cores, showing a speedup
    of 665x compared to single-threaded execution."
author:
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Stephan
  full_name: Mohr, Stephan
  last_name: Mohr
- first_name: Hendrik
  full_name: Wiebeler, Hendrik
  last_name: Wiebeler
- first_name: Thomas
  full_name: Kühne, Thomas
  id: '49079'
  last_name: Kühne
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: 'Lass M, Mohr S, Wiebeler H, Kühne T, Plessl C. A Massively Parallel Algorithm
    for the Approximate Calculation of Inverse p-th Roots of Large Sparse Matrices.
    In: <i>Proc. Platform for Advanced Scientific Computing (PASC) Conference</i>.
    ACM; 2018. doi:<a href="https://doi.org/10.1145/3218176.3218231">10.1145/3218176.3218231</a>'
  apa: Lass, M., Mohr, S., Wiebeler, H., Kühne, T., &#38; Plessl, C. (2018). A Massively
    Parallel Algorithm for the Approximate Calculation of Inverse p-th Roots of Large
    Sparse Matrices. <i>Proc. Platform for Advanced Scientific Computing (PASC) Conference</i>.
    Platform for Advanced Scientific Computing Conference (PASC), Basel, Switzerland.
    <a href="https://doi.org/10.1145/3218176.3218231">https://doi.org/10.1145/3218176.3218231</a>
  bibtex: '@inproceedings{Lass_Mohr_Wiebeler_Kühne_Plessl_2018, place={New York, NY,
    USA}, title={A Massively Parallel Algorithm for the Approximate Calculation of
    Inverse p-th Roots of Large Sparse Matrices}, DOI={<a href="https://doi.org/10.1145/3218176.3218231">10.1145/3218176.3218231</a>},
    booktitle={Proc. Platform for Advanced Scientific Computing (PASC) Conference},
    publisher={ACM}, author={Lass, Michael and Mohr, Stephan and Wiebeler, Hendrik
    and Kühne, Thomas and Plessl, Christian}, year={2018} }'
  chicago: 'Lass, Michael, Stephan Mohr, Hendrik Wiebeler, Thomas Kühne, and Christian
    Plessl. “A Massively Parallel Algorithm for the Approximate Calculation of Inverse
    P-Th Roots of Large Sparse Matrices.” In <i>Proc. Platform for Advanced Scientific
    Computing (PASC) Conference</i>. New York, NY, USA: ACM, 2018. <a href="https://doi.org/10.1145/3218176.3218231">https://doi.org/10.1145/3218176.3218231</a>.'
  ieee: 'M. Lass, S. Mohr, H. Wiebeler, T. Kühne, and C. Plessl, “A Massively Parallel
    Algorithm for the Approximate Calculation of Inverse p-th Roots of Large Sparse
    Matrices,” presented at the Platform for Advanced Scientific Computing Conference
    (PASC), Basel, Switzerland, 2018, doi: <a href="https://doi.org/10.1145/3218176.3218231">10.1145/3218176.3218231</a>.'
  mla: Lass, Michael, et al. “A Massively Parallel Algorithm for the Approximate Calculation
    of Inverse P-Th Roots of Large Sparse Matrices.” <i>Proc. Platform for Advanced
    Scientific Computing (PASC) Conference</i>, ACM, 2018, doi:<a href="https://doi.org/10.1145/3218176.3218231">10.1145/3218176.3218231</a>.
  short: 'M. Lass, S. Mohr, H. Wiebeler, T. Kühne, C. Plessl, in: Proc. Platform for
    Advanced Scientific Computing (PASC) Conference, ACM, New York, NY, USA, 2018.'
conference:
  end_date: 2018-07-04
  location: Basel, Switzerland
  name: Platform for Advanced Scientific Computing Conference (PASC)
  start_date: 2018-07-02
date_created: 2018-03-22T10:53:01Z
date_updated: 2023-09-26T11:48:12Z
department:
- _id: '27'
- _id: '518'
- _id: '304'
doi: 10.1145/3218176.3218231
external_id:
  arxiv:
  - '1710.10899'
keyword:
- approximate computing
- linear algebra
- matrix inversion
- matrix p-th roots
- numeric algorithm
- parallel computing
language:
- iso: eng
place: New York, NY, USA
project:
- _id: '32'
  grant_number: PL 595/2-1 / 320898746
  name: Performance and Efficiency in HPC with Custom Computing
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
publication: Proc. Platform for Advanced Scientific Computing (PASC) Conference
publication_identifier:
  isbn:
  - 978-1-4503-5891-0/18/07
publisher: ACM
quality_controlled: '1'
status: public
title: A Massively Parallel Algorithm for the Approximate Calculation of Inverse p-th
  Roots of Large Sparse Matrices
type: conference
user_id: '15278'
year: '2018'
...
---
_id: '18'
abstract:
- lang: eng
  text: "Branch and bound (B&B) algorithms structure the search space as a tree and
    eliminate infeasible solutions early by pruning subtrees that cannot lead to a
    valid or optimal solution. Custom hardware designs significantly accelerate the
    execution of these algorithms. In this article, we demonstrate a high-performance
    B&B implementation on FPGAs. First, we identify general elements of B&B algorithms
    and describe their implementation as a finite state machine. Then, we introduce
    workers that autonomously cooperate using work stealing to allow parallel execution
    and full utilization of the target FPGA. Finally, we explore advantages of instance-specific
    designs that target a specific problem instance to improve performance.\r\n\r\nWe
    evaluate our concepts by applying them to a branch and bound problem, the reconstruction
    of corrupted AES keys obtained from cold-boot attacks. The evaluation shows that
    our work stealing approach is scalable with the available resources and provides
    speedups proportional to the number of workers. Instance-specific designs allow
    us to achieve an overall speedup of 47 × compared to the fastest implementation
    of AES key reconstruction so far. Finally, we demonstrate how instance-specific
    designs can be generated just-in-time such that the provided speedups outweigh
    the additional time required for design synthesis."
author:
- first_name: Heinrich
  full_name: Riebler, Heinrich
  id: '8961'
  last_name: Riebler
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Robert
  full_name: Mittendorf, Robert
  last_name: Mittendorf
- first_name: Thomas
  full_name: Löcke, Thomas
  last_name: Löcke
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: Riebler H, Lass M, Mittendorf R, Löcke T, Plessl C. Efficient Branch and Bound
    on FPGAs Using Work Stealing and Instance-Specific Designs. <i>ACM Transactions
    on Reconfigurable Technology and Systems (TRETS)</i>. 2017;10(3):24:1-24:23. doi:<a
    href="https://doi.org/10.1145/3053687">10.1145/3053687</a>
  apa: Riebler, H., Lass, M., Mittendorf, R., Löcke, T., &#38; Plessl, C. (2017).
    Efficient Branch and Bound on FPGAs Using Work Stealing and Instance-Specific
    Designs. <i>ACM Transactions on Reconfigurable Technology and Systems (TRETS)</i>,
    <i>10</i>(3), 24:1-24:23. <a href="https://doi.org/10.1145/3053687">https://doi.org/10.1145/3053687</a>
  bibtex: '@article{Riebler_Lass_Mittendorf_Löcke_Plessl_2017, title={Efficient Branch
    and Bound on FPGAs Using Work Stealing and Instance-Specific Designs}, volume={10},
    DOI={<a href="https://doi.org/10.1145/3053687">10.1145/3053687</a>}, number={3},
    journal={ACM Transactions on Reconfigurable Technology and Systems (TRETS)}, publisher={Association
    for Computing Machinery (ACM)}, author={Riebler, Heinrich and Lass, Michael and
    Mittendorf, Robert and Löcke, Thomas and Plessl, Christian}, year={2017}, pages={24:1-24:23}
    }'
  chicago: 'Riebler, Heinrich, Michael Lass, Robert Mittendorf, Thomas Löcke, and
    Christian Plessl. “Efficient Branch and Bound on FPGAs Using Work Stealing and
    Instance-Specific Designs.” <i>ACM Transactions on Reconfigurable Technology and
    Systems (TRETS)</i> 10, no. 3 (2017): 24:1-24:23. <a href="https://doi.org/10.1145/3053687">https://doi.org/10.1145/3053687</a>.'
  ieee: 'H. Riebler, M. Lass, R. Mittendorf, T. Löcke, and C. Plessl, “Efficient Branch
    and Bound on FPGAs Using Work Stealing and Instance-Specific Designs,” <i>ACM
    Transactions on Reconfigurable Technology and Systems (TRETS)</i>, vol. 10, no.
    3, p. 24:1-24:23, 2017, doi: <a href="https://doi.org/10.1145/3053687">10.1145/3053687</a>.'
  mla: Riebler, Heinrich, et al. “Efficient Branch and Bound on FPGAs Using Work Stealing
    and Instance-Specific Designs.” <i>ACM Transactions on Reconfigurable Technology
    and Systems (TRETS)</i>, vol. 10, no. 3, Association for Computing Machinery (ACM),
    2017, p. 24:1-24:23, doi:<a href="https://doi.org/10.1145/3053687">10.1145/3053687</a>.
  short: H. Riebler, M. Lass, R. Mittendorf, T. Löcke, C. Plessl, ACM Transactions
    on Reconfigurable Technology and Systems (TRETS) 10 (2017) 24:1-24:23.
date_created: 2017-07-25T14:17:32Z
date_updated: 2023-09-26T13:23:58Z
ddc:
- '000'
department:
- _id: '27'
- _id: '518'
doi: 10.1145/3053687
file:
- access_level: closed
  content_type: application/pdf
  creator: ups
  date_created: 2018-11-02T16:04:14Z
  date_updated: 2018-11-02T16:04:14Z
  file_id: '5322'
  file_name: a24-riebler.pdf
  file_size: 2131617
  relation: main_file
  success: 1
file_date_updated: 2018-11-02T16:04:14Z
has_accepted_license: '1'
intvolume: '        10'
issue: '3'
keyword:
- coldboot
language:
- iso: eng
page: 24:1-24:23
project:
- _id: '1'
  grant_number: '160364472'
  name: SFB 901
- _id: '4'
  name: SFB 901 - Project Area C
- _id: '14'
  grant_number: '160364472'
  name: SFB 901 - Subproject C2
- _id: '34'
  grant_number: '610996'
  name: Self-Adaptive Virtualisation-Aware High-Performance/Low-Energy Heterogeneous
    System Architectures
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
publication: ACM Transactions on Reconfigurable Technology and Systems (TRETS)
publication_identifier:
  issn:
  - 1936-7406
publication_status: published
publisher: Association for Computing Machinery (ACM)
quality_controlled: '1'
status: public
title: Efficient Branch and Bound on FPGAs Using Work Stealing and Instance-Specific
  Designs
type: journal_article
user_id: '15278'
volume: 10
year: '2017'
...
---
_id: '19'
abstract:
- lang: eng
  text: "Version Control Systems (VCS) are a valuable tool for software development\r\nand
    document management. Both client/server and distributed (Peer-to-Peer)\r\nmodels
    exist, with the latter (e.g., Git and Mercurial) becoming\r\nincreasingly popular.
    Their distributed nature introduces complications,\r\nespecially concerning security:
    it is hard to control the dissemination of\r\ncontents stored in distributed VCS
    as they rely on replication of complete\r\nrepositories to any involved user.\r\n\r\nWe
    overcome this issue by designing and implementing a concept for\r\ncryptography-enforced
    access control which is transparent to the user. Use\r\nof field-tested schemes
    (end-to-end encryption, digital signatures) allows\r\nfor strong security, while
    adoption of convergent encryption and\r\ncontent-defined chunking retains storage
    efficiency. The concept is\r\nseamlessly integrated into Mercurial---respecting
    its distributed storage\r\nconcept---to ensure practical usability and compatibility
    to existing\r\ndeployments."
author:
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Dominik
  full_name: Leibenger, Dominik
  last_name: Leibenger
- first_name: Christoph
  full_name: Sorge, Christoph
  last_name: Sorge
citation:
  ama: 'Lass M, Leibenger D, Sorge C. Confidentiality and Authenticity for Distributed
    Version Control Systems - A Mercurial Extension. In: <i>Proc. 41st Conference
    on Local Computer Networks (LCN)</i>. IEEE; 2016. doi:<a href="https://doi.org/10.1109/lcn.2016.11">10.1109/lcn.2016.11</a>'
  apa: Lass, M., Leibenger, D., &#38; Sorge, C. (2016). Confidentiality and Authenticity
    for Distributed Version Control Systems - A Mercurial Extension. In <i>Proc. 41st
    Conference on Local Computer Networks (LCN)</i>. IEEE. <a href="https://doi.org/10.1109/lcn.2016.11">https://doi.org/10.1109/lcn.2016.11</a>
  bibtex: '@inproceedings{Lass_Leibenger_Sorge_2016, title={Confidentiality and Authenticity
    for Distributed Version Control Systems - A Mercurial Extension}, DOI={<a href="https://doi.org/10.1109/lcn.2016.11">10.1109/lcn.2016.11</a>},
    booktitle={Proc. 41st Conference on Local Computer Networks (LCN)}, publisher={IEEE},
    author={Lass, Michael and Leibenger, Dominik and Sorge, Christoph}, year={2016}
    }'
  chicago: Lass, Michael, Dominik Leibenger, and Christoph Sorge. “Confidentiality
    and Authenticity for Distributed Version Control Systems - A Mercurial Extension.”
    In <i>Proc. 41st Conference on Local Computer Networks (LCN)</i>. IEEE, 2016.
    <a href="https://doi.org/10.1109/lcn.2016.11">https://doi.org/10.1109/lcn.2016.11</a>.
  ieee: M. Lass, D. Leibenger, and C. Sorge, “Confidentiality and Authenticity for
    Distributed Version Control Systems - A Mercurial Extension,” in <i>Proc. 41st
    Conference on Local Computer Networks (LCN)</i>, 2016.
  mla: Lass, Michael, et al. “Confidentiality and Authenticity for Distributed Version
    Control Systems - A Mercurial Extension.” <i>Proc. 41st Conference on Local Computer
    Networks (LCN)</i>, IEEE, 2016, doi:<a href="https://doi.org/10.1109/lcn.2016.11">10.1109/lcn.2016.11</a>.
  short: 'M. Lass, D. Leibenger, C. Sorge, in: Proc. 41st Conference on Local Computer
    Networks (LCN), IEEE, 2016.'
date_created: 2017-07-25T14:36:16Z
date_updated: 2022-01-06T06:53:56Z
department:
- _id: '27'
- _id: '518'
doi: 10.1109/lcn.2016.11
keyword:
- access control
- distributed version control systems
- mercurial
- peer-to-peer
- convergent encryption
- confidentiality
- authenticity
language:
- iso: eng
publication: Proc. 41st Conference on Local Computer Networks (LCN)
publication_identifier:
  isbn:
  - 978-1-5090-2054-6
publication_status: published
publisher: IEEE
status: public
title: Confidentiality and Authenticity for Distributed Version Control Systems -
  A Mercurial Extension
type: conference
user_id: '24135'
year: '2016'
...
---
_id: '25'
author:
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
- first_name: Thomas
  full_name: Kühne, Thomas
  id: '49079'
  last_name: Kühne
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
citation:
  ama: 'Lass M, Kühne T, Plessl C. Using Approximate Computing in Scientific Codes.
    In: <i>Workshop on Approximate Computing (AC)</i>. ; 2016.'
  apa: Lass, M., Kühne, T., &#38; Plessl, C. (2016). Using Approximate Computing in
    Scientific Codes. <i>Workshop on Approximate Computing (AC)</i>.
  bibtex: '@inproceedings{Lass_Kühne_Plessl_2016, title={Using Approximate Computing
    in Scientific Codes}, booktitle={Workshop on Approximate Computing (AC)}, author={Lass,
    Michael and Kühne, Thomas and Plessl, Christian}, year={2016} }'
  chicago: Lass, Michael, Thomas Kühne, and Christian Plessl. “Using Approximate Computing
    in Scientific Codes.” In <i>Workshop on Approximate Computing (AC)</i>, 2016.
  ieee: M. Lass, T. Kühne, and C. Plessl, “Using Approximate Computing in Scientific
    Codes,” 2016.
  mla: Lass, Michael, et al. “Using Approximate Computing in Scientific Codes.” <i>Workshop
    on Approximate Computing (AC)</i>, 2016.
  short: 'M. Lass, T. Kühne, C. Plessl, in: Workshop on Approximate Computing (AC),
    2016.'
date_created: 2017-07-26T15:02:20Z
date_updated: 2023-09-26T13:25:17Z
department:
- _id: '27'
- _id: '518'
- _id: '304'
language:
- iso: eng
project:
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
publication: Workshop on Approximate Computing (AC)
quality_controlled: '1'
status: public
title: Using Approximate Computing in Scientific Codes
type: conference
user_id: '15278'
year: '2016'
...
---
_id: '1794'
abstract:
- lang: eng
  text: Demands for computational power and energy efficiency of computing devices
    are steadily increasing. At the same time, following classic methods to increase
    speed and reduce energy consumption of these devices becomes increasingly difficult,
    bringing alternative methods into focus. One of these methods is approximate computing
    which utilizes the fact that small errors in computations are acceptable in many
    applications in order to allow acceleration of these computations or to increase
    energy efficiency. This thesis develops elements of a workflow that can be followed
    to apply approximate computing to existing applications. It proposes a novel heuristic
    approach to the localization of code paths that are suitable to approximate computing
    based on findings in recent research. Additionally, an approach to identification
    of approximable instructions within these code paths is proposed and used to implement
    simulation of approximation. The parts of the workflow are implemented with the
    goal to lay the foundation for a partly automated toolflow. Evaluation of the
    developed techniques shows that the proposed methods can help providing a convenient
    workflow, facilitating the first steps into the application of approximate computing.
author:
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
citation:
  ama: 'Lass M. <i>Localization and Analysis of Code Paths Suitable for Acceleration
    Using Approximate Computing</i>. Paderborn: Paderborn University; 2015.'
  apa: 'Lass, M. (2015). <i>Localization and Analysis of Code Paths Suitable for Acceleration
    using Approximate Computing</i>. Paderborn: Paderborn University.'
  bibtex: '@book{Lass_2015, place={Paderborn}, title={Localization and Analysis of
    Code Paths Suitable for Acceleration using Approximate Computing}, publisher={Paderborn
    University}, author={Lass, Michael}, year={2015} }'
  chicago: 'Lass, Michael. <i>Localization and Analysis of Code Paths Suitable for
    Acceleration Using Approximate Computing</i>. Paderborn: Paderborn University,
    2015.'
  ieee: 'M. Lass, <i>Localization and Analysis of Code Paths Suitable for Acceleration
    using Approximate Computing</i>. Paderborn: Paderborn University, 2015.'
  mla: Lass, Michael. <i>Localization and Analysis of Code Paths Suitable for Acceleration
    Using Approximate Computing</i>. Paderborn University, 2015.
  short: M. Lass, Localization and Analysis of Code Paths Suitable for Acceleration
    Using Approximate Computing, Paderborn University, Paderborn, 2015.
date_created: 2018-03-26T15:24:10Z
date_updated: 2022-01-06T06:53:23Z
department:
- _id: '27'
- _id: '518'
place: Paderborn
publisher: Paderborn University
status: public
supervisor:
- first_name: Christian
  full_name: Plessl, Christian
  id: '16153'
  last_name: Plessl
  orcid: 0000-0001-5728-9982
title: Localization and Analysis of Code Paths Suitable for Acceleration using Approximate
  Computing
type: mastersthesis
user_id: '24135'
year: '2015'
...
---
_id: '1795'
abstract:
- lang: eng
  text: Distributed revision control is widespread throughout the software industry.
    Systems like git and mercurial gained a lot of users over the last years and started
    to supersede central systems like Subversion or CVS in some projects. While restricting
    access to those central systems is basically possible, it is difficult to control
    the propagation of contents in a distributed revision control system because every
    user has a local copy of the whole repository. In this thesis a concept is developed
    and implemented that allows secure storage of confidential data in a distributed
    revision control system and enables users to manage read and write permissions
    on single confidential files. Therefore different cryptographic methods are used,
    such as asymmetric encryption, digital signatures and convergent encryption. These
    techniques are applied in a manner that fits the special requirements of a revision
    control system and allows a space efficient storage of changes to the encrypted
    files.
author:
- first_name: Michael
  full_name: Lass, Michael
  id: '24135'
  last_name: Lass
  orcid: 0000-0002-5708-7632
citation:
  ama: 'Lass M. <i>Sichere Speicherung Vertraulicher Daten in Verteilten Versionskontrollsystemen</i>.
    Paderborn: Paderborn University; 2013.'
  apa: 'Lass, M. (2013). <i>Sichere Speicherung vertraulicher Daten in verteilten
    Versionskontrollsystemen</i>. Paderborn: Paderborn University.'
  bibtex: '@book{Lass_2013, place={Paderborn}, title={Sichere Speicherung vertraulicher
    Daten in verteilten Versionskontrollsystemen}, publisher={Paderborn University},
    author={Lass, Michael}, year={2013} }'
  chicago: 'Lass, Michael. <i>Sichere Speicherung Vertraulicher Daten in Verteilten
    Versionskontrollsystemen</i>. Paderborn: Paderborn University, 2013.'
  ieee: 'M. Lass, <i>Sichere Speicherung vertraulicher Daten in verteilten Versionskontrollsystemen</i>.
    Paderborn: Paderborn University, 2013.'
  mla: Lass, Michael. <i>Sichere Speicherung Vertraulicher Daten in Verteilten Versionskontrollsystemen</i>.
    Paderborn University, 2013.
  short: M. Lass, Sichere Speicherung Vertraulicher Daten in Verteilten Versionskontrollsystemen,
    Paderborn University, Paderborn, 2013.
date_created: 2018-03-26T15:25:01Z
date_updated: 2022-01-06T06:53:23Z
place: Paderborn
publisher: Paderborn University
status: public
supervisor:
- first_name: Christoph
  full_name: Sorge, Christoph
  last_name: Sorge
title: Sichere Speicherung vertraulicher Daten in verteilten Versionskontrollsystemen
type: bachelorsthesis
user_id: '24135'
year: '2013'
...
