---
_id: '32177'
abstract:
- lang: eng
text: "We investigate the early time development of the anisotropic transverse flow\r\nand
spatial eccentricities of a fireball with various particle-based transport\r\napproaches
using a fixed initial condition. In numerical simulations ranging\r\nfrom the
quasi-collisionless case to the hydrodynamic regime, we find that the\r\nonset
of $v_n$ and of related measures of anisotropic flow can be described\r\nwith
a simple power-law ansatz, with an exponent that depends on the amount of\r\nrescatterings
in the system. In the few-rescatterings regime we perform\r\nsemi-analytical calculations,
based on a systematic expansion in powers of time\r\nand the cross section, which
can reproduce the numerical findings."
author:
- first_name: Nicolas
full_name: Borghini, Nicolas
last_name: Borghini
- first_name: Marc
full_name: Borrell, Marc
last_name: Borrell
- first_name: Hendrik
full_name: Roch, Hendrik
last_name: Roch
citation:
ama: Borghini N, Borrell M, Roch H. Early time behavior of spatial and momentum
anisotropies in kinetic theory across different Knudsen numbers. arXiv:220113294.
Published online 2022.
apa: Borghini, N., Borrell, M., & Roch, H. (2022). Early time behavior of spatial
and momentum anisotropies in kinetic theory across different Knudsen numbers.
In arXiv:2201.13294.
bibtex: '@article{Borghini_Borrell_Roch_2022, title={Early time behavior of spatial
and momentum anisotropies in kinetic theory across different Knudsen numbers},
journal={arXiv:2201.13294}, author={Borghini, Nicolas and Borrell, Marc and Roch,
Hendrik}, year={2022} }'
chicago: Borghini, Nicolas, Marc Borrell, and Hendrik Roch. “Early Time Behavior
of Spatial and Momentum Anisotropies in Kinetic Theory across Different Knudsen
Numbers.” ArXiv:2201.13294, 2022.
ieee: N. Borghini, M. Borrell, and H. Roch, “Early time behavior of spatial and
momentum anisotropies in kinetic theory across different Knudsen numbers,” arXiv:2201.13294.
2022.
mla: Borghini, Nicolas, et al. “Early Time Behavior of Spatial and Momentum Anisotropies
in Kinetic Theory across Different Knudsen Numbers.” ArXiv:2201.13294,
2022.
short: N. Borghini, M. Borrell, H. Roch, ArXiv:2201.13294 (2022).
date_created: 2022-06-27T09:08:04Z
date_updated: 2022-06-27T09:35:53Z
department:
- _id: '27'
external_id:
arxiv:
- '2201.13294'
language:
- iso: eng
project:
- _id: '52'
name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: arXiv:2201.13294
status: public
title: Early time behavior of spatial and momentum anisotropies in kinetic theory
across different Knudsen numbers
type: preprint
user_id: '67287'
year: '2022'
...
---
_id: '32178'
abstract:
- lang: eng
text: "We test the ability of the \"escape mechanism\" to create the anisotropic
flow\r\nobserved in high-energy nuclear collisions. We compare the flow harmonics
$v_n$\r\nin the few-rescatterings regime from two types of transport simulations,
with\r\n$2\\to 2$ and $2\\to 0$ collision kernels respectively, and from analytical\r\ncalculations
neglecting the gain term of the Boltzmann equation. We find that\r\nthe even flow
harmonics are similar in the three approaches, while the odd\r\nharmonics differ
significantly."
author:
- first_name: Benedikt
full_name: Bachmann, Benedikt
last_name: Bachmann
- first_name: Nicolas
full_name: Borghini, Nicolas
last_name: Borghini
- first_name: Nina
full_name: Feld, Nina
last_name: Feld
- first_name: Hendrik
full_name: Roch, Hendrik
last_name: Roch
citation:
ama: Bachmann B, Borghini N, Feld N, Roch H. Even anisotropic-flow harmonics are
from Venus, odd ones are from Mars. arXiv:220313306. Published online 2022.
apa: Bachmann, B., Borghini, N., Feld, N., & Roch, H. (2022). Even anisotropic-flow
harmonics are from Venus, odd ones are from Mars. In arXiv:2203.13306.
bibtex: '@article{Bachmann_Borghini_Feld_Roch_2022, title={Even anisotropic-flow
harmonics are from Venus, odd ones are from Mars}, journal={arXiv:2203.13306},
author={Bachmann, Benedikt and Borghini, Nicolas and Feld, Nina and Roch, Hendrik},
year={2022} }'
chicago: Bachmann, Benedikt, Nicolas Borghini, Nina Feld, and Hendrik Roch. “Even
Anisotropic-Flow Harmonics Are from Venus, Odd Ones Are from Mars.” ArXiv:2203.13306,
2022.
ieee: B. Bachmann, N. Borghini, N. Feld, and H. Roch, “Even anisotropic-flow harmonics
are from Venus, odd ones are from Mars,” arXiv:2203.13306. 2022.
mla: Bachmann, Benedikt, et al. “Even Anisotropic-Flow Harmonics Are from Venus,
Odd Ones Are from Mars.” ArXiv:2203.13306, 2022.
short: B. Bachmann, N. Borghini, N. Feld, H. Roch, ArXiv:2203.13306 (2022).
date_created: 2022-06-27T09:12:26Z
date_updated: 2022-06-27T09:35:34Z
department:
- _id: '27'
external_id:
arxiv:
- '2203.13306'
language:
- iso: eng
project:
- _id: '52'
name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: arXiv:2203.13306
status: public
title: Even anisotropic-flow harmonics are from Venus, odd ones are from Mars
type: preprint
user_id: '67287'
year: '2022'
...
---
_id: '33493'
abstract:
- lang: eng
text: "Electronic structure calculations have been instrumental in providing many\r\nimportant
insights into a range of physical and chemical properties of various\r\nmolecular
and solid-state systems. Their importance to various fields,\r\nincluding materials
science, chemical sciences, computational chemistry and\r\ndevice physics, is
underscored by the large fraction of available public\r\nsupercomputing resources
devoted to these calculations. As we enter the\r\nexascale era, exciting new opportunities
to increase simulation numbers, sizes,\r\nand accuracies present themselves. In
order to realize these promises, the\r\ncommunity of electronic structure software
developers will however first have\r\nto tackle a number of challenges pertaining
to the efficient use of new\r\narchitectures that will rely heavily on massive
parallelism and hardware\r\naccelerators. This roadmap provides a broad overview
of the state-of-the-art in\r\nelectronic structure calculations and of the various
new directions being\r\npursued by the community. It covers 14 electronic structure
codes, presenting\r\ntheir current status, their development priorities over the
next five years,\r\nand their plans towards tackling the challenges and leveraging
the\r\nopportunities presented by the advent of exascale computing."
author:
- first_name: Vikram
full_name: Gavini, Vikram
last_name: Gavini
- first_name: Stefano
full_name: Baroni, Stefano
last_name: Baroni
- first_name: Volker
full_name: Blum, Volker
last_name: Blum
- first_name: David R.
full_name: Bowler, David R.
last_name: Bowler
- first_name: Alexander
full_name: Buccheri, Alexander
last_name: Buccheri
- first_name: James R.
full_name: Chelikowsky, James R.
last_name: Chelikowsky
- first_name: Sambit
full_name: Das, Sambit
last_name: Das
- first_name: William
full_name: Dawson, William
last_name: Dawson
- first_name: Pietro
full_name: Delugas, Pietro
last_name: Delugas
- first_name: Mehmet
full_name: Dogan, Mehmet
last_name: Dogan
- first_name: Claudia
full_name: Draxl, Claudia
last_name: Draxl
- first_name: Giulia
full_name: Galli, Giulia
last_name: Galli
- first_name: Luigi
full_name: Genovese, Luigi
last_name: Genovese
- first_name: Paolo
full_name: Giannozzi, Paolo
last_name: Giannozzi
- first_name: Matteo
full_name: Giantomassi, Matteo
last_name: Giantomassi
- first_name: Xavier
full_name: Gonze, Xavier
last_name: Gonze
- first_name: Marco
full_name: Govoni, Marco
last_name: Govoni
- first_name: Andris
full_name: Gulans, Andris
last_name: Gulans
- first_name: François
full_name: Gygi, François
last_name: Gygi
- first_name: John M.
full_name: Herbert, John M.
last_name: Herbert
- first_name: Sebastian
full_name: Kokott, Sebastian
last_name: Kokott
- first_name: Thomas
full_name: Kühne, Thomas
id: '49079'
last_name: Kühne
- first_name: Kai-Hsin
full_name: Liou, Kai-Hsin
last_name: Liou
- first_name: Tsuyoshi
full_name: Miyazaki, Tsuyoshi
last_name: Miyazaki
- first_name: Phani
full_name: Motamarri, Phani
last_name: Motamarri
- first_name: Ayako
full_name: Nakata, Ayako
last_name: Nakata
- first_name: John E.
full_name: Pask, John E.
last_name: Pask
- first_name: Christian
full_name: Plessl, Christian
id: '16153'
last_name: Plessl
orcid: 0000-0001-5728-9982
- first_name: Laura E.
full_name: Ratcliff, Laura E.
last_name: Ratcliff
- first_name: Ryan M.
full_name: Richard, Ryan M.
last_name: Richard
- first_name: Mariana
full_name: Rossi, Mariana
last_name: Rossi
- first_name: Robert
full_name: Schade, Robert
id: '75963'
last_name: Schade
orcid: 0000-0002-6268-539
- first_name: Matthias
full_name: Scheffler, Matthias
last_name: Scheffler
- first_name: Ole
full_name: Schütt, Ole
last_name: Schütt
- first_name: Phanish
full_name: Suryanarayana, Phanish
last_name: Suryanarayana
- first_name: Marc
full_name: Torrent, Marc
last_name: Torrent
- first_name: Lionel
full_name: Truflandier, Lionel
last_name: Truflandier
- first_name: Theresa L.
full_name: Windus, Theresa L.
last_name: Windus
- first_name: Qimen
full_name: Xu, Qimen
last_name: Xu
- first_name: Victor W. -Z.
full_name: Yu, Victor W. -Z.
last_name: Yu
- first_name: Danny
full_name: Perez, Danny
last_name: Perez
citation:
ama: Gavini V, Baroni S, Blum V, et al. Roadmap on Electronic Structure Codes in
the Exascale Era. arXiv:220912747. Published online 2022.
apa: Gavini, V., Baroni, S., Blum, V., Bowler, D. R., Buccheri, A., Chelikowsky,
J. R., Das, S., Dawson, W., Delugas, P., Dogan, M., Draxl, C., Galli, G., Genovese,
L., Giannozzi, P., Giantomassi, M., Gonze, X., Govoni, M., Gulans, A., Gygi, F.,
… Perez, D. (2022). Roadmap on Electronic Structure Codes in the Exascale Era.
In arXiv:2209.12747.
bibtex: '@article{Gavini_Baroni_Blum_Bowler_Buccheri_Chelikowsky_Das_Dawson_Delugas_Dogan_et
al._2022, title={Roadmap on Electronic Structure Codes in the Exascale Era}, journal={arXiv:2209.12747},
author={Gavini, Vikram and Baroni, Stefano and Blum, Volker and Bowler, David
R. and Buccheri, Alexander and Chelikowsky, James R. and Das, Sambit and Dawson,
William and Delugas, Pietro and Dogan, Mehmet and et al.}, year={2022} }'
chicago: Gavini, Vikram, Stefano Baroni, Volker Blum, David R. Bowler, Alexander
Buccheri, James R. Chelikowsky, Sambit Das, et al. “Roadmap on Electronic Structure
Codes in the Exascale Era.” ArXiv:2209.12747, 2022.
ieee: V. Gavini et al., “Roadmap on Electronic Structure Codes in the Exascale
Era,” arXiv:2209.12747. 2022.
mla: Gavini, Vikram, et al. “Roadmap on Electronic Structure Codes in the Exascale
Era.” ArXiv:2209.12747, 2022.
short: V. Gavini, S. Baroni, V. Blum, D.R. Bowler, A. Buccheri, J.R. Chelikowsky,
S. Das, W. Dawson, P. Delugas, M. Dogan, C. Draxl, G. Galli, L. Genovese, P. Giannozzi,
M. Giantomassi, X. Gonze, M. Govoni, A. Gulans, F. Gygi, J.M. Herbert, S. Kokott,
T. Kühne, K.-H. Liou, T. Miyazaki, P. Motamarri, A. Nakata, J.E. Pask, C. Plessl,
L.E. Ratcliff, R.M. Richard, M. Rossi, R. Schade, M. Scheffler, O. Schütt, P.
Suryanarayana, M. Torrent, L. Truflandier, T.L. Windus, Q. Xu, V.W.-Z. Yu, D.
Perez, ArXiv:2209.12747 (2022).
date_created: 2022-09-28T05:25:10Z
date_updated: 2023-07-28T08:03:41Z
department:
- _id: '27'
- _id: '518'
external_id:
arxiv:
- '2209.12747'
language:
- iso: eng
project:
- _id: '52'
name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: arXiv:2209.12747
status: public
title: Roadmap on Electronic Structure Codes in the Exascale Era
type: preprint
user_id: '24135'
year: '2022'
...
---
_id: '46275'
abstract:
- lang: eng
text: "Electronic structure calculations have been instrumental in providing many\r\nimportant
insights into a range of physical and chemical properties of various\r\nmolecular
and solid-state systems. Their importance to various fields,\r\nincluding materials
science, chemical sciences, computational chemistry and\r\ndevice physics, is
underscored by the large fraction of available public\r\nsupercomputing resources
devoted to these calculations. As we enter the\r\nexascale era, exciting new opportunities
to increase simulation numbers, sizes,\r\nand accuracies present themselves. In
order to realize these promises, the\r\ncommunity of electronic structure software
developers will however first have\r\nto tackle a number of challenges pertaining
to the efficient use of new\r\narchitectures that will rely heavily on massive
parallelism and hardware\r\naccelerators. This roadmap provides a broad overview
of the state-of-the-art in\r\nelectronic structure calculations and of the various
new directions being\r\npursued by the community. It covers 14 electronic structure
codes, presenting\r\ntheir current status, their development priorities over the
next five years,\r\nand their plans towards tackling the challenges and leveraging
the\r\nopportunities presented by the advent of exascale computing."
author:
- first_name: Vikram
full_name: Gavini, Vikram
last_name: Gavini
- first_name: Stefano
full_name: Baroni, Stefano
last_name: Baroni
- first_name: Volker
full_name: Blum, Volker
last_name: Blum
- first_name: David R.
full_name: Bowler, David R.
last_name: Bowler
- first_name: Alexander
full_name: Buccheri, Alexander
last_name: Buccheri
- first_name: James R.
full_name: Chelikowsky, James R.
last_name: Chelikowsky
- first_name: Sambit
full_name: Das, Sambit
last_name: Das
- first_name: William
full_name: Dawson, William
last_name: Dawson
- first_name: Pietro
full_name: Delugas, Pietro
last_name: Delugas
- first_name: Mehmet
full_name: Dogan, Mehmet
last_name: Dogan
- first_name: Claudia
full_name: Draxl, Claudia
last_name: Draxl
- first_name: Giulia
full_name: Galli, Giulia
last_name: Galli
- first_name: Luigi
full_name: Genovese, Luigi
last_name: Genovese
- first_name: Paolo
full_name: Giannozzi, Paolo
last_name: Giannozzi
- first_name: Matteo
full_name: Giantomassi, Matteo
last_name: Giantomassi
- first_name: Xavier
full_name: Gonze, Xavier
last_name: Gonze
- first_name: Marco
full_name: Govoni, Marco
last_name: Govoni
- first_name: Andris
full_name: Gulans, Andris
last_name: Gulans
- first_name: François
full_name: Gygi, François
last_name: Gygi
- first_name: John M.
full_name: Herbert, John M.
last_name: Herbert
- first_name: Sebastian
full_name: Kokott, Sebastian
last_name: Kokott
- first_name: Thomas
full_name: Kühne, Thomas
id: '49079'
last_name: Kühne
- first_name: Kai-Hsin
full_name: Liou, Kai-Hsin
last_name: Liou
- first_name: Tsuyoshi
full_name: Miyazaki, Tsuyoshi
last_name: Miyazaki
- first_name: Phani
full_name: Motamarri, Phani
last_name: Motamarri
- first_name: Ayako
full_name: Nakata, Ayako
last_name: Nakata
- first_name: John E.
full_name: Pask, John E.
last_name: Pask
- first_name: Christian
full_name: Plessl, Christian
id: '16153'
last_name: Plessl
orcid: 0000-0001-5728-9982
- first_name: Laura E.
full_name: Ratcliff, Laura E.
last_name: Ratcliff
- first_name: Ryan M.
full_name: Richard, Ryan M.
last_name: Richard
- first_name: Mariana
full_name: Rossi, Mariana
last_name: Rossi
- first_name: Robert
full_name: Schade, Robert
id: '75963'
last_name: Schade
orcid: 0000-0002-6268-539
- first_name: Matthias
full_name: Scheffler, Matthias
last_name: Scheffler
- first_name: Ole
full_name: Schütt, Ole
last_name: Schütt
- first_name: Phanish
full_name: Suryanarayana, Phanish
last_name: Suryanarayana
- first_name: Marc
full_name: Torrent, Marc
last_name: Torrent
- first_name: Lionel
full_name: Truflandier, Lionel
last_name: Truflandier
- first_name: Theresa L.
full_name: Windus, Theresa L.
last_name: Windus
- first_name: Qimen
full_name: Xu, Qimen
last_name: Xu
- first_name: Victor W. -Z.
full_name: Yu, Victor W. -Z.
last_name: Yu
- first_name: Danny
full_name: Perez, Danny
last_name: Perez
citation:
ama: Gavini V, Baroni S, Blum V, et al. Roadmap on Electronic Structure Codes in
the Exascale Era. arXiv:220912747. Published online 2022.
apa: Gavini, V., Baroni, S., Blum, V., Bowler, D. R., Buccheri, A., Chelikowsky,
J. R., Das, S., Dawson, W., Delugas, P., Dogan, M., Draxl, C., Galli, G., Genovese,
L., Giannozzi, P., Giantomassi, M., Gonze, X., Govoni, M., Gulans, A., Gygi, F.,
… Perez, D. (2022). Roadmap on Electronic Structure Codes in the Exascale Era.
In arXiv:2209.12747.
bibtex: '@article{Gavini_Baroni_Blum_Bowler_Buccheri_Chelikowsky_Das_Dawson_Delugas_Dogan_et
al._2022, title={Roadmap on Electronic Structure Codes in the Exascale Era}, journal={arXiv:2209.12747},
author={Gavini, Vikram and Baroni, Stefano and Blum, Volker and Bowler, David
R. and Buccheri, Alexander and Chelikowsky, James R. and Das, Sambit and Dawson,
William and Delugas, Pietro and Dogan, Mehmet and et al.}, year={2022} }'
chicago: Gavini, Vikram, Stefano Baroni, Volker Blum, David R. Bowler, Alexander
Buccheri, James R. Chelikowsky, Sambit Das, et al. “Roadmap on Electronic Structure
Codes in the Exascale Era.” ArXiv:2209.12747, 2022.
ieee: V. Gavini et al., “Roadmap on Electronic Structure Codes in the Exascale
Era,” arXiv:2209.12747. 2022.
mla: Gavini, Vikram, et al. “Roadmap on Electronic Structure Codes in the Exascale
Era.” ArXiv:2209.12747, 2022.
short: V. Gavini, S. Baroni, V. Blum, D.R. Bowler, A. Buccheri, J.R. Chelikowsky,
S. Das, W. Dawson, P. Delugas, M. Dogan, C. Draxl, G. Galli, L. Genovese, P. Giannozzi,
M. Giantomassi, X. Gonze, M. Govoni, A. Gulans, F. Gygi, J.M. Herbert, S. Kokott,
T. Kühne, K.-H. Liou, T. Miyazaki, P. Motamarri, A. Nakata, J.E. Pask, C. Plessl,
L.E. Ratcliff, R.M. Richard, M. Rossi, R. Schade, M. Scheffler, O. Schütt, P.
Suryanarayana, M. Torrent, L. Truflandier, T.L. Windus, Q. Xu, V.W.-Z. Yu, D.
Perez, ArXiv:2209.12747 (2022).
date_created: 2023-08-02T14:59:18Z
date_updated: 2023-08-02T15:00:47Z
department:
- _id: '27'
external_id:
arxiv:
- '2209.12747'
language:
- iso: eng
project:
- _id: '52'
name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: arXiv:2209.12747
status: public
title: Roadmap on Electronic Structure Codes in the Exascale Era
type: preprint
user_id: '75963'
year: '2022'
...
---
_id: '46194'
author:
- first_name: Tobias
full_name: Kenter, Tobias
id: '3145'
last_name: Kenter
- first_name: Adesh
full_name: Shambhu, Adesh
last_name: Shambhu
- first_name: Sara
full_name: Faghih-Naini, Sara
last_name: Faghih-Naini
- first_name: Vadym
full_name: Aizinger, Vadym
last_name: Aizinger
citation:
ama: 'Kenter T, Shambhu A, Faghih-Naini S, Aizinger V. Algorithm-hardware co-design
of a discontinuous Galerkin shallow-water model for a dataflow architecture on
FPGA. In: Proceedings of the Platform for Advanced Scientific Computing Conference.
ACM; 2021. doi:10.1145/3468267.3470617'
apa: Kenter, T., Shambhu, A., Faghih-Naini, S., & Aizinger, V. (2021). Algorithm-hardware
co-design of a discontinuous Galerkin shallow-water model for a dataflow architecture
on FPGA. Proceedings of the Platform for Advanced Scientific Computing Conference.
https://doi.org/10.1145/3468267.3470617
bibtex: '@inproceedings{Kenter_Shambhu_Faghih-Naini_Aizinger_2021, title={Algorithm-hardware
co-design of a discontinuous Galerkin shallow-water model for a dataflow architecture
on FPGA}, DOI={10.1145/3468267.3470617},
booktitle={Proceedings of the Platform for Advanced Scientific Computing Conference},
publisher={ACM}, author={Kenter, Tobias and Shambhu, Adesh and Faghih-Naini, Sara
and Aizinger, Vadym}, year={2021} }'
chicago: Kenter, Tobias, Adesh Shambhu, Sara Faghih-Naini, and Vadym Aizinger. “Algorithm-Hardware
Co-Design of a Discontinuous Galerkin Shallow-Water Model for a Dataflow Architecture
on FPGA.” In Proceedings of the Platform for Advanced Scientific Computing
Conference. ACM, 2021. https://doi.org/10.1145/3468267.3470617.
ieee: 'T. Kenter, A. Shambhu, S. Faghih-Naini, and V. Aizinger, “Algorithm-hardware
co-design of a discontinuous Galerkin shallow-water model for a dataflow architecture
on FPGA,” 2021, doi: 10.1145/3468267.3470617.'
mla: Kenter, Tobias, et al. “Algorithm-Hardware Co-Design of a Discontinuous Galerkin
Shallow-Water Model for a Dataflow Architecture on FPGA.” Proceedings of the
Platform for Advanced Scientific Computing Conference, ACM, 2021, doi:10.1145/3468267.3470617.
short: 'T. Kenter, A. Shambhu, S. Faghih-Naini, V. Aizinger, in: Proceedings of
the Platform for Advanced Scientific Computing Conference, ACM, 2021.'
date_created: 2023-07-28T11:58:14Z
date_updated: 2023-07-28T12:03:19Z
department:
- _id: '27'
- _id: '518'
doi: 10.1145/3468267.3470617
language:
- iso: eng
main_file_link:
- open_access: '1'
url: https://dl.acm.org/doi/pdf/10.1145/3468267.3470617
oa: '1'
project:
- _id: '52'
name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: Proceedings of the Platform for Advanced Scientific Computing Conference
publication_status: published
publisher: ACM
quality_controlled: '1'
status: public
title: Algorithm-hardware co-design of a discontinuous Galerkin shallow-water model
for a dataflow architecture on FPGA
type: conference
user_id: '3145'
year: '2021'
...
---
_id: '20886'
author:
- first_name: Tobias
full_name: Nickchen, Tobias
last_name: Nickchen
- first_name: Stefan
full_name: Heindorf, Stefan
last_name: Heindorf
- first_name: Gregor
full_name: Engels, Gregor
last_name: Engels
citation:
ama: 'Nickchen T, Heindorf S, Engels G. Generating Physically Sound Training Data
for Image Recognition of Additively Manufactured Parts. In: Proceedings of
the IEEE/CVF Winter Conference on Applications of Computer Vision. ; 2021:1994-2002.'
apa: Nickchen, T., Heindorf, S., & Engels, G. (2021). Generating Physically
Sound Training Data for Image Recognition of Additively Manufactured Parts. In
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision
(pp. 1994–2002). Hawaii.
bibtex: '@inproceedings{Nickchen_Heindorf_Engels_2021, title={Generating Physically
Sound Training Data for Image Recognition of Additively Manufactured Parts}, booktitle={Proceedings
of the IEEE/CVF Winter Conference on Applications of Computer Vision}, author={Nickchen,
Tobias and Heindorf, Stefan and Engels, Gregor}, year={2021}, pages={1994–2002}
}'
chicago: Nickchen, Tobias, Stefan Heindorf, and Gregor Engels. “Generating Physically
Sound Training Data for Image Recognition of Additively Manufactured Parts.” In
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision,
1994–2002, 2021.
ieee: T. Nickchen, S. Heindorf, and G. Engels, “Generating Physically Sound Training
Data for Image Recognition of Additively Manufactured Parts,” in Proceedings
of the IEEE/CVF Winter Conference on Applications of Computer Vision, Hawaii,
2021, pp. 1994–2002.
mla: Nickchen, Tobias, et al. “Generating Physically Sound Training Data for Image
Recognition of Additively Manufactured Parts.” Proceedings of the IEEE/CVF
Winter Conference on Applications of Computer Vision, 2021, pp. 1994–2002.
short: 'T. Nickchen, S. Heindorf, G. Engels, in: Proceedings of the IEEE/CVF Winter
Conference on Applications of Computer Vision, 2021, pp. 1994–2002.'
conference:
end_date: 2021-09-01
location: Hawaii
name: IEEE/CVF Winter Conference on Applications of Computer Vision
start_date: 2021-05-01
date_created: 2021-01-07T15:32:45Z
date_updated: 2022-01-06T06:54:41Z
department:
- _id: '66'
- _id: '534'
- _id: '624'
- _id: '219'
- _id: '27'
language:
- iso: eng
page: 1994-2002
publication: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer
Vision
publication_status: published
status: public
title: Generating Physically Sound Training Data for Image Recognition of Additively
Manufactured Parts
type: conference
user_id: '27340'
year: '2021'
...
---
_id: '46195'
author:
- first_name: Martin
full_name: Karp, Martin
last_name: Karp
- first_name: Artur
full_name: Podobas, Artur
last_name: Podobas
- first_name: Niclas
full_name: Jansson, Niclas
last_name: Jansson
- first_name: Tobias
full_name: Kenter, Tobias
id: '3145'
last_name: Kenter
- first_name: Christian
full_name: Plessl, Christian
id: '16153'
last_name: Plessl
orcid: 0000-0001-5728-9982
- first_name: Philipp
full_name: Schlatter, Philipp
last_name: Schlatter
- first_name: Stefano
full_name: Markidis, Stefano
last_name: Markidis
citation:
ama: 'Karp M, Podobas A, Jansson N, et al. High-Performance Spectral Element Methods
on Field-Programmable Gate Arrays : Implementation, Evaluation, and Future Projection.
In: 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
IEEE; 2021. doi:10.1109/ipdps49936.2021.00116'
apa: 'Karp, M., Podobas, A., Jansson, N., Kenter, T., Plessl, C., Schlatter, P.,
& Markidis, S. (2021). High-Performance Spectral Element Methods on Field-Programmable
Gate Arrays : Implementation, Evaluation, and Future Projection. 2021 IEEE
International Parallel and Distributed Processing Symposium (IPDPS). https://doi.org/10.1109/ipdps49936.2021.00116'
bibtex: '@inproceedings{Karp_Podobas_Jansson_Kenter_Plessl_Schlatter_Markidis_2021,
title={High-Performance Spectral Element Methods on Field-Programmable Gate Arrays :
Implementation, Evaluation, and Future Projection}, DOI={10.1109/ipdps49936.2021.00116},
booktitle={2021 IEEE International Parallel and Distributed Processing Symposium
(IPDPS)}, publisher={IEEE}, author={Karp, Martin and Podobas, Artur and Jansson,
Niclas and Kenter, Tobias and Plessl, Christian and Schlatter, Philipp and Markidis,
Stefano}, year={2021} }'
chicago: 'Karp, Martin, Artur Podobas, Niclas Jansson, Tobias Kenter, Christian
Plessl, Philipp Schlatter, and Stefano Markidis. “High-Performance Spectral Element
Methods on Field-Programmable Gate Arrays : Implementation, Evaluation, and Future
Projection.” In 2021 IEEE International Parallel and Distributed Processing
Symposium (IPDPS). IEEE, 2021. https://doi.org/10.1109/ipdps49936.2021.00116.'
ieee: 'M. Karp et al., “High-Performance Spectral Element Methods on Field-Programmable
Gate Arrays : Implementation, Evaluation, and Future Projection,” 2021, doi: 10.1109/ipdps49936.2021.00116.'
mla: 'Karp, Martin, et al. “High-Performance Spectral Element Methods on Field-Programmable
Gate Arrays : Implementation, Evaluation, and Future Projection.” 2021 IEEE
International Parallel and Distributed Processing Symposium (IPDPS), IEEE,
2021, doi:10.1109/ipdps49936.2021.00116.'
short: 'M. Karp, A. Podobas, N. Jansson, T. Kenter, C. Plessl, P. Schlatter, S.
Markidis, in: 2021 IEEE International Parallel and Distributed Processing Symposium
(IPDPS), IEEE, 2021.'
date_created: 2023-07-28T12:04:27Z
date_updated: 2023-07-28T12:05:15Z
department:
- _id: '27'
- _id: '518'
doi: 10.1109/ipdps49936.2021.00116
language:
- iso: eng
publication: 2021 IEEE International Parallel and Distributed Processing Symposium
(IPDPS)
publication_status: published
publisher: IEEE
quality_controlled: '1'
status: public
title: 'High-Performance Spectral Element Methods on Field-Programmable Gate Arrays
: Implementation, Evaluation, and Future Projection'
type: conference
user_id: '3145'
year: '2021'
...
---
_id: '29937'
author:
- first_name: Martin
full_name: Karp, Martin
last_name: Karp
- first_name: Artur
full_name: Podobas, Artur
last_name: Podobas
- first_name: Niclas
full_name: Jansson, Niclas
last_name: Jansson
- first_name: Tobias
full_name: Kenter, Tobias
id: '3145'
last_name: Kenter
- first_name: Christian
full_name: Plessl, Christian
id: '16153'
last_name: Plessl
orcid: 0000-0001-5728-9982
- first_name: Philipp
full_name: Schlatter, Philipp
last_name: Schlatter
- first_name: Stefano
full_name: Markidis, Stefano
last_name: Markidis
citation:
ama: 'Karp M, Podobas A, Jansson N, et al. High-Performance Spectral Element Methods
on Field-Programmable Gate Arrays : Implementation, Evaluation, and Future Projection.
In: 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
IEEE; 2021. doi:10.1109/ipdps49936.2021.00116'
apa: 'Karp, M., Podobas, A., Jansson, N., Kenter, T., Plessl, C., Schlatter, P.,
& Markidis, S. (2021). High-Performance Spectral Element Methods on Field-Programmable
Gate Arrays : Implementation, Evaluation, and Future Projection. 2021 IEEE
International Parallel and Distributed Processing Symposium (IPDPS). https://doi.org/10.1109/ipdps49936.2021.00116'
bibtex: '@inproceedings{Karp_Podobas_Jansson_Kenter_Plessl_Schlatter_Markidis_2021,
title={High-Performance Spectral Element Methods on Field-Programmable Gate Arrays :
Implementation, Evaluation, and Future Projection}, DOI={10.1109/ipdps49936.2021.00116},
booktitle={2021 IEEE International Parallel and Distributed Processing Symposium
(IPDPS)}, publisher={IEEE}, author={Karp, Martin and Podobas, Artur and Jansson,
Niclas and Kenter, Tobias and Plessl, Christian and Schlatter, Philipp and Markidis,
Stefano}, year={2021} }'
chicago: 'Karp, Martin, Artur Podobas, Niclas Jansson, Tobias Kenter, Christian
Plessl, Philipp Schlatter, and Stefano Markidis. “High-Performance Spectral Element
Methods on Field-Programmable Gate Arrays : Implementation, Evaluation, and Future
Projection.” In 2021 IEEE International Parallel and Distributed Processing
Symposium (IPDPS). IEEE, 2021. https://doi.org/10.1109/ipdps49936.2021.00116.'
ieee: 'M. Karp et al., “High-Performance Spectral Element Methods on Field-Programmable
Gate Arrays : Implementation, Evaluation, and Future Projection,” 2021, doi: 10.1109/ipdps49936.2021.00116.'
mla: 'Karp, Martin, et al. “High-Performance Spectral Element Methods on Field-Programmable
Gate Arrays : Implementation, Evaluation, and Future Projection.” 2021 IEEE
International Parallel and Distributed Processing Symposium (IPDPS), IEEE,
2021, doi:10.1109/ipdps49936.2021.00116.'
short: 'M. Karp, A. Podobas, N. Jansson, T. Kenter, C. Plessl, P. Schlatter, S.
Markidis, in: 2021 IEEE International Parallel and Distributed Processing Symposium
(IPDPS), IEEE, 2021.'
date_created: 2022-02-21T14:26:37Z
date_updated: 2024-01-22T09:59:13Z
department:
- _id: '27'
- _id: '518'
doi: 10.1109/ipdps49936.2021.00116
language:
- iso: eng
project:
- _id: '52'
name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: 2021 IEEE International Parallel and Distributed Processing Symposium
(IPDPS)
publication_status: published
publisher: IEEE
quality_controlled: '1'
status: public
title: 'High-Performance Spectral Element Methods on Field-Programmable Gate Arrays
: Implementation, Evaluation, and Future Projection'
type: conference
user_id: '3145'
year: '2021'
...
---
_id: '32245'
abstract:
- lang: eng
text: "Optical travelling wave antennas offer unique opportunities to control and\r\nselectively
guide light into a specific direction which renders them as\r\nexcellent candidates
for optical communication and sensing. These applications\r\nrequire state of
the art engineering to reach optimized functionalities such as\r\nhigh directivity
and radiation efficiency, low side lobe level, broadband and\r\ntunable capabilities,
and compact design. In this work we report on the\r\nnumerical optimization of
the directivity of optical travelling wave antennas\r\nmade from low-loss dielectric
materials using full-wave numerical simulations\r\nin conjunction with a particle
swarm optimization algorithm. The antennas are\r\ncomposed of a reflector and
a director deposited on a glass substrate and an\r\nemitter placed in the feed
gap between them serves as an internal source of\r\nexcitation. In particular,
we analysed antennas with rectangular- and\r\nhorn-shaped directors made of either
Hafnium dioxide or Silicon. The optimized\r\nantennas produce highly directional
emission due to the presence of two\r\ndominant guided TE modes in the director
in addition to leaky modes. These\r\nguided modes dominate the far-field emission
pattern and govern the direction\r\nof the main lobe emission which predominately
originates from the end facet of\r\nthe director. Our work also provides a comprehensive
analysis of the modes,\r\nradiation patterns, parametric influences, and bandwidths
of the antennas that\r\nhighlights their robust nature."
author:
- first_name: Henna
full_name: Farheen, Henna
last_name: Farheen
- first_name: Till
full_name: Leuteritz, Till
last_name: Leuteritz
- first_name: Stefan
full_name: Linden, Stefan
last_name: Linden
- first_name: Viktor
full_name: Myroshnychenko, Viktor
last_name: Myroshnychenko
- first_name: Jens
full_name: Förstner, Jens
last_name: Förstner
citation:
ama: Farheen H, Leuteritz T, Linden S, Myroshnychenko V, Förstner J. Optimization
of optical waveguide antennas for directive emission of light. arXiv:210602468.
Published online 2021.
apa: Farheen, H., Leuteritz, T., Linden, S., Myroshnychenko, V., & Förstner,
J. (2021). Optimization of optical waveguide antennas for directive emission of
light. In arXiv:2106.02468.
bibtex: '@article{Farheen_Leuteritz_Linden_Myroshnychenko_Förstner_2021, title={Optimization
of optical waveguide antennas for directive emission of light}, journal={arXiv:2106.02468},
author={Farheen, Henna and Leuteritz, Till and Linden, Stefan and Myroshnychenko,
Viktor and Förstner, Jens}, year={2021} }'
chicago: Farheen, Henna, Till Leuteritz, Stefan Linden, Viktor Myroshnychenko, and
Jens Förstner. “Optimization of Optical Waveguide Antennas for Directive Emission
of Light.” ArXiv:2106.02468, 2021.
ieee: H. Farheen, T. Leuteritz, S. Linden, V. Myroshnychenko, and J. Förstner, “Optimization
of optical waveguide antennas for directive emission of light,” arXiv:2106.02468.
2021.
mla: Farheen, Henna, et al. “Optimization of Optical Waveguide Antennas for Directive
Emission of Light.” ArXiv:2106.02468, 2021.
short: H. Farheen, T. Leuteritz, S. Linden, V. Myroshnychenko, J. Förstner, ArXiv:2106.02468
(2021).
date_created: 2022-06-28T08:01:09Z
date_updated: 2022-06-28T08:01:39Z
department:
- _id: '27'
external_id:
arxiv:
- '2106.02468'
language:
- iso: eng
project:
- _id: '52'
name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: arXiv:2106.02468
status: public
title: Optimization of optical waveguide antennas for directive emission of light
type: preprint
user_id: '15278'
year: '2021'
...
---
_id: '32236'
abstract:
- lang: eng
text: "The interaction between quantum light and matter is being intensively studied\r\nfor
systems that are enclosed in high-$Q$ cavities which strongly enhance the\r\nlight-matter
coupling. However, for many applications, cavities with lower\r\n$Q$-factors are
preferred due to the increased spectral width of the cavity\r\nmode. Here, we
investigate the interaction between quantum light and matter\r\nrepresented by
a $\\Lambda$-type three-level system in lossy cavities, assuming\r\nthat cavity
losses are the dominant loss mechanism. We demonstrate that cavity\r\nlosses lead
to non-trivial steady states of the electronic occupations that can\r\nbe controlled
by the loss rate and the initial statistics of the quantum\r\nfields. The mechanism
of formation of such steady states can be understood on\r\nthe basis of the equations
of motion. Analytical expressions for steady states\r\nand their numerical simulations
are presented and discussed."
author:
- first_name: H.
full_name: Rose, H.
last_name: Rose
- first_name: O. V.
full_name: Tikhonova, O. V.
last_name: Tikhonova
- first_name: T.
full_name: Meier, T.
last_name: Meier
- first_name: 'P. '
full_name: 'Sharapova, P. '
last_name: Sharapova
citation:
ama: Rose H, Tikhonova OV, Meier T, Sharapova P. Steady states of $Λ$-type three-level
systems excited by quantum light in lossy cavities. arXiv:210900842. Published
online 2021.
apa: Rose, H., Tikhonova, O. V., Meier, T., & Sharapova, P. (2021). Steady states
of $Λ$-type three-level systems excited by quantum light in lossy cavities. In
arXiv:2109.00842.
bibtex: '@article{Rose_Tikhonova_Meier_Sharapova_2021, title={Steady states of $Λ$-type
three-level systems excited by quantum light in lossy cavities}, journal={arXiv:2109.00842},
author={Rose, H. and Tikhonova, O. V. and Meier, T. and Sharapova, P. }, year={2021}
}'
chicago: Rose, H., O. V. Tikhonova, T. Meier, and P. Sharapova. “Steady States
of $Λ$-Type Three-Level Systems Excited by Quantum Light in Lossy Cavities.”
ArXiv:2109.00842, 2021.
ieee: H. Rose, O. V. Tikhonova, T. Meier, and P. Sharapova, “Steady states of $Λ$-type
three-level systems excited by quantum light in lossy cavities,” arXiv:2109.00842.
2021.
mla: Rose, H., et al. “Steady States of $Λ$-Type Three-Level Systems Excited by
Quantum Light in Lossy Cavities.” ArXiv:2109.00842, 2021.
short: H. Rose, O.V. Tikhonova, T. Meier, P. Sharapova, ArXiv:2109.00842 (2021).
date_created: 2022-06-28T07:03:29Z
date_updated: 2023-02-10T16:00:12Z
department:
- _id: '27'
external_id:
arxiv:
- '2109.00842'
language:
- iso: eng
project:
- _id: '52'
name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: arXiv:2109.00842
status: public
title: Steady states of $Λ$-type three-level systems excited by quantum light in
lossy cavities
type: preprint
user_id: '14931'
year: '2021'
...
---
_id: '32244'
abstract:
- lang: eng
text: "We push the boundaries of electronic structure-based \\textit{ab-initio}\r\nmolecular
dynamics (AIMD) beyond 100 million atoms. This scale is otherwise\r\nbarely reachable
with classical force-field methods or novel neural network and\r\nmachine learning
potentials. We achieve this breakthrough by combining\r\ninnovations in linear-scaling
AIMD, efficient and approximate sparse linear\r\nalgebra, low and mixed-precision
floating-point computation on GPUs, and a\r\ncompensation scheme for the errors
introduced by numerical approximations. The\r\ncore of our work is the non-orthogonalized
local submatrix method (NOLSM),\r\nwhich scales very favorably to massively parallel
computing systems and\r\ntranslates large sparse matrix operations into highly
parallel, dense matrix\r\noperations that are ideally suited to hardware accelerators.
We demonstrate\r\nthat the NOLSM method, which is at the center point of each
AIMD step, is able\r\nto achieve a sustained performance of 324 PFLOP/s in mixed
FP16/FP32 precision\r\ncorresponding to an efficiency of 67.7% when running on
1536 NVIDIA A100 GPUs."
author:
- first_name: Robert
full_name: Schade, Robert
last_name: Schade
- first_name: Tobias
full_name: Kenter, Tobias
last_name: Kenter
- first_name: Hossam
full_name: Elgabarty, Hossam
last_name: Elgabarty
- first_name: Michael
full_name: Lass, Michael
last_name: Lass
- first_name: Ole
full_name: Schütt, Ole
last_name: Schütt
- first_name: Alfio
full_name: Lazzaro, Alfio
last_name: Lazzaro
- first_name: Hans
full_name: Pabst, Hans
last_name: Pabst
- first_name: Stephan
full_name: Mohr, Stephan
last_name: Mohr
- first_name: Jürg
full_name: Hutter, Jürg
last_name: Hutter
- first_name: Thomas D.
full_name: Kühne, Thomas D.
last_name: Kühne
- first_name: Christian
full_name: Plessl, Christian
last_name: Plessl
citation:
ama: Schade R, Kenter T, Elgabarty H, et al. Towards Electronic Structure-Based
Ab-Initio Molecular Dynamics Simulations with Hundreds of Millions of Atoms.
arXiv:210408245. Published online 2021.
apa: Schade, R., Kenter, T., Elgabarty, H., Lass, M., Schütt, O., Lazzaro, A., Pabst,
H., Mohr, S., Hutter, J., Kühne, T. D., & Plessl, C. (2021). Towards Electronic
Structure-Based Ab-Initio Molecular Dynamics Simulations with Hundreds of Millions
of Atoms. In arXiv:2104.08245.
bibtex: '@article{Schade_Kenter_Elgabarty_Lass_Schütt_Lazzaro_Pabst_Mohr_Hutter_Kühne_et
al._2021, title={Towards Electronic Structure-Based Ab-Initio Molecular Dynamics
Simulations with Hundreds of Millions of Atoms}, journal={arXiv:2104.08245}, author={Schade,
Robert and Kenter, Tobias and Elgabarty, Hossam and Lass, Michael and Schütt,
Ole and Lazzaro, Alfio and Pabst, Hans and Mohr, Stephan and Hutter, Jürg and
Kühne, Thomas D. and et al.}, year={2021} }'
chicago: Schade, Robert, Tobias Kenter, Hossam Elgabarty, Michael Lass, Ole Schütt,
Alfio Lazzaro, Hans Pabst, et al. “Towards Electronic Structure-Based Ab-Initio
Molecular Dynamics Simulations with Hundreds of Millions of Atoms.” ArXiv:2104.08245,
2021.
ieee: R. Schade et al., “Towards Electronic Structure-Based Ab-Initio Molecular
Dynamics Simulations with Hundreds of Millions of Atoms,” arXiv:2104.08245.
2021.
mla: Schade, Robert, et al. “Towards Electronic Structure-Based Ab-Initio Molecular
Dynamics Simulations with Hundreds of Millions of Atoms.” ArXiv:2104.08245,
2021.
short: R. Schade, T. Kenter, H. Elgabarty, M. Lass, O. Schütt, A. Lazzaro, H. Pabst,
S. Mohr, J. Hutter, T.D. Kühne, C. Plessl, ArXiv:2104.08245 (2021).
date_created: 2022-06-28T07:48:31Z
date_updated: 2022-06-28T07:49:31Z
department:
- _id: '27'
external_id:
arxiv:
- '2104.08245'
language:
- iso: eng
project:
- _id: '52'
name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: arXiv:2104.08245
status: public
title: Towards Electronic Structure-Based Ab-Initio Molecular Dynamics Simulations
with Hundreds of Millions of Atoms
type: preprint
user_id: '15278'
year: '2021'
...
---
_id: '27365'
author:
- first_name: Marius
full_name: Meyer, Marius
id: '40778'
last_name: Meyer
citation:
ama: 'Meyer M. Towards Performance Characterization of FPGAs in Context of HPC using
OpenCL Benchmarks. In: Proceedings of the 11th International Symposium on Highly
Efficient Accelerators and Reconfigurable Technologies. ; 2021. doi:10.1145/3468044.3468058'
apa: Meyer, M. (2021). Towards Performance Characterization of FPGAs in Context
of HPC using OpenCL Benchmarks. Proceedings of the 11th International Symposium
on Highly Efficient Accelerators and Reconfigurable Technologies. https://doi.org/10.1145/3468044.3468058
bibtex: '@inproceedings{Meyer_2021, title={Towards Performance Characterization
of FPGAs in Context of HPC using OpenCL Benchmarks}, DOI={10.1145/3468044.3468058},
booktitle={Proceedings of the 11th International Symposium on Highly Efficient
Accelerators and Reconfigurable Technologies}, author={Meyer, Marius}, year={2021}
}'
chicago: Meyer, Marius. “Towards Performance Characterization of FPGAs in Context
of HPC Using OpenCL Benchmarks.” In Proceedings of the 11th International Symposium
on Highly Efficient Accelerators and Reconfigurable Technologies, 2021. https://doi.org/10.1145/3468044.3468058.
ieee: 'M. Meyer, “Towards Performance Characterization of FPGAs in Context of HPC
using OpenCL Benchmarks,” 2021, doi: 10.1145/3468044.3468058.'
mla: Meyer, Marius. “Towards Performance Characterization of FPGAs in Context of
HPC Using OpenCL Benchmarks.” Proceedings of the 11th International Symposium
on Highly Efficient Accelerators and Reconfigurable Technologies, 2021, doi:10.1145/3468044.3468058.
short: 'M. Meyer, in: Proceedings of the 11th International Symposium on Highly
Efficient Accelerators and Reconfigurable Technologies, 2021.'
date_created: 2021-11-10T14:42:17Z
date_updated: 2022-01-06T06:57:38Z
department:
- _id: '27'
doi: 10.1145/3468044.3468058
language:
- iso: eng
project:
- _id: '52'
name: Computing Resources Provided by the Paderborn Center for Parallel Computing
publication: Proceedings of the 11th International Symposium on Highly Efficient Accelerators
and Reconfigurable Technologies
publication_status: published
status: public
title: Towards Performance Characterization of FPGAs in Context of HPC using OpenCL
Benchmarks
type: conference
user_id: '40778'
year: '2021'
...
---
_id: '16898'
abstract:
- lang: eng
text: "Electronic structure calculations based on density-functional theory (DFT)\r\nrepresent
a significant part of today's HPC workloads and pose high demands on\r\nhigh-performance
computing resources. To perform these quantum-mechanical DFT\r\ncalculations on
complex large-scale systems, so-called linear scaling methods\r\ninstead of conventional
cubic scaling methods are required. In this work, we\r\ntake up the idea of the
submatrix method and apply it to the DFT computations\r\nin the software package
CP2K. For that purpose, we transform the underlying\r\nnumeric operations on distributed,
large, sparse matrices into computations on\r\nlocal, much smaller and nearly
dense matrices. This allows us to exploit the\r\nfull floating-point performance
of modern CPUs and to make use of dedicated\r\naccelerator hardware, where performance
has been limited by memory bandwidth\r\nbefore. We demonstrate both functionality
and performance of our implementation\r\nand show how it can be accelerated with
GPUs and FPGAs."
author:
- first_name: Michael
full_name: Lass, Michael
id: '24135'
last_name: Lass
orcid: 0000-0002-5708-7632
- first_name: Robert
full_name: Schade, Robert
id: '75963'
last_name: Schade
orcid: 0000-0002-6268-539
- first_name: Thomas
full_name: Kühne, Thomas
id: '49079'
last_name: Kühne
- first_name: Christian
full_name: Plessl, Christian
id: '16153'
last_name: Plessl
orcid: 0000-0001-5728-9982
citation:
ama: 'Lass M, Schade R, Kühne T, Plessl C. A Submatrix-Based Method for Approximate
Matrix Function Evaluation in the Quantum Chemistry Code CP2K. In: Proc. International
Conference for High Performance Computing, Networking, Storage and Analysis (SC).
IEEE Computer Society; 2020:1127-1140. doi:10.1109/SC41405.2020.00084'
apa: Lass, M., Schade, R., Kühne, T., & Plessl, C. (2020). A Submatrix-Based
Method for Approximate Matrix Function Evaluation in the Quantum Chemistry Code
CP2K. Proc. International Conference for High Performance Computing, Networking,
Storage and Analysis (SC), 1127–1140. https://doi.org/10.1109/SC41405.2020.00084
bibtex: '@inproceedings{Lass_Schade_Kühne_Plessl_2020, place={Los Alamitos, CA,
USA}, title={A Submatrix-Based Method for Approximate Matrix Function Evaluation
in the Quantum Chemistry Code CP2K}, DOI={10.1109/SC41405.2020.00084},
booktitle={Proc. International Conference for High Performance Computing, Networking,
Storage and Analysis (SC)}, publisher={IEEE Computer Society}, author={Lass, Michael
and Schade, Robert and Kühne, Thomas and Plessl, Christian}, year={2020}, pages={1127–1140}
}'
chicago: 'Lass, Michael, Robert Schade, Thomas Kühne, and Christian Plessl. “A Submatrix-Based
Method for Approximate Matrix Function Evaluation in the Quantum Chemistry Code
CP2K.” In Proc. International Conference for High Performance Computing, Networking,
Storage and Analysis (SC), 1127–40. Los Alamitos, CA, USA: IEEE Computer Society,
2020. https://doi.org/10.1109/SC41405.2020.00084.'
ieee: 'M. Lass, R. Schade, T. Kühne, and C. Plessl, “A Submatrix-Based Method for
Approximate Matrix Function Evaluation in the Quantum Chemistry Code CP2K,” in
Proc. International Conference for High Performance Computing, Networking,
Storage and Analysis (SC), Atlanta, GA, US, 2020, pp. 1127–1140, doi: 10.1109/SC41405.2020.00084.'
mla: Lass, Michael, et al. “A Submatrix-Based Method for Approximate Matrix Function
Evaluation in the Quantum Chemistry Code CP2K.” Proc. International Conference
for High Performance Computing, Networking, Storage and Analysis (SC), IEEE
Computer Society, 2020, pp. 1127–40, doi:10.1109/SC41405.2020.00084.
short: 'M. Lass, R. Schade, T. Kühne, C. Plessl, in: Proc. International Conference
for High Performance Computing, Networking, Storage and Analysis (SC), IEEE Computer
Society, Los Alamitos, CA, USA, 2020, pp. 1127–1140.'
conference:
location: Atlanta, GA, US
name: 'SC20: International Conference for High Performance Computing, Networking,
Storage and Analysis (SC)'
date_created: 2020-04-28T14:44:21Z
date_updated: 2023-08-02T14:55:59Z
department:
- _id: '27'
- _id: '518'
- _id: '304'
doi: 10.1109/SC41405.2020.00084
external_id:
arxiv:
- '2004.10811'
language:
- iso: eng
main_file_link:
- url: https://ieeexplore.ieee.org/document/9355245
page: 1127-1140
place: Los Alamitos, CA, USA
project:
- _id: '52'
name: Computing Resources Provided by the Paderborn Center for Parallel Computing
- _id: '32'
grant_number: PL 595/2-1 / 320898746
name: Performance and Efficiency in HPC with Custom Computing
- _id: '52'
name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: Proc. International Conference for High Performance Computing, Networking,
Storage and Analysis (SC)
publisher: IEEE Computer Society
quality_controlled: '1'
status: public
title: A Submatrix-Based Method for Approximate Matrix Function Evaluation in the
Quantum Chemistry Code CP2K
type: conference
user_id: '75963'
year: '2020'
...
---
_id: '21632'
abstract:
- lang: eng
text: FPGAs have found increasing adoption in data center applications since a new
generation of high-level tools have become available which noticeably reduce development
time for FPGA accelerators and still provide high-quality results. There is, however,
no high-level benchmark suite available, which specifically enables a comparison
of FPGA architectures, programming tools, and libraries for HPC applications.
To fill this gap, we have developed an OpenCL-based open-source implementation
of the HPCC benchmark suite for Xilinx and Intel FPGAs. This benchmark can serve
to analyze the current capabilities of FPGA devices, cards, and development tool
flows, track progress over time, and point out specific difficulties for FPGA
acceleration in the HPC domain. Additionally, the benchmark documents proven performance
optimization patterns. We will continue optimizing and porting the benchmark for
new generations of FPGAs and design tools and encourage active participation to
create a valuable tool for the community. To fill this gap, we have developed
an OpenCL-based open-source implementation of the HPCC benchmark suite for Xilinx
and Intel FPGAs. This benchmark can serve to analyze the current capabilities
of FPGA devices, cards, and development tool flows, track progress over time,
and point out specific difficulties for FPGA acceleration in the HPC domain. Additionally,
the benchmark documents proven performance optimization patterns. We will continue
optimizing and porting the benchmark for new generations of FPGAs and design tools
and encourage active participation to create a valuable tool for the community.
author:
- first_name: Marius
full_name: Meyer, Marius
id: '40778'
last_name: Meyer
- first_name: Tobias
full_name: Kenter, Tobias
id: '3145'
last_name: Kenter
- first_name: Christian
full_name: Plessl, Christian
id: '16153'
last_name: Plessl
orcid: 0000-0001-5728-9982
citation:
ama: 'Meyer M, Kenter T, Plessl C. Evaluating FPGA Accelerator Performance with
a Parameterized OpenCL Adaptation of Selected Benchmarks of the HPCChallenge Benchmark
Suite. In: 2020 IEEE/ACM International Workshop on Heterogeneous High-Performance
Reconfigurable Computing (H2RC). ; 2020. doi:10.1109/h2rc51942.2020.00007'
apa: Meyer, M., Kenter, T., & Plessl, C. (2020). Evaluating FPGA Accelerator
Performance with a Parameterized OpenCL Adaptation of Selected Benchmarks of the
HPCChallenge Benchmark Suite. 2020 IEEE/ACM International Workshop on Heterogeneous
High-Performance Reconfigurable Computing (H2RC). https://doi.org/10.1109/h2rc51942.2020.00007
bibtex: '@inproceedings{Meyer_Kenter_Plessl_2020, title={Evaluating FPGA Accelerator
Performance with a Parameterized OpenCL Adaptation of Selected Benchmarks of the
HPCChallenge Benchmark Suite}, DOI={10.1109/h2rc51942.2020.00007},
booktitle={2020 IEEE/ACM International Workshop on Heterogeneous High-performance
Reconfigurable Computing (H2RC)}, author={Meyer, Marius and Kenter, Tobias and
Plessl, Christian}, year={2020} }'
chicago: Meyer, Marius, Tobias Kenter, and Christian Plessl. “Evaluating FPGA Accelerator
Performance with a Parameterized OpenCL Adaptation of Selected Benchmarks of the
HPCChallenge Benchmark Suite.” In 2020 IEEE/ACM International Workshop on Heterogeneous
High-Performance Reconfigurable Computing (H2RC), 2020. https://doi.org/10.1109/h2rc51942.2020.00007.
ieee: 'M. Meyer, T. Kenter, and C. Plessl, “Evaluating FPGA Accelerator Performance
with a Parameterized OpenCL Adaptation of Selected Benchmarks of the HPCChallenge
Benchmark Suite,” 2020, doi: 10.1109/h2rc51942.2020.00007.'
mla: Meyer, Marius, et al. “Evaluating FPGA Accelerator Performance with a Parameterized
OpenCL Adaptation of Selected Benchmarks of the HPCChallenge Benchmark Suite.”
2020 IEEE/ACM International Workshop on Heterogeneous High-Performance Reconfigurable
Computing (H2RC), 2020, doi:10.1109/h2rc51942.2020.00007.
short: 'M. Meyer, T. Kenter, C. Plessl, in: 2020 IEEE/ACM International Workshop
on Heterogeneous High-Performance Reconfigurable Computing (H2RC), 2020.'
date_created: 2021-04-16T10:17:22Z
date_updated: 2023-09-26T11:42:53Z
department:
- _id: '27'
- _id: '518'
doi: 10.1109/h2rc51942.2020.00007
keyword:
- FPGA
- OpenCL
- High Level Synthesis
- HPC benchmarking
language:
- iso: eng
main_file_link:
- url: https://ieeexplore.ieee.org/document/9306963
project:
- _id: '52'
name: Computing Resources Provided by the Paderborn Center for Parallel Computing
publication: 2020 IEEE/ACM International Workshop on Heterogeneous High-performance
Reconfigurable Computing (H2RC)
publication_identifier:
isbn:
- '9781665415927'
publication_status: published
quality_controlled: '1'
related_material:
link:
- description: Official repository of the benchmark suite on GitHub
relation: supplementary_material
url: https://github.com/pc2/HPCC_FPGA
status: public
title: Evaluating FPGA Accelerator Performance with a Parameterized OpenCL Adaptation
of Selected Benchmarks of the HPCChallenge Benchmark Suite
type: conference
user_id: '15278'
year: '2020'
...
---
_id: '32242'
abstract:
- lang: eng
text: "We consider a resource-aware variant of the classical multi-armed bandit\r\nproblem:
In each round, the learner selects an arm and determines a resource\r\nlimit.
It then observes a corresponding (random) reward, provided the (random)\r\namount
of consumed resources remains below the limit. Otherwise, the\r\nobservation is
censored, i.e., no reward is obtained. For this problem setting,\r\nwe introduce
a measure of regret, which incorporates the actual amount of\r\nallocated resources
of each learning round as well as the optimality of\r\nrealizable rewards. Thus,
to minimize regret, the learner needs to set a\r\nresource limit and choose an
arm in such a way that the chance to realize a\r\nhigh reward within the predefined
resource limit is high, while the resource\r\nlimit itself should be kept as low
as possible. We derive the theoretical lower\r\nbound on the cumulative regret
and propose a learning algorithm having a regret\r\nupper bound that matches the
lower bound. In a simulation study, we show that\r\nour learning algorithm outperforms
straightforward extensions of standard\r\nmulti-armed bandit algorithms."
author:
- first_name: Viktor
full_name: Bengs, Viktor
last_name: Bengs
- first_name: Eyke
full_name: Hüllermeier, Eyke
last_name: Hüllermeier
citation:
ama: Bengs V, Hüllermeier E. Multi-Armed Bandits with Censored Consumption of Resources.
arXiv:201100813. Published online 2020.
apa: Bengs, V., & Hüllermeier, E. (2020). Multi-Armed Bandits with Censored
Consumption of Resources. In arXiv:2011.00813.
bibtex: '@article{Bengs_Hüllermeier_2020, title={Multi-Armed Bandits with Censored
Consumption of Resources}, journal={arXiv:2011.00813}, author={Bengs, Viktor and
Hüllermeier, Eyke}, year={2020} }'
chicago: Bengs, Viktor, and Eyke Hüllermeier. “Multi-Armed Bandits with Censored
Consumption of Resources.” ArXiv:2011.00813, 2020.
ieee: V. Bengs and E. Hüllermeier, “Multi-Armed Bandits with Censored Consumption
of Resources,” arXiv:2011.00813. 2020.
mla: Bengs, Viktor, and Eyke Hüllermeier. “Multi-Armed Bandits with Censored Consumption
of Resources.” ArXiv:2011.00813, 2020.
short: V. Bengs, E. Hüllermeier, ArXiv:2011.00813 (2020).
date_created: 2022-06-28T07:26:54Z
date_updated: 2022-06-28T07:27:19Z
department:
- _id: '27'
external_id:
arxiv:
- '2011.00813'
language:
- iso: eng
project:
- _id: '52'
name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: arXiv:2011.00813
status: public
title: Multi-Armed Bandits with Censored Consumption of Resources
type: preprint
user_id: '15278'
year: '2020'
...
---
_id: '15478'
abstract:
- lang: eng
text: Stratix 10 FPGA cards have a good potential for the acceleration of HPC workloads
since the Stratix 10 product line introduces devices with a large number of DSP
and memory blocks. The high level synthesis of OpenCL codes can play a fundamental
role for FPGAs in HPC, because it allows to implement different designs with lower
development effort compared to hand optimized HDL. However, Stratix 10 cards are
still hard to fully exploit using the Intel FPGA SDK for OpenCL. The implementation
of designs with thousands of concurrent arithmetic operations often suffers from
place and route problems that limit the maximum frequency or entirely prevent
a successful synthesis. In order to overcome these issues for the implementation
of the matrix multiplication, we formulate Cannon's matrix multiplication algorithm
with regard to its efficient synthesis within the FPGA logic. We obtain a two-level
block algorithm, where the lower level sub-matrices are multiplied using our Cannon's
algorithm implementation. Following this design approach with multiple compute
units, we are able to get maximum frequencies close to and above 300 MHz with
high utilization of DSP and memory blocks. This allows for performance results
above 1 TeraFLOPS.
author:
- first_name: Paolo
full_name: Gorlani, Paolo
id: '72045'
last_name: Gorlani
- first_name: Tobias
full_name: Kenter, Tobias
id: '3145'
last_name: Kenter
- first_name: Christian
full_name: Plessl, Christian
id: '16153'
last_name: Plessl
orcid: 0000-0001-5728-9982
citation:
ama: 'Gorlani P, Kenter T, Plessl C. OpenCL Implementation of Cannon’s Matrix Multiplication
Algorithm on Intel Stratix 10 FPGAs. In: Proceedings of the International Conference
on Field-Programmable Technology (FPT). IEEE; 2019. doi:10.1109/ICFPT47387.2019.00020'
apa: Gorlani, P., Kenter, T., & Plessl, C. (2019). OpenCL Implementation of
Cannon’s Matrix Multiplication Algorithm on Intel Stratix 10 FPGAs. In Proceedings
of the International Conference on Field-Programmable Technology (FPT). IEEE.
https://doi.org/10.1109/ICFPT47387.2019.00020
bibtex: '@inproceedings{Gorlani_Kenter_Plessl_2019, title={OpenCL Implementation
of Cannon’s Matrix Multiplication Algorithm on Intel Stratix 10 FPGAs}, DOI={10.1109/ICFPT47387.2019.00020},
booktitle={Proceedings of the International Conference on Field-Programmable Technology
(FPT)}, publisher={IEEE}, author={Gorlani, Paolo and Kenter, Tobias and Plessl,
Christian}, year={2019} }'
chicago: Gorlani, Paolo, Tobias Kenter, and Christian Plessl. “OpenCL Implementation
of Cannon’s Matrix Multiplication Algorithm on Intel Stratix 10 FPGAs.” In Proceedings
of the International Conference on Field-Programmable Technology (FPT). IEEE,
2019. https://doi.org/10.1109/ICFPT47387.2019.00020.
ieee: P. Gorlani, T. Kenter, and C. Plessl, “OpenCL Implementation of Cannon’s Matrix
Multiplication Algorithm on Intel Stratix 10 FPGAs,” in Proceedings of the
International Conference on Field-Programmable Technology (FPT), 2019.
mla: Gorlani, Paolo, et al. “OpenCL Implementation of Cannon’s Matrix Multiplication
Algorithm on Intel Stratix 10 FPGAs.” Proceedings of the International Conference
on Field-Programmable Technology (FPT), IEEE, 2019, doi:10.1109/ICFPT47387.2019.00020.
short: 'P. Gorlani, T. Kenter, C. Plessl, in: Proceedings of the International Conference
on Field-Programmable Technology (FPT), IEEE, 2019.'
conference:
name: International Conference on Field-Programmable Technology (FPT)
date_created: 2020-01-09T12:54:48Z
date_updated: 2022-01-06T06:52:26Z
ddc:
- '004'
department:
- _id: '27'
- _id: '518'
doi: 10.1109/ICFPT47387.2019.00020
file:
- access_level: closed
content_type: application/pdf
creator: plessl
date_created: 2020-01-09T12:53:57Z
date_updated: 2020-01-09T12:53:57Z
file_id: '15479'
file_name: gorlani19_fpt.pdf
file_size: 250559
relation: main_file
success: 1
file_date_updated: 2020-01-09T12:53:57Z
has_accepted_license: '1'
language:
- iso: eng
project:
- _id: '33'
grant_number: 01|H16005
name: HighPerMeshes
- _id: '32'
grant_number: PL 595/2-1
name: Performance and Efficiency in HPC with Custom Computing
publication: Proceedings of the International Conference on Field-Programmable Technology
(FPT)
publisher: IEEE
quality_controlled: '1'
status: public
title: OpenCL Implementation of Cannon's Matrix Multiplication Algorithm on Intel
Stratix 10 FPGAs
type: conference
user_id: '3145'
year: '2019'
...
---
_id: '22'
abstract:
- lang: eng
text: This paper describes a data structure and a heuristic to plan and map arbitrary
resources in complex combinations while applying time dependent constraints. The
approach is used in the planning based workload manager OpenCCS at the Paderborn
Center for Parallel Computing (PC\(^2\)) to operate heterogeneous clusters with
up to 10000 cores. We also show performance results derived from four years of
operation.
author:
- first_name: Axel
full_name: Keller, Axel
id: '15274'
last_name: Keller
citation:
ama: 'Keller A. A Data Structure for Planning Based Workload Management of Heterogeneous
HPC Systems. In: Klusáček D, Cirne W, Desai N, eds. Proc. Workshop on Job Scheduling
Strategies for Parallel Processing (JSSPP). Vol 10773. Lecture Notes in Computer
Science. Springer; 2018:132-151. doi:10.1007/978-3-319-77398-8_8'
apa: 'Keller, A. (2018). A Data Structure for Planning Based Workload Management
of Heterogeneous HPC Systems. In D. Klusáček, W. Cirne, & N. Desai (Eds.),
Proc. Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP)
(Vol. 10773, pp. 132–151). Orlando, FL, USA: Springer. https://doi.org/10.1007/978-3-319-77398-8_8'
bibtex: '@inproceedings{Keller_2018, series={Lecture Notes in Computer Science},
title={A Data Structure for Planning Based Workload Management of Heterogeneous
HPC Systems}, volume={10773}, DOI={10.1007/978-3-319-77398-8_8},
booktitle={Proc. Workshop on Job Scheduling Strategies for Parallel Processing
(JSSPP)}, publisher={Springer}, author={Keller, Axel}, editor={Klusáček, D. and
Cirne, W. and Desai, N.Editors}, year={2018}, pages={132–151}, collection={Lecture
Notes in Computer Science} }'
chicago: Keller, Axel. “A Data Structure for Planning Based Workload Management
of Heterogeneous HPC Systems.” In Proc. Workshop on Job Scheduling Strategies
for Parallel Processing (JSSPP), edited by D. Klusáček, W. Cirne, and N. Desai,
10773:132–51. Lecture Notes in Computer Science. Springer, 2018. https://doi.org/10.1007/978-3-319-77398-8_8.
ieee: A. Keller, “A Data Structure for Planning Based Workload Management of Heterogeneous
HPC Systems,” in Proc. Workshop on Job Scheduling Strategies for Parallel Processing
(JSSPP), Orlando, FL, USA, 2018, vol. 10773, pp. 132–151.
mla: Keller, Axel. “A Data Structure for Planning Based Workload Management of Heterogeneous
HPC Systems.” Proc. Workshop on Job Scheduling Strategies for Parallel Processing
(JSSPP), edited by D. Klusáček et al., vol. 10773, Springer, 2018, pp. 132–51,
doi:10.1007/978-3-319-77398-8_8.
short: 'A. Keller, in: D. Klusáček, W. Cirne, N. Desai (Eds.), Proc. Workshop on
Job Scheduling Strategies for Parallel Processing (JSSPP), Springer, 2018, pp.
132–151.'
conference:
end_date: 2017-06-02
location: Orlando, FL, USA
name: 21st Workshop on Job Scheduling Strategies for Parallel Processing
start_date: 2017-06-02
date_created: 2017-07-25T14:54:08Z
date_updated: 2022-01-06T06:55:22Z
department:
- _id: '27'
doi: 10.1007/978-3-319-77398-8_8
editor:
- first_name: D.
full_name: Klusáček, D.
last_name: Klusáček
- first_name: W.
full_name: Cirne, W.
last_name: Cirne
- first_name: N.
full_name: Desai, N.
last_name: Desai
intvolume: ' 10773'
keyword:
- Scheduling Planning Mapping Workload management
language:
- iso: eng
page: 132-151
publication: Proc. Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP)
publication_identifier:
isbn:
- 978-3-319-77398-8
- 978-3-319-77397-1
publication_status: published
publisher: Springer
series_title: Lecture Notes in Computer Science
status: public
title: A Data Structure for Planning Based Workload Management of Heterogeneous HPC
Systems
type: conference
user_id: '15274'
volume: 10773
year: '2018'
...
---
_id: '1590'
abstract:
- lang: eng
text: "We present the submatrix method, a highly parallelizable method for the approximate
calculation of inverse p-th roots of large sparse symmetric matrices which are
required in different scientific applications. Following the idea of Approximate
Computing, we allow imprecision in the final result in order to utilize the sparsity
of the input matrix and to allow massively parallel execution. For an n x n matrix,
the proposed algorithm allows to distribute the calculations over n nodes with
only little communication overhead. The result matrix exhibits the same sparsity
pattern as the input matrix, allowing for efficient reuse of allocated data structures.\r\n\r\nWe
evaluate the algorithm with respect to the error that it introduces into calculated
results, as well as its performance and scalability. We demonstrate that the error
is relatively limited for well-conditioned matrices and that results are still
valuable for error-resilient applications like preconditioning even for ill-conditioned
matrices. We discuss the execution time and scaling of the algorithm on a theoretical
level and present a distributed implementation of the algorithm using MPI and
OpenMP. We demonstrate the scalability of this implementation by running it on
a high-performance compute cluster comprised of 1024 CPU cores, showing a speedup
of 665x compared to single-threaded execution."
author:
- first_name: Michael
full_name: Lass, Michael
id: '24135'
last_name: Lass
orcid: 0000-0002-5708-7632
- first_name: Stephan
full_name: Mohr, Stephan
last_name: Mohr
- first_name: Hendrik
full_name: Wiebeler, Hendrik
last_name: Wiebeler
- first_name: Thomas
full_name: Kühne, Thomas
id: '49079'
last_name: Kühne
- first_name: Christian
full_name: Plessl, Christian
id: '16153'
last_name: Plessl
orcid: 0000-0001-5728-9982
citation:
ama: 'Lass M, Mohr S, Wiebeler H, Kühne T, Plessl C. A Massively Parallel Algorithm
for the Approximate Calculation of Inverse p-th Roots of Large Sparse Matrices.
In: Proc. Platform for Advanced Scientific Computing (PASC) Conference.
ACM; 2018. doi:10.1145/3218176.3218231'
apa: Lass, M., Mohr, S., Wiebeler, H., Kühne, T., & Plessl, C. (2018). A Massively
Parallel Algorithm for the Approximate Calculation of Inverse p-th Roots of Large
Sparse Matrices. Proc. Platform for Advanced Scientific Computing (PASC) Conference.
Platform for Advanced Scientific Computing Conference (PASC), Basel, Switzerland.
https://doi.org/10.1145/3218176.3218231
bibtex: '@inproceedings{Lass_Mohr_Wiebeler_Kühne_Plessl_2018, place={New York, NY,
USA}, title={A Massively Parallel Algorithm for the Approximate Calculation of
Inverse p-th Roots of Large Sparse Matrices}, DOI={10.1145/3218176.3218231},
booktitle={Proc. Platform for Advanced Scientific Computing (PASC) Conference},
publisher={ACM}, author={Lass, Michael and Mohr, Stephan and Wiebeler, Hendrik
and Kühne, Thomas and Plessl, Christian}, year={2018} }'
chicago: 'Lass, Michael, Stephan Mohr, Hendrik Wiebeler, Thomas Kühne, and Christian
Plessl. “A Massively Parallel Algorithm for the Approximate Calculation of Inverse
P-Th Roots of Large Sparse Matrices.” In Proc. Platform for Advanced Scientific
Computing (PASC) Conference. New York, NY, USA: ACM, 2018. https://doi.org/10.1145/3218176.3218231.'
ieee: 'M. Lass, S. Mohr, H. Wiebeler, T. Kühne, and C. Plessl, “A Massively Parallel
Algorithm for the Approximate Calculation of Inverse p-th Roots of Large Sparse
Matrices,” presented at the Platform for Advanced Scientific Computing Conference
(PASC), Basel, Switzerland, 2018, doi: 10.1145/3218176.3218231.'
mla: Lass, Michael, et al. “A Massively Parallel Algorithm for the Approximate Calculation
of Inverse P-Th Roots of Large Sparse Matrices.” Proc. Platform for Advanced
Scientific Computing (PASC) Conference, ACM, 2018, doi:10.1145/3218176.3218231.
short: 'M. Lass, S. Mohr, H. Wiebeler, T. Kühne, C. Plessl, in: Proc. Platform for
Advanced Scientific Computing (PASC) Conference, ACM, New York, NY, USA, 2018.'
conference:
end_date: 2018-07-04
location: Basel, Switzerland
name: Platform for Advanced Scientific Computing Conference (PASC)
start_date: 2018-07-02
date_created: 2018-03-22T10:53:01Z
date_updated: 2023-09-26T11:48:12Z
department:
- _id: '27'
- _id: '518'
- _id: '304'
doi: 10.1145/3218176.3218231
external_id:
arxiv:
- '1710.10899'
keyword:
- approximate computing
- linear algebra
- matrix inversion
- matrix p-th roots
- numeric algorithm
- parallel computing
language:
- iso: eng
place: New York, NY, USA
project:
- _id: '32'
grant_number: PL 595/2-1 / 320898746
name: Performance and Efficiency in HPC with Custom Computing
- _id: '52'
name: Computing Resources Provided by the Paderborn Center for Parallel Computing
publication: Proc. Platform for Advanced Scientific Computing (PASC) Conference
publication_identifier:
isbn:
- 978-1-4503-5891-0/18/07
publisher: ACM
quality_controlled: '1'
status: public
title: A Massively Parallel Algorithm for the Approximate Calculation of Inverse p-th
Roots of Large Sparse Matrices
type: conference
user_id: '15278'
year: '2018'
...
---
_id: '1204'
author:
- first_name: Heinrich
full_name: Riebler, Heinrich
id: '8961'
last_name: Riebler
- first_name: Gavin Francis
full_name: Vaz, Gavin Francis
id: '30332'
last_name: Vaz
- first_name: Tobias
full_name: Kenter, Tobias
id: '3145'
last_name: Kenter
- first_name: Christian
full_name: Plessl, Christian
id: '16153'
last_name: Plessl
orcid: 0000-0001-5728-9982
citation:
ama: 'Riebler H, Vaz GF, Kenter T, Plessl C. Automated Code Acceleration Targeting
Heterogeneous OpenCL Devices. In: Proc. ACM SIGPLAN Symposium on Principles
and Practice of Parallel Programming (PPoPP). ACM; 2018. doi:10.1145/3178487.3178534'
apa: Riebler, H., Vaz, G. F., Kenter, T., & Plessl, C. (2018). Automated Code
Acceleration Targeting Heterogeneous OpenCL Devices. Proc. ACM SIGPLAN Symposium
on Principles and Practice of Parallel Programming (PPoPP). https://doi.org/10.1145/3178487.3178534
bibtex: '@inproceedings{Riebler_Vaz_Kenter_Plessl_2018, title={Automated Code Acceleration
Targeting Heterogeneous OpenCL Devices}, DOI={10.1145/3178487.3178534},
booktitle={Proc. ACM SIGPLAN Symposium on Principles and Practice of Parallel
Programming (PPoPP)}, publisher={ACM}, author={Riebler, Heinrich and Vaz, Gavin
Francis and Kenter, Tobias and Plessl, Christian}, year={2018} }'
chicago: Riebler, Heinrich, Gavin Francis Vaz, Tobias Kenter, and Christian Plessl.
“Automated Code Acceleration Targeting Heterogeneous OpenCL Devices.” In Proc.
ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP).
ACM, 2018. https://doi.org/10.1145/3178487.3178534.
ieee: 'H. Riebler, G. F. Vaz, T. Kenter, and C. Plessl, “Automated Code Acceleration
Targeting Heterogeneous OpenCL Devices,” 2018, doi: 10.1145/3178487.3178534.'
mla: Riebler, Heinrich, et al. “Automated Code Acceleration Targeting Heterogeneous
OpenCL Devices.” Proc. ACM SIGPLAN Symposium on Principles and Practice of
Parallel Programming (PPoPP), ACM, 2018, doi:10.1145/3178487.3178534.
short: 'H. Riebler, G.F. Vaz, T. Kenter, C. Plessl, in: Proc. ACM SIGPLAN Symposium
on Principles and Practice of Parallel Programming (PPoPP), ACM, 2018.'
date_created: 2018-03-08T14:45:18Z
date_updated: 2023-09-26T11:47:23Z
ddc:
- '000'
department:
- _id: '27'
- _id: '518'
doi: 10.1145/3178487.3178534
file:
- access_level: closed
content_type: application/pdf
creator: ups
date_created: 2018-11-02T14:43:37Z
date_updated: 2018-11-02T14:43:37Z
file_id: '5281'
file_name: p417-riebler.pdf
file_size: 447769
relation: main_file
success: 1
file_date_updated: 2018-11-02T14:43:37Z
has_accepted_license: '1'
keyword:
- htrop
language:
- iso: eng
project:
- _id: '1'
grant_number: '160364472'
name: SFB 901
- _id: '4'
name: SFB 901 - Project Area C
- _id: '14'
grant_number: '160364472'
name: SFB 901 - Subproject C2
publication: Proc. ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
(PPoPP)
publication_identifier:
isbn:
- '9781450349826'
publication_status: published
publisher: ACM
quality_controlled: '1'
status: public
title: Automated Code Acceleration Targeting Heterogeneous OpenCL Devices
type: conference
user_id: '15278'
year: '2018'
...
---
_id: '1588'
abstract:
- lang: eng
text: The exploration of FPGAs as accelerators for scientific simulations has so
far mostly been focused on small kernels of methods working on regular data structures,
for example in the form of stencil computations for finite difference methods.
In computational sciences, often more advanced methods are employed that promise
better stability, convergence, locality and scaling. Unstructured meshes are shown
to be more effective and more accurate, compared to regular grids, in representing
computation domains of various shapes. Using unstructured meshes, the discontinuous
Galerkin method preserves the ability to perform explicit local update operations
for simulations in the time domain. In this work, we investigate FPGAs as target
platform for an implementation of the nodal discontinuous Galerkin method to find
time-domain solutions of Maxwell's equations in an unstructured mesh. When maximizing
data reuse and fitting constant coefficients into suitably partitioned on-chip
memory, high computational intensity allows us to implement and feed wide data
paths with hundreds of floating point operators. By decoupling off-chip memory
accesses from the computations, high memory bandwidth can be sustained, even for
the irregular access pattern required by parts of the application. Using the Intel/Altera
OpenCL SDK for FPGAs, we present different implementation variants for different
polynomial orders of the method. In different phases of the algorithm, either
computational or bandwidth limits of the Arria 10 platform are almost reached,
thus outperforming a highly multithreaded CPU implementation by around 2x.
author:
- first_name: Tobias
full_name: Kenter, Tobias
id: '3145'
last_name: Kenter
- first_name: Gopinath
full_name: Mahale, Gopinath
last_name: Mahale
- first_name: Samer
full_name: Alhaddad, Samer
id: '42456'
last_name: Alhaddad
- first_name: Yevgen
full_name: Grynko, Yevgen
id: '26059'
last_name: Grynko
- first_name: Christian
full_name: Schmitt, Christian
last_name: Schmitt
- first_name: Ayesha
full_name: Afzal, Ayesha
last_name: Afzal
- first_name: Frank
full_name: Hannig, Frank
last_name: Hannig
- first_name: Jens
full_name: Förstner, Jens
id: '158'
last_name: Förstner
orcid: 0000-0001-7059-9862
- first_name: Christian
full_name: Plessl, Christian
id: '16153'
last_name: Plessl
orcid: 0000-0001-5728-9982
citation:
ama: 'Kenter T, Mahale G, Alhaddad S, et al. OpenCL-based FPGA Design to Accelerate
the Nodal Discontinuous Galerkin Method for Unstructured Meshes. In: Proc.
Int. Symp. on Field-Programmable Custom Computing Machines (FCCM). IEEE; 2018.
doi:10.1109/FCCM.2018.00037'
apa: Kenter, T., Mahale, G., Alhaddad, S., Grynko, Y., Schmitt, C., Afzal, A., Hannig,
F., Förstner, J., & Plessl, C. (2018). OpenCL-based FPGA Design to Accelerate
the Nodal Discontinuous Galerkin Method for Unstructured Meshes. Proc. Int.
Symp. on Field-Programmable Custom Computing Machines (FCCM). Proc. Int. Symp.
on Field-Programmable Custom Computing Machines (FCCM). https://doi.org/10.1109/FCCM.2018.00037
bibtex: '@inproceedings{Kenter_Mahale_Alhaddad_Grynko_Schmitt_Afzal_Hannig_Förstner_Plessl_2018,
title={OpenCL-based FPGA Design to Accelerate the Nodal Discontinuous Galerkin
Method for Unstructured Meshes}, DOI={10.1109/FCCM.2018.00037},
booktitle={Proc. Int. Symp. on Field-Programmable Custom Computing Machines (FCCM)},
publisher={IEEE}, author={Kenter, Tobias and Mahale, Gopinath and Alhaddad, Samer
and Grynko, Yevgen and Schmitt, Christian and Afzal, Ayesha and Hannig, Frank
and Förstner, Jens and Plessl, Christian}, year={2018} }'
chicago: Kenter, Tobias, Gopinath Mahale, Samer Alhaddad, Yevgen Grynko, Christian
Schmitt, Ayesha Afzal, Frank Hannig, Jens Förstner, and Christian Plessl. “OpenCL-Based
FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method for Unstructured
Meshes.” In Proc. Int. Symp. on Field-Programmable Custom Computing Machines
(FCCM). IEEE, 2018. https://doi.org/10.1109/FCCM.2018.00037.
ieee: 'T. Kenter et al., “OpenCL-based FPGA Design to Accelerate the Nodal
Discontinuous Galerkin Method for Unstructured Meshes,” presented at the Proc.
Int. Symp. on Field-Programmable Custom Computing Machines (FCCM), 2018, doi:
10.1109/FCCM.2018.00037.'
mla: Kenter, Tobias, et al. “OpenCL-Based FPGA Design to Accelerate the Nodal Discontinuous
Galerkin Method for Unstructured Meshes.” Proc. Int. Symp. on Field-Programmable
Custom Computing Machines (FCCM), IEEE, 2018, doi:10.1109/FCCM.2018.00037.
short: 'T. Kenter, G. Mahale, S. Alhaddad, Y. Grynko, C. Schmitt, A. Afzal, F. Hannig,
J. Förstner, C. Plessl, in: Proc. Int. Symp. on Field-Programmable Custom Computing
Machines (FCCM), IEEE, 2018.'
conference:
name: Proc. Int. Symp. on Field-Programmable Custom Computing Machines (FCCM)
date_created: 2018-03-22T10:48:01Z
date_updated: 2023-09-26T11:47:52Z
ddc:
- '000'
department:
- _id: '27'
- _id: '518'
- _id: '61'
doi: 10.1109/FCCM.2018.00037
file:
- access_level: closed
content_type: application/pdf
creator: ups
date_created: 2018-11-02T14:45:05Z
date_updated: 2018-11-02T14:45:05Z
file_id: '5282'
file_name: 08457652.pdf
file_size: 269130
relation: main_file
success: 1
file_date_updated: 2018-11-02T14:45:05Z
has_accepted_license: '1'
keyword:
- tet_topic_hpc
language:
- iso: eng
project:
- _id: '33'
grant_number: 01|H16005A
name: HighPerMeshes
- _id: '1'
grant_number: '160364472'
name: SFB 901
- _id: '4'
name: SFB 901 - Project Area C
- _id: '14'
grant_number: '160364472'
name: SFB 901 - Subproject C2
publication: Proc. Int. Symp. on Field-Programmable Custom Computing Machines (FCCM)
publisher: IEEE
quality_controlled: '1'
status: public
title: OpenCL-based FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method
for Unstructured Meshes
type: conference
user_id: '15278'
year: '2018'
...