TY  - JOUR
AU  - Schade, Robert
AU  - Kenter, Tobias
AU  - Elgabarty, Hossam
AU  - Lass, Michael
AU  - Schütt, Ole
AU  - Lazzaro, Alfio
AU  - Pabst, Hans
AU  - Mohr, Stephan
AU  - Hutter, Jürg
AU  - Kühne, Thomas
AU  - Plessl, Christian
ID  - 33684
JF  - Parallel Computing
KW  - Artificial Intelligence
KW  - Computer Graphics and Computer-Aided Design
KW  - Computer Networks and Communications
KW  - Hardware and Architecture
KW  - Theoretical Computer Science
KW  - Software
SN  - 0167-8191
TI  - Towards electronic structure-based ab-initio molecular dynamics simulations with hundreds of millions of atoms
VL  - 111
ER  - 
TY  - JOUR
AU  - Meyer, Marius
AU  - Kenter, Tobias
AU  - Plessl, Christian
ID  - 27364
JF  - Journal of Parallel and Distributed Computing
SN  - 0743-7315
TI  - In-depth FPGA Accelerator Performance Evaluation with Single Node Benchmarks from the HPC Challenge Benchmark Suite for Intel and Xilinx FPGAs using OpenCL
ER  - 
TY  - JOUR
AB  - Hole polarons and defect-bound exciton polarons in lithium niobate are investigated by means of density-functional theory, where the localization of the holes is achieved by applying the +U approach to the oxygen 2p orbitals. We find three principal configurations of hole polarons: (i) self-trapped holes localized at displaced regular oxygen atoms and (ii) two other configurations bound to a lithium vacancy either at a threefold coordinated oxygen atom above or at a two-fold coordinated oxygen atom below the defect. The latter is the most stable and is in excellent quantitative agreement with measured g factors from electron paramagnetic resonance. Due to the absence of mid-gap states, none of these hole polarons can explain the broad optical absorption centered between 2.5 and 2.8 eV that is observed in transient absorption spectroscopy, but such states appear if a free electron polaron is trapped at the same lithium vacancy as the bound hole polaron, resulting in an exciton polaron. The dielectric function calculated by solving the Bethe–Salpeter equation indeed yields an optical peak at 2.6 eV in agreement with the two-photon experiments. The coexistence of hole and exciton polarons, which are simultaneously created in optical excitations, thus satisfactorily explains the reported experimental data.
AU  - Schmidt, Falko
AU  - Kozub, Agnieszka L.
AU  - Gerstmann, Uwe
AU  - Schmidt, Wolf Gero
AU  - Schindlmayr, Arno
ID  - 44088
IS  - 11
JF  - Crystals
TI  - A density-functional theory study of hole and defect-bound exciton polarons in lithium niobate
VL  - 12
ER  - 
TY  - JOUR
AB  - <jats:p>Multimode integrated interferometers have great potential for both spectral engineering and metrological applications. However, the material dispersion of integrated platforms constitutes an obstacle that limits the performance and precision of such interferometers. At the same time, two-colour nonlinear interferometers present an important tool for metrological applications, when measurements in a certain frequency range are difficult. In this manuscript, we theoretically developed and investigated an integrated multimode two-colour SU(1,1) interferometer operating in a supersensitive mode. By ensuring the proper design of the integrated platform, we suppressed the dispersion, thereby significantly increasing the visibility of the interference pattern. The use of a continuous wave pump laser provided the symmetry between the spectral shapes of the signal and idler photons concerning half the pump frequency, despite different photon colours. We demonstrate that such an interferometer overcomes the classical phase sensitivity limit for wide parametric gain ranges, when up to 3×104 photons are generated.</jats:p>
AU  - Ferreri, Alessandro
AU  - Sharapova, Polina R.
ID  - 40371
IS  - 3
JF  - Symmetry
KW  - Physics and Astronomy (miscellaneous)
KW  - General Mathematics
KW  - Chemistry (miscellaneous)
KW  - Computer Science (miscellaneous)
SN  - 2073-8994
TI  - Two-Colour Spectrally Multimode Integrated SU(1,1) Interferometer
VL  - 14
ER  - 
TY  - JOUR
AB  - N-body methods are one of the essential algorithmic building blocks of high-performance and parallel computing. Previous research has shown promising performance for implementing n-body simulations with pairwise force calculations on FPGAs. However, to avoid challenges with accumulation and memory access patterns, the presented designs calculate each pair of forces twice, along with both force sums of the involved particles. Also, they require large problem instances with hundreds of thousands of particles to reach their respective peak performance, limiting the applicability for strong scaling scenarios. This work addresses both issues by presenting a novel FPGA design that uses each calculated force twice and overlaps data transfers and computations in a way that allows to reach peak performance even for small problem instances, outperforming previous single precision results even in double precision, and scaling linearly over multiple interconnected FPGAs. For a comparison across architectures, we provide an equally optimized CPU reference, which for large problems actually achieves higher peak performance per device, however, given the strong scaling advantages of the FPGA design, in parallel setups with few thousand particles per device, the FPGA platform achieves highest performance and power efficiency.
AU  - Menzel, Johannes
AU  - Plessl, Christian
AU  - Kenter, Tobias
ID  - 28099
IS  - 1
JF  - ACM Transactions on Reconfigurable Technology and Systems
SN  - 1936-7406
TI  - The Strong Scaling Advantage of FPGAs in HPC for N-body Simulations
VL  - 15
ER  - 
TY  - CONF
AU  - Meyer, Marius
ID  - 27365
T2  - Proceedings of the 11th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies
TI  - Towards Performance Characterization of FPGAs in Context of HPC using OpenCL Benchmarks
ER  - 
TY  - CONF
AU  - Nickchen, Tobias
AU  - Heindorf, Stefan
AU  - Engels, Gregor
ID  - 20886
T2  - Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision
TI  - Generating Physically Sound Training Data for Image Recognition of Additively Manufactured Parts
ER  - 
TY  - JOUR
AB  - <jats:title>Abstract</jats:title>
               <jats:p>The defining feature of active particles is that they constantly propel themselves by locally converting chemical energy into directed motion. This active self-propulsion prevents them from equilibrating with their thermal environment (e.g. an aqueous solution), thus keeping them permanently out of equilibrium. Nevertheless, the spatial dynamics of active particles might share certain equilibrium features, in particular in the steady state. We here focus on the time-reversal symmetry of individual spatial trajectories as a distinct equilibrium characteristic. We investigate to what extent the steady-state trajectories of a trapped active particle obey or break this time-reversal symmetry. Within the framework of active Ornstein–Uhlenbeck particles we find that the steady-state trajectories in a harmonic potential fulfill path-wise time-reversal symmetry exactly, while this symmetry is typically broken in anharmonic potentials.</jats:p>
AU  - Dabelow, Lennart
AU  - Bo, Stefano
AU  - Eichhorn, Ralf
ID  - 32243
IS  - 3
JF  - Journal of Statistical Mechanics: Theory and Experiment
KW  - Statistics
KW  - Probability and Uncertainty
KW  - Statistics and Probability
KW  - Statistical and Nonlinear Physics
SN  - 1742-5468
TI  - How irreversible are steady-state trajectories of a trapped active particle?
VL  - 2021
ER  - 
TY  - GEN
AB  - We push the boundaries of electronic structure-based \textit{ab-initio}
molecular dynamics (AIMD) beyond 100 million atoms. This scale is otherwise
barely reachable with classical force-field methods or novel neural network and
machine learning potentials. We achieve this breakthrough by combining
innovations in linear-scaling AIMD, efficient and approximate sparse linear
algebra, low and mixed-precision floating-point computation on GPUs, and a
compensation scheme for the errors introduced by numerical approximations. The
core of our work is the non-orthogonalized local submatrix method (NOLSM),
which scales very favorably to massively parallel computing systems and
translates large sparse matrix operations into highly parallel, dense matrix
operations that are ideally suited to hardware accelerators. We demonstrate
that the NOLSM method, which is at the center point of each AIMD step, is able
to achieve a sustained performance of 324 PFLOP/s in mixed FP16/FP32 precision
corresponding to an efficiency of 67.7% when running on 1536 NVIDIA A100 GPUs.
AU  - Schade, Robert
AU  - Kenter, Tobias
AU  - Elgabarty, Hossam
AU  - Lass, Michael
AU  - Schütt, Ole
AU  - Lazzaro, Alfio
AU  - Pabst, Hans
AU  - Mohr, Stephan
AU  - Hutter, Jürg
AU  - Kühne, Thomas D.
AU  - Plessl, Christian
ID  - 32244
T2  - arXiv:2104.08245
TI  - Towards Electronic Structure-Based Ab-Initio Molecular Dynamics  Simulations with Hundreds of Millions of Atoms
ER  - 
TY  - CONF
AU  - Karp, Martin
AU  - Podobas, Artur
AU  - Jansson, Niclas
AU  - Kenter, Tobias
AU  - Plessl, Christian
AU  - Schlatter, Philipp
AU  - Markidis, Stefano
ID  - 29937
T2  - 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
TI  - High-Performance Spectral Element Methods on Field-Programmable Gate Arrays : Implementation, Evaluation, and Future Projection
ER  - 
TY  - CONF
AU  - Kenter, Tobias
AU  - Shambhu, Adesh
AU  - Faghih-Naini, Sara
AU  - Aizinger, Vadym
ID  - 46194
T2  - Proceedings of the Platform for Advanced Scientific Computing Conference (PASC)
TI  - Algorithm-hardware co-design of a discontinuous Galerkin shallow-water model for a dataflow architecture on FPGA
ER  - 
TY  - GEN
AB  - The interaction between quantum light and matter is being intensively studied
for systems that are enclosed in high-$Q$ cavities which strongly enhance the
light-matter coupling. However, for many applications, cavities with lower
$Q$-factors are preferred due to the increased spectral width of the cavity
mode. Here, we investigate the interaction between quantum light and matter
represented by a $\Lambda$-type three-level system in lossy cavities, assuming
that cavity losses are the dominant loss mechanism. We demonstrate that cavity
losses lead to non-trivial steady states of the electronic occupations that can
be controlled by the loss rate and the initial statistics of the quantum
fields. The mechanism of formation of such steady states can be understood on
the basis of the equations of motion. Analytical expressions for steady states
and their numerical simulations are presented and discussed.
AU  - Rose, H.
AU  - Tikhonova, O. V.
AU  - Meier, T.
AU  - Sharapova, P. 
ID  - 32236
T2  - arXiv:2109.00842
TI  - Steady states of $Λ$-type three-level systems excited by quantum  light in lossy cavities
ER  - 
TY  - JOUR
AU  - Kaczmarek, Olaf
AU  - Mazur, Lukas
AU  - Sharma, Sayantan
ID  - 46122
IS  - 9
JF  - Physical Review D
SN  - 2470-0010
TI  - Eigenvalue spectra of QCD and the fate of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" display="inline"><mml:msub><mml:mi>U</mml:mi><mml:mi>A</mml:mi></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo></mml:math> breaking towards the chiral limit
VL  - 104
ER  - 
TY  - JOUR
AU  - Altenkort, Luis
AU  - Eller, Alexander M.
AU  - Kaczmarek, O.
AU  - Mazur, Lukas
AU  - Moore, Guy D.
AU  - Shu, H.-T.
ID  - 46124
IS  - 1
JF  - Physical Review D
SN  - 2470-0010
TI  - Heavy quark momentum diffusion from the lattice using gradient flow
VL  - 103
ER  - 
TY  - JOUR
AU  - Altenkort, Luis
AU  - Eller, Alexander M.
AU  - Kaczmarek, O.
AU  - Mazur, Lukas
AU  - Moore, Guy D.
AU  - Shu, H.-T.
ID  - 46123
IS  - 11
JF  - Physical Review D
SN  - 2470-0010
TI  - Sphaleron rate from Euclidean lattice correlators: An exploration
VL  - 103
ER  - 
TY  - CONF
AU  - Karp, Martin
AU  - Podobas, Artur
AU  - Jansson, Niclas
AU  - Kenter, Tobias
AU  - Plessl, Christian
AU  - Schlatter, Philipp
AU  - Markidis, Stefano
ID  - 46195
T2  - 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
TI  - High-Performance Spectral Element Methods on Field-Programmable Gate Arrays : Implementation, Evaluation, and Future Projection
ER  - 
TY  - CHAP
AB  - Solving partial differential equations on unstructured grids is a cornerstone of engineering and scientific computing. Nowadays, heterogeneous parallel platforms with CPUs, GPUs, and FPGAs enable energy-efficient and computationally demanding simulations. We developed the HighPerMeshes C++-embedded Domain-Specific Language (DSL) for bridging the abstraction gap between the mathematical and algorithmic formulation of mesh-based algorithms for PDE problems on the one hand and an increasing number of heterogeneous platforms with their different parallel programming and runtime models on the other hand. Thus, the HighPerMeshes DSL aims at higher productivity in the code development process for multiple target platforms. We introduce the concepts as well as the basic structure of the HighPerMeshes DSL, and demonstrate its usage with three examples, a Poisson and monodomain problem, respectively, solved by the continuous finite element method, and the discontinuous Galerkin method for Maxwell’s equation. The mapping of the abstract algorithmic description onto parallel hardware, including distributed memory compute clusters, is presented. Finally, the achievable performance and scalability are demonstrated for a typical example problem on a multi-core CPU cluster.
AU  - Alhaddad, Samer
AU  - Förstner, Jens
AU  - Groth, Stefan
AU  - Grünewald, Daniel
AU  - Grynko, Yevgen
AU  - Hannig, Frank
AU  - Kenter, Tobias
AU  - Pfreundt, Franz-Josef
AU  - Plessl, Christian
AU  - Schotte, Merlind
AU  - Steinke, Thomas
AU  - Teich, Jürgen
AU  - Weiser, Martin
AU  - Wende, Florian
ID  - 21587
KW  - tet_topic_hpc
SN  - 0302-9743
T2  - Euro-Par 2020: Parallel Processing Workshops
TI  - HighPerMeshes – A Domain-Specific Language for Numerical Algorithms on Unstructured Grids
ER  - 
TY  - CHAP
AU  - Ramaswami, Arjun
AU  - Kenter, Tobias
AU  - Kühne, Thomas
AU  - Plessl, Christian
ID  - 29936
SN  - 0302-9743
T2  - Applied Reconfigurable Computing. Architectures, Tools, and Applications
TI  - Evaluating the Design Space for Offloading 3D FFT Calculations to an FPGA for High-Performance Computing
ER  - 
TY  - JOUR
AU  - Alhaddad, Samer
AU  - Förstner, Jens
AU  - Groth, Stefan
AU  - Grünewald, Daniel
AU  - Grynko, Yevgen
AU  - Hannig, Frank
AU  - Kenter, Tobias
AU  - Pfreundt, Franz‐Josef
AU  - Plessl, Christian
AU  - Schotte, Merlind
AU  - Steinke, Thomas
AU  - Teich, Jürgen
AU  - Weiser, Martin
AU  - Wende, Florian
ID  - 24788
JF  - Concurrency and Computation: Practice and Experience
KW  - tet_topic_hpc
SN  - 1532-0626
TI  - The HighPerMeshes framework for numerical algorithms on unstructured grids
ER  - 
TY  - JOUR
AB  - <p>The effect of traces of ethanol in supercritical carbon dioxide on the mixture's thermodynamic properties is studied by molecular simulations and Taylor dispersion measurements.</p>
AU  - Chatwell, René Spencer
AU  - Guevara-Carrion, Gabriela
AU  - Gaponenko, Yuri
AU  - Shevtsova, Valentina
AU  - Vrabec, Jadran
ID  - 32240
IS  - 4
JF  - Physical Chemistry Chemical Physics
KW  - Physical and Theoretical Chemistry
KW  - General Physics and Astronomy
SN  - 1463-9076
TI  - Diffusion of the carbon dioxide–ethanol mixture in the extended critical region
VL  - 23
ER  -