---
res:
  bibo_abstract:
  - The exploration of FPGAs as accelerators for scientific simulations has so far
    mostly been focused on small kernels of methods working on regular data structures,
    for example in the form of stencil computations for finite difference methods.
    In computational sciences, often more advanced methods are employed that promise
    better stability, convergence, locality and scaling. Unstructured meshes are shown
    to be more effective and more accurate, compared to regular grids, in representing
    computation domains of various shapes. Using unstructured meshes, the discontinuous
    Galerkin method preserves the ability to perform explicit local update operations
    for simulations in the time domain. In this work, we investigate FPGAs as target
    platform for an implementation of the nodal discontinuous Galerkin method to find
    time-domain solutions of Maxwell's equations in an unstructured mesh. When maximizing
    data reuse and fitting constant coefficients into suitably partitioned on-chip
    memory, high computational intensity allows us to implement and feed wide data
    paths with hundreds of floating point operators. By decoupling off-chip memory
    accesses from the computations, high memory bandwidth can be sustained, even for
    the irregular access pattern required by parts of the application. Using the Intel/Altera
    OpenCL SDK for FPGAs, we present different implementation variants for different
    polynomial orders of the method. In different phases of the algorithm, either
    computational or bandwidth limits of the Arria 10 platform are almost reached,
    thus outperforming a highly multithreaded CPU implementation by around 2x.@eng
  bibo_authorlist:
  - foaf_Person:
      foaf_givenName: Tobias
      foaf_name: Kenter, Tobias
      foaf_surname: Kenter
      foaf_workInfoHomepage: http://www.librecat.org/personId=3145
  - foaf_Person:
      foaf_givenName: Gopinath
      foaf_name: Mahale, Gopinath
      foaf_surname: Mahale
  - foaf_Person:
      foaf_givenName: Samer
      foaf_name: Alhaddad, Samer
      foaf_surname: Alhaddad
      foaf_workInfoHomepage: http://www.librecat.org/personId=42456
  - foaf_Person:
      foaf_givenName: Yevgen
      foaf_name: Grynko, Yevgen
      foaf_surname: Grynko
      foaf_workInfoHomepage: http://www.librecat.org/personId=26059
  - foaf_Person:
      foaf_givenName: Christian
      foaf_name: Schmitt, Christian
      foaf_surname: Schmitt
  - foaf_Person:
      foaf_givenName: Ayesha
      foaf_name: Afzal, Ayesha
      foaf_surname: Afzal
  - foaf_Person:
      foaf_givenName: Frank
      foaf_name: Hannig, Frank
      foaf_surname: Hannig
  - foaf_Person:
      foaf_givenName: Jens
      foaf_name: Förstner, Jens
      foaf_surname: Förstner
      foaf_workInfoHomepage: http://www.librecat.org/personId=158
    orcid: 0000-0001-7059-9862
  - foaf_Person:
      foaf_givenName: Christian
      foaf_name: Plessl, Christian
      foaf_surname: Plessl
      foaf_workInfoHomepage: http://www.librecat.org/personId=16153
    orcid: 0000-0001-5728-9982
  bibo_doi: 10.1109/FCCM.2018.00037
  dct_date: 2018^xs_gYear
  dct_language: eng
  dct_publisher: IEEE@
  dct_subject:
  - tet_topic_hpc
  dct_title: OpenCL-based FPGA Design to Accelerate the Nodal Discontinuous Galerkin
    Method for Unstructured Meshes@
...
