{"doi":"10.1109/FCCM.2018.00037","file":[{"success":1,"file_id":"5282","file_name":"08457652.pdf","access_level":"closed","content_type":"application/pdf","relation":"main_file","date_created":"2018-11-02T14:45:05Z","date_updated":"2018-11-02T14:45:05Z","creator":"ups","file_size":269130}],"ddc":["000"],"file_date_updated":"2018-11-02T14:45:05Z","keyword":["tet_topic_hpc"],"date_created":"2018-03-22T10:48:01Z","project":[{"name":"HighPerMeshes","grant_number":"01|H16005A","_id":"33"},{"name":"SFB 901","_id":"1","grant_number":"160364472"},{"_id":"4","name":"SFB 901 - Project Area C"},{"_id":"14","grant_number":"160364472","name":"SFB 901 - Subproject C2"}],"has_accepted_license":"1","quality_controlled":"1","abstract":[{"text":"The exploration of FPGAs as accelerators for scientific simulations has so far mostly been focused on small kernels of methods working on regular data structures, for example in the form of stencil computations for finite difference methods. In computational sciences, often more advanced methods are employed that promise better stability, convergence, locality and scaling. Unstructured meshes are shown to be more effective and more accurate, compared to regular grids, in representing computation domains of various shapes. Using unstructured meshes, the discontinuous Galerkin method preserves the ability to perform explicit local update operations for simulations in the time domain. In this work, we investigate FPGAs as target platform for an implementation of the nodal discontinuous Galerkin method to find time-domain solutions of Maxwell's equations in an unstructured mesh. When maximizing data reuse and fitting constant coefficients into suitably partitioned on-chip memory, high computational intensity allows us to implement and feed wide data paths with hundreds of floating point operators. By decoupling off-chip memory accesses from the computations, high memory bandwidth can be sustained, even for the irregular access pattern required by parts of the application. Using the Intel/Altera OpenCL SDK for FPGAs, we present different implementation variants for different polynomial orders of the method. In different phases of the algorithm, either computational or bandwidth limits of the Arria 10 platform are almost reached, thus outperforming a highly multithreaded CPU implementation by around 2x.","lang":"eng"}],"publication":"Proc. Int. Symp. on Field-Programmable Custom Computing Machines (FCCM)","conference":{"name":"Proc. Int. Symp. on Field-Programmable Custom Computing Machines (FCCM)"},"date_updated":"2023-09-26T11:47:52Z","type":"conference","language":[{"iso":"eng"}],"department":[{"_id":"27"},{"_id":"518"},{"_id":"61"}],"_id":"1588","title":"OpenCL-based FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method for Unstructured Meshes","citation":{"ieee":"T. Kenter et al., “OpenCL-based FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method for Unstructured Meshes,” presented at the Proc. Int. Symp. on Field-Programmable Custom Computing Machines (FCCM), 2018, doi: 10.1109/FCCM.2018.00037.","mla":"Kenter, Tobias, et al. “OpenCL-Based FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method for Unstructured Meshes.” Proc. Int. Symp. on Field-Programmable Custom Computing Machines (FCCM), IEEE, 2018, doi:10.1109/FCCM.2018.00037.","bibtex":"@inproceedings{Kenter_Mahale_Alhaddad_Grynko_Schmitt_Afzal_Hannig_Förstner_Plessl_2018, title={OpenCL-based FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method for Unstructured Meshes}, DOI={10.1109/FCCM.2018.00037}, booktitle={Proc. Int. Symp. on Field-Programmable Custom Computing Machines (FCCM)}, publisher={IEEE}, author={Kenter, Tobias and Mahale, Gopinath and Alhaddad, Samer and Grynko, Yevgen and Schmitt, Christian and Afzal, Ayesha and Hannig, Frank and Förstner, Jens and Plessl, Christian}, year={2018} }","ama":"Kenter T, Mahale G, Alhaddad S, et al. OpenCL-based FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method for Unstructured Meshes. In: Proc. Int. Symp. on Field-Programmable Custom Computing Machines (FCCM). IEEE; 2018. doi:10.1109/FCCM.2018.00037","apa":"Kenter, T., Mahale, G., Alhaddad, S., Grynko, Y., Schmitt, C., Afzal, A., Hannig, F., Förstner, J., & Plessl, C. (2018). OpenCL-based FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method for Unstructured Meshes. Proc. Int. Symp. on Field-Programmable Custom Computing Machines (FCCM). Proc. Int. Symp. on Field-Programmable Custom Computing Machines (FCCM). https://doi.org/10.1109/FCCM.2018.00037","short":"T. Kenter, G. Mahale, S. Alhaddad, Y. Grynko, C. Schmitt, A. Afzal, F. Hannig, J. Förstner, C. Plessl, in: Proc. Int. Symp. on Field-Programmable Custom Computing Machines (FCCM), IEEE, 2018.","chicago":"Kenter, Tobias, Gopinath Mahale, Samer Alhaddad, Yevgen Grynko, Christian Schmitt, Ayesha Afzal, Frank Hannig, Jens Förstner, and Christian Plessl. “OpenCL-Based FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method for Unstructured Meshes.” In Proc. Int. Symp. on Field-Programmable Custom Computing Machines (FCCM). IEEE, 2018. https://doi.org/10.1109/FCCM.2018.00037."},"user_id":"15278","status":"public","year":"2018","publisher":"IEEE","author":[{"first_name":"Tobias","id":"3145","last_name":"Kenter","full_name":"Kenter, Tobias"},{"full_name":"Mahale, Gopinath","last_name":"Mahale","first_name":"Gopinath"},{"last_name":"Alhaddad","id":"42456","full_name":"Alhaddad, Samer","first_name":"Samer"},{"full_name":"Grynko, Yevgen","id":"26059","last_name":"Grynko","first_name":"Yevgen"},{"full_name":"Schmitt, Christian","last_name":"Schmitt","first_name":"Christian"},{"first_name":"Ayesha","last_name":"Afzal","full_name":"Afzal, Ayesha"},{"full_name":"Hannig, Frank","last_name":"Hannig","first_name":"Frank"},{"orcid":"0000-0001-7059-9862","first_name":"Jens","full_name":"Förstner, Jens","id":"158","last_name":"Förstner"},{"id":"16153","last_name":"Plessl","full_name":"Plessl, Christian","first_name":"Christian","orcid":"0000-0001-5728-9982"}]}