[{"ddc":["000"],"user_id":"15278","abstract":[{"lang":"eng","text":"Branch and bound (B&B) algorithms structure the search space as a tree and eliminate infeasible solutions early by pruning subtrees that cannot lead to a valid or optimal solution. Custom hardware designs significantly accelerate the execution of these algorithms. In this article, we demonstrate a high-performance B&B implementation on FPGAs. First, we identify general elements of B&B algorithms and describe their implementation as a finite state machine. Then, we introduce workers that autonomously cooperate using work stealing to allow parallel execution and full utilization of the target FPGA. Finally, we explore advantages of instance-specific designs that target a specific problem instance to improve performance.\r\n\r\nWe evaluate our concepts by applying them to a branch and bound problem, the reconstruction of corrupted AES keys obtained from cold-boot attacks. The evaluation shows that our work stealing approach is scalable with the available resources and provides speedups proportional to the number of workers. Instance-specific designs allow us to achieve an overall speedup of 47 × compared to the fastest implementation of AES key reconstruction so far. Finally, we demonstrate how instance-specific designs can be generated just-in-time such that the provided speedups outweigh the additional time required for design synthesis."}],"volume":10,"date_created":"2017-07-25T14:17:32Z","has_accepted_license":"1","status":"public","publication":"ACM Transactions on Reconfigurable Technology and Systems (TRETS)","file_date_updated":"2018-11-02T16:04:14Z","keyword":["coldboot"],"quality_controlled":"1","author":[{"first_name":"Heinrich","full_name":"Riebler, Heinrich","last_name":"Riebler","id":"8961"},{"id":"24135","last_name":"Lass","full_name":"Lass, Michael","orcid":"0000-0002-5708-7632","first_name":"Michael"},{"first_name":"Robert","full_name":"Mittendorf, Robert","last_name":"Mittendorf"},{"first_name":"Thomas","full_name":"Löcke, Thomas","last_name":"Löcke"},{"first_name":"Christian","full_name":"Plessl, Christian","orcid":"0000-0001-5728-9982","last_name":"Plessl","id":"16153"}],"publisher":"Association for Computing Machinery (ACM)","file":[{"success":1,"relation":"main_file","content_type":"application/pdf","date_updated":"2018-11-02T16:04:14Z","creator":"ups","file_id":"5322","file_size":2131617,"access_level":"closed","date_created":"2018-11-02T16:04:14Z","file_name":"a24-riebler.pdf"}],"issue":"3","intvolume":" 10","_id":"18","page":"24:1-24:23","citation":{"short":"H. Riebler, M. Lass, R. Mittendorf, T. Löcke, C. Plessl, ACM Transactions on Reconfigurable Technology and Systems (TRETS) 10 (2017) 24:1-24:23.","ieee":"H. Riebler, M. Lass, R. Mittendorf, T. Löcke, and C. Plessl, “Efficient Branch and Bound on FPGAs Using Work Stealing and Instance-Specific Designs,” ACM Transactions on Reconfigurable Technology and Systems (TRETS), vol. 10, no. 3, p. 24:1-24:23, 2017, doi: 10.1145/3053687.","ama":"Riebler H, Lass M, Mittendorf R, Löcke T, Plessl C. Efficient Branch and Bound on FPGAs Using Work Stealing and Instance-Specific Designs. ACM Transactions on Reconfigurable Technology and Systems (TRETS). 2017;10(3):24:1-24:23. doi:10.1145/3053687","apa":"Riebler, H., Lass, M., Mittendorf, R., Löcke, T., & Plessl, C. (2017). Efficient Branch and Bound on FPGAs Using Work Stealing and Instance-Specific Designs. ACM Transactions on Reconfigurable Technology and Systems (TRETS), 10(3), 24:1-24:23. https://doi.org/10.1145/3053687","chicago":"Riebler, Heinrich, Michael Lass, Robert Mittendorf, Thomas Löcke, and Christian Plessl. “Efficient Branch and Bound on FPGAs Using Work Stealing and Instance-Specific Designs.” ACM Transactions on Reconfigurable Technology and Systems (TRETS) 10, no. 3 (2017): 24:1-24:23. https://doi.org/10.1145/3053687.","bibtex":"@article{Riebler_Lass_Mittendorf_Löcke_Plessl_2017, title={Efficient Branch and Bound on FPGAs Using Work Stealing and Instance-Specific Designs}, volume={10}, DOI={10.1145/3053687}, number={3}, journal={ACM Transactions on Reconfigurable Technology and Systems (TRETS)}, publisher={Association for Computing Machinery (ACM)}, author={Riebler, Heinrich and Lass, Michael and Mittendorf, Robert and Löcke, Thomas and Plessl, Christian}, year={2017}, pages={24:1-24:23} }","mla":"Riebler, Heinrich, et al. “Efficient Branch and Bound on FPGAs Using Work Stealing and Instance-Specific Designs.” ACM Transactions on Reconfigurable Technology and Systems (TRETS), vol. 10, no. 3, Association for Computing Machinery (ACM), 2017, p. 24:1-24:23, doi:10.1145/3053687."},"type":"journal_article","year":"2017","title":"Efficient Branch and Bound on FPGAs Using Work Stealing and Instance-Specific Designs","publication_identifier":{"issn":["1936-7406"]},"publication_status":"published","project":[{"_id":"1","grant_number":"160364472","name":"SFB 901"},{"name":"SFB 901 - Project Area C","_id":"4"},{"_id":"14","grant_number":"160364472","name":"SFB 901 - Subproject C2"},{"grant_number":"610996","name":"Self-Adaptive Virtualisation-Aware High-Performance/Low-Energy Heterogeneous System Architectures","_id":"34"},{"name":"Computing Resources Provided by the Paderborn Center for Parallel Computing","_id":"52"}],"department":[{"_id":"27"},{"_id":"518"}],"doi":"10.1145/3053687","date_updated":"2023-09-26T13:23:58Z","language":[{"iso":"eng"}]},{"language":[{"iso":"eng"}],"citation":{"ieee":"J. Schumacher, C. Plessl, and W. Vandelli, “High-Throughput and Low-Latency Network Communication with NetIO,” Journal of Physics: Conference Series, vol. 898, Art. no. 082003, 2017, doi: 10.1088/1742-6596/898/8/082003.","short":"J. Schumacher, C. Plessl, W. Vandelli, Journal of Physics: Conference Series 898 (2017).","mla":"Schumacher, Jörn, et al. “High-Throughput and Low-Latency Network Communication with NetIO.” Journal of Physics: Conference Series, vol. 898, 082003, IOP Publishing, 2017, doi:10.1088/1742-6596/898/8/082003.","bibtex":"@article{Schumacher_Plessl_Vandelli_2017, title={High-Throughput and Low-Latency Network Communication with NetIO}, volume={898}, DOI={10.1088/1742-6596/898/8/082003}, number={082003}, journal={Journal of Physics: Conference Series}, publisher={IOP Publishing}, author={Schumacher, Jörn and Plessl, Christian and Vandelli, Wainer}, year={2017} }","chicago":"Schumacher, Jörn, Christian Plessl, and Wainer Vandelli. “High-Throughput and Low-Latency Network Communication with NetIO.” Journal of Physics: Conference Series 898 (2017). https://doi.org/10.1088/1742-6596/898/8/082003.","ama":"Schumacher J, Plessl C, Vandelli W. High-Throughput and Low-Latency Network Communication with NetIO. Journal of Physics: Conference Series. 2017;898. doi:10.1088/1742-6596/898/8/082003","apa":"Schumacher, J., Plessl, C., & Vandelli, W. (2017). High-Throughput and Low-Latency Network Communication with NetIO. Journal of Physics: Conference Series, 898, Article 082003. https://doi.org/10.1088/1742-6596/898/8/082003"},"year":"2017","type":"journal_article","article_number":"082003","doi":"10.1088/1742-6596/898/8/082003","date_updated":"2023-09-26T13:24:19Z","_id":"1589","intvolume":" 898","status":"public","date_created":"2018-03-22T10:51:20Z","volume":898,"author":[{"last_name":"Schumacher","first_name":"Jörn","full_name":"Schumacher, Jörn"},{"first_name":"Christian","orcid":"0000-0001-5728-9982","full_name":"Plessl, Christian","last_name":"Plessl","id":"16153"},{"last_name":"Vandelli","full_name":"Vandelli, Wainer","first_name":"Wainer"}],"quality_controlled":"1","publisher":"IOP Publishing","department":[{"_id":"27"},{"_id":"518"}],"publication":"Journal of Physics: Conference Series","user_id":"15278","title":"High-Throughput and Low-Latency Network Communication with NetIO"},{"department":[{"_id":"27"},{"_id":"518"}],"publication_identifier":{"issn":["0045-7906"]},"project":[{"_id":"1","grant_number":"160364472","name":"SFB 901"},{"name":"SFB 901 - Subprojekt C2","grant_number":"160364472","_id":"14"},{"name":"SFB 901 - Project Area C","_id":"4"},{"_id":"34","name":"Self-Adaptive Virtualisation-Aware High-Performance/Low-Energy Heterogeneous System Architectures","grant_number":"610996"}],"title":"Potential and Methods for Embedding Dynamic Offloading Decisions into Application Code","language":[{"iso":"eng"}],"date_updated":"2023-09-26T13:26:38Z","doi":"10.1016/j.compeleceng.2016.04.021","publication":"Computers and Electrical Engineering","file_date_updated":"2018-03-21T12:45:47Z","publisher":"Elsevier","author":[{"full_name":"Vaz, Gavin Francis","first_name":"Gavin Francis","id":"30332","last_name":"Vaz"},{"first_name":"Heinrich","full_name":"Riebler, Heinrich","last_name":"Riebler","id":"8961"},{"id":"3145","last_name":"Kenter","full_name":"Kenter, Tobias","first_name":"Tobias"},{"last_name":"Plessl","id":"16153","first_name":"Christian","orcid":"0000-0001-5728-9982","full_name":"Plessl, Christian"}],"quality_controlled":"1","file":[{"access_level":"closed","date_created":"2018-03-21T12:45:47Z","file_name":"165-1-s2.0-S0045790616301021-main.pdf","content_type":"application/pdf","date_updated":"2018-03-21T12:45:47Z","relation":"main_file","success":1,"file_size":3037854,"creator":"florida","file_id":"1544"}],"volume":55,"date_created":"2017-10-17T12:41:24Z","has_accepted_license":"1","status":"public","abstract":[{"text":"A broad spectrum of applications can be accelerated by offloading computation intensive parts to reconfigurable hardware. However, to achieve speedups, the number of loop it- erations (trip count) needs to be sufficiently large to amortize offloading overheads. Trip counts are frequently not known at compile time, but only at runtime just before entering a loop. Therefore, we propose to generate code for both the CPU and the coprocessor, and defer the offloading decision to the application runtime. We demonstrate how a toolflow, based on the LLVM compiler framework, can automatically embed dynamic offloading de- cisions into the application code. We perform in-depth static and dynamic analysis of pop- ular benchmarks, which confirm the general potential of such an approach. We also pro- pose to optimize the offloading process by decoupling the runtime decision from the loop execution (decision slack). The feasibility of our approach is demonstrated by a toolflow that automatically identifies suitable data-parallel loops and generates code for the FPGA coprocessor of a Convey HC-1. We evaluate the integrated toolflow with representative loops executed for different input data sizes.","lang":"eng"}],"ddc":["040"],"user_id":"15278","page":"91-111","citation":{"ieee":"G. F. Vaz, H. Riebler, T. Kenter, and C. Plessl, “Potential and Methods for Embedding Dynamic Offloading Decisions into Application Code,” Computers and Electrical Engineering, vol. 55, pp. 91–111, 2016, doi: 10.1016/j.compeleceng.2016.04.021.","short":"G.F. Vaz, H. Riebler, T. Kenter, C. Plessl, Computers and Electrical Engineering 55 (2016) 91–111.","mla":"Vaz, Gavin Francis, et al. “Potential and Methods for Embedding Dynamic Offloading Decisions into Application Code.” Computers and Electrical Engineering, vol. 55, Elsevier, 2016, pp. 91–111, doi:10.1016/j.compeleceng.2016.04.021.","bibtex":"@article{Vaz_Riebler_Kenter_Plessl_2016, title={Potential and Methods for Embedding Dynamic Offloading Decisions into Application Code}, volume={55}, DOI={10.1016/j.compeleceng.2016.04.021}, journal={Computers and Electrical Engineering}, publisher={Elsevier}, author={Vaz, Gavin Francis and Riebler, Heinrich and Kenter, Tobias and Plessl, Christian}, year={2016}, pages={91–111} }","ama":"Vaz GF, Riebler H, Kenter T, Plessl C. Potential and Methods for Embedding Dynamic Offloading Decisions into Application Code. Computers and Electrical Engineering. 2016;55:91-111. doi:10.1016/j.compeleceng.2016.04.021","apa":"Vaz, G. F., Riebler, H., Kenter, T., & Plessl, C. (2016). Potential and Methods for Embedding Dynamic Offloading Decisions into Application Code. Computers and Electrical Engineering, 55, 91–111. https://doi.org/10.1016/j.compeleceng.2016.04.021","chicago":"Vaz, Gavin Francis, Heinrich Riebler, Tobias Kenter, and Christian Plessl. “Potential and Methods for Embedding Dynamic Offloading Decisions into Application Code.” Computers and Electrical Engineering 55 (2016): 91–111. https://doi.org/10.1016/j.compeleceng.2016.04.021."},"type":"journal_article","year":"2016","intvolume":" 55","_id":"165"},{"issue":"9","doi":"doi:10.1515/teme-2015-0031","_id":"1769","intvolume":" 82","date_updated":"2022-01-06T06:53:17Z","page":"440-450","citation":{"ieee":"S. Hegler et al., “Simulative Ultraschall-Untersuchung von Pitch-Catch-Messanordnungen für große zylindrische Stahl-Prüflinge und gradientenbasierte Bildgebung,” tm - Technisches Messen, vol. 82, no. 9, pp. 440–450, 2015.","short":"S. Hegler, C. Statz, M. Mütze, H. Mooshofer, M. Goldammer, K. Fendt, S. Schwarzer, K. Feldhoff, M. Flehmig, U. Markwardt, W. E. Nagel, M. Schütte, A. Walther, M. Meinel, A. Basermann, D. Plettemeier, Tm - Technisches Messen 82 (2015) 440–450.","mla":"Hegler, Sebastian, et al. “Simulative Ultraschall-Untersuchung von Pitch-Catch-Messanordnungen Für Große Zylindrische Stahl-Prüflinge Und Gradientenbasierte Bildgebung.” Tm - Technisches Messen, vol. 82, no. 9, Walter de Gruyter, 2015, pp. 440–50, doi:doi:10.1515/teme-2015-0031.","bibtex":"@article{Hegler_Statz_Mütze_Mooshofer_Goldammer_Fendt_Schwarzer_Feldhoff_Flehmig_Markwardt_et al._2015, title={Simulative Ultraschall-Untersuchung von Pitch-Catch-Messanordnungen für große zylindrische Stahl-Prüflinge und gradientenbasierte Bildgebung}, volume={82}, DOI={doi:10.1515/teme-2015-0031}, number={9}, journal={tm - Technisches Messen}, publisher={Walter de Gruyter}, author={Hegler, Sebastian and Statz, Christoph and Mütze, Marco and Mooshofer, Hubert and Goldammer, Matthias and Fendt, Karl and Schwarzer, Stefan and Feldhoff, Kim and Flehmig, Martin and Markwardt, Ulf and et al.}, year={2015}, pages={440–450} }","apa":"Hegler, S., Statz, C., Mütze, M., Mooshofer, H., Goldammer, M., Fendt, K., … Plettemeier, D. (2015). Simulative Ultraschall-Untersuchung von Pitch-Catch-Messanordnungen für große zylindrische Stahl-Prüflinge und gradientenbasierte Bildgebung. Tm - Technisches Messen, 82(9), 440–450. https://doi.org/doi:10.1515/teme-2015-0031","ama":"Hegler S, Statz C, Mütze M, et al. Simulative Ultraschall-Untersuchung von Pitch-Catch-Messanordnungen für große zylindrische Stahl-Prüflinge und gradientenbasierte Bildgebung. tm - Technisches Messen. 2015;82(9):440-450. doi:doi:10.1515/teme-2015-0031","chicago":"Hegler, Sebastian, Christoph Statz, Marco Mütze, Hubert Mooshofer, Matthias Goldammer, Karl Fendt, Stefan Schwarzer, et al. “Simulative Ultraschall-Untersuchung von Pitch-Catch-Messanordnungen Für Große Zylindrische Stahl-Prüflinge Und Gradientenbasierte Bildgebung.” Tm - Technisches Messen 82, no. 9 (2015): 440–50. https://doi.org/doi:10.1515/teme-2015-0031."},"type":"journal_article","year":"2015","user_id":"24135","title":"Simulative Ultraschall-Untersuchung von Pitch-Catch-Messanordnungen für große zylindrische Stahl-Prüflinge und gradientenbasierte Bildgebung","abstract":[{"lang":"eng","text":"Große zylindrische Stahlprüflinge werden mittels der Methode der finiten Differenzen im Zeitbereich (engl. finite differences in time domain, FDTD) simulativ untersucht. Dabei werden Pitch-Catch-Messanordnungen verwendet. Es werden zwei Bildgebungsansätze vorgestellt: ersterer basiert auf dem Imaging Principle nach Claerbout, letzterer basiert auf gradientenbasierter Optimierung eines Zielfunktionals."}],"date_created":"2018-03-23T14:01:39Z","status":"public","volume":82,"publication":"tm - Technisches Messen","department":[{"_id":"27"},{"_id":"104"}],"publisher":"Walter de Gruyter","author":[{"first_name":"Sebastian","full_name":"Hegler, Sebastian","last_name":"Hegler"},{"last_name":"Statz","first_name":"Christoph","full_name":"Statz, Christoph"},{"first_name":"Marco","full_name":"Mütze, Marco","last_name":"Mütze"},{"full_name":"Mooshofer, Hubert","first_name":"Hubert","last_name":"Mooshofer"},{"first_name":"Matthias","full_name":"Goldammer, Matthias","last_name":"Goldammer"},{"last_name":"Fendt","first_name":"Karl","full_name":"Fendt, Karl"},{"last_name":"Schwarzer","first_name":"Stefan","full_name":"Schwarzer, Stefan"},{"last_name":"Feldhoff","first_name":"Kim","full_name":"Feldhoff, Kim"},{"last_name":"Flehmig","full_name":"Flehmig, Martin","first_name":"Martin"},{"full_name":"Markwardt, Ulf","first_name":"Ulf","last_name":"Markwardt"},{"last_name":"E. Nagel","full_name":"E. Nagel, Wolfgang","first_name":"Wolfgang"},{"last_name":"Schütte","full_name":"Schütte, Maria","first_name":"Maria"},{"last_name":"Walther","full_name":"Walther, Andrea","first_name":"Andrea"},{"last_name":"Meinel","first_name":"Michael","full_name":"Meinel, Michael"},{"last_name":"Basermann","first_name":"Achim","full_name":"Basermann, Achim"},{"last_name":"Plettemeier","full_name":"Plettemeier, Dirk","first_name":"Dirk"}]},{"title":"Self-Aware and Self-Expressive Systems – Guest Editor's Introduction","project":[{"_id":"1","name":"SFB 901"},{"_id":"4","name":"SFB 901 - Project Area C"},{"_id":"14","name":"SFB 901 - Subproject C2"},{"_id":"34","grant_number":"610996","name":"Self-Adaptive Virtualisation-Aware High-Performance/Low-Energy Heterogeneous System Architectures"}],"department":[{"_id":"27"},{"_id":"518"},{"_id":"78"}],"doi":"10.1109/MC.2015.205","date_updated":"2022-01-06T06:53:19Z","language":[{"iso":"eng"}],"ddc":["000"],"user_id":"16153","volume":48,"date_created":"2018-03-23T14:06:12Z","status":"public","has_accepted_license":"1","file_date_updated":"2018-11-02T15:47:45Z","publication":"IEEE Computer","keyword":["self-awareness","self-expression"],"publisher":"IEEE Computer Society","author":[{"last_name":"Torresen","full_name":"Torresen, Jim","first_name":"Jim"},{"id":"16153","last_name":"Plessl","orcid":"0000-0001-5728-9982","full_name":"Plessl, Christian","first_name":"Christian"},{"first_name":"Xin","full_name":"Yao, Xin","last_name":"Yao"}],"file":[{"access_level":"closed","date_created":"2018-11-02T15:47:45Z","file_name":"07163237.pdf","date_updated":"2018-11-02T15:47:45Z","content_type":"application/pdf","relation":"main_file","success":1,"file_size":5605009,"file_id":"5313","creator":"ups"}],"issue":"7","intvolume":" 48","_id":"1772","page":"18-20","citation":{"short":"J. Torresen, C. Plessl, X. Yao, IEEE Computer 48 (2015) 18–20.","ieee":"J. Torresen, C. Plessl, and X. Yao, “Self-Aware and Self-Expressive Systems – Guest Editor’s Introduction,” IEEE Computer, vol. 48, no. 7, pp. 18–20, 2015.","apa":"Torresen, J., Plessl, C., & Yao, X. (2015). Self-Aware and Self-Expressive Systems – Guest Editor’s Introduction. IEEE Computer, 48(7), 18–20. https://doi.org/10.1109/MC.2015.205","ama":"Torresen J, Plessl C, Yao X. Self-Aware and Self-Expressive Systems – Guest Editor’s Introduction. IEEE Computer. 2015;48(7):18-20. doi:10.1109/MC.2015.205","chicago":"Torresen, Jim, Christian Plessl, and Xin Yao. “Self-Aware and Self-Expressive Systems – Guest Editor’s Introduction.” IEEE Computer 48, no. 7 (2015): 18–20. https://doi.org/10.1109/MC.2015.205.","bibtex":"@article{Torresen_Plessl_Yao_2015, title={Self-Aware and Self-Expressive Systems – Guest Editor’s Introduction}, volume={48}, DOI={10.1109/MC.2015.205}, number={7}, journal={IEEE Computer}, publisher={IEEE Computer Society}, author={Torresen, Jim and Plessl, Christian and Yao, Xin}, year={2015}, pages={18–20} }","mla":"Torresen, Jim, et al. “Self-Aware and Self-Expressive Systems – Guest Editor’s Introduction.” IEEE Computer, vol. 48, no. 7, IEEE Computer Society, 2015, pp. 18–20, doi:10.1109/MC.2015.205."},"type":"journal_article","year":"2015"},{"page":"613-614","year":"2015","citation":{"mla":"Peitz, Sebastian, and Michael Dellnitz. “Multiobjective Optimization of the Flow Around a Cylinder Using Model Order Reduction.” PAMM, vol. 15, no. 1, WILEY-VCH Verlag, 2015, pp. 613–14, doi:10.1002/pamm.201510296.","bibtex":"@article{Peitz_Dellnitz_2015, title={Multiobjective Optimization of the Flow Around a Cylinder Using Model Order Reduction}, volume={15}, DOI={10.1002/pamm.201510296}, number={1}, journal={PAMM}, publisher={WILEY-VCH Verlag}, author={Peitz, Sebastian and Dellnitz, Michael}, year={2015}, pages={613–614} }","chicago":"Peitz, Sebastian, and Michael Dellnitz. “Multiobjective Optimization of the Flow Around a Cylinder Using Model Order Reduction.” PAMM 15, no. 1 (2015): 613–14. https://doi.org/10.1002/pamm.201510296.","ama":"Peitz S, Dellnitz M. Multiobjective Optimization of the Flow Around a Cylinder Using Model Order Reduction. PAMM. 2015;15(1):613-614. doi:10.1002/pamm.201510296","apa":"Peitz, S., & Dellnitz, M. (2015). Multiobjective Optimization of the Flow Around a Cylinder Using Model Order Reduction. PAMM, 15(1), 613–614. https://doi.org/10.1002/pamm.201510296","ieee":"S. Peitz and M. Dellnitz, “Multiobjective Optimization of the Flow Around a Cylinder Using Model Order Reduction,” PAMM, vol. 15, no. 1, pp. 613–614, 2015.","short":"S. Peitz, M. Dellnitz, PAMM 15 (2015) 613–614."},"type":"journal_article","doi":"10.1002/pamm.201510296","issue":"1","_id":"1774","intvolume":" 15","date_updated":"2022-01-06T06:53:19Z","publication_identifier":{"issn":["1617-7061"]},"volume":15,"date_created":"2018-03-23T14:14:24Z","status":"public","department":[{"_id":"27"},{"_id":"101"}],"publication":"PAMM","publisher":"WILEY-VCH Verlag","author":[{"first_name":"Sebastian","full_name":"Peitz, Sebastian","last_name":"Peitz"},{"last_name":"Dellnitz","first_name":"Michael","full_name":"Dellnitz, Michael"}],"title":"Multiobjective Optimization of the Flow Around a Cylinder Using Model Order Reduction","user_id":"24135","abstract":[{"text":"In this article an efficient numerical method to solve multiobjective optimization problems for fluid flow governed by the Navier Stokes equations is presented. In order to decrease the computational effort, a reduced order model is introduced using Proper Orthogonal Decomposition and a corresponding Galerkin Projection. A global, derivative free multiobjective optimization algorithm is applied to compute the Pareto set (i.e. the set of optimal compromises) for the concurrent objectives minimization of flow field fluctuations and control cost. The method is illustrated for a 2D flow around a cylinder at Re = 100.","lang":"eng"}]},{"title":"Exploring Tradeoffs between Specialized Kernels and a Reusable Overlay in a Stereo-Matching Case Study","project":[{"_id":"1","name":"SFB 901","grant_number":"160364472"},{"_id":"14","grant_number":"160364472","name":"SFB 901 - Subprojekt C2"},{"_id":"4","name":"SFB 901 - Project Area C"},{"grant_number":"610996","name":"Self-Adaptive Virtualisation-Aware High-Performance/Low-Energy Heterogeneous System Architectures","_id":"34"}],"department":[{"_id":"27"},{"_id":"518"},{"_id":"78"}],"doi":"10.1155/2015/859425","date_updated":"2023-09-26T13:29:08Z","language":[{"iso":"eng"}],"ddc":["040"],"user_id":"15278","abstract":[{"text":"FPGAs are known to permit huge gains in performance and efficiency for suitable applications but still require reduced design efforts and shorter development cycles for wider adoption. In this work, we compare the resulting performance of two design concepts that in different ways promise such increased productivity. As common starting point, we employ a kernel-centric design approach, where computational hotspots in an application are identified and individually accelerated on FPGA. By means of a complex stereo matching application, we evaluate two fundamentally different design philosophies and approaches for implementing the required kernels on FPGAs. In the first implementation approach, we designed individually specialized data flow kernels in a spatial programming language for a Maxeler FPGA platform; in the alternative design approach, we target a vector coprocessor with large vector lengths, which is implemented as a form of programmable overlay on the application FPGAs of a Convey HC-1. We assess both approaches in terms of overall system performance, raw kernel performance, and performance relative to invested resources. After compensating for the effects of the underlying hardware platforms, the specialized dataflow kernels on the Maxeler platform are around 3x faster than kernels executing on the Convey vector coprocessor. In our concrete scenario, due to trade-offs between reconfiguration overheads and exposed parallelism, the advantage of specialized dataflow kernels is reduced to around 2.5x.","lang":"eng"}],"volume":2015,"date_created":"2017-10-17T12:41:49Z","status":"public","has_accepted_license":"1","file_date_updated":"2018-03-20T07:47:56Z","publication":"International Journal of Reconfigurable Computing (IJRC)","quality_controlled":"1","author":[{"last_name":"Kenter","id":"3145","first_name":"Tobias","full_name":"Kenter, Tobias"},{"first_name":"Henning","full_name":"Schmitz, Henning","last_name":"Schmitz"},{"last_name":"Plessl","id":"16153","first_name":"Christian","orcid":"0000-0001-5728-9982","full_name":"Plessl, Christian"}],"publisher":"Hindawi","file":[{"file_name":"296-859425.pdf","date_created":"2018-03-20T07:47:56Z","access_level":"closed","creator":"florida","file_id":"1444","file_size":2993898,"relation":"main_file","success":1,"date_updated":"2018-03-20T07:47:56Z","content_type":"application/pdf"}],"article_number":"859425","intvolume":" 2015","_id":"296","type":"journal_article","year":"2015","citation":{"bibtex":"@article{Kenter_Schmitz_Plessl_2015, title={Exploring Tradeoffs between Specialized Kernels and a Reusable Overlay in a Stereo-Matching Case Study}, volume={2015}, DOI={10.1155/2015/859425}, number={859425}, journal={International Journal of Reconfigurable Computing (IJRC)}, publisher={Hindawi}, author={Kenter, Tobias and Schmitz, Henning and Plessl, Christian}, year={2015} }","mla":"Kenter, Tobias, et al. “Exploring Tradeoffs between Specialized Kernels and a Reusable Overlay in a Stereo-Matching Case Study.” International Journal of Reconfigurable Computing (IJRC), vol. 2015, 859425, Hindawi, 2015, doi:10.1155/2015/859425.","apa":"Kenter, T., Schmitz, H., & Plessl, C. (2015). Exploring Tradeoffs between Specialized Kernels and a Reusable Overlay in a Stereo-Matching Case Study. International Journal of Reconfigurable Computing (IJRC), 2015, Article 859425. https://doi.org/10.1155/2015/859425","ama":"Kenter T, Schmitz H, Plessl C. Exploring Tradeoffs between Specialized Kernels and a Reusable Overlay in a Stereo-Matching Case Study. International Journal of Reconfigurable Computing (IJRC). 2015;2015. doi:10.1155/2015/859425","chicago":"Kenter, Tobias, Henning Schmitz, and Christian Plessl. “Exploring Tradeoffs between Specialized Kernels and a Reusable Overlay in a Stereo-Matching Case Study.” International Journal of Reconfigurable Computing (IJRC) 2015 (2015). https://doi.org/10.1155/2015/859425.","ieee":"T. Kenter, H. Schmitz, and C. Plessl, “Exploring Tradeoffs between Specialized Kernels and a Reusable Overlay in a Stereo-Matching Case Study,” International Journal of Reconfigurable Computing (IJRC), vol. 2015, Art. no. 859425, 2015, doi: 10.1155/2015/859425.","short":"T. Kenter, H. Schmitz, C. Plessl, International Journal of Reconfigurable Computing (IJRC) 2015 (2015)."}},{"language":[{"iso":"eng"}],"page":"396-399","year":"2015","citation":{"chicago":"Plessl, Christian, Marco Platzner, and Peter J. Schreier. “Aktuelles Schlagwort: Approximate Computing.” Informatik Spektrum, no. 5 (2015): 396–99. https://doi.org/10.1007/s00287-015-0911-z.","ama":"Plessl C, Platzner M, Schreier PJ. Aktuelles Schlagwort: Approximate Computing. Informatik Spektrum. 2015;(5):396-399. doi:10.1007/s00287-015-0911-z","apa":"Plessl, C., Platzner, M., & Schreier, P. J. (2015). Aktuelles Schlagwort: Approximate Computing. Informatik Spektrum, 5, 396–399. https://doi.org/10.1007/s00287-015-0911-z","bibtex":"@article{Plessl_Platzner_Schreier_2015, title={Aktuelles Schlagwort: Approximate Computing}, DOI={10.1007/s00287-015-0911-z}, number={5}, journal={Informatik Spektrum}, publisher={Springer}, author={Plessl, Christian and Platzner, Marco and Schreier, Peter J.}, year={2015}, pages={396–399} }","mla":"Plessl, Christian, et al. “Aktuelles Schlagwort: Approximate Computing.” Informatik Spektrum, no. 5, Springer, 2015, pp. 396–99, doi:10.1007/s00287-015-0911-z.","short":"C. Plessl, M. Platzner, P.J. Schreier, Informatik Spektrum (2015) 396–399.","ieee":"C. Plessl, M. Platzner, and P. J. Schreier, “Aktuelles Schlagwort: Approximate Computing,” Informatik Spektrum, no. 5, pp. 396–399, 2015, doi: 10.1007/s00287-015-0911-z."},"type":"journal_article","issue":"5","doi":"10.1007/s00287-015-0911-z","date_updated":"2023-09-26T13:30:22Z","_id":"1768","date_created":"2018-03-23T13:58:34Z","status":"public","publication":"Informatik Spektrum","department":[{"_id":"27"},{"_id":"518"},{"_id":"263"},{"_id":"78"}],"keyword":["approximate computing","survey"],"publisher":"Springer","quality_controlled":"1","author":[{"id":"16153","last_name":"Plessl","full_name":"Plessl, Christian","orcid":"0000-0001-5728-9982","first_name":"Christian"},{"full_name":"Platzner, Marco","first_name":"Marco","id":"398","last_name":"Platzner"},{"last_name":"Schreier","full_name":"Schreier, Peter J.","first_name":"Peter J."}],"user_id":"15278","title":"Aktuelles Schlagwort: Approximate Computing"},{"abstract":[{"text":"The ATLAS experiment at CERN is planning full deployment of a new unified optical link technology for connecting detector front end electronics on the timescale of the LHC Run 4 (2025). It is estimated that roughly 8000 GBT (GigaBit Transceiver) links, with transfer rates up to 10.24 Gbps, will replace existing links used for readout, detector control and distribution of timing and trigger information. A new class of devices will be needed to interface many GBT links to the rest of the trigger, data-acquisition and detector control systems. In this paper FELIX (Front End LInk eXchange) is presented, a PC-based device to route data from and to multiple GBT links via a high-performance general purpose network capable of a total throughput up to O(20 Tbps). FELIX implies architectural changes to the ATLAS data acquisition system, such as the use of industry standard COTS components early in the DAQ chain. Additionally the design and implementation of a FELIX demonstration platform is presented and hardware and software aspects will be discussed.","lang":"eng"}],"user_id":"15278","title":"FELIX: a High-Throughput Network Approach for Interfacing to Front End Electronics for ATLAS Upgrades","publication":"Journal of Physics: Conference Series","department":[{"_id":"27"},{"_id":"518"}],"author":[{"last_name":"Anderson","first_name":"J","full_name":"Anderson, J"},{"last_name":"Borga","full_name":"Borga, A","first_name":"A"},{"first_name":"H","full_name":"Boterenbrood, H","last_name":"Boterenbrood"},{"full_name":"Chen, H","first_name":"H","last_name":"Chen"},{"first_name":"K","full_name":"Chen, K","last_name":"Chen"},{"last_name":"Drake","full_name":"Drake, G","first_name":"G"},{"last_name":"Francis","first_name":"D","full_name":"Francis, D"},{"full_name":"Gorini, B","first_name":"B","last_name":"Gorini"},{"first_name":"F","full_name":"Lanni, F","last_name":"Lanni"},{"last_name":"Lehmann Miotto","full_name":"Lehmann Miotto, G","first_name":"G"},{"first_name":"L","full_name":"Levinson, L","last_name":"Levinson"},{"last_name":"Narevicius","first_name":"J","full_name":"Narevicius, J"},{"first_name":"Christian","full_name":"Plessl, Christian","orcid":"0000-0001-5728-9982","last_name":"Plessl","id":"16153"},{"full_name":"Roich, A","first_name":"A","last_name":"Roich"},{"full_name":"Ryu, S","first_name":"S","last_name":"Ryu"},{"last_name":"Schreuder","full_name":"Schreuder, F","first_name":"F"},{"last_name":"Schumacher","first_name":"Jörn","full_name":"Schumacher, Jörn"},{"last_name":"Vandelli","full_name":"Vandelli, Wainer","first_name":"Wainer"},{"first_name":"J","full_name":"Vermeulen, J","last_name":"Vermeulen"},{"last_name":"Zhang","first_name":"J","full_name":"Zhang, J"}],"quality_controlled":"1","publisher":"IOP Publishing","date_created":"2018-03-23T14:19:27Z","status":"public","volume":664,"_id":"1775","date_updated":"2023-09-26T13:31:23Z","intvolume":" 664","doi":"10.1088/1742-6596/664/8/082050","article_number":"082050","language":[{"iso":"eng"}],"type":"journal_article","year":"2015","citation":{"ieee":"J. Anderson et al., “FELIX: a High-Throughput Network Approach for Interfacing to Front End Electronics for ATLAS Upgrades,” Journal of Physics: Conference Series, vol. 664, Art. no. 082050, 2015, doi: 10.1088/1742-6596/664/8/082050.","short":"J. Anderson, A. Borga, H. Boterenbrood, H. Chen, K. Chen, G. Drake, D. Francis, B. Gorini, F. Lanni, G. Lehmann Miotto, L. Levinson, J. Narevicius, C. Plessl, A. Roich, S. Ryu, F. Schreuder, J. Schumacher, W. Vandelli, J. Vermeulen, J. Zhang, Journal of Physics: Conference Series 664 (2015).","bibtex":"@article{Anderson_Borga_Boterenbrood_Chen_Chen_Drake_Francis_Gorini_Lanni_Lehmann Miotto_et al._2015, title={FELIX: a High-Throughput Network Approach for Interfacing to Front End Electronics for ATLAS Upgrades}, volume={664}, DOI={10.1088/1742-6596/664/8/082050}, number={082050}, journal={Journal of Physics: Conference Series}, publisher={IOP Publishing}, author={Anderson, J and Borga, A and Boterenbrood, H and Chen, H and Chen, K and Drake, G and Francis, D and Gorini, B and Lanni, F and Lehmann Miotto, G and et al.}, year={2015} }","mla":"Anderson, J., et al. “FELIX: A High-Throughput Network Approach for Interfacing to Front End Electronics for ATLAS Upgrades.” Journal of Physics: Conference Series, vol. 664, 082050, IOP Publishing, 2015, doi:10.1088/1742-6596/664/8/082050.","apa":"Anderson, J., Borga, A., Boterenbrood, H., Chen, H., Chen, K., Drake, G., Francis, D., Gorini, B., Lanni, F., Lehmann Miotto, G., Levinson, L., Narevicius, J., Plessl, C., Roich, A., Ryu, S., Schreuder, F., Schumacher, J., Vandelli, W., Vermeulen, J., & Zhang, J. (2015). FELIX: a High-Throughput Network Approach for Interfacing to Front End Electronics for ATLAS Upgrades. Journal of Physics: Conference Series, 664, Article 082050. https://doi.org/10.1088/1742-6596/664/8/082050","ama":"Anderson J, Borga A, Boterenbrood H, et al. FELIX: a High-Throughput Network Approach for Interfacing to Front End Electronics for ATLAS Upgrades. Journal of Physics: Conference Series. 2015;664. doi:10.1088/1742-6596/664/8/082050","chicago":"Anderson, J, A Borga, H Boterenbrood, H Chen, K Chen, G Drake, D Francis, et al. “FELIX: A High-Throughput Network Approach for Interfacing to Front End Electronics for ATLAS Upgrades.” Journal of Physics: Conference Series 664 (2015). https://doi.org/10.1088/1742-6596/664/8/082050."}},{"intvolume":" 38","_id":"363","issue":"8, Part B","page":"911-919","citation":{"short":"A. Agne, H. Hangmann, M. Happe, M. Platzner, C. Plessl, Microprocessors and Microsystems 38 (2014) 911–919.","ieee":"A. Agne, H. Hangmann, M. Happe, M. Platzner, and C. Plessl, “Seven Recipes for Setting Your FPGA on Fire – A Cookbook on Heat Generators,” Microprocessors and Microsystems, vol. 38, no. 8, Part B, pp. 911–919, 2014, doi: 10.1016/j.micpro.2013.12.001.","ama":"Agne A, Hangmann H, Happe M, Platzner M, Plessl C. Seven Recipes for Setting Your FPGA on Fire – A Cookbook on Heat Generators. Microprocessors and Microsystems. 2014;38(8, Part B):911-919. doi:10.1016/j.micpro.2013.12.001","apa":"Agne, A., Hangmann, H., Happe, M., Platzner, M., & Plessl, C. (2014). Seven Recipes for Setting Your FPGA on Fire – A Cookbook on Heat Generators. Microprocessors and Microsystems, 38(8, Part B), 911–919. https://doi.org/10.1016/j.micpro.2013.12.001","chicago":"Agne, Andreas, Hendrik Hangmann, Markus Happe, Marco Platzner, and Christian Plessl. “Seven Recipes for Setting Your FPGA on Fire – A Cookbook on Heat Generators.” Microprocessors and Microsystems 38, no. 8, Part B (2014): 911–19. https://doi.org/10.1016/j.micpro.2013.12.001.","bibtex":"@article{Agne_Hangmann_Happe_Platzner_Plessl_2014, title={Seven Recipes for Setting Your FPGA on Fire – A Cookbook on Heat Generators}, volume={38}, DOI={10.1016/j.micpro.2013.12.001}, number={8, Part B}, journal={Microprocessors and Microsystems}, publisher={Elsevier}, author={Agne, Andreas and Hangmann, Hendrik and Happe, Markus and Platzner, Marco and Plessl, Christian}, year={2014}, pages={911–919} }","mla":"Agne, Andreas, et al. “Seven Recipes for Setting Your FPGA on Fire – A Cookbook on Heat Generators.” Microprocessors and Microsystems, vol. 38, no. 8, Part B, Elsevier, 2014, pp. 911–19, doi:10.1016/j.micpro.2013.12.001."},"type":"journal_article","year":"2014","abstract":[{"lang":"eng","text":"Due to the continuously shrinking device structures and increasing densities of FPGAs, thermal aspects have become the new focus for many research projects over the last years. Most researchers rely on temperature simulations to evaluate their novel thermal management techniques. However, these temperature simulations require a high computational effort if a detailed thermal model is used and their accuracies are often unclear. In contrast to simulations, the use of synthetic heat sources allows for experimental evaluation of temperature management methods. In this paper we investigate the creation of significant rises in temperature on modern FPGAs to enable future evaluation of thermal management techniques based on experiments. To that end, we have developed seven different heat-generating cores that use different subsets of FPGA resources. Our experimental results show that, according to external temperature probes connected to the FPGA’s heat sink, we can increase the temperature by an average of 81 !C. This corresponds to an average increase of 156.3 !C as measured by the built-in thermal diodes of our Virtex-5 FPGAs in less than 30 min by only utilizing about 21 percent of the slices."}],"user_id":"15278","ddc":["040"],"file":[{"date_created":"2018-03-20T07:20:31Z","file_name":"363-plessl13_micpro.pdf","access_level":"closed","file_size":1499996,"file_id":"1408","creator":"florida","content_type":"application/pdf","date_updated":"2018-03-20T07:20:31Z","relation":"main_file","success":1}],"file_date_updated":"2018-03-20T07:20:31Z","publication":"Microprocessors and Microsystems","quality_controlled":"1","publisher":"Elsevier","author":[{"last_name":"Agne","full_name":"Agne, Andreas","first_name":"Andreas"},{"last_name":"Hangmann","first_name":"Hendrik","full_name":"Hangmann, Hendrik"},{"last_name":"Happe","first_name":"Markus","full_name":"Happe, Markus"},{"first_name":"Marco","full_name":"Platzner, Marco","last_name":"Platzner","id":"398"},{"last_name":"Plessl","id":"16153","first_name":"Christian","orcid":"0000-0001-5728-9982","full_name":"Plessl, Christian"}],"date_created":"2017-10-17T12:42:02Z","status":"public","has_accepted_license":"1","volume":38,"date_updated":"2023-09-26T13:33:06Z","doi":"10.1016/j.micpro.2013.12.001","language":[{"iso":"eng"}],"title":"Seven Recipes for Setting Your FPGA on Fire – A Cookbook on Heat Generators","department":[{"_id":"27"},{"_id":"518"},{"_id":"78"}],"project":[{"name":"SFB 901","grant_number":"160364472","_id":"1"},{"_id":"14","name":"SFB 901 - Subprojekt C2","grant_number":"160364472"},{"name":"SFB 901 - Project Area C","_id":"4"},{"grant_number":"257906","name":"Engineering Proprioception in Computing Systems","_id":"31"}]}]