TY - THES AB - Wettstreit zwischen der Entwicklung neuer Hardwaretrojaner und entsprechender Gegenmaßnahmen beschreiten Widersacher immer raffiniertere Wege um Schaltungsentwürfe zu infizieren und dabei selbst fortgeschrittene Test- und Verifikationsmethoden zu überlisten. Abgesehen von den konventionellen Methoden um einen Trojaner in eine Schaltung für ein Field-programmable Gate Array (FPGA) einzuschleusen, können auch die Entwurfswerkzeuge heimlich kompromittiert werden um einen Angreifer dabei zu unterstützen einen erfolgreichen Angriff durchzuführen, der zum Beispiel Fehlfunktionen oder ungewollte Informationsabflüsse bewirken kann. Diese Dissertation beschäftigt sich hauptsächlich mit den beiden Blickwinkeln auf Hardwaretrojaner in rekonfigurierbaren Systemen, einerseits der Perspektive des Verteidigers mit einer Methode zur Erkennung von Trojanern auf der Bitstromebene, und andererseits derjenigen des Angreifers mit einer neuartigen Angriffsmethode für FPGA Trojaner. Für die Verteidigung gegen den Trojaner ``Heimtückische LUT'' stellen wir die allererste erfolgreiche Gegenmaßnahme vor, die durch Verifikation mittels Proof-carrying Hardware (PCH) auf der Bitstromebene direkt vor der Konfiguration der Hardware angewendet werden kann, und präsentieren ein vollständiges Schema für den Entwurf und die Verifikation von Schaltungen für iCE40 FPGAs. Für die Gegenseite führen wir einen neuen Angriff ein, welcher bösartiges Routing im eingefügten Trojaner ausnutzt um selbst im fertigen Bitstrom in einem inaktiven Zustand zu verbleiben: Hierdurch kann dieser neuartige Angriff zur Zeit weder von herkömmlichen Test- und Verifikationsmethoden, noch von unserer vorher vorgestellten Verifikation auf der Bitstromebene entdeckt werden. AU - Ahmed, Qazi Arbab ID - 29769 KW - FPGA Security KW - Hardware Trojans KW - Bitstream-level Trojans KW - Bitstream Verification TI - Hardware Trojans in Reconfigurable Computing ER - TY - CONF AB - FPGAs have found increasing adoption in data center applications since a new generation of high-level tools have become available which noticeably reduce development time for FPGA accelerators and still provide high-quality results. There is, however, no high-level benchmark suite available, which specifically enables a comparison of FPGA architectures, programming tools, and libraries for HPC applications. To fill this gap, we have developed an OpenCL-based open-source implementation of the HPCC benchmark suite for Xilinx and Intel FPGAs. This benchmark can serve to analyze the current capabilities of FPGA devices, cards, and development tool flows, track progress over time, and point out specific difficulties for FPGA acceleration in the HPC domain. Additionally, the benchmark documents proven performance optimization patterns. We will continue optimizing and porting the benchmark for new generations of FPGAs and design tools and encourage active participation to create a valuable tool for the community. To fill this gap, we have developed an OpenCL-based open-source implementation of the HPCC benchmark suite for Xilinx and Intel FPGAs. This benchmark can serve to analyze the current capabilities of FPGA devices, cards, and development tool flows, track progress over time, and point out specific difficulties for FPGA acceleration in the HPC domain. Additionally, the benchmark documents proven performance optimization patterns. We will continue optimizing and porting the benchmark for new generations of FPGAs and design tools and encourage active participation to create a valuable tool for the community. AU - Meyer, Marius AU - Kenter, Tobias AU - Plessl, Christian ID - 21632 KW - FPGA KW - OpenCL KW - High Level Synthesis KW - HPC benchmarking SN - 9781665415927 T2 - 2020 IEEE/ACM International Workshop on Heterogeneous High-performance Reconfigurable Computing (H2RC) TI - Evaluating FPGA Accelerator Performance with a Parameterized OpenCL Adaptation of Selected Benchmarks of the HPCChallenge Benchmark Suite ER - TY - JOUR AB - Advances in electromyographic (EMG) sensor technology and machine learning algorithms have led to an increased research effort into high density EMG-based pattern recognition methods for prosthesis control. With the goal set on an autonomous multi-movement prosthesis capable of performing training and classification of an amputee’s EMG signals, the focus of this paper lies in the acceleration of the embedded signal processing chain. We present two Xilinx Zynq-based architectures for accelerating two inherently different high density EMG-based control algorithms. The first hardware accelerated design achieves speed-ups of up to 4.8 over the software-only solution, allowing for a processing delay lower than the sample period of 1 ms. The second system achieved a speed-up of 5.5 over the software-only version and operates at a still satisfactory low processing delay of up to 15 ms while providing a higher reliability and robustness against electrode shift and noisy channels. AU - Boschmann, Alexander AU - Agne, Andreas AU - Thombansen, Georg AU - Witschen, Linus Matthias AU - Kraus, Florian AU - Platzner, Marco ID - 11950 JF - Journal of Parallel and Distributed Computing KW - High density electromyography KW - FPGA acceleration KW - Medical signal processing KW - Pattern recognition KW - Prosthetics SN - 0743-7315 TI - Zynq-based acceleration of robust high density myoelectric signal processing VL - 123 ER - TY - GEN AB - Molecular Dynamic (MD) simulations are computationally intensive and accelerating them using specialized hardware is a topic of investigation in many studies. One of the routines in the critical path of MD simulations is the three-dimensional Fast Fourier Transformation (FFT3d). The potential in accelerating FFT3d using hardware is usually bound by bandwidth and memory. Therefore, designing a high throughput solution for an FPGA that overcomes this problem is challenging. In this thesis, the feasibility of offloading FFT3d computations to FPGA implemented using OpenCL is investigated. In order to mask the latency in memory access, an FFT3d that overlaps computation with communication is designed. The implementa- tion of this design is synthesized for the Arria 10 GX 1150 FPGA and evaluated with the FFTW benchmark. Analysis shows a better performance using FPGA over CPU for larger FFT sizes, with the 643 FFT showing a 70% improvement in runtime using FPGAs. This FFT3d design is integrated with CP2K to explore the potential in accelerating molecular dynamic simulations. Evaluation of CP2K simulations using FPGA shows a 41% improvement in runtime in FFT3d computations over CPU for larger FFT3d designs. AU - Ramaswami, Arjun ID - 5417 KW - FFT: FPGA KW - CP2K KW - OpenCL TI - Accelerating Molecular Dynamic Simulations by Offloading Fast Fourier Transformations to FPGA ER - TY - CONF AU - Ho, Nam AU - Ahmed, Abdullah Fathi AU - Kaufmann, Paul AU - Platzner, Marco ID - 10673 KW - cache storage KW - field programmable gate arrays KW - multiprocessing systems KW - parallel architectures KW - reconfigurable architectures KW - FPGA KW - dynamic reconfiguration KW - evolvable cache mapping KW - many-core architecture KW - memory-to-cache address mapping function KW - microarchitectural optimization KW - multicore architecture KW - nature-inspired optimization KW - parallelization degrees KW - processor KW - reconfigurable cache mapping KW - reconfigurable computing KW - Field programmable gate arrays KW - Software KW - Tuning T2 - Proc. NASA/ESA Conf. Adaptive Hardware and Systems (AHS) TI - Microarchitectural optimization by means of reconfigurable and evolvable cache mappings ER - TY - CONF AU - Anwer, Jahanzeb AU - Meisner, Sebastian AU - Platzner, Marco ID - 10620 KW - fault tolerant computing KW - field programmable gate arrays KW - logic design KW - reliability KW - BYU-LANL tool KW - DRM tool flow KW - FPGA based hardware designs KW - avionic application KW - device technologies KW - dynamic reliability management KW - fault-tolerant operation KW - hardware designs KW - reconfiguring reliability levels KW - space applications KW - Field programmable gate arrays KW - Hardware KW - Redundancy KW - Reliability engineering KW - Runtime KW - Tunneling magnetoresistance T2 - Reconfigurable Computing and FPGAs (ReConFig), 2013 International Conference on TI - Dynamic reliability management: Reconfiguring reliability-levels of hardware designs at runtime ER - TY - JOUR AB - Reconfigurable architectures that tightly integrate a standard CPU core with a field-programmable hardware structure have recently been receiving impact of these design decisions on the overall system performance is a challenging task. In this paper, we first present a framework for the cycle-accurate performance evaluation of hybrid reconfigurable processors on the system level. Then, we discuss a reconfigurable processor for data-streaming applications, which attaches a coarse-grained reconfigurable unit to the coprocessor interface of a standard embedded CPU core. By means of a case study we evaluate the system-level impact of certain design features for the reconfigurable unit, such as multiple contexts, register replication, and hardware context scheduling. The results illustrate that a system-level evaluation framework is of paramount importance for studying the architectural trade-offs and optimizing design parameters for reconfigurable processors. AU - Enzler, Rolf AU - Plessl, Christian AU - Platzner, Marco ID - 2412 IS - 2-3 JF - Microprocessors and Microsystems KW - FPGA KW - reconfigurable computing KW - co-simulation KW - Zippy TI - System-level performance evaluation of reconfigurable processors VL - 29 ER - TY - CONF AB - This paper presents TKDM, a PC-based high-performance reconfigurable computing environment. The TKDM hardware consists of an FPGA module that uses the DIMM (dual inline memory module) bus for high-bandwidth and low-latency communication with the host CPU. The system's firmware is integrated with the Linux host operating system and offers functions for data communication and FPGA reconfiguration. The intended use of TKDM is that of a dynamically reconfigurable co-processor for data streaming applications. The system's firmware can be customized for specific application domains to facilitate simple and easy-to-use programming interfaces. AU - Plessl, Christian AU - Platzner, Marco ID - 2418 KW - coprocessor KW - DIMM KW - memory bus KW - FPGA KW - high performance computing T2 - Proc. Int. Conf. on Field Programmable Technology (ICFPT) TI - TKDM – A Reconfigurable Co-processor in a PC's Memory Slot ER - TY - CONF AB - In contrast to processors, current reconfigurable devices totally lack programming models that would allow for device independent compilation and forward compatibility. The key to overcome this limitation is hardware virtualization. In this paper, we resort to a macro-pipelined execution model to achieve hardware virtualization for data streaming applications. As a hardware implementation we present a hybrid multi-context architecture that attaches a coarse-grained reconfigurable array to a host CPU. A co-simulation framework enables cycle-accurate simulation of the complete architecture. As a case study we map an FIR filter to our virtualized hardware model and evaluate different designs. We discuss the impact of the number of contexts and the feature of context state on the speedup and the CPU load. AU - Enzler, Rolf AU - Plessl, Christian AU - Platzner, Marco ID - 2421 KW - Zippy KW - multi-context KW - FPGA T2 - Proc. Int. Conf. on Field Programmable Logic and Applications (FPL) TI - Virtualizing Hardware with Multi-Context Reconfigurable Arrays VL - 2778 ER -