TY - CONF
AU - Awais, Muhammad
AU - Ghasemzadeh Mohammadi, Hassan
AU - Platzner, Marco
ID - 21610
T2 - Proceedings of the ACM Great Lakes Symposium on VLSI (GLSVLSI) 2021
TI - LDAX: A Learning-based Fast Design Space Exploration Framework for Approximate Circuit Synthesis
ER -
TY - GEN
AU - Rehnen, Jakob Werner
ID - 22216
TI - Decomposition of Arithmetic Components for the Approximate Circuit Synthesis with EvoApproxLib
ER -
TY - CONF
AB - Approximate computing (AC) has acquired significant maturity in recent years as a promising approach to obtain energy and area-efficient hardware. Automated approximate accelerator synthesis involves a great deal of complexity on the size of design space which exponentially grows with the number of possible approximations. Design space exploration of approximate accelerator synthesis is usually targeted via heuristic-based search methods. The majority of existing frameworks prune a large part of the design space using a greedy-based approach to keep the problem tractable. Therefore, they result in inferior solutions since many potential solutions are neglected in the pruning process without the possibility of backtracking of removed approximate instances. In this paper, we address the aforementioned issue by adopting Monte Carlo Tree Search (MCTS), as an efficient stochastic learning-based search algorithm, in the context of automated synthesis of approximate accelerators. This enables the synthesis frameworks to deeply subsamples the design space of approximate accelerator synthesis toward most promising approximate instances based on the required performance goals, i.e., power consumption, area, or/and delay. We investigated the challenges of providing an efficient open-source framework that benefits analytical and search-based approximation techniques simultaneously to both speed up the synthesis runtime and improve the quality of obtained results. Besides, we studied the utilization of machine learning algorithms to improve the performance of several critical steps, i.e., accelerator quality testing, in the synthesis framework. The proposed framework can help the community to rapidly generate efficient approximate accelerators in a reasonable runtime.
AU - Awais, Muhammad
AU - Platzner, Marco
ID - 22309
KW - Approximate computing
KW - Design space exploration
KW - Accelerator synthesis
T2 - Proceedings of IEEE Computer Society Annual Symposium on VLSI
TI - MCTS-Based Synthesis Towards Efficient Approximate Accelerators
ER -
TY - GEN
AB - This bachelor thesis presents a C/C++ implementation of the XCS algorithm for an embedded system and profiling results concerning the execution time of the functions. These are then analyzed in relation to the input characteristics of the examined learning environments and compared with related work. Three main conclusions can be drawn from the measured results. First, the maximum size of the population of the classifiers influences the runtime of the genetic algorithm; second, the size of the input space has a direct effect on the execution time of the matching function; and last, a larger action space results in a longer runtime generating the prediction for the possible actions. The dependencies identified here can serve to optimize the computational efficiency and make XCS more suitable for embedded systems.
AU - Brede, Mathis
ID - 22483
TI - Implementation and Profiling of XCS in the Context of Embedded Systems
ER -
TY - CONF
AU - Witschen, Linus Matthias
AU - Wiersema, Tobias
AU - Raeisi Nafchi, Masood
AU - Bockhorn, Arne
AU - Platzner, Marco
ED - Hannig, Frank
ED - Derrien, Steven
ED - Diniz, Pedro
ED - Chillet, Daniel
ID - 21953
T2 - Proceedings of International Symposium on Applied Reconfigurable Computing (ARC'21)
TI - Timing Optimization for Virtual FPGA Configurations
ER -
TY - JOUR
AB - Abstract
Background
Hand amputation can have a truly debilitating impact on the life of the affected person. A multifunctional myoelectric prosthesis controlled using pattern classification can be used to restore some of the lost motor abilities. However, learning to control an advanced prosthesis can be a challenging task, but virtual and augmented reality (AR) provide means to create an engaging and motivating training.
Methods
In this study, we present a novel training framework that integrates virtual elements within a real scene (AR) while allowing the view from the first-person perspective. The framework was evaluated in 13 able-bodied subjects and a limb-deficient person divided into intervention (IG) and control (CG) groups. The IG received training by performing simulated clothespin task and both groups conducted a pre- and posttest with a real prosthesis. When training with the AR, the subjects received visual feedback on the generated grasping force. The main outcome measure was the number of pins that were successfully transferred within 20 min (task duration), while the number of dropped and broken pins were also registered. The participants were asked to score the difficulty of the real task (posttest), fun-factor and motivation, as well as the utility of the feedback.
Results
The performance (median/interquartile range) consistently increased during the training sessions (4/3 to 22/4). While the results were similar for the two groups in the pretest, the performance improved in the posttest only in IG. In addition, the subjects in IG transferred significantly more pins (28/10.5 versus 14.5/11), and dropped (1/2.5 versus 3.5/2) and broke (5/3.8 versus 14.5/9) significantly fewer pins in the posttest compared to CG. The participants in IG assigned (mean ± std) significantly lower scores to the difficulty compared to CG (5.2 ± 1.9 versus 7.1 ± 0.9), and they highly rated the fun factor (8.7 ± 1.3) and usefulness of feedback (8.5 ± 1.7).
Conclusion
The results demonstrated that the proposed AR system allows for the transfer of skills from the simulated to the real task while providing a positive user experience. The present study demonstrates the effectiveness and flexibility of the proposed AR framework. Importantly, the developed system is open source and available for download and further development.
AU - Boschmann, Alexander
AU - Neuhaus, Dorothee
AU - Vogt, Sarah
AU - Kaltschmidt, Christian
AU - Platzner, Marco
AU - Dosen, Strahinja
ID - 30906
IS - 1
JF - Journal of NeuroEngineering and Rehabilitation
KW - Health Informatics
KW - Rehabilitation
SN - 1743-0003
TI - Immersive augmented reality system for the training of pattern classification control with a myoelectric prosthesis
VL - 18
ER -
TY - JOUR
AU - Rodriguez, Alfonso
AU - Otero, Andres
AU - Platzner, Marco
AU - De la Torre, Eduardo
ID - 30907
JF - IEEE Transactions on Computers
KW - Computational Theory and Mathematics
KW - Hardware and Architecture
KW - Theoretical Computer Science
KW - Software
SN - 0018-9340
TI - Exploiting Hardware-Based Data-Parallel and Multithreading Models for Smart Edge Computing in Reconfigurable FPGAs
ER -
TY - CONF
AU - Hansmeier, Tim
ID - 29137
T2 - HEART '21: Proceedings of the 11th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies
TI - Self-aware Operation of Heterogeneous Compute Nodes using the Learning Classifier System XCS
ER -
TY - GEN
AB - Autonomous mobile robots are becoming increasingly more capable and widespread. Reliable Obstacle avoidance is an integral part of autonomous navigation. This involves real time interpretation and processing of a complex environment. Strict time and energy constraints of a mobile autonomous system make efficient computation extremely desirable. The benefits of employing Hardware/Software co-designed applications are obvious and significant. Hardware accelerators are used for efficient processing of the algorithms by exploiting parallelism. FPGAs are a class of hardware accelerators, which
can contain hundreds of small execution units, and can be used for Hardware/Software co-designed application. However, there is a reluctance when it comes to adoption of these devices in well established application domains, such as Robotics, due to a steep learning curve needed for FPGA application design. ReconROS has successfully bridged the gap between robotic and FPGA application development, by providing an intuitive, common development platform for robotic application development for FPGA. It does so by integrating Robotics Operating System(ROS) which is an industry and academia standard for robotics application development, with ReconOS, an operating system for re-configurable hardware. In this thesis an obstacle avoidance system is designed and implemented for an autonomous vehicle using ReconROS. The objectives of the thesis is to demonstrate and explore ReconROS integration within the ROS ecosystem and explore the design process within ReconROS framework, and to demonstrate the effectiveness of Hardware Acceleration in Robotics, by analysing the resulting architectures for Latency and Power Consumption.
AU - Sheikh, Muhammad Aamir
ID - 29540
TI - Design and Implementation of a ReconROS-based Obstacle Avoidance System
ER -
TY - GEN
AB - Robotics applications process large amounts of data in real-time and require compute platforms that provide high performance and energy-efficiency. FPGAs are well-suited for many of these applications, but there is a reluctance in the robotics community to use hardware acceleration due to increased design complexity and a lack of consistent programming models across the software/hardware boundary. In this paper we present ReconROS, a framework that integrates the widely-used robot operating system (ROS) with ReconOS, which features multithreaded programming of hardware and software threads for reconfigurable computers. This unique combination gives ROS2 developers the flexibility to transparently accelerate parts of their robotics applications in hardware. We elaborate on the architecture and the design flow for ReconROS and report on a set of experiments that underline the feasibility and flexibility of our approach.
AU - Lienen, Christian
AU - Platzner, Marco
ID - 22764
T2 - arXiv:2107.07208
TI - Design of Distributed Reconfigurable Robotics Systems with ReconROS
ER -
TY - CONF
AU - Hansmeier, Tim
AU - Platzner, Marco
ID - 21813
SN - 978-1-4503-8351-6
T2 - GECCO '21: Proceedings of the Genetic and Evolutionary Computation Conference Companion
TI - An Experimental Comparison of Explore/Exploit Strategies for the Learning Classifier System XCS
ER -
TY - JOUR
AB - Verification of software and processor hardware usually proceeds separately, software analysis relying on the correctness of processors executing machine instructions. This assumption is valid as long as the software runs on standard CPUs that have been extensively validated and are in wide use. However, for processors exploiting custom instruction set extensions to meet performance and energy constraints the validation might be less extensive, challenging the correctness assumption. In this paper we present a novel formal approach for hardware/software co-verification targeting processors with custom instruction set extensions. We detail two different approaches for checking whether the hardware fulfills the requirements expected by the software analysis. The approaches are designed to explore a trade-off between generality of the verification and computational effort. Then, we describe the integration of software and hardware analyses for both techniques and describe a fully automated tool chain implementing the approaches. Finally, we demonstrate and compare the two approaches on example source code with custom instructions, using state-of-the-art software analysis and hardware verification techniques.
AU - Jakobs, Marie-Christine
AU - Pauck, Felix
AU - Platzner, Marco
AU - Wehrheim, Heike
AU - Wiersema, Tobias
ID - 27841
JF - IEEE Access
KW - Software Analysis
KW - Abstract Interpretation
KW - Custom Instruction
KW - Hardware Verification
TI - Software/Hardware Co-Verification for Custom Instruction Set Processors
ER -
TY - CONF
AU - Ahmed, Qazi Arbab
ID - 29138
T2 - 2021 IFIP/IEEE 29th International Conference on Very Large Scale Integration (VLSI-SoC)
TI - Hardware Trojans in Reconfigurable Computing
ER -
TY - CONF
AB - The battle of developing hardware Trojans and corresponding countermeasures has taken adversaries towards ingenious ways of compromising hardware designs by circumventing even advanced testing and verification methods. Besides conventional methods of inserting Trojans into a design by a malicious entity, the design flow for field-programmable gate arrays (FPGAs) can also be surreptitiously compromised to assist the attacker to perform a successful malfunctioning or information leakage attack. The advanced stealthy malicious look-up-table (LUT) attack activates a Trojan only when generating the FPGA bitstream and can thus not be detected by register transfer and gate level testing and verification. However, also this attack was recently revealed by a bitstream-level proof-carrying hardware (PCH) approach. In this paper, we present a novel attack that leverages malicious routing of the inserted Trojan circuit to acquire a dormant state even in the generated and transmitted bitstream. The Trojan's payload is connected to primary inputs/outputs of the FPGA via a programmable interconnect point (PIP). The Trojan is detached from inputs/outputs during place-and-route and re-connected only when the FPGA is being programmed, thus activating the Trojan circuit without any need for a trigger logic. Since the Trojan is injected in a post-synthesis step and remains unconnected in the bitstream, the presented attack can currently neither be prevented by conventional testing and verification methods nor by recent bitstream-level verification techniques.
AU - Ahmed, Qazi Arbab
AU - Wiersema, Tobias
AU - Platzner, Marco
ID - 20681
T2 - 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE)
TI - Malicious Routing: Circumventing Bitstream-level Verification for FPGAs
ER -
TY - CONF
AU - Clausing, Lennart
ID - 30909
T2 - Proceedings of the 11th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies
TI - ReconOS64: High-Performance Embedded Computing for Industrial Analytics on a Reconfigurable System-on-Chip
ER -
TY - CONF
AU - Ghasemzadeh Mohammadi, Hassan
AU - Jentzsch, Felix
AU - Kuschel, Maurice
AU - Arshad, Rahil
AU - Rautmare, Sneha
AU - Manjunatha, Suraj
AU - Platzner, Marco
AU - Boschmann, Alexander
AU - Schollbach, Dirk
ID - 30908
T2 - Machine Learning and Principles and Practice of Knowledge Discovery in Databases
TI - FLight: FPGA Acceleration of Lightweight DNN Model Inference in Industrial Analytics
ER -
TY - CONF
AU - Guetttatfi, Zakarya
AU - Kaufmann, Paul
AU - Platzner, Marco
ID - 3583
T2 - Proceedings of the International Workshop on Applied Reconfigurable Computing (ARC)
TI - Optimal and Greedy Heuristic Approaches for Scheduling and Mapping of Hardware Tasks to Reconfigurable Computing Devices
ER -
TY - GEN
AU - Chandrakar, Khushboo
ID - 21324
TI - Comparison of Feature Selection Techniques to Improve Approximate Circuit Synthesis
ER -
TY - GEN
AB - Robots are becoming increasingly autonomous and more capable. Because of a limited portable energy budget by e.g. batteries, and more demanding algorithms, an efficient computation is of interest. Field Programmable Gate Arrays (FPGAs) for example can provide fast and efficient processing and the Robot Operating System (ROS) is a popular
middleware used for robotic applications. The novel ReconROS combines version 2 of the Robot Operating System with ReconOS, a framework for integrating reconfigurable hardware. It provides a unified interface between software and hardware. ReconROS is evaluated in this thesis by implementing a Sobel filter as the video processing application, running on a Zynq-7000 series System on Chip. Timing measurements were taken of execution and transfer times and were compared to theoretical values. Designing the hardware implementation is done by C code using High Level Synthesis and with the interface and functionality provided by ReconROS. An important aspect is the publish/subscribe mechanism of ROS. The Operating System interface functions for publishing and subscribing are reasonably fast at below 10 ms for a 1 MB color VGA image. The main memory interface performs well at higher data sizes, crossing 100 MB/s at 20 kB and increasing to a maximum of around 150 MB/s. Furthermore, the hardware implementation introduces consistency to the execution times and performs twice as fast as the software implementation.
AU - Henke, Luca-Sebastian
ID - 21432
TI - Evaluation of a ReconOS-ROS Combination based on a Video Processing Application
ER -
TY - CONF
AU - Gatica, Carlos Paiz
AU - Platzner, Marco
ID - 21584
SN - 2522-8579
T2 - Machine Learning for Cyber Physical Systems (ML4CPS 2017)
TI - Adaptable Realization of Industrial Analytics Functions on Edge-Devices using Reconfigurable Architectures
ER -