TY - GEN AB - This bachelor thesis presents a C/C++ implementation of the XCS algorithm for an embedded system and profiling results concerning the execution time of the functions. These are then analyzed in relation to the input characteristics of the examined learning environments and compared with related work. Three main conclusions can be drawn from the measured results. First, the maximum size of the population of the classifiers influences the runtime of the genetic algorithm; second, the size of the input space has a direct effect on the execution time of the matching function; and last, a larger action space results in a longer runtime generating the prediction for the possible actions. The dependencies identified here can serve to optimize the computational efficiency and make XCS more suitable for embedded systems. AU - Brede, Mathis ID - 22483 TI - Implementation and Profiling of XCS in the Context of Embedded Systems ER - TY - GEN AB - Autonomous mobile robots are becoming increasingly more capable and widespread. Reliable Obstacle avoidance is an integral part of autonomous navigation. This involves real time interpretation and processing of a complex environment. Strict time and energy constraints of a mobile autonomous system make efficient computation extremely desirable. The benefits of employing Hardware/Software co-designed applications are obvious and significant. Hardware accelerators are used for efficient processing of the algorithms by exploiting parallelism. FPGAs are a class of hardware accelerators, which can contain hundreds of small execution units, and can be used for Hardware/Software co-designed application. However, there is a reluctance when it comes to adoption of these devices in well established application domains, such as Robotics, due to a steep learning curve needed for FPGA application design. ReconROS has successfully bridged the gap between robotic and FPGA application development, by providing an intuitive, common development platform for robotic application development for FPGA. It does so by integrating Robotics Operating System(ROS) which is an industry and academia standard for robotics application development, with ReconOS, an operating system for re-configurable hardware. In this thesis an obstacle avoidance system is designed and implemented for an autonomous vehicle using ReconROS. The objectives of the thesis is to demonstrate and explore ReconROS integration within the ROS ecosystem and explore the design process within ReconROS framework, and to demonstrate the effectiveness of Hardware Acceleration in Robotics, by analysing the resulting architectures for Latency and Power Consumption. AU - Sheikh, Muhammad Aamir ID - 29540 TI - Design and Implementation of a ReconROS-based Obstacle Avoidance System ER - TY - GEN AU - Chandrakar, Khushboo ID - 21324 TI - Comparison of Feature Selection Techniques to Improve Approximate Circuit Synthesis ER - TY - GEN AB - Robots are becoming increasingly autonomous and more capable. Because of a limited portable energy budget by e.g. batteries, and more demanding algorithms, an efficient computation is of interest. Field Programmable Gate Arrays (FPGAs) for example can provide fast and efficient processing and the Robot Operating System (ROS) is a popular middleware used for robotic applications. The novel ReconROS combines version 2 of the Robot Operating System with ReconOS, a framework for integrating reconfigurable hardware. It provides a unified interface between software and hardware. ReconROS is evaluated in this thesis by implementing a Sobel filter as the video processing application, running on a Zynq-7000 series System on Chip. Timing measurements were taken of execution and transfer times and were compared to theoretical values. Designing the hardware implementation is done by C code using High Level Synthesis and with the interface and functionality provided by ReconROS. An important aspect is the publish/subscribe mechanism of ROS. The Operating System interface functions for publishing and subscribing are reasonably fast at below 10 ms for a 1 MB color VGA image. The main memory interface performs well at higher data sizes, crossing 100 MB/s at 20 kB and increasing to a maximum of around 150 MB/s. Furthermore, the hardware implementation introduces consistency to the execution times and performs twice as fast as the software implementation. AU - Henke, Luca-Sebastian ID - 21432 TI - Evaluation of a ReconOS-ROS Combination based on a Video Processing Application ER - TY - GEN AU - Thiele, Simon ID - 20820 TI - Implementing Machine Learning Functions as PYNQ FPGA Overlays ER - TY - GEN AU - Jaganath, Vivek ID - 20821 TI - Extension and Evaluation of Python-based High-Level Synthesis Tool Flows ER - TY - GEN AB - Modern machine learning (ML) techniques continue to move into the embedded system space because traditional centralized compute resources do not suit certain application domains, for example in mobile or real-time environments. Google’s TensorFlow Lite (TFLite) framework supports this shift from cloud to edge computing and makes ML inference accessible on resource-constrained devices. While it offers the possibility to partially delegate computation to hardware accelerators, there is no such “delegate” available to utilize the promising characteristics of reconfigurable hardware. This thesis incorporates modern platform FPGAs into TFLite by implementing a modular delegate framework, which allows accelerators within the programmable logic to take over the execution of neural network layers. To facilitate the necessary hardware/software codesign, the FPGA delegate is based on the operating system for reconfigurable computing (ReconOS), whose partial reconfiguration support enables the instantiation of model-tailored accelerator architectures. In the hardware back-end, a streaming-based prototype accelerator for the MobileNet model family showcases the working order of the platform, but falls short of the desired performance. Thus, it indicates the need for further exploration of alternative accelerator designs, which the delegate could automatically synthesize to meet a model’s demands. AU - Jentzsch, Felix P. ID - 21433 TI - Design and Implementation of a ReconOS-based TensorFlow Lite Delegate Architecture ER - TY - GEN AB - Secure hardware design is the most important aspect to be considered in addition to functional correctness. Achieving hardware security in today’s globalized Integrated Cir- cuit(IC) supply chain is a challenging task. One solution that is widely considered to help achieve secure hardware designs is Information Flow Tracking(IFT). It provides an ap- proach to verify that the systems adhere to security properties either by static verification during design phase or dynamic checking during runtime. Proof-Carrying Hardware(PCH) is an approach to verify a functional design prior to using it in hardware. It is a two-party verification approach, where the target party, the consumer requests new functionalities with pre-defined properties to the producer. In response, the producer designs the IP (Intellectual Property) cores with the requested functionalities that adhere to the consumer-defined properties. The producer provides the IP cores and a proof certificate combined into a proof-carrying bitstream to the consumer to verify it. If the verification is successful, the consumer can use the IP cores in his hardware. In essence, the consumer can only run verified IP cores. Correctly applied, PCH techniques can help consumers to defend against many unintentional modifications and malicious alterations of the modules they receive. There are numerous published examples of how to use PCH to detect any change in the functionality of a circuit, i.e., pairing a PCH approach with functional equivalence checking for combinational or sequential circuits. For non-functional properties, since opening new covert channels to leak secret information from secure circuits is a viable attack vector for hardware trojans, i.e., intentionally added malicious circuitry, IFT technique is employed to make sure that secret/untrusted information never reaches any unclassified/trusted outputs. This master thesis aims to explore the possibility of adapting Information Flow Tracking into a Proof-Carrying Hardware scenario. It aims to create a method that combines Infor- mation Flow Tracking(IFT) with a PCH approach at bitstream level enabling consumers to validate the trustworthiness of a module’s information flow without the computational costs of a complete flow analysis. AU - Keerthipati, Monica ID - 15920 TI - A Bitstream-Level Proof-Carrying Hardware Technique for Information Flow Tracking ER - TY - GEN AU - Sabu, Nithin S. ID - 14831 TI - FPGA Acceleration of String Search Techniques in Huge Data Sets ER - TY - GEN AU - Hansmeier, Tim ID - 14546 TI - Autonomous Operation of High-Performance Compute Nodes through Self-Awareness and Learning Classifiers ER - TY - GEN AU - Lienen, Christian ID - 15874 TI - Implementing a Real-time System on a Platform FPGA operated with ReconOS ER - TY - GEN AU - Schnuer, Jan-Philip ID - 3365 TI - Static Scheduling Algorithms for Heterogeneous Compute Nodes ER - TY - GEN AU - Croce, Marcel ID - 3366 TI - Evaluation of OpenCL-based Compilation for FPGAs ER - TY - THES AB - Traditional cache design uses a consolidated block of memory address bits to index a cache set, equivalent to the use of modulo functions. While this module-based mapping scheme is widely used in contemporary cache structures due to the simplicity of its hardware design and its good performance for sequences of consecutive addresses, its use may not be satisfactory for a variety of application domains having different characteristics.This thesis presents a new type of cache mapping scheme, motivated by programmable capabilities combined with Nature-inspired optimization of reconfigurable hardware. This research has focussed on an FPGA-based evolvable cache structure of the first level cache in a multi-core processor architecture, able to dynamically change cache indexing. To solve the challenge of reconfigurable cache mappings, a programmable Boolean circuit based on a combination of Look-up Table (LUT) memory elements is proposed. Focusing on optimization aspects at the system level, a Performance Measurement Infrastructure is introduced that is able to monitor the underlying microarchitectural metrics, and an adaptive evaluation strategy is presented that leverages on Evolutionary Algorithms, that is not only capable of evolving application-specific address-to-cache-index mappings for level one split caches but also of reducing optimization times. Putting this all together and prototyping in an FPGA for a LEON3/Linux-based multi-core processor, the creation of a system architecture reduces cache misses and improves performance over the use of conventional caches. AU - Ho, Nam ID - 3720 TI - FPGA-based Reconfigurable Cache Mapping Schemes: Design and Optimization ER - TY - GEN AU - Hansmeier, Tim ID - 3580 TI - An FPGA Accelerator for Checking Resolution Proofs ER - TY - GEN AU - Witschen, Linus Matthias ID - 1157 TI - A Framework for the Synthesis of Approximate Circuits ER - TY - GEN AU - Knorr, Christoph ID - 74 TI - OpenCL-basierte Videoverarbeitung auf heterogenen Rechenknoten ER - TY - GEN AU - Knorr, Christoph ID - 3364 TI - Evaluation von Bildverarbeitungsalgorithmen in heterogenen Rechenknoten ER - TY - GEN AU - Koch, Benjamin ID - 10701 TI - Hardware Acceleration of Mechatronic Controllers on a Zynq Platform FPGA ER - TY - THES AB - Monte-Carlo Tree Search (MCTS) is a class of simulation-based search algorithms. It brought about great success in the past few years regarding the evaluation of deterministic two-player games such as the Asian board game Go. In this thesis, we present a parallelization of the most popular MCTS variant for large HPC compute clusters that efficiently shares a single game tree representation in a distributed memory environment and scales up to 128 compute nodes and 2048 cores. It is hereby one of the most powerful MCTS parallelizations to date. In order to measure the impact of our parallelization on the search quality and remain comparable to the most advanced MCTS implementations to date, we implemented it in a state-of-the-art Go engine Gomorra, making it competitive with the strongest Go programs in the world. We further present an empirical comparison of different Bayesian ranking systems when being used for predicting expert moves for the game of Go and introduce a novel technique for automated detection and analysis of evaluation uncertainties that show up during MCTS searches. AU - Schäfers, Lars ID - 10733 SN - 978-3-8325-3748-7 TI - Parallel Monte-Carlo Tree Search for HPC Systems and its Application to Computer Go ER - TY - GEN AU - Surmund, Sebastian ID - 10744 TI - Multithreaded Parallelization of Mechatronic Controllers on a Zynq Platform FPGA ER - TY - THES AB - Reconfigurable circuit devices have opened up a fundamentally new way of creating adaptable systems. Combined with artificial evolution, reconfigurable circuits allow an elegant adaptation approach to compensating for changes in the distribution of input data, computational resource errors, and variations in resource requirements. Referred to as ``Evolvable Hardware'' (EHW), this paradigm has yielded astonishing results for traditional engineering challenges and has discovered intriguing design principles, which have not yet been seen in conventional engineering. In this thesis, we present new and fundamental work on Evolvable Hardware motivated by the insight that Evolvable Hardware needs to compensate for events with different change rates. To solve the challenge of different adaptation speeds, we propose a unified adaptation approach based on multi-objective evolution, evolving and propagating candidate solutions that are diverse in objectives that may experience radical changes. Focusing on algorithmic aspects, we enable Cartesian Genetic Programming (CGP) model, which we are using to encode Boolean circuits, for multi-objective optimization by introducing a meaningful recombination operator. We improve the scalability of CGP by objectives scaling, periodization of local- and global-search algorithms, and the automatic acquisition and reuse of subfunctions using age- and cone-based techniques. We validate our methods on the applications of adaptation of hardware classifiers to resource changes, recognition of muscular signals for prosthesis control and optimization of processor caches. AU - Kaufmann, Paul ID - 11619 SN - 978-3-8325-3530-8 TI - Adapting Hardware Systems by Means of Multi-Objective Evolution ER - TY - THES AB - Handling run-time dynamics on embedded system-on-chip architectures has become more challenging over the years. On the one hand, the impact of workload and physical dynamics on the system behavior has dramatically increased. On the other hand, embedded architectures have become more complex as they have evolved from single-processor systems over multi-processor systems to hybrid multi-core platforms.Static design-time techniques no longer provide suitable solutions to deal with the run-time dynamics of today's embedded systems. Therefore, system designers have to apply run-time solutions, which have hardly been investigated for hybrid multi-core platforms.In this thesis, we present fundamental work in the new area of run-time management on hybrid multi-core platforms. We propose a novel architecture, a self-adaptive hybrid multi-core system, that combines heterogeneous processors, reconfigurable hardware cores, and monitoring cores on a single chip. Using self-adaptation on thread-level, our hybrid multi-core systems can effectively perform performance and thermal management autonomously at run-time. AU - Happe, Markus ID - 501 SN - 978-3-8325-3425-7 TI - Performance and thermal management on self-adaptive hybrid multi-cores ER - TY - THES AB - FPGAs, systems on chip and embedded systems are nowadays irreplaceable. They combine the computational power of application specific hardware with software-like flexibility. At runtime, they can adjust their functionality by downloading new hardware modules and integrating their functionality. Due to their growing capabilities, the demands made to reconfigurable hardware grow. Their deployment in increasingly security critical scenarios requires new ways of enforcing security since a failure in security has severe consequences. Aside from financial losses, a loss of human life and risks to national security are possible. With this work I present the novel and groundbreaking concept of proof-carrying hardware. It is a method for the verification of properties of hardware modules to guarantee security for a target platform at runtime. The producer of a hardware module delivers based on the consumer's safety policy a safety proof in combination with the reconfiguration bitstream. The extensive computation of a proof is a contrast to the comparatively undemanding checking of the proof. I present a prototype based on open-source tools and an abstract FPGA architecture and bitstream format. The proof of the usability of proof-carrying hardware provides the evaluation of the prototype with the exemplary application of securing combinational and bounded sequential equivalence of reference monitor modules for memory safety. AU - Drzevitzky, Stephanie ID - 586 TI - Proof-Carrying Hardware: A Novel Approach to Reconfigurable Hardware Security ER - TY - THES AB - The paradigm shift towards many-core parallelism is accompanied by two fundamental questions: how should the many processors on a single die communicate to each other and what are suitable programming models for these novel architectures? In this thesis, the author tackles both questions by reviewing the reconfigurable mesh model of massively parallel computation for many-cores. The book presents the design, implementation and evaluation of a many-core architecture that is based on the execution principles and communication infrastructure of the reconfigurable mesh. This work fundamentally rests on FPGA implementations and shows that reconfigurable mesh processors with hundreds of autonomous cores are feasible. Several case studies demonstrate the effectiveness of programming and illustrate why the reconfigurable mesh is a promising model for many-cores. AU - Giefers, Heiner ID - 10652 SN - 978-3-8325-3165-2 TI - Design and Programming of Reconfigurable Mesh based Many-Cores ER -