TY - CONF AB - In the next decades, hybrid multi-cores will be the predominant architecture for reconfigurable FPGA-based systems. Temperature-aware thread mapping strategies are key for providing dependability in such systems. These strategies rely on measuring the temperature distribution and redicting the thermal behavior of the system when there are changes to the hardware and software running on the FPGA. While there are a number of tools that use thermal models to predict temperature distributions at design time, these tools lack the flexibility to autonomously adjust to changing FPGA configurations. To address this problem we propose a temperature-aware system that empowers FPGA-based reconfigurable multi-cores to autonomously predict the on-chip temperature distribution for pro-active thread remapping. Our system obtains temperature measurements through a self-calibrating grid of sensors and uses area constrained heat-generating circuits in order to generate spatial and temporal temperature gradients. The generated temperature variations are then used to learn the free parameters of the system's thermal model. The system thus acquires an understanding of its own thermal characteristics. We implemented an FPGA system containing a net of 144 temperature sensors on a Xilinx Virtex-6 LX240T FPGA that is aware of its thermal model. Finally, we show that the temperature predictions vary less than 0.72 degree C on average compared to the measured temperature distributions at run-time. AU - Happe, Markus AU - Agne, Andreas AU - Plessl, Christian ID - 656 T2 - Proceedings of the 2011 International Conference on Reconfigurable Computing and FPGAs (ReConFig) TI - Measuring and Predicting Temperature Distributions on FPGAs at Run-Time ER - TY - CONF AU - Kenter, Tobias AU - Platzner, Marco AU - Plessl, Christian AU - Kauschke, Michael ID - 2200 KW - design space exploration KW - LLVM KW - partitioning KW - performance KW - estimation KW - funding-intel SN - 978-1-4503-0554-9 T2 - Proc. Int. Symp. on Field-Programmable Gate Arrays (FPGA) TI - Performance Estimation Framework for Automated Exploration of CPU-Accelerator Architectures ER - TY - JOUR AU - Schumacher, Tobias AU - Süß, Tim AU - Plessl, Christian AU - Platzner, Marco ID - 2201 JF - Int. Journal of Recon- figurable Computing (IJRC) KW - funding-altera TI - FPGA Acceleration of Communication-bound Streaming Applications: Architecture Modeling and a 3D Image Compositing Case Study ER - TY - CONF AU - Grad, Mariusz AU - Plessl, Christian ID - 2198 T2 - Proc. Reconfigurable Architectures Workshop (RAW) TI - Just-in-time Instruction Set Extension – Feasibility and Limitations for an FPGA-based Reconfigurable ASIP Architecture ER - TY - CONF AU - Lübbers, Enno AU - Platzner, Marco AU - Plessl, Christian AU - Keller, Ariane AU - Plattner, Bernhard ID - 2223 SN - 1-60132-140-6 T2 - Proc. Int. Conf. on Engineering of Reconfigurable Systems and Algorithms (ERSA) TI - Towards Adaptive Networking for Embedded Devices based on Reconfigurable Hardware ER - TY - CONF AU - Grad, Mariusz AU - Plessl, Christian ID - 2216 T2 - Proc. Int. Conf. on ReConFigurable Computing and FPGAs (ReConFig) TI - Pruning the Design Space for Just-In-Time Processor Customization ER - TY - CONF AU - Grad, Mariusz AU - Plessl, Christian ID - 2224 SN - 1-60132-140-6 T2 - Proc. Int. Conf. on Engineering of Reconfigurable Systems and Algorithms (ERSA) TI - An Open Source Circuit Library with Benchmarking Facilities ER - TY - CONF AU - Andrews, David AU - Plessl, Christian ID - 2220 SN - 1-60132-140-6 T2 - Proc. Int. Conf. on Engineering of Reconfigurable Systems and Algorithms (ERSA) TI - Configurable Processor Architectures: History and Trends ER - TY - GEN ED - Plaks, Toomas P. ED - Andrews, David ED - DeMara, Ronald ED - Lam, Herman ED - Lee, Jooheung ED - Plessl, Christian ED - Stitt, Greg ID - 2222 SN - 1-60132-140-6 TI - Proc. Int. Conf. on Engineering of Reconfigurable Systems and Algorithms (ERSA) ER - TY - CONF AU - Beisel, Tobias AU - Niekamp, Manuel AU - Plessl, Christian ID - 2226 SN - 978-1-4244-6965-9 T2 - Proc. Int. Conf. on Application-Specific Systems, Architectures, and Processors (ASAP) TI - Using Shared Library Interposing for Transparent Acceleration in Systems with Heterogeneous Hardware Accelerators ER - TY - CONF AU - Keller, Ariane AU - Plattner, Bernhard AU - Lübbers, Enno AU - Platzner, Marco AU - Plessl, Christian ID - 2206 SN - 978-1-4244-8864-3 T2 - Proc. IEEE Globecom Workshop on Network of the Future (FutureNet) TI - Reconfigurable Nodes for Future Networks ER - TY - CONF AU - Woehrle, Matthias AU - Plessl, Christian AU - Thiele, Lothar ID - 2227 SN - 978-1-4244-7911-5 T2 - Proc. Int. Conf. Networked Sensing Systems (INSS) TI - Rupeas: Ruby Powered Event Analysis DSL ER - TY - CONF AU - Kenter, Tobias AU - Platzner, Marco AU - Plessl, Christian AU - Kauschke, Michael ED - Hammami, Omar ED - Larrabee, Sandra ID - 2228 T2 - Proc. Workshop on Architectural Research Prototyping (WARP), International Symposium on Computer Architecture (ISCA) TI - Performance Estimation for the Exploration of CPU-Accelerator Architectures ER - TY - GEN AB - Wireless Sensor Networks (WSNs) are unique embedded computation systems for distributed sensing of a dispersed phenomenon. While being a strongly concurrent distributed system, its embedded aspects with severe resource limitations and the wireless communication requires a fusion of technologies and methodologies from very different fields. As WSNs are deployed in remote locations for long-term unattended operation, assurance of correct functioning of the system is of prime concern. Thus, the design and development of WSNs requires specialized tools to allow for testing and debugging the system. To this end, we present a framework for analyzing and checking WSNs based on collected events during system operation. It allows for abstracting from the event trace by means of behavioral queries and uses assertions for checking the accordance of an execution to its specification. The framework is independent from WSN test platforms, applications and logging semantics and thus generally applicable for analyzing event logs of WSN test executions. AU - Woehrle, Matthias AU - Plessl, Christian AU - Thiele, Lothar ID - 2353 KW - Rupeas KW - DSL KW - WSN KW - testing TI - Rupeas: Ruby Powered Event Analysis DSL ER - TY - CONF AB - Mapping applications that consist of a collection of cores to FPGA accelerators and optimizing their performance is a challenging task in high performance reconfigurable computing. We present IMORC, an architectural template and highly versatile on-chip interconnect. IMORC links provide asynchronous FIFOs and bitwidth conversion which allows for flexibly composing accelerators from cores running at full speed within their own clock domains, thus facilitating the re-use of cores and portability. Further, IMORC inserts performance counters for monitoring runtime data. In this paper, we first introduce the IMORC architectural template and the on-chip interconnect, and then demonstrate IMORC on the example of accelerating the k-th nearest neighbor thinning problem on an XD1000 reconfigurable computing system. Using IMORC's monitoring infrastructure, we gain insights into the data-dependent behavior of the application which, in turn, allow for optimizing the accelerator. AU - Schumacher, Tobias AU - Plessl, Christian AU - Platzner, Marco ID - 2350 KW - IMORC KW - interconnect KW - performance SN - 978-1-4244-4450-2 T2 - Proc. Int. Symp. on Field-Programmable Custom Computing Machines (FCCM) TI - IMORC: Application Mapping, Monitoring and Optimization for High-Performance Reconfigurable Computing ER - TY - CONF AB - In this work we present EvoCache, a novel approach for implementing application-specific caches. The key innovation of EvoCache is to make the function that maps memory addresses from the CPU address space to cache indices programmable. We support arbitrary Boolean mapping functions that are implemented within a small reconfigurable logic fabric. For finding suitable cache mapping functions we rely on techniques from the evolvable hardware domain and utilize an evolutionary optimization procedure. We evaluate the use of EvoCache in an embedded processor for two specific applications (JPEG and BZIP2 compression) with respect to execution time, cache miss rate and energy consumption. We show that the evolvable hardware approach for optimizing the cache functions not only significantly improves the cache performance for the training data used during optimization, but that the evolved mapping functions generalize very well. Compared to a conventional cache architecture, EvoCache applied to test data achieves a reduction in execution time of up to 14.31% for JPEG (10.98% for BZIP2), and in energy consumption by 16.43% for JPEG (10.70% for BZIP2). We also discuss the integration of EvoCache into the operating system and show that the area and delay overheads introduced by EvoCache are acceptable. AU - Kaufmann, Paul AU - Plessl, Christian AU - Platzner, Marco ID - 2262 KW - EvoCache KW - evolvable hardware KW - computer architecture T2 - Proc. NASA/ESA Conference on Adaptive Hardware and Systems (AHS) TI - EvoCaches: Application-specific Adaptation of Cache Mapping ER - TY - CONF AU - Beutel, Jan AU - Gruber, Stephan AU - Hasler, Andi AU - Lim, Roman AU - Meier, Andreas AU - Plessl, Christian AU - Talzi, Igor AU - Thiele, Lothar AU - Tschudin, Christian AU - Woehrle, Matthias AU - Yuecel, Mustafa ID - 2352 KW - WSN KW - PermaSense SN - 978-1-4244-5108-1 T2 - Proc. Int. Conf. on Information Processing in Sensor Networks (IPSN) TI - PermaDAQ: A Scientific Instrument for Precision Sensing and Data Recovery in Environmental Extremes ER - TY - CONF AU - Schumacher, Tobias AU - Süß, Tim AU - Plessl, Christian AU - Platzner, Marco ID - 2238 KW - IMORC KW - graphics SN - 978-0-7695-3917-1 T2 - Proc. Int. Conf. on ReConFigurable Computing and FPGAs (ReConFig) TI - Communication Performance Characterization for Reconfigurable Accelerator Design on the XD1000 ER - TY - CONF AU - Schumacher, Tobias AU - Plessl, Christian AU - Platzner, Marco ID - 2261 KW - IMORC KW - NOC KW - KNN KW - accelerator SN - 1946-1488 T2 - Proc. Int. Conf. on Field Programmable Logic and Applications (FPL) TI - An Accelerator for k-th Nearest Neighbor Thinning Based on the IMORC Infrastructure ER - TY - CONF AB - In this paper, we introduce the Woolcano reconfigurable processor architecture. The architecture is based on the Xilinx Virtex-4 FX FPGA and leverages the Auxiliary Processing Unit (APU) as well as the partial reconfiguration capabilities to provide dynamically reconfigurable custom instructions. We also present a hardware tool flow that automatically translates software functions into custom instructions and a software tool flow that creates binaries using these instructions. While previous research on processors with reconfigurable functional units has been performed predominantly with simulation, the Woolcano architecture allows for exploring dynamic instruction set extension with commercially available hardware. Finally, we present a case study demonstrating a custom floating-point instruction generated with our approach, which achieves a 40x speedup over software-emulated floating-point operations and a 21% speedup over the Xilinx hardware floating-point unit. AU - Grad, Mariusz AU - Plessl, Christian ID - 2263 SN - 1-60132-101-5 T2 - Proc. Int. Conf. on Engineering of Reconfigurable Systems and Algorithms (ERSA) TI - Woolcano: An Architecture and Tool Flow for Dynamic Instruction Set Extension on Xilinx Virtex-4 FX ER - TY - CONF AU - Woehrle, Matthias AU - Plessl, Christian AU - Lim, Roman AU - Beutel, Jan AU - Thiele, Lothar ID - 2370 KW - WSN KW - testing KW - verification SN - 978-0-7695-3158-8 T2 - IEEE Int. Conf. on Sensor Networks, Ubiquitous, and Trustworthy Computing (SUTC) TI - EvAnT: Analysis and Checking of event traces for Wireless Sensor Networks ER - TY - CONF AU - Schumacher, Tobias AU - Meiche, Robert AU - Kaufmann, Paul AU - Lübbers, Enno AU - Plessl, Christian AU - Platzner, Marco ID - 2364 SN - 1-60132-064-7 T2 - Proc. Int. Conf. on Engineering of Reconfigurable Systems and Algorithms (ERSA) TI - A Hardware Accelerator for k-th Nearest Neighbor Thinning ER - TY - CONF AU - Schumacher, Tobias AU - Plessl, Christian AU - Platzner, Marco ID - 2372 KW - IMORC KW - IP core KW - interconnect T2 - Many-core and Reconfigurable Supercomputing Conference (MRSC) TI - IMORC: An infrastructure for performance monitoring and optimization of reconfigurable computers ER - TY - GEN AU - Beutel, Jan AU - Plessl, Christian AU - Woehrle, Matthias ID - 2394 TI - Increasing the Reliability of Wireless Sensor Networks with a Unit Testing Framework ER - TY - CONF AU - Woehrle, Matthias AU - Plessl, Christian AU - Beutel, Jan AU - Thiele, Lothar ID - 2392 KW - WSN KW - testing KW - distributed KW - embedded SN - 978-1-59593-694-3 T2 - Proc. Workshop on Embedded Networked Sensors (EmNets) TI - Increasing the Reliability of Wireless Sensor Networks with a Distributed Testing Framework ER - TY - CONF AU - Beutel, Jan AU - Dyer, Matthias AU - Lim, Roman AU - Plessl, Christian AU - Woehrle, Matthias AU - Yuecel, Mustafa AU - Thiele, Lothar ID - 2393 KW - WSN KW - testing KW - verification SN - 1-4244-1231-5 T2 - Proc. Int. Conf. Networked Sensing Systems (INSS) TI - Automated Wireless Sensor Network Testing ER - TY - THES AB - In this thesis, we propose to use a reconfigurable processor as main computation element in embedded systems for applications from the multi-media and communications domain. A reconfigurable processor integrates an embedded CPU core with a Reconfigurable Processing Unit (RPU). Many of our target applications require real-time signal-processing of data streams and expose a high computational demand. The key challenge in designing embedded systems for these applications is to find an implementation that satisfies the performance goals and is adaptable to new applications, while the system cost is minimized. Implementations that solely use an embedded CPU are likely to miss the performance goals. Application-Specific Integrated Circuit (ASIC)-based coprocessors can be used for some high-volume products with fixed functions, but fall short for systems with varying applications. We argue that a reconfigurable processor with a coarse-grained, dynamically reconfigurable array of modest size provides an attractive implementation platform for our application domain. The computational intensive application kernels are executed on the RPU, while the remaining parts of the application are executed on the CPU. Reconfigurable hardware allows for implementing application specific coprocessors with a high performance, while the function of the coprocessor can still be adapted due to the programmability. So far, reconfigurable technology is used in embedded systems primarily with static configurations, e.g., for implementing glue-logic, replacing ASICs, and for implementing fixed-function coprocessors. Changing the configuration at runtime enables a number of interesting application modes, e.g., on-demand loading of coprocessors and time-multiplexed execution of coprocessors, which is commonly denoted as hardware virtualization. While the use of static configurations is well understood and supported by design-tools, the role of dynamic reconfiguration is not well investigated yet. Current application specification methods and design-tools do not provide an end-to-end tool-flow that considers dynamic reconfiguration. A key idea of our approach is to reduce system cost by keeping the size of the reconfigurable array small and to use hardware virtualization techniques to compensate for the limited hardware resources. The main contribution of this thesis is the codesign of a reconfigurable processor architecture named ZIPPY, the corresponding hardware and software implementation tools, and an application specification model which explicitly considers hardware virtualization. The ZIPPY architecture is widely parametrized and allows for specifying a whole family of processor architectures. The implementation tools are also parametrized and can target any architectural variant. We evaluate the performance of the architecture with a system-level, cycle-accurate cosimulation framework. This framework enables us to perform design-space exploration for a variety of reconfigurable processor architectures. With two case studies, we demonstrate, that hardware virtualization on the Zippy architecture is feasible and enables us to trade-off performance for area in embedded systems. Finally, we present a novel method for optimal temporal partitioning of sequential circuits, which is an important form of hardware virtualization. The method based on Slowdown and Retiming allows us to decompose any sequential circuit into a number of smaller, communicating subcircuits that can be executed on a dynamically reconfigurable architecture. AU - Plessl, Christian ID - 2404 KW - Zippy SN - 978-3-8322-5561-3 TI - Hardware virtualization on a coarse-grained reconfigurable processor ER - TY - CONF AB - This paper presents a novel method for optimal temporal partitioning of sequential circuits for time-multiplexed reconfigurable architectures. The method bases on slowdown and retiming and maximizes the circuit's performance during execution while restricting the size of the partitions to respect the resource constraints of the reconfigurable architecture. We provide a mixed integer linear program (MILP) formulation of the problem, which can be solved exactly. In contrast to related work, our approach optimizes performance directly, takes structural modifications of the circuit into account, and is extensible. We present the application of the new method to temporal partitioning for a coarse-grained reconfigurable architecture. AU - Plessl, Christian AU - Platzner, Marco AU - Thiele, Lothar ID - 2401 KW - temporal partitioning KW - retiming KW - ILP T2 - Proc. Int. Conf. on Field Programmable Technology (ICFPT) TI - Optimal Temporal Partitioning based on Slowdown and Retiming ER - TY - CONF AB - This paper motivates the use of hardware virtualization on coarse-grained reconfigurable architectures. We introduce Zippy, a coarse-grained multi-context hybrid CPU with architectural support for efficient hardware virtualization. The architectural details and the corresponding tool flow are outlined. As a case study, we compare the non-virtualized and the virtualized execution of an ADPCM decoder. AU - Plessl, Christian AU - Platzner, Marco ID - 2411 KW - Zippy T2 - Proc. Int. Conf. on Application-Specific Systems, Architectures, and Processors (ASAP) TI - Zippy – A coarse-grained reconfigurable array with support for hardware virtualization ER - TY - JOUR AB - Reconfigurable architectures that tightly integrate a standard CPU core with a field-programmable hardware structure have recently been receiving impact of these design decisions on the overall system performance is a challenging task. In this paper, we first present a framework for the cycle-accurate performance evaluation of hybrid reconfigurable processors on the system level. Then, we discuss a reconfigurable processor for data-streaming applications, which attaches a coarse-grained reconfigurable unit to the coprocessor interface of a standard embedded CPU core. By means of a case study we evaluate the system-level impact of certain design features for the reconfigurable unit, such as multiple contexts, register replication, and hardware context scheduling. The results illustrate that a system-level evaluation framework is of paramount importance for studying the architectural trade-offs and optimizing design parameters for reconfigurable processors. AU - Enzler, Rolf AU - Plessl, Christian AU - Platzner, Marco ID - 2412 IS - 2-3 JF - Microprocessors and Microsystems KW - FPGA KW - reconfigurable computing KW - co-simulation KW - Zippy TI - System-level performance evaluation of reconfigurable processors VL - 29 ER - TY - CONF AB - In this paper we introduce to virtualization of hardware on reconfigurable devices. We identify three main approaches denoted with temporal partitioning, virtualized execution, and virtual machine. For each virtualization approach, we discuss the application models, the required execution architectures, the design tools and the run-time systems. Then, we survey a selection of important projects in the field. AU - Plessl, Christian AU - Platzner, Marco ID - 2415 KW - hardware virtualization T2 - Proc. Int. Conf. on Engineering of Reconfigurable Systems and Algorithms (ERSA) TI - Virtualization of Hardware – Introduction and Survey ER - TY - CONF AB - This paper presents TKDM, a PC-based high-performance reconfigurable computing environment. The TKDM hardware consists of an FPGA module that uses the DIMM (dual inline memory module) bus for high-bandwidth and low-latency communication with the host CPU. The system's firmware is integrated with the Linux host operating system and offers functions for data communication and FPGA reconfiguration. The intended use of TKDM is that of a dynamically reconfigurable co-processor for data streaming applications. The system's firmware can be customized for specific application domains to facilitate simple and easy-to-use programming interfaces. AU - Plessl, Christian AU - Platzner, Marco ID - 2418 KW - coprocessor KW - DIMM KW - memory bus KW - FPGA KW - high performance computing T2 - Proc. Int. Conf. on Field Programmable Technology (ICFPT) TI - TKDM – A Reconfigurable Co-processor in a PC's Memory Slot ER - TY - JOUR AB - Wearable computers are embedded into the mobile environment of their users. A design challenge for wearable systems is to combine the high performance required for tasks such as video decoding with the low energy consumption required to maximise battery runtimes and the flexibility demanded by the dynamics of the environment and the applications. In this paper, we demonstrate that reconfigurable hardware technology is able to answer this challenge. We present the concept and the prototype implementation of an autonomous wearable unit with reconfigurable modules (WURM). We discuss experiments that show the uses of reconfigurable hardware in WURM: ASICs-on-demand and adaptive interfaces. Finally, we present an experiment with an operating system layer for WURM. AU - Plessl, Christian AU - Enzler, Rolf AU - Walder, Herbert AU - Beutel, Jan AU - Platzner, Marco AU - Thiele, Lothar AU - Tröster, Gerhard ID - 2419 IS - 5 JF - Personal and Ubiquitous Computing TI - The Case for Reconfigurable Hardware in Wearable Computing VL - 7 ER - TY - JOUR AB - This paper presents the acceleration of minimum-cost covering problems by instance-specific hardware. First, we formulate the minimum-cost covering problem and discuss a branch \& bound algorithm to solve it. Then we describe instance-specific hardware architectures that implement branch \& bound in 3-valued logic and use reduction techniques similar to those found in software solvers. We further present prototypical accelerator implementations and a corresponding design tool flow. Our experiments reveal significant raw speedups up to five orders of magnitude for a set of smaller unate covering problems. Provided that hardware compilation times can be reduced, we conclude that instance-specific acceleration of hard minimum-cost covering problems will lead to substantial overall speedups. AU - Plessl, Christian AU - Platzner, Marco ID - 2420 IS - 2 JF - Journal of Supercomputing KW - reconfigurable computing KW - instance-specific acceleration KW - minimum covering SN - 0920-8542 TI - Instance-Specific Accelerators for Minimum Covering VL - 26 ER - TY - CONF AB - In contrast to processors, current reconfigurable devices totally lack programming models that would allow for device independent compilation and forward compatibility. The key to overcome this limitation is hardware virtualization. In this paper, we resort to a macro-pipelined execution model to achieve hardware virtualization for data streaming applications. As a hardware implementation we present a hybrid multi-context architecture that attaches a coarse-grained reconfigurable array to a host CPU. A co-simulation framework enables cycle-accurate simulation of the complete architecture. As a case study we map an FIR filter to our virtualized hardware model and evaluate different designs. We discuss the impact of the number of contexts and the feature of context state on the speedup and the CPU load. AU - Enzler, Rolf AU - Plessl, Christian AU - Platzner, Marco ID - 2421 KW - Zippy KW - multi-context KW - FPGA T2 - Proc. Int. Conf. on Field Programmable Logic and Applications (FPL) TI - Virtualizing Hardware with Multi-Context Reconfigurable Arrays VL - 2778 ER - TY - CONF AB - Reconfigurable computing architectures aim to dynamically adapt their hardware to the application at hand. As research shows, the time it takes to reconfigure the hardware forms an overhead that can significantly impair the benefits of hardware customization. Multi-context devices are one promising approach to overcome the limitations posed by long reconfiguration times. In contrast to more traditional reconfigurable architectures, multi-context devices hold several configurations on-chip. On demand, the device can quickly switch to another context. In this paper we present a co-simulation environment to investigate design trade-offs for hybrid multi-context architectures. Our architectural model comprises a reconfigurable unit closely coupled to a CPU core. As a case study, we discuss the implementation of a FIR filter partitioned into several contexts. We outline the mapping process and present simulation results for single- and multi-context reconfigurable units coupled with both embedded and high-end CPUs. AU - Enzler, Rolf AU - Plessl, Christian AU - Platzner, Marco ID - 2422 KW - Zippy KW - co-simulation SN - 1-932415-05-X T2 - Proc. Int. Conf. on Engineering of Reconfigurable Systems and Algorithms (ERSA) TI - Co-simulation of a Hybrid Multi-Context Architecture ER - TY - CONF AB - Wearable computers are embedded into the mobile environment of the human body. A design challenge for wearable systems is to combine the high performance required for tasks such as video decoding with low energy consumption required to maximize battery runtimes and the flexibility demanded by the dynamics of the environment and the applications. In this paper, we demonstrate that reconfigurable hardware technology is able to answer this challenge. We present the concept and the prototype implementation of an autonomous wearable unit with reconfigurable modules (WURM). We discuss two experiments that show the uses of reconfigurable hardware in WURM: ASICs-on-demand and adaptive interfaces. Finally, we develop and evaluate task placement techniques used in the operating system layer of WURM. AU - Plessl, Christian AU - Enzler, Rolf AU - Walder, Herbert AU - Beutel, Jan AU - Platzner, Marco AU - Thiele, Lothar ID - 2423 KW - wearable computing SN - 0-7695-1816-8 T2 - Proc. Int. Symp. on Wearable Computers (ISWC) TI - Reconfigurable Hardware in Wearable Computing Nodes ER - TY - CONF AB - Recent generations of high-density and high-speed FPGAs provide a sufficient capacity for implementing complete configurable systems on a chip (CSoCs). Hybrid CPUs that combine standard CPU cores with reconfigurable coprocessors are an important subclass of CSoCs. With partially reconfigurable FPGAs, coprocessors can be loaded on demand while the CPU remains running. However, the lack of high-level design tools for partial reconfiguration makes practical implementations a challenging task. In this paper, we introduce a design flow to implement hybrid processors on Xilinx Virtex. The design flow is based on two techniques, virtual sockets and feed-through components, and can efficiently generate partial configurations from industry-quality cores. We discuss the design flow and present a fully operational audio streaming prototype to demonstrate its feasibility. AU - Dyer, Matthias AU - Plessl, Christian AU - Platzner, Marco ID - 2424 KW - partial reconfiguration T2 - Proc. Int. Conf. on Field Programmable Logic and Applications (FPL) TI - Partially Reconfigurable Cores for Xilinx Virtex VL - 2438 ER - TY - CONF AB - We present instance-specific custom computing machines for the set covering problem. Four accelerator architectures are developed that implement branch \& bound in 3-valued logic and many of the deduction techniques found in software solvers. We use set covering benchmarks from two-level logic minimization and Steiner triple systems to derive and discuss experimental results. The resulting raw speedups are in the order of four magnitudes on average. Finally, we propose a hybrid solver architecture that combines the raw speed of instance-specific reconfigurable hardware with flexible bounding schemes implemented in software. AU - Plessl, Christian AU - Platzner, Marco ID - 2425 T2 - Proc. Int. Symp. on Field-Programmable Custom Computing Machines (FCCM) TI - Custom Computing Machines for the Set Covering Problem ER - TY - CONF AB - In this paper we present instance-specific accelerators for minimum-cost covering problems. We first define the covering problem and discuss a branch&bound algorithm to solve it. Then we describe an instance-specific hardware architecture that implements branch&bound in 3-valued logic and uses reduction techniques usually found in software solvers. Results for small unate covering problems reveal significant raw speedups. AU - Plessl, Christian AU - Platzner, Marco ID - 2428 KW - minimum covering KW - accelerator KW - funding-sundance T2 - Proc. Int. Conf. on Engineering of Reconfigurable Systems and Algorithms (ERSA) TI - Instance-Specific Accelerators for Minimum Covering ER - TY - JOUR AU - Plessl, Christian AU - Wilde, Erik ID - 2429 JF - iX TI - Server-Side-Techniken im Web – ein Überblick ER - TY - GEN AB - In this report the design and implementation of an instance-specific accelerator for solving minimum covering problems will be presented. After an introduction to configurable computing in general, the minimum covering problem is defined and a branch and bound algorithm to solve it in software is presented. The remainder of the report shows how this branch and bound algorithm can be adopted to hardware. Specifically it is stressed how the various sophisticated strategies for deducing conditions for variables used by software solvers can be adopted to hardware and how a system which uses 3-valued logic to solve this problem can be designed. In addition to these considerations focusing on the architecture of the system, some important details of the actual implementation are given. A prototype has been implemented for showing the feasibility of the concept and for gaining information about speed and size of the hardware implementation. Cycle-accurate simulations for a set of benchmark problems have been done for determining the performance of the accelerator. The speed of the resulting accelerators has been compared to the time a reference software solver (espresso) needs and the resulting speedups have been calculated. I have shown that a raw speedup of several orders of maginitude can be achieved for many problems; for some problems no speedup is achieved yet. After a discussion of the results, ideas for future work are presented. AU - Plessl, Christian ID - 2430 TI - Reconfigurable Accelerators for Minimum Covering ER - TY - CONF AB - In this paper, we present the analysis of applications from the domain of handheld and wearable computing. This analysis is the first step to derive and evaluate design parameters for dynamically reconfigurable processors. We discuss the selection of representative benchmarks for handhelds and wearables and group the applications into multimedia, communications, and cryptography programs. We simulate the applications on a cycle-accurate processor simulator and gather statistical data such as instruction mix, cache hit rates and memory requirements for an embedded processor model. A breakdown of the executed cycles into different functions identifies the most compute-intensive code sections - the kernels. Then, we analyze the applications and discuss parameters that strongly influence the design of dynamically reconfigurable processors. Finally, we outline the construction of a parameterizable simulation model for a reconfigurable unit that is attached to a processor core. AU - Enzler, Rolf AU - Platzner, Marco AU - Plessl, Christian AU - Thiele, Lothar AU - Tröster, Gerhard ID - 2432 KW - benchmark T2 - Reconfigurable Technology: FPGAs and Reconfigurable Processors for Computing and Communications III TI - Reconfigurable Processors for Handhelds and Wearables: Application Analysis VL - 4525 ER - TY - GEN AU - Plessl, Christian AU - Maurer, Simon ID - 2433 KW - co-design KW - speech processing TI - Hardware/Software Codesign in Speech Compression Applications ER -