TY - CONF AU - Djemame, Karim AU - Gourlay, Iain AU - Padgett, James AU - Birkenheuer, Georg AU - Hovestadt, Matthias AU - Kao, Odej AU - Voss, Kerstin ID - 2402 SN - 0-7695-2734-5 T2 - Proc. Int. Conf. on e-Science and Grid Computing TI - Introducing Risk Management into the Grid ER - TY - CONF AU - Hovestadt, Matthias AU - Kao, Odej AU - Voss, Kerstin ID - 2403 SN - 0-7695-2670-5 T2 - Proc. Int. Conf. on Services Computing (SCC) TI - The First Step of Introducing Risk Management for Prepossessing SLAs ER - TY - JOUR AU - Groppe, Sven AU - Böttcher, Stefan AU - Birkenheuer, Georg AU - Höing, André ID - 2405 IS - 1 JF - Data & Knowledge Engineering TI - Reformulating XPath queries and XSLT queries on XSLT views VL - 57 ER - TY - CONF AU - Voss, Kerstin ID - 2406 SN - 0-7695-2622-5 T2 - Proc. Int. Conf. on Networking and Services (ICNS) TI - Risk Aware Migrations for Prepossessing SLAs ER - TY - CONF AU - Lietsch, Stefan AU - Zabel, Henning AU - Berssenbruegge, Jan AU - Wittenberg, Veit AU - Eikermann, Martin ID - 2407 SN - 3-540-48628-3 T2 - Proc. Int. Symp. on Visual Computing (ISVC) TI - Light Simulation in a Distributed Driving Simulator VL - 4291 ER - TY - CONF AU - Lerch, Nicolas AU - Nitsche, Holger AU - Voss, Kerstin AU - Hovestadt, Matthias ID - 2408 T2 - Proc. Cracow Grid Workshop (CGW) TI - First Steps of a Monitoring Framework to Empower Risk Assessment on Grids ER - TY - CONF AU - Birkenheuer, Georg AU - Döhre, Sven AU - Hovestadt, Matthias AU - Kao, Odej AU - Voss, Kerstin ID - 2409 T2 - Proc. Cracow Grid Workshop (CGW) TI - On Similarities of Grid Resources for Identifying Potential Migration Targets ER - TY - CONF AU - Birkenheuer, Georg AU - Djemame, Karim AU - Gourlay, Iain AU - Kao, Odej AU - Padgett, James AU - Voß, Kerstin ID - 2410 T2 - Proc. WS-Agreement Workshop (Open Grid Forum 18) TI - Using WS-Agreement for Risk Management in the Grid ER - TY - CHAP AB - Grid Computing promises an efficient sharing of world-wide distributed resources, ranging from hardware, software, expert knowledge to special I/O devices. However, although the main Grid mechanisms are already developed or are currently addressed by tremendous research effort, the Grid environment still suffers from a low acceptance in different user communities. Beside difficulties regarding an intuitive and comfortable resource access, various problems related to the reliability and the Quality-of-Service while using the Grid exist. Users should be able to rely, that their jobs will have certain priority at the remote Grid site and that they will be finished upon the agreed time regardless of any provider problems. Therefore, QoS issues have to be considered in the Grid middleware but also in the local resource management systems at the Grid sites. However, most of the currently used resource management systems are not suitable for SLAs, as they do not support resource reservation and do not offer mechanisms for job checkpointing/migration respectively. The latter are mandatory for Grid providers as rescue anchor in case of system failures or system overload. This paper focuses on SLA-aware job migration and presents a work, which is being performed in the EU supported project HPC4U. AU - Heine, Felix AU - Hovestadt, Matthias AU - Kao, Odej AU - Keller, Axel ED - Grandinetti, Lucio ID - 1990 T2 - Grid Computing: New Frontiers of High Performance Computing TI - SLA-aware Job Migration in Grid Environments VL - 14 ER - TY - CONF AB - The next generation grid applications demand grid middleware for a flexible negotiation mechanism supporting various ways of quality-of-service (QoS) guarantees. In this context, a QoS guarantee covers simultaneous allocations of various kinds of different resources, such as processor runtime, storage capacity, or network bandwidth, which are specified in the form of service level agreements (SLA). Currently, a gap exists between the capabilities of grid middleware and the underlying resource management systems concerning their support for QoS and SLA negotiation. In this paper we present an approach which closes this gap. Introducing the architecture of the virtual resource manager, we highlight its main QoS management features like run-time responsibility, co-allocation, and fault tolerance. AU - Burchard, Lars-Olof AU - Heine, Felix AU - Hovestadt, Matthias AU - Kao, Odej AU - Keller, Axel AU - Linnert, Barry ID - 1992 T2 - Proc. IEEE Int. Parallel & Distributed Processing Symposium (IPDPS) TI - A Quality-of-Service Architecture for Future Grid Computing Applications. ER - TY - CONF AU - Lietsch, Stefan AU - Kao, Odej ID - 2413 T2 - Proc. Intelligence in Communication Systems (INTELLCOMM) TI - CoLoS - A System for Device Unaware and Position Dependent Communication Based on the Session Initiation Protocol VL - 190 ER - TY - CONF AU - Birkenheuer, Georg AU - Hagelweide, Wilke AU - Hagemeier, Björn AU - Japs, Viktor AU - Keller, Matthias AU - Mayr, Nikolas AU - Meyer, Jan AU - Schumacher, Tobias AU - Voß, Kerstin AU - Zajac, Markus ID - 2414 T2 - Proc. GI Informatiktage TI - PIRANHA – Hunter of Idle Resources VL - 2 ER - TY - CONF AU - Kao, Odej AU - Hovestadt, Matthias AU - Keller, Axel ID - 1993 T2 - Proc. Advanced Research Workshop on High Perfomance Computing: Technology and Applications TI - SLA-aware Job Migration in Grid Environments ER - TY - CONF AU - Burchard, Lars-Olof AU - Heiss, Hans-Ulrich AU - Hovestadt, Matthias AU - Kao, Odej AU - Keller, Axel AU - Linnert, Barry ID - 1994 T2 - Proceedings of the GI-Meeting on Operating Systems TI - An Architecture for SLA-aware Resource Management ER - TY - CONF AB - The next generation Grid will demand the Grid middleware to provide flexibility, transparency, and reliability. This implies the appliance of service level agreements to guarantee a negotiated level of quality of service. These requirements also affect the local resource management systems providing resources for the Grid. At this a gap between these demands and the features of today's resource management systems becomes apparent. In this paper we present an approach which closes this gap. Introducing the architecture of the virtual resource manager we highlight its main features of runtime responsibility, resource virtualization, information hiding, autonomy provision, and smooth integration of existing resource management system installations. AU - Burchard, Lars-Olof AU - Hovestadt, Matthias AU - Kao, Odej AU - Keller, Axel AU - Linnert, Barry ID - 1995 T2 - Proc. Int. Symposium on Cluster Computing and the Grid (CCGRID) TI - Virtual Resource Manager: An Architecture for SLA-aware Resource Management ER - TY - CONF AU - Groppe, Sven AU - Böttcher, Stefan AU - Birkenheuer, Georg ID - 2416 T2 - Proc. Int. Conf. on Enterprise Information Systems (ICEIS) TI - Efficient Querying of Transformed XML Documents ER - TY - CONF AU - Groppe, Sven AU - Böttcher, Stefan AU - Heckel, Reiko AU - Birkenheuer, Georg ID - 2417 T2 - Proc. East-European Conf. on Advances in Databases and Information Systems (ADBIS) TI - Using XSLT Stylesheets to Transform XPath Queries ER - TY - CONF AB - Nearly all existing HPC systems are operated by resource management systems based on the queuing approach. With the increasing acceptance of grid middleware like Globus, new requirements for the underlying local resource management systems arise. Features like advanced reservation or quality of service are needed to implement high level functions like co-allocation. However it is difficult to realize these features with a resource management system based on the queuing concept since it considers only the present resource usage. In this paper we present an approach which closes this gap. By assigning start times to each resource request, a complete schedule is planned. Advanced reservations are now easily possible. Based on this planning approach functions like diffuse requests, automatic duration extension, or service level agreements are described. We think they are useful to increase the usability, acceptance and performance of HPC machines. In the second part of this paper we present a planning based resource management system which already covers some of the mentioned features. AU - Hovestadt, Matthias AU - Kao, Odej AU - Keller, Axel AU - Streit, Achim ID - 1998 KW - High Performance Computing KW - Service Level Agreement KW - Grid Resource KW - Resource Management System KW - Advance Reservation T2 - Proc. Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP) TI - Scheduling in HPC Resource Management Systems: Queuing vs. Planning VL - 2862 ER - TY - CONF AU - P. Miller, Barton AU - Labarta, Jesús AU - Schintke, Florian AU - Simon, Jens ID - 2426 SN - 978-3-540-45706-0 T2 - Proc. European Conf. on Parallel Processing (Euro-Par) TI - Performance Evaluation, Analysis and Optimization VL - 2400 ER - TY - JOUR AB - Workstation clusters are often not only used for high-throughput computing in time-sharing mode but also for running complex parallel jobs in space-sharing mode. This poses several difficulties to the resource management system, which must be able to reserve computing resources for exclusive use and also to determine an optimal process mapping for a given system topology. On the basis of our CCS software, we describe the anatomy of a modern resource management system. Like Codine, Condor, and LSF, CCS provides mechanisms for the user-friendly system access and management of clusters. But unlike them, CCS is targeted at the effective support of space-sharing parallel computers and even metacomputers. Among other features, CCS provides a versatile resource description facility, topology-based process mapping, pluggable schedulers, and hooks to metacomputer management. AU - Keller, Axel AU - Reinefeld, Alexander ID - 1999 JF - Annual Review of Scalable Computing TI - Anatomy of a Resource Management System for HPC Clusters VL - 3 ER - TY - CONF AB - The Testbed and Applications working group of the European Grid Forum (EGrid) is actively building and experimenting with a grid infrastructure connecting several research-based supercomputing sites located in Europe. The paper reports on our first feasibility study: running a self-migrating version of the Cactus simulation code across the European grid testbed, including "live" remote data visualization and steering from different demonstration booths at Supercomputing 2000, in Dallas, TX. We report on the problems that had to be resolved for this endeavour and identify open research challenges for building production-grade grid environments. AU - Gehring, Jörn AU - Keller, Axel AU - Reinefeld, Alexander AU - Streit, Achim ID - 2000 T2 - Proc. Int. Symposium on Cluster Computing and the Grid (CCGRID) TI - Early Experiences with the EGrid Testbed ER - TY - CONF AB - The availability of commodity high performance components for workstations and networks made it possible to build up large, PC based compute clusters at modest costs. These clusters seem to be a realistic alternative to proprietary, massively parallel systems with respect to the price/performance ratio. However, from the administration point of view, those systems are still often solely a collection of autonomous nodes, connected by a fast short area network. Therefore, aiming at providing the best possible performance in daily work to all users, a lot of work has to be done before obtaining the expected result. The paper describes the problem areas we had to cope with during the integration of two large SCI clusters (one with 64 and one with 192 processors) in the environment of the Paderborn Center for Parallel Computing. AU - Keller, Axel AU - Krawinkel, Andreas ID - 2002 T2 - Proc. Int. Symposium on Cluster Computing and the Grid (CCGRID) TI - Lessons Learned While Operating Two Large SCI Clusters ER - TY - GEN AU - Hungershöfer, Jan AU - Streit, Achim AU - Wierum, Jens-Michael ID - 2427 TI - Efficient Resource Management for Malleable Applications ER - TY - CONF AU - Schintke, Florian AU - Simon, Jens AU - Reinefeld, Alexander ID - 2431 T2 - Proc. Int. Conf. on Computational Science (ICCS) TI - A Cache Simulator for Shared Memory Systems VL - 2074 ER - TY - CONF AB - RsdEditor is a graphical user interface which produces specifications of computational resources. It is used in the RSD (Resource and Service Description) environment for specifying, registering, requesting and accessing resources and services in a metacomputer. RsdEditor was designed to be used by the administrators and users of metacomputing environments. At the administrator level, the GUI is used to describe the available computing and networking components of a metacomputer. At the user level, RsdEditor can be used to specify which characteristics of the computational resources are needed to execute a meta-application. This paper is organized as follows: it first introduces RsdEditor. It then briefly describes the RSD environment, and finally, it highlights various features and implementation issues of RsdEditor. AU - Baraglia, Ranieri AU - Keller, Axel AU - Laforenza, Domenico AU - Reinefeld, Alexander ID - 2003 T2 - Proc. Heterogenous Computing Workshop HCW at IPDPS TI - RsdEditor: A Graphical User Interface for Specifying Metacomputer Components ER - TY - THES AU - Simon, Jens ID - 2434 SN - 3-934445-03-9 TI - Werkzeugunterstützte effiziente Nutzung von Hochleistungsrechnern ER - TY - CONF AB - With the recent availability of cost-effective network cards for the PCI bus, researchers have been tempted to build up large compute clusters with standard PCs. Many of them are operated with workstation cluster management software in high-throughput or single user mode. For very large clusters with more than 100 PEs, however, it becomes necessary to implement a full fledged resource management software that allows to partition the system for multi-user access. In this paper, we present our Computing Center Software (CCS), which was originally designed for managing massively parallel high-performance computers, and now adapted to modern workstation clusters. It provides - partitioning of exclusive and non-exclusive resources, - hardware-independent scheduling of interactive and batch jobs, - open, extensible interfaces to other resource management systems, - a high degree of reliability. AU - Brune, Matthias AU - Keller, Axel AU - Reinefeld, Alexander ID - 2004 T2 - Proc. Int. Conf. on High-Performance Computing and Networking (HPCN) TI - Resource Management for High-Performance PC Clusters ER - TY - CHAP AB - With a steadily increasing number of services, metacomputing is now gaining importance in science and industry. Virtual organizations, autonomous agents, mobile computing services, and high-performance client–server applications are among the many examples of metacomputing services. For all of them, resource description plays a major role in organizing access, use, and administration of the computing components and software services. We present a generic Resource and Service Description (RSD) for specifying the hardware and software components of (meta-) computing environments. Its graphical interface allows metacomputer users to specify their resource requests. Its textual counterpart gives service providers the necessary flexibility to specify topology and properties of the available system and software resources. Finally, its internal object-oriented representation is used to link different resource management systems and service tools. With these three representations, our generic RSD approach is a key component for building metacomputer environments. AU - Brune, Matthias AU - Gehring, Jörn AU - Keller, Axel AU - Reinefeld, Alexander ED - Buya, R. ID - 2005 T2 - High-Performance Cluster Computing: Architecture and Systems TI - Specifying Resources and Services in Metacomputing Systems ER - TY - JOUR AB - We present a software system for the management of geographically distributed high‐performance computers. It consists of three components: 1. The Computing Center Software (CCS) is a vendor‐independent resource management software for local HPC systems. It controls the mapping and scheduling of interactive and batch jobs on massively parallel systems; 2. The Resource and Service Description (RSD) is used by CCS for specifying and mapping hardware and software components of (meta‐)computing environments. It has a graphical user interface, a textual representation and an object‐oriented API; 3. The Service Coordination Layer (SCL) co‐ordinates the co‐operative use of resources in autonomous computing sites. It negotiates between the applications' requirements and the available system services. AU - Brune, Matthias AU - Gehring, Jörn AU - Keller, Axel AU - Reinefeld, Alexander ID - 2007 JF - Concurrency, Practice, and Experience TI - Managing Clusters of Geographically Distributed High-Performance Computers VL - II(15) ER - TY - CHAP AB - The growing maturity of hardware and software components has tempted researchers to build very large SCI clusters with several hundred processors that are operated as high-performance compute servers in multi-user mode. In this chapter, we present a resource management software for the user access and system administration of high-performance compute clusters named Computing Center Software (CCS). It is in day-to-day use since 1992 on various parallel systems and has recently been adapted to the management of SCI clusters. CCS provides pluggable schedulers, optimal space partitioning for multiple users, reliable user access, and powerful tools for specifying resources and services by means of a specification language and a graphical user interface. After a brief introduction in the remainder of this section, we describe the CCS system architecture and the characteristics of its resource description facilities. AU - Brune, Matthias AU - Keller, Axel AU - Reinefeld, Alexander ED - Hellwagner, Hermann ED - Reinefeld, Alexander ID - 2008 T2 - SCI - Scalable Coherent Interface: Architecture and Software for High Performance Compute Clusters TI - Multi-User System Management on SCI Cluster ER - TY - CHAP AU - Simon, Jens AU - Reinefeld, Alexander AU - Heinz, Oliver ED - Hellwagner, Hermann ED - Reinefeld, Alexander ID - 2435 SN - 0302-9743 T2 - SCI: Scalable Coherent Interface. Architecture and Software for High-Performance Compute Clusters TI - Large-Scale SCI Clusters in Practice: Architecture and Performance in SCI VL - 1734 ER - TY - CONF AU - Brune, Matthias AU - Reinefeld, Alexander AU - Varnholt, Jörg ID - 2436 T2 - Proc. Int. Symp. High-Performance Distributed Computing (HPDC) TI - A Resource Description Environment for Distributed Computing Systems ER - TY - CONF AB - RSD (Resource and Service Description) is a scheme for specifying resources and services in complex heterogeneous computing systems and metacomputing environments. At the system administrator level, RSD is used to specify the available system components, such as the number of nodes, their interconnection topology, CPU speeds, and available software packages. At the user level, a GUI provides a comfortable, high-level interface for specifying system requests. A textual editor can be used for defining repetitive and recursive structures. This gives service providers the necessary flexibility for fine-grained specification of system topologies, interconnection networks, system and software dependent properties. All these representations are mapped onto a single, coherent internal object-oriented resource representation. Dynamic aspects (like network performance, availability of compute nodes, and compute node loads) are traced at runtime and included in the resource description to allow for optimal process mapping and dynamic task load balancing at runtime at the metacomputer level. This is done in a self-organizing way, with human system operators becoming only involved when new hardware/software components are installed. AU - Brune, Matthias AU - Gehring, Jörn AU - Keller, Axel AU - Reinefeld, Alexander ID - 2009 T2 - Proc. Int. Conf. on High-Performance Computing Systems (HPCS) TI - RSD - Resource and Service Description ER - TY - CONF AB - CCS is a resource management system for parallel high-performance computers. At the user level, CCS provides vendor-independent access to parallel systems. At the system administrator level, CCS offers tools for controlling (i.e, specifying, configuring and scheduling) the system components that are operated in a computing center. Hence the name "Computing Center Software". CCS provides: hardware-independent scheduling of interactive and batch jobs; partitioning of exclusive and non-exclusive resources; open, extensible interfaces to other resource management systems; a high degree of reliability (e.g. automatic restart of crashed daemons); fault tolerance in the case of network breakdowns. The authors describe CCS as one important component for the access, job distribution, and administration of networked HPC systems in a metacomputing environment. AU - Keller, Axel AU - Reinefeld, Alexander ID - 2011 T2 - Proc. Heterogenous Computing Workshop (HCW) at IPPS TI - CCS Resource Management in Networked HPC Systems ER - TY - JOUR AB - With a steadily increasing number of services, metacomputing is now gaining importance in science and industry. Virtual organizations, autonomous agents, mobile computing services, and high-performance client–server applications are among the many examples of metacomputing services. For all of them, resource description plays a major role in organizing access, use, and administration of the computing components and software services. We present a generic Resource and Service Description (RSD) for specifying the hardware and software components of (meta-) computing environments. Its graphical interface allows metacomputer users to specify their resource requests. Its textual counterpart gives service providers the necessary flexibility to specify topology and properties of the available system and software resources. Finally, its internal object-oriented representation is used to link different resource management systems and service tools. With these three representations, our generic RSD approach is a key component for building metacomputer environments. AU - Brune, Matthias AU - Gehring, Jörn AU - Keller, Axel AU - Monien, Burkhard ID - 2012 JF - Parallel Computing TI - Specifying Resources and Services in Metacomputing Environments VL - 24 ER - TY - JOUR AU - Simon, Jens AU - Wierum, Jens-Michael ID - 2437 IS - 5 JF - Information Processing Letters - Special Issue on Models of Computation SN - 0020-0190 TI - The Latency-of-Data-Access model for Analyzing Parallel Computation VL - 66 ER - TY - CONF AU - Brune, Matthias AU - Hellmann, Christian AU - Keller, Axel ID - 2013 T2 - Proc. Workshop Hypercomputing at ITG/GI-Conference Architekur von Rechensystemen TI - A Closer Step towards Management of Metacomputing-Resources ER - TY - CONF AU - Simon, Jens AU - Weicker, Reinhold AU - Vieth, Marco ID - 2438 SN - 978-3-540-69549-3 T2 - Proc. European Conf. on Parallel Processing (Euro-Par) TI - Workload Analysis of Computation Intensive Tasks: Case Study on SPEC CPU95 Benchmarks VL - 1300 ER - TY - CONF AU - Heinz, Oliver AU - Simon, Jens ID - 2439 T2 - Proc. Int. Conf. on Architecture of Computing Systems (ARCS) TI - Experiences with a SCI Multiprocessor Workstation Cluster ER - TY - CONF AU - Simon, Jens AU - Heinz, Oliver ID - 2440 T2 - Proc. Workshops im Rahmen der 14. ITG/GI-Fachtagung Architektur von Rechensystemen TI - SCI multiprocessor PC cluster in a WindowsNT environment ER - TY - CONF AU - Fischer, Markus AU - Simon, Jens ID - 2441 T2 - Proc. European Parallel Virtual Machine / Message Passing Interface Users’ Group Meeting (EuroPVM/MPI) TI - Embedding SCI into PVM VL - 1332 ER - TY - CONF AU - Reinefeld, Alexander AU - Baraglia, Ranieri AU - Decker, Thomas AU - Gehring, Jörn AU - Laforenza, Domenico AU - Ramme, Friedhelm AU - Römke, Thomas AU - Simon, Jens ID - 2442 T2 - Proc. Heterogenous Computing Workshop (HCW) TI - The MOL Project: An Open, Extensible Metacomputer ER - TY - CONF AU - Simon, Jens AU - Wierum, Jens-Michael ID - 2443 SN - 978-3-540-61142-4 T2 - Proc. Int. Conf. on High-Performance Computing and Networking (HPCN-Europe) TI - Sequential Performance versus Scalability: Optimizing Parallel LU-Decomposition VL - 1067 ER - TY - CONF AU - Simon, Jens AU - Wierum, Jens-Michael ID - 2444 T2 - Proc. Annual Int. Conf. on High-Performance Computers (HPCS) TI - Performance Prediction of Benchmark Programs for Massively Parallel Architectures ER - TY - CONF AU - Simon, Jens AU - Wierum, Jens-Michael ID - 2445 T2 - Proc. European Conf. on Parallel Processing (Euro-Par) TI - Accurate Performance Prediction for Massively Parallel Systems and its Applications VL - 1124 ER - TY - GEN AU - Simon, Jens AU - Wierum, Jens-Michael ID - 2446 TI - On Accurate Performance Prediction for Massively Parallel Systems and its Applications ER - TY - CONF AU - Röttger, Markus AU - Schroeder, Ulf-Peter AU - Simon, Jens ID - 2447 T2 - Proc. Int. Conf. on High-Performance Computing and Networking TI - Implementation of a Parallel and Distributed Mapping Kernel for PARIX VL - 919 ER - TY - CONF AU - Römke, Thomas AU - Röttger, Markus AU - Schroeder, Ulf-Peter AU - Simon, Jens ID - 2448 T2 - Proc. ZEUS Workshop on Par. Programming and Computation TI - An Efficient Mapping Library for Parix ER - TY - CONF AU - Römke, Thomas AU - Röttger, Markus AU - Schroeder, Ulf-Peter AU - Simon, Jens ID - 2449 SN - 978-3-540-44769-6 T2 - Proc. European Conf. on Parallel Processing (Euro-Par) TI - On Efficient Embeddings of Grids into Grids in PARIX VL - 966 ER - TY - GEN AU - Gehring, Jörn AU - Simon, Jens ID - 2450 TI - SparcStation SCI-Interface ER -