TY - CONF AU - Dou, Feng AU - Wang, Lin AU - Chen, Shutong AU - Liu, Fangming ID - 50066 T2 - Proceedings of the IEEE International Conference on Computer Communications (INFOCOM) TI - X-Stream: A Flexible, Adaptive Video Transformer for Privacy-Preserving Video Stream Analytics ER - TY - CONF AU - Blöcher, Marcel AU - Nedderhut, Nils AU - Chuprikov, Pavel AU - Khalili, Ramin AU - Eugster, Patrick AU - Wang, Lin ID - 50065 T2 - Proceedings of the IEEE International Conference on Computer Communications (INFOCOM) TI - Train Once Apply Anywhere: Effective Scheduling for Network Function Chains Running on FUMES ER - TY - CONF AU - Hu, Haichuan AU - Liu, Fangming AU - Pei, Qiangyu AU - Yuan, Yongjie AU - Xu, Zichen AU - Wang, Lin ID - 50807 T2 - Proceedings of the ACM Web Conference (WWW) TI - 𝜆Grapher: A Resource-Efficient Serverless System for GNN Serving through Graph Sharing ER - TY - CONF AU - Razavi, Kamran AU - Ghafouri, Saeid AU - Mühlhäuser, Max AU - Jamshidi, Pooyan AU - Wang, Lin ID - 53095 T2 - Proceedings of the 4th Workshop on Machine Learning and Systems (EuroMLSys), colocated with EuroSys 2024 TI - Sponge: Inference Serving with Dynamic SLOs Using In-Place Vertical Scaling ER - TY - THES AU - Schneider, Stefan Balthasar ID - 29672 TI - Network and Service Coordination: Conventional and Machine Learning Approaches" ER - TY - CONF AB - Recent reinforcement learning approaches for continuous control in wireless mobile networks have shown impressive results. But due to the lack of open and compatible simulators, authors typically create their own simulation environments for training and evaluation. This is cumbersome and time-consuming for authors and limits reproducibility and comparability, ultimately impeding progress in the field. To this end, we propose mobile-env, a simple and open platform for training, evaluating, and comparing reinforcement learning and conventional approaches for continuous control in mobile wireless networks. mobile-env is lightweight and implements the common OpenAI Gym interface and additional wrappers, which allows connecting virtually any single-agent or multi-agent reinforcement learning framework to the environment. While mobile-env provides sensible default values and can be used out of the box, it also has many configuration options and is easy to extend. We therefore believe mobile-env to be a valuable platform for driving meaningful progress in autonomous coordination of wireless mobile networks. AU - Schneider, Stefan Balthasar AU - Werner, Stefan AU - Khalili, Ramin AU - Hecker, Artur AU - Karl, Holger ID - 30236 KW - wireless mobile networks KW - network management KW - continuous control KW - cognitive networks KW - autonomous coordination KW - reinforcement learning KW - gym environment KW - simulation KW - open source T2 - IEEE/IFIP Network Operations and Management Symposium (NOMS) TI - mobile-env: An Open Platform for Reinforcement Learning in Wireless Mobile Networks ER - TY - CONF AB - The decentralized nature of multi-agent systems requires continuous data exchange to achieve global objectives. In such scenarios, Age of Information (AoI) has become an important metric of the freshness of exchanged data due to the error-proneness and delays of communication systems. Communication systems usually possess dependencies: the process describing the success or failure of communication is highly correlated when these attempts are ``close'' in some domain (e.g. in time, frequency, space or code as in wireless communication) and is, in general, non-stationary. To study AoI in such scenarios, we consider an abstract event-based AoI process $\Delta(n)$, expressing time since the last update: If, at time $n$, a monitoring node receives a status update from a source node (event $A(n-1)$ occurs), then $\Delta(n)$ is reset to one; otherwise, $\Delta(n)$ grows linearly in time. This AoI process can thus be viewed as a special random walk with resets. The event process $A(n)$ may be nonstationary and we merely assume that its temporal dependencies decay sufficiently, described by $\alpha$-mixing. We calculate moment bounds for the resulting AoI process as a function of the mixing rate of $A(n)$. Furthermore, we prove that the AoI process $\Delta(n)$ is itself $\alpha$-mixing from which we conclude a strong law of large numbers for $\Delta(n)$. These results are new, since AoI processes have not been studied so far in this general strongly mixing setting. This opens up future work on renewal processes with non-independent interarrival times. AU - Redder, Adrian AU - Ramaswamy, Arunselvan AU - Karl, Holger ID - 32811 T2 - Proceedings of the 58th Allerton Conference on Communication, Control, and Computing TI - Age of Information Process under Strongly Mixing Communication -- Moment Bound, Mixing Rate and Strong Law ER - TY - CONF AU - Redder, Adrian AU - Ramaswamy, Arunselvan AU - Karl, Holger ID - 30793 T2 - Proceedings of the 14th International Conference on Agents and Artificial Intelligence TI - Multi-agent Policy Gradient Algorithms for Cyber-physical Systems with Lossy Communication ER - TY - GEN AB - Iterative distributed optimization algorithms involve multiple agents that communicate with each other, over time, in order to minimize/maximize a global objective. In the presence of unreliable communication networks, the Age-of-Information (AoI), which measures the freshness of data received, may be large and hence hinder algorithmic convergence. In this paper, we study the convergence of general distributed gradient-based optimization algorithms in the presence of communication that neither happens periodically nor at stochastically independent points in time. We show that convergence is guaranteed provided the random variables associated with the AoI processes are stochastically dominated by a random variable with finite first moment. This improves on previous requirements of boundedness of more than the first moment. We then introduce stochastically strongly connected (SSC) networks, a new stochastic form of strong connectedness for time-varying networks. We show: If for any $p \ge0$ the processes that describe the success of communication between agents in a SSC network are $\alpha$-mixing with $n^{p-1}\alpha(n)$ summable, then the associated AoI processes are stochastically dominated by a random variable with finite $p$-th moment. In combination with our first contribution, this implies that distributed stochastic gradient descend converges in the presence of AoI, if $\alpha(n)$ is summable. AU - Redder, Adrian AU - Ramaswamy, Arunselvan AU - Karl, Holger ID - 30790 T2 - arXiv:2201.11343 TI - Distributed gradient-based optimization in the presence of dependent aperiodic communication ER - TY - GEN AB - We present sufficient conditions that ensure convergence of the multi-agent Deep Deterministic Policy Gradient (DDPG) algorithm. It is an example of one of the most popular paradigms of Deep Reinforcement Learning (DeepRL) for tackling continuous action spaces: the actor-critic paradigm. In the setting considered herein, each agent observes a part of the global state space in order to take local actions, for which it receives local rewards. For every agent, DDPG trains a local actor (policy) and a local critic (Q-function). The analysis shows that multi-agent DDPG using neural networks to approximate the local policies and critics converge to limits with the following properties: The critic limits minimize the average squared Bellman loss; the actor limits parameterize a policy that maximizes the local critic's approximation of $Q_i^*$, where $i$ is the agent index. The averaging is with respect to a probability distribution over the global state-action space. It captures the asymptotics of all local training processes. Finally, we extend the analysis to a fully decentralized setting where agents communicate over a wireless network prone to delays and losses; a typical scenario in, e.g., robotic applications. AU - Redder, Adrian AU - Ramaswamy, Arunselvan AU - Karl, Holger ID - 30791 T2 - arXiv:2201.00570 TI - Asymptotic Convergence of Deep Multi-Agent Actor-Critic Algorithms ER - TY - JOUR AU - Redder, Adrian AU - Ramaswamy, Arunselvan AU - Karl, Holger ID - 32854 IS - 13 JF - IFAC-PapersOnLine TI - Practical Network Conditions for the Convergence of Distributed Optimization VL - 55 ER - TY - CONF AB - Modern services often comprise several components, such as chained virtual network functions, microservices, or machine learning functions. Providing such services requires to decide how often to instantiate each component, where to place these instances in the network, how to chain them and route traffic through them. To overcome limitations of conventional, hardwired heuristics, deep reinforcement learning (DRL) approaches for self-learning network and service management have emerged recently. These model-free DRL approaches are more flexible but typically learn tabula rasa, i.e., disregard existing understanding of networks, services, and their coordination. Instead, we propose FutureCoord, a novel model-based AI approach that leverages existing understanding of networks and services for more efficient and effective coordination without time-intensive training. FutureCoord combines Monte Carlo Tree Search with a stochastic traffic model. This allows FutureCoord to estimate the impact of future incoming traffic and effectively optimize long-term effects, taking fluctuating demand and Quality of Service (QoS) requirements into account. Our extensive evaluation based on real-world network topologies, services, and traffic traces indicates that FutureCoord clearly outperforms state-of-the-art model-free and model-based approaches with up to 51% higher flow success ratios. AU - Werner, Stefan AU - Schneider, Stefan Balthasar AU - Karl, Holger ID - 29220 KW - network management KW - service management KW - AI KW - Monte Carlo Tree Search KW - model-based KW - QoS T2 - IEEE/IFIP Network Operations and Management Symposium (NOMS) TI - Use What You Know: Network and Service Coordination Beyond Certainty ER - TY - CONF AB - Datacenter applications have different resource requirements from network and developing flow scheduling heuristics for every workload is practically infeasible. In this paper, we show that deep reinforcement learning (RL) can be used to efficiently learn flow scheduling policies for different workloads without manual feature engineering. Specifically, we present LFS, which learns to optimize a high-level performance objective, e.g., maximize the number of flow admissions while meeting the deadlines. The LFS scheduler is trained through deep RL to learn a scheduling policy on continuous online flow arrivals. The evaluation results show that the trained LFS scheduler admits 1.05x more flows than the greedy flow scheduling heuristics under varying network load. AU - Hasnain, Asif AU - Karl, Holger ID - 20125 KW - Flow scheduling KW - Deadlines KW - Reinforcement learning T2 - 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC) TI - Learning Flow Scheduling ER - TY - THES AU - Hasnain, Asif ID - 27503 TI - Automating Network Resource Allocation for Coflows with Deadlines ER - TY - CONF AB - Data-parallel applications are developed using different data programming models, e.g., MapReduce, partition/aggregate. These models represent diverse resource requirements of application in a datacenter network, which can be represented by the coflow abstraction. The conventional method of creating hand-crafted coflow heuristics for admission or scheduling for different workloads is practically infeasible. In this paper, we propose a deep reinforcement learning (DRL)-based coflow admission scheme -- LCS -- that can learn an admission policy for a higher-level performance objective, i.e., maximize successful coflow admissions, without manual feature engineering. LCS is trained on a production trace, which has online coflow arrivals. The evaluation results show that LCS is able to learn a reasonable admission policy that admits more coflows than state-of-the-art Varys heuristic while meeting their deadlines. AU - Hasnain, Asif AU - Karl, Holger ID - 21005 KW - Coflow scheduling KW - Reinforcement learning KW - Deadlines T2 - IEEE INFOCOM 2021 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS) TI - Learning Coflow Admissions ER - TY - CONF AB - Services often consist of multiple chained components such as microservices in a service mesh, or machine learning functions in a pipeline. Providing these services requires online coordination including scaling the service, placing instance of all components in the network, scheduling traffic to these instances, and routing traffic through the network. Optimized service coordination is still a hard problem due to many influencing factors such as rapidly arriving user demands and limited node and link capacity. Existing approaches to solve the problem are often built on rigid models and assumptions, tailored to specific scenarios. If the scenario changes and the assumptions no longer hold, they easily break and require manual adjustments by experts. Novel self-learning approaches using deep reinforcement learning (DRL) are promising but still have limitations as they only address simplified versions of the problem and are typically centralized and thus do not scale to practical large-scale networks. To address these issues, we propose a distributed self-learning service coordination approach using DRL. After centralized training, we deploy a distributed DRL agent at each node in the network, making fast coordination decisions locally in parallel with the other nodes. Each agent only observes its direct neighbors and does not need global knowledge. Hence, our approach scales independently from the size of the network. In our extensive evaluation using real-world network topologies and traffic traces, we show that our proposed approach outperforms a state-of-the-art conventional heuristic as well as a centralized DRL approach (60% higher throughput on average) while requiring less time per online decision (1 ms). AU - Schneider, Stefan Balthasar AU - Qarawlus, Haydar AU - Karl, Holger ID - 21543 KW - network management KW - service management KW - coordination KW - reinforcement learning KW - distributed T2 - IEEE International Conference on Distributed Computing Systems (ICDCS) TI - Distributed Online Service Coordination Using Deep Reinforcement Learning ER - TY - CONF AB - In practical, large-scale networks, services are requested by users across the globe, e.g., for video streaming. Services consist of multiple interconnected components such as microservices in a service mesh. Coordinating these services requires scaling them according to continuously changing user demand, deploying instances at the edge close to their users, and routing traffic efficiently between users and connected instances. Network and service coordination is commonly addressed through centralized approaches, where a single coordinator knows everything and coordinates the entire network globally. While such centralized approaches can reach global optima, they do not scale to large, realistic networks. In contrast, distributed approaches scale well, but sacrifice solution quality due to their limited scope of knowledge and coordination decisions. To this end, we propose a hierarchical coordination approach that combines the good solution quality of centralized approaches with the scalability of distributed approaches. In doing so, we divide the network into multiple hierarchical domains and optimize coordination in a top-down manner. We compare our hierarchical with a centralized approach in an extensive evaluation on a real-world network topology. Our results indicate that hierarchical coordination can find close-to-optimal solutions in a fraction of the runtime of centralized approaches. AU - Schneider, Stefan Balthasar AU - Jürgens, Mirko AU - Karl, Holger ID - 20693 KW - network management KW - service management KW - coordination KW - hierarchical KW - scalability KW - nfv T2 - IFIP/IEEE International Symposium on Integrated Network Management (IM) TI - Divide and Conquer: Hierarchical Network and Service Coordination ER - TY - JOUR AB - Modern services consist of interconnected components,e.g., microservices in a service mesh or machine learning functions in a pipeline. These services can scale and run across multiple network nodes on demand. To process incoming traffic, service components have to be instantiated and traffic assigned to these instances, taking capacities, changing demands, and Quality of Service (QoS) requirements into account. This challenge is usually solved with custom approaches designed by experts. While this typically works well for the considered scenario, the models often rely on unrealistic assumptions or on knowledge that is not available in practice (e.g., a priori knowledge). We propose DeepCoord, a novel deep reinforcement learning approach that learns how to best coordinate services and is geared towards realistic assumptions. It interacts with the network and relies on available, possibly delayed monitoring information. Rather than defining a complex model or an algorithm on how to achieve an objective, our model-free approach adapts to various objectives and traffic patterns. An agent is trained offline without expert knowledge and then applied online with minimal overhead. Compared to a state-of-the-art heuristic, DeepCoord significantly improves flow throughput (up to 76%) and overall network utility (more than 2x) on realworld network topologies and traffic traces. It also supports optimizing multiple, possibly competing objectives, learns to respect QoS requirements, generalizes to scenarios with unseen, stochastic traffic, and scales to large real-world networks. For reproducibility and reuse, our code is publicly available. AU - Schneider, Stefan Balthasar AU - Khalili, Ramin AU - Manzoor, Adnan AU - Qarawlus, Haydar AU - Schellenberg, Rafael AU - Karl, Holger AU - Hecker, Artur ID - 21808 JF - Transactions on Network and Service Management KW - network management KW - service management KW - coordination KW - reinforcement learning KW - self-learning KW - self-adaptation KW - multi-objective TI - Self-Learning Multi-Objective Service Coordination Using Deep Reinforcement Learning ER - TY - GEN AB - Macrodiversity is a key technique to increase the capacity of mobile networks. It can be realized using coordinated multipoint (CoMP), simultaneously connecting users to multiple overlapping cells. Selecting which users to serve by how many and which cells is NP-hard but needs to happen continuously in real time as users move and channel state changes. Existing approaches often require strict assumptions about or perfect knowledge of the underlying radio system, its resource allocation scheme, or user movements, none of which is readily available in practice. Instead, we propose three novel self-learning and self-adapting approaches using model-free deep reinforcement learning (DRL): DeepCoMP, DD-CoMP, and D3-CoMP. DeepCoMP leverages central observations and control of all users to select cells almost optimally. DD-CoMP and D3-CoMP use multi-agent DRL, which allows distributed, robust, and highly scalable coordination. All three approaches learn from experience and self-adapt to varying scenarios, reaching 2x higher Quality of Experience than other approaches. They have very few built-in assumptions and do not need prior system knowledge, making them more robust to change and better applicable in practice than existing approaches. AU - Schneider, Stefan Balthasar AU - Karl, Holger AU - Khalili, Ramin AU - Hecker, Artur ID - 33854 KW - mobility management KW - coordinated multipoint KW - CoMP KW - cell selection KW - resource management KW - reinforcement learning KW - multi agent KW - MARL KW - self-learning KW - self-adaptation KW - QoE TI - DeepCoMP: Coordinated Multipoint Using Multi-Agent Deep Reinforcement Learning ER - TY - GEN AB - Network and service coordination is important to provide modern services consisting of multiple interconnected components, e.g., in 5G, network function virtualization (NFV), or cloud and edge computing. In this paper, I outline my dissertation research, which proposes six approaches to automate such network and service coordination. All approaches dynamically react to the current demand and optimize coordination for high service quality and low costs. The approaches range from centralized to distributed methods and from conventional heuristic algorithms and mixed-integer linear programs to machine learning approaches using supervised and reinforcement learning. I briefly discuss their main ideas and advantages over other state-of-the-art approaches and compare strengths and weaknesses. AU - Schneider, Stefan Balthasar ID - 35889 KW - nfv KW - coordination KW - machine learning KW - reinforcement learning KW - phd KW - digest TI - Conventional and Machine Learning Approaches for Network and Service Coordination ER - TY - CONF AB - Modern services consist of modular, interconnected components, e.g., microservices forming a service mesh. To dynamically adjust to ever-changing service demands, service components have to be instantiated on nodes across the network. Incoming flows requesting a service then need to be routed through the deployed instances while considering node and link capacities. Ultimately, the goal is to maximize the successfully served flows and Quality of Service (QoS) through online service coordination. Current approaches for service coordination are usually centralized, assuming up-to-date global knowledge and making global decisions for all nodes in the network. Such global knowledge and centralized decisions are not realistic in practical large-scale networks. To solve this problem, we propose two algorithms for fully distributed service coordination. The proposed algorithms can be executed individually at each node in parallel and require only very limited global knowledge. We compare and evaluate both algorithms with a state-of-the-art centralized approach in extensive simulations on a large-scale, real-world network topology. Our results indicate that the two algorithms can compete with centralized approaches in terms of solution quality but require less global knowledge and are magnitudes faster (more than 100x). AU - Schneider, Stefan Balthasar AU - Klenner, Lars Dietrich AU - Karl, Holger ID - 19607 KW - distributed management KW - service coordination KW - network coordination KW - nfv KW - softwarization KW - orchestration T2 - IEEE International Conference on Network and Service Management (CNSM) TI - Every Node for Itself: Fully Distributed Service Coordination ER - TY - CONF AB - Modern services comprise interconnected components, e.g., microservices in a service mesh, that can scale and run on multiple nodes across the network on demand. To process incoming traffic, service components have to be instantiated and traffic assigned to these instances, taking capacities and changing demands into account. This challenge is usually solved with custom approaches designed by experts. While this typically works well for the considered scenario, the models often rely on unrealistic assumptions or on knowledge that is not available in practice (e.g., a priori knowledge). We propose a novel deep reinforcement learning approach that learns how to best coordinate services and is geared towards realistic assumptions. It interacts with the network and relies on available, possibly delayed monitoring information. Rather than defining a complex model or an algorithm how to achieve an objective, our model-free approach adapts to various objectives and traffic patterns. An agent is trained offline without expert knowledge and then applied online with minimal overhead. Compared to a state-of-the-art heuristic, it significantly improves flow throughput and overall network utility on real-world network topologies and traffic traces. It also learns to optimize different objectives, generalizes to scenarios with unseen, stochastic traffic patterns, and scales to large real-world networks. AU - Schneider, Stefan Balthasar AU - Manzoor, Adnan AU - Qarawlus, Haydar AU - Schellenberg, Rafael AU - Karl, Holger AU - Khalili, Ramin AU - Hecker, Artur ID - 19609 KW - self-driving networks KW - self-learning KW - network coordination KW - service coordination KW - reinforcement learning KW - deep learning KW - nfv T2 - IEEE International Conference on Network and Service Management (CNSM) TI - Self-Driving Network and Service Coordination Using Deep Reinforcement Learning ER - TY - CONF AB - Data-parallel applications run on cluster of servers in a datacenter and their communication triggers correlated resource demand on multiple links that can be abstracted as coflow. They often desire predictable network performance, which can be passed to network via coflow abstraction for application-aware network scheduling. In this paper, we propose a heuristic and an optimization algorithm for predictable network performance such that they guarantee coflows completion within their deadlines. The algorithms also ensure high network utilization, i.e., it's work-conserving, and avoids starvation of coflows. We evaluate both algorithms via trace-driven simulation and show that they admit 1.1x more coflows than the Varys scheme while meeting their deadlines. AU - Hasnain, Asif AU - Karl, Holger ID - 17082 KW - Coflow KW - Scheduling KW - Deadlines KW - Data centers T2 - 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID) TI - Coflow Scheduling with Performance Guarantees for Data Center Applications ER - TY - CONF AB - Network function virtualization (NFV) proposes to replace physical middleboxes with more flexible virtual network functions (VNFs). To dynamically adjust to everchanging traffic demands, VNFs have to be instantiated and their allocated resources have to be adjusted on demand. Deciding the amount of allocated resources is non-trivial. Existing optimization approaches often assume fixed resource requirements for each VNF instance. However, this can easily lead to either waste of resources or bad service quality if too many or too few resources are allocated. To solve this problem, we train machine learning models on real VNF data, containing measurements of performance and resource requirements. For each VNF, the trained models can then accurately predict the required resources to handle a certain traffic load. We integrate these machine learning models into an algorithm for joint VNF scaling and placement and evaluate their impact on resulting VNF placements. Our evaluation based on real-world data shows that using suitable machine learning models effectively avoids over- and underallocation of resources, leading to up to 12 times lower resource consumption and better service quality with up to 4.5 times lower total delay than using standard fixed resource allocation. AU - Schneider, Stefan Balthasar AU - Satheeschandran, Narayanan Puthenpurayil AU - Peuster, Manuel AU - Karl, Holger ID - 16219 T2 - IEEE Conference on Network Softwarization (NetSoft) TI - Machine Learning for Dynamic Resource Allocation in Network Function Virtualization ER - TY - CONF AU - Zafeiropoulos, A. AU - Fotopoulou, E. AU - Peuster, Manuel AU - Schneider, Stefan Balthasar AU - Gouvas, P. AU - Behnke, D. AU - Müller, M. AU - Bök, P. AU - Trakadas, P. AU - Karkazis, P. AU - Karl, Holger ID - 16222 T2 - IEEE Conference on Network Softwarization (NetSoft) TI - Benchmarking and Profiling 5G Verticals' Applications: An Industrial IoT Use Case ER - TY - JOUR AB - Currently, the coexistence of multiple users and devices challenges the network's ability to reliably connect them. This article proposes a novel communication architecture that satisfies the requirements of fifth-generation (5G) mobile network applications. In particular, this architecture extends and combines ultra-dense networking (UDN), multi-access edge computing (MEC), and virtual infrastructure manager (VIM) concepts to provide a flexible network of moving radio access (RA) nodes, flying or moving to areas where users and devices struggle for connectivity and data rate. Furthermore, advances in radio communications and non-orthogonal multiple access (NOMA), virtualization technologies and energy-awareness mechanisms are integrated towards a mobile UDN that not only allows RA nodes to follow the user but also enables the virtualized network functions (VNFs) to adapt to user mobility by migrating from one node to another. Performance evaluation shows that the underlying network improves connectivity of users and devices through the flexible deployment of moving RA nodes and the use of NOMA. AU - Nomikos, Nikolaos AU - Michailidis, Emmanouel T. AU - Trakadas, Panagiotis AU - Vouyioukas, Demosthenes AU - Karl, Holger AU - Martrat, Josep AU - Zahariadis, Theodore AU - Papadopoulos, Konstantinos AU - Voliotis, Stamatis ID - 16278 JF - Vehicular Communications SN - 2214-2096 TI - A UAV-based moving 5G RAN for massive connectivity of mobile users and IoT devices ER - TY - JOUR AB - Assigning bands of the wireless spectrum as resources to users is a common problem in wireless networks. Typically, frequency bands were assumed to be available in a stable manner. Nevertheless, in recent scenarios where wireless networks may be deployed in unknown environments, spectrum competition is considered, making it uncertain whether a frequency band is available at all or at what quality. To fully exploit such resources with uncertain availability, the multi-armed bandit (MAB) method, a representative online learning technique, has been applied to design spectrum scheduling algorithms. This article surveys such proposals. We describe the following three aspects: how to model spectrum scheduling problems within the MAB framework, what the main thread is following which prevalent algorithms are designed, and how to evaluate algorithm performance and complexity. We also give some promising directions for future research in related fields. AU - Li, Feng AU - Yu, Dongxiao AU - Yang, Huan AU - Yu, Jiguo AU - Karl, Holger AU - Cheng, Xiuzhen ID - 16280 JF - IEEE Wireless Communications SN - 1536-1284 TI - Multi-Armed-Bandit-Based Spectrum Scheduling Algorithms in Wireless Networks: A Survey ER - TY - CONF AB - Softwarization facilitates the introduction of smart manufacturing applications in the industry. Manifold devices such as machine computers, Industrial IoT devices, tablets, smartphones and smart glasses are integrated into factory networks to enable shop floor digitalization and big data analysis. To handle the increasing number of devices and the resulting traffic, a flexible and scalable factory network is necessary which can be realized using softwarization technologies like Network Function Virtualization (NFV). However, the security risks increase with the increasing number of new devices, so that cyber security must also be considered in NFV-based networks. Therefore, extending our previous work, we showcase threat detection using a cloud-native NFV-driven intrusion detection system (IDS) that is integrated in our industrial-specific network services. As a result of the threat detection, the affected network service is put into quarantine via automatic network reconfiguration. We use the 5GTANGO service platform to deploy our developed network services on Kubernetes and to initiate the network reconfiguration. AU - Müller, Marcel AU - Behnke, Daniel AU - Bök, Patrick-Benjamin AU - Schneider, Stefan Balthasar AU - Peuster, Manuel AU - Karl, Holger ID - 16400 T2 - IEEE Conference on Network Softwarization (NetSoft) Demo Track TI - Cloud-Native Threat Detection and Containment for Smart Manufacturing ER - TY - JOUR AU - Karl, Holger AU - Kundisch, Dennis AU - Meyer auf der Heide, Friedhelm AU - Wehrheim, Heike ID - 13770 IS - 6 JF - Business & Information Systems Engineering TI - A Case for a New IT Ecosystem: On-The-Fly Computing VL - 62 ER - TY - CONF AB - For optimal placement and orchestration of network services, it is crucial that their structure and semantics are specified clearly and comprehensively and are available to an orchestrator. Existing specification approaches are either ambiguous or miss important aspects regarding the behavior of virtual network functions (VNFs) forming a service. We propose to formally and unambiguously specify the behavior of these functions and services using Queuing Petri Nets (QPNs). QPNs are an established method that allows to express queuing, synchronization, stochastically distributed processing delays, and changing traffic volume and characteristics at each VNF. With QPNs, multiple VNFs can be connected to complete network services in any structure, even specifying bidirectional network services containing loops. We discuss how management and orchestration systems can benefit from our clear and comprehensive specification approach, leading to better placement of VNFs and improved Quality of Service. Another benefit of formally specifying network services with QPNs are diverse analysis options, which allow valuable insights such as the distribution of end-to-end delay. We propose a tool-based workflow that supports the specification of network services and the automatic generation of corresponding simulation code to enable an in-depth analysis of their behavior and performance. AU - Schneider, Stefan Balthasar AU - Sharma, Arnab AU - Karl, Holger AU - Wehrheim, Heike ID - 3287 T2 - 2019 IFIP/IEEE International Symposium on Integrated Network Management (IM) TI - Specifying and Analyzing Virtual Network Services Using Queuing Petri Nets ER - TY - CONF AB - As 5G and network function virtualization (NFV) are maturing, it becomes crucial to demonstrate their feasibility and benefits by means of vertical scenarios. While 5GPPP has identified smart manufacturing as one of the most important vertical industries, there is still a lack of specific, practical use cases. Using the experience from a large-scale manufacturing company, Weidm{\"u}ller Group, we present a detailed use case that reflects the needs of real-world manufacturers. We also propose an architecture with specific network services and virtual network functions (VNFs) that realize the use case in practice. As a proof of concept, we implement the required services and deploy them on an emulation-based prototyping platform. Our experimental results indicate that a fully virtualized smart manufacturing use case is not only feasible but also reduces machine interconnection and configuration time and thus improves productivity by orders of magnitude. AU - Schneider, Stefan Balthasar AU - Peuster, Manuel AU - Behnke, Daniel AU - Marcel, Müller AU - Bök, Patrick-Benjamin AU - Karl, Holger ID - 9270 KW - 5g KW - vertical KW - smart manufacturing KW - nfv T2 - European Conference on Networks and Communications (EuCNC) TI - Putting 5G into Production: Realizing a Smart Manufacturing Vertical Scenario ER - TY - JOUR AB - The ongoing softwarization of networks creates a big need for automated testing solutions to ensure service quality. This becomes even more important if agile environments with short time to market and high demands, in terms of service performance and availability, are considered. In this paper, we introduce a novel testing solution for virtualized, microservice-based network functions and services, which we base on TTCN-3, a well known testing language defined by the European standards institute (ETSI). We use TTCN-3 not only for functional testing but also answer the question whether TTCN-3 can be used for performance profiling tasks as well. Finally, we demonstrate the proposed concepts and solutions in a case study using our open-source prototype to test and profile a chained network service. AU - Peuster, Manuel AU - Dröge, Christian AU - Boos, Clemens AU - Karl, Holger ID - 8113 JF - ICT Express SN - 2405-9595 TI - Joint testing and profiling of microservice-based network services using TTCN-3 ER - TY - CONF AU - Dräxler, Sevil AU - Karl, Holger ID - 8240 T2 - 5th IEEE International Conference on Network Softwarization (NetSoft) 2019 TI - SPRING: Scaling, Placement, and Routing of Heterogeneous Services with Flexible Structures ER - TY - CONF AB - 5G together with software defined networking (SDN) and network function virtualisation (NFV) will enable a wide variety of vertical use cases. One of them is the smart man- ufacturing case which utilises 5G networks to interconnect production machines, machine parks, and factory sites to enable new possibilities in terms of flexibility, automation, and novel applications (industry 4.0). However, the availability of realistic and practical proof-of-concepts for those smart manufacturing scenarios is still limited. This demo fills this gap by not only showing a real-world smart manufacturing application entirely implemented using NFV concepts, but also a lightweight prototyping framework that simplifies the realisation of vertical NFV proof-of-concepts. Dur- ing the demo, we show how an NFV-based smart manufacturing scenario can be specified, on-boarded, and instantiated before we demonstrate how the presented NFV services simplify machine data collection, aggregation, and analysis. AU - Peuster, Manuel AU - Schneider, Stefan Balthasar AU - Behnke, Daniel AU - Müller, Marcel AU - Bök, Patrick-Benjamin AU - Karl, Holger ID - 8792 T2 - 5th IEEE International Conference on Network Softwarization (NetSoft 2019) TI - Prototyping and Demonstrating 5G Verticals: The Smart Manufacturing Case ER - TY - JOUR AB - Softwarized networks are the key enabler for elastic, on-demand service deployments of virtualized network functions. They allow to dynamically steer traffic through the network when new network functions are instantiated, or old ones are terminated. These scenarios become in particular challenging when stateful functions are involved, necessitating state management solutions to migrate state between the functions. The problem with existing solutions is that they typically embrace state migration and flow rerouting jointly, imposing a huge set of requirements on the on-boarded virtualized network functions (VNFs), eg, solution-specific state management interfaces. To change this, we introduce the seamless handover protocol (SHarP). An easy-to-use, loss-less, and order-preserving flow rerouting mechanism that is not fixed to a single state management approach. Using SHarP, VNF vendors are empowered to implement or use the state management solution of their choice. SHarP supports these solutions with additional information when flows are migrated. In this paper, we present SHarP's design, its open source prototype implementation, and show how SHarP significantly reduces the buffer usage at a central (SDN) controller, which is a typical bottleneck in state-of-the-art solutions. Our experiments show that SHarP uses a constant amount of controller buffer, irrespective of the time taken to migrate the VNF state. AU - Peuster, Manuel AU - Küttner, Hannes AU - Karl, Holger ID - 8795 JF - International Journal of Network Management SN - 1055-7148 TI - A flow handover protocol to support state migration in softwarized networks ER - TY - JOUR AU - Soenen, Thomas AU - Tavernier, Wouter AU - Peuster, Manuel AU - Vicens, Felipe AU - Xilouris, George AU - Kolometsos, Stavros AU - Kourtis, Michail-Alexandros AU - Colle, Didier ID - 9823 JF - IEEE Communications Magazine SN - 0163-6804 TI - Empowering Network Service Developers: Enhanced NFV DevOps and Programmable MANO ER - TY - JOUR AU - Peuster, Manuel AU - Schneider, Stefan Balthasar AU - Zhao, Mengxuan AU - Xilouris, George AU - Trakadas, Panagiotis AU - Vicens, Felipe AU - Tavernier, Wouter AU - Soenen, Thomas AU - Vilalta, Ricard AU - Andreou, George AU - Kyriazis, Dimosthenis AU - Karl, Holger ID - 9824 JF - IEEE Communications Magazine SN - 0163-6804 TI - Introducing Automated Verification and Validation for Virtualized Network Functions and Services ER - TY - CONF AU - Afifi, Haitham AU - Karl, Holger ID - 6860 T2 - 2019 16th IEEE Annual Consumer Communications & Networking Conference (CCNC2019) TI - Power Allocation with a Wireless Multi-cast Aware Routing for Virtual Network Embedding ER - TY - CONF AB - By distributing the computational load over the nodes of a Wireless Acoustic Sensor Network (WASN), the real-time capability of the TRINICON (TRIple-N-Independent component analysis for CONvolutive mixtures) framework for Blind Source Separation (BSS) can be ensured, even if the individual network nodes are not powerful enough to run TRINICON in real-time by themselves. To optimally utilize the limited computing power and data rate in WASNs, the MARVELO (Multicast-Aware Routing for Virtual network Embedding with Loops in Overlays) framework is expanded for use with TRINICON, while a feature-based selection scheme is proposed to exploit the most beneficial parts of the input signal for adapting the demixing system. The simulation results of realistic scenarios show only a minor degradation of the separation performance even in heavily resource-limited situations. AU - Guenther, Michael AU - Afifi, Haitham AU - Brendel, Andreas AU - Karl, Holger AU - Kellermann, Walter ID - 12880 T2 - 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) (WASPAA 2019) TI - Sparse Adaptation of Distributed Blind Source Separation in Acoustic Sensor Networks ER - TY - CONF AB - Internet of Things (IoT) applications witness an exceptional evolution of traffic demands, while existing protocols, as seen in wireless sensor networks (WSNs), struggle to cope with these demands. Traditional protocols rely on finding a routing path between sensors generating data and sinks acting as gateway or databases. Meanwhile, the network will suffer from high collisions in case of high data rates. In this context, in-network processing solutions are used to leverage the wireless nodes' computations, by distributing processing tasks on the nodes along the routing path. Although in-network processing solutions are very popular in wired networks (e.g., data centers and wide area networks), there are many challenges to adopt these solutions in wireless networks, due to the interference problem. In this paper, we solve the problem of routing and task distribution jointly using a greedy Virtual Network Embedding (VNE) algorithm, and consider power control as well. Through simulations, we compare the proposed algorithm to optimal solutions and show that it achieves good results in terms of delay. Moreover, we discuss its sub-optimality by driving tight lower bounds and loose upper bounds. We also compare our solution with another wireless VNE solution to show the trade-off between delay and symbol error rate. AU - Afifi, Haitham AU - Karl, Holger ID - 12881 T2 - 2019 12th IFIP Wireless and Mobile Networking Conference (WMNC) (WMNC'19) TI - An Approximate Power Control Algorithm for a Multi-Cast Wireless Virtual Network Embedding ER - TY - CONF AB - One of the major challenges in implementing wireless virtualization is the resource discovery. This is particularly important for the embedding-algorithms that are used to distribute the tasks to nodes. MARVELO is a prototype framework for executing different distributed algorithms on the top of a wireless (802.11) ad-hoc network. The aim of MARVELO is to select the nodes for running the algorithms and to define the routing between the nodes. Hence, it also supports monitoring functionalities to collect information about the available resources and to assist in profiling the algorithms. The objective of this demo is to show how MAVRLEO distributes tasks in an ad-hoc network, based on a feedback from our monitoring tool. Additionally, we explain the work-flow, composition and execution of the framework. AU - Afifi, Haitham AU - Karl, Holger AU - Eikenberg, Sebastian AU - Mueller, Arnold AU - Gansel, Lars AU - Makejkin, Alexander AU - Hannemann, Kai AU - Schellenberg, Rafael ID - 12882 KW - WSN KW - virtualization KW - VNE T2 - 2019 IEEE Wireless Communications and Networking Conference (WCNC) (IEEE WCNC 2019) (Demo) TI - A Rapid Prototyping for Wireless Virtual Network Embedding using MARVELO ER - TY - CONF AU - Müller, Marcel AU - Behnke, Daniel AU - Bök, Patrick-Benjamin AU - Peuster, Manuel AU - Schneider, Stefan Balthasar AU - Karl, Holger ID - 15369 T2 - IEEE 17th International Conference on Industrial Informatics (IEEE-INDIN) TI - 5G as Key Technology for Networked Factories: Application of Vertical-specific Network Services for Enabling Flexible Smart Manufacturing ER - TY - CONF AB - More and more management and orchestration approaches for (software) networks are based on machine learning paradigms and solutions. These approaches depend not only on their program code to operate properly, but also require enough input data to train their internal models. However, such training data is barely available for the software networking domain and most presented solutions rely on their own, sometimes not even published, data sets. This makes it hard, or even infeasible, to reproduce and compare many of the existing solutions. As a result, it ultimately slows down the adoption of machine learning approaches in softwarised networks. To this end, we introduce the "softwarised network data zoo" (SNDZoo), an open collection of software networking data sets aiming to streamline and ease machine learning research in the software networking domain. We present a general methodology to collect, archive, and publish those data sets for use by other researches and, as an example, eight initial data sets, focusing on the performance of virtualised network functions. AU - Peuster, Manuel AU - Schneider, Stefan Balthasar AU - Karl, Holger ID - 15371 T2 - IEEE/IFIP 15th International Conference on Network and Service Management (CNSM) TI - The Softwarised Network Data Zoo ER - TY - CONF AU - Nuriddinov, Askhat AU - Tavernier, Wouter AU - Colle, Didier AU - Pickavet, Mario AU - Peuster, Manuel AU - Schneider, Stefan Balthasar ID - 15372 T2 - IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN) TI - Reproducible Functional Tests for Multi-scale Network Services ER - TY - CONF AB - Offloading packet processing tasks to programmable switches and/or to programmable network interfaces, so called “SmartNICs”, is one of the key concepts to prepare softwarized networks for the high traffic demands of the future. However, implementing network functions that make use of those offload- ing technologies is still challenging and usually requires the availability of specialized hardware. It becomes even harder if heterogeneous services, making use of different offloading and network virtualization technologies, should be developed. In this paper, we introduce FOP4 (Function Offloading Pro- totyping with P4), a novel prototyping platform that allows to prototype heterogeneous software network scenarios, including container-based, P4-switch-based, and SmartNIC-based network functions. The presented work substantially extends our existing Containernet platform with the means to prototype offloading scenarios. Besides presenting the platform’s system design, we evaluate its scalability and show that it can run scenarios with more than 64 P4 switch or SmartNIC nodes on a single laptop. Finally, we presented a case study in which we use the presented platform to prototype an extended in-band network telemetry use case. AU - Moro, Daniele AU - Peuster, Manuel AU - Karl, Holger AU - Capone, Antonio ID - 15373 T2 - IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN) TI - FOP4: Function Offloading Prototyping in Heterogeneous and Programmable Network Scenarios ER - TY - CONF AB - Emulation platforms supporting Virtual Network Functions (VNFs) allow developers to rapidly prototype network services. None of the available platforms, however, supports experimenting with programmable data planes to enable VNF offloading. In this demonstration, we show FOP4, a flexible platform that provides support for Docker-based VNFs, and VNF offloading, by means of P4-enabled switches. The platform provides interfaces to program the P4 devices and to deploy network functions. We demonstrate FOP4 with two complex example scenarios, demonstrating how developers can exploit data plane programmability to implement network functions. AU - Moro, Daniele AU - Peuster, Manuel AU - Karl, Holger AU - Capone, Antonio ID - 15374 T2 - IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN) TI - Demonstrating FOP4: A Flexible Platform to Prototype NFV Offloading Scenarios ER - TY - CONF AU - Müller, Marcel AU - Behnke, Daniel AU - Bök, Patrick-Benjamin AU - Schneider, Stefan Balthasar AU - Peuster, Manuel AU - Karl, Holger ID - 15375 T2 - IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN) TI - Putting NFV into Reality: Physical Smart Manufacturing Testbed ER - TY - CONF AU - Behnke, Daniel AU - Müller, Marcel AU - Bök, Patrick-Benjamin AU - Schneider, Stefan Balthasar AU - Peuster, Manuel AU - Karl, Holger ID - 15376 T2 - IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN) TI - NFV-driven intrusion detection for smart manufacturing ER - TY - JOUR AB - In many cyber–physical systems, we encounter the problem of remote state estimation of geo- graphically distributed and remote physical processes. This paper studies the scheduling of sensor transmissions to estimate the states of multiple remote, dynamic processes. Information from the different sensors has to be transmitted to a central gateway over a wireless network for monitoring purposes, where typically fewer wireless channels are available than there are processes to be monitored. For effective estimation at the gateway, the sensors need to be scheduled appropriately, i.e., at each time instant one needs to decide which sensors have network access and which ones do not. To address this scheduling problem, we formulate an associated Markov decision process (MDP). This MDP is then solved using a Deep Q-Network, a recent deep reinforcement learning algorithm that is at once scalable and model-free. We compare our scheduling algorithm to popular scheduling algorithms such as round-robin and reduced-waiting-time, among others. Our algorithm is shown to significantly outperform these algorithms for many example scenario AU - Leong, Alex S. AU - Ramaswamy, Arunselvan AU - Quevedo, Daniel E. AU - Karl, Holger AU - Shi, Ling ID - 15741 JF - Automatica SN - 0005-1098 TI - Deep reinforcement learning for wireless sensor scheduling in cyber–physical systems ER - TY - CONF AB - Given the recent development in embedded devices, wireless senor nodes are no longer limited to data collection but they can also do processing (e.g., smartphones). Accordingly, new types of applications take an advantage of the processing and flexibility provided by the wireless network. A common property between these applications is that the processing is not running on only one single node, but it is broken-down into smaller tasks that can run over multiple nodes, i.e., exploiting the in-network processing. We study a special variant of in-network processing, where the application is given by a graph; the processing tasks have predefined connections to be executed in a predefined sequence. The problem of embedding an application graph into a network is commonly known as Virtual Network Embedding (VNE). In this paper, we present a Genetic Algorithm (GA) solution to solve this wireless VNE problem, where we take into account the interference and multi-cast properties. We show that the GA has a good performance and fast execution compared to the optimization problem. AU - Afifi, Haitham AU - Horbach, Konrad AU - Karl, Holger ID - 13123 T2 - 2019 International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob) (WiMob 2019) TI - A Genetic Algorithm Framework for Solving Wireless Virtual Network Embedding ER -