TY - CHAP
AU - Hanselle, Jonas Manuel
AU - Hüllermeier, Eyke
AU - Mohr, Felix
AU - Ngonga Ngomo, Axel-Cyrille
AU - Sherif, Mohamed
AU - Tornede, Alexander
AU - Wever, Marcel Dominik
ED - Haake, Claus-Jochen
ED - Meyer auf der Heide, Friedhelm
ED - Platzner, Marco
ED - Wachsmuth, Henning
ED - Wehrheim, Heike
ID - 45884
T2 - On-The-Fly Computing -- Individualized IT-services in dynamic markets
TI - Configuration and Evaluation
VL - 412
ER -
TY - THES
AU - Tornede, Alexander
ID - 45780
TI - Advanced Algorithm Selection with Machine Learning: Handling Large Algorithm Sets, Learning From Censored Data, and Simplyfing Meta Level Decisions
ER -
TY - BOOK
AB - In the proposal for our CRC in 2011, we formulated a vision of markets for
IT services that describes an approach to the provision of such services
that was novel at that time and, to a large extent, remains so today:
„Our vision of on-the-fly computing is that of IT services individually and
automatically configured and brought to execution from flexibly combinable
services traded on markets. At the same time, we aim at organizing
markets whose participants maintain a lively market of services through
appropriate entrepreneurial actions.“
Over the last 12 years, we have developed methods and techniques to
address problems critical to the convenient, efficient, and secure use of
on-the-fly computing. Among other things, we have made the description
of services more convenient by allowing natural language input,
increased the quality of configured services through (natural language)
interaction and more efficient configuration processes and analysis
procedures, made the quality of (the products of) providers in the
marketplace transparent through reputation systems, and increased the
resource efficiency of execution through reconfigurable heterogeneous
computing nodes and an integrated treatment of service description and
configuration. We have also developed network infrastructures that have
a high degree of adaptivity, scalability, efficiency, and reliability, and
provide cryptographic guarantees of anonymity and security for market
participants and their products and services.
To demonstrate the pervasiveness of the OTF computing approach, we
have implemented a proof-of-concept for OTF computing that can run
typical scenarios of an OTF market. We illustrated the approach using
a cutting-edge application scenario – automated machine learning (AutoML).
Finally, we have been pushing our work for the perpetuation of
On-The-Fly Computing beyond the SFB and sharing the expertise gained
in the SFB in events with industry partners as well as transfer projects.
This work required a broad spectrum of expertise. Computer scientists
and economists with research interests such as computer networks and
distributed algorithms, security and cryptography, software engineering
and verification, configuration and machine learning, computer engineering
and HPC, microeconomics and game theory, business informatics
and management have successfully collaborated here.
AU - Haake, Claus-Jochen
AU - Meyer auf der Heide, Friedhelm
AU - Platzner, Marco
AU - Wachsmuth, Henning
AU - Wehrheim, Heike
ID - 45863
TI - On-The-Fly Computing -- Individualized IT-services in dynamic markets
VL - 412
ER -
TY - GEN
AB - Algorithm configuration (AC) is concerned with the automated search of the
most suitable parameter configuration of a parametrized algorithm. There is
currently a wide variety of AC problem variants and methods proposed in the
literature. Existing reviews do not take into account all derivatives of the AC
problem, nor do they offer a complete classification scheme. To this end, we
introduce taxonomies to describe the AC problem and features of configuration
methods, respectively. We review existing AC literature within the lens of our
taxonomies, outline relevant design choices of configuration approaches,
contrast methods and problem variants against each other, and describe the
state of AC in industry. Finally, our review provides researchers and
practitioners with a look at future research directions in the field of AC.
AU - Schede, Elias
AU - Brandt, Jasmin
AU - Tornede, Alexander
AU - Wever, Marcel Dominik
AU - Bengs, Viktor
AU - Hüllermeier, Eyke
AU - Tierney, Kevin
ID - 30868
T2 - arXiv:2202.01651
TI - A Survey of Methods for Automated Algorithm Configuration
ER -
TY - CONF
AB - It is well known that different algorithms perform differently well on an
instance of an algorithmic problem, motivating algorithm selection (AS): Given
an instance of an algorithmic problem, which is the most suitable algorithm to
solve it? As such, the AS problem has received considerable attention resulting
in various approaches - many of which either solve a regression or ranking
problem under the hood. Although both of these formulations yield very natural
ways to tackle AS, they have considerable weaknesses. On the one hand,
correctly predicting the performance of an algorithm on an instance is a
sufficient, but not a necessary condition to produce a correct ranking over
algorithms and in particular ranking the best algorithm first. On the other
hand, classical ranking approaches often do not account for concrete
performance values available in the training data, but only leverage rankings
composed from such data. We propose HARRIS- Hybrid rAnking and RegRessIon
foreSts - a new algorithm selector leveraging special forests, combining the
strengths of both approaches while alleviating their weaknesses. HARRIS'
decisions are based on a forest model, whose trees are created based on splits
optimized on a hybrid ranking and regression loss function. As our preliminary
experimental study on ASLib shows, HARRIS improves over standard algorithm
selection approaches on some scenarios showing that combining ranking and
regression in trees is indeed promising for AS.
AU - Fehring, Lukass
AU - Hanselle, Jonas Manuel
AU - Tornede, Alexander
ID - 34103
T2 - Workshop on Meta-Learning (MetaLearn 2022) @ NeurIPS 2022
TI - HARRIS: Hybrid Ranking and Regression Forests for Algorithm Selection
ER -
TY - CONF
AB - The creation of an RDF knowledge graph for a particular application commonly involves a pipeline of tools that transform a set ofinput data sources into an RDF knowledge graph in a process called dataset augmentation. The components of such augmentation pipelines often require extensive configuration to lead to satisfactory results. Thus, non-experts are often unable to use them. Wepresent an efficient supervised algorithm based on genetic programming for learning knowledge graph augmentation pipelines of arbitrary length. Our approach uses multi-expression learning to learn augmentation pipelines able to achieve a high F-measure on the training data. Our evaluation suggests that our approach can efficiently learn a larger class of RDF dataset augmentation tasks than the state of the art while using only a single training example. Even on the most complex augmentation problem we posed, our approach consistently achieves an average F1-measure of 99% in under 500 iterations with an average runtime of 16 seconds
AU - Dreßler, Kevin
AU - Sherif, Mohamed
AU - Ngonga Ngomo, Axel-Cyrille
ID - 31806
KW - 2022 RAKI SFB901 deer dice kevin knowgraphs limes ngonga sherif simba
T2 - Proceedings of the 33rd ACM Conference on Hypertext and Hypermedia
TI - ADAGIO - Automated Data Augmentation of Knowledge Graphs Using Multi-expression Learning
ER -
TY - GEN
AU - Fehring, Lukas
ID - 33033
TI - Combined Ranking and Regression Trees for Algorithm Selection
ER -
TY - GEN
AB - In online algorithm selection (OAS), instances of an algorithmic problem
class are presented to an agent one after another, and the agent has to quickly
select a presumably best algorithm from a fixed set of candidate algorithms.
For decision problems such as satisfiability (SAT), quality typically refers to
the algorithm's runtime. As the latter is known to exhibit a heavy-tail
distribution, an algorithm is normally stopped when exceeding a predefined
upper time limit. As a consequence, machine learning methods used to optimize
an algorithm selection strategy in a data-driven manner need to deal with
right-censored samples, a problem that has received little attention in the
literature so far. In this work, we revisit multi-armed bandit algorithms for
OAS and discuss their capability of dealing with the problem. Moreover, we
adapt them towards runtime-oriented losses, allowing for partially censored
data while keeping a space- and time-complexity independent of the time
horizon. In an extensive experimental evaluation on an adapted version of the
ASlib benchmark, we demonstrate that theoretically well-founded methods based
on Thompson sampling perform specifically strong and improve in comparison to
existing methods.
AU - Tornede, Alexander
AU - Bengs, Viktor
AU - Hüllermeier, Eyke
ID - 30867
T2 - Proceedings of the 36th AAAI Conference on Artificial Intelligence
TI - Machine Learning for Online Algorithm Selection under Censored Feedback
ER -
TY - GEN
AB - The problem of selecting an algorithm that appears most suitable for a
specific instance of an algorithmic problem class, such as the Boolean
satisfiability problem, is called instance-specific algorithm selection. Over
the past decade, the problem has received considerable attention, resulting in
a number of different methods for algorithm selection. Although most of these
methods are based on machine learning, surprisingly little work has been done
on meta learning, that is, on taking advantage of the complementarity of
existing algorithm selection methods in order to combine them into a single
superior algorithm selector. In this paper, we introduce the problem of meta
algorithm selection, which essentially asks for the best way to combine a given
set of algorithm selectors. We present a general methodological framework for
meta algorithm selection as well as several concrete learning methods as
instantiations of this framework, essentially combining ideas of meta learning
and ensemble learning. In an extensive experimental evaluation, we demonstrate
that ensembles of algorithm selectors can significantly outperform single
algorithm selectors and have the potential to form the new state of the art in
algorithm selection.
AU - Tornede, Alexander
AU - Gehring, Lukas
AU - Tornede, Tanja
AU - Wever, Marcel Dominik
AU - Hüllermeier, Eyke
ID - 30865
T2 - Machine Learning
TI - Algorithm Selection on a Meta Level
ER -
TY - JOUR
AB - AbstractHeated tool butt welding is a method often used for joining thermoplastics, especially when the components are made out of different materials. The quality of the connection between the components crucially depends on a suitable choice of the parameters of the welding process, such as heating time, temperature, and the precise way how the parts are then welded. Moreover, when different materials are to be joined, the parameter values need to be tailored to the specifics of the respective material. To this end, in this paper, three approaches to tailor the parameter values to optimize the quality of the connection are compared: a heuristic by Potente, statistical experimental design, and Bayesian optimization. With the suitability for practice in mind, a series of experiments are carried out with these approaches, and their capabilities of proposing well-performing parameter values are investigated. As a result, Bayesian optimization is found to yield peak performance, but the costs for optimization are substantial. In contrast, the Potente heuristic does not require any experimentation and recommends parameter values with competitive quality.
AU - Gevers, Karina
AU - Tornede, Alexander
AU - Wever, Marcel Dominik
AU - Schöppner, Volker
AU - Hüllermeier, Eyke
ID - 33090
JF - Welding in the World
KW - Metals and Alloys
KW - Mechanical Engineering
KW - Mechanics of Materials
SN - 0043-2288
TI - A comparison of heuristic, statistical, and machine learning methods for heated tool butt welding of two different materials
ER -
TY - CONF
AB - In recent years, we observe an increasing amount of software with machine learning components being deployed. This poses the question of quality assurance for such components: how can we validate whether specified requirements are fulfilled by a machine learned software? Current testing and verification approaches either focus on a single requirement (e.g., fairness) or specialize on a single type of machine learning model (e.g., neural networks).
In this paper, we propose property-driven testing of machine learning models. Our approach MLCheck encompasses (1) a language for property specification, and (2) a technique for systematic test case generation. The specification language is comparable to property-based testing languages. Test case generation employs advanced verification technology for a systematic, property dependent construction of test suites, without additional user supplied generator functions. We evaluate MLCheck using requirements and data sets from three different application areas (software
discrimination, learning on knowledge graphs and security). Our evaluation shows that despite its generality MLCheck can even outperform specialised testing approaches while having a comparable runtime
AU - Sharma, Arnab
AU - Demir, Caglar
AU - Ngonga Ngomo, Axel-Cyrille
AU - Wehrheim, Heike
ID - 28350
T2 - Proceedings of the 20th IEEE International Conference on Machine Learning and Applications (ICMLA)
TI - MLCHECK–Property-Driven Testing of Machine Learning Classifiers
ER -
TY - JOUR
AB - Automated machine learning (AutoML) supports the algorithmic construction and data-specific customization of machine learning pipelines, including the selection, combination, and parametrization of machine learning algorithms as main constituents. Generally speaking, AutoML approaches comprise two major components: a search space model and an optimizer for traversing the space. Recent approaches have shown impressive results in the realm of supervised learning, most notably (single-label) classification (SLC). Moreover, first attempts at extending these approaches towards multi-label classification (MLC) have been made. While the space of candidate pipelines is already huge in SLC, the complexity of the search space is raised to an even higher power in MLC. One may wonder, therefore, whether and to what extent optimizers established for SLC can scale to this increased complexity, and how they compare to each other. This paper makes the following contributions: First, we survey existing approaches to AutoML for MLC. Second, we augment these approaches with optimizers not previously tried for MLC. Third, we propose a benchmarking framework that supports a fair and systematic comparison. Fourth, we conduct an extensive experimental study, evaluating the methods on a suite of MLC problems. We find a grammar-based best-first search to compare favorably to other optimizers.
AU - Wever, Marcel Dominik
AU - Tornede, Alexander
AU - Mohr, Felix
AU - Hüllermeier, Eyke
ID - 21004
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
KW - Automated Machine Learning
KW - Multi Label Classification
KW - Hierarchical Planning
KW - Bayesian Optimization
SN - 0162-8828
TI - AutoML for Multi-Label Classification: Overview and Empirical Evaluation
ER -
TY - JOUR
AB - Automated Machine Learning (AutoML) seeks to automatically find so-called machine learning pipelines that maximize the prediction performance when being used to train a model on a given dataset. One of the main and yet open challenges in AutoML is an effective use of computational resources: An AutoML process involves the evaluation of many candidate pipelines, which are costly but often ineffective because they are canceled due to a timeout.
In this paper, we present an approach to predict the runtime of two-step machine learning pipelines with up to one pre-processor, which can be used to anticipate whether or not a pipeline will time out. Separate runtime models are trained offline for each algorithm that may be used in a pipeline, and an overall prediction is derived from these models. We empirically show that the approach increases successful evaluations made by an AutoML tool while preserving or even improving on the previously best solutions.
AU - Mohr, Felix
AU - Wever, Marcel Dominik
AU - Tornede, Alexander
AU - Hüllermeier, Eyke
ID - 21092
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
TI - Predicting Machine Learning Pipeline Runtimes in the Context of Automated Machine Learning
ER -
TY - CONF
AU - Tornede, Tanja
AU - Tornede, Alexander
AU - Wever, Marcel Dominik
AU - Hüllermeier, Eyke
ID - 21570
T2 - Proceedings of the Genetic and Evolutionary Computation Conference
TI - Coevolution of Remaining Useful Lifetime Estimation Pipelines for Automated Predictive Maintenance
ER -
TY - CONF
AU - Hüllermeier, Eyke
AU - Mohr, Felix
AU - Tornede, Alexander
AU - Wever, Marcel Dominik
ID - 22913
TI - Automated Machine Learning, Bounded Rationality, and Rational Metareasoning
ER -
TY - GEN
AB - Automated machine learning (AutoML) strives for the automatic configuration
of machine learning algorithms and their composition into an overall (software)
solution - a machine learning pipeline - tailored to the learning task
(dataset) at hand. Over the last decade, AutoML has developed into an
independent research field with hundreds of contributions. While AutoML offers
many prospects, it is also known to be quite resource-intensive, which is one
of its major points of criticism. The primary cause for a high resource
consumption is that many approaches rely on the (costly) evaluation of many
machine learning pipelines while searching for good candidates. This problem is
amplified in the context of research on AutoML methods, due to large scale
experiments conducted with many datasets and approaches, each of them being run
with several repetitions to rule out random effects. In the spirit of recent
work on Green AI, this paper is written in an attempt to raise the awareness of
AutoML researchers for the problem and to elaborate on possible remedies. To
this end, we identify four categories of actions the community may take towards
more sustainable research on AutoML, i.e. Green AutoML: design of AutoML
systems, benchmarking, transparency and research incentives.
AU - Tornede, Tanja
AU - Tornede, Alexander
AU - Hanselle, Jonas Manuel
AU - Wever, Marcel Dominik
AU - Mohr, Felix
AU - Hüllermeier, Eyke
ID - 30866
T2 - arXiv:2111.05850
TI - Towards Green Automated Machine Learning: Status Quo and Future Directions
ER -
TY - THES
AU - Wever, Marcel Dominik
ID - 27284
TI - Automated Machine Learning for Multi-Label Classification
ER -
TY - CONF
AU - Hanselle, Jonas Manuel
AU - Tornede, Alexander
AU - Wever, Marcel Dominik
AU - Hüllermeier, Eyke
ID - 21198
TI - Algorithm Selection as Superset Learning: Constructing Algorithm Selectors from Imprecise Performance Data
ER -
TY - CONF
AU - Tornede, Alexander
AU - Wever, Marcel Dominik
AU - Hüllermeier, Eyke
ID - 17407
T2 - Discovery Science
TI - Extreme Algorithm Selection with Dyadic Feature Representation
ER -
TY - CONF
AU - Hanselle, Jonas Manuel
AU - Tornede, Alexander
AU - Wever, Marcel Dominik
AU - Hüllermeier, Eyke
ID - 17408
T2 - KI 2020: Advances in Artificial Intelligence
TI - Hybrid Ranking and Regression for Algorithm Selection
ER -