TY - CONF
AU - Richter, Cedric
AU - Haltermann, Jan Frederik
AU - Jakobs, Marie-Christine
AU - Pauck, Felix
AU - Schott, Stefan
AU - Wehrheim, Heike
ID - 35426
T2 - 37th IEEE/ACM International Conference on Automated Software Engineering
TI - Are Neural Bug Detectors Comparable to Software Developers on Variable Misuse Bugs?
ER -
TY - CONF
AU - Pauck, Felix
ID - 35427
T2 - 37th IEEE/ACM International Conference on Automated Software Engineering
TI - Scaling Arbitrary Android App Analyses
ER -
TY - CONF
AU - Ahmed, Qazi Arbab
AU - Awais, Muhammad
AU - Platzner, Marco
ID - 44194
T2 - The 24th International Symposium on Quality Electronic Design (ISQED'23), San Francisco, Califorina USA
TI - MAAS: Hiding Trojans in Approximate Circuits
ER -
TY - THES
AU - Pauck, Felix
ID - 43108
TI - Cooperative Android App Analysis
ER -
TY - THES
AB - Reading between the lines has so far been reserved for humans. The present dissertation addresses this research gap using machine learning methods.
Implicit expressions are not comprehensible by computers and cannot be localized in the text. However, many texts arise on interpersonal topics that, unlike commercial evaluation texts, often imply information only by means of longer phrases. Examples are the kindness and the attentiveness of a doctor, which are only paraphrased (“he didn’t even look me in the eye”). The analysis of such data, especially the identification and localization of implicit statements, is a research gap (1). This work uses so-called Aspect-based Sentiment Analysis as a method for this purpose. It remains open how the aspect categories to be extracted can be discovered and thematically delineated based on the data (2). Furthermore, it is not yet explored how a collection of tools should look like, with which implicit phrases can be identified and thus made explicit
(3). Last, it is an open question how to correlate the identified phrases from the text data with other data, including the investigation of the relationship between quantitative scores (e.g., school grades) and the thematically related text (4). Based on these research gaps, the research question is posed as follows: Using text mining methods, how can implicit rating content be properly interpreted and thus made explicit before it is automatically categorized and quantified?
The uniqueness of this dissertation is based on the automated recognition of implicit linguistic statements alongside explicit statements. These are identified in unstructured text data so that features expressed only in the text can later be compared across data sources, even though they were not included in rating categories such as stars or school grades. German-language physician ratings from websites in three countries serve as the sample domain. The solution approach consists of data creation, a pipeline for text processing and analyses based on this. In the data creation, aspect classes are identified and delineated across platforms and marked in text data. This results in six datasets with over 70,000 annotated sentences and detailed guidelines. The models that were created based on the training data extract and categorize the aspects. In addition, the sentiment polarity and the evaluation weight, i. e., the importance of each phrase, are determined. The models, which are combined in a pipeline, are used in a prototype in the form of a web application. The analyses built on the pipeline quantify the rating contents by linking the obtained information with further data, thus allowing new insights.
As a result, a toolbox is provided to identify quantifiable rating content and categories using text mining for a sample domain. This is used to evaluate the approach, which in principle can also be adapted to any other domain.
AU - Kersting, Joschka
ID - 44323
TI - Identifizierung quantifizierbarer Bewertungsinhalte und -kategorien mittels Text Mining
ER -
TY - CHAP
AU - Wehrheim, Heike
AU - Platzner, Marco
AU - Bodden, Eric
AU - Schubert, Philipp
AU - Pauck, Felix
AU - Jakobs, Marie-Christine
ED - Haake, Claus-Jochen
ED - Meyer auf der Heide, Friedhelm
ED - Platzner, Marco
ED - Wachsmuth, Henning
ED - Wehrheim, Heike
ID - 45888
T2 - On-The-Fly Computing -- Individualized IT-services in dynamic markets
TI - Verifying Software and Reconfigurable Hardware Services
VL - 412
ER -
TY - CHAP
AU - Bäumer, Frederik Simon
AU - Chen, Wei-Fan
AU - Geierhos, Michaela
AU - Kersting, Joschka
AU - Wachsmuth, Henning
ED - Haake, Claus-Jochen
ED - Meyer auf der Heide, Friedhelm
ED - Platzner, Marco
ED - Wachsmuth, Henning
ED - Wehrheim, Heike
ID - 45882
T2 - On-The-Fly Computing -- Individualized IT-services in dynamic markets
TI - Dialogue-based Requirement Compensation and Style-adjusted Data-to-text Generation
VL - 412
ER -
TY - CHAP
AU - Hanselle, Jonas Manuel
AU - Hüllermeier, Eyke
AU - Mohr, Felix
AU - Ngonga Ngomo, Axel-Cyrille
AU - Sherif, Mohamed
AU - Tornede, Alexander
AU - Wever, Marcel Dominik
ED - Haake, Claus-Jochen
ED - Meyer auf der Heide, Friedhelm
ED - Platzner, Marco
ED - Wachsmuth, Henning
ED - Wehrheim, Heike
ID - 45884
T2 - On-The-Fly Computing -- Individualized IT-services in dynamic markets
TI - Configuration and Evaluation
VL - 412
ER -
TY - CHAP
AU - Wehrheim, Heike
AU - Hüllermeier, Eyke
AU - Becker, Steffen
AU - Becker, Matthias
AU - Richter, Cedric
AU - Sharma, Arnab
ED - Haake, Claus-Jochen
ED - Meyer auf der Heide, Friedhelm
ED - Platzner, Marco
ED - Wachsmuth, Henning
ED - Wehrheim, Heike
ID - 45886
T2 - On-The-Fly Computing -- Individualized IT-services in dynamic markets
TI - Composition Analysis in Unknown Contexts
VL - 412
ER -
TY - CHAP
AB - We present a concept for quantifying evaluative phrases to later compare rating texts numerically instead of just relying on stars or grades. We achievethis by combining deep learning models in an aspect-based sentiment analysis pipeline along with sentiment weighting, polarity, and correlation analyses that combine deep learning results with metadata. The results provide new insights for the medical field. Our application domain, physician reviews, shows that there are millions of review texts on the Internet that cannot yet be comprehensively analyzed because previous studies have focused on explicit aspects from other domains (e.g., products). We identify, extract, and classify implicit and explicit aspect phrases equally from German-language review texts. To do so, we annotated aspect phrases representing reviews on numerous aspects of a physician, medical practice, or practice staff. We apply the best performing transformer model, XLM-RoBERTa, to a large physician review dataset and correlate the results with existing metadata. As a result, we can show different correlations between the sentiment polarity of certain aspect classes (e.g., friendliness, practice equipment) and physicians’ professions (e.g., surgeon, ophthalmologist). As a result, we have individual numerical scores that contain a variety of information based on deep learning algorithms that extract textual (evaluative) information and metadata from the Web.
AU - Kersting, Joschka
AU - Geierhos, Michaela
ED - Cuzzocrea, Alfredo
ED - Gusikhin, Oleg
ED - Hammoudi, Slimane
ED - Quix, Christoph
ID - 46205
SN - 1865-0929
T2 - Data Management Technologies and Applications
TI - Towards Comparable Ratings: Quantifying Evaluative Phrases in Physician Reviews
VL - 1860
ER -
TY - THES
AU - Tornede, Alexander
ID - 45780
TI - Advanced Algorithm Selection with Machine Learning: Handling Large Algorithm Sets, Learning From Censored Data, and Simplyfing Meta Level Decisions
ER -
TY - BOOK
AB - In the proposal for our CRC in 2011, we formulated a vision of markets for
IT services that describes an approach to the provision of such services
that was novel at that time and, to a large extent, remains so today:
„Our vision of on-the-fly computing is that of IT services individually and
automatically configured and brought to execution from flexibly combinable
services traded on markets. At the same time, we aim at organizing
markets whose participants maintain a lively market of services through
appropriate entrepreneurial actions.“
Over the last 12 years, we have developed methods and techniques to
address problems critical to the convenient, efficient, and secure use of
on-the-fly computing. Among other things, we have made the description
of services more convenient by allowing natural language input,
increased the quality of configured services through (natural language)
interaction and more efficient configuration processes and analysis
procedures, made the quality of (the products of) providers in the
marketplace transparent through reputation systems, and increased the
resource efficiency of execution through reconfigurable heterogeneous
computing nodes and an integrated treatment of service description and
configuration. We have also developed network infrastructures that have
a high degree of adaptivity, scalability, efficiency, and reliability, and
provide cryptographic guarantees of anonymity and security for market
participants and their products and services.
To demonstrate the pervasiveness of the OTF computing approach, we
have implemented a proof-of-concept for OTF computing that can run
typical scenarios of an OTF market. We illustrated the approach using
a cutting-edge application scenario – automated machine learning (AutoML).
Finally, we have been pushing our work for the perpetuation of
On-The-Fly Computing beyond the SFB and sharing the expertise gained
in the SFB in events with industry partners as well as transfer projects.
This work required a broad spectrum of expertise. Computer scientists
and economists with research interests such as computer networks and
distributed algorithms, security and cryptography, software engineering
and verification, configuration and machine learning, computer engineering
and HPC, microeconomics and game theory, business informatics
and management have successfully collaborated here.
AU - Haake, Claus-Jochen
AU - Meyer auf der Heide, Friedhelm
AU - Platzner, Marco
AU - Wachsmuth, Henning
AU - Wehrheim, Heike
ID - 45863
TI - On-The-Fly Computing -- Individualized IT-services in dynamic markets
VL - 412
ER -
TY - THES
AU - König, Jürgen
ID - 47833
TI - On the Membership and Correctness Problem for State Serializability and Value Opacity
ER -
TY - CONF
AU - Witschen, Linus Matthias
AU - Wiersema, Tobias
AU - Reuter, Lucas David
AU - Platzner, Marco
ID - 29945
T2 - 2022 59th ACM/IEEE Design Automation Conference (DAC)
TI - Search Space Characterization for Approximate Logic Synthesis
ER -
TY - CONF
AU - Witschen, Linus Matthias
AU - Wiersema, Tobias
AU - Artmann, Matthias
AU - Platzner, Marco
ID - 29865
T2 - Design, Automation and Test in Europe (DATE)
TI - MUSCAT: MUS-based Circuit Approximation Technique
ER -
TY - GEN
AB - Algorithm configuration (AC) is concerned with the automated search of the
most suitable parameter configuration of a parametrized algorithm. There is
currently a wide variety of AC problem variants and methods proposed in the
literature. Existing reviews do not take into account all derivatives of the AC
problem, nor do they offer a complete classification scheme. To this end, we
introduce taxonomies to describe the AC problem and features of configuration
methods, respectively. We review existing AC literature within the lens of our
taxonomies, outline relevant design choices of configuration approaches,
contrast methods and problem variants against each other, and describe the
state of AC in industry. Finally, our review provides researchers and
practitioners with a look at future research directions in the field of AC.
AU - Schede, Elias
AU - Brandt, Jasmin
AU - Tornede, Alexander
AU - Wever, Marcel Dominik
AU - Bengs, Viktor
AU - Hüllermeier, Eyke
AU - Tierney, Kevin
ID - 30868
T2 - arXiv:2202.01651
TI - A Survey of Methods for Automated Algorithm Configuration
ER -
TY - CONF
AB - Testing is one of the most frequent means of quality assurance for software. Property-based testing aims at generating test suites for checking code against user-defined properties. Test input generation is, however, most often independent of the property to be checked, and is instead based on random or user-defined data generation.In this paper, we present property-driven unit testing of functions with numerical inputs and outputs. Alike property-based testing, it allows users to define the properties to be tested for. Contrary to property-based testing, it also uses the property for a targeted generation of test inputs. Our approach is a form of learning-based testing where we first of all learn a model of a given black-box function using standard machine learning algorithms, and in a second step use model and property for test input generation. This allows us to test both predefined functions as well as machine learned regression models. Our experimental evaluation shows that our property-driven approach is more effective than standard property-based testing techniques.
AU - Sharma, Arnab
AU - Melnikov, Vitaly
AU - Hüllermeier, Eyke
AU - Wehrheim, Heike
ID - 32311
T2 - Proceedings of the 10th IEEE/ACM International Conference on Formal Methods in Software Engineering (FormaliSE)
TI - Property-Driven Testing of Black-Box Functions
ER -
TY - CONF
AB - It is well known that different algorithms perform differently well on an
instance of an algorithmic problem, motivating algorithm selection (AS): Given
an instance of an algorithmic problem, which is the most suitable algorithm to
solve it? As such, the AS problem has received considerable attention resulting
in various approaches - many of which either solve a regression or ranking
problem under the hood. Although both of these formulations yield very natural
ways to tackle AS, they have considerable weaknesses. On the one hand,
correctly predicting the performance of an algorithm on an instance is a
sufficient, but not a necessary condition to produce a correct ranking over
algorithms and in particular ranking the best algorithm first. On the other
hand, classical ranking approaches often do not account for concrete
performance values available in the training data, but only leverage rankings
composed from such data. We propose HARRIS- Hybrid rAnking and RegRessIon
foreSts - a new algorithm selector leveraging special forests, combining the
strengths of both approaches while alleviating their weaknesses. HARRIS'
decisions are based on a forest model, whose trees are created based on splits
optimized on a hybrid ranking and regression loss function. As our preliminary
experimental study on ASLib shows, HARRIS improves over standard algorithm
selection approaches on some scenarios showing that combining ranking and
regression in trees is indeed promising for AS.
AU - Fehring, Lukass
AU - Hanselle, Jonas Manuel
AU - Tornede, Alexander
ID - 34103
T2 - Workshop on Meta-Learning (MetaLearn 2022) @ NeurIPS 2022
TI - HARRIS: Hybrid Ranking and Regression Forests for Algorithm Selection
ER -
TY - JOUR
AB - AbstractMany critical codebases are written in C, and most of them use preprocessor directives to encode variability, effectively encoding software product lines. These preprocessor directives, however, challenge any static code analysis. SPLlift, a previously presented approach for analyzing software product lines, is limited to Java programs that use a rather simple feature encoding and to analysis problems with a finite and ideally small domain. Other approaches that allow the analysis of real-world C software product lines use special-purpose analyses, preventing the reuse of existing analysis infrastructures and ignoring the progress made by the static analysis community. This work presents VarAlyzer, a novel static analysis approach for software product lines. VarAlyzer first transforms preprocessor constructs to plain C while preserving their variability and semantics. It then solves any given distributive analysis problem on transformed product lines in a variability-aware manner. VarAlyzer ’s analysis results are annotated with feature constraints that encode in which configurations each result holds. Our experiments with 95 compilation units of OpenSSL show that applying VarAlyzer enables one to conduct inter-procedural, flow-, field- and context-sensitive data-flow analyses on entire product lines for the first time, outperforming the product-based approach for highly-configurable systems.
AU - Schubert, Philipp
AU - Gazzillo, Paul
AU - Patterson, Zach
AU - Braha, Julian
AU - Schiebel, Fabian
AU - Hermann, Ben
AU - Wei, Shiyi
AU - Bodden, Eric
ID - 30511
IS - 1
JF - Automated Software Engineering
KW - inter-procedural static analysis
KW - software product lines
KW - preprocessor
KW - LLVM
KW - C/C++
SN - 0928-8910
TI - Static data-flow analysis for software product lines in C
VL - 29
ER -
TY - CONF
AU - Richter, Cedric
AU - Wehrheim, Heike
ID - 32590
T2 - 2022 IEEE Conference on Software Testing, Verification and Validation (ICST)
TI - Learning Realistic Mutations: Bug Creation for Neural Bug Detectors
ER -
TY - CONF
AU - Richter, Cedric
AU - Wehrheim, Heike
ID - 32591
T2 - 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR)
TI - TSSB-3M: Mining single statement bugs at massive scale
ER -
TY - CONF
AB - The creation of an RDF knowledge graph for a particular application commonly involves a pipeline of tools that transform a set ofinput data sources into an RDF knowledge graph in a process called dataset augmentation. The components of such augmentation pipelines often require extensive configuration to lead to satisfactory results. Thus, non-experts are often unable to use them. Wepresent an efficient supervised algorithm based on genetic programming for learning knowledge graph augmentation pipelines of arbitrary length. Our approach uses multi-expression learning to learn augmentation pipelines able to achieve a high F-measure on the training data. Our evaluation suggests that our approach can efficiently learn a larger class of RDF dataset augmentation tasks than the state of the art while using only a single training example. Even on the most complex augmentation problem we posed, our approach consistently achieves an average F1-measure of 99% in under 500 iterations with an average runtime of 16 seconds
AU - Dreßler, Kevin
AU - Sherif, Mohamed
AU - Ngonga Ngomo, Axel-Cyrille
ID - 31806
KW - 2022 RAKI SFB901 deer dice kevin knowgraphs limes ngonga sherif simba
T2 - Proceedings of the 33rd ACM Conference on Hypertext and Hypermedia
TI - ADAGIO - Automated Data Augmentation of Knowledge Graphs Using Multi-expression Learning
ER -
TY - CONF
AU - Chen, Wei-Fan
AU - Chen, Mei-Hua
AU - Mudgal, Garima
AU - Wachsmuth, Henning
ID - 33274
T2 - Proceedings of the 9th Workshop on Argument Mining (ArgMining 2022)
TI - Analyzing Culture-Specific Argument Structures in Learner Essays
ER -
TY - CHAP
AB - This work addresses the automatic resolution of software requirements. In the vision of On-The-Fly Computing, software services should be composed on demand, based solely on natural language input from human users. To enable this, we build a chatbot solution that works with human-in-the-loop support to receive, analyze, correct, and complete their software requirements. The chatbot is equipped with a natural language processing pipeline and a large knowledge base, as well as sophisticated dialogue management skills to enhance the user experience. Previous solutions have focused on analyzing software requirements to point out errors such as vagueness, ambiguity, or incompleteness. Our work shows how apps can collaborate with users to efficiently produce correct requirements. We developed and compared three different chatbot apps that can work with built-in knowledge. We rely on ChatterBot, DialoGPT and Rasa for this purpose. While DialoGPT provides its own knowledge base, Rasa is the best system to combine the text mining and knowledge solutions at our disposal. The evaluation shows that users accept 73% of the suggested answers from Rasa, while they accept only 63% from DialoGPT or even 36% from ChatterBot.
AU - Kersting, Joschka
AU - Ahmed, Mobeen
AU - Geierhos, Michaela
ED - Stephanidis, Constantine
ED - Antona, Margherita
ED - Ntoa, Stavroula
ID - 32179
KW - On-The-Fly Computing
KW - Chatbot
KW - Knowledge Base
SN - 1865-0929
T2 - HCI International 2022 Posters
TI - Chatbot-Enhanced Requirements Resolution for Automated Service Compositions
VL - 1580
ER -
TY - CONF
AB - This paper aims at discussing past limitations set in sentiment analysis research regarding explicit and implicit mentions of opinions. Previous studies have regularly neglected this question in favor of methodical research on standard-datasets. Furthermore, they were limited to linguistically less-diverse domains, such as commercial product reviews. We face this issue by annotating a German-language physician review dataset that contains numerous implicit, long, and complex statements that indicate aspect ratings, such as the physician’s friendliness. We discuss the nature of implicit statements and present various samples to illustrate the challenge described.
AU - Kersting, Joschka
AU - Bäumer, Frederik Simon
ED - Kersting, Joschka
ID - 31054
KW - Sentiment analysis
KW - Natural language processing
KW - Aspect phrase extraction
T2 - Proceedings of the Fourteenth International Conference on Pervasive Patterns and Applications (PATTERNS 2022): Special Track AI-DRSWA: Maturing Artificial Intelligence - Data Science for Real-World Applications
TI - Implicit Statements in Healthcare Reviews: A Challenge for Sentiment Analysis
ER -
TY - GEN
AU - Chen, Mei-Hua
AU - Mudgal, Garima
AU - Chen, Wei-Fan
AU - Wachsmuth, Henning
ID - 31068
T2 - EUROCALL
TI - Investigating the argumentation structures of EFL learners from diverse language backgrounds
ER -
TY - GEN
AU - Fehring, Lukas
ID - 33033
TI - Combined Ranking and Regression Trees for Algorithm Selection
ER -
TY - GEN
AB - In online algorithm selection (OAS), instances of an algorithmic problem
class are presented to an agent one after another, and the agent has to quickly
select a presumably best algorithm from a fixed set of candidate algorithms.
For decision problems such as satisfiability (SAT), quality typically refers to
the algorithm's runtime. As the latter is known to exhibit a heavy-tail
distribution, an algorithm is normally stopped when exceeding a predefined
upper time limit. As a consequence, machine learning methods used to optimize
an algorithm selection strategy in a data-driven manner need to deal with
right-censored samples, a problem that has received little attention in the
literature so far. In this work, we revisit multi-armed bandit algorithms for
OAS and discuss their capability of dealing with the problem. Moreover, we
adapt them towards runtime-oriented losses, allowing for partially censored
data while keeping a space- and time-complexity independent of the time
horizon. In an extensive experimental evaluation on an adapted version of the
ASlib benchmark, we demonstrate that theoretically well-founded methods based
on Thompson sampling perform specifically strong and improve in comparison to
existing methods.
AU - Tornede, Alexander
AU - Bengs, Viktor
AU - Hüllermeier, Eyke
ID - 30867
T2 - Proceedings of the 36th AAAI Conference on Artificial Intelligence
TI - Machine Learning for Online Algorithm Selection under Censored Feedback
ER -
TY - GEN
AB - The problem of selecting an algorithm that appears most suitable for a
specific instance of an algorithmic problem class, such as the Boolean
satisfiability problem, is called instance-specific algorithm selection. Over
the past decade, the problem has received considerable attention, resulting in
a number of different methods for algorithm selection. Although most of these
methods are based on machine learning, surprisingly little work has been done
on meta learning, that is, on taking advantage of the complementarity of
existing algorithm selection methods in order to combine them into a single
superior algorithm selector. In this paper, we introduce the problem of meta
algorithm selection, which essentially asks for the best way to combine a given
set of algorithm selectors. We present a general methodological framework for
meta algorithm selection as well as several concrete learning methods as
instantiations of this framework, essentially combining ideas of meta learning
and ensemble learning. In an extensive experimental evaluation, we demonstrate
that ensembles of algorithm selectors can significantly outperform single
algorithm selectors and have the potential to form the new state of the art in
algorithm selection.
AU - Tornede, Alexander
AU - Gehring, Lukas
AU - Tornede, Tanja
AU - Wever, Marcel Dominik
AU - Hüllermeier, Eyke
ID - 30865
T2 - Machine Learning
TI - Algorithm Selection on a Meta Level
ER -
TY - JOUR
AB - AbstractHeated tool butt welding is a method often used for joining thermoplastics, especially when the components are made out of different materials. The quality of the connection between the components crucially depends on a suitable choice of the parameters of the welding process, such as heating time, temperature, and the precise way how the parts are then welded. Moreover, when different materials are to be joined, the parameter values need to be tailored to the specifics of the respective material. To this end, in this paper, three approaches to tailor the parameter values to optimize the quality of the connection are compared: a heuristic by Potente, statistical experimental design, and Bayesian optimization. With the suitability for practice in mind, a series of experiments are carried out with these approaches, and their capabilities of proposing well-performing parameter values are investigated. As a result, Bayesian optimization is found to yield peak performance, but the costs for optimization are substantial. In contrast, the Potente heuristic does not require any experimentation and recommends parameter values with competitive quality.
AU - Gevers, Karina
AU - Tornede, Alexander
AU - Wever, Marcel Dominik
AU - Schöppner, Volker
AU - Hüllermeier, Eyke
ID - 33090
JF - Welding in the World
KW - Metals and Alloys
KW - Mechanical Engineering
KW - Mechanics of Materials
SN - 0043-2288
TI - A comparison of heuristic, statistical, and machine learning methods for heated tool butt welding of two different materials
ER -
TY - THES
AU - Witschen, Linus Matthias
ID - 34041
TI - Frameworks and Methodologies for Search-based Approximate Logic Synthesis
ER -
TY - CONF
AU - Ahmed, Qazi Arbab
AU - Platzner, Marco
ID - 32342
TI - On the Detection and Circumvention of Bitstream-Level Trojans in FPGAs
ER -
TY - GEN
AB - This thesis aims to provide a bidirectional chatbot solution for the requirement engineering process. The Sonderforschungsbereich (SFB) 901 intends to provide the composition of software service On-the-Fly (OTF). The sub-project (B1) of the SFB 901 project deals with the parameters of service configuration. OTF Computing aims to eradicate the dependency on the requirement engineers for the software development process. However, there is no existing bidirectional chatbot solution that analyses user software requirements and provides viable suggestions to the user regarding their service. Previously, CORDULA chatbot was developed to analyze the software requirements but cannot keep the conversation’s context. The Rasa framework is integrated with the knowledge base to solve the issue, the knowledge base provides domain-specific knowledge to the chatbot. The software description is passed through the natural language understanding process to give consciousness to the chatbot. This process involves various machine learning models, including app family classification, to correctly identify the domain for user OTF service. The statistical models like naïve Bayes, kNN and SVM are compared with transformer models for this classification task. Furthermore, the entities (functional requirements) are also separated from the user description.
The chatbot provides the suggestion of requirements from the preliminary service template with the support of the knowledge base. Furthermore, the generated response is compared with the state-of-the-art DialoGPT transformer model and ChatterBot conversational library. These models are trained over the software development related conversational dataset. All the responses are ranked using the DialoRPT model, and the BLEU score to evaluates the models’ responses. Moreover, the chatbot mod- els are tested with human participants, they used and scored the chatbot responses based on effectiveness, efficiency and satisfaction. The overall response accuracy is also measured by averaging the user approval over the generated responses.
AU - Ahmed, Mobeen
ID - 29000
TI - Knowledge Base Enhanced & User-centric Dialogue Design for OTF Computing
ER -
TY - GEN
AU - Palushi, Juela
ID - 45790
TI - Domain-aware Text Professionalization using Sequence-to-Sequence Neural Networks
ER -
TY - GEN
AU - Budanurmath, Vinaykumar
ID - 45789
TI - Propaganda Technique Detection Using Connotation Frames
ER -
TY - CONF
AU - Dongol, Brijesh
AU - Schellhorn, Gerhard
AU - Wehrheim, Heike
ED - Klin, Bartek
ED - Lasota, Slawomir
ED - Muscholl, Anca
ID - 45248
T2 - 33rd International Conference on Concurrency Theory, CONCUR 2022, September 12-16, 2022, Warsaw, Poland
TI - Weak Progressive Forward Simulation Is Necessary and Sufficient for Strong Observational Refinement
VL - 243
ER -
TY - CONF
AB - In recent years, we observe an increasing amount of software with machine learning components being deployed. This poses the question of quality assurance for such components: how can we validate whether specified requirements are fulfilled by a machine learned software? Current testing and verification approaches either focus on a single requirement (e.g., fairness) or specialize on a single type of machine learning model (e.g., neural networks).
In this paper, we propose property-driven testing of machine learning models. Our approach MLCheck encompasses (1) a language for property specification, and (2) a technique for systematic test case generation. The specification language is comparable to property-based testing languages. Test case generation employs advanced verification technology for a systematic, property dependent construction of test suites, without additional user supplied generator functions. We evaluate MLCheck using requirements and data sets from three different application areas (software
discrimination, learning on knowledge graphs and security). Our evaluation shows that despite its generality MLCheck can even outperform specialised testing approaches while having a comparable runtime
AU - Sharma, Arnab
AU - Demir, Caglar
AU - Ngonga Ngomo, Axel-Cyrille
AU - Wehrheim, Heike
ID - 28350
T2 - Proceedings of the 20th IEEE International Conference on Machine Learning and Applications (ICMLA)
TI - MLCHECK–Property-Driven Testing of Machine Learning Classifiers
ER -
TY - CONF
AB - Content is the new oil. Users consume billions of terabytes a day while surfing on news sites or blogs, posting on social media sites, and sending chat messages around the globe. While content is heterogeneous, the dominant form of web content is text. There are situations where more diversity needs to be introduced into text content, for example, to reuse it on websites or to allow a chatbot to base its models on the information conveyed rather than of the language used. In order to achieve this, paraphrasing techniques have been developed: One example is Text spinning, a technique that automatically paraphrases text while leaving the intent intact. This makes it easier to reuse content, or to change the language generated by the bot more human. One method for modifying texts is a combination of translation and back-translation. This paper presents NATTS, a naive approach that uses transformer-based translation models to create diversified text, combining translation steps in one model. An advantage of this approach is that it can be fine-tuned and handle technical language.
AU - Bäumer, Frederik Simon
AU - Kersting, Joschka
AU - Denisov, Sergej
AU - Geierhos, Michaela
ID - 26049
KW - Software Requirements
KW - Natural Language Processing
KW - Transfer Learning
KW - On-The-Fly Computing
T2 - PROCEEDINGS OF THE INTERNATIONAL CONFERENCES ON WWW/INTERNET 2021 AND APPLIED COMPUTING 2021
TI - IN OTHER WORDS: A NAIVE APPROACH TO TEXT SPINNING
ER -
TY - THES
AB - Previous research in proof-carrying hardware has established the feasibility and utility of the approach, and provided a concrete solution for employing it for the certification of functional equivalence checking against a specification, but fell short in connecting it to state-of-the-art formal verification insights, methods and tools. Due to the immense complexity of modern circuits, and verification challenges such as the state explosion problem for sequential circuits, this restriction of readily-available verification solutions severely limited the applicability of the approach in wider contexts.
This thesis closes the gap between the PCH approach and current advances in formal hardware verification, provides methods and tools to express and certify a wide range of circuit properties, both functional and non-functional, and presents for the first time prototypes in which circuits that are implemented on actual reconfigurable hardware are verified with PCH methods. Using these results, designers can now apply PCH to establish trust in more complex circuits, by using more diverse properties which they can express using modern, efficient property specification techniques.
AU - Wiersema, Tobias
ID - 26746
KW - Proof-Carrying Hardware
KW - Formal Verification
KW - Sequential Circuits
KW - Non-Functional Properties
KW - Functional Properties
TI - Guaranteeing Properties of Reconfigurable Hardware Circuits with Proof-Carrying Hardware
ER -
TY - JOUR
AB - Due to the lack of established real-world benchmark suites for static taint analyses of Android applications, evaluations of these analyses are often restricted and hard to compare. Even in evaluations that do use real-world apps, details about the ground truth in those apps are rarely documented, which makes it difficult to compare and reproduce the results. To push Android taint analysis research forward, this paper thus recommends criteria for constructing real-world benchmark suites for this specific domain, and presents TaintBench, the first real-world malware benchmark suite with documented taint flows. TaintBench benchmark apps include taint flows with complex structures, and addresses static challenges that are commonly agreed on by the community. Together with the TaintBench suite, we introduce the TaintBench framework, whose goal is to simplify real-world benchmarking of Android taint analyses. First, a usability test shows that the framework improves experts’ performance and perceived usability when documenting and inspecting taint flows. Second, experiments using TaintBench reveal new insights for the taint analysis tools Amandroid and FlowDroid: (i) They are less effective on real-world malware apps than on synthetic benchmark apps. (ii) Predefined lists of sources and sinks heavily impact the tools’ accuracy. (iii) Surprisingly, up-to-date versions of both tools are less accurate than their predecessors.
AU - Luo, Linghui
AU - Pauck, Felix
AU - Piskachev, Goran
AU - Benz, Manuel
AU - Pashchenko, Ivan
AU - Mory, Martin
AU - Bodden, Eric
AU - Hermann, Ben
AU - Massacci, Fabio
ID - 27045
JF - Empirical Software Engineering
SN - 1382-3256
TI - TaintBench: Automatic real-world malware benchmarking of Android taint analyses
ER -
TY - JOUR
AB - Automated machine learning (AutoML) supports the algorithmic construction and data-specific customization of machine learning pipelines, including the selection, combination, and parametrization of machine learning algorithms as main constituents. Generally speaking, AutoML approaches comprise two major components: a search space model and an optimizer for traversing the space. Recent approaches have shown impressive results in the realm of supervised learning, most notably (single-label) classification (SLC). Moreover, first attempts at extending these approaches towards multi-label classification (MLC) have been made. While the space of candidate pipelines is already huge in SLC, the complexity of the search space is raised to an even higher power in MLC. One may wonder, therefore, whether and to what extent optimizers established for SLC can scale to this increased complexity, and how they compare to each other. This paper makes the following contributions: First, we survey existing approaches to AutoML for MLC. Second, we augment these approaches with optimizers not previously tried for MLC. Third, we propose a benchmarking framework that supports a fair and systematic comparison. Fourth, we conduct an extensive experimental study, evaluating the methods on a suite of MLC problems. We find a grammar-based best-first search to compare favorably to other optimizers.
AU - Wever, Marcel Dominik
AU - Tornede, Alexander
AU - Mohr, Felix
AU - Hüllermeier, Eyke
ID - 21004
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
KW - Automated Machine Learning
KW - Multi Label Classification
KW - Hierarchical Planning
KW - Bayesian Optimization
SN - 0162-8828
TI - AutoML for Multi-Label Classification: Overview and Empirical Evaluation
ER -
TY - JOUR
AB - Automated Machine Learning (AutoML) seeks to automatically find so-called machine learning pipelines that maximize the prediction performance when being used to train a model on a given dataset. One of the main and yet open challenges in AutoML is an effective use of computational resources: An AutoML process involves the evaluation of many candidate pipelines, which are costly but often ineffective because they are canceled due to a timeout.
In this paper, we present an approach to predict the runtime of two-step machine learning pipelines with up to one pre-processor, which can be used to anticipate whether or not a pipeline will time out. Separate runtime models are trained offline for each algorithm that may be used in a pipeline, and an overall prediction is derived from these models. We empirically show that the approach increases successful evaluations made by an AutoML tool while preserving or even improving on the previously best solutions.
AU - Mohr, Felix
AU - Wever, Marcel Dominik
AU - Tornede, Alexander
AU - Hüllermeier, Eyke
ID - 21092
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
TI - Predicting Machine Learning Pipeline Runtimes in the Context of Automated Machine Learning
ER -
TY - CONF
AU - Tornede, Tanja
AU - Tornede, Alexander
AU - Wever, Marcel Dominik
AU - Hüllermeier, Eyke
ID - 21570
T2 - Proceedings of the Genetic and Evolutionary Computation Conference
TI - Coevolution of Remaining Useful Lifetime Estimation Pipelines for Automated Predictive Maintenance
ER -
TY - CHAP
AB - This chapter concentrates on aspect-based sentiment analysis, a form of opinion mining where algorithms detect sentiments expressed about features of products, services, etc. We especially focus on novel approaches for aspect phrase extraction and classification trained on feature-rich datasets. Here, we present two new datasets, which we gathered from the linguistically rich domain of physician reviews, as other investigations have mainly concentrated on commercial reviews and social media reviews so far. To give readers a better understanding of the underlying datasets, we describe the annotation process and inter-annotator agreement in detail. In our research, we automatically assess implicit mentions or indications of specific aspects. To do this, we propose and utilize neural network models that perform the here-defined aspect phrase extraction and classification task, achieving F1-score values of about 80% and accuracy values of more than 90%. As we apply our models to a comparatively complex domain, we obtain promising results.
AU - Kersting, Joschka
AU - Geierhos, Michaela
ED - Loukanova, Roussanka
ID - 17905
T2 - Natural Language Processing in Artificial Intelligence -- NLPinAI 2020
TI - Towards Aspect Extraction and Classification for Opinion Mining with Deep Sequence Networks
VL - 939
ER -
TY - GEN
AU - Schott, Stefan
ID - 22304
TI - Android App Analysis Benchmark Case Generation
ER -
TY - CONF
AU - Hüllermeier, Eyke
AU - Mohr, Felix
AU - Tornede, Alexander
AU - Wever, Marcel Dominik
ID - 22913
TI - Automated Machine Learning, Bounded Rationality, and Rational Metareasoning
ER -
TY - CONF
AU - Derrick, John
AU - Doherty, Simon
AU - Dongol, Brijesh
AU - Schellhorn, Gerhard
AU - Wehrheim, Heike
ID - 22927
T2 - Proceedings of the 35th International Symposium on Distributed Computing (DISC)
TI - On Strong Observational Refinement and Forward Simulation
ER -
TY - CONF
AU - Kersting, Joschka
AU - Geierhos, Michaela
ID - 22051
T2 - Proceedings of the 10th International Conference on Data Science, Technology and Applications (DATA 2021)
TI - Well-being in Plastic Surgery: Deep Learning Reveals Patients' Evaluations
ER -
TY - CONF
AU - Witschen, Linus Matthias
AU - Wiersema, Tobias
AU - Raeisi Nafchi, Masood
AU - Bockhorn, Arne
AU - Platzner, Marco
ED - Hannig, Frank
ED - Derrien, Steven
ED - Diniz, Pedro
ED - Chillet, Daniel
ID - 21953
T2 - Proceedings of International Symposium on Applied Reconfigurable Computing (ARC'21)
TI - Timing Optimization for Virtual FPGA Configurations
ER -
TY - CONF
AB - Static analysis is used to automatically detect bugs and security breaches, and aids compileroptimization. Whole-program analysis (WPA) can yield high precision, however causes long analysistimes and thus does not match common software-development workflows, making it often impracticalto use for large, real-world applications.This paper thus presents the design and implementation ofModAlyzer, a novel static-analysisapproach that aims at accelerating whole-program analysis by making the analysis modular andcompositional. It shows how to computelossless, persisted summaries for callgraph, points-to anddata-flow information, and it reports under which circumstances this function-level compositionalanalysis outperforms WPA.We implementedModAlyzeras an extension to LLVM and PhASAR, and applied it to 12 real-world C and C++ applications. At analysis time,ModAlyzermodularly and losslessly summarizesthe analysis effect of the library code those applications share, hence avoiding its repeated re-analysis.The experimental results show that the reuse of these summaries can save, on average, 72% ofanalysis time over WPA. Moreover, because it is lossless, the module-wise analysis fully retainsprecision and recall. Surprisingly, as our results show, it sometimes even yields precision superior toWPA. The initial summary generation, on average, takes about 3.67 times as long as WPA.
AU - Schubert, Philipp
AU - Hermann, Ben
AU - Bodden, Eric
ID - 21598
T2 - European Conference on Object-Oriented Programming (ECOOP)
TI - Lossless, Persisted Summarization of Static Callgraph, Points-To and Data-Flow Analysis
ER -