TY - CONF AU - Richter, Cedric AU - Haltermann, Jan Frederik AU - Jakobs, Marie-Christine AU - Pauck, Felix AU - Schott, Stefan AU - Wehrheim, Heike ID - 35426 T2 - 37th IEEE/ACM International Conference on Automated Software Engineering TI - Are Neural Bug Detectors Comparable to Software Developers on Variable Misuse Bugs? ER - TY - CONF AU - Schott, Stefan AU - Pauck, Felix ID - 36848 T2 - 2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM) TI - Benchmark Fuzzing for Android Taint Analyses ER - TY - CONF AU - Pauck, Felix ID - 35427 T2 - 37th IEEE/ACM International Conference on Automated Software Engineering TI - Scaling Arbitrary Android App Analyses ER - TY - THES AU - Pauck, Felix ID - 43108 TI - Cooperative Android App Analysis ER - TY - THES AU - König, Jürgen ID - 47833 TI - On the Membership and Correctness Problem for State Serializability and Value Opacity ER - TY - CONF AU - Richter, Cedric AU - Wehrheim, Heike ID - 32590 T2 - 2022 IEEE Conference on Software Testing, Verification and Validation (ICST) TI - Learning Realistic Mutations: Bug Creation for Neural Bug Detectors ER - TY - CONF AU - Richter, Cedric AU - Wehrheim, Heike ID - 32591 T2 - 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR) TI - TSSB-3M: Mining single statement bugs at massive scale ER - TY - CONF AU - Dongol, Brijesh AU - Schellhorn, Gerhard AU - Wehrheim, Heike ED - Klin, Bartek ED - Lasota, Slawomir ED - Muscholl, Anca ID - 45248 T2 - 33rd International Conference on Concurrency Theory, CONCUR 2022, September 12-16, 2022, Warsaw, Poland TI - Weak Progressive Forward Simulation Is Necessary and Sufficient for Strong Observational Refinement VL - 243 ER - TY - CONF AB - In recent years, we observe an increasing amount of software with machine learning components being deployed. This poses the question of quality assurance for such components: how can we validate whether specified requirements are fulfilled by a machine learned software? Current testing and verification approaches either focus on a single requirement (e.g., fairness) or specialize on a single type of machine learning model (e.g., neural networks). In this paper, we propose property-driven testing of machine learning models. Our approach MLCheck encompasses (1) a language for property specification, and (2) a technique for systematic test case generation. The specification language is comparable to property-based testing languages. Test case generation employs advanced verification technology for a systematic, property dependent construction of test suites, without additional user supplied generator functions. We evaluate MLCheck using requirements and data sets from three different application areas (software discrimination, learning on knowledge graphs and security). Our evaluation shows that despite its generality MLCheck can even outperform specialised testing approaches while having a comparable runtime AU - Sharma, Arnab AU - Demir, Caglar AU - Ngonga Ngomo, Axel-Cyrille AU - Wehrheim, Heike ID - 28350 T2 - Proceedings of the 20th IEEE International Conference on Machine Learning and Applications (ICMLA) TI - MLCHECK–Property-Driven Testing of Machine Learning Classifiers ER - TY - JOUR AB - Due to the lack of established real-world benchmark suites for static taint analyses of Android applications, evaluations of these analyses are often restricted and hard to compare. Even in evaluations that do use real-world apps, details about the ground truth in those apps are rarely documented, which makes it difficult to compare and reproduce the results. To push Android taint analysis research forward, this paper thus recommends criteria for constructing real-world benchmark suites for this specific domain, and presents TaintBench, the first real-world malware benchmark suite with documented taint flows. TaintBench benchmark apps include taint flows with complex structures, and addresses static challenges that are commonly agreed on by the community. Together with the TaintBench suite, we introduce the TaintBench framework, whose goal is to simplify real-world benchmarking of Android taint analyses. First, a usability test shows that the framework improves experts’ performance and perceived usability when documenting and inspecting taint flows. Second, experiments using TaintBench reveal new insights for the taint analysis tools Amandroid and FlowDroid: (i) They are less effective on real-world malware apps than on synthetic benchmark apps. (ii) Predefined lists of sources and sinks heavily impact the tools’ accuracy. (iii) Surprisingly, up-to-date versions of both tools are less accurate than their predecessors. AU - Luo, Linghui AU - Pauck, Felix AU - Piskachev, Goran AU - Benz, Manuel AU - Pashchenko, Ivan AU - Mory, Martin AU - Bodden, Eric AU - Hermann, Ben AU - Massacci, Fabio ID - 27045 JF - Empirical Software Engineering SN - 1382-3256 TI - TaintBench: Automatic real-world malware benchmarking of Android taint analyses ER - TY - GEN AU - Schott, Stefan ID - 22304 TI - Android App Analysis Benchmark Case Generation ER - TY - CONF AU - Pauck, Felix AU - Wehrheim, Heike ID - 28199 T2 - 2021 IEEE 21st International Working Conference on Source Code Analysis and Manipulation (SCAM) TI - Jicer: Simplifying Cooperative Android App Analysis Tasks ER - TY - CONF AU - Pauck, Felix AU - Wehrheim, Heike ED - Koziolek, Anne ED - Schaefer, Ina ED - Seidl, Christoph ID - 21238 T2 - Software Engineering 2021 TI - Cooperative Android App Analysis with CoDiDroid ER - TY - CONF AU - Sharma, Arnab AU - Wehrheim, Heike ID - 19656 T2 - Proceedings of the 32th IFIP International Conference on Testing Software and Systems (ICTSS) TI - Automatic Fairness Testing of Machine Learning Models ER - TY - GEN AU - Mayer, Stefan ID - 19999 TI - Optimierung von JMCTest beim Testen von Inter Method Contracts ER - TY - CONF AU - Bila, Eleni AU - Doherty, Simon AU - Dongol, Brijesh AU - Derrick, John AU - Schellhorn, Gerhard AU - Wehrheim, Heike ED - Gotsman, Alexey ED - Sokolova, Ana ID - 20274 T2 - Formal Techniques for Distributed Objects, Components, and Systems - 40th {IFIP} {WG} 6.1 International Conference, {FORTE} 2020, Held as Part of the 15th International Federated Conference on Distributed Computing Techniques, DisCoTec 2020, Valletta, Malta, June 15-19, 2020, Proceedings TI - Defining and Verifying Durable Opacity: Correctness for Persistent Software Transactional Memory VL - 12136 ER - TY - CONF AU - Beringer, Steffen AU - Wehrheim, Heike ED - van Sinderen, Marten ED - Fill, Hans{-}Georg ED - A. Maciaszek, Leszek ID - 20275 T2 - Proceedings of the 15th International Conference on Software Technologies, {ICSOFT} 2020, Lieusaint, Paris, France, July 7-9, 2020 TI - Consistency Analysis of AUTOSAR Timing Requirements ER - TY - CONF AU - Beyer, Dirk AU - Wehrheim, Heike ED - Margaria, Tiziana ED - Steffen, Bernhard ID - 20276 T2 - Leveraging Applications of Formal Methods, Verification and Validation: Verification Principles - 9th International Symposium on Leveraging Applications of Formal Methods, ISoLA 2020, Rhodes, Greece, October 20-30, 2020, Proceedings, Part {I} TI - Verification Artifacts in Cooperative Verification: Survey and Unifying Component Framework VL - 12476 ER - TY - GEN ED - Wehrheim, Heike ED - Cabot, Jordi ID - 20277 SN - 978-3-030-45233-9 TI - Fundamental Approaches to Software Engineering - 23rd International Conference, FASE 2020, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020, Dublin, Ireland, April 25-30, 2020, Proceedings VL - 12076 ER - TY - GEN ED - Ahrendt, Wolfgang ED - Wehrheim, Heike ID - 20278 SN - 978-3-030-50994-1 TI - Tests and Proofs - 14th International Conference, TAP@STAF 2020, Bergen, Norway, June 22-23, 2020, Proceedings [postponed] VL - 12165 ER - TY - JOUR AU - Sharma, Arnab AU - Wehrheim, Heike ID - 20279 JF - CoRR TI - Testing Monotonicity of Machine Learning Models VL - abs/2002.12278 ER - TY - JOUR AU - Dalvandi, Sadegh AU - Doherty, Simon AU - Dongol, Brijesh AU - Wehrheim, Heike ID - 21016 IS - 2 JF - Dagstuhl Artifacts Ser. TI - Owicki-Gries Reasoning for C11 RAR (Artifact) VL - 6 ER - TY - CONF AU - Dalvandi, Sadegh AU - Doherty, Simon AU - Dongol, Brijesh AU - Wehrheim, Heike ED - Hirschfeld, Robert ED - Pape, Tobias ID - 21017 T2 - 34th European Conference on Object-Oriented Programming, {ECOOP} 2020, November 15-17, 2020, Berlin, Germany (Virtual Conference) TI - Owicki-Gries Reasoning for C11 RAR VL - 166 ER - TY - CONF AU - Richter, Cedric AU - Wehrheim, Heike ID - 21018 T2 - 35th {IEEE/ACM} International Conference on Automated Software Engineering, {ASE} 2020, Melbourne, Australia, September 21-25, 2020 TI - Attend and Represent: A Novel View on Algorithm Selection for Software Verification ER - TY - GEN ED - Ahrendt, Wolfgang ED - Wehrheim, Heike ID - 21019 SN - 978-3-030-50994-1 TI - Tests and Proofs - 14th International Conference, TAP@STAF 2020, Bergen, Norway, June 22-23, 2020, Proceedings [postponed] VL - 12165 ER - TY - GEN AB - Software verification has recently made enormous progress due to the development of novel verification methods and the speed-up of supporting technologies like SMT solving. To keep software verification tools up to date with these advances, tool developers keep on integrating newly designed methods into their tools, almost exclusively by re-implementing the method within their own framework. While this allows for a conceptual re-use of methods, it requires novel implementations for every new technique. In this paper, we employ cooperative verification in order to avoid reimplementation and enable usage of novel tools as black-box components in verification. Specifically, cooperation is employed for the core ingredient of software verification which is invariant generation. Finding an adequate loop invariant is key to the success of a verification run. Our framework named CoVerCIG allows a master verification tool to delegate the task of invariant generation to one or several specialized helper invariant generators. Their results are then utilized within the verification run of the master verifier, allowing in particular for crosschecking the validity of the invariant. We experimentally evaluate our framework on an instance with two masters and three different invariant generators using a number of benchmarks from SV-COMP 2020. The experiments show that the use of CoVerCIG can increase the number of correctly verified tasks without increasing the used resources AU - Haltermann, Jan Frederik AU - Wehrheim, Heike ID - 17825 T2 - arXiv:2008.04551 TI - Cooperative Verification via Collective Invariant Generation ER - TY - CONF AU - Sharma, Arnab AU - Wehrheim, Heike ID - 16724 T2 - Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA). TI - Higher Income, Larger Loan? Monotonicity Testing of Machine Learning Models ER - TY - JOUR AU - Richter, Cedric AU - Hüllermeier, Eyke AU - Jakobs, Marie-Christine AU - Wehrheim, Heike ID - 16725 JF - Journal of Automated Software Engineering TI - Algorithm Selection for Software Validation Based on Graph Kernels ER - TY - JOUR AU - Karl, Holger AU - Kundisch, Dennis AU - Meyer auf der Heide, Friedhelm AU - Wehrheim, Heike ID - 13770 IS - 6 JF - Business & Information Systems Engineering TI - A Case for a New IT Ecosystem: On-The-Fly Computing VL - 62 ER - TY - CONF AU - Pauck, Felix AU - Bodden, Eric AU - Wehrheim, Heike ED - Felderer, Michael ED - Hasselbring, Wilhelm ED - Rabiser, Rick ED - Jung, Reiner ID - 16214 T2 - Software Engineering 2020, Fachtagung des GI-Fachbereichs Softwaretechnik, 24.-28. Februar 2020, Innsbruck, Austria TI - Reproducing Taint-Analysis Results with ReproDroid ER - TY - CONF AB - For optimal placement and orchestration of network services, it is crucial that their structure and semantics are specified clearly and comprehensively and are available to an orchestrator. Existing specification approaches are either ambiguous or miss important aspects regarding the behavior of virtual network functions (VNFs) forming a service. We propose to formally and unambiguously specify the behavior of these functions and services using Queuing Petri Nets (QPNs). QPNs are an established method that allows to express queuing, synchronization, stochastically distributed processing delays, and changing traffic volume and characteristics at each VNF. With QPNs, multiple VNFs can be connected to complete network services in any structure, even specifying bidirectional network services containing loops. We discuss how management and orchestration systems can benefit from our clear and comprehensive specification approach, leading to better placement of VNFs and improved Quality of Service. Another benefit of formally specifying network services with QPNs are diverse analysis options, which allow valuable insights such as the distribution of end-to-end delay. We propose a tool-based workflow that supports the specification of network services and the automatic generation of corresponding simulation code to enable an in-depth analysis of their behavior and performance. AU - Schneider, Stefan Balthasar AU - Sharma, Arnab AU - Karl, Holger AU - Wehrheim, Heike ID - 3287 T2 - 2019 IFIP/IEEE International Symposium on Integrated Network Management (IM) TI - Specifying and Analyzing Virtual Network Services Using Queuing Petri Nets ER - TY - GEN AU - Sharma, Arnab AU - Wehrheim, Heike ID - 7752 SN - 978-3-88579-686-2 T2 - Proceedings of the Software Engineering Conference (SE) TI - Testing Balancedness of ML Algorithms VL - P-292 ER - TY - GEN AU - Zhang, Shikun ID - 7623 TI - Combining Android Apps for Analysis Purposes ER - TY - CONF AU - Sharma, Arnab AU - Wehrheim, Heike ID - 7635 T2 - IEEE International Conference on Software Testing, Verification and Validation (ICST) TI - Testing Machine Learning Algorithms for Balanced Data Usage ER - TY - GEN AU - Haltermann, Jan Frederik ID - 12885 TI - Analyzing Data Usage in Array Programs ER - TY - CONF AB - In the field of software analysis a trade-off between scalability and accuracy always exists. In this respect, Android app analysis is no exception, in particular, analyzing large or many apps can be challenging. Dealing with many small apps is a typical challenge when facing micro-benchmarks such as DROIDBENCH or ICC-BENCH. These particular benchmarks are not only used for the evaluation of novel tools but also in continuous integration pipelines of existing mature tools to maintain and guarantee a certain quality-level. Considering this latter usage it becomes very important to be able to achieve benchmark results as fast as possible. Hence, benchmarks have to be optimized for this purpose. One approach to do so is app merging. We implemented the Android Merge Tool (AMT) following this approach and show that its novel aspects can be used to produce scaled up and accurate benchmarks. For such benchmarks Android app analysis tools do not suffer from the scalability-accuracy trade-off anymore. We show this throughout detailed experiments on DROIDBENCH employing three different analysis tools (AMANDROID, ICCTA, FLOWDROID). Benchmark execution times are largely reduced without losing benchmark accuracy. Moreover, we argue why AMT is an advantageous successor of the state-of-the-art app merging tool (APKCOMBINER) in analysis lift-up scenarios. AU - Pauck, Felix AU - Zhang, Shikun ID - 15838 KW - Program Analysis KW - Android App Analysis KW - Taint Analysis KW - App Merging KW - Benchmark SN - 9781728141367 T2 - 2019 34th IEEE/ACM International Conference on Automated Software Engineering Workshop (ASEW) TI - Android App Merging for Benchmark Speed-Up and Analysis Lift-Up ER - TY - CONF AU - Derrick, John AU - Doherty, Simon AU - Dongol, Brijesh AU - Schellhorn, Gerhard AU - Wehrheim, Heike ED - H. ter Beek, Maurice ED - McIver, Annabelle ED - N. Oliveira, Jos{\'{e}} ID - 16215 T2 - Formal Methods - The Next 30 Years - Third World Congress, {FM} 2019, Porto, Portugal, October 7-11, 2019, Proceedings TI - Verifying Correctness of Persistent Concurrent Data Structures VL - 11800 ER - TY - JOUR AU - Russo, Alessandra AU - Schürr, Andy AU - Wehrheim, Heike ID - 16216 IS - 5 JF - Formal Asp. Comput. TI - Editorial VL - 31 ER - TY - JOUR AU - Fränzle, Martin AU - Kapur, Deepak AU - Wehrheim, Heike AU - Zhan, Naijun ID - 16217 IS - 1 JF - Formal Asp. Comput. TI - Editorial VL - 31 ER - TY - CHAP AU - Beyer, Dirk AU - Jakobs, Marie-Christine ID - 13872 SN - 0302-9743 T2 - Fundamental Approaches to Software Engineering TI - CoVeriTest: Cooperative Verifier-Based Testing ER - TY - CONF AU - Derrick, John AU - Doherty, Simon AU - Dongol, Brijesh AU - Schellhorn, Gerhard AU - Wehrheim, Heike ID - 13993 T2 - Formal Methods - The Next 30 Years - Third World Congress, {FM} 2019, Porto, Portugal, October 7-11, 2019, Proceedings TI - Verifying Correctness of Persistent Concurrent Data Structures ER - TY - JOUR AU - Fränzle, Martin AU - Kapur, Deepak AU - Wehrheim, Heike AU - Zhan, Naijun ID - 10011 IS - 1 JF - Formal Asp. Comput. TI - Editorial VL - 31 ER - TY - CONF AU - König, Jürgen AU - Wehrheim, Heike ED - M. Badger, Julia ED - Yvonne Rozier, Kristin ID - 10091 T2 - {NASA} Formal Methods - 11th International Symposium, {NFM} 2019, Houston, TX, USA, May 7-9, 2019, Proceedings TI - Data Independence for Software Transactional Memory VL - 11460 ER - TY - CONF AU - Doherty, Simon AU - Dongol, Brijesh AU - Wehrheim, Heike AU - Derrick, John ED - K. Hollingsworth, Jeffrey ED - Keidar, Idit ID - 10092 T2 - Proceedings of the 24th {ACM} {SIGPLAN} Symposium on Principles and Practice of Parallel Programming, PPoPP 2019, Washington, DC, USA, February 16-20, 2019 TI - Verifying C11 programs operationally ER - TY - CONF AU - Beyer, Dirk AU - Jakobs, Marie-Christine AU - Lemberger, Thomas AU - Wehrheim, Heike ED - Becker, Steffen ED - Bogicevic, Ivan ED - Herzwurm, Georg ED - Wagner, Stefan ID - 10093 T2 - Software Engineering and Software Management (SE/SWM 2019), Stuttgart, Germany, February 18-22, 2019 TI - Combining Verifiers in Conditional Model Checking via Reducers VL - P-292 ER - TY - CONF AU - Sharma, Arnab AU - Wehrheim, Heike ED - Becker, Steffen ED - Bogicevic, Ivan ED - Herzwurm, Georg ED - Wagner, Stefan ID - 10094 T2 - Software Engineering and Software Management, {SE/SWM} 2019, Stuttgart, Germany, February 18-22, 2019 TI - Testing Balancedness of ML Algorithms VL - {P-292} ER - TY - CONF AU - Richter, Cedric AU - Wehrheim, Heike ED - Beyer, Dirk ED - Huisman, Marieke ED - Kordon, Fabrice ED - Steffen, Bernhard ID - 10095 T2 - Tools and Algorithms for the Construction and Analysis of Systems - 25 Years of {TACAS:} TOOLympics, Held as Part of {ETAPS} 2019, Prague, Czech Republic, April 6-11, 2019, Proceedings, Part {III} TI - PeSCo: Predicting Sequential Combinations of Verifiers - (Competition Contribution) VL - 11429 ER - TY - GEN AU - Haltermann, Jan ID - 10105 TI - Analyzing Data Usage in Array Programs ER - TY - CONF AB - Recent years have seen the development of numerous tools for the analysis of taint flows in Android apps. Taint analyses aim at detecting data leaks, accidentally or by purpose programmed into apps. Often, such tools specialize in the treatment of specific features impeding precise taint analysis (like reflection or inter-app communication). This multitude of tools, their specific applicability and their various combination options complicate the selection of a tool (or multiple tools) when faced with an analysis instance, even for knowledgeable users, and hence hinders the successful adoption of taint analyses. In this work, we thus present CoDiDroid, a framework for cooperative Android app analysis. CoDiDroid (1) allows users to ask questions about flows in apps in varying degrees of detail, (2) automatically generates subtasks for answering such questions, (3) distributes tasks onto analysis tools (currently DroidRA, FlowDroid, HornDroid, IC3 and two novel tools) and (4) at the end merges tool answers on subtasks into an overall answer. Thereby, users are freed from having to learn about the use and functionality of all these tools while still being able to leverage their capabilities. Moreover, we experimentally show that cooperation among tools pays off with respect to effectiveness, precision and scalability. AU - Pauck, Felix AU - Wehrheim, Heike ID - 10108 KW - Android Taint Analysis KW - Cooperation KW - Precision KW - Tools SN - 978-1-4503-5572-8 T2 - Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering TI - Together Strong: Cooperative Android App Analysis ER - TY - CONF AU - Isenberg, Tobias AU - Jakobs, Marie-Christine AU - Pauck, Felix AU - Wehrheim, Heike ID - 13874 T2 - Tests and Proofs - 13th International Conference, {TAP} 2019, Held as Part of the Third World Congress on Formal Methods 2019, Porto, Portugal, October 9-11, 2019, Proceedings TI - When Are Software Verification Results Valid for Approximate Hardware? ER -