TY  - GEN
AB  - Context
Static analyses are well-established to aid in understanding bugs or vulnerabilities during the development process or in large-scale studies. A low false-positive rate is essential for the adaption in practice and for precise results of empirical studies. Unfortunately, static analyses tend to report where a vulnerability manifests rather than the fix location. This can cause presumed false positives or imprecise results.
Method
To address this problem, we designed an adaption of an existing static analysis algorithm that can distinguish between a manifestation and fix location, and reports error chains. An error chain represents at least two interconnected errors that occur successively, thus building the connection between the fix and manifestation location. We used our tool CogniCryptSUBS for a case study on 471 GitHub repositories, a performance benchmark to compare different analysis configurations, and conducted an expert interview.
Result
We found that 50 % of the projects with a report had at least one error chain. Our runtime benchmark demonstrated that our improvement caused only a minimal runtime overhead of less than 4 %. The results of our expert interview indicate that with our adapted version participants require fewer executions of the analysis.
Conclusion
Our results indicate that error chains occur frequently in real-world projects, and ignoring them can lead to imprecise evaluation results. The runtime benchmark indicates that our tool is a feasible and efficient solution for detecting error chains in real-world projects. Further, our results gave a hint that the usability of static analyses may benefit from supporting error chains.
AU  - Wickert, Anna-Katharina
AU  - Schlichtig, Michael
AU  - Vogel, Marvin
AU  - Winter, Lukas
AU  - Mezini, Mira
AU  - Bodden, Eric
ID  - 52663
KW  - Static analysis
KW  - error chains
KW  - false positive re- duction
KW  - empirical studies
TI  - Supporting Error Chains in Static Analysis for Precise Evaluation Results and Enhanced Usability
ER  - 
TY  - JOUR
AU  - Bodden, Eric
AU  - Pottebaum, Jens
AU  - Fockel, Markus
AU  - Gräßler, Iris
ID  - 52587
IS  - 1
JF  - IEEE Security & Privacy
KW  - Law
KW  - Electrical and Electronic Engineering
KW  - Computer Networks and Communications
SN  - 1540-7993
TI  - Evaluating Security Through Isolation and Defense in Depth
VL  - 22
ER  - 
TY  - CONF
AB  - Previous work has shown that one can often greatly speed up static analysis by computing data flows not for every edge in the program’s control-flow graph but instead only along definition-use chains. This yields a so-called sparse static analysis. Recent work on SparseDroid has shown that specifically taint analysis can be “sparsified” with extraordinary effectiveness because the taint state of one variable does not depend on those of others. This allows one to soundly omit more flow-function computations than in the general case. In this work, we now assess whether this result carries over to the more generic setting of so-called Interprocedural Distributive Environment (IDE) problems. Opposed to taint analysis, IDE comprises distributive problems with large or even infinitely broad domains, such as typestate analysis or linear constant propagation. Specifically, this paper presents Sparse IDE, a framework that realizes sparsification for any static analysis that fits the IDE framework. We implement Sparse IDE in SparseHeros, as an extension to the popular Heros IDE solver, and evaluate its performance on real-world Java libraries by comparing it to the baseline IDE algorithm. To this end, we design, implement and evaluate a linear constant propagation analysis client on top of SparseHeros. Our experiments show that, although IDE analyses can only be sparsified with respect to symbols and not (numeric) values, Sparse IDE can nonetheless yield significantly lower runtimes and often also memory consumptions compared to the original IDE.
AU  - Karakaya, Kadiray
AU  - Bodden, Eric
ID  - 53938
T2  - Proceedings of the IEEE/ACM 46th International Conference on Software Engineering
TI  - Symbol-Specific Sparsification of Interprocedural Distributive Environment Problems
ER  - 
TY  - CONF
AB  - To detect security vulnerabilities, static analysis tools need to be configured with security-relevant methods. Current approaches can automatically identify such methods using binary relevance machine learning approaches. However, they ignore dependencies among security-relevant methods, over-generalize and perform poorly in practice. Additionally, users have to nevertheless manually configure static analysis tools using the detected methods. Based on feedback from users and our observations, the excessive manual steps can often be tedious, error-prone and counter-intuitive.
 In this paper, we present Dev-Assist, an IntelliJ IDEA plugin that detects security-relevant methods using a multi-label machine learning approach that considers dependencies among labels. The plugin can automatically generate configurations for static analysis tools, run the static analysis, and show the results in IntelliJ IDEA. Our experiments reveal that Dev-Assist's machine learning approach has a higher F1-Measure than related approaches. Moreover, the plugin reduces and simplifies the manual effort required when configuring and using static analysis tools.
AU  - Johnson, Oshando
AU  - Piskachev, Goran
AU  - Krishnamurthy, Ranjith
AU  - Bodden, Eric
ID  - 53958
T2  - Proceedings of the 46th International Conference on Software Engineering, IDE Workshop
TI  - Detecting Security-Relevant Methods using Multi-label Machine Learning
ER  - 
TY  - CONF
AB  - In light of the growing interest in type inference research for Python, both researchers and practitioners require a standardized process to assess the performance of various type inference techniques. This paper introduces TypeEvalPy, a comprehensive micro-benchmarking framework for evaluating type inference tools. TypeEvalPy contains 154 code snippets with 845 type annotations across 18 categories that target various Python features. The framework manages the execution of containerized tools, transforms inferred types into a standardized format, and produces meaningful metrics for assessment. Through our analysis, we compare the performance of six type inference tools, highlighting their strengths and limitations. Our findings provide a foundation for further research and optimization in the domain of Python type inference.
AU  - Shivarpatna Venkatesh, Ashwin Prasad
AU  - Sabu, Samkutty
AU  - Wang, Jiawei
AU  - Mir, Amir M.
AU  - Li, Li
AU  - Bodden, Eric
ID  - 53959
SN  - 9798400705021
T2  - Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings
TI  - TypeEvalPy: A Micro-benchmarking Framework for Python Type Inference  Tools
ER  - 
TY  - CONF
AU  - Shivarpatna Venkatesh, Ashwin Prasad
AU  - Sabu, Samkutty
AU  - Mir, Amir M.
AU  - Reis, Sofia
AU  - Bodden, Eric
ID  - 55516
T2  - Proceedings of the 2024 IEEE/ACM First International Conference on AI Foundation Models and Software Engineering
TI  - The Emergence of Large Language Models in Static Analysis: A First Look through Micro-Benchmarks
ER  - 
TY  - CONF
AB  - Android applications collecting data from users must protect it according to the current legal frameworks. Such data protection has become even more important since the European Union rolled out the General Data Protection Regulation (GDPR). Since app developers are not legal experts, they find it difficult to write privacy-aware source code. Moreover, they have limited tool support to reason about data protection throughout their app development process.
This paper motivates the need for a static analysis approach to diagnose and explain data protection in Android apps. The analysis will recognize personal data sources in the source code, and aims to further examine the data flow originating from these sources. App developers can then address key questions about data manipulation, derived data, and the presence of technical measures. Despite challenges, we explore to what extent one can realize this analysis through static taint analysis, a common method for identifying security vulnerabilities. This is a first step towards designing a tool-based approach that aids app developers and assessors in ensuring data protection in Android apps, based on automated static program analysis. 
AU  - Khedkar, Mugdha
AU  - Bodden, Eric
ID  - 52235
KW  - static program analysis
KW  - data protection and privacy
KW  - GDPR compliance
T2  - Proceedings of the IEEE/ACM 11th International Conference on Mobile Software Engineering and Systems (MOBILESoft '24). Association for Computing Machinery, New York, NY, USA, 65–68.
TI  - Toward an Android Static Analysis Approach for Data Protection
ER  - 
TY  - CONF
AU  - Schiebel, Fabian
AU  - Sattler, Florian
AU  - Schubert, Philipp Dominik
AU  - Apel, Sven
AU  - Bodden, Eric
ED  - Aldrich, Jonathan
ED  - Salvaneschi, Guido
ID  - 56863
SN  - 1868-8969
T2  - 38th European Conference on Object-Oriented Programming (ECOOP 2024)
TI  - Scaling Interprocedural Static Data-Flow Analysis to Large C/C++ Applications: An Experience Report
VL  - 313
ER  - 
TY  - JOUR
AB  - <jats:p>As our lives, our businesses, and indeed our world economy become increasingly reliant on the secure operation of many interconnected software systems, the software engineering research community is faced with unprecedented research challenges, but also with exciting new opportunities. In this roadmap paper, we outline our vision of Software Security Analysis for the systems of the future. Given the recent advances in generative AI, we need new methods to assess and maximize the security of code co-written by machines. As our systems become increasingly heterogeneous, we need practical approaches that work even if some functions are automatically generated, e.g., by deep neural networks. As software systems depend evermore on the software supply chain, we need tools that scale to an entire ecosystem. What kind of vulnerabilities exist in future systems and how do we detect them? When all the shallow bugs are found, how do we discover vulnerabilities hidden deeply in the system? Assuming we cannot find all security flaws, how can we nevertheless protect our system? To answer these questions, we start our roadmap with a survey of recent advances in software security, then discuss open challenges and opportunities, and conclude with a long-term perspective for the field.</jats:p>
AU  - Böhme, Marcel
AU  - Bodden, Eric
AU  - Bultan, Tevfik
AU  - Cadar, Cristian
AU  - Liu, Yang
AU  - Scanniello, Giuseppe
ID  - 59411
JF  - ACM Transactions on Software Engineering and Methodology
SN  - 1049-331X
TI  - Software Security Analysis in 2030 and Beyond: A Research Roadmap
ER  - 
TY  - CONF
AB  - Many Android applications collect data from users. The European Union's General Data Protection Regulation (GDPR) requires vendors to faithfully disclose which data their apps collect. This task is complicated because many apps use third-party code for which the same information is not readily available. Hence we ask: how accurately do current Android apps fulfill these requirements?
In this work, we first expose a multi-layered definition of privacy-related data to correctly report data collection in Android apps. We further create a dataset of privacy-sensitive data classes that may be used as input by an Android app. This dataset takes into account data collected both through the user interface and system APIs.
We manually examine the data safety sections of 70 Android apps to observe how data collection is reported, identifying instances of over- and under-reporting. Additionally, we develop a prototype to statically extract and label privacy-related data collected via app source code, user interfaces, and permissions. Comparing the prototype's results with the data safety sections of 20 apps reveals reporting discrepancies. Using the results from two Messaging and Social Media apps (Signal and Instagram), we discuss how app developers under-report and over-report data collection, respectively, and identify inaccurately reported data categories.
Our results show that app developers struggle to accurately report data collection, either due to Google's abstract definition of collected data or insufficient existing tool support. 
AU  - Khedkar, Mugdha
AU  - Mondal, Ambuj Kumar
AU  - Bodden, Eric
ID  - 56137
T2  - In Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW ’24)
TI  - Do Android App Developers Accurately Report Collection of Privacy-Related Data?
ER  - 
TY  - CONF
AB  -     Android apps collecting data from users must comply with legal frameworks to ensure data protection. This requirement has become even more important since the implementation of the General Data Protection Regulation (GDPR) by the European Union in 2018. Moreover, with the proposed Cyber Resilience Act on the horizon, stakeholders will soon need to assess software against even more stringent security and privacy standards. Effective privacy assessments require collaboration among groups with diverse expertise to function effectively as a cohesive unit.
    This paper motivates the need for an automated approach that enhances understanding of data protection in Android apps and improves communication between the various parties involved in privacy assessments. We propose the Assessor View, a tool designed to bridge the knowledge gap between these parties, facilitating more effective privacy assessments of Android applications. 
AU  - Khedkar, Mugdha
AU  - Schlichtig, Michael
AU  - Bodden, Eric
ID  - 56140
T2  - In Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW ’24)
TI  - Advancing Android Privacy Assessments with Automation
ER  - 
TY  - GEN
AB  - The increased complexity of modern software has led to much more
sophisticated attack vectors. As a result, we require newer vulnerability detection
methods to ensure software security without compromising efficiency.
The Code Property Graph (CPG) is a program representation that provides a comprehensive overview of program behavior, combining abstract syntax trees, control flow
graphs, and program dependence graphs. With such a detailed data structure, we can
detect patterns that characterize known vulnerabilities and identify various security
threats. Querying the combined data structure instead of the individual graphs enables the detection of multidimensional scenarios.
This work aims to integrate the advantages of CPGs into software systems that utilize
the Jimple intermediate representation. We introduce JimNode, a novel approach for
generating CPGs specifically tailored to Jimple. Despite the model incompatibility, our
evaluation, which covered approximately 50,800 methods, reveals an 88.07% similarity
of the inter-statement edges compared to Joern, the state-of-the-art tool for CPG
generation. We provide a detailed analysis of our methodology and discuss why it is
better suited for Jimple programs than Joern’s language-agnostic approach.
AU  - Youkeim, Michael Hany Fawzy
ID  - 57416
TI  - Tailoring Code Property Graphs to Jimple
ER  - 
TY  - CHAP
AB  - <jats:title>Abstract</jats:title><jats:p>Since its inception two decades ago, <jats:sc>Soot</jats:sc> has become one of the most widely used open-source static analysis frameworks. Over time it has been extended with the contributions of countless researchers. Yet, at the same time, the requirements for <jats:sc>Soot</jats:sc> have changed over the years and become increasingly at odds with some of the major design decisions that underlie it. In this work, we thus present <jats:sc>SootUp</jats:sc>, a complete reimplementation of <jats:sc>Soot</jats:sc> that seeks to fulfill these requirements with a novel design, while at the same time keeping elements that <jats:sc>Soot</jats:sc> users have grown accustomed to.</jats:p>
AU  - Karakaya, Kadiray
AU  - Schott, Stefan
AU  - Klauke, Jonas
AU  - Bodden, Eric
AU  - Schmidt, Markus
AU  - Luo, Linghui
AU  - He, Dongjie
ID  - 53942
SN  - 0302-9743
T2  - Tools and Algorithms for the Construction and Analysis of Systems
TI  - SootUp: A Redesign of the Soot Static Analysis Framework
ER  - 
TY  - CONF
AU  - Schott, Stefan
AU  - Ponta, Serena Elisa
AU  - Fischer, Wolfram
AU  - Klauke, Jonas
AU  - Bodden, Eric
ID  - 57550
T2  - 38th European Conference on Object-Oriented Programming (ECOOP 2024)
TI  - Java Bytecode Normalization for Code Similarity Analysis
ER  - 
TY  - JOUR
AU  - Torres, Adriano
AU  - Costa, Pedro
AU  - Amaral, Luis
AU  - Pastro, Jonata
AU  - Bonifácio, Rodrigo
AU  - d'Amorim, Marcelo
AU  - Legunsen, Owolabi
AU  - Bodden, Eric
AU  - Dias Canedo, Edna
ID  - 46816
IS  - 10
JF  - IEEE Transactions on Software Engineering
KW  - Software
SN  - 0098-5589
TI  - Runtime Verification of Crypto APIs: An Empirical Study
VL  - 49
ER  - 
TY  - JOUR
AB  - <jats:title>Abstract</jats:title><jats:p>The use of static analysis security testing (SAST) tools has been increasing in recent years. However, previous studies have shown that, when shipped to end users such as development or security teams, the findings of these tools are often unsatisfying. Users report high numbers of false positives or long analysis times, making the tools unusable in the daily workflow. To address this, SAST tool creators provide a wide range of configuration options, such as customization of rules through domain-specific languages or specification of the application-specific analysis scope. In this paper, we study the configuration space of selected existing SAST tools when used within the integrated development environment (IDE). We focus on the configuration options that impact three dimensions, for which a trade-off is unavoidable, i.e., precision, recall, and analysis runtime. We perform a between-subjects user study with 40 users from multiple development and security teams - to our knowledge, the largest population for this kind of user study in the software engineering community. The results show that users who configure SAST tools are more effective in resolving security vulnerabilities detected by the tools than those using the default configuration. Based on post-study interviews, we identify common strategies that users have while configuring the SAST tools to provide further insights for tool creators. Finally, an evaluation of the configuration options of two commercial SAST tools, <jats:sc>Fortify</jats:sc> and <jats:sc>CheckMarx</jats:sc>, reveals that a quarter of the users do not understand the configuration options provided. The configuration options that are found most useful relate to the analysis scope.</jats:p>
AU  - Piskachev, Goran
AU  - Becker, Matthias
AU  - Bodden, Eric
ID  - 49439
IS  - 5
JF  - Empirical Software Engineering
KW  - Software
SN  - 1382-3256
TI  - Can the configuration of static analyses make resolving security vulnerabilities more effective? - A user study
VL  - 28
ER  - 
TY  - JOUR
AB  - inhalt Der verlässliche Betrieb von technischen Produkten wird zunehmend durch bewusste Angriffe bedroht. Vollständige Sicherheit ist dabei nicht möglich, durchschlagende Angriffe sind unvermeidbar (Assume Breach). Dies erfordert einen Paradigmenwechsel in der sicherheitsgerechten Entwicklung mechatronischer und cyber-physischer Systeme hin zu Defense-in-Depth. Systeme müssen so ausgelegt werden, dass sie auch bei gezielten Angriffen möglichst hohe Zuverlässigkeit und Sicherheit gewährleisten. Der hier beschriebene Lösungsansatz erweitert das Systemmodell um Angriffsszenarien und Verteidigungslinien. Diese werden am Beispiel eines industriellen Schließsystems zur Anlagensicherheit erläutert. Entwickler werden sensibilisiert, Angriffe systematisch zu berücksichtigen und interdisziplinär Verteidigungselemente gegenüber Bedrohungen und Angriffen zu spezifizieren.
AU  - Gräßler, Iris
AU  - Bodden, Eric
AU  - Wiechel, Dominik
AU  - Pottebaum, Jens
ID  - 48946
IS  - 11-12
JF  - Konstruktion
KW  - Mechanical Engineering
KW  - Mechanics of Materials
KW  - General Materials Science
KW  - Theoretical Computer Science
SN  - 0720-5953
TI  - Defense-in-Depth als neues Paradigma der sicherheitsgerechten Produktentwicklung: interdisziplinäre, bedrohungsbewusste und lösungsorientierte Security
VL  - 75
ER  - 
TY  - CHAP
AB  - Static analysis tools support developers in detecting potential coding issues, such as bugs or vulnerabilities. Research emphasizes technical challenges of such tools but also mentions severe usability shortcomings. These shortcomings hinder the adoption of static analysis tools, and user dissatisfaction may even lead to tool abandonment. To comprehensively assess the state of the art, we present the first systematic usability evaluation of a wide range of static analysis tools. We derived a set of 36 relevant criteria from the literature and used them to evaluate a total of 46 static analysis tools complying with our inclusion and exclusion criteria - a representative set of mainly non-proprietary tools. The evaluation against the usability criteria in a multiple-raters approach shows that two thirds of the considered tools off er poor warning messages, while about three-quarters provide hardly any fix support. Furthermore, the integration of user knowledge is strongly neglected, which could be used for instance, to improve handling of false positives. Finally, issues regarding workflow integration and specialized user interfaces are revealed. These findings should prove useful in guiding and focusing further research and development in user experience for static code analyses.
AU  - Nachtigall, Marcus
AU  - Schlichtig, Michael
AU  - Bodden, Eric
ID  - 52662
KW  - Automated static analysis
KW  - Software usability
SN  - 978-3-88579-726-5
T2  - Software Engineering 2023
TI  - Evaluation of Usability Criteria Addressed by Static Analysis Tools on a Large Scale
ER  - 
TY  - CHAP
AB  - Application Programming Interfaces (APIs) are the primary mechanism developers use to obtain access to third-party algorithms and services. Unfortunately, APIs can be misused, which can have catastrophic consequences, especially if the APIs provide security-critical functionalities like cryptography. Understanding what API misuses are, and how they are caused, is important to prevent them, eg, with API misuse detectors. However, definitions for API misuses and related terms in literature vary. This paper presents a systematic literature review to clarify these terms and introduces FUM, a novel Framework for API Usage constraint and Misuse classification. The literature review revealed that API misuses are violations of API usage constraints. To address this, we provide unified definitions and use them to derive FUM. To assess the extent to which FUM aids in determining and guiding the improvement of an API misuses detector’s capabilities, we performed a case study on the state-of the-art misuse detection tool CogniCrypt. The study showed that FUM can be used to properly assess CogniCrypt’s capabilities, identify weaknesses and assist in deriving mitigations and improvements.
AU  - Schlichtig, Michael
AU  - Sassalla, Steffen
AU  - Narasimhan, Krishna
AU  - Bodden, Eric
ID  - 52660
KW  - API misuses  API usage constraints
KW  - classification framework
KW  - API misuse detection
KW  - static analysis
SN  - 978-3-88579-726-5
T2  - Software Engineering 2023
TI  - Introducing FUM: A Framework for API Usage Constraint and Misuse Classification
ER  - 
TY  - CONF
AU  - Krüger, Stefan
AU  - Reif, Michael
AU  - Wickert, Anna-Katharina
AU  - Nadi, Sarah
AU  - Ali, Karim
AU  - Bodden, Eric
AU  - Acar, Yasemin
AU  - Mezini, Mira
AU  - Fahl, Sascha
ID  - 49438
T2  - 2023 IEEE Secure Development Conference (SecDev)
TI  - Securing Your Crypto-API Usage Through Tool Support - A Usability Study
ER  - 
TY  - CONF
AB  - The security of Industrial Control Systems is relevant both for reliable production system operations and for high-quality throughput in terms of manufactured products. Security measures are designed, operated and maintained by different roles along product and production system lifecycles. Defense-in-Depth as a paradigm builds upon the assumption that breaches are unavoidable. The paper at hand provides an analysis of roles, corresponding Human Factors and their relevance for data theft and sabotage attacks. The resulting taxonomy is reflected by an example related to Additive Manufacturing. The results assist in both designing and redesigning Industrial Control System as part of an entire production system so that Defense-in-Depth with regard to Human Factors is built in by design.
AU  - Pottebaum, Jens
AU  - Rossel, Jost
AU  - Somorovsky, Juraj
AU  - Acar, Yasemin
AU  - Fahr, René
AU  - Arias Cabarcos, Patricia
AU  - Bodden, Eric
AU  - Gräßler, Iris
ID  - 46500
KW  - Defense-in-Depth
KW  - Human Factors
KW  - Production Engineering
KW  - Product Design
KW  - Systems Engineering
T2  - 2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)
TI  - Re-Envisioning Industrial Control Systems Security by Considering Human Factors as a Core Element of Defense-in-Depth
ER  - 
TY  - CONF
AU  - Shivarpatna Venkatesh, Ashwin Prasad
AU  - Wang, Jiawei
AU  - Li, Li
AU  - Bodden, Eric
ID  - 41813
T2  - IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)
TI  - Enhancing Comprehension and Navigation in Jupyter Notebooks with Static Analysis
ER  - 
TY  - CONF
AU  - Karakaya, Kadiray
AU  - Bodden, Eric
ID  - 45312
T2  - 2023 IEEE Conference on Software Testing, Verification and Validation (ICST)
TI  - Two Sparsification Strategies for Accelerating Demand-Driven Pointer Analysis
ER  - 
TY  - CONF
AB  - Many Android applications collect data from users. When they do, they must
protect this collected data according to the current legal frameworks. Such
data protection has become even more important since the European Union rolled
out the General Data Protection Regulation (GDPR). App developers have limited
tool support to reason about data protection throughout their app development
process. Although many Android applications state a privacy policy, privacy
policy compliance checks are currently manual, expensive, and prone to error.
One of the major challenges in privacy audits is the significant gap between
legal privacy statements (in English text) and technical measures that Android
apps use to protect their user's privacy. In this thesis, we will explore to
what extent we can use static analysis to answer important questions regarding
data protection. Our main goal is to design a tool based approach that aids app
developers and auditors in ensuring data protection in Android applications,
based on automated static program analysis.
AU  - Khedkar, Mugdha
ID  - 44146
KW  - static analysis
KW  - data protection and privacy
KW  - GDPR compliance
T2  - 2023 IEEE/ACM 45th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), Melbourne, Australia, 2023, pp. 197-199
TI  - Static Analysis for Android GDPR Compliance Assurance
ER  - 
TY  - CONF
AU  - Karakaya, Kadiray
AU  - Bodden, Eric
ID  - 59412
T2  - 2023 IEEE Conference on Software Testing, Verification and Validation (ICST)
TI  - Two Sparsification Strategies for Accelerating Demand-Driven Pointer Analysis
ER  - 
TY  - CONF
AU  - Luo, Linghui
AU  - Piskachev, Goran
AU  - Krishnamurthy, Ranjith
AU  - Dolby, Julian
AU  - Schäf, Martin
AU  - Bodden, Eric
ID  - 41812
T2  - IEEE International Conference on Software Testing, Verification and Validation (ICST)
TI  - Model Generation For Java Frameworks
ER  - 
TY  - CONF
AU  - Dann, Andreas Peter
AU  - Hermann, Ben
AU  - Bodden, Eric
ID  - 35083
TI  - UpCy: Safely Updating Outdated Dependencies
ER  - 
TY  - JOUR
AB  - <jats:p>Encrypting data before sending it to the cloud ensures data confidentiality but requires the cloud to compute on encrypted data. Trusted execution environments, such as Intel SGX enclaves, promise to provide a secure environment in which data can be decrypted and then processed. However, vulnerabilities in the executed program give attackers ample opportunities to execute arbitrary code inside the enclave. This code can modify the dataflow of the program and leak secrets via SGX side channels. Fully homomorphic encryption would be an alternative to compute on encrypted data without data leaks. However, due to its high computational complexity, its applicability to general-purpose computing remains limited. Researchers have made several proposals for transforming programs to perform encrypted computations on less powerful encryption schemes. Yet current approaches do not support programs making control-flow decisions based on encrypted data.</jats:p>
          <jats:p>
            We introduce the concept of
            <jats:italic>dataflow authentication</jats:italic>
            (DFAuth) to enable such programs. DFAuth prevents an adversary from arbitrarily deviating from the dataflow of a program. Our technique hence offers protections against the side-channel attacks described previously. We implemented two flavors of DFAuth, a Java bytecode-to-bytecode compiler, and an SGX enclave running a small and program-independent trusted code base. We applied DFAuth to a neural network performing machine learning on sensitive medical data and a smart charging scheduler for electric vehicles. Our transformation yields a neural network with encrypted weights, which can be evaluated on encrypted inputs in
            <jats:inline-formula content-type="math/tex">
              <jats:tex-math notation="LaTeX" version="MathJax">\( 12.55 \,\mathrm{m}\mathrm{s} \)</jats:tex-math>
            </jats:inline-formula>
            . Our protected scheduler is capable of updating the encrypted charging plan in approximately 1.06 seconds.
          </jats:p>
AU  - Fischer, Andreas
AU  - Fuhry, Benny
AU  - Kußmaul, Jörn
AU  - Janneck, Jonas
AU  - Kerschbaum, Florian
AU  - Bodden, Eric
ID  - 31844
IS  - 3
JF  - ACM Transactions on Privacy and Security
KW  - Safety
KW  - Risk
KW  - Reliability and Quality
KW  - General Computer Science
SN  - 2471-2566
TI  - Computation on Encrypted Data Using Dataflow Authentication
VL  - 25
ER  - 
TY  - GEN
AB  - Context: Cryptographic APIs are often misused in real-world applications. Therefore, many cryptographic API misuse detection tools have been introduced. However, there exists no established reference benchmark for a fair and comprehensive comparison and evaluation of these tools. While there are benchmarks, they often only address a subset of the domain or were only used to evaluate a subset of existing misuse detection tools. Objective: To fairly compare cryptographic API misuse detection tools and to drive future development in this domain, we will devise such a benchmark. Openness and transparency in the generation process are key factors to fairly generate and establish the needed benchmark. Method: We propose an approach where we derive the benchmark generation methodology from the literature which consists of general best practices in benchmarking and domain-specific benchmark generation. A part of this methodology is transparency and openness of the generation process, which is achieved by pre-registering this work. Based on our methodology we design CamBench, a fair "Cryptographic API Misuse Detection Tool Benchmark Suite". We will implement the first version of CamBench limiting the domain to Java, the JCA, and static analyses. Finally, we will use CamBench to compare current misuse detection tools and compare CamBench to related benchmarks of its domain.
AU  - Schlichtig, Michael
AU  - Wickert, Anna-Katharina
AU  - Krüger, Stefan
AU  - Bodden, Eric
AU  - Mezini, Mira
ID  - 32409
KW  - cryptography
KW  - benchmark
KW  - API misuse
KW  - static analysis
TI  - CamBench -- Cryptographic API Misuse Detection Tool Benchmark Suite
ER  - 
TY  - CONF
AB  - Static analysis tools support developers in detecting potential coding issues, such as bugs or vulnerabilities. Research on static analysis emphasizes its technical challenges but also mentions severe usability shortcomings. These shortcomings hinder the adoption of static analysis tools, and in some cases, user dissatisfaction even leads to tool abandonment.
To comprehensively assess the current state of the art, this paper presents the first systematic usability evaluation in a wide range of static analysis tools. We derived a set of 36 relevant criteria from the scientific literature and gathered a collection of 46 static analysis tools complying with our inclusion and exclusion criteria - a representative set of mainly non-proprietary tools. Then, we evaluated how well these tools fulfill the aforementioned criteria.
The evaluation shows that more than half of the considered tools offer poor warning messages, while about three-quarters of the tools provide hardly any fix support. Furthermore, the integration of user knowledge is strongly neglected, which could be used for improved handling of false positives and tuning the results for the corresponding developer. Finally, issues regarding workflow integration and specialized user interfaces are proved further.
These findings should prove useful in guiding and focusing further research and development in the area of user experience for static code analyses.
AU  - Nachtigall, Marcus
AU  - Schlichtig, Michael
AU  - Bodden, Eric
ID  - 32410
KW  - Automated static analysis
KW  - Software usability
SN  - 9781450393799
T2  - Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis
TI  - A Large-Scale Study of Usability Criteria Addressed by Static Analysis Tools
ER  - 
TY  - CONF
AB  - Application Programming Interfaces (APIs) are the primary mechanism that developers use to obtain access to third-party algorithms and services. Unfortunately, APIs can be misused, which can have catastrophic consequences, especially if the APIs provide security-critical functionalities like cryptography. Understanding what API misuses are, and for what reasons they are caused, is important to prevent them, e.g., with API misuse detectors. However, definitions and nominations for API misuses and related terms in literature vary and are diverse. This paper addresses the problem of scattered knowledge and definitions of API misuses by presenting a systematic literature review on the subject and introducing FUM, a novel Framework for API Usage constraint and Misuse classification. The literature review revealed that API misuses are violations of API usage constraints. To capture this, we provide unified definitions and use them to derive FUM. To assess the extent to which FUM aids in determining and guiding the improvement of an API misuses detectors' capabilities, we performed a case study on CogniCrypt, a state-of-the-art misuse detector for cryptographic APIs. The study showed that FUM can be used to properly assess CogniCrypt's capabilities, identify weaknesses and assist in deriving mitigations and improvements. And it appears that also more generally FUM can aid the development and improvement of misuse detection tools.
AU  - Schlichtig, Michael
AU  - Sassalla, Steffen
AU  - Narasimhan, Krishna
AU  - Bodden, Eric
ID  - 31133
KW  - API misuses
KW  - API usage constraints
KW  - classification framework
KW  - API misuse detection
KW  - static analysis
T2  - 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)
TI  - FUM - A Framework for API Usage constraint and Misuse Classification
ER  - 
TY  - CONF
AU  - Pasic, Faruk
AU  - Becker, Matthias
ID  - 34057
T2  - 2022 IEEE 27th International Conference on Emerging Technologies and Factory Automation (ETFA)
TI  - Domain-specific Language for Condition Monitoring Software Development
ER  - 
TY  - JOUR
AB  - <jats:title>Abstract</jats:title><jats:p>Many critical codebases are written in C, and most of them use preprocessor directives to encode variability, effectively encoding software product lines. These preprocessor directives, however, challenge any static code analysis. SPLlift, a previously presented approach for analyzing software product lines, is limited to Java programs that use a rather simple feature encoding and to analysis problems with a finite and ideally small domain. Other approaches that allow the analysis of real-world C software product lines use special-purpose analyses, preventing the reuse of existing analysis infrastructures and ignoring the progress made by the static analysis community. This work presents <jats:sc>VarAlyzer</jats:sc>, a novel static analysis approach for software product lines. <jats:sc>VarAlyzer</jats:sc> first transforms preprocessor constructs to plain C while preserving their variability and semantics. It then solves any given distributive analysis problem on transformed product lines in a variability-aware manner. <jats:sc>VarAlyzer</jats:sc> ’s analysis results are annotated with feature constraints that encode in which configurations each result holds. Our experiments with 95 compilation units of OpenSSL show that applying <jats:sc>VarAlyzer</jats:sc> enables one to conduct inter-procedural, flow-, field- and context-sensitive data-flow analyses on entire product lines for the first time, outperforming the product-based approach for highly-configurable systems.</jats:p>
AU  - Schubert, Philipp
AU  - Gazzillo, Paul
AU  - Patterson, Zach
AU  - Braha, Julian
AU  - Schiebel, Fabian
AU  - Hermann, Ben
AU  - Wei, Shiyi
AU  - Bodden, Eric
ID  - 30511
IS  - 1
JF  - Automated Software Engineering
KW  - inter-procedural static analysis
KW  - software product lines
KW  - preprocessor
KW  - LLVM
KW  - C/C++
SN  - 0928-8910
TI  - Static data-flow analysis for software product lines in C
VL  - 29
ER  - 
TY  - JOUR
AB  - <jats:p>
            Nowadays, an increasing number of applications uses deserialization. This technique, based on rebuilding the instance of objects from serialized byte streams, can be dangerous since it can open the application to attacks such as remote code execution (RCE) if the data to deserialize is originating from an untrusted source. Deserialization vulnerabilities are so critical that they are in OWASP’s list of top 10 security risks for web applications. This is mainly caused by faults in the development process of applications and by flaws in their dependencies, i.e., flaws in the libraries used by these applications. No previous work has studied deserialization attacks in-depth: How are they performed? How are weaknesses introduced and patched? And for how long are vulnerabilities present in the codebase? To yield a deeper understanding of this important kind of vulnerability, we perform two main analyses: one on attack gadgets, i.e., exploitable pieces of code, present in Java libraries, and one on vulnerabilities present in Java applications. For the first analysis, we conduct an exploratory large-scale study by running 256 515 experiments in which we vary the versions of libraries for each of the 19 publicly available exploits. Such attacks rely on a combination of
            <jats:italic>gadgets</jats:italic>
            present in one or multiple Java libraries. A gadget is a method which is using objects or fields that can be attacker-controlled. Our goal is to precisely identify library versions containing gadgets and to understand how gadgets have been introduced and how they have been patched. We observe that the modification of one innocent-looking detail in a class – such as making it
            <jats:monospace>public</jats:monospace>
            – can already introduce a gadget. Furthermore, we noticed that among the studied libraries, 37.5% are not patched, leaving gadgets available for future attacks.
          </jats:p>
          <jats:p>For the second analysis, we manually analyze 104 deserialization vulnerabilities CVEs to understand how vulnerabilities are introduced and patched in real-life Java applications. Results indicate that the vulnerabilities are not always completely patched or that a workaround solution is proposed. With a workaround solution, applications are still vulnerable since the code itself is unchanged.</jats:p>
AU  - Sayar, Imen
AU  - Bartel, Alexandre
AU  - Bodden, Eric
AU  - Le Traon, Yves
ID  - 33835
JF  - ACM Transactions on Software Engineering and Methodology
KW  - Software
SN  - 1049-331X
TI  - An In-depth Study of Java Deserialization Remote-Code Execution Exploits and Vulnerabilities
ER  - 
TY  - JOUR
AU  - Piskachev, Goran
AU  - Späth, Johannes
AU  - Budde, Ingo
AU  - Bodden, Eric
ID  - 33836
IS  - 5
JF  - Empirical Software Engineering
TI  - Fluently specifying taint-flow queries with fluentTQL
VL  - 27
ER  - 
TY  - CONF
AU  - Krishnamurthy, Ranjith
AU  - Piskachev, Goran
AU  - Bodden, Eric
ID  - 33838
TI  - To what extent can we analyze Kotlin programs using existing Java taint analysis tools?
ER  - 
TY  - CONF
AU  - Piskachev, Goran
AU  - Dziwok, Stefan
AU  - Koch, Thorsten
AU  - Merschjohann, Sven
AU  - Bodden, Eric
ID  - 33837
TI  - How far are German companies in improving security through static program analysis tools?
ER  - 
TY  - GEN
AB  - Recent studies have revealed that 87 % to 96 % of the Android apps using cryptographic APIs have a misuse which may cause security vulnerabilities. As previous studies did not conduct a qualitative examination of the validity and severity of the findings, our objective was to understand the findings in more depth. We analyzed a set of 936 open-source Java applications for cryptographic misuses. Our study reveals that 88.10 % of the analyzed applications fail to use cryptographic APIs securely. Through our manual analysis of a random sample, we gained new insights into effective false positives. For example, every fourth misuse of the frequently misused JCA class MessageDigest is an effective false positive due to its occurrence in a non-security context. As we wanted to gain deeper insights into the security implications of these misuses, we created an extensive vulnerability model for cryptographic API misuses. Our model includes previously undiscussed attacks in the context of cryptographic APIs such as DoS attacks. This model reveals that nearly half of the misuses are of high severity, e.g., hard-coded credentials and potential Man-in-the-Middle attacks.
AU  - Wickert, Anna-Katharina
AU  - Baumgärtner, Lars
AU  - Schlichtig, Michael
AU  - Mezini, Mira
ID  - 33959
TI  - To Fix or Not to Fix: A Critical Study of Crypto-misuses in the Wild
ER  - 
TY  - JOUR
AU  - Massacci, Fabio
AU  - Sabetta, Antonino
AU  - Mirkovic, Jelena
AU  - Murray, Toby
AU  - Okhravi, Hamed
AU  - Mannan, Mohammad
AU  - Rocha, Anderson
AU  - Bodden, Eric
AU  - Geer, Daniel E.
ID  - 53952
IS  - 5
JF  - IEEE Security &amp; Privacy
SN  - 1540-7993
TI  - “Free” as in Freedom to Protest?
VL  - 20
ER  - 
TY  - JOUR
AB  - Due to the lack of established real-world benchmark suites for static taint analyses of Android applications, evaluations of these analyses are often restricted and hard to compare. Even in evaluations that do use real-world apps, details about the ground truth in those apps are rarely documented, which makes it difficult to compare and reproduce the results. To push Android taint analysis research forward, this paper thus recommends criteria for constructing real-world benchmark suites for this specific domain, and presents TaintBench, the first real-world malware benchmark suite with documented taint flows. TaintBench benchmark apps include taint flows with complex structures, and addresses static challenges that are commonly agreed on by the community. Together with the TaintBench suite, we introduce the TaintBench framework, whose goal is to simplify real-world benchmarking of Android taint analyses. First, a usability test shows that the framework improves experts’ performance and perceived usability when documenting and inspecting taint flows. Second, experiments using TaintBench reveal new insights for the taint analysis tools Amandroid and FlowDroid: (i) They are less effective on real-world malware apps than on synthetic benchmark apps. (ii) Predefined lists of sources and sinks heavily impact the tools’ accuracy. (iii) Surprisingly, up-to-date versions of both tools are less accurate than their predecessors.
AU  - Luo, Linghui
AU  - Pauck, Felix
AU  - Piskachev, Goran
AU  - Benz, Manuel
AU  - Pashchenko, Ivan
AU  - Mory, Martin
AU  - Bodden, Eric
AU  - Hermann, Ben
AU  - Massacci, Fabio
ID  - 27045
JF  - Empirical Software Engineering
SN  - 1382-3256
TI  - TaintBench: Automatic real-world malware benchmarking of Android taint analyses
ER  - 
TY  - THES
AU  - Luo, Linghui
ID  - 27158
TI  - Improving Real-World Applicability of Static Taint Analysis
ER  - 
TY  - JOUR
AU  - Stockmann, Lars
AU  - Laux, Sven
AU  - Bodden, Eric
ID  - 21595
JF  - Journal of Automotive Software Engineering
SN  - 2589-2258
TI  - Using Architectural Runtime Verification for Offline Data Analysis
ER  - 
TY  - THES
AU  - Fischer, Andreas
ID  - 21596
TI  - Computing on Encrypted Data using Trusted Execution Environments
ER  - 
TY  - JOUR
AU  - Holzinger, Philipp
AU  - Bodden, Eric
ID  - 21597
JF  - International Symposium on Advanced Security on Software and Systems (ASSS)
TI  - A Systematic Hardening of Java's Information Hiding
ER  - 
TY  - JOUR
AU  - Bonifacio, Rodrigo
AU  - Krüger, Stefan
AU  - Narasimhan, Krishna
AU  - Bodden, Eric
AU  - Mezini, Mira
ID  - 21599
JF  - European Conference on Object-Oriented Programming (ECOOP)
TI  - Dealing with Variability in API Misuse Specification
ER  - 
TY  - CONF
AU  - Kummita, Sriteja
AU  - Piskachev, Goran
AU  - Spath, Johannes
AU  - Bodden, Eric
ID  - 23374
T2  - 2021 International Conference on Code Quality (ICCQ)
TI  - Qualitative and Quantitative Analysis of Callgraph Algorithms for Python
ER  - 
TY  - CONF
AU  - Karakaya, Kadiray
AU  - Bodden, Eric
ID  - 30084
T2  - 2021 IEEE 21st International Working Conference on Source Code Analysis and Manipulation (SCAM)
TI  - SootFX: A Static Code Feature Extraction Tool for Java and Android
ER  - 
TY  - CONF
AB  - Static analysis is used to automatically detect bugs and security breaches, and aids compileroptimization. Whole-program analysis (WPA) can yield high precision, however causes long analysistimes and thus does not match common software-development workflows, making it often impracticalto use for large, real-world applications.This paper thus presents the design and implementation ofModAlyzer, a novel static-analysisapproach that aims at accelerating whole-program analysis by making the analysis modular andcompositional. It shows how to computelossless, persisted summaries for callgraph, points-to anddata-flow information, and it reports under which circumstances this function-level compositionalanalysis outperforms WPA.We implementedModAlyzeras an extension to LLVM and PhASAR, and applied it to 12 real-world C and C++ applications. At analysis time,ModAlyzermodularly and losslessly summarizesthe analysis effect of the library code those applications share, hence avoiding its repeated re-analysis.The experimental results show that the reuse of these summaries can save, on average, 72% ofanalysis time over WPA. Moreover, because it is lossless, the module-wise analysis fully retainsprecision and recall. Surprisingly, as our results show, it sometimes even yields precision superior toWPA. The initial summary generation, on average, takes about 3.67 times as long as WPA.
AU  - Schubert, Philipp
AU  - Hermann, Ben
AU  - Bodden, Eric
ID  - 21598
T2  - European Conference on Object-Oriented Programming (ECOOP)
TI  - Lossless, Persisted Summarization of Static Callgraph, Points-To and Data-Flow Analysis
ER  - 
TY  - CONF
AU  - Piskachev, Goran
AU  - Krishnamurthy, Ranjith
AU  - Bodden, Eric
ID  - 26407
T2  - 2021 IEEE 21st International Working Conference on Source Code Analysis and Manipulation (SCAM)
TI  - SecuCheck: Engineering configurable taint analysis for software developers
ER  - 
TY  - CONF
AU  - Luo, Linghui
AU  - Schäf, Martin
AU  - Sanchez, Daniel
AU  - Bodden, Eric
ID  - 22463
T2  - Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
TI  - IDE Support for Cloud-Based Static Analyses
ER  -