---
_id: '63834'
abstract:
- lang: eng
  text: "<jats:title>Abstract</jats:title>\r\n                  <jats:p>\r\n                    Many
    Android apps collect data from users, and the European Union’s General Data Protection
    Regulation (GDPR) mandates clear disclosures of such data collection. However,
    apps often use third-party code, complicating accurate disclosures. This paper
    investigates how accurately current Android apps fulfill these requirements. In
    this work, we present a multi-layered definition of privacy-related data to correctly
    report data collection in Android apps. We further create a dataset of privacy-sensitive
    data classes that may be used as input by an Android app. This dataset takes into
    account data collected both through the user interface and system APIs. Based
    on this, we implement a semi-automated prototype that detects and labels privacy-related
    data collected by a given Android app. We manually examine the data safety sections
    of 70 Android apps to observe how data collection is reported, identifying instances
    of over- and under-reporting. We compare our prototype’s results with the data
    safety sections of 20 apps revealing reporting discrepancies. Using the results
    from two Messaging and Social Media apps (Signal and Instagram), we discuss how
    app developers under-report and over-report data collection, respectively, and
    identify inaccurately reported data categories. A broader study of 7,500 Android
    apps reveals that apps most frequently collect data that can\r\n                    <jats:italic>partially
    identify</jats:italic>\r\n                    users. Although system APIs consistently
    collect large amounts of privacy-related data, user interfaces exhibit some more
    diverse data collection patterns. A more focused study on various domains of apps
    reveals that the largest fraction of apps collecting personal data belong to the
    domain of\r\n                    <jats:italic>Messaging and Social Media</jats:italic>\r\n
    \                   . Our findings show that location is collected frequently
    by apps, specially from the\r\n                    <jats:italic>E-commerce and
    Shopping</jats:italic>\r\n                    domain. However, it is often under-reported
    in app data safety sections. Our results highlight the need for greater consistency
    in privacy-aware app development and reporting practices.\r\n                  </jats:p>"
article_number: '45'
author:
- first_name: Mugdha
  full_name: Khedkar, Mugdha
  id: '88024'
  last_name: Khedkar
- first_name: Ambuj
  full_name: Kumar Mondal, Ambuj
  last_name: Kumar Mondal
- first_name: Eric
  full_name: Bodden, Eric
  id: '59256'
  last_name: Bodden
  orcid: 0000-0003-3470-3647
citation:
  ama: Khedkar M, Kumar Mondal A, Bodden E. A study of privacy-related data collected
    by Android apps. <i>Automated Software Engineering</i>. 2026;33(2). doi:<a href="https://doi.org/10.1007/s10515-025-00589-3">10.1007/s10515-025-00589-3</a>
  apa: Khedkar, M., Kumar Mondal, A., &#38; Bodden, E. (2026). A study of privacy-related
    data collected by Android apps. <i>Automated Software Engineering</i>, <i>33</i>(2),
    Article 45. <a href="https://doi.org/10.1007/s10515-025-00589-3">https://doi.org/10.1007/s10515-025-00589-3</a>
  bibtex: '@article{Khedkar_Kumar Mondal_Bodden_2026, title={A study of privacy-related
    data collected by Android apps}, volume={33}, DOI={<a href="https://doi.org/10.1007/s10515-025-00589-3">10.1007/s10515-025-00589-3</a>},
    number={245}, journal={Automated Software Engineering}, publisher={Springer Science
    and Business Media LLC}, author={Khedkar, Mugdha and Kumar Mondal, Ambuj and Bodden,
    Eric}, year={2026} }'
  chicago: Khedkar, Mugdha, Ambuj Kumar Mondal, and Eric Bodden. “A Study of Privacy-Related
    Data Collected by Android Apps.” <i>Automated Software Engineering</i> 33, no.
    2 (2026). <a href="https://doi.org/10.1007/s10515-025-00589-3">https://doi.org/10.1007/s10515-025-00589-3</a>.
  ieee: 'M. Khedkar, A. Kumar Mondal, and E. Bodden, “A study of privacy-related data
    collected by Android apps,” <i>Automated Software Engineering</i>, vol. 33, no.
    2, Art. no. 45, 2026, doi: <a href="https://doi.org/10.1007/s10515-025-00589-3">10.1007/s10515-025-00589-3</a>.'
  mla: Khedkar, Mugdha, et al. “A Study of Privacy-Related Data Collected by Android
    Apps.” <i>Automated Software Engineering</i>, vol. 33, no. 2, 45, Springer Science
    and Business Media LLC, 2026, doi:<a href="https://doi.org/10.1007/s10515-025-00589-3">10.1007/s10515-025-00589-3</a>.
  short: M. Khedkar, A. Kumar Mondal, E. Bodden, Automated Software Engineering 33
    (2026).
date_created: 2026-02-02T12:36:22Z
date_updated: 2026-02-11T18:33:12Z
ddc:
- '006'
department:
- _id: '76'
doi: 10.1007/s10515-025-00589-3
file:
- access_level: closed
  content_type: application/pdf
  creator: khedkarm
  date_created: 2026-02-11T18:32:52Z
  date_updated: 2026-02-11T18:32:52Z
  file_id: '64127'
  file_name: s10515-025-00589-3-1.pdf
  file_size: 3363479
  relation: main_file
  success: 1
file_date_updated: 2026-02-11T18:32:52Z
has_accepted_license: '1'
intvolume: '        33'
issue: '2'
language:
- iso: eng
publication: Automated Software Engineering
publication_identifier:
  issn:
  - 0928-8910
  - 1573-7535
publication_status: published
publisher: Springer Science and Business Media LLC
status: public
title: A study of privacy-related data collected by Android apps
type: journal_article
user_id: '88024'
volume: 33
year: '2026'
...
---
_id: '30511'
abstract:
- lang: eng
  text: <jats:title>Abstract</jats:title><jats:p>Many critical codebases are written
    in C, and most of them use preprocessor directives to encode variability, effectively
    encoding software product lines. These preprocessor directives, however, challenge
    any static code analysis. SPLlift, a previously presented approach for analyzing
    software product lines, is limited to Java programs that use a rather simple feature
    encoding and to analysis problems with a finite and ideally small domain. Other
    approaches that allow the analysis of real-world C software product lines use
    special-purpose analyses, preventing the reuse of existing analysis infrastructures
    and ignoring the progress made by the static analysis community. This work presents
    <jats:sc>VarAlyzer</jats:sc>, a novel static analysis approach for software product
    lines. <jats:sc>VarAlyzer</jats:sc> first transforms preprocessor constructs to
    plain C while preserving their variability and semantics. It then solves any given
    distributive analysis problem on transformed product lines in a variability-aware
    manner. <jats:sc>VarAlyzer</jats:sc> ’s analysis results are annotated with feature
    constraints that encode in which configurations each result holds. Our experiments
    with 95 compilation units of OpenSSL show that applying <jats:sc>VarAlyzer</jats:sc>
    enables one to conduct inter-procedural, flow-, field- and context-sensitive data-flow
    analyses on entire product lines for the first time, outperforming the product-based
    approach for highly-configurable systems.</jats:p>
alternative_title:
- Revoking the preprocessor’s special role
article_number: '35'
article_type: original
author:
- first_name: Philipp
  full_name: Schubert, Philipp
  id: '60543'
  last_name: Schubert
  orcid: 0000-0002-8674-1859
- first_name: Paul
  full_name: Gazzillo, Paul
  last_name: Gazzillo
- first_name: Zach
  full_name: Patterson, Zach
  last_name: Patterson
- first_name: Julian
  full_name: Braha, Julian
  last_name: Braha
- first_name: Fabian Benedikt
  full_name: Schiebel, Fabian Benedikt
  id: '55745'
  last_name: Schiebel
  orcid: 0009-0008-6867-9802
- first_name: Ben
  full_name: Hermann, Ben
  id: '66173'
  last_name: Hermann
  orcid: 0000-0001-9848-2017
- first_name: Shiyi
  full_name: Wei, Shiyi
  last_name: Wei
- first_name: Eric
  full_name: Bodden, Eric
  id: '59256'
  last_name: Bodden
  orcid: 0000-0003-3470-3647
citation:
  ama: Schubert P, Gazzillo P, Patterson Z, et al. Static data-flow analysis for software
    product lines in C. <i>Automated Software Engineering</i>. 2022;29(1). doi:<a
    href="https://doi.org/10.1007/s10515-022-00333-1">10.1007/s10515-022-00333-1</a>
  apa: Schubert, P., Gazzillo, P., Patterson, Z., Braha, J., Schiebel, F. B., Hermann,
    B., Wei, S., &#38; Bodden, E. (2022). Static data-flow analysis for software product
    lines in C. <i>Automated Software Engineering</i>, <i>29</i>(1), Article 35. <a
    href="https://doi.org/10.1007/s10515-022-00333-1">https://doi.org/10.1007/s10515-022-00333-1</a>
  bibtex: '@article{Schubert_Gazzillo_Patterson_Braha_Schiebel_Hermann_Wei_Bodden_2022,
    title={Static data-flow analysis for software product lines in C}, volume={29},
    DOI={<a href="https://doi.org/10.1007/s10515-022-00333-1">10.1007/s10515-022-00333-1</a>},
    number={135}, journal={Automated Software Engineering}, publisher={Springer Science
    and Business Media LLC}, author={Schubert, Philipp and Gazzillo, Paul and Patterson,
    Zach and Braha, Julian and Schiebel, Fabian Benedikt and Hermann, Ben and Wei,
    Shiyi and Bodden, Eric}, year={2022} }'
  chicago: Schubert, Philipp, Paul Gazzillo, Zach Patterson, Julian Braha, Fabian
    Benedikt Schiebel, Ben Hermann, Shiyi Wei, and Eric Bodden. “Static Data-Flow
    Analysis for Software Product Lines in C.” <i>Automated Software Engineering</i>
    29, no. 1 (2022). <a href="https://doi.org/10.1007/s10515-022-00333-1">https://doi.org/10.1007/s10515-022-00333-1</a>.
  ieee: 'P. Schubert <i>et al.</i>, “Static data-flow analysis for software product
    lines in C,” <i>Automated Software Engineering</i>, vol. 29, no. 1, Art. no. 35,
    2022, doi: <a href="https://doi.org/10.1007/s10515-022-00333-1">10.1007/s10515-022-00333-1</a>.'
  mla: Schubert, Philipp, et al. “Static Data-Flow Analysis for Software Product Lines
    in C.” <i>Automated Software Engineering</i>, vol. 29, no. 1, 35, Springer Science
    and Business Media LLC, 2022, doi:<a href="https://doi.org/10.1007/s10515-022-00333-1">10.1007/s10515-022-00333-1</a>.
  short: P. Schubert, P. Gazzillo, Z. Patterson, J. Braha, F.B. Schiebel, B. Hermann,
    S. Wei, E. Bodden, Automated Software Engineering 29 (2022).
date_created: 2022-03-25T07:41:26Z
date_updated: 2025-12-04T10:42:38Z
department:
- _id: '76'
doi: 10.1007/s10515-022-00333-1
intvolume: '        29'
issue: '1'
keyword:
- inter-procedural static analysis
- software product lines
- preprocessor
- LLVM
- C/C++
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://link.springer.com/article/10.1007/s10515-022-00333-1
oa: '1'
project:
- _id: '12'
  name: 'SFB 901 - B4: SFB 901 - Subproject B4'
- _id: '3'
  name: 'SFB 901 - B: SFB 901 - Project Area B'
- _id: '1'
  name: 'SFB 901: SFB 901'
publication: Automated Software Engineering
publication_identifier:
  issn:
  - 0928-8910
  - 1573-7535
publication_status: published
publisher: Springer Science and Business Media LLC
status: public
title: Static data-flow analysis for software product lines in C
type: journal_article
user_id: '15249'
volume: 29
year: '2022'
...
