A study of privacy-related data collected by Android apps

M. Khedkar, A. Kumar Mondal, E. Bodden, Automated Software Engineering 33 (2026).

Download
No fulltext has been uploaded.
Journal Article | Published | English
Author
Abstract
<jats:title>Abstract</jats:title> <jats:p> Many Android apps collect data from users, and the European Union’s General Data Protection Regulation (GDPR) mandates clear disclosures of such data collection. However, apps often use third-party code, complicating accurate disclosures. This paper investigates how accurately current Android apps fulfill these requirements. In this work, we present a multi-layered definition of privacy-related data to correctly report data collection in Android apps. We further create a dataset of privacy-sensitive data classes that may be used as input by an Android app. This dataset takes into account data collected both through the user interface and system APIs. Based on this, we implement a semi-automated prototype that detects and labels privacy-related data collected by a given Android app. We manually examine the data safety sections of 70 Android apps to observe how data collection is reported, identifying instances of over- and under-reporting. We compare our prototype’s results with the data safety sections of 20 apps revealing reporting discrepancies. Using the results from two Messaging and Social Media apps (Signal and Instagram), we discuss how app developers under-report and over-report data collection, respectively, and identify inaccurately reported data categories. A broader study of 7,500 Android apps reveals that apps most frequently collect data that can <jats:italic>partially identify</jats:italic> users. Although system APIs consistently collect large amounts of privacy-related data, user interfaces exhibit some more diverse data collection patterns. A more focused study on various domains of apps reveals that the largest fraction of apps collecting personal data belong to the domain of <jats:italic>Messaging and Social Media</jats:italic> . Our findings show that location is collected frequently by apps, specially from the <jats:italic>E-commerce and Shopping</jats:italic> domain. However, it is often under-reported in app data safety sections. Our results highlight the need for greater consistency in privacy-aware app development and reporting practices. </jats:p>
Publishing Year
Journal Title
Automated Software Engineering
Volume
33
Issue
2
Article Number
45
LibreCat-ID

Cite this

Khedkar M, Kumar Mondal A, Bodden E. A study of privacy-related data collected by Android apps. Automated Software Engineering. 2026;33(2). doi:10.1007/s10515-025-00589-3
Khedkar, M., Kumar Mondal, A., & Bodden, E. (2026). A study of privacy-related data collected by Android apps. Automated Software Engineering, 33(2), Article 45. https://doi.org/10.1007/s10515-025-00589-3
@article{Khedkar_Kumar Mondal_Bodden_2026, title={A study of privacy-related data collected by Android apps}, volume={33}, DOI={10.1007/s10515-025-00589-3}, number={245}, journal={Automated Software Engineering}, publisher={Springer Science and Business Media LLC}, author={Khedkar, Mugdha and Kumar Mondal, Ambuj and Bodden, Eric}, year={2026} }
Khedkar, Mugdha, Ambuj Kumar Mondal, and Eric Bodden. “A Study of Privacy-Related Data Collected by Android Apps.” Automated Software Engineering 33, no. 2 (2026). https://doi.org/10.1007/s10515-025-00589-3.
M. Khedkar, A. Kumar Mondal, and E. Bodden, “A study of privacy-related data collected by Android apps,” Automated Software Engineering, vol. 33, no. 2, Art. no. 45, 2026, doi: 10.1007/s10515-025-00589-3.
Khedkar, Mugdha, et al. “A Study of Privacy-Related Data Collected by Android Apps.” Automated Software Engineering, vol. 33, no. 2, 45, Springer Science and Business Media LLC, 2026, doi:10.1007/s10515-025-00589-3.

Export

Marked Publications

Open Data LibreCat

Search this title in

Google Scholar