• Franz-Xaver Geiger
  • Ivano Malavolta
  • Luca Pascarella
  • Fabio Palomba
  • Dario Di Nucci
  • Alberto Bacchelli
Empirical studies on the engineering of Android apps need to be based on open datasets and tools to allow comparisons, improve generalizability, and enable replicability. However, obtaining a good dataset is problematic and this state of things slows down empirical research on this topic.
In this paper, we contribute to overcome this challenge by presenting the firrst, self-contained, publicly available dataset weaving spread-out data sources about real-world, open-source Android apps. Our dataset is encoded as a graph-based database and contains the following information about 8,431 real open-source Android apps: (i) metadata about their GitHub projects, (ii) Git repositories with full commit history and (iii) metadata extracted from the Google Play store, such as app ratings and permissions. The dataset is available in Docker images to ease adoption.
Original languageEnglish
Title of host publicationin Proceedings of the 15th ACM/IEEE International Conference on Mining Software Repositories, Data Showcase
PublisherACM / IEEE
Number of pages4
StateAccepted/In press - 2 Mar 2018

ID: 36663690