The Mobility Laws of Location-Based Games: Carat Dataset

This dataset is published as part of the article:

Leonardo Tonetto, Eemil Lagerspetz, Aaron Yi Ding, Jörg Ott, Sasu Tarkoma, Petteri Nurmi: The Mobility Laws of Location-Based Games. Host publication information TBD, 2020.

This dataset is released to the public for research purposes only. Commercial use is strictly prohibited. It is forbidden to attempt discovery of the identities of participating users. Works that use the dataset must cite the article "The Mobility Laws of Location-Based Games" as the source of the data and cite the original article (also available online:

A. J. Oliner, A. P. Iyer, I. Stoica, E. Lagerspetz, and S. Tarkoma.
Carat: Collaborative Energy Diagnosis for Mobile Devices.
In Proceedings of the 11th ACM Conference on Embedded Networked Sensor Systems, (10 pages), 2013, ACM.

By downloading and using the dataset, you agree to the above conditions. To download the dataset, use the link at the bottom of this page. The zip archive containing the dataset is password protected; the password is


How the Dataset was Collected

The dataset consists of data collected from volunteers who installed the Carat energy awareness application ( ). The app was developed during collaborative work of University of Helsinki and University of California at Berkeley. The app and all of its servers and data analytics services have been managed by University of Helsinki since the end of 2013. The app collected application usage and battery level information every time the battery level changed by 1%, as allowed by the mobile operating system. The app focused on energy improvement, and did not sample data regularly, and only sent data to our servers when the user opened the app. As a result, the data is sparse and large time periods may be missing for some users. As the app focused on energy use, it is likely that users that had it installed did so to solve energy issues on their device, were interested in contributing to research, and/or were keen followers of technology news. This results in a possible bias of users, which has to be considered when using the dataset. However, the dataset as a whole is very diverse and contains users from many countries, perhaps mitigating the bias. See also:

What the Dataset Contains

The dataset contains anonymized records of smartphone usage with fields relevant to the study of Pokemon Go usage and its mobility effects. The dataset is formatted as a semicolon-separated CSV file, with the following fields:

  1. Anonymized user ID (number from 0 to 4984)
  2. Timestamp when the sample was recorded, in Unix Timestamp format (seconds since 00:00:00 Jan 1 1970 UTC). Recorded on the device, dependent on the accuracy of the device's clock.
  3. The number of metres moved since the last sample
  4. timeZone name in Continent/City format, see Java Timezone IDs
  5. two-lettery country code, e.g. fi or us
  6. A binary field, 1 if Pokemon GO was running at the time, 0 otherwise
  7. A binary field, 1 if Clash Royale was running at the time, 0 otherwise
  8. A text field with the version name of Clash Royale, if installed, - otherwise. Example: 0.33
  9. A text field with the version name of Pokemon GO if installed, - otherwise. Example: 0.33

Questions regarding the dataset and its use can be addressed to .

The dataset can be downloaded here: