Our Research Methodology

Building on our findings from our previous research, we sought to investigate the data management and sharing practices of menstruation apps with third parties beyond Facebook, as well as to assess whether some of the apps we looked at the first time around had improved their practices as they have claimed.

Long Read
A screenshot of PI's Data Interception Environment

Go back to the report page

 

Methodology 

We looked at the top period tracking apps downloaded in the Google Play Store, some of which we had examined in our original research, and some of which are newly emerging apps that have since grown in popularity. The top period-tracking apps with the most downloads included Flo; Period Tracker by Simple Design; and apps we'd tested in our previous research that still exist such as Maya, Period Tracker by GP Apps, as well as several apps popular in other global markets such as WomanLog and Wocute. We also included an app that saw an uptake in downloads post the overturning of Roe v Wade, which has claimed to be 'privacy-enhancing' (Stardust), and finally an open-source period tracker developed by non-profit researchers (Euki). 

To conduct our research, we first ran every app through Exodus Privacy for a static analysis. Exodus Privacy is a static analysis tool that allows anyone to check the trackers (e.g., ad trackers) and permissions (e.g., access to precise GPS location) embedded into Android apps. We entered into Exodus a number of popular period tracking apps beyond those listed above, with the goal to narrow down our list and select the apps with the greater number of trackers and permissions enabled. This static analysis allowed us to reach our final list of apps to test: Flo, Period Tracker by Simple Design, Maya by Plackal Tech, Period Tracker by GP Apps, WomanLog, Wocute, Stardust and Euki. Note that we found no trackers and permissions enabled for Euki, but we decided to put Euki to the test anyway for the purposes of demonstration and comparison. 

Then, we deployed a two-pronged analysis: firstly, we examined the dynamic web traffic analysis via our in-house Data Interception as a Service (DIAAS) tool; and secondly, we compared these findings with the apps privacy policies. 

The DIAAS is a web traffic interception tool that allows us to view the requests and responses sent by apps over the web (note the DIAAS only shows client-server interactions, not server-to-server interactions). When running an app in the DIAAS environment we can view the web traffic of an app, for example the web requests they send to URLs, which can help to reveal any calls to third party deployers. The first step in the research involved running each app through the DIAAS to view the web requests being sent to various third parties as we interacted with the app, from the moment we set up our accounts to logging our period cycle data. 

The second step, which we conducted alongside our dynamic analysis, was to compare our web traffic findings (e.g., which third parties appeared) with the information provided in the apps' privacy policies about their data management and third-party processing practices. 

Definitions 

Before diving into the details of the methodology and our findings, here are some definitions of technical vocabulary we will be using throughout this report: 

Software Development Kit (SDK): a set of software tools provided to developers that can be used for building applications for specific platforms (essentially the building blocks for a software application); SDKs include documentation, APIs (below), libraries and other tools 

Application Programming Interface (API): a software intermediary in code that enables two software programs to communicate with each other; API documentation provides common coding calls and functions for how developers can make requests and responses across applications (e.g., developer making an API call to its third-party cloud service provider which will respond with the infrastructure requested to power the application) 

Cloud computing: computing services like servers, storage, databases, networking, etc. delivered across the “cloud” (the Internet), generally describing objects abstracted from the underlying infrastructure 

Content delivery networks (CDNs): a network of proxy servers (intermediary between client requests and the servers providing that resource) and their data centres that deliver the requested content (e.g., videos, images, web and mobile content. etc.) at a higher speed (via caching) and scalability necessary for complex apps 

Caching: Storing data temporarily to improve performance 

Data minimalist: collecting only the data that is needed for the functioning of the app 

Open source: non-proprietary software programmes whose source code is publicly available for anyone to use, modify, or collaborate on 

Mitmproxy: An open-source tool for analysing encrypted datastreams by sitting in the middle of a connection between a client and a server, allowing the data to be examined 

Server-side: the processing takes place on an external web server 

Client-side: the processing takes place on the user’s device 

The app setup 

1. First, we created a Google account for our research subject ('User'), which was required to download apps from the Google Play Store. 

2. Then we downloaded each app from the Play Store onto our virtual Android machine deployed through the DIAAS. 

3. We opened each app and completed their individual onboarding processes with the required personal information, the degrees of which varied across apps. Some apps required logging in with our Google or other email account, while others required providing basic information such as the user’s name and date of birth. Some of the apps did not require any personal information before directing us to the cycle dashboard. 

In the process of setting up our user onto the apps and populating their personal data and period cycle data, we took a data minimalist approach by providing the minimum amount of information needed for the proper functioning of the app. This meant we would not provide personal data like our birthday, height or weight, where doing so was optional, We also would not allow notifications in the app, location-sharing or other optional requests from the app. The purpose of this approach is to showcase how much extensive data each app requires of the user and how functional (or not functional) an app is when a user does not share everything. 

Setting up the DIAAS 

General practices we abided by when using the DIAAS to maintain a clean research environment are as follows: 

  • The mitmproxy window in the DIAAS displays the URL path on the left side, which shows where the request is being sent (e.g., the app's API, a third party website, etc.), and on the right side it shows the request and response information, such as what information in the app is being passed through (e.g., user input data about their cycle, date of birth during the onboarding questionnaire, device dimensions, device location).
  • Every time we were done with a web traffic analysis session for an app, we went into the Android's Settings and clicked 'Force Stop' for the specific app. We then cleared the mitmproxy window via 'File > Clear all' to maintain a clean page.
  • If we needed to restart an app to begin with a fresh slate (such as to redo the onboarding process), we went to the Android's Settings, selected the specific app and clicked on 'Clear storage', which would completely clear the storage and cache on the app to facilitate a fresh start. 

We will be releasing publicly available documentation about the DIAAS in the near future, which we have developed in house as open-source software.

Download the full report

Read more

What can I do?

If you want to make sure we can keep doing work like this you can donate now to make sure PI can keep holding governments and companies to account.  

Related learning resources