The IDAC investigative team has found a prevalent and alarming trend among worldwide mobile applications: Android apps are secretly inferring users’ location without using the device’s location services. Collecting user location data raises privacy concerns, including how and why that data is used.
There are multiple ways to infer a phone’s location without using the location services. Here, we focus on two methods we frequently observed: WiFi scanning and the use of unprotected side-channels. We note that our technical capabilities were not able to test these behaviors on Android 10 devices, which, as of February 2020, make up less than 10 percent of the market share.
1. WiFi Scanning
Our investigation revealed that apps were secretly obtaining the BSSID (basic service set identifiers) of users’ WiFi. The BSSID provides the MAC address of wireless access points such as nearby WiFi routers, which can be used as a surrogate to discern location information the same way that WiFi router maps allow for the look-up of router locations. Mobile apps can determine the MAC address of the router to which they are currently or recently connected, and use the location maps of WiFi routers to determine, with relative precision, where that mobile device is located.
This behavior was not observed on Android 10 apps because WiFi scanning is no longer an option for developers to covertly obtain location data, as a result of Android’s permission changes.
In order to obtain even more precise location estimates, apps may perform a calculation known as multilateration, in which they combine the known locations of each of the networks with the distance to each network, using signal strength as a proxy. See the Appendix for more information on multilateration.
2. Unprotected Side Channels
The following are unprotected ways in which apps can gain access to the router’s MAC address, which is ultimately used to discreetly obtain the phone’s location.
- Reading the ARP Table
Apps may infer location by simply reading the ARP table and parsing the contents of the arp file (/proc/net/arp) on the phone. With this file, the app is able to retrieve the phone’s IP address and MAC address of the phone’s network. The app then uses the MAC address to obtain location from the BSSID. This method of obtaining location data is no longer possible on Android 10, as Google has now addressed the issue.
- Using UNIX’s Input/Output Controller
Using UNIX’s input/output controller, apps are able to obtain the network’s interfaces, including the MAC address of the phone. As mentioned previously, having access to the MAC address allows the app to then obtain location from the BSSID. This method is not unique or specific to Android.
This side-channel comes from the Linux kernel* within Android, as part of UNIX's system calls. Limiting this side-channel can result in significant performance and security risks, so it’s not expected to be removed anytime soon.
* A kernel is the "core" part of an operating system and acts as an intermediary between the hardware and the applications.
- Using Router UPnP
UPnP (Universal Plug and Play) are a set of protocols that automate all of the complex steps required for devices to communicate with each other (e.g., laptop/phone/IoT device). UPnP is typically seen in residential areas and small offices, where devices discover and receive information about the available network services. An app could make use of these UPnP protocols to obtain the MAC address from a WiFi router, which it can then use to obtain location from the BSSID.
This method cannot be easily fixed on the Android side, since these protocols work over standard internet protocols (e.g., DHCP, TCP/IP, HTTP) and it’s difficult to decipher whether the activity is malicious or not. On the router side, UPnP can be disabled through the configurations of the router, but this disables the ability to connect smart-devices around your home.
- Reading Metadata via Photos
Most digital images, audio, and video files contain Exif data. This metadata contains information about the file itself, such as when it was created, what type of device created it, and other details about the file. Among these, most smartphones also encode the GPS coordinates of the location where the image was taken. An app that has access to your photo library can read the Exif information in your photos to obtain location data. Users on Android devices can, through the phone’s camera settings, disable location information from being added to the file -- and additionally disable the camera’s access location permission.
The high degree to which Android apps are inferring location using unconventional and workaround ways suggest that there has been a lack enforcement and accountability by Google to stop this misconduct. Quietly collecting location data without user consent is a privacy infringement, therefore Google should be implementing controls, educating app developers, and penalizing bad actors to prevent this behavior from occurring.
Multilateration is used to obtain more precise location data. The following is an illustration of how a device’s location can be inferred from three networks by using their known location and the signal strength on the device:
Multilateration in use: the device is known to be at a distance ri of network i, each of which has a known location (xi,yi)