Boston: Twitterati, take note! Just a handful of your tweets over the course of a single day may be enough to disclose the location of your home and workplace even to a relatively low-tech snooper, scientists have found.
Twitter’s location-reporting service is off by default, but many Twitter users choose to activate it. The study by researchers at Massachusetts Institute of Technology in the US and Oxford University in the UK may help raise awareness about just how much privacy people may be giving up when they use social media.
“Many people have this idea that only machine-learning techniques can discover interesting patterns in location data,” said Ilaria Liccardi, a research scientist at MIT. “With this study, what we wanted to show is that when you send location data as a secondary piece of information, it is extremely simple for people with very little technical knowledge to find out where you work or live,” said Liccardi.
In their study, researchers used real tweets from Twitter users in the Boston area in the US. The users consented to the use of their data, and they also confirmed their home and work addresses, their commuting routes, and the locations of various leisure destinations from which they had tweeted.
The time and location data associated with the tweets were then presented to a group of 45 study participants, who were asked to try to deduce whether the tweets had originated at the Twitter users’ homes, their workplaces, leisure destinations, or locations along their commutes.
The participants were not recruited on the basis of any particular expertise in urban studies or the social sciences; they just drew what conclusions they could from location clustering. They were also recruited in Oxford, to eliminate biasing that might result from familiarity with Boston geography.
Similarly, they had no information about the content of the tweets. The data were presented in three different forms. One was a static Google map, in which tweet locations were marked with virtual pins; one was an animated version of the map, in which the pins appeared on-screen in chronological order; and the third—the resolutely low-tech version—was a table listing geographical coordinates, street names and times of day.
The maps featured only street names, with no names of businesses, parks, schools, or other landmarks. Pins and table rows were, however, colour coded to indicate general time of day—morning, afternoon, or evening.
The researchers also varied the volume of data that the participants were asked to consider: one day’s, three days’, or five days’ worth. To avoid biasing, there was no overlap between data sets of different sizes.
Predictably, participants fared better with map-based representations, correctly identifying Twitter users’ homes roughly 65% of the time and their workplaces at closer to 70%. Even the tabular representation was informative, however, with accuracy rates of just under 50% for homes and a surprisingly high 70% for workplaces.