Facebook’s Data for Good initiative aims to use the vast amounts of data the social network gets to help in various humanitarian issues. Important amongst these is working on disaster management and relief, and the company has worked with Indian organisations like SEEDS India, Wadhwani AI and more in the same direction. Mint spoke to Alex Pompe, Research Manager at Facebook to understand how these technologies work. Edited excerpts.
Is Facebook’s user-base in India vast enough across the country to help with disaster management efforts?
If there are fewer than 10 unique Facebook users in a 0.6 square kilometre radius, we do not show that data because it would put the identification of a user at risk. The caveat here is that it has to be 10 Facebook users who have opted into the location history setting, which is roughly 15% of the users.
We don’t publish this data to humanitarian organisations as the absolute truth. They use this alongside data they receive from other sources, like governments and telcos.
Since you use convolutional neural networks, have you seen the artificial intelligence (AI) used in these disaster maps change over time? Has the algorithm evolved by itself?
The algorithm isn’t running all the time. The update frequency for each country is roughly six months, so we get an improved satellite image layer and we’re adding new data on top of that layer. We’re training the algorithm to be smart.
Right now it just detects are there any buildings, but we’d like it to learn how to tell the difference between (residential and) industrial buildings. So, we run a job and it takes the algorithm a day or two to generate data for a country. It is still learning from each job but I’d say it isn’t continuous. I’d say the improvements are much more on the side of satellite imagery and improved census data.
When there’s a disaster like the Kerala Floods, do the disaster maps work in real time during that time?
The default is that from the moment a disaster strikes, it is generated for 14 days and updated every eight hours. But for example, Cyclone Phani was a much longer disaster, so we extend that until the crisis is finished. Real time to us means every eight hours.
The algorithm is triggered by looking at aggregated feeds of CAP (common alerting protocol) alerts from government agencies etc. If one of our partners, like SEEDS India, alerts us we can manually kick it off as well.
Do you think you will be able to predict disasters like floods etc. in future?
It isn’t a goal for us, but there are academic researchers who use this data for their predictive models. We’re contributing to that work but it isn’t our goal to make such predictions.
How do these maps work in the case of diseases?
So, this one started only in May. We don’t have any health data that we’re providing. We provide underlying geospatial data that they use in their models. In the past, they have used census data and anonymised call record data from telecoms.
The input sources for the algorithm on the population density maps is based only on users’ location history?
Those aren’t made using the AI algorithm. The data there is just anonymized and aggregated counts of the geospatial location of our users in a different format. There’s no algorithm learning or predicting anything in these maps.
Could you take more data from the user (with their permission) to improve what you do?
I think we have a vision to try to improve our accuracy in the future, but we’re spending most of our efforts right now in -- is that we have these existing data sets, and based on feedback from partners, we’re investing more on finding more use cases where this can have more social impact than new data sources.
What other areas do you plan to work on other than diseases, population density etc?
We’re certainly open to it. We rely on our partners to make us aware of that.
Do any governments use this data?
We only allow NGOs, academic researchers, etc. Those partners though can share derivatives, maps, visualisations etc. We provide raw data, which most government agencies would struggle to work with. As a policy, we do not share this data with governments.