When bots become bigots
Algorithms that prefer light skin, Twitter bots that turn sexist—Artificial Intelligence easily adopts human prejudice
When Beauty.ai was announced as a beauty contest judged by Artificial Intelligence (AI), the point was to take human bias out of the equation. The project’s creators wanted to study what constituted beauty objectively. More than 60,000 people from around the world sent in pictures of their faces to be judged by bots. But when the results, released in August, came in, 37 of the 44 winners were white. Six were Asian and only one of the winners had dark skin. It appeared that the AI was racist.
One explanation for the lopsided results, according to Anastasia Georgievskaya, one of Beauty.ai’s project managers, was that most of the entrants were white—only 1.3% were black, 2% Indian and 6% Asian. But this is not the first time AI has shown signs of bigotry. When Microsoft launched a Twitter bot named Tay in March, it began spewing hatred within hours, saying, among other things, that feminists should burn in hell and Adolf Hitler was right in hating Jews. Last year, the Google Photos app, which was supposed to be able to identify objects, activities and events on its own, with no need for humans to tag them, labelled a black couple as “gorillas”.
To understand how this could be happening, you have to think of an AI bot as a child. Much of AI today undergoes a process called deep learning, which means the AI, during what is called its “training”, is exposed to reams of data and human behaviour and then forms its own ways of deciding how to act, much like a child develops his nature from his environment. Now, if the data the AI sees reflects, in some way, human prejudices, then it will pick those up. So, if an algorithm is training to identify things in photographs and it sees a lot of photographs in which people of a certain race have been labelled with a racist slur, it will begin to use that same slur against that race.
Abhinav Aggarwal, co-founder of Fluid.ai, which, among other things, builds bots to replace customer executives in banks and call centres, explains that, like a child, a bot will learn more from the data it is exposed to early on than what it sees later. “The Microsoft bot probably happened to stumble upon hateful material on Twitter early on and developed those traits. Then it would have been hard to correct them by exposing it to cleaner material,” he says.
Georgievskaya says the problem Beauty.ai had was that the data the algorithms trained with was not diverse enough. If a bot being trained to study how many wrinkles a person has does not see enough pictures of Indians, then it may not develop a good enough methodology to determine how many wrinkles they have and may appear to be biased against them. The contest was an initiative of Russia and Hong Kong-based Youth Laboratories. Youth Laboratories is now working on a project called Diversity.ai, whose aim is to adjust data sets so that all demographics of people are well represented.
It is possible to correct some of the flaws in AI by managing the data it is exposed to. Fluid.ai, for example, does not let the AI it puts in banks know what the customers they are interacting with look like. But as we see greater integration of AI in our everyday lives, we must consider that some prejudices are intrinsic in our society, and it will take a proactive effort to stop bots from continuing them.
In the Beauty.ai contest, one algorithm judged how similar entrants looked to actors or models. “Models or actors are considered to be beautiful and hence we have an algorithm that compares entrants to them,” Georgievskaya says. “And we are not comparing all people to white models, but to those within their own ethnic groups.” What is not acknowledged in this process is that some cultures are biased in the way they select their models and actors. In India’s film industry, for instance, there is a clear bias towards fairer skin, so would a dark-skinned Indian person be judged less “model-like” than a lighter-skinned one by one of Beauty.ai’s algorithms?
Bias has been a part of our lives for so long that various older technologies that AI now banks on as reliable betray shades of it. Cameras, for example, have been developed to best capture the skin tones of white people. So the Google Photos app and Beauty.ai’s algorithms are studying photographs that don’t display dark skin as well as they do light skin.
Our language itself is filled with prejudice. All the literature, advertising and other written material that exists on the Internet simmers with it. So when AI tries to learn our language, it absorbs the prejudice too. When computer scientists at Princeton University, US, did a word-association test with GloVe, an unsupervised algorithm that learns human language from the Web, they found that it associated words such as “management” and “salary” more with male names and ones such as “home” and “family” with female ones. It also found names common among whites “pleasant” and those common among blacks “unpleasant”.
The worry is that AI’s racism is not just being revealed in controlled experiments such as Beauty.ai or the Princeton one. It is influencing real decisions.
Earlier this year, non-profit news organization ProPublica released a report that it said exposed how software used in the US prison system was discriminating against black convicts. In Florida, US, an algorithm created by a company called Northpointe is used to predict how likely a convict is to commit another crime on release. This information is used to decide bail amounts, lengths of parole and even, in some cases, a felon’s jail sentence. But what ProPublica found was that the algorithm was more likely to flag black criminals as re-offenders and wrongly predicted blacks would commit future crimes twice as often as it made the same mistake about whites.
In a 2013 study, Latanya Sweeney, a professor of government and technology at Harvard University, found that Google’s AdSense was far more likely to show advertisements suggestive of arrests when a black name was googled than when a white one was, regardless of how many people with the searched name had actually been arrested. This could easily play a role in an employer’s decision to hire a particular person.
Though these kinds of prejudices already exist, they are harder to question when machines make decisions that are biased. Northpointe, for example, refuses to disclose how the algorithm that rates prisoners in Florida works. So, it is harder to call it a racist than it is to call a prison guard with a history of discrimination one.
The key problem seems to be that several developers continue to refer to AI as “impartial” when it is clear that it tends to adopt our prejudices.
Before unleashing bots into the world, it would be a good idea for their makers to set ethical standards for them, training them not just to learn from humans but also guard themselves against picking up their biases.
How Beauty.ai works
The purpose of this robot-judged contest is to gain insights into what constitutes human beauty, for the benefit of the health, skincare and wellness sectors. People send in pictures and are judged in five age categories: 18-29, 30-39, 40-49, 50-59 and 60-plus. Men and women are judged separately. The photographs are analysed and rated by five different algorithms:
AntiAgeist: This algorithm measures two things. First, how old it thinks the person in the photo is. Second, how old it believes people will think the person is. Based on the difference between these and the person’s actual age, it comes up with an AntiAgeist score, which, essentially, defines how “young” a person looks.
PIMPL: This algorithm detects acne and allergic rashes and scores people based on how clear their skin is.
RYNKL: It counts the number of wrinkles a person has and gives them a score based on it.
MADIS: It compares photographs of people with those of models and actors in the person’s ethnic group and tries to determine how similar the person looks to models and actors. The more you look like a celebrity, the higher the score you get.
Symmetry Master score: This algorithm measures the distance between various key points on your face and determines how symmetrical it is.
Editor's Picks »
- Hindustan Zinc dividend payout offsets dull Q2 results
- Q2 results no blockbuster for Inox Leisure as margins disappoint
- NBFC scare shaves 8.5% of IndusInd Bank share price
- Q2 results portent a dull Diwali for paint stocks investors
- Reliance Jio seen overtaking Vodafone Idea, Airtel to become India’s largest telecom firm by 2018-end