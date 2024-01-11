DEI is down but not out. Several states, universities and corporations signaled their retreat from diversity, equity and inclusion programs last year. Yet despite these developments, the movement remains alive and well thanks in part to the implicit-association test: a popular tool that has been used to claim that most of the U.S. is racially prejudiced.

Three scholars first introduced the concept in a 1998 article in the Journal of Personality and Social Psychology titled, “Measuring Individual Differences in Implicit Cognition: The Implicit Association Test." The article describes how researchers measured how quickly respondents associated “good" and “bad" words with white or black faces. When people more quickly paired good words with “white" and bad words with “black," they were deemed implicitly biased.

In a 2007 article in the European Review of Social Psychology, another group of academics found that approximately 68% of whites manifested a biased result. This quickly morphed into the widespread assertion, in some of the press, that 68% of whites are racially prejudiced.

The test has since become a fixture in DEI training programs nationwide, with universities and businesses often using it during mandatory training sessions for students and employees. Harvard University’s Project Implicit reports that more than 20 million people have taken its online administration of the test on various topics, including implicit racial bias.

The tool, it turns out, is highly suspect. To see why, we must examine it with the two most important measures used for any other psychological test: reliability and validity.

Test-retest reliability refers to the extent to which a person who takes the test twice obtains the same result. If a test taker’s scores on two repeated administrations of the same test are totally unrelated, its reliability would be zero. If the two scores were identical, its score would be 1. According to measurement scholars Jum Nunnally and Ira Bernstein, the threshold for acceptable test-retest reliability is 0.7. In 2015 one of the authors of the original 1998 article relayed that the reliability of the test is between 0.5 and 0.6, rendering it unreliable as a psychological test.

The next measure is validity, or whether the test measures what it’s supposed to measure. If you’re trying to analyze the factors that affect a baseball player’s salary, you would identify such metrics as his batting average, fielding errors, runs batted in, etc., and then compare them with the same statistics and salaries of his fellow players in a regression equation. Using this method in meta-analyses of the implicit-association test, several scholars have found that “implicit bias" accounts for between 2% and 5.6% of prejudicial behavior. Even Mahzarin Banaji and Anthony Greenwald, two of the test’s most prominent advocates, have written that “attempts to diagnostically use such measures risk undesirably high rates of erroneous classifications." The test, in other words, is an extremely feeble predictor of behavior.

Consider these problems in another context. Suppose someone invented a guilt detector to replace the jury system in American courtrooms. Defendants would be administered the test, and those deemed guilty would be sent to prison.

Say the detector deemed 68% of people to be guilty, but the tool’s reliability was a mere 0.5 or 0.6 and it identified only 2% to 5.5% of the reasons someone may be guilty. America’s jails would be jam-packed with innocent people. So it is with the implicit-association test, which falsely classifies huge numbers of white Americans as bigots.

Despite the tool’s inadequacies, one group has benefited enormously from its popularity: the companies that perform antibias training. In 2018 I asked a “debiasing trainer" how she measures the success of her company’s training. She said it assesses knowledge about implicit bias before training and then again after it. But that isn’t useful. Trainees necessarily know much more about any topic after they’ve learned it. What matters is whether the training caused people to be less biased, which isn’t measured.

I asked if there was any assessment of whether the training helped group morale, hiring practices, and personal interactions. She said no, because the training company doesn’t return to measure such factors.

Does it use a control group that receives no training or alternate training to ascertain if the test-based training is superior? “Of course not," she said. “The company doesn’t want to pay for a no-training group." With such techniques, it would be virtually impossible to provide a truly rigorous approach to measuring—much less fighting—actual bias.

Though the implicit-training test says very little, governments, universities, companies and nonprofits continue to force it on countless people. It makes sense for those who stand to profit from it, but it’s ruinous for the health of our polity. So long as the test remains in circulation, the divisive and destructive DEI enterprise will be alive and well.

Mr. Arkes is an emeritus professor of psychology at Ohio State University.