New research indicates that Artificial Intelligence, or AI, as it is defined and practised today, has several limits. New buzzwords only serve to mystify the populace, and it is increasingly clear to me that many technologists and information technology (IT) managers are just groping about in the dark. They throw out terms such as “neural networks”, “deep learning”, “big data”, “black box systems”, and so on, hoping to mask the fact that they know very little of how this technology may evolve over the next several years.
As an observer, I can’t help but think there is an important question in front of us: are the ramblings of these pundits in fact a case of the one-eyed man becoming king in the land of the blind—or, instead, more akin to the parable of the five blind men, who all encountered an elephant and, after inspecting various parts of the elephant by touch, came away with different definitions of what an elephant is like?
The vital premise in today’s AI is that the computer program itself learns as it goes along, creating a database of information, and then, uses that database to automatically generate additional computer programming codes as it ‘learns’ more—without the need for human programmers. These AI programs then become “black boxes”, since even their original human programmers have no way of knowing what code the machine has generated on its own.
ALSO READ: The road ahead for AI: engendering trust
These computer programs, however, need copious amounts of carefully categorized data to make themselves smarter. Anything that is sloppily characterized can easily cause the machine to make the wrong conclusions. I have mentioned before in this column that it has been proven that just changing a few pixels on an image can make an AI image-recognition program conclude that a car is in fact an elephant—which is a mistake that an ordinarily intelligent human eye would never make.
Thus, many firms that are trying to chart out a path in AI are scrambling to go out and acquire vast stores of data that have already been neatly characterized. IBM, for instance, has bought firms that own billions of medico-radiological images—in the hope of feeding this vast acquired data to the medical diagnosis components of IBM’s Watson product. The idea is that this data, collected over many years of digital medico-radiological imaging, will enable Watson to become cannier in diagnosing diseases. When quizzed about these acquisitions, a senior IBM executive said to me recently: “If you’re not at the table, you can be sure you’ll be on the menu.”
In another example of the use of categorized data, a firm called Cambridge Analytica has recreated a sinister way to profile people, from psychometric tests that show up, ostensibly as harmless quizzes, on Facebook and other social networking sites—luring people into taking them and posting the individual results online. Cambridge Analytica claims it used these psychometric analyses to accurately predict the personality types and preferences of individual voters. The firm was apparently retained by both the Brexit “leave” and Donald Trump’s presidential election campaigns to accurately target voters who were likely to vote for them, and to lure more of these supportive voters out to the polling booths.
Trained psychologists have a dim view of psychometric testing and other personality profiling tests. When I asked my sister, who holds a doctorate from Harvard in Psychology, about the efficacy of such methods, her response was that there are dozens of such psychometric rubrics out there that do have some utility, but are in fact quite flawed; many of them have been debunked for predictive utility.
The accuracy of diagnostics and psychometrics aside, the fact remains that without reams of carefully categorized data, AI as we know it today is dead on arrival. That means that in areas where data is not yet available—for instance, crash data for self-driving cars—we must look elsewhere to create models that mimic large data stores accurately when data is absent. Where does one go to find out under what circumstances self-driving automobiles like the Tesla that killed its occupant in 2016 might have other such accidents? Enough instances of this haven’t occurred and, therefore, the data doesn’t exist. Building predictive models here without data is not “neural”—it’s neurotic, and dangerous!
ALSO READ: Why India needs an AI policy
This brings us to the fields of pure mathematics and theoretical physics, which are the way forward. In an informative blog last year, Wale Akinfaderin, a Ph.D. candidate in physics at Florida State University, has enumerated the types of mathematics that an aspiring AI specialist must be familiar with, if not master, to be effective. Here is a partial list from his blog post: Principal Component Analysis, Eigen decomposition, Combinatorics, Bernoulli, Gaussian, Hessian, Jacobian, Laplacian, and Lagragian Distributions, Entropy, and Manifolds. I’ll stop here—I’m sure you get the idea!
“Don’t panic,” says Neil Sheffield, an AI researcher at Amazon, in a blog. “By bringing our mathematical tools to bear on the new wave of deep learning methods, we can ensure they remain mostly harmless.”
Time for us amateur pundits and pedestrian programmers to make way for the pure mathematicians and theoretical physicists to lead the charge. They have long used mathematical theory to contemplate the unsolvable where data doesn’t exist. Visionaries like Stephen Hawking, Albert Einstein and Srinivasa Ramanujan have been feted for their ability to posit plausible models on hitherto unsolvable problems such as the theory of the universe.
One-eyed they may well be, but all hail the new kings of AI!
Siddharth Pai is a world-renowned technology consultant who has led over $20 billion in complex, first-of-a-kind outsourcing transactions.