Opinion | Covid has exposed the limitations of machine learning4 min read . Updated: 09 Jun 2020, 06:40 AM IST
Its algorithms have proven unable to deal with volatility and still need much human intervention
Last Friday, the US’s Dow Jones Index climbed up by almost 1,000 points. The U.S. Labor Department said that the economy unexpectedly added 2.5 million jobs in May. This followed a depressing April, when the country shed as many as 20 million jobs. This lowered the unemployment rate to roughly 13%, versus the 15% it had hit in April. The report also surprised economists and analysts who had forecast millions more losing their jobs. Their Machine Learning (ML) models were predicting that the jobless rate would continue to rise to over 20%.
This isn’t the first time that the technology around ML has failed. In 2016, sophisticated ML algorithms failed to predict the outcomes of both the Brexit vote as well as the US presidential election. Some make the argument that algorithm-driven machine prediction was in its infancy in 2016. If that’s the case, then what have the intervening four years of computer programming and an explosion of data available to “train" deep-learning algorithms really achieved?
As a concept, ML represents the idea that a computer, when fed with enough raw data, can begin on its own to see patterns and rules in these numbers. It can also learn to recognize, categorize and feed new data upon arrival into the patterns and rules already created by the computer program. As more data is received, it adds to the “intelligence" of the computer by making its patterns and rules ever more refined and reliable.
There is still a small but pertinent inconvenience that deserves our attention. Despite the great advances in computing, it is still very difficult to teach computers both human context and basic common sense. The brute-force approach of Artificial Intelligence (AI) behemoths does not rely on well-codified rules based on common sense. It relies instead on the raw computing power of machines to sift thousands upon thousands of potential combinations before selecting the best answer using pattern-matching. This applies as much to questions that are intuitively answered by five-year-olds as it does to a medical image diagnosis.
These same algorithms have been guiding decisions made by businesses for a while now—especially strategic and other shifts in corporate direction based on consumer behaviour. In a world where corporations make binary choices (either path X or path Y, but not both), these algorithms still fall short.
The pandemic has exposed their insufficiency further. This is especially true with ML systems at e-commerce retailers that were initially programmed to make sense of our online behaviour. During the pandemic, our online behaviour has been volatile. News reports in various Western countries that kept e-commerce alive during their lockdowns have focused on retailers trying to optimize toilet paper stocks one week and stay-at-home board games the next.
The disruption in ML is widespread. Our online buying behaviour influences a whole hoard of subsidiary computer systems. These are in areas such as inventory and supply chain management, marketing, pricing, fraud detection and so on.
To an interested observer, it would appear that many of these algorithms base themselves on stationary assumptions about data. A detailed explanation of how stationary processes are used for statistical data modeling and predictions can be found here. Very simply put, this means that algorithms assume that the rules haven’t changed, or won’t change due to some event in the future. Surprisingly, this goes against the basic admonition that almost all professional investors bake into their fine print, especially the one that says, “Past performance is no predictor of future performance."
The paradox is that finding patterns and then using them to make useful predictions is what ML is all about in the first place. But static assumptions have meant that the data sets used to train ML models haven’t included anything more than elementary “worst case" information. They didn’t expect a pandemic.
Also, bias, even when it is not informed by such negative qualities as racism, is often added into these algorithms long before they spit out computer code. The bias enters through the manner in which an ML solution is framed, the presence of “unknown unknowns" in data sets, and in how the data is prepared before it is fed into a computer.
Compounding such biases is the phenomenon of an “echo chamber" that is created by finely-targeted algorithms that these companies use. The original algorithms induced users to stay online longer and bombarded them with an echo-chamber overload of information that served to reinforce what the algorithm thinks the searcher needs to know. For instance, if I search for a particular type of phone on an e-commerce site, future searches are likely to auto-complete with that phone showing up even before I key in my entire search string. The algorithm gets thrown off when I search for toilet paper instead.
The situation brought about by the covid pandemic is still volatile and fluid. The training data sets and the computer code they produce to adjust predictive ML algorithms are unequal to the volatility. They need constant manual supervision and tweaking so that they do not throw themselves and other sophisticated downstream automated processes out of gear. It appears that consistent human involvement in automated systems will be around for quite some time.
Siddharth Pai is founder of Siana Capital, a venture fund management company focused on deep science and tech in India