# Math in the gap

Historians have long known that the idea of zero came from India, but its exact origins remained unclear until recently. The first text to discuss zero in the numerical sense is the Indian astronomer Brahmagupta’s work *Brahmasphutasiddhanta*, which was written in 628 AD, though many believe that the concept of zero itself originated much earlier than when it was captured in this work.

The oldest known Indian reference to the digit zero has now been identified, in a manuscript dating back to the third or fourth century, according to scholars at the University of Oxford who recently announced the results of their radio-carbon testing of an ancient Indian manuscript in their possession. This manuscript predates Brahmagupta’s work by at least three centuries.

The University of Oxford’s Bodleian Libraries has had the famous Indian Bakhshali manuscript, since 1902, just another example of the untold number of Indian objects of antiquity squirreled away to Britain during the Raj. A farmer dug up the text in 1881 in the village of Bakhshali, near Peshawar in north-west India, in what is today’s Pakistan. It consists of birch bark and contains hundreds of placeholder zeros in the form of dots. Marcus du Sautoy, a professor of mathematics at the University of Oxford, said in a statement that the placeholder zero in the Bakhshali manuscript is still “exciting” because it is “the seed from which the concept of zero as a number in its own right emerged some centuries later, something many regard as one of the great moments in the history of mathematics.” The concept of zero as the number representing nothingness paved the way for algebra, calculus and of course, computer science. This last science is still dependent on 1s and 0s as the basic value of any transistor bit on electronic chips which form the engine of all computers as we know them today.

Interestingly, it also turns out that mathematics still needs to step in where today’s computers cannot go. The greatest computers of today still fall short of sufficient computing power needed to analyse the vast data throw-off caused by our incessant use of internet. This proliferation of largely useless data has caused a justifiable fear of “data inundation”. The investment advisory outfit Ark-Invest predicts that these data will now grow at an annual average rate of almost 40% for the next half-decade, going from 8.5 zettabytes at the end of 2015 to 44 zettabytes by 2020, a number which beggars the imagination. Computing technology has simply not been able to keep pace to the point where it can meaningfully process this “data exhaust”.

This is where mathematics-based statistical modelling steps in to help us bridge the gap between useless data and useful, actionable information. The only way to make sense of data where brute-force computer processing fails is to use mathematical tools used to understand the tradeoffs between data and uncertainty. In a recent edition of *DukEngineer* a publication from Duke University, from which I will quote in order to make my point, Galen Reeves, a professor at Duke, says: “Although the early applications of information theory were focused on signal processing and communication applications, the mathematical foundations apply much more broadly to problems in data analysis and statistical inference”.

According to Reeves, these data have created critical and complex questions. “Data is powerful,” he says. “The information in medical data could greatly improve medicine. But how do you get it out, and what happens if you make a mistake? If you have the right data but conduct the wrong analysis, you can make the wrong conclusions. And that can result in a detrimental outcome.”

True, that. I have pointed out in this column before how today’s tools around “big data” analysis failed spectacularly in predicting the outcome of the last US Presidential election and have also dwelled on the limitations of data crunching machine learning-based computing systems used in the service of medicine.

Statistical inference problems, which must control for many mathematical variables, sometimes exhibit “phase transitions”, in which a small change in information leads to large changes in measures of uncertainty, says *DukEngineer*. The study of where and why these phase transitions occur provides new ways to characterize problems and to analyse tradeoffs between information, computation and structure. For example, says the publication, social networks follow this phase transition behaviour. When observing which celebrities, politicians and other notable figures influence one another, the data is initially chaotic. But with a small bit of information, key players can suddenly be identified.

The significance of these phase transitions extends into every area that entails massive amounts of data. Reeves says this phenomenon is the “curse of dimensionality.” The phrase was first coined by Richard Bellman, and refers to how data can “blow up” to a million unknown values with a million observed variables. So, the “new oil” of data isn’t necessarily all that it is made up to be, at least not as long as we do not have computing power on hand that is sufficient to draw reliable information from these data.

In contrast, there is also the “blessing of dimensionality.” As data is scaled up, the number of random interactions becomes so large that the macro level behaviour becomes predictable and non-random. “The analysis can become beautiful when the data are large and complex,” Reeves says. “But much of this math cannot be brute-forced via numerical simulation. Instead, one sometimes needs to go back to the pencil and paper to understand the mathematical properties of data.”

Just like today’s theoretical physicists, who contemplate the theory of the universe, are reduced to mathematical equations when it comes to plotting cosmic activity when their most powerful telescopes fall short, it appears as if today’s computer scientists are also back to where our ancients were, the only difference being that the ancients used birch bark and quill while our contemporaries use pencil and paper to codify abstract ideas.

*Siddharth Pai is a world-renowned technology consultant who has personally led over $20 billion in complex, first-of-a-kind outsourcing transactions.*