Grasping how neural nets work
If research and advisory firm Gartner Inc. is right in its forecast, Artificial Intelligence (AI) technologies will become pervasive in almost every new software product and service by the year 2020. The growth in AI, broadly a set of computational technologies and methodologies aimed at helping machines emulate human intelligence, is being driven primarily by sophisticated algorithms, the availability of huge data sets, greater computing power, and advances in machine learning as well as deep learning.
Machine learning, a subset of AI, is broadly about teaching a computer how to spot patterns and use mountains of data to make connections without any programming to accomplish the specific task. A recommendation engine is a good example. Deep learning, an advanced machine learning technique, uses layered (hence “deep”) neural networks (neural nets) that are loosely modelled on the human brain. Neural nets enable image recognition, speech recognition, self-driving cars and smarthome automation devices, among other things.
A neural net comprises thousands or even millions of simple processing nodes that are densely interconnected . An individual node might be connected to several nodes in the layer beneath it, from which it receives data, and several nodes in the layer above it, to which it sends data.
Broadly, this is how neural nets work. Neurons receive inputs layer by layer. The neurons in the first layer, for instance, perform a calculation and then send it (the output) to the neurons in the next layer, and so on, until there is overall output. There is also a process known as back-propagation, which tweaks the calculations of individual neurons to allow the network to learn to produce a desired output.
Researchers, though, continue to be perturbed by the fact that neural nets are “black boxes”—once they have been trained on the data sets, even their designers rarely have any idea how the results are generated. In a 2 November 2016 paper, Rationalizing Neural Predictions, Massachusetts Institute of Technology (MIT) researchers attempted to address these issues by proposing a neural network that would be forced to explain why it reached a particular conclusion.
Moreover, “unsupervised” or “adaptive” learning—wherein you can run a deep learning algorithm with no desired output in mind but let it start evaluating results and adjusting itself as it desires—can lead to undesired results. A case in point of an unsupervised neural network going rogue is that of Microsoft Corp.’s Tay AI chatbot, which made racist tweets, forcing the company to apologize and pull the bot down.
Researchers are trying their best to address this issue. Two years ago, a team of computer-vision researchers from MIT’s computer science and Artificial Intelligence laboratory (Csail) described a method for peering into the black box of a neural net trained to identify visual scenes. But the method required data to be sent to human reviewers recruited through Amazon’s Mechanical Turk crowdsourcing service, according to a 30 June press note.
At this year’s four-day Computer Vision and Pattern Recognition conference, which started on 22 July, Csail researchers presented a fully automated version of the same system. While the previous paper reported the analysis of one type of neural network trained to perform one task, the new paper reports the analysis of four types of neural networks trained to perform more than 20 tasks, including recognizing scenes and objects, colouring grey images, and solving puzzles.
In both papers, the MIT researchers doctored neural networks trained to perform computer-vision tasks to ensure they disclosed the strength with which individual nodes fired in response to different input images. Then they selected the 10 input images that provoked the strongest response from each node.
In the earlier paper, the researchers sent the images to workers recruited through online marketplace Amazon’s Mechanical Turk (Mturk.com), asking them to identify what the images had in common. In the new paper, they use a computer system instead.
Similar attempts are being made by researchers from the department of computer science at Brown University. In their paper, published on 2 June , they too acknowledge that as “deep neural networks continue to find application to a growing collection of tasks, understanding their decision-making processes becomes increasingly important”.
Cutting Edge is a monthly column that explores the melding of science and technology.