Context and common sense
Machine learning represents the idea that a computer, when fed with enough raw data, can begin, on its own, to see patterns and rules in these data, and learn to recognize and categorize new data as they arrive into the patterns and rules that the computer has already created. And as more data arrive, these data also add to the ‘knowledge base’ of the computer by making its patterns and rules ever more refined and therefore reliable.
Some fear machine learning since they think that the rules created by these machines will reduce the need for human knowledge and capability to drive these machines. I shall dwell on the ramifications of machine learning in a future column, but in a recent column on bots, I had explored how firms focused on automation are reducing how many humans are needed to manage and operate computer systems. Bots accomplish this by recording and replicating the repetitive tasks that humans perform. The creators of these bots feel that adding machine learning components to their efforts can greatly increase the efficiency and scope of their bots. In essence, they think that machines generating knowledge bases automatically will allow them to pull data from the internet and to put together vast storehouses of human-like knowledge at warp speed.
There is however, a small but pertinent inconvenience. Despite the great advances in computing and the rhetoric around these advances, it is still very difficult to teach computers both human context and basic common sense. The brute-force approach of AI (artificial intelligence) behemoths does not rely on well codified common sense based rules; it relies instead on the raw computing power of the machine to run through thousands upon thousands of potential combinations before selecting the best answer using pattern-matching. This applies as much to questions that are intuitively answered by five-year-olds as it does to powering medico-radiological diagnosis.
Douglas Lenat, one of the fathers of modern AI, started addressing the problem of making machines more human-like in a very different way than the top-down approach of today’s machine learning efforts. He and his colleagues have been working for many decades to codify many aspects of human learning, especially common sense and the understanding of context, into a knowledge base that computers can use. He started with a grounds-up approach, and at first used PhDs in philosophy, who are experts at logic, to codify knowledge into simple rules that a computer could understand.
These bit-parts of knowledge could be as simple as ‘Every room has a door. A door can be open or closed. One can only enter the room if the door is open. A room may also have windows, but these are usually smaller than doors. It is usually not possible to enter a room through a window’. While the codification of this sort of common sense may seem laughable, and Lenat has had many detractors, even his detractors would agree that it is not something a computer would ever know unless it had been told so. By contrast, a human being already has thousands of such rules embedded into their thinking by the time he or she is five years old.
Lenat had originally estimated that this endeavour would take more than 1,000 person years of effort before his knowledge base would have the common sense needed for computers to base truly intelligent tasks off it. His labour of love has lasted for over three decades, and has probably gone beyond the 1,000 person years, but is still a work in progress. It was first heavily backed by the US government and industry in an attempt to keep the country ahead of the Japanese as the latter country was gaining ascendancy in electronics and allied manufacturing.
Lenat and his team of researchers created a computer knowledge base called Cyc, from “encyclopaedia”, which for many years was owned by the Microelectronics and Computer Consortium or MCC. MCC was founded with the direct help of Admiral Bobby Ray Inman, who recruited Lenat away from Stanford University, where he was a professor of Computer Science and AI. Inman had previously been the head of both the National Security Agency and the Central Intelligence Agency, the internal security and international intelligence arms of the US government, and was hand-picking the country’s best brains for the MCC to help beat back the Japanese.
MCC was finally dissolved early in this century, and Lenat and his team went on to found Cycorp, a company dedicated to keeping Cyc alive, and continuously growing. The company markets its common-sense knowledge base, and parts of this knowledge base are made available as OpenCyc, which has an open source licence and allows programmers and others access to this knowledge base so that they may make their programs ready for the real world, as AI ups its stakes.
Some in the industry today feel that Lenat’s type of piece-by-piece knowledge base semantic logic is outdated, and that machine learning today is capable of reaching the level of semantics that Cyc has had painstakingly codified into it since 1984. In my opinion, machine learning and Lenat’s breed of data will end up needing each other, even if all that the brute-force computing engines of today do is end up treating this treasure trove as just more data to crunch on.
Lenat’s dream is that computers will end up becoming more than idiot savants. Paraphrasing his words from an interview he gave to IBM’s ‘developer-works’ platform some years ago, if all we are looking for is information retrieval, which is like a dog bringing the paper to its master despite the fact that it can’t understand a word of what’s in the newspaper, then knowledge of context is not important. But if we are looking for our computers to truly reason, then context and common sense are king and queen.
Siddharth Pai is a world-renowned technology consultant who has personally led over $20 billion in complex, first-of-a-kind outsourcing transactions.