Semiconductor company Intel has added a new feather in its AI cap, in the form of two new Nervana NNP (neural network processors) designed for AI (artificial intelligence) in cloud. The duo are based on NNP-T (for training deep learning models of any scale) and NNP-I (for deep learning inference to work out new insights) architecture, which were announced in August at Hot Chips event. In a telephonic conversation with Mint, Naveen Rao, Intel corporate vice president and general manager, Artificial Intelligence Products Group, elaborates on Intel’s AI strategy and the relevance of India for them.
Q. Intel has been pushing for AI everywhere? What does it mean and how is Intel geared for it?
NR. AI everywhere means AI will be one of the dominant capabilities of computing. How does that work with our roadmap? So we have a few different ways we segmented it. We have the Intel Nervana neural network processors for inference and training in the data centre and then we have Movidus VPU (Vision Processing Unit) at the Edge. Movidus is under 5 watts, inference is between 10 and 40 watts, and then training is 100 to 300 watts. So these are the three broad segments that we look at for AI acceleration today. We are also working on bringing those AI capabilities or some part of it to our other products like FPGA (field programmable gate array) and CPU (central processing unit). So for CPU, we actually just announced a new variant called Cooper Lake, which will support VNNI instructions for deep learning.
These are 8-bit vector instructions which are very relevant to inference capabilities in the datacenter or in the PC. So that's an example of how across multiple different use cases all the way from laptops, desktops to servers, we have increased inference capabilities by 2 to 3x. In addition to this, we're also building a graphics roadmap. So we've always had integrated graphics for many years. And we're also going to be building discrete graphics which will also have AI capabilities.
Q. What role can AI play in graphics?
NR: There is an overlap between AI and graphics computing, what we call the GPU (graphics processing unit). Nvidia is really pushing with their programming and all of that. There was some overlap between how that's used for graphics and how that is used for AI. But what we're seeing now is actually that even within a classical Nvidia GPU, there is an AI acceleration right inside the diet cell. So we're going to be infusing our graphics processors with AI capabilities that also have graphics workloads. So they're built for graphic media and AI.
Q. Is it right to say that Intel has been a bit late to the NPU party considering the fact that Google’s TPU (Tensor Processing Unit) and NVDLA (Nvidia Deep learning accelerator) hit the market before?
NR: I don't think it's late. I think it's still a very new market. Nvidia is really just kind of getting started and Google is one of the companies which is at the forefront. So they saw the need before many others. Intel definitely would have been happier to get the market earlier. But I think the market is just really emerging now. So, we see this ramping from single digit billion dollar market next year to double digits billions in the next several years. So this is really that inflection point in the market taking off. So at the moment, only very large players like Google, Facebook, Microsoft, Baidu, Alibaba can take advantage because they have large investments and team of 1000 experts who can really take advantage of the capabilities. However, the greater enterprise cannot yet do that. They don't have the ability to invest large amounts of money. So what we're seeing now is there's a maturity happening in software stack and the capabilities that actually allows bigger markets to flourish.
Q. Why would a customer choose Intel Nervana over rivals?
NR: Nervana has an edge in several ways but it depends on which kind of rival you're talking about. Nvidia, like I said, is building a graphics processor that can also do some computing, like AI computing. The advantage we have over them is that for our specific AI accelerators we don't have to support all the other workloads. We can offer more efficiency, higher performance and much higher scalability. So the scalability is a very important factor as our neural network models are getting bigger and bigger. The rate of growth is so high that technology like Moore's law, packaging and memory technologies cannot keep up. So we have to use multiple chips to work on one singular problem. On the TPU from Google, this isn't really a product that other companies can use. They can use the service, the Google Cloud Service. At Intel, we are using our silicon engineering capabilities to build something with higher performance and slightly better capabilities than what they have.
There's a whole new set of competitors emerging from some of the traditional silicon players Qualcomm and AMD (Advanced Micro Devices). But then also, the cloud service providers themselves are investing more as a whole. So that actually just like a TPU, we're seeing that happen in other places like Amazon and Alibaba and other. So, that is that is a competitive threat for us.
Q. How mature is the India market for NPUs and what are Intel’s expectations from it?
NR: The India market is still relatively small compared to North America, which also services Europe and China. The India market is important from the perspective of developers, however. I look at it as much more of a place where we need to enable the ecosystem as the developer community in India can have a huge impact upon the world in terms of just pure market size.
Q. We have seen what AI in cloud can do. How will AI fair with Edge computing and how is Intel placed in this new scenario?
NR: I think this is a huge opportunity for AI. The Edge opportunity is growing faster than the datacenter opportunity. However, it's very fragmented across many different industries like industrial automotive, manufacturing and healthcare. The way we look at it again, is that we take a bit of a platform approach. We have Intel Xeon for higher power applications and Atom core based products for lower power. They both have AI capabilities that are enabled through a software stack that we have built from the ground up for very easily deploying models to the Edge platform.
So we give our customers lots of different choices because there's a diverse set of use cases and what we call the Edge can be anywhere from half a watt all the way to 20 or 30 watts with very different limitations in terms of thermal cooling and latency. We have a variety of solution, but they're all AI enabled through a unified software stack. I don't think any one player can cover everything because Edge has such a diverse set of use cases.