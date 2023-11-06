NEW DELHI :Generative artificial intelligence (AI) is at the peak of its hype cycle right now, but the hype could cool off soon. According to Juergen Mueller, chief technology officer and member of executive board at German enterprise technology company SAP, this is because many businesses currently experimenting with the technology will see that certain business use cases do not work out. In an interview, Mueller also spoke about how regulations can be good for AI, and detailed SAP’s work behind building its own LLM for businesses. Edited excerpts: {{^adFree}} {{/adFree}}

Where does adoption of generative AI stand right now? It is at the peak of its hype cycle. Every day, the peak gets higher. But, what we also see is that it is hard to get the technology into production. Large language models (LLMs) have obvious shortcomings—there, even if you give additional context such as company-specific data in job description generation through generative AI, a lot of our prompts are like statements given by a child. The interactions are still rudimentary, and that leads to misleading answers. This requires pages after pages of commands to just get this right.

There will be a dip in the hype cycle. A lot of use cases that companies are trying right now will not work. Further, a lot of generative AI deployments will happen within end-to-end business processes. As a result, a lot of business processes will improve through generative AI. But, it's not easy to get there in the business context, even though it is already possible in the basic consumer context right now.

Generative AI has four ingredients—algorithm, which is well understood now; compute—still expensive, but with every new generation the prices are reducing; data, and finally, business process awareness. The final two will be the key.

Everyone has to relearn when it comes to understanding AI. And, whenever this happens, this leaves us with a level playing field.

How would regulations hinder information flow and sensitivity of data? In theory, regulation can stop anything. But, the cheapest way to run a factory is to entirely stop production. But of course, this doesn't meet the needs of consumers, and all jobs would be lost. That holds true for generative AI as well.

Of course, there should be guardrails to not misusing this tool and doing harm—but that’s true for other tools as well. Then of course, there are new nuances that get involved, such as when training data is involved. Here, there will be some regulations on reducing risks associated with training data. This can be done by working with all the leading LLM providers.

Even if one of the LLM providers does not work out properly, and regulation in hindsight says that it is not okay from bias or other standpoints, then you can simply switch to another model. That is how we can de-risk external use of LLMs.

Do governments understand generative AI, and have the early regulations helped? Yes. As long as it is reasonable and risk-based, regulations are good, and that's what we're seeing at most places. But, there are limitations. Internally, we for instance, have an AI ethics council—all AI use cases go through it. There is also an independent external body for it, which includes professors, etc. They check AI use cases that we develop, and they have historically stopped such cases from time to time. These are relevant in use cases, such as hiring and firing solutions.

The point being, most regulations that we’ve seen so far are not as sophisticated as steps that we’ve already taken internally. As a result, so far, we haven’t seen any instance where we’ve had to adjust our practices because of external regulations.

Would you consider building your own large AI model, then? We are already building our own LLM. We have a lot of critical data, but that belongs to customers—we’re only custodians. For this, when we started working on our own LLM, we started using metadata and structural information from end to end business processes to codify best practices. We collected data from various sectors such as finance, supply chain, quality management, human resources, procurement, customer interactions and experience—for 26 different industries.

We've codified all of this into our LLM. All of it is anonymized. In the structural information layer for the chemical industry, we know the purchase order details of a particular industry, such as the chemical industry.

Isn’t all this data actually your customer’s? Yes. But, five years ago, we also started changing our contracts with customers. Here, we included an optional clause to anonymize their data and use that to train our machine learning (ML) models. Customers access this from their dashboard, and even revoke it at a later stage. They can also break down permission for data from various segments.

Beyond permissions, how difficult is it to build an LLM? It definitely requires significant investments. We have also built a team, and we’re expanding it as well. We already have our first prototype, and it is being applied to certain use cases—such as payment predictions. This can be used by suppliers of a large group to understand payment cycles, so as to fund the supplies.

This, and three other use cases, are already applying our LLM internally at an early stage. Doing this is better than individual ML models that we had before, which was very tedious. We've done this for over 8.5 years, and have more than 130 use cases. This helped us understand how to operate the data pipeline, and learn how to work with AI models.

We don’t disclose a specific investment amount, and we haven’t disclosed a timeline for when we’d make it live—we don’t want any artificial pressure on our head for it. We’ll do it when we see that it is adding value to our customers—until then, we’re already offering generative AI applications and use cases with our customers.

Does this LLM go into trillions of data parameters? This would be in billions, just like it would be for most other LLMs. We have at least 200,000 database tables that we’ve used here.

