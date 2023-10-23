You can’t go very far in Silicon Valley without hitting an “AI for X" startup. AI for enterprise tech. AI for medicine. AI for dating. And on and on.

Some of these startups, no doubt, are pure marketing hype. But even most of the others are simply applying existing AI to a given category of human need or desire—licensing big AI systems from well-capitalized startups and tech giants, such as OpenAI’s ChatGPT, Google’s Bard and Anthropic’s Claude, and applying them to whatever area of human endeavor their founders think hasn’t had enough AI thrown at it yet.

The sudden ubiquitousness of these startups and services suggests that the AIs they are leveraging are ready for prime time. In many ways, though, they are not. Not yet, anyway. But the good news (for AI enthusiasts, anyway) is that the underlying AIs upon which all this hype rests are getting better, fast. And that means that today’s hype could quickly become tomorrow’s reality.

To understand all of this—why the AIs aren’t ready for prime time, how they are getting better, and what that can tell us about where we’re heading—we have to go on a bit of an intellectual journey.

Today’s steam engines

To start, it helps to understand how these AIs work. Two terms it’s imperative to know: “generative AI" and “foundation models." The current generation of AIs that have people so excited—the ones doing things that until a couple of years ago it seemed only humans could—are what are known as generative AIs. They are based on foundation models, which are gigantic systems trained on enormous corpuses of data—in many cases, terabytes of information representing everything readily available on the internet.

Generative AIs are the AIs that generate eerily humanlike responses to written prompts, or surprisingly convincing images, or artificial voices that sound just like the humans they copy.

The best way to understand where these AIs might take us, and why predictions are only so useful, is to compare them to other transformative technologies in their earliest stages of development. Take the steam engine. No one in the early 18th century could have known that the primitive steam-powered pumps invented by Thomas Savery and Thomas Newcomen, used for removing water from mines, would someday evolve into highly efficient steam turbines essential for generating electricity. (For one thing, electricity had yet to be discovered.)

The first steam engines were the product of intuition and tinkering, not a robust understanding of the science of thermodynamics, says George Musser, author of a forthcoming book about how scientists are pioneering new ways to probe the nature of human and machine intelligence.

In a pattern repeated over and over in the history of technology, first there was the thing—in this case the steam engine—and only later did we come to understand it. That understanding, which we call thermodynamics, took on a life of its own, becoming one of the most universally applicable branches of physics.

Well, it’s happening again. In an almost perfect recapitulation of that history, today’s AIs have been the products of intuition and tinkering, and we don’t understand how they work, says Musser. But like the earliest steam engines, today’s generative AIs contain within them the seeds of countless future applications. Unlocking those applications will require something that is only just getting started—an understanding of how foundation models and generative AI actually work.

To that end, computer scientists, mathematicians, physicists, neuroscientists and engineers are all coming together to create a new area of study: a universal science of machine intelligence. And as they develop it, we’re gaining useful insights into what AIs might one day be capable of.

In search of reason

Some researchers, for instance, are convinced that one kind of foundation model is already capable of something that is, for all intents and purposes, reasoning.

Here, we have to introduce a third term—large language model. A large language model is a type of generative AI, and one representative of the class of foundation models that is trained exclusively on text. (ChatGPT, Bard and Meta’s new chatbots are all examples.)

Whether large language models have crossed the threshold from merely memorizing and regurgitating information about the world, to synthesizing it in completely novel ways—that is, reasoning about it—is a matter of debate.

Blaise Aguera y Arcas, an AI researcher at Google Research, cites the ability of today’s large language models to handle tricky tasks, when prompted with enough information about them, as evidence of their ability to reason. For example, with proper coaxing, it’s possible to get a large language model to give a correct answer to basic mathematical questions, even though, say, the product of two four-digit numbers isn’t anywhere in its training data.

“Figuring that out means having had to have learned what the algorithm for multiplication actually is—there is no other way to get that right," says Aguera y Arcas.

Other researchers think Aguera y Arcas is overstating the amount of reasoning today’s large language models are capable of. Sarah Hooker, director of Cohere for AI, the nonprofit research wing of AI company Cohere, says that some of what people think is reasoning by large language models could just be things they’ve memorized. This could explain the fact that as these models grow bigger, they gain new capabilities—not because teaching them language gives them the ability to reason.

“A lot of the mystery is that we just don’t know what’s in our pretraining data," says Hooker. That lack of knowledge comes from two factors. First, many AI companies are no longer revealing what’s in their pretraining data. Also, these pretraining data sets are so big (think: all the text available on the open web) that when we ask the AIs trained on them any given question, it’s difficult to know if the answer just happens to be in that ocean of data already.

In any case, there is now ample evidence that these large language models are capable of some form of reasoning, however primitive by human standards, says Sayash Kapoor, a third-year Ph.D. student at Princeton who researches and writes about the limitations of today’s AIs. “But there is also evidence that in many cases memorization in these models is leading to performance claims that may be exaggerated," he adds.

What’s next

If you’ve gotten this far, here’s the payoff: If today’s large language models are capable of some amount of reasoning, however elementary, it could yield what could be years of rapid advances in the abilities of generative AIs.

In part, that’s because language isn’t just another medium of communication, like pictures or sound. It’s a technology humans developed for describing absolutely everything in the world we can conceive of, and how it all relates. Language gives us the ability to build models of the world, even absent any other stimuli, like vision or hearing, says Aguera y Arcas. That is why a large language model can write fluently about the relationship between, say, two colors, even though it has never “seen" either of them, he says.

In addition, language is the interface for countless other systems on the internet that were designed for use by humans, but which can be repurposed by these generative AIs—such as search engines.

The synthesis of all of these observations about large language models is that, for example, we might soon have AI-based assistants that are completely personalized to data specific to us. Google is already attempting a first version of this—an update to its Bard generative AI allows it to search and synthesize across all of your emails, calendar items and documents, as long as they are already in Google’s system—but it’s primitive and prone to error.

In the not-too-distant future, such systems might be better able to refashion themselves when fed our personal data, in a way that is analogous to how humans continuously form new memories, says Aguera y Arcas. Within, say, two to five years, this could make future AI assistants much better at personalizing their responses to every one of us.

When I asked Aguera y Arcas if such hyper-personalized AI assistants are coming, he said that while he can’t comment on any future products from Google, the trajectory of today’s AIs means that the existence of such an assistant is “a very obvious implication."

Another implication is that future AIs will be granted new abilities in a way that is similar to how humans gain them—by giving these AIs access to cloud-based software intended for humans that offer those services.

The simplest example of this is giving chat-based AIs access to search engines like Google. But of course the internet has many more search engines on it than just Google’s—there are repositories for code, for legal decisions, for academic papers, and on and on.

One way that generative AIs are being connected to services originally intended for humans is through “plug-ins." For example, travel search services Kayak and Expedia can both be accessed through ChatGPT through plug-ins, as well as shopping services Instacart and Shop.

The reason large language models need such plug-ins is that while they have been trained on huge volumes of information, they might not have access to things that aren’t available to be scraped from the web; they are only as up-to-date as the body of information they were last trained on; and that even with all of that data inside them, they can struggle with some kinds of reasoning, for example as in mathematics.

When large language models are given access to the same kinds of resources humans already have access to, the real potential of future versions of “AI for X" services and startups becomes apparent. Rather than just offering access to what’s really a licensed and rebranded version of an existing foundation model, these startups can start to integrate all kinds of other data and services. “AI for legal advice" would integrate databases of legal decisions, say, or “AI for diagnoses" would tap in to databases of medical literature. These systems would leverage the primitive reasoning ability of large language models to get people answers to questions that are much more reliable than the frequently flawed and made-up answers that they are currently capable of.

Hard to imagine

What the world will be like when we all have these new kinds of cognitive aids is as difficult to predict as railroads, cars, jet planes and rockets would have been from the perspective of those first builders of steam engines.

What’s more, there remain many barriers to this promised nirvana of plain-language interfaces for AI assistants that can tap in to the superpowers of the internet on our behalf. One of them is the cost of running today’s generative AIs, which needs to come down before hundreds of millions of us are able to maintain a continuous dialogue with our future AI assistants, rather than just early adopters occasionally asking them targeted questions.

Another barrier is that even the near-future systems that fuse large language models and specialized systems for making them better at particular tasks are, in the words of Douwe Kiela, CEO of Contextual AI, a kind of “Frankenstein’s monster." Fixing the cost issues that arise in such cobbled-together systems, and making them more useful, may require many years of continuous improvements in which engineers optimize every part of these systems to work together harmoniously, while shaving off whatever doesn’t serve the customer.

Between the invention of the steam engine and the debut of the locomotive, more than a century elapsed. Meanwhile, a new science was born, which in turn became the midwife of countless other advancements essential to the Industrial Revolution. If the development of generative AIs conforms to this pattern at all, its near future will include transformative inventions—AIs expert in different subjects, truly personal assistants—followed by years of refinement, mad scrambles to harness and benefit from these new technologies, and possibly another sort of Industrial Revolution. But rather than a revolution predicated on energy and matter, this one will be based on the manipulation of data and insight.

We can only begin to imagine what that will look like.

Christopher Mims writes The Wall Street Journal’s Keywords column. Email him at christopher.mims@wsj.com.