Samir Abboud, chief of emergency radiology for Northwestern Medicine, thought he was already working at maximum speed. In a carefully honed routine, aided by voice dictation, he could finish writing an X-ray report in as little as 75 seconds.
Then the Chicago-based health system rolled out generative artificial intelligence in 2024 that can analyze patient scans and write reports. Abboud, who checks the AI work for potential changes, said reviews have sped up to about 45 seconds.
The result was breathtaking—and startling. “It was the first time I felt like there was a clock on my career,” Abboud said.
Still, he said humans are a necessary part of the process. And reading scans faster came with benefits, too.
“You’d feel guilty getting up to use the restroom,” Abboud said. “There’s hundreds of patients waiting for our read, and any one of them could be one that’s actively dying.”
Big hospital systems have become the proving ground for widespread AI adoption, testing what the technology can do, but also revealing—sometimes via alarming mishaps—where it falls flat.
Among health systems, 27% are paying for commercial AI licenses, triple the rate across the U.S. economy, according to a recent survey by Menlo Ventures and Morning Consult.
While the aging populations’ healthcare needs rise, hospitals are looking for ways to deal with persistent worker shortages that can burn out clinicians and delay care. They are also looking for efficiency wherever they can find it while cuts to Medicaid loom.
AI has especially broken through in some of the least-flashy, but most labor-intensive, tasks hospitals deal with daily: taking notes, fielding patient phone calls and dealing with insurance claims.
These tasks are often “labor-dependent, with the same rote process done thousands of times,” said Rupal Malani, a senior partner at consulting firm McKinsey who advises health systems on AI implementation.
Doctors still make medical decisions, though AI can aid the process. A University of California, Los Angeles, study last year, for example, found that AI was better able to identify subtle signs of breast cancer that can develop and grow undetected between routine screenings. The study estimated using AI to help screen patients could reduce such breast cancers by 30%.
At the same time, there are reasons for caution.
Mayo Clinic cardiologist Paul A. Friedman turned to ChatGPT when he needed to weigh in on the case of a patient who needed a defibrillator implantation a few days after having heart surgery. Friedman thought such a procedure was feasible and safe, but wanted to know whether there were case studies. ChatGPT responded by giving him references to several reports published in medical journals that it said showed such a procedure was “safe and effective.” Friedman said “it looked very realistic” until a colleague tried searching for the studies only to discover they were completely fabricated.
After that experience, Friedman said, he takes a “trust but verify” approach. “It’s not that I don’t ask ChatGPT medical questions but when I do, I always look for the references, click on them and read the abstracts at a minimum,” he said. The hospital’s cardiology department is testing out alternate in-house AI tools.
A spokesperson for OpenAI, the company behind ChatGPT, said that its teams run “ongoing evaluations to reduce harmful or misleading responses,” and that its latest models are much more able to provide accurate health information than previous versions such as the one Friedman would have used. ChatGPT wasn’t intended to be a substitute for guidance from health professionals, the company added.
An October study in the Lancet Gastroenterology & Hepatology found that physicians who used AI for three months to aid them in spotting growths during colonoscopies were able to detect significantly fewer such growths once the tool was taken away.
“I’m constantly worried about myself with deskilling,” said Anthony Cardillo, a pathologist based in New York City who directs a Memorial Sloan Kettering laboratory specializing in blood samples. “Any time I outsource my thoughts to something that isn’t my own brain, I’m worried I’m going to lose that muscle memory.”
Cardillo said he and his colleagues use generative AI to review specimens, but that they do so only as a second pair of eyes after already coming up with their own diagnoses.
Despite such concerns, health systems say that they see tremendous promise—and necessity.
“When you think about the tsunami of need that’s coming as a society, technology is one of the only levers we have to pull,” said Doug King, Northwestern Medicine’s chief digital and innovation officer.
At Northwestern, an AI review of a million scans taken over a year highlighted 70 that humans hadn’t flagged for further review. A manual check then showed five instances where physicians deemed there was more follow-up needed. Northwestern is also using another AI tool to schedule operating-room time more efficiently, which means more patients can be treated, officials there say.
Hospitals were early AI users, long before massive data centers started sprouting across the U.S. landscape. Predictive algorithms have powered early-warning systems for sepsis, flagged high-risk patients and helped manage scheduling for years.
In Northern California, Kaiser Permanente’s 21 hospitals use a system that analyzes all patients’ vitals and charts and scores them every hour to determine which patients are at highest risk. A study in the New England Journal of Medicine found the system saves more than 500 lives a year.
On a recent day, the system determined a heart-failure patient required more scrutiny, leading physicians to learn he was also suffering from a severe respiratory virus and needed steroids for his lungs, said Vincent Liu, a pulmonary critical-care physician at Kaiser Permanente Santa Clara Medical Center.
A continuing trial at Jefferson Health, a Philadelphia-based health system, is evaluating whether large-language AI models, such as ChatGPT, can provide breast-cancer patients with tailored nutrition advice. Factors include patients’ cancer stage, their other health issues, budgets and access to nearby stores.
In Augusta, Ga., family medicine doctor Dean Seehusen said he uses generative AI to check the latest standard of care for various conditions, particularly if it is something he doesn’t usually encounter. The tool he uses only draws from vetted medical sources, so he said he feels comfortable trusting it, and only checks its references about 25% of the time.
Still, he has reservations about AI’s overall effects on medicine. “My biggest fear is that it further degrades mainstream confidence in medicine, and actually leads to a kind of Wild West for patients,” he said. He added that he sees more patients who have come up with self-diagnoses after using AI, including inaccurate ones.
Hospitals are aggressively adopting AI for tasks that are less flashy but take up a significant amount of time and resources. Electronic medical-record provider Epic Systems in 2024 launched a tool that uses generative AI to mine patient records and draft appeal letters to insurance companies. About 1,000 hospitals are already using the system, the company said.
Northwestern routinely has to appeal roughly 5-10% of the millions of claims it processes every year, said David Blahnik, vice president of information technology there.
“You’re spending so much staff overhead and work trying to fight them and appeal to them and justify why we should get paid,” he said. But, after adopting Epic’s tool, staffers now spend about 23% less time processing each denied claim, Blahnik said.
A similar effort at New York’s Mount Sinai has led to a 3% increase in insurance denials getting overturned, helping net the health system an added $12 million a year, said Lisa Stump, the chief digital information officer there.
Mount Sinai recently paused use of an Epic generative AI tool, which aimed to analyze messages patients sent to doctors and create personalized draft responses. After trying it for a few weeks, doctors said the drafts weren’t helpful and required too much rewriting.
There were some very specific mishaps, according to Ankit Sakhuja, director of Mount Sinai’s AI assurance lab. In one case, the system told a patient who asked for a walker or cane that it couldn’t help. In another, a patient reporting a headache was given a verbose response that said the patient could have anything from something minor to a brain tumor.
Epic said a small minority of hospitals have also paused use of the feature, and that it is working to make improvements. The tool has helped nurses save as much as 30 seconds per exchange with patients through the system, Epic said, and has rolled out to roughly 1,700 hospitals. Still, the company said it requires human oversight.
“Clinicians have full control of the message that goes to the patient,” said Seth Hain, Epic’s senior vice president of research and development.
Cheryl Wilkes, an internal-medicine doctor with Northwestern Medicine, said she has spent years typing away during patient visits, occasionally turning to ask a question.
She started using AI in 2024 to transcribe and summarize her patients’ visits. Instead of spending two to three hours a day after work on electronic records, she now spends just half an hour checking the AI tool’s work and making any necessary edits.
Nurse practitioner Jeremy Lapham, who works at an outpatient clinic in Ann Arbor, Mich., said he has tried using AI to transcribe some of his visits, but that it has required too much time editing its output. He occasionally checks AI-powered databases for clinical information, but he said he is careful to click on references, even if they are to medical journals, to make sure they weren’t retracted.
He spends a lot of time thinking about what treatments can work for patients given other life constraints, such as housing or mental-health challenges—criteria, he said, that AI wouldn’t necessarily know how to factor in. “I’m still skeptical,” he said.
