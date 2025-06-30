Microsoft claims its AI Diagnostic Orchestrator outperformed 21 doctors, got 85.5% of diagnoses right

Published30 Jun 2025
Microsoft has introduced a new artificial intelligence (AI) system that it says can diagnose some of the most difficult medical cases more accurately and at a lower cost than human doctors.

The system, called the Microsoft AI Diagnostic Orchestrator (MAI-DxO), was tested using case studies published by the New England Journal of Medicine (NEJM). These cases are known for being particularly complex and usually involve teams of specialists. According to Microsoft, the AI got the correct diagnosis 85.5 per cent of the time, compared to just 20 per cent for a group of experienced doctors from the US and UK.

How the technology works

As more people turn to digital tools for medical advice, the company says it sees over 50 million health-related searches every day across its services like Bing and Copilot.

To test the system, Microsoft created a new challenge called the Sequential Diagnosis Benchmark (SD Bench), based on 304 real NEJM cases. The cases were turned into step-by-step scenarios, where the AI or a human doctor could ask questions or order tests before making a diagnosis. Each test had a virtual cost, helping to measure both accuracy and how wisely resources were used.

Microsoft tested several top AI models, including OpenAI’s o3, Claude, Gemini, Llama, and DeepSeek, both alone and as part of MAI-DxO. The orchestrator system works by combining different models to act like a team of doctors, sharing ideas and narrowing down possible diagnoses. The best results came from using MAI-DxO with OpenAI’s o3 model, the tech giant stated.

Reportedly, the results showed that the AI not only diagnosed more cases correctly but also did so with fewer and more cost-effective tests than the doctors involved in the study.

Limitations of the Microsoft Medical AI

However, Microsoft admitted the research has its limits. The tests focused on rare and complex cases, not everyday health problems. Also, the doctors were not allowed to use any support tools like books or the internet during the test, unlike in real-world situations where such resources are often used.

Other tools developed by the company include RAD-DINO, which helps improve radiology processes, and Dragon Copilot, a voice assistant for doctors.

Microsoft says it is now working with health organisations to test its AI in real clinics and hospitals. Before any wider use, the technology will need to meet safety standards and get approval from regulators.

