HomeHeadlinenewsMicrosoft Says Its New AI System Diagnosed Patients 4 Times More Accurately...

Microsoft Says Its New AI System Diagnosed Patients 4 Times More Accurately Than Human Doctors

Microsoft has introduced a groundbreaking artificial intelligence system designed to diagnose illnesses with significantly greater accuracy and efficiency than human physicians. The development marks what the company calls “a genuine step toward medical superintelligence,” according to Mustafa Suleyman, CEO of Microsoft AI. The initiative involves a team of top-tier AI researchers, including several recently recruited from Google, underlining the fierce competition for AI talent among tech giants.

The new system, known as the MAI Diagnostic Orchestrator (MAI-DxO), was tested using 304 medical case studies from the New England Journal of Medicine. Microsoft created a benchmark known as the Sequential Diagnosis Benchmark, which simulates the diagnostic reasoning process a doctor would follow—breaking down each case into symptom analysis, testing, and diagnosis.

Rather than relying on a single AI model, MAI-DxO coordinates multiple state-of-the-art AI systems—including OpenAI’s GPT, Google’s Gemini, Anthropic’s Claude, Meta’s LLaMA, and xAI’s Grok—in a collaborative “chain-of-debate” format, emulating a panel of human experts. The results were striking: the AI system achieved an 80% diagnostic accuracy rate, outperforming human doctors who averaged just 20%. Additionally, it identified less expensive tests and treatments, reducing projected health care costs by 20%.

This orchestration mechanism—multiple agents working together—is what brings us closer to true medical superintelligence,” said Suleyman, who previously worked on AI at Google.

AI is already used in the U.S. health care system, particularly in radiology, but Microsoft’s tool represents a shift toward broader diagnostic use. However, the company acknowledges that AI in medicine brings concerns, especially regarding biases in training data that may not reflect diverse populations.

While Microsoft has yet to decide whether it will commercialize the technology, it may integrate diagnostic features into platforms like Bing or develop tools to assist medical professionals. “Over the next couple of years, you’ll see us doing more real-world testing of these systems,” Suleyman added.

This project adds to a growing body of research demonstrating the diagnostic potential of large language models (LLMs). Unlike previous studies, Microsoft’s research emphasizes a more realistic replication of how doctors reach diagnoses—by ordering tests, interpreting results, and refining their assessments. The company referred to this multi-model collaboration as a “path to medical superintelligence” in a blog post detailing the initiative.

The possibility that AI could drive down medical costs is particularly significant in the U.S., where health care expenses are a major concern. “Our model not only gets to the diagnosis accurately but also does so in a highly cost-effective manner,” said Dominic King, a Microsoft vice president involved in the project.

Experts in the medical AI space have responded positively but cautiously. David Sontag, an MIT researcher and cofounder of medical AI firm Layer Health, praised the study’s rigorous methodology and its attempt to mirror real clinical workflows. However, he pointed out that the doctors in the study were restricted from using external tools, which doesn’t reflect how they typically operate. He also noted that cost savings observed in simulations may not directly translate into real-world results, where physicians often consider factors the AI cannot, such as patient preferences and equipment availability.

Eric Topol, a scientist at the Scripps Research Institute, called the report “impressive,” especially for tackling complex diagnostic cases. He highlighted the novelty of showing that AI could potentially lower health care costs.

Both Sontag and Topol emphasized the need for clinical trials to validate the AI tool’s performance in real-world settings. “Only then can we get a truly rigorous evaluation of its cost and clinical effectiveness,” Sontag said.

Microsoft’s latest research signals a new phase in the evolution of AI in health care—one where the focus is not only on diagnostic accuracy but also on replicating the nuanced decision-making process of experienced physicians while reducing the financial burden on health systems.

Headline news

- Advertisement -spot_img
Must Read
Related News
- Advertisement -spot_img