Academic Publication

Large Language Model Influence on Diagnostic Reasoning

589

Citations

October 28, 2024

Published Date

Research Abstract & Technology Focus

ImportanceLarge language models (LLMs) have shown promise in their performance on both multiple-choice and open-ended medical reasoning examinations, but it remains unknown whether the use of such tools improves physician diagnostic reasoning.ObjectiveTo assess the effect of an LLM on physicians’ diagnostic reasoning compared with conventional resources.Design, Setting, and ParticipantsA single-blind randomized clinical trial was conducted from November 29 to December 29, 2023. Using remote video conferencing and in-person participation across multiple academic medical institutions, physicians with training in family medicine, internal medicine, or emergency medicine were recruited.InterventionParticipants were randomized to either access the LLM in addition to conventional diagnostic resources or conventional resources only, stratified by career stage. Participants were allocated 60 minutes to review up to 6 clinical vignettes.Main Outcomes and MeasuresThe primary outcome was performance on a standardized rubric of diagnostic performance based on differential diagnosis accuracy, appropriateness of supporting and opposing factors, and next diagnostic evaluation steps, validated and graded via blinded expert consensus. Secondary outcomes included time spent per case (in seconds) and final diagnosis accuracy. All analyses followed the intention-to-treat principle. A secondary exploratory analysis evaluated the standalone performance of the LLM by comparing the primary outcomes between the LLM alone group and the conventional resource group.ResultsFifty physicians (26 attendings, 24 residents; median years in practice, 3 [IQR, 2-8]) participated virtually as well as at 1 in-person site. The median diagnostic reasoning score per case was 76% (IQR, 66%-87%) for the LLM group and 74% (IQR, 63%-84%) for the conventional resources-only group, with an adjusted difference of 2 percentage points (95% CI, −4 to 8 percentage points; P = .60). The median time spent per case for the LLM group was 519 (IQR, 371-668) seconds, compared with 565 (IQR, 456-788) seconds for the conventional resources group, with a time difference of −82 (95% CI, −195 to 31; P = .20) seconds. The LLM alone scored 16 percentage points (95% CI, 2-30 percentage points; P = .03) higher than the conventional resources group.Conclusions and RelevanceIn this trial, the availability of an LLM to physicians as a diagnostic aid did not significantly improve clinical reasoning compared with conventional resources. The LLM alone demonstrated higher performance than both physician groups, indicating the need for technology and workforce development to realize the potential of physician-artificial intelligence collaboration in clinical practice.Trial RegistrationClinicalTrials.gov Identifier: NCT06157944

Read Full Literature

AI Semantic Synergy Context

Connecting this academic literature to real-world market discussions and products.

Large Language Model Influence on Diagnostic Reasoning

Evaluation and mitigation of the limitations of large language models in clinical decision-making

Abstract Clinical decision-making is one of the most impactful parts of a physician’s responsibilities and stands to benefit greatly from artificial intelligence solutions and lar...

Large Language Models in Healthcare and Medical Domain: A Review

The deployment of large language models (LLMs) within the healthcare sector has sparked both enthusiasm and apprehension. These models exhibit the remarkable ability to provide proficient responses...

Large Language Models and User Trust: Consequence of Self-Referential Learning Loop and the Deskilling of Health Care Professionals

As the health care industry increasingly embraces large language models (LLMs), understanding the consequence of this integration becomes crucial for maximizing benefits while mitigating potential ...

A Survey on Evaluation of Large Language Models

Large language models (LLMs) are gaining increasing popularity in both academia and industry, owing to their unprecedented performance in various applications. As LLMs continue to play a vital role...

Frequently Asked Questions (FAQ)

Curated market intelligence mapped to this research.

What is the core focus of the research titled 'Large Language Model Influence on Diagnostic Reasoning'?

This literature focuses on: ImportanceLarge language models (LLMs) have shown promise in their performance on both multiple-choice and open-ended medical reasoning examinations, but it remains unknown whether the use of such tools improves physician diagnostic reasoning.Obje...

Are there open-source GitHub repositories related to Large Language Model Influence on Diagnostic Reasoning?

Yes, open-source projects like PKU-YuanGroup/Helios (Helios: Real Real-Time Long Video Generation Model) are actively building upon these concepts.

Which startups are commercializing the technology behind Large Language Model Influence on Diagnostic Reasoning?

Products like FreeCAD 1.1 are bringing this to market. Their focus is: Extremely powerful, completely free 3D CAD modeling.

What other academic literature is closely related to 'Large Language Model Influence on Diagnostic Reasoning'?

Yes, highly correlated activity was mapped. An entry titled 'Large Language Model Influence on Diagnostic Reasoning' discusses this: ImportanceLarge language models (LLMs) have shown promise in their performance on both multiple-choice and open-ended medical reasoning examination...

Cite this Market Intelligence Report

Reference our AI-mapped synergy between this research and the commercial market to instantly build authority.

"Commercial Applications of Large Language Model Influence on Diagnostic Reasoning." ROIpad Intelligence Index, 2026. Available at: https://roipad.com/saas-metrics/research/cr_MTAuMTAwMS9qYW1hbmV0d29ya29wZW4uMjAyNC40MDk2OQ/large-language-model-influence-on-diagnostic-reasoning

Commercial Realization

Startups and Open Source tools heavily associated with the concepts explored in this paper.

GitHub
PKU-YuanGroup/Helios
Helios: Real Real-Time Long Video Generation Model
GitHub
FreedomIntelligence/OpenClaw-Medical-Skills
The largest open-source medical AI skills library for OpenClaw🦞.
Product Hunt
FreeCAD 1.1
Extremely powerful, completely free 3D CAD modeling
Product Hunt
Ollang DX
The AI Language Execution Layer for Enterprise

Associated Media Narrative

Ancient mystery on K’gari as world’s largest sand island lakes dried up during rainy era
Science Daily • Jul 21, 2026
Pathways for glomerular macromolecule filtration: A mathematical model for transport across glomerular filtration surface, mesangium and shear-induced shunts
Plos.org • Jul 20, 2026
AI Chatbot Responses Often Mirror Government Censorship, Report Finds
CNET • Jul 17, 2026