Risultati della ricerca per “ChatGPT mostra una precisione “impressionante” nel processo decisionale clinico” – Pagina 2

Objective
Large language models (LLMs) such as ChatGPT are being developed for use in research, medical education and clinical decision systems. However, as their usage increases, LLMs face ongoing regulatory concerns. This study aims to analyse ChatGPT’s performance on a postgraduate examination to identify areas of strength and weakness, which may provide further insight into their role in healthcare.

Design
We evaluated the performance of ChatGPT 4 (24 May 2023 version) on official MRCP (Membership of the Royal College of Physicians) parts 1 and 2 written examination practice questions. Statistical analysis was performed using Python. Spearman rank correlation assessed the relationship between the probability of correctly answering a question and two variables: question difficulty and question length. Incorrectly answered questions were analysed further using a clinical reasoning framework to assess the errors made.

Setting
Online using ChatGPT web interface.

Primary and secondary outcome measures
Primary outcome was the score (percentage questions correct) in the MRCP postgraduate written examinations. Secondary outcomes were qualitative categorisation of errors using a clinical decision-making framework.

Results
ChatGPT achieved accuracy rates of 86.3% (part 1) and 70.3% (part 2). Weak but significant correlations were found between ChatGPT’s accuracy and both just-passing rates in part 2 (r=0.34, p=0.0001) and question length in part 1 (r=–0.19, p=0.008). Eight types of error were identified, with the most frequent being factual errors, context errors and omission errors.

Conclusion
ChatGPT performance greatly exceeded the passing mark for both exams. Multiple choice examinations provide a benchmark for LLM performance which is comparable to human demonstrations of knowledge, while also highlighting the errors LLMs make. Understanding the reasons behind ChatGPT’s errors allows us to develop strategies to prevent them in medical devices that incorporate LLM technology.

Leggi →

Marzo 2024

Primo protocollo di precisione contro un tumore dei bambini

ANSA – Salute e Benessere

Nato in Italia e basato sul genoma, nel mirino il neuroblastoma

Leggi →

Febbraio 2024

REVOLUTIONIZING IBD MANAGEMENT: HOW DO CHATGPT & GOOGLE BARD STAND UP IN OFFERING COMPREHENSIVE MANAGEMENT SOLUTIONS?

Gastroenteology

Artificial Intelligence (AI) has notably transformed the realm of healthcare, especially in diagnosing and treating Inflammatory Bowel Disease (IBD) and other digestive disorders. Essential AI tools, such as ChatGPT and Google Bard, can interpret endoscopic imagery, analyze diverse samples, simplify administrative duties, and assist in assessing medical images and the automation of devices. By individualizing treatments and forecasting adverse reactions, these AI applications have notably enhanced the management of digestive diseases.

Leggi →

Gennaio 2024

ChatGPT potrebbe creare dati falsi per truccare test clinici

ANSA – Salute e Benessere

Studio italiano,programma fa risultare credibili risultati finti

Leggi →

Novembre 2023

Abstract 16401: Optimizing ChatGPT to Detect VT Recurrence From Complex Medical Notes

Circulation

Circulation, Volume 148, Issue Suppl_1, Page A16401-A16401, November 6, 2023. Introduction:Large language models (LLMs), such as ChatGPT, have remarkable ability to interpret natural language using text questions (prompts) applied to gigabytes of data in the world wide web. However, the performance of ChatGPT is less impressive when addressing nuanced questions from finite repositories of lengthy, unstructured clinical notes (Fig A).Hypothesis:The performance of ChatGPT to identify sustained ventricular tachycardia (VT) or fibrillation after ablation from free-text medical notes is improved by optimizing the question and adding in-context sample notes with correct responses (‘prompt engineering’).Methods:We curated a dataset of N = 125 patients with implantable defibrillators (32.0% female, LVEF 48.9±13.9%, 61.7±14.0 years), split into development (N = 75) and testing (N = 50) sets of 307 and 337 notes, with 256.8±95.1 and 289.8±103 words, respectively. Notes were deidentified. Gold standard labels for recurrent VT (Yes, No, Unknown) were provided by experts. We applied GPT-3.5 to the test set (N=337 notes), using 1 of 3 prompts (“Does the patient have sustained VT or VF after ablation” or 2 others), systematically adding 1-5 “training” examples, and repeating experiments 10 times (51,561 inquiries).Results:At baseline, GPT achieved an F1 score of 38.6%±19.4% (mean across 3 prompts; Fig B). Increasing the number of examples progressively improved mean accuracy and reduced variance. The optimal result was the illustrated prompt plus 5 in-context examples, with an F1 score of 84.6%±6.4% (p

Leggi →

Novembre 2023

Risultati per: ChatGPT mostra una precisione “impressionante” nel processo decisionale clinico

Su1963 A SYSTEMATIC APPROACH FOR USING CHATGPT TO GENERATE PERIPROCEDURE PATIENT EDUCATION MATERIALS

Mo1273 QUALITATIVELY EVALUATING CHATGPT'S ACCURACY IN PROVIDING ANSWERS ON GASTROESOPHAGEAL REFLUX DISEASE RELATED QUESTIONS

Su1990 EVALUATING THE UTILITY OF CHATGPT OVER TRADITIONAL SEARCH ENGINE QUERY FOR SAFETY OF INFLAMMATORY BOWEL DISEASE THERAPEUTICS IN PREGNANCY AND BREASTFEEDING

Su1985 BEYOND HUMAN EXPERTISE: COMPARING CHATGPT 4.0 AND GLASS AI IN GASTROENTEROLOGICAL INNOVATION

Su1984 ASSESSING CHATGPT VS. STANDARD MEDICAL RESOURCES FOR ENDOSCOPIC SLEEVE GASTROPLASTY EDUCATION: A MEDICAL PROFESSIONAL EVALUATION STUDY

540 UTILIZING CHATGPT FOR THE DEVELOPMENT OF A NATURAL LANGUAGE PROCESSING (NLP) ALGORITHM FOR THE AUTOMATED CALCULATION OF COLONOSCOPY QUALITY METRICS

Su1962 GI-COPILOT: AUGMENTING CHATGPT WITH GUIDELINE-BASED KNOWLEDGE

Tu1997 THE POTENTIAL UTILITY OF CHATGPT 4.0 AS AN ARTIFICIAL INTELLIGENCE ASSISTANT IN THE EDUCATION AND MANAGEMENT OF PATIENTS WITH BARRETT'S ESOPHAGUS

Tu2033 EVALUATING THE UTILITY OF CHATGPT OVER TRADITONAL SEARCH ENGINE QUERY FOR POST-UPPER ENDOSCOPY PATIENT CONCERNS

Trapianti, progetto clinico con Univaq e Centro nazionale

Can ChatGPT pass the MRCP (UK) written examinations? Analysis of performance and errors using a clinical decision-reasoning framework

Primo protocollo di precisione contro un tumore dei bambini

REVOLUTIONIZING IBD MANAGEMENT: HOW DO CHATGPT & GOOGLE BARD STAND UP IN OFFERING COMPREHENSIVE MANAGEMENT SOLUTIONS?

ChatGPT potrebbe creare dati falsi per truccare test clinici

Abstract 16401: Optimizing ChatGPT to Detect VT Recurrence From Complex Medical Notes

Vuoi restare informato delle novità di Meditutor?

Iscriviti alla newsletter.

Confermo di aver preso visione dell'informativa in questa pagina