Skip to main content
Fig. 14 | BMC Medical Informatics and Decision Making

Fig. 14

From: Exploring the potential of ChatGPT in medical dialogue summarization: a study on consistency with human preferences

Fig. 14

From the perspective of ROUGE-1 score, the BART summary here shows a high similarity to the manual summary. However, there are significant issues with the BART summary. Firstly, in the “Diagnosis” part, the BART summary incorrectly states the diagnosis as “Upper respiratory infection”, while the correct diagnosis in the manual summary is “Diarrhea”. Secondly, the entire summary is too brief, leading to the omission of some potentially important information. For instance, in the “Recommendation” part, the BART summary only mentions the recommendation of “Oral montmorillonite powder”. Although ChartGPT’s ROUGE-1 score is lower than BART’s, the resulting summary is highly detailed and semantically consistent with the original conversation data, such as “routine stool examination and other relevant examinations” and “avoid eating greasy, spicy and irritating food, and feed more liquid food”

Back to article page