详细信息

AI-Driven Evidence Synthesis: Data Extraction of Randomized Controlled Trials with Large Language Models  ( EI收录)  

文献类型:期刊文献

英文题名:AI-Driven Evidence Synthesis: Data Extraction of Randomized Controlled Trials with Large Language Models

作者:Liu, Jiayi[1,2]; Ge, Long[1,2,6]; Lai, Honghao[1,2]; Zhao, Weilong[1,2]; Huang, Jiajie[3]; Xia, Danni[1,2]; Liu, Hui[4]; Luo, Xufei[4,6,7]; Wang, Bingyi[4]; Pan, Bei[4]; Hou, Liangying[4,5]; Chen, Yaolong[4,7,8]

第一作者:Liu, Jiayi

机构:[1] Department of Health Policy and Health Management, School of Public Health, Lanzhou University, Lanzhou, China; [2] Evidence-Based Social Science Research Center, School of Public Health, Lanzhou University, Lanzhou, China; [3] College of Nursing, Gansu University of Chinese Medicine, Lanzhou, China; [4] Evidence-Based Medicine Center, School of Basic Medical Sciences, Lanzhou University, Lanzhou, China; [5] Department of Health Research Methods, Evidence, and Impact, McMaster University, ON, Canada; [6] Key Laboratory of Evidence-Based Medicine of Gansu Province, Lanzhou University, Lanzhou, China; [7] Research Unit of Evidence-Based Evaluation and Guidelines, Chinese Academy of Medical Sciences [2021RU017], School of Basic Medical Sciences, Lanzhou University, Lanzhou, China; [8] WHO Collaborating Center for Guideline Implementation and Knowledge Translation, Lanzhou, China

第一机构:Department of Health Policy and Health Management, School of Public Health, Lanzhou University, Lanzhou, China

年份:2024

外文期刊名:SSRN

收录:EI(收录号:20240256229)

语种:英文

外文关键词:Computational linguistics - Data mining - Extraction - Handbooks - Machine learning

摘要:Background: The advancement of large language models (LLMs) harbors the potential to improve the quality and efficiency of evidence synthesis, especially for labor-intensive processes such as data extraction. However, effective prompts have not yet been established and the feasibility and reliability remain uncertain. Objective: To develop structured prompts guiding large language models (LLMs) and explore the feasibility and accuracy of utilizing LLM in extracting data in randomized controlled trials (RCTs). Design: We conducted a survey study between August 10, 2023, and October 30, 2023. We developed structured prompts to guide Claude (Claude-2) in extracting data from RCTs, incorporating the following six specific domains as outlined in the Cochrane Handbook: "Methods," "Participants," "baseline characteristics," "Outcomes," "Data and analysis," and "Others". We randomly selected 10 RCTs from published Cochrane reviews as the sample. Compared to a gold standard, we assessed the accuracy using correct rates at overall, study-specific, domain-specific, and item-specific levels. Besides, we estimated the efficiency by the mean time spent. Results: We successfully established structured prompts contained 58 items applied to the data extraction. Across the 10 RCTs’ data extraction, Claude guided by prompts achieved an overall correct rate of 94.77% (95% CI: 93.66% to 95.73%). At the level of domain-specific, the "Others" (funding and conflicts of interest) illustrated the highest mean rate at 100% (95% CI: 83.16% to 100%), while the "Baseline characteristics" domain showed the poorest performance with a mean correct rate of 77.97% (95% CI: 72.72% to 82.64%). The mean correct rates of the remaining domains all exceeded 95%. At the level of item-specific, 51.72% (38/58) items achieved 100% correct rates, while 20.68% (12/58) had correct rates over 90%. Only 8 items (13.79%) had corrected rates below 90%. The wrong extractions were predominantly observed within the "Baseline characteristics" domain, specifically pertaining to the items that detail the participant numbers. The mean time needed for data extraction was 88 seconds for each RCT and the mean ratio of correct extractions to time spent was 1.98. Conclusion: Using structured prompts, Claude can extract data efficiently and accurately, showing the application feasibility and value of LLM in evidence synthesis. Funding: This study was jointly funded by the Fundamental Research Funds for the Central Universities, the National Natural Science Foundation of China (No. 82204931) and the Scientific and Technological Innovation Project of the China Academy of Chinese Medical Sciences (No. CI2021A05502). Declaration of Interest: None. Ethical Approval: The Medical Ethics Review Committee of the School of Public Health at Lanzhou University deemed the study exempt from review, as all data originated from published research. ? 2024, The Authors. All rights reserved.

参考文献:

正在载入数据...

版权所有©甘肃中医药大学 重庆维普资讯有限公司 渝B2-20050021-8 
渝公网安备 50019002500408号 违法和不良信息举报中心