A post-processing algorithm for building longitudinal medication dose data from extracted medication information using natural language processing from electronic health records
A post-processing algorithm for building longitudinal medication dose data from extracted medication information using natural language processing from electronic health records
ABSTRACT ObjectiveWe developed a post-processing algorithm to convert raw natural language processing output from electronic health records into a usable format for analysis. This algorithm was specifically developed for creating datasets that can be used for medication-based studies. Materials and MethodsThe algorithm was developed using output from two natural language processing systems, MedXN and medExtractR. We extracted medication information from deidentified clinical notes from Vanderbilt’s electronic health record system for two medications, tacrolimus and lamotrigine, which have widely different prescribing patterns. The algorithm consists of two parts. Part I parses the raw output and connects entities together and Part II removes redundancies and calculates dose intake and daily dose. We evaluated both parts of the algorithm by comparing to gold standards that were generated using approximately 300 records from 10 subjects for both medications and both NLP systems. ResultsBoth parts of the algorithm performed well. For MedXN, the F-measures for Part I were at or above 0.94 and for Part II they were at or above 0.98. For medExtractR the F-measures for Part I were at or above 0.98 and for Part II they were at or above 0.91. DiscussionOur post-processing algorithm is useful for drug-based studies because it converts NLP output to analyzable data. It performed well, although it cannot handle highly complicated cases, which usually occurred when a NLP incorrectly extracted dose information. Future work will focus on identifying the most likely correct dose when conflicting doses are extracted on the same day.
Denny Joshua C.、Choi Leena、McNeer Elizabeth、Weeks Hannah L.、Williams Michael L.、Beck Cole、Bejan Cosmin Adrian
Department of Biomedical Informatics, Vanderbilt University Medical Center||Department of Medicine, Vanderbilt University Medical CenterDepartment of Biostatistics, Vanderbilt University Medical CenterDepartment of Biostatistics, Vanderbilt University Medical CenterDepartment of Biostatistics, Vanderbilt University Medical CenterDepartment of Biostatistics, Vanderbilt University Medical CenterDepartment of Biostatistics, Vanderbilt University Medical CenterDepartment of Biomedical Informatics, Vanderbilt University Medical Center
医学研究方法药学计算技术、计算机技术
medication extractionelectronic health recordsnatural language processingpost processing algorithm
Denny Joshua C.,Choi Leena,McNeer Elizabeth,Weeks Hannah L.,Williams Michael L.,Beck Cole,Bejan Cosmin Adrian.A post-processing algorithm for building longitudinal medication dose data from extracted medication information using natural language processing from electronic health records[EB/OL].(2025-03-28)[2025-05-23].https://www.biorxiv.org/content/10.1101/775015.点此复制
评论