Automated Generation of ICD-11 Cluster Codes for Precision Medical Record Classification

Authors

  • Jiayi Feng Department of Information Management, Beijing Jiaotong University, China
  • Runtong Zhang Department of Information Management, Beijing Jiaotong University, China
  • Donghua Chen Department of Information Management, Beijing Jiaotong University, China
  • Lei Shi School of Computing, Newcastle University, United Kingdom
  • Zhaoxing Li Department of Electronics and Computer Science, University of Southampton, United Kingdom

DOI:

https://doi.org/10.15837/ijccc.2024.1.6251

Keywords:

ICD-11;ICD code; machine learning; text similarity; clinical coding;

Abstract

Accurate clinical coding using the International Classification of Diseases (ICD) standard is essential for healthcare analytics. ICD-11 introduces new coding guidelines and cluster structures, posing challenges for existing coding tools. This research presents an automated approach to generate valid ICD-11 cluster codes from medical text. Natural language records are represented as vectors and compared to an ICD-11 corpus using cosine similarity. A bidirectional matching technique then refines similarity estimation. Experiments demonstrate the method yields up to 0.91 F1 score in coding accuracy, significantly outperforming a baseline tool. This work enables efficient high-quality ICD-11 coding to support healthcare informatics.

References

Biruta Sloka, Anna Angena(2022). Challenges for health care financing in latvia – comparison with other baltic countries, Journal of Service, Innovation and Sustainable Development, 3(2), 143–152, 2022.

[Online]. https://icd.who.int/en/docs/icd11factsheet_en.pdf, Accesed on 10 November 2022.

Zhu, V. J., Lenert, L. A., Barth, K. S., Simpson, K. N., Li, H., Kopscik, M., Brady, K. T. (2022). Automatically identifying opioid use disorder in non-cancer patients on chronic opioid therapy, Health Informatics Journal, 28(2), 2022.

Eastwood, C. A., Southern, D. A., Doktorchik, C., Khair, S., Cullen, D., Boxill, A., Quan, H. (2021). Training and experience of coding with the World Health Organization’s International Classification of Diseases, Eleventh Revision, Health Information Management Journal, 52(2), 92–100, 2021.

Saravanan A, Anandhi D, Srividya M. (2023). Class probability distribution based maximum entropy model for classification of datasets with sparse instances, Computer Science and Information Systems, 20(3), 949-976, 2023.

Kaur R, Ginige JA, Obst O. (2022). AI-based ICD coding and classification approaches using discharge summaries: A systematic literature review, Expert Systems with Applications, 213, 118997, 2022.

Yamada, E., Aramaki, E., Imai, T., Ohe, K. (2010). Internal structure of a disease name and its application for ICD coding, Studies in Health Technology and Informatics, 160, 1010-1014, 2010.

Filip, F. G. (2023). Automation and computers and their contribution to human well-being and resilience, Studies in Informatics and Control, 30(4), 5-18, 2023.

Nakahara, S., Uchida, Y., Oda, J., Yokota, J. (2014). Bridging classification for injury diagnoses that can be converted to both the International Classification of Diseases and the Abbreviated Injury Scale, Acute Medicine and Surgery, 1(1), 10-16, 2014.

[Online]. https://www.medsci.cn/sci/icd-10.do, Accesed on 12 November 2022.

Gill, P. J., Thavam, T., Anwar, M. R., Zhu, J., To, T., Mahant, S. (2022). Pediatric Clinical Classification System for use in Canadian inpatient settings, Plos one, 17(8), e0273580, 2022.

Fung, K. W., Xu, J., Ameye, F., Gutiérrez, A. R., Busquets, A. (2018). Re-purposing the ICD-9- CM procedures index for coding in ICD-10-PCS and SNOMED CT, American Medical Informatics Association Annual Symposium Proceedings, 2018, 450, 2018.

Venepalli, N. K., Qamruzzaman, Y., Li, J. J., Lussier, Y. A., Boyd, A. D. (2014). Identifying clinically disruptive International Classification of Diseases 10th Revision Clinical Modification conversions to mitigate financial costs using an online tool, Journal of Oncology Practice, 10(2), 97-103, 2014.

Ertuğrul D Ç, Abdullah S A. (2022). A Decision-Making Tool for Early Detection of Breast Cancer on Mammographic Images, Tehnički vjesnik, 29(5), 1528-1536, 2022.

Hamad, A. F., Vasylkiv, V., Yan, L., Sanusi, R., Ayilara, O., Delaney, J. A., Lix, L. M. (2021). Mapping three versions of the international classification of diseases to categories of chronic conditions, International Journal of Population Data Science, 6(1), 1406, 2021.

Cao, L., Gu, D., Ni, Y., **e, G. (2019). Automatic ICD code assignment based on ICD’s hierarchy structure for Chinese electronic medical records, AMIA Summits on Translational Science Proceedings, 2019, 417-424, 2019.

Fareh, M., Riali, I., Kherbache, H., Guemmouz, M. (2023). Probabilistic reasoning for diagnosis prediction of Coronavirus disease based on probabilistic ontology, Computer Science and Information Systems, 20(3), 1109-1132, 2023.

Wu, Y., Chen, Z., Yao, X., Chen, X., Zhou, Z., Xue, J. (2022). JAN: Joint Attention Networks for Automatic ICD Coding, IEEE Journal of Biomedical and Health Informatics, 26(10), 5235-5246, 2022.

Teng, F., Zhang, Q., Zhou, X., Hu, J., Li, T. (2024). Few-shot ICD coding with knowledge transfer and evidence representation, Expert Systems with Applications, 238, 121861, 2024.

Lee H, Kim S. (2023). Impact of the ICD-11 on the accuracy of clinical coding in Korea, Health Information Management Journal, 52(3), 221-228, 2023.

Fung KW, Xu J, Bodenreider O. (2020). The new International Classification of Diseases 11th edition: a comparative analysis with ICD-10 and ICD-10-CM, Journal of the American Medical Informatics Association, 27(5), 738-746, 2020.

Eastwood, C. A., Southern, D. A., Khair, S., Doktorchik, C., Cullen, D., Ghali, W. A., Quan, H. (2022). Field testing a new ICD coding system: methods and early experiences with ICD-11 Beta Version 2018, BMC Research Notes, 15(1), 1-7, 2022.

Perotte, A., Pivovarov, R., Natarajan, K., Weiskopf, N., Wood, F., Elhadad, N. (2014). Diagnosis code assignment: models and evaluation metrics. Journal of the American Medical Informatics Association, 21(2), 231-237, 2014.

Venkatesh R, Shenbagarajan A, Shenbagalakshmi G. (2023). Multi-gradient boosted adaptive SVM-based prediction of heart disease, International Journal of Computers Communications and Control, 18(5), 2023.

Wang, Y. (2022). Online Healthcare Privacy Disclosure User Group Profile Modeling Based on Multimodal Fusion, International Journal of Computers Communications and Control , 17(5), 2022. Doi: 10.15837/ijccc.2022.5.4696.

Negoiţă, RF, Borangiu T. (2023). Robotic Process Automation of Inventory Demand with Intelligent Reservation, Studies in Informatics and Control, 32(2), 5-14, 2023.

Boerma, T., Harrison, J., Jakob, R., Mathers, C., Schmider, A., Weber, S. (2016). Revising the ICD: Explaining the WHO approach, The Lancet, 388(10059), 2476-2477, 2016.

Robertson S. (2004). Understanding inverse document frequency: on theoretical arguments for IDF, Journal of Documentation, 60(5), 503-520, 2004.

Huang A. (2008). Similarity measures for text document clustering, Proceedings of the sixth New Zealand computer science research student conference, 2008, 9-56, 2008.

Gomaa WH, Fahmy AA (2013). A survey of text similarity approaches, International Journal of Computer Applications, 68(13), 13-18, 2013.

[Online]. Available: https://icd.who.int/browse11/l-m/en, Accesed on 10 November 2022.

Mousavi, R., Raghu, T. S., Frey, K. (2020). Harnessing artificial intelligence to improve the quality of answers in online question-answering health forums, Journal of Management Information Systems, 37(4), 1073-1098, 2020.

Teng, F., Liu, Y., Li, T., Zhang, Y., Li, S., Zhao, Y. (2023). A review on deep neural networks for ICD coding, IEEE Transactions on Knowledge and Data Engineering, 35(5), 4357-4375, 2023.

Additional Files

Published

2024-01-04

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.