Tuberculosis (TB) is an infectious disease caused by Mycobacterium Tuberculosis, which can affect any organ in the body, with pulmonary TB being the most common form of the disease and the one that causes the most deaths. According to the World Health Organization (WHO), TB is among the top 10 causes of death worldwide, and in the case of Colombia TB is a disease of interest in terms of public health, due to the high number of cases that are reported in the territory, with respect to other communicable diseases. One of the main problems for TB management is in the diagnostic methods, for which personnel and infrastructure are needed that are not always available in places with poor health systems. According to the national protocol for the detection of TB, the diagnosis of pulmonary TB must be made through microbiological confirmation, for which there are three types of tests, smear microscopy, molecular tests and cultures. All tests have an associated cost and their availability is limited, so the generation of tools that provide support in the diagnosis of TB can help to have better control of the disease. Artificial intelligence (AI) is an area of computing that seeks to provide machines with intelligent behaviors, in order to carry out a specific task. One of the applications of AI is the decision support systems of the English Decision Support System (DSS), these systems applied in health, seek to generate models that are based on large volumes of data and previous clinical knowledge, to help the doctor in making better decisions regarding patients. In order to generate tools that help in the management of TB, in the present work AI techniques are used to develop a DSS that supports the diagnosis of TB, using the information contained in electronic medical records (EHR). ). EHRs are sources of information widely used by doctors, in which the health status of patients is recorded, so it is expected that with the information contained in them, a computational tool can be generated that helps healthcare professionals. health in the management of TB. For the development of the work, a database was built from 151 EHR of patients suspected of pulmonary TB, in the database there are the clinical reports of the patients on dates prior to the performance of the diagnostic tests, so that no information is found in the reports on the final diagnosis of TB. For the creation of the diagnostic tool, clinical reports were taken and preprocessing was applied to clean the text, then characteristics were extracted using 2 methods TF-IDF (term-frequency - inverse document frequency) and Word2Vec; Subsequently, machine learning models were used to make the prediction of TB. The exploration of models was carried out by cross validation, finding that the best results are obtained by reducing the dimensionality of the characteristics obtained with TF-IDF, and using the algorithm of random trees for classification. The performance metrics obtained on the test sets with this model are: 0.721, 0.802, 0.462, and 0.723, in accuracy, sensitivity, specificity, and F1-score respectively. This work was developed within the project `` Generation of alternative models based on computational intelligence for screening and diagnosis of pulmonary tuberculosis '' (minciencias, Universidad del Rosario, Universidad Antonio Nariño, Integrated Subnet of Health Services Centro-Oriente – Hospital Santa Clara ), which is a project made up of a joint team of doctors and engineers, and its objective is to generate computational tools that can be used in places with poor infrastructure for the diagnosis of pulmonary TB. Within the project, computational models are being developed using clinical, epidemiological and sociodemographic variables, it is expected in the future to integrate this work with other strategies generated within the project, for the construction of a more robust system that can support the doctor in the diagnosis of pulmonary TB.
publication date
June 2, 2021 3:42 PM
Research
keywords
Decision support systems (DSS) based on Artificial Intelligence (AI) for the diagnosis of tuberculosis (TB)
Electronic medical record processing system (EHR) as a diagnostic tool in (TB)
Medical technology
TF-IDF and Word2Vec methods for data analysis in diagnostic medical AI Computer program for the diagnosis based on Natural Language Processing registered in electronic medical records (EHR)