Prediction of venous thromboembolism using semantic and sentiment analyses of clinical narratives
Description
Venous thromboembolism (VTE) is the third most common cardiovascular disorder. It affects people of both
genders at ages as young as 20 years. The increased number of VTE cases with a high fatality rate of 25% at first
occurrence makes preventive measures essential. Clinical narratives are a rich source of knowledge and should be
included in the diagnosis and treatment processes, as they may contain critical information on risk factors. It is
very important to make such narrative blocks of information usable for searching, health analytics, and decisionmaking.
This paper proposes a Semantic Extraction and Sentiment Assessment of Risk Factors (SESARF) framework.
Unlike traditional machine-learning approaches, SESARF, which consists of two main algorithms, namely,
ExtractRiskFactor and FindSeverity, prepares a feature vector as the input to a support vector machine (SVM)
classifier to make a diagnosis. SESARF matches and maps the concepts of VTE risk factors and finds adjectives and
adverbs that reflect their levels of severity. SESARF uses a semantic- and sentiment-based approach to analyze
clinical narratives of electronic health records (EHR) and then predict a diagnosis of VTE.
We use a dataset of 150 clinical narratives, 80% of which are used to train our prediction classifier support
vector machine, with the remaining 20% used for testing. Semantic extraction and sentiment analysis results
yielded precisions of 81% and 70%, respectively. Using a support vector machine, prediction of patients with VTE
yielded precision and recall values of 54.5% and 85.7%, respectively
Citation
Sabra, S., Malik, K. M., & Alobaidi, M. (2018). Prediction of venous thromboembolism using semantic and sentiment analyses of clinical narratives. Computers in biology and medicine, 94, 1-10.
Date
2018
Subject
Venous thromboembolism
Risk factor assessment
Natural language processing
Semantic enrichment
Sentiment analysis
Prediction through classification
Support vector machine
Risk factor assessment
Natural language processing
Semantic enrichment
Sentiment analysis
Prediction through classification
Support vector machine