The Effects of Telenursing on Readmission of Patients with Chronic Heart Failure: A Literature Review

Abstract

Nurses have the role of educators. Nurses in this role help clients learn about their condition and health care procedures that clients must take to restore or maintain their health. Telenursing is one of the communication methods used to educate the client, especially in heart failure. They were often admitted repeatedly to the hospital due to a lack of knowledge and poor adherence to the regiment. And rarely discussed what methods that have the most efficient to implement in telenursing. This study tried to describe the effective implementation of telenursing programs in readmission among patients with heart failure. Database conducted from Science Direct, ProQuest, Scopus, EBSCO, BMJ, Research Gate, Clinical Key, Taylor & Francis, PubMed, Wiley Online Library, and SAGE Journal, published in 2014 – 2019 matched with specific keywords. Good telephone communication skills of nurses can be improved to meet patients’ needs. Thus, in combination with systematic comprehensive monitoring can be used to build a professional relationship between patients and nurses, and it could be easy to implement. Telenursing methods are still being developed to get a better way of communication after the client’s hospitalization to decrease readmission. Telenursing requires an integrated system to support through telemonitoring so that clients will get continuous monitoring to maintain their health.

Keywords: telenursing, nurse telephone follow-up, heart failure, readmission

Introduction

Noncommunicable diseases (NCDs), such as heart disease, stroke, cancer, chronic respiratory diseases, and diabetes, are the leading cause of mortality in the world. The burden is growing – the number of people, families, and communities afflicted is increasing (WHO, 2016). Prevalence of heart failure will increase 46% from 2012 to 2030, resulting in >8 million people ≥18 years of age with heart failure. Additionally, the total percentage of the population with heart failure is predicted to increase from 2.42% in 2012 to 2.97% in 2030. Because most forms of heart failure tend to present in older age, and the population is aging, lifetime risk for heart failure the community is high. The prevalence estimates for heart failure across Asia range from 1.26% to 6.7% (Benjamin et al., 2019).

Previous research results showed that heart failure patients’ nonadherence to self-care behaviors is still a major challenge, which leads to exacerbation of symptoms, complications, repeated visits to specialists, and unwanted readmission (Jaarsma, et.al., 2003). Therefore, heart failure has the highest rate of readmission. The readmission rate due to HF, 30 to 60 days after discharge, has been reported to be 30% (Ahmadi, et.al., 2014).

Reductions in readmission rates among patients with heart failure have been demonstrated with both homebased programs and telephone monitoring. The effects of multidisciplinary home-based interventions in the population with heart failure have also been shown to be sustained for periods of at least 18 months, resulting in both reduced hospital-based costs and mortality (Stewart, Marley, & Horowitz, 1999). Structured telephone support included regular telephone contacts between patients and health care providers, with discussions about patient-reported symptoms and physiological data, including reminders about the importance of adherence to treatment recommendations and proper self-management. This support was provided by trained nurses or using interactive voice interviews with validated standard questionnairs (Chaudhry et al., 2010).

Telenursing has been applied since the early 1990s, where nurses’ roles are relied upon in facilitating the transition period after hospital discharge. Although not explicitly mentioned as telenursing, the role of the nurse becomes the main telephone program and even telemedicine. The ACCF/AHA guidelines in 2013 recommend early postdischarge follow-up, because it may help minimize gaps in understanding of changes to the care plan or knowledge of test results and has been associated with a lower risk of subsequent rehospitalization (Yancy, et.al., 2013). A follow-up visit within 7 to 14 days and/or a telephone follow-up within 3 days of hospital discharge are reasonable goals of care (Ponikowski et al., 2016). In adults discharged to home after hospitalization for heart failure, outpatient follow-up with a cardiology or general medicine provider within 7 days was associated with a lower chance of 30-day readmission (Lee, et.al., 2016)

Many studies have shown the benefits of telenursing. Specific types of telephone support have been developed and tested, such as telephone case management for patients with heart failure (Riegel, 2002). It can improving knowledge about signs and symptoms of disease progression, treatment plan, and actions to engage in to manage symptoms has been reported to minimize exacerbation and frequent hospitalization. Based on this, author was interested in finding out more about how to apply effective methods implement in telenursing to prevent the readmission of patients with chronic heart failure.

Objective

The aim of this literature review was to describe the effective implementation of telenursing in readmission among patients with heart failure.

Methods

This literature review used PRISMA as process of article selection, as shown on figure 1.1. The author collected articles from several studies both qualitative and quantitative, to describe the effective implementation of telenursing in readmission among patients with heart failure. The author collected articles related to the purpose of this study, through several stages of searching process using the keywords “telenursing”, “nurse telephone follow-up”, “heart failure”, and “readmission”. Database and article’s search engine published in English were used, it consists of 11 database sources: Science Direct, ProQuest, Scopus, EBSCO, BMJ, Research Gate, Clinical Key, Taylor & Francis, PubMed, Wiley Online Library, SAGE Journal, published in 2014 – 2019. The results of this literature review are explained in the type of methods were used from the implementation of their use in each study.

Results

In the last two decades, there has been considerable focus on post-discharge telephone calls to address preventable readmissions (Harrison, Auerbach, Quinn, Kynoch, & Mourad, 2014). From retrospective study in USA in 2014, patients who received a call and completed the intervention were significantly less likely to be readmitted compared to those who did not 155 patients (5.8%) vs 123 patients (8.6 %), with p 65 years of age.

Weekly NP phone management and access to NP consultative services around the clock allowed home care staff to continue to monitor and treat CHF patients in their home environment and prevented unnecessary emergency room visits and hospital readmissions.

Daily interactions with professional healthcare staff (RNs, physical therapists, and/or occupational therapists) improved patient monitoring and early detection of patient decompensation.

Spinsante (2014) Systematic overview Italy

To provides evidence for a simple, but effective, paradigm upon which a telehealth system may be built, and highlights how such a model may successfully apply to Heart Failure management, to improve patients’ quality of life after discharge, increase independence, and reduce readmissions and costs for the public health institutions

A quite significant amount (almost 20%) of all hospital admissions are actually readmissions, and a majority of these could be avoided through more effective management of health care after discharge.

Factors that contribute to excessive hospital readmissions include service fragmentation and poor communication among and between health care settings and care providers and poorly delivered and/or understood discharge instructions and follow-up. By improving coordination across the continuum of care and promoting seamless transitions from the hospital to home, skilled nursing care, or home health care, avoidable readmission rates can be decreased.

Harrison et al., (2014) Retrospective observational study the USA

To determine the specific effects of receiving a post-discharge telephone call on all-cause 30-day readmission, and to describe the post-discharge issues addressed by the calls.

The effectiveness of post-discharge phone call programs may be more related to whether patients are able to answer a phone call than to the care delivered by the phone call. Programs would benefit from improving their ability to perform phone outreach while simultaneously improving on the care delivered during the calls.

Black et al., (2014) Multi-center, Randomized Controlled Trial USA

To evaluate the effectiveness of this remote care transition intervention in reducing all-cause 180-day hospital readmissions for older adults hospitalized with heart failure.

Better Effectiveness After Transition-Heart Failure (BEAT-HF) is poised to serve as an important research resource to understand how best to use telehealth approaches to improve key healthcare processes and outcomes, including care transitions and hospital readmissions, and to set the stage for future comparative effectiveness research on chronic disease management for heart failure.

References

  1. Abdelkader, Y. I., Elshamy, K. F., & M, H. A. (2019). Efficacy of Supportive Educational Package on Self Care among Heart Failure Patients. (July).
  2. Ahmadi, A., Soori, H., Mobasheri, M., Etemad, K., & Khaledifar, A. (2014). Heart Failure, the Outcomes, Predictive and Related Factors in Iran. Journal of Mazandaran University of Medical Sciences, 24(118), 180–188.
  3. Baptiste, D. L., Davidson, P., Groff Paris, L., Becker, K., Magloire, T., & Taylor, L. A. (2016). Feasibility study of a nurse-led heart failure education program. Contemporary Nurse, 52(4), 499–510. https://doi.org/10.1080/10376178.2016.1229577
  4. Benjamin, E. J., Muntner, P., Alonso, A., Bittencourt, M. S., Callaway, C. W., Carson, A. P., … Virani, S. S. (2019). Heart Disease and Stroke Statistics-2019 Update: A Report From the American Heart Association. In Circulation (Vol. 139). https://doi.org/10.1161/CIR.0000000000000659
  5. Black, J. T., Romano, P. S., Sadeghi, B., Auerbach, A. D., Ganiats, T. G., Greenfield, S., … Ong, M. K. (2014). A remote monitoring and telephone nurse coaching intervention to reduce readmissions among patients with heart failure: Study protocol for the Better Effectiveness After Transition – Heart Failure (BEAT-HF) randomized controlled trial. Trials, 15(1). https://doi.org/10.1186/1745-6215-15-124
  6. Chaudhry, S. I., Mattera, J. A., Curtis, J. P., Spertus, J. A., Herrin, J., Lin, Z., … Krumholz, H. M. (2010). Telemonitoring in patients with heart failure. New England Journal of Medicine, 363(24), 2301–2309. https://doi.org/10.1056/NEJMoa1010029
  7. Fors, A. (2018). Effects of a self-management program on patient participation in patients with chronic heart failure or chronic obstructive pulmonary disease: A randomized controlled trial. European Journal of Cardiovascular Nursing, 18(3), 185–193. https://doi.org/10.1177/1474515118804126
  8. Harrison, J. D., Auerbach, A. D., Quinn, K., Kynoch, E., & Mourad, M. (2014). Assessing the Impact of Nurse Post-Discharge Telephone Calls on 30-Day Hospital Readmission Rates. Journal of General Internal Medicine, 29(11), 1519–1525. https://doi.org/10.1007/s11606-014-2954-2
  9. Jaarsma, T., Stromberg, A., Martensson, J., & Dracup, K. (2003). Development and Testing of The European Heart Failure Self-Care Behaviour Scale. European Journal of Heart Failure, 5(5), 621–627. https://doi.org/10.1016/S1388-9842
  10. Jenkins, R. L., & White, P. (2001). Telehealth advancing nursing practice. Nursing Outlook, 49(2), 100–105. https://doi.org/10.1067/mno.2001.111933
  11. Kasem, S. M., Mahmoud, N., & El-aziz, A. (2017). Effect of medical and nursing teaching program on awareness and adherence among elderly patients with chronic heart failure in Assiut , Egypt. 47–53. https://doi.org/10.4103/ejim.ejim
  12. Lee, K. K., Yang, J., Hernandez, A. F., Steimle, A. E., & Go, S. A. (2016). Post-Discharge Follow-up Characteristics Associated with 30- Day Readmission After Heart Failure Hospitalization. 25(5), 1032–1057. https://doi.org/10.1111/mec.13536.Application
  13. Liberati, A., Altman, D. G., Tetzlaff, J., Mulrow, C., Gøtzsche, P. C., Ioannidis, J. P. A., … Moher, D. (2009). The PRISMA Statement for Reporting Systematic Reviews and Meta-Analyses of Studies That Evaluate Health Care Interventions: Explanation and Elaboration. PLoS Medicine, 6(7), 50931. https://doi.org/10.1371/journal.pmed.1000100
  14. Liljeroos, M., Ågren, S., Jaarsma, T., & Stromberg, A. (2017). Dialogues between nurses, patients with heart failure and their partners during a dyadic psychoeducational intervention: A qualitative study. BMJ Open, 7(12). https://doi.org/10.1136/bmjopen-2017-018236
  15. Moore, J. A. M. (2016). Evaluation of the efficacy of a nurse practitioner-led home-based congestive heart failure clinical pathway. Home Health Care Services Quarterly, 35(1), 39–51. https://doi.org/10.1080/01621424.2016.1175992
  16. Murphy, N., Shanks, M., & Alderman, P. (2019). Management of Heart Failure With Outpatient Technology. Journal for Nurse Practitioners, 15(1), 12–18. https://doi.org/10.1016/j.nurpra.2018.07.004
  17. Najafi, S. S., Shaabani, M., Momennassab, M., & Aghasadeghi, K. (2016). The nurse-led telephone follow-up on medication and dietary adherence among patients after myocardial infarction: A randomized controlled clinical trial. International Journal of Community Based Nursing and Midwifery, 4(3), 199–208.
  18. Negarandeh, R., Zolfaghari, M., Bashi, N., & Kiarsi, M. (2019). Evaluating the Effect of Monitoring through Telephone (Tele-Monitoring) on Self-Care Behaviors and Readmission of Patients with Heart Failure after Discharge. Applied Clinical Informatics, 10(2), 261–268. https://doi.org/10.1055/s-0039-1685167
  19. Ponikowski, P., Poland, C., Voors, A. A., Germany, S. D. A., Uk, J. G. F. C., Uk, A. J. S. C., … Rutten, F. H. (2016). 2016 ESC Guidelines for the diagnosis and treatment of acute and chronic heart failure. 2129–2200. https://doi.org/10.1093/eurheartj/ehw128
  20. Raphael, D., Waterworth, S., & Gott, M. (2017). Telephone communication between practice nurses and older patients with long term conditions – a systematic review. Journal of Telemedicine and Telecare, 23(1), 142–148. https://doi.org/10.1177/1357633X15625398
  21. Record, J. D. (2015). Telephone calls to patients after discharge from the hospital: an important part of transitions of care. Medical Education Online, 19, 4–6.
  22. Riegel, B. (2002). Effect of a Standardized Nurse Case-Management Telephone Intervention on Resource Use in Patients With Chronic Heart Failure. 162.
  23. Simorangkir, H., & McGuire, S. J. J. (2017). Training in Readmission Reduction in an Indonesian Hospital. Hospital Topics, 95(2), 40–50. https://doi.org/10.1080/00185868.2017.1300477
  24. Souza-Junior, V. D., Mendes, I. A. C., Mazzo, A., & Godoy, S. (2016). Application of telenursing in nursing practice: An integrative literature review. Applied Nursing Research, 29, 254–260. https://doi.org/10.1016/j.apnr.2015.05.005
  25. Spinsante, S. (2014). Home telehealth in older patients with heart failure – costs, adherence, and outcomes. Smart Homecare Technology and TeleHealth, 93. https://doi.org/10.2147/shtt.s45318
  26. Stewart, S., Marley, J. E., & Horowitz, J. D. (1999). Effects of a multidisciplinary, home-based intervention on unplanned readmissions and survival among patients with chronic congestive heart failure: A randomised controlled study. Lancet, 354(9184), 1077–1083. https://doi.org/10.1016/S0140-6736(99)03428-5
  27. Yancy, C. W. (2013). 2013 ACCF/AHA Guideline for the Management of Heart Failure: A Report of the American College of Cardiology Foundation/American Heart Association Task Force on Practice Guidelines. Circulation, 128(16), e240-327. https://doi.org/10.1161/CIR.0b013e31829e8776
  28. WHO. (2016). Global Health Observatory (GHO) data: Non Communicable Disease. Retrieved from https://www.who.int/gho/ncd/en/
  29. WHO. (2017). Cardiovascular diseases (CVDs). Retrieved form https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds)

Analysis of Machine Learning Techniques for Heart Failure Readmissions

Problem Statement

High rates of readmission after hospitalization for heart failure puts a tremendous burden on patients and the healthcare system. Predictive models are used to identify patients with high risk for hospital readmissions and potentially enable direct specific interventions toward those who might benefit most by identifying key risk factors. The current ability to predict readmissions in patients with heart failure is modest at best. The inclusion of a richer set of predictor variables encompassing patients’ clinical, social, and demographic domains, while improving discrimination in some internally validated studies, may not improve discrimination. This richer set of predictors might not contain the predictive domain of variables required but does represent a large set of data not routinely collected in other studies. It is unclear whether machine learning techniques that address higher dimensional, nonlinear relationships among variables would enhance prediction. The authors seek to compare the effectiveness of several machine learning algorithms for predicting readmissions.

Proposed Solution

Data for this study were drawn from Tele-HF, which enrolled 1653 patients within 30 days of their discharge after an index hospitalization for heart failure. In addition to the clinical data from the index admission, Tele-HF used validated instruments to collect data on patients’ socioeconomic, psychosocial, and health status. Of the 1653 enrolled patients, the authors excluded 36 who were readmitted or died before the interview, 574 whose interviews were completed after 30 days from discharge, and 39 who were missing data on >15 of the 236 baseline features to create a study sample of 1004 patients for the 30-day readmission analysis set. 472 variables were used for input. Models were built using both traditional statistical methods and ML methods to predict readmission and model were compared to discrimination and predictive range of the various techniques. An LR model and a Poisson regression were used for traditional statistical models. Three ML methods— RF, boosting, and SVM—were used for readmission prediction.

Using the variables selected in a recent study from Tele-HF would provide the most accurate representation of an LR model on the Tele-HF data set for comparison purposes which were selected by the authors for their study to compare model performance, as the current analysis is concerned with finding improved analytic algorithms for predicting 30-day readmissions rather than primarily with variable selection. Given the flexibility of nonlinear methods, the complexity of the desired models might overwhelm the available data, resulting in overfitting. Although all the available variables can be used in ML techniques such as RF and boosting, which are robust to this overfitting, we may require some form of feature selection to help prevent overfitting in less robust techniques like SVM.

To overcome the potential for overfitting in LR and SVM, a hierarchical method with RF was developed. Previous hierarchical methods used RF as a feature selection method because it is well suited to a data set of high dimensionalities with varied data types, to identify a subset of features to feed into methods such as LR and SVM. RF is well known to use out-of-bag estimates and an internal bootstrap to help reduce and select only predictive variables and avoid overfitting, like AdaBoost.

To construct the derivation and validation data sets, the cohort were split into 2 equally sized groups, ensuring equal percentages of readmitted patients in each group. To account for a significant difference in numbers of patients who were readmitted and not readmitted in each group, the ML algorithms were weighted. The weight selected for the readmitted patients was the ratio of not-readmitted patients to readmitted patients in the derivation set. Once the derivation and validation sets were created, a traditional LR model was trained. The models generated were run on the validation set and calculated the area under the receiver operating characteristics curve (C statistic), which provided a measure of model discrimination. The analysis was run 100× in order to provide robustness over a potentially poor random split of patients and to generate a mean C statistic with a 95% confidence interval (CI). The probabilities of readmission generated over the 100 iterations were then sorted into deciles. Finally, the observed readmission rate for each decile were calculated to determine the predictive range of the algorithms.

Results

Thirty-Day All-Cause Model Discrimination

LR had a low C statistic of 0.533 (95% CI, 0.527–0.538). Boosting on the input data had the highest C statistic (0.601; 95% CI, 0.594–0.607) in a 30-day binary outcome with a 30-day training case. Boosting also had the highest C statistic for the 30-day binary outcome with 180-day binary training (0.613; 95% CI, 0.607–0.618). For the 30-day outcomes with 180-day counts training, the RF technique had the highest C statistic (0.628; 95% CI, 0.624–0.633).

One Hundred Eighty–Day All-Cause Model Discrimination

LR again showed a low C statistic (0.574; 95% CI, 0.571–0.578) for the 180-day binary case. The RF into SVM hierarchical method had the highest achieved C statistic across all methods (0.654; 95% CI, 0.650–0.657) in 180-day binary outcome and 180-day count case (0.649; 95% CI, 0.648–0.651).

Readmission Because Of Heart Failure Discrimination

For readmissions because of heart failure, the LR model again had a low C statistic for the 30-day binary case (0.543; 95% CI, 0.536–0.550) and the 180-day binary case (0.566; 95% CI, 0.562–0.570). Boosting had the best C statistic for the 30-day binary-only case (0.615; 95% CI, 0.607–0.622) and for the 30-day with 180-day binary case training (0.678; 95% CI, 0.670– 0.687). The highest C statistic for other prediction cases were RF for the 30-day with 180-day counts case training (0.669; 95% CI, 0.661–0.676); RF into SVM for the 180- day binary-only case (0.657; 95% CI, 0.652–0.661); and RF into SVM for the 180-day counts case (0.651; 95% CI, 0.646–0.656).

Readmission Because Of Heart Failure Predictive Range

When the deciles of risk prediction were plotted against the observed readmission rate, RF and boosting each had the biggest differences between the first and tenth deciles of risk (1.8–11.9% and 1.4–12.2%, respectively).

Prediction Results

Thirty-day all cause readmission had a PPV of 0.22 (95% CI, 0.21–0.23), sensitivity of 0.61 (0.59–0.64), and a specificity of 0.61 (0.58– 0.63) at a maximal f-score of 0.32 (0.31–0.32); 180-day all cause readmission had a PPV of 0.51 (0.51–0.52), a sensitivity of 0.92 (0.91–0.93), and a specificity of 0.18 (0.16-0.21) at a maximal f-score of 0.66 (0.65–0.66); 30-day readmission because of heart failure had a PPV of 0.15 (0.13–0.16), a sensitivity of 0.45 (0.41–0.48), and a specificity of 0.79 (0.76–0.82) at a maximal f-score of 0.20 (0.19–0.21); 180- day readmission because of heart failure had a PPV of 0.51 (0.50–0.51), a sensitivity of 0.94 (0.93–0.95), and a specificity of 0.15 (0.13–0.17) at a maximal f-score of 0.66 (0.65–0.66). The 30-day predictions, in general, were better at identifying negative cases, whereas the 180-day predictions were better able to correctly identify positive cases.

Critique

The results of this study support the hypothesis that ML methods, with the ability to leverage all available data and their complex relationships, can improve both discrimination and range of prediction over traditional statistical techniques. The performance of the ML methods varies in complex ways, including discrimination and predictive range.

While we know how the data has been obtained, the quality of data is unknown. When data is collected, there are always errors in it such as outliers, missing values, incorrect data types, etc. It is unknown whether such types of errors are removed. Once these are removed, the expected accuracy of the model can increase by as much as 10% based on the type of data and the model that is being used.

While simple Random Forest, Boosting and SVM are good techniques, Neural Networks can also be used when the data set tends to get larger. Here, with over 450 different parameters, SVM and Random Forest would slow down. Simpler models like logistic regression can also be tried out to identify the accuracy of that model. With the addition of these two models, the comparison between the 5 different classification models would help better understand how these techniques work.

These techniques work on this set of data. While there has been some diversity mentioned in this paper, the research would need to include more diversity in the results. The risk of heart disease readmission could be different for people from different ethnicities and different origins depending on the type of food they eat and the climate they grow up in and the changes to their lifestyle over a period.

Ideas for Follow-On Work

Future work might be to focus on further improvement of predictive ability through advanced methods and more discerning data, to facilitate better targeting of interventions to subgroups of patients at highest risk for adverse outcomes. Newer models are being developed every day, which can improve the predictability of these types of data sets.

While it is not mentioned whether the process is performed on cloud or not, this can be included as a part of Edge Analytics using wearable devices. This combination of edge analytics and processing on the cloud can be done using federated learning. Federated learning is a machine learning technique that trains an algorithm across multiple decentralized edge devices or servers holding local data samples, without exchanging their data samples. This approach stands in contrast to traditional centralized machine learning techniques where all data samples are uploaded to one server, as well as to more classical decentralized approaches which assume that local data samples are identically distributed.

Federated learning enables multiple actors to build a common, robust machine learning model without sharing data, thus addressing critical issues such as data privacy, data security, data access rights, and access to heterogeneous data which makes it perfect for this kind of analysis.

Analytical Essay on the Hospital Readmissions in Case of Chronic Obstructive Pulmonary Disease

An NLP (Natural Language Processing) Framework to perform risk identification using featured engineering from unstructured data

Abstract

The hospital readmissions in case of COPD (Chronic Obstructive Pulmonary disease) increases medical expenses and also require intensive care for patients. Natural Language Processing (NLP) is the art and science which helps us extract information from text and use it in our computations and algorithms. We aim to develop a Natural Language Processing framework to analyze clinical notes, physician entries, x-ray reports, and other unstructured hospital data and predict the hospital readmissions. The framework is to be built with natural language processing techniques of text preprocessing, featured analysis, and machine learning’s neural network for the classification process.

Keywords—Natural Language Processing, Chronic Obstructive Pulmonary Disease, Bio-Informatics, Preprocessing of unstructured data, Hospital readmissions, Featured analysis, Classification.

I. Introduction

One in five patients requires re-hospitalization within 30 days in case of COPD disease (Chronic Obstructive Pulmonary Disease). In the United States, COPD is part of Medicare’s Hospital Readmissions Reduction Program (HRRP), which penalizes hospitals for excess 30-day, all-cause readmissions after a hospitalization for an acute exacerbation of COPD, despite minimal evidence to guide hospitals on how to reduce readmissions. This review outlines challenges for improving overall COPD care quality and specifically for HRRP. There is limited evidence available on readmission risk factors and reasons for readmission to guide hospitals in initiating programs to reduce COPD readmissions. Over the study period, there were 26,798,404 inpatient admissions, of which 3.5% were index COPD admissions. At 30 days, 20.2% were readmitted to the hospital. Respiratory-related diseases accounted for only one-half of the reasons for readmission, and COPD was the most common diagnosis, explaining 27.6% of all readmissions. To address rising costs and quality concerns, the Hospital Readmissions Reduction Program (HRRP) was enacted, targeting inpatient discharges in the Medicare fee-for-service population for congestive heart failure (CHF), acute myocardial infarction (AMI), and pneumonia. The HRRP mandates up to a 3% reduction in all Medicare reimbursements should hospitals fail to stay below their expected readmission rates. In October 2014, the HRRP was expanded to include COPD.

A. Problem Statement

The US government penalizes hospitals for excess readmissions of patients into the hospitals. The information in the hospital contains more of unstructured data. The data include clinical notes, physician notes, x-ray reports, etc. A Natural Language Processing (NLP) based approach to extract such information from hospital records is being developed. Because of the variation and complexity in such unstructured information, a protocol which can standardize the records by converting this unstructured data into a structured form is required.

B. Approaches

The raw unstructured data must be processed before analyzing. This process involves some natural language processing techniques like tokenization, stemming, noise reduction, removal of stop words, etc. Few approaches on preprocessing, analyzing, and classifying are discussed below.

  • Preprocessing

Preprocessing is the process of extracting the relevant information or records from the unstructured data. Pre-processing more commonly focuses on these three components such as tokenization, normalization, and substitution. Tokenization is a technique which splits the cluster of words which is string into tokenized individual keywords. During Tokenization, some stop words like ‘the’, ‘is’, ‘are’, ‘an’ will be eliminated. Tokenization is a step that splits the cluster of words into a minimal meaningful units which is called as tokens. Normalization coverts a set of words in a sequential manner. Under stemming is the process of eliminating unwanted pre occurred syllables in a word (suffixes, prefixes, infixes, circumfixes) and Noise removal is also called text cleaning which removes data like text file headers, footers, metadata, etc., and it also extracts records from other different formats.

  • Featured Analysis

Feature extraction involves in reducing the resources that describe the large set of data. Analyzing a large data sets includes more efficient time and effort but feature extraction technique extracts only the vital keywords by removing unwanted data words. Feature extraction is classified into two methods such as BOW and CTAKES. BOW is abbreviated as Bag-Of-Words which represents the maximum occurrence of words in the data set that is fed to the system. Bag-Of-Words is very useful in producing efficient solutions to complex problems and its used to extract features from text documents. The higher-level feature extraction is done using Apache cTAKES. cTAKES stands for clinical Text Analysis and Knowledge Extraction System which is used to extract the clinical information from electronic health records unstructured text.

  • Classification

The text classification can be done via natural language processing algorithms like Naïve Bayes, Random Forest, Knn algorithm, etc. But we propose a framework which uses neural network for classification rather than natural language processing algorithm. A few minor drawbacks of using natural language processing algorithms for text classification is that the algorithms provide a score but what we need is probability. And moreover, The NLP algorithms learn from what is present in a class but not from what is not. Therefore the NLP algorithms is not understanding the context of a sentence, instead classifying it based on the scores. Hence neural networks are used to obtain high performance on NLP tasks. The Neural network algorithms are advantageous over classification algorithms in a way that they provide more accurate results. The accuracy is improved based on two primary methods. A first method is supervised neural network which will run input through various classifications and the second one is unsupervised neural network to optimize the feature selection.

Neural network in simple terms is feeding the data and making it analyze to provide various solutions for complex tasks. A basic neural network is classified into three layers: input layer, hidden layer, and output layer. The classifiers used in neural networks are known as softmax layer which is the final layer. Thereby we can model neurons to perform classification computation. Neural network with multiple neurons can be considered as providing same data to various classification functions. Each neuron denotes a different regression function. A huge set of data is fed to train these networks. The training is achieved through back propagation. Each layer sends the previous layer’s output to another function.

II. Literature survey

  1. Natural language processing is a computer technology which mainly concentrates on human-computer interaction. Most of the data present these days is in unstructured form which makes it hard for computers to understand for further use and analysis. This unstructured text needs to be converted into structured form by clearly defining the sentence boundaries, word boundaries, and context-dependent character boundaries for further analysis. Key steps include many algorithms within the field of data mining and machine learning, so a framework for component selection is created to select the best components. NLP is applied followed by some of processing techniques like, tokenization, stop words removal, stemming, pruning, semantic analysis, POS Tagger, etc.
  2. An intelligent system is developed for the analysis and the real-time evaluation of patient’s condition. A hybrid classifier has been implemented on a personal digital assistant, combining a support vector machine, a random forest, and a rule-based system to provide a more advanced categorization scheme for the early and in the real-time characterization of a COPD episode. This is followed by a severity estimation algorithm which classifies the identified pathological situation in different levels and triggers an alerting mechanism to provide an informative and instructive message/advice to the patient.
  3. The cTAKES builds on existing open-source technologies—the Unstructured Information Management Architecture framework and Open NLP natural language processing toolkit. Its components, specifically trained for the clinical domain, create rich linguistic and semantic annotations. The cTAKES annotations are the foundation for methods and modules for higher-level semantic processing of clinical free-text. The cTAKES is a modular system of pipelined components combining rule-based and machine learning techniques aiming at information extraction from the clinical narrative.
  4. The goal of this study is to analyze key factors using machine learning methods and patients’ medical records of a reputed Indian hospital which impact the all-purpose readmission of a patient with diabetes and compare different classification models that predict readmission and evaluate the best model. It proposed architecture of this prediction model and identified various risk factors using text mining techniques. This study not only discovered risk factors that predict the risk of readmission but also identified individual as well as group of factors that are strong indicators of low risk of readmission along with the cost analysis using real-world data.
  5. We provide voice-based android application to the user where user can interact with system and get inference of diseases and their remedies by giving the symptoms as input. For processing the given input we normalize the data by using noun phrase extraction and medical term identifier. The pre-processing system and the question-answer system are the crucial elements of the proposed system. The question generation is performed using a QA matrix. Further, the response of the system is reached to the user in voice format.
  6. Hospital readmission rates are considered to be an important indicator of the quality of care because they may be a consequence of actions of commission or omission made during the initial hospitalization of the patient. the framework considers specific COPD-related laboratory test results as part of the structured patient data. These data types are used in the development of appropriate regression models to Predicting Hospital Readmission Risk for COPD Using EHR Information.
  7. In this paper, we are interested by analyzing and pre-processing tweets for NLP and machine learning applications such as machine translation and classification. We propose a pre-processing pipeline for tweets consisting of filtering part-of-speech, named entities recognition, hashtag segmentation, and disambiguation. Our proposed approach is also based on the graph theory and group words of tweets using semantic relations of WordNet and the idea of connected components. The integration of WordNet in preprocessing transforms our corpus into a bag of words. We keep the frequency of each word found in the corpus and each tweet is represented as a sequence of frequencies of the words in the tweet.
  8. This project creates a novel natural language processing (NLP) pipeline for extraction and classification of temporal information as historic, current, and planned from free-text eligibility criteria. The pipeline uses pattern learning algorithms for extracting temporal information and trained Random Forest classifier for classification. The pipeline achieved an accuracy of 0.82 in temporal data detection and classification with an average precision of 0.83 and recall of 0.80 in temporal data classification. The accuracy of the classifier was further tuned by training the random forest classifier with a different number of decision trees.
  9. Fake reviews are considered as spam reviews, which may have a great impact in the online marketplace behavior. Many types of features could be used for extracting useful features such as Bag-of-Words, linguistic features, words counts, and n-gram feature. We will investigate the effects of using two different feature selection methods on spam review detection: Bag-of-Words and word count. Different machine learning algorithms are applied Support Victor Machine, Decision Tree, Naïve Bayes, and Random Forest. Two different categories of spam review detection methods: First, Supervised techniques that required labeled datasets to detect reviews of unseen data, such as Naïve Bayes (NB), Support Vector Machine (SVM), Decision Tree (DT), and Random Forests (RF). Second, Unsupervised techniques that concern about finding hidden patterns in data that is unlabeled

III. Comparison table

  • Paper
  • Name
  • Author
  • Year
  • Description
  • NLP-based Clinical Data Analysis for Accessing Re-admissions of patients with COPD. Priyanka V. Medhe, Dinesh D. Puri, 2017

They analyze the patient hospital lab reports and discharge summary which classifies them as primary and secondary factor. In preprocessing they convert documents into data schemes and divide them into clusters. With the help of predictive model the proposed factors are listed and detect the prediction which uses cTAKES, Prediction model technique, Text analysis, and UMA models.

  • Identification of COPD patients’ health Status using an Intelligent System in the chronious wearable platform. Christos C. Bellos, Athanasios Papadopoulos, Roberto Rosso, Dimitrios I. Fotiadi, 2014

Patient data, whose likelihood of having COPD has been recorded. The dataset contains information from sources like; Sensors, External devices, and Database are collected. Preprocessing technique involves in removal of baseline wander noises and high-frequency noises and then feature extraction processes the preprocessed functional data and gives sensor acquired data. Heterogeneous data fusion which fuses data from different sources in various formats.

  • Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES) architecture component Evaluation and applications. Guergana K Savova, James J Masanz, Philip V Ogren, Jiaping Zheng, Sunghwan Sohn, Karin C Kipper-Schuler, Christopher G Chute, 2010

They used cTAKES that consists of components executed to process the components which incrementally Contributing the cumulative annotation dataset. The cTAKES accepts a plain text or XML documents. The sentence boundary detector predicts the period or question mark in the end of the sentence. Tokenizer spits the sentence into smaller tokens. cTAKES POS tagger and the shallow parser are the wrappers around Open NLP’s modules that extract data from the system.

  • Predictive risk modeling for early hospital readmission of patients with diabetes in India. Reena Duggal, Suren Shukla, Sarika Chandra, Balvinder Shukla, Sunil Kumar Khatri, 2016

This system takes the raw data of 7100 diabetes patients are collected over a period of 2 years and noisy, inconsistent data are removed using preprocessing. Predictive feature demographic the illness severity. cTAKES is adopted to create one or more pipelines to process clinical notes and entities like diseases and disorders, signs and symptoms, drugs and procedures. Five classification models such as naïve Bayesian, Logistic Regression, Random Forest, Adaboost, and Neural Networks classifiers were used to build the system.

  • An Interactive Medical Assistant using Natural Language Processing. Shamli Deshmukh, Ritika Balani, Vijayalaxmi Rohane, Asmita Singh, 2016

In this NLP-based medical assistant, the given input is tokenized with the help of a POS tagger. As an input statement, only the disease and duration is mentioned. Medical terms and keywords will be extracted from the input separately by feature extraction technique and the Question-Generation System checks the similarities and then plots them in the QA map entity which helps in understanding the accurate requirement of patient disease prediction.

  • Predicting Hospital Readmission Risk for COPD Using EHR Information. Ravi Behara, Ankur Agarwal, Faiz Fatteh and Borko Furht, 2013

In this study, Mayo Clinic’s Text Analysis and Knowledge Extraction system is adopted which is used to create one or more pipelines to process clinical notes and entities such as diseases and disorders, signs and symptoms, anatomical sites and procedures, and drugs. The Lexical Analyzer Layer parses all tokenized words from preprocessing. Assertion determines if the text discussed is related to patient and then Structured Data Analysis will develop predictive statistical model predicts readmission.

  • Efficient Natural Language Pre-processing for Analyzing Large Data Sets. Belainine Billal, Alexsandro Fonseca and Fatiha Sadat, 2016

They Collected content from Twitter (tweets) which are considered as unstructured and highly noisy texts. To extract from such data, they used a traditional NLP and machine language technique. Once the words are selected, they cluster the words with their synset in graph. Tweet Normalization rewrites words in a standard way that helps to process efficiently.

  • Natural Language Processing Pipeline for Temporal Information Extraction and Classification from Free Text Eligibility Criteria Gayathri. Parthasarathy, Aspen Olmsted, Paul Anderson, 2016

This project creates a novel natural language processing pipeline for extraction and classification of temporal information. The pipeline uses pattern learning algorithms for extracting data. The initial step involves in extracting the trained temporal patterns from the data set. The pipeline involves in generating temporal patterns using the TEXer algorithm. Then pipeline utilizes the trained temporal pattern that detects the temporal expressions from sentences. As a next step, labeled fragments are used to create Bag of words, and classifier extracts data.

  • The Effects of Features Selection Methods on Spam Review Detection Performance. Wael Etaiwi, Arafat Awajan, 2017

They used feature selection methods and classification algorithms to detect spam reviews. The feature selection method extracts the features from text or review and finds the accuracy of words within a review by the frequency of occurrence. Other characteristics features like lexical and syntactic also included which enhance the detection of performance. They used four classification algorithms, Naïve Bayes, Decision Tree¸ Support Vector Machine, and Random Forest.

References

  1. Priyanka V. Medhe, Dinesh D. Puri, “NLP Based Clinical Data Analysis for Assessing Readmissions of Patients with COPD”. International Conference Proceeding ICGTETM Dec 2017 | ISSN: 2320-2882.
  2. Christos C. Bellos, Athanasios Papadopoulos, Roberto Rosso, Dimitrios I. Fotiadi, “Identification of COPD Patients’ Health Status Using an Intelligent System in the CHRONIOUS Wearable Platform” IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 18, NO. 3, MAY 201.
  3. Guergana K Savova, James J Masanz, Philip V Ogren, Jiaping Zheng, Sunghwan Sohn, Karin C Kipper-Schuler, Christopher G Chute, “Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation, and application” Application of information technology.
  4. Reena Duggal, Suren Shukla, Sarika Chandra, Balvinder Shukla, Sunil Kumar Khatri, “Predictive risk modeling for early hospital readmission of patients with diabetes in India” Int J Diabetes Dev Ctries.
  5. Shamli Deshmukh, Ritika Balani, Vijayalaxmi Rohane, Asmita Singh, ”Sia: An Interactive Medical Assistant using Natural Language Processin” 2016 International Conference on Global Trends in Signal Processing, Information Computing, and Communication.
  6. Ravi Behara, Ankur Agarwal, Faiz Fatteh and Borko Furht, ”Predicting Hospital Readmission Risk for COPD Using EHR Information” Handbook of Medical and Healthcare Technologies.
  7. Belainine Billal, Alexsandro Fonseca and Fatiha Sadat, “Efficient Natural Language Pre-processing for Analyzing Large DataSets” 978-1-4673-9005-7/1
  8. Gayathri Parthasarathy, Aspen Olmsted, Paul Anderson, “Natural Language Processing Pipeline for Temporal Information Extraction and Classification from Free Text Eligibility Criteria” International Conference on Information Society (i-Society 2016).
  9. Wael Etaiwi, Arafat Awajan, “The Effects of Features Selection Methods on Spam Review Detection Performance ” 2017 International Conference on New Trends in Computing Sciences.