Healthcare analytics

A brief description of select completed or ongoing analytics projects in the healthcare sector is provided below. If you need further information, please contact us.

Re-hospitalization analytics: Modeling and reducing the risks of re-hospitalization

Large amounts of heterogeneous medical data have become available in various healthcare organizations (payers, providers, pharmaceuticals). Those data could be an enabling resource for deriving insights for improving care delivery and reducing waste in healthcare spending. The enormity and complexity of these datasets present great challenges in analyses and subsequent applications to a practical clinical environment. Such clinical data is not only massive, but also heterogeneous and longitudinal in nature with some noisy and missing information. Estimating the predictive power of the clinical data collected during the hospitalization of a patient and effectively making predictions from such diverse patient records requires new analytical models. This project develops a 'rehospitalization analytics' framework that can identify, characterize, and reduce the risks of rehospitalization for patients using a wide range of electronic health records. This project aims to provide a comprehensive, accurate, and timely assessment of risk of rehospitalizations, and has the potential to direct more aggressive treatments towards specific high-risk patients. Specifically, the project aims to develop integrated predictive models that can effectively leverage multiple heterogeneous patient information sources and transfer the acquired knowledge about rehospitalization between different hospitals and patient groups in the presence of only few patient records. Providing special care for a targeted group of patients who are at a high risk of rehospitalization can significantly improve the chances of avoiding rehospitalization. This has the potential to improve the lives of patients, by reducing exacerbations, and reducing overall health care costs by reducing the number of hospitalizations. 

Streamlining patient flow in emergency departments

Emergency departments (ED) in hospitals are experiencing severe crowding and prolonged patient waiting times in recent years. While the major factor in many cases is shortage of capacity, a significant concern is “boarding” delays, where admitted ED patients are held in ED until an inpatient bed is identified and readied in the admit wards. Recent research has suggested that if hospital admissions of ED patients can be predicted right during patient triage, then bed requests and preparations can be triggered early on to reduce patient boarding time. We are developing a comprehensive decision support and operations tool kit to help triage staff proactively manage ED patient flow, reduce costs, and improve patient satisfaction. The tool kit employs state-of-the-art data mining and machine learning algorithms to: 1) Automate predictions of patient admission at triage, 2) Prediction of target wards for patients to be admitted, 3) Estimation of patient’s length-of-stay (LOS) in ED, and 4) Cost sensitive bed reservation policies that recommend optimal ward-bed reservation times for patients. This research is being funded by the Veterans Administration (VA) and is being done in collaboration with VA CASE. Proposed methods and tools are currently being tested and validated at a VA Medical Center. 

Near real-time decision support system for patient flow and supply optimization in healthcare systems

The efficient utilization of healthcare system resources depends on multiple factors such as long-term planning (e.g. block time allocations in surgical departments), medium-term planning (e.g., surgery scheduling), uncontrollable operational uncertainty (e.g. surgery durations), and controllable operational uncertainty (e.g., delays in pre- and peri-operative as well as turnover processes). While the medium- to long-term planning decisions attempt to hedge against these uncertainties weeks to months in advance, there remains considerable opportunities for near real-time decision (re-)making through operational predictive analytics and real-time optimization.  Working with Veterans Affairs, we have developed near real-time decision support system (NRT DSS) that can optimize patient flow and supply using the real-time predictions operational uncertainties. These predictive analytical models integrate the statistical models and simulation models that are calibrated and validated through historical patient demand, supply usage, and flow and duration information. The optimization models and algorithms are custom developed to (1) exploit the predictive models’ results and (2) generate high quality solutions in near real-time. The developed NRT DSS models the patient flow and supply operations in Operating Rooms, PACU and ICU units, Emergency Departments, Labs, Clinics (e.g., Dental, OB/GYN), Inpatient Wards, Logistics, Central Sterilization Service, and flow of re-usable medical supplies (case carts, scopes, etc.). The results from pilot implementations in Detroit VAMC are very promising with reduced patient and procedure delays, appointment cancellations and missed opportunities, lower costs (e.g., reduced overtime and emergency supplies) and improved patient and staff satisfaction.

Discovering relational saliency in electronic medical records

Usually, data in an EMR system are highly complex, sharing one or several prominent characteristics: they are tremendous in size with tens of thousands of records; they come unstructured with heterogeneous features; they are intrinsically dynamic in nature; evidences that can guide diagnosis and treatment are often embedded in large amounts of noisy, even confusing data. Thus, in an EMR system, identifying the important interplay between patients, record types and features is of utter most importance for "smart" healthcare. In this project, we define a new concept in EMR, the Relational Saliency. Discovering relational saliency in EMR can help physicians tailor the diagnosis and treatment to an individual patient, a step closer to the personalized medicine.

Toward personalized medicine

Treatments based on the genetic profiles of the diseases have bee shown to be much more successful than the older one-fits-all approach. However, even targeted treatments fail for some patients for which they should work. Furthermore, many patients who initially respond well to treatment develop resistance later. This is because  the evolution of the disease is dictated by the interplay between the tumor and the immune system of the host, both of which are unique in every case. To realize the much-heralded promise of personalized medicine, it is necessary to (a) accurately recognize and distinguish disease subtypes, (b) apply these diagnostic signatures at the level of the individual patient (and tissue type) over time, and (c) identify therapeutic interventions that target them specifically. The aim of this project is to be able to identify the specific disease mechanisms in action in every single individual suffering from a given condition. The big data challenge here comes from the added dimension of the individual patients. Instead of comparing 30,000 genes and 100,000 proteins between two groups, we now need to consider hundreds and potentially thousands of individuals separately, as well as over time. The approach developed here will allow us to predict which patients are likely to respond to a given treatment, based on their genetic profile. This would greatly increase the ability to design successful clinical trials and develop drugs that are effective in sub-groups of patients sharing the same mechanisms. 

The rate of acquiring biological data has greatly surpassed our ability to interpret it. At the same time, we have started to understand that evolution of many diseases such as cancer, are the results of the interplay between the disease itself and the immune system of the host. It is now well accepted that cancer is not a single disease, but a “complex collection of distinct genetic diseases united by common hallmarks”. Understanding the differences between such disease subtypes is key not only in providing adequate treatments for known subtypes but also identifying new ones. These unforeseen disease subtypes are one of the main reasons high-profile clinical trials fail. To identify such cases, we proposed a classification technique, based on Support Vector Machines, that is able to automatically identify samples that are dissimilar from the classes used for training. In a leukemia experiment our method was able to identify 65% of MLL patients when it had no prior knowledge of the existence of such group, as it was trained only on AML vs. ALL. In addition, we need to understand the disease mechanism specific to each subgroup. For this purpose, we proposed a systems biology approach able to consider all measured gene expressing changes, thus eliminating the possibility that small but important gene changes (e.g. transcription factors) are omitted from the analysis. Our approach provides consistent results that do not depend on the choice of an arbitrary threshold for the differential regulation. In a multiple sclerosis study our approach was able to obtain consistent results across multiple experiments performed by different groups on different technologies, that could not be achieved based solely using differential expression.

Computational systems biology of human diseases

This funded research is about developing and applying probabilistic models to understand biological networks, human diseases, and their associations. We develop methods to integrate heterogeneous datasets collected from diverse biological sources (e.g., microarray, next generation sequencing, genotype data, and pathway) for clinical problems such as cancer, autism, and aging. Our group is developing systematic learning methods for knowledge discovery in population-based analysis of human diseases. In addition, We are also interested in developing learning methods for understanding genotype-phenotype associations (e.g. diseases, drug ADR).