Tuesday, 22 May 2012

Healthcare Big Data


Part 2: The method

The fun part with designing a big data system is that both ends of the tunnel have a fuzzy light: the data is not in a stable, normalized form, and the intended presentation is itself not clear, mostly because a better idea of what we want is only possible when we start having something.
Coming up with a universal data model that covers all the cases is not possible, for the simple reason that the goals are many, and they are growing and changing.

To address this challenge, I started using a simple analysis methodology:
I separate the scenarios from the use cases, and they both evolve hand in hand.
  • Use Cases are the intended actions that rely on the data, independent of the fact that the data may be available.
  • Scenarios are the conditions and data that can be available, regardless of their use. 

According to W. H. Inmon (if I’m not misquoting; I lost the reference), the data analysis does not present the information needed for use cases. It represents what we could possibly do with it (In a next post I’ll make a good analogy where this should be clear).
For big data, we can iteratively build up the data model as more use cases are found; and iteratively build up possible uses for the data, as more data becomes available and is integrated / federated into the data model.

These two aspects are synergetic: New use cases can provide need for new data, and availability (or unavailability) of data can determine the uses for that data.

This applies to data and metadata:
Unavailability of some data can be interpreted differently (not having a report of administration of an injection in a hospital has different impact and meaning from a patient’s lack of reporting taking anti-depressive medication).

The information quality may depend on who provides it (for example some cultural factors will influence the customer’s report of use of drugs, contraceptives, etc.)/

When building the model (and very carefully separating it from the use cases), my rule is: keep everything just in case. For example, a cell phone can have an attached device that monitors the patient’s glucose levels. This is not the same as a measurement by a nurse, but we don’t want to waste this information just because a hospital currently does not want to use it.

Thursday, 10 May 2012

Healthcare Big Data

 

Part 1: Introduction
When it comes to big data, healthcare is an excellent place to find it.
Data is structured and captured in different manners, there’s lots of data, and it is hard to get any sensible information out of this data put together.
In healthcare, traditionally, data (and models) are split by department; data needed for a department were modelled according to the needs of that department. Most of this data is not shared. But data that represent patient or workflow information could be shared: patient demographics, patient location/status, current attending physician, allergies, wound documentation, which user accesses the patient record...
The need for cross-department (and cross-institution, and cross domain) data is emerging as important e.g. for decision making. This data is important for clinical reasons (informed decision making), and operational reasons (e.g. Meaningful Use in the US, and the growing interest in Europe for operational/efficiency improvements).
Even breaking the boundaries between healthcare domains will be insufficient: Healthcare data is not a silo. Institutions and patients collaborate on health data, but also other institutions can play a role. Work and social environment, demographics information, etc. can be potentially relevant.




  • Should a person that lives in the mountains and takes many intercontinental flights mention that to the physician that is evaluating the possibility of an ionizing radiation procedure?
  • Should a patient’s demographics be taken into account in deciding the sensitivity of a procedure?
  • Should a patient’s social habits considered when recommending a continued treatment?
  • When prescribing a drug, should we consider the habits of the patient and his surroundings?

  • Coming down to spaceship earth, within a healthcare organization, information can be captured anywhere, and this information is getting more stretched to be viewed under different perspectives.
    Data Interfaces (messages) are usually designed to provide just enough data for continuation of a workflow. And these data are modelled for the specific needs of the originating system. If we want to know a patient’s full history, there will be some collage work, especially if we want the information to be in a usable form.
    Here I propose a look at the problem, the possibilities and the goals, and I present a way of jumping across these water lilies. The keyword is: Iterate.


     

    Next:

    • Part 2: The approach
    • Part 3: The delicious analogy