Wes Rishel

A member of the Gartner Blog Network

Wes Rishel
VP Distinguished Analyst
12 years at Gartner
45 years IT industry

Wes Rishel is a vice president and distinguished analyst in Gartner's healthcare provider research practice. He covers electronic medical records, interoperability, health information exchanges and the underlying technologies of healthcare IT, including application integration and standards. Read Full Bio

Coverage Areas:

PCAST Opportunity: Documents vs. “Atomic Data Elements”

by Wes Rishel  |  February 13, 2011  |  22 Comments

The PCAST report represents an opportunity to redirect resources in a way that meets the short term goals associated with the HITECH, support more nimble development of standards for clinical data that are less arcane, and set a long-term direction that enables more innovative use of IT in healthcare. In this post, I talk about how that could work and what ONC should do.

This is a complex issue, so I will start and end with a summary.

Tell ‘Em What You’re Gonna Tell Them

Here are the conclusions and recommendations included in this post.

  • Documents will continue to be at the heart of information flow for patient care and one primary way of bundling clinical information about people.
  • It is appropriate to deal with smaller snippets of clinical data outside of such bundles for some uses. However, each such use will have to accept and account for the lost context that comes when snippets are extracted from bundles.
  • Although the PCAST report refers to these snippets as “atoms” a more appropriate metaphor is to think of them as “molecules.”
  • An evolving universal exchange language (UEL) should first target agreed-on definitions of the molecules.
  • The UEL should be used to encode data within documents (or document formats should include embedded UEL).
  • No document, however, should consist solely of coded data. They should all include a human-readable representation of the information that has been created
  • The UEL should be equally support unbundled use of clinical data for searching, conveying and reasoning on clinical data that is not bundled into documents.
  • The UEL should not be as arcane as the current coded form of HL7 CDA documents.
  • There is a large reservoir of definitions of molecules available in as many as seven different projects world-wide, including some that are being used as tools in the fourth SHARP grant.
  • These definitions can lead directly to XML data representations that are far less arcane than current HL7 CDA.
  • HL7 Green CDA is also a step towards less arcane XML (although it still includes the dreaded OIDs). However, HL7′s recent position statement on Green CDA indicates an unwillingness to do anything but permit experimentation on the Green CDA.
  • Consistent with PCAST recommendations, every system that has legitimate access to healthcare data may encode it using whatever vocabulary and ontology it wants. Sharing the data, vocabulary or ontology is a business decision made by the operators of a system
  • Contrary to PCAST recommendations the government should sponsor specific vocabularies and ontologies that serve as baseline capabilities for a UEL.
  • ONC should start an intense project similar to The Direct Project to establish molecular definitions in support of Stage 2 Meaningful Use requirements. This project should leverage the reservoir of molecular definitions that are available to the government now. It could be a fly-off between Green CDA and another less-arcane format.

Tell ‘Em

At page 72 the PCAST Report says

As mentioned, [HL7’s] CDA is a foundational step in the right direction. However, the thrust of CDA seems largely that it be an extensible wrapper that can hold a variety of structured reports or documents, each with vocabulary-controlled metadata. While this shares many features with the universal exchange language that we envisage, it lacks many others. In particular, it perpetuates the record-centric notion that data elements should “live” inside documents (albeit metadata tagged). We think that a universal exchange language must facilitate the exchange of metadata tagged elements at a more atomic and disaggregated level, so that their varied assembly into documents or reports can itself be a robust, entrepreneurial marketplace of applications. [Emphasis added.]

This statement has created a great deal of controversy.

  • On the one hand, there is a real danger to patients when individual “atomic” statements about them are taken out of context.
  • On the other hand, there are many situations where the risk-value ratio of pulling such statements out of a document favors doing so.

Without techniques that require extracting specific facts from documents we would not be able to plot a patient’s cholesterol trend (documented in separate visits), have automated surveillance, or perform population health studies.  If there will ever be substantial secondary use of data collected as a part of giving care, this extraction is necessary.

The PCAST expresses faith that making data available in a more atomic form than documents will enable innovative new applications. This faith seems well founded to the extent that

  • the data is truly deidentified or made available under truly informed and revocable consent, and
  • the applications do not fall into the trap of making false inferences based on data pulled out of context.

So far this is easy. There may not have been any furor except but for the phrase “perpetuates the record-centric notion.” One thinks of perpetuating myths, and importance of the context implicit in a document is not a myth.

In fact, all of these statements apply:

  • documents will continue to be at the heart of information flow for patient care, AND
  • a UEL should be used to encode structured clinical information within documentsAND
  • the UEL should be equally support searching, conveying and reasoning on clinical information in clusters that are smaller than entire documents.

Atoms and Molecules

Clearly the notion of what “atomic” means must be teased out. In a recent presentation Stan Huff gave an example that illustrates the danger of being overly atomic.

  • A stack of coded items is ambiguous. (SNOMED CT)
    • Numbness of right arm and left leg
      • Numbness (44077006)
      • Right (24028007)
      • Arm (40983000)
      • Left (7771000)
      • Leg (3021000)
    • Numbness of left arm and right leg
      • Numbness (44077006)
      • Left (7771000)
      • Arm (40983000)
      • Right (24028007)
      • Leg (3021000)

Without enough structure to understand the sequence and whether a code modifies the code before it or the code after it these “atomic” concepts are meaningless.

So, let’s assume that the PCAST intended its concept of “atomic” to follow Einstein’s rule, “Make everything as simple as possible, but not simpler.” I prefer to think of the Einstein level of simplicity as “molecular” rather than atomic. The structure includes multiple heterogeneous atoms joined together in a very stylized way. Molecules create a context in which the atoms are meaningful. Simple molecules such as chemistry lab results or blood pressure tests may have as few as a dozen atoms, although many of the atoms are seldom used so the number of vital atoms in common molecules is lower. More complex molecules, such as an initial assessment of a new pregnancy, may have hundreds of atoms. One can carry the metaphor further by pointing out that certain combinations of atoms will exist as chemical radicals, fitting into the larger molecule in the same way a single atom would. For example, a blood pressure “radical” may be a stylized part of many assessment molecules.

To define a molecule it is necessary to achieve consensus on several things:

  • What are the fields of data described in the molecule? For example, for a blood pressure reading molecule the data fields include the diastolic and systolic readings, the posture and the method (automatic or manual). Other situations may involve more than a dozen data fields.
  • How is the data represented: a number, a string, an image, a code, or something else?
  • How are the units of measure represented? Are they assumed or explicitly stated? For example, must the systolic and diastolic pressures be stated in millimeters of mercury or may they also be expressed in millibars? If the latter, how are the units of measure expressed in the molecule?
  • What codes from what coding system are used to represent the overall molecule and each of its constituents.
  • What codes from what coding system are used to represent any observed values or other data.
  • How are the various bits of information structured together logically. For example, a blood pressure reading consists of exactly one systolic and one diastolic reading. A separate molecule might be a differential blood pressure. It might include several pairs of systolic-diastolic readings with specific postures. Here the basic blood pressure molecule may serve as a “radical” in forming the more complex differential blood pressure module.

It is important to note that the choice of codes interacts with the definition of structure. Choosing a pre-coordinated code to identify the molecule results in fewer data fields than is the case with post coordinated codes. (The Wikipedia writeup on SNOMED CT describes pre-coordination and post coordination. A Google search will uncover more rigorous discussions in academic articles and books.)

How many molecules and radicals are there?

That is how many ways can you define the combinations of atoms that make sense and are useful for communicating patient data? Frankly, nobody knows. For one thing, even as they are identified and brought to consensus new ones become needed through advances in science and care approaches. I have heard estimates from 20,000 to 100,000 but a number of physicians seem to agree that a very useful collection of molecules and radicals would contain many fewer than 20,000. Stan’s presentation describes several different parallel efforts to enumerate the molecules using siloed methodologies. The one he is working on as identified more than 4,000.  These are being used as part of the toolset in the fourth SHARP grant.

On Monday 14 Feb John Halamka will add a post to his blog that described five efforts to create molecule definitions.

What names do the serious folks use for molecules?

“Clinical Element Model” and “detailed clinical models” are common terms. HL7 uses the term “Clinical Statements.” OpenEHR and CEN 13606 call them archetypes. Each molecular form is expressed in two ways, as a template that describes how instances are constructed and in many, many instances containing specific data about a subject and constructed according to the template.

I will use the term clinical statement here, meaning an expression of a discrete item of clinical (or clinically related) information that is recorded because of its relevance to the health or care of a subject. (This is less precise than the HL7 definition of the same term and it is generalized to apply beyond caregivng processes and to apply to people who may not be under care as patients.)

Molecules as Aids to Interoperability

In a recent post John Halamka described a clinically important glitch in sharing allergies by way of the CCR and personal health records.

BIDMC’s EHR considers an allergy list entry to be the substance, the reaction, the observer (doctor, nurses, your mom), and the level of certainty.   Google  considers an allergy to be the substance and a mild/severe indictor.  Thus, a transmission of an allergy “Penicillin, Hives, Doctor, Very Certain”  to Google results in “Penicillin” with no other information.    Use of an agreed upon list of data elements (i.e. what constitutes an allergy list) for data exchange would resolve this problem.

A physician viewing the allergy as filtered through the data model of the PHR has significantly less information in deciding how to deal with the patient.

There is a strong argument that getting agreement on the important molecules is more important to interoperability than getting agreement on complete bundles such as the CCD/CCR or a discharge summary.

Molecules and Documents

What is the relationship of a molecule to a document?

Just as it is not possible to interpret the atoms without the context of a molecule there are strong arguments that important meaning is lost if you interpret the molecules without the context of a document.

Clinical statements in isolation.

A lot of physicians believe that in patient care, most clinical statements cannot be properly or safely interpreted in isolation. A statement such as “the patient has been prescribed 10 mg of Crestor qd” may be subject to a different interpretation if another statement is that the patient is not taking the drug because he believes it is causing his myalgia.  ”History of smoking” means something different under “family history” than it does under “patient history.”

One very relevant piece of context is the purpose for which a set of clinical statements was prepared. A health assessment on a patient with no obvious cardiovascular problems is different from a preop workup on a patient by a cardiologist and that is different from a cardiologist’s assessment of a patient who has pain that may be angina.

In medical practice there is a well-established expectation that a collection of clinical statements created for a purpose and signed by clinician does contain enough relevant context to be useful to another clinician by itself. Common practice refers to those collections of statements as a document. The physical act of signing (with a pen or electronically) represents a point at which the signer believes that the collection of clinical statements, taken together, represents a useful set of information about the patient.

(So far, we make no assumption about the form of the document or the method by which the clinical statements are entered. What’s important is that a clinician prepared it with a purpose in mind and, in performing some signing ceremony, took responsibility for it being accurate and not missing glaring context information. Four forms of documents that illustrate the range of forms of documents are: a bit image of a fax of a dictated report; an electronically transmitted dictated report that contains all the natural language as computer text; an electronically transmitted text report that contains images of EKG strips, rashes, audio recordings of chest sounds or other multimedia text; and, reports prepared in computer software so that all the data is structured and coded using clinical standards.)

Secondary uses of data often aggregate clinical statements about many subjects extracted from many encounters or other data sources. These uses involve a controlled loss of context that represents an acceptable level of risk for the purpose. For example, in a study of Crestor vs. Lipitor it may be acceptable to extract the prescribed amounts without regard to reports of the patients’ compliance. (It may even be beneficial.)

To the maximum extent possible all data about a patient should be encoded using a UEL that enables straightforward extraction of their clinical statements. To the maximum extent possible that “exchange language” should be common in all documents (universal). The PCAST report does a service in calling for an exchange language that is universal and supports straightforward extraction of clinical statements (appropriately sized molecules). This can be very important to support searching and secondary use of data, although natural language processing of documents without coded clinical statements will continue to be equally important for many years.

I believe that the UEL should not be as arcane as are the coded form of HL7 CDA documents. However, the need for a simpler UEL in no way implies that maintaining the source document structure is not critical when the data is passed from one clinician to another in the course of patient care.

The Importance of Partially Coded Documents

As a practical matter it is unlikely that all the data in most documents can be encoded. The UEL must include the ability to represent partially coded data. (Or, conversely, the document must contain the ability to include some of its data encoded using the UEL.)

In Testimony for the Enrollment Working Group last June, I expressed very real concerns about “incremental interoperability” to avoid “frozen interface syndrome” a phenomenon where industry gets so comfortable with an existing standard that it becomes economically infeasible to move on. Clearly the PCAST has prescribed an approach to meeting these concerns, but it assumes that the starting point is coded data. In fact, a pragmatic strategy to reaching the PCAST goals must start from a starting point of mixed text and coded data. As a practical matter documents will be prepared on systems with varying ability to encode the clinical statements. They will be consumed on systems with varying ability to decode the structured clinical statements. Documents should all meet a lowest common denominator format where they can be read and displayed by systems that have little ability to interpret the coded and structured data. Even sophisticated systems that receive documents that are extensively coded will need to present the document as it was seen by the signing clinician.

Is it More Important To Standardize Molecules or Documents?

Most standards efforts have been oriented towards defining documents as they come closer to fulfilling the needs of specific use cases. However, the reality is that the definition of use cases are only approximations of the real-world needs and many real-world use cases are going to fulfilled by some combination of structured and unstructured data. The benefit of document definitions as standards is that they provide a convenient way to enumerate what must be sent (usually pretty minimal) and what may be sent (usually wide open).

It is unfortunate that HL7 has produced arcane XML and currently seems uninterested in producing wire standards based on Green CDA other than to “encourage experimentation.”. However, ONC has an alternative that is arguably more closely in line with the PCAST recommendations, seems to lead directly to much less arcane XML and doesn’t involve starting over. ONC should leverage the excellent work that has been done by the organizations that have been defining molecules to create straightforward XML representations of the most important molecules. It should define documents as having a fully human-readable representation of what is transmitted and, to the maximum extent possible, having coded molecules according to one of the fine libraries of molecule definitions that are out there. With a project comparable in interest and intensity as The Direct Project it might have specifications available for the draft notice of proposed rulemaking for Stage 2 of meaningful use.

Who Defines the Molecules: Government or “The Market”?

Directly after the statement cited above the PCAST Report says:

In a similar vein, we view the semantics of metadata tags as an arena in which new players can participate (by “publishing”), not as one limited to a vocabulary controlled by the government.

This statement is important and may invalidate my assertion that there is no real controversy. The PCAST may have been asserting that all clinical statements created about subjects can be interpreted post hoc and the software doing the interpretation could use its own molecular definition or whatever third-party definitions they find appropriate. This represents a major opportunity for innovation and also a major danger that one proprietary interest could ultimately control healthcare data in the U.S. Consider the current market for generalized search engines, such as Google and Bing. The “secret sauce” that establishes competitive differentiation among them is the vocabulary and related techniques that they employ to optimize a user’s success in finding what they want. Neither firm publishes their approach. The AND statements below describe a balanced way to enable innovation while guarding against an information monopoly:

  • Every system that has legitimate access to healthcare data may encode it using whatever vocabulary and ontology it wants. Sharing the data, vocabulary or ontology is a business decision made by the operators of a system, AND
  • The government should sponsor specific vocabularies and ontologies that serve as baseline capabilities for a UEL.

Tell ‘Em What You Told ‘Em

Here it is again. If you read all the way through to here, think of these as the Cliff Notes.

  • Documents will continue to be at the heart of information flow for patient care and one primary way of bundling clinical information about people.
  • It is appropriate to deal with smaller snippets of clinical data outside of such bundles for some uses. However, each such use will have to accept and account for the lost context that comes when snippets are extracted from bundles.
  • Although the PCAST report refers to these snippets as “atoms” a more appropriate metaphor is to think of them as “molecules.”
  • An evolving universal exchange language (UEL) should first target agreed-on definitions of the molecules.
  • The UEL should be used to encode data within documents (or document formats should include embedded UEL).
  • No document, however, should consist solely of coded data. They should all include a human-readable representation of the information that has been created
  • The UEL should be equally support unbundled use of clinical data for searching, conveying and reasoning on clinical data that is not bundled into documents.
  • The UEL should not be as arcane as the current coded form of HL7 CDA documents.
  • There is a large reservoir of definitions of molecules available in as many as seven different projects world-wide, including some that are being used as tools in the fourth SHARP grant.
  • These definitions can lead directly to XML data representations that are far less arcane than current HL7 CDA.
  • HL7 Green CDA is also a step towards less arcane XML (although it still includes the dreaded OIDs). However, HL7′s recent position statement on Green CDA indicates an unwillingness to do anything but permit experimentation on the Green CDA.
  • Consistent with PCAST recommendations, every system that has legitimate access to healthcare data may encode it using whatever vocabulary and ontology it wants. Sharing the data, vocabulary or ontology is a business decision made by the operators of a system
  • Contrary to PCAST recommendations the government should sponsor specific vocabularies and ontologies that serve as baseline capabilities for a UEL.
  • ONC should start an intense project similar to The Direct Project to establish molecular definitions in support of Stage 2 Meaningful Use requirements. This project should leverage the reservoir of molecular definitions that are available to the government now. It could be a fly-off between GreenCDA and another less-arcane format.

22 Comments »

Category: Healthcare Providers     Tags: , , , , , ,

22 responses so far ↓

  • 1 Steven Davidson   February 13, 2011 at 8:13 pm

    The former chemist I am very much enjoyed your analogy and found greater understanding than previously. Still, some e-patients and privacy activists tell me they want a meta-tag on every data element to secure their privacy rights risking even the model of clinical statements. Elevating the discussion around reasonable clinical information (documents or clinical statements as appropriate to the use) in today’s anxious patient privacy environment seems a daunting task. Though perhaps that’s just NY.

  • 2 Stanley Nachimson   February 13, 2011 at 8:28 pm

    Wes, you express very well the continued “debate” about the development of “documents” (structure) versus the development of “content” (molecules). I agree with your statement that “every system that has legitimate access to healthcare data may encode it using whatever vocabulary and ontology it wants. Sharing the data, vocabulary or ontology is a business decision made by the operators of a system”. It appears that it will fall to the UEL to correctly structure and define the information exchanged.

    What I feel may be a weakness in the approach is identified in your first Cliff Note – “Documents will continue to be at the heart of information flow for patient care and one primary way of bundling clinical information about people.” In our current system, patient information is collected not only for patient care, but for a myriad of other uses – especially administrative uses such as claims payment, and uses such as public health research, biosurveillance, etc. Limiting the UEL to only clinical exchange will be problematic as the data is attempted to be used for other purposes. The other purposes must be considered in this development process.

  • 3 Charles Parisot   February 13, 2011 at 9:33 pm

    Wes,
    You stated: \Most standards efforts have been oriented towards defining documents as they come closer to fulfilling the needs of specific use cases.\. This is a misperception. Specific Documents have been mostly defined by IHE Content profiles and HL7 CDA Implementation Guides, but by design they have been reusing molecules. This is old news. As a result, libraries of \molecules\ have been defined and widely reused across different types of documents across HL7 and IHE and many other more specific projects around the world. If fact an analysis shared by EHRA across over 20 types of documents demonstrates that most molecules (also called modules by IHE) have a wide rate of consistent reuse. Thank for advocating what has already been largely achieved and need only a little touch up that the S&I CDA Consolidation project will rapidly finish with the help of HL7 and IHE.
    ==> I agree with the “library of molecules” appraoch you propose, but no need for another Direct like project as you suggest. It is already well underway with the CDA consolidation project.

    On a completely different topic I would encourage you to not worry too much about the arcane HL7 CDA XML. One universal lesson that I have learned over the past 15 years in the XML world, is that everyone thinks that someone else’s use of XML is arcane. Why? Because XML allows endless variations in style and can be optimized for different processing and expression needs.
    So let’s not chase a myth. Let’s use the HL7 CDA XML on the wire and for those who do not like it, we may define our favorite XSLT transform to our favorite XML style. We can all be Green our own way! Any way no health professional will ever read the XML format of a document to care for a patient !

    Charles

  • 4 Wes Rishel   February 13, 2011 at 10:07 pm

    Thanks for your comments, Charles. I think we will have agree to disagree on the importance of reasonably intuitive XML. The issues are far more than stylistic.

  • 5 Wes Rishel   February 13, 2011 at 10:18 pm

    Stanley, thanks for the reply. A couple of points to consider. For some administrative data there are a set of regulations from CMS that specify the use of X12 and a substantial investment in infrastructure around those transactions.

    To the extent that you are referring to the use of clinical data to support payer-based care management, fraud and abuse investigations and (dare I say it?) claims attachments then I would expect payer-provider interactions to be coded using the same UEL described above.

    These transactions may well meet the criteria I described as “secondary” use of the data. They may use molecular UEL rather than full documents. For example, if a payer is monitoring dialysis it may expect to receive only the molecules that provide the last enzymes.

    On the other hand, if not all payers are ready to parse the XML in order to automatically adjudicate dialysis claims, they may choose an electronically transmitted document with embedded UEL. That way some payers can simply view the report and others can extract the molecules to drive auto adjudication.

  • 6 Wes Rishel   February 13, 2011 at 10:38 pm

    Steven, thanks for your reply. I am glad to know that the extended metaphor helped.

    It is difficult to judge the extent that the underlying thinking in the PCAST report gets to the full scope of privacy concerns. It doesn’t nearly call them out but the thinking still may be there. It will take some analysis not yet conducted. However, molecules already exist for identifying the patient and could be defined for other metadata as needed if the PCAST scheme proves viable.

  • 7 Peter Basch   February 14, 2011 at 8:31 am

    Wes – a very thoughtful post and conceptualization of the strengths and weakness of an approach to make information more readily shareable. One additional layer of complexity, the size of the knowledge molecule necessary to provide appropriate context for treatment purposes is not constant, and varies by situation and specialty. For example, a neurosurgeon may find way in discrete images of a cervical spine MRI. That same surgeon might find that information too granular to make a decision to operate or not, and may need with it clinical findings over time.

    Someone like myself (a PCP) might require a different size and shape molecule to find clinical meaning – such as a report of the cervical spine MRI + a neurosurgeon’s interpretation.

    One additional metaphor to consider when thinking about the same questions is with pixels and pictures. Overgranularity and separation / movement of pixels below a certain level can lead to clear but meaningless (or meaningful but misleading) snap shots in time. One additional example is that of electronic prescriptions. While each prescription is made up of a drug (brand and/or generic), a formulation, a strength, a fill #, a refill #, and directions – a prescription cannot be transmitted without all of these fields being linked together.

  • 8 Thomas Beale   February 14, 2011 at 6:29 pm

    Charles, can you point us to the HL7/IHE molecules / modules? I have never seen any reusable molecule in either space. Unless you mean message fragments? But for practical real-world reusability, we can’t be stuck in the v2 or v3 message space.

  • 9 will ross   February 15, 2011 at 7:00 am

    Wes — Thanks for the thoughtful comments. You write, “ONC should leverage the excellent work that has been done by the organizations that have been defining molecules to create straightforward XML representations of the most important molecules.” But then you do not mention the Standards and Interoperability Framework effort, launched last month by ONC. Are you specifically excluding the S&I Framework initiative because it is orthogonal or otherwise irrelevant to your suggestion to launch a national effort to define “straightforward XML representations”? Or did you not mention the S&I Framework for another reason?

  • 10 Wes Rishel   February 15, 2011 at 7:23 am

    Will, I was thinking ONC could do that work within the S&I framework.

  • 11 Mark Frisse   February 15, 2011 at 8:24 am

    Wes,

    Thanks for a great summary. When I think of examples to explain context, I think of the allergies and oxygen saturation (we old timers used to do arterial blood gasses.) As a medical student, I quickly learned the hazard of recording blood oxygen saturation without knowing the amount of oxygen one was breathing. Consider an asthmatic who presents with a low oxygen saturation, is treated, and is discharged with a high oxygen saturation. Viewed in isolation, either value could be misinterpreted. Is whatever value you are receiving the \baseline\? or is a reflection of a patient when they are acutely ill. I’ve seen misinterpretations when these values are viewed in isolation. But I’ll defer the the real practitioners like Peter Basch to add appropriate context to my own note.

  • 12 Keith W. Boone   February 15, 2011 at 9:02 am

    Wes, we mostly agree. For points of departure, see http://motorcycleguy.blogspot.com/2011/02/wes-rishel-recently-gave-some-advice-to.html

    Keith

  • 13 William Goossen   February 15, 2011 at 9:54 am

    Dear Wes,

    Interesting approach to the molecules. I have been working in this space for about a decade or so now, and have found that besides what you say, there is another analogy important.
    Once you have virtual molecules represented by byte strings, and since they are virtual and thus can be copied without limits, in contrary to atomic based materials, the option for reuse is great.

    Reuse of the virtual molecules, such as Detailed Clinical Models helps in many occasions. For instance, a DCM virtual molecule can be linked to any CDA example. Or it can be part of any HL7 v2 or v3 message, indeed as a clinical statement representation. It can be part of the EHR specification, e.g. if there is a section in the spec that EHR should contain 1-n assessment scales, any DCM representing an assessment scale can be included.

    In any reporting mechanism for aggregate data, virtual molecules like DCM can be reused, reused and reused again. Context is to some extent available in DCM because of purpose, evidence, guidance for data capture is included in the DCM expression.
    It helps to think of the virtual molecules to be part of larger virtual organs or even virtual systems. Hence, you can define the organ or system and identify the DCM that goes into it. DCM are then more or less comparable to components of a stem cell.

    So the characteristic that the data elements, clinical knowledge, terminology binding and data type spec can be in one virtual molecule and with modern tools can be used and reused again and again is core.
    There are different names available, and the most common 6 approaches are recently reviewed. Please have a look at
    http://pdf.medrang.co.kr/Hir/2010/016/Hir016-04-01.pdf

    William

  • 14 Dennis Giokas   February 15, 2011 at 1:10 pm

    Wes – I like the fact that you pushed us to think about molecules over atoms. I am going to push us further. It is not that I disagree with your observations. What I struggle with is people continue to focus on the low level representation of clinical data and seem to not put enough emphasis on the business/clinical context of the clinical models and their underlying behaviors. There are two stakeholders that can significantly benefit from my proposed shift in thinking – the clinicians and the implementer (software designers and developers).

    To continue with the analogy started, I propose we shift our thinking from molecules to “cells”. I want to wrap complete clinical concepts in that “single cell” and interact with it in an appropriate way, without looking into it. Within it there are many molecules which are the data. “Cells” can be assembled into higher forms of life. The role and relationships of the cells must be clearly defined, even though they are all based on the same building blocks.

    So this brings me back to me early days as a designer and programmer. Excuse me for sounding a bit dated, but let’s model this using an object oriented paradigm. These detailed clinical models hold all of the data (molecules) and the set of functions to read/write that data, without exposing how it is rendered (e.g. XML). (We still need the appropriate use of controlled medical vocabularies within the objects.) The objects have encapsulated functions (methods) that provide value added business logic using that data, again without exposing the underlying mechanism holding data. This aspect and what I feel is a key requirement has not been prominent in this discussion so far.

    Then there is the matter of transmission for these objects and the wire format. Those are also things that I don’t want to expose. Just marshal the objects over the wire and be done with it!

    Who benefits? Clinicians do because we can model these objects to represent healthcare in a way they can understand and validate. The developers do because they have object specification (or the objects in a reference implementation and/or runtime forms) they can use in their solutions.

  • 15 Liora Alschuler   February 15, 2011 at 2:35 pm

    Hey, Wes,
    It’s great to hear different views on this.

    You write: “HL7′s recent position statement on Green CDA indicates an unwillingness to do anything but permit experimentation on the Green CDA”.

    I have a question and what might be a minor gloss on your statement.

    Question: What would you prefer over experimentation?

    Minor gloss: There is no barrier to a proposal for making greenCDA a normative wire format within HL7. If anyone would like to propose it as a normative wire format today, the process is open and available. I don’t speak for the organization, this is simply from knowledge of policy and procedures.

  • 16 J Marc Overhage   February 15, 2011 at 2:40 pm

    I honestly don’t know the answer to this but is it that HL7′s constructs are arcane or is it that generalizable constructs in a complex environment are inherently complex. It seems to me that our experience is that we find these constructs complex and say, lets start over, and we come up with a simpler construct but then reality sinks in — we need all those things — at least if we are going to deal with more than a few trivial cases and they get added back in a bit a time until the new simpler thing is just as complicated as the thing we tried to replace. In addition, I’m not sure that very many people need to understand the arcane formats. Shouldn’t there be tools that expose the information from the arcane structure in usable ways, essentially views of the data for different purposes. I’ll go back to the Einstein quotation in your post — “make everything as simple as possible, but not simpler” At some level, we are modeling a very complex world and it may take a relatively complex structure to represent the word and that may be as simple as we can make it.

  • 17 Wes Rishel   February 15, 2011 at 3:34 pm

    Thanks, Liora. “Experimentation” covers a broad range of alternatives. If a group would form in HL7 that would develop the green CDA for an important use case and implement it within less than a year, that would be very productive. It might lead to sufficient proof of viability to inform decisions that must be made this year about meaningful use Stage 2.

    Ideally, I would like to see a project that, in some ways, follows the pattern of The Direct Project. The specific feature of Direct that I have in mind are that

    a) there were known disagreements on the backbone protocol at the start of the project
    b) participants were committed to evaluating multiple options and then going with what was chosen
    c) rough specs (standards here), code to operation, final specs all in < 1 year
    d) mulitple open source implementations

    In this case Green CDA would be one of the alternatives to consider.

  • 18 Eric Rose   February 16, 2011 at 5:52 pm

    A very interesting post.

    The health information “molecules” concept sounds like a step toward building a standardized data model for clinical information, at least in small to medium-sized chunks.

    Of course, that doesn’t directly imply a particular data model for clincial systems-In theory, systems that need to send or consume data according to such a model would be able to internally represent the data however they want. However, their existing internal representations might be inherently incongruous with whatever the standardized “molecular” schema is (one example that comes to mind is family medical history or medication history, for which various discrepant object models are possible).

    In such cases, the system could be designed to render the hard-to-handle data as narrative text on both import and output. But if a regulatory imperative comes about to be able to import/export specific types of data “molecules” with maintenance of fidelity, entire swaths of the current HIT installed base could be in trouble. What are your thoughts on that?

  • 19 David Tao   February 17, 2011 at 11:30 am

    Hi Wes,
    Thank you for the post. It seems to me that the CDA Consolidation project, run under the ONC S&I Framework, is ONC’s attempt to do what you recommended. The Direct Project was a precursor for the S&I Framework processes being used in the three new initiatives. I think there is strong agreement on the benefit of “molecules” as well as “documents” (it’s not an either/or choice). Charles Parisot’s mentioned is talking about molecules (modules). I wonder if the reusable templates, which are a goal of the CDA consolidation project, can be made sufficiently “green” to reach an agreement? As you pointed out, the Direct Project started with widely diverse viewpoints, but people worked together to reach a consensus. I hope that the ONC S&I Framework projects will be run in a way that promotes similar results — in fact, maybe they will be run even better because of learnings from The Direct Project.

  • 20 Hans Buitendijk   February 17, 2011 at 5:51 pm

    Wes:

    I support Marc Overhage’s notion that as we simplify, we lose expressivity and will build those capabilities back in over time as we realize what we lost. Also, it is an important notion that the end-user need not be bothered by whatever the internal syntax is, complex or simple. That’s what the computer is for. And as there are many who believe CDA is complex, others will attest to it being quite workable. We are all in sync that implementation guides should be improved, and the S&I Framework CDA Consolidation Project is working on just that.

    There is another consideration I would like to contribute to the mix in our quest for an acceptable UEL:

    - We should be able to express a molecule consistently the same, whether part of a document, message, service, or other exchangeable data set.
    - We should not have to express the molecule in UEL1 for one document type, UEL2 for a message, UEL3 for another document type, etc., etc.

    Clearly, today the reality is that we have multiple UELs that cannot easily map bi-directional with high-fidelity. Within one UEL, HL7 V3, we also have multiple dialects that require further harmonization. HL7 workgroups do have a stated goal that data communicated as part of an order, result message or document exchange should be expressed the same. Clinical Statement and DCM are after the same goal. There is a risk that greenCDA adds another dialect, but I’m confident that can be avoided.

    But as we evolve to one UEL with minimal dialects that can accommodate the above considerations, creation of yet another UEL is not going to be very helpful. In fact, starting a new UEL will yield a repeat of all the same questions, the same learning, the same pitfalls.

    So I would hope that we can rally around a common objective (which I think we have) and contribute to the current efforts that despite the real and perceived warts is getting us closer. Maybe not as fast as most of us would like, but a lot further than given credit for.

  • 21 Liora Alschuler   February 21, 2011 at 2:06 pm

    Hi, Wes,

    All interested in experimentation in this direction may want to attend Dan Pollock’s CDA HAI talk at the HL7 booth 5863, Wednesday, 2:40pm. Dan will review the use of CDA in the CDC’s National Healthcare Safety Network (HAI reporting) and their experimentation with greenCDA for Central Line Insertion Practice (CLIP) reporting.

    Seeing what CDC and those mandated to do CLIP reporting do with them will be interesting. An HL7 member has set up a wiki page to support communication around this and related trial uses.

    I’m not sure it has all the earmarks of Direct, for one, it’s not a code development project, however, I think it’s a great beginning to the process you would like to see and should give us a good basis to move to ballot based on the outcome well within a year.

    Liora

  • 22 HIMSS11 « health care commentaries from around the world   February 25, 2011 at 2:43 pm

    [...] take shots of Joyce and Robin on it.  The tweetup goes quite well.  I catch up with Wes Rishel and we talk about greenCDA for about 10 minutes before he has to run off to another meeting.  [...]