break

Common Clinical Vocabularies in Health Care

I wrote this article for my course called: “Advanced Topics in Computer Science: Health Informatics”.

Introduction

Accurate knowledge about patients and diseases is critical when clinical decisions are taken. According to [8], improvement of medical knowledge depends upon the ability to analyze practice outcomes and apply them to the patients. However, to analyze these outcomes, we need data that is comparable. To have comparable data, all the parties involved need to understand the same vocabulary. As a result, the single greatest obstacle to comparable data remains common clinical vocabularies. After all, the data we store one day might be difficult to interpret the next day, if the vocabulary used to encode it has changed. Common clinical vocabularies must be more than a list of terms. They need to have a synonymy, multiple classifications, domain completeness, and provide consistent views of the definitions, while being unambiguous and avoiding redundancy.

In this article we examine different aspects of common clinical vocabularies. This article will:

List a set of requirements every organization should take into consideration when creating a common clinical vocabulary.

Discuss what common clinical vocabularies are for as well as why is it so hard to create one.

Provide a list of vocabularies that are currently used.

Discuss how change is handled in common clinical vocabularies.

Definition

Common Clinical Vocabularies are the natural prerequisite for disease and health outcome studies. Also, they are standardized terms and their synonyms, which record patient findings, circumstances, events and interventions with sufficient detail to support clinical care, decision support, outcomes research and quality improvement [8].

What is it for?

Based on [6] research, despite a vast literature on common clinical vocabularies, there is little information on what tasks they need to perform. However, [6] identifies some tasks vocabularies need to facilitate such as:

Collect information: on individual patients, population of patients, and institutions.

Present information: querying and retrieving information about patients.

Navigating and browsing through the information: either using the web or on a local repository.

Indexing knowledge: either medical knowledge or information about patients.

Analyzing and generating a natural language: which can be used internationally, based on local usage and preferences.

Why Common Clinical Vocabularies?
There are multiple reasons why common clinical vocabularies are needed. First of all, it is a challenge. But not any challenge, since it has been considered as one of the grand challenges for medical informatics. But most importantly, it is necessary to establish a common terminology that can be used to share data universally.

Computers play a big role, since they have changed the direction of medicine. Nonetheless, they complicate matters since patients can educate themselves using the internet. This can result in patients reading inadequate information about the proper medical action their disease requires. On top of that, there’s the English language; a complex language that looks even more complicated when we add the ambiguity and redundancy of the clinical terms used by doctors.

According to [8], The Institute of Medicine (IOM), conducted a study which points that 44,000 up to 180,000 Americans die each year as a result of medical errors. In another survey conducted at the 2000 Healthcare Information Management and Systems Society, 98 % of respondents believed that common clinical vocabularies would be important in reducing medical errors.

Common clinical vocabularies attempt to eliminate these problems by eliminating any semantics issues between doctors, nurses, researches, patients or the general public.

Why is it so hard?
The difficulties start by the fact that humans and computers understand information in different ways. On this regard, Donald Norman said “We are analog beings trapped in a digital world… We are compliant, flexible, and tolerant. Yet we have constructed a world of machines that requires us to be rigid, fixed, and intolerant” [6]. Nowadays, some of the existing technology for information exchange is designed for computers. As a result, users have to adapt to computers. The desired result is that vocabularies are understandable by health care professionals, but at the same time it is understandable to software engineers that work on health care systems.

In order to create a common clinical vocabulary, achieving consensus is required. Consensus is not always possible, since doctors, nurses and health care professionals disagree. One way to minimize the difficulties of achieving a consensus is to establish areas in which a level of consensus is appropriate, and what areas can be left for the local choice of the health care professional.

Clinical Vocabularies introduce a dilemma between the interpretation of the terms by the patients and doctors. If the ultimate aim is diagnosis by computer, it is mandatory to have a totally unambiguous clinical vocabulary. How hard is this to achieve? Charles Murray [4] conducted a survey to evaluate the interpretation of doctors and patients on clinical vocabularies. In his study, multiple-choice questionnaires were completed by 234 patients and compared with those completed by 35 doctors. On the study, the doctors reached a level of agreement of over 90%. However, the patients did not reach complete agreement of definition for any term. For example, on the word Diarrhea, 54% patients thought the term means “passing a lot of bowel motions in a short time”, while 68% of the doctors answered that the word means “Passing loose bowel motions” [4]. This example, shows how hard is to come up with unambiguous semantics, even for the most common terms.

Requirements
There are several requirements every common clinical vocabulary needs to have. Based on [8] research, we identified four basic requirements:

Evolving: the vocabulary needs to be expandable. The vocabulary needs to be capable to grow as new terms are created, existing concepts are refined, or some concepts are retired. Also, the vocabulary must carefully track changes and notify any violations concurred if the term is modified.

Unique: each term should have a single conceptual meaning. Terms cannot be vague or redundant. If a term in a common clinical vocabulary is discovered to have two or more meanings an appropriate response is to disambiguate these meanings by creating a separate term for each [2].

Unchangeable: once a term is defined, it should be permanent and immutable. If the concept is made inactive, the term still needs to retain its uniqueness and remain in the structure. Usually, terms that are deleted create a problem for systems that are using them [2]. For example, if a patient receives a diagnosis on a specific date, it is not acceptable to delete the diagnosis, only because the term was removed.

Hierarchical: a concept and its terms should be related to each other in the form of a hierarchy, based on the concept’s essential meaning. Although, individual terms can be represented in multiple hierarchies as long as they remain unique.

Impact on Health Care Organizations
Common clinical vocabularies can impact organizations in many ways. For example, common clinical vocabularies can create a link between the industry and organization-specific vocabularies [1]. Furthermore, it facilitates interoperability because organizations could exchange comparable data between them.

Integrating global and specific vocabularies allows Electronic Health Records (EHR) to be cross-referenced to standards that everyone can understand. Consequently, a health care organization could save time, money, and resources [1]. Finally, it reduces the opportunities for misinterpreted, inaccurate, imprecise data or human errors in a patient’s record. As a result, quality in the organization increases.

Vocabularies in Use
The following section discusses different vocabularies and classification standards.

International Classification of Disease (ICD)
The ICD is a set of classifications were one code typically represents a category in which several diseases may be mapped [9]. The classification has its origins in the 1850s [11]. Up until July 2010 the latest version is ICD-10 [11].

ICD has gained wide acceptance for coding clinical disorders, especially for hospital billing purposes [3]. The ICD is used internationally as a standard diagnosis classification for general epidemiological, health management purposes and clinical use [11]. Additionally, it includes terms for medical and surgical procedures, occupations, and other factors influencing a patient’s health status. The basic structure of ICD is a strict hierarchy.

Nonetheless, the ICD has several short comings such as: many categories are too broad to be clinically used; significant amount of details is lost when a paper-based record is coded, and it contains many ambiguous and redundant catch-all categories.

Systematized Nomenclature of Medicine (SNOMED)
In comparison with codings or classifications, SNOMED covers the breadth and depth of health care terminology. Several investigations confirmed SNOMED as a source with one of the best overall coverage of clinical content. It uses explicit hierarchies, description logic concept definitions, and relationships.

Unified Medical Language System (UMLS)
According to [10], UMLS facilitates the development of computer systems that operate as if the system knows the meaning of the language of health and biomedicine.

The UMLS Knowledge sources (databases) are distributed by The National Library of Medicine (NLM) in the United States. UMLS Knowledge sources are created for developers and not for end-users. Additionally, the NLM distributes software tools that can be used by software developers to create, process, retrieve, and integrate health data [10].

There are three UMLS Knowledge sources: The Metathesaurus, the Semantic Network, and the Specialist Lexicon. The Methathesaurus is a very large multi-lingual vocabulary database that contains information about biomedical and health-related concepts [10]. The Semantic Network is a set of broad categories that provide a categorization to all of the concepts represented in the UMLS Metathesaurus. The Specialist Lexicon is under development by the NLM to provide a general English lexicon that includes many biomedical terms.

Obtaining the Knowledge Sources or any software tool distributed by the NLM is free of charge and accessible over the Internet for any user. However, the use of the Metathesaurus requires a license agreement.

RadLex
Radlex is an initiative from the Radiological Society of North America (RSNA). It provides a uniform structure for capturing, indexing, and retrieving a variety of radiology information sources (e.g. radiology reports). Rather than “re-inventing the wheel”, Radlex unifies and supplements other lexicons and standards like: the SNOMED, UMLS, and others. Radlex is very beneficial for educators, clinical radiologists, and radiology researchers.

Radlex terms are organized into categories which provide an overall organization for the lexicon and are a guide for how imaging information can be used. Some examples include: treatment, uncertainty, image quality, and others.

In order to illustrate how RadLex can benefit radiology educators or researchers, a case study written by [5], will be presented in the next sub-section.

A RadLex Case Study: Clinical Decision Support.
A radiologist is interpreting a chest CT showing a tree-in-bud appearance. The radiologist is unsure whether the examination being interpreted truly exhibits this feature, and does not know the diagnostic possibilities that might explain the appearance.

Before Radlex the radiologist consults textbooks, journal articles, and online sources. However, he spends lots of time searching different databases and looking different results.

With Radlex the radiologist is able to search for a tree-in-bud on the RadLex site. He finds an image that matches the case at hand. A diagnosis of tree-in-bud is displayed, including links to relevant full-text articles from journal websites.

In this case study, Radlex is able to satisfy the needs of a radiologist. Nonetheless, if needed Radlex could also satisfy the needs of software developers, and systems vendors.

Digital Imaging and Communications in Medicine (DICOM)

DICOM defines a method of communication for medical image systems. It’s being developed by National Electrical Manufacturers Association (NEMA) and ACR (American College of Radiology). To facilitate interoperability it provides a protocol for communication, semantics of commands, but it does not provide any implementation details.

The goals of DICOM include: obtaining images and all of the information associated to a patient, achieve compatibility, and to improve the workflow efficiency between imaging systems and other information systems in health care environments worldwide.

Why DICOM?
The first reason is that is provides a single identification of images. A radiology department produces thousands of images per day. If images are classified in a JPEG or GIF format, they can lose the demographic data of the images. Consequently, DICOM associates information (such as name of the patient, type of examination, hospital, date of examination, type of acquisition, etc) to each image produced. Thus each image is autonomous. If an image is lost, it is always possible to identify formally its origin, the patient, the date, etc.

Each image has four unique identifiers: service-object pair class, study authority, series authority, and image UID. The service-object pair class identifies the type of service for which the image is intended. The study authority identifies a whole examination, in time and place. The Series Authority identifies a series of images within the examination. Finally, the Image UID identifies the image associated with the file.

The second reason is that it uses a common vocabulary. DICOM uses SNOMED to universally identify the data from machine to machine.

The third reason is that the format is used by different medical specialties. DICOM is used in radiology, cardiology, radiotherapy, and many others.

DICOM File Format
DICOM file format is composed of a header as well as the image data [7]. The header stores the information about the patient’s name, the type of scan, and the image dimensions. The image data can contain information in three dimensions. Also, it can be compressed to reduce the image size.

In a DICOM header, the first 794 bytes are used for a DICOM format header. These bytes describe the image dimensions and retain other text information about the scan. The image data follows the header. DICOM requires a 128-byte preamble followed by the letters ‘D’,'I’,'C’,'M’. This is followed by the header information that is organized in groups [7]. Some DICOM elements are required, but that is based on the image type. If this information is not available, the DICOM standards requirements are violated.

How to handle Change in Common Clinical Vocabularies
Clinical vocabularies and medical knowledge will grow. Evolution is necessary and inevitable. Changes in common clinical vocabularies have several advantages and disadvantages.

Some advantages include: addition, refinement, removing redundancies, and updating obsolete terms. Addition is required by the evolution of the discipline of medicine. Refinement is needed since one or more terms are added to a vocabulary to specify a greater level of detail. Any code or term that is added which is identical in meaning to an existing term needs to be removed. Finally, it can be said that new knowledge often requires the addition of new terms to a vocabulary. As a result, some terms need to be rendered as obsolete. Even though a term has fallen out of favor, we cannot remove them from a vocabulary because a patient could have been diagnosed with that term. Instead, new terms can be added as refinements to the obsolete terms.

Some disadvantages include: major name changes, and changed codes. With major name changes, changing the name corresponds to a true change in its meaning. There are two scenarios when dealing with major name changes: deletion and addition.

In the deletion case, terms may be deleted if the creators no longer wish to include the concept in the domain of the terminology. For example, if a patient was diagnosed with a disease on a particular date, it would be unacceptable to simply delete the diagnosis because the disease term is no longer part of the vocabulary. However, in most cases, no changes are needed. For example, if the laboratory stops performing a particular test, the existence of the term in the clinical vocabulary is harmless. Any previous occurrences of the test remain coded in the patient databases and remain interpretable.

In the addition scenario, when the new term represents a truly new concept, the proper response is simply to accept it into the vocabulary and use it when appropriate.

There are different ways to deal with change in clinical vocabularies. One way is to apply automated vocabulary maintenance methods. However, the right method can only be applied when the type of change is well understood. At present, no method can automatically detect the type of change needed for specific scenarios. For example, no method can differentiate between a minor and a major name change. Vocabulary changes usually do not include information regarding the reason for the change. Such information in a structured, machine-readable format might help. Nonetheless, the most efficient way to deal with change is to have domain experts perform manual reviews of the required changes.

Conclusion

After reviewing the literature for common clinical vocabularies, we can point out many lessons learned such as:

Common clinical vocabularies are an essential piece in the process of moving health care into an automated computerized way.

Clinical vocabularies can improve quality, and reduce errors on IT systems.

The ideal characteristics of a common clinical vocabulary include: concepts with one meaning, structured and controlled, and a sense of evolvability.

Patients will try to educate themselves on clinical terms using the internet. Achieving a consensus on clinical terms in necessary to avoid confusions between patients and doctors.

Until new methods are discovered, manual reviews by domain experts are the best way to deal with change in common clinical vocabularies.

SNOMED is the closest to a well established common clinical vocabulary.

The potential of common clinical vocabularies will depend on its ability have an impact on medicine and technology. But that will only happen when common clinical vocabularies are used and re-used in software while independently developed medical records, and decision support systems share the same information using the same terminology. If common clinical vocabularies have their way, they will become of routine use for all the parties involved in health care.

References

[1] – 3M Health Information Systems. “Using a Medical Data Dictionary to Comply with Vocabulary Standards and Exchange Clinical Data”. Retrieved on June 2010.

[2]- Cimino, J; and Clayton, PD. “Coping with changing controlled vocabularies. in Eighteenth Annual Symposium on Computer Applications in Medical Care”. 1994. Washington, DC: Hanley & Belfus, Inc, Philadelphia PA: pp. 135-139.

[3]- Cimino, J; and Johnson, Stephen. “Designing an Introspective, Multipurpose, Controlled Medical Vocabulary” in Proc. 13th Annual Symposium on Computer A pphcatzons zn Medical Care. L. C. Kingsland (ed.), IEEE Computer Society Press, November 1989, 513-518.

[4] Murray, Charles. “Difference between Patient’s and Doctor’s interpretation of some common medical terms”. British Medical Journal. 1970.

[5] Radiological Society of North America. “RadLex: Overview and Examples”. Retrieved on June 2010.

[6] Rector, Alan. “Clinical Terminology: Why is it so hard?”. 1999 Methods of Information in Medicine 38(4):239-252

[7] Rorden, Christopher.”The DICOM Standard”. Georgia State University. Retrieved on June 2010.

[8] Rose, Jeffrey; Hogan, William; Marshal, Philip; and Kirkley, Debra .”Common Medical Terminology Comes of Age, Part One: Standard Language Improves Healthcare Quality” Journal of Healthcare Information Management. 2001.

[9] – Rose, Jeffrey; Hogan, William; Marshal, Philip; and Kirkley, Debra .”Common Medical Terminology Comes of Age, Part Two: Current Code and Terminology Sets Strengths and Weaknesses”. Journal of Healthcare Information Management. 2001.

[10] – United States National Library of Medicine. “Unified Medical Language System Fact Sheet”. Retrieved on July 2010.

[11] – World Health Organization. “International Classification of Diseases “. Retrieved on July 2010.

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.

CAPTCHA Image
Reload Image