python-hl7 - Easy HL7 v2.x Parsing

python-hl7 is a simple library for parsing messages of Health Level 7 (HL7) version 2.x into Python objects. python-hl7 includes a simple client that can send HL7 messages to a Minimal Lower Level Protocol (MLLP) server (mllp_send).

HL7 is a communication protocol and message format for health care data. It is the de-facto standard for transmitting data between clinical information systems and between clinical devices. The version 2.x series, which is often is a pipe delimited format is currently the most widely accepted version of HL7 (there is an alternative XML-based format).

python-hl7 currently only parses HL7 version 2.x messages into an easy to access data structure. The library could eventually also contain the ability to create HL7 v2.x messages.

python-hl7 parses HL7 into a series of wrapped hl7.Container objects. The there are specific subclasses of hl7.Container depending on the part of the HL7 message. The hl7.Container message itself is a subclass of a Python list, thus we can easily access the HL7 message as an n-dimensional list. Specifically, the subclasses of hl7.Container, in order, are hl7.Message, hl7.Segment, hl7.Field, hl7.Repetition. and hl7.Component.


0.3.0 breaks backwards compatibility by correcting the indexing of the MSH segment and the introducing improved parsing down to the repetition and sub-component level.

Result Tree

HL7 Messages have a limited number of levels. The top level is a Message. A Message is comprised of a number of Fields (hl7.Field). Fields can repeat (hl7.Repetition). The content of a field is either a primitive data type (such as a string) or a composite data type comprised of one or more Components (hl7.Component). Components are in turn comprised of Sub-Components (primitive data types).

The result of parsing is accessed as a tree using python list conventions:


The result can also be accessed using HL7 1-based indexing conventions by treating each element as a callable:



As an example, let’s create a HL7 message:

>>> message = 'MSH|^~\&|GHH LAB|ELAB-3|GHH OE|BLDG4|200202150930||ORU^R01|CNTRL-3456|P|2.4\r'
>>> message += 'PID|||555-44-4444||EVERYWOMAN^EVE^E^^^^L|JONES|196203520|F|||153 FERNWOOD DR.^^STATESVILLE^OH^35292||(206)3345232|(206)752-121||||AC555444444||67-A4335^OH^20030520\r'
>>> message += 'OBR|1|845439^GHH OE|1045813^GHH LAB|1554-5^GLUCOSE|||200202150730||||||||555-55-5555^PRIMARY^PATRICIA P^^^^MD^^LEVEL SEVEN HEALTHCARE, INC.|||||||||F||||||444-44-4444^HIPPOCRATES^HOWARD H^^^^MD\r'
>>> message += 'OBX|1|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^182|mg/dl|70_105|H|||F'

We call the hl7.parse() command with string message:

>>> import hl7
>>> h = hl7.parse(message)

We get a hl7.Message object, wrapping a series of hl7.Segment objects:

>>> type(h)
<class 'hl7.containers.Message'>

We can always get the HL7 message back:

>>> unicode(h) == message

Interestingly, hl7.Message can be accessed as a list:

>>> isinstance(h, list)

There were 4 segments (MSH, PID, OBR, OBX):

>>> len(h)

We can extract the hl7.Segment from the hl7.Message instance:

>>> h[3]
[[u'OBX'], [u'1'], [u'SN'], [[[u'1554-5'], [u'GLUCOSE'], [u'POST 12H CFST:MCNC:PT:SER/PLAS:QN']]], [u''], [[[u''], [u'182']]], [u'mg/dl'], [u'70_105'], [u'H'], [u''], [u''], [u'F']]
>>> h[3] is h(4)

Note that since the first element of the segment is the segment name, segments are effectively 1-based in python as well (because the HL7 spec does not count the segment name as part of the segment itself):

>>> h[3][0]
>>> h[3][1]
>>> h[3][2]
>>> h(4)(2)

We can easily reconstitute this segment as HL7, using the appropriate separators:

>>> unicode(h[3])
u'OBX|1|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^182|mg/dl|70_105|H|||F'

We can extract individual elements of the message:

>>> h[3][3][0][1][0]
>>> h[3][3][0][1][0] is h(4)(3)(1)(2)(1)
>>> h[3][5][0][1][0]
>>> h[3][5][0][1][0] is h(4)(5)(1)(2)(1)

We can look up segments by the segment identifier, either via hl7.Message.segments() or via the traditional dictionary syntax:

>>> h.segments('OBX')[0][3][0][1][0]
>>> h['OBX'][0][3][0][1][0]
>>> h['OBX'][0][3][0][1][0] is h['OBX'](1)(3)(1)(2)(1)

Since many many types of segments only have a single instance in a message (e.g. PID or MSH), hl7.Message.segment() provides a convienance wrapper around hl7.Message.segments() that returns the first matching hl7.Segment:

>>> h.segment('PID')[3][0]
>>> h.segment('PID')[3][0] is h.segment('PID')(3)(1)

The result of parsing contains up to 5 levels. The last level is a non-container type.

>>> type(h)
<class 'hl7.containers.Message'>

>>> type(h[3])
<class 'hl7.containers.Segment'>

>>> type(h[3][3])
<class 'hl7.containers.Field'>

>>> type(h[3][3][0])
<class 'hl7.containers.Repetition'>

>>> type(h[3][3][0][1])
<class 'hl7.containers.Component'>

>>> type(h[3][3][0][1][0])
<type 'unicode'>

The parser only generates the levels which are present in the message.

>>> type(h[3][1])
<class 'hl7.containers.Field'>

>>> type(h[3][1][0])
<type 'unicode'>

MLLP network client - mllp_send

python-hl7 features a simple network client, mllp_send, which reads HL7 messages from a file or sys.stdin and posts them to an MLLP server. mllp_send is a command-line wrapper around hl7.client.MLLPClient. mllp_send is a useful tool for testing HL7 interfaces or resending logged messages:

mllp_send --file sample.hl7 --port 6661

See mllp_send - MLLP network client for examples and usage instructions.

For receiving HL7 messages using the Minimal Lower Level Protocol (MLLP), take a look at the related twisted-hl7 package. If do not want to use twisted and are looking to re-write some of twisted-hl7’s functionality, please reach out to us. It is likely that some of the MLLP parsing and formatting can be moved into python-hl7, which twisted-hl7 and other libraries can depend upon.

Python 2 vs Python 3 and Unicode vs Byte strings

python-hl7 supports both Python 2.6+ and Python 3.3+. The library primarily deals in unicode (the str type in Python 3).

Passing a byte string to hl7.parse(), requires setting the encoding parameter, if using anything other than UTF-8. hl7.parse() will always return a datastructure containing unicode.

hl7.Message can be forced back into a string using unicode(message) in Python 2 and str(message) in Python 3.

mllp_send - MLLP network client assumes the stream is already in the correct encoding.

hl7.client.MLLPClient, if given a unicode string or hl7.Message instance, will use its encoding method to encode the unicode data to a byte string.


python-hl7 is available on PyPi via pip or easy_install:

pip install -U hl7

For recent versions of Debian and Ubuntu, the python-hl7 package is available:

sudo apt-get install python-hl7