wiki:i2b2 Mapped Importer

i2b2 Mapped Importer

Java program to take input from various sources and load it into i2b2.


The aim is to allow any type of source to be used that comes in a per patient format. For example, a CSV file with a row per patient, an XML file with a node per patient, a set of XML files with one per patient, or a database query with a row per patient.

Multiple rows (etc) can be supplied for a single patient.


The data can be outputted as either an i2b2 PDO XML file or loaded directly into an i2b2 database (currently only using Microsoft SQL Server).


The source and output to be used is defined is a YAML file. By default this file is called settings.yaml in the application root directory, but the setings file path can be defined using the -s=xxx or --settingsFilePath=xxx command line arguments.

The settings that can be defined are dataSource and output. Both settings have a field called type that defines the class to be used as a source or output processor. The further fields are specific for the type specified.

Here is an example settings file:

 server: Server Name
 instance: Instance Name
 database: Database Name
 username: Username
 password: Password
 export_directory: data

Available DataSources

There are four data sources at present, with others to be written as the need arises. The available ones are:

  • MsSqlDataSourceSet
  • MySqlDataSourceSet
  • OnyxDataSourceSet
  • CsvDataSourceSet

The exact settings required for these shown in the page i2b2 Mapped Importer Data Sources

Proposed Additional Data Sources

Other data sources that might be useful are:

  • XML using XPATH
  • Other relational databases
  • JSON

Entity Mapping

Mappings are used to specify which field from the source gets used for which field in the output. These are specified in a YAML file. By default this file is called mapping.yaml in the application root directory, but the mapping file path can be defined using the -m=xxx or --mappingFilePath=xxx command line arguments.

Mappings can be defined for the following entities:

  • Patients, including mappings
  • Events / Visits, including mappings
  • Observations

Currently mappings are not available for other i2b2 entities such as concepts and observers.

Data Fields

The following data fields are available for mapping entities:

  • BooleanTrueField
  • DataSourceConditionalExistenceField
  • DataSourceConditionalExistenceIfNotNullField
  • DataSourceConditionalExistenceIfTrueField
  • DataSourceConditionalExistenceIfValueOneOfField
  • DataSourceEnumerationField
  • DataSourceNumericField
  • DataSourceTextField
  • LiteralTextField
  • DataSourceDateField
  • YearOnlyDateField
Observation Fields

These fields can be used to create a observations. An observation will only be create if a non-NULL value is returned.

Some observation fields create an observation with a text value, others a numeric value, and others an enumeration value (actually just a text value). Other observations create an observation without a value of any kind. These are called Existence fields.

Date Fields

Most dates in entities are defined as a list of date fields instead of a single date field. If the first date in the list does not yield a date, the next date field in the list is checked and so on. Only if all dates are undefined will the date be set to NULL. In some cases this will cause an error.

Datasource Fields

These fields take their data from the patient row (etc) from the data source. The source attributes of these fields define how the value is located.

Literal Fields

These fields take their values from the mapping definition itself. In these fields the value is either defined by the source attribute or by the field type itself.

The Fields


Always set the value to true.

  • type:!BooleanTrueField


An Existence observation is created if the field from the data source is not NULL.

  • type:!DataSourceConditionalExistenceIfNotNullField


An Existence observation is created if the field from the data source is not true. True values are (case insensitive):

  • TRUE
  • YES
  • 1
  • -1

All other values are false.

  • type:!DataSourceConditionalExistenceIfTrueField


An Existence observation is created if the field from the data source matches one of the supplied values. The matching is case insensitive.

  • type:!DataSourceConditionalExistenceIfValueOneOfField
  • create_if_values: List of values to match.


Date field used for observation start and end dates as well as patient dates of birth, etc.

This field has an optional format mapping parameter. The processing differs depending on whether this parameter is present and what type of data source is being used.

If the parameter is provided the field will ask the data source for a string and attempt to parse it to a date using the provided date format string. If the parameter is missing, the field will ask the data source for a date value.

If a database data source is asked for a date variable it will expect the column returned by the query to have a type of date. If a text based data source, such as an XML or CSV file, is asked for a date, it will attempt to parse the text value using the format yyyy-MM-dd'T'HH:mm:ss.SSSZ.

  • type:!DataSourceDateField
  • format: (optional) Date format string (see SimpleDateFormat). If missing the field will rely on the data source for date parsing.


Creates an enumeration type observation containing the text from the data source.

  • type:!DataSourceEnumerationField


Creates a numeric observation by parsing the text from the data source.

  • type:!DataSourceNumericField


Creates a text observation by copying the text from the data source.

  • type:!DataSourceTextField


Creates a text observation by copying the text from the mapping.

  • type:!LiteralTextField


Creates a date for use as observation start and end dates, etc, from a numeric year field. The date created is the 1st January in the year specified.

  • type:!YearOnlyDateField

Patient Entity

Observation Entity

Event / Visit Entity


Error: Macro BackLinks(None) failed
'Environment' object has no attribute 'get_db_cnx'

Last modified 8 years ago Last modified on 04/24/15 08:17:45
Note: See TracWiki for help on using the wiki.