= i2b2 Mapped Importer

Java program to take input from various sources and load it into i2b2.

== Sources

The aim is to allow any type of source to be used that comes in a per patient format.  For example, a CSV file with a row per patient, an XML file with a node per patient, a set of XML files with one per patient, or a database query with a row per patient.

Multiple rows (etc) can be supplied for a single patient.

== Outputs
The data can be outputted as either an i2b2 PDO XML file or loaded directlt into an i2b2 database (currently only using Microsoft SQL Server).

== Settings
The source and output to used is defined is a YAML file.  By default this file is called settings.yaml in the application root directory, but the setings file path can be defined using the {{{-s=xxx}}} or {{{--settingsFilePath=xxx}}} command line arguments.

The settings that can be defined are {{{dataSource}}} and {{{output}}}.  Both settings have a field called {{{type}}} that defines the class to be used as a source or output processor.  The further fields are specific for the {{{type}}} specified.

Here is an example settings file:
{{{#!yaml
output:
 type: uk.org.briccs.onyxmappedexport2pdo.output.mssql.MsSqlOutput
 server: Server Name
 instance: Instance Name
 database: Database Name
 username: Username
 password: Password
dataSource:
 type: uk.org.briccs.onyxmappedexport2pdo.onyx.OnyxDataSourceSet
 export_directory: data
}}}

=== Available !DataSources

There are three data sources at present, with others to be written as the need arises. The available ones are:

* !MsSqlDataSourceSet
* !MySqlDataSourceSet
* !OnyxDataSourceSet

==== !MsSqlDataSourceSet

Settings

* {{{type}}}: {{{uk.org.briccs.onyxmappedexport2pdo.input.sql.!MsSqlDataSourceSet}}}
* {{{Server}}}: ''The name of the SQL Server''
* {{{instance}}}: ''The name of the SQL Server instance'''
* {{{database}}}: ''The name of the SQL Server database''
* {{{username}}}: ''The username required to log on''
* {{{password}}}: ''The password for the above user''
* {{{query}}}: ''The query used to extract the patient data''

The query should return a row for each patient, and contain a patient identifier.

===== Mapping Source

The {{{source}}} attribute list for mappings (see later) should contain one item: the name of the field from the query.

==== !MySqlDataSourceSet

Settings

* {{{type}}}: {{{uk.org.briccs.onyxmappedexport2pdo.input.sql.MySqlDataSourceSet}}}
* {{{Server}}}: ''The name of the SQL Server''
* {{{database}}}: ''The name of the SQL Server database''
* {{{username}}}: ''The username required to log on''
* {{{password}}}: ''The password for the above user''
* {{{query}}}: ''The query used to extract the patient data''

The query should return a row for each patient, and contain a patient identifier.

===== Mapping Source

The {{{source}}} attribute list for mappings (see later) should contain one item: the name of the field from the query.

==== !OnyxDataSourceSet

Settings

* {{{type}}}: uk.org.briccs.onyxmappedexport2pdo.onyx.!OnyxDataSourceSet
* {{{export_directory}}}: ''The directory where the exported questionnaires are stored''

This is a data source written to import questionnaire data exported from the Onyx questionnaire tool.

The tool exports a set of questionnaires into a series of sub-directories, each directory containing the results for all patients for one questionnaire. Each questionnaire folder contains a file called {{{Entities.xml}}} that maps patients to numbered XML files within the same directory (for example {{{00000001.xml}}}).

===== Mapping Source

The {{{source}}} attribute list for mappings (see later) should contain two 
items.  The first is the name of the questionnaire folder, the second is the name of the variable element within the patient's numbered XML file.

==== Proposed Additional Data Sources

Other data sources that might be useful are:

* CSV
* XML using XPATH
* Other relational databases
* JSON

== Mapping

Mappings are used to specify which field from the source gets used for which field in the output.  These are specified in a YAML file.  By default this file is called mapping.yaml in the application root directory, but the mapping file path can be defined using the {{{-m=xxx}}} or {{{--mappingFilePath=xxx}}} command line arguments.

Mappings can be defined for:

* Patients, including mappings
* Events / Visits, including mappings
* Observations

Currently mappings are not available for other i2b2 types such as concepts and observers.

=== Observation Mappings

==== Data Fields
The types available for observation mappings are:

* !BooleanTrueField
* !DataSourceConditionalExistenceField
* !DataSourceConditionalExistenceIfNotNullField
* !DataSourceConditionalExistenceIfTrueField
* !DataSourceConditionalExistenceIfValueOneOfField
* !DataSourceEnumerationField
* !DataSourceNumericField
* !DataSourceTextField
* !LiteralTextField

==== Date Fields
Observation dates are defined as a list of dates that can be used if they are defined.  If the first field in the list is not defined, then next one on the list will be tried and so on.

The types of date field available are:

* !DataSourceDateField
* !YearOnlyDateField

== Extensions