wiki:i2b2 Data Import Pathology Procedure v2

Version 5 (modified by Richard Bramley, 11 years ago) ( diff )

--

Back

Version 2 of the i2b2 pathology data integration load will also use SSIS, but will address the following issues with version 1.

  1. Address and document the correct procedure for dealing with duplicates, by getting Paul Smalley to define which items are true duplicates and may be discarded and identifying a means of distinguish between duplicates where both should be loaded into i2b2.
  2. Improve the error handling and reporting.
  3. Move the SSIS packages onto the BRICCS SQL Server so that we can control their running and updating.
  4. Be configurable so that the source and destination databases can be changed more easily.

Implementation

Version 2 of the i2b2 pathology data integration load is written as the SQL Server Integration Services (SSIS) packages i2b2ImportPathologyObservations.dtsx and i2b2BuildPathologyOntoloy.dtsx. The mapping, ontology and logging data is stored in the i2b2ClinDataIntegration database on the UHLSQLBRICCSDB\UHLBRICCSDB server.

i2b2ClinDataIntegration Database

The database contains the following tables used by the two pathology integration packages:

Table Description
PathologyHierarchy The hierarchy nodes used to build the tree structure in the pathology ontology
PathologyCodes The relevant LOINC codes used in the pathology ontology, linked to their place in the hierarchy
PathologyMapping The mapping of LOINC codes to iLab codes. Each LOINC code can be mapped to more than one iLab code, but each iLab Code can only be mapped to one LOINC code.
ETLHistory Log file showing when each job was last run for each destination database.
Failed_Observation_Fact Log file showing Observation_Fact records that could not be loaded into the Observation_Fact table, when they were loaded and what the error message was.

It also contains the following views

View Description
PathologyHierarchyRecursed Recurses the PathologyHierarchy table to produce a tree structure

i2b2BuildPathologyOntoloy SSIS Package

The package contains the following steps:

1 Recreate Ontology in the i2b2 meta database.

1 Delete existing ontology records in the PATHOLOGY table. 1 Insert the new ontology hierarchy records created from the PathologyCodes table and PathologyHierarchyRecursed view in the i2b2ClinDataIntegration database into the PATHOLOGY table of the i2b2 meta database.

1 Recreate the concept codes in the i2b2 data database.

1 Delete existing concept records from the concept_dimension table in i2b2 data database. 1 Isert the new concept records created from the PathologyCodes table and PathologyHierarchyRecursed view in the i2b2ClinDataIntegration database into the concept_dimension table of the i2b2 meta database.

Note: See TracWiki for help on using the wiki.