wiki:i2b2 Data Import Pathology Procedure v2

Version 3 (modified by Richard Bramley, 11 years ago) ( diff )

--

Back

Version 2 of the i2b2 pathology data integration load will also use SSIS, but will address the following issues with version 1.

  1. Address and document the correct procedure for dealing with duplicates, by getting Paul Smalley to define which items are true duplicates and may be discarded and identifying a means of distinguish between duplicates where both should be loaded into i2b2.
  2. Improve the error handling and reporting.
  3. Move the SSIS packages onto the BRICCS SQL Server so that we can control their running and updating.
  4. Be configurable so that the source and destination databases can be changed more easily.

Implementation

Version 2 of the i2b2 pathology data integration load is written as the SQL Server Integration Services (SSIS) packages i2b2ImportPathologyObservations.dtsx and i2b2BuildPathologyOntoloy.dtsx. The mapping, ontology and logging data is stored in the i2b2ClinDataIntegration database on the UHLSQLBRICCSDB\UHLBRICCSDB server.

i2b2ClinDataIntegration Database

The database contains the following tables used by the two pathology integration packages:

Table Description
PathologyHierarchy The hierarchy nodes used to build the tree structure in the pathology ontology
PathologyCodes The relevant LOINC codes used in the pathology ontology, linked to their place in the hierarchy
PathologyMapping The mapping of LOINC codes to iLab codes. Each LOINC code can be mapped to more than one iLab code, but each iLab Code can only be mapped to one LOINC code.
ETLHistory Log file showing when each job was last run for each destination database.
Failed_Observation_Fact Log file showing Observation_Fact records that could not be loaded into the Observation_Fact table, when they were loaded and what the error message was.
Note: See TracWiki for help on using the wiki.