Version 50 (modified by 12 years ago) ( diff ) | ,
---|
The PATS mappings are split into text and numeric fields.
Text Fields
Text fields are mapped to the i2b2 Observation Fact table like this
i2b2 Field | Source |
Encounter_Num | i2b2 Patient ID |
Patient_Num | i2b2 PatientID |
Concept_Cd | Enum lookup value (see below) |
Provider_Id | "@" |
Start_Date | The source table observation date field (see below) |
Modifier_Cd | "@" |
ValType_Cd | "T" (text) |
Units_Cd | "@" |
Location_Cd | "@" |
Update_Date | The current date using the GETDATE() Function |
SourceSystem_Cd | "BRICCS" |
Upload_Id | "1" |
SOMETIME (RB): Convert Update_Date to use a date parameter. It's just a little quicker.
SOMETIME (RB): Should the SourceSystem_Cd not be "PATS"? Or maybe "MINAP" or whatever.
SOMETIME (RB): The Upload_Id is presumably always one because everything is reloaded each time. Might be nice to change this to something more meaningful.
Enum Lookup Value
The Concept Code to use is found in the table DATA_DICTIONARY_ENUM_VALUES in the DWBRICCS database on UHLDWH server. This table contains records for concept code keyed on source table name, column name and column value. NULL column values may be mapped to the field '(empty)'.
ACTION (RB): Check if all possible values of column value contain a mapping in the DATA_DICTIONARY_ENUM_VALUES table.
Source Table Observation Date Field
For each source table in the there is a single field that is used as the start date for all columns from that field. The observation date field to use for each table is stored in the pats_tables table in the DWBRICS on the UHLDWH server.
Numeric Fields
Numeric fields are mapped to the i2b2 Observation Fact table like this
i2b2 Field | Source |
Encounter_Num | i2b2 Patient ID |
Patient_Num | i2b2 PatientID |
Concept_Cd | "PTS:{TABLE_NAME}_{COLUMN_ID}" |
Provider_Id | "@" |
Start_Date | The source table observation date field (see above) |
Modifier_Cd | "@" |
ValType_Cd | "N" (numeric) |
TVal_Char | "E" (see below) |
Units_Cd | Value calculated by ufn_get_units function (see below) |
Location_Cd | "@" |
Update_Date | The current date using the GETDATE() Function |
SourceSystem_Cd | "BRICCS" |
Upload_Id | "1" |
TVal_Char
The TVal_Char describes the meaning of the value contained in the NVal_Num field.
For all values currently the value of TVal_Char is set to "E", which means equals. That is, the value in the NVal_Num field is the result of the test. All possible values of TVal_Char are:
Value | Meaning |
E | Equals |
NE | Not Equal |
L | Less Than |
LE | Less Than or Equals |
G | Greater Than |
GE | Greater Than or Equals |
From a casual look at the data, I think that a value of less than X, or greater than Y is sometimes used for Pathology data.
ufn_get_units Function
The function ufn_get_units in the DWBRICCS database is called with parameters of table_name and field_name to determine the units.
ACTION (RB): Work out what this function actually does. The function just selects the units column from the Data_Dictionary table in the DWBRICCS database. If the result is an empty string, it returns 'units' instead. Don't know why it would do that.
ACTION (RB): Rewrite numeric data extract to do the same, but with a join. Or it could be moved into the query that copies all the numeric fields into the recordset for looping with the for each. Either really.
Tables
The PATS i2b2 data integration load maps source table types to either Text or Numeric. All other types are ignored. The mapping is as follows.
Source Type | i2b2 Type |
VARCHAR | Text |
INT or FLOAT | Numeric |
All other types | NOT LOADED |
In addition, all fields with certain names are also ignored.
Ignored Field Names
- PatientId
- EntryId
- PreviousEntry
- NextEntry
Comments about data:
COMMENT (RB): Which fields should be removed?
COMMENT (RB): All date fields are ignored, but could they be converted into numeric fields. For example, length of stay, time between stays, etc. This may already be being done. Nicks says that generally this information is not important for cohorting.
COMMENT (RB): There are quite a few denormalised fields. For example: drug1, drug2 - could these be normalised into one concept code. Or: readmittedDate1, readmittedDate2 - could these be different encounters? Are these already being recorded?
COMMENT (RB): Some dates are recorded as varchars and so presumably are being imported
COMMENT (RB): Are some text fields also being ignored because they don't have entries in the enum data dictionary? Yes they are ignored.
ACTION (RB): Rewrite extracting of varchar fields to check that there are any entries in the ENUM Data Dictionary for that field. This should reduce the number of queries that need to be run in the for each and speed things up quite a bit.
Registries
MINAP
Minap data is contained within the following mapped tables in the data warehouse
- GGHMINAP_GGHMINAP
- GGHMINAPNEW
- GGHMINAPNEW_GGHMINAPINIVW
- LRIMINAPNEW
- LRIMINAPNEW_LRIMINAPNEWINITIALVW
And the following unmapped tables
- MINAPGGH_MINAPGGHINIVW
- MINAPLRI_MINAPLRIINIVW (This table only contains records for 2 patients. The details of both have been copied into the table MINAPGGH_MINAPGGHINIVW.)