wiki:i2b2 Database Structure

Version 21 (modified by jeff.lusted, 14 years ago) ( diff )

--

i2b2 - DB Basics

You can get all of the following from the documentation, but here are pertinent things to where we are...

Structure

This is a very bare basics. At this stage we should be primarily interested in the observation_fact table and its relationship with the concept_dimension table within the CRC Cell. And then the correspondence with any ontology tables within the Ontology Cell. Interest in the patient_dimension, visit_dimension and other of the tables within the CRC are a little further down the line.

There can be a number of ontology tables within the Ontology Cell. We know which are ontology tables by an entry in the table_access table.

The QueryTool uses the ontology tables within the Ontology Cell to format queries for submission against the tables within the CRC Cell.

+-------------CRC Cell-------------+   +-----------Ontology Cell---------+
|                                  |   |                                 |
|        +===================+     |   |     +=====================+     |
|        | concept_dimension |<<<=========>>>|   ontology tables   |>:   | 
|        +=========^=========+     |   |     +==========^==========+ :   |
|                  |               |   |                |            :   |
|                  |               |   |                |            :   |
|                  |               |   |                |            :   |
|        +=========^=========+     |   |     +==========^==========+ :   |
|        | observation_fact  |     |   |     |   table_access      | :   | 
|        +===V===========V===+     |   |     +=====================+ :   |
|            |           |         |   |                             :   |
|            |           |         |   +-----------------------------:---+
| +==========V========+  |         |                                 :
| | patient_dimension |  |         |           +------------+        :
| +===================+  |         |<<<<<<<<<<<| QUERY TOOL |<<<<<<<<:
|                        |         |           +------------+  
|             +==========V======+  |
|             | visit_dimension |  |
|             +=================+  |
+----------------------------------+

Ontology Cell: DDL for Table ONYX

-- ORACLE --
  CREATE TABLE ONYX 
   (	"C_HLEVEL"           NUMBER(22,0)   NOT NULL, 
	"C_FULLNAME"         VARCHAR2(700)  NOT NULL, 
	"C_NAME"             VARCHAR2(2000) NOT NULL, 
	"C_SYNONYM_CD"       CHAR(1)	    NOT NULL, 
	"C_VISUALATTRIBUTES" CHAR(3)	    NOT NULL, 
	"C_TOTALNUM"         NUMBER(22,0)   NULL, 
	"C_BASECODE"         VARCHAR2(50)   NULL, 
	"C_METADATAXML"      CLOB	    NULL, 
	"C_FACTTABLECOLUMN"  VARCHAR2(50)   NOT NULL, 
	"C_TABLENAME"        VARCHAR2(50)   NOT NULL, 
	"C_COLUMNNAME"       VARCHAR2(50)   NOT NULL, 
	"C_COLUMNDATATYPE"   VARCHAR2(50)   NOT NULL, 
	"C_OPERATOR"         VARCHAR2(10)   NOT NULL, 
	"C_DIMCODE"          VARCHAR2(700)  NOT NULL, 
	"C_COMMENT"          CLOB	    NULL, 
	"C_TOOLTIP"          VARCHAR2(900)  NULL, 
	"UPDATE_DATE"        DATE	    NOT NULL, 
	"DOWNLOAD_DATE"      DATE	    NULL, 
	"IMPORT_DATE"        DATE	    NULL, 
	"SOURCESYSTEM_CD"    VARCHAR2(50)   NULL, 
	"VALUETYPE_CD"       VARCHAR2(50)   NULL
   ) ;

These are worth highlighting...
C_HLEVEL: hierarchical level in the tree. 0,1,2 etc
C_FULLNAME: full path within the tree.
C_NAME: in the demo system this is the last part of the path within the tree. Don't understand the size implications between C_FULLNAME and C_NAME.
C_SYNONYM_CD: Y or N regarding a synonym
C_VISUALATTRIBUTES: eg 'LA' would be an active leaf node...

  • 1st character: F = Folder C = Container M = Multiple L = Leaf
  • 2nd character: A = Active I = Inactive H = Hidden
  • 3rd character: E = editable

C_TOTALNUM: Not used, but suspect can be used in the latest version (for performance reasons?).
C_BASECODE: the term that describes the ontological concept.
C_METADATAXML: an optional field to store extra information about the concept in xml format. It is currently used to describe value metadata associated with a lab finding. For further details, see Documents page.Jeff has points to make about this.
C_FACTTABLECOLUMN: name of the column in the CRC cell observation_fact table that holds the concept code.
C_TABLENAME: name of the lookup table within the CRC cell that holds the concept code and concept path. In the top diagram this would be concept_dimension.
C_COLUMNNAME: the name of the column within the table described by C_TABLENAME that holds the concept path. In the top diagram this would be the column within the concept_dimension table, which in our case is called CONCEPT_PATH. See below.
C_COLUMNDATATYPE: either ‘T’ for text or ‘N’ for numeric and describes the datatype of the concept. Don't understand this fully. Jeff has questions about this.
C_OPERATOR: the SQL operator used in the WHERE clause of the SQL query. It is usually either ‘LIKE’ or ‘=’.
C_DIMCODE: the path that lives in the dimension table and maps to the concept path.

CRC Cell: DDL for Table concept_dimension

-- Oracle -- 
CREATE TABLE CONCEPT_DIMENSION ( 
	CONCEPT_CD          VARCHAR2(50)   NOT NULL,
	CONCEPT_PATH        VARCHAR2(700)  NOT NULL,
	NAME_CHAR           VARCHAR2(2000) NULL,
	CONCEPT_BLOB        CLOB           NULL,
	UPDATE_DATE         DATE           NULL,
	DOWNLOAD_DATE       DATE           NULL,
	IMPORT_DATE         DATE           NULL,
	SOURCESYSTEM_CD     VARCHAR2(50)   NULL,
	UPLOAD_ID           NUMBER(38,0)   NULL,
    CONSTRAINT CONCEPT_DIMENSION_PK PRIMARY KEY(CONCEPT_PATH)
) ;	
Note: See TracWiki for help on using the wiki.