i2b2 Ontology c_metadataxml Column: MetadataXML-documentation.txt

File MetadataXML-documentation.txt, 14.9 KB (added by jeff.lusted, 13 years ago)

XML Specification for Descriptive Files of Values

Line 
1XML Specification for Descriptive Files of Values
2
3Version 3.02
4
5In the Observation_fact table of the Clinical Research Chart of i2b2, the following columns are used to describe values associated with concepts:
6
7ValueType_CD (Varchar(50))
8
9NVal_num (number)
10
11TVal_char (Varchar(255) – standard, sized per implementation)
12
13ValueFlag_CD (Varchar(50))
14
15Quantity_CD (number)
16
17Units_CD (Varchar(50))
18
19Observation_blob (blob)
20
21Values that are expected to occur in these columns are specified in a column of the Ontology Cell named “Metadata_XML”. Modifiers may often have values associated with them, and the above columns are filled and the Metadata_XML is specified in the same way as the concept_cd, however the value applies to the modifier rather than the concept_cd. When there is a value in these columns and a modifier is present in the modifier_cd column, the value ALWAYS applies to the modifier_cd coded element. Note that if the value is desired to apply to the base concept_cd, the modifier_cd column is specified to be “@”.
22
23Values are defined according to the ValueType_CD column, which can be construed as an identifier for the type of value object that fills out the rest of the above columns. There are two identifiers that are used for medication modifiers:
24
25N = numeric value
26
27T = enumerated or string value
28
29Numeric values are normalized in one of two ways. In versions through 1.5, the values are assumed to be equivalent to a “Normal” set of units. In the Metadata_XML file this is specified as <NormalUnits>. This implies that all values are normalized to those units during loading. Any value in the Units_CD column is ignored. In versions 1.6 RC4 and above, the Units_CD column is used to specify several possible values for the units. Conversions will occur between units as long as they are specified in the Metadata_XML file. It should be recognized however that queries may run considerable slower when specifying values in this mode.
30
31Quantity is part of the value specification, and has units that are in the denominator of the unit’s string. If the value of quantity is null, then the value is assumed to be 1.
32
33The required tags are Version, CreationDateTime, TestID, DataType, and Flagstouse. All tags except data in the Analysis tag is expected to persist from one build to the next. Note that the Metadata_XML file can be used during data loading in order to provide data type and normalization information for the values that are being loaded (especially important when this information is not specified by the source)
34
35Version will contain a real number with the XML specification version. This is version “3.02”
36
37CreationDateTime is the date and time the file was created, in the format MM/DD/YY HH:MM:SS using the 24 hour time.
38
39TestID is the Code (concept_cd or modifer_cd) associated with the file.
40
41CodeType is equivalent to the “symantic type” and will be fully specified at a later date.
42
43TestName is the literal name of the test. It is used for the convenience of the Analyst.
44
45DataType will contain the code for what kind of data to expect for this test. Possible values are:
46
47 PosInteger – domain of all positive integers
48
49 Integer – domain of all integers
50
51 PosFloat – domain of all positive real numbers
52
53 Float – domain of all real numbers
54
55 Enum – domain of enumerated values
56
57 String – domain of free text, NOT enumerated text values, which would be the Enum data type.
58
59Flagstouse is a string of concatinated flags that are valid for this TestID. For example, for most PosIntegers it would be “LNH” Acceptable values are L (low), N (normal), H (high), A (abnormal), T (toxic)
60
61Oktousevalues will contain a “Y” or a message, which indicates why it is not OK to use values. Nothing indicates that values cannot be used and the user may only specify values using flags.
62
63UnitValues is the parent tag of a set of possibly repeating tags. It contains data when the datatype is PosInteger, Integer, Float, and PosFloat. All units are always LOWER CASE.
64
65NormalUnits can exist only once, it contains a string that a user would recognize which represents the units of the value as we have it in the data warehouse.
66
67EqualUnits can repeat, it contains other strings that are numerically equal to the NormalUnits string.
68
69ConvertingUnits can repeat, it contains other strings, Units, and the factor, MultiplyingFactor, such that values in theses units need to be multiplied by the multiplyingfactor to convert them into values of NormalUnits. For example, if NormalUnits was in feet, and ConvertingUnits was in yards, the MultiplyingFactor would be 0.333.
70
71ExcludingUnits can repeat, it contains units that will cause the test with these units to be excluded from the query (and in versions 1.5 and prior, should not be included in the data load). These are units that can not be converted with a simple multiplier to NormalUnits. These concepts will need a new code, or if grouped will need to go into their own group, in order to be queried by value.
72
73EnumValues is used to specify the list of acceptable enumerated values, each enclosed in the Val tag. Enumerated values that indicate an invalid test result (for the enum datatype) can be enclosed in the ExcludingVal tag. The “descriptions” parameter exists to allow a humanly readable value to be presented for the enumerated value in choice boxes of user intergaces. ExcludingVal is directed to loading processes, and specifies values not to lead into the database or display in a user interface (for example, “pending”).
74
75MaxStringLength will contain a postitive integer or 0, representing the longest acceptable string length, if the datatype is “string”.
76
77LowofLowReference specifies the lowest of the low range values for PosInteger, Integer, Float, and PosFloat datatypes.
78
79HighofLowReference specifies the highest of the low range values for PosInteger, Integer, Float, and PosFloat datatypes.
80
81LowofHighReference specifies the lowest of the high range values for PosInteger, Integer, Float, and PosFloat datatypes.
82
83HighofHighReference specifies the highest of the high range values for PosInteger, Integer, Float, and PosFloat datatypes.
84
85LowofToxicReference specifies the lowest of the toxic range values for PosInteger, Integer, Float, and PosFloat datatypes.
86
87HighofToxicReference specifies the highest of the toxic range values for PosInteger, Integer, Float, and PosFloat datatypes (rarely used).
88
89Analysis contains tags that are used to retain data for the analysis of a valuemetadata element. However, it is transient, and will not be retained from one build to the next. It generally only reflects the current state of the metadata database. The New tag is intended to contain new unitvalues, enumvalues, and such.
90
91<ValueMetadata>
92
93 <Version>3.02</Version>
94
95 <CreationDateTime>
96
97 <TestID/>
98
99 <TestName/>
100
101 <DataType>PosInteger, Integer, Float, PosFloat, Enum, String</DataType>
102
103 <Flagstouse/>
104
105 <Oktousevalues/>
106
107 <UnitValues>
108
109 <NormalUnits/>
110
111 <EqualUnits/>
112
113 <ConvertingUnits/>
114
115 <Units/>
116
117 <MultiplyingFactor/>
118
119 <ConvertingUnits/>
120
121 <ExcludingUnits/>
122
123 </UnitValues>
124
125 <EnumValues>
126
127 <Val description=”"/>
128
129 <ExcludingVal description=”"/>
130
131 </EnumValues>
132
133 <MaxStringLength/>
134
135 <CommentsDeterminingExclusion>
136
137 <Com></Com>
138
139 </ CommentsDeterminingExclusion >
140
141 <LowofLowReference/>
142
143 <HighofLowReference/>
144
145 <LowofHighReference/>
146
147 <HighofHighReference/>
148
149 <LowofToxicReference/>
150
151 <HighofToxicReference/>
152
153 <Analysis>
154
155 <Enums/>
156
157 <Counts/>
158
159 <New/>
160
161 </Analysis>
162
163</ValueMetadata>
164
165Rolling up group codes
166
167Groups are one hierarchical level above the leaf codes, and the XML file that represents this group code will be computed from all the XML files that represent the children in the group. The following fields are computed form the children’s Metadata_XML files in the following manner:
168
169Flagstouse is a string of concatinated flags that are valid for any Children in this group.
170
171Oktousevalues will contain a Y or nothing. If the Children in the group are not all Numbers (Float, PosFloat, Integer, PosInteger) or all Enums, or all Strings, then Oktousevalues is usually a short explanatory string.
172
173UnitValues is the parent tag of a set of possibly repeating tags. It contains data when the datatype is PosInteger, Integer, Float, and PosFloat. Each unique XML element and value in the Children of this group is simply repeated as necessary under this tag.
174
175EnumValues is used to specify the list of acceptable enumerated values for all children in the group, each enclosed in the Val tag. Enumerated values that indicate an invalid result (for the enum datatype) can be enclosed in the ExcludingVal tag. These (like UnitValues) are simply repeated for each unique value found in the Children.
176
177MaxStringLength will contain a postitive integer or 0, representing the longest acceptable string length of all the Children in the Group.
178
179LowofLowReference specifies the lowest of the low range values for PosInteger, Integer, Float, and PosFloat datatypes of all the Children in the Group.
180
181HighofLowReference specifies the highest of the low range values for PosInteger, Integer, Float, and PosFloat datatypes of all the Children in the Group..
182
183LowofHighReference specifies the lowest of the high range values for PosInteger, Integer, Float, and PosFloat datatypes of all the Children in the Group..
184
185HighofHighReference specifies the highest of the high range values for PosInteger, Integer, Float, and PosFloat datatypes of all the Children in the Group..
186
187LowofToxicReference specifies the lowest of the toxic range values for PosInteger, Integer, Float, and PosFloat datatypes of all the Children in the Group.
188
189HighofToxicReference specifies the highest of the toxic range values for PosInteger, Integer, Float, and PosFloat datatypes of all the Children in the Group.
190
191Analysis is not used in the group, and therefore is always nothing.
192
193
194Examples of medication modifiers XML_Metadata codes
195
196DOSE:
197
198<?xml version="1.0"?>
199
200<ValueMetadata>
201
202 <Version>3.02</Version>
203
204 <CreationDateTime>01/26/2011 00:00:00</CreationDateTime>
205
206<TestID>MED:DOSE</TestID>
207
208<TestName>Medication Dose</TestName>
209
210<DataType>PosFloat</DataType>
211
212<Flagstouse></Flagstouse>
213
214<Oktousevalues>Y</Oktousevalues>
215
216<EnumValues></EnumValues>
217
218<UnitValues>
219
220< NormalUnits >mg/dose</ NormalUnits >
221
222< EqualUnits >mg/tablet</ NormalUnits >
223
224< EqualUnits >gm/liter</ EqualUnits >
225
226< EqualUnits >mg/ml</ NormalUnits >
227
228< ExcludingUnits >%</ ExcludingUnits >
229
230< ExcludingUnits >iu</ ExcludingUnits >
231
232< ExcludingUnits >iu/ml</ ExcludingUnits >
233
234< ExcludingUnits >mcg/inh</ ExcludingUnits >
235
236< ExcludingUnits >mEq</ ExcludingUnits >
237
238< ExcludingUnits >mg/inh</ ExcludingUnits >
239
240< ExcludingUnits >u/gm</ ExcludingUnits >
241
242< ExcludingUnits >u/ml</ ExcludingUnits >
243
244<ConvertingUnits>
245
246<Units> gm/tablet </Units>
247
248<MultiplyingFactor>1000</MultiplyingFactor>
249
250</ConvertingUnits>
251
252<ConvertingUnits>
253
254<Units> gm/15ml </Units>
255
256<MultiplyingFactor>66.66</MultiplyingFactor>
257
258</ConvertingUnits>
259
260<ConvertingUnits>
261
262<Units> gm/50ml </Units>
263
264<MultiplyingFactor>20</MultiplyingFactor>
265
266</ConvertingUnits>
267
268<ConvertingUnits>
269
270<Units> mcg/tablet </Units>
271
272<MultiplyingFactor>0.001</MultiplyingFactor>
273
274</ConvertingUnits>
275
276<ConvertingUnits>
277
278<Units> mcg/ml </Units>
279
280<MultiplyingFactor>0.001</MultiplyingFactor>
281
282</ConvertingUnits>
283
284<ConvertingUnits>
285
286<Units> mg/0.5ml </Units>
287
288<MultiplyingFactor>2</MultiplyingFactor>
289
290</ConvertingUnits>
291
292<ConvertingUnits>
293
294<Units> mg/15ml </Units>
295
296<MultiplyingFactor>15</MultiplyingFactor>
297
298</ConvertingUnits>
299
300<ConvertingUnits>
301
302<Units> mg/5ml </Units>
303
304<MultiplyingFactor>5</MultiplyingFactor>
305
306</ConvertingUnits>
307
308</UnitValues>
309
310</ValueMetadata>
311
312FREQUENCY
313
314<?xml version="1.0"?>
315
316<ValueMetadata>
317
318 <Version>3.02</Version>
319
320 <CreationDateTime>01/26/2011 00:00:00</CreationDateTime>
321
322<TestID>MED:FREQ</TestID>
323
324<TestName>Medication Frequency</TestName>
325
326<DataType>Enum</DataType>
327
328<Flagstouse></Flagstouse>
329
330<Oktousevalues>N</Oktousevalues>
331
332<EnumValues>
333
334<Val description="Before meals">AC</Val>
335
336<Val description="Twice per day">BID</Val>
337
338<Val description="Once per day">QD</Val>
339
340<Val description="Once at night">QHS</Val>
341
342<Val description="Three times per day">TID</Val>
343
344 </EnumValues>
345
346<UnitValues></UnitValues>
347
348</ValueMetadata>
349
350ROUTE
351
352<?xml version="1.0"?>
353
354<ValueMetadata>
355
356<Version>3.02</Version>
357
358<CreationDateTime>01/26/2011 00:00:00</CreationDateTime>
359
360<TestID>MED:ROUTE</TestID>
361
362<TestName>Medication Route</TestName>
363
364<DataType>Enum</DataType>
365
366<Flagstouse></Flagstouse>
367
368<Oktousevalues></Oktousevalues>
369
370<EnumValues></EnumValues>
371
372<Val description="">inhalation</Val>
373
374<Val description="">injection</Val>
375
376<Val description="Intravenous">IV</Val>
377
378<Val description="By Mouth">PO</Val>
379
380<Val description="By Rectum">PR</Val>
381
382<Val description="">topical</Val>
383
384<Val description="">transdermal</Val>
385
386<UnitValues></UnitValues>
387
388</ValueMetadata>
389
390PHARMACY ID
391
392<?xml version="1.0"?>
393
394<ValueMetadata>
395
396<Version>3.02</Version>
397
398<CreationDateTime>01/26/2011 00:00:00</CreationDateTime>
399
400<TestID> MED:NCPDPID </TestID>
401
402<TestName>NCPDP Provider ID</TestName>
403
404<DataType>String</DataType>
405
406<MaxStringLength>255</MaxStringLength>
407
408<Flagstouse></Flagstouse>
409
410<Oktousevalues></Oktousevalues>
411
412<EnumValues></EnumValues>
413
414<UnitValues></UnitValues>
415
416</ValueMetadata>
417
418
419PHARMACY BENEFITS MANAGER (PBM) NUMBER
420
421<?xml version="1.0"?>
422
423<ValueMetadata>
424
425<Version>3.02</Version>
426
427<CreationDateTime>01/26/2011 00:00:00</CreationDateTime>
428
429<TestID>MED:PBM</TestID>
430
431<TestName>Pharmacy Benefits Manager Number</TestName>
432
433<DataType>String</DataType>
434
435<MaxStringLength>255</MaxStringLength>
436
437<Flagstouse></Flagstouse>
438
439<Oktousevalues></Oktousevalues>
440
441<EnumValues></EnumValues>
442
443<UnitValues></UnitValues>
444
445</ValueMetadata>
446
447QUANTITY OF PILLS DISPENSED:
448
449<?xml version="1.0"?>
450
451<ValueMetadata>
452
453<Version>3.02</Version>
454
455<CreationDateTime>01/26/2011 00:00:00</CreationDateTime>
456
457<TestID> MED:DSPQ </TestID>
458
459<TestName>Quantity Dispensed</TestName>
460
461<DataType>PosFloat</DataType>
462
463<Flagstouse></Flagstouse>
464
465<Oktousevalues>Y</Oktousevalues>
466
467<EnumValues></EnumValues>
468
469<UnitValues>
470
471< NormalUnits >tablets</ NormalUnits >
472
473</UnitValues>
474
475</ValueMetadata>
476
477NUMBER OF DAYS SUPPLY GIVEN:
478
479<?xml version="1.0"?>
480
481<ValueMetadata>
482
483<Version>3.02</Version>
484
485<CreationDateTime>01/26/2011 00:00:00</CreationDateTime>
486
487<TestID> MED:DDS </TestID>
488
489<TestName>Number of Days Supply Given</TestName>
490
491<DataType>PosFloat</DataType>
492
493<Flagstouse></Flagstouse>
494
495<Oktousevalues>Y</Oktousevalues>
496
497<EnumValues></EnumValues>
498
499<UnitValues>
500
501< NormalUnits >days</ NormalUnits >
502
503</UnitValues>
504
505</ValueMetadata>