Changes between Version 1 and Version 2 of OnyxExportOntology


Ignore:
Timestamp:
01/26/11 09:59:29 (14 years ago)
Author:
jeff.lusted
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • OnyxExportOntology

    v1 v2  
    11== Deriving an Initial Ontology from an Onyx Export ==
     2This is about deriving a first ontology from Onyx for trialing the import of an ontology and subsequent data into i2b2.
     3
     4----
     5'''Nick's comments from meeting with Jason, Jeff and Dave'''
     6{{{
     7Working from variables.xml in MedicalHistoryInterviewQuestionnaire
     8
     9Top level of xml is the questionnaire - nothing pulled from export xml to populate this.
     10
     11Second level is taken from the stage name - all variables have a stage name attribute in the export xml.
     12
     13Third level is section name - all variables have a section name attribute.
     14
     15Fourth level is questionName - all variables have a questionName attribute.
     16
     17Fifth level is name of variable - defined in the <variable name=""> section of the export xml (or possibly constructed from variable name and category name, see below). Note that in some cases this will be the same as the fourth level. If this isn't a good idea, then the name will need to be extended by means of an additional string (maybe '.Question'?)
     18
     19<label> is derived from the variable's attribute "label"
     20
     21<type> is derived from the <variable valueType=""> declaration.
     22
     23* * *
     24
     25I've done the high BP questions (Do you.., When did you..., Have you received...) and then skipped all the other conditions as the question structure is essentially the same, albeit some with multiple categories for (e.g.) type of diabetes, treatment of diabetes, or with multiple question sets for multiple event conditions (MI, etc).
     26
     27* * *
     28
     29Discussion
     30
     31This is messy. Pulling the data only from each <variable></variable> element definition makes sense, but the 'category' variables (Y,N,PNA,DK,etc) would then have 'labels' of 'Y', 'N' and so on - put hundreds of those into i2b2 and you've got a very confusing ontology.
     32
     33Jeff's idea of using the <category> child elements of the original <variables> might therefore be a better idea, but it requires a more complex filter - where a variable has category child elements then the filter needs to construct additional ontology entries from those child elements (combining the variable label and the category label) and ignore the following variables (with the same root variable name) or possibly add detail from them, but where the variable doesn't have category child elements then it must behave differently.
     34
     35Thinking further, it would be better if the primary question <label> element related to the fourth level, not the fifth. Thus in the hierarchy there would be the primary question label, with 1 or more variables underneath it.
     36
     37NOTE a GOTCHA with questions that generate an integer value: the boolean variable is named e.g. part_hist_highbp_onset_cat but the integer value associated with it is just part_hist_highbp_onset.
     38
     39Definitively do not need: page attribute, required attribute, condition attribute, validation attribute. I don't think we need to bring over either the category 'code' attribute, or the 'missing' attribute, but maybe we do? Also don't think we need exclusiveChoiceCategoryVariable attribute, as the ontology just needs to provide all possible variables for all possible participants.
     40
     41Does the stage attribute of a variable ALWAYS match it's questionnaire attribute? Looks like it does.
     42
     43What is the 'script' attribute for?
     44
     45Does the ontology need to include the category variable 'code' attributes? Depends on the structure of the participant answer files. Looks like the codes aren't even mentioned in the answer files, so ignore them for now.
     46}}}