Changes between Version 35 and Version 36 of Onyx Export and Purge


Ignore:
Timestamp:
10/14/11 09:20:57 (13 years ago)
Author:
jeff.lusted
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Onyx Export and Purge

    v35 v36  
    164164Another awkwardness is that once a participant is exported, it is only possible (as far as I can see) to look at a limited amount of their data via the Onyx web interface. So validating what is at the end of the Onyx-to-i2b2 trail by manual inspection will be difficult.
    165165
     166=== On Exporting Small Numbers of Participants Regularly ===
     167
     168I remember from a conversation last year that Philippe Laflamme of OBIBA (and one of the principle developers of Onyx) recommended using the export facility on a regular basis to export smallish numbers of participants. We haven't, and now (October 2011) have a backlog of over 1000 ready and waiting. I think it is imperative we employ the export config to nevertheless stick to Philippe's recommendation; ie: smallish, regular exports. The following sections outline how this might be achieved.[[BR]]
     169[[BR]]
     170But in the meantime here are some things to ponder:
     171 * For 1000 participants, the unzipped export file will contain 14 subdirectories, each with two control files (one of which is metadata) and one file for each exported participant; and there is one control file for the overall export. Altogether, there would be 14029 xml files for an export of 1000 participants, probably totalling over a gigabyte in size.
     172 * The export is triggered by the administrator within the Onyx web application, and is executed by the Onyx web application itself. I suspect that it will take something in the order of 3 hours or more to export 1000 participants. My initial reaction is that the web server is unlikely to survive the memory demands of processing that many xml files to produce one zip file.
     173 * If, however, we were successful in producing such a huge zip file, processing it would be difficult. Our first export(s) of non-artificial data will be used to bottom out the multi-step pipeline process between Onyx export and i2b2 import. It would be sensible, at least in our first realistic tests of participants with non-artificial data, to keep the process within human bounds. If each execution of the pipeline has individual steps which take hours to finish, and maybe a retry, the development process, which will certainly be required considering things like changes to ontology, will be tortuous indeed.
     174
    166175=== On Filtering using a Date ===
    167176
    168 A further experiment revealed that the filter illustrated below worked, selecting participants with a status of completed where the conclusion questionnaire end date was some time in January, and the participant hadn't been exported before. All the valueTables within the export config file had the same filter.
     177Further experiments on exporting revealed that the filter illustrated below worked, selecting participants with a status of completed where the conclusion questionnaire end date was some time in January, and the participant hadn't been exported before. All the valueTables within the export config file had the same filter.
    169178
    170179{{{