wiki:Initial Incremental Export

Version 6 (modified by Nick Holden, 13 years ago) ( diff )

--

In October 2011 we began work exporting data from Onyx for the first time.

Recognising that data was likely to be incorrect at least in some way this is not considered to be the definitive export, and NO PURGE WILL TAKE PLACE.

To manage the load on the server, and limit file sizes, the plan is to manage incremental exports up to date, recognising that this process might need to be repeated later.

The task of splitting the export into increments is managed in the export-destinations.xml configuration file, as described in the Onyx Export and Purge page.

Each incremental export will require a specific export-destinations.xml file with inclusion /exclusion criteria tailored to suit. Each stage in the interview needs to be specifically referenced with a stanza in the export-destinations.xml file. Once the export-destinations.xml file is edited to suit, it must be installed on the server, and tomcat restarted prior to entering Onyx to begin the export.

Data is exported as a timestamped zip file into a destination directory on the server. Each one will then be copied into a remote directory on the UHL BRICCS share, before beginning the next export.

Strategy is to conduct exports as follows:

Include only interviews with the status COMPLETE, exclude any previously exported.

Do NOT encrypt the export file, as it won't be leaving the UHL network.

For each increment then have a time window based on the Recruitment Context stage, and specifically the timeStart attribute of the QuestionnaireRun.

First run limit this to January through to June of the year 2010:

       <script type="INCLUDE">
          <javascript>
            <![CDATA[($('Participants:Admin.Interview.status').any('COMPLETED')).and($('RecruitmentContextQuestionnaire:QuestionnaireRun.timeStart').year().trim().any('2010')).and($('RecruitmentContextQuestionnaire:QuestionnaireRun.timeStart').month().trim().any('0', '1', '2', '3', '4', '5'))]]>
          </javascript>
        </script>
        <script type="EXCLUDE">
          <javascript>
            <![CDATA[$('Participants:Admin.Interview.exportLog.destination').any('BRICCS.Participants')]]>
          </javascript>
        </script>

Second run, extend the month limit through to 6 = July

Third run, extend the month limit through to 7 = August

Fourth run, extend the month limit through to 8 = September

Fifth run, extend the month limit through to 9 = October

Sixth run, extend the month limit through to 10 = November

Seventh run, extend the month limit through to 11 = December

Eighth run, remove the year limit, and set month limit back to 0 = January (should include only January 2011 participants)

Ninth run, extend the month limit through to 1 = February

Tenth run, extend the month limit through to 2 = March

Eleventh run, extend the month limit through to 3 = April

Twelfth run, extend the month limit through to 4 = May

Thirteenth run, extend the month limit through to 5 = June (will start to possibly pick up 2010 cases again, but not if previously exported)

Fourteenth run, extend the month limit through to 6 = July

Fifteenth run, extend the month limit through to 7 = August

Sixteenth run, extend the month limit through to 8 = September

Seventeenth run, extend the month limit through to 9 = October

Eighteenth run, remove both the year and month limits. Should mop up all remaining unexported interviews.

On completion of the 18 runs, the total number of exported interviews should match the total number of COMPLETE interviews in Onyx.

Notes of the actual export process

First increment export-destinations.xml version created in svn

First increment export-destinations.xml version copied to data.briccs.org.uk/onyx-config/...

First increment export-destinations.xml version downloaded to the test onyx server and export run successfully.

First increment export-destinations.xml version downloaded to the live onyx server and export run. No errors reported.

First increment export copied to V:\Test Data Export from Live Onyx\BRICCS-20111031170159.zip

26 participants marked as exported in Onyx interface. 26 xml files in each directory of the export. 26 entities.

There were 27 participants interviewed in June 2010, but one is still 'in progress', so 26 exported is correct.

Second version created in svn, committed, copied to data.briccs.org.uk, downloaded to live onyx server, tomcat restarted, and export run.

Expected result for July 2010 export: 53 participants. 53 participants exported.

PROBLEM: Exporting to /tmp on the live system is a disaster - the /tmp/tomcat6 folder gets destroyed when tomcat is restarted. We need a different configuration for tomcat - this does not happen on the test server. For the time being, I am copying each of the export files to /home/nick/briccs/onyx-export/ and also remotely, but we need a permanent solution.

Note: See TracWiki for help on using the wiki.