== Processing the Patient Identifer Set during a Load. == === Cardinality, with some examples. === ==== __Example One__ ==== First of all, this is wrong. Every pid must have a patient_id. {{{ BPt00000040 Snnnnnnn }}} ==== __Example Two__ ==== This is acceptable: {{{ BPt00000040 Snnnnnnn }}} ==== __Example Three__ ==== The following example is also acceptable, but implies we know the participant is either: * already within the CRC (and that we know the internal identifier = 2) or: * this is a new participant and we are ourselves assigning a new i2b2 internal identifier (= 2) for them. Both situations are ones we can avoid by adopting the approach of Example Two above, and omitting the HIVE as a source. {{{ 2 BPt00000040 Snnnnnnn }}} ==== __Comment__ ==== As far as I can tell, a row in the temporary table covers a patient_id / patient_map_id combination. So: * Example Two would give rise to one row. * Example Three would give rise to two rows. === First Stage: Eliminate Duplicates. === Any "duplicates" are eliminated from the temporary table. A duplicate is one where another row matches on: 1. patient_id 1. patient_id source 1. patient_map_id 1. patient_map_id source === Second Stage: Process HIVE as a Source. === === Third Stage: Not using HIVE as a Source. ===