| 28 | |
| 29 | == Duplicate Processing |
| 30 | |
| 31 | Version one of the data load identifies some records as being duplicates because they have the same patient, sample collection datetime and concept code. When a duplicate is identified it discards the most recent record. This is probably not correct for several reasons: |
| 32 | |
| 33 | 1. If there are more that two duplicates, it only discards one record and so there will still be a duplicate. |
| 34 | 1. Common sense and reason 1 suggest that it should be keeping the most recent record. |
| 35 | 1. There may be a better way to identify which record is correct. For example, if the result has been suppressed (result suppression will not solely solve the problem). |
| 36 | 1. Both records may be valid. |
| 37 | |
| 38 | Paul Smalley has agreed to look at the duplicate records to find out the reasons for the duplication. |