Completion↦Statistic Methodology↦Data Sharing↦Data Preparation
What is it? Why is it important?
Data sharing is a requirement of many funding bodies, and is increasingly considered a scientific standard across disciplines.
The primary objectives of data sharing are to:
- Enable the implementation of reproducibility checks regarding research results
- Facilitate the reuse of study data needed for the investigation of new research questions
- Promote scientific advancement
In medical research, data sharing represents a conflict, as researchers are obligated to protect the privacy of their study participants. This might necessitate the de-identification of data prior to sharing.
De-identification is the process of preparing data in a way that prevents (excluding deliberate and extensive efforts), the identification of participants. By complying with de-identification guidelines (see details under more), researchers can balance the need for data sharing with the need to safeguard the privacy of study participants.
More
De-identification of data can be time-consuming and requires specific expertise in data management within the relevant research field.
According to the Swiss CTU-Network Guidance on Data Sharing, a de-identification process is a collaboration between the sponsor, statistician, and data manager.
Key steps may include to:
- Replace identifiers with random numbers or codes (Identification numbers that are needed for analysis (e.g. patient ID) are replaced by a random number).
- Remove free text variables that could indirectly identify individuals (e.g. reason for patient visit).
- Grouping rare combinations of values to create larger patient groups (sensitive medications are grouped by family).
Remove or code information that in combination may potentially identify the person (e.g. year of birth, diagnosis, gender)
What do I need to do?
As a SP-INV:
- Be familiar with the Swiss law on:
- The correct coding and anonymisation of health-related data and biological material
- Having the appropriate knowledge and skills regarding data security and data protection (e.g. data protection act)
- In collaboration with the statistician, define the level of privacy required for the dataset, and if de-identification is necessary.
Together with the statistician assess the dataset and classify each variable as:
- Directly identifying variables, such as names, address, and dates of birth
- Indirectly identifying variables, which in combination with other variables, may potentially allow the identification of an individual
- Unproblematic variables, which are neither directly nor indirectly identifying variables
Ensure to properly document the de-identification process.
When sharing data or biological material make sure to set-up a Data / Material Transfer Agreement which documents data/material sharing conditions. The document is signed by both parties.
Where can I get help?
Your local CTU↧ can support you with experienced staff regarding this topic
Basel, Departement Klinische Forschung, CTU, dkf.unibas.ch
Lugano, Clinical Trials Unit, CTU-EOC, www.ctueoc.ch
Bern, Clinical Trials Unit, CTU, www.ctu.unibe.ch
Geneva, Clinical Research Center, CRC, crc.hug.ch
Lausanne, Clinical Research Center, CRC, www.chuv.ch
St. Gallen, Clinical Trials Unit, CTU, www.kssg.ch
Zürich, Clinical Trials Center, CTC, www.usz.ch
Swiss law
ClinO – see in particular
- Art. 6 paragraph 1 letter c Data security and data protection
ClinO-MD – see in particular
- Art. 5 paragraph 1 letter d Data security and data protection
HRO – see in particular articles
- Art. 4 paragraph 1 letter d Data security and data protection
- Art, 25 Anonymisation of health-related personal data and biological material
- Art. 26 Coding of health-related personal data and biological material