In the current age of Big-data revolutionizing every field of work, one of the primary concern is data protection and data privacy. A large amount of data about individuals is being collected and stored in computers, later which is used for a purpose that was not intended when the data was collected therefore it leads to a factor of revealing data about a specific person or organization.
Anonymization of data should be performed so that if the data is accessed by an untheorized person it is of no value to him or her. One of the primary procedures of data anonymization is de-identification of the data which can be performed by several techniques, however de indication does guarantee that data is completely anonymized and the relation between the data is kept meaningful. Relationship between the data is key element for which the research is conducted to observe the behavior or patterns if the relationship of the semantics does not exist in the data it is of no use for research.
Whole purpose of performing data anonymization is to protect individuals from the use of inaccurate personal information or information which is incomplete also personal information being used for which it was not intended to collect. According to the data protection principal’s data should be processed fairly and lawfully, personal data should be obtained only for one or more lawful purpose, personal data should be relevant, adequate and not excessive in relation to purpose of research. Data required for research is kept updated and proceeded in accordance with the individual.
Different de-identifications standards have been defined for example HIPPAA is privacy rule that protects protected heath information PHI by allowing only specified disclosure of the information related to the health and management. Creation of the HIPAA de-identified data set involves removal of the names of the individuals, geographical subdivisions, medical records and etc.
By meeting the defined standards of the HIPPA names of the patients should be replaced with the pseudo names however if the research is oriented towards the medical symptoms of malaria for example then medical records cannot be excluded for the data because the patient history is dependent on medical records, by excluding medical history of the patient with eliminate the desired relation of the data which related to the patient and research cannot be conducted.
Therefore there are certain elements which are essential to include for the research purpose however elements like name and other attributes which does not have an impact on the results or the outcomes of the research can be omitted otherwise research can be done. It is important to understand the data privacy and protection rules before conducting any research regarding to any field because there are organizations which protects the individual data.
At Rationale Trainings, we strive to deliver the best IT trainings to the learners. Our trainers and industry professionals specialize in design and delivery of IT training across a wide array of up-to-the-minute technologies, development frameworks and subjects.
Zainab Tower, Office #7, 2nd Floor, Model Town Link Road, Lahore