While clensing PII from test data I've been tied to a frightening scenario: cascading down the alterations with the foreign key associations within the data. Because of the concentrate on privacy and rules if this should practice be frustrated? When the PII itself weren't utilized in any key fashion a neat trick is always to just shuffle the posts.
You will find some commercial tools open to address this issue but not one of them appear to deal with quite a number of databases well.
Sounds harmful and stupid and inefficient. Secrets ought to be synthetic ids.
HIPAA includes a concept known as the "Unique Patient Identifier" which can be used we describe to link data: http://www.ncvhs.hhs.gov/app4.htm
Unique Patient Identifier removes the requirement for the repetitive use and disclosure of the individual's personal identification information (i.e. title, age, sex, race, marital status, place of residence, etc.) for routine internal and exterior communications (e.g. orders, results, medication, consultation, etc.) and safeguards the privacy of the baby. It will help preserve the individual anonymity while assisting communication and information discussing. Health care is essentially a multi-disciplinary process. A Distinctive Patient Identifier allows the integration and also the accessibility to significantly needed information from multi-disciplinary sources and multiple care configurations. Therefore, the integrity and security from the patient information rely on using a reliable Unique Patient Identifier.
The privacy problem hinges less around the identifier itself, but around the security and privacy from the data the identifier can be used to gain access to, and just how that access is controlled. My understanding is the fact that typically which means that a method querying for information using a patient identifier should only return information that cannot be pieced together to show personal data.
Basically you'd generate a man-made key for each individual. Despite the fact that it's unique towards the person, it's not personally determining, unless of course additionally you would release your personal data together with it. For instance, should you let people only first names having a particular query, but additionally came back the artificial key, they now realize that artificial key 00003 is connected with name Bob. if you permit them to in some way return and query with 00003 as criteria, and permit them accessibility lastname, you can observe how they may begin to accumulate information. It is crucial that there be not a way to have an unauthorized user to obtain the artifical key and PII came back within the same query, since that will make the artifical key itself PII. that's my interpretation a minimum of.
Aside from the HIPPA issues, one other issue with using PII like a key is it changes. People get new SSNs whether they have their details stolen. SSNs will also be frequently miskeyed and therefore relate the data for that wrong person (thinking much more of data imports using their company systems here). People (especially female people) frequently change their names. Differnt people also have a similar title (and frequently, because of this, databases hold incorrect SSN infomation for them also becasue they match towards the wrong SSN for your title) and therefore hardly any PPI is actually unique enough to become a key area. Further, PII ought to be saved within an encoded area which makes it a level worse option for a vital area.