How should one go about discovering PII information in a large enterprise?
The challenge here is to go beyond the information provided by System Owners through a written report or interview.
Good Answers (1)
Stuart G
Director, Risk Assessment at McGraw-Hill
Best Answers in: Regulation and Compliance (2), Writing and Editing (1), Corporate Governance (1)
Alex, I would start by defining your point of arrival: inventory of all applications and platforms using PII data. Flow of PII data around the organization and it's full life cycle.
To discover all PII information in a large enterprise (assume multiple information silos and no central repository of data?) you will likely need to pursue multiple paths to get a total picture:
- Software tools with deep packet scanning capability (per previous answers)
- Google hack / data leakage analysis (what are you already loosing?)
- Identify known entry points for PII (e.g. credit card input points, staff data etc)
- Core to addressing the question is determining where you expect PII - consider the nature of the business, where and how PII is used and following that data through to the systems where it is processed and stored. This goes beyond the interview you suggest - (e.g. all companies have HR data - identify systems who use it and partner organizations it goes to. Reasonable to assume pensions, payroll processing and benefits are third party etc). Effectively construct a model of how data is expected to be used and the path it follows.
I would use multiple techniques and methods to get a complete picture of PII. POA comprehensive dictionary of all PII data with applications they pass through.
More Answers (2)
Take a look at Symantec Enterprise Vault.
Andrew R
President & Owner of NMI InfoSecurity Solutions; ISACA NY Chapter Board Member
Some PII, such as social security numbers and phone numbers, have signatures that can be detected automatically. If you are have an authoritative database of people with whom your organization does business, you can also search for any of those strings.
As another reply has suggested, there are commercial tools that
- Search for PII based on such criteria as I have described above
- Search for criteria in cross-perimeter communications
- Attempt to restrict the storage of PII to specific containers
In the end, masking or encryption will defeat any attempts to find PII that someone wants to keep hidden,