Question Details

No question body available.

Tags

privacy security-policy

Answers (3)

Accepted Answer Available
Accepted Answer
March 11, 2026 Score: 4 Rep: 47,363 Quality: Expert Completeness: 70%

Ewan's answer is a good overview of what constitutes personally identifiable information. Globally, we have a general understanding of what this means, but jurisdiction matters; we have specific legal frameworks to deal with in many countries. The General Data Protection Regulation (GDPR) is one such legal framework. Within the United State of America (where I'm from), we have some guidance from the National Institute of Standards and Technology (NIST).

GDPR Definition of Personal Data

While there is more to it, this is a good intro:

The data subjects are identifiable if they can be directly or indirectly identified, especially by reference to an identifier such as a name, an identification number, location data, an online identifier or one of several special characteristics, which expresses the physical, physiological, genetic, mental, commercial, cultural or social identity of these natural persons. In practice, these also include all data which are or can be assigned to a person in any kind of way. For example, the telephone, credit card or personnel number of a person, account data, number plate, appearance, customer number or address are all personal data.

Source: https://gdpr-info.eu/issues/personal-data/

NIST (USA) Definitions of Personal Data

Again, what follows isn't comprehensive, but it is a good overview. There are essentially three definitions of PII:

  1. Personally Identifiable Information; Any representation of information that permits the identity of an individual to whom the information applies to be reasonably inferred by either direct or indirect means.

  2. Any information about an individual maintained by an agency, including (1) any information that can be used to distinguish or trace an individual’s identity, such as name, social security number, date and place of birth, mother‘s maiden name, or biometric records; and (2) any other information that is linked or linkable to an individual, such as medical, educational, financial, and employment information.

  3. Information that can be used to distinguish or trace an individual’s identity, either alone or when combined with other information that is linked or linkable to a specific individual.

Source: https://csrc.nist.gov/glossary/term/PII

Note that each country, and even coalition of countries, is allowed to come up with their own definitions. The GDPR and NIST are just two examples of many. The true test of whether information is considered PII comes in a court of law, argued by lawyers in front of a judge about a specific situation after the damage has already been done.

From an engineering perspective, you need to consider where information is published, displayed, or disseminated, combined with what kinds of information is included, to see if a combination of data points can identify an individual person.

Context Matters With PII

As an extreme example, Social Security Numbers are typically forbidden in e-mail exchanges between government employees in the USA; that's not because the mere existence of a Social Security Number in an e-mail allows me to open a bank account or claim government benefits if I happen to read it. Information in e-mails tends to come with other data points, too: names, e-mail addresses, phone numbers, mailing addresses — these can be combined with an SSN to wreak all sorts of financial havoc on a specific individual.

The Social Security Administration sends a list of SSNs describing the wages earned for each one in the past several quarters, called the Vocational Rehabilitation Client Earnings Report. This e-mail, sent only after an extensive background check and verification process, contains SSNs and a code representing wages earned; not the actual number — a letter code. Wha-whaaaat! SSNs in plain text in an e-mail!? Well, it all comes back to the individual data points; there isn't enough information in that e-mail to identify someone. In fact, you can't tell what they actually earned in wages; it's just a letter code, like "A" or "C". Not PII according to the US Government.

Summary

While we have some similar ideas about what constitutes PII, the legal definitions are open-ended enough to expand the definition if lawyers can make a good argument. I don't think we can guarantee that synthetic primary keys (e.g., surrogate keys) won't be considered PII; this depends on where those primary keys are disseminated, what information is included, and what other information sources can be combined to produce a more comprehensive dataset.

As a software engineer, it's good to understand which legal frameworks your application is subject to, and then do some reading to get a surface-level understanding. If you are put in a position where you are responsible for making this assessment, you need to recommend the organization consult a lawyer or qualified cyber security professional.

Do not make this judgement in isolation.

March 11, 2026 Score: 3 Rep: 85,986 Quality: Medium Completeness: 30%

I'm pretty sure they are PII if you can link them back to a user.

From GDPR

The data subjects are identifiable if they can be directly or indirectly identified, especially by reference to an identifier such as a name, an identification number, location data, an online identifier..

The remedy is to make sure you can't look the user up

‘pseudonymisation’ means the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person;

So if you need to add a username and password to say user:177980 to get the user details from the Id, then its not PII.

Looking at your specific questions:

So should (for example) an Active Directory ObjectGUID be considered PII?

Yes. IF you have the whole AD graph including the other data like name.

No. IF you just log them to some file by themselves

Are the values that I'm using as the synthetic database primary keys in my hypothetical USERS table considered PII?

Yes. IF you have the whole users table

No. IF they are by themselves in a pdf.

What about if a person can be uniquely identified by a value, but it requires a cross-reference (or SQL JOIN)? If I have "case" or "order" tables owned by "subjects" or "customers", do I consider CASE.ID and ORDER.ID to be PII, because they can be unambiguously connected back to the CUSTOMER (and CUSTOMER contains PII)?

Yes. IF you have the whole database

No. IF "such additional information [as the in the join table] is kept separately"

March 11, 2026 Score: -1 Rep: 12,365 Quality: Low Completeness: 10%

The fact that the SSNs have been widely exposed does not change that.

It very much does. If a hacker publishes a list of guids that are the primary keys of your customers table, the values of those guids are useless to anyone who doesn't have the other columns of that table.

You can't identify a person from such a guid without access to other data in your database.