To assist developers in their selection of a primary identifier suitable for federated applications, the AAF suggests persistentNameID as a suitable persistent and unique attribute for identifying users. The AAF Attribute Definitions describes persistentNameID as:
A persistent, non-reassigned, privacy-preserving identifier for a user shared between an identity provider and service provider. An identity provider uses the appropriate value of this attribute when communicating with a particular service provider or group of service providers, and does not reveal that value to any other service provider.
This attribute is the best choice to differentiate between individuals accessing a service, as it presents a unique value per user, specific to the service accessed. A user will present the same value for every session with a service provider; and identity providers must never reassign values to another user. Identity providers may offer to generate new values for persistentNameID, if a user considers that an existing value compromises their privacy. A new value will break any existing relationship between a user and a service provider.
Federations classify persistentNameID as a "service-specific pseudonym" since it is an opaque identifier that is unique for every service a user accesses. Use of persistentNameID is also ideal where a service provider has no requirement to explicitly identify a user; however, usage commonly occurs with one or more personal identifiers like display name and email address.
When selecting persistentNameID for a service, consideration should be given to how an application consumes identities, local storage limits, and flexibility in the authentication pipeline.
Identifier Format and Generation
The persistentNameID attribute value is a string of no more than 2,306 characters comprised of a tuple and the exclamation symbol as the field separator, with the following format:
- identity provider’s entityID name, which authenticated the user,
- service provider’s entityID name of the service accessed by the user, and
- an opaque string.
As per the SAML format definition, the identifier portion MUST NOT exceed 256 characters, and the identity and service provider entityID values MUST NOT each exceed 1024 characters.
For each service a user accesses, an identity provider creates an opaque string by hashing, with a private salt, a user identifier. The user identifier should be a permanent serial number linked to an account and not a personal identifier similar to a username or email address. The serial number should be persistent regardless of the underlying technology and recoverable from a disaster situation.
Though uncommon, a different identity provider service may generate the same opaque string for another user. In this case, the persistentNameID value remains unique since entityID names are unique across identity providers and service providers. A less common format seen by the AAF for persistentNameID uses a GUID format for the opaque string. For example:
Services MUST be able to consume both forms of persistentNameID.
Usage in an application
A typical web application would rely on credentials to identify a user, with the username as the primary identifier. Developers would use the username to query the local data store and retrieve user configuration data and create a local security context. Developers could select email addresses as the primary identifier and easily include a mechanism to cater for email addresses changes with minimal impact to the user experience.
For a federated application, successful credential validation is complete by the time the user accesses the application. A validated user would appear on the application with a range of verified attributes, including email address, display name, etc. In selecting an email address as the primary identifier, no handshake verification would be necessary, but any mechanism that caters for email address updates would gain additional complexity and most likely involve a manual update to a user’s profile within the application to restore access for a user.
The developer can mitigate the complexity of this situation by employing persistentNameID as the primary key. Any update to an email address will occur on a subsequent login without intervention. This situation is common, and the use of persistentNameID ensures uninterrupted access for users.
If an application storage constrains the size of a primary identifier to less than 2,306 characters, there are several approaches which can assist a developer:
- The first approach is to parse the persistentNameID value and remove the service provider’s entityID name. Since entityIDs are constant, it is not necessary to store this tuple component for a service. Using the previous example, the stored value can take the format illustrated here: https://idp.edu.au/idp/shibboleth!-!Mza74xVcOOJ/I/Z3NFFY86+nfOk, which combines an identity provider’s entityID name with the opaque string, and the delimiter “!-!”. This retains unique values for users and shortens the value by the same number of characters as the service provider’s entityID name length.
- A more drastic approach to further reduce the persistentNameID size relies on hashing the value, and storing a mapping of the hash to the values for auditing purposes. The stored mapping and session logging of the persistentNameID will assist when communicating with an IdP administrator about errant accounts within an application.
NOTE: The attribute persistentNameID and eduPersonTargetedID share the same definition and should produce the same attribute value when they share the same hashing salt and source identifier.
In the latest release of the eduPerson (2020-1) specification, REFEDS has deprecated the eduPersonTargetedID attribute in favour of a new identifier called pairwise-id “... with simpler syntax and safer comparison rules”. From a service provider’s perspective, the pairwise-id attribute has all the same properties and features as eduPersonTargetedID and persistentNameID with a simpler format. The format takes the form of “opaque string@scope”, with the generation of the opaque string having a similar process to the opaque string for eduPersonTargetedID. The scope would nominally be the identity provider's domain.
Further information on the pairwise-id attribute is available on the REFEDS site: https://wiki.refeds.org/pages/viewpage.action?pageId=50626629#eduPerson2020-01-eduPersonTargetedID.
The AAF will commit to the pairwise-id attribute, once a migration strategy is in place for identity providers.
The AAF’s current advice to developers using eduPersonTargetedID is to migrate to persistentNameID by collecting both attributes, for a user, for later migration. If a user’s values for both attributes match, the migration is simple and only requires updating the process which provides the source for the application’s primary identifier. If values differ, a find and replace of the data store will also be necessary to update the primary key.
If a developer’s choice is to adopt pairwise-id, developers should prepare by collecting values for eduPersonTargetedID (or persistentNameID) and pairwise-id (when available), to create a mapping between the attribute values for a user.
AAF eduPersonTargetedID Definition
AAF persistentNameID Definition
REFEDS eduPersonTargetedID Specification
REFEDS eduPerson 2020-01
OASIS SAML Subject Identifier Attributes Profile Specification (2019)
OASIS Pairwise Subject Identifier
Shibboleth Consortium NameIdentifiers
Shibboleth Consortium NameID Generation Configuration
Shibboleth Consortium Persistent NameID Generation Configuration
Additional Background Shibboleth Documentation