Best Practices for Managing User Identifiers 2015 Hitachi ID Systems, Inc. All rights reserved.
Contents 1 Introduction 1 2 Defining user identifiers 1 3 Different types of identifiers 1 4 Scope and uniqueness 1 5 When identifiers are assigned 2 6 Machine-readable versus human-readable identifiers 2 7 Desirable attributes of identifiers 3 8 Addressing challenges in identifier management 4 9 Common and recommended algorithms for assigning login IDs 5 9.1 Login IDs for internal users..................................... 5 9.2 Login IDs for external users.................................... 8 9.3 Assigning new E-mail addresses to internal users........................ 9 10 Example business processes 9 10.1 Employee / contractor onboarding................................. 9 10.2 Customer onboarding (Internet-facing).............................. 10 10.3 Renaming an existing employee login ID............................. 10 i
1 Introduction This document presents best practices for assigning and managing unique identifiers to the users of computer systems in medium to large organizations. It begins with definitions and background information, then proceeds to explain scope, uniqueness, business processes, challenges and best practices. 2 Defining user identifiers What is a user identifier, or ID for short? Technical definition: Multi-user computer systems often need to identify users, so that access to applications and data can be controlled, logged and attributed to people. Computers refer to people using unique numbers or strings of characters. These numbers or character strings are user identifiers. User-centric definition: Users have a variety of identifiers, which uniquely identify them in some context. Examples in the IT environment include operating system login IDs, e-mail addresses, employee numbers. Examples from day-to-day life include driver s license numbers, credit card numbers and passport numbers. 3 Different types of identifiers In the context of a medium to large organization, users often have at least the following identifiers: 1. An employee number. 2. At least one network login ID. 3. Possibly additional login IDs to a variety of applications. 4. At least one e-mail address. This document offers guidance to organizations regarding the management of these corporate user IDs. 4 Scope and uniqueness An ID must uniquely identify a person within a defined scope. For example, since no two users can have the same login ID on an application, the application can be thought of as an identification domain, within which each user has a unique ID. Unique IDs commonly have a scope drawn from the following list of possibilities: 2015 Hitachi ID Systems, Inc. All rights reserved. 1
Scope Single system or application Single organization Sub-national National Global Examples Active Directory domain, RAC/F security database. Employee number, standardized cross-application login ID Driver s license, voter number. Passport number, federal tax number. Fully qualified e-mail address. In general, the scope over which an ID is unique can be expanded by appending the context where it was defined. This can be illustrated with some additional examples: Original scope Example Append New scope Example Single system JSMITH Application name Single organization JSMITH Organization name Organization Global JSMITH@App01 JSMITH@Acme.com State/province DL 1341135-013 Jurisdiction National DL 1341135-013@NewYork National QC0318876 Country code Global QC0318876 from Canada 5 When identifiers are assigned When discussing how identifiers are assigned, it is helpful to consider when they are assigned. Here are some examples: 1. At birth as happens in some jurisdictions for government IDs, social insurance numbers, etc. 2. When joining an organization enrolling as a student, starting a new job, etc. 3. When being granted a new login ID to a system or application. Identifiers are sometimes changed as well for example following name changes, which in turn often follow marriage or divorce. 6 Machine-readable versus human-readable identifiers People find it easier to remember and enter memorable strings of characters. On the other hand, computers are able to assign numeric identifiers which are guaranteed to be unique in some scope. This leads to two broad categories of identifiers: 1. Human-friendly identifiers, such as e-mail addresses and login IDs. 2015 Hitachi ID Systems, Inc. All rights reserved. 2
2. Computer-friendly identifiers, such as globally unique IDs (GUIDs) - which are strings of 32 hexadecimal digits. Computer-friendly identifiers often have the benefits of being unique in a larger scope and of never changing during the lifecycle of a user. In contrast, user-friendly identifiers are less unique (unique only in a smaller scope) and more volatile, but are easier for people to manage. 7 Desirable attributes of identifiers Following is a list of desirable characteristics of user IDs. When designing an algorithm to assign IDs to users or business processes for managing user IDs, it is helpful to consider each of these and to develop a process which satisfies as many of them as possible. Identify a person, not a position: Identifiers should refer to people, not to positions. People often move from one position to another and changing their identifier when this happens is a nuisance and creates inconsistencies in audit logs. User friendly: Identifiers should be reasonably easy to remember and short enough to enter quickly. hard-to-remember IDs should be avoided unless they are only used by machines. Long and Easily recognizable: It is helpful for users to be able to recognize that a string of characters is a user ID on casual inspection. In other words, user IDs should be constructed in an easily recognizable format. This is helpful both for users, when reading text that contains IDs, and for automated processes, which can scan log files, scripts, network traffic or other data sets for user IDs. Reusable: It makes sense to assign the smallest possible number of identifiers to a user and to reuse existing identifiers where possible. This is more user friendly, less troublesome to manage and easier to audit. In short, use an existing identifier if possible, rather than creating a new one. Standardized identifiers across as many systems as possible. Compatible: Identifiers are often used on a variety of systems. For example, a user might type the same identifier to sign into Windows / Active Directory, into a mainframe using RAC/F and into an ERP running SAP. Each of these systems will have different constraints on the allowable length and characters that can comprise an identifier. In order to support reuse (previous objective), it makes sense to assign identifiers that are compatible with the largest possible number of systems. Maximum scope: Different systems may have different, overlapping user populations. It makes sense to assign identifiers which are unique over the largest possible scope, so that they can be reused by the largest possible number of systems. Unchanging: 2015 Hitachi ID Systems, Inc. All rights reserved. 3
Identifiers assigned to a user should be designed so that they never have to be changed. Changing identifiers is an administrative burden and leads to inconsistencies in audit logs, Changes in user IDs can create significant operational problems. For example, the ID may appear on multiple systems, making it costly to change. Changing the ID would create a discontinuity in audit logs, perhaps violating security policy. The ID may be embedded in programs or scripts, which would stop working after the change. The ID may be known to other users, who would have to be informed of the change. Never reused: Identifiers should never be reused. For example, when a user leaves an organization, that (old) user s identifier should never be assigned again, to another (new) user. Doing so can have undesirable and unexpected consequences, such as the new user acquiring security access rights from the old user s profile. This means that a repository of every identifier that has ever been assigned must be maintained, rather than just a repository of currently-in-use identifiers. Not offensive: People have an amazing ability to read meaning into meaningless strings of characters. This leads to situations which range from humorous to offensive, where identifiers are assigned to users, often by automatic processes, which users can read literally or with poetic license to have colorful or offensive meanings. This problem suggests that a human review process is often needed when new identifiers are assigned, so that they can be vetted and perhaps replaced if they are found to be offensive. Cross-language: Many organizations span countries, languages and cultures. In this context, a question of cultural, rather than just technical compatibility arises. For example, would a uniligual English speaker be able to read, remember or type an identifier for a co-worker if that identifier is in Kanji (Japanese)? Since identifiers may have to be accessible by multiple users, it is important to consider the ability of users fluent in different languages to read and enter them. Accessible only within an appropriate scope: In some cases, an organization may consider identifiers to be confidential. This is true in the legal sense with some identifiers, such as social security numbers. Confidentiality of identifiers may also be considered a secondary line of defense against security attacks such as automated password guessing. Since users often have to know, remember and enter their own identifiers, confidentiality means limiting the visibility of identifiers to just authorized users and not disclosing information about whether an identifier is valid to unauthorized or unauthenticated users. 8 Addressing challenges in identifier management Some challenges arise in most organizations in the course of assigning new or managing existing identifiers. These are described below: Collisions: If the algorithm used to assign unique IDs to users is based on users names then users with identical or even similar names may be assigned the same identifier. This obviously needs to be rectified. 2015 Hitachi ID Systems, Inc. All rights reserved. 4
For example, an organization may employ 10 people with the (common among English speakers) name Michael Smith. If IDs are assigned using the algorithm last name plus first initial then they would all be assigned the ID smithm. Assigning the same ID to multiple users would defeat the purpose of IDs unique identification so the algorithm must be adjusted to eliminate these collisions. This may be done by appending one or two digits to the IDs above, for example. Name changes: Where IDs are assigned using an algorithm based on the user s name, in the event that the user s name changes (for example, due to marriage or divorce) the user may wish the in the event that the user s name changes (for example, due to marriage or divorce) the user may wish to change his ID to match his new name. Changes to user IDs are undesirable, as described in Section 7 on Page 3. Short names: Where IDs are based on user names, the algorithm used to calculate IDs may produce unsatisfactory results for users with short names. For example, two common Chinese surnames are written (in English) as Wu and Li. An organization with many Chinese users and IDs based on surname might have many collisions and require two or more extra characters appended to IDs, to make them unique. These unique suffixes are hard to remember and tend to lead to confusion, such as e-mails intended for one user being sent to another. Changes in user role or status: Where IDs are based on a user s role (e.g., which department he works in) or status (e.g., employee vs. contractor), changes in the user s role or status would trigger a change to the user s ID. For example, a contractor who is subsequently hired as an employee would be assigned a new ID. Changes to user IDs are undesirable, as described in Section 7 on Page 3. Multiple character sets: As described in Section 7 on Page 4, users fluent in one language, or whose computer is configured for text input in one language, may be unable to read, remember or enter an ID in another language, especially when the two languages use different character sets. 9 Common and recommended algorithms for assigning login IDs 9.1 Login IDs for internal users The following process and algorithm can be used to satisfy each of the requirements set forth in Section 7 on Page 3: 2015 Hitachi ID Systems, Inc. All rights reserved. 5
Requirement Identify a person User friendly Easily recognizable Reusable Compatible Maximum scope Unchanging Never reused Not offensive Strategy Assign IDs to people, not roles. IDs should be 7 characters, total. Formulate IDs as Unnnnnn where n represents a digit. There are 10,000,000 possible IDs of this form. Use the same ID on every system and application. IDs starting with a letter and containing only one letter and 6 digits work on almost every conceivable system and application. Assign an ID to every user in the organization and use these IDs to sign users into applications. If possible, use the same ID as an employee number as well. Since the IDs are numeric, changes in user names should not trigger a request for a new ID. Since they do not represent user role or status, changes in these attributes also do not trigger a request for a different ID. Create a database of every ID ever assigned. Only append to it and never reuse IDs. Numbers are not generally offensive, though some numbers are considered bad luck in some cultures. Give users an opportunity to request a new ID (but not to specify what it will be) when they are first assigned an ID. Cross-language Limited disclosure Roman letters (U) and digits are legible across cultures and languages. Do not publish lists of IDs or the correlation between user names and IDs. 2015 Hitachi ID Systems, Inc. All rights reserved. 6
Another reasonable process is as follows: Requirement Identify a person User friendly Strategy Assign IDs to people, not roles. IDs should be 7 characters, total. Easily recognizable Formulate IDs as the user s surname, in English, with up to 3 characters followed by a 4 digit number assigned sequentially for each prefix. Example: the fourth Mike Smith could be assigned SMI0003. Reusable Compatible Maximum scope Unchanging Never reused Not offensive Cross-language Limited disclosure Use the same ID on every system and application. IDs always start with a letter, only have letters and digits and contain no more than 7 characters. Almost every conceivable system and application supports this. Assign an ID to every user in the organization and use these IDs to sign users into applications. If possible, use the same ID as an employee number as well. Since IDs do not represent user role or status, changes in these attributes do not trigger a request for a different ID. Changes in a user s name may cause users to request an ID, but in most cases only a short subset of the name is used, so users are likely to tolerate continuing use of their old ID. Create a database of every ID ever assigned. Only append to it and never reuse IDs. Short strings of letters are not usually offensive and neither are numbers. Give users an opportunity to request a new ID, indicating the string they did not like, when they are first assigned an ID. Roman letters and digits are legible across cultures and languages. Do not publish lists of IDs or the correlation between user names and IDs. 2015 Hitachi ID Systems, Inc. All rights reserved. 7
9.2 Login IDs for external users External users that sign into an organization s Internet-facing applications generally only sign on infrequently. Since Internet users generally already have an e-mail address and since e-mail addresses are guaranteed to be globally unique, it makes sense to identify external users with their fully qualified e-mail address. This has many advantages: Requirement Identify a person User friendly Easily recognizable Reusable Compatible Maximum scope Unchanging Never reused Not offensive Cross-language Limited disclosure Strategy Use fully qualified e-mail addresses. Users already know their own e-mail addresses. E-mail addresses are easily recognized by people and programs. Users already use their e-mail address elsewhere, so by definition assigning this as an ID is reusing it. E-mail addresses are not compatible with all applications. They can be quite long (over 100 characters) and may contain symbols not supported by some applications (@, _, -,.). These limitations are not usually problematic with Internet-facing applications, but they can present difficulties for back office systems, such as mainframes. E-mail addresses can be used as IDs on every Internet-facing application. Users do periodically change their e-mail address, so this requirement is, unfortunately, violated. Few if any e-mail systems assign the same ID, consecutively, to different users. This reduces the problem of ID reuse to a vanishingly small size. Users presumably already address this problem when provisioning their e-mail account, so this problem is transferred to another organization. SMTP e-mail addresses are, by definition, cross-cultural and global. E-mail addresses are widely known, so this requirement cannot be met using this strategy. 2015 Hitachi ID Systems, Inc. All rights reserved. 8
9.3 Assigning new E-mail addresses to internal users Requirement Identify a person User friendly Easily recognizable Reusable Compatible Maximum scope Unchanging Never reused Not offensive Cross-language Limited disclosure Strategy Assign a new and unique e-mail address to every new e-mail user. Assign firstname.lastname@organizationdomain and insert.uniqueid before the @ if required, where the uniqueid is two letters aa, ab, ac, etc. E-mail addresses are easily recognized by people and programs. Users can use their e-mail address to sign into a variety of web-based applications. Since many legacy applications do not support long IDs or IDs containing punctuation marks, e-mail addresses cannot be reused everywhere, nor should they because they are long and so take longer to type than other, typically internal IDs. E-mail addresses a standard format, compatible with all mail systems. Compatibility with other applications is not predictable. E-mail addresses can be used as IDs on many 3rd party Internet-facing application. Unfortunately, users will generally demand changes to their e-mail address when their name changes. This is unavoidable with this format. Create a repository of all current and previously assigned e-mail addresses. Even in the case where a user with a given name leaves and later a different person with the same name joins, use the unique field. Users are not generally offended by their own names. SMTP e-mail addresses are, by definition, cross-cultural and global. E-mail addresses are widely known, so this requirement cannot be met. 10 Example business processes Following are some typical examples that illustrate how the naming algorithms described in Section 9 on Page 5 above are used. 10.1 Employee / contractor onboarding 1. For employees: HR creates a new employee record. 2. For contractors: a manager submits a new-contractor request. 3. In either case, the request includes the user s full name. 4. Once the request is approved: 2015 Hitachi ID Systems, Inc. All rights reserved. 9
(a) A new login ID is assigned. (b) Using the algorithm in Subsection 9.1 on Page 5: i. A database is referenced to find the highest-number, already-assigned ID. ii. The next number is used. iii. Database locking is used to ensure that two users, provisioned at nearly the same instant, do not get the same ID. iv. The ID might be U0012311.. (c) A new e-mail address is assigned. (d) Using the algorithm in Subsection 9.3 on Page 9: i. John Smith might become john.smith@acme.com ii. As with the previous example, a database lookup is required to check for duplicates. iii. If a duplicate is found, the e-mail address might become john.smith.aa@acme.com iv. The new ID must be stored in the database, correlated to U0012311. v. Also as before, record locking semantics must be used to avoid a case where two samenamed users are assigned the same address if they are provisioned nearly simultaneously. 10.2 Customer onboarding (Internet-facing) 1. A new customer fills in an access request form. 2. The form should include a CAPTCHA to ensure that it is filled in by a person, rather than a (possibly malicious) script. 3. The user should be required to enter his existing e-mail address. 4. Form input should validate that the e-mail address is well formed. 5. Account activation may involve e-mail validation: (a) An activation URL is sent to the user s e-mail address. (b) The URL includes a pseudo-random string. (c) The user has to click through to the URL to activate the account. (d) Activation strings and un-activated accounts should be scrubbed periodically for example when they are over 24 hours old. 6. This method ensures that all users have a globally-unique, already-remembered ID. 7. Password reset can be accomplished by sending an activation string to the user, just like account activation. 10.3 Renaming an existing employee login ID 1. Users may ask for a new ID in the event that their old ID was based on their name, which has since changed. 2. Organizational changes mergers, acquisitions, etc. may trigger renames to align naming standards. 2015 Hitachi ID Systems, Inc. All rights reserved. 10
3. In general, so long as a user has the same ID on all systems, it is safer to leave that ID alone and provision any new accounts for the same user with the pre-existing ID. Name changes are dangerous since scripts or programs may explicitly refer to the old name. 4. Where renaming a user is deemed essential, be careful to consider: (a) Scripts or programs that refer to the old ID. (b) Uniqueness of the new ID (should not be used by any other user on any system). (c) Compatibility of the new ID with all systems, not just those which the user will access immediately. 5. Before renaming a user, notify him of the change, both so that he can sign in after it happens and so that he can report problems that may have been caused by the change quickly. 500, 1401-1 Street SE, Calgary AB Canada T2G 2J3 Tel: 1.403.233.0740 Fax: 1.403.233.0725 E-Mail: sales@hitachi-id.com www.hitachi-id.com Date: 2011-02-11 File: /pub/wp/documents/assigning-ids/managing-user-ids-1.tex