An Analysis of Keystroke Dynamics Use in User Authentication Sam Hyland (0053677) Last Revised: April 7, 2004 Prepared For: Software Engineering 4C03
Introduction Authentication is an important factor in computer and network security. Whether it is controlling access to company resources or verifying the billing of a customer on Amazon.com, verifying a user s identity will always be an integral part of a secure system. There are many techniques currently used in authentication. The most common is the password. Passwords are convenient as they are easily implemented in software and require no specialized hardware. Users are also familiar with their use. Passwords also suffer from many flaws. Users frequently share passwords, forget passwords, and select poor passwords that may be easily defeated. This has spawned research and innovation into alternatives to supplement and supplant the common password.[joyce & Gupta] Password Alternatives Physical security measures, such as access cards or keys, are one such alternative. They are reliable provided the physical devices are not lost or stolen. Frequently they are paired with simple passwords to reduce the loss of security with theft. For example the PIN number associated with automatic teller machines. Physical security techniques are useful but rely heavily on expensive customized hardware and software products.[joyce & Gupta] The newest alternative access technique is known as biometric identification. Biometrics measures quantities or qualities of an individual in order to determine their identity. Traits measured by biometrics may be divided into two broad categories: physiological and behavioral.[bergadano et. al.] Physiological traits are qualities of the body and change little over time. For example patterns in fingerprints and retinas or the shape of ones face. These techniques have proven themselves remarkably efficient in identification tasks. However, like physical techniques they require specialized hardware that makes them impractical for all but the most specialized of applications. The second from of biometric identification, behavioral identification, measures quantities related to the actions an individual takes. For example their handwriting, voice patterns or how they type. Measuring the natures of an individuals typing is a technique known as keystroke analysis. These properties are often harder to measure than physiological properties as they may change significantly over time and be consciously influenced.[bergadano] For example patterns in individual s keystrokes may change even between samples. Basics of Keystroke Analysis Keystroke analysis as a method of behavioral based biometric identification is a technique based on measuring aspects of individual s keystrokes. There are two primary measures that can be taken. The first is duration and the second is latency. Duration measures how long a given key is held down.
Latency measures the time between keystrokes.[bergadano] In other words the time between realizing one key is pressed and pressing another. Actual implementations apply these techniques in different ways to get useful measures. For example they may measure key strokes in groups of three, taking the time the first key is pressed until the time the third key is released as their useful measure. By statistically analyzing the results keystroke analysis can produce surprisingly accurate results. In order for keystroke analysis to perform successfully a technique must be developed that measures how likely it is two typing samples originate from the same user. In the literature this is often referred to as the distance between samples. Given samples A and B we denote this as d(a,b). Techniques to calculate this distance are discussed in the literature. In most authentication tasks there will be N samples, S1, S2 SN, for each user. A new reading X is taken and for each user a mean distance md=(d(s1,x)+d(s2,x) + + d(sn,x))/n is calculated. The user with the smallest md is likely the one providing the new sample.[bergadano] Performance The performance of keystroke analysis during simple classification tasks is near perfect. In classification tasks samples have been taken from all users. Real life authentication is much more challenging as users unknown to the system may attempt to access illegitimately. In order to prevent this some threshold must be developed that the distance must be within in order for a user to be accepted into the system. When designing an authentication system there are two important measures. The False Alarm Rate (FAR) and Intruder Pass Rate (IPR). FAR measures the percentage of Table 1 FAR & IPR 1 users that are permitted to access Study FAR IPR the system but are rejected by the Bergadano et al. 0-7.2% 0-2.3% authentication tools. IPR measures Leggett and Williams 1988 5.5% 5.0% the percentage of users not Joyce and Gupta 1990 16.36%.25% permitted access to the system Bleha et al. 1990 3.1% 0.5% that are accepted by Brown and Rogers 1993 4.2-40% 0%* software.[bergadano] Ideally the IPR should be close to zero and the FAR low. Often decreasing the IPR has the effect of increasing the FAR and vice versa. Exactly what numbers are acceptable depends on the application. Table 1 summarizes some of the major results in Keystroke analysis, their IPR and FAR. Each of the studies used different techniques for comparing samples. Brown and Rogers impressive score of 0% for IPR was a result of tuning their method to achieve that measure. An IPR of 0.25% is still too high for any application requiring high levels of security. As such keystroke analysis is not yet ready for use as a primary authentication technique. 1 Data for table extracted from Bergadano et. al. Refer to Bergadano for full descriptions of the studies mentioned.
Implementations Accuracy is not the only concern with Keystroke analysis. The more accurate results in Table 1 were achieved using sample text that is relatively long. Good results often require one or two paragraphs to be typed, often without mistakes. This can easily take a minute or more which is unacceptable for simple authentication tasks. Keystroke analysis can be used to supplement the use of passwords for security purposes. For example, as passwords are easily forgotten many websites ask personal questions during registration to aid in password recovery. Often the answers to these questions are easily forgotten or easily guessed. Rather than ask personal questions users could be asked to provide typing samples during registration. If they forget their passwords keystroke analysis techniques could be used for authentication.[bergadano] Another use of keystroke analysis is to provide additional security to existing passwords. Keystroke analysis could be performed on a user s login and password as they connect to a system.[netnanny] This provides an additional layer of security to passwords and makes the sharing of passwords useless -- effectively discouraging the practice. This approach has been taken by NetNanny with their Biopassword software. Biopassword works through keystroke analysis techniques developed at Stanford during the late 1970 s and early 1980 s. It was originally conceived as a hardware product but as computers became faster and more reliable it made the shift to software. It replaces the Microsoft Windows login framework and provides both password based authentication and keystroke analysis. In order to deal with the FAR and IPR tradeoffs an administrator can tune the software to 10 different levels. It has been successfully marketed and tested inside large governmental organizations such as the FBI, CIA and NSA and large business s such as the Chase Manhattan Bank, Dupont, Exxon, GE Capital and the New York Stock Exchange.[NetNanny] It has been given favorable mentions by both CNN[Fontana] and the computing press[munro]. Biopassword is already in use by several groups including the web based music distribution company Musiccrypt. [Musiccrypt Inc] Conclusion The moderate success of Biopassword demonstrates that keystroke analysis is a valid security tool. Work in this area can still be considered to be in its infancy. As more research is done the results can be expected to improve. Keystroke analysis has proven itself capable of providing additional security above and beyond conventional passwords. The lack of special hardware and resulting low cost make it a popular area in biometrics research. As the field matures keystroke analysis will become more efficient and become more visible in the security field.
References Bergadano, F., D. Gunetti and C. Picardi (November 2002). User Authentication through Keystroke Dynamics. ACM Transactions on Information and System Security, 5, (4), 367-397. Accessed Online through: http://doi.acm.org/10.1145/581271.581272. Fontana, J. (2000, Dec 6) Biometrics software aimed at improving Windows NT security. CNN. Accessed Online through: http://www.cnn.com/2000/tech/computing/12/26/biometrics.softw are.idg/index.html. Joyce, R. and G. Gupta (February 1990).Identity Authentication Based on Keystroke Latencies. Communications of the ACM, 33, (2), 168-176. Accessed Online through: http://doi.acm.org/10.1145/75577.75582. Munro, J. (2001, Sept 24) BioPassword 4.5: Hardware free Biometrics. PC Magazine. Accessed Online through: http://www.pcmag.com/article2/0,1759,38615,00.asp. Musiccrypt Inc. (2004, March 25) Website. Accessed Online through: http://musiccrypt.com. NetNanny Software. Technical Report BioPassword Keystroke Dynamics. 2001. Accessed Online through: http://www.biopassword.com/home/technology/bp%204.5%20techn ical%20paper.pdf.