Institut f. Statistik u. Wahrscheinlichkeitstheorie 1040 Wien, Wiedner Hauptstr. 8-10/107 AUSTRIA http://www.statistik.tuwien.ac.at A security analysis of the SecLookOn authentication system K. Grill Forschungsbericht MS-2010-1 Februar 2010 Kontakt: P.Filzmoser@tuwien.ac.at
A Security Analysis of the SecLookOn Authentication System Karl Grill May 19, 2009 Abstract We show that an attacker that is able to retrieve a sufficiently high number (in simple cases, about 100) of challenges can, with a high probability, reconstruct the secret, even without knowing the correct responses, by a brute-force approach. With the same approach, a known plaintext attack against the simple system is likely to succeed with knowledge of about 25 challenge-response pairs. Key words: authentication, brute force, known plaintext attack, unknown plaintext attack, password security. 1 Introduction SecLookOn, invented by Helmut Schluderbacher and marketed by MERLINnovations, replaces the classical username and password authentication method by a graphical challenge-response system. Thus, the credentials are never completely revealed to an observer. This kind of authentication scheme may be of merit in an office situation where it is not feasible to keep costumers from watching the login process and probably memorizing the username and password and gaining access when the authorized person is absent. Marketing statements by MERLINnovations, however, seem to indicate that this system is intended to be used as an all-purpose high security authentication method, and that it is safe against all imaginable attacks. Unfortunately, up to now, there is little theoretical evidence to support these claims. Consequently, there has been some criticism[5, 6, 7, 8]. Steinkamp[8] lists some attacking scenarios that do not seem to properly handled by the SecLookOn system. A security report[1] commissioned by MERLINnovations was conducted under very tight time restrictions and thus only addresses the most obvious points. It must be mentioned, however, that it does mention the problems that make our approach work. Apart from this report, the only publicly visible effort to provide some kind of proof of Institut für Statistik und Wahrscheinlichkeitstheorie, TU Wien, 1040 Wien, Austria, e-mail:grill@ci.tuwien.ac.at, Tel.:+436991/9678101 1
the system s security was a hack the key contest held in 2008 which promised an Apple iphone for the one who could determine the key secret from 333 challenge-response pairs of a bloated version of the basic system that we are dealing with. After this, we will describe two attacks that can be used against this scheme: first, we will only assume that a sufficiently large number of challenges have been collected, then we will discuss the known plaintext case where the correct responses are also known. 2 The SecLookOn System The SecLookOn system presents the user with two images, each of which consists of 6 6 smaller images (in the sequel, we will number the rows and columns from 0 to 9 and use coordinate pairs (i, j) to refer to row i, column j). In the first image, the smaller images are randomly selected from a set of 9 predefined images. In the second image, the smaller images are constructed from the background, which has a color that is selected from a given set C of six colors, one or more symbols, each selected from a set of six possible shapes and having a color from the same set C, in front of all the above, a digit from the range 0 to 9, again with a color from the same set C (in each challenge, exactly six of the ten possible digits are present). The colors and shapes are jointly referred to as properties. The type of property (e.g., the top left symbol or the color of the digit) is also referred to as a dimension, and, for each dimension, its actual value is picked from a set of six possible choices which we will number from 0 to 5. The secret that is shared between the user and the system consists of a connected set of six squares in the left image (such a connected set will in the sequel be referred to as a block, and it should be noted that there are exactly 2816 such blocks) B 1, a number K {2, 3}, a set I = {I 1,..., I k } of small images (from the set of 9 mentioned above) that will be relevant for the login process, an upper bound T {1, 2} for the frequencies of the images from I that are used in constructing the response, a block B 2 in the second picture, the number D of dimensions, 2
for each possible combination f = (f 1,..., f K ) {0,..., T } K a property X f, which, in turn, is a pair (X f1, X f2 ) of a dimension and an associated value. In order to prove her knowledge of the hidden secret, the user has to proceed as follows: the frequencies F = (F 1,..., F K ) of the images in I inside the block B 1 are determined and truncated, i.e., a frequency greater than T is replaced by T. These frequencies determine the property X F response. associated with the right The user finds X F in the block B 2, and the digit inside that small picture constitutes the response to be returned. This procedure is repeated a number of times (4 10). Apart from these facts and a few minor details (which are not really important for our investigations), this is about all the official information that is provided by the purveyors of this system[2]. Inspection of a number of image pairs reveals the following observations (cf.[8, 1]): there are six special blocks S ij = {3i, 3i + 1, 3i + 2} {2i, 2i + 1}, i = 0, 1, j = 0, 1, 2. In each of these blocks, for each of the property dimensions, each of the six possible values occurs exactly once. In the sequel, these blocks will be called the segments. The color values inside each small picture are all different. the digits are arranged in a fixed pattern, i.e., the squares of the second image can be divided into six sets, and the squares in each of these sets will always contain identical digits (which will be different for different challenges, of course; again, this does not affect our investigations). The main problem, and the one that is behind our attack, is that the user has to be presented with a valid, unambiguous response. This means that the property X F must be present in the block B 2, and if there is more than one occurrence of this property, each of them has to show the same digit. This is not a problem if B 2 is one of the segments. Otherwise it is a nontrivial problem, and it has apparently been solved (although this is not really documented) by forcing the property X F to appear exactly once. 3 A Brute Force Attack In the view of the above, we may try the following (assuming that the parameters K, T, and D are known the last one can be found by scrutinizing a few challenges, the other two may be known from other channels or can simply be tried in turn as there are only two choices for each): 3
1. Fix a pattern f 0. 2. Successively try each possible combination B 1, B 2, I, X f0 on one challenge. If the frequency pattern F resulting from B 1 and I equals f 0 i, and if X f0 does not occur exactly once inside B 2, then we can exclude this combination from subsequent trials. 3. Repeat this process on the n challenges that we have (or until there is only one combination left). 4. Repeat the above for other patterns f, using the information from the previous rounds to reduce the search space. This method has its limitations: if the true B 2 is one of the segments, then the uniqueness condition is satisfied by design, so, if we choose a segment as our B 2, then step 2 above will not remove anything, so we will always get the segments as (probably false) positives. So, we will exclude the segments from our searches. If the blocks are chosen with equal probabilities when the user accounts are created (this has to be done if the number of bits in the key is to be used to its full extent), only a fraction 6/2816 of secrets will contain a segment as B 2, so this is not a severe restriction. In addition, there are 87 pairs of complementary blocks in the sense that the members of such a pair are disjoint and that their union coincides with the union of two adjacent segments. If one member of such a pair contains exactly one occurrence of a property, then this will also be true for the other member, so the two cannot be distinguished by this procedure. In all other cases, our approach should lead to a success, if sufficiently many challenges can be checked against. The questions that remain are how large n should be in order to be considered sufficiently large, and how expensive the computations are. Regarding the first question, we use a simple heuristic argument. Namely, first observe that going through our procedure is tantamount to checking all possible choices for B 1, B 2, I and (X f, f {0,..., T } K ) against our n challenges. Assume that our B 1 is disjoint from the true B 1 and that no segment intersects both our B 2 and the true B 2 simultaneously. Furthermore, assume that all choices that are possible for the second image (as dictated by the uniqueness condition and by the observations mentioned above) have the same probability to be chosen in constructing the challenge. This makes what happens in the blocks of the true secret independent from what happens inside the blocks that we chose. Finally, note that the probability that a given property occurs in B 2 exactly once is bounded above by p 0 = 26/36 (this maximum is attained for a block that has 5 squares inside one segment and one outside). The probability that our choice will be rejected can be estimated above by p n 0, because in any round one of our X f s will be checked and rejected with probability at least 1 p 0. Heuristically assuming that this estimate pertains in the general case, and observing that we have 2816 choices for B 1 and B 2 each (actually, for the latter there are only 2810 choices to consider because we have excluded the segments), ( 9 K) choices for I, and 6D choices for each of the (T + 1) K X f s, we 4
arrive at the following estimate p for the probability that there is some false choice that cannot be rejected (it can also be interpreted as an estimate for the expected number of false positives): ( ) 9 p = 2816 2 (6D) (T +1)K p n 0. K Our heuristic estimate for n is then determined by p = θ, where θ is any bound we wish to impose on the probability of retaining a false positive. In the simplest case K = 2, T = 1, D = 4 and for θ = 1 we obtain n = 98, for K = 2, T = 2 and T = 6 we get n = 180. As to the complexity of this scheme, first observe that if B is different from the true B 2 or X is not among the true X f s, then there is an upper bound p 1 = 11/12 for the probability that X occurs in B exactly once (this upper bound is obtained as the maximum of conditional probability that x occurs in b exactly once given that x occurs exactly once in b, over all choices (x, b ) (x, b), excluding the cases where b is a segment and where b and b are complementary, which is elementary but tedious). For these, on the average, a proportion p 2 /12 will be eliminated, where p 2 is the probability that the pattern f 0 is observed. Choosing the pattern f 0 appropriately, one can always achieve p 2 (T + 1) K. The number of combinations that we have to check against against the first challenge is ( ) 9 L = 2816 2 6D. K Of these, M = 2816 ( 9 K) (T + 1) K 2 do not fit the above description. Of those that do, an expected number not exceeding (1 p 2 /12)L will be checked against the second challenge, and so on, constituting a geometric series. So, we arrive at the following estimate for the total number of checks: ( ) 9 N Mn + 12L/p 2 5632 (T + 1) K (8448D + n). K Inserting the parameters of the actual systems, we arrive at N between 10 10 and 2 10 11. This is of an order of magnitude that is accessible to personal computers, especially since the check can be very efficiently implemented using bitfield operations. 4 Practical Examples The results in the last sections are based on heuristics and so leave some doubts regarding their validity. So, we would like to complement these considerations by a few experiments. The first is a simple simulation experiment. We generated 100 samples of 1000 challenges for K = 2 and T = 1, choosing the blocks, the 2 pictures and the 4 properties uniformly from their respective ranges, generating the first picture in the obvious way and selecting the second picture uniformly among the 5
allowed choices. The first n challenges in each sample were processed for n = 96 and n = 128. At n = 96, 11 of the 100 secrets were not uniquely determined. At n = 128 all secrets were uniquely found, up to complementarity 8 of the samples had a member of a complementary pair as their second block, so these were only determined up to this ambiguity that cannot be avoided. Running times on an AMD 64 with 4800 MHz were below 3 minutes per sample. Encouraged by this agreement with our previous considerations, we decided to turn to a real-life example, and chose the account Key-User-D2 described in [3]. Downloading the necessary number of login screens was rather easy, the most demanding task was the conversion of the data into machine-readable form. This problem was solved at n = 96 up to complementarity. Finally, we tried to do some real hacking and crack an unknown account. We guessed a username by extrapolating the Key-User-D* sequence, obtained a number of login screens, went through another tedious conversion job and finally ran the data through our program. It turned out that this account had T = 2 and K = 2. At n = 160, all but X 22 were uniquely determined, getting this last one right needed n = 576. This is not really a surprise because the probability of obtaining the pattern F = (2, 2) is less than one percent, so it may not show up at all in 100 trials. On the other hand, it is not really needed, because the chance that it occurs during the at most ten rounds of a single login is less than 0.1, so one can easily log in without this particular knowledge. As a last proof to ourselves, we logged into this account successfully. 5 A Known-Plaintext Attack The same approach as in the last section can be used if the responses are known. Computationally, the only difference is that we also reject a combination if the digit in the uniquely determined square is different from the correct response. In this setting, the segments and complementary blocks lose their special status. Using this on the simulation data we used above, we found, that at n = 32 all but 12 of the 100 keys were uniquely determined, 5 of those could not even be determined uniquely at n = 64. The number of keys that were left in each case, however, was small enough that it would not pose a problem to try these out in sequel (at n = 32 there was one case with 10 possible keys left, one with 5, and the rest had 2 or 3; at n = 22 the maximum number of keys left was 49, the mean was 3.77, and more than half of the cases showed a unique result). Computation times were less than one minute. In addition, we simulated a set of samples containing only segments because these were excluded in the original samples. As expected, the results were similar to those for the first set of samples. 6
6 WebLookOn WebLookOn [9] is a reduced version of the SecLookOn system. In this variant, the choice of the two blocks is removed from the key, the user is only shown the six squares that are relevant for the determination of the response. The number of dimensions is four, and all dimensions are symbols, so the peculiarities of color dimensions are removed. Apart from that, it works the same as before: the occurrence or non-occurrence of two small images in the first picture determines which symbol to look for in the second picture, and the digit in the square that contains this symbol is the response. This should make it clear that the results from the last section should give an upper bound to the security of this system. In fact, it has been observed (cf. the discussion in [4]) that the observation of three logins (or twelve challenge-response pairs) is enough to reduce the number of possible keys to a handful. This system simple enough that we can do some strict mathematics. Assume that S 0 is the true key, and that we check a different key S against one challenge-respond round created from S 0. If the probability that S is rejected in this round is p(s 0, S), and we have a total of n rounds (which we assume to be independently uniformly distributed among all possible choices) to check it against, then the probability that S is not rejected by the whole procedure is (1 p(s 0, S)) n, and the expected number of keys left is N(n) = S S 0 (1 p(s 0, S)) n. p(s 0, S) can be determined as follows: let X(S) = (X 1 (S), X 2 (S)) be the (random) symbol determined by strategy S for the round under consideration, with X 1 denoting its dimension and X 2 its value. Then p(s 0, S) = x,y P(X(S 0 ) = x, X(S) = y)q(x, y). Here, q(x, y) is the probability that the symbols x and y are found in the same square and can be calculated as 1 if x = y, q(x, y) = 0 if x 1 = y 1 and x 2 y 2, 1/6 otherwise. The probability P(X(S 0 ) = x, X(S) = y) is obtained as the sum of the probabilities of all frequency patterns that are associated with x in key S 0, and with y in S, which in turn can be calculated from the multinomial distribution. Our calculations showed that indeed the values of N(12) are small for all possible choices of S 0, with its maximum value, 0.752426, attained for a key that uses one symbol for the both pictures absent case, and one single symbol from a different dimension for all the other cases. N(22) is less than 2 for all keys, which is very nicely in tune with the results we obtained in the previous section. 7
7 More Complex Systems It may be conjectured that adding more features to the basic system described above could make attacks like the ones we described above unfeasible. Unfortunately, leaving aside the fact that memorizing a system like the one described below comes close to a Vaudeville act, this is not necessarily the case. As an example, we consider the scheme that is presented as the solution of the hacking contest that was held in 2008. This scheme sports the following extensions: Instead of a single fixed block, there is a choice of ten blocks in the first picture. Which one of these is actually to be used is determined by the contents of a certain square in this picture. Similarly, in the second picture there is a choice of ten blocks which is determined by the digit in a given square. Finally, shifts are added to the final result. This means that the final answer is not found in the square where the property in question was found, but, for example, two squares above it or three to the right. The method from section 3 cannot determine the shifts, but the shifts do not interfere with it either because they only affect the position of the right response which this method does not use at all. If we can obtain about 10000 challenges, we can use a divide and conquer strategy pick one square in the first image, one in the second (due to the fact that the digits are arranged in a fixed pattern, there are only six essentially different choices for the latter), and use only those challenges that have a previously fixed image/digit combination in these places. Running the program from section 3 on these 216 samples will take about 11 hours and narrow the search space sufficiently, so that the time used for subsequent searches for other combinations will become negligible. Thus, in less than a day, almost all of the key information will be revealed, only segments, ambiguous pairs and the shifts will not be properly determined (in the solution shown, there are no segments or memebers of complimentary pairs among the second-picture blocks, so, for this particular case, only the shifts would be missing). These remaining parts can be found if a (substantially smaller) number of correct responses is known. 8 Conclusion Summing up, it seems that SecLookOn does not really provide the amount of security that is claimed. In particular, the known plaintext attack from section 5 makes it questionable if it really fulfills its original purpose, namely to avoid revealing the login secret to observers. 8
References [1] T. Dübendorfer, Gutachen Sicherheitsanalyse von SecLookOn, (2008). Available: http://www.seclookon.com/seclookon/study.asp [2] MERLINnovations & Consulting GmBH, SecLookOn: Beschreibung des Verfahrens und Lösung für Man-in-the-Middle Attacken, (2007). Available: http://www.seclookon.com/seclookon/white paper.asp [3] MERLINnovations & Consulting GmBH, SecLookOn- Schlüssel für user Key-User-D2, (2008). Available: http://www.seclookon.com/seclookon/fix accounts.asp [4] M. Mrak, Gedankensplitter, (2009). Available: http://gedankensplitter.mrak.at/2009/03/weblookon-eine-innovativeneue.html [5] R. Oppliger, esecurity Communications, vol. 5, no. 1(2008) Available: http://www.esecurity.ch/communications.html [6] R. Oppliger, esecurity Communications, vol. 5, no. 2(2008) Available: http://www.esecurity.ch/communications.html [7] R. Oppliger, esecurity Communications, vol. 6, no. 1(2009) Available: http://www.esecurity.ch/communications.html [8] M. Steinkamp. Analyse des Verfahrens SecLookOn der Firma MERLINnovations, (2008) Available: email://steinmar@freenet.de [9] WebLookOn GesmbH, WebLookOn, (2009). Available: http://www.weblookon.com 9