Counting, Generating, and Solving Sudoku Mathias Weller April 21, 2008 Abstract In this work, we give an overview of the research done so far on the topic of Sudoku grid enumeration, solving, and generating Sudoku puzzles. We examine possible extensions and generalizations of previous work on solving and generating Sudoku puzzles focusing mainly on rulebased solvers. A possible way to influence the difficulty of a generated Sudoku puzzle is described and we introduce new deduction rules for solving a puzzle based on the rules described by David Eppstein in his paper Nonrepetitive Paths and Cycles in Graphs with Application to Sudoku. We then generalize these new rules further leading to an efficient constraint propagation algorithm that is able to solve puzzles that could not be solved by applying only Eppstein s deduction rules. The implementation of this strategy and how it may be used to implement the special cases is explained, followed by a practical evaluation of the solving power of all presented solvers. 1
Contents 1 Introduction 3 1.1 The Sudoku Game.......................... 3 1.2 Sudoku Variants........................... 6 2 Prior Work 7 2.1 Counting Sudoku Grids....................... 9 2.2 Complexity.............................. 11 2.2.1 Short Introduction to NP-completeness.......... 11 2.2.2 Sudoku Decision Problem.................. 12 2.2.3 Complexity of the Sudoku Decision Problem....... 12 2.3 Generating Sudoku Puzzles..................... 13 2.3.1 Incremental Generation................... 13 2.3.2 Decremental Generation................... 14 2.4 Judging the Difficulty of Generated Sudoku Puzzles....... 14 2.5 Finding Solutions to Sudoku Puzzles................ 15 2.5.1 Solving Sudoku Puzzles via Backtracking......... 15 2.5.2 Solving Sudoku Puzzles via Constraint Programming... 15 2.5.3 Solving Sudoku Puzzles via Logic Deduction....... 16 2.6 Graph Coloring............................ 18 3 Generalization and Contribution 18 3.1 Counting Sudoku........................... 19 3.2 Generating Sudoku Puzzles..................... 20 3.2.1 Finding a Full Sudoku Grid................. 20 3.2.2 Deletion Witnesses...................... 21 3.3 Judging the Difficulty of Generated Sudoku Puzzles....... 21 3.4 Finding Solutions to Sudoku Puzzles................ 21 3.4.1 Extension of Bilocation and Bivalue............ 22 3.4.2 Group-Modified Rules.................... 25 3.4.3 Limited Constraint Propagation.............. 26 4 Experimental Results 31 5 Outlook and Future Work 32 2
1 Introduction Sudoku (or Number place, as it is called in the US) is a well-known logic puzzle popular for its appearance in newspapers and magazines. Its popularity is expressed in examples like a Boston Japanese restaurant that hands out $10 gift certificates to patrons who can finish a Sudoku puzzle before their Sushi is served. There are official tournaments in Europe and the US with the possibility to win monetary prizes [Sem05]. Solving a Sudoku puzzle is usually very satisfactory for the puzzler, or, to quote Henry Dudeney, A good puzzle, like virtue, is its own reward [Dud02]. Sudoku is a derivative of Latin Square, a puzzle first described by Leonhard Euler in 1783. The Sudoku puzzle was first created for Dell Magazines by Howard Garnes, an architect from Indianapolis and introduced to the US public in 1979. It was not until seven years later that Sudoku was successful in Japan, where it was first published by the Nikoli company under its current name, which is Japanese for single number. At the beginning of the 21 st century, the puzzle spread all over the world. This international success partially relies on using numbers instead of letters or words [Sem05]. When solving Sudoku puzzles, one naturally stumbles upon a variety of questions: Does my puzzle have a solution? If so, is it the only one for my puzzle? If not, how many solutions are there and is there a systematic way of determining all solutions? Does the puzzle become harder if there were less hints? What is the minimum number of hints in order to assure a unique solution? In this article, we will consider some of these and other questions. 1.1 The Sudoku Game Sudoku is a puzzle game played on a grid that consists of 9 9 cells each belonging to three groups: one of nine rows, one of nine columns and one of nine blocks (sometimes called boxes or subsquares). Three blocks in a row are called a band, three vertically stacked blocks are called a stack, a chute is either a band or a stack (see Figure 1). A Sudoku grid is full, if each group contains the numerals from 1 to 9 exactly once. Figure 1 shows a full Sudoku grid. A Sudoku puzzle is a Sudoku grid that is partially filled, meaning that a set of fixed cells (cells whose numerals are given, i.e. that cannot be chosen by the solver), also called hints or clues is provided by a puzzle master, whereas the other cells are blank. Figure 2 shows a possible Sudoku puzzle for the grid in Figure 1. The objective of the puzzle game is to fill the Sudoku grid by assigning a numeral to each blank cell in such a way that each numeral is unique in each of its three groups. A solution to a Sudoku puzzle is a full Sudoku grid that is consistent with the puzzle, meaning that all hints of the puzzle appear in the full grid as well. Figure 1 is a solution to the puzzle in Figure 2. A Sudoku grid is called proper or unisolvent if it has only one solution, ambiguous if it has more than one solution and invalid if it has no solution at all, due to contradicting hints. Most daily newspaper Sudoku puzzles provide about 28 or 30 clues, but for the difficulty of the puzzle, the number of hints matters less than the complexity of the logical leaps required to assign numerals to the blank cells 3
1 3 8 2 7 6 5 4 9 2 7 5 4 1 9 8 6 3 6 4 9 5 8 3 1 7 2 5 8 3 1 6 7 2 9 4 9 1 6 3 4 2 7 5 8 7 2 4 9 5 8 3 1 6 3 5 2 7 9 4 6 8 1 4 6 1 8 3 5 9 2 7 8 9 7 6 2 1 4 3 5 1 3 8 2 7 6 5 4 9 2 7 5 4 1 9 8 6 3 6 4 9 5 8 3 1 7 2 5 8 3 1 6 7 2 9 4 9 1 6 3 4 2 7 5 8 7 2 4 9 5 8 3 1 6 3 5 2 7 9 4 6 8 1 4 6 1 8 3 5 9 2 7 8 9 7 6 2 1 4 3 5 Figure 1: A full Sudoku grid. On the right, the first band and the second stack are marked. 1 7 9 2 5 6 3 1 5 8 1 6 6 7 5 8 1 6 2 7 6 9 7 8 2 5 Figure 2: A proper Sudoku puzzle. 4
[Sem05]. As opposed to the difficulty, the number of clues plays a crucial role in determining the properness of a puzzle. So far, no proper 9 9 Sudoku puzzles with less than 17 hints is known, whereas there are several proper puzzles with exactly 17 hints. The minimal number of hints necessary for an n n puzzle to have a unique solution is yet unknown [HM07], although a lower bound of n 1 is easy to prove: if a puzzle had only n 2 hints, then there are two numerals that are not specified by the puzzle. These two numerals may be exchanged throughout the solution to the puzzle in order to obtain another solution for it and thus, the puzzle cannot be proper (see also Section 2.6 on page 18). In order to do complexity analysis for solving Sudoku puzzles, we parametrize them. In this case, the size of each group (meaning the number of different numerals) (n = 9), the order of the Sudoku grid, meaning the length of a side of the blocks (m = n = 3), or the number of cells in the Sudoku grid (h = n 2 = 81) may be used. Since they are all polynomial in one another, an efficient algorithm with regard to any of these parameters is efficient with regard to all of them, hence efficiency is invariant under choosing one of the mentioned parameters. For reasons of compatibility with other papers, n will refer to the number of different numerals in the following part of the article, if not explicitly stated otherwise. Though it is common to use numerals to fill the Sudoku grid, letters, pictures or any kind of items dividable into at least n disjoint equivalence classes is suitable as well. Sudoku is closely related to the Latin Square problem: given an n n square of cells and a set of fixed cells, find a completely filled n n grid that is a superset of the fixed cells such that each item is unique for its column and row while still using only n different types of items. Figure 3 shows an example of a Latin Square puzzle and its solution. 6 4 2 1 3 9 4 8 4 6 3 7 5 2 8 9 8 1 3 2 1 7 3 2 8 6 1 7 4 9 5 1 8 3 6 8 4 3 7 2 1 5 9 1 3 6 9 4 5 7 2 8 9 4 2 6 1 8 3 7 5 7 5 3 2 8 4 9 6 1 8 6 7 1 5 3 4 9 2 5 9 1 8 2 7 6 3 4 3 2 8 4 9 6 5 1 7 4 7 9 5 3 1 2 8 6 2 1 5 7 6 9 8 4 3 Figure 3: A 9 9 Latin Square puzzle and its solution. 5
1 3 7 4 6 2 3 7 8 4 3 7 2 7 4 7 8 3 8 1 8 5 3 6 Figure 4: An 8 8 Sudoku puzzle with 2 4 groups. Figure 5: The 12 pentomino groups. 1.2 Sudoku Variants Although this article is mainly about plain Sudoku as described above, this short introduction to Sudoku variants may be of interest for the reader. As already mentioned, a Sudoku grid may have any dimension n, although 9 9 puzzles are by far the most common. There are also puzzles that do not have the same number of stacks and bands. An example is the 8 8 Sudoku puzzle shown in Figure 4. The groups may be irregular, which allows for 5 5 grids with pentomino groups (groups of irregular shape that contain exactly five cells, see Figure 5). This type of puzzle is also known as Logi-5. Apart from geometric differences, there are several Sudoku variants that impose new rules or modify the existing rules of the puzzle: The Sudoku X variant enforces the numerals in the cells on the diagonals to be unique for each diagonal (see Figure 6) [Mon05]. The Hypersudoku variant, also called Windocu consists of a normal Sudoku grid that is supplemented with additional regions that have to contain each numeral exactly once. These regions overlap the blocks, thereby giving additional information (see Figure 6). The Samurai Sudoku variant consists of five 9 9 Sudoku puzzles arranged in a quincunx 1 such that the grid in the 1 A quincunx is a formation of five entities similar to a cross. For example, the five dots on a side of a dice form a quincunx. 6
5 4 6 1 2 7 8 3 9 8 9 7 4 5 3 2 1 6 2 3 1 9 6 8 7 5 4 1 7 8 6 4 5 3 9 2 6 2 9 7 3 1 5 4 8 3 5 4 8 9 2 6 7 1 7 1 2 5 8 9 4 6 3 9 6 3 2 7 4 1 8 5 4 8 5 3 1 6 9 2 7 Figure 6: Left: A 9 9 Sudoku X grid. Right: A 9 9 Hypersudoku grid. middle is being overlapped by the other four grids in its four corners, such that the middle grid shares one block with each of the other grids while the outer grids are disjoint (see Figure 7) [Tel06]. The circular Sudoku variant employs a circular formation of cells that is divided into segments and rings. Each cell has to be assigned a numeral such that each ring and each pair of neighboring sectors contain each numeral exactly once (see Figure 8) [PMH06]. A variant combining the idea of the Rubik s cube with Sudoku puzzles is the Sudokucube, a 3 3 3 cube that can be solved by turning plains of subcubes in such a way that each side becomes a valid Sudoku grid. Hence the cube contains the numbers 1 to 9 exactly 6 times each. Variants that use letters instead of numerals may enforce the formation of a valid word at some place in the grid. For almost every Sudoku variant, there is another variant with the nonconsecutive property, meaning that no two neighboring cells may be assigned consecutive numerals. Other variants may modify the way in which hints are given, for example the 2005 U.S. Puzzle Championship featured a puzzle that contained ranges of numerals as hints. 2 Prior Work In this chapter, publications about Sudoku puzzles are being introduced: For a start, we will consider the problem of counting possible Sudoku grids in Section 2.1. A general complexity consideration in Section 2.2 will introduce to the topic of generating (Section 2.3) and solving (Section 2.5) Sudoku puzzles. In Section 2.4, the problem of judging the difficulty of a generated puzzle will be addressed. We will show parallels to the Graph-n-Coloring problem in Section 2.6. Chapter 3 will introduce thoughts and ideas developed from the approaches of Chapter 2 and finally, results of applying some of these ideas are given in Chapter 4. 7
9 3 7 1 8 6 4 2 5 1 6 4 3 9 5 2 7 8 1 2 4 5 7 9 3 8 6 8 2 3 1 7 4 5 6 9 6 5 8 2 3 4 7 9 1 9 5 7 2 8 6 4 1 3 2 6 1 4 5 7 9 3 8 4 8 5 7 6 9 1 3 2 8 7 5 3 9 2 1 6 4 7 9 2 5 3 1 6 8 4 4 9 3 8 6 1 5 7 2 3 1 6 8 4 2 7 9 5 5 8 2 7 1 3 6 4 9 7 1 5 2 3 8 4 1 7 9 5 6 3 1 6 9 4 8 2 5 7 8 3 9 6 4 1 9 5 3 8 2 7 7 4 9 6 2 5 8 1 3 2 6 4 5 7 9 6 2 8 3 4 1 3 6 4 1 9 2 8 5 7 1 7 8 5 4 6 9 2 3 5 9 2 3 8 7 4 1 6 9 2 7 3 5 6 4 8 1 9 5 3 7 6 2 5 1 3 4 9 8 5 3 8 4 7 1 9 2 6 4 7 1 3 8 5 4 6 9 2 1 7 4 6 1 9 2 8 7 3 5 6 2 8 1 9 4 8 7 2 6 3 5 6 7 3 5 8 9 1 4 2 5 3 7 2 9 4 1 8 6 2 8 4 1 6 3 5 7 9 4 1 6 3 8 7 9 5 2 1 5 9 7 4 2 8 6 3 8 2 9 6 5 1 7 4 3 7 1 5 2 3 4 6 9 8 6 4 3 1 2 8 5 7 9 3 9 6 8 1 7 2 5 4 9 5 8 7 4 6 3 2 1 8 4 2 6 9 5 3 1 7 2 7 1 9 3 5 8 6 4 Figure 7: A 9 9 Samurai Sudoku grid. Figure 8: A circular Sudoku puzzle with n = 8. 8
2.1 Counting Sudoku Grids This section is a short summary of what was done so far on the topic of determining the number of full Sudoku grids of specific dimensions. For a more detailed view, please refer to the literature given in the section. First of all, we are interested in the number of different full Sudoku grids of a certain order. To calculate this number, we first need a definition of difference regarding Sudoku grids. Therefore, an equality relation is to be provided that relates equal Sudoku grids. Hence two grids that are not related are considered different. For the following lemmas, two Sudoku grids are considered equal if every cell of a grid contains the same numeral as the cell at the same position in the other grid. This equality relation will be referred to as E. Lemma 2.1 ([HM07]) There are N 4 4 = 288 valid full 4 4 Sudoku grids. Lemma 2.2 ([FJ06]) There are valid full 9 9 Sudoku grids. N 9 9 = 6, 670, 903, 752, 021, 072, 936, 960 Remark The lemmas were proved using a combination of symmetry consideration and brute force calculation, which did not allow for the calculation of the exact number of valid 16 16 Sudoku grids yet, so this is an open problem. This result may satisfy for the time being, but the fact that in order to calculate these numbers a Sudoku grid and the version of the grid that is simply rotated by 90 are considered different may be disturbing. Hence another equality relation is presented: Definition Let S denominate the set of all full Sudoku grids of a certain order. A transformation t : S S is called validity-preserving. Let T be a set of validity-preserving transformations. We define the equality relation E T S S with (s 1, s 2 ) E T k N t 1,..., t k T ((t 1 t 2... t k )(s 1 ), s 2 ) E. Remark The relation E T relates two grids iff one can be transformed into the other by using only transformations in T. Note that being validity-preserving is invariant with respect to composition, thus t 1 t 2... t k is a validity-preserving transformation but is not necessarily in T. So far, a number of transformations that preserve the validity of a Sudoku grid are known. For example, a Sudoku grid may be rotated by a multiple of 90 without affecting its validity. Furthermore, it is possible to permute the numerals throughout the entire grid without changing the validity of the grid because generally the items in a Sudoku grid are not ordered. More transformations will be mentioned later in this section and additional possibilities will be discussed in Section 3.2.1. 9
Definition Let T 1 be the set that consists of the following validity-preserving transformations: Permuting numerals Permuting rows in the same band Permuting bands Transposing the grid (That is, mirroring the grid by the main diagonal) The equality relation E := E T1 is being referred to when speaking of essentially different Sudoku grids. For irregular puzzle sizes, the transposition is not validity preserving. The following transformations are referred to when speaking of essentially different irregular Sudoku grids: Permuting numerals Permuting rows in the same band Permuting bands Permuting columns in the same stack Permuting stacks Lemma 2.3 ([HM07]) There are N 4 4 = 2 essentially different 4 4 Sudoku grids (see Figure 9). 1 2 3 4 1 2 3 4 3 4 1 2 3 4 2 1 2 1 4 3 2 1 4 3 4 3 2 1 4 3 1 2 Figure 9: Representatives of the only two equivalence classes of 4 4 Sudoku grids with respect to essentially different Sudoku grids. Lemma 2.4 ([RJ06a]) There are N 9 9 = 5, 472, 730, 538 essentially different 9 9 Sudoku grids. Other Sudoku variants were analyzed as well. Applying the transformations listed in Definition 2.1 to different grid sizes results in different numbers of full grids. An overview about these results is given in Table 1. 10
Grid type Block types Number of essentially different Sudoku grids 4 4 2 2 2 (See Lemma 2.3) 6 6 2 3 49 [RJ06b] 8 8 2 4 1, 673, 187 [Rus06] 10 10 2 5 4, 743, 933, 602, 050, 718 [Pet06] 9 9 3 3 5, 472, 730, 538 (See Lemma 2.4) Table 1: Number of different Sudoku grids with respect to E for different puzzle sizes. Note that different transformations apply for irregular Sudoku puzzle sizes. 2.2 Complexity From the point of view of a student of theoretical computer science, a very important consideration is the complexity analysis of a problem. In this section, we will discuss the decision variant of the Sudoku problem. This will be defined in Section 2.2.2, after a short introduction to NP-completeness. 2.2.1 Short Introduction to NP-completeness This section will provide a brief overview over the topic of NP-completeness. First of all, it is important to know some terms: In computer science, an algorithm is called deterministic if each step is determined only by prior steps and the input data. A deterministic algorithm is called efficient if its running time is bounded by a polynomial in the size of the input data. The set of problems that are solvable efficiently is denominated by P, while NP denominates the set of problems whose solutions are efficiently verifiable. Let A and B be problems in NP, then a function f is called a reduction from A to B if for any input d, d A f(d) B and the computation of f(d) is deterministic and efficient. So the question whether d A can be answered by applying the reduction f to d and testing whether f(d) B. If such a function exists for two problems A and B, A is called reducible to B. Note that if f(d) B can be determined efficiently, so can d A. Also note that the binary reducible-relation is transitive, meaning that if A is reducible to B and B is reducible to C, A is also reducible to C. A problem Q in NP is called NP-complete, if all problems in NP can be reduced to it. Hence, if Q was solvable efficiently, all problems in NP would be. For example, the SAT Problem, which is to tell whether a given Boolean formula in conjunctive normal form has a satisfying assignment, in other words, if the formula can evaluate to true, is NP-complete. It is yet unknown if efficiently finding solutions to the problems in NP is possible. This is called the P vs. NP Problem. 11
2.2.2 Sudoku Decision Problem We refer to the Sudoku problem as the problem of finding a solution to a given Sudoku puzzle. Much like SAT, where the decision problem is to find whether a satisfying assignment of all variables of a given formula exists, the decision problem for Sudoku is to find whether a solution to a given Sudoku puzzle exists. Note that it does not matter if the solution is ambiguous or not, the uniqueness of a solution is not of interest. The decision variant of the Latin Square problem is defined analogously. 2.2.3 Complexity of the Sudoku Decision Problem The Sudoku decision problem is in NP. Obviously, the size of an n n Sudoku grid is polynomial in n and thus a given solution to the grid can be verified efficiently. It has been shown that the decision problem of Sudoku is NPcomplete by reducing Latin Square, which is known to be NP-complete, to Sudoku [YS03]. In the following, a sketch of the proof will be presented: To solve an n n Latin Square, we construct a k k Sudoku grid with k = n 2 as follows: let S(i, j) denote the numeral in the cell of the Sudoku grid whose column is i and whose row is j and let L(r, s) denote the cell of the Latin Square whose column is r and whose row is s. The Sudoku grid is then constructed respecting the equation { r(l(i 1, (j 1)/n)), if stack((i, j)) = rowband((i, j)) = 1, S(i, j) = tr n ((i, j)), otherwise with tr n (x) = (colstack(x) n + stack(x) + row(x)) mod n 2 + 1 where rowband(x) stands for the number of the row in the band of cell x, and colstack(x) stands for the number of the column in the stack of cell x. The functions stack(x) and row(x) stand for the stack and the row of x respectively. This leads to { r(l(i 1, j 1 S(i, j) = n )), if i 1 n = (j 1) mod n = 0, tr n ((i, j)), otherwise. with r being a bijection that maps the n numerals of the Latin Square to n of the k numerals of the Sudoku: and r(x) = (x 1) n + 1, i 1 tr n (x) = (((i 1) mod n) n + + j 1) mod n 2 + 1. n This construction enforces the assignment of all numerals d with x {0,..., n} : d = r(x) 12
2 1 2 4 2 5 8 3 6 9 2 5 8 3 6 9 4 7 1 3 6 9 4 7 1 5 8 2 4 5 8 2 6 9 3 5 8 2 6 9 3 7 1 4 6 9 3 7 1 4 8 2 5 1 8 2 5 9 3 6 8 2 5 9 3 6 1 4 7 9 3 6 1 4 7 2 5 8 Figure 10: An example for the reduction from Latin Square to Sudoku. to the cells with i 1 = (j 1) mod n = 0 n but does not enforce any ordering on them other than the Latin Square rules for the resulting grid to comply with the Sudoku rules. Figure 10 shows an example for the reduction of a 3 3 Latin Square: together, the gray cells make up a solution to the given Latin Square. The numerals 1, 4 and 7 in the gray cells of the Sudoku grid are translated to the numerals 1,2 and 3 in the Latin Square. 2.3 Generating Sudoku Puzzles Generating a Sudoku puzzle is the task of choosing a subset of cells of the Sudoku grid to contain hints to enable the solver to calculate a solution for the puzzle. To be satisfactory for human solvers, the solution implied by the hints should be unique, so it is desirable to generate proper puzzles. Basically, there are two different methods to create a proper Sudoku puzzle: Incremental generation, which assigns numerals to one cell after another until sufficient hints are given for the puzzle to have a unique solution. Decremental generation removes numerals from the cells of a full Sudoku grid for as long as desired or possible in order for the solution to stay unique. 2.3.1 Incremental Generation Several Sudoku programmer forums advice to implement Sudoku generators that (randomly) pick cells and assign a (random) non-conflicting numeral to them until an automated solver can solve it. The disadvantage of this method is that determining if a numeral contradicts another in a partially filled Sudoku grid in general requires a solver. When assigning a random numeral to a random cell, the puzzle may become invalid so the generator must either utilize backtracking to find another cell or numeral, or discard the whole puzzle and start over when a puzzle becomes invalid. 13
1 2 3 4 5 6 7 8 9 4 5 6 7 8 9 1 2 3 7 8 9 1 2 3 4 5 6 2 3 4 5 6 7 8 9 1 5 6 7 8 9 1 2 3 4 8 9 1 2 3 4 5 6 7 3 4 5 6 7 8 9 1 2 6 7 8 9 1 2 3 4 5 9 1 2 3 4 5 6 7 8 Figure 11: Trivial Sudoku grid generated by S(x, y) = (( x/m +m (x mod m)+ y) mod n) + 1, where x is the number of the row of the cell starting with 0 and y is the number of its column starting with 0, n is the number of numerals and m = n is the order of the Sudoku grid. 2.3.2 Decremental Generation To generate a Sudoku puzzle decrementally, we have to create a completely filled grid first. There are multiple methods for how this can be achieved: For instance, we could just take an existing Sudoku grid or generate a trivial Sudoku grid by employing a mathematical formula (see Figure 11). The transformation of an existing grid using validity-preserving transformations will also yield a new Sudoku grid. We can also employ an algorithm for incremental generation of Sudoku puzzles and apply a solver to it. This last method may seem intricate but may be of interest for complexity analysis. After generating a full Sudoku grid, the numerals from this grid are being removed for as long as possible for the solution to stay unique. Therein lies the problem of indirect generation of Sudoku puzzles, because determining if a Sudoku grid is proper is not trivial and usually requires a solver. If the removal of a numeral causes the puzzle to not be proper anymore, backtracking is used or the puzzle is discarded. 2.4 Judging the Difficulty of Generated Sudoku Puzzles With the generation of a Sudoku puzzle comes the task to judge its difficulty. To the best of our knowledge, all Sudoku puzzle generators determine the difficulty of a puzzle after its generation, which has the disadvantage that one cannot choose the difficulty of the puzzle to be generated. In order to get a puzzle of desired difficulty, the generator may have to be run multiple times. Eppstein s generator judges a puzzle by the logic rules needed to solve it. Each rule is assigned a value and the difficulty value of the puzzle equals the maximum difficulty value of all rules needed to solve it, where the solver only applies a difficult rule if all simpler rules have been exhausted [Epp05b]. This means that if we were to generate a Sudoku puzzle of a certain difficulty, we would need an 14
automated solver. 2.5 Finding Solutions to Sudoku Puzzles Finding solutions to Sudoku puzzles is easily done by a simple backtracking algorithm explained in Section 2.5.1. However, there are two main reasons why this is not desirable: Backtracking in general takes too much time and it is not fitting to judge the difficulty of a Sudoku puzzle. For the purpose of simulating a human solver and thus evaluating the difficulty of a Sudoku puzzle in context of human strategies, solving it with a set of deduction-rules is of great interest. For these reasons this article is focused on (efficient) non-backtracking algorithms for solving Sudoku puzzles and just briefly introduces other options. 2.5.1 Solving Sudoku Puzzles via Backtracking To solve a given Sudoku puzzle we can traverse the search tree of all compatible Sudoku grids, that is, grids that extend the puzzle. This leads to a trial and error backtracking algorithm: 1. Find an unfixed cell in the grid. 2. Choose a possible numeral for it. 3. With the new fixed cell, solve the grid (recursively). 4. If the choice leads to an invalid grid, track back and try another possible numeral. The worst case running time of such an algorithm is Ω(n n k ), with k being the number of fixed cells, hence, if n k ω(1) it exceeds polynomial boundaries. It is easy to see that performing backtracking on a constant part of a Sudoku puzzle is generally not enough to solve it. However, in practice, the backtracking algorithm can be modified so that it often takes linear time to solve a given puzzle: instead of randomly picking a cell to branch from, choose the one with the least number of possible numerals. Although it has a superpolynomial worst case running time, the backtracking algorithm is capable of solving any proper Sudoku puzzle and determining every solution to an ambiguous Sudoku puzzle. 2.5.2 Solving Sudoku Puzzles via Constraint Programming Constraint Programming is the problem of finding an assignment to a given set of variables in a given domain that complies to a given set of constraints. For example, solving alphametic puzzles can be solved by Constraint Programming. A famous alphametic puzzle is shown in Figure 12. Applied to this puzzle, the Constraint Programming algorithm will come up with an assignment of the variables respecting the given constraints. One may utilize Constraint Programming to solve Sudoku puzzles by implementing the fundamental rules of Sudoku as constraints over the domain 15
s e n d m o r e m o n e y Figure 12: A popular alphametic puzzle. The objective is to find values for s, e, n, d, m, o, r, y {0... 9} with s 0, m 0, (1000s + 100e + 10n + d) + (1000m + 100o + 10r + e) = 10000m + 1000o + 100n + 10e + y, and no two different variables being assigned the same value. {1,..., n}: Each numeral must be unique for its column, row and block. In general, Constraint Programming is NP-complete and equally suitable for solving any given Sudoku puzzle as the backtracking algorithm. In the Internet, there are a lot of examples and tutorials on how to tweak constraint programming for Sudoku, effectively improving the performance for example by cutting down symmetric branches. 2.5.3 Solving Sudoku Puzzles via Logic Deduction This method tries to mimic a human solver by applying a set of rules that rule out possibilities for numerals in certain cells or fix unfixed cells in the grid thus simplifying the task of solving the puzzle. As long as each of these rules can be implemented efficiently, the whole solving process can, because the number of cells is polynomial in n and the number of possible numerals per cell is at most n. Hence not every given Sudoku puzzle is solvable by a solver using only logic deduction, unless P=NP. However, it is an open problem whether there is a ruleset that is able to solve all proper Sudoku puzzles. A set of deduction rules to solve a Sudoku puzzle is the following [Epp05a]: Eliminate If there is only one numeral left for a cell, assign it to this cell. Locate If there is only one cell left for a numeral in a group, assign it to this cell. Align Eliminate possibilities for numerals that would leave no choices for another group. This means that if all cells of a group g that may contain a numeral x share two of their three groups (g and g ), all possibilities of x in cells of g that are not part of g may be removed, because if placed in any of these cells, there would be no cell in g that may contain x. For example, if all possibilities of 1 in a block are in the same row, then all possibilities of 1 in this row outside of the block may be removed. Pair/Triad Eliminate possibilities for numerals that would leave no choices for two (three) other numerals in a group. This means that if two (three) cells that share two of their three groups contain the only two (three) possibilities 16
for two (three) numerals in one of their shared groups, then all possibilities of these numerals may be removed from both groups. For example, if the only possibilities for the numerals 1 and 2 in a block are in two cells in the same row, all possibilities of 1 and 2 may be removed from the rest of the row and the rest of the block. Digit Eliminate possibilities for numerals that cannot be extended to a placement of n copies of that numeral covering each group. For each numeral d, consider the bipartite graph G d = ({R, C}, E) with R being the set of all rows, C being the set of all columns and {r, c} E iff the cell of row r and column c may contain d. In order for a solution to exist, G d has to have a perfect matching. The Digit rule removes possibilities for any numeral in any cell that prevents a perfect matching of rows and columns for this numeral to exist. Rectangle, Trapezoid These only apply if we know that the given Sudoku puzzle is proper. If so, avoid formation of ambiguities in the grid. This means that if a puzzle is known to be proper, the possibility of any numeral in any cell that would imply the formation of an ambiguous rectangle (see Section 3.1) may be removed. Subproblem Eliminate possibilities for numerals that cannot be extended to a complete arrangement of all its groups. For each group g, consider the bipartite graph G g = ({N, C g }, E) with N being the set of all numerals, C g the set of all cells in g, and {k, c} E iff k is a possible assignment of c. In order for a solution to exist, G g has to have a perfect matching. The Subproblem rule removes possibilities for any numeral in any cell that prevents such a perfect matching of numerals and cells of any group. Bilocation Find non-repetitive cycles in the graph of bilocated cells and remove any other possibility from the their cells. For explanation please refer to Eppstein [Epp05a]. Bivalue Find non-repetitive cycles in the graph of bivalued cells and remove any other possibility. For explanation please refer to Eppstein [Epp05a]. Repeat Find repetitive cycles and assign the repeating numeral to the incident cell. For explanation please refer to Eppstein [Epp05a]. Path Find conflicting paths in the graphs of bivalued and bilocated cells. For explanation please refer to Eppstein [Epp05a]. 17
With this ruleset, Eppstein managed to solve about 96% of the proper puzzles generated by a puzzle generator that works as follows [Epp05b]: 1. generate a full Sudoku grid: (a) choose a random cell (b) assign a random, non-conflicting numeral (c) propagate the changes by applying the most simple deduction rules as often as possible 2. revert the changes step by step for as long as the puzzle stays proper (determined by a backtracking solver) Solving by logic deduction is a suitable method to determine the difficulty of a Sudoku puzzle, that is, how hard it is for a human solver to find a solution, for example by applying a difficulty index to each deduction rule. 2.6 Graph Coloring An n n Sudoku grid may be interpreted as a graph in the following way: for each cell there is a vertex of the graph. Each two vertices are connected iff they share a group. Each numeral is represented by a different color. A full Sudoku grid belongs to an n-coloring of this graph, so that no two adjacent vertices have the same color. Solving a Sudoku puzzle is therefore equal to completing a partial coloring of the graph representing it. Note that this induces a reduction from Sudoku to Graph-n-Coloring: Given an n n Sudoku puzzle, build a graph G with n 2 vertices, one for each of the cells of the grid and connect two vertices iff they share at least one of their groups. The size of the graph is polynomial in n and the partial coloring of G given by the hints in the Sudoku puzzle can be extended to an n-coloring of G iff the Sudoku puzzle has a solution. The number of ways to complete a partial coloring is a monic polynomial 2 of degree at most n 2 and due to the reduction, the same holds for the number of different solutions to a Sudoku puzzle [HM07]. Note that this is an exponentially growing function in n. 3 Generalization and Contribution In this chapter, the previously presented approaches will be extended and generalized. We will extend the term essentially different and consider the impacts on the number of Sudoku grids. After presenting a different way to generate Sudoku puzzles, we will introduce generalizations of parts of the set of deduction rules presented in Section 2.5.3 and a limited constraint propagation algorithm to solve Sudoku puzzles. 2 A polynomial is called monic if all its coefficients are integer. 18
3.1 Counting Sudoku In Section 2.1, the transformations leading to the term essentially different grid are mentioned. Recall also the difference between two Sudoku grids being equal with respect to E and being equal with respect to E. Additionally to the four transformations considered by E, the ambiguous rectangle transformation may be taken into account: Definition An ambiguous rectangle is a formation of four cells that share exactly two different rows, columns, blocks, and numerals. Flipping such a rectangle means to replace the content of each cell with the content of the cell it shares a block with. Remark Figure 9 on page 10 shows an ambiguous rectangle (the gray cells form one) and its flipped variant. This is exactly the same as Eppstein [Epp05a] defined to test whether a Sudoku grid was ambiguous (a grid is ambiguous if it contains an ambiguous rectangle). If a solution to a Sudoku puzzle contains an ambiguous rectangle that is not part of the puzzle, this solution cannot be unique, because the ambiguous rectangle may be flipped to obtain a second solution to the puzzle. Flipping a flipped ambiguous rectangle reverts the transformation, thus, flipping is its own inverse transformation. The equivalence relation that takes the five transformations mentioned so far into account will be referred to as E. As shown in Figure 9 on page 10, the only two essentially different 4 4 Sudoku grids are in fact equal under flipping an ambiguous rectangle. Hence, with respect to E there is just one equivalence class of 4 4 Sudoku grids (this means, there are no essentially different grids of this type), which in turn means that all 4 4 Sudoku grids can be obtained from a single one by applying a series of the stated transformations. It is now of interest how many full 9 9 Sudoku grids exist that are essentially different with respect to E. Unfortunately, applying the ambiguous rectangle transformation to 9 9 Sudoku grids is not trivial. Using Burnside s lemma like Jarvis and Russel [RJ06a] did to calculate N 9 9 (see Lemma 2.4 on page 10) is not applicable to the ambiguous rectangle transformation, because of its dependency on numerals, not just geometric shapes. Thus, there is at the moment no better way than to look at all N 9 9 = 5, 472, 730, 538 equivalence classes and checking all pairs of classes for equality by brute force. However, for the sheer size of these classes it is overwhelmingly costly to handle them. In the following, we will estimate how many comparisons it would take to calculate the number of different Sudoku grids taking the ambiguous rectangle transformation into account. If the average number of grids in an equivalence class with respect to E is k N 9 9 6.6 1021 N 9 9 = 1.2 1012 5.4 109 and a uniform distribution of grids in each class is assumed, the estimated number of comparisons is N 9 9 /2, which is approximately 3.3 10 21. Hence 19
even if we compared a trillion Sudoku grids per second it would take 104 years to finish calculation. However it is still interesting how many 9 9 grids are essentially different with respect to E, because from a list of these grids, it would be possible to generate every valid 9 9 Sudoku grid, which is useful for generating Sudoku puzzles. 3.2 Generating Sudoku Puzzles As we have already seen, Sudoku puzzle generators employ backtracking when the puzzle becomes invalid while adding or removing a numeral. To avoid this, we will try to retrace the actions a potential solver would take to solve a Sudoku puzzle and reverse them to reconstruct a puzzle from a given Sudoku grid. Of course there are multiple puzzles for a single grid. The idea is to select an available trace that is closest to a desired difficulty level whenever there are multiple choices. This eliminates the need for backtracking in the generation process. However, a full Sudoku grid has to be obtained first. 3.2.1 Finding a Full Sudoku Grid When trying to find a full Sudoku grid, it is possible to apply a composition of validity-preserving transformations on a previously saved grid. In Section 2.1 on page 9 we introduced the equality relation E T that relates two grids if one can be transformed into the other by using only transformations of T. E T partitions the set of all full Sudoku grids into equivalence classes. If we had a representative of each of the classes, we were able to generate every full Sudoku grid there is by applying a composition of transformations of T to a representative of an equivalence class. For this reason, we take a look at implementing such transformations. To get a better overview the previously mentioned transformations (see Section 3.1) can be split into numeric transformations and geometric transformations: Relabeling entries and flipping ambiguous rectangles are both considered numeric, while the permutation of bands, the permutation of rows in the same band and transposing the grid are considered geometric transformations. All of the previously mentioned transformations can be written as a combination of these five. It is interesting to note that numeric transformations and geometric transformations are orthogonal, meaning they can be applied independently, or in other words, in the order of application any occurrence of a numeric transformation may be swapped with an adjacent geometric transformation without changing the overall result. This enables us to first apply all geometric transformations and then apply all numeric transformations. Also note that all transformations that relabel entries can be replaced by a single transformation that relabels entries, because the concatenation of two permutations is itself a single permutation. Unfortunately, we could not come up with a way to efficiently enumerate all Sudoku grids of one class, neither with a way to efficiently calculate a list of representatives from each class. 20
3.2.2 Deletion Witnesses To decide whether the numeral in a certain cell may be removed it is of great importance whether it can be restored using the rules of a potential solver. One possibility would be to just try and remove that numeral. If the solver can derive it from the information left, it can be safely removed. However, there is a better way than this trial and error technique. In the following, we will use structures called deletion-witnesses (DW) to know in advance which rules applied on which cells would cause the numeral in a certain cell to be derived. Definition A witness of a numeral k of a cell c is a pair (R, S) of a logic deduction rule R and a set S of pairs of cells and numerals such that the collectivity of the pairs in S implies the assignment of k for the cell c by the rule R. The numeral k is then called witnessed by (R, S). Remark Witnessed numerals may be removed in the generation process because a solver can deduct them with the help of its witnesses. This will enable us to remove only those numerals deductible by a given set of rules and thus effectively influencing the difficulty level of the puzzle. This method still does not generally allow for choosing a difficulty level in advance because the rule that causes the deduction of the content of a selected cell is not necessarily the easiest. Since the solver applies rules with a high difficulty rating only if the easier rules are exhausted, the puzzle may be easier than expected. However, a puzzle generated by this method will not be harder than expected. Also note that this method does not require backtracking. 3.3 Judging the Difficulty of Generated Sudoku Puzzles When generating a Sudoku puzzle incrementally, the difficulty may be controlled by picking the numerals in such a way that deduction by rules with the desired difficulty-level are possible. Whereas, when generating decrementally, the difficulty of the puzzle may be controlled by removing those numerals whose recalculation has the desired difficulty. 3.4 Finding Solutions to Sudoku Puzzles When looking at Eppstein s Bilocation and Bivalue rules [Epp05a] a distinct feeling that they are two occurrences of a common phenomenon arises. In our work, this phenomenon is called constraint propagation, meaning that changing the grid in a certain way may affect other cells. However, not all possible constraint propagation mechanisms can be considered efficiently, hence we focus on four ways in which such propagation may occur: 1. Not assigning a numeral to a certain cell may force not assigning another numeral to another cell. 2. Not assigning a numeral to a certain cell may force assigning another numeral to another cell. 21
3. Assigning a numeral to a certain cell may force not assigning another numeral to another cell. 4. Assigning a numeral to a certain cell may force assigning another numeral to another cell. In fact, Eppstein s Bilocation and Bivalue rules [Epp05a] only cover points 2 (by the Bilocation rule) and 4 (by the Bivalue rule). Additionally, the Bilocation and Bivalue rules are separated from one another, which further limits their potential. A cycle of Bivalued and Bilocated cells in a puzzle may exercise a constraint on other cells of the puzzle. This source of information is not exploited by applying Bivalue and Bilocation rules separately. Therefore the next two sections will discuss a combination of these two. In Section 3.4.3, we will discuss a further step of generalization and the implementation of all mentioned rules. 3.4.1 Extension of Bilocation and Bivalue Inspired by Eppstein s cycle analysis approach [Epp05a] Bivalue and Bilocation rules were combined. This is possible since they are, as described above, constraint propagating rules. The combination of the two rules can be condensed into two rules for a human Sudoku solver. In the following, those are going to be explained. Definition If two cells c 1,c 2 in a group share a possible numeral x, these two cells are called grouped by x (written as c 1 g x c 2 ). If c 1 g x c 2 and x cannot be assigned to any other cell in the group, these two cells are called bilocated by x (written as c 1 l x c 2 ). If c 1 g x c 2 and c 1,c 2 have only two possible numerals each, the two cells are called bivalued by x (written as c 1 v x c 2 ). Remark Note that for two cells to be bivalued, the intersection of the sets of their respective possible numerals may contain both numerals, although this is not required for the Bivalued property to apply. Also note that the grouped, bilocated, and bivalued relations are symmetric and intransitive. Definition For an n n Sudoku grid S, the graph G = (W, E) with with the edge-labeling function W = {c c is a cell in S} E = {{c, c } c, c W x(c l x c c v x c )} label : E P({L, V } {1,..., n}) (L, x) label({c, c }) c l x c (V, x) label({c, c }) c v x c 22
is called Force-Propagation-Graph or short FPG. Note that an edge may have multiple labels. Let (x, y) be a label, then x = type((x, y)) is the type of the label and y = numeral((x, y)) is the numeral of the label. The function d : ({L, V } {1,..., n}) 2 N calculates the distances of two labels. It is much like the Hamming-Distance in that it specifies how many parts of the labels differ. 0, if p = x q = y d((p, q), (x, y)) = 1, if p = x q y 2, else. A path in the FPG of length p + 2 is called alternating if for each edge e i in the path, there is a label b i l(e i ) such that i {0,..., p} : d(b i, b i+1 ) = 1. That means that only one part of the label may differ from edge to edge. An alternating cycle is defined analogously. The additional rules are defined as follows. 1. Alternating Cycle Rule (ACR) Suppose there is an alternating cycle in the FPG. Let c i be a cell of the cycle and e i and e i+1 its incident edges. If there are two numerals x and y with (L, x) label(e i ) and (L, y) label(e i+1 ), remove all possible numerals except x and y from consideration for c i (see Figure 13). 2. Repetitive Cycle Rule (RCR) Suppose there is an alternating path of p + 1 edges in the FPG that starts and ends at the same vertex (cell) but is not an alternating cycle - this means that the edges incident to the starting cell prevent the alternating path from being an alternating cycle. Let e 0, e p denote these two edges and b 0, b p the labels of e 0 and e p that were used to form the alternating path (note that d(b 0, b 1 ) = d(b p 1, b p ) = 1, but d(b 0, b p ) 1). Then, the starting cell may not be assigned the numeral of the label whose type is V if there are any, and must be assigned the numeral of the label whose type is L if there are any. These two numerals cannot be the same because the equality would yield d(b 0, b p ) = 1 and thus the alternating path would as well be an alternating cycle. Also, if both labels were of the same type, then for the same reason, their numerals would not differ. While being an improvement to applying the Bilocation and Bivalue rules separately, the Alternating Cycle and Repetitive Cycle rules alone are not powerful enough to provide a substantial gain of solving power, as shown in Section 4. Further generalization of the rules will be considered in the following sections. 23
7 5 8 4 6 9 8 4 6 5 1 3 7 2 8 5 4 5 6 3 5 2 1 3 9 6 5 8 4 6 1 5 1 Figure 13: An example for the application of the ACR. The grid on the right shows the alternating cycle and the labels of its edges. Note the two marked cells in the top row. The left one may only contain 1 or 2, whereas the right one may only contain 1 or 3. Because each of them may only contain two numerals one of which is 1, the two cells are connected by an edge labeled V1, which stands for bivalued by 1, with respect to the top row. That means that by assigning 1 to any of them, the other cell is forced not to contain 1 but the other possible numeral. Not being able to contain the 1 propagates by the edge labeled L1. This label means that the two cells are bilocated by 1 with respect to their group, meaning that if 1 cannot be assigned to any of them, the other cell is forced to contain 1. The other edges are formed in the same manner. 24
3.4.2 Group-Modified Rules The definition of the FPG can be extended to support propagation through grouping. That means that propagation may occur among cells that do not have to be bivalued or bilocated, but just in the same group. Since there are a lot of cells that are related by being in the same group, the extended FPG will be much bigger (although still polynomial in n) than the FPG. The size may be too much for a human solver to handle, which is why this was not included into the (previous) definition of FPG. However, for an automated solver, this is still of interest, so we will define the extended FPG in the following: Definition For an n n Sudoku grid S, the graph G = (W, E ) with with the edge-labeling function W = {c c is a cell in S} E = {{c, c } c, c W x(c g x c } label : E P({L, V, G} {1,..., n}) (L, x) label ({c, c }) c l x c (V, x) label ({c, c }) c v x c (G, x) label ({c, c }) c g x c is called extended Force-Propagation-Graph or short EFPG. The function d : ({L, V, G} {1,..., n}) 2 N calculates the distance of two labels.,if p = x = G q y d 0, if (p = L x = L) q = y ((p, q), (x, y)) = 2, if (p = L x L) q y 1, otherwise. Analogous to FPG, a path in the EFPG of length p + 2 is called alternating if for each edge e i in the path, there is a label b i label(e i ) such that i {0,..., p} : d (b i, b i+1 ) = 1. Now, both additional rules stated in the previous section may also be used with d instead of d: 1. Extended Alternating Cycle Rule (EACR) Suppose there is an alternating cycle in the FPG. Let c i be a cell of the cycle and e i and e i+1 its incident edges. If there are two numerals x and y with (L, x) label(e i ) and (L, y) label(e i+1 ), remove all possible numerals except x and y from consideration for c i. 25
2. Extended Repetitive Cycle Rule (ERCR) Suppose there is an alternating path of p + 1 edges in the EFPG that starts and ends at the same vertex (cell) but is not an alternating cycle. Let e 0, e p denote the two edges that are incident to the starting cell and b 0, b p the labels of e 0 and e p that were used to form the alternating path (note that d (b 0, b 1 ) = d (b p 1, b p ) = 1 but = d (b 0, b p ) 1). Then the starting cell may not be assigned the numeral of the label whose type is V or G if there are any, and must be assigned the numeral of the label whose type is L if there are any. With the Extended Alternating Cycle and Repetitive Cycle Rules we are closer to the goal of making use of the four limited constraint propagation mechanisms mentioned in Section 3.4. However, there is still a more abstract formulation than these two rules, which for example takes into account multiple cells having influence on the content of a single cell. In the following we will introduce this formulation and explain our implementation of it. 3.4.3 Limited Constraint Propagation After uniting Bilocation and Bivalue rules, there are still unconsidered constraint propagation rules as mentioned in Section 3.4. To take them into account a limited constraint propagation algorithm was implemented. The idea is to build a graph by analyzing the Sudoku grid with respect to the following interpretation of the fundamental rules of Sudoku: Each cell contains at least one numeral. This implies that, if there is only one numeral left for a cell, it has to be assigned to it (Eliminate). Each cell contains at most one numeral. This implies that, if a numeral has been assigned to a cell, no other numeral may be assigned to it (Cell-Flood). Each group contains each numeral at least once. This implies that, if a numeral can only be assigned to one cell in a group, it has to be assigned to this exact cell (Locate). Each group contains each numeral at most once. This implies that, if a numeral has been assigned to a cell in a group, no other cell in this group may be assigned this numeral (Group-Flood). In the following the algorithm and its implementation will be described. The general structure of a constraint-propagation-node, or fp node is shown in Figure 14. The assignment of a numeral to a cell of a Sudoku grid is represented by an fp node containing this numeral. Not assigning a certain numeral to a cell is represented by an fp node whose numeral is negative. An fp node can be triggered, meaning that it was determined to be true, for example, if a cell must contain the numeral k, the triggered-property of the fp node of k in this cell is 26
Figure 14: An fp node has an array of triggers and an array of impacts. set to true. If a numeral k of a cell has been determined to cause a violation of the above rules, the node of k of this cell is triggered. Every fp node has a list of triggers. A trigger of a node f is a list of fp nodes that collectively imply f, meaning that f is a logic consequence of the totality of all nodes of the trigger. Triggers have a type that describes the nature of its implication. Types may be Eliminate, Locate, Cell-Flood and Group-Flood: Eliminate: For all k {1,..., n} and all cells c of the grid, the totality of all negative nodes of c except the node of k trigger the node of k of this cell, see Figure 16. Cell-Flood: For all k {1,..., n} and all cells c of the grid, the node of k triggers all negative nodes of c except k, see Figure 16. Locate: For all k {1,..., n}, all groups g and all cells c of g, the totality of all nodes with the numeral k of all cells of g except c trigger the node with k of c, see Figure 17. Group-Flood: For all k {1,..., n}, all groups g and all cells c of g, the node of k triggers all nodes with k of all cells of g except c, see Figure 17. Likewise, every node has a list of impacts, which point to the triggers it participates in (those have to be considered when changing the triggered property of a node). With this data structure it is possible to represent most of the implications that assigning or not assigning a certain numeral to a certain cell may have. The graph structure in which these constraints are organized is built by the following algorithm: for all cells c in the grid begin for all possibilities k of c begin 27
Figure 15: An fp node may be triggered by several other fp nodes and may itself have impact on multiple fp nodes. 28
Figure 16: Illustration of the Eliminate and Cell-Flood implementation. Figure 17: Illustration of the Locate and Group-Flood implementation. f = fp_node(c, k) s = new set for all possible negative numerals -m begin if m!= k then add fp_node(c, -m) to s make s an impact of f by Cell-Flood make f a trigger for s by Eliminate t = new set for all groups g that contain c begin for all cells d of g begin if d!= c then add fp_node(d, -k) to s make s an impact of f by Group-Flood make f a trigger of s by Locate Lemma 3.1 The size of the graph structure is polynomial in the size of the Sudoku grid. Proof Let n be the number of different numerals in the grid. Note that the number of groups is 3 and does not depend on n. Hence, in each run of the outer loops, 4 triggers and 4 impacts will be added, each of size O(n). The outer loops will run O(n 2 ) O(n) times since there are n 2 cells in a grid and each cell contains at most 2n possibilities ({ n,..., n} \ {0}). In total, the graph structure will be of size O(n 4 ). The following algorithm is an implementation of the Limited Constraint Propagation method to solve Sudoku puzzles: build the graph structure G account for the given hints do for all cells c in the grid begin 29
for all possibilities k of c begin f = fp_node(c, k) f = fp_node(c, -k) if f is a consequence of f in G then trigger f while changes occurred Remark Being a consequence of f in G is determined by a modified BFS 3 algorithm that gathers all consequences of f in a set while traversing the graph. A node f is considered a consequence of f if any trigger of f consists exclusively of nodes that are either triggered, a consequence of f, or f itself. This way, the solver will only trigger nodes that are logical consequences of nodes that are triggered already. Hence at any given time, all triggered nodes are consequences of the hints given in the puzzle and therefore, if the solver finds a full Sudoku grid, it is indeed the solution to the puzzle. Note that if a puzzle is ambiguous, no solution will be found. Triggering f causes all impacts of f that do not contain any more untriggered nodes to be triggered as well, which may induce a chain-triggering of multiple nodes. The node f is then deleted since f represents the impossibility of f. Lemma 3.2 The solver runs in polynomial time. Proof Obviously, the inner for-loops will be executed at most O(n 3 ) times. The outer while-loop will run for as long as nodes are being triggered. Since once a node is triggered, it will not become untriggered again, and the size of the data structure containing all nodes is polynomial in n, the outer while-loop will run a number of times that is polynomial in n. Finding a path in a graph of polynomial size will also take polynomial time and so will triggering a node. All in all, the worst case running time stays polynomial in n. With this technique, we now have a more powerful polynomial time solving mechanism that allows for the implementation of the AC, RC, EAC and ERC rules by limiting the set of edges that are subject to the path algorithm used: The Graph-Accessibility algorithm that finds a path in the graph structure was modified to only use edges that imply its incident vertices to be bilocated, bivalued or, in case of the extended rules, grouped in such a way as described in the respective sections: All Cell-Flood triggers are allowed, Locate and Eliminate triggers are allowed only if the trigger contains at most one untriggered node. If a Group-Flood trigger is encountered, the AC and RC implementation makes sure that the cells of both nodes have only two possible numerals and thus imply a Bivalue rule. The EAC and ERC implementation makes sure that no two Group-Flood triggers that do not imply Bivaluation are consecutive. To show how Alternating Cycles are being detected and exploited, suppose there is an Alternating Cycle. Let c be a cell in the cycle having more than two possible numerals. Let x, y and z be three of them with x and y being part of the labels of the two incident edges used to form the cycle. The path 3 breadth-first search 30
Strategy Test Puzzles Test Puzzles Solved ACR and RCR 5464 6 (0.11%) EACR and ERCR 5464 4261 (78%) Limited Constraint Propagation 5464 5464 (100%) Table 2: Comparison of the three solving strategies presented in Section 3.4. algorithm will find a path from the node (c, z) to the node (c, z) along the cycle, thereby removing z from consideration for this cell as proposed by the Alternating Cycle rule. To show how Repetitive Cycles are being detected and exploited, suppose there is an alternating path that starts and ends in the same cell c. If (V, z) is a label of the first or last edge that is used to form this path, then the path algorithm will find a path from the node (c, z) to (c, z) thereby removing z from consideration for this cell. If (L, z ) is such a label, then the path algorithm will find a path from the node (c, z ) to (c, z ) thereby removing z from consideration for c and hence triggering (c, z ) as described in the RCR definition. The same holds for the extended versions of the two rules. When applying Limited Constraint Propagation, the difficulty of a puzzle can be estimated by modifying the path algorithm to always take the most simple available path and analyzing the structure of it. Overall length, average number of untriggered nodes of the triggers, types of used triggers are example features that may contribute to measuring the difficulty of a path. Having an implementation for all three methods, we are now prepared to carry out some experiments. 4 Experimental Results In this section, the results of our implementation of the deduction rules mentioned in Section 3.4 will be presented. In order to test the AC, RC, EAC and ERC rules (see Section 3.4.1 and Section 3.4.2) we modified our implementation of the Limited Constraint Propagation algorithm according to Section 3.4.3. All three variants were tested as follows: A list of Sudoku puzzles that could not be solved by Eppstein s solver [Epp05b] was generated. A puzzle from this list will be referred to as test puzzle. For each test puzzle and each of the three solvers, the solver and Eppstein s solver were run in turns until the puzzle was solved or no more numerals could be determined. Only traditional 9 9 Sudoku puzzles were considered. All in all, 143262 Sudoku puzzles were generated, 5646 of which were test puzzles (3.94%). The results are presented briefly in Table 2. Although the combination of the two rules did prove to be helpful in solving Sudoku puzzles (Figure 18 shows a Sudoku puzzle that is not solvable by Eppstein s solver [Epp05b], but does contain an alternating cycle that yields a solution to the puzzle) it was rare that the AC and RC rules were enough to solve a test puzzle. Only six test puzzles (0.11% of all test puzzles) proved to be solvable by applying the AC and RC rules. The EAC and ERC rules 31
1 9 2 3 6 6 9 7 4 8 3 2 8 5 6 1 9 7 4 1 3 6 2 6 5 5 9 3 6 6 9 5 2 3 9 6 4 7 8 6 9 5 Figure 18: An example test puzzle that is solvable by applying the Alternating Cycle Rule. The cycle shown on the right implies the removal of the possibility of the numeral 3 in the lower left marked cell. Hence, the 3 is to be assigned to the cell directly below it. proved to substantially increase solving power. 4261 test puzzles (78%) could be solved using the implementation described above. As expected, the furthest generalization, the implementation of the limited constraint propagation algorithm was superior to the simpler cycle rules. In the test, it was able to solve all 5464 test puzzles. This result encourages the idea that this algorithm may be able to solve all proper Sudoku puzzles. Additionally, since it is easy to combine the limited constraint propagation algorithm with a rating system, it may be an excellent choice for an implementation of a Sudoku puzzle generator as described in Section 3.2. 5 Outlook and Future Work While writing this article, the following tasks arose but were not addressed here. They provide a topic for future research regarding Sudoku: 1. Compare generated puzzles with those of other generators (need something to measure quality, for example: number of filled cells). 2. Think about how to generate a full Sudoku grid (efficiently enumerate all?). 3. Find a suitable parameter and check whether Sudoku is FPT and determine the problem kernel. 4. Calculate the number of different Sudoku grids under ambiguous rectangle transformation. 32
5. Consider Sudoku as convincing someone that a certain solution is indeed unique to a puzzle. 6. Research backtracking by Knuth s dancing links. 7. Check hints of a incremental generator for superfluousness. 8. Try to prove that Limited Constraint Propagation can solve all proper Sudoku puzzles. 9. Implement a Sudoku generator based on deletion witnesses with regard to the LCP-solver. In the process of developing generalizations of previously researched topics, there were dead ends and ideas that proved wrong. Attempts of extending the set of transformations further than just adding the ambiguous rectangle failed: Interpreting the n th numeral in the m th row as the row of the numeral n in the column m failed because the resulting grid may have the same numeral twice in a block. However, this may work for Latin Squares. Secondly, switching rows with blocks instead of switching rows and columns (as a transposition does), failed because each block intersects a row and a column with three numerals each, if this block were to become a row, it would have to be intersected by two groups at 3 numerals each while there would have to be a cell shared by all three groups, which is impossible if the overall geometry of the group was to be kept. References [Dud02] Henry Ernest Dudeney. The Canterbury Puzzles. Dover Publications, 2002. [Epp05a] David Eppstein. Nonrepetitive Paths and Cycles in Graphs with Application to Sudoku. ACM Computing Research Repository, July 2005. [Epp05b] David Eppstein. PADS library, July 2005. source code of sudoku.py, URL: http://www.ics.uci.edu/~eppstein/pads/sudoku.py. [FJ06] Bertram Felgenhauer and Frazer Jarvis. Mathematics of Sudoku I. Mathematical Spectrum, 39, February 2006. [HM07] [Mon05] Agnes M. Herzberg and Ram M. Murty. Sodoku Squares and Chromatic Polynomials. Notices of the AMS, 54:708 717, June 2007. Christopher Monckton. Sudoku X Book 1: The Only Puzzles With the X Factor. Justin, Charles & Co., Nov 2005. [Pet06] Kjell Fredrik Pettersen. Sudoku Enumeration 2x5. URL: http:// www.afjarvis.staff.shef.ac.uk/sudoku/sud25gp.html, 2006. 33
[PMH06] Caroline Higgins Peter M. Higgins. Sudoku: Book 1. Plume, July 2006. The Official Book of Circular [RJ06a] [RJ06b] [Rus06] Ed Russel and Frazer Jarvis. Mathematics of Sudoku II. Mathematical Spectrum, 39, February 2006. Ed Russel and Frazer Jarvis. Sudoku Enumeration 2x3. URL: http: //www.afjarvis.staff.shef.ac.uk/sudoku/sud23gp.html, 2006. Ed Russel. Sudoku Enumeration 2x4. URL: http://www.afjarvis. staff.shef.ac.uk/sudoku/sud24gp.html, 2006. [Sem05] Ivan Semeniuk. Addictive, seductive, sudoku. New Scientist, 2531, December 2005. [Tel06] The Daily Telegraph. The Daily Telegraph Samurai Sudoku. Pan Books, Oct 2006. [YS03] Takayuki Yato and Takahiro Seta. Complexity and Completeness of Finding Another Solution and Its Application to Puzzles. IE- ICE transactions on fundamentals of electronics, communications and computer sciences, 86(5):1052 1060, 2003. 34