ENUMERATION OF UNIQUELY SOLVABLE OPEN SURVO PUZZLES

Similar documents
Sudoku puzzles and how to solve them

5 INTEGER LINEAR PROGRAMMING (ILP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

Lecture 3: Finding integer solutions to systems of linear equations

Practical Guide to the Simplex Method of Linear Programming

1.2 Solving a System of Linear Equations

Linear Programming for Optimization. Mark A. Schulze, Ph.D. Perceptive Scientific Instruments, Inc.

5.1 Bipartite Matching

Solution to Homework 2

The Taxman Game. Robert K. Moniot September 5, 2003

Kenken For Teachers. Tom Davis June 27, Abstract

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

Notes on Determinant

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

Mathematics of Sudoku I

3. Mathematical Induction

160 CHAPTER 4. VECTOR SPACES

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Session 7 Bivariate Data and Analysis

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

Zeros of a Polynomial Function

Basic Probability Concepts

1 Review of Newton Polynomials

ALGEBRA. sequence, term, nth term, consecutive, rule, relationship, generate, predict, continue increase, decrease finite, infinite

SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89. by Joseph Collison

Enumerating possible Sudoku grids

Scheduling Shop Scheduling. Tim Nieberg

CBA Fractions Student Sheet 1

The Prime Numbers. Definition. A prime number is a positive integer with exactly two positive divisors.

Row Echelon Form and Reduced Row Echelon Form

Linear Codes. Chapter Basics

Mathematical finance and linear programming (optimization)

6.2 Permutations continued

NP-Completeness I. Lecture Overview Introduction: Reduction and Expressiveness

Notes on Excel Forecasting Tools. Data Table, Scenario Manager, Goal Seek, & Solver

Lecture Notes on Linear Search

Excel 2010: Create your first spreadsheet

6.4 Normal Distribution

Probability Distributions

7 Gaussian Elimination and LU Factorization

EXCEL SOLVER TUTORIAL

Guide to the Uniform mark scale (UMS) Uniform marks in A-level and GCSE exams

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

CHAPTER 15 NOMINAL MEASURES OF CORRELATION: PHI, THE CONTINGENCY COEFFICIENT, AND CRAMER'S V

Lecture 1: Systems of Linear Equations

Introduction to Matrix Algebra

Linear Programming. March 14, 2014

COMBINATORIAL PROPERTIES OF THE HIGMAN-SIMS GRAPH. 1. Introduction

Application of linear programming methods to determine the best location of concrete dispatch plants

A Review of Sudoku Solving using Patterns

Linear Programming Notes V Problem Transformations

Playing with Numbers

The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy

1(a). How many ways are there to rearrange the letters in the word COMPUTER?

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Transportation Polytopes: a Twenty year Update

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM

Using Excel for Statistical Analysis

Chapter 3. Cartesian Products and Relations. 3.1 Cartesian Products

Commonly Used Excel Functions. Supplement to Excel for Budget Analysts

WHERE DOES THE 10% CONDITION COME FROM?

CALCULATIONS & STATISTICS

USING EXCEL 2010 TO SOLVE LINEAR PROGRAMMING PROBLEMS MTH 125 Chapter 4

Notes on Factoring. MA 206 Kurt Bryan

8 Primes and Modular Arithmetic

Linear Programming. Solving LP Models Using MS Excel, 18

Exploratory Spatial Data Analysis

OPRE 6201 : 2. Simplex Method

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

MIP-Based Approaches for Solving Scheduling Problems with Batch Processing Machines

Algebra of the 2x2x2 Rubik s Cube

Mathematical Induction

Using Excel for Data Manipulation and Statistical Analysis: How-to s and Cautions

Comparables Sales Price

Lecture 11: Sponsored search

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Linda Shapiro Winter 2015

ON THE COMPLEXITY OF THE GAME OF SET.

Solving Systems of Linear Equations

Pigeonhole Principle Solutions

SIMS 255 Foundations of Software Design. Complexity and NP-completeness

SUBGROUPS OF CYCLIC GROUPS. 1. Introduction In a group G, we denote the (cyclic) group of powers of some g G by

Lecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method

Methods for Finding Bases

Sample Induction Proofs

INSTRUCTIONS FOR USING HMS 2014 Scoring (v2.0)

1 Solving LPs: The Simplex Algorithm of George Dantzig

5 Homogeneous systems

The Trip Scheduling Problem

Introduction to Microsoft Excel 2007/2010

Systems of Linear Equations

MATH 140 Lab 4: Probability and the Standard Normal Distribution

2 Describing, Exploring, and

Sensitivity Analysis 3.1 AN EXAMPLE FOR ANALYSIS

Unit 1 Number Sense. In this unit, students will study repeating decimals, percents, fractions, decimals, and proportions.

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

3 Some Integer Functions

Managerial Economics Prof. Trupti Mishra S.J.M. School of Management Indian Institute of Technology, Bombay. Lecture - 13 Consumer Behaviour (Contd )

Transcription:

ENUMERATION OF UNIQUELY SOLVABLE OPEN SURVO PUZZLES SEPPO MUSTONEN 30 OCTOBER 2007 1. Introduction In an m n Survo puzzle an m n table has to be filled by integers 1, 2,..., mn in such a way that the their row and column sums are equal to integers given as marginals. Often some of the numbers 1, 2,..., mn are placed readily in the table in order to guarantee that the solution is unique or to make the problem easier. A Survo puzzle is called open if the marginals only are given. Finding all open, uniquely solvable Survo puzzles is not a trivial task. Statistically speaking, it may seem surprising that such tables exist. An open Survo puzzle can be seen as a contingency table with row and column totals given but all cell frequencies missing. Usually the totals or marginal sums tell practically nothing about the cell frequencies. However, the restricition that the cell frequencies are exactly the integers 1, 2,..., mn in certain order is so strong that sometimes the marginals determine the cell frequencies uniquely. 2. Example Consider the following 3 4 contingency table with missing cell frequencies: 1 2 3 4 1 13 2 27 3 38 13 14 23 28 78 In this case there are a huge amount (1104682) of different tables having the same margins, two extreme examples being. Copyright c Seppo Mustonen, 2007 1 seppo.mustonen@survo.fi

2 SEPPO MUSTONEN 30 OCTOBER 2007 1 2 3 4 1 0 0 0 13 13 2 0 0 12 15 27 3 13 14 11 0 38 13 14 23 28 78 1 2 3 4 1 13 0 0 0 13 2 0 14 13 0 27 3 0 0 10 28 38 13 14 23 28 78 with correlation coefficients r = 0.78 and r = 0.92, respectively, when equally spaced scores are assumed. The grand total in these tables is 78 which happens to be equal to 1 + 2 + 3 + 4 +5+6+7+8+9+10 +11 +12. The only case among these over a million cases containing just these cell frequencies is 1 2 3 4 1 2 1 3 7 13 2 5 4 8 10 27 3 6 9 12 11 38 13 14 23 28 78 There is no direct proof that there are no other solutions for this open Survo puzzle. The only way to show the uniqueness is to use an algorithm which sorts out all possible solutions. On the other hand, it is easy to create, for example, 3 4 tables with cell counts 1, 2,..., 12 in some order. There are altogether 12! = 479001600 such tables but only 83952 (0.0175 per cent) of them presented as open Survo puzzles (only marginals given) have a unique solution. Of these 83952 tables only 83952/(3! 4!) = 583 are essentially different when tables obtained from each other by row and column permutations are considered similar. 3. Enumeration Two open, uniquely solvable Survo puzzles A and B are defined essentially different if the solution of A cannot be transformed into the solution of B by interchanging rows and columns or by transposing (in case m = n). Let S(m, n) be the number of essentially different open m n Survo puzzles. It is evident that S(1, 1) = 1 and S(2, 2) = 1, see [2].

ENUMERATION OF UNIQUELY SOLVABLE OPEN SURVO PUZZLES 3 Currently (October 2007) the following S(m, n) numbers have been found: m/n 2 3 4 5 6 7 8 9 10 2 1 18 62 278 1146 5706 28707 154587 843476 3 18 38 583 5337 55815 617658 4 62 583 5327 257773 5 278 5337 257773 6 1146 55815 7 5706 617658 8 28707 9 154587 10 843476 S(3, 3) = 38 was determined by Reijo Sund who was the first to pay attention to enumeration of open Survo puzzles. He calculated this result by studying all 9! = 362880 possible 3 3 tables by the standard combinatorial and data handling program modules of SURVO MM (see Appendix 1 in [2]). A much more efficient way for determining S(m, n) is to start from all possible partitions of margins, i.e. from possible distinct partitions of the total sum S = mn(mn + 1)/2 to sums of m and n integers. If the number of distinct partitions of r into s parts with minimal part min and maximal part max is denoted by P(r, s, min, max), there are when m n P(S, m, n(n + 1)/2, mn 2 n(n 1)/2)P(S, n, m(m + 1)/2, m 2 n m(m 1)/2) tables to be scanned for uniqueness of the solution and this is much less than (mn)!. For example, when m = 3 and n = 4, the numbers of those partitions partitions are P(78, 3, 10, 42) = 128 and P(78, 4, 6, 33) = 519 obtained in Survo by the COMB commands OFF=2,3,14,4,12,8,6,13,5,9,25,11 The sum of all numbers in a 3x4-table is 12*13/2=78. COMB P,CUR+1 / P=PARTITIONS,78,3 DISTINCT=1 MIN=10 MAX=42 RESULTS=0 Partitions 3 of 78: N[P]=128 COMB P,CUR+1 / P=PARTITIONS,78,4 DISTINCT=1 MIN=6 MAX=33 RESULTS=0 Partitions 4 of 78: N[P]=519 and thus there are only 128 519 = 66432 cases to be scanned instead of 12! = 479001600. The number S(3, 4) = 583 was computed by my original solver program (see [2], page 11). Already the number S(4, 4) would have been absolutely formidable to compute by my original program, but Petteri Kaski could do that rather easily by converting the task into an exact cover problem and got the result S(4, 4) = 5327. The most efficient solver program for open Survo puzzles seems to be SP_SOL which I created during Summer 2007 and the current table of known S(m, n) values is computed by it. The new program confirms the earlier results.

4 SEPPO MUSTONEN 30 OCTOBER 2007 4. Main features of the new solver program The idea of this program is simple, but its efficiency is essentially dependent on how various stages of the solution are implemented. The working principle is illustrated here by the next example given as the Problem 11 in [1], page 24. The row and column sums are here reversed to ascending order and thus the Survo puzzle 17 32 36 51 17 26 42 51 is to be considered. The only solution is 2 1 5 9 17 3 6 10 13 32 4 7 11 14 36 8 12 16 15 51 17 26 42 51 At first, no attention is paid on column sums. The target is to find preliminary solutions giving correct row sums 17,32,36,51. The procedure starts from the first row and the COMB program (used here only as an demonstration vehicle) gives the following 11 restricted partitions: COMB P,CUR+1 / P=PARTITIONS,17,4 DISTINCT=1 MAX=16 Partitions 4 of 17: N[P]=11 1 2 3 11 1 2 4 10 1 2 5 9! 1 2 6 8 1 3 4 9 1 3 5 8 1 3 6 7 1 4 5 7

ENUMERATION OF UNIQUELY SOLVABLE OPEN SURVO PUZZLES 5 2 3 4 8 2 3 5 7 2 4 5 6 From these partitions each in turn should be investigated but here a shortcut is taken by selecting the only one leading to a genuine solution, namely the third (!) 1 2 5 9. Next, the all alternatives for the row 2 are sought for. These partitions of 32 must not contain numbers 1,2,5,9 and they are found as follows: OFF=1,2,5,9 COMB P,CUR+1 / P=PARTITIONS,32,4 DISTINCT=1 MAX=16 Partitions 4 of 32: N[P]=16 3 4 10 15 3 4 11 14 3 4 12 13 3 6 7 16 3 6 8 15 3 6 10 13! 3 6 11 12 3 7 8 14 3 7 10 12 3 8 10 11 4 6 7 15 4 6 8 14 4 6 10 12 4 7 8 13 4 7 10 11 6 7 8 11 Again, each of these 16 partitions must be studied separately, but as in the first stage here, the only favourable one, the sixth (!) 3 6 10 11 is selected. Similarly, the candidates for the row 3 are found by observing what has selected already on the previous rows by OFF=1,2,5,9,3,6,10,13 COMB P,CUR+1 / P=PARTITIONS,36,4 DISTINCT=1 MAX=16 Partitions 4 of 36: N[P]=1 4 7 11 14 and this leads to only one possibility. The last row then must contain the remaining numbers. In this case they are 8,12,15,16.

6 SEPPO MUSTONEN 30 OCTOBER 2007 Thus a preliminary solution with correct row sums A B C D 1 1 2 5 9 17 2 3 6 10 13 32 3 4 7 11 14 36 4 8 12 15 16 51 16 27 41 52 computed sums 17 26 42 51 correct sums -1 1-1 1 error is then found, but the column sums taken now into consideration do not match. The final solution(s), if any, must be found by swapping numbers within the rows. In this case it is easy to detect that the swaps (1,2),(15,16) and these only lead to the final solution. In reality, the solution(s) must be found without shortcuts made in the previous instance. The solver program has to scan all possible alternatives. In this case a thorough search leads to 394 preliminary solutions with correct row sums, one of them being A B C D 1 1 2 3 11 17 2 5 6 9 12 32 3 4 8 10 14 36 4 7 13 15 16 51 17 29 37 53 computed sums 17 26 42 51 correct sums 0 3-5 2 error Now the first column sum is correct but the second sum is helplessly too large since it cannot be diminished by any swap within the rows (where numbers are always in increasing order). A similar deadlock is encountered by all the other 394 candidates save the only happy one. The solver program includes two recursive algorithms. The first one finds all candidates satisfying the row sums. The second one studies each candidate in turn and finds all final solutions by swaps within the rows. In both algorithms the essential task is to find restricted integer partitions with different kind of restrictions in each. 5. Reducing sets of marginal partitions Any solver program for Survo puzzles based on scanning the combinations of marginal partitions can be speeded up by observing that the sets of these partitions may be reduced by imposing simple linear constraints for cumulative sums of marginal sums. Let us consider an open m n Survo puzzle with marginal sums r 1 < r 2 <... < r m and c 1 < c 2 <... < c n. The corresponding cumulative sums are denoted R i = r 1 + + r i, i = 1, 2,..., m, C j = c 1 + + c j, j = 1, 2,..., n.

ENUMERATION OF UNIQUELY SOLVABLE OPEN SURVO PUZZLES 7 It can be easily verified that the puzzle has no solution if at least one of the conditions (1) R i < 2in(2in + 1)/2, C j < 2im(2im + 1)/2, i = 1, 2,...,m, j = 1, 2,...,n. is satisfied. The number of cases to be studied thoroughly by a solver program is still more substantially reduced by observing that the puzzle has no solutions if at least one of the conditions (2) (jm + in ij)(jm + in ij + 1)/2 + ij(ij + 1)/2 < R i + C j, i = 1,..., m 1, j = 1,..., n 1, is satisfied. Also these conditions are easy to prove. Instead of a general proof the validity of conditions (2) is shown in the case i = j = 2. Then (2) has the form (3) (2m + 2n 4)(2m + 2n 3)/2 + 10 < r 1 + r 2 + c 1 + c 2. and a possible solution of the puzzle may look like 1 2 r 1 3 4 r 2 c 1 c 2 The sum r 1 + r 2 + c 1 + c 2 takes its smallest value when in the left upper corner we have the four smallest numbers 1,2,3,4 (in some order) and in the two first rows and in the two first columns we have in other positions the smallest numbers above 4. Therefore in these rows and columns (2m + 2n 4 positions) the minimal setting consists of numbers 1, 2,..., 2m + 2n 4. The total sum of these numbers is (2m+2n 4)(2m+2n 3)/2 and because the sum r 1 +r 2 +c 1 +c 2 includes the numbers 1,2,3,4 twicely (sum equal to 10) the lower bound for r 1 + r 2 + c 1 + c 2 is the left side of (3). A general proof is a straightforward generalization. Unfortunately, conditions (1) and (2) do not cancel all impossible cases. For example, if the row sum r 1 is minimal and thus it contains the numbers 1, 2, 3, 4,..., n, the 2 2 left upper corner cannot hold numbers 1,2,3,4 but, at the lowest 1,2,5,6 which makes the condition (3) still stronger. These kind of enhancements have not yet been implemented. The good news are that conditions (1) and (2) cancel a substantial majority of puzzles doomed to lack any solution. For example, in case m = n = 4 the original number of cases to be solved is 2980 2979/2 = 4438710 and the number of unsolvable cases is then 1837169 (41.4 %), but when the conditions (1) and (2) are applied, their number drops to 193695 (4.4 %). A smaller case m = 3, n = 4 is suitable even for graphical illustration. The following graph tells how various cases are distributed in a lattice determined the indices of marginal partitions listed in a lexicographic order. The graph is plotted by Survo using e.g. the specification POINT_COLOR.

8 SEPPO MUSTONEN 30 OCTOBER 2007 Number of solutions for open 3x4 Survo puzzles (10 Sep 2007 / SM) 127 120 115 110 105 100 95 90 85 80 75 70 65 60 55 50 45 40 35 30 25 20 15 10 5 0 Part3 0 50 100 150 200 250 300 350 400 458 Part4 In this picture, each combination of marginal partitions, i.e. each open 3 4 Survo puzzle is represented by a colored dot. Interpretation of colors: Color Cases black 22821 0 solutions, according to (2) red 583 1 solution green 32505 2 or more solutions yellow 2843 0 solutions, not detected by (2) total 58752

ENUMERATION OF UNIQUELY SOLVABLE OPEN SURVO PUZZLES 9 The cases cancelled by conditions (1) (60 partitions of the total sum into 4 distinct parts) are omitted from the graph. Then it covers all remaining (519 60) 128 = 58752 combinations. The graph gives a general impression about the nature of the problem. It is good to zoom in to a suitable degree in order to see how nastily the red gems are located principally among the green and yellow dots. The current version of this paper can be downloaded from http://www.survo.fi/papers/enum survo puzzles.pdf [1] http://www.survo.fi/english [2] http://www.survo.fi/papers/puzzles.pdf References