Mid-contest analysis

by Navan Ruthramoorthy

Here is a simple analysis of the leading entry in the Sudoku contest. In a rough way, we want to summarize how far we've come and show that the possibility for algorithmic improvement still remains.

Contents

A Sample Puzzle

Let's load a sample puzzle.

clear
load midcontest puzzle1 list1
puzzle1
puzzle1 =

  Columns 1 through 6 

   28.2453         0         0         0         0         0
   40.0213         0         0         0         0    3.1104
         0         0         0         0         0         0
         0         0    2.7306   61.2247         0         0
         0         0         0         0   41.3306   15.6538
    5.4513   59.1322         0   26.1762         0         0
    7.6852         0         0         0         0         0
    4.9323   97.9423         0   39.5309         0         0
   10.8694         0         0   98.2561         0   24.3343

  Columns 7 through 9 

   62.3612         0    3.7733
         0         0    7.9101
         0    3.8627         0
         0         0         0
         0         0         0
         0   35.9642    9.2909
   20.9934         0         0
         0    6.3376         0
         0         0         0

Since the number of zeros in puzzle equals the length of the list, this problem involves only placing the numbers.

numel(puzzle1)-nnz(puzzle1)
length(list1)
ans =

    56


ans =

    56

Because there are no extra numbers in this problem, we can compute the target sum for the rows, columns, and the 3x3 matrices.

S = (sum(puzzle1(:)) + sum(list1))/9
S =

  268.1379

The current leader 'randomChangeNew1' by wuzhili gives the following answer.

solution1 = solver(puzzle1, list1)
solution1 =

  Columns 1 through 6 

   28.2453    4.0802   41.4974   12.1962   97.2444    5.9535
   40.0213   61.1029   11.5610    5.4586    5.0935    3.1104
   40.6966   25.5792   14.9902   13.1661   26.0939   99.2050
   93.0919    3.9903    2.7306   61.2247    8.6118   10.0240
   37.5220    0.7352   60.4859    1.0232   41.3306   15.6538
    5.4513   59.1322    6.1544   26.1762   10.3577   94.2764
    7.6852    9.4128   64.0140   11.1649   63.0417   10.0594
    4.9323   97.9423    1.3854   39.5309    5.0091    5.5122
   10.8694    6.8805   64.9051   98.2561   11.2048   24.3343

  Columns 7 through 9 

   62.3612   12.7141    3.7733
   95.9941   38.3403    7.9101
   16.6492    3.8627   26.2386
   16.5892   43.7246   27.4702
    7.5540   93.4369   14.6961
   19.2994   35.9642    9.2909
   20.9934   17.4942   64.0150
   10.3422    6.3376   97.5386
   18.4918   16.0829   16.6676

Let's calculate its score.

calculateScore(solution1)
ans =

   16.2687

Breaking down the score, the submitted solution does well with the column sums.

sum(abs(sum(solution1,1)-S))
ans =

    2.5817

It is not as good with row sums.

sum(abs(sum(solution1,2)-S))
ans =

   10.2941

And it's much worse with the sub-matrix sums.

h = reshape(1:9,3,3);
g = ceil((1:9)/3);
h = (h(g,g)-1)*9 + repmat(h,3);

sum(abs(sum(solution1(h))-S))
ans =

    3.3929

Sometimes just looping a little more helps. Just by increasing the NUMRUNS to 15, the M-code gives a much better solution.

betterSolution = solverm(puzzle1, list1);
calculateScore(betterSolution)
ans =

    7.3982

The error in each of the three dimensions is now equally small.

sum(abs(sum(betterSolution,1)-S))
sum(abs(sum(betterSolution,2)-S))
sum(abs(sum(betterSolution(h))-S))
ans =

    2.4112


ans =

    2.3085


ans =

    2.6786

Of course, this decreased error comes at a cost of additional runtime. Because entries are now pushed up against the steep part of the timing penalty, this can be too costly to your overall score.

Extra Numbers

Let us see another example with the length of list longer than the number of zeros in the puzzle.

load midcontest puzzle2 list2
puzzle2
puzzle2 =

  Columns 1 through 6 

   35.9172   99.6454         0         0         0    4.8500
         0         0         0   59.0665         0         0
         0         0    6.1037   78.7512         0         0
   95.6195         0   18.3014         0         0   40.0331
         0         0    5.7796         0   79.8167         0
   57.6140   37.3150         0         0    4.9584   89.7484
         0   60.9279   37.3798         0         0         0
         0    6.1666         0         0   96.0928         0
   21.6282         0         0         0         0         0

  Columns 7 through 9 

         0         0   93.5906
         0         0    7.4406
         0   39.8439         0
    4.9720         0         0
         0         0         0
   77.6967         0         0
         0         0         0
   60.0590         0         0
         0   92.3738    3.9943

In this puzzle, there too few empty spaces for all the numbers in the list.

numel(puzzle2)-nnz(puzzle2)
length(list2)
ans =

    53


ans =

    81

Again, try this puzzle against the current best entry.

solution2 = solver(puzzle2,list2)
solution2 =

  Columns 1 through 6 

   35.9172   99.6454   48.6732   47.7743   92.7385    4.8500
   80.3335    7.4161   97.5189   59.0665    1.0405   93.1149
   16.3018   89.4345    6.1037   78.7512   92.0520   11.2658
   95.6195   91.9891   18.3014   89.8951    5.6462   40.0331
   59.2705   18.9701    5.7796   40.6975   79.8167   92.7123
   57.6140   37.3150   96.3704   37.9078    4.9584   89.7484
   95.1059   60.9279   37.3798    8.7746   17.5893   98.3972
   19.1128    6.1666   95.4864   25.2319   96.0928   32.7882
   21.6282   70.5162   74.9541   92.7434   91.4784   18.0534

  Columns 7 through 9 

   38.4076   19.4348   93.5906
   55.1606   80.0086    7.4406
   74.1868   39.8439   73.3003
    4.9720   57.1190   77.6202
   62.7877   21.3030   99.7659
   77.6967   39.9556   40.2094
   92.7368   62.1994    8.2519
   60.0590   69.5821   76.5980
   15.0219   92.3738    3.9943

Here is a better solution that we've calculated.

load midcontest bettersolution2
bettersolution2
bettersolution2 =

  Columns 1 through 6 

   35.9172   99.6454   54.6707    4.5019   94.3556    4.8500
   80.3335   15.0219   98.3972   59.0665   37.9078   92.7434
   93.1149    4.5089    6.1037   78.7512   18.9701   96.3704
   95.6195   80.0086   18.3014   92.7368   57.1190   40.0331
    5.6462   92.1701    5.7796   92.7123   79.8167   21.3030
   57.6140   37.3150   95.1059    8.7746    4.9584   89.7484
   91.4784   60.9279   37.3798   91.7040   11.2658    1.0405
    6.1571    6.1666   91.7051   19.4348   96.0928   78.3816
   21.6282   91.7258   80.8319   40.2094   87.0928   62.7877

  Columns 7 through 9 

   25.2319   74.9541   93.5906
   89.8951    6.2719    7.4406
   88.2372   39.8439   62.1994
    4.9720    6.4548   92.0520
   38.4076   59.2705   92.2849
   77.6967   97.5189   19.1128
   95.4864   21.7773   77.1519
   60.0590   89.4345   39.9556
    7.4161   92.3738    3.9943

Not a single number in the better solution matches with the solution from the solver. The scores are pretty close, however, showing that there are local minimum scores whose solutions are very far apart.

[calculateScore(bettersolution2) calculateScore(solution2)]
ans =

    6.9684    7.6338

Integers

Let us try a standard Sudoku puzzle with these solvers.

% Define the puzzle.
puzzle3 = [ ...
    5     3     0     0     7     0     0     0     0
    6     0     0     1     9     5     0     0     0
    0     9     8     0     0     0     0     6     0
    8     0     0     0     6     0     0     0     3
    4     0     0     8     0     3     0     0     1
    7     0     0     0     2     0     0     0     6
    0     6     0     0     0     0     2     8     0
    0     0     0     4     1     9     0     0     5
    0     0     0     0     8     0     0     7     9];

% Define the list.
list3 = repmat(1:9,1,9);
dups = puzzle3(puzzle3~=0);
for i = 1:length(dups)
    list3(min(find(list3==dups(i)))) = [];
end

Here is the best solution which any regular Sudoku solver will find.

bestsolution = [ ...
    5     3     4     6     7     8     9     1     2
    6     7     2     1     9     5     3     4     8
    1     9     8     3     4     2     5     6     7
    8     5     9     7     6     1     4     2     3
    4     2     6     8     5     3     7     9     1
    7     1     3     9     2     4     8     5     6
    9     6     1     5     3     7     2     8     4
    2     8     7     4     1     9     6     3     5
    3     4     5     2     8     6     1     7     9];

calculateScore(bestsolution)
ans =

     0

The current leader does not find it.

solution3 = solver(puzzle3,list3)
calculateScore(solution3)
solution3 =

     5     3     5     6     7     6     5     2     6
     6     3     2     1     9     5     8     2     9
     3     9     8     3     7     1     2     6     5
     8     3     4     2     6     8     4     7     3
     4     7     4     8     4     3     7     7     1
     7     4     4     4     2     8     5     5     6
     9     6     6     9     1     3     2     8     1
     1     9     7     4     1     9     9     1     5
     2     1     5     8     8     2     3     7     9


ans =

     4

Integers + Jitter

One difference between the original Sudoku game and our contest test suite, is that the numbers are not unique. By introducing random jitter to the puzzle in the previous example, we can make the digits unique.

puzzle4 = puzzle3 + (rand(size(puzzle3)).*(puzzle3~=0))/100
list4 = list3 + rand(size(list3))/100;
puzzle4 =

  Columns 1 through 6 

    5.0088    3.0016         0         0    7.0089         0
    6.0008         0         0    1.0054    9.0066    5.0017
         0    9.0084    8.0022         0         0         0
    8.0006         0         0         0    6.0036         0
    4.0085         0         0    8.0039         0    3.0058
    7.0092         0         0         0    2.0028         0
         0    6.0023         0         0         0         0
         0         0         0    4.0042    1.0048    9.0001
         0         0         0         0    8.0029         0

  Columns 7 through 9 

         0         0         0
         0         0         0
         0    6.0076         0
         0         0    3.0013
         0         0    1.0022
         0         0    6.0019
    2.0013    8.0069         0
         0         0    5.0098
         0    7.0093    9.0055

In this case, the current best entry does a bit better, but still doesn't quite get it.

solution4 = solver(puzzle4,list4)
calculateScore(solution4)
solution4 =

  Columns 1 through 6 

    5.0088    3.0016    5.0057    7.0033    7.0089    6.0089
    6.0008    3.0054    2.0067    1.0054    9.0066    5.0017
    4.0073    9.0084    8.0022    2.0018    7.0087    1.0017
    8.0006    2.0014    2.0078    2.0077    6.0036    8.0094
    4.0085    7.0013    5.0003    8.0039    4.0017    3.0058
    7.0092    5.0080    5.0095    4.0053    2.0028    8.0069
    9.0046    6.0023    6.0056    9.0047    1.0081    3.0068
    1.0067    7.0092    8.0005    4.0042    1.0048    9.0001
    1.0004    3.0077    4.0084    8.0084    8.0029    2.0056

  Columns 7 through 9 

    2.0003    4.0026    6.0082
    6.0098    4.0023    9.0081
    3.0074    6.0076    5.0013
    9.0091    5.0009    3.0013
    7.0029    7.0086    1.0022
    4.0099    3.0100    6.0019
    2.0013    8.0069    1.0061
    9.0044    1.0027    5.0098
    3.0015    7.0093    9.0055


ans =

    2.0030

Conclusion

The current best entry is very impressive and we're looking forward to the watching for new innovations made in the last day of the contest. Thank you all for your participation.