Sudoku code

Based on this model, the transmitter sends a sequence of all symbols of a solved sudoku.

The decoder gets a matrix with missing entries and uses the constraints of sudoku puzzles to reconstruct a limited amount of erased symbols.

Sudoku codes are not suitable for practical usage but are subject of research.

Questions like the rate and error performance are still unknown for general dimensions.

[1] In a sudoku one can find missing information by using different techniques to reproduce the full puzzle.

This method can be seen as decoding a sudoku coded message that is sent over an erasure channel where some symbols got erased.

By using the sudoku rules the decoder can recover the missing information.

In the erasure channel model a symbol gets either transmitted correctly with probability

The binary erasure channel model however is not applicable because it erases only individual bits with some probability and not Sudoku symbols.

It is filled in a way, that in each column, row and sub-grid N distinct symbols occur exactly once.

Every solved sudoku and every sub-grid of it is a Latin square, meaning every symbol occurs exactly once in each row and column.

At the starting point (in this case after the erasure channel) the puzzle is only partially complete but has only one unique solution.

Decoding methods like belief propagation are also used for low-density parity-check codes are of special interest.

[1] and Moon[4] This method is originally designed for low-density parity-check codes.

LDPC decoding is a common use-case for belief propagation, with slight modifications this approach can be used for solving Sudoku codes.

The aim of error-correcting codes is to encode data in a way such that it is more resilient to errors in the transmission process.

One Sudoku contains therewith about the same information as 72 coin tosses or a sequence of 72 bits.

sub-grid this number is excluded from the possibilities in the belief propagation decoder.

The source has to be transformed into a uniform distribution between three possibilities and mapped to the valid numbers and so on, until the grid is filled.

In the second lines the information is additionally reduced by the area rule: cell

[5] The minimum number of given entries that render a unique solution was proven to be 17.

[6] In the worst case only four missing entries can lead to ambiguous solutions.

For an erasure channel it is very unlikely that 17 successful transmissions are enough to reproduce the puzzle.

[7] Density evolution is a capacity analysis algorithm originally developed for low-density parity-check codes on belief propagation decoding.

[1] One important simplification used in density evolution on LDPC codes is the sufficiency to analyze only the all-one codeword.

Therewith it is necessary to compute density evolution recursions for every possible Sudoku puzzle to get precise performance analysis.

Lets assume the true output value is 4 and the inputs have cardinalities

The final output cardinality distribution can be expressed by summing over all possible input combinations.

To find the threshold the erasure probability must be increased until the decoding error remains positive for any number of iterations.

With the method of Sayir[1] density evolution recursions can be used to calculate thresholds also for Sudoku codes up to an alphabet size

The channel model for a standard sudoku erasure channel with a mapping from the input $X$ to the channel output $Y$ with the erasure symbol $?$ and erasure probability $p_{e}$ .

Schema of a sudoku transmission in the erasure channel model