Rusty bit Sudoku solving

Recently I’ve been doing a lot of Sudoku puzzles, following the discovery of the Cracking the Cryptic YouTube channel. I’ve always enjoyed Sudoku and they have served to entertain me during long journeys for many years. However, I’d never encountered any of the newer variants of the original puzzle such as “chess Sudoku” or “thermo Sudoku”, where the placement of a value is restricted based on what values are within a standard chess move or the position on a “thermometer”. These variations are a fresh take on the original puzzle and make for some really interesting and difficult problems. (I highly recommend the Cracking the Cryptic YouTube channel link.)

I’ve wanted to write a solver for Sudoku puzzles for some time, but have never got around to starting. Since I’ve got some free time now and have been solving a lot of puzzles recently, I thought it might be a good time to start. Of course, this is a rather complex challenge with many different aspects. I’m going to focus on the design of the underlying objects here. I’ll cover other aspects in later posts (as I figure these steps out).

There are two core problems that need to be addressed before we start to design the the program itself:

  1. How do we reference positions in the grid; and
  2. How do we store the possible digits in each cell of the grid.

The first problem seems to be fairly easy to solve. We can simply store the row and column number associated to each cell. The row and column coordinate pair - each of which is a digit between 0 and 8 inclusive - uniquely identifies a cell in the 9 by 9 grid. We can also use the row and column number to calculate the box in which the cell lies. The boxes are the smaller 3 by 3 grids of cells, labelled 0 to 8 start in the top left and counting along each row. We will need to be able to access all three of these attributes, along with the value (or possible values) of the cell.

The second question is slightly more difficult to answer. There are various objects available, but none of the basic options seem to be particularly helpful. The most obvious method is to simply store a list on each cell that holds the values that can be placed in the cell. This has the advantage of being fairly flexible, but suffers from complexity. Alternatively, we might store fixed values on the cell objects and then have some other method for describing the possible values. For example, we could store them in a multimap object indexed by values, where there entries are the cells in which that value can be placed. This approach has the advantage of being easy to inspect and analyse, since we can very easily find all possible locations for a given value. However, updating and maintaining this multimap will likely prove to be a rather complex task.

How we choose between these options depends on the type of operations we need to perform in solving a puzzle. For instance, we need to quickly determine the possible locations of a given value within a row, column, or box. We will also need to perform reductions - remove possibilities from a row, column, or box when a value is fixed in a cell. Already we can see that these operations might be easier for the multimap model. However, the multimap model will really struggle with inspecting particular rows, columns, or boxes. Here we will have to perform several value lookups for reach row. I think this is a real disadvantage of trying to use a multimap, especially considering the extra complexity of constructing and maintaining this object.

I’ve been watching a lot of videos on YouTube in the last few months - typically in the background whilst doing other things. (This is how I came across the Cracking the Cryptic channel.) I’ve watched a lot of videos on logic circuits and microprocessor architecture, which involves manipulating electrical signals on individual wires to set bits according to the architecture of the components. (I recommend the 8-bit breadboard computer series by Ben Eater on YouTube. It is a very interesting deep dive into how computers - microprocessors - work.) This gave me an idea about how I might go about storing the information about the cells in a Sudoku grid.

Rather than storing a structure containing the row and column as an integer, I can store one number from 1 to 9 in 4 bits of information. From a storage point of view, this is far more efficient than storing them as even the smallest integer type (8 bits/1 Byte), since we can fit both numbers in a single byte rather than using 2 bytes. We can apply the same kind of thinking to the keep track of the values that can be placed in a cell. We can assign a single bit to each of the digits 1-9, where a bit is set to 1 if the corresponding value can be placed in the cell and 0 otherwise. Since this will already need more than a single 8 bit integer in memory, we will have to use the next smallest size of integer, which is a 16 bit integer. This gives us 7 spare bits to store any additional information such as the box number (4 bits) and a flag to indicate whether the value is fixed or not (1 bit).

All together, we need 22 bits to store all the information for each cell in the grid. We can store these in a combination of an 8 bit integer and a 16 bit integer - leaving two spare bits in case we need these later. This storage structure is most akin to storing a list of possible values for each cell, but it is a little more subtle than that. I didn’t realise this when I decided that this is the way I wanted to implement the Sudoku grid.

The next step was to choose a programming language to use. I probably could have chosen any programming language for this, but I decided to choose Rust because it is a nice low level programming language where I have control over the size of the integer types used and can easily manipulate bits, but still have a full suite of modern features. I really enjoy writing Rust code because it doesn’t feel as restrictive and verbose as a language like C or C++. Of course it is a relatively new language, and still doesn’t have some of the flexibility as some of the more mature languages - generic types are not as well developed as C++ templates, for instance - but it is constantly improving.

For this project I won’t really need any of the really advanced features of the language. In fact, all we really need is a struct to hold our two integer types, and array of these structs to represent the grid. The only other features we need are individual bit manipulations - bitwise and, or, shifts - and basic arithmetic, along with basic flow control. This really is an exercise in design rather than an exercise in using clever features of the programming language.

Designing the program to work at such a low level - as in, not using clever language features - means that I have to work much harder in the planning and organisation stages to make it work. This is exactly how I wanted this project to work. Unfortunately, it also means that the actual implementation is going to be rather more difficult to understand.

A Sudoku cell in this program contains the row and column number, combined together into a single 8 bit integer, and the box, set flag, and possible values (and two spare bits) in a 16 bit integer. We’ll use a Rust tuple struct to hold these two integers. (We’re going to call this a SudokuSquare rather than a Cell.)

1
struct SudokuSquare(u8, u16);

We need to write an associated function to create a new default square in a given position. The usual practice is to call this function new. However, before we do this, we need to set up some constants that will help us to select the relevant pieces of information from these two integer values. For this we define some constants at the top of our file.

1
2
3
4
5
pub(crate) static ROW_MASK: u8 = 0xF0;
pub(crate) static COL_MASK: u8 = 0x0F;
pub(crate) static SET_BIT: u16 = 0x0200;
pub(crate) static DIGIT_MASK: u16 = 0x01FF;
pub(crate) static BOX_MASK: u16 = 0x7800;

These are bit masks that are used to extract certain bits from a value by performing a bitwise and operation of the mask and the data. For example, we can extract the first 4 bits - which encodes the row number in binary - from the 8 bit integer value by performing a bitwise and with the binary number 11110000, which is written in hexidecimal as 0xF0. These constants are made public to the whole crate using pub(crate), which will make them available to other parts of the program, but not from outside. We can see from these masks how the data is going to be arranged in each of the two integer types. The first 4 bits of the row-column data represents the row and the last 4 bits are the column. The first bit of the second integer is not used. The next 4 bits of the the second integer represents the box - since 0x7800 is 0111100000000000 in binary. The next bit is not used, and bit that follows is the set flag. The final 9 bits of the integer - masked by 0x01FF which is 0000000111111111 in binary - represent whether each of the values 1 - 9 are possible in this cell.

To convert from a pair of numbers in the range 1 to 9 into the row-column data to be stored in a SudokuSquare struct, we need to take the binary of that of the row number an shift this 4 bits to the left and add the bits for the column number. We’re not going to do any checking at this stage to make sure the values are valid, we assume that the caller has performed these checks in advance. We also have to create a default for the second piece of data, which encodes the box, set flag, and possible digits. At the beginning, all digits are available, the value is not set, and the box is determined from the row and column data. This means our new function will look something like the following.

1
2
3
4
5
6
7
8
9
impl SudokuSquare {

	fn new(row: u8, col: u8) -> SudokuSquare {
		let box_id = SudokuSquare::get_box_index(row, col);
        
        let position: u8 = ((row & 0x0F) << 4) + (col & 0x0F);
        SudokuSquare(position, box_id | 0x01FF)
	}
}

Let’s ignore the get_box_index call for the time being and focus on the second two lines of code. First we set the position by shifting the row digit to the left by 4 places and adding the column digit. (To prevent problems we are using a mask to ignore the first 4 bits of the number.) The position integer is the sum of these two integers. The second value is given by the bitwise or of the box_id value, which is a 16 bit integer provided by the get_box_index function, and the value 0x01FF which signifies that all values are available. The implementation of theget_box_index function is a simple matching exercise.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
    fn get_box_index(row: u8, col: u8) -> u16
    {
        let mut box_id: u16 = match (row, col) {
            (r, c) if r <=3 && c<= 3 => 0x0001,
            (r, c) if r <=3 && c<= 6 => 0x0002,
            (r, c) if r <=3 && c<= 9 => 0x0003,
            (r, c) if r <=6 && c<= 3 => 0x0004,
            (r, c) if r <=6 && c<= 6 => 0x0005,
            (r, c) if r <=6 && c<= 9 => 0x0006,
            (r, c) if r <=9 && c<= 3 => 0x0007,
            (r, c) if r <=9 && c<= 6 => 0x0008,
            (r, c) if r <=9 && c<= 9 => 0x0009,
            _ => panic!("Invalid row/column configuration")
        };
        box_id <<= 11;
        box_id
    }

This definition is also placed within the impl block for the SudokuSquare struct. Now we have a struct that represents a cell in a Sudoku grid and a means of creating “blank” squares ready for working.

I’m conscious of the length of this post, so I think this is a good point to stop for now. The code that I have written for this project is available on my GitHub https://github.com/inakleinbottle/bitsudoku.