Sudoku and SAT

Note: some facts, dates, or historical events might not be accurate in this page. Don't use the information given here to make a homework or something like that. Not even to mention that the english is far from excellent...

Someone I know began to play with it. Then I read an article in a "scientific" journal (IEEE spectrum to name it) which explained some stuff about the game.

I said to me: hey! let's be original and solve this little game! And let's do it using a SAT solver!

Of course, some people already did it. This is my nightmare. When I have a funny idea, someone somewhere already did it. But that's the life.

What does not really exist is a web page that explains all the little details about this SAT solving of sudoku.

So let me explain what I know about this affair.
And let me do it in great details.

1. The game

Everyone knows about sudoku. You have a 9x9 grid divided in small squares of 3x3. The goal is to put numbers 1 to 9 in the grid so that each line, each column and each 3x3 square contains one and only one time each of the numbers.

I love to draw pictures with text. It gives me some kind of weird philosophical feeling. But that's not the point.

You can easily check that each line, column, and 3x3 square contains the numbers 1 to 9 one and only one time.

Of course, for you to play the game, a grid is not given to you fully filled. Some numbers are hidden. And your goal is to put the right number at the right position.

Normally, a given grid has only one solution. But I could see some grids with more than one solution. Shame on the authors of those! (They will remain anonymous.)

The game is said to have been invented in the 1970's by an USAsian, if someone cares about it. Well, Euler also is involved. Anyway, go here for some very good historical background of the game.

There are variations on the game. Some people propose a bigger grid (16x16, 25x25, ...) or weird ones like Samurai Sudoku (scroll down to see it), and we have stuff like Duplex Plex Sudoku, Duplex Sums Sudoku, Duplex Difference Sudoku, and probably billions of others. When something is popular, this kind of derived work becomes the norm. (Of course, I cannot prove this claim.)

1.1. Sudoku is NP-complete

Note: this is not really sudoku which is NP-complete. This is the ASP version of sudoku that is. ASP stands for Another Solution Problem. We want to know if, given a solution to a sudoku, there is another solution or not. The answer will be yes or no so we have here a decision problem. These are the problems for which NP-completeness is defined. But more on that very soon...

You have a problem, with certain unknown stuff. And you want to know if there is a solution to this problem. For example you have a bag and objects. And you want to know if, given the geometrical dimensions of the bag and the objects, you can put the objects in the bag.

So, how do you do that? Well, that's simple, you try to put the objects in the bag. If you can put everything in you're done.

Well, in the real life, it is not that easy. It might be that the bag is very very big and the objects are very very heavy. So you would like to know if you can do it without actually doing it. You want to plan the packing. Because what happens if you cannot put everything in? Maybe the first object is not put correctly? So you would need to take everything out and do all again from the beginning. Woo, no one would do that!

So, some people began to dig into problem solving, so that you can decide how you would pack your bag before doing it. And they found it was damn hard to get a good method to do it.

Later in time, there was this automation process. Algorithms, machines (not yet computers at the time) and the like.

So people began to program, you know. So they said: hey! let's use an algorithm to solve our problem. So they tried. And could not do it.

One guy came one day and said: hum, there is a problem here. We want to solve some problems (in the meantime they wanted to do more than packing bags) but we don't find good algorithms and programs, are we stupid or what?

So he created an abstract model of a computer to think about all this stuff. This is the Turing machine and the guy is, well, Turing. He was a fag who killed himself after the world war 2, because England condamned homosexuality at that time. They didn't care that this guy worked hard to help defeating the Nazis. He was just a stupid fag and nothing else.

You can read an electronic version of his paper On computable numbers, with an application to the Entscheidungsproblem (I don't know if it is correct or not, especially the figures) if you want to know in details what a Turing machine is. Yeah, the guy is obsessed by mathematics, but it's just to look intelligent. What he really cares about is how to pack his bag for his week-end with his (boy)friend.

They invented the P stuff, which has the following definition. A problem is said to be in P if you have a Turing machine that can find a solution in polynomial time. Polynomial means that it takes a number of steps proportional to the size of the problem. If you increase a bit the problem, you also increase a bit the number of steps. There exist problems for which you have to increase a lot the number of steps to check for a solution. Yeah, it looks weird, but these people are weird.

And then the NP stuff, which has the following definition. A problem is said to be NP if you can find a Turing machine that, given a solution to the problem, can check it in a polynomial time.

What did they care? Because they had problems for which they could not find a Turing machine in P, but they had one in NP. So, they formalized a bit these notions, for the fun of it. (Yeah, it's fun. At least for them.)

Now, for the NP-complete stuff, it just means that you have a problem, that you know is NP (you have this nice Turing machine that can check the validity of a solution in polynomial time) and you would like to compare it with this other problem. So that maybe if you can solve one (remember that NP means that you can check if a solution is good or not, but you don't know how to find one) you can solve the other one too. So, a problem is NP-complete if you can transform all the other problems in NP into it in polynomial time.

That would be great, no? You just need to solve one (NP-complete) problem, and all the other (NP) problems are solved, as by magic!

That's where NP-complete comes. A problem is NP-complete if you can transform any NP problem into it.

Does this kind of problems exist? Damn, yes! Otherwise no one would care about this stupid idea of NP and NP-complete. That's where the fun arises. (See below for the paper by Cook demonstrating that SAT is NP-complete. It was the first paper proving such stuff. Weirdos in computer science find it very very important.)

Anyway, Wikipedia has some more serious information about it. Look for P, NP, NP-complete, Turing, Turing machines and the like on it if you want.

What you really need to know is that people don't know how to solve some problems efficiently. So they invented formalisms to classify these problems. NP-completeness is such a classification. If you know that a problem is NP-complete, you get the intuition that it's hard to solve. But you are not absolutely sure. Maybe there is a nice algorithm waiting to be discovered. This is the P versus NP problem. Is there a problem in NP that is not in P? We don't know. If you find the answer, tell the world and then take some vacation because your name will be famous for a few thousands of years, so that you won't have to work like hell to give some food to your kids.

2. The solver

2014-04-02: the following code may fail to compile with modern systems. Only msudoku2 is known to work.

After writing some code (yes, after, shame on me), I wondered if some other people did the same (you know, I wanted to be original).

3. How the solver works

Let me explain to you how the solver works. You will learn what a CNF formula is, what SAT is, what the DPLL procedure is.

3.1. Encoding the problem

Ok, so you have a sudoku problem and you want to use a computer to solve it. How do you give the sudoku problem to the computer (more specifically the SAT solver)?

Well, that's simple. You transform the grid into a SAT instance in CNF form and then run the SAT solver on it to get a solution out of it.

3.1.1. CNF formulae

You have a sudoku, and you need to encode it as a bunch of clauses, what we call a CNF formula (CNF means conjunctive normal form). Go to Wikipedia for a definition of CNF.

To be short, a CNF formula is a set of clauses. For the CNF formula to be true, each clause has to be true.

A clause is a bunch of atomic propositions. These atomic propositions are made of variables, that can be true or false. For the clause to be true one of the atomic proposition has to be true (several of them can be true, we need at least one).

This clause if true if A is true. It is also true if B is true. And still true if C is this time false.

(You are familiar with boolean variables, aren't you? These are variables that can be true or false, ie. can be 0 or 1, light on or light off, binary. Here A, B and C are this kind of variables, they can take only two values.)

In the reality you may have a lot of clauses and some variables may appear in several clauses. So when you say: ok, this clause is true if A is true, so I set A to true. Now, this other clause is true if A is false or B is false, so because A is now true I must have B false. Now I have this third clause that is true if B is true or C is true, but because... etc.

As you see, you apply a reasoning. In your reasoning you sometimes have to choose values for variables. And this choice might be the bad one. So in this case you would have to remove all your reasoning and do all over again with the other choice.

You could say: hey! that's simple! let's not choose at all. Let's pick up the good value, as imposed by the problem.

Sure, it would be nice to have this. But for the moment, no one knows how to do that. And a lot of people think it is not possible. I am on their side. I think that there does not exist this kind of magic method to know whether a given CNF formula is true. It's the famous P versus NP problem.

3.1.2. The SAT problem

So you now have your CNF formula. And you want to know if there is a solution for it, ie. if you can give values to the variables so that all the clauses are true.

Because it has been proven in the past that SAT was NP-complete. In fact SAT was the first problem to be proven NP-complete. That's why it has a name and that's why so much people waste their time trying to create fast solvers for it.

Go read the paper of Stephen A. Cook, The Complexity of Theorem-Proving Procedures, written in 1971, that demonstrates that SAT is NP-complete. It's a very important paper. Cook only wrote around 60 papers in his whole academic life (which is not over yet), which is not a lot compared to others. But, hey, he wrote this one, so no one cares that much if he does not produce too much other stuff.

What is funny is that he says that he tried to prove that PRIME is also NP-complete, but he failed.

We now know (demonstrated in 2002) that PRIME is in P (article written by third-world people: Manindra Agrawal, Neeraj Kayal and Nitin Saxena, yeah you guess well, they are Indians), so it's not a big surprise that Cook could not prove it to be NP-complete (serious people (me for example) think that P is not equal to NP).

By the way, PRIME is to know whether a given number is a prime, ie. if it is divisible only by itself and by 1. This kind of numbers is important in electronic security for example, that's why we care so much about them. Otherwise, they are no more sexy than the others.

3.1.3. From sudoku to CNF

First of all I need to encode a cell. I choose to use 4 boolean variables per cell for that purpose. It gives me 16 possible values. For example to encode 5, you have the binary value 0101 which states that two variables will be 0 and two others will be 1. So each cell will give 4 boolean variables. We have 9x9 cells, so all in all 9x9x4 = 324 boolean variables. Got the point?

To encode the first rule, I state that a cell must be different from 10 AND different from 11 AND different from 12 and different from 13 AND different from 14 AND different from 15.

Why that? Because I want clauses, and using the other way (the cell is 1 OR the cell is 2 OR ...) is not in CNF form.

Then to encode the fact that a cell is different from a value x, it simply says that vb₁ different from x₁ OR vb₂ different from x₂ OR vb₃ different from x₃ OR vb₄ different from x₄, where vb_i are the boolean variables of the cell and x_i are the boolean values of the number x to compare to.

To encode vb_i different from x_i, it's simple. If x_i is 0 then we want vb_i to be 1, so we have the atomic proposition vb_i and if x_i is 1 then we want vb_i to be 0, so we have the atomic proposition !vb_i.

So we have a set of clauses to encode the first rule. (Just glue all the previous stuff together and you're done.)

For the second rule, let's name the cells cell₁ to cell₉. We want all these cells to be different. It means that we want them different two by two.

Here is what I wrote in the code to document this case (cells X and Y have to be different, noted X != Y):

Is it clear? It gives 16 clauses of 8 atomic propositions each. It's barbarian, that's for sure. But it works!

It should be clearer now. If you want you can take the X != Y stuff and do all the calculation by yourself. It's fun. (Yes it is.) (And no I am not a computer scientist, because I failed in finishing my PhD.)

And you are done. You can use this method for rules 2, 3 and 4, by choosing the good cells.

3.2. The DPLL procedure

First of all you have this resolution procedure. Go to Wikipedia to get it full.

To be short, the algorithm takes the clauses and creates new ones, up to a contradiction or no more new clauses can be found.

This is more or less what we want, but not exactly. We want to find explicit values, so we need more than just that.

There is this Davis-Putnam algorithm, the beloved one. In fact it's not exactly that, it's the Davis-Putnam-Logemann-Loveland algorithm that we want. This is not the same.

To be short, the Davis-Putnam-Logemann-Loveland algorithm uses resolution. When it cannot use it anymore, it decides to assign a value to a variable and do it all again. The key point here is in the choice of the variable. If you choose well, you may be fast. If you choose bad, you will be slow like hell. (But see 4.3 below, there are cases where whatever choice you make, you will always be slow like hell. People wanted to know if this resolution stuff was good for everything or not. It turned out it's not. Back to your quest my dear computer scientists, resolution sucks.)

For your enjoyment, you can read the article describing the Davis-Putnam algorithm here. It was published in 1960. You can also read the article describing the Davis-Putnam-Logemann-Loveland algorithm here. It was published in 1962.

For some modernity, you can read articles describing minisat (published in 2003 and 2005), for you to see how to optimize stuff. On the webpage are also some papers describing other SAT solvers. It's a good reading I think.

4. Some issues

4.1. Improve the encoding

I encode each cell with 4 boolean variables. Some people use 9 (read Sudoku as a SAT Problem by Inês Lynce and Joël Ouaknine for example). Why not? What is best?

I should compare, both in term of size of clauses, number of clauses, time to solve the problem and the like. But, hey, it's just sudoku! I won't waste too much of my time on it, will I?

But anyway, maybe we can encode the problem better, with less clauses, less variables, and the like.

The first one is called A SAT-based Sudoku Solver (written by Tjark Weber) and was publish in december 2005, in the 12th International Conference on Logic for Programming, Artificial Intelligence and Reasoning, happening in Montego Bay, Jamaica. You had to pay from 300 to 500$ (not including transportation and hotel) to attend the conference. I guess that the guy didn't pay it, his university did. It is often like that in the academia. (By the way, researchers spend some of their time travelling (for free most of the time), so if you like to travel, go and be a researcher!) So you write some stuff about sudoku and you get a few days of vacation in Jamaica. Nice life, no?

But wait! In january 2006, two guys (Inês Lynce and Joël Ouaknine) published Sudoku as a SAT Problem, presenting their research at the Ninth International Symposium on Artificial Intelligence and Mathematics, held in Fort Lauderdale, Florida. The registration was between 150 and 340$. Once again I guess the guys did not pay that, their institution did (if they both went, maybe only one did). What a fun life. You write a few pages about sudoku so that people have some fun at the conference and then you are off for a few days in Florida. Kewl!

The research community is great, definitely. (Don't come and say it's a waste of resources. Their work is important and deserves that the society supports it. Yes it does!)

4.2. Improve the solver

Since with the DPLL procedure you have to make choices, it's good to do the right choice. Of course, no one knows what the right choice is, but by studying the clauses, maybe you can find a better one, no?

4.3. Future of SAT solvers

In 1985, the paper The Intractability of Resolution was written by Armin Haken (here). It explains that for the Pigeon's hole problem, which says that you cannot put n+1 pigeons into n holes with at most one pigeon in one hole, you cannot find a short resolution.

This was bad news. Resolution is bad. There are some problems for which you absolutely cannot decide quickly if they have a solution or not using resolution. Everyone was sad and some people even commited suicide. Where do you think the economic crisis of 1987 comes from, huh? (Yeah, the world needs some time to digest what computer scientists produce. Two years is damn short, when you think of it by the way.)

By the way, if you want to read some stuff about how to prove lower bounds for resolution and various problems, you can find tons of articles online.

There is this, which explains (but will you understand, I don't know) the Haken's paper. Haken wrote an article with Cook (the one who proved SAT is NP-complete), an Exponential Lower Bound for the Size of Monotone Real Circuits, in 1995, using the same technique of counting bottlenecks that he introduced in 1985. You can read Hard examples for resolution by Alasdair Urquhart, written in 1987. A lot of papers may also be found here. Enjoy. A last one to finish is Propositional Proof Complexity: Past, Present, and Future by Paul Beame and Toniann Pitassi in 1998. Nice reading with some open (as in 1998) problems. I don't know if some of the open issues are now closed or not. Anyway, nice reading.

Now, let me express some ideas I've got concerning this SAT solving stuff. (I won't give pointers, maybe some stuff exists, I don't know. If you know, tell me.)

Direct manipulation of clauses without finding an explicit instance. The SAT problem is to know whether there is a solution or not. We don't care about an explicit solution. Is there a method that would directly manipulate clauses and answer yes if there is a solution and no otherwise, but without providing any explicit solution? (No, I am not talking about hiding a solution, you can do it with minisat for example, I talk about a program that would never create a solution but still give a real answer to the question whether the problem has a solution or not.)

Use mathematical methods (with real or complex numbers for example). Let's imagine you create an equation using your SAT problem. Maybe you can solve this equation with some mathematical tools in a quick way, and thus know if your SAT problem has a solution or not? Maybe by using some calculus or I don't know. Maybe...

Go into 'another' domain. If you run on a Koch snowflake you hardly can get the surface it covers, because its length is infinite. For some given iteration you can do it. You know that you move straight for a length n then turn right at angle θ and move straight for another length n, etc. You can deduce from that the surface covered by the Koch snowflake. But for the limit of the series, you cannot do it, because the length of the curve is infinite. So, if you want to know the surface covered by a Koch snowflake, you need another way than running around. You will probably use some mathematics and calculate a limit of the series, which will give you the analytical solution, which (if you are lucky) will give you an easy to calculate number. (I think it does, I did it when I was a kid but don't remember exactly.) Maybe there is some similar stuff for this SAT problem. Maybe this is a mathematical formalism, maybe something different that needs to be invented.

Use 'some' physical devices to do the calculation (biological molecules, atoms, quantum properties of nature, etc.). Why not? What we really care about is if there is a solution. What we want to do is to encode the SAT problem in a way or another, let the stuff we encoded it into go to some state that we would check later in time. Sure, stated this way it's very vague, but why not? A computer is just that, but with a lot of atoms and each step of the calculation full of redundant tests that slow down everything, but provide a reliable device to calculate stuff. But it's helly slow. Maybe we can have some better physical device?

Anyway, some people think about it. And some people try to know what is the best we can have out of the universe and the laws we know about it. There is a paper, Computational capacity of the universe, published in 2002, written by Seth Lloyd if you want some ideas about it. I don't know if it is reliable or what, but hey, there you are. Judge by yourself.

In fact, the question people (of computer science) are interested in is the following one. Can we find a Turing machine's program that solves a SAT problem (or equivalent problems, ie. any problem that is NP-complete) in a quick way (both computationaly and in terms of time)? It's the P versus NP question.

But what do we care? The real question is: can we find a practical way to solve a SAT problem quickly, whatever this way is. If we need to pop bubbles to do it, let's pop bubbles!

For example, there is the paper Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer by Peter W. Shor, published in 1994. No one (when I write this, back in march 2006) knows of a good algorithm to factorize numbers on a classical computer (a Turing machine, to state it shortly), but we have stuff using quantum properties of the nature. See? We don't really care about the Turing stuff and the complexity zoo and the like. We want practical ways to handle problems, whatever these ways are.

4.4. Alternative to SAT solving: BDD

Let me quickly present you a nice data structure that might be of some interest to solve a sudoku.

Some people wrote research articles about sudoku and SAT, but I could not find anything about sudoku and BDD. Maybe you can do something? Do the analysis, write some code to prove it works, and then write a paper for a conference somewhere in Tahiti or Australia or anywhere you would like to go for free (I suppose you work into an institution that would pay for your travel of course).

To describe it shortly, a BDD is a (sometimes) compact representation of a boolean formula. You can do intersection (AND) and join (OR) of boolean formulae directly on the BDD.

It's sometimes compact, but not always. There are boolean formulae that have no compact BDD (formulae that describe multiplication for instance). I don't know for sudoku. I tried a bit but could not find anything useful. But since sudoku is NP-complete, I am not surprised at all. But even for 3x3 sudoku I could not find something. Maybe I did not construct the BDD in the right way? I take some small BDD and combine them together with AND and OR, maybe I don't do it in the proper order? I don't know. Dig into it!

By the way, some people wanted to compare resolution and BDD. Well, it seems these are two different beasts. There are some problems that are easily solved by resolution but not by BDD and some problems that are easily solved by BDD but not by resolution. Read Resolution and Binary Decision Diagrams cannot simulate each other polynomially by Jan Friso Groote and Hans Zantema, written in 2000. It explains the stuff. (Yeah, it was just before the economic crisis of 2002.)

4.5. "Intelligent" solvers

Some solvers out there are less brutal. Some people encode their (brainy) reasoning into a program and let the program systematically apply the rules they invented until a solution is found.

It seems better than SAT, no? The program is more aware of sudoku than a barbarian SAT solver.

But what about grids that are not solvable with some thinking? What about grids where you need some choice at some point to get the solution? Ok, I can't find a solution. Let's suppose there is a 7 in this cell and keep going. See what I mean?

Anyway, this kind of solvers is good in that you can measure the difficulty of a sudoku. If you need only the simple rules to solve it, it's an easy sudoku. If you need the very weird and tricky ones you just discovered yesterday in a dream where a sudoku grid was fighting with George W. Bush to let the great green elephant escape to Mars, then it's a hard one.

4.6. Designing a Sudoku

How to design a sudoku is another interesting question. Some people make money with this activity.

Anyway, I don't know how to design a sudoku. If you know some free resources in the great Internet, contact me, I'll put the links right here.

5. Philosophical implications

Ok, so let me present you some philosophical implication of SAT solving of sudoku. Yes, there are some.

Is SAT solving thinking? I mean, we have this resolution procedure, this DPLL procedure, but does it think? It clearly solves a problem. But is it aware that it solves a sudoku? Or isn't it just trying to find a point of equilibrium where it cannot manipulate clauses anymore and got an answer to the question whether this set has a solution or not? What is thinking after all? What does happen in your brain when you solve problems? Why wouldn't a computer's program be said to think when it clearly finds a solution to a problem?

Same for BDD, is it thinking? There the technique is different, but still, the computer gives an answer to a question, so doesn't it think? Hum?

What about "intelligent" solvers? Do they think? Here it's even worse, because the computer mimics the way we human solve this sudoku game. But you could tell me that the program just follows the rules, it does not create new ones. So, thinking would be to create rules to solve problems? What if someone somewhere makes a program that invents rules to solve a sudoku? You would reply that the program has been especially designed to find rules, so it does not think. So what do you call thinking? Just what you do you and your brain and nothing else? Or do you accept that some other device might also be able to think, hum? Can something that is not based on neurons think? Or maybe you think there is something special, something more, that is needed to think, that neurons are not enought. Maybe you think that there is a need for some kind of soul. But let me tell you that souls do not exist, so it's not a good answer.

6. What you can do

Ok. Enough about what I did. It's a useless way to spend some time. What can you do in this business?

It is probably the case. Why? Well, I feel it this way. What is a proof if not a path along a certain road? And when you follow a road, you cannot go anywhere you want, you just follow the road.

So you can imagine very very long roads, specially designed and built so that anyone will need a very huge amount of time to go from A to B (think about this Koch snowflake). In other words, you can probably define some problems so that, for any thinking, it would require a huge amount of time to solve it. Yeah, you can try some random values and by luck you can hit the good one. But if the search space is damn huge this is not practical.

Imagine that you throw a 1.01cm long stick in the middle of one billion of 1cm long sticks. Then you mix the stuff and ask someone to pick up the 1.01cm long stick. No one will be able to do it in a reasonable amount of time. Your brain is probably convinced that picking up sticks at random is useless here. It's the same for values of digits.

Of course one can argue here. What do we care about a thinking method at all? We want a device that gives answers, without caring what's going on inside. Sure. But this does not exist. Just as some philosophers who wasted their time trying to prove that movement could not exist or that there is a God down here, you may spend the rest of your life with this idea. It would not give any advance in the business of P versus NP problem though.

No I tell you. Try to find some theoretical framework in which it is easy to see that P is not equal to NP. Yeah, it's tricky. But you would be famous for this one.

Unless, of course, if P equals NP. But I doubt it. Thinking is a process. Data is manipulated over time, transformed into other data. Steps must be followed, in a certain order. I am pretty convinced that there are problems out there for which, whatever thinking you use, the number of steps are too huge.

One needs to prove it, though. Maybe we must go out of thinking for that? Maybe we need to know more about the thinking process in itself? Maybe we need to study the universe and conclude from its nature that we absolutely need to use this step's approach and we cannot bypass it? But how could we do it without an absolute knowledge of the nature of the universe? And is such a knowledge possible to acquire? I guess not. So we are doomed.

But we are going a bit too far here. P and NP are defined for Turing machines, not for the universe as a whole.

So prove it using Turing machines and you would still have done something really big and important. (But still, if you can prove something like that out of the Turing machines' framework, your result can be meaningful in the particular Turing machines' case.)

Or maybe we just cannot prove that P is not equal to NP or that P is equal to NP. Think about this guy, Gödel. What do you think, hum?

Contact: sed@free.fr

Creation time: Fri, 10 Mar 2006 14:03:16 +0100
Last update: Wed, 27 Sep 2023 21:31:50 +0200