Hacker News

From hours to 360ms: over-engineering a puzzle solution

85 points by max__dev - 18 comments

toast0 [3 hidden]5 mins ago

I don't think it would make a large difference in runtime, but the bounds aren't quite right.

The largest possible value isn't 999,999,999, because that's not a legal value. The largest value is 987,654,321

Dividing that by 9, you get 109,739,369, but that's not a legal value either, you'd want to start with 109,738,365 for the gcd as it's a valid possible row. I don't know that all gcds would need to be a valid row, but any gcd larger than 98,765,432 would need to be used at multiples from 1-9; with that gcd or smaller, multiple 10 becomes viable and the gcd doesn't need to be valid with multiple of 1.

aroundtown [3 hidden]5 mins ago

For us slow kids in the back what does this mean, “[…] such that the nine 9-digit numbers formed by the rows of the grid has the highest-possible GCD over any such grid”?

sizediterable [3 hidden]5 mins ago

I also wrote my solution in Rust. I was pleased that my approach gave me an opportunity to write a linked list and get the chance to apply some rarely used learning[0].

My solution doesn't use SIMD, but is actually takes about the same amount of time as the the solution in the article, given the same number of cores, though an glaring weakness of my approach is that it can only scale up to 7 cores as written.

Rough outline of how mine works:

- Set current best GCD to 1.

- Search through all valid board configurations.

- Bail out from the search early if the current partially generated board couldn't have a GCD greater than the current maximum found, e.g. if we've only generated two rows of the board, and the GCD of those two rows are already less than the max.

- Update the current best GCD as you find higher ones.

- Share the current best GCD value across multiple threads. That way the longer the program runs, the earlier and earlier the searches will start bailing out.

- Don't always read from the shared variable to avoid contention. Instead, each thread has its own copy of the last value it read which we compare with first before copying and caching the shared value.

- Another interesting property of this approach is that it can be used to validate the correct solution even faster than it takes to find it. Instead of initially setting the max GCD to 1, set it to `solution - 1`. That way branches in our search will bail even sooner from the beginning. This leads to the program running about 20% more quickly.

Source: https://gist.github.com/lazytype/35b45f3ea81b5c1c5555546fe6f...

[0] https://rust-unofficial.github.io/too-many-lists/

natessilva [3 hidden]5 mins ago

Should look into Knuth’s DLX. My JS implementation solved this case in 1.8ms

amyres [3 hidden]5 mins ago

Since we're over-engineering here, one more optimization is to skip all potential GCDs that are divisible by 2 or 5.

Suppose the GCD was divisible by 2, then all rows would be even. Since the last digit of an even integer is in {0,2,4,6,8} and we need 9 unique numbers in the final column, we know that 4 or 5 of the row numbers must be odd. So the GCD can't be even.

Similarly, the GCD can't be divisible by 5. If it were, all rows numbers would need a 0 or 5 in the final digit.

lern_too_spel [3 hidden]5 mins ago

We know that the GCD is divisible by 9 because the sum of the digits for every row is 45.

Jtsummers [3 hidden]5 mins ago

Not necessarily. The sum of the digits ranges from 36-45, there are 10 possible digits of which 9 will be used. If, say, 5 were the unused number then it would not be divisible by 9.

foota [3 hidden]5 mins ago

Sigh. My friend did this problem on pen and paper and made me feel stupid. Their solution was so wildly clever! It relied on the observation that the sum of the rows of the sudoku board (given the digits in use) is a known fixed value, and went from there (I'll leave the rest as an exercise to the reader to avoid spoilers).

rahimnathwani [3 hidden]5 mins ago

The Jane Street page explains the (a?) number theory approach.

toast0 [3 hidden]5 mins ago

This isn't exactly a sudoku board, because they allow for 0.

Jtsummers [3 hidden]5 mins ago

While it's not a true sudoku board, that invariant still holds. Whichever set of digits you end up using (it'll be the same 9 for rows, columns, and blocks) the sum will be same for all of them. It's also unique to the set of 9 numbers you end up using to solve the puzzle.

If you use 1-9, sum is 45. For anything else, it's 45 - (the unused number).

foota [3 hidden]5 mins ago

Yep, I'm aware, hence the given digits part. The whole board is still drawn from 9 digits, it's just that precisely which 9 digits is unknown.

moralestapia [3 hidden]5 mins ago

If anything, what a testament of the massive failure Z3 is.

Jtsummers [3 hidden]5 mins ago

Unless I missed it, their Z3 solution wasn't even presented. So we can't comment on how good or bad Z3 is without seeing how good or bad their Z3 attempt was.

moralestapia [3 hidden]5 mins ago

That is true, indeed.

It could be that the author is just massively unskilled at writing Z3 code.

moralestapia [3 hidden]5 mins ago

Oh come on ...

Either one of the two statements is true, no other way around.

fn-mote [3 hidden]5 mins ago

Can you say more about this?

On the face of it this comment seems ridiculous to me. Z3 is fabulously successful in other domains. Perhaps the problem fit is not there, or the problem encoding chosen was not appropriate.

moralestapia [3 hidden]5 mins ago

(Way) worse than exhaustive brute force search ... you don't really have to say much more.

I think it's the first time I see such thing in the wild.