The title of this problem is a homage to a truly delightful book on bitwise magic: Hacker’s Delight.
We will show our solution in steps and illustrate it on one example.
Imagine X
and X >> 1
written one below another:
X : 01121 20000 00000 00000 00000 00200 00000 00000
A = X >> 1: 00112 12000 00000 00000 00000 00020 00000 00000
Each column now contains two consecutive trits of X
. We are looking for the first column in which X >> 1
is bigger (if such a column exists). How can we find it? Take the logical or, and look at the first column that does not match X
.
(In general, note that even though the statement calls &
“and” and |
“or”, it is better to think of them as the “min” and “max” operators, as that is another description of what they do.)
Mismatches can be found using xor. In particular, X ^ ~X
is 2 everywhere. Hence, X ^ ~A
will be 2 where they match and 0 or 1 where they differ.
A = X | A : 01122 22000 00000 00000 00000 00220 00000 00000
A = ~A : 21100 00222 22222 22222 22222 22002 22222 22222
A = X ^ A : 22221 20222 22222 22222 22222 22202 22222 22222
Now we would like to isolate the trits where X
and A
mismatched. The first step is obvious: negate A
. Then, the mismatched positions are 1s and 2s while the matched ones are 0s.
In order to make our life easier, let’s now turn any potential 2s into 1s using another clever trick: note that A ^ A
has the same non-zero trits, but with 1s and 2s swapped. Thus A & (A ^ A)
will only have 0s and 1s. In particular, the leftmost 1 is the position where the first comparison failed, and the number of 0s to the left of that 1 is the answer we seek.
A = ~ A : 00001 02000 00000 00000 00000 00020 00000 00000
B = A ^ A : 00002 01000 00000 00000 00000 00010 00000 00000
A = A & B : 00001 01000 00000 00000 00000 00010 00000 00000
^^^^
answer = 4
Let’s now propagate the 1s towards the right by shifting A
and or-ing it with itself:
B = A >> 1 : 00000 10100 00000 00000 00000 00001 00000 00000
A = A | B : 00001 11100 00000 00000 00000 00011 00000 00000
Each original 1 are now two consecutive 1s. Let’s do the same again with step 2:
B = A >> 2 : 00000 01111 00000 00000 00000 00000 11000 00000
A = A | B : 00001 11111 00000 00000 00000 00011 11000 00000
Now we do the same with steps of size 4, 8, 16, and 32, yielding:
A : 00001 11111 11111 11111 11111 11111 11111 11111
Now let’s “binary-negate” A, for example by subtracting it from the constant 111…1113.
A : 11110 00000 00000 00000 00000 00000 00000 00000
All that remains is to count the 1s in this A
. As with the propagation of 1s, we will do the population count in a logarithmic number of iterations. In the first iteration we will make groups of two trits, then merge those into groups of four, eight, sixteen, thirty-two, and finally forty.
We will only show the first iteration here: Let B = A &
020202…3, and let C = (A >> 1) &
020202…3. That is, B
are trits of A
at even positions and C
are trits of A
at odd positions, shifted one to the right. Once we have these two, set A = B + C
.
B : 01010 00000 00000 00000 00000 00000 00000 00000
C : 01010 00000 00000 00000 00000 00000 00000 00000
A : 02020 00000 00000 00000 00000 00000 00000 00000
After this iteration, if we consider each two-trit block to be a separate number, those numbers are precisely the numbers of 1s that started in that block.
After the next iteration we would have
A : 00110 00000 00000 00000 00000 00000 00000 00000
Note that the 0011
(created by adding 0002
and 0002
) is the ternary representation of the number 4.
Wait, what? Primality testing?
With only 4,000 instructions allowed, 3401 primes up to $\sqrt{10^9}$, and the need to use at least two instructions per prime (“compute the remainder” and “do something with the remainder”), trial division is probably out of the question. And trying out just a subset of those primes is not worth the effort – we’ll miss so many that random tests will almost certainly find a test case we’ll solve incorrectly.
So it seems that we need a smarter algorithm. But how? We don’t even have an “if” statement, how on Earth should we implement anything non-trivial? Well, it’s painful but definitely possible. For example, you can essentially emulate an “if” by evaluating the condition, doing the computation anyway, and multiplying its effects by the value of the condition – so that if the condition was false, you multiply each change by zero and in effect nothing happens.
Let’s start by designing a helper function that returns 0 for 0 and 1 for anything else. This is quite easy to do in constant time: 1 − ((x − 1)/(340 − 1)). The constant is the largest possible value. If x is zero, x − 1 overflows to that value, otherwise it is some smaller value. Thus, the result of division is 1 iff x was zero.
(Alternately, it’s easy to construct signum that runs in logarithmic time by propagating set trits similarly to what we did in the easy subproblem.)
In order to test whether two numbers are equal, we just subtract one from the other and use signum to test whether the result is zero.
The test we picked for our implementation is a deterministic version of the Miller-Rabin test. In particular, a result from 2013 when Izykowski and Panasiuk showed that n < 1, 050, 535, 501 is a prime if it is a strong pseudoprime in bases 336781006125
and 9639812373923155
. Thus, all we need to do are two iterations of the Miller-Rabin test.
Here is a short Python implementation of the test we are going to implement:
def is_SPRP(n,a):
d, s, c, a = n-1, 0, 1, a%n
if a == 0:
return True
while d % 2 == 0:
d, s = d//2, s+1
while d:
if d % 2: c = (c * a) % n
a, d = (a * a) % n, d // 2
if c == 1: return True
for r in range(s):
if c == n-1: return True
c = (c * c) % n
return False
def is_n_leq_billion_prime(n):
return n > 1 and is_SPRP(n,336781006125) and is_SPRP(n,9639812373923155)
Of course, we don’t have conditionals and fancy cycles with a variable number of repetitions. Luckily, each of the cycles in the function makes at most log2n steps, so we can just do a fixed number of 30 iterations. (And those will be 30 physical copies of the same sequence of instructions.) Hence, our implementation of the test will look more like this:
if n == 0:
set n to 1 so that we don't get bitten by a%n
d, s, c, a = n-1, 0, 1, a%n
if a == 0:
return True
repeat 30 times:
if d is even: d, s = d//2, s+1
repeat 30 times:
if d is odd: c = (c * a) % n
a, d = (a * a) % n, d // 2
if c == 1: return True
repeat 30 times:
if s is positive:
if c == n-1: return True
decrement s
c = (c * c) % n
return False
Below are the implementation details for the body of each of the three cycles. First, finding the largest power of two that divides n − 1. This was actually quite simple. Take $d\bmod 2$ and use that value both to increment or not increment s and to divide or not divide d by 2.
B = D % 2
S = S + 1
S = S - B
B = 2 - B
D = D / B
Next, computing $c = (a^d) \bmod n$. Again, let $b=d\bmod 2$. If b = 1, we want to multiply c by a, and if b = 0, we want to multiply it by 1. In order to do this, we will simply multiply c by (1 + b * (a − 1)). The rest of this cycle is trivial, so here we’ll just print the implementation of the part we just explained.
B = D % 2
M = A - 1
M = M * B
M = M + 1
C = C * M
C = C % N
Finally, we need to check whether some suitable c(2r) gives remainder −1 modulo n. This part doesn’t use any new tricks: we just use signum both to check whether s is already zero (in which case we make sure to do no changes) and to check whether the current c equals n − 1.
Putting it all together, we get an implementation with a bit fewer than 1500 commands. Not bad, for a primality test that correctly works all the way up to 109.
If you solved this task, did you use a different approach? We would love to hear from you! The submits are a bit hard to read, if you know what I mean ;)