Internet Problem Solving Contest

IPSC 2006

Solution to Problem H – h4x0r t3h c0d3

Easy subproblem

The congruence given in the problems statement looks remarkably similar to the congruence used in the RSA digital signatures scheme. In fact, if B was equal to 1, the system would correspond to an instance of RSA with public exponent E=47 and modulus N (the number from the input file h1.in). The STOP-U box corresponds to a "signing device", which transforms the input number X into its signature S. The given modular equation can then be used to verify if the alleged signature is valid.

Due to the restriction of possible inputs, our problem is related to the concept of so-called "blind signatures", you can find more information here.

The same approach that can be used for blind RSA digital signatures can also be used in the case of our scheme -- we just need to find a number R such that X'=R47.X (mod N) is correctly transformed by the box (i.e. has lowest three bits equal to zero), where X=31415926535897932384626433 is the number that we want to "sign". If we ask the box to calculate the output corresponding to X', the output S' will satisfy S'47=B.X'=B.R47.X, which can be rewritten as (R-1.S')47=B.X (R-1 is the inverse of R modulo N; yes, we make the assumption that the inverse element exists).

Inverse element of R modulo N is the (unique) positive number R-1 smaller than N satisfying the congruence R.R-1=1 (mod N). This number can be calculated using the extended Euclidean algorithm (you can find its description here). Alternatively, we can simply rewrite the congruence into an ordinary equation R.R-1=k.N+1 in which k is an unknown integer. Obviously, we can assume 0≤k<R, thus if R is very small, we can simply try all the possibile choices by brute-force.

Summarizing it all, S=(R-1.S') is the answer we are looking for. In our case, choosing R=8 suffices, the corresponding R-1 is then equal to 64603145990192535568041110603904346455413161834021358733812390838706099548563459644034410948135747117266624970664457571512837377401131661044382987.

Thus, the complete solution looks like this: Ask the machine for the output corresponding to the number X' and multiply the answer by R-1 to obtain the right answer.

Hard subproblem

There is, naturally, more than one way of solving this problem. We'll describe the one we used to create the example solutions. First, we'll review some facts about MD5 and Postscript.

The basic structure of MD5 hash is very simple. At the beginning, it sets its internal state to a defined value -- the so-called Initial Vector (IV). Then, it processes the input in blocks 64 bytes long and updates its internal state using a complicated function of the previous state and the input data. The last block of input data is padded with zeroes and the total length of the data so that it is 64 bytes long (actually, this process might add one additional block of 64 bytes but it is not important for our purposes).

The key to the solution of this subproblem lies in knowledge of recent results regarding MD5 hashing algorithm -- it is possible to find a pair of blocks having the same MD5 hash in very short time for ANY initial vector (IV). More information can be found here.

Another important ingredient is the knowledge of Postscript file format. The key fact is that Postscript can perform comparisons of blocks of data and perform different actions depending on the result. We'll use the binary operator "eq" which, as the name suggests, tests for equality of its two arguments.

Armed with this knowledge, we can simply construct two Postscript files having the following structure:

  1. Prescribed header + possibly some padding to make it 64 bytes long. Let S be the internal state of the hashing algorithm after processing this block of data.
  2. Two DISTINCT blocks A and B with the same hash with respect to the initial vector S
  3. The block A (well, we also need to add some brackets to make it syntactically correct)
  4. The "eq" operator followed by a branch based on its result

In order to obtain two blocks with equal hashes, we've used the program of Pavel Dufek which can be downloaded here. The program expects the value of the initial vector (in our case, it's the vector S which has been calculated by a short C-program using the OpenSSL library (helper.c). Dufek's program is an improved version of a program by a famous Czech cryptographer Vlastimil Klima (can you discover his name in the problem statement?), that uses a concept of "tunnels" in the MD5 function to achieve a vast speedup in collision generation.