Fog Creek Hashing Puzzle

A few days ago, Fog Creek Software posted a really cool puzzle on their software developer jobs page.

The problem is as follows:

Find a 9 letter string of characters that contains only letters from

acdegilmnoprstuw

such that the hash(the_string) is

910897038977002

if hash() is defined by the following pseudo-code:

Int64 hash (String s) {  
    Int64 h = 7
    String letters = "acdegilmnoprstuw"
    for(Int32 i = 0; i < s.length; i++) {
        h = (h * 37 + letters.indexOf(s[i]))
    }
    return h
}

For example, if we were trying to find the 7 letter string where hash(the_string) was 680131659347, the answer would be "leepadg".

Note: DONT EVER USE A HASHING FUNCTION LIKE THIS

(for anything other than puzzles)

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-==-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Given the algorithm above, how can we reverse the hash function such that we can go from 910897038977002 to some 9-letter word?

First we have to take a look at what the Hash function actually does. Each iteration it does two things

  1. Multiplies the current value of h by 37
  2. Adds to result of (1) the index of the current character of s (the string being hashed) in the alphabet string letters

Lets look at an example to see how values of h change across iterations. What would the result be if the string being hashed was "age"?

Iterationihs[i]Index of s[i] in letters
0N/A7N/AN/A
10259'a'0
219587'g'4
32354722'e'3

The value of h for any row n can be given by the formula:

$f(n) = f(n-1) * 37 + x$

where $x$ = letters.indexOf(s[i])

To reverse this process we simply reverse the operations of the Hash function, so instead of multiplying by 37 we will divide by 37. Also, since these values are computed by multiplying the previous value of h by 37 we know already that 37 will divide into them evenly, which will leave the value of letters.indexOf(s[i]) as the remainder of the division operation.

So we can get to previous values of h by this formula:

$f(n-1) = f(n)~/~37$

and we can get the indices the characters (in reverse) by this formula:

$x = f(n)~\%~37$

Combining all of this gives you an algorithm that looks like this:

String letters = "acdegilmnoprstuw"  
String unhash(Int64 h) {  
   indices = []
   i = 0
   while ( h > 37 ) {
      indices[i++] = h % 37
      h /= 37
   }

   String unhashedWord = ""
   for(int j = indices.length - 1; j >= 0; j--) {
      unhashedWord += letters[ indices[j] ];
   }

   return unhashedWord
}