29 Sep 2017
Hashwalk is an exercise in futility. It attempts to mount a preimage attack on a cryptographic hash function by using local search. This is essentially a poor man’s differential cryptanalysis.
A good, general purpose cryptographic hash function should exhibit the avalanche effect. This means that a small change in the input (e.g. a single bit flip) results in a large change in the output (e.g. approximately half of the bits being flipped). In other words, the function should be psuedorandom. A hash function that isn’t psuedorandom is unsuitable for many cryptographic use cases, since it may leak information on the data being hashed.
In an effort to test this, I’ll apply local search to the hashing function. I’ll start with a target hash value, . The actual value of isn’t particularly important. I’ll define the optimization problem as finding an input whose hash minimizes the hamming distance to . The idea is that if there is any discernible pattern in the hashed values with respect to incremental changes to the input, then a local search heuristic should find better solutions than simply randomly hashing values.
I’ll try my luck on an older, broken hashing algorithm: md5. Md5 has been considered broken for well over a decade, although you can probably still find it in the wild today. Interestingly, it is only broken in terms of collision resistance. A practical preimage attack isn’t puclically known (although there is at least one theoretical attack). So if my primitive cryptanalysis works, I’ll be famous.
To start things off, I implemented something that isn’t really a local search at all. I mean technically it is local, and it is also a search. But it just wanders around making incremental changes to the input, and hashing the results. I called it Hiker. This is pretty much equivalent to randomly trying inputs, which gives us a good baseline. In practice, the most primitive local search heuristic is called a hill climber. The hill climber randomly chooses a neighboring solution but only accepts it if is better than the current one. By contrast, Hiker accepts every random incremental change to the input.
So what can we expect to see when running Hiker? When randomly hashing inputs, we would expect the hamming distance to be distributed according to the binomial distribution, for and . Every individual bit of the resulting hash is essentially a coin toss between 0 or 1.
I ran Hiker for ten million iterations and graphed the resulting hamming distances to the target, which I’ve defined as sixteen zero bytes.
The red line is the binomial probability mass function. So as it turns out, hashing random inputs in the same neighbhorhood without any kind of steering mechanism leads to randomly distributed hashes. This is not entirely unexpected. The best solution found was still a hamming distance of 35 away from our target.
I’ll try a real local search.
Simulated annealing is a simple but effective kind of local search. It permits all neighboring solutions that score better than the current solution, but rejects poorer solutions with a probability that decreases with time. Typically, this decrease is exponential – somewhat similar to how a hot object cools, hence the name. This helps prevent the heuristic from getting stuck in local minima prematurely, and helps it find (or at least get close to) the global minima.
If the md5 hash space has any local neighborhoods (in the sense that similar inputs produce similar hashes), then the algorithm should spend more time traversing good solutions – e.g. lower hamming distances. In that case, the distribution should shift left. I’ll run it for a similar amount of time as Hiker.
Hmmm. No such luck. Simulated annealing has approximately 25% overhead compared to hiker, which isn’t bad. The best solution found was again 35 bits away from the target.
Maybe simulated annealing isn’t sophisticated enough, and I should try something else. The simplest form of genetic algorithm performs crossover operations on parent solutions to form the next generation of solutions, along with a certain rate of mutation. That’s what I’ve implemented here. Crossover brings a new kind of search operation to the table, which may give different results.
Wait, is that… does that mean what I think it means? Have I demonstrated a weakness in md5?
Unfortunately, no. I’m not completely certain what causes this phenomenon. It’s possible that as the generation count increases and the population becomes fitter and less diverse, the same children keep being generated (why the children are fitter I’m not sure of). Adjusting the code to only count unique inputs confirms that I haven’t found anything interesting.
Maybe md5 is pretty good at modeling a psuedorandom function after all. I mean it did resist cryptanalysis much more sophisticated than this for a number of years. Of course, there are lots of other things I could try. Genetic algorithms work best when they exploit characteristics of the problem. In this case, I could try a crossover function that takes the internal block size of md5 into account. I’m not particularly familiar with md5’s construction, but there’s probably lots of operations we could add to the algorithm.
That sounds like a lot of work though, so I’ll leave it as an exercise for the reader.