Wed, 11 Apr 07

Getting to grips with pwdhash


Having used pwdhash for a while, I decided it was about time I dived into the implementation to see what was actually going on. I figured that the best way for me to get an understanding of the library was to translate it from javascript to ruby.

Investigating the javascript libraries

The first step was to create a basic environment within which I could explore the existing javascript functionality. Relying on Firebug to provide a javascript console, this was as simple as creating a very basic html page.

    <script type="text/javascript" src="md5.js" />
    <script type="text/javascript" src="hashed-password.js" />
    <title>pwdhash client</title>

It actually took me a good few minutes to get up and running with this environment. It seems that hashed-password.js relies on a variable, SPH_kPasswordPrefix, being defined. Although defined in pwdhash.js, it turns out to be our only dependency on that file. In wanting to keep the environment as simple as possible, I chose to copy the definition of that variable to hashed-password.js, allowing us to remain independent of pwdhash.js.

Starting to implement the javascript functions in ruby

Knowing that md5.js implemented pretty standard md5 (rfc) and hmac (rfc) routines, for which there are equivalents in ruby, I chose to concentrate on translating hashed-password.js. Although the code is pretty simple, it took me a while to understand because of my limited javascript knowledge. We essentially construct an object (SPH_HashedPassword), supplying it with password and realm (domain), that generates, and represents, the hashed password.

The very first thing that a newly created SPH_HashedPassword does, is to use the b64_hmac_md5 function, supplied in the md5.js library, to generate the initial hash. I really didn’t expect to be tripped up quite this early on, but I just couldn’t replicate the value obtained from b64_hmac_md5 (using digest/md5 and ruby-hmac.) I now realise that I didn’t fully understand what I was actually trying to replicate – I really should have paid more attention to those three little letters (b64 / base64) at the beginning of the function name…

Back to the drawing-board – starting to implement the md5.js library in ruby

Without thinking about it as much as I should have, I decided that maybe I needed to implement the functionality provided in md5.js in ruby (code for reference). I dived in and started replicating one function at a time. All was going OK until I came across the zero fill right shift bitwise operator in javascript. There is no standard equivalent in ruby, but a little help from the mailing list allowed me to come up with this implementation.

class Integer
  def zero_fill_right_shift(count)
    (self >> count) & ((2 ** (size_in_bits-count))-1) # defined in Integer, size_in_bits is self.size * 8

Although this allowed me to move a little further, progress completely ground to a halt when trying to replicate the bit_rol function. It seems that bitwise left-shifting by a negative number offers a different result in javascript and ruby.

javascript: 1 << -1 = -2147483648
ruby: 1 << -1 = 0

Back to the drawing-board (again) – using digest/md5 and ruby-hmac to replicate the results given by md5.js

This time, I decided to break the b64_hmac_md5 function down into its component parts and see which, if any, differed between ruby and javascript. The first step was to compare the hex output of a simple md5 of a string.

javascript: hex_md5(text) => "1cb251ec0d568de6a929b520c4aed8d1"
ruby: => 1cb251ec0d568de6a929b520c4aed8d1

Next up was a comparison of an HMAC MD5 of a key and some data.

javascript: hex_hmac_md5(key, data) => "9d5c73ef85594d34ec4438b7c97e51d8"
ruby: HMAC::MD5.hexdigest(key, data) => "9d5c73ef85594d34ec4438b7c97e51d8"

Although we can easily replicate the result of hex_hmac_md5 in ruby, we actually want to replicate the result of b64_hmac_md5 (i.e. the hashed data encoded in base64 representation.)
With a bit of thinking, and some trial and error, I realised that I needed to calculate the binary digest of the key and data, and then base64 encode the result. Note. My original problems arose when comparing the output of the binary string data, what with Javascript being able to display unicode characters and ruby just displaying their code points in octal representation.

javascript: b64_hmac_md5(key, data) => "nVxz74VZTTTsRDi3yX5R2A"
ruby: # This requires both ruby-hmac and base64
ruby: Base64.encode64(HMAC::MD5.digest(key, data)) => "nVxz74VZTTTsRDi3yX5R2A=="

This is the first time we see any difference in the values obtained from javascript and ruby. It turns out that there’s a clue to this difference in the md5.js library:

var b64pad  = ""; /* base-64 pad character. "=" for strict RFC compliance   */

Ok, so it seems I should just be able to remove the pad characters (=) from my generated hash and I’ll achieve the same value. Happy that this was a pretty easy difference to resolve, I could move forward with actually implementing the pwdhash magic in ruby…

Implementing the pwdhash magic in ruby

The pwdhash specific stuff is actually very simple. It alters the initial base 64 encoded hash to ensure that we have:

So, having got over the first hurdle, the remainder of the translation was pretty simple. In fact, in its current incarnation (revision 62), there are only a few small syntactical differences between the javascript and ruby versions.

Next steps

Rubify the code. Although I’ve copied it almost directly from the javascript I think I’ve actually managed to make it slightly less readable.

Investigate the whole world of unicode a bit more – I definitely don’t know as much about it as I should.