ilovett

Revving File Names with Hashes

November 1st, 2009

In High Performance Web Sites, Steve Souders explains the value of adding a version number to resource filenames so that they are both cacheable and replaceable. In a blog post, he further explains why the querystring is not an optimal spot for these numbers.

To create the version numbers, I've been using a variation on the technique described by Kevin Hale in Automatically Version Your CSS and JavaScript Files. His approach uses PHP to append the modification date of each file to its filename. An Apache rewrite rule allows those dates to be virtual, so that the client sees a path like /css/structure.1194900443.css but the server interprets it as /css/structure.css.

The first time I read Kevin's article, I wasn't keen on the PHP part. Even if the overhead of filemtime is miniscule, why have it there at all? My solution was to define a single PHP constant that could be populated by a build script during deployment, and then appended in all my CSS and JavaScript links. What could possibly go wrong?

If you have one version number for all your files, you end up treating them as a set. That's not good. If you upload some new CSS the new version number will be applied across the board, even to files that haven't changed. That cuts down on the longevity of your cached files, which is the opposite of what we're trying to accomplish. I was generating version numbers from Unix time at the moment of deployment, but the same thing could have happened if I used the most recent source control revision. Each file really needs to be revved separately.

SHA1 to the rescue. Version numbers based on the sha1 hash of each file ensure that a change to one file won't impact the version number of any other file. They're also less fragile than date-based version numbers, which can be changed in the course of copying files around unless you're careful. I also prefer the anonymity of the hashes because they don't reveal anything about a file's age.

I've started using Phing, so generating and inserting the hashes is easy. A FileHash task exists, but I'm using an ad-hoc task for now: