Guy Captures Millions of HTTP Headers and Analyzes Them | Nextthing

First of all, yes, HTTP headers form something like a long tail:

Graph of log(frequency) over rank of headers

In particular, hapax legomena (one-offs) make up over half of the headers found. I expected this. Unfortunately for me, however, a lot of the really interesting stuff is over on that long flat section of the long tail. Which means I spent a lot of time poring over one-offs looking for interesting stuff. Weee.

I love this type of fun. I did something similar a couple of years back with SMTP banners; it was enlightening.