This is old news
, but a few weeks ago AOL released 2 gigabytes of raw search data from it's users. Basically, this document contains every web search every AOL member made for a period of several months. The "reason" given for releasing this was academic research -- and I have to agree....seeing the search terms of my fellow Americans is very academic:(crazy search histories from our friends at Something Awful. More here and here.)
In aggregate, this data is downright fascinating (when it's not totally disturbing) Aside from the trainwreck/cringe factor, I could (actually) understand the rationale of releasing it. It's another snapshot of the zietgiest -- and much more personal than (the now bland seeming) Google Zietgiest's revelations that "ricky bobby" is a gaining query
. Figuring out what's really on America's mind has been a hobby of socialogists for decades.
The problem is folks have been conditioned to see that little text field and "Search" button as a great anonymizer. And it is -- except that your searches, with a little detective work, can reveal who you are. . In fact, last week the interpid New York Times managed to, using only the searches of one person, figure out who that AOL user was in real life -- a 62 year old grandmother
(Sidebar: Bri spent a nice a chunk of time yesterday on my computer trying to find the pictures mentioned in this article
. Can't wait til Google decides to release my
What this little incident has taught me, however, is to now be much more worried about the NSA wiretapping boondoggle. Honestly, before, it wasn't particularly worried
about it. Yes, Bush broke the law. Yes, creating huge call lists is generally not something I'd like my government to be engaged in. But overall, my feeling against it were more about due process of law being ignored -- not the actual program itself.
See, I actually believed the whole notion of "We're just data mining it! There's soooo many numbers that we're just shuffling them through the computer as fast as we can looking for patterns that suggest terrorism." But that was the same rationale for the AOL search leak -- use it for data mining. Run those 2 gigs to data through a computer and try to decide how to market to people. The problem is not the data mining -- it's the file itself. Because when the file get broken down and you focus on one person (either AOL subscriber or phonejack) whole lives can be sussed out.
And, in a wierd way, I'd expect AOL to do a better job of keeping information safe than the Feds, so I think this bodes poorly for all of us.