DMV Dedupes Data. Darn.

February 20th, 2009 by Matt McAdams Leave a reply »

Two weeks ago I went to my local DMV to get a new driver’s license. After the obligatory excruciating wait, I was told I’d need to go to another location and speak to an “investigator” about a mysterious problem. Another trip, another wait, situation resolved; now I have to go back to the first location and start over. Gotta love the DMV.

The problem that required “investigation” was that in 2000, I received a speeding ticket on which my date of birth was recorded differently than on all of my other speeding tickets. (Don’t read anything into this – stay with me here.)  The incorrect date was 9/25/70; the correct date is 3/25/70. I had to convince the investigator that the ticketing officer, or perhaps a data entry clerk, misread a 3 for a 9, rather than what they suspected: that I’d given a false “identity” to avoid insurance points, or arrest for an outstanding warrant (there are none of those – I drive fast, but that’s it!), or something else bad.

As annoying as the run-around was, I’m actually pretty impressed that the DMV spotted this data anomaly eight years after the fact. They’ve clearly implemented a new program designed to find duplicates, or near duplicates, in their vast stores of data. I suspect it’s part of a post-9/11 attempt to crack down on identity theft.

The business value of being able to find potential duplicate records in large amounts of data is well-known to TrackVia users. Our Find Duplicates feature allows you to choose which field or fields you want to use as the basis for comparison – for example, email address and zip code together. And it’s fast – it can find, group, color code and display possible duplicates from 100,000 database records in about 10 seconds. For smaller data sets, like a few thousand records, it’s essentially instantaneous.

If only the waiting-in-line part at the DMV were that fast.

Advertisement

Leave a Reply