Remember David Wilkins, former US ambassador to Canada? Well, if you do a Google search on him, this map from whitepages.com comes up near the top, showing the distribution of telephone directory listings matching his name:
Since they apparently generate these automatically for most any name, I thought of doing my own. But, I figured that I would take another opportunity to increase the fame and internet profile of Mr. Wilkins. Can’t pass that up.
The colors are certainly less than ideal – as with so many of the maps seen here, there’s a mismatch between an orderable data set (number of listings) and an un-orderable symbology (the colors chosen to represent those numbers). Though, I suppose one can see a weak progression in the colors, depending on your perspective. But it’s still far from a good match to the data. Running from a light to a dark blue would be perfect. It would also be more friendly to people with color vision impairments.
It would also be nice if I didn’t have to assume that white means zero listings, since it could also reasonably mean “no data available.” Troubling is the fact that some of the small states are filled in with white on the main map, but on the inset, where they are enlarged, they are given a color. The inset needs to be consistent with the main map – else it makes it harder to understand that the inset is, in fact, a zoomed-in version of the main map.
A sacrifice made with a classed choropleth map like this is that you lose some precision in getting the numbers off of it. Look at the states in light blue – they all have anywhere from 1 to 11 listings for “David Wilkins.” Grouping states like this is perfectly reasonable, to help reduce the number of colors used on the map and make it easier for someone to pick out one distinct color and match it to the legend. Some ambiguity is necessary as part of this process. But, look at Texas – the only state colored in dark red. It apparently has anywhere from 43 to 53 listings. It’s the only state in its class – why is the exact number not specified?
The classification scheme in general is a bit odd. There are a few big goals you want to try and go for when deciding how to group your states. One is to minimize intra-class differences – that is, keep the class sizes small. You don’t want a class that goes 1 to 11 listings, and one that goes 12 to 500 listings. The second one is way too broad. Another is to try and make each class roughly the same size, which this map has a problem with. There’s one state in the dark red class, two in the orange class, and twenty-five in the light blue class. A third goal for class breaks is to try and have class breaks that are relatively even in number – as an astute reader points out below, the class breaks change in size just a bit, though they’re roughly pretty even, so I think they hold up pretty well. There are a few other goals, but I’ll leave it at that. As you might expect, it’s hard to fulfill all the goals at once, but the severity of the difference between 1 red state and 25 light blue ones is still pretty bad. The two lowest classes cover most of the country, and the two upper classes cover only three states. It makes those three states stand out, but more than they should. There’s not a large, unusual, and worth-pointing-out difference between the upper and lower end states, to my mind.
These data should probably be normalized, as well. Consider Texas again: a lot of people named David Wilkins live there. This is probably because a lot of people live there in the first place – it’s one of the most populous states. More populated places will probably have more people named David Wilkins. Likewise, you can’t find anyone named David Wilkins in places like Wyoming or South Dakota, because approximately no one lives in those states. The pattern shown by this map is highly correlated to the population distribution of the United States. It does not show whether or not people from Texas are more likely than people from Wisconsin to be named David Wilkins. Instead of making a map of how many telephone listings there are in each state for David Wilkins, the author(s) should plot how many listings there are for David Wilkins per million inhabitants of the state. Then you would find out that Delaware has 8.1 listings for David Wilkins per million inhabitants, vs. only 2.2 for Texas. The name is also particularly popular in South Carolina, which state the Ambassador calls home.
I find it a bit odd that they have region names listed for New England and the Mid Atlantic, but not the rest of the country. Also, I was under the impression that Maine was part of New England.
One Nice Thing: Those inset maps to the right sure are handy.
With that, I will leave off today’s effort to make this blog the #1 item on a Google search for David Wilkins.