A quarterly magazine of urban affairs, published by the Manhattan Institute, edited by Brian C. Anderson.
• • • • • • • • • • • • • • •
The data miners are watching, but they dont always see us clearly.
10 October 2008
The Numerati, by Stephen Baker (Houghton Mifflin, 256 pp., $26)
For all their vaunted sophistication, computers are not really that bright. They have a hard time telling if a blog post is written by a man or a woman. They dont grasp that a Seattle blogger who writes I love this weather in the middle of a 12-day rainy patch is being sarcastic. They can be flummoxed by spelling errors. But they will run equation after equation with no complaint. Their persistence allows crafty humans to mine the data we leave sprinkled all over our lives to piece together predictions that prove enormously useful to marketers, politicians, andmore consequentiallyour employers and our governments.
Thats the thesis of Stephen Bakers The Numerati, a field guide to the mathematical modeling of our lives. Baker, a BusinessWeek veteran, started reporting on this high-tech information gathering a few years ago, when he learned about IBMs attempts to construct a database that could match employees with their ideal projects based on past behavior and personal traits. He soon discovered that the same science is being used everywhere, from grocery stores to online dating. The so-called numerati are math whizzes looking for patterns in data that describe something almost hopelessly complex: human life and behavior, he writes.
While Bakers title implies some sinister intelligentsia predicting our every move, his book is actually more of a gee-whiz tour of this nascent field. Baker organizes his seven breezy chapters around the major activities in which computers monitor and model our behavior: as workers, shoppers, voters, bloggers, potential terrorists or criminals, medical patients, and lovers. Employers like IBM, for instance, can monitor e-mails, phone calls, and work output to learn that were useless on Mondays. Supermarkets, meanwhile, try to figure out just how susceptible we are to ice cream promotions.
All of this data mining is, of course, infinitely easier to do in the electronic age. Cash transactions and letters are hard to analyze. These days, almost everything we do leaves a trailsequences of ones and zeros that reveal fascinating patterns when fed into computers with time on their hands. Baker recounts an example of how some peoples online habits suggested that they were romantic movie fans. These fans also showed a great propensity to click on Alamo rental car ads. It turned out that the ads promoted weekend escapes, which proved irresistible to romantic movie buffs. Point noted for future ads.
But thats a rare example of a clear-cut connection. At this point in the data-crunching industrys history, Baker points out, the tools arent precise enough truly to comprehend the maddening randomness of humans. Take IBMs employee-assignment system. Maybe you smell like onions. No one who has a choice will want to work with you. Yet theres no room, Baker writes, at least in the early versions of IBMs employee database, for those personal details.
So for most purposes, at least for the moment, the uses of data mining are more theoretical than practical. As Baker notes, the ideal industries for the Numerati are those in which they can goof up regularly and still top the status quo. The growing industry of political data consultants, for instance, tries to organize us into various tribes, each receptive to a different message. Such categorizations work in slightly more sophisticated fashion than old-style data cuts, in which a twentysomething woman who lives in New York City and works in media looks like a definite Democrat. If the data miners discover that shes married, has kids, and works for herself, though, she might be a swing voter open to a Republican direct mailing on schools and taxes. Of course, the Republicans might waste a stamp. But politicians, accustomed to paying big money for TV ad buys that hit even more of the wrong people, wont care if their targeted mail pitch for a specific group of swing voters only gets it right 75 percent of the time.
Online dating is another area where data miners can goof up with no real consequences. Near the end of his book, Baker conducts a humorous experiment to see if Chemistry.com matches him up with his wife. It doesntbut not because his marriage is doomed. Rather, Baker carelessly limited my search to women younger than my wife and thus was practically throwing my wife into the arms of more open-minded rivals. The program didnt have a way to check if he actually intended what he was typing. When he corrected his error, his wife was suddenly deemed a great match.
Its all fun and games with dating, shopping, and political direct mail. But things get fuzzier, Baker says, when it comes to governments looking through our phone records and other sensitive information to find out if were potential terrorists. Health insurance companies may also use records to calculate that people with our habits and demographic profiles are bad bets. This is going to raise torturous moral questions, ones that until now we never knew enough to ask, Baker contends. Maybe someday well be able to predict that people who resemble a local fourth-grade teacher have a 60 percent likelihood of being pedophiles. But what are we going to do with that information? Using it to fire the teacher might save a child from harm. Still, no computer can yet predict with absolute certainty whether the teacher is, in fact, a potential pedophile, and if so, whether hell act, or whether hell triumph over temptation.
These are thought-provoking puzzles, and The Numerati is an enjoyable read. The one annoyance is the authors odd belief that a book should accommodate details usually edited out of magazine articles. One of Bakers subjects has languid eyes. Another man speaks in a strong French accent. Were talking in the lobby of a Midtown hotel in New York, and he has to yell to make himself heard over a particularly loud fountain. Its even occasionally a dark and stormy night for this frustrated novelist: Outside the sky grows dark, and rain scatters the crowd, Baker concludes one chapter.
But, eye-rolling as these descriptions are, its easy to skip over them to get to the meatier questions of whether humans can be analyzed like machines or financial models. Were more than stocks and parts, quite a bit more, Baker insists. As you put the same cereal in your shopping cart for the 40th week in a row, though, you may wonder if youre really that complex after all.
Laura Vanderkam, a New York Citybased freelance writer, is a member of the Board of Contributors of USA Today. Her work has also appeared in Readers Digest, The American, The Huffington Post, and other publications.