Ever since leading the Boston Red Sox to victory in the 2007 World Series, Josh Beckett had been a mainstay of the team’s pitching rotation. But when he hobbled off the mound with an ankle injury on September 5, 2011, the Red Sox faithful took the news in stride. After all, their team was the hottest in baseball. The previous winter, the Sox had acquired two of the sport’s most sought-after players, outfielder Carl Crawford and first baseman Adrian Gonzalez. The acquisitions led the Boston Herald to declare the team the TOP SOX SQUAD OF ALL TIME before it had played an inning. And until the afternoon of September 5, the Sox had largely lived up to the hype, winning 84 games and losing just 55. Even with nearly a month of baseball left, they seemed like a lock for the postseason.
True, Red Sox fans were no strangers to tragic endings: the slow ground ball through Bill Buckner’s legs in the 1986 World Series; manager Grady Little’s leaving Pedro Martinez on the pitcher’s mound for one inning too many in the 2003 playoffs; Yankee Bucky Dent’s season-ending home run at Fenway Park in 1978. But this time, any Sox fans disconcerted by news of Beckett’s ankle sprain could rest assured that they had science on their side—that is, the various statistical models that gave their team almost a 100 percent chance of making the playoffs.
Coolstandings.com is a website that forecasts every baseball team’s odds of reaching the postseason by running a Monte Carlo simulation—millions of computerized iterations of the team’s upcoming games—and converting the results into the percentage chance that the team will win its division. “If a team wins the division 100,000 times out of a million,” the editors of Coolstandings explain, “then that team is given a 10 percent chance of winning the division.” When Beckett exited the game on September 5, Boston fans could visit Coolstandings and take solace in the fact that their team still had a 99.2 percent chance of making the playoffs. Coolstandings wasn’t alone: Baseball Prospectus, one the nation’s best-respected baseball websites and another stat-based predictor of playoff odds, reported on Beckett’s injury but added: “Not that the Sox are in danger of missing the playoffs without him; their chance of bonus baseball stood at 99.7 percent even after losing [the game in] extra innings.”
Having read all that, don’t ask a Red Sox fan how the 2011 season played out. What ensued was the worst collapse in Major League Baseball history: losing 17 games and winning just six, the Sox watched their lead over Tampa Bay for the last American League playoff spot evaporate. But even as the team collapsed, the statistical models kept telling Boston fans to keep calm. “Before we all freak out about an epic Boston choke, let’s head over to coolstandings.com for a dose of reality,” wrote USA Today’s Gabe Lacques on September 16. “They have done us the favor of simulating the rest of the season 1 million times to determine every team’s chances of postseason play. And they still put the Red Sox’s chances at . . . 88.7 percent to make the playoffs.”
Twelve days later, Boston really did get a dose of reality. On the morning of September 28, the final day of the season, the team found itself tied with Tampa for the last playoff spot. That night, the Red Sox led Baltimore 3–2 in the seventh inning; then a torrential downpour delayed the game for nearly 90 minutes. After play resumed, Baltimore scored two runs in the bottom of the ninth to win the game. Just moments later, Tampa capped a miraculous comeback from a 7–0 deficit against the Yankees—winning the game in the tenth inning, clinching the final playoff spot, and sending the Red Sox home for a dismal winter. As the magnitude of the collapse sank in, nobody put it better than the Boston Globe’s Pete Abraham on Twitter: “I think my days of looking at Cool Standings or the Baseball Prospectus playoff odds are over with.”
Baseball is far from the only area in which predictions can disappoint, and in recent years, a handful of writers have drawn attention to the biases and habits that influence our ability to make good forecasts. The two most recent entrants in this category, Nate Silver’s The Signal and the Noise and Nassim Taleb’s Antifragile, appeared to wide acclaim in late 2012, with each author winning scores of favorable reviews and profiles; both received invitations to Silicon Valley to speak at Google’s prestigious Authors@Google series within weeks of each other. The two books, then, seem to have captured the zeitgeist—despite having largely contradictory messages and methodologies. Silver comes to praise those who have mastered the art of forecasting and to encourage us to learn from their lessons; Taleb comes to bury those who hubristically overestimated their own predictive capabilities and to caution us against repeating their mistakes. When read together, the two books illustrate not just how we can become better predictors but also how we can recognize better when we predict at our own peril.
In The Signal and the Noise: Why So Many Predictions Fail—but Some Don’t, Nate Silver offers a book-length extension of the statistical approach that has brought such prominence to his New York Times poll-tracking blog, FiveThirtyEight (a reference to the number of votes in the electoral college). Political statistical analysis had already enjoyed a renaissance in the first decade of the 2000s, thanks in part to RealClearPolitics.com’s creation of “RCP Averages,” which combined many polls into a single poll-of-polls. But Silver took aggregation to an entirely new level.
First, instead of treating all the relevant polls equally, as the RCP Averages did by simply averaging the polls’ results, Silver weighted them according to their perceived reliability, taking into account such factors as their freshness, their sample size, and the pollsters’ track records. Second, instead of averaging the polls’ bottom-line numbers and then reporting a single, nationwide average, Silver entered the polls into a Monte Carlo simulation. Much as Coolstandings runs millions of computerized simulations of upcoming baseball games, Silver ran millions of computerized simulations of elections. The outcomes of all these simulations varied: sometimes Candidate A beat Candidate B by a wide margin; sometimes A beat B by a narrow margin; sometimes B beat A. But when the millions of results were amassed, they could be translated into the most likely final vote tally. Perhaps more important was that the range of simulated outcomes illustrated the “variance,” the likely range of actual outcomes. So even if the likeliest outcome was a substantial win by Candidate A, Candidate B might still win upset victories in 10 percent of the simulated elections; thus, A’s odds of winning the election would be stated as 90 percent.
So far, Silver’s predictions have been impressively accurate. For the 2008 presidential race, he accurately called 49 of the 50 states, getting only Indiana wrong. In 2012, he was 50-for-50—much to the chagrin of critics who had dismissed what they thought were unduly pro-Obama forecasts.
The 34-year-old forecaster has blogged about statistics and politics only for those two presidential election cycles, first on his own independent site and now at the Times, in a three-year agreement that expires this year. He has legions of readers, and his blog attracted almost 5 million page views on Election Day 2012. But his interest in statistics and forecasting manifested itself first in the context of baseball. He pioneered PECOTA, a forecasting model that revolutionized baseball statistical analysis, and sold it to Baseball Prospectus for an ownership stake and an editor’s position. Silver has also been a very successful poker player, and he credits the game for giving him “better training than anything else I can think of about how to weigh new information, what might be important information and what might be less so,” as he told the Guardian.
Accordingly, The Signal and the Noise isn’t Silver’s attempt to apply his work in political prognostication to foreign, nonpolitical territory. Rather, it’s a demonstration of the instincts, skill, and breadth of knowledge that made him such a good political forecaster in the first place.
The title of Silver’s book refers to a contrast that engineers draw between a transmitted signal and the background “noise” that interferes with the audience’s reception of that signal. “The signal is the truth,” Silver writes in his opening pages. “The noise is what distracts us from the truth.” Built as a series of case studies—some “hopeful examples”—rather than as a ground-up theoretical work, The Signal and the Noise seeks to show how we can improve at forecasting future events.
Silver sees himself as a pragmatist, not an absolutist or a utopian. In the very first pages, he excoriates one tech utopian, former Wired editor Chris Anderson, who evangelized in 2008 that the age of “Big Data” meant that “we can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.” Such views, Silver responds, are “badly mistaken.” For while Big Data may offer forecasters an unprecedented opportunity to model how the world works, “the numbers have no way of speaking for themselves.” Worse still, Anderson’s “data deluge” threatens to overwhelm us not just with more signal but also with more noise, forcing analysts to work harder than ever to separate truly useful information from spurious correlations and other false results. Bigger data may allow for more accurate forecasts, but only if we have analysts capable of discerning when the models have gone in the wrong direction.
This argument finds its best illustration in Silver’s chapter on weather forecasting, which shows how meteorologists use their own experience and intellectual craftsmanship to adjust computer-model results manually. “Humans can make the computer forecasts better or they can make them worse,” Silver writes. He praises analysts’ “eyesight”—their capacity to “see” fuzzy patterns that computer models miss. So while The Signal and the Noise is “an emphatically pro-science and pro-technology book,” Silver intends it to be first and foremost a practical book. One thinks of Michael Lewis’s Moneyball, which commenced a decade of bitter fighting between traditional baseball “scouts” and modern statistics geeks before a consensus emerged on the need for both technology and craftsmanship. Silver’s aim is to skip the dialectical thesis and antithesis and move right to synthesis, pointing a way forward.
The book’s case studies are examples of how forecasters can use this general approach—technology aided by human understanding—to overcome major forecasting failures. The most recent of these failures that Silver discusses is the recent financial crisis, which far too few on Wall Street foresaw or guarded against. Moody’s and Wall Street overestimated their ability to recognize the risk in portfolios; the broader American public was overconfident that housing prices would rise forever; later, the Obama administration was overconfident that its $800 billion stimulus package would jump-start the economy and return it to health.
Silver contrasts this cautionary tale with some success stories in which failures led to substantial improvements in the quality of forecasts. One of these examples involves Silver’s home turf: elections. Like financiers, political pundits suffer from overconfidence. They’re too confident that their own preferred parties or candidates will prevail at the polls, too confident to admit mistakes and learn from them, and too confident about their predictive capabilities to take seriously the range of other outcomes that might result on Election Day. According to Silver, these forecasters would do well to adopt a “probabilistic” approach, “acknowledging the real-world uncertainty in their forecasts” and the “imperfections in their theories about how the world [is] supposed to behave—the last thing that an ideologue wants to do.” In other words, they should follow the lead not just of FiveThirtyEight but also of the Cook Political Report, which takes much more seriously the value of “qualitative” information gleaned from interviews with candidates. The key is not to give these nonquantitative factors too much weight, “which might actually make . . . forecasts worse,” but rather to take them just seriously enough to help (in the Cook Report’s case) “call a few extra races right.”
And then there are those meteorologists—who, Silver explains, are sometimes corrupted by market competition that prizes TV ratings over technical accuracy. Viewers value the appearance of accuracy, so weathermen facing a 50 percent chance of rain will at times flip a coin to settle on a forecast stating either a 60 percent or a 40 percent chance, rather than deliver a wishy-washy 50-50 forecast. And because viewers would rather be pleasantly surprised by clear skies than be unpleasantly surprised by rain, weathermen tend to skew their forecasts toward showers. Even though Silver’s two scenarios seem at odds—are weathermen flipping coins, or are they reliably biased in one direction?—the general point rings true: meteorologists are in the TV business as much as the weather business, and their forecasts can be the worse for it. “If the forecast was objective, if it has zero bias in precipitation,” one Weather Channel meteorologist admitted to Silver, “we’d probably be in trouble.” Thus Silver applauds the “admirably well calibrated” forecasts of the National Weather Service, which shuns the biases that undermine so many TV forecasts. And he further praises meteorologists’ recognition that, in forecasting hurricanes, one should predict not just a narrow, most-likely path for the storm but rather a “cone” of outcomes that shows the public a broader range of possible paths.
All in all, Silver’s diagnosis of what ails modern forecasting appears sound, in no small part because of his willingness to stand on the shoulders of social-science giants. His chapter on political predictions draws heavily on Philip Tetlock’s 2005 classic Expert Political Judgment—which, after taking the confidential predictions of nearly 300 “experts” in various fields over 20 years and comparing them with the arbitrary “predictions” of a dart-throwing chimpanzee found that “humanity barely bests the chimp.” Silver’s discussion of overconfidence in the financial crisis echoes Carmen Reinhart and Kenneth Rogoff’s wildly successful This Time Is Different: Eight Centuries of Financial Folly. Other predecessors go unacknowledged, perhaps because the author didn’t notice them. In describing the stock market as a two-track process—a “signal track” reflecting market fundamentals and long-term value and a “noise track” dominated by short-term “momentum trading” and “herding behaviors”—he neglects to mention that Benjamin Graham, one of the most influential market analysts of all time, wrote the same thing in 1949. “In the short run, the stock market is a voting machine,” Graham explained in The Intelligent Investor; “in the long run it is a weighing machine.”
Silver’s most important intellectual debt might be found near the end of the book. He uses an interview with Donald Rumsfeld as a jumping-off point for a discussion of what the defense secretary famously called “unknown unknowns”—that is, information that we don’t even know that we lack and questions that we don’t realize we need to ask. And that, in turn, leads Silver to Roberta Wohlstetter’s seminal study of the intelligence failures that preceded the Pearl Harbor attack, a study that itself employed the metaphor of signals and noise. (Silver graciously notes that Wohlstetter “pioneered the use of the metaphor 50 years ago.”)
This chapter reveals a central, if unstated, premise of Silver’s analysis, along with some limits of his prescriptions—limits that he refuses to concede meaningfully. By employing the metaphor of signal and noise, both Wohlstetter and Silver assume that there is a signal to be found. In Wohlstetter’s case, that assumption made sense: her signal consisted of the Japanese transmissions that America collected but didn’t digest in time to prevent the attack. For Silver, a signal is something quite different. True, his loose terminology sometimes obscures precisely what he means by the term: early in the book, his signal is the “truth” that we are too often distracted from; by the end, it is “an indication of the underlying truth behind a statistical or predictive problem.” But whether his signal is a future event or today’s approximation of that event based on past data points, he fails to deal with a troubling difficulty: signals don’t always exist.
For example, how would Silver calculate the odds of a truly rare event—say, Russia’s sudden, unexpected default on its own bonds in 1998, which set off a cascade of failures that killed the vaunted hedge fund Long-Term Capital Management and nearly precipitated a Wall Street disaster? As journalist Roger Lowenstein later recounted, “before August 1998, Russia had never defaulted on its debt—or not since 1917, at any rate. When it did, credit markets behaved in ways that Long-Term didn’t predict and wasn’t prepared for.” Even if Russia’s default had been forecastable, the cascade of subsequent events that led to LTCM’s demise wouldn’t have been. “For all the brilliance of [LTCM’s] theory, it was based on an assumption that the future performance of bonds would mirror their past movements,” the New York Times’s Gretchen Morgenson wrote at the time of the crisis. There was no “signal”—only silence.
How should we try to forecast such events? Or is the mere act of trying to forecast them dangerous? Silver comes closest to conceding the problem when he writes, “frankly,” that his methods for forecasting in politics and sports probably aren’t all that useful for national-security analysis because politics and sports are “data-rich fields that yield satisfying answers,” while national security involves very rare events. At this point, one might expect Silver to warn readers not to take his theories too far. In an earlier chapter, after all, he has cautioned against “overfitting” data to preconceived models. But in the end, he fails to heed that advice.
Nassim Taleb has spent the last several years railing against precisely this variety of predictive hubris. In Fooled by Randomness (2001), he warned that we often see cause-and-effect narratives in random courses of events. Six years later, in The Black Swan, he expanded his thesis into an all-out assault on modern society’s failure to take seriously the threat of “black-swan” events. These were low-probability, high-impact events that couldn’t plausibly be forecast using past data and experience—just the problem that Silver fails to acknowledge properly.
Taleb’s personal story, like Silver’s, has been repeated countless times in newspaper and magazine profiles. And just as with Silver, Taleb’s methodology grows out of his background. Silver’s roots are in baseball and poker; Taleb’s life has been marked by black swans. He was born in Lebanon in 1960, when that nation was still “a stable equilibrium,” a “mosaic of cultures and religions” seen by many as “an example of coexistence,” as he recounts in The Black Swan. But a civil war shattered that coexistence—and with it, Taleb’s philosophy: “The very idea of assumed equilibrium bothered me. I looked at the constellations in the sky and did not know what to believe.”
Taleb didn’t flee volatility, however; he pursued it. After receiving his MBA from the Wharton School, he worked as a derivatives trader at the investment bank First Boston, where he built up what he later described as a “massive” position in long-shot options. Ordinarily, those options were useless. But when the markets collapsed on October 19, 1987—Black Monday, as Wall Street would call it—Taleb’s position paid off. In a single day, he netted tens of millions of dollars—97 percent of all the money he would ever earn. Taleb’s work taught him that a deep gulf yawned between market theory and market reality, and he began to press the point in public, first as an evening professor at NYU and then in writing.
Taleb drew the term “black swan” from the tale of British explorers amazed to discover black swans in Australia. Before their journey, the explorers had assumed that all swans were white—an assumption derived from thousands of years of European history. “One single observation can invalidate a general statement derived from millennia of confirmatory sightings of millions of white swans,” wrote Taleb. “All you need is one single (and, I am told, quite ugly) black bird.” The point is that in some cases, past events do not reliably illustrate the possible scope of future events. And some of these unpredictable future events—financial collapses, terrorist attacks, unanticipated wars—can cause sudden, catastrophic harm. In The Black Swan and in media appearances, Taleb urged his audience not to leave itself exposed to the destructive force of black swans.
At first, Taleb saw the solution as making ourselves and our society more “robust”—that is, better able to survive black swans. In a long discussion appended to the book’s second edition, he described how we could replace our modern “fragility” with a “Black-Swan-Robust Society.” But as Taleb continued to research and write about the subject, he came to realize that he had seized on the wrong dichotomy. The opposite of your odds of being harmed by unforeseen events are your odds of benefiting from them. The opposite of fragility, therefore, is not robustness but anti-fragility. We are anti-fragile when we are “convex” to risk—when we have more to gain from unpredictable risks than we have to lose from them. Hence Taleb’s latest book, Antifragile, a prescription for how to thrive in an uncertain world.
Taleb emphasizes that his latest book is not a stand-alone project but rather an outgrowth of the same ideas that animated his earlier writings. The whole, he writes, constitutes “a main corpus focused on uncertainty, randomness, probability, disorder, and what to do in a world we don’t understand; . . . that is, decision making under opacity.” So readers familiar with Taleb’s work will find Antifragile a comfortable read whose prescriptions flow naturally from his previous books. For example, instead of constructing your life as a series of wagers on the future, build a “nonpredictive” life that eliminates downside risk whenever possible while leaving yourself open to upside risk—perhaps by putting your precious retirement savings into low-risk fixed-income investments while committing smaller parts of your investment portfolio to small stakes in start-up companies, which might yield a small loss but might produce an immense gain.
More advice: antifragility requires keeping your options open as long as possible and thus maintaining your freedom to change a course of action if a black swan suddenly arrives. Furthermore, you can reduce fragility with a less-is-more mind-set that searches not for complex solutions to complex problems but for simple solutions that solve as much of the problem as possible. Finally, Taleb endorses conservative respect for the tried and true over the new and novel: what’s worked for generations ought to enjoy at least presumptive credit because it has withstood the test of time.
Taleb discusses few specific policy changes. But one of his primary assertions is readily applicable to policy debates: “Skin in the game is the only true mitigator of fragility.” The surest way to reduce the threat that unforeseeable risk poses is to make sure that those who create the risk also bear its potential costs. Among Taleb’s targets are the banks that the government considers “too big to fail.” Bank executives have no “skin in the game,” he argues, because they keep the upside and transfer the downside to others. When they hit it big, they collect bonuses; when they lose big, they get rescued by the bankruptcy code, government bailouts, and other parachutes. Similarly, “journalists who ‘analyze’ and predict” may win book deals and TV appearances when they make the right call, but seldom do they suffer a penalty for wrong predictions. Speculative traders, by contrast, take both the upside and the downside.
Rules ensuring that people have skin in the game create incentives for them to identify and mitigate risks ahead of time. Taleb traces such rules back nearly 4,000 years, to Hammurabi’s code: “If a builder builds a house and the house collapses and causes the death of the owner of the house—the builder shall be put to death.” Today, Taleb argues, an analogous rule should apply to Robert Rubin and the other “fragilistas” who built a brittle financial system and paid no cost for its collapse.
Antifragile helps dispel a common misconception about Taleb’s thought. He is not opposed to risk or even to black swans. Risk-taking and the mistakes that often result from it can be a source of strength in our society. Just as stress can make bones stronger, errors provide indispensable information—at least in a properly anti-fragile system, which learns from that information. The airline industry is a good example of such a system. “Every plane crash brings us closer to safety, improves the system, and makes the next flight safer,” writes Taleb. “Those who perish contribute to the overall safety of others. Swiss flight 111, TWA flight 800, and Air France flight 447 allowed the improvement of the system.” Thanks to legal and regulatory liability, the airlines fully bear the costs of their mistakes and ignorance; no airline is considered “too big to fail,” and thus the incentives are aligned for them to act prudently and improve over time. In the long run, the airline system as a whole benefits from this experience. (In 2010, it’s worth noting, two MIT Sloan School of Management professors and a National Transportation Safety Board official published a working paper calling on Wall Street regulators to mimic the NTSB’s postaccident procedures, saying that it could improve our understanding of systemic risk.) Taleb offers a similar take on the Fukushima nuclear disaster of 2011: “One can safely say that it made us aware of the problem with nuclear reactors . . . and prevented larger catastrophes.” We can’t learn from mistakes unless we make them.
Taleb thus stands in marked opposition to the modern-day activists who seize on the presence of uncertainty as grounds for shutting down certain projects altogether. Environmentalist critics of energy-infrastructure initiatives frequently resort to this tactic. If Taleb had the power to decide whether to allow shale gas “fracking,” say, his first question wouldn’t be: Have we figured out all the possible risks? Rather, he would ask: Where are the project’s fragilities, and who will bear the costs when the project produces actual failures and losses—even if those losses are the result not just of events that we anticipate today but of those that we don’t see coming? Only if companies remain responsible can we feel reasonably confident that they will mitigate the risks already on their radars and try, at least, to guard against the black swans that are not.
But Taleb’s specific prescriptions are less significant than the more general theme that he strives to convey in Antifragile and his other books: the need for epistemic modesty, a recognition of our own limits when we try to ascertain what the future holds. Taleb’s objective is not to make us better predictors but rather to reject the impossible ambition of foreseeing the unforeseeable. He wants to “live happily in a world I don’t understand,” as he puts it. That we can forecast some things doesn’t justify our overconfidence in forecasting generally.
That overconfidence is precisely the risk that Nate Silver runs at certain points of The Signal and the Noise. Silver might counter that his prescriptions deal with scenarios, such as baseball playoffs and presidential elections, in which black swans aren’t really a threat, since past data and experience offer a reasonable approximation of what the future will look like. And even if Silver himself wouldn’t offer that rejoinder, Taleb would, as he did in a rare joint appearance with Silver on Bloomberg TV this past December. Asked whether Silver’s book contradicted Taleb’s views on uncertainty and prediction, Taleb replied:
No, no, what he does is rigorous. Why? Because in economics, you can’t predict. In what he does, you can predict, because he’s dealing with binary variables. And let me explain. He’s predicting . . . the probability of getting this president. Okay, you’re going to get one president, so it’s a binary event. A war is not defined; an election is defined. A war can kill one person, or a million people, or 25 million people.
For the most part, Taleb continued, Silver’s work avoided contexts with “fat tails”—situations in which extreme events were extraordinarily likely and powerful—and was thus “immune to black swan errors.”
That might be an accurate characterization of Silver’s work on politics, baseball, and poker, but it doesn’t describe his book’s discussion of economics, let alone of terrorism and war. The problem in The Signal and the Noise is Silver’s faith in our ability to know when we’re on steady predictive ground. Throughout the book, he concedes that “if you can’t make a good prediction, it is very often harmful to pretend that you can”; that “purely statistical approaches toward forecasting are ineffective at best when there is not a sufficient sample of data to work with”; and that forecasting is difficult for areas that are not “data-rich.” But those warnings do little good when he offers little or no indication of how we can be certain that we’re in a position to make a “good” prediction, that we have “sufficient” data, and that our area is “data-rich.”
Compounding that problem, probabilistic frameworks inherently lend themselves to a bias that Taleb highlighted in The Black Swan: our ability to explain away mistakes without reassessing our presumptions. A good example of this kind of thinking occurred in the run-up to the 2012 presidential election, when Silver’s advocates, responding to conservative critics, argued that even if Silver blew his call, that wouldn’t disprove his model. After first blaming a possible Silver misfire on the polls that Silver used—as the old saying goes, garbage in, garbage out—the Washington Post’s Ezra Klein stressed that “if Mitt Romney wins on election day, it doesn’t mean Silver’s model was wrong.” After all, Klein explained, Silver’s model gave Romney a 25 to 40 percent chance of winning. So a Romney win would mean simply that the underdog won, not that the underdog wasn’t, in fact, the underdog.
Yet for all the energy that Klein and others devoted to criticizing Silver’s critics, they made little effort to grapple with a deeper question: How could one ever know whether Silver’s model was actually wrong? We’ve certainly seen other models implode. Wall Street bankers long depended on “value-at-risk” (VaR) models to determine how much they stood to lose from a market downturn. But those instruments failed to model the housing bubble realistically before it burst, sending shock waves through the financial markets. As Taleb has often argued since the financial crash, the problem with VaR models wasn’t just that they were sometimes wrong. It was that their appearance of precision inspired too much confidence among those who used them.
In a much less ruinous way, that’s what happened to Red Sox fans in 2011. The problem with the Coolstandings and Baseball Prospectus forecasts wasn’t merely that they overestimated Boston’s playoff odds when Josh Beckett hobbled to the dugout on September 5. Nor was it just that they failed to adjust their models appropriately in the days that immediately followed. It was that they persisted in their underlying assumptions, even as the Red Sox experienced a cascade of failures that simply had no precedent in the historical data. No one watching the Red Sox on September 16 should have taken at face value the Coolstandings statement that the team had an 88.7 percent chance of making the playoffs. Fans who took refuge in those statistics were mainly trying to fool themselves about what they were seeing with their own eyes.
When the Sox finally did collapse in the last game of the season, no shortage of analysts reviewed the statistical wreckage. Silver was among them. Tracing the series of putative probabilities, he did the math and concluded that there had been “a combined probability of about one chance in 278 million of all these events coming together in quite this way,” at least if you trusted the models. Silver admitted that “when confronted with numbers like these, you have to start to ask a few questions, statistical and existential.” But to Silver, the main lesson was that some of the models’ assumptions needed to be tweaked. To the rest of us, the lesson was that the models were almost certainly wrong.
Fortunately, it was only baseball. Fortunately for Silver and his defenders, the election results saved them from having to answer the nagging question of how, if ever, they would be able to tell in real time whether Silver’s model was wrong. And fortunately for all of us, a miscalled election would, in the great scheme of things, rank closer to a blown baseball prediction than to an unforeseen financial crisis in terms of societal harm.
But what about those future financial crises and terrorist attacks? For that matter, what about the risk of deepwater oil wells’ rupturing or of tsunamis’ hitting nuclear facilities? And those are just the “known unknowns.” What about the truly unknown unknowns? If we are to draw a useful lesson from Silver’s and Taleb’s excellent books, it is that we need both of them. As Silver contends, we should improve our forecasts whenever possible. But as Taleb cautions, we must have the modesty to admit that some matters do not lend themselves to sound forecasts—and plan accordingly.
Photo: Few experts foresaw the 2008 financial meltdown—a “black-swan” event, in the words of author Nassim Taleb. (Richard Drew/AP Photo)