Lies, Damned Lies, and Statistics

Why the world needs better data.

During the presidential campaign that just concluded, not a day went by without Mitt Romney trotting out the Chinese bogeyman, a looming red specter racking up a tremendous trade surplus and sapping Americans of their confidence. "It's pretty clear who doesn't want a trade war. And there's one going on right now, which we don't know about. It's a silent one. And they're winning," Romney declared during the third and final debate. The point was a politically useful one for the Republican challenger, but did he really know how big the trade deficit with China is and what it implies?

Such statistics play an extraordinarily important role in the analysis and formulation of government policy. But even when they are not used as political fodder, they are often misleading, and once they become the established way of measuring things, special interests latch on to them and make it almost impossible to change the metrics used. Greater discrimination should be used with statistics, and voters should insist that political leaders use and invest in statistics that accurately reflect what they are intended to measure.

Take exports, which have traditionally been reported on a gross basis -- simply the dollar value of what is exported. Such data is of limited economic consequence because nowadays, recorded exports very often consist of large amounts of imported inputs, ranging from petroleum to sophisticated machine tools, as distinct from value added domestically, such as the design and assembly of a jet fighter or an earth-moving machine.

Responding to this criticism, the Organization for Economic Cooperation and Development and World Trade Organization have just released a new dataset based on the domestic value added of exports that transforms the way trade is viewed. They show that China's bilateral trade surplus with the United States is about 25 percent smaller than previously reported because so much of China's exports consist of the assembly of parts produced elsewhere, including in the United States. Moreover, the statistics illustrate more clearly than ever before how interconnected economies around the world have become, and how attempts to restrict imports -- such as those Romney was effectively advocating -- are likely to backfire not only because other countries will retaliate but also because domestic production will quickly become uncompetitive or even grind to a halt due to lack of parts or raw materials.

The new data also indicates that, contrary to general belief, the United States' service exports are about as large as its manufactured exports. Because domestically produced services such as transport and financial and legal services are exported both directly and indirectly as part of the domestic value-added of manufactured exports, traditional ways of measuring exports underestimate the importance of services as generators of foreign exchange. For example, much of what is reported as export of a Boeing aircraft actually includes a large proportion of services Boeing has purchased from U.S. service providers -- everything from janitors to lawyers to local government officials.

But while these statistics shed new light on how to measure economic output, do not expect textile or steel producers to adopt them any time soon. Given their interest in playing up the importance of manufacturing, these groups have a vested interest in maintaining the outdated economic statistics. The same applies to China-bashers interested in playing up the size of the bilateral trade deficit.

And trade is just one of many instances where standard economic statistics present a misleading picture of reality. Take the government budget deficit -- the difference between the government's total expenditures in a given year and its revenues from taxes, fees, and other miscellaneous revenue sources. This method of calculating the budget deficit is useful to understand how much the government needs to borrow each year, but the deficit fails to capture far more important dynamics in government spending and future obligations.

As Boston University economist Laurence Kotlikoff has long argued, in his article Deficit Delusion and elsewhere, the economic significance of the budget deficit as traditionally measured is, well, minuscule. What really matters for assessing the solvency of the government is not current cash outlays and receipts, but its overall balance sheet and how it is changing, notably through the new commitments it undertakes each year and the tax revenue it can expect in coming decades. It is therefore possible for the government to show a cash surplus in a given year but for its net liabilities to rise because life expectancy has risen or a new pension or health care scheme has been introduced that will draw on resources for many years to come.

As populations have aged and health-care costs have steeply increased in advanced countries, the cash deficit has tended to greatly underestimate the true deficit. Long-term budget projection models have been developed that incorporate demographic, productivity, and other trends to evaluate the true budget situation of countries. Although the results are widely available, they remain, by and large, invisible to the public eye, the province of policy wonks.

Instead, simple economic questions such as the size of the economic pie are obscured behind economic statistics that fail to accurately reflect the underlying economic reality. Gross domestic product, for instance -- the sum total of all things produced in an economy -- provides a very partial and vastly distorted picture of a nation's well-being. The measure fails to account for factors such as the cost of environmental pollution, urban congestion, and the depletion of natural resources that is associated with the economic activity it measures.

The growth of GDP per capita, the most widely used measure of economic progress, can also produce a vastly misleading picture because it fails to account for income distribution. From 1979 to 2007, for example, the United States saw average GDP per capita rise by 62 percent in real terms, but its median GDP per capita, the income level that half of families exceed, grew by just 35 percent. This is because the gains of economic growth have accrued overwhelmingly to the top of the income spectrum, a fact that remained obscured during the recent boom years and has only recently gained political attention.

For ordinary people, inflation remains far and away the most important economic statistic -- many a revolution, after all, has found its spark in rising bread prices. Still, despite numerous attempts to rework it in the United States, Britain, and elsewhere, the standard measure of inflation, the consumer price index, tends to overstate inflation because it does not adequately account for quality improvements. The car I own today, for example, is a much safer, more comfortable, and more fuel-efficient vehicle than the one I bought when I was in graduate school almost 40 years ago. But the consumer price index would have me believe that it is far more expensive while not accounting for the fact that I have gotten much more for my buck.

The consumer price index can also overstate inflation because it does not adequately account for the fact that consumers buy fewer products and services whose prices rise faster. A "chain-weighted" index has been developed that updates the weights of different products every year to reflect the most recent choices of consumers and results in lower and more accurate inflation estimates, but its use to compute the inflation adjustment of social security benefits is fiercely resisted by advocacy groups like the American Association of Retired People, as a lower figure for inflation would result in lower cost-of-living adjustments for retired seniors.

Why do such statistics continue to be used when many now recognize that they are deeply flawed? Sheer inertia is part of the explanation, and gathering more nuanced and better-targeted statistics and disseminating them is costly. But perhaps the most important reason is that powerful interest groups latch on to the commonly used statistics that suit them.

Particular groups should have the right to their own view, but not to their own facts. Political leaders in the United States and around the world should invest in statistics that tell the truth and treat headline figures with healthy skepticism. Those charged with the big decisions that determine lives should be expected to delve deeper into what is really going on.



China and Japan's Wikipedia War

How a showdown over a group of remote islands in the East China Sea is heating up online.

As China and Japan jockey for influence in the Pacific, an unlikely diplomatic fault line has emerged: an archipelago of uninhabited rocks in the East China Sea. Known as the Senkakus in Japan, which controls them, the islands are also claimed by China and Taiwan -- and both are struggling to reassert sovereignty. Tremors have increased in recent months with confrontations between the Japanese and Taiwanese coast guards and rabble-rousing from Chinese media outlets. Statesman have shuffled back and forth between Beijing, Tokyo, and Washington to cool the crisis, but neither Xi Jinping, the new head of the Chinese Communist Party, nor Shinzo Abe, Japan's new prime minister, show any sign of backing down. On the contrary, China raised the stakes on Jan. 30, when one of its military frigates aimed weapons-targeting radar at a Japanese warship, prompting Japan to lodge a formal complaint with the Chinese government.

But if the physical posturing has been vigorously covered in the news media, the digital posturing has not. In recent years, partisans have taken the fight to Wikipedia, where articles about the islands have been subject to weekly "edit wars" between contributors. The content on these pages might seem to be of only marginal importance compared to more significant coverage in other outlets. But the "Senkaku Islands" and "Senkaku Islands dispute" Wikipedia articles are the two most prominent English-language sources of information about the islands on the Internet, with the top search result ranking on Google and thousands of page views every month. The Japanese and Chinese language editions of Wikipedia have their own article pages for the islands as well -- each offering different chronologies of ownership. These sites, however, receive far less traffic and the content debates are far more diplomatic.

For many Web users, Wikipedia remains a reliable first stop for facts, and the site's crowd-sourced quality control has always been more effective than critics give it credit for. But entries about contentious subjects -- from Kosovo's independence to Kim Kardashian's pregnancy -- are difficult to monitor around the clock, and remain susceptible to vandalism, questionable sources, and editorial disagreement. Charges of censorship and bias are rampant on the Senkaku Islands entries' talk pages, where the process of article creation is negotiated. Many of the combatants are veteran editors with established user handles and years of experience who would never admit to any sort of partisanship. Nonetheless, strongly entrenched opinions are evident, with each side claiming adherence to Wikipedia's editing guidelines -- much like respective Japanese and Chinese officials continue to ground their claims in international law.

Like the real world debate, the Wikipedia dispute is a fairly recent development. The original Senkaku Islands article, created in 2003, was relatively short at just over 300 words. This version actually listed the traditional Chinese name, "Diaoyutai," first in the opening paragraph. By January 2010, the article had swelled to well over 4,000 words, and included 43 different footnotes. Although the article emphasized that ownership of the islands was disputed, "Senkaku" was now used on first reference, and many of the geographic citations were Japanese maps. That year, the article was subject to more than 800 separate edits. And when a Chinese fishing trawler collided with two Japanese fishing boats near the islands in September, the article's talk page exploded with activity.

As the article attracted more attention, three issues emerged as key points of contention between editors. First and foremost: the name. China refers to the chain as the Diaoyu Islands, Taiwan as the Diaoyutai Islands. Wikipedia searches for those entries have always redirected readers to the main "Senkaku Islands" page. But in 2009, an editor renamed the page "Diaoyutai Islands" and moved that name ahead of the Japanese translation in the opening paragraph. Although that change was quickly reversed by another editor, it launched a talk page dispute that raged through 2010. Some editors supported changing the article's name to "Pinnacle Islands" -- the English-language name for the island chain used in the 19th century -- to mitigate concerns about article bias. This attempt at a compromise was quickly shot down, even as the talk page rhetoric heated up. "These pro-Japanese editors just a bunch of bully boys and hooligans!" an editor named STSC vented.

The second point of contention is ownership. As the Senkaku Islands article developed, competing Japanese and Chinese/Taiwanese historical claims to the territory were outlined in long, excessively detailed sections that soon took up the bulk of the article. One editor wisely created a new "Senkaku Islands dispute" page in October 2010 to accommodate new additions and outline the dispute's chronology. But deciding what evidence was admissible even as a "claim" remained contentious. For example, a classified PRC government map identifying the islands as Japanese territory was added as a graphic after it was referenced in a 2010 Washington Times column. But some editors questioned the map's authenticity, and others wondered whether the Times could really be considered a "reliable" source of information on the subject. A 2012 New York Times column by Taiwanese academic Han-yi Shaw received similar scrutiny. Han-yi revealed Japanese government documents from the Meiji era that seemed to acknowledge Chinese ownership of the islands, but this evidence was dismissed by one editor because the piece featured an introduction by Nicholas Kristof, a "pro-China journalist" with a "Chinese wife."

Third and finally, editorial neutrality has been a regular area of dispute. As the previous examples make clear, charges of bias are the most common sticking point during article development. Editing from a neutral point of view (NPOV in Wiki-speak) is a fundamental principle of Wikipedia, and any new content that appears less-than-objective is likely to be removed by another editor. But recent charges of subjectivity have had less to do with wording or sourcing, and more to do with a nationality. Editors using Japanese or Chinese words in their user handles have been frequent targets for this line of attack. A "Suggested Rules of Engagement" tag has been placed on the top of both articles' talk pages to encourage civility, and parts of the main article were "locked" throughout 2012 to prevent editors from attempting to change the name.

Regular editing dust-ups might suggest that the Senkaku Islands article and its "dispute" offshoot are dubious resources of little value. In fact, both articles nicely summarize the controversy and provide a long list of citations and references that can advance further research. While news accounts of the islands focus on recent diplomatic incidents and their international implications, these Wikipedia articles provide historical context and a more detailed explanation of the arguments underlying each side's claims to the territory. The vitriol exchanged by editors might be ugly, but it's also evidence of a transparent and ongoing screening process. Wikipedia has a strict policy against "original research" -- all claims and assertions must be supported by reliable, published sources, not personal interpretation -- but editors are encouraged to vet sources and use their language skills to translate foreign documents.

Furthermore, while the Senkaku pages are particularly "active" right now, Wikipedia articles related to other territorial disputes have experienced similar disputes and edit wars. A recently proposed change to a single sentence on the Falkland Islands article produced multiple rounds of recriminations between two editors, each asserting the NPOV high ground. A suggestion to split the Cyprus article into a "Republic of Cyprus" page and "Cyprus (island)" page dissolved into a month-long debate that was 5,000 words longer than the existing article. And the Northern Ireland page -- as you might suspect -- is currently subject to active arbitration by administrators.

As the standoff over the Senkaku Islands escalates, Wikipedia will continue to be a kinetic diplomatic front. The pages' high profile and the subject's newsworthiness forces embattled editors to revisit and relitigate the same name and legal status battles again and again against new challengers. Whether voluntary cooperation and third-party mediation is enough to contain the crisis -- editing or otherwise -- remains to be seen. What is clear, however, is that a large Web audience increasingly perceives Wikipedia as the encyclopedia of record where history is documented and judged.