In Box

Big Data: A Short History

How we arrived at a term to describe the potential and peril of today's data deluge.


Humans have been whining about being bombarded with too much information since the advent of clay tablets. The complaint in Ecclesiastes that "of making many books there is no end" resonated in the Renaissance, when the invention of the printing press flooded Western Europe with what an alarmed Erasmus called "swarms of new books." But the digital revolution -- with its ever-growing horde of sensors, digital devices, corporate databases, and social media sites -- has been a game-changer, with 90 percent of the data in the world today created in the last two years alone. In response, everyone from marketers to policymakers has begun embracing a loosely defined term for today's massive data sets and the challenges they present: Big Data. While today's information deluge has enabled governments to improve security and public services, it has also sowed fears that Big Data is just another euphemism for Big Brother.

American statistician Herman Hollerith invents an electric machine that reads holes punched into paper cards to tabulate 1890 census data, revolutionizing the concept of a national head count, which had originated with the Babylonians in 3800 B.C. The device, which enables the United States to complete its census in one year instead of eight, spreads globally as the age of modern data processing begins.

President Franklin D. Roosevelt's Social Security Act launches the U.S. government on its most ambitious data-gathering project ever, as IBM wins a government contract to keep employment records on 26 million working Americans and 3 million employers. "Imagine the vast army of clerks which will be necessary to keep these records," Republican presidential candidate Alf Landon scoffs. "Another army of field investigators will be necessary to check up on the people whose records are not clear."

At Bletchley Park, a British facility dedicated to breaking Nazi codes during World War II, engineers develop a series of groundbreaking mass data-processing machines, culminating in the first programmable electronic computer. The device, named "Colossus," searches for patterns in intercepted messages by reading paper tape at 5,000 characters per second -- reducing a process that had previously taken weeks to a matter of hours. Deciphered information on German troop formations later helps the Allies during their D-Day invasion.

The U.S. National Security Agency (NSA), a nine-year-old intelligence agency with more than 12,000 cryptologists, confronts information overload during the espionage-saturated Cold War, as it begins collecting and processing signals intelligence automatically with computers while struggling to digitize a backlog of records stored on analog magnetic tape in warehouses. (In July 1961 alone, the agency receives 17,000 reels of tape.)

The U.S. government secretly studies a plan to transfer all government records -- including 742 million tax returns and 175 million sets of fingerprints -- to magnetic computer tape at a single national data center, though the plan is later scrapped amid public concern about bringing "Orwell's '1984' at least as close as 1970," as one report puts it. The outcry inspires the 1974 Privacy Act, which places limits on federal agencies' sharing of personal information.

British computer scientist Tim Berners-Lee proposes leveraging the Internet, pioneered by the U.S. government in the 1960s, to share information globally through a "hypertext" system called the World Wide Web. "The information contained would grow past a critical threshold," he writes, "so that the usefulness [of] the scheme would in turn encourage its increased use."

August 1996
"We are developing a supercomputer that will do more calculating in a second than a person with a hand-held calculator can do in 30,000 years." --U.S. President Bill Clinton

NASA researchers Michael Cox and David Ellsworth use the term "big data" for the first time to describe a familiar challenge in the 1990s: supercomputers generating massive amounts of information -- in Cox and Ellsworth's case, simulations of airflow around aircraft -- that cannot be processed and visualized. "[D]ata sets are generally quite large, taxing the capacities of main memory, local disk, and even remote disk," they write. "We call this the problem of big data."

After the 9/11 attacks, the U.S. government, which has already dabbled in mining large volumes of data to thwart terrorism, escalates these efforts. Former national security advisor John Poindexter leads a Defense Department effort to fuse existing government data sets into a "grand database" that sifts through communications, criminal, educational, financial, medical, and travel records to identify suspicious individuals. Congress shutters the program a year later due to civil liberties concerns, though components of the initiative are simply shifted to other agencies.

The 9/11 Commission calls for unifying counterterrorism agencies "in a network-based information sharing system" that is quickly inundated with data. By 2010, the NSA's 30,000 employees will be intercepting and storing 1.7 billion emails, phone calls, and other communications daily. Meanwhile, with retailers amassing information on customers' shopping and personal habits, Wal-Mart boasts a cache of 460 terabytes -- more than double the amount of data on the Internet at the time.

As social networks proliferate, technology bloggers and professionals breathe new life into the "big data" concept. "This is a world where massive amounts of data and applied mathematics replace every other tool that might be brought to bear," Wired's Chris Anderson writes in "The End of Theory." Government agencies, some of the United States' top computer scientists report, "should be deeply involved in the development and deployment of big-data computing, since it will be of direct benefit to many of their missions."

January 2009
The Indian government establishes the Unique Identification Authority of India to fingerprint, photograph, and take an iris scan of all 1.2 billion people in the country and assign each person a 12-digit ID number, funneling the data into the world's largest biometric database. Officials say it will improve the delivery of government services and reduce corruption, but critics worry about the government profiling individuals and sharing intimate details about their personal lives.

May 2009
U.S. President Barack Obama's administration launches as part of its Open Government Initiative. The website's more than 445,000 data sets go on to fuel websites and smartphone apps that track everything from flights to product recalls to location-specific unemployment, inspiring governments from Kenya to Britain to launch similar initiatives.

July 2009
Reacting to the global financial crisis, U.N. Secretary-General Ban Ki-moon pledges to create an alert system that captures "real-time data on the impact of the economic crisis on the poorest nations." The U.N. Global Pulse program has conducted research on how to predict everything from spiraling prices to disease outbreaks by analyzing data from sources such as mobile phones and social networks.

August 2010
"There were 5 exabytes of information created by the entire world between the dawn of civilization and 2003. Now that same amount is created every two days." --Google CEO Eric Schmidt

February 2011
Scanning 200 million pages of information, or 4 terabytes of disk storage, in a matter of seconds, IBM's Watson computer system defeats two human challengers in the quiz show Jeopardy!. The New York Times later dubs this moment a "triumph of Big Data computing."

March 2012
The Obama administration announces a $200 million Big Data Research and Development Initiative in response to a U.S. government report calling for every federal agency to have a "'big data' strategy." The National Institutes of Health puts a data set of the Human Genome Project in Amazon's computer cloud, while the Defense Department pledges to develop "autonomous" defense systems that can "learn from experience." CIA Director David Petraeus, marveling that the "'digital dust' to which we have access is being delivered by the equivalent of dump trucks," discusses a post-Arab Spring agency effort to collect and analyze global social media feeds through cloud computing.

July 2012
U.S. Secretary of State Hillary Clinton announces a public-private partnership called "Data 2X" to collect statistics on women and girls' economic, political, and social status around the world. "Data not only measures progress -- it inspires it," she explains. "Once you start measuring problems, people are more inclined to take action to fix them because nobody wants to end up at the bottom of a list of rankings." Let the Big Data race begin.

Sources for charts: International Data Corp., March 2012; Facebook SEC filing, April 2012.


Getty Images

In Box

Blame Game

Want to avert another global recession? Stop the finger-pointing.

The global economic blame game is reaching a crescendo as Americans go to the polls and Europeans approach critical decision points. And everyone -- from economists to central bankers, from TV analysts to the person on the street -- seems to have a favorite scapegoat for Europe's recession and debt crisis, for America's feeble recovery and its recurrent political fiscal dramas, for dangerously high youth unemployment in a surprising number of countries, and for China's sudden economic slowdown.

But four years into the global economic malaise that has followed the 2008 crash, the endless recriminations are more than just academic. They are actually preventing us from coming to a consensus not only on how to dig out of this mess, but also on how to prevent it from happening again. The unhappy result is that the risk of a global recession is rising, as is that of another financial crisis. So can we please get the finger-pointing out of our systems and move on?

Banks are at the top of most lists of bad guys. They lent way too much to credit-challenged entities, often using structured products like collateralized debt obligations and repackaged loans that few of them understood sufficiently, let alone knew how to manage responsibly. This lapse in the most fundamental element of banking -- failing to properly channel loanable funds to productive uses -- was consequential. Yet it was far from the only one.

Banks compounded the mistake by massively leveraging their balance sheets, making a whole set of expensive side bets, and moving activities to unregulated areas. Many did so while benefiting from the protection afforded to them by state-run deposit guarantees, emergency loans from central banks, and, most destructive of all, the notion that they were "too big to fail."

To add insult to injury, many banks (particularly in the United States) are seen as now overreacting. Having lent way too much and in a reckless manner, today they are withholding credit from legitimate borrowers, preferring just to add to the capital they've stockpiled at the Federal Reserve.

Even as I write this, I can hear the bankers shout, "Unfair!" Many of them argue that these crises are due to regulators falling asleep at the switch. They have a point.

Enamored with the textbook characterization of efficient, unfettered capitalism and well-functioning markets, regulators gave the banks an enormous amount of rope with which to hang themselves. In the process, they aided and abetted the illusion that banks could operate outside the constraints of the real economic activity that they finance. The laxity was intensified by competitive hubris -- among cities vying to be the world's financial center (London vs. New York) and among others that, in a quest for greater international respect, allowed their national banks to vastly outgrow domestic realities (Dublin, Reykjavik, and Zurich).

"Not so fast," respond the regulators, who invariably rebut the finger-pointing by arguing that the politicians pushed them to turn a blind eye or, even worse, deregulate beyond what prudence would have dictated. And the politicians wouldn't let up, believing that leverage could deliver sustainable economic growth to an impatient electorate.

After all, the West was (and is) increasingly losing out to emerging economies that benefit from lower wages, technology leapfrogging, or considerable natural resource endowments. Too many Western governments fell prey to the idea that advanced economies, in a bid to retain their global standing, could simply migrate up the value-added curve from producing things to financing them (and then trading over and over again a seemingly endless array of derivatives).

To which the politicians, of course, will say: "It's not our fault. Without a level economic playing field, we're forced to take risks to compete." In their version of reality, emerging economies gained competitive advantage by manipulating their currencies, weakening labor standards, degrading the environment, or engaging in various forms of implicit protectionism -- complaints that resurfaced in America's presidential campaign, pushing candidate Mitt Romney to threaten to label China a "currency manipulator."

These same politicians are also quick to point the finger at "do-nothing" multilateral institutions, claiming that the lapses leading to the crisis were a reflection of poor governance at global bodies like the International Monetary Fund (IMF), whose very purpose is to blow the whistle on unsustainable national policies, especially if they risk harming other countries. Remember, the raison d'être of the IMF and other multilateral institutions is to promote and facilitate global cooperation and shared policy responsibilities. Yet, when push came to shove, these institutions shied away from their duties, hindered by widespread representation and legitimacy deficits.

Then again, the multilaterals will say, "Don't blame us. We're only as strong as our member nations let us be." Fair enough: For many years, a number of key countries acted to undermine responsible multilateralism. Western countries stubbornly held on (and still do) to their outmoded entitlements at the IMF and World Bank, from absurd voting overrepresentation to feudalistic strangleholds on key decision-making positions. For their part, emerging economies were too timid in forcing much-needed change.

"How could we?" ask these emerging economies. After all, the last thing they wished to do was alienate powerful politicians in Europe and the United States who were too timid to face reality, shying away from the responsibility of explaining to their citizens the reality and consequences of global economic realignments. These same politicians -- hostage to the tyranny of short election cycles -- instead wooed voters seeking instant gratification, the protection of unsustainable entitlements, and shortcuts to continued prosperity. All this brings us back to the banks, the beginning of this vicious circle of blame. I mean, who else could have enabled these voters to consume as they wanted, inflating their living standards based not on income but credit?

And so go the endless, useless recriminations. The blame game, however, is a lot more dangerous than it sounds. This never-ending cycle not only diverts responsibility, but distracts from coherent responses. That has two immediate consequences.

First, it is virtually impossible to generate the sense of shared responsibility that must underpin any sustainable, effective solution. And while we dither, the global malaise spreads: Unemployment remains too high, financial fragility grows, and healthy capital retreats to the sidelines. Meanwhile, a growing number of weak economies now risk being tipped into severe crises while the stronger ones see their vibrancy sapped by global instabilities. Just witness the sharp slowdown in Germany -- Europe's strongest economy -- and the related rise in unemployment. Or look at how growth is plummeting in once vibrant emerging markets such as Brazil, China, and India. The longer the blame game continues, the greater the scope and scale of the synchronized global slowdown.

Second, the temptation increases for each country to turn inward, significantly raising the risk of protectionism. In today's world, this risk is not limited just to trade and currency wars. If we're not careful, a cascade of capital controls -- gates that stop money from flowing around the global system -- could be on the horizon, severely undermining the functioning of the open markets our economies have come to rely on and risking a further deterioration in growth and job dynamics. That's something we can ill afford.

There is no time to waste. Instead of a blame game, we need a cooperative game. And whoever wins the U.S. election in November must make this priority No. 1.

The world is looking for bold economic leadership, and in the absence of it, dysfunction that will make 2008 merely a flesh wound risks becoming an ever more likely reality. And trust me -- no one will want to take the blame for that tragedy.

Illustration by Laurie Rosenwald