The NSA's Data Haul Is Bigger Than You Can Possibly Imagine

And so are its mistakes.

Editor's note: Shortly after this story was published, the Washington Post released a series of eye-popping leaked documents showing that the National Security Agency has accidentally intercepted the communications of thousands of people it had no right to spy on. The story below is in many ways the precursor to that blockbuster revelation.

The NSA, as intelligence historian Matthew Aid shows, collects so much information online that even its mistakes are enormous. Every day, it actively analyzes the rough equivalent of what's inside the Library of Congress and "touches," to use the agency's term, another 2,990 Libraries' worth of data. With such a huge haul, even the most infrequent of error rates -- one in a hundred thousand, say -- still produces terabytes and terabytes of improperly-harvested data. It still means thousands and thousands of people are wrongly caught in the surveillance driftnet.

The NSA's defenders will point to the many times the agency's intelligence analysts followed the rules, and got things right. But that misses the point; no one expects these analysts, or the systems they use, to be flawless. The problem is that the surveillance net is so very large that even the most miniscule of imperfections can have outsized impact. And that calls into question whether the NSA's intelligence-collection efforts have grown too big for their own good.

The electronic spies at the National Security Agency have tried lately to play down the amount of Internet traffic they inspect -- and play up how central that monitoring is to stopping terrorist attacks. Neither one of those arguments is entirely true. Yes, the NSA claimed in a recently released white paper that it "touches" only 1.6 percent of the planet's online data, but the agency neglected to note that this is roughly equivalent to the Library of Congress's entire textual collection, inspected 2,990 times every day. And sure, the NSA's Internet surveillance has been instrumental in some counterterrorism operations. But this analysis of online communications has also been central to U.S. spying on places like Syria, Libya, China, and Iran.

The importance of the Internet as an intelligence source for the NSA cannot be underestimated. The NSA may have made its Cold War reputation intercepting phone and radio traffic; these days, it's all about the Net. According to information gathered from interviews with three former or currently serving U.S. intelligence officials conducted over the past month, the NSA is now producing high-grade intelligence information on a multitude of national and transnational targets at levels never before achieved in the agency's history. Here are a few examples of the intelligence reportedly derived from NSA's intercepts of the contents of emails and other Internet-based communications systems:

* According to a recently retired U.S. intelligence analyst, much of what the U.S. intelligence community knows, or thinks it knows, about the Iranian nuclear program is based largely on intercepted online communications.

* Intercepted emails and other Internet communications have been an essential source of information about what has been transpiring in Syria and the countries surrounding it since the Syrian civil war broke out in early 2011.

* The NSA's ability to exploit email traffic, both plaintext and encrypted, has proved to be a critically important tool allowing the U.S. intelligence community to track military activities around the world, particularly in certain key countries in the Middle East, South Asia, and the Far East. For instance, intercepted Internet traffic reportedly played an important role in allowing the U.S. intelligence community to keep close tabs on the activities of military units loyal to Muammar al-Qaddafi during Libya's civil war in 2011.

* Intercepted emails and text messages were also essential to the success of Gen. David Petraeus's Baghdad "surge" operation in Iraq in the spring and summer of 2007. According to an Aug. 9 NSA white paper, "The senior U.S. commander in Iraq credited signals intelligence with being a prime reason for the significant progress made by U.S. troops in the 2008 [actually 2007] surge, directly enabling the removal of almost 4,000 insurgents from the battlefield."

* According to one official, intelligence information derived from Internet signals collection, or SIGINT (for "signals intelligence"), has been responsible, directly or indirectly, for more than 60 percent of the al Qaeda terrorists captured or killed since the 9/11 attacks.

* Since 2008, signals intelligence derived from mobile phone and email intercepts has become the principal intelligence source used by the CIA, the Defense Intelligence Agency, and Joint Special Operations Command to target unmanned drone strikes and commando raids against al Qaeda terrorists and local insurgent targets in northern Pakistan and Yemen. Signals intelligence has become so important to the U.S. intelligence community's counterterrorism effort that it has given birth to a new type of CIA intelligence officer called a human intelligence targeting officer (HTO) who is responsible for fusing real-time signals intelligence concerning the locations of al Qaeda officials with available intelligence received from agents in order to direct CIA Reaper unmanned drones equipped with Hellfire air-to-surface missiles to their targets.

Working in close conjunction with its English-speaking partners in Britain, Canada, Australia, and New Zealand, the NSA is currently engaged in two Internet-related SIGINT collection programs.

The first involves the collection of Internet metadata -- who communicates with whom and how. The domestic component of this program, which started shortly after 9/11, involved AT&T, Verizon, and Sprint providing the NSA with massive volumes of Internet usage data for all their subscribers in the United States and overseas. This program was officially terminated in December 2011 after Sen. Mark Udall and Sen. Ron Wyden questioned whether the program was producing sufficient intelligence to justify continuing to fund it. Whether the NSA still retains the massive database of Internet metadata is unknown. But the agency isn't in the habit of throwing things away.

Either way, the NSA continues to collect the exact same sort of Internet metadata on foreign targets to this very day (though determining who's a foreigner and who's not can be a near-impossible task, as my FP colleague Shane Harris has shown). Every minute of every day of the year, the NSA's vast array of computers sweeps the entire global Internet using almost exactly the same search and sweep techniques as Google, collecting vast amounts of metadata on Internet usage around the world. The metadata that the NSA and its partners collect every day yields vast amounts of information on computer systems and email communications links of particular interest to the agency: Internet protocol (IP) addresses, email accounts, user names, domains, service providers, server locations, ports, blocked sites, browser(s) used, dates and times of logins, length of web sessions, website addresses (URLs) visited, IP addresses contacted, and, for Skype users, all phone numbers called.

The Internet metadata program has been particularly useful for identifying which email links use PGP or other encryption systems, which automatically earns that particular system increased scrutiny by the NSA's computer-hacking organization, the Office of Tailored Access Operations, to determine whether this communications traffic might be of intelligence value.

Separate from the Internet metadata program, the NSA and its overseas partners intercept the content of vast amounts of communications and digital data traffic carried on the Internet, especially email traffic. The NSA and its English-speaking partners are intercepting, machine-reading, and caching millions (if not billions) of emails every day. According to previously published reports, the agency may even be able to read emails that were encrypted with a wide variety of commercially available encryption systems.

Getting at the vast and growing volume of email and related communications traffic being carried over the Internet is, from a purely technical standpoint, a relatively easy proposition for the NSA because, according to industry estimates, roughly 80 percent of the world's Internet traffic either originates in the United States or transits through Internet service providers and/or computer servers in the United States.

And what the NSA cannot access, sources report that the agency's British, Canadian, Australian, and New Zealand SIGINT partners oftentimes can. They do this by covertly collecting all Internet and data traffic being carried on all fiber-optic cables that touch on their territory.

The majority of the Internet traffic entering, leaving, or transiting through the United States travels through one of 32 fiber-optic-cable landing points or terminals: 20 on the U.S. East Coast and 12 on the West Coast. According to the consulting firm TeleGeography in Washington, D.C., 56 global fiber-optic cable systems carrying Internet and digital data traffic to and from Europe, Asia, the Middle East, Africa, Latin America, and the Caribbean are connected to these 32 cable landing points.

The NSA can now access almost all traffic transiting through these fiber-optic cable systems (except those cables connecting the lower U.S. mainland with Alaska) pursuant to a classified program called Upstream. Upstream consists of four subordinate programs called Fairview, Stormbrew, Blarney, and Oakstar. An April 2013 top secret PowerPoint slide leaked by Edward Snowden to the Washington Post indicates that Stormbrew focuses on Internet traffic passing between the United States and Asia, while Blarney appears to cover traffic between the United States and Europe and the Middle East. The precise functions of the Fairview and Oakstar programs are not yet known.

Getting at this traffic is only technically feasible because of the NSA's intimate relationships with the largest American telecommunications companies and Internet service providers. Thanks to a series of secret cooperative agreements with America's three largest telecommunications companies -- AT&T, Verizon, and Sprint -- since 9/11 the NSA has been given access to virtually all foreign Internet traffic carried by these underwater fiber-optic cable systems. These access agreements with the "Big Three" telecommunications companies are legally sanctioned by warrants that are routinely renewed every 90 days by the Foreign Intelligence Surveillance Court in Washington, D.C.

AT&T, Verizon, and Sprint can access most Internet traffic transiting the United States via these fiber-optic cables because at some point the traffic passes through one or more gateway nodes, backbone nodes, remote access routers, Internet exchange points, or network access points in the United States that are operated by the "Big Three." At these points, Internet traffic of interest to the agency is intercepted by NSA equipment (euphemistically referred to as "black boxes" by company personnel) that is operated and maintained by specially cleared personnel on the payroll of the telecommunications companies.

For example, all Internet and data traffic from Latin America and the Caribbean arrives in the United States via eight submarine fiber-optic cables whose terminals are located in Florida at Jacksonville, Vero Beach, West Palm Beach, Spanish River Park, Boca Raton, Hollywood, North Miami Beach, and Miami. All Internet traffic from these eight fiber-optic cables is forwarded to the AT&T backbone node facility in Orlando, Florida, where email and data traffic of interest to the NSA is instantly copied and sent via secure buried fiber-optic cable links to NSA headquarters for processing, analysis, and reporting.

And since September 2007, the NSA has been able to expand and enhance its coverage of global Internet communications traffic through a now-infamous program called PRISM, which uses orders issued by the Foreign Intelligence Surveillance Court that permit the NSA to access emails and other communications traffic held by nine American companies: Microsoft, Google, Yahoo!, Facebook, PalTalk, YouTube, Skype, AOL, and Apple.

Thanks to PRISM, for the past six years the NSA has been exploiting a plethora of other communications systems besides emails that also use the Internet as their platform: voice-over-Internet protocol (VoIP) systems like Skype, instant messaging and text messaging systems, social networking sites, and web chat sites and forums, to name but a few. The NSA is also reading emails and text messages carried on 3G and 4G wireless traffic around the world because many of these systems are made by American companies, such as Verizon Wireless.

No matter how you measure it, the amount of intercepted Internet-based communications traffic that the NSA must process, analyze, and report on is massive and getting larger by the day.

In an unclassified white paper released on Aug. 9, the NSA claimed that it "touches" only 1.6 percent of the 1,826 petabytes of traffic currently being carried by the Internet, which equates to approximately 29.2 petabytes of communications data. To give one a sense of how much raw data this is, the Library of Congress's entire collection, the world's largest, holds an estimated 10 terabytes of data, which is equivalent to 0.009765625 petabytes. In other words, the NSA collects just from intercepted Internet traffic the equivalent of the entire textual collection of the Library of Congress 2,990 times every day.

Of this amount, according to the NSA, only 0.025 percent of the intercepted Internet material is selected for review based on a vast and ever-changing "key word" or "key phrase" alert system. On paper this sounds reasonably manageable until you realize that the daily amount of material in question is the equivalent of 75 percent of the Library of Congress's entire collection.

More and more of the data to be reviewed is Chinese. Although the Internet was invented in the United States, its future is in China, which has seen its online population increase a hundredfold in the last 10 years and now boasts double the number of America's Internet users. That means the NSA's ability to access Chinese communications, which sources confirm is the U.S. intelligence community's top Tier I target after al Qaeda and other foreign terrorist groups, has also increased a hundredfold in just the past decade, and the NSA's access to Chinese communications will only continue to grow incrementally as tens of millions more Chinese people are expected to get online in the next few years.

The same is true about Russia, another increasingly important Tier I high-priority target for the U.S. intelligence community and another place where Internet usage is growing. As Russian President Vladimir Putin's relations with Washington continue to deteriorate, the U.S. intelligence community's prioritization of Russia as an intelligence target has risen significantly in just the past two months.

But if there is one area where Internet-based signals intelligence has played a particularly critical role, it is in the field of counterterrorism. The NSA has confirmed that al Qaeda and other terrorist leaders in the Middle East and South Asia depend on email and other Internet-based communications systems to communicate with one another because, according to a leaked 2009 NSA inspector general's report, "they are ubiquitous, anonymous, and usually free of charge," allowing terrorist leaders to "access Web-based email accounts and similar services from any origination point around the world." Of course U.S. spies are going to try to listen in.

Rainier Ehrhardt/Getty Images


Spy or Die

Can corporate suicide stop the NSA?

When the U.S. government orders a communications company to give up its data, the firm has two basic choices: resist, and risk its leaders going to jail, or comply, and break faith with its customers. On Thursday, Aug. 8, however, two privacy-minded businesses chose a third and unprecedented option: They committed corporate suicide rather than bend to the surveillance state's wishes.

It could just be the opening battles in a new front of the surveillance war.

In a move that blocks governmental monitoring of private email accounts, two secure email providers closed shop on Thursday rather than divulge information about their users to the authorities. The first Dallas-based Lavabit -- which reportedly counts among its users NSA-leaker Edward Snowden -- stopped operations after apparently fighting a losing battle to resist a federal surveillance order. (Snowden called the decision "inspiring" in a note to the Guardian's Glenn Greenwald.) A few hours later, Silent Circle, headquartered outside Washington, D.C., announced it was suspending its encrypted email service as a preemptive measure before ever receiving a command from the government to spy on its users.

The companies' extreme actions put them in an exclusive club. Security and legal experts said they could not recall a company preventing government access to its customers' information by shutting down its business. Some companies have appealed surveillance orders in the courts or attempted to force more public disclosure about the secretive intelligence-gathering process, but they have remained functioning. Refusing to comply with an order also means the government is cut off from potentially valuable information that it may have no other means of obtaining.

Ladar Levison, the owner and operator of Lavabit, said in a cryptic public message to his users that he had "been forced to make a difficult decision: to become complicit in crimes against the American people or walk away from nearly ten years of hard work by shutting down Lavabit."

Levison didn't say precisely what events had led to his decision, but his letter strongly suggests that he had refused to comply with an official order to hand over Lavabit users' emails and give the government ongoing, prospective access to the company's systems. In the letter, Levison said he was forbidden from discussing "the events that led to my decision." Recipients of secretly issued government surveillance orders are often prohibited from disclosing or discussing them publicly.

Silent Circle, in a letter to its customers, cited Lavabit's decision. "We see the writing the wall, and we have decided that it is best for us to shut down Silent Mail [its encrypted email service] now. We have not received subpoenas, warrants, security letters, or anything else by any government, and this is why we are acting now."

The company also acknowledged that its email service didn't have protections as strong as those for its phone and text services, which can delete communications entirely, as well any corresponding metadata records. Email leaves a digital trail that can be recovered and therefore forcibly disclosed by the authorities.

"Tough decision but we couldn't wait for the inevitable risking member security," Vic Hyder, the company's chief operations officer, wrote on Twitter.

"We huddled this afternoon and saw no other choice," Jon Callas, Silent Circle's chief technology officer and a noted computer security expert, wrote on his Twitter feed.

Companies that receive surveillance demands find themselves in an unenviable position. Some, such as Yahoo!, Microsoft, and Google, have either fought surveillance orders in court or petitioned the government to let them disclose more information about what the authorities are asking about the companies' users. But until now, these companies and others, including Internet mainstays such as Facebook that have hundreds of millions of users, have complied with the orders and helped form the backbone of official surveillance.

Companies also know they cooperate at the risk of undermining their reputation and their business. Take the encrypted email service Hushmail, a Canadian company that like Lavabit had marketed itself as a secure system. In 2007, the firm gave over information on three customers as part of a U.S. federal investigation into illegal steroids. Although Hushmail was complying with a court order and a legal assistance treaty between the United States and Canada, its reputation was significantly damaged among its product's core users.

Closing a company is certainly not illegal. But evading an official demand is. What penalties or charges Levison might face depends on what the government is seeking. He could face a contempt proceeding, which could include jail time, if he refused to comply with a court order, said Albert Gidari, a lawyer with the firm Perkins Coie who represents companies on surveillance and communications law.

But the government might also be looking for ongoing or prospective surveillance of Lavabit's customers and access to the company's systems. Given Levison's drastic actions, that is likely the case. Shuttering the company would do little to stop the authorities from gaining access to Snowden's or any other customer's old emails. But going out of business would mean Lavabit couldn't comply with any future surveillance.

"It may be that by shutting down the service, he can't comply, and so it's doubtful he would be held in contempt," Gidari said. But "shutting down the service could be viewed as obstruction of justice, so he isn't necessarily out of the woods yet."

Levison faced two bad options. That helps explain why Silent Circle's executives may have decided to avoid the quandary altogether.

Levison's decision was greeted by some as a heroic act of protest. A fund was set up to help pay for his legal expenses. "We've already started preparing the paperwork needed to continue to fight for the Constitution in the Fourth Circuit Court of Appeals," he wrote.

But Silent Circle's decision added a new wrinkle. The company appeared to be making a business decision, rather than a legal or ideological one. It had not been served with a government order. Indeed, the company, which was founded by an ex-Navy SEAL and the inventor of the first widely distributed commercial encryption software, says it counts intelligence agency employees and special operations forces as its most loyal customers. Silent Circle has billed its encrypted email service as a way for people with secretive jobs to communicate securely, not as an end run around federal surveillance. (The firm has been known to help privacy-minded journalists stay beneath government radars.) By preemptively shutting down its email service -- and purging all data related to it -- Silent Circle preserves its reputation as a secret-keeper. It will continue to sell its secure phone, text-messaging, and video services.

Companies may also find resisting NSA surveillance a losing battle. Recently disclosed documents show that the agency has the legal authority to collect and store any electronic communication that uses encryption. And if companies are storing email in servers within the government's jurisdiction, they may not be able to make good on promises to users that their communications are absolutely private and secure. In his letter, Levison said, "I would _strongly_ recommend against anyone trusting their private data to a company with physical ties to the United States."

The government has given no indication that it will back down from using surveillance orders to demand all kinds of customer records, from Internet searches to phone logs to email metadata and content. But what Lavabit and Silent Circle have done may mark the beginning of a resistance.

The truth is that for all the government's extraordinary powers under surveillance law and the NSA's global reach, the U.S. intelligence community is largely at the mercy of companies to help it monitor the world's networks. Indeed, current surveillance law was modified a few years ago to give telecom companies that assisted the NSA with warrantless wiretapping legal immunity from prosecution. Officials feared that without those protections, the companies would do everything in their power not to help the government.

If enough companies were to take the drastic step of shutting down, the government would find itself in the dark on potentially crucial intelligence. The likelihood of this happening is still remote. But the fact that two companies would take such drastic measures to preserve their independence and keep the government out of their business may speak to a dawning awareness: While the government may hold the legal power, it is not all-powerful.

SAUL LOEB/AFP/Getty Images