Over the last week, critics and defenders of the National Security Agency have heatedly debated the merits of metadata -- information about the phone activity of millions of Americans that was given to the government via a secret court order.
The information collected includes records of every call placed on the Verizon communications network (and, it appears, every other U.S. phone carrier) including times, dates, lengths of calls, and the phone numbers of the participants, but not the names associated with the accounts.
For some, the collection of these data represent a grave violation of the privacy of American citizens. For others, the privacy issue is negligible, as long as it helps keep us safe from terrorism.
There are indeed privacy issues at play here, but they aren't necessarily the obvious ones. In order to put the most important questions into context, consider the following illustration of a metadata analysis using sample data derived from a real social network. The sample data isn't derived from telephone records, but it's close enough to give a sense of the analysis challenges and privacy issues in play.
While this example is relevant to what happens behind the NSA's closed doors, it is not in any way intended to be a literal or accurate portrayal. While every effort was made to keep this example close to reality, a wide number of hypotheticals and classified procedures ensure the reality is somewhat different.
We start with a classic scenario. U.S. intelligence officials have captured an al Qaeda operative and obtained the phone number of an al Qaeda fundraiser in Yemen.
You are an analyst for a fictionalized version of the NSA, and you have been authorized to search through metadata in order to expose the fundraiser's network, armed with only a single phone number as a starting point.
The first step is refreshingly simple: You type the fundraiser's phone number into the metadata analysis software and click OK.
In our example data, the result is a list of 79 phone numbers that were involved in an incoming or outgoing call with the fundraiser's phone within the last 30 days. The fundraiser is a covert operator and this phone is dedicated to covert activities, so almost anyone who calls the number is a high-value target right out of the gate.
Using the metadata, we can weight each phone number according to the number of calls it was involved in, the lengths of the calls, the location of the other participant, and the time of day the call was placed. Your NSA training manual claims these qualities help indicate the threat level of each participant. Your workstation renders these data as a graph. Each dot represents a phone number, and the size of the dot is bigger when the number scores higher on the "threat" calculus.
This is already a significant intelligence windfall, and you've barely been at this for five minutes. But you can go back to the metadata and query which of these 79 people have been talking to each other in addition to talking to the fundraiser.