Analysts are understandably greedy for data even when they don't necessarily need it, and much of government is filled with people for whom Big Data might as well be magic. The inevitable result is that when presidents, lawmakers, and judges are told in vague but enthusiastic terms that more data equals less terrorism, they might be inclined to write blank checks.
But while these matters are complex, they are not impenetrable. Once we get beyond the obvious and not-insignificant issue of whether the Foreign Intelligence Surveillance Act was intended to authorize such broad collection, there are important questions that must be addressed if we're going to continue to use these techniques -- which we almost certainly are.
1. How much contact can an analyst have with a U.S. person's data before it becomes a troublesome violation of privacy? Is it a violation to load a phone record into a graph if the analyst never looks at it individually? Is it a violation to look at the number individually if you don't associate a name? Is it a violation to associate a name if you never take any additional investigative steps?
2. Metadata analysis is more accurate when the data is more complete. Should minimization practices filter metadata on American citizens out of the analysis altogether? What if that means targeting might be less accurate and, ironically, more likely to designate innocent people for more intrusive scrutiny?
3. What percentage of phone traffic to targeted numbers travels only on foreign carriers? Does the absence of those data skew analysis and possibly overemphasize the scoring of phone numbers used by American citizens?
4. On a fundamental level, are we willing to trust mathematical formulas and behavioral models to decide who should receive intrusive scrutiny?
5. Metadata analysis rarely deals in certainties; it almost always produces probabilities. What probability of evil intent should these models demonstrate before the government uses them to help justify a phone tap, or a house search, or a drone strike? 90 percent? 60 percent? Should we allow incremental collection of slightly more intrusive data if they can clarify a marginal case?
6. Have we tested our analytical math to see how accurate its predictions are relative to the actual content of calls? If so, how were these tests done? If not, are we willing to trust these models based on their success in other fields, or do they need to be tested specifically for counterterrorism?
7. If we believe the models do need to be tested for accuracy, are we willing to endure the privacy violations such tests would almost certainly entail? Will more accurate models lead to better privacy in the long run by reducing the number of innocent people subjected to more intrusive scrutiny?
8. Are we willing to trust the government to hold this data? Although the government says this data is currently focused on foreign counterterrorism, do we believe the president might not order the NSA to access metadata in the wake of a terrorist attack of domestic origin?
9. On a related note, what happens if the origin of an attack isn't immediately clear, as in the Boston Marathon bombing? Should the NSA immediately begin a broad analysis of metadata and continue until it's clear where the responsibility lies?
10. If we were to allow the use of this technology in domestic terrorism investigations, during a crisis or otherwise, how do we avoid collecting information on legal political dissent? For instance, targeting anarchists might inadvertently produce a list of influential leaders in the Occupy movement. Targeting militia groups might create a database of gun sellers. When you plunge into a huge dataset, you sometimes get insights you didn't expect.
None of these questions is simple or easy. None of them lends itself to polling or punditry. They aren't easy to discuss in a reasoned and accurate manner during a two-minute TV hit or on the floor of the House of Representatives.
Yet they cut straight to the intersection of Big Data, counterterrorism, and the U.S. legal system, including constitutional protections against unreasonable searches. The founding fathers couldn't have imagined that one day a machine using advanced math might provide an argument in favor of a search warrant.
Our technological capabilities far exceed the wildest dreams of the authors of the Fourth Amendment, and neither the courts nor our laws have kept pace.
If America can't muster the energy to tackle these questions thoughtfully, we are likely to lose control of the outcome and become less free, less secure, or both.
And no one will be able to explain why.