Government-funded trolls. Decoy documents. Software that identifies you by how you type. Those are just a few of the methods the Pentagon has pursued in order to find the next Edward Snowden before he leaks. The small problem, military-backed researchers tell Foreign Policy, is that every spot-the-leaker solution creates almost as many headaches as it's supposed to resolve.
With more than 1.4 million Americans holding top-secret clearance throughout a complex network of military, government, and private agencies, rooting out the next Snowden or Bradley Manning is a daunting task. But even before last week's National Security Agency (NSA) revelations, the government was funding research to see whether there are telltale signs in the mountains of data that can help detect internal threats in advance.
In the months following the WikiLeaks revelations, the Defense Advanced Research Projects Agency (DARPA) -- the U.S. military's far-out tech arm -- put out a number of requests for research on methods to detect suspicious behavior in large datasets as a way to root out rogue actors like Manning (or in more extreme cases, ones like Fort Hood shooter Nidal Malik Hasan.)
The most ambitious of these is known as Anomaly Detection at Multiple Scales (ADAMS), a program that as an October 2010 research request put it, is meant "to create, adapt and apply technology to the problem of anomaly characterization and detection in massive data sets." The hope is that ADAMS would develop computers that could analyze a large set of user-generated data -- the emails and data requests passing through an NSA office in Honolulu for instance -- and learn to detect abnormal behavior in the system.
The tricky part of this kind of analysis is not so much training a computer to detect aberrant behavior -- there's plenty of that going around on any large network -- it's training a computer what to ignore.
"I like to use the example of learning to recognize the difference between reindeer and elk," wrote Oregon State University computer scientist Tom Dietterich, who worked on developing anomaly detection methods for ADAMS, in an email to Foreign Policy. "If all I need to do is tell these species apart, I can focus on the size [of] their antlers and whether the antlers have velvety fur, and I don't need to consider color. But if I only focus on these features, I won't notice that Rudolph the Red-Nosed Reindeer is anomalous, because I'm ignoring color (and noses, for that matter). So in an anomaly detection system, it is important to consider any attribute (or behavior) that might possibly be relevant rather than trying to focus on a very few specific characteristics."
Over the past three years, DARPA has shelled out millions of dollars on efforts to learn how to root out Rudolphs from the rest of the reindeer and find out exactly what these red noses look like. This includes a $9 million award to Georgia Tech to coordinate research on developing anomaly detection algorithms. You can peruse much of the research funded through ADAMS online. For instance, a proposal by the New York-based firm Allure Security Technology, founded by a Columbia University computer science professor, calls for seeding government systems with "honeypot servers" and decoy documents meant to entice potential leakers to subversives. The files would alert administrators when accessed and allow the system to develop models for suspicious behavior. The company cheekily refers to this technique as "fog computing."