"Big Data Is Anonymous, so It Doesn't Invade Our Privacy."
Flat-out wrong. While many big-data providers do their best to de-identify individuals from human-subject data sets, the risk of re-identification is very real. Cell-phone data, on mass, may seem fairly anonymous, but a recent study on a data set of 1.5 million cell-phone users in Europe showed that just four points of reference were enough to individually identify 95 percent of people. There is a uniqueness to the way that people make their way through cities, the researchers observed, and given how much can be inferred by the large number of public data sets, this makes privacy a "growing concern." We already know, thanks to academics like Alessandro Acquisti, how to predict an individual's Social Security number simply by cross-analyzing publicly available data.
But big data's privacy problem goes far beyond standard re-identification risks. Currently, medical data sold to analytics firms has a risk of being used to track your identity. There is a lot of chatter about personalized medicine, where the hope is that drugs and other therapies will be so individually targeted that they work to heal an individual's body as if they were made from that person's very own DNA. It's a wonderful prospect in terms of improving the power of medical science, but it's fundamentally reliant on personal identification at cellular and genetic levels, with high risks if it is used inappropriately or leaked. But despite the rapid growth in personal health data collectors such as RunKeeper and Nike+, practical use of big data to improve health-care delivery is still more aspiration than reality.
Other kinds of intimate information are being collected by big-data energy initiatives, such as the Smart Grid. This effort looks to improve the efficiency of energy distribution to our homes and businesses by analyzing enormous data sets of consumer energy usage. The project has great promise but also comes with great privacy risks. It can predict not only how much energy we need and when we need it, but also minute-by-minute information on where we are in our homes and what we are doing. This can include knowing when we are in the shower, when our dinner guests leave for the night, and when we turn off the lights to go to sleep.
Of course, such highly personal big-data sets are a prime targets for hackers or leakers. WikiLeaks has been at the center of some of the most significant big-data releases of recent times. And as we saw recently with the massive data leak from Britain's offshore financial industry, the 1 percenters of the world are just as vulnerable as everyone else to having their personal data made public.
MOHAMMED AL-SHAIKH/AFP/Getty Images