Could comparable techniques work for predicting social upheaval? The UN has launched an initiative, Global Pulse, that uses new technologies to collect, analyze, and filter information to help governments and organizations better understand what is happening in certain at-risk communities. In 2010 and 2011, the Ushahidi group, famous for its pioneering crowdsourcing and real-time visualization software, created a website that tracked potential disturbances during Liberia's elections.
On a similar note, the Associated Press reported in November 2011 that the CIA was monitoring five million tweets a day to monitor revolutionary change. (Of course, the same data can be used by repressive governments who want to track dissent and unrest.)
One of the key problems in making assumptions about societal change based on social media data is our limited understanding of the relevant conversion rates. Just as online marketers, advertisers, and political campaigners are frantically trying to understand the relationship between tweets and dollars or tweets and votes, the conversion rates for social change are murkier still. What assumptions, for example, can data scientists make between tweet volume and the amount of people likely to attend a protest? Or can the sentiments expressed in Facebook status updates be mined to produce an accurate indication of support for Vladimir Putin?
Given that social media analysis is still in its infancy, the answers to those questions remain elusive. So-called sentiment analysis still struggles to distinguish the signals from the noise. Network diagrams mapping relationships between tweeters and "likers" tell us that there is a big crowd, but they are pretty unhelpful in telling us what that crowd is thinking and why. While programs will become better at parsing huge amounts of data, they are still more comfortable counting than they are at interpreting. Computers still struggle with slang, sarcasm, and subtexts.
Take these tweets from user DanielNothing, who was tweeting about the London riots on August 6, 2011:
Heading to Tottenham to join the riot! who's with me? #ANARCHY
Clear enough, right? If thousands were retweeting DanielNothing's tweet, or tweeting similar sentiments, police officers might be well-advised to deploy resources accordingly. But then DanielNothing tweets again:
Hang on, that last tweet should've read 'Curling up on the sofa with an Avengers DVD and my missus, who's with me?' What a klutz I am!
Only friends of DanielNothing could say for sure what he meant. Is he just being sarcastic? Or could his first tweet be taken at face value, with his second being read as an attempt to mask his true intent? If a human struggles to decipher the true meaning, how would a computer fare?