When free will, causality and privacy are all at stake

Review of Viktor Mayer-Schonberger and Kenneth Cukier’s Big Data

We live in a world where flu outbreaks are predicted faster and more accurately by analysing Google search results rather than by doctors or clinicians, where traffic jams are better judged by crunching data from cellphone signals rather than from direct reports from people on the ground, and where your shopping habits might reveal that you’re pregnant before anyone else in your family knows.

This is the power of big data. It is defined not by the sheer volume of information, but by what that large volume enables us to do that similar smaller volume of data wouldn’t. For example, Google’s flu trends will hardly work if the amount of queries made per second were not in the thousands. Another aspect of big data is that using it means shedding our obsession for causality and embracing correlations. The important thing is to know what rather than the why.

Take the example of how Google’s page ranking system works. The computer algorithm at the heart of Google search that is analysing data from all across the web is not trying to understand what the websites say or mean, so much as it is trying to correlate what people want when they type something in the search query. More queries followed by more clicks on relevant sites will better the algorithm that ranks pages, helping it to make better predictions which links will work best. Now add to it information like a person’s search history, location, time of the day, etc. and Google is able to give near perfect search results.

The book also rightly argues that despite data’s ubiquitous use today, the revolution has only just begun. A lot of the information in the world still remains locked or wasted. Consider the example of electrocardiography (ECG). When a patient undergoes ECG, hundreds of data points are collected every second but most of it gets thrown away. Instead the capability of easy storage (thus never needing to throw away any data) can be used to make better predictions of the patient’s health in the future. Datafication, which is recording everything possible, can unlock information around us. Things which might seem uninteresting could, in combination with other data, reveal insights that we could not have guessed before

Big data’s use does not paint a uniformly rosy picture. Governments are desperately trying to control more and more data from their citizens’ lives under the guise of security concerns. But, just like private companies do, the data can easily be employed by governments for uses that citizens would not approve of, if they were asked to give consent. The story of Minority Report could come true. In it the government develops a system that is used to predict the future occurrence of crime and make arrests in time to stop it. These sort of uses are still science fiction, but not for long. They risk taking away from humanity its most dear capacity—to act on “free will”.

It is this dual-edged sword of big data that makes Messrs Cukier and Mayer-Schonberger’s book timely and important. Written beautifully and convincingly, it makes for a great read. Where I don’t agree with the book is that big data “will transforming how we live, work and think”. I think it already has.


