When free will, causality and privacy are all at stake

Review of Viktor Mayer-Schonberger and Kenneth Cukier’s Big Data

We live in a world where flu outbreaks are predicted faster and more accurately by analysing Google search results rather than by doctors or clinicians, where traffic jams are better judged by crunching data from cellphone signals rather than from direct reports from people on the ground, and where your shopping habits might reveal that you’re pregnant before anyone else in your family knows.

This is the power of big data. It is defined not by the sheer volume of information, but by what that large volume enables us to do that similar smaller volume of data wouldn’t. For example, Google’s flu trends will hardly work if the amount of queries made per second were not in the thousands. Another aspect of big data is that using it means shedding our obsession for causality and embracing correlations. The important thing is to know what rather than the why.

Take the example of how Google’s page ranking system works. The computer algorithm at the heart of Google search that is analysing data from all across the web is not trying to understand what the websites say or mean, so much as it is trying to correlate what people want when they type something in the search query. More queries followed by more clicks on relevant sites will better the algorithm that ranks pages, helping it to make better predictions which links will work best. Now add to it information like a person’s search history, location, time of the day, etc. and Google is able to give near perfect search results.

The book also rightly argues that despite data’s ubiquitous use today, the revolution has only just begun. A lot of the information in the world still remains locked or wasted. Consider the example of electrocardiography (ECG). When a patient undergoes ECG, hundreds of data points are collected every second but most of it gets thrown away. Instead the capability of easy storage (thus never needing to throw away any data) can be used to make better predictions of the patient’s health in the future. Datafication, which is recording everything possible, can unlock information around us. Things which might seem uninteresting could, in combination with other data, reveal insights that we could not have guessed before

Big data’s use does not paint a uniformly rosy picture. Governments are desperately trying to control more and more data from their citizens’ lives under the guise of security concerns. But, just like private companies do, the data can easily be employed by governments for uses that citizens would not approve of, if they were asked to give consent. The story of Minority Report could come true. In it the government develops a system that is used to predict the future occurrence of crime and make arrests in time to stop it. These sort of uses are still science fiction, but not for long. They risk taking away from humanity its most dear capacity—to act on “free will”.

It is this dual-edged sword of big data that makes Messrs Cukier and Mayer-Schonberger’s book timely and important. Written beautifully and convincingly, it makes for a great read. Where I don’t agree with the book is that big data “will transforming how we live, work and think”. I think it already has.

Redefining the notion of a book

Two months of failing to fulfill my reading goals towards the #100bookschallenge has made me rethink the purpose of taking up the challenge

Less than a decade ago, it was easy to recognise a book. It was anything that could be printed, bound and put on shelf of a library or a store. Now, though, things have gotten messy.  There are ebooks, Kindle singles, Atavist originals, Matter stories, and the list goes on.

In many parts of the world, digital has become the primary platform for the written word. The advantages are plenty and this trend towards digital is no surprise. But it disrupts how ideas get shared, and sharing ideas was the main reason for books to come into existence.

While it was with the classical definition of a book that I began my #100bookschallenge, the main reason behind taking up the challenge was to be able to learn about the greatest ideas out there. These are increasingly being communicated not just in books. A lot of the ideas are long conversations that have been running on a blog, or those that appear in longform writing/journalism like the New Yorker or The Economist’s special reports.

Thus I’m revising the definition of a (non-fiction) book that can count towards my challenge of 100 books. Apart from the classical definition, all pieces of writing that will fulfill all the conditions below can be counted towards my target:

  • Longform writing that has a clear-defined message or explores a topic in a significant amount of detail or has a central theme.
  • Has been written by a single author (‘classical’ books may have more than one author).
  • Is more than 10,000 words long as a single piece.
  • A series of blog posts won’t count if at least one of them is not close to 10,000 words long and explains the main idea of the series.
  • The writing should be so dense (full of ideas) that I cannot stop myself from writing a review of what I read.
  • (UPDATE) The work should be not just newsworthy ie it should still relevant and worth reading after, say, many months or sometimes years.

As to why just 10,000 words? Because it’s not too short and it feels like the right length to have a comprehensive look at a topic. I’m open to revising my definition, so please feel free to make suggestions.

Busting myths with science

Review of David Bradley’s Deceived Wisdom

A quick, interesting read for those who like reading websites like Quora (and later having arguments in a pub). This is a book about setting the facts right. David Bradley attempts to use science and rational thinking to clear some age-old myths (like “cats are smarter than dogs“) and some modern ones (like “everyone is connected to the other through six connections“).

This “book” should really exist as well-referenced posts on a single website. Some of it exists on sciencebase.com, Bradley’s blog. There are others like snopes.com too. Perhaps what Bradley has done with the book is refine his explanations.

I somewhat agree with what Brian Clegg has to say about it: If you liken popular science books to food, Deceived Wisdom is simply not meaty enough to make it a three course meal. It is, however, a top notch box of chocolates – and who doesn’t like that?