Sunday, 14 October 2012

Big Data - Big Limits

It must be more than twenty years since I predicted that one day, we would be able to cram all the world's data into a single optical disk. I recall, that I was writing an article about Apple Computer, perhaps it was for Computer Weekly, with Apple very much a niche player at the time and Steve Jobs, as ever, well ahead of the curve, had expressed real excitement about the technology.

Of course, I was wrong in one very important respect. All the world's data or the sum of human knowledge, twenty years ago, is only a fraction of what it is today, at a time when IDC reports that three exabytes of new data are created each twenty-four hours. That's two to the sixtieth bytes or simply a billion gigabytes; I'm told about 50,000 years worth of DVD quality video. As volumes of data continue to grow at near exponential rates, the prevailing 'Big Data' debate surrounds the very real challenge of storing and making sense of it all, let alone the the privacy implications of capturing every keystroke and query for pattern and behaviour analysis.

In October's ARC Magazine, Samuel Arbesman worries that soon, we will no longer be able to understand a large fraction of the knowledge we have created and in 'Scientific American 2010' Danny Hillis makes a similar point. Hillis argued that that we have moved from the Enlightenment, a period where logic and reason could bring understanding, to the Entanglement, where everything is so unbelievably interconnected, that we can no longer understand systems of our own making.

Arbesman speculates that for a greater part of human history, the vast majority of humanity has understood its surroundings according to the knowledge of the day: "From the four elements to the workings of the screw and the pulley, a significant fraction of the world's knowledge was within the grasp of most individuals." He adds: "As our world as become more complex and knowledge has increased rapidly, a smaller and smaller fraction of society has felt it has a true-enough understanding of everything."

This is a theme I would like to explore a little further in the weeks and months ahead, as I think further on the nature of physical limits and our ability to draw meaningful and useful conclusions from seemingly infinite volumes of human behaviour, expressed and delivered in neat digital packets.