The Data Order

Tanveer Sayyed
Predict
Published in
5 min readDec 12, 2018

--

Datum is fast becoming the 5th factor of production and no sooner we may find a “Minister of Data” in our cabinet. In this disconnected and emotional post-truth era, can we truly arrange data methodically in ‘order’? Or would it be the other way, will data ‘order’ us?

In Hayao Miyazais’s movie — The Wind Rises[2013], the mother of a 10yr old kid advises him: “Fighting is never justified”. The same kid, later goes on to revolutionize Japanese fighter plane design — from wood and canvas to a solid all metal steel jet airplane. These very planes were used by Japan during World War-II to bomb its enemies. And we know how millions perished in that bloody war! The name of the kid was Jiro Horikoshi. So, did Horikishi ended up justifying fighting/violence?

Certainly not! The only dream of Horikishi was to build good airplanes… and he did! But how would those plane be put to use and what purpose would they serve, ultimately, was not in his hands.

Now what does all of this, has to do with data?

Datum, put simply, is a fact. It’s neutral, shapeless, tasteless. But converting it to information infuses it with power! Data thus bestows power, to the one who is “able” to manifest it. Here “ability” aims at evolving tools, methods and techniques that construct robust, comprehensive and holistic data. Also, without doubt, the extremely crucial question we must raise is: Will that power be exercised ‘morally’? Because only just means can achieve just ends.

We can even form a truth table:

*As we can easily observe, the terms ‘Good Data’ and ‘Bad Data’ have been used generally and not technically.

Multiple Examples:
[1/3]
Pax-Britannica/Imperial Britain gathered exhaustive data about their colonies ranging from: climate, flora, fauna, herbs, soil types, crops, demography, resources, etc. But the real reason this was all done, was, to satiate their unfathomable greed. Lord William Bentick, the Governor-General of India (1833–1835) puts it honestly and bluntly: “The misery hardly finds parallel in the history of commerce… The bones of the cotton weavers are bleaching the plains of India.” Another example would be the Holocaust itself. During World War-II its well known that the punch cards developed by IBM were rented by Nazis for pin-pointedly tracing the residences of minorities in Germany. IBM did that to just get more business. Both these examples belong to “Bad Data” category because, even though the ‘ability’ with which data were gathered might have been exhaustive and the best available in those times, but it failed the test of ‘morality’. Power, bestowed upon by data, was employed as a means to achieve an evil end.

[2/3]
The work by Angus Deaton, who bagged the 2015 Nobel prize in the field of economics is a good example of “Good Data”. His work questioned the very role of the great leveler called “Globalization” in India during the 1990s. The Indian policy analysts proved that the leveler was indeed successful in lifting “all” boats. He, however, pointed out that the methods used by Indians were outdated and erroneous; so the extent of distribution of fruits of globalization, that was calculated, was also not exactly correct[it turned out to be lower]. Which means the Indian data belonged to “Not Good Data” category. His work was(and is) valued because it focuses more on developing skills, techniques and tools that are required in “construction” of a data series, rather than analyzing it. Deaton cleverly could even find evidence(i.e. predict), to a good extent, against gender discrimination against girl children within families by observing their per capita expenditure. That’s one of the many reasons he famously said that “Good data is fundamental for good governance!”.

[3/3]
Governments intentionally “hide” data about poverty, low nutrition levels, rapes, health indicators, sanitation, state sponsored violence, etc. All this is done to keep attracting investments and tourists. It can also be the other way where governments portray false(bad) GDP figures so as to maintain their credibility, as we saw in 2010 Euro-Zone Crisis. Corporations “hide” data about human rights violation, their “political” investments as well as safe havens. The chemical industry spent thousands of dollars to malign the name of Rachael Carson, in late 1950s, who proved that DDT was affecting our environment through bio-magnification. Similar cases have occurred in relation to mercury poisoning in Minamata or the Bhopal Gas tragedy in India. Even today a majority of the corporations do not even know whether they are procuring conflict minerals from Africa or not. The moment we switch on our smartphone’s location, at least 75 companies directly harvest our data. In all of these cases “ability” is deliberately killed or rendered in-efficient, which itself disables morality. These examples, in my opinion, fall in the last category.

Lastly, data are also preponderant. In almost 2 years the total data, on the internet doubles itself. Thus Big Data, has extremely high “velocity” and “volume”. Human mind alone does not have the capacity to analyze even 1% of this juggernaut. After deploying bots and making use of artificial intelligence we still can only analyze a small part of it. But as Alan Turing’s presage creepily unfolds and machines slowly begin taking control, we observe that bots also have the potential to trick us. So when it comes to “veracity”, Big Data has also made our lives worse by spreading hatred and fake news like wild fire. “Big data may mean more information, but it also means more false information,” warns Nassim Nicholas Taleb. Making matters worse, China AI is ushering “algorithmic governance”!

Summing it up now. It is important to understand that the capacity to analyse data is far lesser than the amount of data generated each second. And the effort required to construct Good Data is definitely not trivial. The task of analyzing this Big Data is going to be majorly with data science professionals. There is a possibility for them to get lost in the waves of data; as they would not be searching a needle in just one haystack now. More importantly, more and more redundant information each second, as pointed out by Taleb increases the chances of errors exponentially.

Thus it’s easily recognizable that, the Data order, is an important notion that shapes the World order. Good data is thus fundamental for a better world. Like clean water and air it ‘is’ now a basic necessity. As Data Science marches ahead we need to keep taking cues from history to fully comprehend the power of data. Thus Big Data comes with a much “Bigger” responsibility on shoulders of humanity.

So the most important question that ought to haunt us is: Can we convert Big data to Good data, with an efficiency, that is supportive to the idea of peace and development for “everyone”? Or are we going to be witnesses of even more systematic Holocausts ahead?

--

--

Tanveer Sayyed
Predict

Data science enthusiast. Ardent reader. Art lover. Dissent lover… rest of the time swinging on the rings of Saturn