Beware! Don’t fall for the ‘data’ lies like ‘97% of cow related violence happened after 2014’

It is often said that a lie can get halfway around the world before the truth can get its pants on. If you have been watching the  media over the last few days, you might have come across a puzzling statistic that “97%” of cow related violence happened after Narendra Modi became Prime Minister of India.

Not surprisingly, the mainstream media has been actively promoting this number and here are just a few examples of media outlets that have made it to the roll of dishonor thus far.



Where did this 97% number originate? It turns out that all these articles are referencing a report compiled by an organization called IndiaSpend :


I visited their website and discovered that they claim to do something called “data journalism”. Which is shocking, because the methodology behind their “data” of 97% cow related violence happening after 2014 is so dishonest that it can only be considered a lynching of statistics:


A “database” on crime compiled not by looking at police records, but by running Google searches with certain keywords! Worse, they decided to stick to English language media only, taking pride in saying that a “cursory search” through Hindi media “appeared” to throw up the same incidents.

Well, the Hindi media should count its blessings. At least they got a “cursory search”. Everyone knows that nothing worth talking about is ever reported in media in any other language.

So, if the English media chooses not to cover an incident, it doesn’t get counted! Of course, they found that 97% of cow related violence happened after 2014! Because the English media only began talking about this after 2014.

Hey IndiaSpend, why stop there?

For your next “data journalism report”, perhaps you could save some more effort and just search the NDTV website instead of all of English language media.

Why do that even? Why not just collect all your “data” from searching Rajdeep Sardesai’s Twitter timeline? There’s no way that could throw up a fake, distorted picture of India, right?

If only IndiaSpend had bothered with a “cursory” search regarding the ethical methods of data collection….

This 97% number and the media fanfare surrounding it is something of a watershed moment in the history of fake news. Liberals couldn’t find the actual data to prove their accusations against the Modi government. The IndiaSpend report is the first attempt that I know of where liberal outrage itself has been recycled into “data”.

Considering the furious pace at which the mainstream media has fanned this piece of fake news, it is only a matter of time before this number of 97% becomes a fixture as well in international reports about India.

This first wave of  “reports” in the Hindustan Times, Economic Times, Firstpost and Business Standard will soon be referenced by BBC and New York Times.

In turn, future articles in Indian media will reference this second wave of “reports”, firmly establishing this fake 97% number in the discourse.

The final stage will consist of academics and court poet historians whose job it is to turn the fake news into accepted historical fact.

It is up to the common people to resist. And it’s a good thing that the criticism over IndiaSpend’s figure of 97% lynchings occurring after 2014 (see examples here and here) seems to have reached some of the usual suspects.

One of these social “science” eminences put out a series of tweets, apparently defending IndiaSpend’s practise of using media reports to generate its now (in)famous “data” (We also note with some amusement Prof. Varshney’s possible bid for self promotion):

Let’s quickly examine this line of reasoning. Prof. Varshney’s main claim is that media reports would be more reliable than the “useless” government records when it comes to cow related violence.

So, why not check this assumption quickly with respect to the “data” that IndiaSpend has put out here? In its list of 63 incidents of cow related violence, IndiaSpend itself admits that charges have been filed in as many as 61 cases. In another case, IndiaSpend says that the police didn’t file a case, but the High Court acted severely transferring the District Magistrate, the SP, DSP and the SHO of police and even the CBI took note of the case. That makes 62 of 63 where the case of cow related violence is present in official government records.

Or as IndiaSpend would have said, 98.4% of cases of cow related violence appearing in English media have been recorded by the government.

This blows apart Prof. Varshney’s core argument that government data would be “useless” for cow-related violence.

Thus far, IndiaSpend seems to only have discovered instead that there is little reason to suspect that the government records are leaving out cow related violence. This means that real unknown in the data put out by IndiaSpend is the extent to which the “English media” might have left out other incidents of cow related violence.

If anything, IndiaSpend’s data seems to make a stronger case for reliability of government records. Instead, the question mark shifts firmly to IndiaSpend’s approach of using media reports instead of government records.

But wait, let us not forget that IndiaSpend did much worse than scan media reports, they only scanned English media reports.

Scientists and in fact even social scientists are trained to spot every qualifying word. As such, it is rather shocking that Prof. Varshney fails to even notice the leap of faith from “media” to “English media”.

Data from the Audit Bureau of circulations [pdf] shows that only one out of the top 10 most circulated newspapers in India is published in English. That would be The Times of India. Incidentally, the ToI is referred only once in IndiaSpend’s alleged dataset, further underlining how spectacularly unrepresentative their search effort was.

Instead, there are 5 newspapers in the top 10 that are published in Hindi and the Hindi media gets only a “cursory search”.  Other languages which make for 4 of the top 10, don’t even get that much.

Imagine if you were faced with an opinion poll that surveyed only Indians who speak English…

Not to mention that IndiaSpend did not even make an attempt to define the term “English media”. Their list of sources seems to include plenty of links from outlets such as Catch News and The Wire.

To summarize, we see:

  1. No attempt to define the source (“English media”) that has been studied.
  2. No attempt to explain why the subclass (“English media”) that has been studied is representative of the whole class (“media”).
  3. No justification provided as to why media reports are to be preferred to government records.

No. (2) is particularly surprising in light of the fact that publicly available data proves their implicit assumption to be spectacularly false. And No (3) is totally unforgivable in view of the fact that their own data blatantly contradicts the idea that government records on cow related violence are useless.

I don’t know about Prof. Varshney, but I think it is recognized worldwide that such sloppiness should be the kind of blunder that ends careers.

Of course, the more deep seated evil here is the assumption that English media, written by the elite, can be used to exonerate the elites who are themselves in the dock here.

Remember that the issue at hand here is branding of the country as Lynchistan, and that branding happens through narrative setting, which is the main job of the English media. The accusation is that the English media selectively highlights instances of cow-related violence in certain years and in certain states with a specific agenda.

Remember ghar-wapsi reports and debates in English media? They always happened but somehow never made their way into English media reports before 2014, but you just needed to look into Hindi media (e.g. here and here) and you knew it was an annual affair.

When English media itself is the accused party here, how exactly can you use media reports to give it a clean chit? This is worse than AAP’s internal Lokpal.

It is understandable that certain eminent personalities, who have thus far comfortably and cyclically quoted each other through decades of agreement under a Congress-fed establishment feel irritated that their assumptions are being questioned. But the challengers are here, whether the consensus likes it or not.

To Top