The importance of data dissemination

Thanks to the efforts of customs dept and an open data enthusiast, we can look at trade at a more granular level


The port of Nhava Sheva accounts for the largest share of both imported and exported goods (by value). Photo: Bloomberg
The port of Nhava Sheva accounts for the largest share of both imported and exported goods (by value). Photo: Bloomberg

While broad patterns in Indian trade are rather well known and well studied, what we’ve largely lacked so far is a mechanism to track Indian imports at a much more granular level. Now, thanks to the efforts of the customs department (icegate.gov.in/DailyList/DL), which uploads commodity-wise and port-wise trade data on a daily basis, and activist and open data enthusiast Srinivas Kodali (bit.ly/1H7gfys) , who has scraped and compiled the data, it is possible to look at Indian exports and imports at a much more granular level.

Kodali released the above data last week through the Datameet platform (bit.ly/1RcWAD7) during the Open Data Week celebrations.

Data is available for trade between 8 August and 17 October 2015, and provides us some very interesting insights into the way India trades.

There are no real surprises in terms of our largest trading partners: we import the most from China, while we export the most to Sharjah (interestingly, while the precise port of destination is available for exports, we only have granularity up to the country level when it comes to imports).

Charts 1 and 2 look at the top 10 sources of imports into India and the top 10 destinations for exports.

The data on the largest Indian ports of entry and exit (by value of traded goods) shows some interesting insights. The port of Nhava Sheva accounts for the largest share of both imported and exported goods (by value). Other prominent sea ports such as Mundra, Chennai, Vizag and Tuticorin are also present in the top 10 in terms of both imports and exports (charts 3 and 4).

What is surprising, however, is the presence of airports in the list. Delhi Air Cargo has the fourth largest market share in terms of the value of goods imported. Bangaloru and Mumbai Air Cargo have the fourth and fifth largest market share in terms of value of goods exported.

Looking through the data, close to 40% of imports from the Delhi Air Cargo port comprises of gold in various forms. Gold comprises a large part of imports in the Bengaluru Air Cargo terminal too.

Coming to exports from Bangalore Air Cargo terminal, jewellery accounts for 67% of the value, with aircraft components coming a distant second. Some 22% of Mumbai Air Cargo terminal’s exports comprise pharmaceutical products.

Next, we will look at the most-traded commodities. This is not a very easy task since the data is not clear in terms of item description.

We get around this by using the harmonised system codes (these are universal codes used by customs in order to define specific categories of commodities.

There is some loss of information due to aggregation but we will live with that). Charts 5 and 6 show the top 10 commodity classes for import and exports, respectively.

Despite the fall in commodity prices, 20% of our import bill comprises petroleum oils. Most of our other top imports are also commodities, including gold, silver, copper and natural gas. Among manufactured products, fertilisers (urea and diammonium phosphate) and telecom equipment are among the top imported items.

Our export basket based on this period is much more diverse, though the largest class (jewellery) accounts for only 3.6% of our exports, followed by pharmaceuticals. This seems to suggest that while our exports are well diversified, our import bill is vulnerable to commodity price shocks, and we should not read too much into our reduced current account deficit, given the depressed commodity prices.

This is only some of the information that can be obtained by looking at the customs data. We can also construct some interesting cross-tabs (mintne.ws/1cJAfPA ), and infer daily and weekly trends of commodity imports and exports.

Before we end, a note on data dissemination. Writing in this newspaper last week, V. Anantha Nageswaran lamented about the quality and availability of economic data in India (mintne.ws/1GmCJR8).

Nageswaran wrote that one of the missions of the NITI Aayog should be to make India a data-rich country.

The customs data provides us an illustration into what is wrong with data dissemination in India, even when it is available. While it is a great thing that the department uploads data on a daily basis, there are several shortcomings.

Firstly, data is available on the website only for the preceding seven days.

Secondly, data is available in bits and pieces, with multiple small text files released on each day (the two months of data Kodali had uploaded contained more than 7,500 text files for imports, and an equal number for exports). Thirdly, the data is unclean.

For example, product descriptions contain “0.995 FINENESS GOLDMEDALLIONS” and “0.995 FINENESSGOLD MEDALLIONS”.

All these measures make the process of analysing existing data expensive, since they call for skills in scraping, consolidating and cleaning data.

They also result in duplication of effort since several people might be independently spending their time doing the same thing. In this context, if the data disseminator (the customs department in this case) were to disseminate the data in an easy-to-use, consistent and clean format, it could significantly lower the barriers to analysing government data, and lead to better decision-making overall.

The customs department is by no means the worst offender in terms of data dissemination, though, since they release their data in text files, which are machine-readable.

Several other government departments in India have a penchant for releasing data in PDF files (sometimes from scanned images), which makes them impossible to use unless someone spends the time and effort manually entering data into a spreadsheet—an effort that is completely avoidable.

There is a case to be made for a national data dissemination policy (perhaps the NITI Aayog can take the lead on that?) for use by different government departments that put out data, so that even if we don’t have great data yet, whatever data we have is put out in an easily consumable and analysable format, so that we can make the best use of it.

More From Livemint