OPEN APP
Home / Opinion / Views /  Mint Explainer: How India is lifting tax collections with a data drive

Mint Explainer: How India is lifting tax collections with a data drive

Over 100 software geeks are at work at the Income Tax Department’s Directorate of IT Systems housed in New Delhi. They use analytics tools to mine, crunch and spot insights and patterns for detecting tax frauds.Premium
Over 100 software geeks are at work at the Income Tax Department’s Directorate of IT Systems housed in New Delhi. They use analytics tools to mine, crunch and spot insights and patterns for detecting tax frauds.

  • Given the robustness of the Indian IT industry and the startup ecosystem, the tax administration will, sooner than later, leverage emerging technology tools to raise revenues. This will give more fiscal space to the Centre to meet its mounting expenditure needs for providing support to the recovery

In the 2018 Bollywood movie Raid, actor Ajay Devgan played Amay Patnaik, a senior income tax officer investigating an influential man in Lucknow for money laundering. It’s a pretty accurate picture of the way the income tax department goes about conducting what it calls ‘searches and seizures’. The important point, though, is that the department should have to rely less and less on ‘raids’, and instead, look to use technology creatively to nab evaders and improve tax collections.

Over 100 software geeks are at work at the Income Tax Department’s Directorate of IT Systems housed in New Delhi. They use analytics tools to mine, crunch and spot insights and patterns for detecting tax frauds. As taxpayer data is confidential, tax officers trained in-house to manage IT systems to oversee this work. Some parts of the job are outsourced to data analytics firms for extracting deeper insights.

The department works with assorted data from multiple sources it has access to; these include data from direct and indirect taxes, motor vehicle registrations, passports, import and export data, SEBI data, stock exchange data, investment data including foreign investment, data collected from tax administrations of other countries and such as the OECD’s Country by Country Reporting (CbCR). A mine of taxpayer data is pooled in the Tax Information Network, TIN, for processing; the volume is growing manifold.

The department’s stated goal is to raise the tax base that will push up tax collections in relation to GDP. Today, in India, less than 4% of people file tax returns, and a few lakh admit to having incomes higher than 10 lakh a year, which explains the low level of tax collections in relation to GDP. Perhaps, this explains why finance minister Nirmala Sitharaman chose to invoke a verse from the Mahabharata in her budget speech in Parliament in February this year that says tax collections are in fact in line with Dharma.

The good news is that advance tax collections are in fact buoyant this year amid incipient recovery. Advance tax collections during 1 April to 16 June grew 33% to touch 1,01,017 crore, against 75,783 crore in the year-ago period, swelling the government’s direct tax revenues. The base effect from the hit to collections during the second wave of the covid pandemic and lockdowns, and higher profitability and sales performance of big companies on the back of the surge in pent-up demand from consumers on the economy re-opening after the shut-downs, are all factors that have led to improved tax collections. Equally important is the tax department’s increasing reliance on data analytics for mining information and tracking the source of unaccounted funds.

There is a growing realisation that the use of IT platforms and analytics, and new-generation technology tools such as Artificial Intelligence and Machine Learning, are the best way to raise collections. Tax administrations in advanced economies have recognised this potential and considerably leveraged the use of analytics tools in digital tax administration. Real-time or near real-time data analytics engines are used to validate invoices and lag discrepancies, cross-check sales against purchase declarations, verify salary and withholding declarations, and compare data across jurisdictions and taxpayers.

Big Data refers to the volume, velocity and variety of data that is increasing manifold from disparate sources and the speed at which this can be processed. Analytics is the way to extract value from this data. Tax data analytics combines technical knowledge of tax laws, large sets of data that are to be analysed computationally to reveal patterns and trends in tax frauds, and use of technologies such as machine learning, AI and visualization, to generate insights.

In India, the income tax department has a mine of information. Individuals and corporations file tax returns. Data also flows from third-party sources including banks, credit card companies, registrar of properties and jewellery houses through fillings for Tax Deducted at Source, etc. The expenditure patterns of individuals who are big spenders in high-value transactions are analysed to identify potential taxpayers. Real estate is a sink for tax evasion. The information provided by the registrar of properties on those who buy and sell property above a certain limit is tapped. Each of the financial transactions that are tagged to the tax department’s unique identifier, the permanent account number, PAN, enables intelligent analysis of the data to derive information on the possibility of incomes having escaped taxation. Data is also sourced from other government entities such as Goods and Services Tax Network, GSTN, Ministry of Corporate Affairs and Sebi.

When the requirement of annual information returns, filed by third parties, was first introduced, scrutiny was not that rigorous, as PAN was found missing in many large transactions gathered through the TIN. Getting a PAN has become much easier now. A largely fool-proof PAN, combined with large-scale use of data analytics, has enabled the tax department to identify people with large incomes and collect tax from them, even if they do not file tax returns. This, coupled with efficient TIN, has reduced dependence on information volunteered through tax returns. The potential to collect revenue has become much larger with the GST roll-out and data sharing between the GSTN and the income tax department. The tax department must put the data on the quantum of taxes garnered from the use of data analytics in the public domain to get a clear picture of its efficacy.

In fact, the only limit on how much information is collected is the tax department’s ability to use the datasets meaningfully to extract intelligence and insights while safeguarding data against misuse.

The scope for data mining and analytics has vastly expanded after the Good and Services Tax (GST) roll-out. GST leaves digital footprints (read audit trails) across the income and production chain as manufacturers get credit for the taxes paid on inputs used by them to make a product. Compliance has improved, making it easier to track how much GST a company paid for how much value added by it. Deploying data analytics, the information is correlated with what the company claims as its expenses. This enables tax authorities to check whether a company has declared its income correctly and assess its tax liability more accurately. Assiduously following up on audit trails and data crunching will progressively help the widening of the direct tax base. As more and more data gets mined, the algorithms get better and better through machine learning, making tapping into the unified base of indirect and direct tax potential easier and easier, and plugging the holes in tax evasion.

Take an example. A garment maker who claims plenty of credit against taxes paid on inputs without paying cash or adding a lot of value to a product catches the eye of tax collectors. But they need conclusive proof to establish fraud. Data analytics is deployed to track all the purchases made by such a garment maker and analyse them to fish for fake invoices and claims. The entire supply chain gets covered – from suppliers of embellishments, fabric, yarn and, in turn, their suppliers too. If the yarn supplier figures in various networks of fake invoices, he can be nabbed, his bank account details checked. The scrutiny, thus, extends to others in the supply chain if they are found to be perpetrating the fraud.

Social media data is one of the sources used in Big Data. In the UK, the revenue and customs department has developed a computerized data mining system of social network analysis software that cross-checks the tax records of companies and individuals with other databases to establish fraud. The software combines analytical tools and collects the information and implements predictive analysis. The US Internal Revenue Service too gathers social media data and deploys Big Data to check tax evasion. South Korea has developed a Big Data analytics system based on AI to analyse tax invoices. Indian tax authorities too are also building effective systems, tools, and, feed to tap this source of data and are also leveraging social media and various other types of disparate data to gather intelligence. Using mobile data, textual analytics, geocoding data, audio, and video analytics, they are able to move in tune with what technology tools are able to provide. Many of these are work-in-progress; still, given the robustness of the Indian IT industry and the start-up ecosystem, the tax administration will, sooner than later, leverage emerging technology tools to raise revenues. This will give more fiscal space to the government to meet its mounting expenditure needs for providing support to the recovery.

Catch all the Business News, Market News, Breaking News Events and Latest News Updates on Live Mint. Download The Mint News App to get Daily Market Updates.
More Less
Subscribe to Mint Newsletters
* Enter a valid email
* Thank you for subscribing to our newsletter.
Close

Recommended For You

×
Edit Profile
Get alerts on WhatsApp
Set Preferences My ReadsWatchlistFeedbackRedeem a Gift CardLogout