State of use of AI tax systems
The Tax and Customs Administration of the Kingdom of Spain (AEAT) has always been a frontrunner in fiscal algorithmic governance.
The AEAT is the first EU tax administration to deploy a chatbot, with a high degree of satisfaction among taxpayers. The AEAT is among the first administrations to publicly disclose its use of SNA. First mentions of the use of AI by the AEAT date back to 2012, but earlier sources already refer to statistical governance akin to machine-learning.
Arguably, this high degree of maturity in the use of AI likely explains why Spain is among the States with the smallest VAT gap, despite using comparatively less resources than neighbouring States.
What functions are performed with AI?
Based on publicly available data, tax machine-learning algorithms perform at least three types of functions for the AEAT:
- Webscraping: the AEAT uses an open-source scraping system, similarly to a number of tax administrations such as Austria and Denmark. The scraping system is reportedly based on Python ‘Scrapy’ to automatically collect taxpayer online data from HTLM scripts of websites. Few details have been disclosed over the areas of taxation concerned and the sources of data collection. Reportedly, the scraping tool is particularly used to collect data from e-commerce and e-sharing platforms.
- Social Network Analysis (SNA): ‘TESEO’ visually represents a network of individual taxpayers as a combination of nodes and vertices. Using graph theory, TESEO quantitatively and qualitatively measure connectivity between the nodes. TESEO is used for a wide array of purposes, for instance to detect fraud among UHNWI taxpayers.
- Real-time risk-detection: INFONOR automatically detect and flags suspicious data and transactions in real-time, without any active participation from tax inspectors.
- Risk-detection: DEDALO is used to identify and locate taxpayers, for which no precise information is available. Taxpayers are identified by DEDALO through other search parameters than the NIF-number or the name, such as the real estate, bank account, vehicles, or even disparate information in zujares.
- Internal risk-management: ZÚJAR enables the filtering of taxpayers by predefined variables through Boolean algebra. The ZÚJAR program contains zújares, i.e. units of ordered information, to develop tax management, inspection and collection actions. ZÚJAR classifies and divides data according to different concepts and filters thousands of variables and millions of records. Therefore, it can draft lists of taxpayers or specific attributes for risk-scoring.
- Internal risk-management: GENIO is a supplementary tool that allows the issuing of standardized reports as a conclusion of the risk analysis performed.
- Internal risk-management: PROMETEO issues a detailed report of tax digital documentation, such as accounting documents, VAT book and bank accounts. PROMETEO also acts as a data-matching tool, enabling the treatment of accounting and computer records obtained from taxpayers and the comparison of these documents with information already present in the data warehouse of the tax administration.
- External risk-management (risk-scoring): HERMES is a risk and profile system for the analysis of taxpayers that classifies taxpayers into risk categories to devise annual audit plans and treatment strategies. The system makes also use of the already existing ZUJAR infrastructure.
- External risk-management (risk-scoring): The HLF tool predicts risks of non-compliance of taxpayers for social security or wage-related fraud, to devise treatment strategies and (pre-)select taxpayers for further audits. This tool is particularly used to combat undeclared work and bogus employment.
- Taxpayer assistance ‘AVIVA’: AVIVA is chatbot designed to automatically answer taxpayer queries and FAQs of legal persons regarding SII (sunministro inmediato of informacion), corporate income taxation, VAT and e-invoicing.
What data can be processed by these systems?
The data processed by these systems is not specified by the AEAT.
The data processed by TESEO reportedly includes individual and companies’ bank accounts, cadastral property, foreign assets and a wide range of additional data. TESEO has measured more than 49 different types of relations among more than 530 million connectivity arcs.
Are these systems regulated by specific norms?
The use of machine-learning by the AEAT is not subject to ad hoc legal norms.