State of use of AI tax systems
Since the adoption of the first CFVR decree (Ciblage de la Fraude et Valorisation des Requêtes – Targeting of Fraud and valuation of requests) in 2014, the French Tax Administration (DGFiP) has been leveraging machine-learning technology to perform a number of their prerogatives.
What functions are performed with AI?
The machine-learning algorithms developed by the DGFiP performs a wide range of functions:
- Web-scraping: this tool automatically collects data from hundreds of webpages: social media (Facebook, Instagram, LinkedIn), e-commerce (Ebay, Amazon) and sharing economy platforms (AirBnB, LeBonCoin, 2emeMain, Uber) and matches it with the data provided by taxpayers. Unlike the model used by the Belgian SPF, the data collected by the scraping tool of the DGFiP must be freely accessible (‘manifestement rendus publics‘) and not require the creation of an account, password, or registration on the platform where the data is collected.
In addition, the DGFiP makes use of a webscraping tool to scrape satellite images and aerial photography, in collaboration with Google, to detect undeclared swimming pools and real-estate extension, verandas, garages, sheds, etc. Reportedly, the system has detected more than 140,000 undeclared swimming pools in 2023. - Social Network Analytics (SNA): is a tool used to detect and visualize fraudulent networks. It represents a network of taxpayers as a combination of nodes for individuals or points of interests and lines which quantitatively and qualitatively measure relations between the nodes.
- External-risk management (risk-scoring): the tool segments individual taxpayers into categories of risks and selects taxpayers to be manually audited by human tax officials of the DGFiP (valorisation des requêtes – valuation of requests).
- Taxpayers assistance: ‘AMI’ (‘Friend’ in French) is a nudging system and virtual conversational assistant which guides taxpayers toward the correct forms, answer simple taxpayer queries and facilitates access to fiscal documentation.
What data can be processed by these systems?
The wide range of data which can be used by the machine-learning algorithms is detailed in Art. 3 of the CFVR decree which includes but is not limited to:
- Data for the identification of physical and legal persons and their professional or economic situation (e.g. phone number, emails, postcodes, place of work, monthly revenues, financial status of the company, relations with other companies etc.);
- Financial performance indicators, accounting information (both national and international), banking and patrimonial information, external and internal tax indicators;
- Information from holders of internet pages (i.e. social media, e-commerce, e-sharing platforms, etc.) (as per the Arrest of 8 March 2021 modifying the 2014 CFVR decree);
- Fraud signals and general information of the DGFiP.
Are these systems regulated by specific norms?
The use of these tax machine-learning algorithms are regulated by specific legal norms, namely:
- CFVR 2014 Decree (Arrêté du 21 Février 2014 portant création par la direction générale des finances publiques d’un traitement automatisé de lutte contre la fraude dénommé “ciblage de la fraude et valorisation des requêtes”)
- Loi n° 2019 – 1479 loi de finances pour 2020, Art. 154
- CFVR 2021 Arrest amending the 2014 CFVR Decree (Arrêté du 8 mars 2021 modifiant l’Arrêté du 21 Février 2014)