AI and machine learning techniques are being used by regulated institutions for regulatory compliance, and by authorities for supervision. RegTech is often regarded as the subset of FinTech that focus on facilitating regulatory compliance more efficiently and effectively than existing capabilities. The total RegTech market is expected to reach $6.45 billion by 2020, growing at a compound annual growth rate (CAGR) of 76%.
SupTech is the use of these technologies by public sector regulators and supervisors. Within SupTech, the objective of AI and machine learning applications is to enhance efficiency and effectiveness of supervision and surveillance. While not yet being applied by regulatory or supervisory bodies, they represent potential applications in this sector. The use cases are grouped by the function for which they are used, namely regulatory compliance; regulatory reporting and data quality; monetary policy and systemic risk analysis; and surveillance and fraud detection.
Applications by financial institutions for regulatory compliance
For analyzing unstructured data, RegTech can use machine learning combined with NLP. Besides being applied to the monitoring of behaviour and communication of traders for transparency and market conduct, machine learning together with NLP can interpret data inputs such as e-mails, spoken word, instant messaging, documents, and metadata. This in turn begs the issue of the boundaries for the employee surveillance policy. Some regulated institutions are experimenting with cases seeking to enhance their ability to comply with product suitability requirements.
NLP could be used by asset management firms to cope with new regulations. In the EU, investment managers have to comply with specific requirements in the Markets in Financial Instruments Directive (MiFID II), the Undertakings for Collective Investments in Transferrable Securities (UCITS) Directive, and the Alternative Investment Fund Managers Directive (AIFMD). Firms could potentially leverage NLP and other machine learning tools to interpret these regulations into a common language.
They could then analyse and codify the rules for automation into the integrated risk and reporting systems to help firms comply with the regulations. This could bring down the cost, effort and time needed to interpret and implement new and updated regulations for fund managers.
Knowing the identity of customers (‘know your customer’ or KYC) is another area where AI and machine learning are applied to address one of the biggest pain points in the financial industry, both with regards to user experience and regulator expectations. The KYC process is often costly, laborious, and highly duplicative across many services and institutions. Machine learning is increasingly used in remote KYC of financial services firms to perform identity and background pre-checks. It is predominantly used in two ways:
(1) evaluating whether images in identifying documents match one another, and
(2) calculating risk scores on which firms determine which individuals or applications need to receive additional scrutiny.
Machine learning-based risk scores are also used in ongoing periodic checks based on public and other data sources, such as police registers of offenders and social media services. Use of these sources may enable risk and trust to be assessed quickly and often cheaply. Firms can use risk scores on the probability of customers raising “red flags” on KYC checks to help make decisions on whether to proceed with the time and expense of a full background check. Nonetheless, concerns about their accuracy have kept some financial services from incorporating these tools.
Uses for macroprudential surveillance and data quality assurance
AI and machine learning methods may help to improve macroprudential surveillance by automating macroprudential analysis and data quality assurance. A series of new reporting requirements across jurisdictions has led to a greater volume and frequency of reported data, as well as greater resources required from financial institutions to complete reporting on time. In some cases (for example, transactions data in MiFID, AIFMD templates, etc.), the volume of data received can be challenging for the authorities receiving the data, such that it cannot be used to its full potential using traditional methods.
Moreover, substantial errors, blank fields, and other data quality issues are often more prevalent in new datasets, and additional checks and data quality assurance are needed. Machine learning can help improve data quality, for example, by automatically identifying anomalies (potential errors) to flag them to the statistician and/or the data-providing source. This may allow for both lower-cost and higherquality reporting and more efficient and effective data processing and macroprudential surveillance of data by authorities.
Similarly, AI and machine learning could help trade repositories (TRs) tackle data quality issues, increasing the value of TR data to authorities and the public. Authorities report that overcoming data quality issues continues to be a key challenge to making full use of TR data. Application of machine learning techniques may help TRs – for over-the-counter (OTC) derivatives or (where applicable) other types of transactions, such as exchange-traded derivatives or securities financing transactions – improve data quality. Specifically, appropriately trained machine learning algorithms could help identify data gaps, data inconsistencies, and fat-finger errors, as well as match likely pairs of transactions and/or interpolate missing data. The same techniques can be used by authorities, themselves. In this context, the Autorité des marchés financiers du Québec reports that it has successfully tested in its FinTech Laboratory a supervised machine learning algorithm able to recognise distinct categories from unstructured free text fields in OTC derivatives data, such as the floating leg of swaps. Implementation of alerts based on this algorithm is underway to automatically detect transactions that are not compliant with mandatory clearing requirements.
Uses and potential uses by central banks and prudential authorities
Machine learning can be applied to systemic risk identification and risk propagation channels.Specifically, NLP tools may help authorities to detect, measure, predict, and anticipate, among other things, market volatility, liquidity risks, financial stress, housing prices, and unemployment. In a recent Banca d’Italia (BdI) study, still in progress, textual sentiment derived from Twitter posts is used as a proxy for the time-varying retail depositors’ trust in banks. The indicator is used to challenge the predictions of a banks’ retail funding model, and to try to capture possible threats to financial stability deriving from an increase of public distrust in the banking system. Furthermore, at the BdI, in order to extract the most relevant information available on the web, newspaper articles are processed through a suitable NLP pipeline that evaluates their sentiment. In another study, academics developed a model using computational linguistics and probabilistic approaches to uncover semantics of natural language in mandatory US bank disclosures. The model found risks as early as 2005 related to interest rates, mortgages, real estate, capital requirements, rating agencies and marketable securities.Other studies are able to predict and anticipate market outcomes and economic conditions, including volatility and growth.
Use of machine learning combined with NLP can be used to identify patterns for further attention from supervisors in large and complex data. Machine learning can also be used with NLP to link trading databases to other information on market participants. This could include, for example, the ability to integrate and compare trading activity information with behavioral data like communications and to compare normal trading scenarios with those that may have substantial deviations, triggering the need for further analysis.
Central banks can use AI to assist with monetary policy assessments. A 2015 survey of central banks’ use of and interest in big data reported, among other things, that central banks expected a growing use of big data for macroeconomic and financial stability purposes. The most prevalent expected use was for economic forecasting, in particular for economic indicators such as inflation and prices. For instance, 39% of central banks expect to ‘nowcast,’ or predict in real time, retail home prices using big data. AI can be used to forecast unemployment, GDP, industrial production, retail sales, tourism activity, and the business cycle (for example, with sentiment indicators and nowcasting techniques).
Recent research highlights how these methods could be used. Researchers at Columbia University have recently combined newly developed machine learning approaches with observational studies to enable public authorities and market participants to:
(i) ‘score’ policy choices and link them to indicators of financial sector performance;
(ii) simulate the impact of policies under varying economic and political conditions; and
(iii) detect the rate of change of market innovation by comparing trends of policy efficacy over time.
With the aim of studying the redistributive effects of fiscal policy over different municipalities, a study from the BdI employs a dynamic factor model and utilises a dataset containing variables from different sectors of the economy. In order to select the statistically most relevant independent variables they use automatic regression variable selection. At the Office of Financial Research (OFR), researchers are evaluating the potential for machine learning tools to identify new financial innovations receiving more attention from market participants in financial publications. OFR researchers have also used machine learning to extract sentiment and key topics from financial publications in order to evaluate the relationship between news, attention, and financial stability.
Uses by market regulators for surveillance and fraud detection
Some regulators are using AI for fraud and AML/CFT detection. The Australian Securities and Investments Commission (ASIC) has been exploring the quality of results and potential use of NLP technology to identify and extract entities of interest from evidentiary documents. ASIC is using NLP and other technology to visualise and explore the extracted entities and their relationships. In order to fight criminal activities carried out through the banking system (such as money laundering), BdI collects detailed information on bank transfers and correlates this information with information from newspaper articles. The correlation involves both structured and unstructured data for file sizes of more than 50 gigabytes. In the same vein, the Monetary Authority of Singapore (MAS) is exploring the use of AI and machine learning in the analysis of suspicious transactions to identify those transactions that warrant further attention, allowing supervisors to focus their resources on higher risk transactions.Investigating suspicious transactions is time consuming and often suffers from a high rate of false positives, due to defensive filings by regulated entities.
Coupled with machine learning methods to analyse the granular data from transactions, client profiles, and a variety of unstructured data, machine learning is being explored to uncover non-linear relationships among different attributes and entities, and to detect potentially complicated behaviour patterns of money laundering and the financing of terrorism not directly observable through suspicious transactions filings from individual entities.
Market regulators can also use these techniques for disclosure and risk assessment. The US Securities and Exchange Commission (SEC) staff leverages “big data” to develop text analytics and machine learning algorithms to detect possible fraud and misconduct. Certain risk assessment tools are beginning to move into the AI space.For instance, the SEC staff uses machine learning to identify patterns in the text of SEC filings. With supervised learning, these patterns can be compared to past examination outcomes to find risks in investment manager filings. The SEC staff notes that these techniques are five times better than random at finding language that merits a referral to enforcement. While the results can generate false positives that can be explained by non-nefarious actions and intent, these nonetheless provide increasingly important signals to prioritise examination. For investment advisers, the SEC staff compiles structured and unstructured data. Unsupervised learning algorithms are used to identify unique or outlier reporting behaviours – including both topic modelling and tonality analysis. The output from this first stage is then combined with past examination outcomes and fed into a second-stage, machine learning algorithm to predict the presence of idiosyncratic risks at each investment advisor. In Australia, ASIC has also used machine learning software to identify misleading marketing in a particular sub-sector, such as unlicensed accountants in the provision of financial advice.