Researchers in computer science and statistics have developed advanced techniques to obtain insights from large disparate data sets. Data may be of different types, from different sources, and of different quality (structured and unstructured data). These techniques can leverage the ability of computers to perform tasks, such as recognizing images and processing natural languages, by learning from experience. The application of computational tools to address tasks traditionally requiring human sophistication is broadly termed ‘artificial intelligence’ (AI). As a field, AI has existed for many years. However, recent increases in computing power coupled with increases in the availability and quantity of data have resulted in a resurgence of interest in potential applications of artificial intelligence. These applications are already being used to diagnose diseases, translate languages, and drive cars; and they are increasingly being used in the financial sector as well.
There are many terms that are used in describing this field, so some definitions are needed before proceeding. ‘Big data’ is a term for which there is no single, consistent definition, but the term is used broadly to describe the storage and analysis of large and/or complicated data sets using a variety of techniques including AI. Analysis of such large and complicated data sets is often called ‘big data analytics.’ A key feature of the complexity relevant in big data sets analytics often relates to the amount of unstructured or semi-structured data contained in the data sets.
Many machine learning tools build on statistical methods that are familiar to most researchers. These include extending linear regression models to deal with potentially millions of inputs, or using statistical techniques to summarize a large data set for easy visualization. Yet machine learning frameworks are inherently more flexible; patterns detected by machine learning algorithms are not constrained to the linear relationships that tend to dominate economic and financial analysis. In general, machine learning deals with (automated) optimization, prediction, and categorization, not with causal inference. In other words, classifying whether the debt of a company will be investment grade or high yield one year from now could be done with machine learning. However, determining what factors have driven the level of bond yields would likely not be done using machine learning.
Types of Machine Learning algorithms
There are several categories of machine learning algorithms.These categories vary according to the level of human intervention required in labeling the data:
- In ‘supervised learning, the algorithm is fed a set of ‘training’ data that contains labels on some portion of the observations. For instance, a data set of transactions may contain labels on some data points identifying those that are fraudulent and those that are not fraudulent. The algorithm will ‘learn’ a general rule of classification that it will use to predict the labels for the remaining observations in the data set.
- Unsupervised learning refers to situations where the data provided to the algorithm does not contain labels. The algorithm is asked to detect patterns in the data by identifying clusters of observations that depend on similar underlying characteristics. For example, an unsupervised machine learning algorithm could be set up to look for securities that have characteristics similar to an illiquid security that is hard to price. If it finds an appropriate cluster for the illiquid security, pricing of other securities in the cluster can be used to help price the illiquid security.
- Reinforcement learning falls in between supervised and unsupervised learning. In this case, the algorithm is fed an unlabelled set of data, chooses an action for each data point, and receives feedback (perhaps from a human) that helps the algorithm learn. For instance, reinforcement learning can be used in robotics, game theory, and self-driving cars.
- Deep learning is a form of machine learning that uses algorithms that work in ‘layers’ inspired by the structure and function of the brain. Deep learning algorithms, whose structure are called artificial neural networks, can be used for supervised, unsupervised, or reinforcement learning.
Recently, deep learning has led to remarkable results in diverse fields, such as image recognition and natural language processing (NLP). Deep learning algorithms are capable of discovering generalisable concepts, such as encoding the concept of a ‘car’ from a series of images. An investor might deploy an algorithm that recognizes cars to count the number of cars in a retail parking lot from a satellite image in order to infer a likely store sales figure for a particular period. NLP allows computers to ‘read’ and produce written text or, when combined with voice recognition, to read and produce spoken language. This allows firms to automate financial service functions previously requiring manual intervention.
Applications of Machine Learning
Machine learning can be applied to different types of problems, such as classification or regression analysis. Classification algorithms, which are far more frequently deployed in practice, group observations into a finite number of categories.
Classification algorithms are probability-based, meaning that the outcome is the category for which it finds the highest probability that it belongs to. An example might be to automatically read a sell-side report and label it as ‘bullish’ or ‘bearish’ with some probability, or estimate an unrated company’s initial credit rating. Regression algorithms, in contrast, estimate the outcome of problems that have an infinite number of solutions (continuous set of possible outcomes). This outcome can be accompanied with a confidence interval. Regression algorithms can be used for the pricing of options. Regression algorithms can also be used as one intermediate step of classification algorithm. It is important to note what machine learning cannot do, such as determining causality.
Generally speaking, machine learning algorithms are used to identify patterns that are correlated with other events or patterns. The patterns that machine learning identifies are merely correlations, some of which are unrecognizable to the human eye. However, AI and machine learning applications are being used increasingly by economists and others to help understand complex relationships, along with other tools and domain expertise. Many machine learning techniques are hardly new. Indeed, neural networks, the base concept for deep learning, were first developed in the 1960’s. However after an initial burst of excitement, machine learning and AI failed to live up to their promises and funding dissipated for over a decade, in part because of the lack of sufficient computing power and data. There was renewed funding and interest in applications in the 1980’s, during which many of the research concepts were developed for later breakthroughs.
By 2011 and 2012, driven by the vast increase in the computational power of modern computers, machine learning algorithms, especially deep learning algorithms, began to consistently win image, text, and speech recognition contests. Noticing this trend, major tech companies began to acquire deep learning start-ups and rapidly accelerate deep learning research. Also new is the scale of collection of big data, for example the ability to capture data on the scale of every single credit card transaction or every word on the web, and even ‘mouse’ hovers over websites. Other advances have also helped, such as increased interconnectedness of information technology resources with cloud computing architecture, with which big data can now be organised and analysed. Using data sets of this size and complexity and with the increase in computing power, machine learning algorithms results have improved, some of which are highlighted in the sections that follow. This has also spurred large investments in AI start-ups.
The continued advancement of AI-related technologies will drive double-digit year-over-year spend into the next decade.The number of merger and acquisition (M&A) deals in AI has also accelerated over this period.
Key drivers in Fin Tech
A variety of factors that have contributed to the growing use of Fin Tech generally have also spurred adoption of AI and machine learning in financial services.On the supply side, financial market participants have benefited from the availability of AI and machine learning tools developed for applications in other fields. These include availability of computing power owing to faster processor speeds, lower hardware costs, and better access to computing power via cloud services. Similarly, there is cheaper storage, parsing, and analysis of data through the availability of targeted databases, software, and algorithms. There has also been a rapid growth of data sets for learning and prediction owing to increased digitization and the adoption of web-based services.The same tools driving advances in machine learning in search engines and self-driving cars, can be adopted in the financial sector. For example, entity recognition tools that enable search engines to understand when a user is referring to Ford Motor Company, rather than fording a river, are now used to quickly identify news or social media chatter relevant to publicly traded firms. As more firms adopt these tools, the financial incentives to access new or additional data and to develop faster and more accurate AI and machine learning tools may increase. In turn, such adoption and development of tools may affect incentives for yet other firms. A variety of technological developments in the financial sector have contributed to the creation of infrastructure and data sets. The proliferation of electronic trading platforms has been accompanied by an increase in the availability of high quality market data in structured formats.
In some countries, such as the United States, market regulators allow publicly traded firms to use social media for public announcements. In addition to making digitized financial data available for machine learning, the computerization of markets has made it possible for AI algorithms to interact directly with markets, putting in real-time complex buy and sell orders based on sophisticated decision-making, in many cases with minimal human intervention. Meanwhile, retail credit scoring systems have become more common since the 1980’s,and news has become machine readable since the 1990’s. With the growth of data in financial markets as well as data sets – such as online search trends, viewership patterns and social media that contain financial information about markets and consumers – there are even more data sources that can be explored and mined in the financial sector.
On the demand side, financial institutions have incentives to use AI and machine learning for business needs.Opportunities for cost reduction, risk management gains, and productivity improvements have encouraged adoption, as they all can contribute to greater profitability. In a recent study, industry sources described priorities for using AI and machine learning as follows: optimizing processes on behalf of clients; working to create interactions between systems and staff applying AI to enhance decision-making; and developing new products and services to offer to clients. In many cases these factors may also drive ‘arms races’ in which market participants increasingly find it necessary to keep up with their competitors’ adoption of AI and machine learning, including for reputation reasons.There is also demand due to regulatory compliance.
New regulations have increased the need for efficient regulatory compliance, which has pushed banks to automate and adopt new analytical tools that can include use of AI and machine learning. Financial institutions are seeking cost effective means of complying with regulatory requirements, such as prudential regulations, data reporting, best execution of trades, and rules on anti-money laundering and combating the financing of terrorism (AML/CFT).
Correspondingly, supervisory agencies are faced with responsibility for evaluating larger, more complex and faster-growing data sets, necessitating more powerful analytical tools to better monitor the financial sector. A number of developments could impact future adoption of a broad range of financial applications of AI and machine learning. These developments include continued growth in the number of data sources and the timeliness of access to data; growth in data repositories, data granularity, variety of data types; and efforts to enhance data quality.
Continued improvement in hardware, as well as AI and machine learning software as a service, including open-source libraries, will also impact continued innovation. Development in hardware includes processing chips and quantum computing that enable faster and more powerful AI. These developments could enable cheaper and broader access to AI and machine learning tools that are increasingly powerful. They could make more sophisticated real-time insights possible on larger data sets, such as real-time databases of online user behavior or internet-of-things (IoT) sensors located around the world.
At the same time, sophisticated software services are becoming more widely available. Some of the software services are open source libraries made available in the past few years that provide researchers with off-the-shelf-tools for machine learning. There are also a growing variety of vendors that provide machine learning for financial market participants, including some firms that scrape news and/or metadata and enable users to identify the specific features (webpages viewed, etc.) that correlate with the events they are interested in predicting. As services emerge to provide, clean, organise, and analyse these data for financial insights, the cost to users of incorporating sophisticated insights may fall significantly. Thus, at the same time, risks related to multiple users of the same information and techniques across the financial sector could grow .
The legal framework for relevant data will likely also impact the adoption of AI and machine learning tools. Breaches of personal data or uses of data that are not in the interests of consumers may be expected to lead to added data protection legislation. In addition, the development of new data standards, new data reporting requirements, or other institutional changes in financial services can impact the adoption of AI and machine learning in specific markets.