The global rise in interest rates has created a situation where the recent history of near-zero rates for major currencies no longer represents the likely future evolution of rates. Most conventional interest rate models are based on modeling incremental changes to the initial level of interest rates—the further rates are from today’s markets, the worse the predictions of these models become.
The incremental approach of conventional models works best for small changes in interest rates in quiet markets but becomes less effective during market turmoil, when rates change rapidly and when the future evolution of interest rates takes them very far from their current initial state.
The new type of machine learning risk models we have developed, called Autoencoder Market Models (AEMM), are based on training a model using the entire history of interest rates, not only for the currency we are modeling but also across all other currencies. This historical data covers all kinds of market regimes. For example, in the past, currencies such as the Japanese yen had very low interest rates—near zero—for decades, while in other markets the rates were very high.
By training our models to all of the rate regimes across all currencies, they become more effective after a change in market regime that sees a rapid evolution from today’s markets. Wherever this evolution will take us, we can always find a similar market regime in the historical data, and the model will be able to represent this regime well. This makes AEMM more effective than conventional models based on incremental changes from today’s rate.
How Long It Takes to Train a Machine Learning Model
Training is the most computationally intensive part of machine learning (ML). However, the financial markets have a lot less data available than, for example, language models or image recognition models, so training a ML model, even on a regular machine, only takes minutes (or in some cases an hour or a couple of hours).
The most important advantage of ML is that you only need to do this training daily, or even less frequently (e.g., monthly, or even quarterly). After that, when you run the model, the speed of using the model is similar to the speed of complex analytical solutions. ML models outperform methods such as Monte Carlo, finite differences, and differential equations, and their speed is comparable to analytical or semi-analytical solutions.
Autoencoders, Machine Learning Risk Models, and Real-World Problems
The textbook definition of autoencoders is that they are ML algorithms that can compress high-dimensional data into a low-dimensional latent space. These concepts have a very specific meaning in relation to financial markets.
Let us start with high-dimensional data. Financial data for interest rates and other asset classes is high dimensional, in the sense that each observation consists of many individual market quotes. If we are talking about a stock, for example, and would like to see the value of our stock portfolio, we only need the stock price. But a yield curve (i.e., a curve that represents yields of bonds or swaps or interest rate instruments of different maturities) consists of 10 to 15 quotes for different maturities. The model must represent all these quotes together because they depend on each other, they evolve together, and therefore, they cannot be modeled independently. So that already gives 10 to 15 dimensions, where each dimension is an individual market quote.
If we talk about the volatility surface, namely the volatilities of different interest rate options, it has two dimensions:
- The maturity dimension, consisting of options of different maturities
- The strike dimension (the strike prices of the options)
And the volatility cube adds another dimension—the maturity of an underlying.
For swaptions, which are options to enter into an interest rate swap, there are three dimensions:
- The time taken to decide whether to exercise an option
- The length of the swap into which you would enter
- The yield of the financial instrument
Overall, there are over 100 or 200 individual market quotes that together form all this market data. This data is high dimensional in a sense that it has many more than these dimensions, but fortunately it is less high dimensional than, for example, an image, making it less computationally demanding.
Because the model cannot describe the evolution of each quote individually, it must be able to represent them collectively using a small number of model variables. However, a model that has hundreds of state variables (parameters) is very hard to build and hard to work with, and it becomes very unstable and can change significantly from day to day.
This is why all interest rate models, not only ML ones, perform dimension reduction—learning how to represent all this collection of market quotes using fewer variables. These are called state variables in interest rate research and latent variables in ML autoencoder research. These two are identical if the model uses autoencoder latent variables as its variables.
Most interest rate models in production today use between two and five state variables, far fewer than the number of market quotes in the input data. This is what is meant by saying the state variable space (the same as the latent space) is low dimensional.
What is a Variational Autoencoder (VAE) and How Is It Used in CompatibL’s Autoencoder Market Models?
Dimension or size reduction is done via a compression algorithm, such as those used to compress an image (which uses JPEG compression). General purpose compression algorithms have a moderate rate of compression (around x10 for JPEG).
Variational autoencoders are ML algorithms that provide a fundamentally different type of compression. They have more power, but they come with certain limitations. They can achieve much higher rates of compression (x100 to x1000, compared with x10 for general purpose compression algorithms), but only when they are trained and tuned to specific data (while general purpose algorithms can compress any data). For example, if you train a VAE to compress images of faces, it will be able to do this much better than JPEG, but it will not work well with an image of a different type (e.g., a house or a landscape).
AEMM use VAEs by adopting VAE latent variables as their state variables. An autoencoder learns from the sample data it has been trained on how to perform optimal compression. Then it can be used to perform the compression of similar data it has not seen before.
CompatibL’s AEMM are trained on historical financial markets data (yield curve, volatility surface, and volatility cube, collectively having over 100 dimensions), which is high dimensional. The models then learn how to optimally compress that into between two and five state variables. VAEs have been used successfully in areas other than finance, and now we have taken this amazing ML innovation and applied it to financial markets.
AEMM can also use other types of autoencoders, or any other ML or conventional algorithm that performs compression effectively.
Why We Applied Variational Autoencoders to Interest Rate Modeling
The main motivation for these models is respect for the financial markets and willingness to learn from history without preconceptions. Conventional interest rate or quant models that banks, financial institutions, and asset managers use to manage their risks and determine the value of their portfolios are based on equations. Applying these models starts with deriving an equation and then calibrating its parameters to the data (either today’s prices (market implied) of instruments or historical data).
But, when you are taking this first step of selecting an equation, you are already constrained by what the model can do. When selecting a model from among many model choices, the model selection constrains the ability to fit the financial markets, as it imposes its rigid structure on what the fit could be. You hope that your equation represents the historical data well enough, but it will not represent it perfectly, as the markets do not move according to neat equations.
When we developed AEMM, we needed to consider how financial markets operate. Our objective was to learn from history directly without external constraints related to model choice, rather than trying to fit history into an equation. When we train a model on historical data, we make the model able to make predictions (on price or risk) based on how the market has behaved in the past and without any arbitrary choice of equations.
How Can AEMM Help with Interest Rate Portfolios, Managing Limits and Add-Ons, and Credit Exposure Concerns
The models can be used for both pricing and value adjustments (XVA) (under what quants call the Q- or risk-neutral measure) and for modeling risk and limits (under the P- or real-world measure). They perform especially well for long time horizons, and interest rate portfolios tend to be very long dated. Specifically, when conventional models based on representing rate increments are used over a 20- to 30-year horizon, the initial state is long forgotten, and the model should not be sensitive to it. The AEMM approach, which is based on optimally representing the curve or surface itself rather than its increments, works especially well for pricing XVA and limits of interest rate portfolios.
This is why we first decided to specifically apply these models to interest rate markets: because they work better with interest rate portfolios that are very long dated.
Most of the popular conventional models, however, start from today’s market and rely on modeling rate increments based on it. One of the reasons why this approach is not as good as modeling the actual market as opposed to the change in the market from the initial state is that during market turmoil a future evolution can take us too far from today’s market. Another reason is that when you go over a very long time horizon (from a 20-year maturity to a 30-year one—or even longer for fixed-income portfolios), the initial state could be long forgotten. What your model is predicting 30 years into the future should not depend on the exact point it started from. The longer the horizon is, the better the AEMM approach works, as it represents the state of the curve, surface or volatility cube as opposed to just representing the change from the initial state.
So, during market turmoil, when there are many changes happening in a short period of time (or in any market if you go far enough into the future), the models that represent the state (namely what shape the curve is, as opposed how it changes from this very distant point in the past) work better. This applies to things such as pricing, XVA, risk, and limits. This is why AEMM are especially suitable for interest rate portfolios, or any other portfolios, that include any long-dated instruments. And they also work for any portfolios during market turmoil.
Models with Autoencoders Potentially Yield Better Predictions Than Traditional Methods
We expect our model predictions to be better because we learn directly from history. There is a well-known adage that history repeats itself. So, by learning from history rather than relying on equations, we hope that these models work well. The data we have gathered from testing the AEMM is very encouraging and points to the fact that this approach is correct.
Because the models learn from history, they can better fit the markets without the constraints of the equations, and we can see this in the predictions they have made. For example, sometimes the yield curve (the yields of different instruments with different maturities) behaves differently for near-zero or very low interest rates (sometimes even negative ones) and for mid-range and very high interest rates. And because AEMM learn from historical data that includes all these regimes, they can represent all of them equally well, whereas conventional models work best for regimes that are similar to the initial state of the markets, becoming progressively less accurate when the regime changes.
This ability is what makes AEMM better than conventional models, and the findings we have obtained so far, as well as data from other research groups that adopted this approach, provide strong support for these models.
How the Cloud and Other IT Innovations Are Helping to Bring Down Costs for Cutting-Edge Autoencoders
The computational load patterns for most ML models, including AEMM, differ from the computational load patterns of conventional models. When running a model in production, the computationally intensive training process must be performed from time to time (in some cases daily, and in some cases less frequently, e.g., quarterly). When using a trained model, the computational load is minimal and comparable to that of an analytical solution.
By using cloud services, financial institutions can access powerful computing resources on demand without the need to invest in expensive hardware: with on-premises model deployment, the hardware would be idle most of the time, whereas the cloud pay-as-you-go model provides the most efficient way to deploy ML models. So, the cloud is extremely well suited for ML because of its flexibility in capacity.
Cloud resources are hugely scalable and can be increased or decreased when needed, so you do not have to pay for the capacity you do not use. With ML there is a typical load pattern, with intensive computational demand during training and very little during model use. A lot of ML research and adoption would not have happened without cloud services.
Another aspect of digital transformation that is very important for adopting AEMM and other ML models is the tremendous progress made by the open-source community in releasing powerful Python-based ML and math libraries and making them available in the cloud at minimal cost.
Performing ML research or running the model in production using commercially available infrastructure and code from major cloud vendors lowers the barrier to entry. A massive investment in a related IT infrastructure is no longer required in order to carry out ML research or use these powerful models in production.
Interested in Using Autoencoder Market Models in Your Project?
CompatibL’s Autoencoder Market Models are now available to the quant community and open source. Do not hesitate to get in touch with us if you want to become a contributor or use them in your project.