In the realm of financial mathematics, understanding the intricate dynamics of interest rates is of paramount importance. The Vasicek model, named after its creator Oldřich Vašíček, is a pioneering contribution that revolutionized the modeling of interest rates. This article delves deep into the Vasicek model, elucidating its underlying principles, mathematical formulation, applications, and significance in the world of finance. The Vasicek model emerged in 1977 when Oldřich Vašíček, a Czech economist, introduced a groundbreaking framework for modeling the evolution of short-term interest rates. His model aimed to address limitations in existing approaches and provide a more realistic depiction of interest rate movements. The Vasicek model quickly gained prominence for its elegance and applicability in various financial contexts.

It is based on the following assumptions:

Changes in interest rates: It assumes that changes in interest rates in the economy are continuous. This mainly means that interest rates will follow a set pattern and move along all the points before reaching their final position.

Arbitrage: The Ornstein-Uhlenbeck model assumes no arbitrage. Arbitrage is buying or selling currencies, securities, or goods by taking advantage of different market prices. It happens when a good sold at a lower price is bought to be sold at a higher price in a different market where its cost is higher to gain profits.

Z random term: To simplify the model and calculations, the equation has a z-random term. This random term is assumed to be normally distributed with mean zero and variance 2.

Mean reversion: The most critical assumption of the model is that interest rates follow a process of mean reversion. Mean reversion observes that if interest rates become too high compared to their historical levels, they tend to fall short. Conversely, if interest rates drop too low from their historical value, they will tend to rise now. Therefore, the model assumes that interest rates do not fall or rise to their extreme values to eliminate such repercussions.

Let us now look at the negative aspects of each assumption:

Only single factors affect the interest rate: It's the very first and foremost self-exploratory assumption. This means that only market volatility or market risk affects the interest rates. But in the practical world, this is not true. Since the economy runs on solid and extensive indicators, interest rates may depend on several factors other than risk.

Introduction of a drift term: The drift term suggests that interest rates in the economy can go below zero or, in other words, can become harmful. This is a highly considerable situation and is a very extreme phenomenon. This phenomenon is used by monetary authorities as a tool to revamp economic stability but in a very conscious manner.

Continuous interest rates: It is another vague assumption when we assume that interest rates are continuous. In a real-world scenario, this is hardly true since it is widely observed that interest rates are frequently changing distinctly. Just like a jumping game, it is okay that they move point by point in a path, but they can have extreme unpredicted movements.

No arbitrage: In a dynamic set-up of investment and savings in every economy, it is doubtful that arbitrage is not present. The globalizing world has seen vast and huge potential in integrating the global capital market and mobility. Therefore assuming no secondary market becomes unviable.

Z follows a normal distribution: The Vasicek model also assumes that Z (Normal distributed random term) follows a normal distribution without any stringent reason. Therefore this can be untrue in the real sense where variables are so exploratory and time-changing. If the assumption is violated, the Vasicek interest model may become too complicated to be used as a decision-making criterion for investment.

The model is based on the following stochastic differential equation:

Where Wt is a Wiener process under the risk neutral framework modeling the random market risk factor, it would be 0 if no exogenous stimuli occur, is the instant volatility of the interest rate, a is usually a non negative constant that represents the speed of reversion, b is the long term interest rate mean. This formula shows that the interest rate could assume negative values, while in practice it is quite difficult, this problem will be solved with the Cox-Ingersoll-Ross model and the Hull-White model among others.

The long term variance is represented by the term sigma^2/2a, showing that a high speed of reversion reduces the long term volatility (demonstration below).

After some simplification we get that the state variable is normally distributed with mean:

And variance:

When t tends to infinite the interest rate should approach b and so the long term variance is:

**Estimating the parameters**

From the general formula written above we can see that there are 2 parameters, a,b, which are not given and they can result in different estimations, so what we are going to do is estimate these parameters. However, estimating the parameters from a SDE is quite complex and so we will do this from the discrete time model which is:

The parameters can be estimated even with an OLS regression which can be written as:

We took weekly data from the last 70 years for 3-months Treasury Bill on the secondary market. We started by regressing rt over the lagged value rt-1 and we obtained

So the coefficient of -a is equal to -0.001777 and the coefficient of a*b is equal to 0.008545. Both the estimated coefficients have a t-value less than 1.96 which is a 95% confidence interval, but at 90% confidence interval we can still reject the null hypothesis so we accept a slightly higher risk to accept this result. So this means that the speed of convergence (a) is 0.001 and the rate of equilibrium is 5%.

The Durbin-Watson test is used to test autocorrelation under the null hypothesis that there is no autocorrelation, however a value of 1.46 means that we can reject the null hypothesis, so using the Vasicek model makes sense for this sample.

**Maximum Likelihood calibration**

The OLS estimation approach is not the only one that we can use, we can even try to calibrate the model using historical data and applying a maximum likelihood estimation to fit the parameters. In maximum likelihood estimation, we search over all possible sets of parameter values for a specified model to find the set of values for which the observed sample was most likely. That is, we find the set of parameter values that, given a model, were most likely to have given us the data that we have in hand.

By extracting a sample consisting of n Xi i.i.d. random variables from an X population with probability/density function f(x, θ), the probability function representing probability / density function of the sample itself is constructed: In this context, it is assumed that it is a function of the vector of the parameters θ, while the samples xi are fixed.

Analytically we have:

The statistical function =t(x1, x2, ...., xn) is called a maximum probability estimator if, in correspondence to each sample extracted, it assigns a value to the vector θ that maximizes the probability function. In symbols:

Obviously the estimate of maximum probability `is defined as follows:

To calculate the MLE estimator we use the log-probability function obtained through the application of the natural logarithm, so it results:

Since the logarithmic function is a monotonous increasing transformation, with the transition to log-probability the characteristics of the function L(x, θ) in terms of growth and decrease are not lost. In the case of random variables i.i.d. this is true because the combined density function of the sample can be expressed as the production of the margins: For the properties of logarithms, from [1.1] we derive the log-vero similarity as a sum, in fact:

In the case of a normally distributed population the function is:

And the log-vero similarity function is:

The distribution of rt+t is assumed to be normal and so it follows:

Reducing to the log-likelihood:

We need to find values for a, b and , usually it is more convenient to adopt a log-likelihood function that is easier to solve. Run the MLE optimization algorithm with an initial guess for the model parameters. The algorithm will iteratively adjust the parameters to maximize the likelihood of the observed data. The final values of a, b, and σ are the estimated parameters that best fit the historical data according to the Vasicek model.

For this estimation we used the L-BFGS-B algorithm, which is available in the scipy library in Python, it stands for Limited-memory Broyden-Fletcher-Goldfarb-Shanno with Bound Constraints and it is an optimization algorithm used for solving unconstrained and bound-constrained nonlinear optimization problems. It is a variant of the BFGS (Broyden-Fletcher-Goldfarb-Shanno) algorithm, which is a popular method for finding the local minimum (or maximum) of a function.

The algorithm starts with an initial estimate of the optimal value X0 and proceeds iteratively to refine that estimate with a sequence of better estimates X1, X2,.... The derivatives of the function gk:=∇f(xk) are used as a key driver of the algorithm to identify the direction of steepest descent, and also to form an estimate of the Hessian matrix (second derivative) of

f(x). L-BFGS-B belongs to the class of quasi-Newton methods, which iteratively build an approximation of the Hessian matrix using gradient information. This approximation is updated at each iteration to guide the search for the minimum. One of the significant advantages of L-BFGS-B is its limited-memory approach, which allows it to approximate the Hessian matrix (second derivatives of the objective function) without explicitly storing it. This makes it efficient for large-scale optimization problems where memory is limited.

Running this model we obtain a rate of equilibrium of 4.8% and a speed of reversion of 0.44, which seems credible, compared to the OLS results.

## Comments