top of page

Econometrics model for volatility - GARCH

Updated: Dec 4, 2023


In the analysis of macroeconomic data, we often find values in some phenomena for which the variances of the terms of error in the temporal models are less stable than those that are generally assumed. In addition, the time series and especially the financial series often show non-linear links between them, and this has caused quite a few problems, since the models available did not contemplate this at all.

The results obtained suggest that in the analysis of models on this type of data, large or small errors seem to recur, generating a form of heteroskedasticity in which the variance of the error term depends on them. The first characteristic that emerged from the first pioneering contributions of Mandelbrot (1963) and Fama (1965) and that the time series of returns are characterized by leptokurtic distributions and "volatility clustering"; their presence therefore determines the collapse of the hypothesis of normality as the same time series show alternation of periods with large oscillations around the mean value and periods characterized by small variations. Subsequently, other authors have shown that volatility tends to increase when certain events occur that determine an increase in uncertainty in the markets.

The empirical evidence also provided the following results:

1. Time series concerning the prices of financial assets are generally integrated processes, while those of yields are stationary.

2. Return series are often fractionally integrated processes.

3. Returns are usually not self-correlated.

4. The squares of the returns show relevant self-correlations supporting the assumption of the existence of non-linear relationships between the returns themselves with their values assumed in the past.

Based on these statements, the hypothesis of normal i.i.d. of returns collapses, an essential condition especially in the context of asset allocation models, which determines three conditions:

1. Constant volatility of returns for each security,

2. Correlations between the returns of different securities constant over time,

3. About 99% of the available data are in the range [μ − 3σ, μ + 3σ] where μ and σ represent mean and standard deviation of the distribution of returns, respectively.

In addition to the average-variance analysis, the theory of finance has made use of other contributions that focus their attention on different aspects: in this framework are inserted the numerous models of variable volatility that represent useful tools for the interpretation of all the empirical characteristics of financial activities listed so far. Their main characteristic is that of being able to capture the nonlinearity of some economic phenomena, a goal not achievable through previous models on the subject. The main empirical applications concerning financial markets were mainly focused on the analysis of time series from the point of view of the first conditional moment, assuming instead the subsequent moments substantially as constraints. The increasing role played by risk and uncertainty in decision and the fact that results have been found for which the risk itself, therefore the volatility, are variable over time, have led modern economic theory to elaborate new techniques on time series that mark the birth of methods of investigation focused mainly on the study of conditional moments following the first. Within this category is inserted the class of models of type Auto-Regressive Conditional Heteroskedasticity (ARCH) which assumes particular importance especially for the fact that for the first time attention is paid to the distinction between conditional second moment and non-conditional second moment: the innovative element is therefore represented by the fact that, while the non-conditional variance matrix and covariance of a generic variable of interest (yields securities, exchange rates, inflation rates, etc.) It may not vary over time, the conditional one often shows a dynamic trend.

1. Volatility

Volatility is the variability of a value, or a financial index calculated in each time interval: it represents a new object of study in the analysis of time series since it was realized that components such as risk, uncertainty or structural changes play an important and often decisive role within the economic system. Volatility is the observable expression of the uncertainty present on the financial markets and analytically is identified in the concept of conditional variance (if it exists) to a given information set available at time t: in the case of univariate models volatility is a scalar ( ht) because it works with a single variable of interest, while in multivariate models it is represented by a square matrix, symmetric and at least semi-definite positive (t) with dimensions equal to the number of variables considered. Volatility is a phenomenon characterized by a memory that can go back very far in time, therefore, in all its expressions, it also contemplates its past values. For example, from the perspective of asset allocation strategies, it is crucial to be able to examine movements in the volatility of returns of multiple assets at the same time as portfolio risk could be reduced through diversification. To explain the phenomenon of variable volatility Engle (1982) developed the first ARCH model, a new method of analysis of time series based on the intuition that conditional variance is related to the values assumed by it in the past; From an econometric point of view, this discourse translates into the fact that volatility shows autoregressive dynamics over time. With reference to the linear regression model, the contribution of Engle (1982) has therefore provided the basis for the subsequent formalization of several nonlinear error specification models neither in mean nor in variance and above all able to explain different characteristics related to the phenomenon of variable volatility.

2.1 Leptokurtosis

Mandelbrot (1963) and Fama (1965, 1970) were the first to document the fact that time series on yields are characterized by leptokurtic distributions, that is, by distributions within which the probability mass that thickens on the tails is greater than that which is recorded in the density function of the normal random variable: from the statistical point of view this translates into the fact that from the observed samples an excess of kurtosis emerges.

2.2 Volatility clustering

Volatility is a persistent phenomenon, i.e., it is closely related to the value from it taken in the previous period, but also with the following one at the time of the analysis. To highlight this phenomenon Mandelbrot (1963) writes:

big changes tend to be followed by big changes,

while small changes tend to be followed by small changes

In these words, lies the concept of volatility clustering (cluster volatility) which is identified in the trend of the time series that show a continuous alternation of large oscillations and small oscillations around their average value. Volatility clustering and leptokurtosis are two closely related components. Conditional heteroscedasticity models such as those of type ARCH or stochastic volatility (SV models) have been introduced substantially to model this phenomenon. Diebold (1988) and Drost and Nijman (1993) also document the fact that through the temporal aggregation of data The phenomenon of volatility clustering tends to vanish.

Example “SP500 stock index”

The volatility clustering is a consequence of different events that tend to influence the market, for example Central Banks policies, earnings announcements from companies, volumes and macroeconomic factors, such as recessions. These variables tend to affect the volatility of different securities in a similar way.


The main motivation for the study of conditional heteroskedasticity in finance is to study the volatility of asset returns. Volatility is an incredibly important concept in finance because it is highly synonymous with risk.

For example, consider the prevalence of downward portfolio protection insurance used by long-only fund managers. If the stock markets were to have a challenging day (i.e. a substantial drop!) it could trigger automated selling orders for risk management, which would further depress the price of the shares within these portfolios. Because larger portfolios are generally highly correlated, this could trigger significant downward volatility.

These “sales” periods, as well as many other forms of volatility that occur in finance, lead to a heteroskedasticity that is serially correlated and therefore conditioned to periods of greater variance. So let's say that these series are conditionally heteroskedastic.

Before delving into generalized models we take a look at ARCH models where

Where the volatility is a function of the past volatility, this gives it the name of the AutoRegressive process since it follows an AR(1) process without the noise. We note that ARCH(1) should only be applied to a series for which we have already adapted an appropriate model, sufficient to allow residues to be considered as discrete white noise. Since we can only verify whether an ARCH model is suitable by squaring the residues and examining the correlogram, we need to make sure that the average of the residue is zero.

In his generalized form the GARCH (p,q) has


And wt is a white noise. It is the same as an ARCH model, but it also combines a MA term, so that it is quite similar to an ARMA process. Note that it is necessary for α1+β1<1 otherwise the series will become unstable.

We took data from the last 13 years for the S&P 500. Applying a MLE to calibrate the model in order to obtain all the parameters we obtain α1 equal to 0.20, α0 equal to 0.0008, β1 equal to 0.75 and an average daily volatility of 1%. To solve the MLE we applied the Nelder-Mead method, also known as the downhill simplex method, is a numerical optimization technique used to find the minimum (or maximum) of an objective function. It is a derivative-free optimization method, meaning that it doesn't require the calculation of gradients or derivatives of the objective function. Instead, it explores the function's behavior by iteratively adjusting a simplex (a geometric shape, like a triangle, in N-dimensional space) to converge to the minimum.

Based on these values for the parameters we now want to simulate a GARCH(1,1) process in a way similar to a Random Walk process. We create two vectors: one to store the values of white noise randomly generated and one for the historical series. We then print the correlogram:

Taking a look at this graph it looks like a white noise process, but if we take the squared values for the series it results in a conditionally heteroskedastic process through the decay of subsequent delays:

Trying to adapt a GARCH model to our simulated series we want to test if the parameters for the model are robust.

The results show that all of the parameters are significant because their t-stat is over 2

#importing packages

import numpy as np

import pandas as pd

import yfinance as yf

import matplotlib.pyplot as plt

import scipy.optimize as spop

#specifying the sample

ticker = '^GSPC'

start = '2015-12-31'

end = '2021-06-25'

prices =, start, end)['Close']

#calculating returns

returns = np.array(prices)[1:]/np.array(prices)[:-1] - 1

#starting parameter values - sample mean and variance

mean = np.average(returns)

var = np.std(returns)**2

def garch_mle(params):

#specifying model parameters

mu = params[0]

omega = params[1]

alpha = params[2]

beta = params[3]

#calculating long-run volatility

long_run = (omega/(1 - alpha - beta))**(1/2)

#calculating realised and conditional volatility

resid = returns - mu

realised = abs(resid)

conditional = np.zeros(len(returns))

conditional[0] = long_run

for t in range(1,len(returns)):

conditional[t] = (omega + alpha*resid[t-1]**2 + beta*conditional[t-1]**2)**(1/2)

#calculating log-likelihood

likelihood = 1/((2*np.pi)**(1/2)*conditional)*np.exp(-realised**2/(2*conditional**2))

log_likelihood = np.sum(np.log(likelihood))

return -log_likelihood

#maximising log-likelihood

res = spop.minimize(garch_mle, [mean, var, 0, 0], method='Nelder-Mead')

#retrieving optimal parameters

params = res.x

mu = res.x[0]

omega = res.x[1]

alpha = res.x[2]

beta = res.x[3]

log_likelihood = -float(

#calculating realised and conditional volatility for optimal parameters

long_run = (omega/(1 - alpha - beta))**(1/2)

resid = returns - mu

realised = abs(resid)

conditional = np.zeros(len(returns))

conditional[0] = long_run

for t in range(1,len(returns)):

conditional[t] = (omega + alpha*resid[t-1]**2 + beta*conditional[t-1]**2)**(1/2)

#printing optimal parameters

print('GARCH model parameters')


print('mu '+str(round(mu, 6)))

print('omega '+str(round(omega, 6)))

print('alpha '+str(round(alpha, 4)))

print('beta '+str(round(beta, 4)))

print('long-run volatility '+str(round(long_run, 4)))

print('log-likelihood '+str(round(log_likelihood, 4)))

#visualising the results


plt.rc('xtick', labelsize = 10)




bottom of page