Introduction

Within the expansive domain of quantitative trading, Kakushadze and Serur’s ”151 Trading Strategies” present a panorama of diverse tactics. This report meticulously analyzes selected strategies, each embodying a distinct facet in the intricate fabric of financial markets. The examination focuses on the practicality, mathematical coherence, and real-world adaptability of these chosen strategies, shedding light on the nuanced landscape of quantitative trading.

Stocks

There are a lot of strategies in the field of stocks, but the majority are the ones related to a single stock at time.

Price-momentum

This strategy is about the assumption that empirically there appears to be certain “inertia” in stock returns known as the momentum effect, whereby future returns are positively correlated with past returns. In The next example the cumulative returns of several stocks are computed for the last two years. They are then sorted in order to check the top quartile (stocks that should have been bought) and the bottom one (the short signal of the strategy). The histogram is then plotted.

Code

Volatility

This strategy is based on the empirical observation that future returns of previously low-return-volatility portfolios outperform those of previously high-return-volatility portfolios. We can see that this goes counter to the expectation that higher risk assets should yield proportionately higher returns. That happens because we are evaluating on the risk-adjusted basis (“low-risk anomaly”). Similar to the previous example stocks are sorted, but based on volatility, specifically using standard deviation.

Code

An histogram is plotted, allowing for the possibility of buying stocks with lower volatility and selling those with higher volatility.

Residual Momentum

This is a stretegy similar to the price-momentum with the difference that considers the residual from the linear regression with the 3 Fama-French factors MKT(t), SMB(t), HML(t) in place of the returns. The Fama-French factors capture market performance, company size, and value-related factors, respectively. By utilizing residuals, the strategy aims to filter out the influence of these factors, emphasizing the unique patterns beyond general market trends and size or value considerations. Investors employing the Residual Momentum strategy may seek to capitalize on more refined signals for potential market outperformance or underperformance.

Pairs trading

With pairs trading it is meant the goal of identifying a pair of historically highly correlated stocks and, when a mispricing (i.e., a deviation from the high historical correlation) occurs, shorting the “rich” stock and buying the “cheap” one. This is an example of a mean-reversion strategy.

R = P (t1)/P (t1) − 1

R¯ = 1/2 (RA + RB)

ReA,B = RA,B − R¯

Let ReA and ReB be the demeaned returns, where R is the mean return. A stock is “rich” if its demeaned return is positive, and it is “cheap” if its demeaned return is negative. Demeaned returns help remove common market effects or factors that affect both stocks in a pair similarly. By subtracting the mean return (or other appropriate measure of central tendency) from the actual returns, you focus on the relative performance of each stock, independent of overall market movements. In fact a low demeaned return indicates that, relative to its historical average or the average of the pair, the stock is underperforming. This may be due to factors specific to that stock rather than general market conditions. The stock might be undervalued or experiencing a temporary dip in performance, presenting an opportunity for mean reversion.

Moving Avarages

This strategy is based on moving avarages (SMA) and their crossovers. In this practical example only two SMA are considered, the 50 and 200 periods, but generally more than only two are used in order to find a well behaved strategy. When the shorter SMA encounters the longer from above it means that the price is going bearish, while if it happens the opposite the sentiment is considered bullish.

Code

KNN alghoritm

The K-Nearest Neighbors (KNN) algorithm is a supervised learning method used for classification and regression problems. In the context of stock analysis, applying KNN involves several steps.

Firstly, historical data on stocks is collected, including daily closing prices, trading volumes, and other relevant indicators. This data is prepared by normalizing or standardizing features to ensure they carry equal weight. Next, the number of neighbors (K) to consider during the prediction phase is chosen. This represents the number of closest points that will influence the classification or prediction of a new data point.

The algorithm calculates the distance between the point of interest (the new observation) and all points in the training set using a distance metric like Euclidean distance. The K nearest neighbors are identified based on this distance. For a classification problem, the most common class label among the nearest neighbors is assigned to the point of interest. For a regression problem, the average of the output values of the nearest neighbors is calculated.

It’s important to note that the dataset is commonly divided into a training set (60%) and a test set (40%). The training set is used to train the model, while the test set is used to assess the algorithm’s ability to make predictions on data not seen during training. This helps evaluate the model’s ability to generalize to new data and reduces the risk of overfitting.

Code

This strategy has to be backtested out-of-sample and the process o diving the data set in a training and a test group is called cross-validation. The goodness in forecasting can be then evaluated by the minimization of the MSE, or also the out of sample error.

MSE = X(ˆyi − yi)

In this case an accuracy test is performed:

The first disadvantage is that equally weighting contributions of all k nearest neighbors could be suboptimal. The second is that since KNN is a lazy algorithm, it takes up more memory and data storage compared to other classifiers. This can be costly from both a time and money perspective.

ARIMA model

ARIMA (AutoRegressive Integrated Moving Average) is a statistical model used for time series forecasting. It combines AutoRegression (AR), Integration (I), and Moving Average (MA) components. The AR part involves predicting future values based on past values, while the I part deals with differencing to achieve stationarity. The MA part considers past error terms through a moving average. In an AR(p) model the future value of a variable is assumed to be a linear combination of p past observations and a random error together with a constant term. Mathematically the AR(p) model can be expressed as :

Just as an AR(p) model regress against past values of the series, an MA(q) model uses past errors as the explanatory variables. The MA(q) model is given by :

The random shocks are assumed to be a white noise process, i.e. a sequence of independent and identically distributed (i.i.d) random variables with zero mean and a constant variance σ 2 . Thus conceptually a moving average model is a linear regression of the current observation of the time series against the random shocks of one or more prior observations. The formal notation is ARIMA(p, d, q), where p is the order of AutoRegression, d is the degree of differencing, and q is the order of Moving Average.

Autoregressive (AR) and moving average (MA) models can be effectively combined together to form a general and useful class of time series models, known as the ARMA models. The ARMA models, described above can only be used for stationary time series data. Thus from application view point ARMA models are inadequate to properly describe non-stationary time series, which are frequently encountered in practice. For this reason the ARIMA model is proposed, which is a generalization of an ARMA model to include the case of non-stationarity as well. In ARIMA models a non-stationary time series is made stationary by applying finite differencing of the data points. Another limit coming from ARIMA model is that it is for non-seasonal non-stationary data. Box and Jenkins have generalized this model to deal with seasonality. Their proposed model is known as the Seasonal ARIMA (SARIMA) model.

In this practical scenario, Amazon stock closing prices from January 1, 2021, to January 1, 2024, are considered for the application of the ARIMA model. To perform model identification, i.e., choosing the right parameter values, it can be beneficial to examine Autocorrelation (ACF) and Partial Autocorrelation (PACF). Specifically, while ACF indicates how many lagged observations could significantly influence the dependent variable, PACF is used to measure the correlation between an observation k periods ago and the current observation, after controlling for data points at intermediate lags. By observing ACF and PACF values outside the confidence interval, it is possible to deduce the proper values for both parameters, p and q.

The process of model identification can also be done by an automated function (.auto arima()) which leads to the best model without having to fit and try multiple values for p and q. This function is based on the minimization of AIC (Akaike Information Criterion), a statistical measure that quantifies the trade-off between the goodness of fit of a statistical model and its complexity (i.e. numbers of parameters p).

The model has been trained using the previously mentioned parameters, which include considerations for the seasonal component of the model. In this scenario, the initially contemplated ARIMA model has transformed into a SARIMA model. Examining the residual plot reveals that there is no discernible pattern indicating that our chosen model adequately captures the autocorrelation of the variable.

Following the training of the SARIMA model, it is employed to forecast the closing price of Amazon stock beyond the sample data, thereby providing predictions for future values. This out-of-sample forecasting allows us to assess the model’s performance in predicting the behavior of the Amazon stock price beyond the time frame it was trained on.

Code

Conclusion

Single-stock technical analysis strategies, such as those relying on moving averages, momentum, single-stock KNN and ARIMA model, are often considered ”unscientific” by many professionals and academics. While moving averages are used in trend following/momentum strategies, applying them to individual stocks could bring to biased conclusions. In contrast, strategies involving a large cross-section of stocks introduce statistical elements, making mean-reversion more plausible due to correlations within industries. But it is important to remember that :” *stock market – an imperfect man-made construct – is not governed by laws of nature the same way as, say, the motion of planets in the solar system is governed by fundamental laws of gravity. The markets behave the way they do because their participants behave in certain ways, which are sometimes irrational and certainly not always efficient*”. (Z. Kakushadze and J.A. Serur, 151 Trading Strategies)

## Comments