Investment Strategy Based on Machine Learning Models by Jiayue Zhang

Nikolai Pokryshkin
Modérateur
Inscrit depuis le: 2022-07-22 09:48:36
2024-06-19 13:36:57

Investment Strategy Based on Machine Learning Models by Jiayue Zhang

Chapter 1
Introduction
Data mining is a process of finding hidden patterns within data using automatic or semi-

automatic learning algorithms. The accelerating development of computer technology and
machine learning has generated increasing research interests in the innovative solution to
traditional challenges in social sciences. In particular, machine learning (ML) techniques
have shown impressive performance in solving real life classification problems in many
different areas such as communications, internet traffic analysis, medical imaging, astron-

omy, document analysis, biology and time series analysis(Gerlein, 2016).[11] In term of
finance applications, especially in the equity market, finding effective investment strategies
based on computer science theoretical algorithms is becoming more popular in recent years.
Financial trading of securities using technical and quantitative analysis has been tra-

ditionally modelled by statistical techniques for time series analysis such as ARMA and
ARIMA models, and more sophisticated ARCH models (Engle, 1982) [10]. In contrast
to these statistical approaches, complex models coming from the ML field have emerged
attempting to predict future movements of securities’ prices [4]. The extensive litera-

ture has shown how some ML techniques specializing in classification and regression tasks
have been shown to be well-suited for a quantitative analysis in the financial industry, as
their capabilities of finding hidden patterns in large amounts of financial data may help in
derivatives pricing, risk management and financial forecasting. In this essay, we compare
prediction accuracies of our investment strategy based on different machine learning mod-

els and methods and make some innovations to improve the return rates.

1.1 Investment Strategy with Classification Models
Regression and classification are categorized under the same umbrella of supervised ma-

chine learning. Both share the same concept of utilizing known datasets (referred to as
training datasets) to make predictions. In supervised learning, an algorithm is employed to
learn a mapping function from a set of input variables, X, to an output variable, y; that is
y = f(X), where X = (x1, x2, ..., xp). The objective of such a problem is to approximate
the mapping function f as accurately as possible such that whenever there is a new input
data x0, the output variable y for the dataset can be predicted. The main difference be-

tween classification models and regression models is that the output variable in regression
is numerical (or continuous) while that for classification is categorical (or discrete).
Most relevant researches about the investment strategy using machine learning models
and methods mainly focus on classification models. Takeuchi and Lee (2013) [23] develop
an enhanced momentum strategy on the U.S. stock market using the data from 1965 until
2009. They apply deep neural networks (DNN) as classifiers to calculate the probability
for each stock to outperform the cross-sectional median return of all stocks in the holding
month t + 1. Using standardized cumulative returns produces annualized returns of 45.93
percent in the out-of-sample testing period from 1990 until 2009. Heaton et al. (2016) [12]
discuss the application of deep learning to financial prediction and classification, which is
able to exploit empirical data to find the function relationship between an output variable
and a group of input variables, that are not predicted by existing financial economic theory.
Krauss and Huck (2016) [16] apply four kinds of classification models and their ensemble
models to predict the probability of outperforming a cross-sectional median performance in
next period, which shows the best result is obtained by Tree-based models. Culkin and Das
(2017)[6] train a deep learning neural network to calculate option prices, which achieves a
high degree of accuracy compared to the Black-Scholes option pricing formula. Nobre and
Ferreira Neves(2018)[20] present an expert system in the financial area that combines Prin-

cipal Component Analysis (PCA), Discrete Wavelet Transform (DWT), Extreme Gradient
Boosting (XGBoost) and a Multi-Objective Optimization Genetic Algorithm (MOO-GA)
in order to achieve high returns with a low level of risk.

Investment Strategy Based on Machine Learning Models by Jiayue Zhang

image/svg+xml


BigMoney.VIP Powered by Hosting Pokrov