artificial intelligence for etf market prediction and portfolio … · 2019-06-24 · 1995),...

19
2019 富邦人壽管理博碩士論文獎 Artificial Intelligence for ETF Market Prediction and Portfolio Optimization ABSTRACT This research aims to develop the system which apply Machine Learning and Deep Learning Algorithms for investment return forecasting and portfolio optimization management, for recommend investor proper strategies for short-term and long-term investment. The core algorithm we designed, an artificial intelligence embedded system to perform time-series forecasting in the financial market and focus on the ETF trading. There are many studies focus on algorithmic trading, traditional time series forecasting and portfolio management in different forms of various applications; however, the amount of literatures focusing on the applications which use various machine learning algorithm to forecast market trends are limited. In this research, we used five machine learning algorithms and two deep learning approaches, Long Short-Term Memory (LSTM) and Gate Recurrent Unit (GRU), to develop our system and build up the applications for short-term and long term portfolio management. We develop a forecasting module and exploit the result to construct a day trading strategy and to perform portfolio optimization to exemplify the machine learning algorithms truly add values, and we compare different methodologies and evaluate which algorithms perform better in our task. The system and module we built is expandable and portable, it can be used as a framework and submodule when developing trading recommendation system in different time intervals. Keywords: Artificial Intelligent (AI), ETF (Exchange Traded Funds), Deep Learning, Portfolio Optimization, Financial Market Prediction

Upload: others

Post on 30-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Artificial Intelligence for ETF Market Prediction and Portfolio … · 2019-06-24 · 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin,

2019富邦人壽管理博碩士論文獎

Artificial Intelligence for ETF Market Prediction and Portfolio Optimization

ABSTRACT

This research aims to develop the system which apply Machine Learning and Deep Learning Algorithms for investment return forecasting and portfolio optimization management, for recommend investor proper strategies for short-term and long-term investment. The core algorithm we designed, an artificial intelligence embedded system to perform time-series forecasting in the financial market and focus on the ETF trading. There are many studies focus on algorithmic trading, traditional time series forecasting and portfolio management in different forms of various applications; however, the amount of literatures focusing on the applications which use various machine learning algorithm to forecast market trends are limited. In this research, we used five machine learning algorithms and two deep learning approaches, Long Short-Term Memory (LSTM) and Gate Recurrent Unit (GRU), to develop our system and build up the applications for short-term and long term portfolio management. We develop a forecasting module and exploit the result to construct a day trading strategy and to perform portfolio optimization to exemplify the machine learning algorithms truly add values, and we compare different methodologies and evaluate which algorithms perform better in our task. The system and module we built is expandable and portable, it can be used as a framework and submodule when developing trading recommendation system in different time intervals.

Keywords: Artificial Intelligent (AI), ETF (Exchange Traded Funds), Deep Learning, Portfolio Optimization, Financial Market Prediction

Page 2: Artificial Intelligence for ETF Market Prediction and Portfolio … · 2019-06-24 · 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin,

2019富邦人壽管理博碩士論文獎

1

1. Introduction

This research we used five machine learning algorithms and two deep learning approaches, Long Short-Term Memory (LSTM) and Gate Recurrent Unit (GRU), to develop our system and build up the applications for short-term and long term portfolio management. We developed a forecasting module and exploit different model to generate the model-based investors view as an indicator for our portfolio optimization algorisms and exemplify the machine learning algorithms truly add values.

1.1. Objective

The purpose of this research is aim to propose the architecture for investment recommendation system to deliver different investment consultation services for short-term and long-term periods respectively. This system integrated several dimensions of technologies which contains machine learning, data analytics, and portfolio optimization. The result of our research can be applied into various applications, for instance, comparing different algorithms and find out the better practice for supporting decision making, and providing investment recommendation to increase the ROI of the optimized portfolio.

1.2. Background and Motivation

The blooming of Financial technology (Fintech) and deep learning has received enormous attention from wild spectrum of industries in recent years. The World Economic Forum (2015) identified Fintech as a disruptive innovation, the disruptive power of fintech innovations will manifest themselves clearly as the market evolves (Lee & Shin, 2018). In 2012, Hinton and Krizhevsky proposed AlexNet to prove the strong ability of deep learning, by participating the ImageNet competition, the result shows that convolution neuron net (CNN) had outperformed significantly compare to traditional manners, after that, artificial intelligence and deep learning has rapidly changed almost in every industry.

And for stock market forecasting, deep learning model, recurrent neuron network with time series forecasting and attention mechanism has provided artificial neuron network the ability to compute the relationship between time and trends, and the outcome of these successes is to contribute more possibility to predict the future and make better decisions. The relevant literature will be discussed for the detail in Section 2.

1.3. Research Gap

In recent years, Financial Technology (FinTech) has received enormous attention in science, information technology, and financial industries. Using different types of strategies for stock market prediction has become a trend in the trading market. Some literatures are doing research on portfolio optimization models based on several traditional manners, however, few researches go to the combination of deep learning and asset allocation, moreover, the precision of forecasting still remains a wild rang to improve. Therefore, this study focuses on

Page 3: Artificial Intelligence for ETF Market Prediction and Portfolio … · 2019-06-24 · 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin,

2019富邦人壽管理博碩士論文獎

2

using machine learning and deep learning model for stock market forecasting, then integrate some theories of portfolio optimization, in order to contribute more Return of Investment (ROI) to customers by accepting our portfolio recommendations.

1.4. Research Purpose

The main purpose of this study is using artificial intelligence technology, including different machine learning and deep learning mechanisms, to forecast the trend of ETFs listed in Taiwan and US markets, moreover, we will compare the performance of different machine learning and deep learning model to forecast the investment returns, and utilize the best practice of forecasting data to allocate our future portfolio. We use two different asset allocation theories to optimize our portfolio, furthermore, with the comparison of buy and hold or average weighted strategies, we can compare which theories perform better against others. 1.5. Research Questions

Q1:Whether the combination of deep learning and investing strategy can effectively improve the return of investment.

Q2 : Compare the returns between different assets allocation theories to find which strategies outperform others.

1.6. Research Values

It is hoped that the result of this study will become a submodule of financial recommendation system. We recommend the investment strategy to investors, assist them to reach higher return of investment, and according to the investors’ opinion and risk tolerance coefficient, we can customize the portfolio recommendation to each type of investors. The prediction which generate by our deep learning and machine learning models can serve as a decision making support system, according to different term of investment, we provide different way to forecast the trends respectively, so that to reach each investors’ requirement and expectation. We hope this achievement will bring more intelligent services to the financial community, and use cutting edge technology to make the financial industry more progressive and flourish.

1.7. Overview of this Paper

The rest of this paper is structured as follows. Selection 2 describes the literature review on assets allocation, deep learning and machine learning. Section 3 shows the research methodology. Section 4 shows the experimental results and discussion. Finally, Section 5 presents the conclusions of this research.

Page 4: Artificial Intelligence for ETF Market Prediction and Portfolio … · 2019-06-24 · 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin,

2019富邦人壽管理博碩士論文獎

3

2. Related Works

In this paragraph, we focus on the related works of machine learning, deep learning, and portfolio optimization, and through the relative literatures, we can understand the structure and methods of different forecast strategies and the way of assets allocate. We will introduce some previous research and empirical results to show how other research have done and what can be referred as a foundation and background for our research, we will mention some recent works which related to our research and introduce the topic of the researches and the method or algorisms they used, lastly, we will present the output and the result of the model implementation.

2.1. Artificial Intelligence in Finance

Statistical analysis based on historical trends prediction, such as the autoregressive integrated moving average model ARIMA model (Box, Jenkins, Reinsel, & Ljung, 2015)), the autoregressive conditional heteroscedasticity ARCH (Engle, 1982) model, and the generalized autoregressive conditional heteroscedasticity (GARCH) (Karolyi & Statistics, 1995) model, has been widely utilized to predict market about the financial market, however, the performance does not perform well due to their own limitation. In this research we will propose several machine learning and deep learning model and try to solve the issues. Recent years, implementing the machine learning model to forecast the stock trend has become the trend in stock market, and some discussion appears (L. Chen et al., 2018) in Chen’s research they proposed that deep learning model indeed provide some benefit of stock prediction but it still a complicated task because of the factors as the economy, politics, the environment, and culture, it still remains a lot of chaotic indicators which will affect the market and the ability of forecasting.

2.2. Machine Learning and Deep Learning

Machine learning is a field of computer science that uses different statistical techniques to give computer systems the ability to "learn" things with data, without being explicitly programmed (Samuel, 1988)"). In our research will leverage five machine learning model to generate investor’s view for the portfolio optimization model, the model include SVM (Cortes & Vapnik, 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin, 2016), Random Forest (Ho, 1995) and Naïve Bayes (McCallum & Nigam, 1998).

Deep learning it’s a branch of machine learning which specific known as multiple-hidden layer neuron networks, and it takes some advantages of doing multi-dimension calculation and having an ability to calculate the weight automatically of each parameter to seek for the minimum loss and highest accuracy. The concept of deep learning has been introduced since the 1990s, however, the efficacy of hardware and the efficiency of computing were not well enough for running deep learning models and reach high performances, not until AlexNet (Krizhevsky, Sutskever, & Hinton, 2012) and Deep Learning (LeCun, Bengio, & Hinton, 2015)

Page 5: Artificial Intelligence for ETF Market Prediction and Portfolio … · 2019-06-24 · 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin,

2019富邦人壽管理博碩士論文獎

4

has been proposed, the deep learning start flourish. In our research we implement the LSTM (Hochreiter & Schmidhuber, 1997) and GRU (Glorot, Bordes, & Bengio, 2011) as our forecast model.

Table 1. The empirical result and comparison of trend forecasting and portfolio optimization

Table 1 shows that the empirical studies of different methodologies of applying machine leaning mechanism to predict stock movement, those studies use accuracy index to validate the model and evaluate the ability of prediction, both machine learning and deep learning model reach the accuracy of 50% - 60 %, in this research we will compare the performance of different model and optimize the testing accuracy by tuning and modify the parameters.

Authors(Years) Data Type Methods Accuracy

Jui-Sheng Chou

et.al (2018)

S&P 500 index from Oct 5, 2011

to May 31,2017 SVM 45.48%

Magnus Hansson

et,al (2017)

S&P 500 index from 2009-01-02

to 2017-04-28 ARMA 50.43%

Magnus Hansson

et,al (2017)

S&P 500 index from 2009-01-02

to 2017-04-28 Deep-LSTM 50.72%

Magnus Hansson

et,al (2017)

S&P 500 index from 2009-01-02

to 2017-04-28 LSTM 51.88%

Yang Jiao

et.al(2017)

S&P 500 index from Jan 2, 2009

End Jun 30, 2017

Random

Forest 58.00%

Chang Sim

et.al(2013)

S&P 500 index from 1 January 2000 to 31

December 2009 ANN 56.30%

Huacheng Wang

et,al (2009)

Shanghai Stock Exchange form 2000

year to 2003 year Decision Tree 54.00%

Huacheng Wang

et,al (2009)

Shanghai Stock Exchange form 2000

year to 2003 year

Bagging-

decision tree 61.87%

Huacheng Wang

et,al (2009)

Shanghai Stock Exchange form 2000

year to 2003 year

Boosting-

decision tree 59.16%

Page 6: Artificial Intelligence for ETF Market Prediction and Portfolio … · 2019-06-24 · 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin,

2019富邦人壽管理博碩士論文獎

5

2.3. Portfolio Optimization

The earliest study of portfolio and assets allocation started from 1952, it’s introduced by Harry Markowitz which known as the Markowitz Mean-Variance Model (Markowitz, 1952), however, the deviation will be enlarged subjected to the highly sensitive with input parameters, so the Black-Litterman model proposed by Black & Litterman in 1992 has been derived (Black & Litterman, 1990) to address the disadvantage.

Table 2 shows that the previous studies of portfolio optimization use the historical data for calculating the expected returns, the result is highly effect by the actual retunes of the market, despite it can truly reach a better performance and lower the risk than just buying a single asset, however when the market trend is going down or the volatility is greater than expected, it can hardly get a better performance against the actual market. In our research, we use the machine learning and deep learning forecasting result to optimize our portfolio to capture the non-linear relationship of our assent and use the information ratio which is an objective indicator for verifying our performance.

Table 2. A comparison of related works on portfolio optimization

Authors(Years) Portfolio Selected From Methods Sharpe

Ratio

(Zhao & Palomar,

2018)

NASDAQ(2014-2017) Markowitz 0.1761

(Jia-long, Bo-wei, &

Min, 2013)

NASDAQ and NYSE (2012) 3F 0.0684

(Jia-long et al.,

2013)

NASDAQ and NYSE (2012) 4F 0.1996

(Jia-long et al.,

2013)

NASDAQ and NYSE (2012) BL 0.4437

(Martellini &

Ziemann, 2007)

US stock, Bonds and VIX etc…form (1996-2004) BL 1.80

(Martellini &

Ziemann, 2007)

US stock, Bonds and VIX etc…form (1996-2004)

Markowitz 1.58

(Paudel & Koirala,

2006)

NEPSE index from 1997 – 2006 Mid-May

Markowitz 0.406

Notes : 3F = Fama-French three-factor model, 4F = CARHART four factor model, BL = Black-Litterman Model

Page 7: Artificial Intelligence for ETF Market Prediction and Portfolio … · 2019-06-24 · 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin,

2019富邦人壽管理博碩士論文獎

6

3. Research Methods and System Framework

In this section, we will depict our research methods to propose our system framework of day trading module and portfolio optimization module. The first section describes the research design and the research structure of this thesis; the second section describes the system architecture; the third section describe the data collection and the subject of the data; the fourth section proposed the supervised machine learning transformation and data normalization; the fifth section describe the implementation of time series deep learning methods and different machine learning methodologies.

3.1. Research Design

The research method of our research adopts the system development research methodology in the field of information system research (System Development Research Methodology) (Nunamaker, Chen, & Purdin, 1990). This research development was been used as our research method. It has been divided into five stages of our day trading and portfolio optimization system which is shown as follows:

3.2. System Architecture

To propose the time series forecasting task, we develop several different modules for our system. To perform machine learning and deep learning forecasting, we use Scikit learn as our library, it’s a flexible and powerful library for implementing machine learning methodologies. And in deep learning time series forecasting, we use Keras which is known as the high-level deep learning API executed on top of the TensorFlow or Theano, for developing our LSTM and GRU. Figure1 shows the structure of our system architecture.

Figure 1. System Architecture

Page 8: Artificial Intelligence for ETF Market Prediction and Portfolio … · 2019-06-24 · 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin,

2019富邦人壽管理博碩士論文獎

7

3.3. Data Collection and Data Set

We use the ETFs (Exchange Traded Funds) listed in Taiwan market as our subjects, and we use the daily adjusted close price as input variable, and our target variable is the trend of next day. The period of our stock collection is about 11 years from 2008/01/02 to 2018/12/31, totally 2768 trading days, and 2018, approximately 245 trading days is for validation. We collected 20 ETFs listed in Taiwan Market respectively. Our data was automatically obtain from our data collection module, sourced by the Yahoo Finance API.

3.4. Increase complexity of time series data and normalization

In supervise machine and deep learning, we have to transform the data into time series supervise learning format, we set up a moving window of past 120 trading days for the day we are trying to predict, meaning that there exist 120 feature for each row and we use the more complicate data to capture the movement of time series, and we use L2 normalization to the input data for the ease of computing, for instance on 2018/12/28, then the data from 201/07/11 to 2017/12/27 (totally 120 trading days) will be set as our input data, and we normalize the data to reach better performance.

3.5. Models Training and Validation

We spilt the whole dataset into a training dataset and testing dataset respectively. The time interval of our training dataset is starts from January 2008 to December 2017. And the testing dataset continues from January 2018 to December 2018. We use the confusing matrix to illustrate the difference between predicted results and the actual trends. We leverage the python library Scikit learn to call for machine learning algorithms, in the other hand, we use the high level library Keras to build up deep learning structure.

Page 9: Artificial Intelligence for ETF Market Prediction and Portfolio … · 2019-06-24 · 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin,

2019富邦人壽管理博碩士論文獎

8

4. Applications and Results

In this section, we propose the application of day trading and based on the result generated by Machine Learning methodologies, the day trading strategy we used are based on the forecast result of different ML models; we will long our target equities if our model shows that the predicted price of next day will go up, conversely, we short the stock while the forecast result was going down. The second application, we convert the forecast result generated from LSTM forecast model as our portfolio optimization module input and combine with the Black-Litterman Model and Markowitz Mean-Variance Model to reach better result and gain values by the portfolio optimization implantation.

4.1.1. Model Based Day Trading Strategy We leverage a day trading investing strategy in our module by using the forecast result we obtained from different machine learning and deep learning model. In section 3.3, we mentioned that we gave our output data a 1 (goes up) and -1 (goes down) to signify the trend of next trading day.

In this research we conduct this trading strategy within the trend prediction section on each trading day in 2018 and used the f1 score as an indicator to validate the model performance of our strateg. We set up a buy and hold strategy as our control group, this strategy meaning that we purchase the target stock at the beginning of 2018 and hold it until the end of the year.

4.1.2. Result and Analysis In this section, we present the result of forecasted accumulated return of each models, we will present the result by the figure of accumulated return and confusion matrix to show the ability of each predict model for each EFT, to demonstrate our model can whether derive better performance or not in each assets.

Non-Normalized Confusion Matrix Normalized Confusion Matrix

45 37 0.42 130

85 83 0.58 120

Table 3: Predicted Result of 0050.TW by SVM

Page 10: Artificial Intelligence for ETF Market Prediction and Portfolio … · 2019-06-24 · 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin,

2019富邦人壽管理博碩士論文獎

9

Figure 2 and Table 3 shows that the normalized confusion matrix and the unnormalized confusion matrix of the forecasting result generated by Support Vector Machine model.

precision recall f1-score support

Down 0.55 0.35 0.42 130

Up 0.49 0.69 0.58 120

Average 0.52 0.51 0.50 250

In the Table 4 shows the performance of our model by forecasting the trend of 2018 in the whole years (250 trading days), we do not just evaluate the model by accuracy, we observe our model by the precision, recall and the f1-score, it can show the predict result more objective and not be distracted by the unequal portion of data set. We train the model by modifying parameter and find the best practice of each implementation by manual modification and grid search optimization to reach the better performance, the grid search can help us find the better parameter setting by training multi-modals and propose the best model of a set of parameter named best estimator, and we use the best estimator of each model to find and elevate the F1 score.

Table 5 shows the structure of the LSTM model of our research, we implemented the bidirectional LSTM with 32 nodes of each layer and use 1 layer fully connected layer with SoftMax activation function to generate our prediction.

Figure 2: Confusion Matrixs for 0050 TW prediction in 2018

Table 4: Predicted Result of 0050.TW by SVM

Page 11: Artificial Intelligence for ETF Market Prediction and Portfolio … · 2019-06-24 · 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin,

2019富邦人壽管理博碩士論文獎

10

Model Node of each Layer

Bi-directional *2 32

Dense 1

SoftMax 1

4.1.3. Comparison with Longer Time Horizon Our models are trained with the data from 2008 to 2017, and the period of testing data in 2018. We use the example of 0050.TW as our example equity and use the support vector machine as our training model, Figure 3 and Figure 4 shows that the volatility and the cumulated return of the forecasting result generated by Support Vector Machine model, we set up three different comparison method including buy and hold, results generated by support vector machine model and the base line of done nothing. In the cumulated returns figure can tell the differentiation of returns between forecasted result and buy and hold strategy, it shows that the predicted result is better than the buy and hold strategy.

Figure 3: Volatility for 0050 TW prediction in 2018

Figure 4: Result and Performance of Each Predict Model

Table 5: Structure of our LSTM model

Page 12: Artificial Intelligence for ETF Market Prediction and Portfolio … · 2019-06-24 · 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin,

2019富邦人壽管理博碩士論文獎

11

Table 4 shows the performance of each model, and we pick the best winning rate model to predict the result of each asset, and we deliver the predict result to the portfolio optimization section, and noted that the wining rate of buy and hold strategy is 35%. For instance, 7 of 20 stocks get positive accumulated return in 2018, so winning rate of the base line is 35 %, in the other hand, the forecast results generated by LSTM model can reach 12 of 20 with positive accumulated return, so the winning rate of LSTM is 60% and outperform base line for 25%.

4.2. Portfolio Optimization Using the Model Based Investor Views

Note. RF = Random Forest ; DT = Decision Tree; NB = Naive-Bayes; XGB : XGBoost

In this section, we leverage the portfolio optimization model to show assets allocation by the combination of investor views which derived from our forecast model and deliver the predicted result to our portfolio optimization model as input parameters. We have chosen six assent with low correlation coefficient to construct our portfolio (shown in table 6). We have quarterly adjust the formation of our portfolio to reach a better performance and higher cumulate return. We set up the 0050.TW as a benchmark of our model, it can show the comparism result more objective with the it’s stabilizing performance and to let our investor understand of about the advantage and better performance of our system.

4.2.1. Portfolio Assets Selection In this section, we use the adjusted close price from 2008 to 2017 for portfolio selection, and we calculate the correlation between each asset and find the assets with negative correlation, and we combine them into our target portfolio.

The correlation coefficient represent the relationship between the two sets of samples. The correlation coefficient is between -1 and 1, and the closer to 1, the higher the correlation between the two sets of sequences. Conversely, the closer to -1, the negative correlation between the two. And close to 0 means there is no relationship between the two sets of data. Portfolio theory combines a variety of negatively correlated but profitable investment

Model Winning Rate % Better than Buy and Hold Strategy

LSTM 60% 25%

GRU 50% 15%

RF 45% 10%

DT 40% 5%

NB 40% 5%

SVM 50% 15%

XGB 55% 20%

Table 6: Result and Performance of Each Predict Model

Page 13: Artificial Intelligence for ETF Market Prediction and Portfolio … · 2019-06-24 · 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin,

2019富邦人壽管理博碩士論文獎

12

strategies to create a portfolio with feature of low risk but can reach relatively stable returns.

Figure 5 shows that the correlation between each assets, we calculate the adjusted close price by group and illustrate the correlation matrix, we found six equities which have relatively low correlation to construct out portfolio (Table 7), with the lower correlation, we except our portfolio can lower the risk and get a better performance when we implement the portfolio optimization models.

Symbol Name

0050.TW 元大寶來台灣卓越 50指數股票型基金

00677U.TW 富邦 VIX

00632R.TW 元大台灣 50單日反向 1倍基金

006203.TW 元大摩臺基金

00664R.TW 國泰臺灣加權指數單日反向 1倍基金

00685L.TW 群益臺灣加權正 2

Table 8. The Data of Portfolio Weight Rebalancing for Each Quarter in 2018

Q1 Q2 Q3 Q4

Date 2018/1/2 2018/4/2 2018/7/2 2018/10/1

Figure 5: Correlation Matrix for whole 20 ETFs

Table 7: List of Selected ETFs

Page 14: Artificial Intelligence for ETF Market Prediction and Portfolio … · 2019-06-24 · 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin,

2019富邦人壽管理博碩士論文獎

13

4.2.2. Portfolio Optimization and Performance Comparison Following the steps we mentioned in section 3.7 we implement the Black-Litterman model to optimize our portfolio. For the optimization process, we have to prepare the data which include historical average returns and the covariance matrix (The correlation between each chosen stocks) for the Black-Litterman portfolio optimization model. At the next step, we implement the investor views meaning that we use our prediction model to predict each stock and define the disparity between each stock performance, and combine with the historical returns and generate the posterior mean and posterior covariance matrix by applying the Black-Litterman formula. And we leverage the Markowitz mean-variance optimization model for another portfolio optimization strategy, the model is sensitive to the investors view and the whole model may be affected by the weight manipulation of a single stock, it may cause the greater fluctuation and unpredictable result. The weight of the Markowitz portfolio is generated from the mean-variance optimization model by using quarterly historical returns and the covariance matrix. Table 8 shows that the weight rebalancing date of our portfolio and Table 9 shows the weight of 6 ETFs of the Black-Litterman Model.

Table 10 : Correlation Matrix for the selected 6 ETFs

Table 9 : Weight of 6 ETFs of the Black-Litterman Model

Symbol Weight

0050.TW 0.262296

00677U.TW 0.039492

00632R.TW 0.051052

006203.TW 0.125719

00664R.TW 0.232864

00685L.TW 0.288578

Page 15: Artificial Intelligence for ETF Market Prediction and Portfolio … · 2019-06-24 · 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin,

2019富邦人壽管理博碩士論文獎

14

Table 10 shows correlation matrix for the selected 6 ETFs, and in this table we can tell the correlation coefficient between selected equities, the correlation between them present negative, meaning that we can decrease the risk by proposing this portfolio.

Figure 6 present the cumulative returns of our selected stocks, and the time interval is the whole years of 2018, it’s approximately to 245 trading days, the cumulative return of each selected stock has an natural performance and the cumulative return of our portfolio is approximately equal to zero. In 2018, most of the stock do not have a great performance in the stock market, most of the stock has an natural performance or even negative performance, in our research, we are going to forecast the stock market and reach a better performance, and use the portfolio optimization methodology to decrease the volatility and obtain a more stable performance.

Figure 8 present the result of the portfolio optimization derived from our forecast model, we use the long short term memory to predict the trend of each stock, and utilize the correlation matrix to propose our portfolio, we select 6 stocks of negative correlation coefficient to construct our portfolio, and we implement the return of each trading days and deliver it into The Black-Litterman Model, we expect it can propose a better performance than the actual return.

Figure 6: Cumulative Return of 6 stocks (2018)

Page 16: Artificial Intelligence for ETF Market Prediction and Portfolio … · 2019-06-24 · 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin,

2019富邦人壽管理博碩士論文獎

15

Figure 7 : Cumulated return of portfolio optimization strategies (Return from Actual Return)

Figure 8 : Cumulated return of portfolio optimization strategies (Return from Predict Model)

Figure 9 : Cumulated return (By Naïve Bayes) of portfolio optimization strategies (Return from Predict Model)

Page 17: Artificial Intelligence for ETF Market Prediction and Portfolio … · 2019-06-24 · 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin,

2019富邦人壽管理博碩士論文獎

16

In the Figure 7 and Figure 8 can apparently tell the dramatically differentiation between them, Figure 8 show that the actual return and implement the same portfolio optimization algorithm, and the return of portfolio can get approximately zero, meaning that, despite our customers will not get deficit in the portfolio with actual data, they still cannot get any profit from the portfolio for the whole year (Still get a better performance than the 0050.TW). Conversely, Table 7 shows that the result derived from our predict model, it can reach more that 14% of annual return by utilizing the Markowitz mean-variance model and 16% with average weight strategies, and the relatively stable Black-Litterman Model can get more than 20% return in the bearish market. The result shows that the portfolio with forecasted data input outperform the 0050.TW and even the optimized portfolio with actual data. In Figure 8 and Figure 9 shows that the result of the prediction model have effect the result a lot, the average F1 score of our LSTM model can reach 0.57 percent and the Naïve Bayes can only reach the 0.48 percent of the F1 score, and the optimized cumulated return of our portfolio optimization model with the LSTM result can reach 20% ROI of 2018 and the Naïve Bayes get the -2% ROI of 2018, there exist a huge differentiation by leverage the result generated by different model.

The portfolio statistics in table 11 supports that the Black-Litterman Portfolio by leveraging the result of deep learning prediction reach a better performance than other manners, it has higher annual returns, Sharpe ratio, Alpha index, and Information ratio, and for the volatility and daily value at risk, it turns out the lower number than the benchmark.

Table 11. Correlation Matrix for the selected 6 ETFs

Black-Litterman

Portfolio

- the LSTM

Investor Views

Markowitz

Portfolio

Equally

Weighted

Portfolio

0050.TW

Index

Annual return 20.240% 14.414% 16.687 % 0.4086 %

Q1 return 5.3084% 3.6833 % 6.3783 % 3.2890 %

Q2 return 2.6887 % 2.9470 % 4.1152 % 5.7721 %

Q3 return 10.865 % 10.070 % 3.3257 % -8.482 %

Q4 return 1.3773 % -2.2865 % 2.8682 % -0.170 %

Annual volatility 4.9711 % 4.7278 % 4.1916 % 10.751 %

Sharpe ratio 1.8853 1.6907 1.4178 0.1076

Stability 0.93053 0.93576 0.79187 0.24595

Max drawdown -6.7751% -7.0713% -10.782% -17.532%

Skew 0.7170 0.0339 -0.9928 -1.1434

Kurtosis 5.8448 4.0851 5.5475 8.7708

Daily value at risk -0.9252% -0.5882% -1.1315% -1.5560%

Alpha 0.00078 0.00059 0.00066 0.00000

Page 18: Artificial Intelligence for ETF Market Prediction and Portfolio … · 2019-06-24 · 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin,

2019富邦人壽管理博碩士論文獎

17

5. Conclusion and Recommendations

In this section we will mainly focus on the research finding of this thesis and the contribution of this research and practitioners imprecation of our module and how it benefit to the real word to address the existed issue or problems. After that, we will shed light on the limitation of this research, and how can we improve our performance based on this research. Last but not least, we depict our future vision, elaborate the integration with other system or module to elevate the usefulness to meet the market’s needs. 5.1. Research contribution

The research contribution of this research can be separated into following parts : • This system can be integrated into the portfolio optimization robo-advisor

recommendation system as a sub-module to assist the investors to make their long term or short terms investment decision.

• LSTM model have a better performance compare to the machine learning models, the recurrent neuron networks have a better computation ability specifically for time series data, and the stateful LSTM can benefit to more reliable stock forecast result compare to the traditional or ensemble machine learning method when process and forecast the day trading data.

• In this research, we proposed the portfolio optimization model and integrate with the deep learning forecasting, in the neutral stock market we can still outperform the base line ETF and get a better performance and the optimized portfolio with the actual data implementation.

5.2. Recommendations for Future Research

In the future, we can improve our module with following methods : • Integrate more technical indicators for our model training. When feed in variety input,

the model can exarate important feature when model training so as to find out the indicators with really matters.

• Some research proposed that the fully convolution neuron network outperform recurrent neuron networks in some cases, in the future we’ll build a CNN module and compare to our system to find out the CNN module really perform better or not.

• Embedding our forecast module with natural language processing technology, combining the investors comment and opinion with the daily price, in order to find whether the comments from the internet really stand for the future trends.

References

Black, F., & Litterman, R. (1990). Asset allocation: combining investor views with market equilibrium. Retrieved from

Box, G. E., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time series analysis: forecasting

Page 19: Artificial Intelligence for ETF Market Prediction and Portfolio … · 2019-06-24 · 1995), Decision Tree (Quinlan, 1986), eXtreme Gradient Boosting (XGBoost) (T. Chen & Guestrin,

2019富邦人壽管理博碩士論文獎

18

and control: John Wiley & Sons. Chen, L., Qiao, Z., Wang, M., Wang, C., Du, R., & Stanley, H. E. J. I. A. (2018). Which Artificial

Intelligence Algorithm Better Predicts the Chinese Stock Market? , 6, 48625-48633. Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. Paper presented at

the Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining.

Cortes, C., & Vapnik, V. J. M. l. (1995). Support-vector networks. 20(3), 273-297. Engle, R. F. J. E. J. o. t. E. S. (1982). Autoregressive conditional heteroscedasticity with estimates

of the variance of United Kingdom inflation. 987-1007. Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep sparse rectifier neural networks. Paper

presented at the Proceedings of the fourteenth international conference on artificial intelligence and statistics.

Ho, T. K. (1995). Random decision forests. Paper presented at the Proceedings of 3rd international conference on document analysis and recognition.

Hochreiter, S., & Schmidhuber, J. J. N. c. (1997). Long short-term memory. 9(8), 1735-1780. Karolyi, G. A. J. J. o. B., & Statistics, E. (1995). A multivariate GARCH model of international

transmissions of stock returns and volatility: The case of the United States and Canada. 13(1), 11-25.

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Paper presented at the Advances in neural information processing systems.

LeCun, Y., Bengio, Y., & Hinton, G. J. n. (2015). Deep learning. 521(7553), 436. Lee, I., & Shin, Y. J. J. B. H. (2018). Fintech: Ecosystem, business models, investment decisions,

and challenges. 61(1), 35-46. Markowitz, H. J. J. o. p. E. (1952). The utility of wealth. 60(2), 151-158. McCallum, A., & Nigam, K. (1998). A comparison of event models for naive bayes text

classification. Paper presented at the AAAI-98 workshop on learning for text categorization.

Nunamaker, J. F., Chen, M., & Purdin, T. D. M. (1990). Systems Development in Information Systems Research. Journal of Management Information Systems, 7(3), 89-106. doi:10.1080/07421222.1990.11517898

Quinlan, J. R. J. M. l. (1986). Induction of decision trees. 1(1), 81-106. Samuel, A. L. (1988). Some Studies in Machine Learning Using the Game of Checkers. II—

Recent Progress. In Computer Games I (pp. 366-400): Springer.