This study proposes a competitive model using the Box–Jenkins approach to implement a Box–Jenkins ARIMAGARCH model in order to improve financial forecasting. Differing from previous studies, we consider optimizing the lagged terms, which assist in capturing the relationships more properly. The competitive model is then used to forecast the stock market index in Taiwan. This study conducts outofsample forecasting and compares the root mean square errors (RMSEs) against previous studies. The results show that the competitive model outperformed in terms of both RMSEs and consistency.
Box–Jenkins ARIMA
The application of these Box–Jenkins ARIMA models consists of three main steps: identifying I(0) or I(1) (unit root test and differencing), Box–Jenkins optimization, and training linear and nonlinear models. Many studies suggest that a hybrid model should be applied to solve both linear and nonlinear approaches. However, this study uses GARCH(1,1)
For comparison purposes, this study chooses various fuzzy time series models, including firstorder models (
The contributions of this study are as follows. First, the competitive model successfully builds a time series approach to solve both the linearity and nonlinearity of the data with optimizing lagged terms, and presents good forecasting results. Second, the application of the Box–Jenkins approach can better capture the lagged term relationships and thus provide a better forecast. Third, optimizing the lagged terms allows the competitive model (a univariate model) to compete with a bivariate model.
This study proposes a competitive model using the Box–Jenkins approach to implement a Box–Jenkins ARIMAGARCH model to improve forecasting. Toward that end, the remainder of this paper consists of the following sections.
ARIMA(
In ARIMA, the autoregressive series
This study also uses the applications of the GARCH model, which is provided to duel with sensitiveness, nonstationarity, and asymmetric volatility series (
Fuzzy time series models have been utilized for decades by many researchers.
The neural network (NN) is a nonlinear technique that is similar to the human brain architecture and is applied widely in forecasting. The first NNfuzzy time series in forecasting was suggested by
This study employed data for the daily closing stock market index of Taiwan, Taiwan Stock Exchange Capitalization Weighed Stock Index (TAIEX). To facilitate comparisons, the sample size was set to be the same as in
We conducted the Box–Jenkins ARIMAGARCH(1,1) as follows, named as Model 6.
Step 1. Unit root test
Checking for data stationarity is an important step. In the case of stationarity, the TAIEX at time (
Step 2. Difference
Many previous researchers used the differences of a stock market index for prediction by the ARIMA family (
Step 3. Box–Jenkins optimization
The orders
Step 4. Building the competitive model
Heteroskedasticity is a statistical problem that causes a bias parameter. Hence, GARCH(1,1) equation is usually used to provide the solution for homoskedasticity and to enhance the robustness for the ARIMA family (
Here,
Step 5. Forecasting
Similar to previous studies, this research takes the backward induction of the difference in forecast values to result in the forecasted stock market index. The output of the Box–Jenkins ARIMAGARCH(1,1) model still forecasts the index (
Step 6. Performance evaluation
Following a previous study (
We took TAIEX data in the year 2000 as an example to demonstrate the empirical analysis.
Step 1. Unit root test
The null hypothesis of the ADF tests is that TAIEX is nonstationary. We performed this test and rejected the null hypothesis. The results are shown in
Step 2. Difference
The stock market index on January 5 was 8849.87, and it was 8756.55 on January 4. Hence,
Step 3. Box–Jenkins optimization
By using Box–Jenkins methods, the value of order
Step 4. Building the competitive model
By using Equation (4), we optimized the error terms
Step 5: Forecasting
As in Equation (4), the closing index on November 1 was 5425.02 and on November 2 it was 5625.08. Employing the model, the forecast of stock difference (
Step 6. Performance evaluation
For the year 2000, the RMSE of Model 6 was 122.68.
We repeated the forecasting for all years and compared the performance of six models in terms of RMSE, as in
The competitive model could capture both the linearity and nonlinearity of the data, which is the same as any hybrid models in previous studies. In the model, the Box–Jenkins approach allows the researcher to optimize the lagged terms of both autoregressive series and moving average series by minimizing the white noise, which means that more correct information is taken into consideration. Hence, the forecasting results are expected to improve. Model 5 (bivariate NNfuzzy approach), using all the degrees of membership, exhibited good results. The drawback of taking all the degrees of membership for training and forecasting is that there can be too many fuzzy sets or inputs for the NN.
This study proposes a Box–Jenkins ARIMAGARCH model as a competitive model to improve forecasting, as it optimizes the lagged terms of both autoregressive series and moving average series by minimizing the white noise. Due to the coverage of most optimizing information, the model performed better than many previous studies. Another advantage of taking the optimized lagged terms is that the competitive model can improve a univariate model to compete with a bivariate model.
The competitive model can easily expand its function and also calculate fuzzy relationships. Following the empirical results in this study, the competitive model can solve problems of both linear and nonlinear data. For future work, if other suitable inputs can be observed, then the competitive model can be easily expanded to bivariate models.
Author named Nhan, NguyenThanh is very appreciation to Prof. YungChang, Wang, and Prof. KuoHsuan, Chin on their supporting comments.
Autocorrelation  Partial Correlation  AC  PAC  QStat  Prob  

..   ..   1  0.037  0.037  0.3763  0.540 
..   ..   2  0.043  0.042  0.8847  0.643 
..   ..   3  0.045  0.042  1.4412  0.696 
*.   *.   4  −0.163  −0.168  8.7337  0.068 
..   ..   5  0.016  0.026  8.8085  0.117 
..   ..   6  −0.049  −0.039  9.4727  0.149 
..   ..   7  −0.029  −0.012  9.7123  0.205 
..   ..   8  −0.037  −0.063  10.094  0.259 
..   ..   9  0.009  0.027  10.115  0.341 
..   ..   10  −0.005  −0.017  10.124  0.430 
..   ..   11  0.029  0.030  10.359  0.498 
..   ..   12  −0.004  −0.026  10.364  0.584 
.*   .*   13  0.090  0.101  12.654  0.475 
.*   .*   14  0.119  0.104  16.722  0.271 
*.   *.   15  −0.123  −0.136  21.087  0.134 
..   ..   16  −0.031  −0.050  21.364  0.165 
..   ..   17  −0.045  −0.005  21.964  0.186 
*.   *.   18  −0.178  −0.139  31.245  0.027 
..   ..   19  0.016  −0.005  31.324  0.037 
..   ..   20  −0.014  0.008  31.379  0.050 
*.   *.   21  −0.088  −0.085  33.673  0.039 
..   ..   22  0.011  −0.035  33.710  0.053 
..   ..   23  −0.045  −0.047  34.319  0.061 
..   ..   24  0.046  0.047  34.951  0.069 
..   ..   25  0.022  −0.012  35.093  0.087 
..   ..   26  −0.007  −0.026  35.108  0.109 
..   ..   27  −0.011  −0.055  35.143  0.135 
.*   .*   28  0.084  0.126  37.277  0.113 
..   ..   29  −0.006  0.009  37.289  0.139 
..   ..   30  0.031  0.016  37.574  0.161 
ARIMA names as autoregressive integrated moving average that is proposed by (
GARCH is the wellknown approach named as generalized autoregressive conditional heteroskedasticity.
TAIFEX denotes Taiwan Future Exchange,
TAIEX denotes Taiwan Capitalization Weighted Stock Index,
Results of ADF unit root tests.
2000  2001  2002  2003  2004  


0.2622  −0.8788  −0.8427  −0.7568  −1.6883 

−15.75 ***  −14.41 ***  −15.26 ***  −15.05 ***  −14.85 *** 
*** denotes significance at 99%, ** significance at 95%, and * significance at 90%. Note: The null hypothesis of the Augmented Dickey–Fuller (ADF) test is that the TAIEX has a unit root.
Box–Jenkins method optimization and results for the year 2000.
Model  Available order to be chosen as 


AR( 
4  13  14  15  18  21  28 
−0.163  0.09  0.119  −0.123  −0.178 **  0.088  0.084  
MA(q)  4  13  14  15  18  21  28 
−0.168 **  0.101  0.104  −0.136 **  −0.139  0.085  0.126  
Model selection by information criteria  
ARIMA family  SIC  AIC  Rank  
ARIMA(18;1;4)  12.9674  12.9254  3  
ARIMA(18;1;15)  12.9879  12.9458  4  
ARIMA(18;1;4,15)  12.9664  12.9126  1  
ARIMA(0;1;4,15,18)  12.9686  12.9131  2 
*** denotes significance at 99%, ** significance at 95%, and * significance at 90%.
Optimized model for each year.
Year  Best model 

2000  ARIMA(18;1;4,15)GARCH(1,1) 
2001  ARIMA(4;1;4,15)GARCH(1,1) 
2002  ARIMA(9;1;4)GARCH(1,1) 
2003  ARIMA(8;1;8)GARCH(1,1) 
2004  ARIMA(29;1;2,15)GARCH(1,1) 
Forecast from the Box–Jenkins ARIMAGARCH(1,1) model.
Date( 
Actual TAIEX( 
Inputs  Outputs  

First difference series 
Forecast value of stock market index difference 
Forecast 

11/1/2000  5425.02  −119.16  5544.18  −0.33156  5543.85 
11/2/2000  5626.08  201.06  5425.02  14.5274  5439.55 
11/3/2000  5796.08  170.00  5626.08  59.2410  5685.32 
11/4/2000  5677.30  −118.78  5796.08  13.9798  5810.06 
11/6/2000  5657.48  −19.82  5677.30  45.7822  5723.08 
11/7/2000  5877.77  220.29  5657.48  0.21743  5657.70 
11/8/2000  6067.94  190.17  5877.77  −89.554  5788.22 
11/9/2000  6089.55  21.61  6067.94  20.0731  6088.01 
11/10/2000  6088.74  −0.81  6089.55  34.4934  6124.04 
11/13/2000  5793.52  −295.22  6088.74  −129.906  5958.83 
11/14/2000  5772.51  −21.01  5793.52  −136.168  5657.35 
11/15/2000  5737.02  −35.49  5772.51  −33.7512  5738.76 
11/16/2000  5454.13  −282.89  5737.02  −19.9644  5717.06 
11/17/2000  5351.36  −102.77  5454.13  25.2855  5479.42 
11/18/2000  5167.35  −184.01  5351.36  −35.5636  5315.80 
11/20/2000  4845.21  −322.14  5167.35  11.3685  5178.72 
11/21/2000  5103.00  257.79  4845.21  38.9101  4884.12 
11/22/2000  5130.61  27.61  5103.00  10.8685  5113.87 
11/23/2000  5146.92  16.31  5130.61  48.9828  5179.59 
11/24/2000  5419.99  273.07  5146.92  52.0762  5199.00 
11/27/2000  5433.78  13.79  5419.99  −124.454  5295.54 
11/28/2000  5362.26  −71.52  5433.78  −48.3960  5385.38 
11/29/2000  5319.46  −42.80  5362.26  −9.66358  5352.60 
11/30/2000  5256.93  −62.53  5319.46  −96.5995  5222.86 
Performance evaluation by root mean square errors (RMSEs). NN: neural network.
Model  2000  2001  2002  2003  2004 

(1) Firstorder model ( 
176.32  147.84  101.18  74.46  84.28 
(2) Multivariate model ( 
154.42  124.02  95.73  70.76  72.35 
(3) NN model ( 
152  130  84  56  N/A 
(4) NN based fuzzy time series ( 
149.59  98.31  78.71  58.78  55.91 
(5) Bivariate fuzzy time series ( 
67  120  69  52  60 
(6) The competitive model  122.68  108.62  66.11  55.56  52.99 