Bitcoin Price Forecasting using Different Artificial Neural Network and Training Algorithm
Bitcoin gain popularity day by day. Economists anticipate that
Bitcoin might one day replace current transaction method.
However, Bitcoin price is hard and difficult for investors to
predict and make decision when investing. One of the reason
is that Bitcoin price has the nonlinearity property as the price
of Bitcoin fluctuated a lot. Thus, a better forecasting method is
needed to minimize the risk from inaccuracy decision. The
aim of this paper is to find the best model to predict Bitcoin
price using two different Neural Network which are
Feedforward Neural Network (FNN) and Nonlinear
Autoregressive (NAR) Neural Network. The NN models are
tested with two different training algorithm which are
Levenberg-Marquardt (LM) and Scaled Conjugate Gradient
(SCG) backpropagation training algorithm. The best model is
identified by evaluating the performance measurements of
each model. The result showed that the performance of NAR
with LM training algorithm out-performed other models. It is
proven NAR with LM training algorithm is the suitable neural
network to predict Bitcoin price. The resulting model provides
new insights into Bitcoin forecasting using NAR which
directly benefits the investors and economists in lowering the
risk of making wrong decision when it comes to invest in
Bitcoin.
INTRODUCTION
Bitcoin gain popularity day by day. Economists anticipate that
Bitcoin might one day replace current transaction method.
However, Bitcoin price is hard and difficult for investors to
predict and make decision when investing. One of the reason
is that Bitcoin price has the nonlinearity property as the price
of Bitcoin fluctuated a lot. Thus, a better forecasting method is
needed to minimize the risk from inaccuracy decision. The
aim of this paper is to find the best model to predict Bitcoin
price using two different Neural Network which are
Feedforward Neural Network (FNN) and Nonlinear
Autoregressive (NAR) Neural Network. The NN models are
tested with two different training algorithm which are
Levenberg-Marquardt (LM) and Scaled Conjugate Gradient
(SCG) backpropagation training algorithm. The best model is
identified by evaluating the performance measurements of
each model. The result showed that the performance of NAR
with LM training algorithm out-performed other models. It is
proven NAR with LM training algorithm is the suitable neural
network to predict Bitcoin price. The resulting model provides
new insights into Bitcoin forecasting using NAR which
directly benefits the investors and economists in lowering the
risk of making wrong decision when it comes to invest in
Bitcoin.
METHODOLOGY
There are two NN models used in this study known as
Feedforward Neural Network (FNN) and Nonlinear
Autoregressive (NAR) Neural Network. The NN models are
trained with two different training algorithms which are
Levenberg-Marquardt (LM) and Scaled Conjugate Gradient
(SCG) training algorithm. The best model is determined by
comparing the performance measurement of each model.
2.1. Data
Different websites offer different selling price for Bitcoin.
For this research, the Bitcoin price data were collected from
Blockchain, which is the master ledger that records the
original Bitcoin price. There are 2435 observations of daily
Bitcoin price data starting from 1st January 2012 until 31st
August 2018 used in this study. Aside from Bitcoin price
data, there are several others daily data variables collected
from Blockchain which includes hash rate, average block
size, transaction cost, numbers of transactions, miner revenue
and number of transaction per block. These are some of the
influence factor mentioned by Kristoufek [6].
2.2. Feedforward Neural Network
Feedforward Neural Network is one of the basic forms of
Artificial Neural Network which passes the information from
the input layer directly to the output layer after undergoing
activation function [7]. However, in order to handle nonlinear
data, a hidden layer is needed to be inserted within the input
and output layer [8]. Fig. 1 illustrates the model of the FNN.
The weight for each of the interconnection constantly changes
based on the predetermined training algorithms.
In this study, a basic three layers backpropagation
feedforward neural network with α input nodes, β hidden
nodes and one output node are used. The predicted output
values are obtained from the Equation (1).
Bitcoin Price Forecasting using Different Artificial Neural Network and
Training Algorithm
is the output value at actual time t; ݔ௧
is the input
value at actual time t; ݓ
is the connection weight between
input and hidden layer nodes; ݓ
is the connection weight
between hidden and output layer nodes; ߠ is the bias constant;
݂(ݔ (and ݃(ݔ (are the activation functions; i = j = 1, 2, 3, … ,
n.
2.3. Nonlinear Autoregressive (NAR)
NAR neural network is another form of recurrent neural
network. The different between NAR and NARX is that NAR
does not have exogenous input. NAR only loops back the
information to the hidden layer. It is commonly used in
forecasting nonlinear time series without taking account any
influence variable [10]. The equation of the NAR model is as
follows:
(is the next value of predicted output value.
The architecture of the NAR neural network is shown in Fig.
2.
Figure 2: Architecture of NAR Model [11]
2.4. Parameters Setting for NN Models
The parameter shown in Table 1 were used for FNN and
NAR. All model are set with the same parameter in order to
obtain a fair result.
All neural networks are set with Levenberg Marquardt
training algorithm and the transfer function of log-sigmoid
function from input layer to hidden layer, linear function from
hidden layer to output layer. Furthermore, the neural networks
have also been set with maximum fail of 500 times when
validating stage, maximum epochs of 10000 iteration,
learning rate of 0.01 unit, performance goal of 0, minimum
gradient of 1.00 x 10-6
unit, μ of 1.00 x 10-3
and the maximum
μ of 1.00 x 1010
.
Table 1: Training Parameters of each model
Parameter FNN NAR
Transfer function log-sigmoid +
linear
log-sigmoid +
linear
Maximum fail 500 500
Maximum epochs 10000 10000
Learning rate, α 0.01 0.01
Performance goal 0 0
Minimum gradient 1.00 x 10-6
1.00 x 10-6
μ 1.00 x 10-3
1.00 x 10-3
Maximum μ 1.00 x 1010 1.00 x 1010
2.5. Data Pre-processing
Different value range in the variables will directly influence
the tendency and accuracy for the models especially for NN
[12]. Therefore, normalization method is applied in the
analysis. The data used in the analysis are normalized using
Min-Max normalization method which transforms the data
into a defined range of 0 to 1 [13]. The equation of the
Min-Max normalization method is shown in Equation (3).
2.6. Forecast Accuracy
In this research, five forecasting accuracy measurements are
applied to evaluate the accuracy and performance of the
predicted output for all models. The measurements that used
in this study are Mean Absolute Error (MAE), Mean Forecast
Error (MFE), Root Mean Square Error (RMSE), Mean
Absolute Error (MAPE) and Mean Absolute Scaled Error
(MASE). The criterion of the best model is based on the
smallest obtained values for all measurements. The equation
for each of the forecast accuracy are shown as follows:
RESULT AND DISCUSSIONS
Fig. 3.1 shows the time series plot Bitcoin price. Based on
visual inspection the series has shown to be nonlinear and
non-stationary. However, proper statistical tests needed to
carry out to prove the findings. Therefore, Augmented
Dickey-Fuller (ADF) and Anderson Darling (AD) statistical
tests were performed to prove the stationarity and linearity
properties in the data. ADF Test showed a logical result of
value 0 with the p-value of 0.8401 (p > 0.05). It indicates that
this test fails to reject the null hypothesis of a unit root is equal
to 0, suggesting that the data is not stationary. Meanwhile, the
obtained output for AD test shows a logical result of value 1
with p-value approximate to 0.0005 which indicates that the
null hypothesis is rejected. Thus, it can be concluded that the
data does not follow the normal distribution, hence
nonlinearity does exist in the data. Pieces of evidence indicate
that Bitcoin time series data is proven to be non-stationary and
nonlinear, hence, fulfill the assumption of neural network.
Once the non-stationarity and nonlinear properties proved in
the dataset, the dataset is ready to be used in prediction by
using neural network method.
The predicted value versus actual value of each model is
illustrated in Fig. 4 to Fig. 7. Visual inspection indicates that
the fluctuation of the predicted values for both FNN models
are huge and the fluctuation for FNN with SCG training
algorithm is the worst. However, the predicted values for both
NAR model are very close to the actual value.
The predicted values were computed using forecast accuracy
which are MAE, MAPE and RMSE, MASE and MFE. The
forecast accuracies of the models are then tabulated in Table
From Table 2, the best performance measurements are
obtained from NAR model with the values of 65.87, 39.71,
202.34 and 1.09 for MAE, RMSE, MAPE and MASE,
respectively. The lowest MAE and RMSE implies that NAR
produced smaller error compared to FNN. MAPE value of
NAR with LM and SCG training algorithms falls in the
category of reasonable forecasting accuracy whereas MAPE
value of FNN fall in the category inaccurate forecasting
accuracy. Meanwhile, MASE of NAR for both training
algorithms approached 1 which implies that the models
slightly out-performed naïve model. The analysis showed
MFE for FNN with LM training algorithm and NAR with
SCG training algorithm indicate that the models are slightly
under-forecasted whereas MFE for FNN with SCG training
algorithm and NAR with LM training algorithm shows that
the models are over-forecasted.
CONCLUSION
The ADF and AD tests indicate Bitcoin price has the
characteristics of nonlinear and non-stationary. Therefore,
classical forecasting methods are not suitable to forecast
Bitcoin price as the classical forecasting methods require to
fulfill the linearity and stationary assumption. NAR has the
lowest error compared to FNN in term of MAE, MAPE and
RMSE and MASE, with the values of 65.878, 39.708%,
202.337 and 1.090 respectively. Thus NAR is the best model
when dealing with Bitcoin price prediction.
However, there are some limitations in forecasting the Bitcoin
price data using ANN. This is because that it is hard to explain
how ANN produces a solution. Besides, the network structure
of ANN contributes significant effect on the result. Thus,
determining the suitable network structure is essential through
many times of trial and error, which in result of consuming
large amount of time. Furthermore, ANN is limited to
numerical based information, thus it cannot process the
information such as news of Bitcoin, global comment and
other non-numerical information.
Besides, aside from internal factors of the Bitcoin system,
external factors such as global trend, Bitcoin news, latest
events and more are also might influence the price of the
Bitcoin price [6]. Moreover, another limitation is that the
Bitcoin price data has to be up to date in order to achieve
better accuracy in prediction its price.
It is recommended that further research should be taken into
account of the optimal network structure in order to achieve
better accuracy.
Besides, a hybrid model of quantitative forecasting
approaches with qualitative forecasting approaches is
recommended so that all factors can be included in the
forecasting.