Electricity Demand Forecasting using LSTM-based RNNs in Romania

Machine Learning

We have developed a deep learning approach to Electricity Demand Forecasting (EDF).By looking at the Romanian electricity market, we demonstrate how moving from traditional econometric model to algorithms such as LSTM-based RNNs brings significant improvements in accuracy.

Apolline Foucher , Augustine Malija , Vlad Surdea-Hernea
2021-04-24

Abstract

Electricity demand forecasting is the central challenge for the operators of a modern grid system. Traditionally, this computational problem has been tackled using statistical models such as the Autoregressive Integrated Moving Average (ARIMA) model. However, recent developments show that machine-learning methods outperform the traditional econometric models. In this paper, we conduct a series of experiments to demonstrate that LSTM-based RNN is capable of forecasting accurately the complex electric load time series. The performance of the machine-learning approach compares favorably to the ARIMA model, especially given the lack of weather-related data.

Introduction

The deep decarbonization of the European Union’s economy implies the massive electrification of industries, transport and heating and cooling. Additionally, the rapid growth of the renewable energy sources (RES) share in the energy mix comes with its own organic challenges, such as seasonal intermittencies. Overall, the upcoming decades will bring the urgent need to reform and smarten the electricity grid, making it capable of serving the contemporary needs of Europe in a sustainable manner. To enable the well-functioning of this smart grid, operators in the distribution and transport industries will have to improve their capacity to predict electricity loads in advance. Therefore, electricity demand forecasting (EDF) lies at the foundation of the EU’s decarbonization plans, being essential for the management of the national electricity market.

EDF is a primary input in decisions ensuring the security of energy supply, as well as the optimal requirements on adjacent markets such as the electricity balancing market. Mistakes related to EDF can lead to significant societal problems:

  1. Overestimating daily electricity demand can lead to waste of resources, introduces unnecessary pressure on the environment through the extra-functioning of fossil-fuel-powered plants and could ultimately prevent the optimal deployment of new renewable
  2. Underestimating electricity demand is also a significant issue, as it can lead to prolonged blackouts, or introduce supplementary costs for the participants in the electricity market who would need to pay the higher tariffs existing on the balancing markets.

In the past, EDF was primary realized through the usage of traditional econometric models such as the autoregressive (AR) method, exponential smoothing (ES) method, and autoregressive moving average (ARIMA) method. The main disadvantage of all the previously-mentioned models it the assumption that past load changes will continue until the present with an identical trend. Given the non-linearity of electricity loads across time, especially in the case of residential electricity consumption, this assumption is rather strong and unlikely to reflect properly the realities of the electricity sectors. Thus, traditional time series statistical models have had failed to improve their accuracy across time. This is the reason for which machine-learning models have gained traction in recent years, especially since the rise of renewable energy.

In this project we deploy deep-learning architectures, in the form of Long Short-Term Memory (LSTM)-based Recurrent Neural Networks (RNNs) to solve the challenge of EDF. To precisely characterize the accuracy of our model, we compare and contrast it with the results of a traditional ARIMA model applied to the same dataset. The proposed LSTM-based RNN exploits long-term dependencies in the electricity consumption time series for generating accurate forecasting of the aggregate load.

Because there was no previous similar research applied for Eastern Europe, and because recently there was a rise of consultancy companies using machine learning models for EDF in this region, we chose to focus on the Romanian market. Therefore, .

Given the rapid and significant advancements of machine-learning, in particular deep-learning, researchers have utilized different multiple architectures for a plethora of EDF challenges. Different time resolutions have been studied for several time intervals: monthly, weekly, daily, hourly, half-hourly, minute-by-minute. These deep-learning architectures have been used to target different markets such as business consumers, household consumers,industrial consumers, etc. Thus, our proposed model of applying LSTM-based RNNs to EDF in Romania is based in the latest literature in the field, as represented by papers such as the following:

  1. Lago et al.(2018) propose a novel machine-learning approach towards EDF on the spot market. In order to prove the effectiveness of their methodology, the authors compare and contrast 27 state-of-the-art models used in academia and industry. They prove that in general, any type of machine-learning model outperforms the standard statistical models, and in particular the LSTM-based RNN approach tends to yield very accurate results. The LSTM-RNN approach is the most accurate in cases in which the expected load is not linear, which is mostly the case in the retail market.

  2. Zheng et al. (2017) use LSTM-based RNN in order to manage the nonlinear, non-stationary and nonseasonal nature of the electric load time series. The authors use 906 different samples and train their model such that it predicts the load in the next day based on the given loads from the past ten days. Multiple experiments show that LSTM-based RNNs outperform traditional methods of EDF, especially in the case of short-terms EDF, which is the hardest to predict using statistical tools.

  3. Muzaffar and Afshari (2019) find that regardless of the horizon for the prediction (next day, next hour, next month, etc.), LSTM-based RNNS outperform traditional statistical models such as SARIMA, ARMA and ARMAX. Having access to a large dataset, the authors train the LSTM-based RNN on the first 12 months of observation and use the 13th month for testing. One interesting fact discovered by Muzaffar and Afshari(2019) is that over-learning might be an issue for a larger number of hidden units.

  4. Son and Kim (2020) apply the same machine-learning approach to EDF to a dataset spanning 22 years of electrical loads in South Korea. The performance of the LTSM-based RNN has been subjected to a comparison with the 4 standard statistical models, and the performance is assessed using 6 different benchmark criteria (MAE, RMSE, MAPE, C, MBE, and UPA). While LSTM RNNs outperform other models in all six categories, the authors recognize that different criteria yield different accuracy disparities.

Proposed Method

The following section describes the two models used for EDF in the Romanian market. We firstly describe, briefly, the mathematical foundations of the traditional ARIMA model, and afterwards we move towards a comprehensive analysis of the architecture of LSTM-based RNNs.

ARIMA

ARIMA is the most well-known statistical method for the purpose of time series analyses, used primarily to forecast univariate data. This model is defined by three central factors, as p, d, and q .These represent the auto-regressive average factor, integration average factor, and moving average factor.

LSTM-based RNN

In standard RNN architectures, the neural network is designed as a chain of identical modules formed as a series of hidden networks, usually in the form of a single sigmoid layer (see Figure 1). Given that the architecture of an RNN assigns one layer for each moment in time, it is theoretically suitable for time series analysis. However, training sequences with very long time steps is challenging due to the RNN’s inherent limitations,such as vanishing gradients.

In contrast to this standard architecture, the hidden layers of a LSTM-based RNN have a more complex structure capable of learning long-term dependencies. The LSTM-extension introduces the concepts of gate and memory cell in each hidden layer in the network. As a consequence, a memory block in the LSTM-based RNN is composed of four structural parts: an input gate , a forget gate , an output gate , and the self-connected memory cells. In addition to this tripartite structure, LSTM applies multiplicative gates to make it possible for the memory cells to access and store the information over a long time interval.

Briefly stated, based on the input time-series vector =\(\{x_1,x_2,x_3,...x_t\}\), the LSTM-based RNN will predict sequences: the hidden state sequence =\(\{y_1,y_2,y_3,...y_t\}\) and the output sequence =\(\{h_1,h_2,h_3,...h_t\}\).The process is iterative, and is realised by sequentially updating the states of the memory cells.

Coding strategy

The proposed LSTM-based RNN model was implemented using Keras from Tensorflow. This model, consisting of one input layer, five hidden layers and one output layer, was trained with a mean-squared error (MSE) loss function and optimized through adaptive moment estimation (ADAM) optimization scheme. ADAM was chosen instead of rmsprop, which is the other standard alternative. The pre-processing of the data was straightforward, making sure that we obtain a dataset that contains all the time-series information (hour, day, weekday, week, month, year) as well as the value of the electricity load consumed for each of the observations in the time-series. For reasons of clarity, we also introduced a date-time variable as the index of the dataset.

Experiments

Data: For the scope of this project, we use the historical electricity consumption database provided by the European Network of Transmission System Operators (ENTSO-E). This database encompasses a range of data-sets provided by national Transmission System Operators (ENTSO-E) in Europe.

We have extracted, from the aggregate dataset, data relevant for the case of Romania. This was done in order to explore a market that was previously ignored by researchers trying to compare the accuracy of machine-learning methods with econometric models for the task of EDF. The dataset provided by the Romanian TSO contains 119771 observations, one for each hour between 01.01.2006 and 31.08.2019.

Evaluation method: The evaluation of the LSTM-model is done using a series of benchmark metrics, used in multiple studies that compare different options for EDF: \begin{itemize}

Mean bias error (MAE) and Root-mean-squared error (RMSE), which are absolute performance measures that allow us to compare the deviation between the actual values and the predictions.

\[ RMSE = \sqrt{\frac{\sum_{i=1}^{n}(x_i-y_i)^2}{n}} \]

\[ MBE= \frac{\sum_{i=1}^{n}(y_i-x_i)}{n} \]

MAPE, which is a relative measure that represents the forecasting error between the actual value and the prediction:

\[MAPE=\frac{1}{n}\sum_{i=1}^{n}|\frac{y_i-x_i}{y_i}|100\]

Results:

The table above describes the results of both our baseline ARIMA model, and the LSTM-based RNN model. In this sense, at the first glance, we observe that the LSTM outperforms the traditional model in all categories, so irrespective of the metric used for computing accuracy.

Comment on quantitative results: Firstly, the comparison with the ARIMA model reveals significant improvements. In practice this would imply massive cost reductions for the companies managing the electricity grid in Romania. The huge improvements measured by the MAPE score are the most important given the scope of this project, as they show that in relative terms, our accuracy would allow for improvements in the electricity balancing market, which is one of the most sensitive and costly elements of the modern electricity grid.

Secondly, the results are better than initially expected, as our model seems to outperform similar models that were also trained using weather-data, not only the electricity load time-series. The absence of weather-data seems to be not so relevant, which is surprising given the history of other papers (including papers that use deep-learning methods). However, this might be explained by the peculiarity of the Romanian market, which is mostly driven by hydro power and nuclear power. Both hydro and nuclear are base-load technologies with extremely low variation capabilities. In this sense, the market generated by these technologies is very different from one in which wind and solar PV would play more significant roles. However, given the expected increase in the role of these renewable sources in Romania in the upcoming decades, it is advisable for deep-learning EDF models to also use weather data as input (if available). Additionally, one can see that the ARIMA model was, in opposition to the LSTM-based RNN, significantly affected by the lack of weather data.

Thirdly, the human evaluators from Romania have revealed one important fact about the results of our model. The negative MBE in the case of the LSTM implies that our approach underestimated the actual electricity consumed by the Romanian markets at a given point in time. While the very small value of the MBE is an improvement from the ARIMA model, negative values are dangerous in EDF. This is because from a grid-operating perspective it is easier to work with overestimated values (and thus with companies producing more, and electricity prices going up) than with underestimated values (and thus with companies producing less, and either prices spiking or blackouts happening). These results indicated that future deep-learning approaches to EDF might need, if also leading to negative MBEs, to be adjusted upwards. The mathematical way of doing this is beyond the scope of this project.

Analysis

Figure 1 displays the results of the ARIMA model. While the model predicts the direction in which the electricity load seems to go, it is inaccurate in actually determining the load at a given moment. Given that ARIMA is a traditional econometric model, it doesn’t use the train-test split. As such, one can see the predicted values alongside the actual values of the electricity load across the entire dataset. Nevertheless, the results do not improve if we only look at one period in time, as the main fault of the model is the incapacity of anticipating spikes in demand.

Figure 1. ARIMA

Figure 1: Figure 1. ARIMA

Figure 2 displays the results of the LSTM-based RNN model. This plot offers the results of the test aggregate on a daily basis. The improvements from the previous models are visually perceivable. One very important point, however, is that results tend to become better and better as we go towards the end of the dataset. In this sense, one can conclude that the best EDF approach with LSTM-based RNN would be to predict a small number of electricity loads based on a very large pre-existing historical time-series. This is ideal, as usually the most important predictions are for the upcoming day, and at most for the next 7 days. Our prediction for a couple of months is beyond the needs of the industry.

Figure 2. LSTM-based RNN

Figure 2: Figure 2. LSTM-based RNN

Figure 3 offers a sample of the results provided by the LSTM-based RNN. There is only one significant discrepancy that can be observed: the one between 5517 and 6663. This particular divergence is associated with 03-06-2019, a typical weekday of summer in which the electricity load was extremely low. This exception proves that in the absence of weather data, even the best performing deep-learning models cannot predict values that are entirely dependent on weather events.

Figure 3. Sample of results

Figure 3: Figure 3. Sample of results

Conclusion(s)

Our project proves that deep-learning can be effectively and efficiently used for EDF. We developed an LSTM-based RNN model that predicted the electricity consumed hourly for the Romanian market, and obtained results that are significantly better than the ones driven by traditional econometric models like ARIMA. One clear limitation of the proposed LSTM-based RNN model is that we only use the historical time-series as input, and no other weather-related data. In this sense, future research should try to integrate more information on hourly solar irradiation, wind speed level or rainfall.

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/vladsurdea/ML, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".