Thanks to theidioms.com

Time-Series Forecasting with TensorFlow 2.0

Time-Series Forecasting with TensorFlow 2.0

Building Multi-Step Forecasting Models

Both the single-output and multiple-output models in the previous sections made single time step predictions, i.e., an hour into the future. In this lesson, we will be going over how to build different multiple-step time-series forecasting models using TensorFlow 2.0.

In a multi-step prediction, the model needs to learn to predict a range of future values. Thus, unlike a single step model, where only a single future point is predicted, a multi-step model predicts a sequence of the values into the future.

Multi-step forecasting can be done in following two approaches,

  1. Direct method where the entire sequence of future values is predicted at once.
  2. Recursive method where the model only makes single step predictions such that the prediction made is again fed back into the model as input recursively.

This chapter will cover both the approaches for multi-step forecasting.

For the multi-step model, the training data will again consist of hourly samples. However, here, the models will learn to predict 24h of the future, given 24h of the past.

Here is a Window object that generates these slices from the dataset:

OUT_STEPS = 24
multi_window = WindowGenerator(input_width=24,
                               label_width=OUT_STEPS,
                               shift=OUT_STEPS)

# Plotting the window
multi_window.plot()
multi_window
Total window size: 48 
Input indices: [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23] 
Label indices: [24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47] 
Label column name(s): None
Building Multi-Step Forecasting Models

In the above diagram, the blue dots represent the input features whereas the green dots represent the labels to be predicted.

Creating a baseline

A simple baseline for this task is to repeat the last input time step for the required number of output timesteps:

Building Multi-Step Forecasting Models
class MultiStepLastBaseline(tf.keras.Model):
  def call(self, inputs):
    return tf.tile(inputs[:, -1:, :], [1, OUT_STEPS, 1])

last_baseline = MultiStepLastBaseline()
last_baseline.compile(loss=tf.losses.MeanSquaredError(),
                      metrics=[tf.metrics.MeanAbsoluteError()])

multi_val_performance = {}
multi_performance = {}

multi_val_performance['Last'] = last_baseline.evaluate(multi_window.val)
multi_performance['Last'] = last_baseline.evaluate(multi_window.val, verbose=0)
multi_window.plot(last_baseline)
437/437 [==============================] - 1s 2ms/step - loss: 0.6285 - mean_absolute_error: 0.5007
Building Multi-Step Forecasting Models

From the above-plotted graphs, we can clearly see that our baseline model is returning the last input value as its output for the next 24h.

Another approach to creating a baseline model would be to repeat the exact same pattern of the previous 24 hrs (input data) as shown below,

Building Multi-Step Forecasting Models
class RepeatBaseline(tf.keras.Model):
  def call(self, inputs):
    return inputs

repeat_baseline = RepeatBaseline()
repeat_baseline.compile(loss=tf.losses.MeanSquaredError(),
                        metrics=[tf.metrics.MeanAbsoluteError()])

multi_val_performance['Repeat'] = repeat_baseline.evaluate(multi_window.val)
multi_performance['Repeat'] = repeat_baseline.evaluate(multi_window.test, verbose=0)
multi_window.plot(repeat_baseline)
437/437 [==============================] - 1s 2ms/step - loss: 0.4270 - mean_absolute_error: 0.3959
Building Multi-Step Forecasting Models

From the above figures we can clearly observe that the predictions from our base line model (represented by orange cross) is exactly the same as the input data (represented by blue dots).

Single-shot models for multi-step forecasting

One high-level approach to this problem is to use a “single-shot” model, where the model makes the entire sequence prediction in a single step. Such a forecasting method is also sometimes referred to as a direct method.

This can be implemented efficiently as a layers.Dense with OUT_STEPS*features output units. The model just needs to reshape that output to the required (OUTPUT_STEPS, features).

1. Using a linear model

A simple linear model based on the last input time step does better than either baseline, but is underpowered. The model needs to predict OUTPUT_STEPS time steps, from a single input time step with a linear projection. It can only capture a low-dimensional slice of the behavior, likely based mainly on the time of day and time of year.

Building Multi-Step Forecasting Models
multi_linear_model = tf.keras.Sequential([
    # Take the last time-step.
    # Shape [batch, time, features] => [batch, 1, features]
    tf.keras.layers.Lambda(lambda x: x[:, -1:, :]),
    # Shape => [batch, 1, out_steps*features]
    tf.keras.layers.Dense(OUT_STEPS*num_features,
                          kernel_initializer=tf.initializers.zeros),
    # Shape => [batch, out_steps, features]
    tf.keras.layers.Reshape([OUT_STEPS, num_features])
])

history = compile_and_fit(multi_linear_model, multi_window)

IPython.display.clear_output()
multi_val_performance['Linear'] = multi_linear_model.evaluate(multi_window.val)
multi_performance['Linear'] = multi_linear_model.evaluate(multi_window.test, verbose=0)
multi_window.plot(multi_linear_model)
437/437 [==============================] - 1s 2ms/step - loss: 0.2556 - mean_absolute_error: 0.3057
2. Using a dense model

Adding a layers.Dense between the input and output gives the linear model more power, but is still only based on a single input timestep.

multi_dense_model = tf.keras.Sequential([
    # Take the last time step.
    # Shape [batch, time, features] => [batch, 1, features]
    tf.keras.layers.Lambda(lambda x: x[:, -1:, :]),
    # Shape => [batch, 1, dense_units]
    tf.keras.layers.Dense(512, activation='relu'),
    # Shape => [batch, out_steps*features]
    tf.keras.layers.Dense(OUT_STEPS*num_features,
                          kernel_initializer=tf.initializers.zeros),
    # Shape => [batch, out_steps, features]
    tf.keras.layers.Reshape([OUT_STEPS, num_features])
])

history = compile_and_fit(multi_dense_model, multi_window)

IPython.display.clear_output()
multi_val_performance['Dense'] = multi_dense_model.evaluate(multi_window.val)
multi_performance['Dense'] = multi_dense_model.evaluate(multi_window.test, verbose=0)
multi_window.plot(multi_dense_model)
437/437 [==============================] - 1s 2ms/step - loss: 0.2202 - mean_absolute_error: 0.2803
Building Multi-Step Forecasting Models
3. Using a Convolutional Neural Network (CNN)

A convolutional model makes predictions based on a fixed-width history, which may lead to better performance than the dense model since it can see how things are changing over time:

Building Multi-Step Forecasting Models
CONV_WIDTH = 3
multi_conv_model = tf.keras.Sequential([
    # Shape [batch, time, features] => [batch, CONV_WIDTH, features]
    tf.keras.layers.Lambda(lambda x: x[:, -CONV_WIDTH:, :]),
    # Shape => [batch, 1, conv_units]
    tf.keras.layers.Conv1D(256, activation='relu', kernel_size=(CONV_WIDTH)),
    # Shape => [batch, 1,  out_steps*features]
    tf.keras.layers.Dense(OUT_STEPS*num_features,
                          kernel_initializer=tf.initializers.zeros),
    # Shape => [batch, out_steps, features]
    tf.keras.layers.Reshape([OUT_STEPS, num_features])
])

history = compile_and_fit(multi_conv_model, multi_window)

IPython.display.clear_output()

multi_val_performance['Conv'] = multi_conv_model.evaluate(multi_window.val)
multi_performance['Conv'] = multi_conv_model.evaluate(multi_window.test, verbose=0)
multi_window.plot(multi_conv_model)
437/437 [==============================] - 1s 2ms/step - loss: 0.2142 - mean_absolute_error: 0.2798
Building Multi-Step Forecasting Models
4. Using a Recurrent Neural Network (RNN)

A recurrent model can learn to use a long history of inputs, if it’s relevant to the predictions the model is making. Here the model will accumulate internal state for 24h, before making a single prediction for the next 24h.

In this single-shot format, the LSTM only needs to produce an output at the last time step, so set return_sequences=False.

Building Multi-Step Forecasting Models
multi_lstm_model = tf.keras.Sequential([
    # Shape [batch, time, features] => [batch, lstm_units]
    # Adding more `lstm_units` just overfits more quickly.
    tf.keras.layers.LSTM(32, return_sequences=False),
    # Shape => [batch, out_steps*features]
    tf.keras.layers.Dense(OUT_STEPS*num_features,
                          kernel_initializer=tf.initializers.zeros),
    # Shape => [batch, out_steps, features]
    tf.keras.layers.Reshape([OUT_STEPS, num_features])
])

history = compile_and_fit(multi_lstm_model, multi_window)

IPython.display.clear_output()

multi_val_performance['LSTM'] = multi_lstm_model.evaluate(multi_window.val)
multi_performance['LSTM'] = multi_lstm_model.evaluate(multi_window.train, verbose=0)
multi_window.plot(multi_lstm_model)
437/437 [==============================] - 1s 3ms/step - loss: 0.2133 - mean_absolute_error: 0.2833
Building Multi-Step Forecasting Models

Auto-regressive models for multi-step forecasting

The above models all predict the entire output sequence as a in a single shot.

In some cases, it may be helpful for the model to decompose this prediction into individual time steps. Then each model’s output can be fed back into itself at each step and predictions can be made conditioned on the previous one. Such a method is also sometimes referred to as a recursive method.

One clear advantage to this style of model is that it can be used to produce output with a varying length as the output itself is fed into the model.

You could take any of single single-step multi-output models trained in the first half of this tutorial and run in an autoregressive feedback loop, but here we’ll focus on building a model that’s been explicitly trained to do that.

Building Multi-Step Forecasting Models
Using a Recurrent Neural Network

This course only builds an autoregressive RNN model, but this pattern could be applied to any model that was designed to output a single timestep.

The model will have the same basic form as the single-step LSTM models we studied in the previous chapter: An LSTM followed by a layers.Dense that converts the LSTM outputs to model predictions.

In this case the model has to manually manage the inputs for each step so it uses layers.LSTMCell directly for the lower level, single time step interface.

class FeedBack(tf.keras.Model):
  def __init__(self, units, out_steps):
    super().__init__()
    self.out_steps = out_steps
    self.units = units
    self.lstm_cell = tf.keras.layers.LSTMCell(units)
    # Also wrap the LSTMCell in an RNN to simplify the `warmup` method.
    self.lstm_rnn = tf.keras.layers.RNN(self.lstm_cell, return_state=True)
    self.dense = tf.keras.layers.Dense(num_features)

feedback_model = FeedBack(units=32, out_steps=OUT_STEPS)

The first method this model needs is a warmup method to initialize is its internal state based on the inputs. Once trained this state will capture the relevant parts of the input history. This is equivalent to the single-step LSTM model from earlier:

def warmup(self, inputs):
  # inputs.shape => (batch, time, features)
  # x.shape => (batch, lstm_units)
  x, *state = self.lstm_rnn(inputs)

  # predictions.shape => (batch, features)
  prediction = self.dense(x)
  return prediction, state

FeedBack.warmup = warmup

This method returns a single time-step prediction, and the internal state of the LSTM. We can see the shape of the prediction made by the model as,

prediction, state = feedback_model.warmup(multi_window.example[0])
prediction.shape
TensorShape([32, 19])

With the RNN‘s state, and an initial prediction, we can now continue iterating the model feeding the predictions at each step back as the input. Using this approach iteratively, we can make predictions much further into the future.

The simplest approach to collecting the output predictions is to use a python list, and tf.stack after the loop.

def call(self, inputs, training=None):
  # Use a TensorArray to capture dynamically unrolled outputs.
  predictions = []
  # Initialize the lstm state
  prediction, state = self.warmup(inputs)

  # Insert the first prediction
  predictions.append(prediction)

  # Run the rest of the prediction steps
  for n in range(1, self.out_steps):
    # Use the last prediction as input.
    x = prediction
    # Execute one lstm step.
    x, state = self.lstm_cell(x, states=state,
                              training=training)
    # Convert the lstm output to a prediction.
    prediction = self.dense(x)
    # Add the prediction to the output
    predictions.append(prediction)

  # predictions.shape => (time, batch, features)
  predictions = tf.stack(predictions)
  # predictions.shape => (batch, time, features)
  predictions = tf.transpose(predictions, [1, 0, 2])
  return predictions

FeedBack.call = call

We can now test run this model on the example inputs as,

print('Output shape (batch, time, features): ', feedback_model(multi_window.example[0]).shape)
Output shape (batch, time, features): (32, 24, 19)

Now train the model:

history = compile_and_fit(feedback_model, multi_window)

IPython.display.clear_output()

multi_val_performance['AR LSTM'] = feedback_model.evaluate(multi_window.val)
multi_performance['AR LSTM'] = feedback_model.evaluate(multi_window.test, verbose=0)
multi_window.plot(feedback_model)
437/437 [==============================] - 3s 6ms/step - loss: 0.2250 - mean_absolute_error: 0.2998
Building Multi-Step Forecasting Models

Performance

In this final section of this chapter, we are going to evaluate the performance of every models built in this lesson.

x = np.arange(len(multi_performance))
width = 0.3


metric_name = 'mean_absolute_error'
metric_index = lstm_model.metrics_names.index('mean_absolute_error')
val_mae = [v[metric_index] for v in multi_val_performance.values()]
test_mae = [v[metric_index] for v in multi_performance.values()]

plt.bar(x - 0.17, val_mae, width, label='Validation')
plt.bar(x + 0.17, test_mae, width, label='Test')
plt.xticks(ticks=x, labels=multi_performance.keys(),
           rotation=45)
plt.ylabel(f'MAE (average over all times and outputs)')
_ = plt.legend()
Building Multi-Step Forecasting Models

The metrics for the multi-output models in the first half of this tutorial show the performance averaged across all output features. These performances similar but also averaged across output timesteps.

for name, value in multi_performance.items():
  print(f'{name:8s}: {value[1]:0.4f}')
Last : 0.5007 
Repeat : 0.3774 
Linear : 0.2986 
Dense : 0.2758 
Conv : 0.2749 
LSTM : 0.2727 
AR LSTM : 0.2927

The gains achieved going from a dense model to convolutional and recurrent models are only a few percent (if any), and the autoregressive model performed clearly worse. So these more complex approaches may not be worthwhile on this problem, but there was no way to know without trying, and these models could be helpful for your problem.

Leave your thought here

Your email address will not be published. Required fields are marked *

Close Bitnami banner
Bitnami