[ad_1]
Flow Forecast [FF], is a state of the art deep learning for time series forecasting framework built in PyTorch. In this ongoing series we will use FF to perform forecasts (and classification) on real world time series datasets. In this first example we will use FF to perform forecasts on a publicly available Avocado dataset located on Kaggle (Open Database). Forecasting the prices of produce can prove valuable for both consumers and producers alike as it can help determine the best time to buy or expected revenue.
Dataset: This dataset contains information on US weekly avocado sales from 2015–2020. Data is separated into different metropolitan areas such as Chicago, Detroit, Milwaukee etc. For the dataset there are roughly nine columns in total. All things considered this a relatively “simple” dataset to forecast on as there are not many missing values in the columns.
We will now try to forecast the average price of avocados based on the total_volume and several of the other columns such as 4046 (e.g. the quantity of certain bags sold).
It is worth noting that in reality for a long range forecasting problem where we forecast with more than one forward pass through the model (e.g. we concatenate the prediction from the model for the target to the other features and re-feed it into the model), we would likely need to treat things like total_volume, 4046, as targets as well since we wouldn’t have access to their real-world ground-truth values several time-steps ahead of time. However, in order to simplify this tutorial we will assume that we do (these values could also come from other separate estimates or other models).
Approach 1: The first approach that we will try is DA-RNN, which is an older though still effective deep learning for time series forecasting model. To-do this we will first design a configuration file which includes our model’s parameters:
the_config = {
"model_name": "DARNN",
"model_type": "PyTorch",
"model_params": {
"n_time_series":6,
"hidden_size_encoder":128,
"decoder_hidden_size":128,
"out_feats":1,
"forecast_history":5,
"gru_lstm": False
},
"dataset_params":
{ "class": "default",
"training_path": "chicago_df.csv",
"validation_path": "chicago_df.csv",
"test_path": "chicago_df.csv",
"forecast_length": 1,
"batch_size":4,
"forecast_history":4,
"train_end": int(len(chicago_df)*.7),
"valid_start":int(len(chicago_df)*.7),
"valid_end": int(len(chicago_df)*.9),
"test_start": int(len(chicago_df)*.9),
"target_col": ["average_price"],
"sort_column": "date",
"no_scale": True,
"relevant_cols": ["average_price"j., "total_volume", "4046", "4225", "4770"],
"scaler": "StandardScaler",
"interpolate": False,
"feature_param":
{
"datetime_params":{
"month":"numerical"
}
}
},"training_params":
{
"criterion":"DilateLoss",
"optimizer": "Adam",
"optim_params":
{"lr": 0.001},
"epochs": 4,
"batch_size":4
},
"inference_params":{
"datetime_start": "2020-11-01",
"hours_to_forecast": 5,
"test_csv_path":"chicago_df.csv",
"decoder_params":{
"decoder_function": "simple_decode",
"unsqueeze_dim": 1
}
},
"GCS": False,
"wandb": {
"name": "avocado_training",
"tags": ["DA-RNN", "avocado_forecast","forecasting"],
"project": "avocado_flow_forecast"
},
"forward_params":{},
"metrics":["DilateLoss", "MSE", "L1"]
}
In this case we will use the DilateLoss function. The DilateLoss function is a loss function that returns an error based on both the values and shape of the time series proposed back in 2020. It is a great function for training, but it unfortunately doesn’t work with every model. We will also add month as a feature in our configuration file.
Now we will train the model for several for several epochs using the train function:
from flood_forecast.trainer import train_function
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
secret_value_0 = user_secrets.get_secret("WANDB_KEY")
os.environ["WANDB_API_KEY"] = secret_value_0
trained_model = train_function("PyTorch", the_config)
Now let’s analyze some of the results on Weights and Biases:
We can see that the model seemed to converge pretty well. We probably could have even squeezed out some more performance training for another epoch or two based on the validation_loss.
Additionally, the prediction is not awful particularly given we did not extensively tune the parameters. However, on the flip side we can see the model is not really using the total_volume when predicting the price (at least according to SHAP).
You can see the full code in this tutorial here and the W&B log here.
Approach 2: We will use a probabilistic version of a GRU to predict the price of the avocados in different regions around the US.
The advantage of a probabilistic m is that it predicts the upper and lower bounds of the forecasted value. Again we define a configuration file:
the_config = {
"model_name": "VanillaGRU",
"model_type": "PyTorch",
"model_params": {
"n_time_series":6,
"hidden_dim":32,
"probabilistic": True,
"num_layers":1,
"forecast_length": 2,
"n_target":2,
"dropout":0.15,
},
"probabilistic": True,
"dataset_params":
{ "class": "default",
"training_path": "chicago_df.csv",
"validation_path": "chicago_df.csv",
"test_path": "chicago_df.csv",
"forecast_length": 2,
"forecast_history":5,
"train_end": int(len(chicago_df)*.7),
"valid_start":int(len(chicago_df)*.7),
"valid_end": int(len(chicago_df)*.9),
"test_start": int(len(chicago_df)*.9),
"target_col": ["average_price"],
"sort_column": "date",
"no_scale": True,
"relevant_cols": ["average_price", "total_volume", "4046", "4225", "4770"],
"scaler": "StandardScaler",
"interpolate": False,
"feature_param":
{
"datetime_params":{
"month":"numerical"
}
}
},"training_params":
{
"criterion":"NegativeLogLikelihood",
"optimizer": "Adam",
"optim_params":
{"lr": 0.001},
"epochs": 5,
"batch_size":4
},
"inference_params":{
"probabilistic": True,
"datetime_start": "2020-11-01",
"hours_to_forecast": 5,
"test_csv_path":"chicago_df.csv",
"decoder_params":{
"decoder_function": "simple_decode", "unsqueeze_dim": 1, "probabilistic": True}
},
"GCS": False,"wandb": {
"name": "avocado_training",
"tags": ["GRU_PROB", "avocado_forecast","forecasting"],
"project": "avocado_flow_forecast"
},
"forward_params":{},
"metrics":["NegativeLogLikelihood"]
}
Here we will use NegativeLogLikelihood loss for our loss function. This is a special loss function for probabilistic models. Now like with the previous model we can examine the results on Weights and Biases.
You can see the full code in this tutorial notebook.
Approach 3: We can now try using a single NN to predict several geographic areas at once. To-do this we will use a simple transformer model. Like the last two models we define a configuration file:
the_config = {
"model_name": "CustomTransformerDecoder",
"model_type": "PyTorch",
"model_params": {
"n_time_series":11,
"seq_length":5,
"dropout": 0.1,
"output_seq_length": 2,
"n_layers_encoder": 2,
"output_dim":2,
"final_act":"Swish"
},
"n_targets":2,
"dataset_params":
{ "class": "default",
"training_path": "multi_city.csv",
"validation_path": "multi_city.csv",
"test_path": "multi_city.csv",
"sort_column": "date",
"batch_size":10,
"forecast_history":5,
"forecast_length":2,
"train_end": int(len(merged_df)*.7),
"valid_start":int(len(merged_df)*.7),
"valid_end": int(len(merged_df)*.9),
"test_start": int(len(merged_df)*.9),
"test_end": int(len(merged_df)),
"target_col": ["average_price_ch", "average_price_dt"],
"relevant_cols": ["average_price_ch", "average_price_dt", "total_volume_ch", "4046_ch", "4225_ch", "4770_ch", "total_volume_dt", "4046_dt", "4225_dt", "4770_dt"],
"scaler": "MinMaxScaler",
"no_scale": True,
"scaler_params":{
"feature_range":[0, 2]
},
"interpolate": False,
"feature_param":
{
"datetime_params":{
"month":"numerical"
}
}
},
"training_params":
{
"criterion":"MSE",
"optimizer": "Adam",
"optim_params":
{
"lr": 0.001,
},
"epochs": 5,
"batch_size":5},
"GCS": False,
"wandb": {
"name": "avocado_training",
"tags": ["multi_trans", "avocado_forecast","forecasting"],
"project": "avocado_flow_forecast"
},
"forward_params":{},
"metrics":["MSE"],
"inference_params":
{
"datetime_start":"2020-11-08",
"num_prediction_samples": 20,
"hours_to_forecast":5,
"test_csv_path":"multi_city.csv",
"decoder_params":{
"decoder_function": "simple_decode",
"unsqueeze_dim": 1},
}
}
For this model we will go back to using MSE for our loss function. We can now analyze the results from W&B.
The Chicago model looks a bit off however, with some additional hyper-parameter tuning it could likely perform well (especially more dropout).
Conclusion
Here we saw the results of three different models with respect to forecasting the prices of avocados over a five week period. FF makes it easy to train many different types of models for forecasting and to see which performs the best. In part two of this series will go over forecasting grocery sales.
Deep Time Series Forecasting with Flow Forecast Part 1: Avocado Prices Republished from Source https://towardsdatascience.com/deep-time-series-forecasting-with-flow-forecast-part-1-avocado-prices-276a59eb454f?source=rss—-7f60cf5620c9—4 via https://towardsdatascience.com/feed
<!–
–>
[ad_2]
Source link