PP: Multi-Horizon Time Series Forecasting with Temporal Attention Learning

Problem:

multi-horizon probabilistic forecasting tasks;

Propose an end-to-end framework for multi-horizon time series forecasting, with temporal attention mechanisms to capture latent patterns.

Introduction:

forecasting ----- understanding demands.

traditional methods: arima, holt-winters methods.

recently: lstm

multi-step forecasting can be naturally formulated as sequence-to-sequence learning.

???? what is sequence-to-sequence learning

??? What is multi-horizon forecasting: forecasting on multiple steps in future time.

forecasting the overall distribution!!

quantile regression to make predictions of different quantiles to approximate the target distribution without making distributional assumptions;

mean regression/ least square method;

cite 29,31 produce quantile estimations with quantile loss functions.

RELATED WORK:

1. pre-assume underlying distribution

DeepAR makes probabilistic forecasts by assuming an underlying distribution for time series data, and could produce the probability density functions for target variables by estimating the distribution parameters on each point with multi-layer perceptrons.

2. quantile regressions: don‘t pre-assume underlying distribution, but generate quantile estimations for target variables.

Attention mechanism, cite 3.

APPROACH:

Use a LSTM-based encoder-decoder model;

The decoder is another recurrent network which takes the encoded history as its initial state, and the future information as inputs to generate the future sequence as outputs. The decoder is bi-directional LSTM. Then the hidden states of BiLSTM are fed into a fully-connected layer/temporal convolution layer.

How to prevent error accumulation: we do not use prediction results of previous time steps to predict the current time step to prevent error accumulation.

???Hard to capture long-term dependency due to memory update. 为什么难以记录长期记忆,lstm本身就包含长期记忆啊,及时memory cell在不断的更新。

??How long the attention should be set? attending to a long history would lead to inaccurate attention as well as inefficient computation.

EXPERIMENTS

test on two datasets: public - GEFCom2014 electricity price forecasting dataset; JD50K sales dataset

multivariable time series: jd50k dataset include product region, category index, promotion type, and holiday event.

evaluate our algorithms with mean abosolute deviation平均绝对偏差, which is defined as the sum of standard quantile loss.

L(yip,?yi)?=?max[q(yip?−?yi),??(q?−?1)(yip?−?yi)]

Training and test Part: 时序数据是纵向切分的,时序数据的前时间段作为训练部分,后时间段作为测试部分。

结果: 和别的方法来比较quantile loss,提升了0.2-0.8,但是loss的最大尺度不知道,所以不知道这个0.2-0.8到底意味着多大的尺度。用MSE loss来评估,还不错,小了很多。如果是点预测的话,可以直接和真实值进行比较,但是quantile estimation就不好衡量准确性了,或者说我目前不知道对应的衡量方法。作者测试了temporal attention width, h = 1和3两个值,这个值的选取需要更多的justify.

me: 和modeling extreme event 那篇文章相比,二者同样添加了attention mechanism, 但二者的不同在与,extreme event那篇文章应用了fixed windows生成固定长度的extreme event 的attention,独立于hidden state 之外,输入是整个序列的extreme event发生与否,而本篇文章的attention是对过去数据h个hidden states的attention记录。相比之下本篇文章的网络设计技巧性更强。但如果说网络结构的创新性,如果biLSTM encoder-decoder本身存在的话,那么本文的贡献只有temporal attention mechanism. 另一个思考是,不同类型的time series,之间的自相关性不同,能不能根据它们的自相关性进行temporal attention width - h的选取标准。越自相关,越被之前的数值影响,因而更需要前面的temporal attention.

Supplementary knowledge:

?? what is temporal attention mechanism and multi-horizon time series.

原文地址:https://www.cnblogs.com/dulun/p/12241901.html

时间: 2024-10-29 18:18:12

PP: Multi-Horizon Time Series Forecasting with Temporal Attention Learning的相关文章

An overview of time series forecasting models

An overview of time series forecasting models 2019-10-04 09:47:05 This blog is from: https://towardsdatascience.com/an-overview-of-time-series-forecasting-models-a2fa7a358fcb What is this article about? This article provides an overview of the main m

[转]Multivariate Time Series Forecasting with LSTMs in Keras

1. Air Pollution Forecasting In this tutorial, we are going to use the Air Quality dataset. This is a dataset that reports on the weather and the level of pollution each hour for five years at the US embassy in Beijing, China. The data includes the d

转:深度学习课程及深度学习公开课资源整理

http://www.52nlp.cn/%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0%E8%AF%BE%E7%A8%8B%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0%E5%85%AC%E5%BC%80%E8%AF%BE%E8%B5%84%E6%BA%90%E6%95%B4%E7%90%86 这里整理一批深度学习课程或者深度学习相关公开课的资源,持续更新,仅供参考. 1. Andrew Ng (吴恩达) 深度学习专项课程 by Courser

PP: A dual-stage attention-based recurrent neural network for time series prediction

Problem: time series prediction The nonlinear autoregressive exogenous model: The Nonlinear autoregressive exogenous (NARX) model, which predicts the current value of a time series based upon its previous values as well as the current and past values

survey on Time Series Analysis Lib

(1)I spent my 4th year Computing project on implementing time series forecasting for Java heap usage prediction using ARIMA, Holt Winters etc, so I might be in a good position to advise you on this. Your best option by far is using the R language, yo

Lab 2 MSc: Time Series Prediction with GP

Advanced Aspects of Nature Inspired Search and Optimisation 2019/2020Lab 2 MSc: Time Series Prediction with GPNB! This coursework is only compulsory for MSc students taking the 20cr module.We released a different Lab 2 with an earlier deadline for UG

POJ 3744 Scout YYF I 矩阵快速幂优化--概率dp

点击打开链接 Scout YYF I Time Limit: 1000MS   Memory Limit: 65536K Total Submissions: 5416   Accepted: 1491 Description YYF is a couragous scout. Now he is on a dangerous mission which is to penetrate into the enemy's base. After overcoming a series diffic

史上最全量化资源整理

有些国外的平台.社区.博客如果连接无法打开,那说明可能需要“科学”上网 量化交易平台 国内在线量化平台: BigQuant - 你的人工智能量化平台 - 可以无门槛地使用机器学习.人工智能开发量化策略,基于python,提供策略自动生成器 镭矿 - 基于量化回测平台 果仁网 - 回测量化平台 京东量化 - 算法交易和量化回测平台 聚宽 - 量化回测平台 优矿 - 通联量化实验室 Ricequant - 量化交易平台 况客 - 基于R语言量化回测平台 Factors - 数库多因子量化平台 诸葛量

使用神经网络预测航班起飞准点率

前些日子,参加了一个解放号的行业大数据创新应用大赛, https://1024.jfh.com/question/detail?contestId=6 一.问题描述 赛题是根据西安机场上半年的航班起降信息,建立适当预测模型,预测未来七天的航班准点率,(航班实际起飞时间-航班计划起飞时间)< 15 分钟即为准点. 二.问题分析 为解决这个问题,首先需要对比赛数据进行深入了解.赛会提供的数据是一个csv文件,由15万条航班起降数据组成.每条记录包含如下九种属性:[航班号,出发地,到达地,机型,计划起