Chapter Four, Time As a Variable: Time-Series Analysis

the main components of every time series: Trend, Seasonality, Noise and Other. (大势,小周期,噪音和其他)。

以下一段具体解释。

The trend may be linear or nonlinear, and we may want to investigate its magnitude. The
seasonality pattern may be either additive or multiplicative. In the first case, the seasonal
change has the same absolute size no matter what the magnitude of the current baseline of
the series is; in the latter case, the seasonal change has the same relative size compared
with the current magnitude of the series. Noise (i.e., some form of random variation) is
almost always part of a time series. Finding ways to reduce the noise in the data is usually
a significant part of the analysis process. Finally, “other” includes anything else that we
may observe in a time series, such as particular significant changes in overall behavior,
special outliers, missing data—anything remarkable at all.

然后就是: Description, Prediction, and Control.

Smoothing

窗口平滑,加权窗口平滑,高斯加权窗口平滑。

以上都有缺点:1, 无法评估效果,不能重复。2,由于窗口问题,不能接近真实值。3,对于范围外的点没法算,也就是不能预测。

克服上述缺点的方法:exponential smoothing or Holt–Winters method

https://gist.github.com/andrequeiroz/5888967

# Holt-Winters algorithms to forecasting
# Coded in Python 2 by: Andre Queiroz
# Description: This module contains three exponential smoothing algorithms. They are Holt‘s linear trend method and Holt-Winters seasonal methods (additive and multiplicative).
# References:
#  Hyndman, R. J.; Athanasopoulos, G. (2013) Forecasting: principles and practice. http://otexts.com/fpp/. Accessed on 07/03/2013.
#  Byrd, R. H.; Lu, P.; Nocedal, J. A Limited Memory Algorithm for Bound Constrained Optimization, (1995), SIAM Journal on Scientific and Statistical Computing, 16, 5, pp. 1190-1208.

from sys import exit
from math import sqrt
from numpy import array
from scipy.optimize import fmin_l_bfgs_b

def RMSE(params, *args):

    Y = args[0]
    type = args[1]
    rmse = 0

    if type == ‘linear‘:

        alpha, beta = params
        a = [Y[0]]
        b = [Y[1] - Y[0]]
        y = [a[0] + b[0]]

        for i in range(len(Y)):

            a.append(alpha * Y[i] + (1 - alpha) * (a[i] + b[i]))
            b.append(beta * (a[i + 1] - a[i]) + (1 - beta) * b[i])
            y.append(a[i + 1] + b[i + 1])

    else:

        alpha, beta, gamma = params
        m = args[2]
        a = [sum(Y[0:m]) / float(m)]
        b = [(sum(Y[m:2 * m]) - sum(Y[0:m])) / m ** 2]

        if type == ‘additive‘:

            s = [Y[i] - a[0] for i in range(m)]
            y = [a[0] + b[0] + s[0]]

            for i in range(len(Y)):

                a.append(alpha * (Y[i] - s[i]) + (1 - alpha) * (a[i] + b[i]))
                b.append(beta * (a[i + 1] - a[i]) + (1 - beta) * b[i])
                s.append(gamma * (Y[i] - a[i] - b[i]) + (1 - gamma) * s[i])
                y.append(a[i + 1] + b[i + 1] + s[i + 1])

        elif type == ‘multiplicative‘:

            s = [Y[i] / a[0] for i in range(m)]
            y = [(a[0] + b[0]) * s[0]]

            for i in range(len(Y)):

                a.append(alpha * (Y[i] / s[i]) + (1 - alpha) * (a[i] + b[i]))
                b.append(beta * (a[i + 1] - a[i]) + (1 - beta) * b[i])
                s.append(gamma * (Y[i] / (a[i] + b[i])) + (1 - gamma) * s[i])
                y.append(a[i + 1] + b[i + 1] + s[i + 1])

        else:

            exit(‘Type must be either linear, additive or multiplicative‘)

    rmse = sqrt(sum([(m - n) ** 2 for m, n in zip(Y, y[:-1])]) / len(Y))

    return rmse

def linear(x, fc, alpha = None, beta = None):

    Y = x[:]

    if (alpha == None or beta == None):

        initial_values = array([0.3, 0.1])
        boundaries = [(0, 1), (0, 1)]
        type = ‘linear‘

        parameters = fmin_l_bfgs_b(RMSE, x0 = initial_values, args = (Y, type), bounds = boundaries, approx_grad = True)
        alpha, beta = parameters[0]

    a = [Y[0]]
    b = [Y[1] - Y[0]]
    y = [a[0] + b[0]]
    rmse = 0

    for i in range(len(Y) + fc):

        if i == len(Y):
            Y.append(a[-1] + b[-1])

        a.append(alpha * Y[i] + (1 - alpha) * (a[i] + b[i]))
        b.append(beta * (a[i + 1] - a[i]) + (1 - beta) * b[i])
        y.append(a[i + 1] + b[i + 1])

    rmse = sqrt(sum([(m - n) ** 2 for m, n in zip(Y[:-fc], y[:-fc - 1])]) / len(Y[:-fc]))

    return Y[-fc:], alpha, beta, rmse

def additive(x, m, fc, alpha = None, beta = None, gamma = None):

    Y = x[:]

    if (alpha == None or beta == None or gamma == None):

        initial_values = array([0.3, 0.1, 0.1])
        boundaries = [(0, 1), (0, 1), (0, 1)]
        type = ‘additive‘

        parameters = fmin_l_bfgs_b(RMSE, x0 = initial_values, args = (Y, type, m), bounds = boundaries, approx_grad = True)
        alpha, beta, gamma = parameters[0]

    a = [sum(Y[0:m]) / float(m)]
    b = [(sum(Y[m:2 * m]) - sum(Y[0:m])) / m ** 2]
    s = [Y[i] - a[0] for i in range(m)]
    y = [a[0] + b[0] + s[0]]
    rmse = 0

    for i in range(len(Y) + fc):

        if i == len(Y):
            Y.append(a[-1] + b[-1] + s[-m])

        a.append(alpha * (Y[i] - s[i]) + (1 - alpha) * (a[i] + b[i]))
        b.append(beta * (a[i + 1] - a[i]) + (1 - beta) * b[i])
        s.append(gamma * (Y[i] - a[i] - b[i]) + (1 - gamma) * s[i])
        y.append(a[i + 1] + b[i + 1] + s[i + 1])

    rmse = sqrt(sum([(m - n) ** 2 for m, n in zip(Y[:-fc], y[:-fc - 1])]) / len(Y[:-fc]))

    return Y[-fc:], alpha, beta, gamma, rmse

def multiplicative(x, m, fc, alpha = None, beta = None, gamma = None):

    Y = x[:]

    if (alpha == None or beta == None or gamma == None):

        initial_values = array([0.0, 1.0, 0.0])
        boundaries = [(0, 1), (0, 1), (0, 1)]
        type = ‘multiplicative‘

        parameters = fmin_l_bfgs_b(RMSE, x0 = initial_values, args = (Y, type, m), bounds = boundaries, approx_grad = True)
        alpha, beta, gamma = parameters[0]

    a = [sum(Y[0:m]) / float(m)]
    b = [(sum(Y[m:2 * m]) - sum(Y[0:m])) / m ** 2]
    s = [Y[i] / a[0] for i in range(m)]
    y = [(a[0] + b[0]) * s[0]]
    rmse = 0

    for i in range(len(Y) + fc):

        if i == len(Y):
            Y.append((a[-1] + b[-1]) * s[-m])

        a.append(alpha * (Y[i] / s[i]) + (1 - alpha) * (a[i] + b[i]))
        b.append(beta * (a[i + 1] - a[i]) + (1 - beta) * b[i])
        s.append(gamma * (Y[i] / (a[i] + b[i])) + (1 - gamma) * s[i])
        y.append((a[i + 1] + b[i + 1]) * s[i + 1])

    rmse = sqrt(sum([(m - n) ** 2 for m, n in zip(Y[:-fc], y[:-fc - 1])]) / len(Y[:-fc]))

    return Y[-fc:], alpha, beta, gamma, rmse

http://adorio-research.org/wordpress/?p=1230

def holtwinters(y, alpha, beta, gamma, c, debug=True):
    """
    y - time series data.
    alpha , beta, gamma - exponential smoothing coefficients
                                      for level, trend, seasonal components.
    c -  extrapolated future data points.
          4 quarterly
          7 weekly.
          12 monthly

    The length of y must be a an integer multiple  (> 2) of c.
    """
    #Compute initial b and intercept using the first two complete c periods.
    ylen =len(y)
    if ylen % c !=0:
        return None
    fc =float(c)
    ybar2 =sum([y[i] for i in range(c, 2 * c)])/ fc
    ybar1 =sum([y[i] for i in range(c)]) / fc
    b0 =(ybar2 - ybar1) / fc
    if debug: print "b0 = ", b0

    #Compute for the level estimate a0 using b0 above.
    tbar  =sum(i for i in range(1, c+1)) / fc
    print tbar
    a0 =ybar1  - b0 * tbar
    if debug: print "a0 = ", a0

    #Compute for initial indices
    I =[y[i] / (a0 + (i+1) * b0) for i in range(0, ylen)]
    if debug: print "Initial indices = ", I

    S=[0] * (ylen+ c)
    for i in range(c):
        S[i] =(I[i] + I[i+c]) / 2.0

    #Normalize so S[i] for i in [0, c)  will add to c.
    tS =c / sum([S[i] for i in range(c)])
    for i in range(c):
        S[i] *=tS
        if debug: print "S[",i,"]=", S[i]

    # Holt - winters proper ...
    if debug: print "Use Holt Winters formulae"
    F =[0] * (ylen+ c)   

    At =a0
    Bt =b0
    for i in range(ylen):
        Atm1 =At
        Btm1 =Bt
        At =alpha * y[i] / S[i] + (1.0-alpha) * (Atm1 + Btm1)
        Bt =beta * (At - Atm1) + (1- beta) * Btm1
        S[i+c] =gamma * y[i] / At + (1.0 - gamma) * S[i]
        F[i]=(a0 + b0 * (i+1)) * S[i]
        print "i=", i+1, "y=", y[i], "S=", S[i], "Atm1=", Atm1, "Btm1=",Btm1, "At=", At, "Bt=", Bt, "S[i+c]=", S[i+c], "F=", F[i]
        print i,y[i],  F[i]
    #Forecast for next c periods:
    for m in range(c):
        print "forecast:", (At + Bt* (m+1))* S[ylen + m]

# the time-series data.
y =[146, 96, 59, 133, 192, 127, 79, 186, 272, 155, 98, 219]

holtwinters(y, 0.2, 0.1, 0.05, 4)

时间: 2024-08-27 06:10:05

Chapter Four, Time As a Variable: Time-Series Analysis的相关文章

PP: Multilevel wavelet decomposition network for interpretable time series analysis

Problem: the important frequency information is lack of effective modelling. ?? what is frequency information in time series? and why other models don't model this kind of frequency information? frequency learning we propose two deep learning models:

survey on Time Series Analysis Lib

(1)I spent my 4th year Computing project on implementing time series forecasting for Java heap usage prediction using ARIMA, Holt Winters etc, so I might be in a good position to advise you on this. Your best option by far is using the R language, yo

Time series analysis

General concept: https://en.wikipedia.org/wiki/Time_series MATLAB:https://uk.mathworks.com/help/ident/time-series-model-identification.html Python:Pandas http://pandas.pydata.org

时间序列分析法(Time series analysis method)(百度词条)

时间序列预测法是一种历史资料延伸预测,也称历史引伸预测法.是以时间数列所能反映的社会经济现象的发展过程和规律性,进行引伸外推,预测其发展趋势的方法. 定义 根据历史统计资料,总结出电力负荷发展水平与时间先后顺序关系的需电量预测方法.有简单平均法.加权平均法和移动平均法等. 应用学科 电力(一级学科):电力系统(二级学科) 1简介 它包括一般统计分析(如自相关分析,谱分析等),统计模型的建立与推断,以及关于时间序列的最优预测.控制与滤波等内容.经典的统计分析都假定数据序列具有独立性,而时间序列分析

CHAPTER 1 Introduction to database (第一章 数据库简介)

Chaper  Objectives  (章节目标) In this chapter you will learn:   (在这一章节中,你将学习) 1. Some common uses of database systems.   (数据库系统的一些普通扩法) 2.The characteristics of file-based systems. (基于文件系统的一些特点.) 3. The problems with the file-based systems.  (基于文件系统拥有的一

Time Series data 与 sequential data 的区别

It is important to note the distinction between time series and sequential data. In both cases, the data consist of a sequence, or list of values, in which the order is important. Time series is a subclass of sequential data where the longitudinal co

时间序列 R 07 时间序列分解 Time series decomposition

一个时间序列可以分解为多个模型的组合 1.1 时间序列的组成 1.1.1 时间序列组成模式 三种时间序列模式(不计剩余残差部分) 1. 趋势Tend :比如线性趋势,先增加后降低的整体趋势 2. 季节性Seasonal :以时间为固定周期,呈现循环的特性 3. 周期性Cyclic:在以不固定周期不断震荡,通常周期性至少持续2年 下图就是讲时间序列分解之后的结果,应该比较容易理解上面的定义 下图是周期性的表现之一: 每个周期的震荡持续了6-10年,整体没有什么明显的趋势,第一幅图中trend包含了

Visibility Graph Analysis of Geophysical Time Series: Potentials and Possible Pitfalls

Tasks: invest papers  3 篇. 研究主动权在我手里.  I have to.  1. the benefit of complex network: complex network theory has been particularly successful in providing unifying统一的 concepts and methods for understanding the structure and dynamics of complex system

A Complete Tutorial on Time Series Modeling in R

TAVISH SRIVASTAVA, DECEMBER 16, 2015 LOGIN TO BOOKMARK THIS ARTICLE Overview Time Series Analysis and Time Series Modeling are powerful forecasting tools A prior knowledge of the statistical theory behind Time Series is useful before Time series Mode