PP: Time series clustering via community detection in Networks

tasks:
1. review the community detection paper
2. formulate your problem and software functions
3.

Suppose: similar time series tend to connect to each other and form communities. / high correlated time series tend to connect to each other and form communities.

Background and related works

shaped based distance measures; feature based distance measures; structure based distance measures. time series clustering; community detection in networks.

Methodology

data normalization
time series distance calculation
network construction
community detection

Which step influence the clustering results:

distance calculation algorithm; network construction methods. community detection methods.

2. distance matrix

calculating the distance for each pair of time series in the data set and construct a distance matrix D, where dij is the distance between series Xi and XJ . A good choice of distance measure has strong influence on the network construction and clustering result.

3. network construction

Two common method: K-NN and \epsilon-NN; EXPLORATION

Experiments

45 time series data sets.

Purpose: check the performance of each combination of step2, step3,and step4 to each data sets.

Index指标：Rand index.

Vary the parameters: the k of k-NN from 1 to n-1; the epsilon of epsilon-NN from min(D) to max(D) in 100 steps.

Step2: Manhattan, Euclidean, infinite Norm, DTW, short time series, DISSIM, Complexity-Invariant, Wavlet tranform, Pearson correlation, Intergrated periodogram.

Step3: fast greedy; multilevel; walktrap; infomap; label propagration.

Step4: vary the parameter of k and \epsilon.

Results

1. The k-NN construction method just allows discrete values of k while the ε-NN method accepts continuous values

Supplementary knowledge:

1. box plot

它能显示出一组数据的最大值、最小值、中位数、及上下四分位数。

以下是箱形图的具体例子：

                            +-----+-+
  *           o     |-------|   + | |---|
                            +-----+-+    

+---+---+---+---+---+---+---+---+---+---+   分数
0   1   2   3   4   5   6   7   8   9  10

这组数据显示出：

最小值(minimum)=5
下四分位数(Q1)=7
中位数(Med --也就是Q2)=8.5
上四分位数(Q3)=9
最大值(maximum )=10
平均值=8
四分位间距(interquartile range)={\displaystyle (Q3-Q1)}=2 (即ΔQ)

2. 观念转变， experiment部分也很重要，不是可有可无的，要细看。

原文地址：https://www.cnblogs.com/dulun/p/12170759.html

时间： 2024-10-13 20:52:58

PP: Time series clustering via community detection in Networks

PP: Time series clustering via community detection in Networks的相关文章

PP: Learning representations for time series clustering

A practical algorithm for distributed clustering and outlier detection

LabelRank（A Stabilized Label Propagation Algorithm for Community Detection in Networks）非重叠社区发现

A Node Influence Based Label Propagation Algorithm for Community detection in networks 文章算法实现的疑问

Distinctive Image Features from Scale-Invariant Keypoints 翻译

（转）分布式深度学习系统构建简介 Distributed Deep Learning

Machine Learning Algorithms Study Notes(4)—无监督学习（unsupervised learning）

Machine learning and data mining

ECCV 2014 Results (16 Jun, 2014) 结果已出