PP: Time series clustering via community detection in Networks

tasks:
1. review the community detection paper
2. formulate your problem and software functions
3.

Suppose: similar time series tend to connect to each other and form communities. / high correlated time series tend to connect to each other and form communities.

Background and related works

shaped based distance measures; feature based distance measures; structure based distance measures. time series clustering; community detection in networks.

Methodology

  1. data normalization
  2. time series distance calculation
  3. network construction
  4. community detection

Which step influence the clustering results:

distance calculation algorithm; network construction methods. community detection methods.

2. distance matrix

calculating the distance for each pair of time series in the data set and construct a distance matrix D, where dij is the distance between series Xi and XJ . A good choice of distance measure has strong influence on the network construction and clustering result.

3. network construction

Two common method: K-NN and \epsilon-NN;  EXPLORATION

Experiments

45 time series data sets.

Purpose: check the performance of each combination of step2, step3,and step4 to each data sets.

Index指标:Rand index.

Vary the parameters: the k of k-NN from 1 to n-1;  the epsilon of epsilon-NN from min(D) to max(D) in 100 steps.

Step2: Manhattan, Euclidean, infinite Norm, DTW, short time series, DISSIM, Complexity-Invariant, Wavlet tranform, Pearson correlation, Intergrated periodogram.

Step3: fast greedy; multilevel; walktrap; infomap; label propagration.

Step4: vary the parameter of k and \epsilon.

Results

1. The k-NN construction method just allows discrete values of k while the ε-NN method accepts continuous values

Supplementary knowledge: 

1. box plot

它能显示出一组数据的最大值最小值中位数、及上下四分位数

以下是箱形图的具体例子:

                            +-----+-+
  *           o     |-------|   + | |---|
                            +-----+-+    

+---+---+---+---+---+---+---+---+---+---+   分数
0   1   2   3   4   5   6   7   8   9  10

这组数据显示出:

  • 最小值(minimum)=5
  • 下四分位数(Q1)=7
  • 中位数(Med --也就是Q2)=8.5
  • 上四分位数(Q3)=9
  • 最大值(maximum )=10
  • 平均值=8
  • 四分位间距(interquartile range)={\displaystyle (Q3-Q1)}=2 (即ΔQ)

2. 观念转变, experiment部分也很重要,不是可有可无的, 要细看。

原文地址:https://www.cnblogs.com/dulun/p/12170759.html

时间: 2024-10-13 20:52:58

PP: Time series clustering via community detection in Networks的相关文章

PP: Learning representations for time series clustering

Problem: time series clustering TSC - unsupervised learning/ category information is not available. time-series clustering for anomaly detection/ pattern detection. Feature-based time series clustering methods typically rely on domain knowledge to ma

A practical algorithm for distributed clustering and outlier detection

文章提出了一种分布式聚类的算法,这是第一个有理论保障的考虑离群点的分布式聚类算法(文章里自己说的).与之前的算法对比有以下四个优点: 1.耗时短O(max{k,logn}*n), 2.传递信息规模小:对抗分区O(klogn+t),随机分区O(klogn+t/s) 3.算法有良好的近似保证, 4.能够有效的检测出离群点. 其中,k聚类中心个数,n数据集大小,t离群点个数,s站点数(分区个数) 符号说明: 算法总体描述: 文中提出的算法分为两个阶段,第一阶段的算法是在[1]中改进,将[1]中纯净的数

LabelRank(A Stabilized Label Propagation Algorithm for Community Detection in Networks)非重叠社区发现

最近在研究基于标签传播的社区分类,LabelRank算法基于标签传播和马尔科夫随机游走思路上改装的算法,引用率较高,打算将代码实现,便于加深理解. 一.概念 相关概念不再累述,详情见前两篇文章 二.算法思路 (1)Propagation (2)Inflation (3)Cut off (4)Explicit Conditional Update (5)Stop Criterion 三.A Stabilized Label Propagation Algorithm for Community D

A Node Influence Based Label Propagation Algorithm for Community detection in networks 文章算法实现的疑问

这是我最近看到的一篇论文,思路还是很清晰的,就是改进的LPA算法.改进的地方在两个方面: (1)结合K-shell算法计算量了节点重重要度NI(node importance),标签更新顺序则按照NI由大到小的顺序更新 得到ks值后,载计算一下节点邻居ks值和度值d的比值 (2)当出现次数最多的标签不止一个时,再计算一下标签重要度LI(label importance) 其实就是找到节点相同标签的那些令居计算一个合值,看着也不难啊 (3)最后这个算法使用的是异步传播 下面是我实现的代码 func

Distinctive Image Features from Scale-Invariant Keypoints 翻译

从尺度不变的关键点选择可区分的图像特征 David G.Lowe 温哥华不列颠哥伦比亚省加拿大英属哥伦比亚大学计算机科学系 [email protected] 2003年1月10日接受,2004年1月7日修改,2004年1月22日采用   摘要:本文提出了一种从图像中提取独特不变特征的方法,可用于完成不同视角之间目标或场景的可靠匹配的方法.这种特征对图像的尺度和旋转具有不变性.并跨越很大范围的对仿射变换,三维视点的变化,添加的噪音和光照变化的图像匹配具有鲁棒性.特征是非常鲜明的,场景中的一个单一

(转)分布式深度学习系统构建 简介 Distributed Deep Learning

HOME ABOUT CONTACT SUBSCRIBE VIA RSS DEEP LEARNING FOR ENTERPRISE Distributed Deep Learning, Part 1: An Introduction to Distributed Training of Neural Networks Oct 3, 2016 3:00:00 AM / by Alex Black and Vyacheslav Kokorin Tweet inShare27   This post

Machine Learning Algorithms Study Notes(4)—无监督学习(unsupervised learning)

1    Unsupervised Learning 1.1    k-means clustering algorithm 1.1.1    算法思想 1.1.2    k-means的不足之处 1.1.3    如何选择K值 1.1.4    Spark MLlib 实现 k-means 算法 1.2    Mixture of Gaussians and the EM algorithm 1.3    The EM Algorithm 1.4    Principal Components

Machine learning and data mining

Problems: Classification, Clustering, Regression, Anomaly detection, Association rules, Reinforcement learning, Structurd prediction, Feature learning, Online learning, Semi-supervised learning, Grammar induction Supervised learning: Decision trees,

ECCV 2014 Results (16 Jun, 2014) 结果已出

Accepted Papers     Title Primary Subject Area ID 3D computer vision 93 UPnP: An optimal O(n) solution to the absolute pose problem with universal applicability 128 Video Registration to SfM Models 168 Image-based 4-d Modeling Using 3-d Change Detect