PP: Deep clustering based on a mixture of autoencoders

Problem: clustering

A clustering network transforms the data into another space and then selects one of the clusters. Next, the autoencoder associated with this cluster is used to reconstruct the data-point.

Introduction:

traditional method: data------> extract a feature vector from each object --------> aggregate groups of vectors in a feature space.

cluster is represented by an autoencoder network. ??how

common method: k-means; but for the high-dimensional dataset, it‘s less useful because inter-point distances become less informative in high-dimensional spaces.

如果对于找一个序列的pattern来说,是不是就是时间维度作为高维情况,每个pattern作为一个cluster,而有的子序列不能归到cluster当中。

representation learning has been used to map the input data into a low-dimensional feature space.

Attempts: apply unsupervised deep learning approaches for clustering.  ??how

However, most focus on clustering over a low-dimensional feature space.

Transform the data into more clustering-friendly representations:

A deep version of k-means is based on learning a data representation and applying k-means in the embedded space.

How to represent a cluster:

a vector VS an autoencoder network.

Data collapsing problem: 数据崩溃问题,对于每个数据库,你必须重新调一遍程序。

for multivariate time series, how to find patterns.

1. find patterns: SAX; TICC; slide windows; 导数

2. VG, statistic features.

3.

Supplementary knowledge: 

1. Pattern recognition and clustering

Pattern recognition is a mature field in computer science with well-established techniques for the assignment of unknown patterns to categories, or classes. A pattern is defined as a vector of some number of measurements, called features. Usually, a pattern recognition system uses training samples from known categories to form a decision rule for unknown patterns. The unknown pattern is assigned to one of the categories according to the decision rule. Since we are interested in the classes of documents that have been assigned by the user, we can use pattern recognition techniques to try to classify previously unseen documents into the user‘s categories. While pattern recognition techniques require that the number and labels of categories are known, clustering techniques are unsupervised, requiring no external knowledge of categories. Clustering methods simply try to group similar patterns into clusters whose members are more similar to each other (according to some distance measure) than to members of other clusters. There is no a priori knowledge of patterns that belong to certain groups, or even how many groups are appropriate. Refer to basic pattern recognition and clustering texts such as [567] for further information.

We first employ pattern recognition techniques on documents to attempt to find features for classification, then focus on clustering the raw features of the documents.

原文地址:https://www.cnblogs.com/dulun/p/12309740.html

时间: 2024-08-30 00:23:23

PP: Deep clustering based on a mixture of autoencoders的相关文章

PP: Deep r -th Root of Rank Supervised Joint Binary Embedding for Multivariate Time Series Retrieval

from: Dacheng Tao 悉尼大学 PROBLEM: time series retrieval: given the current multivariate time series segment, how to obtain its relevant time series segments in the historical data. Two challenging: 1. it requires a compact representation of the raw tim

【神经网络】自编码聚类算法--DEC (Deep Embedded Clustering)

1.算法描述 最近在做AutoEncoder的一些探索,看到2016年的一篇论文,虽然不是最新的,但是思路和方法值得学习.论文原文链接 http://proceedings.mlr.press/v48/xieb16.pdf,论文有感于t-SNE算法的t-分布,先假设初始化K个聚类中心,然后数据距离中心的距离满足t-分布,可以用下面的公式表示: 其中 i表示第i样本,j表示第j个聚类中心, z表示原始特征分布经过Encoder之后的表征空间. $q_{ij}$可以解释为样本i属于聚类j的概率,属于

Unsupervised Deep Learning – ICLR 2017 Discoveries

Unsupervised Learning Using Generative Adversarial Training And Clustering – Authors: Vittal Premachandran, Alan L. Yuille An Information-Theoretic Framework for Fast and Robust Unsupervised Learning via Neural Population Infomax– Authors: Wentao Hua

DEEP LEARNING IS THE FUTURE: Q&A WITH NAVEEN RAO OF NERVANA SYSTEMS

DEEP LEARNING IS THE FUTURE: Q&A WITH NAVEEN RAO OF NERVANA SYSTEMS CME Group was one of several companies taking part in a $20.5 million funding round for the San Diego startup, Nervana Systems. The company specializes in a biologically inspired for

My deep learning reading list

My deep learning reading list 主要是顺着Bengio的PAMI review的文章找出来的.包括几本综述文章,将近100篇论文,各位山头们的Presentation.全部都可以在google上找到.BTW:由于我对视觉尤其是检测识别比较感兴趣,所以关于DL的应用主要都是跟Vision相关的.在其他方面比如语音或者NLP,很少或者几乎没有.个人非常看好CNN和Sparse Autoencoder,这个list也反映了我的偏好,仅供参考. Review Book Lis

[Source] Paper references on Deep Learning

Deep Learning References ________________________________________________________________ Review Book List:[2009 Thesis] Learning Deep Generative Models.pdf[2009] Learning Deep Architectures for AI.pdf[2013 DengLi Review] Deep Learning for Signal and

A real sense 3D face reconstruction system based on multi-view stereo vision

Abstract This paper proposed a system for a real sense 3D facial reconstruction method based on multi-view stereo vision generated using an orthographic 3D model. Multi-view stereopsis is an effective technology for expanding perspective and reducing

Deep Learning Local Descriptor for Image Splicing Detection and Localization阅读

摘要 : 拼接检测方法: 提出了一个两分支CNN,分支的子网络的第一卷积层的内核是使用30种线性高通滤波器的优化组合进行初始化的 ISRM-CNN,这些滤波器用于计算空间富集模型(SRM)中的残差图 SRM-CNN,通过受约束的学习策略进行微调,以保留所学习内核的高通滤波特性 C-ISRM-CNN.利用对比损失和交叉熵损失共同提高了所提出的CNN模型的泛化能力C-ISRM_C-CNN.对于通过预训练的基于CNN的本地描述符提取的测试图像,按块方式密集特征,采用有效的特征融合策略(称为block

{ICIP2014}{收录论文列表}

This article come from HEREARS-L1: Learning Tuesday 10:30–12:30; Oral Session; Room: Leonard de Vinci 10:30  ARS-L1.1—GROUP STRUCTURED DIRTY DICTIONARY LEARNING FOR CLASSIFICATION Yuanming Suo, Minh Dao, Trac Tran, Johns Hopkins University, USA; Hojj