Why one-norm is an agreeable alternative for zero-norm?

【转载请注明出处】http://www.cnblogs.com/mashiqi

Today I try to give a brief inspection on why we always choose one-norm as the approximation of zero-norm, which is a sparsity indicator. This blog is not rigorous in theory, but I just want give a intuitive explanation. It may be extended to be more comprehensive in the future.

I begin to know something about zero-norm totally from the emergence of the so-called Compressive Sensing theory. While CS brings us a bunch of encouraging tools to handle some problems, such as image denoising, we also know that it is hard to operate directly on the zero-norm (in fact it is NP-hard). Therefore many scholars regard one-norm as an agreeable alternative for zero-norm! But why one-norm, why isn‘t two-norm or other?

There is a picture (with some small modefication for my own usage) from [Davenport et al. 2011] that gives a illustrative explanation of what I want to express.

We see that the intersection $\hat{x}$ when $p=1/2$ is equivalent to $\hat{x}$ when $p=1$--both are the intersection of solid line and y-axis. But the corresponding intersection of $p=2$ and $p=\infty$ is not so--they are in somewhere out of any axis. Further, for the first two intersections each is only have one coordinate that is non-zero, and $0 \leq p \leq 1$. Then I give my intuitive explanation of the main question of this blog: the shape of the contour of some critical points, such as intersections of unit circle and axes, of the $l_p$ space attributes a lot to the sparsity of the solution of an algorithm performed in this $l_p$ space, and these intersections is like a sharp vertex when $0 \leq p \leq 1$, while they are dull when $p > 1$. I‘ll show this a simple mathematical example.

Let‘s consider the $l_p$ unit cirle in two-dimensional space: $$\|(x,y)\|_p = (x^p + y^p)^{1/p} = x^p + y^p = 1,~(p \geq 0)$$ For simplicity, I only plot the unit cirle in the first quadrant ($y = (1 - x^p)^{1/p},~(x \geq 0, y \geq 0)$):

It is very necessary to investigate the detail around $x=0$, and the tangential of the unit circle in that point is the key point to understand my intuitive explanation. Now let‘s see the detail and the tangential in $x=0$ to see what happened there.

In these figure, blue lines are unit circle and red lines is the tangential line of the point $(0,1)$. We see that the tangential line is vertial when $p = 0.2$ and $p = 0.8$, and is horizontal when $p = 1.2$ and $p = 1.8$. $p = 1$ is the cut-off point. In fact we can do some simple mathematics to prove that the tangential is vertial when $0 \leq p < 1$ and horizontal when $p > 1$, and only when $p = 1$ the tangential is on an angle of 45 degree. Therefore when $0 \leq p < 1$, there is a sharp vertex in $(1,0)$.

Reference:

Davenport, Mark A., et al. "Introduction to compressed sensing." Preprint 93 (2011).

时间: 2024-10-12 14:55:26

Why one-norm is an agreeable alternative for zero-norm?的相关文章

日更第11期-2015-3-27-processing教程-API篇-第一讲-map(),Table,loadTable(),norm(),lerp()

hI!!今天上线发现我多了一个粉丝!!哇,好高兴! 不过我昨天食言了,没有继续日更......希望不会掉粉..... 不过那是有原因的,我昨天一直在找数据,终于今天给整理好了,我打算这个周末整一整.然后就可以出真正厉害的教程啦!! 我先说一下我接下来会出的教程,然后说说今天发的这个到底是什么. 接下来: 1,美国失业数据可视化 2,地图数据可视化案例教学(案例来自processing教学书visualizing data) 3,中国高考分地域分析 4,API教程 然后说说今天这是干啥. 简单来说

Matlab norm 用法小记

Matlab norm 用法小记 matlab norm (a) 用法以及实例 norm(A,p)当A是向量时norm(A,p)   Returns sum(abs(A).^p)^(1/p), for any 1 <= p <= ∞.norm(A)    Returns norm(A,2)norm(A,inf)   Returns max(abs(A)).norm(A,-inf)   Returns min(abs(A)). 当A是矩阵时n = norm(A) returns the larg

(转)几种范数的解释 l0-Norm, l1-Norm, l2-Norm, … , l-infinity Norm

几种范数的解释 l0-Norm, l1-Norm, l2-Norm, … , l-infinity Norm from Rorasa's blog l0-Norm, l1-Norm, l2-Norm, … , l-infinity Norm 13/05/2012rorasa I’m working on things related to norm a lot lately and it is time to talk about it. In this post we are going to

matlab中norm函数的用法

格式:n=norm(A,p) 功能:norm函数可计算几种不同类型的矩阵范数,根据p的不同可得到不同的范数 以下是Matlab中help norm 的解释 NORM   Matrix or vector norm. For matrices... NORM(X) is the largest singular value of X, max(svd(X)). NORM(X,2) is the same as NORM(X). NORM(X,1) is the 1-norm of X, the l

total variation norm

Total variation norm is commonly used to make the visual data more local smooth. 1)  definition For matrices, the TV norm is defined as where For multidimensional data, the TV norm can be given by 2) algorithm (primal-dual gradient algorithm)  basis:

浅谈字符编码

前言 我们知道,.NET Framework 在内部将文本(string)存储为 Unicode UTF-16.在 .NET Framework Base Class Library 中,System.Text.Encoding 类及其派生类提供了对字符编码的支持.Encoding 类的静态GetEncodings 方法返回包含所有编码的数组. 源程序代码 让我们写个 C# 程序来查看一下 BCL 所支持的所有字符编码吧.下面就是 EncodingTester.cs: 01: using Sys

[C4] Andrew Ng - Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization

About this Course This course will teach you the "magic" of getting deep learning to work well. Rather than the deep learning process being a black box, you will understand what drives performance, and be able to more systematically get good res

Python怎么检验数据的正态分布

在对数据建模前,很多时候我们需要对数据做正态性检验,进而通过检验结果确定下一步的分析方案.下面介绍 Python 中常用的几种正态性检验方法: scipy.stats.kstest kstest 是一个很强大的检验模块,除了正态性检验,还能检验 scipy.stats 中的其他数据分布类型 kstest(rvs, cdf, args=(), N=20, alternative=’two_sided’, mode=’approx’, **kwds) 对于正态性检验,我们只需要手动设置三个参数即可:

SciPy - 正态性 与 KS 检验

假设检验的基本思想 若对总体的某个假设是真实的,那么不利于或者不能支持这一假设的事件A在一次试验中是几乎不可能发生的:如果事件A真的发生了,则有理由怀疑这一假设的真实性,从而拒绝该假设: 假设检验实质上是对原假设是否正确进行检验,因此检验过程中要使原假设得到维护,使之不轻易被拒绝:否定原假设必须有充分的理由.同时,当原假设被接受时,也只能认为否定该假设的根据不充分,而不是认为它绝对正确 ks 检验 ks 检验分为 单样本 和两样本 检验: 单样本检验 用于 检验 一个数据的观测分布 是否符合 某