MNIST

THE MNIST DATABASE of handwritten digits

　　有训练集6万，测试集1万。是NIST的子集。数字放在一个归一化的，固定尺寸的图片的中心。

　　这是一个给那些想在真实的世界数据上面，学习模式识别技术的人的一个很好的数据库。仅仅花费最少的预处理和格式化。（不知道为啥几个数据库都要整这句话，有啥深刻的道理，我没理解？）

总共4个文件：

train-images-idx3-ubyte.gz: training set images (9912422 bytes)
train-labels-idx1-ubyte.gz: training set labels (28881 bytes)
t10k-images-idx3-ubyte.gz: test set images (1648877 bytes)
t10k-labels-idx1-ubyte.gz: test set labels (4542 bytes)

　　如果下周的数据的大小，跟上面显示的不一样。可能是因为浏览器没有解压的原因。

　　文件并不是用任何标准的图片格式，所以要自己写个小程序来读它。文件的格式在下面有描述。

　　原始的黑白（二值）图像是从NIST中来的。NIST都已经尺寸归一化为20x20像素，但保留了数字的长宽比。而结果图像在归一化算法中都用了反混叠技术，使其包含灰度值。图像根据计算像素点的中心而放在28x28的图片中。（这句其实没理解，怎么算的中心，不知道。就理解为大致放在28x28图片的中央位置）。

　　在一些分类算法中（特别是基于模板的方法，比如SVM、KNN），当数字放在边界框的中心的时候，比数字放在像素点中心的时候，错误更高。如果做了这方面的预处理，在论文汇总应该体现。

　　MNIST是由NIST的手写体数字二值化图片的数据库SD3和SD1构成的。NIST最初的设计中，SD3是训练集，SD1是测试集。然而SD3比SD1更清晰，并且更容易识别。原因是SD3是从人口统计局的雇员收集的，而SD1是从中学的学生中收集的。从学习经验中得到有效的结论需要测试集是独立于训练集，并且测试集在完整的样本之中。因此，有必要通过混合NIST的数据来建造一个新的数据库。

　　MNIST的训练集是由从SD3中的3万张图片和SD1中的3万张图片组成的。测试集是由从SD3中的5000张图片和SD1中的5000张图片。6万张训练集包含了近似250位写手。我们保证了训练集和测试集的写手是不相交的。

　　SD1包含了由500位不同的写手写的58527张图片。比较而言，SD3的数据块是一次排列的，而SD1的数据是杂乱无章的。可以识别出SD1中的写手信息，我们根据识别出来的信息，把500位写手的数据分为两部分，前250位分到训练集中，后250位分到测试集。这样训练集和测试集我们现在都有大概3万张图片。在训练集中再加入SD3的数据，从0开始，使其凑够6万。类似的，在测试集中从SD3第35000张图开始，补充测试集到6万张。在这个网址只能下载到1万张测试集。完整的6万张训练集是可以下载的。

　　在这个数据集上已经试验了很多方法。下面是一些例子。详细信息在链接的论文中有。一些试验用了一些方法：输入的图片是扭斜（通过计算接近于垂直型线的主要的轴，然后移位线使其垂直）。在一些其他的试验中，用人工扭曲原始的训练数据的方法来增大训练集，扭曲随意的结合移位、比例缩放、偏移和压缩。（这小段有些地方没理解）

CLASSIFIER	PREPROCESSING	TEST ERROR RATE (%)	Reference
Linear Classifiers
linear classifier (1-layer NN)	none	12.0	LeCun et al. 1998
linear classifier (1-layer NN)	deskewing	8.4	LeCun et al. 1998
pairwise linear classifier	deskewing	7.6	LeCun et al. 1998
K-Nearest Neighbors
K-nearest-neighbors, Euclidean (L2)	none	5.0	LeCun et al. 1998
K-nearest-neighbors, Euclidean (L2)	none	3.09	Kenneth Wilder, U. Chicago
K-nearest-neighbors, L3	none	2.83	Kenneth Wilder, U. Chicago
K-nearest-neighbors, Euclidean (L2)	deskewing	2.4	LeCun et al. 1998
K-nearest-neighbors, Euclidean (L2)	deskewing, noise removal, blurring	1.80	Kenneth Wilder, U. Chicago
K-nearest-neighbors, L3	deskewing, noise removal, blurring	1.73	Kenneth Wilder, U. Chicago
K-nearest-neighbors, L3	deskewing, noise removal, blurring, 1 pixel shift	1.33	Kenneth Wilder, U. Chicago
K-nearest-neighbors, L3	deskewing, noise removal, blurring, 2 pixel shift	1.22	Kenneth Wilder, U. Chicago
K-NN with non-linear deformation (IDM)	shiftable edges	0.54	Keysers et al. IEEE PAMI 2007
K-NN with non-linear deformation (P2DHMDM)	shiftable edges	0.52	Keysers et al. IEEE PAMI 2007
K-NN, Tangent Distance	subsampling to 16x16 pixels	1.1	LeCun et al. 1998
K-NN, shape context matching	shape context feature extraction	0.63	Belongie et al. IEEE PAMI 2002
Boosted Stumps
boosted stumps	none	7.7	Kegl et al., ICML 2009
products of boosted stumps (3 terms)	none	1.26	Kegl et al., ICML 2009
boosted trees (17 leaves)	none	1.53	Kegl et al., ICML 2009
stumps on Haar features	Haar features	1.02	Kegl et al., ICML 2009
product of stumps on Haar f.	Haar features	0.87	Kegl et al., ICML 2009
Non-Linear Classifiers
40 PCA + quadratic classifier	none	3.3	LeCun et al. 1998
1000 RBF + linear classifier	none	3.6	LeCun et al. 1998
SVMs
SVM, Gaussian Kernel	none	1.4
SVM deg 4 polynomial	deskewing	1.1	LeCun et al. 1998
Reduced Set SVM deg 5 polynomial	deskewing	1.0	LeCun et al. 1998
Virtual SVM deg-9 poly [distortions]	none	0.8	LeCun et al. 1998
Virtual SVM, deg-9 poly, 1-pixel jittered	none	0.68	DeCoste and Scholkopf, MLJ 2002
Virtual SVM, deg-9 poly, 1-pixel jittered	deskewing	0.68	DeCoste and Scholkopf, MLJ 2002
Virtual SVM, deg-9 poly, 2-pixel jittered	deskewing	0.56	DeCoste and Scholkopf, MLJ 2002
Neural Nets
2-layer NN, 300 hidden units, mean square error	none	4.7	LeCun et al. 1998
2-layer NN, 300 HU, MSE, [distortions]	none	3.6	LeCun et al. 1998
2-layer NN, 300 HU	deskewing	1.6	LeCun et al. 1998
2-layer NN, 1000 hidden units	none	4.5	LeCun et al. 1998
2-layer NN, 1000 HU, [distortions]	none	3.8	LeCun et al. 1998
3-layer NN, 300+100 hidden units	none	3.05	LeCun et al. 1998
3-layer NN, 300+100 HU [distortions]	none	2.5	LeCun et al. 1998
3-layer NN, 500+150 hidden units	none	2.95	LeCun et al. 1998
3-layer NN, 500+150 HU [distortions]	none	2.45	LeCun et al. 1998
3-layer NN, 500+300 HU, softmax, cross entropy, weight decay	none	1.53	Hinton, unpublished, 2005
2-layer NN, 800 HU, Cross-Entropy Loss	none	1.6	Simard et al., ICDAR 2003
2-layer NN, 800 HU, cross-entropy [affine distortions]	none	1.1	Simard et al., ICDAR 2003
2-layer NN, 800 HU, MSE [elastic distortions]	none	0.9	Simard et al., ICDAR 2003
2-layer NN, 800 HU, cross-entropy [elastic distortions]	none	0.7	Simard et al., ICDAR 2003
NN, 784-500-500-2000-30 + nearest neighbor, RBM + NCA training [no distortions]	none	1.0	Salakhutdinov and Hinton, AI-Stats 2007
6-layer NN 784-2500-2000-1500-1000-500-10 (on GPU) [elastic distortions]	none	0.35	Ciresan et al. Neural Computation 10, 2010 and arXiv 1003.0358, 2010
committee of 25 NN 784-800-10 [elastic distortions]	width normalization, deslanting	0.39	Meier et al. ICDAR 2011
deep convex net, unsup pre-training [no distortions]	none	0.83	Deng et al. Interspeech 2010
Convolutional nets
Convolutional net LeNet-1	subsampling to 16x16 pixels	1.7	LeCun et al. 1998
Convolutional net LeNet-4	none	1.1	LeCun et al. 1998
Convolutional net LeNet-4 with K-NN instead of last layer	none	1.1	LeCun et al. 1998
Convolutional net LeNet-4 with local learning instead of last layer	none	1.1	LeCun et al. 1998
Convolutional net LeNet-5, [no distortions]	none	0.95	LeCun et al. 1998
Convolutional net LeNet-5, [huge distortions]	none	0.85	LeCun et al. 1998
Convolutional net LeNet-5, [distortions]	none	0.8	LeCun et al. 1998
Convolutional net Boosted LeNet-4, [distortions]	none	0.7	LeCun et al. 1998
Trainable feature extractor + SVMs [no distortions]	none	0.83	Lauer et al., Pattern Recognition 40-6, 2007
Trainable feature extractor + SVMs [elastic distortions]	none	0.56	Lauer et al., Pattern Recognition 40-6, 2007
Trainable feature extractor + SVMs [affine distortions]	none	0.54	Lauer et al., Pattern Recognition 40-6, 2007
unsupervised sparse features + SVM, [no distortions]	none	0.59	Labusch et al., IEEE TNN 2008
Convolutional net, cross-entropy [affine distortions]	none	0.6	Simard et al., ICDAR 2003
Convolutional net, cross-entropy [elastic distortions]	none	0.4	Simard et al., ICDAR 2003
large conv. net, random features [no distortions]	none	0.89	Ranzato et al., CVPR 2007
large conv. net, unsup features [no distortions]	none	0.62	Ranzato et al., CVPR 2007
large conv. net, unsup pretraining [no distortions]	none	0.60	Ranzato et al., NIPS 2006
large conv. net, unsup pretraining [elastic distortions]	none	0.39	Ranzato et al., NIPS 2006
large conv. net, unsup pretraining [no distortions]	none	0.53	Jarrett et al., ICCV 2009
large/deep conv. net, 1-20-40-60-80-100-120-120-10 [elastic distortions]	none	0.35	Ciresan et al. IJCAI 2011
committee of 7 conv. net, 1-20-P-40-P-150-10 [elastic distortions]	width normalization	0.27 +-0.02	Ciresan et al. ICDAR 2011
committee of 35 conv. net, 1-20-P-40-P-150-10 [elastic distortions]	width normalization	0.23	Ciresan et al. CVPR 2012

References

[LeCun et al., 1998a]: Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. "Gradient-based learning applied to document recognition." Proceedings of the IEEE, 86(11):2278-2324, November 1998. [on-line version]

时间： 2024-10-08 02:11:18

MNIST

THE MNIST DATABASE of handwritten digits

References

MNIST的相关文章

Caffe学习系列（一）Ubuntu16.04下搭建编译Caffe环境，并运行MNIST示例（仅CPU）

从零到一：caffe-windows(CPU)配置与利用mnist数据集训练第一个caffemodel

caffe linux下面的调试mnist遇到的问题

Tensorflow MNIST 数据集測试代码入门

mnist的格式说明，以及在python3.x和python 2.x读取mnist数据集的不同

Tensorflow中使用CNN实现Mnist手写体识别

卷积神经网络(CNN)代码实现(MNIST)解析

windows环境Caffe安装配置步骤（无GPU）及mnist训练

使用Decision Tree对MNIST数据集进行实验

Ubuntu14.04+caffe+cuda7.5 环境搭建以及MNIST数据集的训练与测试