本文用记录基于Caffe的人脸性别识别过程。基于imdb-wiki模型做finetune,imdb-wiki数据集合模型可从这里下载:https://data.vision.ee.ethz.ch/cvl/rrothe/imdb-wiki/。
准备训练环境
(1)准备OS:Ubuntu16.04
(2)安装Nvidia GPU Driver
https://www.nvidia.com/Download/index.aspx?lang=en-us
(3)安装CUDA
https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
查看cuda版本的方法:
cat /usr/local/cuda/version.txt
(4)安装cnDNN(可选)
https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html
查看cudnn版本的方法:
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
(5)安装Docker(可选)
https://docs.docker.com/install/linux/docker-ce/ubuntu/#set-up-the-repository
(6)安装Nvidia Docker(可选)
https://github.com/NVIDIA/nvidia-docker
(7)准备Docker Image(可选)
进入Container的方式之一:,
nvidia-docker exec -it $ContainerID /bin/bash
用nvidia-docker ps查看ContainerID。
准备模型及训练数据集
(1) 下载Imdb-wiki模型
https://data.vision.ee.ethz.ch/cvl/rrothe/imdb-wiki/static/gender.caffemodel
https://data.vision.ee.ethz.ch/cvl/rrothe/imdb-wiki/static/gender_train.prototxt
如果下载了imdb-wiki的数据集,可以通过如下方式读取数据集的描述文件:
import scipy.io as sio
mat_contents = sio.loadmat(‘wiki.mat‘)
(2) 下载celeba数据集
CelebA是CelebFaces Attribute的缩写,意即名人人脸属性数据集,其包含10,177个名人身份的202,599张人脸图片,每张图片都做好了特征标记,包含人脸bbox标注框、5个人脸特征点坐标以及40个属性标记,CelebA由香港中文大学开放提供,广泛用于人脸相关的计算机视觉训练任务,可用于人脸属性标识训练、人脸检测训练以及landmark标记等,可以从http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html下载img_align_celeba.zip即可。
(3) 生成celeba数据集的训练和测试描述文件
删除list_attr_celeba文件第一行后,提取性别属性:
cat list_attr_celeba | awk -F ‘ ’ ‘{print $1,$2}’ >gender.txt
计算图片文件数量:
cat list_attr_celeba | wc -l
对gender.txt文件行做shuffle:
cat gender.txt | awk -F"\3" ‘BEGIN{srand();}{value=int(rand()*图片文件数量); print value"\3"$0 }‘ | sort | awk -F"\3" ‘{print $2}‘ >> shuffled
生成训练集:
head -n 图片文件数量*0.9 shuffled > train.txt
tail -n 图片文件数量*0.1 shuffled > test.txt
修改图片路径可能用到的VI命令:1,$ s/old/new/g
(4) 为了更好的识别亚洲人的性别,还可以通过爬取等方式收集标注来补充亚洲人的数据。
训练模型
(1)准备solver.prototxt
Solver文件解释可参考:
https://github.com/BVLC/caffe/wiki/Solver-Prototxt
(更全面)https://zhuanlan.zhihu.com/p/48462756
net: “gender.prototxt” test_iter: 100 test_interval: 500 test_compute_loss: true base_lr: 0.00001 momentum: 0.95 type: “SGD” weight_decay: 0.0005 lr_policy: “step” gamma: 0.9 stepsize: 200 display: 100 max_iter: 20000 snapshot: 2000 snapshot_prefix: “gender” solver_mode: GPU
(2) 修改gender.prototxt
name: "VGG_ILSVRC_16_layers" layer { top: "data" type: "ImageData" top: "label" name: "data" transform_param { mirror: true crop_size: 224 mean_file: "imagenet_mean.binaryproto" } image_data_param { source: "train.txt" batch_size: 32 new_height: 256 new_width: 256 } include: { phase: TRAIN } } layer { top: "data" top: "label" name: "data" type: "ImageData" image_data_param { new_height: 256 new_width: 256 source: "train.txt" batch_size: 10 } transform_param { crop_size: 224 mirror: false mean_file: "imagenet_mean.binaryproto" } include: { phase: TEST } } layer { bottom: "data" top: "conv1_1" name: "conv1_1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } type: "Convolution" convolution_param { num_output: 64 pad: 1 kernel_size: 3 } } layer { bottom: "conv1_1" top: "conv1_1" name: "relu1_1" type: "ReLU" } layer { bottom: "conv1_1" top: "conv1_2" name: "conv1_2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } type: "Convolution" convolution_param { num_output: 64 pad: 1 kernel_size: 3 } } layer { bottom: "conv1_2" top: "conv1_2" name: "relu1_2" type: "ReLU" } layer { bottom: "conv1_2" top: "pool1" name: "pool1" type: "Pooling" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { bottom: "pool1" top: "conv2_1" name: "conv2_1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } type: "Convolution" convolution_param { num_output: 128 pad: 1 kernel_size: 3 } } layer { bottom: "conv2_1" top: "conv2_1" name: "relu2_1" type: "ReLU" } layer { bottom: "conv2_1" top: "conv2_2" name: "conv2_2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } type: "Convolution" convolution_param { num_output: 128 pad: 1 kernel_size: 3 } } layer { bottom: "conv2_2" top: "conv2_2" name: "relu2_2" type: "ReLU" } layer { bottom: "conv2_2" top: "pool2" name: "pool2" type: "Pooling" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { bottom: "pool2" top: "conv3_1" name: "conv3_1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } type: "Convolution" convolution_param { num_output: 256 pad: 1 kernel_size: 3 } } layer { bottom: "conv3_1" top: "conv3_1" name: "relu3_1" type: "ReLU" } layer { bottom: "conv3_1" top: "conv3_2" name: "conv3_2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } type: "Convolution" convolution_param { num_output: 256 pad: 1 kernel_size: 3 } } layer { bottom: "conv3_2" top: "conv3_2" name: "relu3_2" type: "ReLU" } layer { bottom: "conv3_2" top: "conv3_3" name: "conv3_3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } type: "Convolution" convolution_param { num_output: 256 pad: 1 kernel_size: 3 } } layer { bottom: "conv3_3" top: "conv3_3" name: "relu3_3" type: "ReLU" } layer { bottom: "conv3_3" top: "pool3" name: "pool3" type: "Pooling" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { bottom: "pool3" top: "conv4_1" name: "conv4_1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } type: "Convolution" convolution_param { num_output: 512 pad: 1 kernel_size: 3 } } layer { bottom: "conv4_1" top: "conv4_1" name: "relu4_1" type: "ReLU" } layer { bottom: "conv4_1" top: "conv4_2" name: "conv4_2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } type: "Convolution" convolution_param { num_output: 512 pad: 1 kernel_size: 3 } } layer { bottom: "conv4_2" top: "conv4_2" name: "relu4_2" type: "ReLU" } layer { bottom: "conv4_2" top: "conv4_3" name: "conv4_3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } type: "Convolution" convolution_param { num_output: 512 pad: 1 kernel_size: 3 } } layer { bottom: "conv4_3" top: "conv4_3" name: "relu4_3" type: "ReLU" } layer { bottom: "conv4_3" top: "pool4" name: "pool4" type: "Pooling" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { bottom: "pool4" top: "conv5_1" name: "conv5_1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } type: "Convolution" convolution_param { num_output: 512 pad: 1 kernel_size: 3 } } layer { bottom: "conv5_1" top: "conv5_1" name: "relu5_1" type: "ReLU" } layer { bottom: "conv5_1" top: "conv5_2" name: "conv5_2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } type: "Convolution" convolution_param { num_output: 512 pad: 1 kernel_size: 3 } } layer { bottom: "conv5_2" top: "conv5_2" name: "relu5_2" type: "ReLU" } layer { bottom: "conv5_2" top: "conv5_3" name: "conv5_3" type: "Convolution" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 512 pad: 1 kernel_size: 3 } } layer { bottom: "conv5_3" top: "conv5_3" name: "relu5_3" type: "ReLU" } layer { bottom: "conv5_3" top: "pool5" name: "pool5" type: "Pooling" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { bottom: "pool5" top: "fc6" name: "fc6" param { lr_mult: 10 decay_mult: 1 } param { lr_mult: 20 decay_mult: 0 } type: "InnerProduct" inner_product_param { num_output: 4096 } } layer { bottom: "fc6" top: "fc6" name: "relu6" type: "ReLU" } layer { bottom: "fc6" top: "fc6" name: "drop6" type: "Dropout" dropout_param { dropout_ratio: 0.5 } } layer { bottom: "fc6" top: "fc7" name: "fc7" param { lr_mult: 10 decay_mult: 1 } param { lr_mult: 20 decay_mult: 0 } type: "InnerProduct" inner_product_param { num_output: 4096 } } layer { bottom: "fc7" top: "fc7" name: "relu7" type: "ReLU" } layer { bottom: "fc7" top: "fc7" name: "drop7" type: "Dropout" dropout_param { dropout_ratio: 0.5 } } layer { bottom: "fc7" top: "fc8-2" name: "fc8-2" param { lr_mult: 10 decay_mult: 1 } param { lr_mult: 20 decay_mult: 0 } type: "InnerProduct" inner_product_param { num_output: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { bottom: "fc8-2" bottom: "label" name: "loss" type: "SoftmaxWithLoss" include: { phase: TRAIN } } layer { name: "prob" type: "Softmax" bottom: "fc8-2" top: "prob" include { phase: TEST } } layer { name: "accuracy_train_top01" type: "Accuracy" bottom: "fc8-2" bottom: "label" top: "accuracy_train_top01" include { phase: TEST } }
imagenet_mean.binaryproto 文件的生成可参考https://github.com/BVLC/caffe/blob/master/examples/imagenet/make_imagenet_mean.sh
或直接从网上下载。
(3)启动训练
caffe train –sovler=pathto/solver.prototxt –weight=pathtto/gender.caffemodel –gpu all
(4)使用训练的模型
实际应用中,我们首先采用人脸检测技术检测人脸,将图片中的人脸取裁剪出来送入训练好的模型进行性别识别。人脸检测技术可以采用dlib库,dlib库人脸检测支持根据具体应用场景进行finetune。
原文地址:https://www.cnblogs.com/dskit/p/10269502.html