深度学习面试题20:GoogLeNet(Inception V1)


















git clone https://github.com/tensorflow/models.git


# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Contains the definition for inception v1 classification network."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf

from nets import inception_utils

slim = tf.contrib.slim
trunc_normal = lambda stddev: tf.truncated_normal_initializer(0.0, stddev)

def inception_v1_base(inputs,
  """Defines the Inception V1 base architecture.

  This architecture is defined in:
    Going deeper with convolutions
    Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed,
    Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich.

    inputs: a tensor of size [batch_size, height, width, channels].
    final_endpoint: specifies the endpoint to construct the network up to. It
      can be one of [‘Conv2d_1a_7x7‘, ‘MaxPool_2a_3x3‘, ‘Conv2d_2b_1x1‘,
      ‘Conv2d_2c_3x3‘, ‘MaxPool_3a_3x3‘, ‘Mixed_3b‘, ‘Mixed_3c‘,
      ‘MaxPool_4a_3x3‘, ‘Mixed_4b‘, ‘Mixed_4c‘, ‘Mixed_4d‘, ‘Mixed_4e‘,
      ‘Mixed_4f‘, ‘MaxPool_5a_2x2‘, ‘Mixed_5b‘, ‘Mixed_5c‘]. If
      include_root_block is False, [‘Conv2d_1a_7x7‘, ‘MaxPool_2a_3x3‘,
      ‘Conv2d_2b_1x1‘, ‘Conv2d_2c_3x3‘, ‘MaxPool_3a_3x3‘] will not be available.
    include_root_block: If True, include the convolution and max-pooling layers
      before the inception modules. If False, excludes those layers.
    scope: Optional variable_scope.

    A dictionary from components of the network to the corresponding activation.

    ValueError: if final_endpoint is not set to one of the predefined values.
  end_points = {}
  with tf.variable_scope(scope, ‘InceptionV1‘, [inputs]):
    with slim.arg_scope(
        [slim.conv2d, slim.fully_connected],
      with slim.arg_scope([slim.conv2d, slim.max_pool2d],
                          stride=1, padding=‘SAME‘):
        net = inputs
        if include_root_block:
          end_point = ‘Conv2d_1a_7x7‘
          net = slim.conv2d(inputs, 64, [7, 7], stride=2, scope=end_point)
          end_points[end_point] = net
          if final_endpoint == end_point:
            return net, end_points
          end_point = ‘MaxPool_2a_3x3‘
          net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)
          end_points[end_point] = net
          if final_endpoint == end_point:
            return net, end_points
          end_point = ‘Conv2d_2b_1x1‘
          net = slim.conv2d(net, 64, [1, 1], scope=end_point)
          end_points[end_point] = net
          if final_endpoint == end_point:
            return net, end_points
          end_point = ‘Conv2d_2c_3x3‘
          net = slim.conv2d(net, 192, [3, 3], scope=end_point)
          end_points[end_point] = net
          if final_endpoint == end_point:
            return net, end_points
          end_point = ‘MaxPool_3a_3x3‘
          net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)
          end_points[end_point] = net
          if final_endpoint == end_point:
            return net, end_points

        end_point = ‘Mixed_3b‘
        with tf.variable_scope(end_point):
          with tf.variable_scope(‘Branch_0‘):
            branch_0 = slim.conv2d(net, 64, [1, 1], scope=‘Conv2d_0a_1x1‘)
          with tf.variable_scope(‘Branch_1‘):
            branch_1 = slim.conv2d(net, 96, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_1 = slim.conv2d(branch_1, 128, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_2‘):
            branch_2 = slim.conv2d(net, 16, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_2 = slim.conv2d(branch_2, 32, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_3‘):
            branch_3 = slim.max_pool2d(net, [3, 3], scope=‘MaxPool_0a_3x3‘)
            branch_3 = slim.conv2d(branch_3, 32, [1, 1], scope=‘Conv2d_0b_1x1‘)
          net = tf.concat(
              axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘Mixed_3c‘
        with tf.variable_scope(end_point):
          with tf.variable_scope(‘Branch_0‘):
            branch_0 = slim.conv2d(net, 128, [1, 1], scope=‘Conv2d_0a_1x1‘)
          with tf.variable_scope(‘Branch_1‘):
            branch_1 = slim.conv2d(net, 128, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_1 = slim.conv2d(branch_1, 192, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_2‘):
            branch_2 = slim.conv2d(net, 32, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_3‘):
            branch_3 = slim.max_pool2d(net, [3, 3], scope=‘MaxPool_0a_3x3‘)
            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope=‘Conv2d_0b_1x1‘)
          net = tf.concat(
              axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘MaxPool_4a_3x3‘
        net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘Mixed_4b‘
        with tf.variable_scope(end_point):
          with tf.variable_scope(‘Branch_0‘):
            branch_0 = slim.conv2d(net, 192, [1, 1], scope=‘Conv2d_0a_1x1‘)
          with tf.variable_scope(‘Branch_1‘):
            branch_1 = slim.conv2d(net, 96, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_1 = slim.conv2d(branch_1, 208, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_2‘):
            branch_2 = slim.conv2d(net, 16, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_2 = slim.conv2d(branch_2, 48, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_3‘):
            branch_3 = slim.max_pool2d(net, [3, 3], scope=‘MaxPool_0a_3x3‘)
            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope=‘Conv2d_0b_1x1‘)
          net = tf.concat(
              axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘Mixed_4c‘
        with tf.variable_scope(end_point):
          with tf.variable_scope(‘Branch_0‘):
            branch_0 = slim.conv2d(net, 160, [1, 1], scope=‘Conv2d_0a_1x1‘)
          with tf.variable_scope(‘Branch_1‘):
            branch_1 = slim.conv2d(net, 112, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_1 = slim.conv2d(branch_1, 224, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_2‘):
            branch_2 = slim.conv2d(net, 24, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_2 = slim.conv2d(branch_2, 64, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_3‘):
            branch_3 = slim.max_pool2d(net, [3, 3], scope=‘MaxPool_0a_3x3‘)
            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope=‘Conv2d_0b_1x1‘)
          net = tf.concat(
              axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘Mixed_4d‘
        with tf.variable_scope(end_point):
          with tf.variable_scope(‘Branch_0‘):
            branch_0 = slim.conv2d(net, 128, [1, 1], scope=‘Conv2d_0a_1x1‘)
          with tf.variable_scope(‘Branch_1‘):
            branch_1 = slim.conv2d(net, 128, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_1 = slim.conv2d(branch_1, 256, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_2‘):
            branch_2 = slim.conv2d(net, 24, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_2 = slim.conv2d(branch_2, 64, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_3‘):
            branch_3 = slim.max_pool2d(net, [3, 3], scope=‘MaxPool_0a_3x3‘)
            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope=‘Conv2d_0b_1x1‘)
          net = tf.concat(
              axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘Mixed_4e‘
        with tf.variable_scope(end_point):
          with tf.variable_scope(‘Branch_0‘):
            branch_0 = slim.conv2d(net, 112, [1, 1], scope=‘Conv2d_0a_1x1‘)
          with tf.variable_scope(‘Branch_1‘):
            branch_1 = slim.conv2d(net, 144, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_1 = slim.conv2d(branch_1, 288, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_2‘):
            branch_2 = slim.conv2d(net, 32, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_2 = slim.conv2d(branch_2, 64, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_3‘):
            branch_3 = slim.max_pool2d(net, [3, 3], scope=‘MaxPool_0a_3x3‘)
            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope=‘Conv2d_0b_1x1‘)
          net = tf.concat(
              axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘Mixed_4f‘
        with tf.variable_scope(end_point):
          with tf.variable_scope(‘Branch_0‘):
            branch_0 = slim.conv2d(net, 256, [1, 1], scope=‘Conv2d_0a_1x1‘)
          with tf.variable_scope(‘Branch_1‘):
            branch_1 = slim.conv2d(net, 160, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_1 = slim.conv2d(branch_1, 320, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_2‘):
            branch_2 = slim.conv2d(net, 32, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_2 = slim.conv2d(branch_2, 128, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_3‘):
            branch_3 = slim.max_pool2d(net, [3, 3], scope=‘MaxPool_0a_3x3‘)
            branch_3 = slim.conv2d(branch_3, 128, [1, 1], scope=‘Conv2d_0b_1x1‘)
          net = tf.concat(
              axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘MaxPool_5a_2x2‘
        net = slim.max_pool2d(net, [2, 2], stride=2, scope=end_point)
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘Mixed_5b‘
        with tf.variable_scope(end_point):
          with tf.variable_scope(‘Branch_0‘):
            branch_0 = slim.conv2d(net, 256, [1, 1], scope=‘Conv2d_0a_1x1‘)
          with tf.variable_scope(‘Branch_1‘):
            branch_1 = slim.conv2d(net, 160, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_1 = slim.conv2d(branch_1, 320, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_2‘):
            branch_2 = slim.conv2d(net, 32, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_2 = slim.conv2d(branch_2, 128, [3, 3], scope=‘Conv2d_0a_3x3‘)
          with tf.variable_scope(‘Branch_3‘):
            branch_3 = slim.max_pool2d(net, [3, 3], scope=‘MaxPool_0a_3x3‘)
            branch_3 = slim.conv2d(branch_3, 128, [1, 1], scope=‘Conv2d_0b_1x1‘)
          net = tf.concat(
              axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘Mixed_5c‘
        with tf.variable_scope(end_point):
          with tf.variable_scope(‘Branch_0‘):
            branch_0 = slim.conv2d(net, 384, [1, 1], scope=‘Conv2d_0a_1x1‘)
          with tf.variable_scope(‘Branch_1‘):
            branch_1 = slim.conv2d(net, 192, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_1 = slim.conv2d(branch_1, 384, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_2‘):
            branch_2 = slim.conv2d(net, 48, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_2 = slim.conv2d(branch_2, 128, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_3‘):
            branch_3 = slim.max_pool2d(net, [3, 3], scope=‘MaxPool_0a_3x3‘)
            branch_3 = slim.conv2d(branch_3, 128, [1, 1], scope=‘Conv2d_0b_1x1‘)
          net = tf.concat(
              axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points
    raise ValueError(‘Unknown final endpoint %s‘ % final_endpoint)

def inception_v1(inputs,
  """Defines the Inception V1 architecture.

  This architecture is defined in:

    Going deeper with convolutions
    Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed,
    Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich.

  The default image size used to train this network is 224x224.

    inputs: a tensor of size [batch_size, height, width, channels].
    num_classes: number of predicted classes. If 0 or None, the logits layer
      is omitted and the input features to the logits layer (before dropout)
      are returned instead.
    is_training: whether is training or not.
    dropout_keep_prob: the percentage of activation values that are retained.
    prediction_fn: a function to get predictions out of logits.
    spatial_squeeze: if True, logits is of shape [B, C], if false logits is of
        shape [B, 1, 1, C], where B is batch_size and C is number of classes.
    reuse: whether or not the network and its variables should be reused. To be
      able to reuse ‘scope‘ must be given.
    scope: Optional variable_scope.
    global_pool: Optional boolean flag to control the avgpooling before the
      logits layer. If false or unset, pooling is done with a fixed window
      that reduces default-sized inputs to 1x1, while larger inputs lead to
      larger outputs. If true, any input size is pooled down to 1x1.

    net: a Tensor with the logits (pre-softmax activations) if num_classes
      is a non-zero integer, or the non-dropped-out input to the logits layer
      if num_classes is 0 or None.
    end_points: a dictionary from components of the network to the corresponding
  # Final pooling and prediction
  with tf.variable_scope(scope, ‘InceptionV1‘, [inputs], reuse=reuse) as scope:
    with slim.arg_scope([slim.batch_norm, slim.dropout],
      net, end_points = inception_v1_base(inputs, scope=scope)
      with tf.variable_scope(‘Logits‘):
        if global_pool:
          # Global average pooling.
          net = tf.reduce_mean(net, [1, 2], keep_dims=True, name=‘global_pool‘)
          end_points[‘global_pool‘] = net
          # Pooling with a fixed kernel size.
          net = slim.avg_pool2d(net, [7, 7], stride=1, scope=‘AvgPool_0a_7x7‘)
          end_points[‘AvgPool_0a_7x7‘] = net
        if not num_classes:
          return net, end_points
        net = slim.dropout(net, dropout_keep_prob, scope=‘Dropout_0b‘)
        logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,
                             normalizer_fn=None, scope=‘Conv2d_0c_1x1‘)
        if spatial_squeeze:
          logits = tf.squeeze(logits, [1, 2], name=‘SpatialSqueeze‘)

        end_points[‘Logits‘] = logits
        end_points[‘Predictions‘] = prediction_fn(logits, scope=‘Predictions‘)
  return logits, end_points
inception_v1.default_image_size = 224

inception_v1_arg_scope = inception_utils.inception_arg_scope




(2)网络最后采用了average pooling(平均池化)来代替全连接层,,事实证明这样可以将准确率提高0.6%。但是,实际在最后还是加了一个全连接层,主要是为了方便对输出进行灵活调整;

(3)虽然移除了全连接,但是网络中依然使用了Dropout ;




Inception V1的参数量=5607184,约为560w




inceptionV1-Going Deeper with Convolutions



大话CNN经典模型:GoogLeNet(从Inception v1到v4的演进)




时间: 2024-10-06 13:25:14

深度学习面试题20:GoogLeNet(Inception V1)的相关文章

深度学习面试题29:GoogLeNet(Inception V3)

目录 使用非对称卷积分解大filters 重新设计pooling层 辅助构造器 使用标签平滑 参考资料 在<深度学习面试题20:GoogLeNet(Inception V1)>和<深度学习面试题26:GoogLeNet(Inception V2)>中对前两个Inception版本做了介绍,下面主要阐述V3版本的创新点 使用非对称卷积分解大filters InceptionV3中在网络较深的位置使用了非对称卷积,他的好处是在不降低模型效果的前提下,缩减模型的参数规模,在<深度学

深度学习面试题27:非对称卷积(Asymmetric Convolutions)

目录 产生背景 举例 参考资料 产生背景 之前在深度学习面试题16:小卷积核级联卷积VS大卷积核卷积中介绍过小卷积核的三个优势: ①整合了三个非线性激活层,代替单一非线性激活层,增加了判别能力. ②减少了网络参数. ③减少了计算量 在<Rethinking the Inception Architecture for Computer Vision>中作者还想把小卷积核继续拆解,从而进一步增强前面的优势 返回目录 举例 一个3*3的卷积可以拆解为:一个3*1的卷积再串联一个1*3的卷积,实验证


目录 网络结构 两大创新点 参考资料 第一个典型的CNN是LeNet5网络结构,但是第一个引起大家注意的网络却是AlexNet,Alex Krizhevsky其实是Hinton的学生,这个团队领导者是Hinton,于2012年发表论文. AlexNet有60 million个参数和65000个 神经元,五层卷积,三层全连接网络,最终的输出层是1000通道的softmax.AlexNet利用了两块GPU进行计算,大大提高了运算效率,并且在ILSVRC-2012竞赛中获得了top-5测试的15.3%


目录 举例 在Inception module上的应用 参考资料 可以减少计算量,可以增加非线性判别能力 举例 假设有1个高为30.宽为40,深度为200的三维张量与55个高为5.宽为5.深度为200的卷积核same卷积,步长=1,则结果是高为30.宽为40.深度为55的三维张量,如图所示: 该卷积过程的乘法计算量大约为5*5*200*30*40*55=330000000,这个计算量很大. 接着,我们可以考虑第二种卷积过程,先利用1*1的卷积核在深度上降维,然后升维: 上述卷积过程的计算量大约为

深度学习面试题18:网中网结构(Network in Network)

目录 举例 参考资料 网中网结构通过多个分支的运算(卷积或池化),将分支上的运算结果在深度上连接 举例 一个3*3*2的张量, 与3个1*1*2的卷积核分别same卷积,步长=1, 与2个2*2*2的卷积核分别same卷积,步长=1, 与1个3*3*2的掩码最大值same池化,步长=1, 将得到的这3个结果在深度方向上拼接 GoogLeNet是基于类似网中网模块设计的网络结构,在GoogLeNet中该模块称为 Inception Module,多个Inception Module 模块可以组合成

深度学习面试题21:批量归一化(Batch Normalization,BN)

目录 BN的由来 BN的作用 BN的操作阶段 BN的操作流程 BN可以防止梯度消失吗 为什么归一化后还要放缩和平移 BN在GoogLeNet中的应用 BN在GoogLeNet中的应用 BN的由来 BN是由Google于2015年提出,论文是<Batch Normalization_ Accelerating Deep Network Training by Reducing Internal Covariate Shift>,这是一个深度神经网络训练的技巧,主要是让数据的分布变得一致,从而使得


目录 神经网络的卷积.池化.拉伸 LeNet网络结构 LeNet在MNIST数据集上应用 参考资料 LeNet是卷积神经网络的祖师爷LeCun在1998年提出,用于解决手写数字识别的视觉任务.自那时起,CNN的最基本的架构就定下来了:卷积层.池化层.全连接层.如今各大深度学习框架中所使用的LeNet都是简化改进过的LeNet-5(-5表示具有5个层),和原始的LeNet有些许不同,比如把激活函数改为了现在很常用的ReLu. 神经网络的卷积.池化.拉伸 前面讲了卷积和池化,卷积层可以从图像中提取特


目录 举例 单个张量与多个卷积核的分离卷积 参考资料 举例 分离卷积就是先在深度上分别卷积,然后再进行卷积,对应代码为: import tensorflow as tf # [batch, in_height, in_width, in_channels] input =tf.reshape(tf.constant([2,5,3,3,8,2,6,1,1,2,5,4,7,9,2,3,-1,3], tf.float32),[1,3,3,2]) # [filter_height, filter_wid


目录 指数移动平均 BN在卷积网络中的使用 参考资料 假设已经训练好一个带有BN操作的卷积神经网络,但是在使用它预测时,往往每次只输入一个样本,那么经过该网络时,计算平均值和方差的意义就不大了,常采用的策略是计算训练阶段的平均值和方差的指数移动平均,然后在预测阶段使用它们作为BN操作时的平均值和方差. 指数移动平均 假设变量xt随时间t变化,按照以下规则定义其指数移动平均值 假设α=0.7 当t=1时,x1=5,则ema(1)=x1=5 当t=2时,x2=10,则ema(2)=α*ema(1)+