深度学习面试题20:GoogLeNet(Inception V1)

目录

  简介

  网络结构

  对应代码

  网络说明

  参考资料



简介

2014年,GoogLeNet和VGG是当年ImageNet挑战赛(ILSVRC14)的双雄,GoogLeNet获得了第一名、VGG获得了第二名,这两类模型结构的共同特点是层次更深了。VGG继承了LeNet以及AlexNet的一些框架结构,而GoogLeNet则做了更加大胆的网络结构尝试,虽然深度只有22层,但大小却比AlexNet和VGG小很多,GoogleNet参数为500万个,AlexNet参数个数是GoogleNet的12倍,VGGNet参数又是AlexNet的3倍,因此在内存或计算资源有限时,GoogleNet是比较好的选择;从模型结果来看,GoogLeNet的性能却更加优越。

GoogLeNet是谷歌(Google)研究出来的深度网络结构,为什么不叫“GoogleNet”,而叫“GoogLeNet”,是为了向“LeNet”致敬,因此取名为“GoogLeNet”

GoogLeNet团队要打造一个Inception模块(名字源于盗梦空间),让深度网络的表现更好。

返回目录


网络结构

PS:Slim是2016年开发出来的,即使在InceptionV1中,他也没有使用论文里说的5*5的卷积核,而是用的3*3的卷积核。

返回目录


对应代码

这里采用的官网的代码tensorflow/models/research/slim/nets/inception_v1.py

下载方式

git clone https://github.com/tensorflow/models.git

这个项目比较大,如果下载很慢的话,可以在qq群:537594183文件中只下载slim的代码即可。

# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Contains the definition for inception v1 classification network."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf

from nets import inception_utils

slim = tf.contrib.slim
trunc_normal = lambda stddev: tf.truncated_normal_initializer(0.0, stddev)

def inception_v1_base(inputs,
                      final_endpoint=‘Mixed_5c‘,
                      include_root_block=True,
                      scope=‘InceptionV1‘):
  """Defines the Inception V1 base architecture.

  This architecture is defined in:
    Going deeper with convolutions
    Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed,
    Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich.
    http://arxiv.org/pdf/1409.4842v1.pdf.

  Args:
    inputs: a tensor of size [batch_size, height, width, channels].
    final_endpoint: specifies the endpoint to construct the network up to. It
      can be one of [‘Conv2d_1a_7x7‘, ‘MaxPool_2a_3x3‘, ‘Conv2d_2b_1x1‘,
      ‘Conv2d_2c_3x3‘, ‘MaxPool_3a_3x3‘, ‘Mixed_3b‘, ‘Mixed_3c‘,
      ‘MaxPool_4a_3x3‘, ‘Mixed_4b‘, ‘Mixed_4c‘, ‘Mixed_4d‘, ‘Mixed_4e‘,
      ‘Mixed_4f‘, ‘MaxPool_5a_2x2‘, ‘Mixed_5b‘, ‘Mixed_5c‘]. If
      include_root_block is False, [‘Conv2d_1a_7x7‘, ‘MaxPool_2a_3x3‘,
      ‘Conv2d_2b_1x1‘, ‘Conv2d_2c_3x3‘, ‘MaxPool_3a_3x3‘] will not be available.
    include_root_block: If True, include the convolution and max-pooling layers
      before the inception modules. If False, excludes those layers.
    scope: Optional variable_scope.

  Returns:
    A dictionary from components of the network to the corresponding activation.

  Raises:
    ValueError: if final_endpoint is not set to one of the predefined values.
  """
  end_points = {}
  with tf.variable_scope(scope, ‘InceptionV1‘, [inputs]):
    with slim.arg_scope(
        [slim.conv2d, slim.fully_connected],
        weights_initializer=trunc_normal(0.01)):
      with slim.arg_scope([slim.conv2d, slim.max_pool2d],
                          stride=1, padding=‘SAME‘):
        net = inputs
        if include_root_block:
          end_point = ‘Conv2d_1a_7x7‘
          net = slim.conv2d(inputs, 64, [7, 7], stride=2, scope=end_point)
          end_points[end_point] = net
          if final_endpoint == end_point:
            return net, end_points
          end_point = ‘MaxPool_2a_3x3‘
          net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)
          end_points[end_point] = net
          if final_endpoint == end_point:
            return net, end_points
          end_point = ‘Conv2d_2b_1x1‘
          net = slim.conv2d(net, 64, [1, 1], scope=end_point)
          end_points[end_point] = net
          if final_endpoint == end_point:
            return net, end_points
          end_point = ‘Conv2d_2c_3x3‘
          net = slim.conv2d(net, 192, [3, 3], scope=end_point)
          end_points[end_point] = net
          if final_endpoint == end_point:
            return net, end_points
          end_point = ‘MaxPool_3a_3x3‘
          net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)
          end_points[end_point] = net
          if final_endpoint == end_point:
            return net, end_points

        end_point = ‘Mixed_3b‘
        with tf.variable_scope(end_point):
          with tf.variable_scope(‘Branch_0‘):
            branch_0 = slim.conv2d(net, 64, [1, 1], scope=‘Conv2d_0a_1x1‘)
          with tf.variable_scope(‘Branch_1‘):
            branch_1 = slim.conv2d(net, 96, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_1 = slim.conv2d(branch_1, 128, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_2‘):
            branch_2 = slim.conv2d(net, 16, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_2 = slim.conv2d(branch_2, 32, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_3‘):
            branch_3 = slim.max_pool2d(net, [3, 3], scope=‘MaxPool_0a_3x3‘)
            branch_3 = slim.conv2d(branch_3, 32, [1, 1], scope=‘Conv2d_0b_1x1‘)
          net = tf.concat(
              axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘Mixed_3c‘
        with tf.variable_scope(end_point):
          with tf.variable_scope(‘Branch_0‘):
            branch_0 = slim.conv2d(net, 128, [1, 1], scope=‘Conv2d_0a_1x1‘)
          with tf.variable_scope(‘Branch_1‘):
            branch_1 = slim.conv2d(net, 128, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_1 = slim.conv2d(branch_1, 192, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_2‘):
            branch_2 = slim.conv2d(net, 32, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_3‘):
            branch_3 = slim.max_pool2d(net, [3, 3], scope=‘MaxPool_0a_3x3‘)
            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope=‘Conv2d_0b_1x1‘)
          net = tf.concat(
              axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘MaxPool_4a_3x3‘
        net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘Mixed_4b‘
        with tf.variable_scope(end_point):
          with tf.variable_scope(‘Branch_0‘):
            branch_0 = slim.conv2d(net, 192, [1, 1], scope=‘Conv2d_0a_1x1‘)
          with tf.variable_scope(‘Branch_1‘):
            branch_1 = slim.conv2d(net, 96, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_1 = slim.conv2d(branch_1, 208, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_2‘):
            branch_2 = slim.conv2d(net, 16, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_2 = slim.conv2d(branch_2, 48, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_3‘):
            branch_3 = slim.max_pool2d(net, [3, 3], scope=‘MaxPool_0a_3x3‘)
            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope=‘Conv2d_0b_1x1‘)
          net = tf.concat(
              axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘Mixed_4c‘
        with tf.variable_scope(end_point):
          with tf.variable_scope(‘Branch_0‘):
            branch_0 = slim.conv2d(net, 160, [1, 1], scope=‘Conv2d_0a_1x1‘)
          with tf.variable_scope(‘Branch_1‘):
            branch_1 = slim.conv2d(net, 112, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_1 = slim.conv2d(branch_1, 224, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_2‘):
            branch_2 = slim.conv2d(net, 24, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_2 = slim.conv2d(branch_2, 64, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_3‘):
            branch_3 = slim.max_pool2d(net, [3, 3], scope=‘MaxPool_0a_3x3‘)
            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope=‘Conv2d_0b_1x1‘)
          net = tf.concat(
              axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘Mixed_4d‘
        with tf.variable_scope(end_point):
          with tf.variable_scope(‘Branch_0‘):
            branch_0 = slim.conv2d(net, 128, [1, 1], scope=‘Conv2d_0a_1x1‘)
          with tf.variable_scope(‘Branch_1‘):
            branch_1 = slim.conv2d(net, 128, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_1 = slim.conv2d(branch_1, 256, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_2‘):
            branch_2 = slim.conv2d(net, 24, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_2 = slim.conv2d(branch_2, 64, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_3‘):
            branch_3 = slim.max_pool2d(net, [3, 3], scope=‘MaxPool_0a_3x3‘)
            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope=‘Conv2d_0b_1x1‘)
          net = tf.concat(
              axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘Mixed_4e‘
        with tf.variable_scope(end_point):
          with tf.variable_scope(‘Branch_0‘):
            branch_0 = slim.conv2d(net, 112, [1, 1], scope=‘Conv2d_0a_1x1‘)
          with tf.variable_scope(‘Branch_1‘):
            branch_1 = slim.conv2d(net, 144, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_1 = slim.conv2d(branch_1, 288, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_2‘):
            branch_2 = slim.conv2d(net, 32, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_2 = slim.conv2d(branch_2, 64, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_3‘):
            branch_3 = slim.max_pool2d(net, [3, 3], scope=‘MaxPool_0a_3x3‘)
            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope=‘Conv2d_0b_1x1‘)
          net = tf.concat(
              axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘Mixed_4f‘
        with tf.variable_scope(end_point):
          with tf.variable_scope(‘Branch_0‘):
            branch_0 = slim.conv2d(net, 256, [1, 1], scope=‘Conv2d_0a_1x1‘)
          with tf.variable_scope(‘Branch_1‘):
            branch_1 = slim.conv2d(net, 160, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_1 = slim.conv2d(branch_1, 320, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_2‘):
            branch_2 = slim.conv2d(net, 32, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_2 = slim.conv2d(branch_2, 128, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_3‘):
            branch_3 = slim.max_pool2d(net, [3, 3], scope=‘MaxPool_0a_3x3‘)
            branch_3 = slim.conv2d(branch_3, 128, [1, 1], scope=‘Conv2d_0b_1x1‘)
          net = tf.concat(
              axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘MaxPool_5a_2x2‘
        net = slim.max_pool2d(net, [2, 2], stride=2, scope=end_point)
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘Mixed_5b‘
        with tf.variable_scope(end_point):
          with tf.variable_scope(‘Branch_0‘):
            branch_0 = slim.conv2d(net, 256, [1, 1], scope=‘Conv2d_0a_1x1‘)
          with tf.variable_scope(‘Branch_1‘):
            branch_1 = slim.conv2d(net, 160, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_1 = slim.conv2d(branch_1, 320, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_2‘):
            branch_2 = slim.conv2d(net, 32, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_2 = slim.conv2d(branch_2, 128, [3, 3], scope=‘Conv2d_0a_3x3‘)
          with tf.variable_scope(‘Branch_3‘):
            branch_3 = slim.max_pool2d(net, [3, 3], scope=‘MaxPool_0a_3x3‘)
            branch_3 = slim.conv2d(branch_3, 128, [1, 1], scope=‘Conv2d_0b_1x1‘)
          net = tf.concat(
              axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = ‘Mixed_5c‘
        with tf.variable_scope(end_point):
          with tf.variable_scope(‘Branch_0‘):
            branch_0 = slim.conv2d(net, 384, [1, 1], scope=‘Conv2d_0a_1x1‘)
          with tf.variable_scope(‘Branch_1‘):
            branch_1 = slim.conv2d(net, 192, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_1 = slim.conv2d(branch_1, 384, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_2‘):
            branch_2 = slim.conv2d(net, 48, [1, 1], scope=‘Conv2d_0a_1x1‘)
            branch_2 = slim.conv2d(branch_2, 128, [3, 3], scope=‘Conv2d_0b_3x3‘)
          with tf.variable_scope(‘Branch_3‘):
            branch_3 = slim.max_pool2d(net, [3, 3], scope=‘MaxPool_0a_3x3‘)
            branch_3 = slim.conv2d(branch_3, 128, [1, 1], scope=‘Conv2d_0b_1x1‘)
          net = tf.concat(
              axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points
    raise ValueError(‘Unknown final endpoint %s‘ % final_endpoint)

def inception_v1(inputs,
                 num_classes=1000,
                 is_training=True,
                 dropout_keep_prob=0.8,
                 prediction_fn=slim.softmax,
                 spatial_squeeze=True,
                 reuse=None,
                 scope=‘InceptionV1‘,
                 global_pool=False):
  """Defines the Inception V1 architecture.

  This architecture is defined in:

    Going deeper with convolutions
    Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed,
    Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich.
    http://arxiv.org/pdf/1409.4842v1.pdf.

  The default image size used to train this network is 224x224.

  Args:
    inputs: a tensor of size [batch_size, height, width, channels].
    num_classes: number of predicted classes. If 0 or None, the logits layer
      is omitted and the input features to the logits layer (before dropout)
      are returned instead.
    is_training: whether is training or not.
    dropout_keep_prob: the percentage of activation values that are retained.
    prediction_fn: a function to get predictions out of logits.
    spatial_squeeze: if True, logits is of shape [B, C], if false logits is of
        shape [B, 1, 1, C], where B is batch_size and C is number of classes.
    reuse: whether or not the network and its variables should be reused. To be
      able to reuse ‘scope‘ must be given.
    scope: Optional variable_scope.
    global_pool: Optional boolean flag to control the avgpooling before the
      logits layer. If false or unset, pooling is done with a fixed window
      that reduces default-sized inputs to 1x1, while larger inputs lead to
      larger outputs. If true, any input size is pooled down to 1x1.

  Returns:
    net: a Tensor with the logits (pre-softmax activations) if num_classes
      is a non-zero integer, or the non-dropped-out input to the logits layer
      if num_classes is 0 or None.
    end_points: a dictionary from components of the network to the corresponding
      activation.
  """
  # Final pooling and prediction
  with tf.variable_scope(scope, ‘InceptionV1‘, [inputs], reuse=reuse) as scope:
    with slim.arg_scope([slim.batch_norm, slim.dropout],
                        is_training=is_training):
      net, end_points = inception_v1_base(inputs, scope=scope)
      with tf.variable_scope(‘Logits‘):
        if global_pool:
          # Global average pooling.
          net = tf.reduce_mean(net, [1, 2], keep_dims=True, name=‘global_pool‘)
          end_points[‘global_pool‘] = net
        else:
          # Pooling with a fixed kernel size.
          net = slim.avg_pool2d(net, [7, 7], stride=1, scope=‘AvgPool_0a_7x7‘)
          end_points[‘AvgPool_0a_7x7‘] = net
        if not num_classes:
          return net, end_points
        net = slim.dropout(net, dropout_keep_prob, scope=‘Dropout_0b‘)
        logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,
                             normalizer_fn=None, scope=‘Conv2d_0c_1x1‘)
        if spatial_squeeze:
          logits = tf.squeeze(logits, [1, 2], name=‘SpatialSqueeze‘)

        end_points[‘Logits‘] = logits
        end_points[‘Predictions‘] = prediction_fn(logits, scope=‘Predictions‘)
  return logits, end_points
inception_v1.default_image_size = 224

inception_v1_arg_scope = inception_utils.inception_arg_scope

返回目录


网络说明

(1)GoogLeNet采用了模块化的结构(Inception结构),方便增添和修改;

(2)网络最后采用了average pooling(平均池化)来代替全连接层,,事实证明这样可以将准确率提高0.6%。但是,实际在最后还是加了一个全连接层,主要是为了方便对输出进行灵活调整;

(3)虽然移除了全连接,但是网络中依然使用了Dropout ;

(4)辅助分类器的两个分支有什么用呢?

作用一:可以把他看做inception网络中的一个小细节,它确保了即便是隐藏单元和中间层也参与了特征计算,他们也能预测图片的类别,他在inception网络中起到一种调整的效果,并且能防止网络发生过拟合。

作用二:给定深度相对较大的网络,有效传播梯度反向通过所有层的能力是一个问题。通过将辅助分类器添加到这些中间层,可以期望较低阶段分类器的判别力。在训练期间,它们的损失以折扣权重(辅助分类器损失的权重是0.3)加到网络的整个损失上。

Inception V1的参数量=5607184,约为560w

返回目录


参考资料

《图解深度学习与神经网络:从张量到TensorFlow实现》_张平

inceptionV1-Going Deeper with Convolutions

http://noahsnail.com/2017/07/21/2017-07-21-GoogleNet%E8%AE%BA%E6%96%87%E7%BF%BB%E8%AF%91%E2%80%94%E2%80%94%E4%B8%AD%E8%8B%B1%E6%96%87%E5%AF%B9%E7%85%A7/

《深-度-学-习-核-心-技-术-与-实-践》

大话CNN经典模型:GoogLeNet(从Inception v1到v4的演进)

https://my.oschina.net/u/876354/blog/1637819

返回目录

原文地址:https://www.cnblogs.com/mfryf/p/11381352.html

时间: 2024-10-06 13:25:14

深度学习面试题20:GoogLeNet(Inception V1)的相关文章

深度学习面试题29:GoogLeNet(Inception V3)

目录 使用非对称卷积分解大filters 重新设计pooling层 辅助构造器 使用标签平滑 参考资料 在<深度学习面试题20:GoogLeNet(Inception V1)>和<深度学习面试题26:GoogLeNet(Inception V2)>中对前两个Inception版本做了介绍,下面主要阐述V3版本的创新点 使用非对称卷积分解大filters InceptionV3中在网络较深的位置使用了非对称卷积,他的好处是在不降低模型效果的前提下,缩减模型的参数规模,在<深度学

深度学习面试题27:非对称卷积(Asymmetric Convolutions)

目录 产生背景 举例 参考资料 产生背景 之前在深度学习面试题16:小卷积核级联卷积VS大卷积核卷积中介绍过小卷积核的三个优势: ①整合了三个非线性激活层,代替单一非线性激活层,增加了判别能力. ②减少了网络参数. ③减少了计算量 在<Rethinking the Inception Architecture for Computer Vision>中作者还想把小卷积核继续拆解,从而进一步增强前面的优势 返回目录 举例 一个3*3的卷积可以拆解为:一个3*1的卷积再串联一个1*3的卷积,实验证

深度学习面试题13:AlexNet(1000类图像分类)

目录 网络结构 两大创新点 参考资料 第一个典型的CNN是LeNet5网络结构,但是第一个引起大家注意的网络却是AlexNet,Alex Krizhevsky其实是Hinton的学生,这个团队领导者是Hinton,于2012年发表论文. AlexNet有60 million个参数和65000个 神经元,五层卷积,三层全连接网络,最终的输出层是1000通道的softmax.AlexNet利用了两块GPU进行计算,大大提高了运算效率,并且在ILSVRC-2012竞赛中获得了top-5测试的15.3%

深度学习面试题19:1*1卷积核的作用

目录 举例 在Inception module上的应用 参考资料 可以减少计算量,可以增加非线性判别能力 举例 假设有1个高为30.宽为40,深度为200的三维张量与55个高为5.宽为5.深度为200的卷积核same卷积,步长=1,则结果是高为30.宽为40.深度为55的三维张量,如图所示: 该卷积过程的乘法计算量大约为5*5*200*30*40*55=330000000,这个计算量很大. 接着,我们可以考虑第二种卷积过程,先利用1*1的卷积核在深度上降维,然后升维: 上述卷积过程的计算量大约为

深度学习面试题18:网中网结构(Network in Network)

目录 举例 参考资料 网中网结构通过多个分支的运算(卷积或池化),将分支上的运算结果在深度上连接 举例 一个3*3*2的张量, 与3个1*1*2的卷积核分别same卷积,步长=1, 与2个2*2*2的卷积核分别same卷积,步长=1, 与1个3*3*2的掩码最大值same池化,步长=1, 将得到的这3个结果在深度方向上拼接 GoogLeNet是基于类似网中网模块设计的网络结构,在GoogLeNet中该模块称为 Inception Module,多个Inception Module 模块可以组合成

深度学习面试题21:批量归一化(Batch Normalization,BN)

目录 BN的由来 BN的作用 BN的操作阶段 BN的操作流程 BN可以防止梯度消失吗 为什么归一化后还要放缩和平移 BN在GoogLeNet中的应用 BN在GoogLeNet中的应用 BN的由来 BN是由Google于2015年提出,论文是<Batch Normalization_ Accelerating Deep Network Training by Reducing Internal Covariate Shift>,这是一个深度神经网络训练的技巧,主要是让数据的分布变得一致,从而使得

深度学习面试题12:LeNet(手写数字识别)

目录 神经网络的卷积.池化.拉伸 LeNet网络结构 LeNet在MNIST数据集上应用 参考资料 LeNet是卷积神经网络的祖师爷LeCun在1998年提出,用于解决手写数字识别的视觉任务.自那时起,CNN的最基本的架构就定下来了:卷积层.池化层.全连接层.如今各大深度学习框架中所使用的LeNet都是简化改进过的LeNet-5(-5表示具有5个层),和原始的LeNet有些许不同,比如把激活函数改为了现在很常用的ReLu. 神经网络的卷积.池化.拉伸 前面讲了卷积和池化,卷积层可以从图像中提取特

深度学习面试题25:分离卷积(separable卷积)

目录 举例 单个张量与多个卷积核的分离卷积 参考资料 举例 分离卷积就是先在深度上分别卷积,然后再进行卷积,对应代码为: import tensorflow as tf # [batch, in_height, in_width, in_channels] input =tf.reshape(tf.constant([2,5,3,3,8,2,6,1,1,2,5,4,7,9,2,3,-1,3], tf.float32),[1,3,3,2]) # [filter_height, filter_wid

深度学习面试题22:批量归一化在实践中的应用

目录 指数移动平均 BN在卷积网络中的使用 参考资料 假设已经训练好一个带有BN操作的卷积神经网络,但是在使用它预测时,往往每次只输入一个样本,那么经过该网络时,计算平均值和方差的意义就不大了,常采用的策略是计算训练阶段的平均值和方差的指数移动平均,然后在预测阶段使用它们作为BN操作时的平均值和方差. 指数移动平均 假设变量xt随时间t变化,按照以下规则定义其指数移动平均值 假设α=0.7 当t=1时,x1=5,则ema(1)=x1=5 当t=2时,x2=10,则ema(2)=α*ema(1)+