『PyTorch』第十四弹_torch.nn.Module深入分析

nn.Module基类的构造函数：

def __init__(self):
    self._parameters = OrderedDict()
    self._modules = OrderedDict()
    self._buffers = OrderedDict()
    self._backward_hooks = OrderedDict()
    self._forward_hooks = OrderedDict()
    self.training = True

其中每个属性的解释如下：

_parameters：字典，保存用户直接设置的parameter，self.param1 = nn.Parameter(t.randn(3, 3))会被检测到，在字典中加入一个key为‘param‘，value为对应parameter的item。而self.submodule = nn.Linear(3, 4)中的parameter则不会存于此。
_modules：子module，通过self.submodel = nn.Linear(3, 4)指定的子module会保存于此。
_buffers：缓存。如batchnorm使用momentum机制，每次前向传播需用到上一次前向传播的结果。
_backward_hooks与_forward_hooks：钩子技术，用来提取中间变量，类似variable的hook。
training：BatchNorm与Dropout层在训练阶段和测试阶段中采取的策略不同，通过判断training值来决定前向传播策略。

上述几个属性中，_parameters、_modules和_buffers这三个字典中的键值，都可以通过self.key方式获得，效果等价于self._parameters[‘key‘].

定义一个Module，这个Module即包含自己的Parameters有包含子Module及其Parameters，

import torch as t
from torch import nn
from torch.autograd import Variable as V

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        # 等价与self.register_parameter(‘param1‘ ,nn.Parameter(t.randn(3, 3)))
        self.param1 = nn.Parameter(t.rand(3, 3))
        self.submodel1 = nn.Linear(3, 4)
    def forward(self, input):
        x = self.param1.mm(input)
        x = self.submodel11(x)
        return x
net = Net()

`一、_modules`

# 打印网络对象的话会输出子module结构
print(net)
Net(
  (submodel1): Linear(in_features=3, out_features=4)
)
# ._modules输出的也是子module结构，不过数据结构和上面的有所不同
print(net.submodel1)
print(net._modules) # 字典子类
Linear(in_features=3, out_features=4)
OrderedDict([(‘submodel1‘, Linear(in_features=3, out_features=4))])
for name, submodel in net.named_modules():
print(name, submodel)
 Net(
  (submodel1): Linear(in_features=3, out_features=4)
)
submodel1 Linear(in_features=3, out_features=4)
print(list(net.named_modules())) # named_modules其实是包含了本层的module集合
[(‘‘, Net(
  (submodel1): Linear(in_features=3, out_features=4)
)), (‘submodel1‘, Linear(in_features=3, out_features=4))]

`二、_parameters`

# ._parameters存储的也是这个结构
print(net.param1)
print(net._parameters) # 字典子类，仅仅包含直接定义的nn.Parameters参数
Parameter containing:
 0.6135  0.8082  0.4519
 0.9052  0.5929  0.2810
 0.6825  0.4437  0.3874
[torch.FloatTensor of size 3x3]

OrderedDict([(‘param1‘, Parameter containing:
 0.6135  0.8082  0.4519
 0.9052  0.5929  0.2810
 0.6825  0.4437  0.3874
[torch.FloatTensor of size 3x3]
)])
for name, param in net.named_parameters():
print(name, param.size())
param1 torch.Size([3, 3])
submodel1.weight torch.Size([4, 3])
submodel1.bias torch.Size([4])

`三、_buffers`

bn = nn.BatchNorm1d(2)
input = V(t.rand(3, 2), requires_grad=True)
output = bn(input)
bn._buffers

OrderedDict([(‘running_mean‘,
              1.00000e-02 *
                9.1559
                1.9914
              [torch.FloatTensor of size 2]), (‘running_var‘,
               0.9003
               0.9019
              [torch.FloatTensor of size 2])])

`四、training`

input = V(t.arange(0, 12).view(3, 4))
model = nn.Dropout()
# 在训练阶段，会有一半左右的数被随机置为0
model(input)

Variable containing:
  0   2   4   0
  8  10   0   0
  0  18   0  22
[torch.FloatTensor of size 3x4]

model.training  = False
# 在测试阶段，dropout什么都不做
model(input)

Variable containing:
  0   1   2   3
  4   5   6   7
  8   9  10  11
[torch.FloatTensor of size 3x4]

Module.train()、Module.eval() 方法和 Module.training属性的关系

print(net.training, net.submodel1.training)
net.train() # 将本层及子层的training设定为True
net.eval() # 将本层及子层的training设定为False
net.training = True # 注意，对module的设置仅仅影响本层，子module不受影响
net.training, net.submodel1.training

True True
(True, False)

原文地址：https://www.cnblogs.com/hellcat/p/8503498.html

时间： 2024-11-05 20:29:40

『PyTorch』第十四弹_torch.nn.Module深入分析的相关文章

『PyTorch』第十二弹_nn.Module和nn.functional

大部分nn中的层class都有nn.function对应,其区别是: nn.Module实现的layer是由class Layer(nn.Module)定义的特殊类,会自动提取可学习参数nn.Parameter nn.functional中的函数更像是纯函数,由def function(input)定义. 由于两者性能差异不大,所以具体使用取决于个人喜好.对于激活函数和池化层,由于没有可学习参数,一般使用nn.functional完成,其他的有学习参数的部分则使用类.但是Droupout由于在训

『PyTorch』第十弹_循环神经网络

『cs231n』作业3问题1选讲_通过代码理解RNN&图像标注训练对于torch中的RNN相关类,有原始和原始Cell之分,其中RNN和RNNCell层的区别在于前者一次能够处理整个序列,而后者一次只处理序列中一个时间点的数据,前者封装更完备更易于使用,后者更具灵活性.实际上RNN层的一种后端实现方式就是调用RNNCell来实现的. 一.nn.RNN import torch as t from torch import nn from torch.autograd import Variab

『LeetCode』练习第四弹_算法6题

6. ZigZag Conversion The string "PAYPALISHIRING" is written in a zigzag pattern on a given number of rows like this: (you may want to display this pattern in a fixed font for better legibility) P A H N A P L S I I G Y I R And then read line by l

『PyTorch』第四弹_通过LeNet初识pytorch神经网络_下

『PyTorch』第四弹_通过LeNet初识pytorch神经网络_上 # Author : Hellcat # Time : 2018/2/11 import torch as t import torch.nn as nn import torch.nn.functional as F class LeNet(nn.Module): def __init__(self): super(LeNet,self).__init__() self.conv1 = nn.Conv2d(3, 6, 5)

『PyTorch』第十一弹_torch.optim优化器

一.简化前馈网络LeNet import torch as t class LeNet(t.nn.Module): def __init__(self): super(LeNet, self).__init__() self.features = t.nn.Sequential( t.nn.Conv2d(3, 6, 5), t.nn.ReLU(), t.nn.MaxPool2d(2, 2), t.nn.Conv2d(6, 16, 5), t.nn.ReLU(), t.nn.MaxPool2d(2

『PyTorch』第十三弹_torch.nn.init参数初始化

初始化参数的方法 nn.Module模块对于参数进行了内置的较为合理的初始化方式,当我们使用nn.Parameter时,初始化就很重要,而且我们也可以指定代替内置初始化的方式对nn.Module模块进行补充. 除了之前的.data进行赋值,或者.data.初始化方式外,我们可以使用torch.nn.init进行初始化参数. from torch.nn import init linear = nn.Linear(3, 4) t.manual_seed(1) init.xavier_normal(

『PyTorch』第一弹_Linux系统下的安装记录

官网首页(http://pytorch.org/)是有安装教程的,但是点击之后没有反应,原因不明,所以不得不自己寻找一个安装方法. 安装参考如下: http://blog.csdn.net/amds123/article/details/69396953 由于我的机器使用Anaconda2.7内部嵌套了Anaconda3.6,而我更倾向于使用3.6版本(个人感觉使用3.x是大势所趋,且3.x的确比2.7方便不少),而我的cuda版本是8,所以我根据自己的情况记录一下安装流程: # 激活环境 so

『PyTorch』第五弹_深入理解autograd_下：Variable梯度探究

查看非叶节点梯度的两种方法在反向传播过程中非叶子节点的导数计算完之后即被清空.若想查看这些变量的梯度,有两种方法: 使用autograd.grad函数使用hook autograd.grad和hook方法都是很强大的工具,更详细的用法参考官方api文档,这里举例说明基础的使用.推荐使用hook方法,但是在实际使用中应尽量避免修改grad的值. 求z对y的导数 x = V(t.ones(3)) w = V(t.rand(3),requires_grad=True) y = w.mul(x) z

『PyTorch』第五弹_深入理解autograd_下：函数扩展&高阶导数

一.封装新的PyTorch函数继承Function类 forward:输入Variable->中间计算Tensor->输出Variable backward:均使用Variable 线性映射 from torch.autograd import Function class MultiplyAdd(Function): # <----- 类需要继承Function类 @staticmethod # <-----forward和backward都是静态方法 def forward(