变量是用来存储和更新参数的,也就是网络中的W或b。变量会被放在内存中。当模型训练结束后,他们需要被存在硬盘上,以便将来使用或分析模型。
一.变量
创建和初始化
当创建一个变量的时候,需要将一个Tensor作为初始值传入构造函数Variable()。这个初始值可以是随机值也可以是常量。Tensor的初始值需要指定shape,这个shape通常都是固定的,但是也可以通过一些高级方法重新调整。
只是创建了变量还是不够的,需要在定义一个初始化的操作,并且在使用任何变量之前,运行初始化的操作。例:
1 import tensorflow as tf 2 3 #Create two variables 4 weights = tf.Variable(tf.random_normal([20,10], stddev=0.35), name="weights") 5 biases = tf.Variable(tf.zeros([10]), name="biases") 6 7 #Add other net structure... 8 #... 9 10 #Add an op to initialize the variables. 11 init = tf.initialize_all_variables() 12 13 #Later, when launching the model 14 with tf.Session() as sess: 15 sess.run(init) 16 print weights.eval() 17 print biases.eval()
输出:weights是[20,10]的矩阵,biases是[10]的向量。
注意:
tf.initialize_all_variables()是并行的初始化所有变量,所以如果需要用一个变量的值给另一个变量初始化的时候,一定要小心。虽然直接初始化不一定会出现问题,但是如果出现问题是很难找到这个原因的。
这时应该用如下方式进行初始化:
# Create a variable with a random value. weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35), name="weights") # Create another variable with the same value as ‘weights‘. w2 = tf.Variable(weights.initialized_value(), name="w2") # Create another variable with twice the value of ‘weights‘ w_twice = tf.Variable(weights.initialized_value() * 0.2, name="w_twice")
另外还有自定义初始化,因为目前还没用到,先挖个坑,以后再填,详见Variables Documentation
变量可以被初始化成常量,或者是随机数,这个跟初始化的策略有关,具体什么情况下使用什么方法进行初始化也挖个坑,以后学到了再讲。
初始化成常量的方法:
tf.zeros(shape, dtype=tf.float32, name=None)
全部初始化为0
tf.zeros_like(tensor, dtype=None, name=None, optimize=True)
创建一个shape和指定tensor相同的变量,但全部元素都为零。例如‘tensor’ =[[1,2,3], [4,5,6]],那么tf.zeros_like(tensor) ==>[[0,0,0],[0,0,0]]
tf.ones(shape, dtype=tf.float32, name=None)
全部初始化为1
tf.ones_like()同上
tf.fill(dims, value, name=None)
对指定好的shape初始化为value值。tf.fill([2,3], 9) ==> [[9,9,9] [9,9,9,]]
tf.constant(value, dtype=None, shape=None, name=‘Const‘, verify_shape=False)
For example:
```python # Constant 1-D Tensor populated with value list. tensor = tf.constant([1, 2, 3, 4, 5, 6, 7]) => [1 2 3 4 5 6 7]
# Constant 2-D tensor populated with scalar value -1. tensor = tf.constant(-1.0, shape=[2, 3]) => [[-1. -1. -1.] [-1. -1. -1.]] ```
初始化成序列
tf.linspace()
tf.range()
初始化为随机数
tf.random_normal(shape, mean=0.0, stddev=1.0. dtype=tf.float32, seed=None, name=None)
按照正太分布初始化
tf.truncated_normal()
同上,但超过两个标准差的数据被丢弃,并重新随机选择数据。即产生的数据都在正太分布的两个标准差内。
保存和加载
得到了训练好的模型之后,我们需要将这个模型保存下来,之后可以再次读取这个模型进行进一步的使用。最简单的方法是使用托tf.train.Saver。下面是例子:
首先是保存变量:
import tensorflow as tf #Create some variables. v1 = tf.Variable([1,2,3,4,5], name="v1") v2 = tf.Variable([11,12,13,14], name="v2") #Add an op to initialize the variables init = tf.initialize_all_variables() #Add an op to save and restore all the variables saver = tf.train.Saver() with tf.Session() as sess: sess.run(init) #Do something with the model print v1.eval() print v2.eval() #save the variables to disk save_path = saver.save(sess,"model/model.ckpt") print "Model saved in file:", save_path
接下来是加载:
import tensorflow as tf v3 = tf.Variable([0,0,0,0,0], name=‘v1‘) v4 = tf.Variable([0,0,0,0], name=‘v2‘) saver = tf.train.Saver() with tf.Session() as sess: saver.restore(sess,"model/model.ckpt") print "Model restored." print v3.eval() print v4.eval()
在从文件中恢复变量时,不需要事先进行初始化。注意:在回复变量时,tf.Variable()里的name参数一定要与原来的变量名称一致,这样才能恢复到对应的变量。
变量的函数
__init__(initial_value=None, trainable=True, collections=None, validate_shape=True, caching_device=None, name=None, variable_def=None, dtype=None, expected_shape=None, import_scope=None)
Creates a new variable with value
initial_value
.The new variable is added to the graph collections listed in
collections
, which defaults to[GraphKeys.GLOBAL_VARIABLES]
.If
trainable
isTrue
the variable is also added to the graph collectionGraphKeys.TRAINABLE_VARIABLES
.This constructor creates both a
variable
Op and anassign
Op to set the variable to its initial value.Args:
initial_value
: ATensor
, or Python object convertible to aTensor
, which is the initial value for the Variable. The initial value must have a shape specified unlessvalidate_shape
is set to False. Can also be a callable with no argument that returns the initial value when called. In that case,dtype
must be specified. (Note that initializer functions from init_ops.py must first be bound to a shape before being used here.)trainable
: IfTrue
, the default, also adds the variable to the graph collectionGraphKeys.TRAINABLE_VARIABLES
. This collection is used as the default list of variables to use by theOptimizer
classes.应该是会在做最优化的时候使用得到collections
: List of graph collections keys. The new variable is added to these collections. Defaults to[GraphKeys.GLOBAL_VARIABLES]
.validate_shape
: IfFalse
, allows the variable to be initialized with a value of unknown shape. IfTrue
, the default, the shape ofinitial_value
must be known.caching_device
: Optional device string describing where the Variable should be cached for reading. Defaults to the Variable‘s device. If notNone
, caches on another device. Typical use is to cache on the device where the Ops using the Variable reside, to deduplicate copying throughSwitch
and other conditional statements.name
: Optional name for the variable. Defaults to‘Variable‘
and gets uniquified automatically.variable_def
:VariableDef
protocol buffer. If notNone
, recreates the Variable object with its contents.variable_def
and the other arguments are mutually exclusive.dtype
: If set, initial_value will be converted to the given type. IfNone
, either the datatype will be kept (ifinitial_value
is a Tensor), orconvert_to_tensor
will decide.expected_shape
: A TensorShape. If set, initial_value is expected to have this shape.import_scope
: Optionalstring
. Name scope to add to theVariable.
Only used when initializing from protocol buffer.Raises:
ValueError
: If bothvariable_def
and initial_value are specified.ValueError
: If the initial value is not specified, or does not have a shape andvalidate_shape
isTrue
.
eval(session=None)
In a session, computes and returns the value of this variable.
This is not a graph construction method, it does not add ops to the graph.
This convenience method requires a session where the graph containing this variable has been launched. If no session is passed, the default session is used. See
tf.Session
for more information on launching a graph and on sessions.v = tf.Variable([1, 2])init = tf.global_variables_initializer() with tf.Session() as sess: sess.run(init) # Usage passing the session explicitly. print(v.eval(sess)) # Usage with the default session. The ‘with‘ block # above makes ‘sess‘ the default session. print(v.eval())
Args:
session
: The session to use to evaluate this variable. If none, the default session is used.Returns:
A numpy
ndarray
with a copy of the value of this variable.
二.常见类
Tensor
Tensor类是最核心的数据结构。Tensor是一个处理操作输出的符号,它并不保存操作输出的值,但是提供了在Session中计算这些值的方法。
This class has two primary purposes:
- A
Tensor
can be passed as an input to anotherOperation
. This builds a dataflow connection between operations, which enables TensorFlow to execute an entireGraph
that represents a large, multi-step computation.- After the graph has been launched in a session, the value of the
Tensor
can be computed by passing it toSession.run()
.t.eval()
is a shortcut for callingtf.get_default_session().run(t)
.
Operation
Operation是tensorflow中的节点,使用Tensor作为输入,并输出一个Tensor。其实就是运算操作,例如tf.matmul(a,b),就是a×b。当图在session中启动之后,operation拒可以通过tf.Session.run()这种操作执行,或者op.run()。