TensorFlow tf.gradients的用法详细解析以及具体例子

tf.gradients

官方定义：

tf.gradients(
    ys,
    xs,
    grad_ys=None,
    name=‘gradients‘,
    stop_gradients=None,
)

Constructs symbolic derivatives of sum of ys w.r.t. x in xs.

ys and xs are each a Tensor or a list of tensors. grad_ys is a list of Tensor, holding the gradients received by theys. The list must be the same length as ys.

gradients() adds ops to the graph to output the derivatives of ys with respect to xs. It returns a list of Tensor of length len(xs) where each tensor is the sum(dy/dx) for y in ys.

grad_ys is a list of tensors of the same length as ys that holds the initial gradients for each y in ys. When grad_ysis None, we fill in a tensor of ‘1‘s of the shape of y for each y in ys. A user can provide their own initial grad_ys to compute the derivatives using a different initial gradient for each y (e.g., if one wanted to weight the gradient differently for each value in each y).

stop_gradients is a Tensor or a list of tensors to be considered constant with respect to all xs. These tensors will not be backpropagated through, as though they had been explicitly disconnected using stop_gradient. Among other things, this allows computation of partial derivatives as opposed to total derivatives.

翻译：

1. xs和ys可以是一个张量，也可以是张量列表，tf.gradients(ys,xs) 实现的功能是求ys（如果ys是列表，那就是ys中所有元素之和）关于xs的导数（如果xs是列表，那就是xs中每一个元素分别求导），返回值是一个与xs长度相同的列表。

例如ys=[y1,y2,y3], xs=[x1,x2,x3,x4]，那么tf.gradients(ys,xs)=[d(y1+y2+y3)/dx1,d(y1+y2+y3)/dx2,d(y1+y2+y3)/dx3,d(y1+y2+y3)/dx4].具体例子见下面代码第16-17行。

2. grad_ys 是ys的加权向量列表，和ys长度相同，当grad_ys=[q1,q2,g3]时，tf.gradients(ys,xs，grad_ys)=[d(g1*y1+g2*y2+g3*y3)/dx1,d(g1*y1+g2*y2+g3*y3)/dx2,d(g1*y1+g2*y2+g3*y3)/dx3,d(g1*y1+g2*y2+g3*y3)/dx4].具体例子见下面代码第19-21行。

3. stop_gradients使得指定变量不被求导，即视为常量，具体的例子见官方例子，此处省略。

 1 import tensorflow as tf
 2 w1 = tf.Variable([[1,2]])
 3 w2 = tf.Variable([[3,4]])
 4 res = tf.matmul(w1, [[2],[1]])
 5
 6 #ys必须与xs有关，否则会报错
 7 # grads = tf.gradients(res,[w1,w2])
 8 #TypeError: Fetch argument None has invalid type <class ‘NoneType‘>
 9
10 # grads = tf.gradients(res,[w1])
11 # # Result [array([[2, 1]])]
12
13 res2a=tf.matmul(w1, [[2],[1]])+tf.matmul(w2, [[3],[5]])
14 res2b=tf.matmul(w1, [[2],[4]])+tf.matmul(w2, [[8],[6]])
15
16 # grads = tf.gradients([res2a,res2b],[w1,w2])
17 #result:[array([[4, 5]]), array([[11, 11]])]
18
19 grad_ys=[tf.Variable([[1]]),tf.Variable([[2]])]
20 grads = tf.gradients([res2a,res2b],[w1,w2],grad_ys=grad_ys)
21 # Result: [array([[6, 9]]), array([[19, 17]])]
22
23 with tf.Session() as sess:
24     tf.global_variables_initializer().run()
25     re = sess.run(grads)
26     print(re)

原文地址：https://www.cnblogs.com/luckyscarlett/p/10570938.html

时间： 2024-10-27 18:26:02

TensorFlow tf.gradients的用法详细解析以及具体例子

tf.gradients

TensorFlow tf.gradients的用法详细解析以及具体例子的相关文章

js中indexof的用法详细解析

定义8：jquery.cookie用法详细解析

jquery.cookie实战用法详细解析

tensorflow中gradients的使用以及TypeError: Fetch argument None has invalid type <class 'NoneType'>错误解析

Android AsyncTask 详细解析

详细解析BluetoothAdapter的详细api

tensorflow-底层梯度tf.AggregationMethod,tf.gradients

linux中的压缩命令详细解析（二）

【转】UML中的几种关系详细解析