结构化TensorFlow模型代码

译自http://danijar.com/structuring-your-tensorflow-models/

使用TensorFlow构建神经网络模型很容易导致较大的代码量，那么如何以可读和可复用的方式构建代码？（没耐心的可直接参考可直接参考源代码https://gist.github.com/danijar/8663d3bbfd586bffecf6a0094cd116f2）

定义计算图

　　在每一个模型里面定义一个类是一个较好的选择。那么，如何定义该类的接口呢？通常，每个模型会连接到一些输入数据和占位符，并提供training，evaluation和inference的操作。

class Model:

    def __init__(self, data, target):
        data_size = int(data.get_shape()[1])
        target_size = int(target.get_shape()[1])
        weight = tf.Variable(tf.truncated_normal([data_size, target_size]))
        bias = tf.Variable(tf.constant(0.1, shape=[target_size]))
        incoming = tf.matmul(data, weight) + bias
        self._prediction = tf.nn.softmax(incoming)
        cross_entropy = -tf.reduce_sum(target, tf.log(self._prediction))
        self._optimize = tf.train.RMSPropOptimizer(0.03).minimize(cross_entropy)
        mistakes = tf.not_equal(
            tf.argmax(target, 1), tf.argmax(self._prediction, 1))
        self._error = tf.reduce_mean(tf.cast(mistakes, tf.float32))

    @property
    def prediction(self):
        return self._prediction

    @property
    def optimize(self):
        return self._optimize

    @property
    def error(self):
        return self._error

以上代码是如何在Tensorflow中定义模型的基本示例。但是，它有一些问题。其中最值得注意的是，整个graph是在单个函数（即构造函数__init__）中定义的, 这既不是可读也不可复用。

使用 Properties

　　只需将代码拆分为函数就不起作用，因为每次调用函数时，图形中就会添加其他代码。因此，我们必须确保只有当第一次调用函数时，操作才会添加到图形中，即使用 lazy-loading。

class Model:

    def __init__(self, data, target):
        self.data = data
        self.target = target
        self._prediction = None
        self._optimize = None
        self._error = None

    @property
    def prediction(self):
        if not self._prediction:
            data_size = int(self.data.get_shape()[1])
            target_size = int(self.target.get_shape()[1])
            weight = tf.Variable(tf.truncated_normal([data_size, target_size]))
            bias = tf.Variable(tf.constant(0.1, shape=[target_size]))
            incoming = tf.matmul(self.data, weight) + bias
            self._prediction = tf.nn.softmax(incoming)
        return self._prediction

    @property
    def optimize(self):
        if not self._optimize:
            cross_entropy = -tf.reduce_sum(self.target, tf.log(self.prediction))
            optimizer = tf.train.RMSPropOptimizer(0.03)
            self._optimize = optimizer.minimize(cross_entropy)
        return self._optimize

    @property
    def error(self):
        if not self._error:
            mistakes = tf.not_equal(
                tf.argmax(self.target, 1), tf.argmax(self.prediction, 1))
            self._error = tf.reduce_mean(tf.cast(mistakes, tf.float32))
        return self._error

比第一个例子要好很多，代码现在被组织为多个函数。然而，由于lazy-loading逻辑，代码仍然有点膨胀。让我们看看可以如何进一步改进。

Lazy Property Decorator

Python是一种非常灵活的语言，如何从最后一个例子中删除冗余代码呢？可以使用一个类似于@property的装饰器，但只能对该函数进行一次评估。它将结果存储在以装饰器函数命名的成员中（加一个前缀），并在后续的任意调用中返回此值。如果您尚未使用自定义装饰器，可以参考下这个教程：http://blog.apcelent.com/python-decorator-tutorial-with-example.html

import functools

def lazy_property(function):
    attribute = ‘_cache_‘ + function.__name__

    @property
    @functools.wraps(function)
    def decorator(self):
        if not hasattr(self, attribute):
            setattr(self, attribute, function(self))
        return getattr(self, attribute)

    return decorator

使用这个装饰器，我们的例子简化成了下面的代码。

class Model:

    def __init__(self, data, target):
        self.data = data
        self.target = target
        self.prediction
        self.optimize
        self.error

    @lazy_property
    def prediction(self):
        data_size = int(self.data.get_shape()[1])
        target_size = int(self.target.get_shape()[1])
        weight = tf.Variable(tf.truncated_normal([data_size, target_size]))
        bias = tf.Variable(tf.constant(0.1, shape=[target_size]))
        incoming = tf.matmul(self.data, weight) + bias
        return tf.nn.softmax(incoming)

    @lazy_property
    def optimize(self):
        cross_entropy = -tf.reduce_sum(self.target, tf.log(self.prediction))
        optimizer = tf.train.RMSPropOptimizer(0.03)
        return optimizer.minimize(cross_entropy)

    @lazy_property
    def error(self):
        mistakes = tf.not_equal(
            tf.argmax(self.target, 1), tf.argmax(self.prediction, 1))
        return tf.reduce_mean(tf.cast(mistakes, tf.float32))

请注意，我们在构造函数中的添加了properties。这样，可以保证在我们运行tf.initialize_variables（）时，整个graph会被创建。

使用Scope组织计算图

　　现在，我们有一个简洁明了的方法定义了代码中的模型，但是所得的计算图仍然十分繁缛。如果你可视化计算图，可以发现大量互连的小型节点。解决方案是用tf.name_scope（‘name‘）或tf.variable_scope（‘name‘）包装每个函数。这样节点就会在图中组织在一起。但是，我们也调整我们前面的装饰器来自动执行：

import functools

def define_scope(function):
    attribute = ‘_cache_‘ + function.__name__

    @property
    @functools.wraps(function)
    def decorator(self):
        if not hasattr(self, attribute):
            with tf.variable_scope(function.__name):
                setattr(self, attribute, function(self))
        return getattr(self, attribute)

    return decorator

给装饰器一个新名称，因为除了惰性缓存外它还具有特定于TensorFlow的功能。除此之外，该模型看起来与前一个模型相同。

我们可以更进一步地，使用@define_scope装饰器将参数转发给tf.variable_scope（），例如为作用域定义默认的initializer。有兴趣的话可查看完整示例：https://gist.github.com/danijar/8663d3bbfd586bffecf6a0094cd116f2

现在我们可以以结构化和紧凑的方式定义模型，从而构造结构清晰的计算图。

时间： 2024-10-05 10:04:03

结构化TensorFlow模型代码

结构化TensorFlow模型代码的相关文章

【ACM Chp3】结构化风险模型的部分具体细节

CMM模型，结构化开发方法和面向对象开发方法的比较，UML（统一建模语言），jackson开发方法

结构化与面向对象化之应用比较

深入研究 Win32 结构化异常处理（好多相关文章）

非结构化数据的存储与查询

结构化和面向对象之应用比较

【文智背后的奥秘】系列篇——结构化抽取平台

深入研究 Win32 结构化异常处理（作者博客有许多SEH的研究文章）

视频结构化相关调研