Theano学习笔记（五）——配置设置与编译模型 / 憋错料

配置

config模块包含了各种用于修改Theano的属性。在Theano导入时，许多属性都会被检查，而有些属性是只读模式。

一般约定，在用户代码内部config模块的属性不应当被修改。

Theano的这些属性都有默认值，但是你也可以在你的.theanorc文件里面修改，并且使用THEANO_FLAGS的环境变量进行修改。

优先顺序是：

1. theano.config.<property>的赋值

2. THEANO_FLAGS的赋值

3..theanorc（或者在THEANORC文件中表示）的赋值

通过打印theano.config可以展示当前的配置：

python-c 'import theano; print theano.config' | less

例如，修改笔记（二）中的逻辑回归函数，设置精度为float32

#!/usr/bin/envpython
#Theano tutorial
#Solution to Exercise in section 'Configuration Settings and Compiling Modes'

importnumpy
importtheano
importtheano.tensor as tt

theano.config.floatX= 'float32'

rng= numpy.random

N= 400
feats= 784
D= (rng.randn(N, feats).astype(theano.config.floatX),
rng.randint(size=N,low=0, high=2).astype(theano.config.floatX))
training_steps= 10000

#Declare Theano symbolic variables
x= tt.matrix("x")
y= tt.vector("y")
w= theano.shared(rng.randn(feats).astype(theano.config.floatX),name="w")
b= theano.shared(numpy.asarray(0., dtype=theano.config.floatX),name="b")
x.tag.test_value= D[0]
y.tag.test_value= D[1]
#print"Initial model:"
#printw.get_value(), b.get_value()

#Construct Theano expression graph
p_1= 1 / (1 + tt.exp(-tt.dot(x, w) - b))  #Probability of having a one
prediction= p_1 > 0.5  # The prediction that isdone: 0 or 1
xent= -y * tt.log(p_1) - (1 - y) * tt.log(1 - p_1) # Cross-entropy
cost= tt.cast(xent.mean(), 'float32') +        0.01 * (w ** 2).sum()  # The cost to optimize
gw,gb = tt.grad(cost, [w, b])

#Compile expressions to functions
train= theano.function(
            inputs=[x, y],
            outputs=[prediction, xent],
            updates={w: w - 0.01 * gw, b: b -0.01 * gb},
            name="train")
predict= theano.function(inputs=[x], outputs=prediction,
            name="predict")

ifany([x.op.__class__.__name__ in ['Gemv', 'CGemv', 'Gemm', 'CGemm'] for x in
train.maker.fgraph.toposort()]):
    print 'Used the cpu'
elifany([x.op.__class__.__name__ in ['GpuGemm', 'GpuGemv'] for x in
train.maker.fgraph.toposort()]):
    print 'Used the gpu'
else:
    print 'ERROR, not able to tell if theanoused the cpu or the gpu'
    print train.maker.fgraph.toposort()

fori in range(training_steps):
    pred, err = train(D[0], D[1])
#print"Final model:"
#printw.get_value(), b.get_value()

print"target values for D"
printD[1]

print"prediction on D"
printpredict(D[0])

用time python file.py运行，可得：

real  0m15.055s
user 0m11.527s
sys   0m0.801s

Mode

每次调用theano.function时，Theano变量输入和输出的符号化关系都被优化和编译了。

而这些编辑都通过made参数的值来控制。

Theano定义以下mode：

FAST_COMPILE：

compile.mode.Mode(linker='py',optimizer='fast_compile')

只应用少量的图优化并且只使用Python实现。

FAST_RUN：

compile.mode.Mode(linker='cvm',optimizer='fast_run')

使用所有的优化并且在可能的情况下使用C实现。

DebugMode：

compile.debugmode.DebugMode()

检查所有优化的正确性，并且比较C与Python实现。这种模式比别的模式耗时都长，但是可以识别出各种问题。

ProfileMode（不赞成使用）：

compile.profilemode.ProfileMode()

与FAST_RUN相同的优化，但是打印出一些设置信息。

默认的模式是FAST_RUN，但是通过传递关键字参数给theano.function，可以控制config.mode，从而改变模式。

Linkers

一个mode由2个部分组成：1个优化器和1个Linker。

[1] gc指计算中间过程的碎片收集。否则在Theano函数调用之间，操作所使用的内存空间将被保存起来。为了不重新分配内存，降低开销（overhead），使其速度更快。

[2] 默认linker

[3] 不推荐使用

使用DebugMode

一般你应当使用FAST_RUN 或者FAST_COMPILE模式，当你定义新的类型的表达式或者优化方法时，先用DebugMode（mode=‘DebugMode‘）运行是很有用的，DebugMode通过运行一些自检和判断程序来帮助诊断出将会导致错误输出的可能的编程错误。值得注意的是，DebugMode比FAST_RUN或者 FAST_COMPILE模式要慢得多，所以只在开发期使用。

举个例子：

import theano
importtheano.tensor as T
x= T.dvector('x')
f= theano.function([x], 10 * x, mode='DebugMode')
f([5])
f([0])
f([7])

运行后，如果有问题，输出会提示异常，如果依然不能解决，请联系本领域的专家。

但是DebugMode也不是万能的，因为有些错误只在特定的输入条件下才会出现。

如果你使用构造器而不是关键词DebugMode，就可以通过配构造器变量来配置。而关键词设置太严格了。

ProfileMode不推荐使用

检索时间信息

图编译好之后，运行就可以了。然后调用profmode.print_summary()，返回各自时间信息，例如你的图大多数时间花在什么地方了等等。

还是以逻辑回归为例

生成ProfileMode实例

fromtheano import ProfileMode
profmode= theano.ProfileMode(optimizer='fast_run', linker=theano.gof.OpWiseCLinker())

在函数末尾声明一下

train = theano.function(
           inputs=[x,y],
           outputs=[prediction,xent],
           updates={w:w - 0.01 * gw, b: b - 0.01 * gb},
           name="train",mode=profmode)
#如果是Module则这样声明：
# m = theano.Module()
# minst = m.make(mode=profmode)

取回时间信息

文件末尾添加

profmode.print_summary()

则运行效果是这样的

ProfileMode.print_summary()
---------------------------

Timesince import 6.183s
Theanocompile time: 0.000s (0.0% since import)
    Optimization time: 0.000s
    Linker time: 0.000s
Theanofct call 5.452s (88.2% since import)
   Theano Op time 5.003s 80.9%(since import)91.8%(of fct call)
   Theano function overhead in ProfileMode0.449s 7.3%(since import) 8.2%(of fct call)
10000Theano fct call, 0.001s per call
Restof the time since import 0.730s 11.8%

Theanofct summary:
<%total fct time> <total time> <time per call> <nb call><fct name>
   100.0% 5.452s 5.45e-04s 10000 train

SingleOp-wise summary:
<%of local_time spent on this kind of Op> <cumulative %> <selfseconds> <cumulative seconds> <time per call> [*]<nb_call> <nb_op> <nb_apply> <Op name>
   87.9%  87.9%  4.400s  4.400s 2.20e-04s * 20000  1  2 <class 'theano.tensor.blas_c.CGemv'>
   10.8%  98.8%  0.542s  4.942s 5.42e-06s * 100000 10 10 <class 'theano.tensor.elemwise.Elemwise'>
    0.5%  99.3%  0.023s  4.966s 1.17e-06s * 20000  1  2 <class 'theano.tensor.basic.Alloc'>
    0.4%  99.6%  0.018s  4.984s 6.05e-07s * 30000  2  3 <class'theano.tensor.elemwise.DimShuffle'>
    0.3%  99.9%  0.013s  4.997s 1.25e-06s * 10000  1  1 <class 'theano.tensor.elemwise.Sum'>
    0.1% 100.0%  0.007s  5.003s 3.35e-07s * 20000  1  2 <class 'theano.compile.ops.Shape_i'>
   ... (remaining 0 single Op account for0.00%(0.00s) of the runtime)
(*)Op is running a c implementation

Op-wisesummary:
<%of local_time spent on this kind of Op> <cumulative %> <selfseconds> <cumulative seconds> <time per call> [*]  <nb_call> <nb apply> <Opname>
   87.9%  87.9%  4.400s  4.400s 2.20e-04s * 20000  2CGemv{inplace}
    6.3%  94.3%  0.318s  4.718s 3.18e-05s * 10000  1Elemwise{Composite{[Composite{[Composite{[sub(mul(i0, i1), neg(i2))]}(i0,scalar_softplus(i1), mul(i2, i3))]}(i0, i1, i2, scalar_softplus(i3))]}}
    2.1%  96.3%  0.103s  4.820s 1.03e-05s * 10000  1Elemwise{Composite{[Composite{[Composite{[Composite{[mul(i0, add(i1, i2))]}(i0,neg(i1), true_div(i2, i3))]}(i0, mul(i1, i2, i3), i4, i5)]}(i0, i1, i2,exp(i3), i4, i5)]}}[(0, 0)]
    1.6%  98.0%  0.082s  4.902s 8.16e-06s * 10000  1Elemwise{ScalarSigmoid{output_types_preference=transfer_type{0}}}[(0, 0)]
    0.5%  98.4%  0.023s  4.925s 1.17e-06s * 20000  2 Alloc
    0.3%  98.7%  0.013s  4.938s 1.25e-06s * 10000  1 Sum
    0.2%  98.9%  0.012s  4.950s 6.11e-07s * 20000  2InplaceDimShuffle{x}
    0.2%  99.1%  0.008s  4.959s 8.44e-07s * 10000  1Elemwise{gt,no_inplace}
    0.1%  99.2%  0.007s  4.965s 6.80e-07s * 10000  1Elemwise{sub,no_inplace}
    0.1%  99.4%  0.007s  4.972s 3.35e-07s * 20000  2 Shape_i{0}
    0.1%  99.5%  0.006s  4.978s 6.11e-07s * 10000  1Elemwise{Composite{[sub(neg(i0), i1)]}}[(0, 0)]
    0.1%  99.6%  0.006s  4.984s 5.93e-07s * 10000  1InplaceDimShuffle{1,0}
    0.1%  99.7%  0.005s  4.989s 5.33e-07s * 10000  1Elemwise{neg,no_inplace}
    0.1%  99.8%  0.005s  4.994s 4.85e-07s * 10000  1Elemwise{Cast{float32}}
    0.1%  99.9%  0.005s  4.999s 4.60e-07s * 10000  1Elemwise{inv,no_inplace}
    0.1% 100.0%  0.004s  5.003s 4.25e-07s * 10000  1Elemwise{Composite{[sub(i0, mul(i1, i2))]}}[(0, 0)]
   ... (remaining 0 Op account for   0.00%(0.00s) of the runtime)
(*)Op is running a c implementation

Apply-wisesummary:
<%of local_time spent at this position> <cumulative %%> <applytime> <cumulative seconds> <time per call> [*] <nb_call><Apply position> <Apply Op name>
   54.7%  54.7%  2.737s  2.737s 2.74e-04s  * 10000  7 CGemv{inplace}(Alloc.0, TensorConstant{1.0}, x, w,TensorConstant{0.0})
   33.2%  87.9%  1.663s  4.400s 1.66e-04s  * 10000 18 CGemv{inplace}(w, TensorConstant{-0.00999999977648}, x.T,Elemwise{Composite{[Composite{[Composite{[Composite{[mul(i0, add(i1, i2))]}(i0,neg(i1), true_div(i2, i3))]}(i0, mul(i1, i2, i3), i4, i5)]}(i0, i1, i2,exp(i3), i4, i5)]}}[(0, 0)].0, TensorConstant{0.999800026417})
    6.3%  94.3%  0.318s  4.718s 3.18e-05s  * 10000 13 Elemwise{Composite{[Composite{[Composite{[sub(mul(i0, i1),neg(i2))]}(i0, scalar_softplus(i1), mul(i2, i3))]}(i0, i1, i2,scalar_softplus(i3))]}}(y, Elemwise{Composite{[sub(neg(i0), i1)]}}[(0, 0)].0,Elemwise{sub,no_inplace}.0, Elemwise{neg,no_inplace}.0)
    2.1%  96.3%  0.103s  4.820s 1.03e-05s  * 10000 16 Elemwise{Composite{[Composite{[Composite{[Composite{[mul(i0, add(i1,i2))]}(i0, neg(i1), true_div(i2, i3))]}(i0, mul(i1, i2, i3), i4, i5)]}(i0, i1,i2, exp(i3), i4, i5)]}}[(0,0)](Elemwise{ScalarSigmoid{output_types_preference=transfer_type{0}}}[(0,0)].0, Alloc.0, y, Elemwise{Composite{[sub(neg(i0), i1)]}}[(0, 0)].0,Elemwise{sub,no_inplace}.0, Elemwise{Cast{float32}}.0)
    1.6%  98.0%  0.082s  4.902s 8.16e-06s  * 10000 14 Elemwise{ScalarSigmoid{output_types_preference=transfer_type{0}}}[(0,0)](Elemwise{neg,no_inplace}.0)
    0.3%  98.3%  0.015s  4.917s 1.53e-06s  * 10000 12 Alloc(Elemwise{inv,no_inplace}.0, Shape_i{0}.0)
    0.3%  98.5%  0.013s  4.930s 1.25e-06s  * 10000 17 Sum(Elemwise{Composite{[Composite{[Composite{[Composite{[mul(i0,add(i1, i2))]}(i0, neg(i1), true_div(i2, i3))]}(i0, mul(i1, i2, i3), i4,i5)]}(i0, i1, i2, exp(i3), i4, i5)]}}[(0, 0)].0)
    0.2%  98.7%  0.008s  4.938s 8.44e-07s  * 10000 15Elemwise{gt,no_inplace}(Elemwise{ScalarSigmoid{output_types_preference=transfer_type{0}}}[(0,0)].0, TensorConstant{(1,) of 0.5})
    0.2%  98.9%  0.008s  4.946s 8.14e-07s  * 10000  5 Alloc(TensorConstant{0.0}, Shape_i{0}.0)
    0.1%  99.0%  0.007s  4.953s 6.80e-07s  * 10000  4 Elemwise{sub,no_inplace}(TensorConstant{(1,) of 1.0}, y)
    0.1%  99.1%  0.006s  4.959s 6.16e-07s  * 10000  6 InplaceDimShuffle{x}(Shape_i{0}.0)
    0.1%  99.2%  0.006s  4.965s 6.11e-07s  * 10000  9 Elemwise{Composite{[sub(neg(i0), i1)]}}[(0, 0)](CGemv{inplace}.0,InplaceDimShuffle{x}.0)
    0.1%  99.4%  0.006s  4.972s 6.07e-07s  * 10000  0 InplaceDimShuffle{x}(b)
    0.1%  99.5%  0.006s  4.977s 5.93e-07s  * 10000  2 InplaceDimShuffle{1,0}(x)
    0.1%  99.6%  0.005s  4.983s 5.33e-07s  * 10000 11 Elemwise{neg,no_inplace}(Elemwise{Composite{[sub(neg(i0), i1)]}}[(0,0)].0)
   ... (remaining 5 Apply instances account for0.41%(0.02s) of the runtime)
(*)Op is running a c implementation

欢迎参与讨论并关注本博客和微博以及知乎个人主页后续内容继续更新哦~

转载请您尊重作者的劳动，完整保留上述文字以及文章链接，谢谢您的支持！

时间： 2024-11-11 10:01:11

Theano学习笔记（五）——配置设置与编译模型

Theano学习笔记（五）——配置设置与编译模型的相关文章

NLTK学习笔记(五):分类和标注词汇

WEB前端学习笔记五

小猪的数据结构学习笔记(五)

EasyARM i.mx28学习笔记——minicom配置和使用

Boost Thread学习笔记五

STM32学习笔记3-IO配置输入输出

jQuery源码学习笔记五六七八转

laravel3学习笔记(五)

Caliburn.Micro学习笔记(五)----协同IResult