转载请注明出处:
http://www.cnblogs.com/darkknightzh/p/7608916.html
参考网址:
https://stackoverflow.com/questions/39758094/clearing-tensorflow-gpu-memory-after-model-execution
https://github.com/tensorflow/tensorflow/issues/1727#issuecomment-285815312s
tensorflow中,在一个函数内配置完GPU,tf分配了显存,等函数执行完,显存不会释放(貌似torch7中也一样。。。)。第二个参考网址指出:
As for the original problem, currently the Allocator in the GPUDevice belongs to the ProcessState, which is essentially a global singleton. The first session using GPU initializes it, and frees itself when the process shuts down. Even if a second session chooses a different GPUOptions, it would not take effect.
第一个session对GPU初始化后,即便释放了显存,第二个sess使用不同的GPU选项来初始化GPU,也不会起效。
第一个网址Oli Blum指出,use processes and shut them down after the computation才能释放显存。具体代码如下(可以参考第一个网址):
1 import tensorflow as tf 2 import multiprocessing 3 import numpy as np 4 5 def run_tensorflow(): 6 7 n_input = 10000 8 n_classes = 1000 9 10 # Create model 11 def multilayer_perceptron(x, weight): 12 # Hidden layer with RELU activation 13 layer_1 = tf.matmul(x, weight) 14 return layer_1 15 16 # Store layers weight & bias 17 weights = tf.Variable(tf.random_normal([n_input, n_classes])) 18 19 20 x = tf.placeholder("float", [None, n_input]) 21 y = tf.placeholder("float", [None, n_classes]) 22 pred = multilayer_perceptron(x, weights) 23 24 cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y)) 25 optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost) 26 27 init = tf.global_variables_initializer() 28 29 with tf.Session() as sess: 30 sess.run(init) 31 32 for i in range(100): 33 batch_x = np.random.rand(10, 10000) 34 batch_y = np.random.rand(10, 1000) 35 sess.run([optimizer, cost], feed_dict={x: batch_x, y: batch_y}) 36 37 print "finished doing stuff with tensorflow!" 38 39 40 if __name__ == "__main__": 41 42 # option 1: execute code with extra process 43 p = multiprocessing.Process(target=run_tensorflow) 44 p.start() 45 p.join() 46 47 # wait until user presses enter key 48 raw_input() 49 50 # option 2: just execute the function 51 run_tensorflow() 52 53 # wait until user presses enter key 54 raw_input()
使用multiprocessing.Process运行run_tensorflow后,显存会自动释放,但是如果直接执行run_tensorflow,显存不会自动释放。当然,该函数计算量较小,如果显卡太好,可能看不到运行multiprocessing.Process后,显存分配、计算并释放的过程,感觉就像没有运行一样。。。