Reducing and Profiling GPU Memory Usage in Keras with TensorFlow Backend

keras 自适应分配显存 & 清理不用的变量释放 GPU 显存

Intro

Are you running out of GPU memory when using keras or tensorflow deep learning models, but only some of the time?

Are you curious about exactly how much GPU memory your tensorflow model uses during training?

Are you wondering if you can run two or more keras models on your GPU at the same time?

Background

By default, tensorflow pre-allocates nearly all of the available GPU memory, which is bad for a variety of use cases, especially production and memory profiling.

When keras uses tensorflow for its back-end, it inherits this behavior.

Setting tensorflow GPU memory options

For new models

Thankfully, tensorflow allows you to change how it allocates GPU memory, and to set a limit on how much GPU memory it is allowed to allocate.

Let’s set GPU options on keras‘s example Sequence classification with LSTM network

## keras example imports
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import Embedding
from keras.layers import LSTM

## extra imports to set GPU options
import tensorflow as tf
from keras import backend as k

###################################
# TensorFlow wizardry
config = tf.ConfigProto()

# Don‘t pre-allocate memory; allocate as-needed
config.gpu_options.allow_growth = True

# Only allow a total of half the GPU memory to be allocated
#config.gpu_options.per_process_gpu_memory_fraction = 0.5

# Create a session with the above options specified.
k.tensorflow_backend.set_session(tf.Session(config=config))
###################################

model = Sequential()
model.add(Embedding(max_features, output_dim=256))
model.add(LSTM(128))
model.add(Dropout(0.5))
model.add(Dense(1, activation=‘sigmoid‘))

model.compile(loss=‘binary_crossentropy‘,
              optimizer=‘rmsprop‘,
              metrics=[‘accuracy‘])

model.fit(x_train, y_train, batch_size=16, epochs=10)
score = model.evaluate(x_test, y_test, batch_size=16)

After the above, when we create the sequence classification model, it won’t use half the GPU memory automatically, but rather will allocate GPU memory as-needed during the calls to model.fit() and model.evaluate().

Additionally, with the per_process_gpu_memory_fraction = 0.5tensorflow will only allocate a total of half the available GPU memory.

If it tries to allocate more than half of the total GPU memory, tensorflow will throw a ResourceExhaustedError, and you’ll get a lengthy stack trace.

If you have a Linux machine and an nvidia card, you can watch nvidia-smi to see how much GPU memory is in use, or can configure a monitoring tool like monitorix to generate graphs for you.

GPU memory usage, as shown in Monitorix for Linux

For a model that you’re loading

We can even set GPU memory management options for a model that’s already created and trained, and that we’re loading from disk for deployment or for further training.

For that, let’s tweak keras‘s load_model example:

# keras example imports
from keras.models import load_model

## extra imports to set GPU options
import tensorflow as tf
from keras import backend as k

###################################
# TensorFlow wizardry
config = tf.ConfigProto()

# Don‘t pre-allocate memory; allocate as-needed
config.gpu_options.allow_growth = True

# Only allow a total of half the GPU memory to be allocated
config.gpu_options.per_process_gpu_memory_fraction = 0.5

# Create a session with the above options specified.
k.tensorflow_backend.set_session(tf.Session(config=config))
###################################

# returns a compiled model
# identical to the previous one
model = load_model(‘my_model.h5‘)

# TODO: classify all the things

Now, with your loaded model, you can open your favorite GPU monitoring tool and watch how the GPU memory usage changes under different loads.

Conclusion

Good news everyone! That sweet deep learning model you just made doesn’t actually need all that memory it usually claims!

And, now that you can tell tensorflow not to pre-allocate memory, you can get a much better idea of what kind of rig(s) you need in order to deploy your model into production.

Is this how you’re handling GPU memory management issues with tensorflow or keras?

Did I miss a better, cleaner way of handling GPU memory allocation with tensorflow and keras?

Let me know in the comments!

====================================================================================

How to remove stale models from GPU memory

import gc
m = Model(.....)
m.save(tmp_model_name)
del m
K.clear_session()
gc.collect()
m = load_model(tmp_model_name)

参考: https://michaelblogscode.wordpress.com/2017/10/10/reducing-and-profiling-gpu-memory-usage-in-keras-with-tensorflow-backend/

https://github.com/keras-team/keras/issues/5345

来自为知笔记(Wiz)

原文地址:https://www.cnblogs.com/jins-note/p/9687181.html

时间: 2024-12-23 20:31:27

Reducing and Profiling GPU Memory Usage in Keras with TensorFlow Backend的相关文章

Allowing GPU memory growth

By default, TensorFlow maps nearly all of the GPU memory of all GPUs (subject to CUDA_VISIBLE_DEVICES) visible to the process. This is done to more efficiently use the relatively precious GPU memory resources on the devices by reducing memory fragmen

Instrumentation: querying the memory usage of a Java object

Copy from: http://www.javamex.com/tutorials/memory/instrumentation.shtml Instrumentation: querying the memory usage of a Java object The most reliable— but not necessarily the easiest— way to estimate the usage of a Java object is to ask the JVM. Que

Unable to determine memory usage

Nagios报错如下 ***** Nagios *****Notification Type: PROBLEM Service: MEM_USEHost: s3Address: 10.10.16.103State: UNKNOWN  Date/Time: Tue Jun 28 17:41:42 CST 2016 Additional Info: Unable to determine memory usage. 解决过程: 1.根据Service:MEM_USE在文件/etc/nagios/ob

5 commands to check memory usage on Linux

Memory Usage On linux, there are commands for almost everything, because the gui might not be always available. When working on servers only shell access is available and everything has to be done from these commands. So today we shall be checking th

Nagios Server通过NRPE监控client memory usage.

How to monitor server memory usage with Nagios Remote Plugin Executor (NRPE) Last updated on September 8, 2014 Authored by Sarmed Rahman 5 Comments In a previous tutorial, we have seen how we can set up Nagios Remote Plugin Executor (NRPE) in an exis

import keras,tensorflow,出现kernel died restarting,解决办法

故障描述 设备环境:Win10家庭版,Anaconda3,Spyder3.3.1,Python3.6. 当加载tensorflow模块时,导致kernel崩溃,如运行一下任一行均会崩溃 1 from keras.models import Sequential #keras 以tensorflow为后端 2 from keras.layers.core import Dense, Activation 3 import tensorflow as tf 注:最开始安装使用是正常的,好久没用近来用

【tf.keras】tf.keras使用tensorflow中定义的optimizer

我的 tensorflow+keras 版本: print(tf.VERSION) # '1.10.0' print(tf.keras.__version__) # '2.1.6-tf' tf.keras 没有实现 AdamW,即 Adam with Weight decay.论文<DECOUPLED WEIGHT DECAY REGULARIZATION>提出,在使用 Adam 时,weight decay 不等于 L2 regularization.具体可以参见 当前训练神经网络最快的方式

window下 人工智能 Keras、TensorFlow、PyTorch、CUDA、cuDNN 的

======= 人工智能 Keras.TensorFlow 的环境安装 ======?1.window下?安装 anaconda(python 3.6 / python 3.7)https://blog.csdn.net/zimiao552147572/article/details/888542392.安装 ubuntu 16/18https://blog.csdn.net/zimiao552147572/article/details/888543703.window下安装 Keras.Te

NRPE Memory usage monitor script

#!/usr/bin/perl -w # Heavily based on the script from: # check_mem.pl Copyright (C) 2000 Dan Larsson <[email protected]> # heavily modified by # Justin Ellison <[email protected]> # # The MIT License (MIT) # Copyright (c) 2011 [email protected