TX2之多线程读取视频及深度学习推理

背景

一般在TX2上部署深度学习模型时,都是读取摄像头视频或传入视频文件进行推理,从视频中抽取帧进行目标检测等任务。对于大点的模型,推理的速度是赶不上摄像头或视频的帧率的,如果我们使用单线程进行处理,即读取一帧检测一帧,推理会堵塞视频的正常传输,表现出来就是摄像头视频有很大的延迟,如果是对实时性要求较高,这种延迟是难以接受的。因此,采用多线程的方法,将视频读取与深度学习推理放在两个线程里,互不影响,达到实时的效果。

实现方法

将摄像头的视频读取放入子线程,充当一个生产者的角色,将推理放入主线程,充当消费者的角色,主线程推理完一帧后从子线程提数据,继续推理,下图是原博文的一幅图片,描述了两个线程的关系:

程序实现

子线程

"""camera.py
This code implements the Camera class, which encapsulates code to
handle IP CAM, USB webcam or the Jetson onboard camera.  The Camera
class is further extend to take either a video or an image file as
input.
"""

import time
import logging
import threading

import numpy as np
import cv2

def open_cam_rtsp(uri, width, height, latency):
    """Open an RTSP URI (IP CAM)."""
    gst_str = ('rtspsrc location={} latency={} ! '
               'rtph264depay ! h264parse ! omxh264dec ! '
               'nvvidconv ! '
               'video/x-raw, width=(int){}, height=(int){}, '
               'format=(string)BGRx ! videoconvert ! '
               'appsink').format(uri, latency, width, height)
    return cv2.VideoCapture(gst_str, cv2.CAP_GSTREAMER)

def open_cam_usb(dev, width, height):
    """Open a USB webcam.
    We want to set width and height here, otherwise we could just do:
        return cv2.VideoCapture(dev)
    """
    gst_str = ('v4l2src device=/dev/video{} ! '
               'video/x-raw, width=(int){}, height=(int){}, '
               'format=(string)RGB ! videoconvert ! '
               'appsink').format(dev, width, height)
    return cv2.VideoCapture(gst_str, cv2.CAP_GSTREAMER)

def open_cam_onboard(width, height):
    """Open the Jetson onboard camera.
    On versions of L4T prior to 28.1, you might need to add
    'flip-method=2' into gst_str.
    """
    gst_str = ('nvcamerasrc ! '
               'video/x-raw(memory:NVMM), '
               'width=(int)2592, height=(int)1458, '
               'format=(string)I420, framerate=(fraction)30/1 ! '
               'nvvidconv ! '
               'video/x-raw, width=(int){}, height=(int){}, '
               'format=(string)BGRx ! videoconvert ! '
               'appsink').format(width, height)
    return cv2.VideoCapture(gst_str, cv2.CAP_GSTREAMER)

def grab_img(cam):
    """This 'grab_img' function is designed to be run in the sub-thread.
    Once started, this thread continues to grab a new image and put it
    into the global 'img_handle', until 'thread_running' is set to False.
    """
    while cam.thread_running:
        if cam.args.use_image:
            assert cam.img_handle is not None, 'img_handle is empty in use_image case!'
            # keep using the same img, no need to update it
            time.sleep(0.01)  # yield CPU to other threads
        else:
            _, cam.img_handle = cam.cap.read()
            fps = cam.cap.get(cv2.CAP_PROP_FPS)
            time.sleep(1/fps)  # fps = 20hz
            print('time sleep ', 1/fps)
            if cam.img_handle is None:
                logging.warning('grab_img(): cap.read() returns None...')
                break
    cam.thread_running = False

class Camera():
    """Camera class which supports reading images from theses video sources:
    1. Video file
    2. Image (jpg, png, etc.) file, repeating indefinitely
    3. RTSP (IP CAM)
    4. USB webcam
    5. Jetson onboard camera
    """

    def __init__(self, args):
        self.args = args
        self.is_opened = False
        self.thread_running = False
        self.img_handle = None
        self.img_width = 0
        self.img_height = 0
        self.cap = None
        self.thread = None

    def open(self):
        """Open camera based on command line arguments."""
        assert self.cap is None, 'Camera is already opened!'
        args = self.args
        if args.use_file:
            self.cap = cv2.VideoCapture(args.filename)
            # ignore image width/height settings here
        elif args.use_image:
            self.cap = 'OK'
            self.img_handle = cv2.imread(args.filename)
            # ignore image width/height settings here
            if self.img_handle is not None:
                self.is_opened = True
                self.img_height, self.img_width, _ = self.img_handle.shape
        elif args.use_rtsp:
            self.cap = open_cam_rtsp(
                args.rtsp_uri,
                args.image_width,
                args.image_height,
                args.rtsp_latency
            )
        elif args.use_usb:
            self.cap = open_cam_usb(
                args.video_dev,
                args.image_width,
                args.image_height
            )
        else:  # by default, use the jetson onboard camera
            self.cap = open_cam_onboard(
                args.image_width,
                args.image_height
            )
        if self.cap != 'OK':
            if self.cap.isOpened():
                # Try to grab the 1st image and determine width and height
                _, img = self.cap.read()
                if img is not None:
                    self.img_height, self.img_width, _ = img.shape
                    self.is_opened = True

    def start(self):
        assert not self.thread_running
        self.thread_running = True
        self.thread = threading.Thread(target=grab_img, args=(self,))
        self.thread.start()

    def stop(self):
        self.thread_running = False
        self.thread.join()

    def read(self):
        if self.args.use_image:
            return np.copy(self.img_handle)
        else:
            return self.img_handle

    def release(self):
        assert not self.thread_running
        if self.cap != 'OK':
            self.cap.release()

主线程

主线程程序以tensorflow object-detection部分为主,重点看里边读摄像头或视频的方法,运行时要传入读取的摄像头或视频参数:

# coding: utf-8
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image

import cv2
import time
from PIL import Image

import tensorflow.contrib.tensorrt as trt
from camera import Camera
import argparse

os.environ['CUDA_VISIBLE_DEVICES'] = '0'

# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
from object_detection.utils import ops as utils_ops

if tf.__version__ < '1.4.0':
  raise ImportError('Please upgrade your tensorflow installation to v1.4.* or later!')

from utils import label_map_util
from utils import visualization_utils as vis_util

# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_FROZEN_GRAPH = 'data/ssd_mobilenet_coco_0129/frozen_inference_graph.pb'

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data/object-detection.pbtxt')

NUM_CLASSES = 12

VIDEO_NAME = 'data/2018-09-10_162811'
filename = VIDEO_NAME + '.mp4'

def parse_args():
    """Parse input arguments."""
    desc = ('This script captures and displays live camera video, '
            'and does real-time object detection with TF-TRT model '
            'on Jetson TX2/TX1')
    parser = argparse.ArgumentParser(description=desc)
    parser.add_argument('--file', dest='use_file',
                        help='use a video file as input (remember to '
                        'also set --filename)',
                        action='store_true')
    parser.add_argument('--image', dest='use_image',
                        help='use an image file as input (remember to '
                        'also set --filename)',
                        action='store_true')
    parser.add_argument('--filename', dest='filename',
                        help='video file name, e.g. test.mp4',
                        default='data/2018-09-10_162811.mp4', type=str)
    parser.add_argument('--rtsp', dest='use_rtsp',
                        help='use IP CAM (remember to also set --uri)',
                        action='store_true')
    parser.add_argument('--uri', dest='rtsp_uri',
                        help='RTSP URI, e.g. rtsp://admin:jiaxun123@192.168.170.119/H.264/ch1/main',
                        default=None, type=str)
    parser.add_argument('--latency', dest='rtsp_latency',
                        help='latency in ms for RTSP [200]',
                        default=200, type=int)
    parser.add_argument('--usb', dest='use_usb',
                        help='use USB webcam (remember to also set --vid)',
                        action='store_true')
    parser.add_argument('--vid', dest='video_dev',
                        help='device # of USB webcam (/dev/video?) [1]',
                        default=1, type=int)
    parser.add_argument('--width', dest='image_width',
                        help='image width [1280]',
                        default=1280, type=int)
    parser.add_argument('--height', dest='image_height',
                        help='image height [720]',
                        default=720, type=int)
    parser.add_argument('--confidence', dest='conf_th',
                        help='confidence threshold [0.3]',
                        default=0.3, type=float)
    args = parser.parse_args()
    return args

def detect_in_video():
    args = parse_args()
    detection_graph = tf.Graph()
    with detection_graph.as_default():
        od_graph_def = tf.GraphDef()
        with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:
            serialized_graph = fid.read()
            od_graph_def.ParseFromString(serialized_graph)
            tf.import_graph_def(od_graph_def, name='')

    label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
    categories = label_map_util.convert_label_map_to_categories(
        label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
    category_index = label_map_util.create_category_index(categories)

    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True

    with detection_graph.as_default():
        with tf.Session(graph=detection_graph,config=config) as sess:
            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
            detection_boxes = detection_graph.get_tensor_by_name(
                'detection_boxes:0')
            detection_scores = detection_graph.get_tensor_by_name(
                'detection_scores:0')
            detection_classes = detection_graph.get_tensor_by_name(
                'detection_classes:0')
            num_detections = detection_graph.get_tensor_by_name(
                'num_detections:0')

            cam = Camera(args)
            cam.open()
            cam.start()

            while cam.thread_running:
                frame = cam.read()
                color_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
                image_np_expanded = np.expand_dims(color_frame, axis=0)
                (boxes, scores, classes, num) = sess.run(
                    [detection_boxes, detection_scores,
                        detection_classes, num_detections],
                    feed_dict={image_tensor: image_np_expanded})

def main():
  detect_in_video()

if __name__ =='__main__':
  main()

Ref

子线程实现
原博客

原文地址:https://www.cnblogs.com/chay/p/10553822.html

时间: 2024-10-12 20:25:54

TX2之多线程读取视频及深度学习推理的相关文章

深度学习在图像识别中的研究进展与展望

深度学习在图像识别中的研究进展与展望 深度学习是近十年来人工智能领域取得的最重要的突破之一.它在语音识别.自然语言处理.计算机视觉.图像与视频分析.多媒体等诸多领域都取得了巨大成功.本文将重点介绍深度学习在物体识别.物体检测.视频分析的最新研究进展,并探讨其发展趋势. 1.深度学习发展历史的回顾 现在的深度学习模型属于神经网络.神经网络的历史可以追溯到上世纪四十年代,曾经在八九十年代流行.神经网络试图通过大脑认知的机理,解决各种机器学习的问题.1986年Rumelhart.Hinton和Will

【王晓刚】深度学习在图像识别中的研究进展与展望

深度学习是近十年来人工智能领域取得的最重要的突破之一.它在语音识别.自然语言处理.计算机视觉.图像与视频分析.多媒体等诸多领域都取得了巨大成功.本文将重点介绍深度学习在物体识别.物体检测.视频分析的最新研究进展,并探讨其发展趋势. 1. 深度学习发展历史的回顾 现有的深度学习模型属于神经网络.神经网络的历史可追述到上世纪四十年代,曾经在八九十年代流行.神经网络试图通过模拟大脑认知的机理,解决各种机器学习的问题.1986 年Rumelhart,Hinton 和Williams 在<自然>发表了著

机器学习和深度学习资料合集

机器学习和深度学习资料合集 注:机器学习资料篇目一共500条,篇目二开始更新 希望转载的朋友,你可以不用联系我.但是一定要保留原文链接,因为这个项目还在继续也在不定期更新.希望看到文章的朋友能够学到更多.此外:某些资料在中国访问需要梯子. <Brief History of Machine Learning> 介绍:这是一篇介绍机器学习历史的文章,介绍很全面,从感知机.神经网络.决策树.SVM.Adaboost到随机森林.Deep Learning. <Deep Learning in

[转]机器学习和深度学习资料汇总【01】

本文转自:http://blog.csdn.net/sinat_34707539/article/details/52105681 <Brief History of Machine Learning> 介绍:这是一篇介绍机器学习历史的文章,介绍很全面,从感知机.神经网络.决策树.SVM.Adaboost到随机森林.Deep Learning. <Deep Learning in Neural Networks: An Overview> 介绍:这是瑞士人工智能实验室Jurgen

机器学习与深度学习资料

<Brief History of Machine Learning> 介绍:这是一篇介绍机器学习历史的文章,介绍很全面,从感知机.神经网络.决策树.SVM.Adaboost到随机森林.Deep Learning. <Deep Learning in Neural Networks: An Overview> 介绍:这是瑞士人工智能实验室Jurgen Schmidhuber写的最新版本<神经网络与深度学习综述>本综述的特点是以时间排序,从1940年开始讲起,到60-80

人工智能AI:Keras PyTorch MXNet 深度学习实战(不定时更新) &#97725;

原文: http://blog.gqylpy.com/gqy/415 置顶:来自一名75后老程序员的武林秘籍--必读(博主推荐) 来,先呈上武林秘籍链接:http://blog.gqylpy.com/gqy/401/ 你好,我是一名极客!一个 75 后的老工程师! 我将花两分钟,表述清楚我让你读这段文字的目的! 如果你看过武侠小说,你可以把这个经历理解为,你失足落入一个山洞遇到了一位垂暮的老者!而这位老者打算传你一套武功秘籍! 没错,我就是这个老者! 干研发 20 多年了!我也年轻过,奋斗过!我

收集一些深度学习视频

<机器学习&&深度学习> 视频课程资源百度云下载. 林轩田:机器学习基石 链接:http://pan.baidu.com/s/1qXSKZP64 密码:dwie 林轩田:機器學習技法 (Machine Learning Techniques)链接:http://pan.baidu.com/s/1i5I0kZj3 密码:zwce andrew Ng视频课程和讲义链接:http://pan.baidu.com/s/1nuT7hUT2 密码:8his hinton 深度学习视频课程链

【转】近200篇机器学习&amp;深度学习资料分享(含各种文档,视频,源码等)

编者按:本文收集了百来篇关于机器学习和深度学习的资料,含各种文档,视频,源码等.而且原文也会不定期的更新,望看到文章的朋友能够学到更多. <Brief History of Machine Learning> 介绍:这是一篇介绍机器学习历史的文章,介绍很全面,从感知机.神经网络.决策树.SVM.Adaboost 到随机森林.Deep Learning. <Deep Learning in Neural Networks: An Overview> 介绍:这是瑞士人工智能实验室 Ju

近200篇机器学习&amp;深度学习资料分享(含各种文档,视频,源码等)(1)

原文:http://developer.51cto.com/art/201501/464174.htm 编者按:本文收集了百来篇关于机器学习和深度学习的资料,含各种文档,视频,源码等.而且原文也会不定期的更新,望看到文章的朋友能够学到更多. <Brief History of Machine Learning> 介绍:这是一篇介绍机器学习历史的文章,介绍很全面,从感知机.神经网络.决策树.SVM.Adaboost 到随机森林.Deep Learning. <Deep Learning i