Paper Reading - Convolutional Image Captioning ( CVPR 2018 )

Innovations:

  • The authors develop a convolutional ( CNN-based ) image captioning method that shows comparable performance to an LSTM based method on standard metrics.
  • The authors analyze the characteristics of CNN and LSTM nets and provide useful insights such as -- CNNs produce more entropy ( useful for diverse predictions ), better classification accuracy, and do not suffer from vanishing gradients.

Improvements:

  • A Convolutional Neural Network with Attention mechanism.

General Points:

  • Image Captioning is applicable to virtual assistants, editing tools, image indexing and support of the disabled.
  • Image Captioning is a basic ingredient for more complex operations such as storytelling and visual summarization.

原文地址:https://www.cnblogs.com/zlian2016/p/9520893.html

时间: 2024-10-25 22:59:48

Paper Reading - Convolutional Image Captioning ( CVPR 2018 )的相关文章

单目标跟踪CVPR 2018 ECO+

FROM https://blog.csdn.net/weixin_40245131/article/details/79754531 目标跟踪的相关滤波方向,Martin Danelljan 4月底在arXiv上挂出来的最新论文: Bhat G, Johnander J, Danelljan M, et al. Unveiling the Power of Deep Tracking [J]. arXiv preprint arXiv:1804.06833, 2018. https://arx

爬取CVPR 2018过程中遇到的坑

爬取 CVPR 2018 过程中遇到的坑 使用语言及模块 语言: Python 3.6.6 模块: re requests lxml bs4 过程 一开始都挺顺利的,先获取到所有文章的链接再逐个爬取获取内容, 中间有一部分的是用正则进行匹配出想要的内容,写完了就想全部跑一遍试试吧. 爬到一半出错了,看了一下是这篇出问题了. 好吧,那就f12看看什么情况. emmmmm.... 跟之前的差不多啊... 直接复制下来匹配试试 ...都能匹配到啊... 直到....emmmm....看看不print出

Paper Reading: Stereo DSO

开篇第一篇就写一个paper reading吧,用markdown+vim写东西切换中英文挺麻烦的,有些就偷懒都用英文写了. Stereo DSO: Large-Scale Direct Sparse Visual Odometry with Stereo Cameras Abstract Optimization objectives: intrinsic/extrinsic parameters of all keyframes all selected pixels' depth Inte

CVPR 2016 paper reading (6)

1. Neuroaesthetics in fashion: modeling the perception of fashionability, Edgar Simo-Serra, Sanja Fidler, Francesc Moreno-Noguer, Raquel Urtasun, in CVPR 2015. Goal: learn and predict how fashionable a person looks on a photograph, and suggest subtle

CVPR 2018 | 腾讯AI Lab入选21篇论文详解

近十年来在国际计算机视觉领域最具影响力.研究内容最全面的顶级学术会议CVPR,近日揭晓2018年收录论文名单,腾讯AI Lab共有21篇论文入选,位居国内企业前列,我们将在下文进行详解,欢迎交流与讨论. 去年CVPR的论文录取率为29%,腾讯AI Lab 共有6篇论文入选,点击 这里可以回顾.2017年,腾讯 AI Lab共有100多篇论文发表在AI顶级会议上,包括ICML(4篇).ACL(3篇).NIPS(8篇)等. 我们还坚持与学界.企业界和行业「共享AI+未来」,已与美国麻省理工大学.英国

【Paper Reading】R-CNN(V5)论文解读

R-CNN论文:Rich feature hierarchies for accurate object detection and semantic segmentation 用于精确目标检测和语义分割的丰富特征层次结构作者:Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik,UC Berkeley(加州大学伯克利分校)一作者Ross Girshick的个人首页:http://www.rossgirshick.info/,有其

Paper Reading: Perceptual Generative Adversarial Networks for Small Object Detection

Perceptual Generative Adversarial Networks for Small Object Detection 2017-07-11  19:47:46   CVPR 2017 This paper use GAN to handle the issue of small object detection which is a very hard problem in general object detection. As shown in the followin

Paper Reading: Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking

Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual TrackingECCV 2016  The key point of KCF is the ability to efficiently exploit available negative data by including all shifted versions of a training sample, in anthor w

Paper Reading: In Defense of the Triplet Loss for Person Re-Identification

In Defense of the Triplet Loss for Person Re-Identification  2017-07-02  14:04:20   This blog comes from: http://blog.csdn.net/shuzfan/article/details/70069822 Paper:  https://arxiv.org/abs/1703.07737 Github: https://github.com/VisualComputingInstitu