DL之R-FCN：R-FCN算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略

相关文章
DL之R-FCN：R-FCN算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略
DL之R-FCN：R-FCN算法的架构详解

R-FCN算法的简介(论文介绍)

Abstract
We present region-based, fully convolutional networks for accurate and efficient object detection. In contrast to previous region-based detectors such as Fast/Faster R-CNN [6, 18] that apply a costly per-region subnetwork hundreds of times, our region-based detector is fully convolutional with almost all computation shared on the entire image. To achieve this goal, we propose position-sensitive score maps to address a dilemma between translation-invariance in image classification and translation-variance in object detection. Our method can thus naturally adopt fully convolutional image classifier backbones, such as the latest Residual Networks (ResNets) [9], for object detection. We show competitive results on the PASCAL VOC datasets (e.g., 83.6% mAP on the 2007 set) with the 101-layer ResNet. Meanwhile, our result is achieved at a test-time speed of 170ms per image, 2.5-20× faster than the Faster R-CNN counterpart. Code is made publicly available at: https://github.com/daijifeng001/r-fcn.
摘要
我们提出了基于区域的全卷积网络，用于精确和有效的目标检测。与之前的基于区域的检测器(如Fast/Faster R-CNN)相比，我们的基于区域的检测器是完全卷积的，几乎所有计算在整个图像上共享。为了实现这一目标，我们提出了位置敏感的分数映射来解决图像分类中的平移不变性与目标检测中的平移方差之间的矛盾。因此，我们的方法可以很自然地采用完全卷积的图像分类器骨干，例如最新的ResNets[9]来进行目标检测。我们使用101层ResNet在PASCAL VOC数据集上显示了竞争结果(例如，在2007年的集上显示了83.6%的mAP)。同时，我们的结果在测试时的速度为每张图像170ms，比更快的R-CNN对应图像快2.5-20倍。代码公开提供:https://github.com/daijifeng001/r-fcn。
Conclusion and Future Work
We presented Region-based Fully Convolutional Networks, a simple but accurate and efficient framework for object detection. Our system naturally adopts the state-of-the-art image classification backbones, such as ResNets, that are by design fully convolutional. Our method achieves accuracy competitive with the Faster R-CNN counterpart, but is much faster during both training and inference. We intentionally keep the R-FCN system presented in the paper simple. There have been a series of orthogonal extensions of FCNs that were developed for semantic segmentation (e.g., see [2]), as well as extensions of region-based methods for object detection (e.g., see [9, 1, 22]). We expect our system will easily enjoy the benefits of the progress in the field.
结论及未来工作
提出了一种基于区域的全卷积网络，这是一种简单、准确、高效的目标检测框架。我们的系统自然采用了最先进的图像分类骨干，如ResNets，它的设计完全是卷积的。我们的方法达到了精度与更快的R-CNN竞争对手，但在训练和推理过程中都快得多。我们有意使本文中介绍的R-FCN系统保持简单。已有一系列针对语义分割的FCNs正交扩展(如[2])，以及基于区域的对象检测方法的扩展(如[9,1,22])。我们希望我们的系统能够很容易地从这一领域的进展中获益。

论文
Jifeng Dai, Yi Li, KaimingHe, Jian Sun.
R-FCN: Object detection via region-based fully convolutional networks. NIPS, 2016
https://arxiv.org/abs/1605.06409

1、Motivation: Sharing is Caring

对Faster R-CNN结构进行了改造，将RoI层之后的卷积都移到了RoI层之前，并利用一种位置敏感的特征图来评估各个类别的概率，在保持较高定位准确度的同时，大幅提高检测速率。

7、各种策略下的实验结果

1、AtrousConvolution技巧

·将ResNet-101的有效步幅从32像素降低到16像素，从而提高了得分图的分辨率。
·Conv4和之前的所有层（stride = 16）都保持不变; 将第一个conv5中的stride = 2修改为stride = 1
·Conv5的所有卷积滤波器都通过“带孔算法”（Algorithmeà rous）进行修改，以补偿减小的步幅。