YOLO Object Detection

2017-12-01

YOLO

论文链接

YOLO9000:Better, Faster, Stronger
Fast YOLO: A Fast You Only Look Once System for Real-time Embedded Object Detection in Video
You Only Look Once:Unified, Real-Time Object Detection
YOLOv3

分析

YOLO

YOLO论文解读

YOLO9000(YOLOv2)

YOLO9000论文解读

Better

传统YOLO缺陷

Significant number of localization errors compared to Fast-RCNN
Low recall compared to region proposal-based mothods

YOLOv2相对YOLO的改进

Batch Normalization

adding batch normalization on all of the convolutional layers in YOLO
2% improvement in mAP
then we can remove dropout without overfitting

High Resolution Classifier

increase of almost 4% mAP

Convolutional With Anchor Boxes

remove fully connected layers from YOLO and use anchor boxes to predict bounding boxes.
eliminate one pooling layer to make the output of the network’s convolutional layers higher resolution.
shrink the network to operate on 416x416 input images instead of 448x448。这样使得输出的feature map大小为13x13,代替原来的14x14使得中心只有一个。

和YOLO相比的性能

mAP: 69.5->69.2
recall: 81%->88%

Dimension Clusters

使用k-means自动挑选box的尺寸代替原来YOLO方案的手动挑选尺寸。

Direct Location Prediction

YOLO使用anchor boxes的过程中会遇到模型不稳定的问题。通过改变预测box位置的方法改进。
Using dimension clusters along with directly predicting the bounding box center location improves YOLO by almost 5% over the version with anchor boxes.

Fine-Grained Features

This gives a modest 1% performance increase.

Multi-Scale Training

由于模型只有卷积层和池化层，我们可以随意调整大小。 Every 10 batches our network randomly chooses new image dimensions。

Fengyang

YOLO Object Detection

YOLO

论文链接

分析

YOLO

YOLO9000(YOLOv2)

Better

传统YOLO缺陷

YOLOv2相对YOLO的改进

Batch Normalization

High Resolution Classifier

Convolutional With Anchor Boxes

和YOLO相比的性能

Dimension Clusters

Direct Location Prediction

Fine-Grained Features

Multi-Scale Training

Further Experiments

Faster

Stronger

YOLOv3