YOLO
论文链接
YOLO9000:Better, Faster, Stronger
Fast YOLO: A Fast You Only Look Once System for Real-time Embedded Object Detection in Video
You Only Look Once:Unified, Real-Time Object Detection
YOLOv3
分析
YOLO
YOLO9000(YOLOv2)
Better
传统YOLO缺陷
Significant number of localization errors compared to Fast-RCNN
Low recall compared to region proposal-based mothods
YOLOv2相对YOLO的改进
Batch Normalization
adding batch normalization on all of the convolutional layers in YOLO
2% improvement in mAP
then we can remove dropout without overfitting
High Resolution Classifier
increase of almost 4% mAP
Convolutional With Anchor Boxes
remove fully connected layers from YOLO and use anchor boxes to predict bounding boxes.
eliminate one pooling layer to make the output of the network’s convolutional layers higher resolution.
shrink the network to operate on 416x416 input images instead of 448x448。这样使得输出的feature map大小为13x13,代替原来的14x14使得中心只有一个。
和YOLO相比的性能
mAP: 69.5->69.2
recall: 81%->88%
Dimension Clusters
使用k-means自动挑选box的尺寸代替原来YOLO方案的手动挑选尺寸。
Direct Location Prediction
YOLO使用anchor boxes的过程中会遇到模型不稳定的问题。通过改变预测box位置的方法改进。
Using dimension clusters along with directly predicting the bounding box center location improves YOLO by almost 5% over the version with anchor boxes.
Fine-Grained Features
This gives a modest 1% performance increase.
Multi-Scale Training
由于模型只有卷积层和池化层,我们可以随意调整大小。 Every 10 batches our network randomly chooses new image dimensions。