Practical Server-side detection method base on RCNN

Introduction

In recent years, object detection tasks have attracted widespread attention. PaddleClas open-sourced the ResNet50_vd_SSLD pretrained model based on ImageNet(Top1 Acc 82.4%). And based on the pretrained model, PaddleDetection provided the PSS-DET (Practical Server-side detection) with the help of the rich operators in PaddleDetection. The inference speed can reach 61FPS on single V100 GPU when COCO mAP is 41.6%, and 20FPS when COCO mAP is 47.8%.
We take the standard Faster RCNN ResNet50_vd FPN as an example. The following table shows ablation study of PSS-DET.

| Trick | Train scale | Test scale | COCO mAP | Infer speed/FPS | |- |:-: |:-: | :-: | :-: | | baseline | 640x640 | 640x640 | 36.4% | 43.589 | | +test proposal=pre/post topk 500/300 | 640x640 | 640x640 | 36.2% | 52.512 | | +fpn channel=64 | 640x640 | 640x640 | 35.1% | 67.450 | | +ssld pretrain | 640x640 | 640x640 | 36.3% | 67.450 | | +ciou loss | 640x640 | 640x640 | 37.1% | 67.450 | | +DCNv2 | 640x640 | 640x640 | 39.4% | 60.345 | | +3x, multi-scale training | 640x640 | 640x640 | 41.0% | 60.345 | | +auto augment | 640x640 | 640x640 | 41.4% | 60.345 | | +libra sampling | 640x640 | 640x640 | 41.6% | 60.345 |

And the following figure shows mAP-Speed curves for some common detectors.

Note

For fair comparison, inference time for PSS-DET models on V100 GPU is transformed to Titan V GPU by multiplying by 1.2 times.

Model Zoo

COCO dataset

Backbone	Type	Image/gpu	Lr schd	Inf time (fps)	Box AP	Mask AP	Download	Configs
ResNet50-vd-FPN-Dcnv2	Faster	2	3x	61.425	41.6	-	model	config
ResNet50-vd-FPN-Dcnv2	Cascade Faster	2	3x	20.001	47.8	-	model	config
ResNet101-vd-FPN-Dcnv2	Cascade Faster	2	3x	19.523	49.4	-	model	config

Attention: Pretrained models whose congigurations are in the directory generic just support inference but do not support training and evaluation as now.

README_en.md 3.2 KB Histórico Raw

Practical Server-side detection method base on RCNN

Introduction

Model Zoo

COCO dataset

README_en.md 3.2 KB

Histórico Raw