README_en.md 3.2 KB

Practical Server-side detection method base on RCNN

Introduction

  • In recent years, object detection tasks have attracted widespread attention. PaddleClas open-sourced the ResNet50_vd_SSLD pretrained model based on ImageNet(Top1 Acc 82.4%). And based on the pretrained model, PaddleDetection provided the PSS-DET (Practical Server-side detection) with the help of the rich operators in PaddleDetection. The inference speed can reach 61FPS on single V100 GPU when COCO mAP is 41.6%, and 20FPS when COCO mAP is 47.8%.

  • We take the standard Faster RCNN ResNet50_vd FPN as an example. The following table shows ablation study of PSS-DET.

| Trick | Train scale | Test scale | COCO mAP | Infer speed/FPS | |- |:-: |:-: | :-: | :-: | | baseline | 640x640 | 640x640 | 36.4% | 43.589 | | +test proposal=pre/post topk 500/300 | 640x640 | 640x640 | 36.2% | 52.512 | | +fpn channel=64 | 640x640 | 640x640 | 35.1% | 67.450 | | +ssld pretrain | 640x640 | 640x640 | 36.3% | 67.450 | | +ciou loss | 640x640 | 640x640 | 37.1% | 67.450 | | +DCNv2 | 640x640 | 640x640 | 39.4% | 60.345 | | +3x, multi-scale training | 640x640 | 640x640 | 41.0% | 60.345 | | +auto augment | 640x640 | 640x640 | 41.4% | 60.345 | | +libra sampling | 640x640 | 640x640 | 41.6% | 60.345 |

And the following figure shows mAP-Speed curves for some common detectors.

pssdet

Note

For fair comparison, inference time for PSS-DET models on V100 GPU is transformed to Titan V GPU by multiplying by 1.2 times.

Model Zoo

COCO dataset

Backbone Type Image/gpu Lr schd Inf time (fps) Box AP Mask AP Download Configs
ResNet50-vd-FPN-Dcnv2 Faster 2 3x 61.425 41.6 - model config
ResNet50-vd-FPN-Dcnv2 Cascade Faster 2 3x 20.001 47.8 - model config
ResNet101-vd-FPN-Dcnv2 Cascade Faster 2 3x 19.523 49.4 - model config

Attention: Pretrained models whose congigurations are in the directory generic just support inference but do not support training and evaluation as now.