MaochengHu 576cda45b8 first commit | 2 years ago | |
---|---|---|
.. | ||
_base_ | 2 years ago | |
README.md | 2 years ago | |
README_cn.md | 2 years ago | |
ppyoloe_crn_l_300e_coco.yml | 2 years ago | |
ppyoloe_crn_l_36e_coco_xpu.yml | 2 years ago | |
ppyoloe_crn_m_300e_coco.yml | 2 years ago | |
ppyoloe_crn_s_300e_coco.yml | 2 years ago | |
ppyoloe_crn_x_300e_coco.yml | 2 years ago |
English | 简体中文
PP-YOLOE is an excellent single-stage anchor-free model based on PP-YOLOv2, surpassing a variety of popular yolo models. PP-YOLOE has a series of models, named s/m/l/x, which are configured through width multiplier and depth multiplier. PP-YOLOE avoids using special operators, such as deformable convolution or matrix nms, to be deployed friendly on various hardware. For more details, please refer to our report.
PP-YOLOE-l achieves 51.4 mAP on COCO test-dev2017 dataset with 78.1 FPS on Tesla V100. While using TensorRT FP16, PP-YOLOE-l can be further accelerated to 149.2 FPS. PP-YOLOE-s/m/x also have excellent accuracy and speed performance, which can be found in Model Zoo
PP-YOLOE is composed of following methods:
Model | GPU number | images/GPU | backbone | input shape | Box APval | Box APtest | Params(M) | FLOPs(G) | V100 FP32(FPS) | V100 TensorRT FP16(FPS) | download | config |
---|---|---|---|---|---|---|---|---|---|---|---|---|
PP-YOLOE-s | 8 | 32 | cspresnet-s | 640 | 42.7 | 43.1 | 7.93 | 17.36 | 208.3 | 333.3 | model | config |
PP-YOLOE-m | 8 | 28 | cspresnet-m | 640 | 48.6 | 48.9 | 23.43 | 49.91 | 123.4 | 208.3 | model | config |
PP-YOLOE-l | 8 | 20 | cspresnet-l | 640 | 50.9 | 51.4 | 52.20 | 110.07 | 78.1 | 149.2 | model | config |
PP-YOLOE-x | 8 | 16 | cspresnet-x | 640 | 51.9 | 52.2 | 98.42 | 206.59 | 45.0 | 95.2 | model | config |
Notes:
mAP(IoU=0.5:0.95)
.--run_benchmark=True
,you should install these dependencies at first, pip install pynvml psutil GPUtil
.Training PP-YOLOE with mixed precision on 8 GPUs with following command
python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml --amp
Notes: use --amp
to train with default config to avoid out of memeory.
Evaluating PP-YOLOE on COCO val2017 dataset in single GPU with following commands:
CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
For evaluation on COCO test-dev2017 dataset, please download COCO test-dev2017 dataset from COCO dataset download and decompress to COCO dataset directory and configure EvalDataset
like configs/ppyolo/ppyolo_test.yml
.
Inference images in single GPU with following commands, use --infer_img
to inference a single image and --infer_dir
to inference all images in the directory.
# inference single image
CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams --infer_img=demo/000000014439_640x640.jpg
# inference all images in the directory
CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams --infer_dir=demo
For deployment on GPU or speed testing, model should be first exported to inference model using tools/export_model.py
.
Exporting PP-YOLOE for Paddle Inference without TensorRT, use following command
python tools/export_model.py -c configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
Exporting PP-YOLOE for Paddle Inference with TensorRT for better performance, use following command with extra -o trt=True
setting.
python tools/export_model.py -c configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams trt=True
If you want to export PP-YOLOE model to ONNX format, use following command refer to PaddleDetection Model Export as ONNX Format Tutorial.
# export inference model
python tools/export_model.py -c configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml --output_dir=output_inference -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
# install paddle2onnx
pip install paddle2onnx
# convert to onnx
paddle2onnx --model_dir output_inference/ppyoloe_crn_l_300e_coco --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 11 --save_file ppyoloe_crn_l_300e_coco.onnx
Notes: ONNX model only supports batch_size=1 now
For fair comparison, the speed in Model Zoo do not contains the time cost of data reading and post-processing(NMS), which is same as YOLOv4(AlexyAB) in testing method. Thus, you should export model with extra -o exclude_nms=True
setting.
Using Paddle Inference without TensorRT to test speed, run following command
# export inference model
python tools/export_model.py -c configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams exclude_nms=True
# speed testing with run_benchmark=True
CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_crn_l_300e_coco --image_file=demo/000000014439_640x640.jpg --run_mode=paddle --device=gpu --run_benchmark=True
Using Paddle Inference with TensorRT to test speed, run following command
# export inference model with trt=True
python tools/export_model.py -c configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams exclude_nms=True trt=True
# speed testing with run_benchmark=True,run_mode=trt_fp32/trt_fp16
CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_crn_l_300e_coco --image_file=demo/000000014439_640x640.jpg --run_mode=trt_fp16 --device=gpu --run_benchmark=True
PP-YOLOE can be deployed by following approches:
Next, we will introduce how to use Paddle Inference to deploy PP-YOLOE models in TensorRT FP16 mode.
First, refer to Paddle Inference Docs, download and install packages corresponding to CUDA, CUDNN and TensorRT version.
Then, Exporting PP-YOLOE for Paddle Inference with TensorRT, use following command.
python tools/export_model.py -c configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams trt=True
Finally, inference in TensorRT FP16 mode.
# inference single image
CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_crn_l_300e_coco --image_file=demo/000000014439_640x640.jpg --device=gpu --run_mode=trt_fp16
# inference all images in the directory
CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_crn_l_300e_coco --image_dir=demo/ --device=gpu --run_mode=trt_fp16
Notes:
use_static=True
in enable_tensorrt_engine. In this way, the serialized file generated will be saved in the output_inference
folder, and the saved serialized file will be loaded the next time when TensorRT is executed.Model | AP | AP50 |
---|---|---|
YOLOX | 22.6 | 37.5 |
YOLOv5 | 26.0 | 42.7 |
PP-YOLOE | 30.5 | 46.4 |
Notes
person, bicycles, car, van, truck, tricyle, awning-tricyle, bus, motor
.Ablation experiments of PP-YOLOE.
NO. | Model | Box APval | Params(M) | FLOPs(G) | V100 FP32 FPS |
---|---|---|---|---|---|
A | PP-YOLOv2 | 49.1 | 54.58 | 115.77 | 68.9 |
B | A + Anchor-free | 48.8 | 54.27 | 114.78 | 69.8 |
C | B + CSPRepResNet | 49.5 | 47.42 | 101.87 | 85.5 |
D | C + TAL | 50.4 | 48.32 | 104.75 | 84.0 |
E | D + ET-Head | 50.9 | 52.20 | 110.07 | 78.1 |