Hybrid proposal refiner: Revisiting detr series from the faster r-cnn perspective
Proceedings of the IEEE/cvf conference on computer vision and …, 2024•openaccess.thecvf.com
With the transformative impact of the Transformer DETR pioneered the application of the
encoder-decoder architecture to object detection. A collection of follow-up research eg
Deformable DETR aims to enhance DETR while adhering to the encoder-decoder design. In
this work we revisit the DETR series through the lens of Faster R-CNN. We find that the
DETR resonates with the underlying principles of Faster R-CNN's RPN-refiner design but
benefits from end-to-end detection owing to the incorporation of Hungarian matching. We …
encoder-decoder architecture to object detection. A collection of follow-up research eg
Deformable DETR aims to enhance DETR while adhering to the encoder-decoder design. In
this work we revisit the DETR series through the lens of Faster R-CNN. We find that the
DETR resonates with the underlying principles of Faster R-CNN's RPN-refiner design but
benefits from end-to-end detection owing to the incorporation of Hungarian matching. We …
Abstract
With the transformative impact of the Transformer DETR pioneered the application of the encoder-decoder architecture to object detection. A collection of follow-up research eg Deformable DETR aims to enhance DETR while adhering to the encoder-decoder design. In this work we revisit the DETR series through the lens of Faster R-CNN. We find that the DETR resonates with the underlying principles of Faster R-CNN's RPN-refiner design but benefits from end-to-end detection owing to the incorporation of Hungarian matching. We systematically adapt the Faster R-CNN towards the Deformable DETR by integrating or repurposing each component of Deformable DETR and note that Deformable DETR's improved performance over Faster R-CNN is attributed to the adoption of advanced modules such as a superior proposal refiner (eg deformable attention rather than RoI Align). When viewing the DETR through the RPN-refiner paradigm we delve into various proposal refinement techniques such as deformable attention cross attention and dynamic convolution. These proposal refiners cooperate well with each other; thus we synergistically combine them to establish a Hybrid Proposal Refiner (HPR). Our HPR is versatile and can be incorporated into various DETR detectors. For instance by integrating HPR to a strong DETR detector we achieve an AP of 54.9 on the COCO benchmark utilizing a ResNet-50 backbone and a 36-epoch training schedule. Code and models are available at https://github. com/ZhaoJingjing713/HPR.
openaccess.thecvf.com
Showing the best result for this search. See all results