SlideShare a Scribd company logo
2
Most read
3
Most read
4
Most read
Feature Pyramid Networks for Object Detection
Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, Serge Belongie
https://arxiv.org/abs/1612.03144
Reference
1. Feature Pyramid Networks for Object Detection
2. Feature Pyramid Networks in PyTorch.
3. CONV2D, PyTorch Document
4. Jonathan Hui, Understanding Feature Pyramid Networks for object detection (FPN), Medium
5. 형준킴 염창동형준킴, 갈아먹는 Object Detection [7] Feature Pyramid Network, Tistory
6. 동산, Feature Pyramid Networks for Object Detection, Naver Blog
7. srk lee, Feature Pyramid Networks for Object Detection 논문읽기, Medium
8. EfficientDet : Scalable and Efficient Object Detection Review
9. Technical Fridays, Autoencoder: Downsampling and Upsampling
10. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Abstract
● Feature pyramid은 다양한 스케일의 객체 검출하기 위한 인식 시스템의 기본적인
컴포넌트. 하지만, 계산량과 메모리 문제로 딥러닝을 이용한 객체 인식에서는 널리
이용되지 않았다.
● 본 논문에서는 적은 연산 비용으로 FPN(Feature Pyramid Network)라는 다양한
스케일의 semantic 정보를 포함할 수 있는 feature map을 적은 비용으로 추출할 수
있는 Architecture을 제안. FPN은 향후 발표되는 거의 모든 SOTA 객체 검출 방법에
응용되고 있다.
여기서, semantic feature은 클래스의 정보를, localization feature은 위치 정보를
갖는다.
● FPN은 Object Detector(single model)가 아닌, Feature Detector 이다. 따라서, 기존의
SOTA 모델을 backbone으로 FPN을 적용하면 비슷한 비용으로 성능 향상을 꾀할 수
있다.
● Lateral connection 와 Top-down architecture을 사용.
● Multi-scale pyramidal hierarchy을 사용.
https://arxiv.org/pdf/1612.03144.pdf
Introduction
(a) 기존의 객체검출 방법
Hand-engineered features(=노가다)을 이용하여 image pyramids을 구성. 각각의
pyramid에서 feature을 계산한 후 모든 위치에서 객체를 검출하는 방법.
각각의 Layer에서 feature을 추출하여 예측을 수행하기 때문에, 다양한 스케일에서
강인한 성능을 보일 수 있지만, 계산량이 어마무시.. 비현실적.
Wikipedia, Pyramid (image processing), https://en.wikipedia.org/wiki/Pyramid_(image_processing)
https://arxiv.org/pdf/1612.03144.pdf
Introduction
Deep Convolutional Networks(ConvNets)
● ConvNets은 scale invariant한 특징이 있지만..한계가 있다.
● 빠른 객체검출을 위해 1개의 scale만을 이용.
● 마지막에 single scale features만 이용. Different depth로 인하여
large scale gap이 발생.
● ConvNet 의 scale 변화에 강인성에는 한계가 있기 때문에,
ImageNet, COCO detection challenges 에서는 featurized image
pyramid 기반으로 multi-scale testing을 이용. 하지만, 학습시 많은
메모리가 요구, Inference time이 증가.
https://arxiv.org/pdf/1612.03144.pdf
Introduction
SSD: SIngle Shot Multibox Detector
● Pyramidal feature hierarchy을 이용.
● 서로 다른 scale을 갖는 feature map을 이용하여 검출에 사용.
● 계산한 feature map을 다시 사용하여 higher layer를 계속 생성.
● 이미 계산한 higher-resolution maps을 검출할 때 1회만 사용.
● Convolution 연산을 수행할수록 Semantic feature을 강인하게
추출하지만, image resolution 저하로 작은 객체를 감지하기가 어려움.
이를 해결하기 위해서는, feature map을 1회가 아닌 여러 번 사용하는
것이 좋음.
● -> FPN은 ConvNet’s feature hierarchy의 pyramidal shape을 이용하여
다양한 scale에서 강인한 semantics을 갖을 수 있도록 하는게 목적.
https://arxiv.org/pdf/1612.03144.pdf
NMS(Non-Maximum Suppression)을 이용해서 중복결과 제거
참고: https://dyndy.tistory.com/275
Introduction
Feature Pyramid Network, FPN
● ConvNet’s pyramidal feature hierarchy을 향상시키고, 전체 계층에 high-level
semantics을 갖는 feature pyramid를 만드는 것.
● 제안된 네트워크 구조는 bottom-up pathway, a top-down pathway, lateral
connections 을 포함.
● skip connection, top-down, cnn forward에서 생성되는 피라미드 구조를 병합.
forward에서 응축된 semantic 정보를 top-down 과정에서 upsampling하여
해상도를 올리고 forward에서 손실된 지역 정보들을 skip connection으로
보충해서 빠른 속도로 크기 변화에 강한 모델을 생성.
https://arxiv.org/pdf/1612.03144.pdf
Bottom-up pathway
https://medium.com/@jonathan_hui/understanding-feature-pyramid-networks-for-object-detection-fpn-45b227b9106c
Top-down pathway and lateral connections
https://medium.com/@jonathan_hui/understanding-feature-pyramid-networks-for-object-detection-fpn-45b227b9106c
FPN in PyTorch
'''FPN in PyTorch.
See the paper "Feature Pyramid Networks for Object Detection" for more details.
'''
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
class Bottleneck(nn.Module):
expansion = 4
def __init__(self, in_planes, planes, stride=1):
super(Bottleneck, self).__init__()
self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=1, bias=False)
self.bn1 = nn.BatchNorm2d(planes)
self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)
self.bn2 = nn.BatchNorm2d(planes)
self.conv3 = nn.Conv2d(planes, self.expansion*planes, kernel_size=1, bias=False)
self.bn3 = nn.BatchNorm2d(self.expansion*planes)
self.shortcut = nn.Sequential()
if stride != 1 or in_planes != self.expansion*planes:
self.shortcut= nn.Sequential(
nn.Conv2d(in_planes, self.expansion*planes, kernel_size=1, stride=stride, bias=False),
nn.BatchNorm2d(self.expansion*planes)
)
def forward(self, x):
out = F.relu(self.bn1(self.conv1(x)))
out = F.relu(self.bn2(self.conv2(out)))
out = self.bn3(self.conv3(out))
out += self.shortcut(x)
out = F.relu(out)
return out
https://github.com/kuangliu/pytorch-fpn/blob/master/fpn.py
FPN in PyTorch
class FPN(nn.Module):
def __init__(self, block, num_blocks):
super(FPN, self).__init__()
self.in_planes= 64
self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False)
self.bn1 = nn.BatchNorm2d(64)
# Bottom-up layers
self.layer1 = self._make_layer(block, 64, num_blocks[0], stride=1)
self.layer2 = self._make_layer(block, 128, num_blocks[1], stride=2)
self.layer3 = self._make_layer(block, 256, num_blocks[2], stride=2)
self.layer4 = self._make_layer(block, 512, num_blocks[3], stride=2)
# Top layer
self.toplayer = nn.Conv2d(2048, 256, kernel_size=1, stride=1, padding=0) # Reduce channels
# Smooth layers
self.smooth1 = nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1)
self.smooth2 = nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1)
self.smooth3 = nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1)
# Lateral layers
self.latlayer1= nn.Conv2d(1024, 256, kernel_size=1, stride=1, padding=0)
self.latlayer2= nn.Conv2d( 512, 256, kernel_size=1, stride=1, padding=0)
self.latlayer3= nn.Conv2d( 256, 256, kernel_size=1, stride=1, padding=0) https://github.com/kuangliu/pytorch-fpn/blob/master/fpn.py
FPN in PyTorch
def _make_layer(self, block, planes, num_blocks, stride):
strides = [stride] + [1]*(num_blocks-1)
layers = []
for stride in strides:
layers.append(block(self.in_planes, planes, stride))
self.in_planes= planes * block.expansion
return nn.Sequential(*layers)
def _upsample_add(self, x, y):
'''Upsample and add two feature maps.
Args:
x: (Variable) top feature map to be upsampled.
y: (Variable) lateral feature map.
Returns:
(Variable) added feature map.
Note in PyTorch, when input size is odd, the upsampled feature map
with `F.upsample(..., scale_factor=2, mode='nearest')`
maybe not equal to the lateral feature map size.
E.g. original input size: [N,_,15,15] ->
conv2d feature map size: [N,_,8,8] ->
upsampled feature map size: [N,_,16,16]
So we choose bilinear upsample which supports arbitrary output sizes.
'''
_,_,H,W = y.size()
return F.upsample(x, size=(H,W), mode='bilinear') + y
def forward(self, x):
# Bottom-up
c1 = F.relu(self.bn1(self.conv1(x)))
c1 = F.max_pool2d(c1, kernel_size=3, stride=2, padding=1)
c2 = self.layer1(c1)
c3 = self.layer2(c2)
c4 = self.layer3(c3)
c5 = self.layer4(c4)
# Top-down
p5 = self.toplayer(c5)
p4 = self._upsample_add(p5, self.latlayer1(c4))
p3 = self._upsample_add(p4, self.latlayer2(c3))
p2 = self._upsample_add(p3, self.latlayer3(c2))
# Smooth
p4 = self.smooth1(p4)
p3 = self.smooth2(p3)
p2 = self.smooth3(p2)
return p2, p3, p4, p5
def FPN101():
return FPN(Bottleneck, [2,2,2,2])
https://github.com/kuangliu/pytorch-fpn/blob/master/fpn.py
https://kharshit.github.io/blog/2019/02/15/autoencoder-downsampling-and-upsampling
Feature Pyramid Networks for RPN
https://medium.com/@jonathan_hui/understanding-feature-pyramid-networks-for-object-detection-fpn-45b227b9106
https://yeomko.tistory.com/44
Faster RCNN에서 RPN 생성방법
1. Pretrained된 VGG을 통과한 feature map을 생성
2. Feature map에 3X3 convolution 적용하여, Intermediate layer
생성
3. Classification와 Bounding box regression을 위해 1X1
Convolution을 적용
Feature Pyramid Networks for Fast R-CNN
https://medium.com/@jonathan_hui/understanding-feature-pyramid-networks-for-object-detection-fpn-45b227b9106
Experiments

More Related Content

PPTX
2018 07 02_dense_pose
harmonylab
 
PDF
Pythonによる機械学習入門 ~SVMからDeep Learningまで~
Yasutomo Kawanishi
 
PDF
Rosのリアルタイムツールの紹介
gakky1667
 
PPTX
MCMCベースレンダリング入門
Hisanari Otsu
 
PPTX
Deferred decal
민웅 이
 
PPTX
画像処理応用
大貴 末廣
 
PDF
Faster R-CNN: Towards real-time object detection with region proposal network...
Universitat Politècnica de Catalunya
 
PDF
CoRL2021論文読み会
Ryo Kabutan
 
2018 07 02_dense_pose
harmonylab
 
Pythonによる機械学習入門 ~SVMからDeep Learningまで~
Yasutomo Kawanishi
 
Rosのリアルタイムツールの紹介
gakky1667
 
MCMCベースレンダリング入門
Hisanari Otsu
 
Deferred decal
민웅 이
 
画像処理応用
大貴 末廣
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Universitat Politècnica de Catalunya
 
CoRL2021論文読み会
Ryo Kabutan
 

What's hot (20)

PPTX
[0903 구경원] recast 네비메쉬
KyeongWon Koo
 
PPTX
Recent Progress on Object Detection_20170331
Jihong Kang
 
PPTX
Feature pyramid networks for object detection
heedaeKwon
 
PDF
[DL Hacks]Simple Online Realtime Tracking with a Deep Association Metric
Deep Learning JP
 
PPTX
Bump Mapping
Sukwoo Lee
 
PDF
Python と型ヒント (Type Hints)
Tetsuya Morimoto
 
PPTX
【DL輪読会】"Instant Neural Graphics Primitives with a Multiresolution Hash Encoding"
Deep Learning JP
 
PDF
SSII2022 [SS1] ニューラル3D表現の最新動向〜 ニューラルネットでなんでも表せる?? 〜​
SSII
 
PPTX
C#で速度を極めるいろは
Core Concept Technologies
 
PDF
UE4 Garbage Collection
QooJuice
 
PDF
拡がるディープラーニングの活用
NVIDIA Japan
 
PDF
remote Docker over SSHが熱い
Hiroyuki Ohnaka
 
PDF
Centernet
Arithmer Inc.
 
PDF
30th コンピュータビジョン勉強会@関東 DynamicFusion
Hiroki Mizuno
 
PPTX
리얼타임 렌더링에 대해
필성 권
 
PPTX
物体検出の歴史(R-CNNからSSD・YOLOまで)
HironoriKanazawa
 
PDF
3次元レジストレーション(PCLデモとコード付き)
Toru Tamaki
 
PPTX
언리얼을 활용한 오브젝트 풀링
TonyCms
 
PDF
[DL輪読会] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Deep Learning JP
 
PPTX
Faster rcnn
捷恩 蔡
 
[0903 구경원] recast 네비메쉬
KyeongWon Koo
 
Recent Progress on Object Detection_20170331
Jihong Kang
 
Feature pyramid networks for object detection
heedaeKwon
 
[DL Hacks]Simple Online Realtime Tracking with a Deep Association Metric
Deep Learning JP
 
Bump Mapping
Sukwoo Lee
 
Python と型ヒント (Type Hints)
Tetsuya Morimoto
 
【DL輪読会】"Instant Neural Graphics Primitives with a Multiresolution Hash Encoding"
Deep Learning JP
 
SSII2022 [SS1] ニューラル3D表現の最新動向〜 ニューラルネットでなんでも表せる?? 〜​
SSII
 
C#で速度を極めるいろは
Core Concept Technologies
 
UE4 Garbage Collection
QooJuice
 
拡がるディープラーニングの活用
NVIDIA Japan
 
remote Docker over SSHが熱い
Hiroyuki Ohnaka
 
Centernet
Arithmer Inc.
 
30th コンピュータビジョン勉強会@関東 DynamicFusion
Hiroki Mizuno
 
리얼타임 렌더링에 대해
필성 권
 
物体検出の歴史(R-CNNからSSD・YOLOまで)
HironoriKanazawa
 
3次元レジストレーション(PCLデモとコード付き)
Toru Tamaki
 
언리얼을 활용한 오브젝트 풀링
TonyCms
 
[DL輪読会] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Deep Learning JP
 
Faster rcnn
捷恩 蔡
 
Ad

Similar to Feature Pyramid Network, FPN (20)

PDF
FPN 리뷰
Hansol Kang
 
PDF
Summary in recent advances in deep learning for object detection
창기 문
 
PDF
Summary in recent advances in deep learning for object detection
창기 문
 
PDF
[paper review] 손규빈 - Eye in the sky & 3D human pose estimation in video with ...
Gyubin Son
 
PDF
FCN to DeepLab.v3+
Whi Kwon
 
PDF
Deep Object Detectors #1 (~2016.6)
Ildoo Kim
 
PDF
SPPNet : Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Re...
Dae Hyun Nam
 
PPTX
Single Shot MultiBox Detector와 Recurrent Instance Segmentation
홍배 김
 
PPTX
Convolutional neural networks
HyunjinBae3
 
PDF
History of Vision AI
Tae Young Lee
 
PDF
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Oh Yoojin
 
PPTX
Image Deep Learning 실무적용
Youngjae Kim
 
PDF
"Learning transferable architectures for scalable image recognition" Paper Re...
LEE HOSEONG
 
PDF
네트워크 경량화 이모저모 @ 2020 DLD
Kim Junghoon
 
PDF
Deep neural networks cnn rnn_ae_some practical techniques
Kang Pilsung
 
PDF
합성곱 신경망
Sunggon Song
 
PPTX
[Paper Review] Visualizing and understanding convolutional networks
Korea, Sejong University.
 
PDF
HistoryOfCNN
Tae Young Lee
 
PDF
[데이터 분석 소모임] Convolution Neural Network 김려린
dkuplusalpha
 
FPN 리뷰
Hansol Kang
 
Summary in recent advances in deep learning for object detection
창기 문
 
Summary in recent advances in deep learning for object detection
창기 문
 
[paper review] 손규빈 - Eye in the sky & 3D human pose estimation in video with ...
Gyubin Son
 
FCN to DeepLab.v3+
Whi Kwon
 
Deep Object Detectors #1 (~2016.6)
Ildoo Kim
 
SPPNet : Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Re...
Dae Hyun Nam
 
Single Shot MultiBox Detector와 Recurrent Instance Segmentation
홍배 김
 
Convolutional neural networks
HyunjinBae3
 
History of Vision AI
Tae Young Lee
 
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Oh Yoojin
 
Image Deep Learning 실무적용
Youngjae Kim
 
"Learning transferable architectures for scalable image recognition" Paper Re...
LEE HOSEONG
 
네트워크 경량화 이모저모 @ 2020 DLD
Kim Junghoon
 
Deep neural networks cnn rnn_ae_some practical techniques
Kang Pilsung
 
합성곱 신경망
Sunggon Song
 
[Paper Review] Visualizing and understanding convolutional networks
Korea, Sejong University.
 
HistoryOfCNN
Tae Young Lee
 
[데이터 분석 소모임] Convolution Neural Network 김려린
dkuplusalpha
 
Ad

More from Institute of Agricultural Machinery, NARO (8)

PDF
ML基本からResNetまで
Institute of Agricultural Machinery, NARO
 
PDF
Review: Deep contextualized word representations
Institute of Agricultural Machinery, NARO
 
PPTX
ソフトウェアプラットフォームを用いたコンバインロボットの制御
Institute of Agricultural Machinery, NARO
 
PPTX
The Integrated Sensor Control Platform for Head-feeding Combine Harvester
Institute of Agricultural Machinery, NARO
 
PDF
ロボットコンバインのためのソフトプラットホーム開発
Institute of Agricultural Machinery, NARO
 
PPTX
A vision-based uncut crop edge detection method for automated guidance of hea...
Institute of Agricultural Machinery, NARO
 
PPTX
A multi-sensor based uncut crop edge detection method for head-feeding combin...
Institute of Agricultural Machinery, NARO
 
ML基本からResNetまで
Institute of Agricultural Machinery, NARO
 
Review: Deep contextualized word representations
Institute of Agricultural Machinery, NARO
 
ソフトウェアプラットフォームを用いたコンバインロボットの制御
Institute of Agricultural Machinery, NARO
 
The Integrated Sensor Control Platform for Head-feeding Combine Harvester
Institute of Agricultural Machinery, NARO
 
ロボットコンバインのためのソフトプラットホーム開発
Institute of Agricultural Machinery, NARO
 
A vision-based uncut crop edge detection method for automated guidance of hea...
Institute of Agricultural Machinery, NARO
 
A multi-sensor based uncut crop edge detection method for head-feeding combin...
Institute of Agricultural Machinery, NARO
 

Feature Pyramid Network, FPN

  • 1. Feature Pyramid Networks for Object Detection Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, Serge Belongie https://arxiv.org/abs/1612.03144
  • 2. Reference 1. Feature Pyramid Networks for Object Detection 2. Feature Pyramid Networks in PyTorch. 3. CONV2D, PyTorch Document 4. Jonathan Hui, Understanding Feature Pyramid Networks for object detection (FPN), Medium 5. 형준킴 염창동형준킴, 갈아먹는 Object Detection [7] Feature Pyramid Network, Tistory 6. 동산, Feature Pyramid Networks for Object Detection, Naver Blog 7. srk lee, Feature Pyramid Networks for Object Detection 논문읽기, Medium 8. EfficientDet : Scalable and Efficient Object Detection Review 9. Technical Fridays, Autoencoder: Downsampling and Upsampling 10. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
  • 3. Abstract ● Feature pyramid은 다양한 스케일의 객체 검출하기 위한 인식 시스템의 기본적인 컴포넌트. 하지만, 계산량과 메모리 문제로 딥러닝을 이용한 객체 인식에서는 널리 이용되지 않았다. ● 본 논문에서는 적은 연산 비용으로 FPN(Feature Pyramid Network)라는 다양한 스케일의 semantic 정보를 포함할 수 있는 feature map을 적은 비용으로 추출할 수 있는 Architecture을 제안. FPN은 향후 발표되는 거의 모든 SOTA 객체 검출 방법에 응용되고 있다. 여기서, semantic feature은 클래스의 정보를, localization feature은 위치 정보를 갖는다. ● FPN은 Object Detector(single model)가 아닌, Feature Detector 이다. 따라서, 기존의 SOTA 모델을 backbone으로 FPN을 적용하면 비슷한 비용으로 성능 향상을 꾀할 수 있다. ● Lateral connection 와 Top-down architecture을 사용. ● Multi-scale pyramidal hierarchy을 사용. https://arxiv.org/pdf/1612.03144.pdf
  • 4. Introduction (a) 기존의 객체검출 방법 Hand-engineered features(=노가다)을 이용하여 image pyramids을 구성. 각각의 pyramid에서 feature을 계산한 후 모든 위치에서 객체를 검출하는 방법. 각각의 Layer에서 feature을 추출하여 예측을 수행하기 때문에, 다양한 스케일에서 강인한 성능을 보일 수 있지만, 계산량이 어마무시.. 비현실적. Wikipedia, Pyramid (image processing), https://en.wikipedia.org/wiki/Pyramid_(image_processing) https://arxiv.org/pdf/1612.03144.pdf
  • 5. Introduction Deep Convolutional Networks(ConvNets) ● ConvNets은 scale invariant한 특징이 있지만..한계가 있다. ● 빠른 객체검출을 위해 1개의 scale만을 이용. ● 마지막에 single scale features만 이용. Different depth로 인하여 large scale gap이 발생. ● ConvNet 의 scale 변화에 강인성에는 한계가 있기 때문에, ImageNet, COCO detection challenges 에서는 featurized image pyramid 기반으로 multi-scale testing을 이용. 하지만, 학습시 많은 메모리가 요구, Inference time이 증가. https://arxiv.org/pdf/1612.03144.pdf
  • 6. Introduction SSD: SIngle Shot Multibox Detector ● Pyramidal feature hierarchy을 이용. ● 서로 다른 scale을 갖는 feature map을 이용하여 검출에 사용. ● 계산한 feature map을 다시 사용하여 higher layer를 계속 생성. ● 이미 계산한 higher-resolution maps을 검출할 때 1회만 사용. ● Convolution 연산을 수행할수록 Semantic feature을 강인하게 추출하지만, image resolution 저하로 작은 객체를 감지하기가 어려움. 이를 해결하기 위해서는, feature map을 1회가 아닌 여러 번 사용하는 것이 좋음. ● -> FPN은 ConvNet’s feature hierarchy의 pyramidal shape을 이용하여 다양한 scale에서 강인한 semantics을 갖을 수 있도록 하는게 목적. https://arxiv.org/pdf/1612.03144.pdf NMS(Non-Maximum Suppression)을 이용해서 중복결과 제거 참고: https://dyndy.tistory.com/275
  • 7. Introduction Feature Pyramid Network, FPN ● ConvNet’s pyramidal feature hierarchy을 향상시키고, 전체 계층에 high-level semantics을 갖는 feature pyramid를 만드는 것. ● 제안된 네트워크 구조는 bottom-up pathway, a top-down pathway, lateral connections 을 포함. ● skip connection, top-down, cnn forward에서 생성되는 피라미드 구조를 병합. forward에서 응축된 semantic 정보를 top-down 과정에서 upsampling하여 해상도를 올리고 forward에서 손실된 지역 정보들을 skip connection으로 보충해서 빠른 속도로 크기 변화에 강한 모델을 생성. https://arxiv.org/pdf/1612.03144.pdf
  • 9. Top-down pathway and lateral connections https://medium.com/@jonathan_hui/understanding-feature-pyramid-networks-for-object-detection-fpn-45b227b9106c
  • 10. FPN in PyTorch '''FPN in PyTorch. See the paper "Feature Pyramid Networks for Object Detection" for more details. ''' import torch import torch.nn as nn import torch.nn.functional as F from torch.autograd import Variable class Bottleneck(nn.Module): expansion = 4 def __init__(self, in_planes, planes, stride=1): super(Bottleneck, self).__init__() self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=1, bias=False) self.bn1 = nn.BatchNorm2d(planes) self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=1, bias=False) self.bn2 = nn.BatchNorm2d(planes) self.conv3 = nn.Conv2d(planes, self.expansion*planes, kernel_size=1, bias=False) self.bn3 = nn.BatchNorm2d(self.expansion*planes) self.shortcut = nn.Sequential() if stride != 1 or in_planes != self.expansion*planes: self.shortcut= nn.Sequential( nn.Conv2d(in_planes, self.expansion*planes, kernel_size=1, stride=stride, bias=False), nn.BatchNorm2d(self.expansion*planes) ) def forward(self, x): out = F.relu(self.bn1(self.conv1(x))) out = F.relu(self.bn2(self.conv2(out))) out = self.bn3(self.conv3(out)) out += self.shortcut(x) out = F.relu(out) return out https://github.com/kuangliu/pytorch-fpn/blob/master/fpn.py
  • 11. FPN in PyTorch class FPN(nn.Module): def __init__(self, block, num_blocks): super(FPN, self).__init__() self.in_planes= 64 self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False) self.bn1 = nn.BatchNorm2d(64) # Bottom-up layers self.layer1 = self._make_layer(block, 64, num_blocks[0], stride=1) self.layer2 = self._make_layer(block, 128, num_blocks[1], stride=2) self.layer3 = self._make_layer(block, 256, num_blocks[2], stride=2) self.layer4 = self._make_layer(block, 512, num_blocks[3], stride=2) # Top layer self.toplayer = nn.Conv2d(2048, 256, kernel_size=1, stride=1, padding=0) # Reduce channels # Smooth layers self.smooth1 = nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1) self.smooth2 = nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1) self.smooth3 = nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1) # Lateral layers self.latlayer1= nn.Conv2d(1024, 256, kernel_size=1, stride=1, padding=0) self.latlayer2= nn.Conv2d( 512, 256, kernel_size=1, stride=1, padding=0) self.latlayer3= nn.Conv2d( 256, 256, kernel_size=1, stride=1, padding=0) https://github.com/kuangliu/pytorch-fpn/blob/master/fpn.py
  • 12. FPN in PyTorch def _make_layer(self, block, planes, num_blocks, stride): strides = [stride] + [1]*(num_blocks-1) layers = [] for stride in strides: layers.append(block(self.in_planes, planes, stride)) self.in_planes= planes * block.expansion return nn.Sequential(*layers) def _upsample_add(self, x, y): '''Upsample and add two feature maps. Args: x: (Variable) top feature map to be upsampled. y: (Variable) lateral feature map. Returns: (Variable) added feature map. Note in PyTorch, when input size is odd, the upsampled feature map with `F.upsample(..., scale_factor=2, mode='nearest')` maybe not equal to the lateral feature map size. E.g. original input size: [N,_,15,15] -> conv2d feature map size: [N,_,8,8] -> upsampled feature map size: [N,_,16,16] So we choose bilinear upsample which supports arbitrary output sizes. ''' _,_,H,W = y.size() return F.upsample(x, size=(H,W), mode='bilinear') + y def forward(self, x): # Bottom-up c1 = F.relu(self.bn1(self.conv1(x))) c1 = F.max_pool2d(c1, kernel_size=3, stride=2, padding=1) c2 = self.layer1(c1) c3 = self.layer2(c2) c4 = self.layer3(c3) c5 = self.layer4(c4) # Top-down p5 = self.toplayer(c5) p4 = self._upsample_add(p5, self.latlayer1(c4)) p3 = self._upsample_add(p4, self.latlayer2(c3)) p2 = self._upsample_add(p3, self.latlayer3(c2)) # Smooth p4 = self.smooth1(p4) p3 = self.smooth2(p3) p2 = self.smooth3(p2) return p2, p3, p4, p5 def FPN101(): return FPN(Bottleneck, [2,2,2,2]) https://github.com/kuangliu/pytorch-fpn/blob/master/fpn.py https://kharshit.github.io/blog/2019/02/15/autoencoder-downsampling-and-upsampling
  • 13. Feature Pyramid Networks for RPN https://medium.com/@jonathan_hui/understanding-feature-pyramid-networks-for-object-detection-fpn-45b227b9106 https://yeomko.tistory.com/44 Faster RCNN에서 RPN 생성방법 1. Pretrained된 VGG을 통과한 feature map을 생성 2. Feature map에 3X3 convolution 적용하여, Intermediate layer 생성 3. Classification와 Bounding box regression을 위해 1X1 Convolution을 적용
  • 14. Feature Pyramid Networks for Fast R-CNN https://medium.com/@jonathan_hui/understanding-feature-pyramid-networks-for-object-detection-fpn-45b227b9106