[dl輪読会]encoder-decoder with atrous separable convolution for semantic image segmentation

16
1 DEEP L EARNING JP [DL Papers] http://deeplearning.jp/ “Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation” 土居健人, 航空宇宙工学科岩崎研

Upload: deep-learning-jp

Post on 15-Mar-2018

74 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: [DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

1

DEEP LEARNING JP

[DL Papers]

http://deeplearning.jp/

“Encoder-Decoder with Atrous Separable Convolution forSemantic Image Segmentation”

土居健人, 航空宇宙工学科岩崎研

Page 2: [DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

書誌情報

• 著者

– Googleの研究グループ

– 主著のChen氏はDeepLab, Mobile Netの発案者

• 発表日 2018/02/07

– 現時点でのSemantic Segmentationタスクのstate of the art

• 選定理由

– DeepLab系の論文をまとめる良い機会.

– atrous (dilated) convolutionが他のタスクでも使えそう.

2

Page 3: [DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

発表の流れ

• DeepLab系のネットワークまとめ

– DeepLab v1 & v2• atrous convolution

• atrous spatial pyramid pooling

– DeepLab v3• cascade and parallel of atrous convolution

– DeepLab v3+• effective decoder module

• Xception model

• depthwise convolution

3

Page 4: [DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

DeepLab v1,2

• “DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs”

• v1, v2の違いはベースのアーキテクチャの違い(VGGとResNet)

• この論文のポイントは以下の3つ

– atrous convolution

– atrous spatial pyramid pooling

– CRFによる後処理

4

この2つについて話します

Page 5: [DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Atrous Convolution

• dilated convolutionとも呼ばれる

• 畳み込み演算を離れたピクセルの値で行う

– 特徴マップを縮小せず受容野を拡大

5“DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs”, L. Chen et al. 2016

Page 6: [DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Atrous Spatial Pyramid Pooling (ASPP)

• Spatial Pyramid Pooling (SPP)からの着想

• SPPとは

– 一つの特徴マップにいくつかのスケールのPoolingをかける

– 任意のサイズの特徴マップを決まった大きさのベクトルに変形

Atrous Spatial Pyramid Pooling (ASPP)はこれをatrous convolutionで行う

6

“Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition”, K. He et al. 2014

Page 7: [DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Atrous Spatial Pyramid Pooling (ASPP)

• 異なるatrous convolutionを特徴マップに適用

• 右図では赤いピクセルの特徴量を計算

• ASSPをした後の特徴マップのサイズは任意に設定可能

7

“DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs”, L. Chen et al. 2016

Page 8: [DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

DeepLab v1のアーキテクチャ

• VGG16の全結合層をatrous convolution, ASPP, 1x1 convで置き換え8

“DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs”, L. Chen et al. 2016

Page 9: [DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

DeepLab v3

• “Rethinking Atrous Convolution for Semantic Image Segmentation”

• DeepLab v1, v2との差分

– atrous convolution in cascade (直列)

– atrous convolution in paralell (並列)

• タイトルにもある通り,atrous convolutionを再考し発展させた

9

Page 10: [DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

atrous convolutionの直列, 並列化

• ResNetをさらに深くしていき,stride=2のconcolutionの代わりにatrousconvolutionを重ねた

• この時,atrous convolutionは異なるdilated rateのを並列した 10

L.-C. Chen et al. “Re- thinking atrous convolution for semantic image segmentation.” arXiv:1706.05587, 2017.

Page 11: [DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

DeepLab v3+

• “Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation”

• DeepLabv3+からの差分

– Decoder部分の構造を改良した• これまではbilinearでupsamplingしていた

– Xceptionネットワークの構造を取り入れた

11

Page 12: [DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Decoderの改良

• Low-Level featureの活用

12

Page 13: [DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Xceptionモデルの活用

• encoderをXceptionNetに変更• 空間方向とチャネル方向でconvolutionを分けている• stride2のpoolingをdepth-wise convolutionに変更 13

Page 14: [DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

実験結果まとめ

• pascal voc 2012 test setの実験結果

14

“DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs”, L. Chen et al. 2016

Page 15: [DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

まとめ

• DeepLab v1, 2

– atrous convolution

– atrous spatial pyramid pooling

• DeepLab v3

– atrous convolution in cascade

– atrous convolution in parallel

• DeepLab v3+

– decoder部分でlow-level featureの活用

– Xceptionをencoderとして活用

15

Page 16: [DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

参考文献

• “Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation”, L. Chen et al. 2018

• “DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs”, L. Chen et al. 2016

• “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition”, K. He et al. 2014

• F. Chollet. Xception: Deep learning with depthwise separable convolutions. In CVPR, 2017.

16