大腸内視鏡検査における大腸癌認識システム
TRANSCRIPT
I�4¢±tq����I��¦¨���"
�l^¾VOIK¿
TP�, Bisser Raytchev, °�Bh, N6CF?�a,, �+/xand many others
http://www.sciencekids.co.nz/pictures/humanbody/braintomography.html
http://www.sciencekids.co.nz/pictures/humanbody/heartsurfaceanatomy.html
http://sozai.rash.jp/medical/p/000154.html
http://sozai.rash.jp/medical/p/000152.html
http://medical.toykikaku.com/ħ�źŬŧƂ/ć/
http://www.sciencekids.co.nz/pictures/humanbody/humanorgans.html
|ĉõ
•|ĉCTÔ�
ßÏõ
•ü%ņ
•Û%ņ
Zćõ
•ZćNBI%Ėľ
http://sozai.rash.jp/medical/p/000154.html
http://sozai.rash.jp/medical/p/000152.html
http://medical.toykikaku.com/ħ�źŬŧƂ/ć/
��ŵźZćÚ
• ÿ~ā: 235,000E (q�21r<Ñ4�Þġſ)†– Őőāƀŷÿ~ÍŻŐŠ
• ²�ā: 42,434E (D�)†– 20r1Ŷ·ſƾ1.7�U3– Ú²Lźî3� (î1�: ĂÚƾî2�: ăÚ)– 7rIJúŵƾ^}źÚ²Lî1�
• 5rÑ`Í: ��űŶ20%��
11† http://www.mhlw.go.jp/toukei/saikin/‡http://www.gunma-cc.jp/sarukihan/seizonritu/index.html
0
20
40
60
80
100
stage1 stage2 stage3 stage4
ZćÚ²�ā�r°�é†
ZćÚ5rÑ`ÍƼƛơƻƚ*Ŏƽ‡su
rviv
al ra
te [
%]
�� ��
stage 1: ZćV%ŹÕƁƊÚstage 2: ZćVƌĥŧÚstage 3: ƵƹƨòĨéŬŰÚstage 4: ĉJžĨéŬŰÚ
stage 1 (��Ú)Ÿƈżƾ100%Ĭŧc¹AĄ
0
10,000
20,000
30,000
40,000
50,000
'90 '91 '92 '93 '94 '95 '96 '97 '98 '99 '00 '01 '02 '03 '04 '05 '06 '07 '08 '09
fatalitiesofcolorectalcancer
year
Ta��U,><`�¤ �o u
http://www.mhlw.go.jp/toukei/saikin/hw/jinkou/geppo/nengai11/kekka03.html#k3_2
N8 �}�ÑÊźŸĹ�/²�ÍƼ�@10�hƽźr°�é
http://ameblo.jp/gomora16610/entry-10839715830.html
�ÄĎ©¥ŵ��ÚťĔųŤƊŤDŽ
ÄĎ©¥Ż��Úź9*ƌƂųŨƊŪŶŻŵŦƁůƍŜ
޶rś�ÄĎ©¥ƌ?ŨŴŠŴń}Ÿźŵb|ŬŴŠŰŜŅ}ŹŸŲŴŤƈ©¥ŬŰƈijďÚűŲŰşŶŠš�ƅŠƁŭŜ
http://daichou.com/ben.htm
Zć%Ėľ©¥
• íŹCCDƌ�ųƛƖƻƬƌ�ĹŤƈ�"Ŭƾć%ƌ©¥
• Zć%Ėľ©¥ƛơƠƬ
�Í100���Ź�ZŬŰćVđʼnź¬ıŤƈ½ĵsƌ�¿
I think this is a cancer…
1(Ð
%Ėľ©¥
ĆØź`Oåğƾ�v
¹Ù�Ľź:ģ�vūƋŰĆØŤƈĕ®Ɔ½ĵsƌ�d
:5�Á�>��z{;
http://www.ajinomoto-seiyaku.co.jp/newsrelease/2004/1217.html
I�4¢±tq L¶ƇųżƓƵƥƠƓƫƸƔhttp://yotsuba-clinic.jp/WordPress/?p=63
áĮƓƵƥƠƓhttp://www.oiya-clinic.jp/inform3.html
ǁ*ǂǀè#Zć%Ėľ�"Æ×ƺ¸»º https://www.youtube.com/watch?v=40L-y9rNOzw
Capture ~Setup~
17
NBI内視鏡
処理用PC
スコープ
光源(NBI)
ビデオプロセッサ
レコーダースコープ接続口
įpź%Ėľ �Z%ĖľƼ70��100�ƽ
©S 4¢±¾¸eI¿
ƤċÛċ!
ƐƹƚƗƒƶưƹ¤ċÛċ!
ƓƵƛƞƶƧƐƑƷƠƣ¤ċÛċ!
ƤċNBI
eI4¢±¾75�1000¿
ƤċÛċ!
ƐƹƚƗƒƶưƹ¤ċÛċ!
ƓƵƛƞƶƧƐƑƷƠƣ¤ċÛċ!
ƤċNBI
NBI(Narrow Band ImagingÁ�RE3£M¿
ÓlŇþň,ŞË³!ŹƇƊ%ĖľƎƣƴƛ–NBIƺAFIƺIRIĞ�ź�1ý–ş,î1È10,��ƲƢƏƒƶƝƹƞƻ,2006.
R
B
G
415nm
540nm
Color Transform
NBIfilter
Xenon lampRGBrotaryfilter
mucosal
CCD
Monitor
LightsourceunitVideoprocessor
ON
OFF
Normallight
Normallight
NBI
NBI
ƭƳƔƸƩƹźG=Ë}
http://www.olympus.co.jp/jp/technology/technology/luceraelite/
�ÄĎ©¥
øĴėg %ĖľÜ+Ń
ZćÚ©¥
8oŁŵź:ģ
%Ėľ©¥
-�©¥
http:/ /cancernavi.nikkeibp.co.jp/daicho/worry/post_2.html
ü ¹ÙŻůŮƾ©¥źƂ
ü ÖWĹź�v
ü ©¥ù¢ƌSŹĒ�ź8oŵ:ģ
ü ÖW�ŝŹhŬŴƾ¹Ù�Ľƌ©ĚIthinkthisisacancer…
'©¥
~āžźĠ�wƾ¹ÙƛơƠƬž
or
or
�Ñ©ŹųŠŴŻƾĆØźijďƺĨéƌ�ŭŰƄƾ«2ĸŨƈƋƊ
or or
¹Ù�µ
Xç�Đ
/ź©¥ƼÑ©Ÿŷƽž
�ÄĎ©¥
øĴėg %ĖľÜ+Ń
%Ėľ©¥
'©¥
~āžźĠ�wƾ¹ÙƛơƠƬž
or
or
�Ñ©ŹųŠŴŻƾĆØźijďƺĨéƌ�ŭŰƄƾ«2ĸŨƈƋƊ
or or
¹Ù�µ
Xç�Đ
/ź©¥ƼÑ©Ÿŷƽž
äìÝÜ
-�©¥
http:/ /cancernavi.nikkeibp.co.jp/daicho/worry/post_2.html
ü ¹ÙŻůŮƾ©¥źƂ
ü ÖWĹź�v
8oŁŵź:ģ
ü ©¥ù¢ƌSŹĒ�ź8oŵ:ģ
ü ÖW�ŝŹhŬŴƾ¹Ù�Ľƌ©Ě Ñ©
ğĢIthinkthisisacancer…
©)
ğĢ
ƪƷƻƱķ�
Ğ���ƌÝÜŶŬŰƵƎƶƞƐƱğĢƙƛơƱź��
Ø 8oźėÜ.�Ź�`Ø iĿ8ź�Ħ
�ÄĎ©¥
øĴėg %ĖľÜ+Ń
'©¥
~āžźĠ�wƾ¹ÙƛơƠƬž
or
or
�Ñ©ŹųŠŴŻƾĆØźijďƺĨéƌ�ŭŰƄƾ«2ĸŨƈƋƊ
or or
¹Ù�µ
Xç�Đ
łIJäì
8oŁŵź:ģ
%Ėľ©¥
-�©¥
http:/ /cancernavi.nikkeibp.co.jp/daicho/worry/post_2.html
ü ¹ÙŻůŮƾ©¥źƂ
ü ÖWĹź�v
ü ©¥ù¢ƌSŹĒ�ź8oŵ:ģ
ü ÖW�ŝŹhŬŴƾ¹Ù�Ľƌ©ĚIthinkthisisacancer…
/ź©¥ƼÑ©Ÿŷƽž
ğĢ
©) Ñ©ğĢ
ƪƷƻƱķ�
Oh,MIA,‘07
Sundaram etal.,MIA,‘08
Diaz&Rao ,PRL,‘07
Al-Kadi,PR,‘10
Gunduz-Demir etal.,MIA,‘10
Tosun,PR,‘09
Pit-Pattern*ŎǃHäfner etal.,PAA,‘09
Häfner, ICPR,‘10
Häfner,PR,‘09
Kwitt &Uhl,ICCV,‘07
Tischendrof etal.,Endoscopy, ‘10
NBI�Z�Ĕ*ŎǃStehle,MI,‘09
Gross,MI,‘08
CÉÓƈ,PRMU,‘10
Tamakietal.,ACCV,‘10
pit-pattern7»• I����¹ ²>®��¾pit¿ [����7»
– �G®�p�|�A�-�Àpit[��Y§– }¬X� ´._ ½���W��L¥~�
29mŶź¤ċŻ�ŁƌēŬƾ~āžźĤčťZŦŠ
mÚ smÚ ijďÚ
±p � ª&uź±pĈñpit
�ČÌpit
�QƇƉƅjūŠñÌDžª&upit
�QƇƉƅZŦŠñÌDžª&upit
Âöƾ¯£ƾąKĨÌpit
�
�
�
�
�
�
S
L
I
N
�ƾ� ƾ� ƾ�Qpitźŀ@ĹZŦūƆĺ,ź��Ÿpit
pitť¾kDž¼]Ƭı�Ĕź)Îƌ�šÌ�
S L
pit-pattern*Ŏ [S.Tanaka etal.,‘06]
NBIeIb¡7» (NBI: Narrow-band Imaging)
• pit [�À=���vª���7»– �RE 3�Z�����ºE�Y§��– }¬X� ´«_�L¥~�
TypeA
TypeB
TypeC
1
2
3
zjĎñŻ�AĖ
ĈñźHMƌ>ƉMƃ÷ŠzjĎñƌğƄƾŁ�ÜŹ�àŵ�Ÿpit¬ıťėgūƋƊ
��ŸûÝ®ƌ¬�ƾŁ�ÜŹ��Ÿpit¬ıťėgAĄƿĎñź[ū/*nť·ĩÜP�ƿ
��ŸûÝ®ƌ¬�ƾŁ�ÜŹ��źtŠpit¬ıťėgAĄƿĎñź[ū/*nť�P�ƿ
��pit¬ıƅ��àŵƾėg�Ąƿ��Ďñź[ū/*nŻ�P�ŵ��ƿÆĎñŋR(AVA)ź)Îƿ�Ç7ŬŰzjĎñť�OŭƊƿ
NBI�Z�Ĕ*Ŏ [H.Kanao et al., ‘09]
smÚDžijďÚ
ĈĆDžijďÚ
±p
texture analysis approach
Yoshito Takemura,Shigeto Yoshida,ShinjiTanaka,KeiichiOnji,Shiro Oka,ToruTamaki,Kazufumi Kaneda,Masaharu Yoshihara,KazuakiChayama:
"Quantitativeanalysisanddevelopmentofacomputer-aidedsystemforidentificationofregularpitpatternsofcolorectallesions,"Gastrointestinal
Endoscopy,Vol.72,No.5,pp.1047-1051(201011).
Bag-of-Visual Words Approach
Type A Type B Type C3
12, 55, 63, …87, 49, 21, …ǃ
32, 20, 73, …67, 6, 0, …ǃ
79, 5, 40, …11, 36, 87, …ǃ
27, 64, 25, …, 8793, 41, 75, …, 8
…
12, 55, 63, …87, 49, 21, …ǃ
32, 20, 73, …67, 6, 0, …ǃ
79, 5, 40, …11, 36, 87, …ǃ
65, 33, 19, …, 10152, 51, 32, …, 89
…
12, 55, 63, …87, 49, 21, …ǃ
32, 20, 73, …67, 6, 0, …ǃ
79, 5, 40, …11, 36, 87, …ǃ
66, 95, 47, …, 8511, 82, 3, …, 124
…
ŕ
Ŗ
ŗ
Ř
ř
ŕ Ŗ ŗŘř
Type A
ŕ Ŗ ŗŘř
Type B
ŕ Ŗ ŗŘř
Type C3
84, 99, 40, …, 1215, 26, 91, …, 150
…
ŕ Ŗ ŗŘř
Vector quantizationVector quantization
Feature space
Classifier
Histogram
Test image
Learning
Classification result
Description of Local features
m�Ë{ļ + Bag-of-features
Object Bag of œwordsŔ
Slide by Li Fei-Fei at CVPR2007 Tutorial http://people.csail.mit.edu/torralba/shortCourseRLOC/
Analogy to documentsOf all the sensory impressions proceeding to the brain, the visual experiences are the dominant ones. Our perception of the world around us is based essentially on the messages that reach the brain from our eyes. For a long time it was thought that the retinal image was transmitted point by point to visual centers in the brain; the cerebral cortex was a movie screen, so to speak, upon which the image in the eye was projected. Through the discoveries of Hubel and Wiesel we now know that behind the origin of the visual perception in the brain there is a considerably more complicated course of events. By following the visual impulses along their path to the various cell layers of the optical cortex, Hubel and Wiesel have been able to demonstrate that the message about the image falling on the retina undergoes a step-wise analysis in a system of nerve cells stored in columns. In this system each cell has its specific function and is responsible for a specific detail in the pattern of the retinal image.
sensory, brain, visual, perception,
retinal, cerebral cortex,eye, cell, optical
nerve, imageHubel, Wiesel
China is forecasting a trade surplus of $90bn (£51bn) to $100bn this year, a threefold increase on 2004's $32bn. The Commerce Ministry said the surplus would be created by a predicted 30% jump in exports to $750bn, compared with a 18% rise in imports to $660bn. The figures are likely to further annoy the US, which has long argued that China's exports are unfairly helped by a deliberately undervalued yuan. Beijing agrees the surplus is too high, but says the yuan is only one factor. Bank of China governor Zhou Xiaochuan said the country also needed to do more to boost domestic demand so more goods stayed within the country. China increased the value of the yuan against the dollar by 2.1% in July and permitted it to trade within a narrow band, but the US wants the yuan to be allowed to trade freely. However, Beijing has made it clear that it will take its time and tread carefully before allowing the yuan to rise further in value.
China, trade, surplus, commerce,
exports, imports, US, yuan, bank, domestic,
foreign, increase, trade, value
Slide by Li Fei-Fei at CVPR2007 Tutorial http://people.csail.mit.edu/torralba/shortCourseRLOC/
Slide by Li Fei-Fei at CVPR2007 Tutorial http://people.csail.mit.edu/torralba/shortCourseRLOC/
ヒストグラムType C3Type BType A
Type A Type C3
学習画像Type B
特徴量:
Bag-of-Visual Words Approach病変部の画像パッチを分類[Tamaki et al., 2013]
・908枚のNBI画像(Type A: 359,Type B:462,Type C3:87)で学習・Type C1,Type C2は不明瞭な部分が多いため省かれている・最大認識率96%認識の流れ
特徴量抽出 特徴量をクラスタリング,代表値をVisual Wordsとする
Visual Wordsヒストグラムを作成
認識画像
Type ?Type B
Visual Wordsヒストグラムを作成
SVM学習 認識
Visual Words
Bag-of-Visual Wordsの枠組み
m�Ë{ļ: gridSIFT• Scale Invariant Feature Transform (SIFT) [Lowe, ‘99]
– m�ŋRŹţŨƊĪs6ĺ�Tƌ128°�ƮƓƣƶŵđΖ DoGŹƇƊË{Å©)ŹƇƊğĢÍŻ90[%]1w
ƔƷƻƛƕƻƶW� Ë{Å©)ƼDoGƽ Ë{ļěĭ
• grid samplingŹƇƊSIFTË{ļěĭ (gridSIFT)– ğĢŹţŠŴƾfŹƘƹƬƵƹƔŭƊŶ}ĄťF�– ¦_ÌŹƘƹƬƵƹƔŬƾSIFTË{ļƌěĭ
grid sampling
grid space
scale size
Ģ/J: Support Vector Machine (SVM)
• ƒƻƦƶƞƐƬ– Radial basis function (RBF)
– linear
– χ2
• YƓƴƛ*Ŏ : One-Versus-One
vuvu =),(lineark
)exp(),( 2vuvu =RBFk
( )+
=vuvuvu2
2 2exp),(k
Ưƻƚƹ�Z7źSÁŵƨƴƲƻƞƌaĀŭƊ2ƓƴƛĢ/�º
2
1max ww
subject to yiw (xi ) 121 w
21 w
�ÒÔ�
• ƴƐơƏƹƔƺ�vĘsƺ�ÍŻDês• %Ėľź�Ô�Ťƈ$QÜŸTypeƌƣƵưƹƔ• Ô�ƘƐƜ: 100Œ300�900Œ800[pix.]• iĿ82E��ŹƇŲŴƴƮƶ�Ũ• �1ng�907n
(Type A: 359, Type B: 462, Type C3: 87)
Type A:
Type B:
Type C3:
< �ÒÔ� >
Results <10-fold Cross Validation>
60
65
70
75
80
85
90
95
100
10 100 1000 10000 100000
CorrectRate[%]
#ofvisual-words[-]
CorrectRate
96.00%
0
10
20
30
40
50
60
70
80
90
100
10 100 1000 10000 100000
RecallRate[%]
#ofvisual-words[-]
RecallRate
TypeA
TypeB
TypeC3
0
10
20
30
40
50
60
70
80
90
100
10 100 1000 10000 100000
PrecisionRate[%]
#ofvisual-words[-]
PrecisionRate
TypeA
TypeB
TypeC3
Results <Holdout Testing>
60
65
70
75
80
85
90
95
100
10 100 1000 10000 100000
CorrectRate[%]
#ofvisual-words[-]
CorrectRate
0
10
20
30
40
50
60
70
80
90
100
10 100 1000 10000 100000
RecallRate[%]
#ofvisual-words[-]
RecallRate
TypeA
TypeB
TypeC3
0
10
20
30
40
50
60
70
80
90
100
10 100 1000 10000 100000
PrecisionRate[%]
#ofvisual-words[-]
PrecisionRate
TypeA
TypeB
TypeC3
92.86%
MOTIVATION
• Ô�ğĢźôsF�� Zĕ®ŸaĀƢƻƞƝƠƣ
YëYŸ�âźNBIÔ�ƌğĢŭƊŰƄźZĕ®ŸƢƻƞƮƻƛź¬ó
� NBIÔ�ƢƻƞƝƠƣź�Ħ
� ZļźÔ�ŹƴƮƶ�Ũ× Ɩƛƣ× �Ł× iĿâĢ
AB C3
ABSTRACT
• Self-trainingn ƴƮƶźŸŠŋRŹhŬŴƴƮƶ�Ũ
n ğĢÍƌ�ũƊaĀƢƻƞƝƠƣƌ¬ó
x �º [Yoshimuta et al., ‘10] �§�º
Key Idea : ƴƮƶźŸŠŋR�TƌÒŠƊ
Self-training
• ƴƮƶŸŬƢƻƞƌÒŠŴ}ĄƌF�• aĀƢƻƞƌĊ5Ñ�
'aĀ
-�aĀ
ğĢ
ƴƮƶ�ŦƢƻƞ
ƴƮƶŸŬƢƻƞ
�Ō}©¥Accept
Reject
POINT
1. ƴƮƶź�Ţ�
2. Į3ŭƊƘƹƬƶźķŽ�
labeled samples
• iĿ8ŹƇŲŴƣƵưƹƔƺƴƮƵƹƔ• Ô�ƘƐƜǃ100Œ300Dž900Œ800 [pix.]• Ô�¡�ź%ĝ
TypeA TypeB TypeC3 Total
359 462 87 908
AB C3
Unlabeled samples
• B�Ô�Ťƈ10¡Ůų+Ɖ)Ŭ• Ô�ƘƐƜǃ30Œ30Dž250Œ250 [pix.]• +Ɖ)Ŭ�
– �Ô�ŤƈƴƹƟƱŹ+Ɖ)Ŭ– ƴƮƶ�ŦŋRźHīŤƈ+Ɖ)Ŭ
•Ô�¡�ź%ĝ
*ƴƮƶ�ŦƘƹƬƶź%�ųűŨ�Ô�ť`OŬŸŠŰƄśƴƮƶŸŬƘƹƬƶť10¡kŸŠ
TypeA TypeB TypeC3 Total
3590 4610 870 9070
+Ɖ)Ŭ�
Result
0.9
0.91
0.92
0.93
0.94
0.95
0.96
x �º Algorithm1 Algorithm2 Algorithm3
RecognitionRate
ƴƮƶ�ŦŋRHīƌ+Ɖ)ŬŰƘƹƬƶƌ�Ò
* p=0.013314
ヒストグラムType C3Type BType A
Type A Type C3
学習画像Type B
特徴量:
2���
病変部の画像パッチを分類[Tamaki et al., 2013]
・908枚のNBI画像(Type A: 359,Type B:462,Type C3:87)で学習・Type C1,Type C2は不明瞭な部分が多いため省かれている・最大認識率96%認識の流れ
特徴量抽出 特徴量をクラスタリング,代表値をVisual Wordsとする
Visual Wordsヒストグラムを作成
認識画像
Type ?Type B
Visual Wordsヒストグラムを作成
SVM学習 認識
Visual Words
Bag-of-Visual Wordsの枠組み
格子間隔 15[pixel] 10[pixel] 5[pixel]最高認識率 92.11[%] 93.89[%] 96.00[%]学習時間 約13分 約30分 約3時間
2/3 1/2
+1.78% +2.11%
特徴量数の増加による学習時間の増加が問題
H¯ �]¯d6
格子間隔
Ø 抽出する特徴量数を増やすと認識率は向上する[Jurie et al., 2005]Ø 特徴量抽出の間隔(格子間隔)を狭くして認識率向上を確認[吉牟田ら,2011]
特徴量数:2.25倍
特徴量数:4倍
格子間隔:2/3
格子間隔:1/2
学習画像:NBI画像908枚(Type A: 359,Type B:462,Type C3:87)
\mcy
特徴空間 ヒストグラム
1. 全ての学習画像から(格子状に)特徴量を抽出する
2. 抽出した特徴量をクラスタリングする
3. 学習画像1枚から(格子状に)特徴量を抽出する
4.特徴量をベクトル量子化してVisual Wordsヒストグラムを求める
格子間隔
全学習画像I = {In | 1, . . . , N}
学習画像In 2 I
Visual Words
fscy(k³9�&¦¨�@*)
特徴空間 ヒストグラム
1. 全ての学習画像から少量の特徴量を抽出する
3. 学習画像1枚から(格子状に)多くの特徴量を抽出する
4. 特徴量をベクトル量子化してVisual Wordsヒストグラムを求める
2. 少量の特徴量をクラスタリングする格子間隔
全学習画像I = {In | 1, . . . , N}
学習画像In 2 I
Visual Words
学習時間の削減と認識率の向上を確認するVisual Words作成
ヒストグラム作成
特徴量数:削減
¦¨L¼
実行環境OS:Linux Fedora 18CPU:Intel Xeon CPU E-5 2620Memory:128GB
識別器Ø Linear SVM
学習画像Ø ラベルありNBI画像908枚(Type A: 359,Type B:462,Type C3:87)
Visual Wordsを作成する特徴量数を減らす
特徴量数:増加
Visual Words作成に使用する特徴量数 格子間隔5[pixel],2[pixel],1[pixel]19,742個8,678,198個
ヒストグラムを作成する特徴量数を増やす
Ø 特徴量数 vs学習時間合計
Ø 格子間隔 vs認識率
学習時間の削減を確認する
認識率の向上を確認する
�]¯g�K�k³
10233.45
680.72
4167.89
16471.8
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
従来手法(格子間隔:5)
提案手法(格子間隔:5)
提案手法(格子間隔:2)
提案手法(格子間隔:1)
CPU時間[sec]
6.6%
40.7%
160.9%
格子間隔:5[pixel],2[pixel]の時,学習時間が削減できている格子間隔:1[pixel]の時,学習時間が増えている
Visual Words数:32
rJ³µ�¦¨�
0.80 0.82 0.84 0.86 0.88 0.90 0.92 0.94 0.96 0.98
32 1024 4096 16384
Co
rrec
t R
ate
Visual Words数従来手法(格子間隔:5) 提案手法(格子間隔:5)提案手法(格子間隔:2) 提案手法(格子間隔:1)
格子間隔:5[pixel]と格子間隔:2[pixel],1[pixel]には差がある格子間隔:2[pixel]と格子間隔:1[pixel]には大きな差がない
Problem
58
光学系が異なる 撮影画像が異なる
特徴量分布が異なる
旧内視鏡と新内視鏡が混在している
Old endoscopy(EVIS LUCERA)
New endoscopy(EVIS LUCERA
ELITE)Viewing angle
140°(WIDE),80°(TELE) 170°(WIDE),90°(TELE)
Resolution 1440*1080 1980*1080 Old endoscopy New endoscopy
Ø 新内視鏡が広角・高解像度で明るい
Old endoscopy New endoscopy
新内視鏡での認識性能の低下学習画像を新旧同時に使えない
新内視鏡の学習画像を収集するのは困難Ø 認識と学習は分布が同じことが前提
Ø がん患者は多くないØ 検査時しか撮影できないØ ラベル付けは医師しかできない
Ø 最新のデバイスが登場し,過渡期にある
jD(iD �
x �Q
http://www.olympus.co.jp/jp/technology/technology/luceraelite/
Objective
60
Solution:新内視鏡の特徴量を旧内視鏡の特徴量に変換し,学習する
Framework of Transfer Learning2つの画像は関連がある
5
10
New endoscopyOld endoscopy
5
10
特徴量を変換する
学習:旧内視鏡認識:旧内視鏡
学習:旧内視鏡認識:新内視鏡
認識率低下
Related Work
61
Adapting Visual Category Models to New Domains[Saenko et al., ECCV2010]
SourceとTargetの同時認識をする問題
Source:x Target:y
Targetのみを認識する問題Ø 本手法はハイパーパラメータが存在しない
Ø この手法はハイパーパラメータが存在し,調整が必要
TargetをSourceに変換する行列Wを求める
Our Approach
Source
Target
W
Source-Target間の条件を満たす行列 を求めるA
SourceTarget
+Target
For each class
(xi � yj)TA(xi � yj) upper bound
(xi � yj)TA(xi � yj) � lower bound
Same class:Different class: A1/2 A1/2
! arg minW
kx�Wyk2F
W
y1
Convert Histogram
62
ynyN
x1xn
xN
Source Target
1. Visual Wordsヒストグラムをベクトルとして扱い,行列とする
2. ヒストグラム同士の誤差を最小にする変換行列WをADMM*で求める
*ADMMによる解法 (For each row n=1, …, N)
arg minW
PNn=1 ||xn �W nyn||22
+ 12 ||W n � zn + un||22
+PN
n=1(zkn � uk
n))
手順
以下の双対問題を手順を繰り返すことで解く
Y =�y1, · · · ,yN
�X =
�x1, · · · ,xN
�
Subject to. W ij � 0
arg minW
kX �WY k2F
zk+1n = ⇡c(W
k+1n + uk
n)
uk+1n = uk
n +W k+1n � zk+1
n
W k+1n = (
PNn=1 yny
Tn +E)�1(
PNn=1 yny
Tn
How to Make Pseudo Dataset
63
l 新内視鏡はくっきり,鮮やかに見えると思われるため
を適用する①コントラスト強調②先鋭化フィルタ
Source Target
Output
Input 0 255 42 213
0
255 19
19
19
19
19
19
19
19
259
コントラスト強調 先鋭化フィルタ
l この手法は学習画像同士の対応がないと使えないØ 現実には対応のある画像を得るのは難しい
Result
64転移することで旧内視鏡と同等に認識率を得た
Almost same
Training Testn ŕ Source Sourcen Ŗ Source Targetn ŗ Source+Target Targetn Ř Source
+W�ŬŰTargetTarget
①④
②
③
Related Works
65
Cross-Domain Transform[Saenko et al., ECCV2010]
Max-Margin Domain Transfer(MMDT)[Hoffman et al., ICLR2013]
min tr(W )� log detW
s.t. W ⌫ 0
kxsi � x
tjkw upper bound, (x
si ,x
tj) 2 the same class
kxsi � x
tjkw lowe rbound, (x
si ,x
tj) 2 di↵erent class
Ø Estimate transformation matrix which minimize Mahalanobis distance.Ø Consider in only transformed feature distributions.Ø Not ensure classification result.
minW ,✓,b
1
2kW k2F +
1
2
KX
k=1
k✓kk22 + Cs
nX
i=1
KX
k=1
⇠si,k + Ct
mX
j=1
KX
k=1
⇠tj,k
s.t. ysi,k✓Tk x
si � bk � 1� ⇠si.k
ytj,k✓Tk Wx
tj � 1� ⇠tj,k
⇠si,k � 0, ⇠tj,k � 0
Ø Optimize transformation matrix and SVM parameters at same time.Ø Ensure classification result.Ø Not guarantee transformed feature distributions.
W : Transform matrix
✓k : SVM parameter ⇠s, ⇠t : Slack variable
yi,k : Indicator function
Propose Method
66
minW ,✓,b
1
2kW k2F +
1
2
KX
k=1
k✓kk22 + Cs
nX
i=1
KX
k=1
⇠si,k
s.t. ysi,k✓Tk x
si � bk � 1� ⇠si.k
ytj,k✓Tk Wx
tj � 1� ⇠tj,k
⇠si,k � 0, ⇠tj,k � 0
Constraint of close transformed target to source.
+Ct
mX
j=1
KX
k=1
⇠tj,k +1
2D
MX
i=1
NX
j=1
yi,jk(Wx
ti � x
sj)k22
Ø Add L2 distance constraints to MMDT.Ø Our method ensures classification result
and transformed feature distributions.
Max-Margin Domain Transfer with L2 Distance Constraints(MMDTL2)
Decompose to Sub-problem
67
Hoffman et al. decompose objective function to 2 sub-problem in MMDT.Our method as well decomposes objective functions in below.Objective function optimize by iterate (1) and (2).
min✓,⇠s,⇠t
1
2
KX
k=1
k✓kk22 + Cs
NX
i=1
KX
k=1
⇠si,k + Ct
MX
j=1
KX
k=1
⇠tj,k(1)
Constraint of close transformed target to source.
(2) minW ,⇠t
1
2kW k2F + Ct
MX
j=1
KX
k=1
⇠tj,k +1
2D
MX
i=1
MX
j=1
yi,jkWx
ti � x
sjk22
Objective function for optimize SVM parameter.
Objective function for optimize transform matrix.
s.t. ysi,k✓Tk x
si � bk � 1� ⇠si.k
ytj,k✓Tk Wx
tj � 1� ⇠tj,k
⇠si,k � 0, ⇠tj,k � 0
Primal Problem
68
U(x) =
2
6664
xx
T
xx
T
. . .xx
T
3
7775vi,j = vec(xs
j(xti)
T )
w = vec(W )�(x) = vec(✓xT )
minw,⇠t
1
2kwk22 + Ct
MX
j=1
KX
k=1
⇠tj,k +1
2D
MX
i=1
MX
j=1
�w
TU(xt
i)w � 2vTijw + (xt
i)Tx
sj
�(2)
s.t. ⇠ti � 0
yti,k�Tk (x
ti)w � 1� ⇠ti,k
Derivate from objective function for optimize transform matrix.
This is standard quadratic programming but…p High computational costs.p Need to huge memory.p Depend on dimensions of data.
Derivate dual problem.
Dual Problem
69
s.t. 0 ai CT
MX
i=1
aiyti,k = 0
max
a�1
2
KX
k1=1
KX
k2=1
MX
i=1
MX
j=1
aiajyti,k1
ytj,k2�
Tk1(x
ti)V
�1�k2(x
tj)
+KX
k=1
MX
i=1
ai
1�D
�
Tk (x
ti)V
�1MX
m=1
NX
n=1
ym,nvi,j
!!(2)
p Low computational cost.p Defined by sparse problem.p Depend on number of target data.
ai : Lagrange multiplier
Dual problem has many advantages.
V =
0
@I +D
MX
i=1
NX
j=1
yi,jU(xti)
1
A
Comparison Primal with Dual of Computation Time
70
SetupTime: computation time for coefficients(e.g. and ). OptimizationTime: optimization time for solving quadratic programmingCalculationTime: computation time in from (dual only).
U(x) vi,j
w a
3riPDO DuDO0
1000
2000
3000
4000
5000
6000
7000
CoP
Su
tDti
on
7iP
e
6etuS7iPe
2StiPizDtion7iPe
CDOcuODtion7iPe
Visual Words:128
About 14 times faster
Result
71
MMDTL2 achieve good performance as equivalent with baseline.
But Not transfer is the best performance.
8 16 32 64 128 256 512 1024# Rf 9LVuDl WRrdV
0.4
0.5
0.6
0.7
0.8
0.9
1.05
ecR
gn
LWLR
n r
DWe
BDVelLne
6Rurce Rnly
1RW WrDnVfer
00D7
00D7L2
(Ðİs:14.7[fps]
AB
C3
ğĢƙƛơƱ
Ô�>y
• Ô�ź�\Ĺ*+Ɖ)Ŭ
Ģ/
Visual Word Histogram��
• Ģ/ù¢(A or B or C3)• A, B, C3źåÍ
đæ
• Ģ/J: SVM
ƵƎƶƞƐƱğĢƙƛơƱ
22 6 … 91 87 …
Ë{ļęð
• ƔƵƠƤƘƹƬƵƹƔ• SIFT
120[pix.]
120[
pix
.]
Ģ/ù¢
AźåÍBźåÍC3źåÍ
eŏĜd
73time
prob
abilit
y
0
1
AźåÍBźåÍ
C3źåÍ
åÍŹŬŦŠ�ƌĜd
ŬŦŠ�ŹÀŰŸŨƋż¨;
Objective
74
処理用PCをNBI内視鏡と接続し,オンラインでの認識を可能とする
システム構成
*2 http://www.genkosha.com/vs/news/entry/sdi.html*1 http://www.olympus.co.jp/jp/news/2012b/nr121002luceraj.jsp
開発環境Visual Studio 2012(製品版),OpenCV 3.0-devel,VLFeat 0.9.18, Boost 1.55.0,DeckLink SDK 10.0
OS:Windows 7 Home Premium SP1 64bitCPU:Intel Core i7-4470 3.40GHzMemory:16GB
OLYMPUS製EVIS LUCERA ELITE
Blackmagic製DeckLink SDI
NBI内視鏡*1
キャプチャボード*2 処理用PC
SDI PCI Express
Capture ~Setup~
75
NBI内視鏡
処理用PC
NBIスコープ
Capture ~demo & performance~
76
NBI内視鏡の画面
処理用PCの画面
色変換処理特徴量抽出
eŏĜd
77time
prob
abilit
y
0
1
AźåÍBźåÍ
C3źåÍ
åÍŹŬŦŠ�ƌĜd
ŬŦŠ�ŹÀŰŸŨƋż¨;
IōÅ [ƵƎƶƞƐƱğĢƙƛơƱ]
0
0.5
1
0 50 100 150 200
Prob
abilit
y
フレーム番号
Type A
Type B
• ƵƎƶƞƐƱğĢƙƛơƱź)2ù¢
• BƪƷƻƱźĢ/ūƋŰƴƮƶ
Type AType BType C3
bdŬŰğĢù¢ƌyƊŪŶťŵŦŸŠ
MRF/HMMƳƢƶ
f x y( )∝ exp A xi, yi( )i∑#
$%
&
'(⋅exp I xi, x j( )
j∈Ni
∑#
$%%
&
'((
ƢƻƞŊ qÃ7Ŋ
x: BƪƷƻƱŹţŨƊ�dŭſŦƴƮƶõ,y: SVMƇƉ�dūƋƊÔ�źË{ļõ,
x1 x50………… x100 x150 x200………… ………… …………B B BC3 C3
i0 50 100 200150
…… …… …… ……
y1 y50 y100 y150 y200
ĶÒù¢
Type B (original)
frame number0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.8)
frame number0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.9)
frame number0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.99)
frame number0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.999)
frame number0 20 40 60 80 100 120 140 160 180 200
Type B (original)
frame number0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.8)
frame number0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.9)
frame number0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.99)
frame number0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.999)
frame number0 20 40 60 80 100 120 140 160 180 200
Type B (original)
frame number0 20 40 60 80 100 120 140 160 180 200
Type B (Gibbs_p4=0.6)
frame number0 20 40 60 80 100 120 140 160 180 200
Type B (Gibbs_p4=0.7)
frame number0 20 40 60 80 100 120 140 160 180 200
Type B (Gibbs_p4=0.8)
frame number0 20 40 60 80 100 120 140 160 180 200
Type B (Gibbs_p4=0.9)
frame number0 20 40 60 80 100 120 140 160 180 200
0
0.5
1
BAC
20 40 60 80 100 120 140 160 180 200
Type B��������
0
0.5
1
ABC
20 40 60 80 100 120 140 160 180 200
Type A_1 (original)
frame number0 20 40 60 80 100 120 140 160 180 200
Type A_1 (DP_0.99)
frame number0 20 40 60 80 100 120 140 160 180 200
Type A_1 (Gibbs_p4=0.9)
frame number0 20 40 60 80 100 120 140 160 180 200
Type Aź5ÔŹhŭƊĶÒù¢
Type A Type B Type C3
MAP�dƌĶÒƾƴƮƶƌđæŬŰƅź
ƴƮƵƹƔù¢ (C3ƌ´ŭƇšŸĻƂ�Ũ)
ƴƮƵƹƔù¢ (qïŸĻƂ�Ũ)
5Ô��\Ĺ*ƌ+Ɖ)ŬƾğĢƌďš
Ģ/ù¢AźåÍBźåÍC3źåÍ
Type A Type B Type C3
MRFźĶÒù¢ƌãuźċŵđæ0
0.5
1
0 50 100 150 200
Probability
TypeA
TypeB
TypeC3
Type A Type B
Type A Type B
Type A Type B Type C3
ColorectalTumorClassificationSystem
inMagnifyingEndoscopicNBIImages
[Tamakietal.,MedIA2013]Recognizingcolorectalimage
p Feature:Bag-of-Visual-Words
ofdenselysampledSIFT
p Classifier:LinearSVM
83
Extendedtovideoframes
Displayposteriorprobabilitiesateachframe.
0
0.5
1
251 271 291 311 331 351 371 391 411 431
Pro
babi
lity
Frame number
A
B
C0 20 40 60 80 120100 140 160 180 200
Highlyunstableclassificationresults
Possible Cause of Instability
84
p Classificationresultswouldbe
affectedbyoutoffocus.
number of visual words
Rec
ogni
tion
Rat
e [%
]
●
●●
●● ● ●
●● ● ● ●
● ● ●
●
●
●●
● ●
● ● ●
●
●no defoucsSD = 0.5SD = 1
SD = 2SD = 3SD = 5
SD = 7SD = 9SD = 11
10 100 1000 10000
0.0
0.2
0.4
0.6
0.8
1.0
p Testimage:1191
Ø TestimagesareaddedGaussianblurwithdifferentSD.
p Trainimage:480
Ø 160imagesforeachclass
SmallerSDLarger
Recognitionresultsforoutoffocusimages
Particle Filter (Online Bayesian Filtering)
85
Statevector:
Observationvector:
� t :time
p (xt | y1:t�1) =
Zp (xt | xt�1, ✓1) p (xt�1 | y1:t�1) dxt�1
Prediction
State transition
We use Dirichlet distribution for state transition and likelihood.
Update
Likelihood
p (xt | y1:t) / p (yt | xt, ✓2) p (xt | y1:t�1)
yt =⇣y(A)t , y(B)
t , y(C3)t
⌘, y(A)
t + y(B)t + y(C3)
t = 1
xt =⇣x
(A)t , x
(B)t , x
(C3)t
⌘, x
(A)t + x
(B)t + x
(C3)t
Dirichlet distribution
86
(0.50, 0.50, 0.50)
(0.85, 1.50, 2.00)
(1.00, 1.00, 1.00)
(1.00, 1.76, 2.35)
(4.00, 4.00 ,4.00)
(3.40, 6.00, 8.00)
low
high
Dirx
[↵] =�(
PNi=1 ↵i)QN
i=1 �(↵i)
NY
i=1
x
↵i�1i
parameterofdistribution:
↵ (x) = ax+ b
Problem & Our Approach
87
xt�1 xt xt+1
yt�1 yt yt+1zt+1zt�1 zt
�t �t+1�t�1
xt�1 xt+1xt
ytyt�1 yt+1
✓2
DirichletParticleFilter(DPF)Defocus-awareDirichletParticleFilter(D-DPF)
Prediction
p (xy | y1:t�1, �1:t�1, z1:t�1) =Zp (xt | xt�1, ✓1)p (xt�1 | y1:t�1, �1:t�1, z1:t�1)dxt�1
State transition
p (xt | y1:t, �1:t, z1:t) /p (yt, �t, zt | xt) p (xt | y1:t�1, �1:t�1, z1:t�1)
Update
Likelihoodp (yt, �t, zt | xt) = p (yt,xt, �t) p (zt | �t)
Isolated Pixel Ratio (IPR) [Oh et al., MedIA2007]
88
Endoscopicimage EdgespixelsbyCannyedgedetector
Clearedge Defocusedge
Edgepixel
Non-edgepixel
Edgeandisolatedpixel
IPR:thepercentageofisolatedpixelineveryedgepixels
Isolated pixel value (IPR)
frequ
ency
0.00
0.02
0.04
0.06
0.08
0.10
0 0.005 0.01 0.015
γt
Den
sity
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
1.2
sigma = 0.5sigma = 1sigma = 2sigma = 3sigma = 4
Dirichlet distribution�
Modeling with Rayleigh dist. and IPR
89
Ray
x
[�] =
x
�
2exp
✓� x
2
2�
2
◆
Defocus Clear
γt
0.000 0.005 0.010 0.015
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
zt
σ(z t)
●
●
� (zt) = 4 exp(100 log(0.25)zt)
p (zt | �t) = Ray�t[� (zt)]
Sequential filtering
90
Prediction
p (xy | y1:t�1, �1:t�1, z1:t�1) =Zp (xt | xt�1, ✓1)p (xt�1 | y1:t�1, �1:t�1, z1:t�1)dxt�1
p (xt | y1:t, �1:t, z1:t) /p (yt, �t, zt | xt) p (xt | y1:t�1, �1:t�1, z1:t�1)
Update
xt�1 xt xt+1
yt�1 yt yt+1zt+1zt�1 zt
�t �t+1�t�1
p (yt, �t, zt | xt) = p (yt,xt, �t) p (zt | �t)p (yt,xt, �t) = Dir
xt [↵2 (yt, �t)] p (zt | �t) = Ray�t[� (zt)]
p (xt | xt�1, ✓1) = Dirxt [↵1(xt�1, ✓1)]
The performance for defocus frames
91
0 100 200 300 400 500 6000.0
0.5
1.0
0 100 200 300 400 500 600
0.0
0.5
1.0
0 100 200 300 400 500 600
0.000
0.005
0.010
0 100 200 300 400 500 600
0.0
0.5
1.0
0 100 200 300 400 500 600
0.0
0.5
1.0
Framenumber
Groundtruth
Observation
IPR
ResultbyDPF
Resultby
D-DPF
Smoothing result for an actual NBI video
92
Nosmoothingresult
Smoothingresult
Ś TypeA
Ś TypeB
Ś TypeC3
Summary
• I�NBI4¢±�1 ¦¨• Baseline: SIFT + Bag-of-Visual Words• $�%��" ¦¨���"v�• ·w�1
– self-training– sampling– domain adaptation / transfer learning
• <�1/k�85�– MRF/HMM���k�8#!$'�– ��$�&7Q�����)���% �%�