【nn輪読会】youtube-8m: a large-scale video classification benchmark
TRANSCRIPT
2017/7/12
YouTube-8M: A Large-Scale Video Classification Benchmark
[ Google Research 2016/9/27 arXiv: 1609.08675v1] TFUG NN #3
1
✤
✤
✤
✤
✤
✤
✤
✤ Kaggle 2
YouTube-8M
✤
ImageNet…
3
YouTube-8M
✤
4
2TB
1GPU 1
1) 1 1
2) Inception
3) PCA
4) TensorFlow
→ URL
✤ YouTube
5
✤ Knowledge Graph entity
6
✤
✤ 3 1-2.57
✤
✤
✤ …
✤ 78.8% 14.5%
✤ → 80%
✤ →
8
DBoF
✤ k N
✤ ReLuM
✤ → (MxN)
✤
✤ Max pooling
✤
✤
9
✤
✤
✤ φ
✤ PCA
✤ L210
✤ mAP:
✤ Hit@k: k
1
✤ PERR(Precision at equal recall rate):
✤ GAP: Kaggle
-11
✤ 1
✤
✤ 2
DBoF,LSTM
12
→
✤
✤ PERR
15%
✤
13
✤ ActivityNet
✤ Sports-1M
14
Kaggle
✤
✤ 6
✤ Google Cloud …15
Kaggle 1
✤ https://github.com/antoine77340/LOUPE
✤ Learnable pooling with Context Gating for video classification
✤ [Antoine Miech arXiv:1706.06905v1 2017/6/21]
✤ 25
✤ 7 GAP 84.698%Gated NetVLAD (256 clusters), Gated NetFV (128 clusters), Gated Soft-DBoW (4096 clusters), Soft-DBoW (8000 Clusters), Gated NetRVLAD (256 Clusters), GRU (2 layers, hidden size: 1200) LSTM (2 layers, hidden size: 1024)
16
✤ Q.p8
✤ A.
✤
…)
80% 80%
p8
17