![Page 1: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/1.jpg)
From pixel to Visual Intelligence
Speaker: Cewu Lu (卢策吾)
Shanghai Jiaotong University
上海交通大学
![Page 2: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/2.jpg)
•About me.
•My understanding of Computer Vision Big Picture .
•My research at that Big Picture.
Outline
![Page 3: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/3.jpg)
About Me
• Professor • Ph.D supervisor• 1000 talents oversee (国家青年千人计划)
Machine Vision and Intelligence Group
![Page 4: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/4.jpg)
Before I joined SJTU
Postdoc and research Fellow at
Prof. Fei-fei LiDirector of Stanford AI lab
Prof. Leonidas J. GuibasNAE(美国工程院院士)
![Page 5: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/5.jpg)
• Stanford-Toyota Self-Driving Cars(斯坦福-丰田无人车) core member
• Publish (accepted) 21 CVPR/ICCV/PAMI/IJCV (77% first author), CCF Apaper 31
• Most cited paper SIGGRAPH in recent 5 years among 1000+ papers.
• Two papers are included in OpenCV
About Me
![Page 6: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/6.jpg)
Computer Vision
Machine can See
NSF while paper: Let machine see like human
![Page 7: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/7.jpg)
Computer Vision
Machine can See
Pixel level Patch level Human Understanding
Object level Super object
[SIFT Feature, 2004]
[Deep Learning, 2012]
![Page 8: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/8.jpg)
Image
Video
RGBD
Scene UnderstandingObject Detection
Fine-gained
Event Understanding
Action Recognition
Gradient Processing
Image Abstraction
Stereo Deblur
DenoisePatch Representation
Tracking
Face
3D reconstruction
Visual QA
Image2catpion
Video storying
Video storying
Pixel level Patch level Human Understanding
Object level(recognition)
SaliencyScene Parsing
Point cloud segmentation
![Page 9: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/9.jpg)
Image
Video
RGBD
Scene UnderstandingObject Detection
Fine-gained
Event Understanding
Action Recognition
Gradient Processing
Image Abstraction
Stereo Deblur
DenoisePatch Representation
Tracking
Face
3D reconstruction
Visual QA
Image2catpion
Video storying
Video storying
Pixel level Patch level Human Understanding
Object level(recognition)
SIGRAPHA ASIA
SIGRAPHA ASIA
IJCV CVPR
ICCV
CVPR
CVPR
TIP
CVPR
CVPRTIP
CVPR
CVPR
ECCV
CVPR
ICCV
ICCV
ICCV
ICCVICCV
ICCV
ICCV
TVCG
PAMI
PAMI
PAMI
IJCV
ICCP
My Research Work
![Page 10: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/10.jpg)
Representative Work on Patch Level
![Page 11: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/11.jpg)
L0-norm smoothing
Cewu Lu*, Li Xu*, Yi Xu, Jiaya Jia , "Image Smoothing via L0 Gradient Minimization“,ACM Transactions on Graphics, Vol. 30, No. 5 (SIGGRAPH Asia 2011) * Indicates co-first author
![Page 12: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/12.jpg)
Main Structure Extraction
Smoothing result
![Page 13: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/13.jpg)
Ours
Extracted Edge
![Page 14: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/14.jpg)
Canny
Extracted Edge
![Page 15: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/15.jpg)
![Page 16: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/16.jpg)
Stationary Estimation
L0 Regularized Stationary Time Estimation for Crowd Group Analysis, [CVPR 2014] [PAMI 2016]
![Page 17: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/17.jpg)
Abnormal Event Detection at 1000 FPS
[Cewu Lu et al. ICCV]
Cewu Lu, Jianping Shi, Jiaya Jia. Abnormal Event Detection at 150 FPS in MATLAB,IEEE International Conference on Computer Vision [ICCV 2013] [IJCV 2017]
![Page 18: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/18.jpg)
Results (UCSD Ped1 Dataset)
MPPCA: [Mahadevan et al. 2009] MPPCA+SF: [Mahadevan et al. 2009] SF: [Mahadevan et al. 2009] MDT: [Mahadevan et al. 2009] Sparse: [Cong et al. 2011] Adam: [Adam et al. 2008]
Pixel-level comparison. FPR: False Positive Rat. TPR: True Positive Rate. Subspace: replacing our combination learning by [Ehsan et al. 2009].
![Page 19: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/19.jpg)
Results
Sec per Frame Platform CPU Memory
[Mahadevan et al. 2009] 25 - 3.0GHz 2.0GB
[Cong et al. 2011] 3.8 - 2.6GHz 2.0GB
[Antic et al. 2011] 10 MATLAB - -
Our 0.00098 MATLAB 2012 3.4GHz 8.0GB
Testing time comparison on the UCSD Ped1 dataset.
![Page 20: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/20.jpg)
Results
Sec per Frame Platform CPU Memory
[Mahadevan et al. 2009] 25 - 3.0GHz 2.0GB
[Cong et al. 2011] 3.8 - 2.6GHz 2.0GB
[Antic et al. 2011] 10 MATLAB - -
Our 0.00098 MATLAB 2012 3.4GHz 8.0GB
Testing time comparison on the UCSD Ped1 dataset.
Others
Ours
![Page 21: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/21.jpg)
Results
![Page 22: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/22.jpg)
Representative Work on Object Level
![Page 23: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/23.jpg)
Personal Object Discovery[Cewu Lu et al. TIP]
![Page 24: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/24.jpg)
Object Scene Distribution
![Page 25: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/25.jpg)
Highlight Projects (Personal Object Discovery)
Cewu Lu, Renjie Liao, Jiaya Jia , “Personal Object Discovery“, IEEE Transactions on Image Processing.
![Page 26: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/26.jpg)
Weather Understanding[Cewu Lu et al. CVPR 2014][Cewu Lu et al. TPAMI2014]
![Page 27: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/27.jpg)
Highlight Projects (Weather classification)
Cewu Lu, Di Lin, Jiaya Jia, Chi-Keung Tang, “Two-class Weather Classification“, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2014, (TPAMI) 2017.
![Page 28: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/28.jpg)
![Page 29: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/29.jpg)
Real-Time Video Stylization Using Object FlowsCewu Lu Yao Xiao and Chi-Keung TangIEEE Transactions on Visualization and Computer Graphics (TVCG), 2017
Combining Sketch and Tone for Pencil Drawing Production.Cewu Lu, Li Xu, Jiaya Jia.Non-Photorealistic Animation and Rendering (NPAR), 2012(Best Paper Award).
![Page 30: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/30.jpg)
Cewu Lu et al. Real-Time Video Stylization Using Object Flows
![Page 31: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/31.jpg)
Papers (Object Detection)
Cewu Lu, Hao Chen, Qifeng Chen, Hei Law, Yao Xiao, Chi-Keung Tang ECCV 2014 workshop - ImageNet Large Scale Visual Recognition Challenge
Di Lin, Xiaoyong Shen, Cewu Lu, Jiaya Jia, Deep LAC: Deep Localization, Alignment and Classification for Fine-grained Recognition, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015.
Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015.
Cewu Lu, Yongyi Lu, CK Tang, Efficient Square Localization for Efficient and Accurate Object Detection, submitted to IEEE International Conference on Computer Vision (ICCV), 2015.
Cewu Lu, Yongyi Lu, CK Tang, Explicit Closed-Curve Optimization for Objectness Estimation , submitted to IEEE International Conference on Computer (ICCV), 2015.
Cewu Lu, Yongyi Lu, CK Tang, Unobjectness for Object Proposals Generation, submitted to IEEE International Conference on Computer Vision (ICCV), 2015.
![Page 32: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/32.jpg)
Pixel level Patch level Human Understanding
Object level
[Deep Learning 2012]
StoryNoun (名词)
Sentence (句子)
Phrase(短语)
verb(动词)Natural Language Understanding
Computer Vision
Comparison to NLP
![Page 33: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/33.jpg)
Pixel level Patch level Human Understanding
Object level
[Deep Learning 2012]
StoryNoun (名词)
Sentence (句子)
Phrase(短语)
verb(动词)Natural Language Understanding
Computer Vision
Comparison to NLP
What can I do here?
![Page 34: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/34.jpg)
Representative Work on Beyond Object Level
![Page 35: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/35.jpg)
Visual Relationship Detection with Language PriorsCewu Lu, Ranjay Krishna, Michael Bernstein, Li Fei-FeiECCV 2016 (oral) (reported by ECCV daily)
![Page 36: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/36.jpg)
Detecting <Subject, Predicate, Object> (<主,谓,宾>)
![Page 37: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/37.jpg)
Difficulties
(1)detection errors by individual is huge (under 5%)
(2)Training data is sparse
主 谓 宾
![Page 38: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/38.jpg)
主
谓
宾
主 谓 宾 主谓宾
100类 70类 100类 70万类
Difficulties
(1)detection errors by individual is huge (under 5%)
(2)Training data is sparse
![Page 39: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/39.jpg)
Linkage from Language Prior
Person-ride-horse
Person-ride-elephant
Person-ride-moto
![Page 40: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/40.jpg)
![Page 41: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/41.jpg)
• Discover and predict relationships in image.
Mining relationship tuples:<man, wear, glass>
<man, carry, bag>
<Car, on, ground>
<trash bin, next to, streetlight>
…………
Some Results
![Page 42: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/42.jpg)
Using relationship: Human-ride-horse
![Page 43: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/43.jpg)
Accuracy
Slide for more details: http://cs.stanford.edu/people/ranjaykrishna/vrd/slides.pdf
![Page 44: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/44.jpg)
人 人
A problem: miss sub-object level information!
![Page 45: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/45.jpg)
Leg stamps on somethingScale pan is stamped by something
Beneath Holistic Object Recognition
Richer semantics on parts helps to infer the story.
![Page 46: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/46.jpg)
sth sits on saddlewheel in the airwheel on sthsth holds handlebar
sth touches headleg in the airleg on sthtorso wears sth
head with bridle reinsth rides torsotorso wears sthleg in the airleg on sth
hand embraces sthtorso sits on sthleg is bent
sth sits on saddle sth hold handlebar.wheel on sthwheel on sth
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
(a) (b) (c)
(d) (e)
Beyond Holistic Object Recognition: Enriching Image Understanding with Part StatesCewu Lu et. al (with Stanford University) arXiv:1612.07310
![Page 47: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/47.jpg)
Beyond Holistic Object Recognition
![Page 48: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/48.jpg)
Regional Multi-person Pose EstimationHaoshu Fang,Shuqin Xie,Cewu Lu (通信作者)
arXiv:1612.00137v2
![Page 49: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/49.jpg)
SST network
![Page 50: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/50.jpg)
STN: spatial Transform NetworkSDTN: spatial de-transform networkSPPE: single person pose estimation
![Page 51: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/51.jpg)
STN: spatial Transform NetworkSDTN: spatial de-transform networkSPPE: single person pose estimation
![Page 52: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/52.jpg)
Comparison
“CMU” indicates Real-time Multi-Person 2D Pose Estimation using Part Affinity Fields,Cao et al. CVPR 2017
MPII COCO
Ours 77.4 64.7
CMU 75.6 61.8
![Page 53: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/53.jpg)
![Page 54: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/54.jpg)
Computer Vision
Machine can See
Pixel level Patch level Human Understanding
Object level Super object
Part level
![Page 55: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/55.jpg)
Computer Vision Big Picture
Machine can See
Machine can Act
![Page 56: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/56.jpg)
![Page 57: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/57.jpg)
Without Action…
![Page 58: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/58.jpg)
Without Action…
To acquire perception, we need daily action indeed!
![Page 59: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/59.jpg)
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., van den Driessche, G., et al. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484–489.
Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T. P., Harley, T., et al. (2016, February 5). Asynchronous Methods for Deep Reinforcement Learning. arXiv.org.Zhu, Y., Mottaghi, R., Kolve, E., Lim, J. J., Gupta, A., Fei-Fei, L., & Farhadi, A. (2016, September 17). Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement
Learning. arXiv.org.
Reinforcement Learning
![Page 60: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/60.jpg)
Suiqin Xie, Cewu Lu(通信作者) Reinforcement learning for pose estimation
![Page 61: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/61.jpg)
Yourong You, Cewu Lu (通信作者),Reinforcement Learning Car for self-driving
![Page 62: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/62.jpg)
Learning Step1. Low speed straight2. Low speed curve3. Stuck4. High Speed straight5. Low speed curve6. Collision
![Page 63: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/63.jpg)
Yourong You, Cewu Lu (通信作者),Reinforcement Learning for Car self-driving
![Page 64: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/64.jpg)
Virtual to Real Reinforcement Learning for Autonomous Driving
Virtual to Real Reinforcement Learning for Autonomous Driving (with Berkeley )
Yurong You, Xinlei Pan,Ziyan Wang and Cewu Lu, arXiv:1704.03952v1
B-RL: training the vehicle in the virtual car racing simulator TORCS [31] with virtual image as input
method Ours B-RL
result 43.40% 28.33%
![Page 65: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/65.jpg)
增强学习的痛点:交互!交互!交互!
怎么都是虚拟的!
![Page 66: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/66.jpg)
Visual Intelligence Big Picture
Machine can See
Machine can Act Machine has Knowledge
![Page 67: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/67.jpg)
ShapeNet (Stanford, Princeton, Adobe )
A Scalable Active Framework for Region Annotation in 3D Shape CollectionsACM Transactions on Graphics (ACM SIGGRAPH ASIA 2016)(With Stanford, Adobe, UCB)
![Page 68: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/68.jpg)
editable Real-world
Promising to one-shot learning
![Page 69: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/69.jpg)
![Page 70: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/70.jpg)
Unsupervised Image Group Distance InferenceZhengTian Xu, Cewu Lu(通信作者) will submit to arXiv soon
![Page 71: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/71.jpg)
Unsupervised Image Group Distance InferenceZhengTian Xu, Cewu Lu(通信作者)
![Page 72: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/72.jpg)
From pixel to Visual Intelligence
See
Act Knowledge
See: finer and finerObject recognition (2013)Detection (2014)Segmentation (2015)Part level such as pose (2016)
My goal: (1) information exploration beyond object level to mine high-level semantics andbetter object level recognition (partly solve long-tail 长尾效应).
![Page 73: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/73.jpg)
From pixel to Visual Intelligence
See
Act Knowledge
See: finer and finerObject recognition (2013)Detection (2014)Segmentation (2015)Part level such as pose (2016)
My direction: information exploration beyond object level to mine high-level semantics andbetter object level recognition (partly reduce long-tail effect 长尾效应).
只不是增加数据的数量,而是数据的深度(信息量)!
![Page 74: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/74.jpg)
From pixel to Visual Intelligence
See
Act Knowledge
See: finer and finerObject recognition (2013)Detection (2014)Segmentation (2015)Part level such as pose (2016)
My direction: information exploration beyond object level to mine high-level semantics andbetter object level recognition (partly solve long-tail 长尾效应).
![Page 75: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/75.jpg)
From pixel to Visual Intelligence
See
Act Knowledge
See: finer and finerObject recognition (2013)Detection (2014)Segmentation (2015)Part level such as pose (2016)
Challenging:
(1) how to benchmark we visually understand the work?
subject part 主观(task driving) + objective part 客观 (doing that)
My thinking: leave to Act
(2) How to obtain low-shot (even one-shot) learning?
My thinking: leave to Knowledge
![Page 76: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/76.jpg)
我们实验室在招生。。。求扩散。。。
![Page 77: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/77.jpg)
My Research Directions
Machine can See
Machine can Act Machine has Knowledge
Better performance (deep learning)!Sub- and super object levelIn Video and Image
• Real-world interaction • Learning speed • Reward function (inverse RL)• Huge action space
one-shot learning by O(1) effortVisual Knowledge base (self-learning and scale-up)
![Page 78: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/78.jpg)
Applications
Machine can See
Machine can Act Machine has Knowledge
11 students: Pose estimationVideo action understandingVisual relationshipObject detectionDeep Learning on mobile phone
9 students:Auto-carRobot armAuto-navigation
5 Students
![Page 79: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/79.jpg)
发邮件到这里[email protected]
2018年入学,硕士,博士博士后(工资好说,不差钱)
福利:推荐北美名校暑假实习今年成功推荐:斯坦福(vision group),麻省理工, CMU
目前组里成员有来自:上海交大ACM班,复旦ACM队中科大少年班,浙大竺可桢学院。人均1.6次国家奖学金。
目前2018年入学,发了两个offer,上交计算机系前三名(一作ICCV 2017),上交电子系前三名, 目前还有名额。。。
![Page 80: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/80.jpg)
欢迎实习!• 目前实习过的学生包括加州伯克利,香港科技大学,浙江大学。我们提供住宿
![Page 81: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/81.jpg)
Thanks!
![Page 82: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,](https://reader033.vdocuments.pub/reader033/viewer/2022043021/5f3cfb8b826d9b49471d88cf/html5/thumbnails/82.jpg)
欢迎关注我们实验室公众号MVIG@SJTU