计算机视觉的部分新成果介绍 -...

计算机视觉的部分新成果介绍

上海交通大学图像处理与模式识别研究所

杨杰教授

Analysis of Camera Response Functions for Image Deblurring

用于图像去模糊的相机响应函数分析

ECCV’12 and PAMI 2013

Addressed Problems: Motion Deblurring

• Traditional methods model motion blurs in the intensity domain:

B* = I ⊗ K I: latent intensity image; K: the blur kernel

• Images captured by a camera: B = ψ( φ(I) ⊗ K)

ψ: the Camera Response Function (CRF); φ = ψ-1 the inverse CRF;

blur occurs in the irradiance domain.

Our main contributions: • Analyze: the way that CRFs affect intensity-based deblur • Develop: a dual-image based solution to simultaneously estimate CRFs and deblur images.

CRF Estimation Proposed solution:

Capture a pair of sharp and blurry images.

Fit CRFs to observed images by minimizing

J(φ) = || W∙ (ψ(φ(I) ⊗ K ∙ r) – B) ||2

where: r = ratio of exposure between B and I.

Weight observations by estimating “blur inconsistency”

Model CRF by the GGCM model:

Minimize the energy using Nelder-Mead Simplex method.

( ) ( ) 1, if , ,( , )

0 else.i j i j

W i jτΓ Γ >

=

( ) ( ) ( )1/ ,

0, , .

nP x i

ii

x x P x xαφ α α=

= =∑

(a, c, e): images captured by Canon 400D, 60D, and Nikon D300 respectively. (b, d, f): estimated CRFs.

Results: Estimated CRFs

(a) iteratively deblurred images by gamma=2.2 correction. (b) curves generated by gamma 2.2, and the proposed (mean CRFs). (c) Error maps using gamma

correction. (d) Error maps using the proposed CRF correction. Results show: Our CRF-based method is better than gamma 2.2 correction.

Results: Image deblurring

Result: Image Deblurring.

Comparison of images from blind and non-blind deconvolution by using: (a) linear CRF, (b) gamma

curve, (c) CRF correction.

Result show: CRF correction-based method consistently

outperforms the remaining 2 methods.

Fast Patch-based Denoising using Approximated Patch Geodesic Paths

基于图像块近似测地线距离的快速图像去噪

CVPR2013

Problems addressed

Image denoising using traditional patch-based approaches requires intensive computations.

example of traditional patch-based denoising methods:

- Non-Local Means (NLM) - BM3D - LPG-PCA [PR 2010] - EPLL [ICCV ‘11] ….

similar patches are used in image as cues for denoising,

Drawbacks of the traditional patch-based denoising: - Computation expensive requires pair-wise patch comparisons.

- Denoising results: low-quality

Distance metric

Γ: a geodesic path connecting patch centered at s and t. NI (p): a patch centered at p.

Brief description of the proposed method

Main ideas of the proposed method: Employ a more efficient patch-based denoising approach: “approximated patch geodesic paths”

Weighting kernel

wp(i; j) = Gaussian function of

where

Denoising using Patch-based Geodesic Path

w: weight for kernel: Z: normalization factor

Test Results: from the proposed method

(b)(e): two patch windows; (c) (f):patch distance map. (d) (g): color-coded path hop maps

Results show: Patch geodesic path may effectively approximated by the proposed method.

Evaluation: accuracy

(a) 5X5 patch size, (b) 7X7 patch size.

The above results are obtained from 200 test images

Further improvement to the proposed method (FM-PatchGP):

+ use better weighting function

+ employ multiscale denoising

Test results show: ‘FM-PatchGP’ is as effective as the previous

proposed method, however, it is much faster.

Results and comparisons:

SALIENCY DRIVEN CLUSTERING FOR SALIENT

OBJECT DETECTION

基于显著性驱动聚类的目标检测

Neurocomputing 2014

Saliency Detection

Saliency Model Color Contrast Prior Saliency

Saliency Model Boundary Prior Saliency

Combined Saliency

Histogram Analysis

Combined Saliency

Histogram

Histogram Analysis

1) Cluster numbers 2) starting centroids

for clustering

Saliency Driven Clustering

Kmean

Cluster numbers and starting centroids are determined in the

stage of histogram analysis

Generated Regions

Regional Saliency Computation

Average Color Prior Saliency

Average Boundary Prior

Saliency

Pixel Level Saliency Values

Final Saliency Values

Regional Saliency Computation

The Role of Regional Saliency Computation

Experiments on MSRA Database

Experiments on Berkley Database

Diversity-Enhanced Condensation Algorithm and Its Application for Robust

and Accurate Endoscope Three-Dimensional Motion Tracking

多样性增强凝结算法及其在稳健精确的内窥镜

三维运动跟踪中的应用

（CVPR2014）

Limitations Condensation algorithm (CA)

• Use sampling importance resampling (SIR) to solve multimodal-density nonlinear non-Gaussian problems

• Limitations: The particle impoverishment

Endoscope 3-D motion tracking • Synchronization of pre- and intra-operative

sensory information, e.g., computed tomography (CT) slices, endoscopic images, and positional sensor measurements

• Limitations: Image artifacts, tissue deformation, inaccurate sensor measurements

Motivation Differential evolution (DE)

• Can deal with non-differentiable, nonlinear and multimodal optimization problems over continuous dynamic state estimation

Purpose • Aims at solving the particle impoverishment problem • By inspired these unique properties of DE, our

strategy is to use the DE algorithm to tackle the particle impoverishment.

• Propose a diversity-enhanced condensation algorithm (DECA) that differentially evolves particles to enhance the diversity

Diversity-Enhanced Condensation Algorithm In general, DECA consists of three steps:

(1) particle diversification using adaptive differential evolution (ADE)

(2) particle transition (3) observation model to compute the

particle probability density

Diversity-Enhanced Condensation Algorithm

Application to Endoscope Tracking

Results and Discussion

Comparison of accuracy and smoothness

The visual quality and weight distribution of different methods


Visual comparison of tracking results from different methods. Top row shows selected images. Other rows display virtual images generated from methods of Schwarz, Mori, Luo, and ours that outperforms others

Visual Tracking via Graph-Based Efficient Manifold Ranking with Low-Dimensional

Compressive Features

基于图的高效流形排序及低维压缩特征的视觉跟踪

（ICME2014 oral）

Motivation Manifold Ranking Application

……

……

query

results

database

Research goal

Tracking is regarded as a ranking problem, we propose a novel tracking method based on graph-manifold ranking algorithm.

Framework

Object representation Flaws with Haar-like features

All of the scale and the position should be considered Haar-like features require high computational loads for feature extraction in training and tracking phases

Object representation To use low-dimensional compressive features

To find a very sparse measurement matrix, which is used to project high-dimensional features into low-dimensional features.

R X V⎣⎢⎢⎢⎢⎢⎡𝑥𝑥1𝑥𝑥2⋮⋮⋮⋮⋮𝑥𝑥𝑚𝑚⎦

⎥⎥⎥⎥⎥⎤

⎣⎢⎢⎡𝑣𝑣1𝑣𝑣2⋮⋮𝑣𝑣𝑛𝑛⎦⎥⎥⎤ = × 𝑣𝑣𝑖𝑖 = �𝑟𝑟𝑖𝑖𝑖𝑖 𝑥𝑥𝑖𝑖

𝑖𝑖

Updating appearance model

Compute the average ranking score:

Then, we compute the displacement error:

We delete the node that has the largest displacement

error, and then add the current tracking result into appearance model.

𝝁𝝁𝒓𝒓𝒎𝒎∗ = �(𝒓𝒓𝒎𝒎∗ )𝒊𝒊

𝒕𝒕

𝒊𝒊=𝟏𝟏

𝒆𝒆𝒊𝒊 = �(𝒓𝒓𝒎𝒎∗ )𝒊𝒊 − 𝝁𝝁𝒓𝒓𝒎𝒎∗ �𝟐𝟐

Temporal and spatial context

Appearance model only represents the temporal

context in the previous frames.

Note: the object can be influenced by its surrounding backgrounds

Efficient manifold ranking

Find anchor points to represent data points

Build the relationship between data points and anchor

points, we only need to build a graph with anchor points.

The number of anchor points is very small.

……

Data points Anchor points

Efficient manifold ranking

Quantitative Results

We compared our method with 6 state-of-the-art methods Implemented in MATLAB, our tracking method runs at about 10 frames per second (FPS) to obtain the averaged results on an i3 3.20 GHz machine with 4 GB RAM.

. Screenshots of sampled tracking results

Some tracking demos

Stone

Some tracking demos

Lemming

ReLISH: Reliable Label Inference via Smoothness Hypothesis

基于光滑性假设的可靠标记推理

AAAI 2014

Introduction to Semi-Supervised Learning

Why SSL is Necessary: Limited Labeled Data

Image segmentation

Wang et al. TPAMI 09 Web image

classification Zhu ICML 07 tutorial


Other Situations/Applications Webpage classification, Visual object tracking, 3D protein annotation, NLP… All situations share the same problem:

Labeled examples are scarce

Not adequate for training a supervised

classifier

EXAMPLE TEXT Go ahead and replace it with your own text.

Example text

Text

Unlabeled examples are

abundant

Why not using them for classification?

EXAMPLE TEXT Go ahead and replace it with your own text.

Example text

Text

Improved results


Manifold assumption: data are supported by an intrinsic manifold. Labels should vary smoothly on this manifold.

Representative algorithms: (graph-based) GFHF (Zhu et al, 2003), LGC (Zhou et al., NIPS 2003), LNP (Wang et al., ICML 2008), LapSVM/LapRLS (Belkin et al., JMLR 2006)

Smoothness is a key issue for accurate classification

(a) bridge point (b) result of LapRLS (c) result of ReLISH

Motivation

Our method follows the manifold assumption “Bridge points” degrade the results significantly in graph-based method.

Key observation: examples with low degree should be regularized heavily.

Local smoothness term

Pairwise smoothness term

Model

Model

initial state induction term proposed smoothness term fidelity term

Theoretical Analyses: Smoothness

Experimental Results

Baselines: HF (Zhu et al., 2003) LGC (Zhou et al., 2003) CML (Xia et al., 2008) LNP (Wang et al., 2008) LapRLS/LapSVM (Belkin et al., 2006)

Datasets/Applications:

Synthetic datasets, UCI datasets, Digit recognition, Image classification

Transductive Transductive & Inductive


Synthetic Datasets DoubleSemicircle: (Mentioned above)

(a) bridge point (b) result of LapRLS (c) result of ReLISH

Experimental Results Inductive Ability

-1 -0.5 0 0.5 1 1.5 2-1

-0.5

0

0.5

1

1.5

2

-2 -1 0 1 2-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

PositiveNegativeUnlabeled

-20 -10 0 10 20 30-15

-10

-5

0

5

10

15


-20 -10 0 10 20 30-15

-10

-5

0

5

10

15

-2 -1 0 1 2-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

(b)

(c)

-1 -0.5 0 0.5 1 1.5 2-1

-0.5

0

0.5

1

1.5

2


(a)

(d)

(e) (f)

DoubleMoon

DoubleRing

Square&Ring

Observation: The decision boundaries are consistent with the geometry of unlabeled examples

Experimental Results UCI Datasets Iris Seeds BreastCancer Wine

1st row: Transductive results; 2nd row: Inductive results

Experimental Results Handwritten Digit Recognition

• 0~9: 10 classes • 800 examples each class, 500 for training,

300 for testing

(a): transductive results; (b) inductive results

Experimental Results Image Classification (Caltech 256)

• 9 animals: dog, goose, swan, zebra, dolphin, duck, goldfish, horse, and whale

• Features: PHOG, SIFT, Region Covariance, LBP

FLAP: Fick’s Law Assisted Propagation for Semi-Supervised Learning

Fick定律辅助传播的半监督学习

Accepted by TNNLS

Motivation: FLAP simulates the diffusion of fluid for label propagation based on a physical theory: Fick’s first law for fluid diffusion. Labeled examples: the diffusive source with a high concentration of label Unlabeled examples: the sink with low concentration of label

Reference: FLAP: Fick’s Law Assisted Propagation for Semi-Supervised Learning,

Chen Gong, Dacheng Tao, Keren Fu, Jie Yang, submitted to TNNLS, 2013.

Propagation Between Two Nodes:

Propagation On the Whole Graph:

Vectorization:

Important Theoretical Analyses:

Theorem 1: The labels of the labeled example will remain almost unchanged after the iteration process.

Important Theoretical Analyses:

The reason for high convergence rate achieved by FLAP!!!

Interpretation and Connections: 1. Regularization networks:

2. First Order Intrinsic Gaussian Markov Random Fields:

Experimental Results Synthetic Data:

Experimental Results Real Benchmarks Data:

UCI Data:


Handwritten Digit Recognition:

Experimental Results Image Classification:

Experimental Results Activity Recognition:

Dataset: INRIA IXMAS Features:

Reference: Human Activity Recognition with Metric Learning, Du Tran and Alexander Sorokin,

ECCV 2008

谢谢！

86

计算机视觉的部分新成果介绍 -...

Documents