cvpr presentation

27
Anisotropic Partial Differential Equation based Video Saliency Detection Vartika Sharma, Vembarasan Vaitheeswaran, Chee Seng Chan

Upload: vartika-sharma

Post on 21-Jan-2017

123 views

Category:

Documents


1 download

TRANSCRIPT

Anisotropic Partial Differential Equation based Video Saliency Detection Result of our Video Saliency Detection model on KTH Action Dataset

Anisotropic Partial Differential Equation based Video Saliency Detection

Vartika Sharma, Vembarasan Vaitheeswaran, Chee Seng Chan

Original LESD Model

Our ContributionsFirst, we propose a novel method to generate static saliency map based on the adaptive nonlinear PDE model. It is based on the Linear Elliptic System with Dirichlet boundary (LESD) model for image saliency detection.We refine this model for the purpose of video saliency detection because the original LESD model does not consider the orientation and motion information contained in the video. Further, the proposed algorithm was tested on MSRA and Berkeley datasets, where images are mostly noiseless and are nearer to the image center but most of the video datasets contains heavy noise and the salient object is usually moving within the frames. For this reason, we do not use center-prior which is given in the original LESD model but instead, an extensive direction map consisting of background prior, color prior, texture and luminance features are used. We then combine the static map with motion map, which consists of motion features extracted from the motion vectors of predicted frames, to get the final saliency map. Figure 1 shows the pipeline of our model.

Addition of Non-Linear Matric TensorThe diffusion PDE seen previously does not give reliable information in the presence of flow-like structures (e.g. fingerprints).

We will extend our model for flow like structure where it would be required to rotate the PDE flow towards the orientation of interesting features.

Addition of Non-Linear Matric Tensor

K2

Feature Extraction From DCT CoefficientsThree features including luminance, color and texture are extracted from the unpredicted (I-frames) using DCT CoefficientsOn a given video frame, DCT operates on one 8X8 block at a time. On this block, there are 64-elements or 64-coeffients and the DCT operates on this block in a left to right and top to down manner (zig-zag sequencing).

Feature Extraction From DCT CoefficientsThe results of a 64-element DCT transform are 1 DC coefficient and 63 AC coefficients. The DC coefficient represents the average color of the 8x8 region. (Color and Luminance Prior)The 63 AC coefficients represent color change across the block.(Texture)

Motion Feature Extraction from Motion VectorsMotion Vector: A two-dimensional vector used for inter prediction that provides an offset from the coordinates in the decoded picture to the coordinates in a reference picture.

There are two types of predicted frames: P frames use motion compensated prediction from a past reference frame, while B frames are bidirectionally predictive-coded by using motion compensated prediction from a past and/or a future reference frame.

Motion Feature Extraction from Motion VectorsAs there is just one prediction direction (predicted from a past reference frame) for P frames, the original motion vector MV are used to represent the motion feature for P frames. As B frames might include two types of motion compensated prediction (the backward and forward prediction), we calculate the motion vectors for B frames

Anisotropic Partial Differential Equation based Video Saliency Detection

Vartika Sharma, Vembarasan Vaitheeswaran, Chee Seng Chan

Result of our Video Saliency Detection model on KTH Action Dataset

Results on KTH Action DatasetNumber of action classes = 6{boxing, hand clapping, hand waving, jogging, running, walking}

Boxing Hand Clapping Hand Waving Jogging Running Walking Original Action Videos* Final Saliency Maps

* For convenience, I have chosen only 16 frames per video "Recognizing Human Actions: A Local SVMApproach",Christian Schuldt, Ivan Laptev and Barbara Caputo; in Proc. ICPR'04, Cambridge, UK.

DC AC01 AC02 AC03 AC04 AC05 AC06 AC07 AC20 AC21 AC22 AC23 AC24 AC25 AC26 AC27 AC10 AC11 AC12 AC13 AC14 AC15 AC16 AC17 AC30 AC31 AC32 AC33 AC34 AC35 AC36 AC37 AC50 AC51 AC52 AC53 AC54 AC55 AC56 AC57 AC40 AC41 AC42 AC43 AC44 AC45 AC46 AC47 AC60 AC61 AC62 AC63 AC64 AC65 AC66 AC67 AC70 AC71 AC72 AC73 AC74 AC75 AC76 AC77

(a) (c) (b)

(d)

We performed salient region segmentation using MCMC segmentation method which was proposed by Barbu \etal ~\cite{Barbu2012} for crowd counting. The main purpose of our experiment is to get an estimation of crowd in a particular video frame and also, to calculate the rate with which the crowd count is changing in the consecutive frames. Although, now CCTV cameras are becoming very common for video surveillance, there are very few algorithms available for real-time automated crowd counting. It is important to note here that our focus is more on the rate of change of crowd count rather than the actual crowd count of every frame. A sudden increase or decrease in a crowd count can act as a warning sign of an unusual activity such as explosion, fight or some other emergency. For our experiment, we calculate the standard deviation of crowd count in consecutive video frames for every 10 seconds as a risk calculator. We train our algorithm on 2000 video frames provided in the Mall Dataset \cite{Loy2013} to set the threshold limit of the standard deviation for which the rate of change of crowd count is safe. We further test our algorithm on a few videos in the Pedestrian Traffic Database to show our results. Figure shows our result on Mall database for crowd counting.

\begin{figure}\begin{center}%\fbox{\rule{0pt}{2in} \rule{0.9\linewidth}{0pt}} \includegraphics[height=0.95\linewidth, width=0.95\linewidth]{final2.png}\end{center} \caption{Crowd counting result on frames of Mall Dataset. $(a)$ is the Original video frames, $(b)$ is our Saliency detection results and $(c)$ is the segmentaion based on MCMC method.}\label{fig:final2}%\label{fig:onecol}\end{figure}

CIOFMMRSSURPRISECAOur Model

Thank You