pre-fetching based on video analysis for interactive region-of- interest streaming of soccer...

Post on 24-Dec-2015

216 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Pre-fetching based on video analysis for interactive region-of-interest streaming of soccer sequences

Authors: Aditya Mavlankar and Bernd Girod

Information Systems Laboratory, Department of Electrical

EngineeringStanford University, Stanford, CA

94305, USAEmail: {maditya,

bgirod}@stanford.eduSpeaker : 童耀民 MA1G0222

2013.03.22

2 Outline

1. INTRODUCTION

2. ROI PREDICTION AND PRE-FETCHING Trajectory Prediction

Prediction Using H.264/AVC Motion Vectors

Prediction Tracking Soccer Ball

Prediction Tracking Soccer Ball and Players

3. EXPERIMENTAL RESULTS

4. CONCLUSIONS

3 INTRODUCTION

We consider a video streaming system in which the user can interactively watch an arbitrary region of a high-spatial-resolution scene.

Region-of-interest (RoI) prediction helps pre-fetch select slices of encoded video.

4 INTRODUCTION

Despite the availability of high-resolution video, challenges in delivering this high-resolution content to the client are posed by the limited resolution of the display and/or limited data rate for communications.

5 INTRODUCTION

The goal of the paper is to find out whether domain-specific techniques can predict the client’s RoI more accurately.

The more accurate the RoI prediction the lower is the percentage of missing pixels.

6 INTRODUCTION

In this paper, we focus on interactive viewing of soccer and investigate whether domain-specific RoI prediction based on semantic video analysis is more accurate than RoI prediction based on general techniques that apply to any type of content.

7 INTRODUCTION

8 ROI PREDICTION AND PRE-FETCHING

As part of earlier work, we have developed a graphical user interface [2,3] to allow the user to select an RoI while watching the video.

The application supports continuous zoom to provide smooth control of the zoom factor.

9 ROI PREDICTION AND PRE-FETCHING

The high-resolution layers are encoded using independent slices.

We choose the high-resolution layer that corresponds closest to the user’s zoom factor.

10 ROI PREDICTION AND PRE-FETCHING

If some required high-resolution slices are unavailable, we conceal the error by upsampling portions of the thumbnail video.

We compare the performance of four RoI predictors in this paper.

11 ROI PREDICTION AND PRE-FETCHING

The goal of each predictor is to predict the RoI in frame n + d when frame n is rendered on screen.

The zoom factor for frame n + d is predicted to be the same as the zoom factor observed for frame n.

12 ROI PREDICTION AND PRE-FETCHING

2.1. Trajectory Prediction

We adapt the autoregressive moving average (ARMA) prediction algorithm of [13] to extrapolate the coordinates of the RoI center.

13 ROI PREDICTION AND PRE-FETCHING

2.2. Prediction Using H.264/AVC Motion Vectors

This algorithm, proposed in our earlier work [12], exploits the motion vectors (MVs) contained within the encoded bitstream of the thumbnail video frames that are buffered at the client.

14 ROI PREDICTION AND PRE-FETCHING

2.2. Prediction Using H.264/AVC Motion Vectors

The MVs are used to find a plausible propagation of the RoI center pixel in every subsequent frame up to frame n+d.

15 ROI PREDICTION AND PRE-FETCHING

2.3. Prediction Tracking Soccer Ball

The RoI is simply predicted to be centered around the ball.

16 ROI PREDICTION AND PRE-FETCHING

2.4. Prediction Tracking Soccer Ball and Players

We have developed our own algorithm for player tracking using background subtraction and blob tracking based on MVs.

17 EXPERIMENTAL RESULTS

We use the Soccer1 sequence having 2560 × 704 pixels and 25 frames/sec.

The RoI display is 480 × 240 pixels.

18 EXPERIMENTAL RESULTS

19 EXPERIMENTAL RESULTS

20 EXPERIMENTAL RESULTS

PSNR (Peak Signal to Noise Ratio):也是訊雜比,只是訊號部分的值通通改用該訊號度量的最大值。以訊號度量範圍為 0到 255當作例子來計算 PSNR時,訊號部分均當成是其能夠度量的最大值,也就是 255,而不是原來的訊號

21 CONCLUSIONS

For long look-ahead, RoI prediction is very challenging for both kinds of techniques and incurs a large percentage of missing pixels.

Nevertheless, we found that the domain-specific technique performs better though only by about 1 dB, while the drop in PSNR with respect to perfect RoI prediction is more than 3 dB.

22

23

24

25

26

27

top related