[ieee 2010 second international conferences on advances in multimedia - athens, greece...

6
Boundary-based Measures for Evaluation of Color Image Segmentation Bogdan Popescu, Andreea Iancu, Marius Brezovan and Dumitru Dan Burdescu University of Craiova Software Engineering Department Craiova, Bd. Decebal 107, Romania {bogdan.popescu, andreea.iancu}@itsix.com, {brezovan marius, burdescu dumitru}@software.ucv.ro Abstract—The present paper addresses the problem of image segmentation evaluation by comparing four different approaches.We are introducing a new method of salient object recognition with very good results relative to other already known object detection methods. We developed a simple evaluation framework in order to compare the results of our method with other segmentation methods. The experimental results offer a complete basis for parallel analysis with respect to the precision of our algorithm, rather than the individual efficiency. Keywords-color segmentation; graph-based segmentation; contour-based evaluation methods; I. I NTRODUCTION The aim of image segmentation is the domain-independent partition of the image into a set of regions which are visually distinct and uniform with respect to some property, such as grey level, texture or color. The problem of segmentation is an important research field and many segmentation methods have been proposed in the literature so far ([1],[3],[4],[7]). Image segmentation is one of the most important oper- ations performed on acquired images. Image segmentation evaluation [2] focuses on two main properties: objectivity and generality. Objectivity means that all the test images in the benchmark should have an unambiguous ground- truth segmentation so that the evaluation can be conducted objectively. Generality means that the test images in the benchmark should have a large variety so that the evaluation results can be extended to other images and applications. The main objective of this paper is to empha- size the very good results of image segmentation obtained by our segmentation technique, Graph - Based Salient Object Detection, and to compare them with other existing methods. The algorithms that we use for comparison are: Normalized Cuts, Efficient Graph - Based Image Segmentation (Local V ariation) and Mean Shift. All of them are complex and well known algorithms, with very good results in this area and building the knowledge based on their results represents a solid reference. The experiments were completed using the images and ground-truth segmentations in the Berkeley segmentation dataset [8]. Since the ground-truth segmentation may not be well and uniquely defined, each test image in the Berkeley benchmark is manually segmented by a group of people. The segmentation accuracy is measured taken into consid- eration the global consistency error and the local consistency error. The paper is organized as follows. In Section III we briefly present previous studies in the domain of image segmentation and the segmentation method we propose. The methodology of performance evaluation is presented in Section IV. The experimental results are presented in Section V. Section VI concludes the paper and outlines the main directions of the future work. II. RELATED WORK Image segmentation evaluation is an open subject in today’s image processing field. The goal of existing studies is to establish the accuracy of each individual approach and find new improvement methods. The segmentation methods require ground truth image segmentations as reference. The main drawback of providing such reference is represented by the resources that are needed. However, after analyzing the differences between the image under study and the ground truth segmentation, a performance proof is obtained. Region-based segmentation methods can be broadly clas- sified as either model-based [11] or visual feature-based [12] approaches. A distinct category of region-based segmenta- tion methods that is relevant to our approach is represented by graph-based segmentation methods. Most graph-based segmentation methods attempt to search a certain structures in the associated edge weighted graph constructed on the im- age pixels, such as minimum spanning tree [3], or minimum cut [13]. Berkeley image segmentation benchmark is the reference that we use for our study. Using the same input, the provided image dataset, we are developing a customized methodology in order to efficiently evaluate our algorithm. The closest work to ours is [3], in which an image segmentation is produced by creating a forest of mini- mum spanning trees of the connected components of the associated weighted graph of the image. The novelty of our contribution concerns two main aspects: (a) in order to minimize the running time we construct a hexagonal structure based on the image pixels, that is used in both 2010 Second International Conferences on Advances in Multimedia 978-0-7695-4068-9/10 $26.00 © 2010 IEEE DOI 10.1109/MMEDIA.2010.34 162

Upload: dumitru-dan

Post on 27-Mar-2017

216 views

Category:

Documents


1 download

TRANSCRIPT

Boundary-based Measures for Evaluation of Color Image Segmentation

Bogdan Popescu, Andreea Iancu, Marius Brezovan and Dumitru Dan BurdescuUniversity of Craiova

Software Engineering DepartmentCraiova, Bd. Decebal 107, Romania

{bogdan.popescu, andreea.iancu}@itsix.com, {brezovan marius, burdescu dumitru}@software.ucv.ro

Abstract—The present paper addresses the problem ofimage segmentation evaluation by comparing four differentapproaches.We are introducing a new method of salient objectrecognition with very good results relative to other alreadyknown object detection methods. We developed a simpleevaluation framework in order to compare the results of ourmethod with other segmentation methods. The experimentalresults offer a complete basis for parallel analysis with respectto the precision of our algorithm, rather than the individualefficiency.

Keywords-color segmentation; graph-based segmentation;contour-based evaluation methods;

I. INTRODUCTION

The aim of image segmentation is the domain-independentpartition of the image into a set of regions which are visuallydistinct and uniform with respect to some property, such asgrey level, texture or color. The problem of segmentation isan important research field and many segmentation methodshave been proposed in the literature so far ([1],[3],[4],[7]).

Image segmentation is one of the most important oper-ations performed on acquired images. Image segmentationevaluation [2] focuses on two main properties: objectivityand generality. Objectivity means that all the test imagesin the benchmark should have an unambiguous ground-truth segmentation so that the evaluation can be conductedobjectively. Generality means that the test images in thebenchmark should have a large variety so that the evaluationresults can be extended to other images and applications.

The main objective of this paper is to empha-size the very good results of image segmentationobtained by our segmentation technique, Graph −Based Salient Object Detection, and to compare themwith other existing methods. The algorithms that we use forcomparison are: Normalized Cuts, Efficient Graph −Based Image Segmentation (Local V ariation) andMean Shift. All of them are complex and well knownalgorithms, with very good results in this area and buildingthe knowledge based on their results represents a solidreference.

The experiments were completed using the images andground-truth segmentations in the Berkeley segmentationdataset [8]. Since the ground-truth segmentation may not be

well and uniquely defined, each test image in the Berkeleybenchmark is manually segmented by a group of people.

The segmentation accuracy is measured taken into consid-eration the global consistency error and the local consistencyerror.

The paper is organized as follows. In Section III webriefly present previous studies in the domain of imagesegmentation and the segmentation method we propose.The methodology of performance evaluation is presented inSection IV. The experimental results are presented in SectionV. Section VI concludes the paper and outlines the maindirections of the future work.

II. RELATED WORK

Image segmentation evaluation is an open subject intoday’s image processing field. The goal of existing studiesis to establish the accuracy of each individual approach andfind new improvement methods. The segmentation methodsrequire ground truth image segmentations as reference. Themain drawback of providing such reference is represented bythe resources that are needed. However, after analyzing thedifferences between the image under study and the groundtruth segmentation, a performance proof is obtained.

Region-based segmentation methods can be broadly clas-sified as either model-based [11] or visual feature-based [12]approaches. A distinct category of region-based segmenta-tion methods that is relevant to our approach is representedby graph-based segmentation methods. Most graph-basedsegmentation methods attempt to search a certain structuresin the associated edge weighted graph constructed on the im-age pixels, such as minimum spanning tree [3], or minimumcut [13].

Berkeley image segmentation benchmark is the referencethat we use for our study. Using the same input, the providedimage dataset, we are developing a customized methodologyin order to efficiently evaluate our algorithm.

The closest work to ours is [3], in which an imagesegmentation is produced by creating a forest of mini-mum spanning trees of the connected components of theassociated weighted graph of the image. The novelty ofour contribution concerns two main aspects: (a) in orderto minimize the running time we construct a hexagonalstructure based on the image pixels, that is used in both

2010 Second International Conferences on Advances in Multimedia

978-0-7695-4068-9/10 $26.00 © 2010 IEEE

DOI 10.1109/MMEDIA.2010.34

162

color-based and syntactic-based segmentation algorithms,and (b) we propose an efficient method for segmentationof color images based on spanning trees and both color andsyntactic features of regions.

III. SEGMENTATION METHODS

We will compare four different segmentation techniques,the Mean Shift-Based segmentation algorithm [4], EfficientGraph-Based segmentation algorithm [3], Normalized Cutssegmentation algorithm [7] and our own contour-based seg-mentation method. We have chosen Mean Shift-Based seg-mentation because it is generally effective and has becomewidely-used in the vision community. The Efficient Graph-Based segmentation algorithm was chosen as an interestingcomparison to the Mean Shift. Its general approach issimilar, however, it excludes the mean shift filtering stepitself, thus partially addressing the question of whether thefiltering step is useful. Due to its computational efficiency,Normalized Cuts represents a solid reference in our study.We use all these algorithms as terms of comparison for theevaluation we performed.

A. Graph-Based Salient Object Detection

We present an efficient segmentation method that usescolor and some geometric features of an image to processit and create a reliable result [10]. The used color space isRGB because of the color consistency and its computationalefficiency.

Figure 1. The grid-graph constructed on the hexagonal structure of animage

What is particular at this approach is the basic usage ofhexagonal structure instead of color pixels. In this way wecan represent the structure as a grid-graph G = (V,E) whereeach hexagon h in the structure has a corresponding vertexv ∈ V , as presented in Figure 1. Each hexagon has sixneighbors and each neighborhood connection is representedby an edge in the set E of the graph. For each hexagonon the structure two important attributes are associated: thedominant color and the coordinates of the gravity center.Basically, each hexagonal cell contains eight pixels: six fromthe frontier and two from the middle.

Image segmentation is realized in two distinct steps. Thefirst step represents a pre-segmentation step when only colorinformation is used to determine an initial segmentation. The

second step represents a syntactic-based segmentation stepwhen both color and geometric properties of regions areused.

The first step of the segmentation algorithm uses a color-based region model and will produce a forest of maximumspanning trees based on a modified form of the Kruskal’salgorithm. In this case the evidence for a boundary betweentwo adjacent regions is based on the difference betweenthe internal contrast and the external contrast between theregions. The color-based segmentation algorithm builds amaximal spanning tree for each salient region of the inputimage.

The second step of the segmentation algorithm uses a newgraph, which has a vertex for each connected componentdetermined by the color-based segmentation algorithm. Inthis case the region model contains in addition some geomet-ric properties of regions such as the area of the region andthe region boundary. The final segmentation step producesa forest of minimum spanning trees based on a modifiedform of the Boruvka’s algorithm. Each determined minimumspanning tree represents a final salient region determined bythe segmentation algorithm.

B. Efficient Graph-Based Image Segmentation

Efficient Graph-Based image segmentation [3], is an effi-cient method of performing image segmentation. The basicprinciple is to directly process the data points of the image,using a variation of single linkage clustering without anyadditional filtering. A minimum spanning tree of the datapoints is used to perform traditional single linkage clusteringfrom which any edges with length greater than a giventhreshold are removed [6].

Let G = (V,E) be a fully connected graph, with medges {ei} and n vertices. Each vertex is a pixel, x,represented in the feature space. The final segmentation willbe S = (C1, ..., Cr), where Ci is a cluster of data points.The algorithm [3] can be shortly presented as follows:

1) Sort E = (e1, ..., em) such that |et| ≤ |e′t|∀t < t′

2) Let S0=({x1}, ..., {xn}) in other words each initialcluster contains exactly one vertex.

3) For t = 1, ...,m

a) Let xi and xj be the vertices connected by et.b) Let Ct−1

xibe the connected component containing

point xi on iteration t−1 and li = maxmstCt−1xi

be the longest edge in the minimum spanningtree of Ct−1

xi. Likewise for lj .

c) Merge Ct−1xi

and Ct−1xj

if:

|et| < min

{li +

k

Ct−1xi

, lj +k

Ct−1xj

}(1)

4) S = Sm.

163

C. Normalized Cuts

Normalized Cuts method models an image using a graphG = (V,E), where V is a set of vertices correspondingto image pixels and E is a set of edges connecting neigh-boring pixels. The edge weight w(u, v) describes the affinitybetween two vertices u and v based on different metrics likeproximity and intensity similarity. The algorithm segmentsan image into two segments that correspond to a graph cut(A,B), where A and B are the vertices in the two resultingsubgraphs.

The segmentation cost is defined by:

Ncut(A,B) =cut(A,B)

assoc(A, V )+

cut(A,B)

assoc(B, V )(2)

where cut(A,B) =∑

u∈A,v∈B w(u, v) is the cut cost of(A,B) and assoc(A, V ) =

∑u∈A,v∈V w(u, v) is the asso-

ciation between A and V . The algorithm finds a graph cut(A,B) with a minimum cost in Eq.(1). Since this is a NP-complete problem, a spectral graph algorithm was developedto find an approximate solution [7]. This algorithm can berecursively applied on the resulting subgraphs to get moresegments. For this method, the most important parameter isthe number of regions to be segmented. Normalized Cuts isan unbiased measure of dissociation between the subgraphs,and it has the property that minimizing normalized cuts leadsdirectly to maximizing the normalized association relative tothe total association within the sub-groups.

D. Mean Shift

The Mean Shift-Based segmentation technique [4] is oneof many techniques dealing with “feature space analysis”.Advantages of feature-space methods are the global repre-sentation of the original data and the excellent tolerance tonoise [9]. The algorithm has two important steps: a meanshift filtering of the image data in feature space and aclustering process of the data points already filtered. Duringthe filtering step, segments are processed using the kerneldensity estimation of the gradient. Details can be foundin[4]. A uniform kernel for gradient estimation with radiusvector h = [hs, hs, hr, hr, hr] is used. hs is the radius of thespatial dimensions and hr the radius of the color dimensions.Combining these two parameters, complex analysis can beperformed while training on different subjects.

Mean shift filtering is only a preprocessing step. Anotherstep is required in the segmentation process: clustering of thefiltered data points {x′}. During filtering, each data point inthe feature space is replaced by its corresponding mode. Thissuggests a single linkage clustering that converts the filteredpoints into a segmentation.

Another paper that describes the clustering is [5]. A regionadjacency graph (RAG) is created to hierarchically clusterthe modes. Also, edge information from an edge detectoris combined with the color information to better guide theclustering. This is the method used in the available EDISON

system, also described in [5]. The EDISON system is theimplementation we use in our evaluation system.

IV. CONTOUR-BASED PERFORMANCE EVALUATION

We present comparative results of segmentation perfor-mance for our contour based segmentation method and thethree alternative segmentation methods mentioned above.

Our evaluation measure is mainly related to the precision-recall framework. It represents a parametric curve thatillustrates the relation between accuracy and noise while thedetector threshold varies.

The precision and recall measures are particularly mean-ingful in the context of boundary detection when we considerapplications that make use of boundary maps, such asstereo or object recognition. It is reasonable to characterizehigher level processing in terms of how much true signal isrequired to succeed R (recall), and how much noise can betolerated P (precision). A particular application can definea relative cost α between these quantities, which focusesattention at a specific point on the precision-recall curve.The F −measure [2], is defined as:

F =PR

αR+ (1− α)P(3)

This equation contains the particular representation as theweighted harmonic mean of P and R. A more accessibleformula results when α = 0.5:

F =2PR

R+ P(4)

The location of the maximum F-measure along the curveprovides the optimal detector threshold for the applicationgiven.We report the maximum F-measure value across analgorithm’s precision-recall curve as its summary statistic.

We will evaluate the performance of our algorithm on theBerkeley Segmentation Database (BSD) [8]. We will referthe characteristics of the error metrics previously definedby Martin et al. [2], explore potential problems with thesemetrics, introduce precision/recall curves that use the BSDto evaluate the quality of a segmentation, and show that thesecurves can be used to tune the parameters of a segmentationalgorithm, and to characterize its performance over a rangeof parameter values.

The current public version of the Berkeley SegmentationDatabase is composed of 300 color images. The images havea size of 481× 321 pixels, and are divided into two sets, atraining set containing 200 images that can be used to tunethe parameters of a segmentation algorithm, and a testingset containing the remaining 100 images on which the finalperformance evaluations should be carried out.

We built a custom benchmark framework, that processesthe Berkeley dataset, converts it to our proprietary formatand preforms parallel analysis. Additionally, we adapted theother mentioned algorithms to the same evaluation formatfor unitary purposes.

164

The human segmented images provide the ground truthboundaries. Therefore, any boundary marked by a humansubject is considered to be valid. Since there are multiplesegmentations of each image by different subjects, it is thecollection of these human-marked boundaries that consti-tutes the ground truth. Based on the output of the previouslypresented algorithms for a set of images, we will determinehow well the ground truth boundaries are approximated.

In order to determine an algorithm’s efficiency by com-paring it to the ground truth boundaries, a threshold of theboundary map is needed. At each threshold level precisionand recall are taken into account, resulting in a precision-recall curve for each algorithm.

Even though the precision-recall curve for the outputof our segmentation algorithm is a rich descriptor of itsperformance, it is still desirable to converge the performanceof an algorithm towards a single number. We are providingan additional evaluation tool based on a histogram of thepixels that have a correspondent on a human segmentationboundary. This can provide a ratio of efficient pixel from asegmented contour. The summary statistic that we use is ameasure of the distance used to calculate the histogram.

V. EXPERIMENTAL RESULTS

Our study of segmentation quality is based on experi-mental results and uses the Berkeley segmentation datasetprovided at [8].

A. Precision-Recall Analysis

In order to proper evaluate the segmentation methodwe propose, we used a well known quantitative contour-based measure: Precision - Recall metric. To calculate thevalues for Precision and Recall, we have elaborated andimplemented a proprietary framework as an alternative toBerkeley’s benchmark tool. We will present a parallel anal-ysis of our method with three other segmentation algorithmsmentioned above: Normalized Cuts [7], Efficient Graph-Based Image Segmentation [3] and Mean-Shift [4].

We have implemented a custom definition for Precisionand Recall. Let C be the set of points on the objects’ contourderived from the segmentation and GC be the ground truthreference. We use as ground truth definitions the humansegmentations from Berkeley dataset. The matching functionmatch(C) can be defined as the number of pixels from Cthat have a correspondent in GC with a tolerance ε. Thetolerance is represented by a minimum distance in pixels thatis considered good for evaluation. Similarly, match(GC)will represent the reverse calculation of correspondent pix-els, from ground truth to the segmentation result. The valuesfor precision and recall are defined as follows:

P =match(C)

|C|(5)

P =match(GC)

|GC |(6)

From the set of 100 test pictures in Berkeleys’s datasetwe have analyzed, we will present the evaluation results ofimages with IDs 42049 and 101087.

In order to have a better analysis result and amore complete Precision-Recall curve we have per-formed 10 different tests for each subject per algo-rithm. More precisely, by varying some parameters, wehave obtained 10 distinct points that define the curvefor each approach. For Normalized Cuts [7] we havemodified the number of segments in the range of{5, 10, 12, 15, 20, 25, 30, 40, 50, 70}. The variable parameterfor Efficient Graph−Based Image Segmentation [3]was the scale of observation, k , in range {100, 200, 300,400, 500, 600, 700, 800, 900, 1000}. For Mean-Shift [4]we have made 10 combinations from Spatial Bandwidth{8, 16} and Range Bandwidth {4, 7, 10, 13, 16}. ForPrecision-Recall curve we have obtained the results in Figure2 and Figure 3.

Figure 2. Precision-Recall curve for image 42049

Figure 3. Precision-Recall curve for image 101087

We calculated the Precision-Recall average values forthe 100 test images provided by Berkeley. Similar to theprevious representation, Figure 4 illustrates the Precision (P)

165

and Recall (R) curve, where P and R represent the averagecalculated values for each parameter considered for eachalgorithm.

Figure 4. Average Precision-Recall curve for Berkeley test images

From the above presented diagrams we can see that thePrecision-Recall metric for our proposed method, denotedGBSOD − Graph Based Salient Object Detection issituated above the other curves indicating a better perfor-mance result and a balanced algorithm.

B. Histogram based evaluation

We elaborated a histogram-based evaluation mechanismaimed to compare contour segmentation results. The his-tograms present in a graphic manner the match(C) functionused by the Precision-Recall equation. They are very usefulas they provide an overview of the results of every segmen-tation. In Figures 5, 6, 7, and 8 are presented individualhistogram analysis for all the four compared algorithms.From visual point of view, the red areas represent pixelsthat have no match with the ground truth contour, while theblue areas denote matching pixels on both contours relativeto the tolerance measure ε.We consider a contour C, as aresult of segmentation through one of the selected methods,and the ground truth contour G. For each point from C, weapplied a distance function match, that calculates if there isa corresponding point in G, taking into account the tolerancemeasure ε. If a corresponding point is found in G, a blueline is drawn, otherwise a red one is drawn. The blue lines’height is proportional with the calculated distance. For acorrect interpretation, consider the histogram segments putside by side. On the horizontal axis we considered the pointsfrom contour C, starting from 0 to the maximum number ofpoints.

VI. CONCLUSION AND FUTURE WORK

In this paper we presented a new graph-method for imagesegmentation and extraction of visual objects. Starting froma survey of several segmentation strategies, we aimed atperforming an image segmentation evaluation experiment.Our segmentation method and other three segmentationmethodologies were chosen for the experiment, and the

Figure 5. Graph Based Salient Object Detection Histogram

Figure 6. Normalized Cuts Histogram

complementary nature of the methods was demonstratedin the results. The study results offer a clear view of theeffectiveness of each segmentation algorithm, trying in thisway to offer a solid reference for future studies.

Future work will be carried out in the direction of inte-grating syntactic visual information into a semantic level ofa semantic image processing and indexing system.

ACKNOWLEDGMENT

This work was supported by CNCSIS−UEFISCSU ,project number PNII − IDEI code/2008.

REFERENCES

[1] K. Fu and J. Mui, “A survey on image segmentation. PatternRecognition”, 1981.

[2] D. Martin, C. Fowlkes, D. Tal, and J. Malik, “A databaseof human segmented natural images and its application toevaluating segmentation algorithms and measuring ecologicalstatistics”, Proc. Int. Conf. Comp. Vis., vol. 2, pp. 416-425,2001.

[3] P. Felzenszwalb and D. Huttenlocher, “Efficient Graph-BasedImage Segmentation”, Intl J. Computer Vision, vol. 59, no. 2,2004.

[4] D. Comaniciu and P. Meer, “Mean Shift: A Robust Approachtoward Feature Space Analysis”, IEEE Trans. Pattern Analysisand Machine Intelligence, vol. 24, pp. 603-619, 2002.

[5] C. Christoudias, B. Georgescu, and P. Meer, “Synergism inLow Level Vision”, Proc. Intl Conf. Pattern Recognition, vol.4, pp. 150-156, 2002.

166

Figure 9. Comparative segmentation results: Human Segmentation, Graph-based Salient Object Detection, Normalized Cuts, Efficient Graph-based,Mean-Shift

Figure 7. Efficient Graph-Based Image Segmentation Histogram

Figure 8. Mean-Shift Histogram

[6] R. Unnikrishnan, C. Pantofaru, and M. Hebert, “Toward Ob-jective Evaluation of Image Segmentation Algorithms”, IEEETransactions on pattern analysis and machine inteligence, Vol.29, No. 6, 2007.

[7] J. Shi and J. Malik, “Normalized Cuts and Image Segmen-tation”, IEEE Transactions on pattern analysis and machineintelligence, Vol. 22, No. 8, 2000.

[8] “Berkeley Segmentation and Boundary De-tection Benchmark and Dataset”, 2003,http://www.cs.berkeley.edu/projects/vision/grouping/segbench.

[9] R.O. Duda, P.E. Hart, and D.G. Stork, “Pattern Classification”,John Wiley & Sons, New York, 2000.

[10] D. Burdescu, M. Brezovan, E. Ganea, and L. Stanescu, “ANew Method for Segmentation of Images Represented in aHSV Color Space”, Lecture Notes in Computer Science, 5807,606-617, 2009.

[11] C. Carson, S. Belongie, H. Greenspan, and J. Malik, “Blob-world: Image segmentation using expectation-maximizationand its application to image querying and classification”,IEEE Trans. on Pattern Analysis and Machine Intelligence,24(8),1026–1037, 2002.

[12] J. Fauqueur and N. Boujemaa, “Region-based image retrieval:Fast coarse segmentation and fine color description”, Journalof Visual Languages and Computing, 15(1), 69–95, 2004.

[13] J. Shi and J. Malik, “Normalized cuts and image segmenta-tion”, Proc. of the IEEE Conference on Computer Vision andPattern Recognition, San Juan, Puerto Rico, 731–737, 1997.

167