paper 21 rrjs 2013 shu raj tha

Research & Reviews: A Journal of Statistics

Volume 2, Issue 3, December 2012, Pages 17-30

__________________________________________________________________________________________

ISSN: 2278 – 2273© STM Journals 2012. All Rights Reserved

7

Edge Estimation in Isomorphic Graph Population using Node-Sampling

D. Shukla1, Narendra Singh Thakur

2*, Yashwant Singh Rajput

1

1Department of Mathematics and Statistics, Dr. H.S. Gour Central University,

Sagar (M.P.), India 2Department of Mathematics and Statistics, Banasthali University,

Rajasthan, India

*Author for Correspondence: E-mail [email protected]

1. INTRODUCTION

Two figures are thought of equivalent if

having identical behavior in terms of

geometric proportions. According to the

concept of isomorphism, two graphs G and 'G

are said to be isomorphic if there is one-to-one

correspondence between their vertices and

edges such that the incidence relationship is

preserved. Like an example, consider Figure

1.0 and Figure 1.1 [1] where graph

G has vertices V ,V ,V ,V ,V 54321 and

edges ,e, e, e,e,ee 654321 ; graph 'G has

vertices V ,V ,V ,V ,V '''''

54321 and edges

e,e ,e ,e,e,e ''''''

654321 .

e2 e1 e5

e6

e4

e3

Fig. 1.0: Graph G

V2 V4

V5 V1

V3

ABSTRACT

Consider population with graphical relationship among units in the form of vertices and edges. Two

graphs are said to be isomorphic if they have similar mapping of edges. Each edge has a length value in

both the graphs. This paper considers a mixture of graphical structure and sampling by virtue of applying

the sampling over the graphical population. It is assumed that one graph is of main interest and other its

isomorphic one is an auxiliary source of information. Estimation of mean edge length of the farmer graph

is considered using the prior knowledge of edge length of its isomorphic part. An estimation strategy is

proposed in the form of a class of estimators using a random sample of some vertices and its properties

are examined. The sample of edges is drawn through a designed node sampling procedure. Optimality

criteria are derived and mathematical results are numerically supported with the test of 99% confidence

interval. All sample based estimates of edge-length are found close to the true value.

Keywords: Graph, Isomorphic graph, Edge, Vertices (nodes), Simple random sampling without

replacement (SRSWOR), Estimator, Bias, Mean Squared Error (MSE), Optimum choice, Confidence

interval



__________________________________________________________________________________________


Page 18

Both G and 'G are drawn differently but edges

ie have similar relations with their vertices as

exist in '

ie . In general, ei contains identical

mapping between vertices

,.......,,ji,V,V ji 321 as '

ie has in

,.......,,ji,V,V ''

ji '' 321 . In all, the

isomorphic properties of G and 'G require the

following [1]:

1. The same number of vertices in both;

2. The same number of edges in both;

3. An equal number of vertices with a given

degree.

Some useful contributions related to the theory

and concept of isomorphic graphs and about

their structures, developments, applications are

due to Unger, Corneil, Gotlieb, Berztiss,

Schmidt and Druffel, Lueker and Booth,

Miller, Zibin et al. , Kleinberg and Kleinberg,

Hallegren et al., etc. [1–10].

Suppose a finite population of size N contains

a graphical relationship in the form of vertices

(nodes) and edges. We say the population is

*G = (V, E) where V is a set of nodes and E

relates to a set of edges. Further, as special

case, assume that finite population *G

contains two isomorphic graphs E,VG

and ''' E,VG , each with N vertices. The

average edge-length of G is unknown but this

average for 'G is assumed available. This

paper aims at to estimate mean edge-length of

G using edge data of 'G as an auxiliary source

of known information with the help of a

random sample of few edges.

2. NODE–EDGE MATRICES (NEM)

Consider a matrix whose (i,j)th element (aij ) is

defined as:

otherwise ;

edgej withlinked is node i if ; a

thth

ij

0

1

The Main Node Edge Matrix (MNEM) for

Figure 1.0 is in Table 2.1.

'e3

'e1

'e6

'e5

'e2

'e4

'V1

'V4 'V3

'V2

'V5

Fig. 1.1: Graph 'G



__________________________________________________________________________________________


Page 19

Table 2.1: MNEM.

1000001

3000111

3101100

2110000

3011010

5

3

3

2

1

654321

V

V

V

V

V

counteeeeee

Total 12

Similarly, the Auxiliary Node Edge Matrix (ANEM) for Figure 1.1 is in Table 2.2.

Table 2.2 ANEM.

NOTE 2.1: Both Matrices MNEM and ANEM are Identical Proving the Existence of Isomorphic

Relation in G and 'G .

2.1. Assumption

Let edge ei in G of main interest is correlated

to edge '

ie in'G . There are total N edges in G

and also 'G with mean lengths:

'N

i

'

i

)(

N

i

i

GeN

e

GeN

e

For 1

For 1

1

2

1

1

…(2.1)

Object of estimation problem is to obtain the

mean edge length )1(

e of G using known

information of the auxiliary mean edge length

)2(

e of 'G .

3. NODE SAMPLING PROCEDURE

STEP I: Draw a random sample of size n

nodes from N (n < N) by SRSWOR in matrix

MNEM discussed in Table 2.1.

STEP II: For the ith sampled node of step I,

draw only one, and very first from left, non-

zero element of the ith

row of MNEM. This

provides edge-length value esi of ith node.



__________________________________________________________________________________________


Page 20

STEP III: Corresponding to esi in MNEM,

select the same ( i,j)th

positioned element '

sie

in ANEM.

STEP IV: Continue step II and III for all n

nodes (i = 1, 2, 3, …n) appearing in sample.

On the completion of node sampling

procedure, one can obtain sample edge values

'

sisi e,e with means

''

sis

sis

Gen

e

Gen

e

For 1

For 1

2

1

…(3.1)

4. A CLASS OF ESTIMATORS

Define,

nN

eneNe

ss

22*

;and f = n/N

…(4.1)

In order to estimate unknown 1

e , proposed

class of estimators is

*)2()2(

(*))2(2)1(

2

ss

sss

b

e

eBfeCeA

eCeBfeCAeM …(4.2)

where A = (k – 1)(k – 2);

B = (k –1)(k – 4);

C = (k – 2)(k – 3)(k – 4); (0 < k < ∞).

NOTE 4.1: The motivation for (4.2) is derived

from Singh and Shukla [11, 12], Shukla et al.

[13] and Shukla [14].

NOTE 4.2: The estimation problem of

population parameter in a finite population

using auxiliary information is discussed in

Sukhatme et al. [15], Singh and Choudhary

[16] and Cochran [17] etc.

NOTE 4.3: The (4.2) contains an unknown

parameter k, (0 < k < ∞) whose different

choices generate a series of mean-edge

estimators. Therefore, b

eM could be looked

upon as a class of edge-estimators.

4.1. Special Estimators

Atk = 1,

2

21

1

2 s

*

ss

b

e

e

eee M …(4.3)

Atk = 2,

*

s

ss

b

e

e

ee M

22

2 …(4.4)

Atk = 3,

*

s

ss

b

e

efe

efeeM

2

221

3 …(4.5)

At k = 4, 1

4 sb

e eM …(4.6)

5. SETTING APPROXIMATIONS FOR

DERIVATION

For any two small positive numbers r1 and r2,

(|r1| < 1, |r2| < 1), the approximation is

For graph G

),1( 1

)1(1

ree s …(5.1)

For graph'G )1( 2

)2()2(

ree s

with assumptions



__________________________________________________________________________________________


Page 21

(i) E 1r = E 2r = 0 ...(5.2)

(ii)

2)1()1()1(2

1 )(

2

eeEerE s

)1()1(

2

seVe

2)1(1 )())(( eCNnnN

...(5.3)

(iii)

2)2()2()2(2

2 )(

2

eeEerE s

)2()2(

2

seVe

2)2(1 )())(( eCNnnN ...(5.4)

(iv) )()2()2()1()1()2()1(

21

1

eeeeEeerrE ss

)2()1(1))(( ee CCNnnN ...(5.5)

NOTE 5.1: Symbols

21

eS ,

22

eS are population mean squares of N edges as per MNEM and

ANEM of G and 'G respectively. The V denotes variance and E denotes expectation of the

estimator based on sample n. Moreover, ;2)1(212)1(

eSC ee 2)2(222)2(

eSC ee are the

coefficients of variation of edges and denotes correlation coefficient between N pair of edges in G

and 'G present in MNEM and ANEM.

REMARK 5.1: The (4.1) could be express using (5.1) as

2

2

1 r ee*

; = n(N – n)-1

THEOREM 5.1: Using (5.1) and Remark 5.1, the class of estimators (4.2) is in the approximate form

21 2 2b

e 2

1 1 2 1 2

1 M

r re

r r r r r

where

CfB;

fBC2; fBCA 2

PROOF: Using (5.1), Remark 5.1, one can express b

eM of (4.2)



__________________________________________________________________________________________


Page 22

2

2

2

12

2

2

2

22

1

1

112

111

r e B fre C e A

r e Cre B fe CA reM b

e

2

21

1

2

1

rBfC

rCBfre

1

221

1

1 1 1

rrre

3

2

2

2221

1

) () ( 1 1 1 rrrrre

Assume terms jr2 very small when 2j , and ignore in expansion 1

2 1

r [15, 17], we get

2

2221

1

111 )r(r r reM b

e

2

21211

2

22

1

1 rrrrrrre

Hence the theorem.

THEOREM 5.2: Bias of the proposed class is

21221

eee

b

e CCC Nn

nNe MB

PROOF: Let .B denotes bias, then using Theorem (5.1)

2

21211

2

22

1

1 rrrr rr re E ME b

e

)E()E()()( E) E(1 2

21211

2

22

1

rrrrrErre

Using (5.2) to (5.5) and substituting

0 r.rE ji 21 when i + j 3: i, j = 0, 1, 2, 3,… [17] ...(5.6)

CCNn

nNC

Nn

nNe ME eee

b

e

21221

1

212211

eee CCCNn

nNee

Therefore, bias is

1b

eM b

eB E M e

21221

eee

b

e CCC Nn

nNe MB

Hence the theorem.



__________________________________________________________________________________________


Page 23

REMARK 5.2: The class of estimators b

eM is almost unbiased if

' =

Let V 2

1

e

e

C

C

PROOF: Substituting 0b

eMB , we get

02122 eee CCC and hence the result.

REMARK 5.3: The Remark 5.2 provides an equation

2 1 0 V A f B V C V ...(5.7)

which is a necessary condition for obtaining an

unbiased estimator in the class (4.2) to the first

order of approximation (or say almost

unbiased) by suitable choice of k. The (5.7) is

a cubic equation in k, which may give at most

three values of k for which the bias is zero.

One can chose the best k related to the lowest

m.s.e. All the three k-values generate a sub-

class of unbiased edge estimators.

THEOREM 5.3: Mean squared error of the

class beM is

212222121

2 eeee

b

e

b

e CC PCPCNn

nNeMMMMSE

where P = ( )

…(5.8)

PROOF: 21

eMMMMMSE b

e

b

e

b

e . On replacing b

eM by Theorem 5.1

and using equation (5.6) we get

21

2

2

2

1

21

2 rrErErENn

nNeMM b

e

With equations (5.2) to (5.5), the resultant is

212222121

2 eeee

b

e CCCCNn

nNeMM

212222121

2 eeee CCPCPCNn

nNe

Hence the theorem.

REMARK 5.4: Some special derivation to bias and m.s.e. are

21221

21 eee1

b

e CCC Nn

nNe MB ,k At

212222121

1222 eeee

b

e CC CCNn

nNeMM



__________________________________________________________________________________________


Page 24

21221

212 eee

b

e CCC Nn

nNe MB ,k

212222121

2121 eeee

b

e CC C CNn

nNeMM

21221

11

13 eee3

b

e CCCf

f

f

f

Nn

nNe MB ,k

2122

2

2121

31

12

1

1eeee

b

e CC f

fC

f

fC

Nn

nNeMM

0 MB ,k 4

b

e 4

2121

4 e

b

e CNn

nNeMM

6. OPTIMUM CHOICE OF k

Expression (5.8) of mean squared error of the

class depends on a parameter P , which is

actually a function of k . One can obtain the

appropriate choice of P such that the mean

squared error is minimum.

THEOREM 6.1: Minimum mean squared

error occurs when

VP

PROOF: On differentiating

b

eMSE M

expression of Theorem 5.3 with respect to P

and equating to zero,

0b

eMM dP

d

02122 eee CCCP

2

1

e

e

C

CP

=V ...(6.1)

Hence the theorem.

REMARK 6.1: The optimality criteria (6.1)

provides an equation,

0221 V CV)( BfVA ...(6.2)

which is cubic in term of parameter k and for

the known values of f and V , one can obtain

at most three values of k for which the m.s.e.

is optimum (minimum).

REMARK 6.2: Let '

1 k , '

2 k and'

3 k be the

three values for which b

eMMSE are

optimum, using equation (6.2). The best

choice among these is,

''' k

b

ek

b

ek

b

e

'

opt MB,MB,MBMink321

(6.3)

7. NUMERICAL ILLUSTRATION

Consider a graphical population of two

isomorphic graphs G and'G with the structure

and relationship as per Figures 7.1 and 7.2.



__________________________________________________________________________________________


Page 25

Length of edges ei and'

ie of G and 'G

respectively are shown in Table no. 7.1. Total

edges are N = 16, total vertices M = 7 and

sample size is n = 4.

Fig. 7.1: [Graph G].

Fig. 7.2: [Graph G′].



__________________________________________________________________________________________


Page 26

Table 7.1: Node–Edge Length Table.

For Graph G (Main Interest) For Graph

'G (Auxiliary Source)

Node Edge Length Node Edge Length

1V 531 ,, eee 10,19,12 '

1V '

5

'

3

'

1 ,, eee 13,21,19

2V 21,ee 10,14 '

2V '

2

'

1,ee 13,16

3V 432 ,, eee 14,19,18 '

3V '

4

'

3

'

2 ,, eee 16.21,18

4V 654 ,, eee 18,12,21 '

4V '

6

'

5

'

4 ,, eee 18,19,22

5V 876 ,, eee 21,23,25 '

5V '

8

'

7

'

6 ,, eee 22,26,24

6V 7e 23 '

6V '

7e 26

7V 8e 25 '

7V '

8e 24

Table 7.2: MNEM (for Fig. 7.1)

Table 7.3: ANEM (For Fig. 7.2)



__________________________________________________________________________________________


Page 27

Table 7.4: Sample Edge Description (For Sample N = 4 Drawn As Per Node Sampling Procedure).

S.No

Sample

Vertices

For Graph G For Graph G′

Sample Edge 1

se Sample Edge 2

se

1. 4321 V,V,V,V 18141010 4211 e,e,e , e

13.000 18161313 4211 '''' e,e,e , e 15.000

2. 5321 V,V,V,V 21141010 6211 e,e,e , e 13.750 22161313 6211 '''' e,e,e , e 16.000

3. 6321 V,V,V,V 23141010 7211 e,e,e , e 14.250 26161313 7211 '''' e,e,e , e 17.000

4. 7321 V,V,V,V 25141010 8211 e,e,e , e 14.750 24161313 8211 '''' e,e,e , e 16.500

5. 5432 V,V,V,V 21181410 6421 e,e,e , e 15.750 22181613 6421 '''' e,e,e , e 17.250

6. 6532 V,V,V,V 232114 7621 e,e,e ,10 e 17.000 26221613 7621 '''' e,e,e , e 19.250

7. 6543 V,V,V,V 23211814 7642 e,e,e , e

19.000 26221816 7642 '''' e,e,e , e 20.500

8. 7654 V,V,V,V 25232118 8764 e,e,e , e 21.750 24262218 8764 '''' e,e,e , e 22.500

Table 7.5 Population Parameters of Fig. 7.1 and 7.2.

S.NO. Parameter Value S. No. Parameter Value

1 M 7 7 21

eC 0.08443

2 n 4 8 22

eC 0.04283

3 1

e 17.7500 9 1

eC 0.29056

4 2

e 19.8750 10 2

eC 0.20694

5 21

eS 26.6000 11 0.98726

6 22

eS 16.9167 12 V 1.25992

Table 7.6: MSE when Estimator is Almost Unbiased.

S. No. Value of k Bias

ik. MSE ik.

1. 1k 2.394215

1k

b

eMB-0.00164

1k

b

eMMSE1.339895

2. 2k --- 2k

b

eMB ---

2k

b

eMMSE ---

3. 3k --

3k

b

eMB ---

3k

b

eMMSE ---

Table 7.7: Optimum MSE.

S. No. Value of 'k '

ik.iasB 'ik.MSE

1. '

1k 2.661921 'k

b

eMB1

-0.09761 'k

b

eMMSE1

1.054235

2. '

2k --- 'k

b

eMB2

--- 'k

b

eMMSE2

---

3. '

3k --- 'k

b

eMB3

--- 'k

b

eMMSE3

---



__________________________________________________________________________________________


Page 28

Table 7.8: Calculation of b

eMMSE over Varying k.

k Bias b

eM MSE b

eM V b

eM

k =1 751501

.MB b

e 887431

.MMSE b

e 322631

.MV b

e

k = 2 302702

.MB b

e 9833172

.MMSE b

e 8916172

.MV b

e

k = 3 05040

3.MB b

e 867063

.MMSE b

e 864463

.MV b

e

3942.2kk 1

(unbiased)

001601

.MBk

b

e 33991

1.MMSE

k

b

e 33481

1.MV

k

b

e

6619.2kkk '1opt

(opt. MSE)

097601

.MB 'k

b

e

054211

.MMSE 'k

b

e 04471

1.MV 'k

b

e

where variance is computed by 2... BaisMSEV .

Table 7.9: Sample Based Estimates of Edge in G (Related To Table 7.4).

Note that the computation of confidence

interval is done using formula

3 , 3b b b b

e e e eM V M M V M

for different value of k = 1, 2, 3, 1k and '

1k .

8. DISCUSSION

For problem undertaken, sample is drawn by

the node-sampling procedure which is used to

estimate the mean-edge length. Table 7.2 and

7.3 showing MNEM and ANEM having 16

edges each of different lengths. Figures 7.1

and 7.2 are two isomorphic graphs, having

similar mapping of edges.



__________________________________________________________________________________________


Page 29

Table 7.4 shows sample means over 8

different samples. The proposed class of

estimator is almost unbiased where k = 2.3942.

The other two roots of Eq. (5.7) are imaginary

and only one such real root exist (Table 7.6).

For Eq. (6.2) also, only one real root exist over

the data set of isomorphic graph. The value k'

= 2.6619 provides the minimum mean squared

error. Moreover, the range 2.3 k 2.7

contains a sub-class of almost unbiased

minimum m.s.e. estimators. The value k = 1 is

also appreciable (Table 7.8). All estimates

based on 8 random samples, as in Table 7.9,

are close to the true value 1

e = 17.75. But

when k = kopt, all sample estimates are very

close to 1

e . The true value lies in the 99%

confidence intervals in all the eight samples

estimates, showing the reliability of proposed

strategy.

9. CONCLUSIONS

The population having graphical relationship

like an isomorphic graph is taken into

consideration. Mean edge estimation of one

graphs is made possible through sampling

technique using its isomorphic part. Node

sampling procedure is derived and found

useful for solving edge estimation problem. A

class of estimator is proposed which has

several interesting cases as its members.

Condition for obtaining optimum member of

class is derived along with bias component.

The minimum mean squared error occurs at

different values of constant k, the best of them

is with lowest bias. Moreover, condition for

unbiased estimator of the class is also derived.

One can obtain almost unbiased estimator of

the class having the lowest mean squared

error. The node sampling procedure is

developed for an isomorphic graphical

population and found effective in providing an

average edge-length. The best and efficient

sub-class of b

eM is when 2.3 k 2.7.

Sample estimates of mean edge length for k' =

2.6619 are very close to the true value 1

e . It

proves the proposed class of estimator is

worthwhile and useful for edge length

estimation.

REFERENCES

1. Deo N. Graph Theory with Application to

Engineering and Computer Science. 2001.

Prentice-Hall. Eastern Economy Edition.

New Delhi, India.

2. Unger S. H. GIT-a Heuristic Program for

Testing Pairs of Directed Line Graphs For

Isomorphism. Communications of the

ACM. 1964. 7. 1–11p.

3. Corneil D. G., Gotlieb C. C. Journal of

the ACM (JACM). 1970. 17. 1–9p.

4. Berztiss A. T. Journal of the ACM

(JACM). 1973. 20. 3–10p

5. Schmidt D. C., Druffel L. E. Journal of

the ACM (JACM). 1976. 23. 3. 435–445p.

6. Lucker G. S., Booth K. S. Journal of the

ACM (JACM). 1979. 26. 2–10p



__________________________________________________________________________________________


Page 30

7. Miller G. L Proceedings of the 12th Annual

ACM Symposium on Theory of Computing

STOC 80. 1980. 225–235p.

8. Zibin Y., Gil J. Y., Considine J. Efficient

Algorithms for Isomorphism's of Simple

Types, ACM SIGPLAN Notices,

Proceedings of the 30th ACM SIGPLAN-

SIGACT Symposium on Principles of

Programming Languages (POPL). 2003.

203–211p.

9. Kleinberg R. D., Kleinberg J. M.

Proceedings of the 16th Annual ACM-

SIAM Symposium on Discrete Algorithms

SODA '05. 2005. 270–286p.

10. Hallgren S., Moore C., Rotteler M. et al.

Proceedings of the 38th Annual ACM

Symposium on Theory of Computing

STOC '06. 2006. 604–617p

11. Singh V. K., Shukla D. Metron. 1987. 45.

1–2. 30–VI. 273–283p.

12. Singh V. K., Shukla D. Metron. 1993. 51.

1–2. 30-VI. 139–159p.

13. Shukla D., Singh V. K., Singh G. N.

Metron. 1991. 49. 1–4. 31-XII. 349–361p.

14. Shukla D. Metron. 2002. 49. 97–106p.

15. Sukhatme P. V., Sukhatme B. V.,

Sukhatme S., et al. Sampling Theory of

Surveys with Applications. 1984. Iowa

State University Press. I.S.A.S.

Publication. New Delhi.

16. Singh D., Choudhary F. S. Theory and

Analysis of Sample Survey Design. 1986.

Wiley Eastern Limited. New Delhi. India.

17. Cochran W. G. Sampling Techniques.

2005. John Wiley and Sons. 5th Edn. New

York.

paper 21 rrjs 2013 shu raj tha

Documents