paper 21 rrjs 2013 shu raj tha
TRANSCRIPT
Research & Reviews: A Journal of Statistics
Volume 2, Issue 3, December 2012, Pages 17-30
__________________________________________________________________________________________
ISSN: 2278 – 2273© STM Journals 2012. All Rights Reserved
Page 17
Edge Estimation in Isomorphic Graph Population using Node-Sampling
D. Shukla1, Narendra Singh Thakur
2*, Yashwant Singh Rajput
1
1Department of Mathematics and Statistics, Dr. H.S. Gour Central University,
Sagar (M.P.), India 2Department of Mathematics and Statistics, Banasthali University,
Rajasthan, India
*Author for Correspondence: E-mail [email protected]
1. INTRODUCTION
Two figures are thought of equivalent if
having identical behavior in terms of
geometric proportions. According to the
concept of isomorphism, two graphs G and 'G
are said to be isomorphic if there is one-to-one
correspondence between their vertices and
edges such that the incidence relationship is
preserved. Like an example, consider Figure
1.0 and Figure 1.1 [1] where graph
G has vertices V ,V ,V ,V ,V 54321 and
edges ,e, e, e,e,ee 654321 ; graph 'G has
vertices V ,V ,V ,V ,V '''''
54321 and edges
e,e ,e ,e,e,e ''''''
654321 .
e2 e1 e5
e6
e4
e3
Fig. 1.0: Graph G
V2 V4
V5 V1
V3
ABSTRACT
Consider population with graphical relationship among units in the form of vertices and edges. Two
graphs are said to be isomorphic if they have similar mapping of edges. Each edge has a length value in
both the graphs. This paper considers a mixture of graphical structure and sampling by virtue of applying
the sampling over the graphical population. It is assumed that one graph is of main interest and other its
isomorphic one is an auxiliary source of information. Estimation of mean edge length of the farmer graph
is considered using the prior knowledge of edge length of its isomorphic part. An estimation strategy is
proposed in the form of a class of estimators using a random sample of some vertices and its properties
are examined. The sample of edges is drawn through a designed node sampling procedure. Optimality
criteria are derived and mathematical results are numerically supported with the test of 99% confidence
interval. All sample based estimates of edge-length are found close to the true value.
Keywords: Graph, Isomorphic graph, Edge, Vertices (nodes), Simple random sampling without
replacement (SRSWOR), Estimator, Bias, Mean Squared Error (MSE), Optimum choice, Confidence
interval
Research & Reviews: A Journal of Statistics
Volume 2, Issue 3, December 2012, Pages 17-30
__________________________________________________________________________________________
ISSN: 2278 – 2273© STM Journals 2012. All Rights Reserved
Page 18
Both G and 'G are drawn differently but edges
ie have similar relations with their vertices as
exist in '
ie . In general, ei contains identical
mapping between vertices
,.......,,ji,V,V ji 321 as '
ie has in
,.......,,ji,V,V ''
ji '' 321 . In all, the
isomorphic properties of G and 'G require the
following [1]:
1. The same number of vertices in both;
2. The same number of edges in both;
3. An equal number of vertices with a given
degree.
Some useful contributions related to the theory
and concept of isomorphic graphs and about
their structures, developments, applications are
due to Unger, Corneil, Gotlieb, Berztiss,
Schmidt and Druffel, Lueker and Booth,
Miller, Zibin et al. , Kleinberg and Kleinberg,
Hallegren et al., etc. [1–10].
Suppose a finite population of size N contains
a graphical relationship in the form of vertices
(nodes) and edges. We say the population is
*G = (V, E) where V is a set of nodes and E
relates to a set of edges. Further, as special
case, assume that finite population *G
contains two isomorphic graphs E,VG
and ''' E,VG , each with N vertices. The
average edge-length of G is unknown but this
average for 'G is assumed available. This
paper aims at to estimate mean edge-length of
G using edge data of 'G as an auxiliary source
of known information with the help of a
random sample of few edges.
2. NODE–EDGE MATRICES (NEM)
Consider a matrix whose (i,j)th element (aij ) is
defined as:
otherwise ;
edgej withlinked is node i if ; a
thth
ij
0
1
The Main Node Edge Matrix (MNEM) for
Figure 1.0 is in Table 2.1.
'e3
'e1
'e6
'e5
'e2
'e4
'V1
'V4 'V3
'V2
'V5
Fig. 1.1: Graph 'G
Research & Reviews: A Journal of Statistics
Volume 2, Issue 3, December 2012, Pages 17-30
__________________________________________________________________________________________
ISSN: 2278 – 2273© STM Journals 2012. All Rights Reserved
Page 19
Table 2.1: MNEM.
1000001
3000111
3101100
2110000
3011010
5
3
3
2
1
654321
V
V
V
V
V
counteeeeee
Total 12
Similarly, the Auxiliary Node Edge Matrix (ANEM) for Figure 1.1 is in Table 2.2.
Table 2.2 ANEM.
NOTE 2.1: Both Matrices MNEM and ANEM are Identical Proving the Existence of Isomorphic
Relation in G and 'G .
2.1. Assumption
Let edge ei in G of main interest is correlated
to edge '
ie in'G . There are total N edges in G
and also 'G with mean lengths:
'N
i
'
i
)(
N
i
i
GeN
e
GeN
e
For 1
For 1
1
2
1
1
…(2.1)
Object of estimation problem is to obtain the
mean edge length )1(
e of G using known
information of the auxiliary mean edge length
)2(
e of 'G .
3. NODE SAMPLING PROCEDURE
STEP I: Draw a random sample of size n
nodes from N (n < N) by SRSWOR in matrix
MNEM discussed in Table 2.1.
STEP II: For the ith sampled node of step I,
draw only one, and very first from left, non-
zero element of the ith
row of MNEM. This
provides edge-length value esi of ith node.
Research & Reviews: A Journal of Statistics
Volume 2, Issue 3, December 2012, Pages 17-30
__________________________________________________________________________________________
ISSN: 2278 – 2273© STM Journals 2012. All Rights Reserved
Page 20
STEP III: Corresponding to esi in MNEM,
select the same ( i,j)th
positioned element '
sie
in ANEM.
STEP IV: Continue step II and III for all n
nodes (i = 1, 2, 3, …n) appearing in sample.
On the completion of node sampling
procedure, one can obtain sample edge values
'
sisi e,e with means
''
sis
sis
Gen
e
Gen
e
For 1
For 1
2
1
…(3.1)
4. A CLASS OF ESTIMATORS
Define,
nN
eneNe
ss
22*
;and f = n/N
…(4.1)
In order to estimate unknown 1
e , proposed
class of estimators is
*)2()2(
(*))2(2)1(
2
ss
sss
b
e
eBfeCeA
eCeBfeCAeM …(4.2)
where A = (k – 1)(k – 2);
B = (k –1)(k – 4);
C = (k – 2)(k – 3)(k – 4); (0 < k < ∞).
NOTE 4.1: The motivation for (4.2) is derived
from Singh and Shukla [11, 12], Shukla et al.
[13] and Shukla [14].
NOTE 4.2: The estimation problem of
population parameter in a finite population
using auxiliary information is discussed in
Sukhatme et al. [15], Singh and Choudhary
[16] and Cochran [17] etc.
NOTE 4.3: The (4.2) contains an unknown
parameter k, (0 < k < ∞) whose different
choices generate a series of mean-edge
estimators. Therefore, b
eM could be looked
upon as a class of edge-estimators.
4.1. Special Estimators
Atk = 1,
2
21
1
2 s
*
ss
b
e
e
eee M …(4.3)
Atk = 2,
*
s
ss
b
e
e
ee M
22
2 …(4.4)
Atk = 3,
*
s
ss
b
e
efe
efeeM
2
221
3 …(4.5)
At k = 4, 1
4 sb
e eM …(4.6)
5. SETTING APPROXIMATIONS FOR
DERIVATION
For any two small positive numbers r1 and r2,
(|r1| < 1, |r2| < 1), the approximation is
For graph G
),1( 1
)1(1
ree s …(5.1)
For graph'G )1( 2
)2()2(
ree s
with assumptions
Research & Reviews: A Journal of Statistics
Volume 2, Issue 3, December 2012, Pages 17-30
__________________________________________________________________________________________
ISSN: 2278 – 2273© STM Journals 2012. All Rights Reserved
Page 21
(i) E 1r = E 2r = 0 ...(5.2)
(ii)
2)1()1()1(2
1 )(
2
eeEerE s
)1()1(
2
seVe
2)1(1 )())(( eCNnnN
...(5.3)
(iii)
2)2()2()2(2
2 )(
2
eeEerE s
)2()2(
2
seVe
2)2(1 )())(( eCNnnN ...(5.4)
(iv) )()2()2()1()1()2()1(
21
1
eeeeEeerrE ss
)2()1(1))(( ee CCNnnN ...(5.5)
NOTE 5.1: Symbols
21
eS ,
22
eS are population mean squares of N edges as per MNEM and
ANEM of G and 'G respectively. The V denotes variance and E denotes expectation of the
estimator based on sample n. Moreover, ;2)1(212)1(
eSC ee 2)2(222)2(
eSC ee are the
coefficients of variation of edges and denotes correlation coefficient between N pair of edges in G
and 'G present in MNEM and ANEM.
REMARK 5.1: The (4.1) could be express using (5.1) as
2
2
1 r ee*
; = n(N – n)-1
THEOREM 5.1: Using (5.1) and Remark 5.1, the class of estimators (4.2) is in the approximate form
21 2 2b
e 2
1 1 2 1 2
1 M
r re
r r r r r
where
CfB;
fBC2; fBCA 2
PROOF: Using (5.1), Remark 5.1, one can express b
eM of (4.2)
Research & Reviews: A Journal of Statistics
Volume 2, Issue 3, December 2012, Pages 17-30
__________________________________________________________________________________________
ISSN: 2278 – 2273© STM Journals 2012. All Rights Reserved
Page 22
2
2
2
12
2
2
2
22
1
1
112
111
r e B fre C e A
r e Cre B fe CA reM b
e
2
21
1
2
1
rBfC
rCBfre
1
221
1
1 1 1
rrre
3
2
2
2221
1
) () ( 1 1 1 rrrrre
Assume terms jr2 very small when 2j , and ignore in expansion 1
2 1
r [15, 17], we get
2
2221
1
111 )r(r r reM b
e
2
21211
2
22
1
1 rrrrrrre
Hence the theorem.
THEOREM 5.2: Bias of the proposed class is
21221
eee
b
e CCC Nn
nNe MB
PROOF: Let .B denotes bias, then using Theorem (5.1)
2
21211
2
22
1
1 rrrr rr re E ME b
e
)E()E()()( E) E(1 2
21211
2
22
1
rrrrrErre
Using (5.2) to (5.5) and substituting
0 r.rE ji 21 when i + j 3: i, j = 0, 1, 2, 3,… [17] ...(5.6)
CCNn
nNC
Nn
nNe ME eee
b
e
21221
1
212211
eee CCCNn
nNee
Therefore, bias is
1b
eM b
eB E M e
21221
eee
b
e CCC Nn
nNe MB
Hence the theorem.
Research & Reviews: A Journal of Statistics
Volume 2, Issue 3, December 2012, Pages 17-30
__________________________________________________________________________________________
ISSN: 2278 – 2273© STM Journals 2012. All Rights Reserved
Page 23
REMARK 5.2: The class of estimators b
eM is almost unbiased if
' =
Let V 2
1
e
e
C
C
PROOF: Substituting 0b
eMB , we get
02122 eee CCC and hence the result.
REMARK 5.3: The Remark 5.2 provides an equation
2 1 0 V A f B V C V ...(5.7)
which is a necessary condition for obtaining an
unbiased estimator in the class (4.2) to the first
order of approximation (or say almost
unbiased) by suitable choice of k. The (5.7) is
a cubic equation in k, which may give at most
three values of k for which the bias is zero.
One can chose the best k related to the lowest
m.s.e. All the three k-values generate a sub-
class of unbiased edge estimators.
THEOREM 5.3: Mean squared error of the
class beM is
212222121
2 eeee
b
e
b
e CC PCPCNn
nNeMMMMSE
where P = ( )
…(5.8)
PROOF: 21
eMMMMMSE b
e
b
e
b
e . On replacing b
eM by Theorem 5.1
and using equation (5.6) we get
21
2
2
2
1
21
2 rrErErENn
nNeMM b
e
With equations (5.2) to (5.5), the resultant is
212222121
2 eeee
b
e CCCCNn
nNeMM
212222121
2 eeee CCPCPCNn
nNe
Hence the theorem.
REMARK 5.4: Some special derivation to bias and m.s.e. are
21221
21 eee1
b
e CCC Nn
nNe MB ,k At
212222121
1222 eeee
b
e CC CCNn
nNeMM
Research & Reviews: A Journal of Statistics
Volume 2, Issue 3, December 2012, Pages 17-30
__________________________________________________________________________________________
ISSN: 2278 – 2273© STM Journals 2012. All Rights Reserved
Page 24
21221
212 eee
b
e CCC Nn
nNe MB ,k
212222121
2121 eeee
b
e CC C CNn
nNeMM
21221
11
13 eee3
b
e CCCf
f
f
f
Nn
nNe MB ,k
2122
2
2121
31
12
1
1eeee
b
e CC f
fC
f
fC
Nn
nNeMM
0 MB ,k 4
b
e 4
2121
4 e
b
e CNn
nNeMM
6. OPTIMUM CHOICE OF k
Expression (5.8) of mean squared error of the
class depends on a parameter P , which is
actually a function of k . One can obtain the
appropriate choice of P such that the mean
squared error is minimum.
THEOREM 6.1: Minimum mean squared
error occurs when
VP
PROOF: On differentiating
b
eMSE M
expression of Theorem 5.3 with respect to P
and equating to zero,
0b
eMM dP
d
02122 eee CCCP
2
1
e
e
C
CP
=V ...(6.1)
Hence the theorem.
REMARK 6.1: The optimality criteria (6.1)
provides an equation,
0221 V CV)( BfVA ...(6.2)
which is cubic in term of parameter k and for
the known values of f and V , one can obtain
at most three values of k for which the m.s.e.
is optimum (minimum).
REMARK 6.2: Let '
1 k , '
2 k and'
3 k be the
three values for which b
eMMSE are
optimum, using equation (6.2). The best
choice among these is,
''' k
b
ek
b
ek
b
e
'
opt MB,MB,MBMink321
(6.3)
7. NUMERICAL ILLUSTRATION
Consider a graphical population of two
isomorphic graphs G and'G with the structure
and relationship as per Figures 7.1 and 7.2.
Research & Reviews: A Journal of Statistics
Volume 2, Issue 3, December 2012, Pages 17-30
__________________________________________________________________________________________
ISSN: 2278 – 2273© STM Journals 2012. All Rights Reserved
Page 25
Length of edges ei and'
ie of G and 'G
respectively are shown in Table no. 7.1. Total
edges are N = 16, total vertices M = 7 and
sample size is n = 4.
Fig. 7.1: [Graph G].
Fig. 7.2: [Graph G′].
Research & Reviews: A Journal of Statistics
Volume 2, Issue 3, December 2012, Pages 17-30
__________________________________________________________________________________________
ISSN: 2278 – 2273© STM Journals 2012. All Rights Reserved
Page 26
Table 7.1: Node–Edge Length Table.
For Graph G (Main Interest) For Graph
'G (Auxiliary Source)
Node Edge Length Node Edge Length
1V 531 ,, eee 10,19,12 '
1V '
5
'
3
'
1 ,, eee 13,21,19
2V 21,ee 10,14 '
2V '
2
'
1,ee 13,16
3V 432 ,, eee 14,19,18 '
3V '
4
'
3
'
2 ,, eee 16.21,18
4V 654 ,, eee 18,12,21 '
4V '
6
'
5
'
4 ,, eee 18,19,22
5V 876 ,, eee 21,23,25 '
5V '
8
'
7
'
6 ,, eee 22,26,24
6V 7e 23 '
6V '
7e 26
7V 8e 25 '
7V '
8e 24
Table 7.2: MNEM (for Fig. 7.1)
Table 7.3: ANEM (For Fig. 7.2)
Research & Reviews: A Journal of Statistics
Volume 2, Issue 3, December 2012, Pages 17-30
__________________________________________________________________________________________
ISSN: 2278 – 2273© STM Journals 2012. All Rights Reserved
Page 27
Table 7.4: Sample Edge Description (For Sample N = 4 Drawn As Per Node Sampling Procedure).
S.No
Sample
Vertices
For Graph G For Graph G′
Sample Edge 1
se Sample Edge 2
se
1. 4321 V,V,V,V 18141010 4211 e,e,e , e
13.000 18161313 4211 '''' e,e,e , e 15.000
2. 5321 V,V,V,V 21141010 6211 e,e,e , e 13.750 22161313 6211 '''' e,e,e , e 16.000
3. 6321 V,V,V,V 23141010 7211 e,e,e , e 14.250 26161313 7211 '''' e,e,e , e 17.000
4. 7321 V,V,V,V 25141010 8211 e,e,e , e 14.750 24161313 8211 '''' e,e,e , e 16.500
5. 5432 V,V,V,V 21181410 6421 e,e,e , e 15.750 22181613 6421 '''' e,e,e , e 17.250
6. 6532 V,V,V,V 232114 7621 e,e,e ,10 e 17.000 26221613 7621 '''' e,e,e , e 19.250
7. 6543 V,V,V,V 23211814 7642 e,e,e , e
19.000 26221816 7642 '''' e,e,e , e 20.500
8. 7654 V,V,V,V 25232118 8764 e,e,e , e 21.750 24262218 8764 '''' e,e,e , e 22.500
Table 7.5 Population Parameters of Fig. 7.1 and 7.2.
S.NO. Parameter Value S. No. Parameter Value
1 M 7 7 21
eC 0.08443
2 n 4 8 22
eC 0.04283
3 1
e 17.7500 9 1
eC 0.29056
4 2
e 19.8750 10 2
eC 0.20694
5 21
eS 26.6000 11 0.98726
6 22
eS 16.9167 12 V 1.25992
Table 7.6: MSE when Estimator is Almost Unbiased.
S. No. Value of k Bias
ik. MSE ik.
1. 1k 2.394215
1k
b
eMB-0.00164
1k
b
eMMSE1.339895
2. 2k --- 2k
b
eMB ---
2k
b
eMMSE ---
3. 3k --
3k
b
eMB ---
3k
b
eMMSE ---
Table 7.7: Optimum MSE.
S. No. Value of 'k '
ik.iasB 'ik.MSE
1. '
1k 2.661921 'k
b
eMB1
-0.09761 'k
b
eMMSE1
1.054235
2. '
2k --- 'k
b
eMB2
--- 'k
b
eMMSE2
---
3. '
3k --- 'k
b
eMB3
--- 'k
b
eMMSE3
---
Research & Reviews: A Journal of Statistics
Volume 2, Issue 3, December 2012, Pages 17-30
__________________________________________________________________________________________
ISSN: 2278 – 2273© STM Journals 2012. All Rights Reserved
Page 28
Table 7.8: Calculation of b
eMMSE over Varying k.
k Bias b
eM MSE b
eM V b
eM
k =1 751501
.MB b
e 887431
.MMSE b
e 322631
.MV b
e
k = 2 302702
.MB b
e 9833172
.MMSE b
e 8916172
.MV b
e
k = 3 05040
3.MB b
e 867063
.MMSE b
e 864463
.MV b
e
3942.2kk 1
(unbiased)
001601
.MBk
b
e 33991
1.MMSE
k
b
e 33481
1.MV
k
b
e
6619.2kkk '1opt
(opt. MSE)
097601
.MB 'k
b
e
054211
.MMSE 'k
b
e 04471
1.MV 'k
b
e
where variance is computed by 2... BaisMSEV .
Table 7.9: Sample Based Estimates of Edge in G (Related To Table 7.4).
Note that the computation of confidence
interval is done using formula
3 , 3b b b b
e e e eM V M M V M
for different value of k = 1, 2, 3, 1k and '
1k .
8. DISCUSSION
For problem undertaken, sample is drawn by
the node-sampling procedure which is used to
estimate the mean-edge length. Table 7.2 and
7.3 showing MNEM and ANEM having 16
edges each of different lengths. Figures 7.1
and 7.2 are two isomorphic graphs, having
similar mapping of edges.
Research & Reviews: A Journal of Statistics
Volume 2, Issue 3, December 2012, Pages 17-30
__________________________________________________________________________________________
ISSN: 2278 – 2273© STM Journals 2012. All Rights Reserved
Page 29
Table 7.4 shows sample means over 8
different samples. The proposed class of
estimator is almost unbiased where k = 2.3942.
The other two roots of Eq. (5.7) are imaginary
and only one such real root exist (Table 7.6).
For Eq. (6.2) also, only one real root exist over
the data set of isomorphic graph. The value k'
= 2.6619 provides the minimum mean squared
error. Moreover, the range 2.3 k 2.7
contains a sub-class of almost unbiased
minimum m.s.e. estimators. The value k = 1 is
also appreciable (Table 7.8). All estimates
based on 8 random samples, as in Table 7.9,
are close to the true value 1
e = 17.75. But
when k = kopt, all sample estimates are very
close to 1
e . The true value lies in the 99%
confidence intervals in all the eight samples
estimates, showing the reliability of proposed
strategy.
9. CONCLUSIONS
The population having graphical relationship
like an isomorphic graph is taken into
consideration. Mean edge estimation of one
graphs is made possible through sampling
technique using its isomorphic part. Node
sampling procedure is derived and found
useful for solving edge estimation problem. A
class of estimator is proposed which has
several interesting cases as its members.
Condition for obtaining optimum member of
class is derived along with bias component.
The minimum mean squared error occurs at
different values of constant k, the best of them
is with lowest bias. Moreover, condition for
unbiased estimator of the class is also derived.
One can obtain almost unbiased estimator of
the class having the lowest mean squared
error. The node sampling procedure is
developed for an isomorphic graphical
population and found effective in providing an
average edge-length. The best and efficient
sub-class of b
eM is when 2.3 k 2.7.
Sample estimates of mean edge length for k' =
2.6619 are very close to the true value 1
e . It
proves the proposed class of estimator is
worthwhile and useful for edge length
estimation.
REFERENCES
1. Deo N. Graph Theory with Application to
Engineering and Computer Science. 2001.
Prentice-Hall. Eastern Economy Edition.
New Delhi, India.
2. Unger S. H. GIT-a Heuristic Program for
Testing Pairs of Directed Line Graphs For
Isomorphism. Communications of the
ACM. 1964. 7. 1–11p.
3. Corneil D. G., Gotlieb C. C. Journal of
the ACM (JACM). 1970. 17. 1–9p.
4. Berztiss A. T. Journal of the ACM
(JACM). 1973. 20. 3–10p
5. Schmidt D. C., Druffel L. E. Journal of
the ACM (JACM). 1976. 23. 3. 435–445p.
6. Lucker G. S., Booth K. S. Journal of the
ACM (JACM). 1979. 26. 2–10p
Research & Reviews: A Journal of Statistics
Volume 2, Issue 3, December 2012, Pages 17-30
__________________________________________________________________________________________
ISSN: 2278 – 2273© STM Journals 2012. All Rights Reserved
Page 30
7. Miller G. L Proceedings of the 12th Annual
ACM Symposium on Theory of Computing
STOC 80. 1980. 225–235p.
8. Zibin Y., Gil J. Y., Considine J. Efficient
Algorithms for Isomorphism's of Simple
Types, ACM SIGPLAN Notices,
Proceedings of the 30th ACM SIGPLAN-
SIGACT Symposium on Principles of
Programming Languages (POPL). 2003.
203–211p.
9. Kleinberg R. D., Kleinberg J. M.
Proceedings of the 16th Annual ACM-
SIAM Symposium on Discrete Algorithms
SODA '05. 2005. 270–286p.
10. Hallgren S., Moore C., Rotteler M. et al.
Proceedings of the 38th Annual ACM
Symposium on Theory of Computing
STOC '06. 2006. 604–617p
11. Singh V. K., Shukla D. Metron. 1987. 45.
1–2. 30–VI. 273–283p.
12. Singh V. K., Shukla D. Metron. 1993. 51.
1–2. 30-VI. 139–159p.
13. Shukla D., Singh V. K., Singh G. N.
Metron. 1991. 49. 1–4. 31-XII. 349–361p.
14. Shukla D. Metron. 2002. 49. 97–106p.
15. Sukhatme P. V., Sukhatme B. V.,
Sukhatme S., et al. Sampling Theory of
Surveys with Applications. 1984. Iowa
State University Press. I.S.A.S.
Publication. New Delhi.
16. Singh D., Choudhary F. S. Theory and
Analysis of Sample Survey Design. 1986.
Wiley Eastern Limited. New Delhi. India.
17. Cochran W. G. Sampling Techniques.
2005. John Wiley and Sons. 5th Edn. New
York.