1/44 a simple test for the consecutive ones property
TRANSCRIPT
2/44
P-Q Trees
Two types of internal nodes: P-nodes & Q-nodesChildren of a P-node can be “permuted arbitrarily”Children of a Q-node can only be “reversed” Q
P1 2
3 4
L(T) = { all permutations generated by T }
In the example, L(T) = { 1234,1243,4321,3421 }
3/44
Intermediate On-Line Operations
… …
… …
… …
… …… …
PQ-tree and PC-tree implementations for the COP are “on-line”, i.e. you can see the rows only one at a time
4/44
Off-Line vs On-Line
• PC-tree algorithm is an on-line algorithm
• We design a simple off-line algorithm– If one arranges the rows in a good order,
then even a kid knows how to test C1P
• In general we can apply graph decomposition to obtain a good ordering
5/44
An intelligent off-line test
• Assume all rows of the matrix are given at the beginning
• Can we find a “good” order of the rows to be processed for the COP test so that the test becomes easier?– No more PQ-trees nor PC-trees. – Even a child can perform the test for you!!
6/44
Strictly Overlapping (S.O.) Relationships
Two rows are said to overlap strictly if they overlap but none is contained in the other.
Such a pair of rows implies the following column partition:
1 - - - - - - 1 1 - - - 1 1 - - - - 1
u
v
V \u V∩u u \V
7/44
Ideal Situation
If there is a row ordering v1 , v2 ,…,vm such that each vi strictly overlaps with some vj , j < i,
then it is trivial to test the consecutive ones property
Partition Before 1- - - - - - 1 1 - - - 1 1 - - - - 1
After 1 - - - 1 1 - - 1 1 - - 1 1 - - - 1 1 - - - - 1
8/44
The strictly overlapping graph
• Define the graph G on the set of rows whose edge set consists of those strictly overlapping pairs of columns.
• Each connected component of G satisfies the above “ideal situation”.– Why? Consider a spanning tree and a breadth-first
order– The corresponding submatrices are called prime – Can show that the matrix satisfies the COP iff each of
its prime submatrices does
11/44
Decomposition – A Divide and Conquer Strategy
• Divide and conquer1. To reduce complexity 2. To make the problem easier to solve
• To simplify a graph algorithm, we can decompose the graph until the problem becomes easier to solve on the final components– If the decomposition operation is also
very efficient, then we will get an efficient algorithm at the end
13/44
A spanning subgraph G’ of G
• However, we cannot afford to compute all the edges in G, which could take O( r 2 ) time.
• We shall compute a subset of edges that contain a spanning tree of each connected component.
• Note that the process of obtaining the component actually decompose the matrix into prime submatrices
15/44
Exact Algorithm for Consecutive Ones Testing
1. Construct a subgraph G’ which contains a spanning tree of G ( the S. O. graph ). Each connected component corresponds to a prime submatrix. ( matrix decomposition )
2. Decide a good ordering of a prime matrix based on BFS.
3. For each prime matrix determine the ordering of columns, using the set partition strategy as described in the following slides.• Process the rows from small to large
16/44
An Efficiency Note
Assume every row in A strictly overlaps every one in B. The # of strictly adjacent pairs is |A| |B| . Let a, b bethe least indexed rows in A,B, respectively. To connect A,B in the graph, it suffices to make a adjacent to all rows in B and b adjacent to all rows in A.
A
B
a
b
17/44
An Efficiency Note
The # of strictly adjacent pairs is |A| |B| . Let a, b bethe least indexed rows in A,B, respectively. To connect A,B, it suffices to make a adjacent to all rows in B and b adjacent to all rows in A.
A
B
a
b
18/44
1 1 1 1 1 1 1 1 1 1 1
11 1 1 1 1 1
Representative Rows vA and vB
v
v1/2
1/2
Let v be adjacent to both A and B. But, vA and vB are
forbidden to be made adjacent to A, B, respectively (to avoid incorrect s. o. edge formed)
vA
vB
vA
vB
vB
vA
19/44
Classifying the neighbors of a row u
u
B
D
C
A
1. Append A(u),B(u) and D(u) to PT(u) (the set of candidates that are potentially s. o. with u) .
2. Append uD to PT(w) for all w in C(u) whose index is smaller than Ind(uD)3. Delete the row u and use an artificial column [u] to replace the region
covered by columns of u4. Add edges from u to nodes of PT(u)-FB(u) (the set of rows forbidden to
be s. o. with u)
20/44
Partition the columns within row uThis is relatively unique
u
At the end of the iteration, no longer have to worry about the columns within u
21/44
After a row u is processed
• All of its columns are shrunk to one artificial column [u] (the main reduction)
• All the 1’s for rows in A, B in columns of u are eliminated except to save ½ for vA and vB (to discover future strictly overlapping relations)
• Save a 1 for rows in C and eliminate all rows in D.
22/44
Which S. O. relations have been Changed?
B
F
C
A
u
E
vA
vB
All those rows which have s. o. relation changed are connected to ui.e. they belong to the same component as u does
23/44
Afterwards
B
F
C
A
u
E
vA
vB
The connected components do not change after the reduction of uRows that were not s. o. cannot become s. o.
24/44
61
4
5
3
2
1 10 00 00 00 00 1
0 0 0 0 00 0 0 1 10 0 1 1 01 1 0 0 00 1 1 0 01 1 0 0 01 6
123
56
4
0 0 0 00 0 0 10 1 1 00 0 1 1.5 0 1 1
1 11 00 00 00 0
4
5 2
6 3
1 6 3 2
23
56
4
25/44
6
45
3
1 6
5
3 2
1 .50 01 00 0
0 0 0 0 1 1 0 0 1 .5 1 1
3
56
4
0.50
6
45
1 6
4 5
3 21 10 11 1
00.5
56
4
65
1 6
4 5
3 2.5 .5
1 00.5
56
27/44
Lemma 2
• If one of the ui and uj (i<j) is contained in the other and the containment is changed before iteraion i, ui and uj are connected in G’.
0.5ui
ui
ujuj
[uk]uk
0
28/44
The sub-graph G’ generated by the algorithm
G’ is a spanning sub-graph of G' with the same components.
Claim 1. G’ is a subgraph of G.
If(ui,uj) G, (ui,uj) G’
Claim 2. if(ui, uj) G, then ui and uj belong to the same component of G’
29/44
Claim 1G’ is a subgraph of G
vB
vA
u
[u]
0.5
0.5
In this case, ui is in FB(uj) and uj is in FB(ui)
1. ui and uj are independent originally. The only case they could be s.o. is when they become vA and vB for some u.
2. ui is contained in uj originally. (apply Lemma 2)
30/44
Claim 2
If(ui, uj) E(G), then ui and uj belong to the same component of G’.
• Suppose not. Let ui,uj be the minimal bad pair. (for all other bad pair (up,uq) either i<p or j<q)
• Consider the changing of intersection relationship– “intersect” to “contain” (case 1)– “intersect” to “independent” (case 2)
31/44
Case 1. “intersect” changed to “contain”
• ui and uj intersect originally. Let one of the ui and uj be contained in the other after iteration k. Consider the following two subcases:
Case 1.1: Both ui and uj overlap uk.
Case 1.2: Only one of the ui and uj (say, z) overlaps uk
( The other is named eA)
33/44
Case 1.2 one of ui and uj (say, z) overlaps uk
zeA
zeA
uk
zeA
ukA
uk is connected to z and ukA.
We shall verify if ukA is
connected to eA.
uk
34/44
Case 1.2 Only one of the ui and uj (said) z overlaps uk
• Case (i) uka is contained in eA originally
By lemma 2, uka is connected to eA.
• Case (ii) uka contains eA originally
zeA
ukA
uk
π-1(eA) < π-1(ukA) < π-1(z)
If z is deleted at iteration t (t< π-1(eA) )
zeA
ukA
uk
t
π-1(eA) < π-1(z) < π-1(utD)
eA connects utD. ut
D connects t. t connects z.
35/44
Case 1.2
• Case (iii) ukA is indepenet eA originally
Let ukA overlaps eA after iteration t. uk
A is connected to eA via ut
• Case (iv) ukA intersects eA originally
(ukA, eA) becomes the minimal bad pair.
(a contradiction)
It concludes that ukA is connected to eA in G"
such that eA and z is connected in G".
36/44
Case 2. ““intersect” changed to “independent”
• ui and uj intersect originally. Let one of the ui and uj become indepedent after iteration k. consider the following two subcases:
Case 2.1: Both ui and uj overlap uk.
Case 2.2: Only one of the ui and uj (said) z intersects uk
(The other is named eA)
38/44
Case 2.2 Only one of the ui and uj (say, z) intersects uk
z
eA
z
eA
z
eA
ukA
uk is connected to z and ukA.
We shall verify if ukA is
connected to eA.
uk