graph-based semi-supervised and active learning for edge...
TRANSCRIPT
Graph-based Semi-Supervised and Active Learning for Edge FlowsJunteng Jia, Michael T. Schaub, Santiago Segarra and Austin R. BensonR [email protected], [email protected], [email protected], [email protected]
� https://github.com/000Justin000/ssl_edge
Motivation & Problem StatementConsider the problem of monitoring traffic flows in a region.Setting up sensors on all roads would provide accurate mea-surements, but is costly. Given traffic flow measurements ona subset of the roads, can we estimate the remaining flows?
Problem statement
Given:◦ a graph topology G = (V , E)◦flows on a subset of the edges EL
Infer:◦unknown flows on EU ≡ E/E L.
?? ?
??
labeled unknown?
Key insight: a suitable learning assumption for edge flows.Flow conservation – flows that enter/exit a node must balance.
Edge- vs. vertex-based semi-supervised learning
labeled inferred inferredlabeled
Vertex-based learning◦given some vertex labels◦ impose smoothness assumption◦ interpolate unknown vertices
Edge-based flow learning◦given some edge flows◦ impose flow conservation◦ infer unknown edge flows
Formulation & Inference Algorithm
1
2
3 4
5
6
71
2
3
4
5
6
7
80
1 1
1
0 00
0-1
0 0
0
0 0
00
0 0
01 1 0
0 00
000 0
0
0
0
0
0
0 1-1
00
0 -1
0
1
-1 0
-1 0
0
0-10
1 0
0
-1-1
incidence matrix B
◦undirected graph G = (V , E) with |V| = n and |E | = m◦vertex label vector y ∈ Rn
◦define (net) edge flow as alternating function f : V ×V → R
f (i, j) ={− f (j, i), ∀ (i, j) ∈ E0, otherwise.
edge flow vector f ∈ Rm with fr = f (i, j) if Er ≡ (i, j), i < j
◦vertex-edge incidence matrix B ∈ Rn×m (right panel)
Computations: edge vs. vertex-based learningVertex-based learning
◦ ‖Bᵀy‖2 = ∑(i,j)∈E(yi− yj)2 measures “unsmoothness”
◦minimize sum-of-squares difference
y∗ = arg miny‖Bᵀy‖2 s.t. yi = yi, ∀Vi ∈ VL.
Edge flow learning
◦ (Bf)i measures the flow “divergence” on the ith vertex◦minimize sum-of-square divergence, with regularization
f∗ = arg minf‖Bf‖2 + λ2 · ‖f‖2 s.t. fr = fr, ∀Er ∈ EL.
◦ least-square solution (null space method f = f0 + ΦfU)
fU∗ = arg minfU
∥∥∥∥[BΦλ · I
]fU−
[−Bf0
0
]∥∥∥∥2
.
Empirical results & Reconstruction error bound
ZeroFill
LineGraph
FlowSSL
0.0 0.2 0.4 0.6 0.8 1.0
ratio labeled (| L|/| |)
0.0 0.2 0.4 0.6 0.8 1.0
ratio labeled (| L|/| |)
WinnipegPower Grid
0.0
0.2
0.4
0.6
0.8
1.0
corr
ela
tion
◦Transforming flow learning problem with LineGraph to besolved as vertex label learning problem performs no betterthan consistently predicting zero.
◦FlowSSL, our proposed semi-supervised edge-flow learningalgorithm, outperforms the baselines by a large margin.
Theorem: Assume the ground truth flow f = f + δ, where fis a divergence free flow; and we have flow measurements ona subset C edges with cardinality at least m− n + 1. Denotethe null-space of the incidence matrix as V = Null(B). Thenas the regularization parameter λ → 0 in our method, the re-construction error is bounded by [σ−1
min(VC, :) + 1] · ‖δ‖, whereσmin(·) is the smallest singular value of a matrix.
Active Learning Problem & Strategies
Goal: Select a set of edges |EL| = mL to minimize reconstruc-tion error (optimal sensor deployment with a limited budget).
1. RRQR – minimize error bound◦use rank revealing QR (RRQR)
to select well conditioned rows
VᵀC, : Π = Q [R1 R2] .
◦ EL from leading columns of Π
2. RB – select bottleneck edges◦ capture global flow trends◦ recursively bisect (RB) & select
edges that bridge clustersblack arrow: edges selected by RB
0.0 0.2 0.4 0.6 0.8 1.0
ratio labeled (| L|/| |)0.0 0.2 0.4 0.6 0.8 1.0
ratio labeled (| L|/| |)
Power Grid
0.0
0.2
0.4
0.6
0.8
1.0
corr
ela
tion
Winnipeg
Baseline
Random
RRQR
RB
Findings: RRQR provides additional gains for approximatelydivergence-free flows (left), RB works well for flows withglobal trends (right).
This research was supported by NSF award DMS-1830274, ARO AwardW911NF-19-1-0057, and European Union’s Horizon 2020 research andinnovation programme under the Marie Sklodowska-Curie grant agree-ment No 702410.