network design for information networks chaitanya swamy caltech and u. waterloo ara hayrapetyanÉva...

Network Design for Information Networks

Chaitanya SwamyCaltech and U. Waterloo

Ara Hayrapetyan Éva Tardos

Cornell University

Typical Network Design

• Users/clients.

• Each user has a demand – number of packets/bits.

• Cost of sending information on an edge for a set of users depends on a single parameter – the total demand of that set of users.

e.g. Steiner tree: coste(S) = ce for S ≠ Ø Buy-at-bulk ND: coste(S) = ce

.f(|S|), f concave

Implicitly assumes that to route a set of users, have to send the total demand of that set.

Information Aggregation Model

• Take a higher level view – want to capture information aggregation.

• Each user has some information.• Interested in the total information flow of a set of

users allowing for information aggregation. cost of sending information of a set of users could be

much less than the sum of the individual information needs

incur cost savings some information may aggregate better than others aggregation/cost function depends on the set of

users

• Can capture complex relations between users by using a set-based cost function.

D: set of terminals/users/clients.

ce: length of edge e.

Cost function h : 2D ≥0 , h(Ø) = 0

Want to model economies of scale – will assume

h(.) is increasing, submodular, i.e., if A B, iB, then

h(A+i) – h(A) ≥ h(B+i) – h(B)

h(.) is given implicitly, e.g., via an oracle.

Algorithm should make only polynomial number of queries.

Model (contd.)

Graph G = (V,E)

Applicability

• Sensor Networks– Distributed sensor nodes send information to

central node(s)– Information can often be well aggregated

along paths, e.g., temperature readings– May care only about aggregate information,

e.g., average temperature, humidity …

• Content-based publish-subscribe systems– Users “publish” or “subscribe” to information– Information flowing through network can be

aggregated

Two Network Design Problems

•Single-sink problem

For each terminal, choose a path to sink to send information.

Goal: Minimize total cost of sending information

= ∑e ce.h(Ae)

Ae : set of terminals using edge e

: Terminal: Sink: Node

•Facility location settingMultiple facilities (sinks) – can route to facility i paying a fixed cost of fi

For each terminal, choose a path to a facility to send information.

Goal: Minimize facility opening + information sending costs

= ∑iF fi + ∑e ce.h(Ae)

Ae : set of terminals using edge eF : set of opened facilities

: Terminal

: Facility

: Node

•Buy-at-bulk network design.

•Facility location with buy-at-bulk connection costs – includes uncapacitated facility location.

•Dependent Maybecast – generalization of Karger-Minkoff (KM00) maybecast problem.

•2-stage stochastic Steiner tree problem.

•Well-approximates the multi-stage Stochastic Steiner tree problem.

• Interval routing problem (Williamson et al.): each user has to send an interval to the root on a single path; cost of e = total length of intervals sent on it

General problem includes many interesting problem classes.

Our Results• Give an O(log |V|)-approximation for the general problem

using tree embeddings.

• Obtain a 4-approximation for Group Facility Location:– terminals divided into groups; cost of e = ce

.(# of groups using e) have to open facilities and connect each group to open facilities via a Steiner forest.

Algorithm combines [AKR, GW]-algorithm for Steiner forest and JV-algorithm for facility location via a novel cleanup phase.

• Give an O(k)-approximation for Dependent Maybecast (probabilistic Steiner tree) with k-level distribution tree.

• Get a 2k-approximation for k-stage Steiner tree problem.Also obtained independently by Gupta-Pal-Ravi-Sinha ’05.

Dependent MaybecastProbability distribution on subsets of terminals – determines which terminals to connect to root r.Want a simple communication scheme.

– Select a single t–r path for each terminal t;– t will use this path to “talk” to the root when activated.

Goal: Minimize expected cost of edges used

= ES[∑tS c(path(t))] = ∑e ce.p(Ae)

Ae : set of terminals using edge ep(S) : probability that at least one terminal in S is active

p(.) is submodular, so special case of single-sink problem.KM00 introduced the special case where they assume that each terminal is activated independently.

Tree-based distributions

: Distribution tree with root , leaves are terminals.

Distinct from the original graph.Each edge e labeled with pe [0, 1] and is turned on independently with probabililty pe.

Activated terminals = {tD: all edges from –t are turned on}

0.21 0.5

0.90.3

0.4 0.1.1 0.6

0.02

level(0)

.1

1

Tree-based distributions

: Distribution tree with root , leaves are terminals.

Distinct from the original graph.Each edge e labeled with pe [0, 1] and is turned on independently with probabililty pe.

Activated terminals = {tD: all edges from –t are turned on}

0.2 0.5

0.90.3

0.4 0.10.6

0.02

Karger-Minkoff model 1-level treeWith general trees, can model correlation between terminals.

level(0)

The Algorithm

level(0)

Assume G is complete.

1. Sample from .

2. Build MST T on {r} {sampled terminals}. Contract T.

3. Recurse (separately) on each subtree of .( graph = contracted graph

root = node containing r

distr. tree = subtree of terminals = leaves of

subtree )

Algorithm (contd.)

level(0)

Stage 0: Sample from . Build MST T on {r} {sampled terminals}.

Stage i+1: Consider each node level(i+1) : subtree rooted at .0= , 1, …, i+1= : nodes on path from to .Contract trees T, T1

, …, Ti.

Sample from .

Build MST in contracted graph on{r} {sampled terminals}.

level(i+1)

Continue up to Stage k. Gives a tree which defines unique paths between terminals.

AnalysisStage 0: Cost incurred = stage(0) = MST(T).

Let OS = cost incurred by OPT on terminal set S.

OPT = ES[OS] and MST(S) ≤ 2.OS, so E[stage(0)] ≤

2.OPT.

Stage i: Let level(i), q = product of pes for edges

on – path.

Tree T is used only by terminals in subtree Pr[edge e of T is used] ≤ q

Define stage i cost = stage(i) = ∑level(i) q.c(T)

Total cost ≤ ∑i=0…k stage(i)

Will show that E[stage(i+1)] ≤ E[stage(i)], 0 ≤ i < k

get a solution of expected cost ≤ 2(k+1).OPT.

q = pe1 x …… x pei

Cost sharing• (G, A, t) = t’s share in building a tree on A in graph G

•Defining (G,A,t): build an MST on A {r}.(G, A, t) = cost of edge connecting t to its parent,

OR 0 if tA

•Will cost-share trees built by the algorithm and compare expected total cost-shares across different stages.Cost-sharing idea first used in Gupta-Kumar-Pál-Roughgarden.

: terminal in Ar

t

E[stage(i+1)] ≤ E[stage(i)]Show this for i=0. Consider node level(1).

Condition on set H' = nodes from ' picked in stage 0.Let S = nodes “attached” to in stage 0.Same random process determines S in stage 0 and stage 1.

'

H'

Cost share (CS) of S in stage 0

= 0 if e is not “on”, ∑tS (G, H' S, t)

otherwise.

CS of S in stage 1 = ∑tS (G/H, S, t)

where H H' = terminals activated in stage 0

e

S

∑tS (G/H, S, t) = cost(MST(S) in G/H) ≤ ∑tS (G, H' S, t)

Stochastic Steiner TreeSet of terminals to connect to the root is given by distribution.

Can buy edges in stage I knowing only the distribution:

– pay cost of ce, ORBuy edges in stage II knowing the terminal set to connect:

– pay cost of Ace in scenario A.

Choose which edges to buy in stage I so as to minimize expected total cost.

2-stage problem reduces to dep. maybecast with 2-level tree.k-stage problem reduces to dep. maybecast with k-level tree.Obtain a 2k-approximation algorithm for k-stage Steiner tree problem with black-box distribution, scenario-dependent costs.

Open Questions• Better approximation for the general problem.

• Approx. algorithms for dependent maybecast with– arbitrary distributions with only (conditional) “black-

box” sampling access – “graph-based” distributions.

• Approximation ratio independent of k for k-level dependent maybecast and k-stage Steiner tree.

• Cost oblivious network design: a single solution that is simultaneously near-optimal for every f’n h(.) in a given class. Goel-Estrin’04 designed a solution for all concave functions.

Thank You.

network design for information networks chaitanya swamy caltech and u. waterloo ara hayrapetyanÉva...

Documents