approximation algorithm design a case study of mrct

41
Approximation Approximation algorithm Design algorithm Design a case study of MRCT a case study of MRCT 樹樹樹樹樹樹 樹樹樹樹樹 樹樹樹樹樹樹 樹樹樹樹樹 樹樹樹樹(B. Y. Wu) (B. Y. Wu)

Upload: carlos-ruiz

Post on 30-Dec-2015

47 views

Category:

Documents


1 download

DESCRIPTION

Approximation algorithm Design a case study of MRCT. 樹德科技大學 資訊工程系 吳邦一 (B. Y. Wu). 1988 – before studying algorithms. 2000 – after studying algorithms. Ron Rivest. Leonard Adleman. Adi Shamir. RSA. Last year, after Prof. Chang went to NSYSU for a speech, - PowerPoint PPT Presentation

TRANSCRIPT

Approximation Approximation algorithm Design algorithm Design a case study of MRCTa case study of MRCT

Approximation Approximation algorithm Design algorithm Design a case study of MRCTa case study of MRCT

樹德科技大學 資訊工程系樹德科技大學 資訊工程系吳邦一 吳邦一 (B. Y. Wu)(B. Y. Wu)

1988 – before studying algorithms

2000 – after studying algorithms

Ron Rivest

Adi Shamir

Leonard Adleman

RSA

Last year, after Prof. Chang went to NSYSU for a speech, A student asked me for a question:

為何做演算法的人皆白髮而做 security 的易禿頭?

Algorithm research Algorithm research and NP-Complete and NP-Complete

TheoremTheorem

Algorithm research Algorithm research and NP-Complete and NP-Complete

TheoremTheorem

NP-hard: the barrier• Since the results of Cook (1971) ,

Levin(1973) & Karp (1972), many important problems have been shown to be NP-hard.

Cook 1982 Turing Award

Karp 1985 Turing AwardLevin

The NPC Theorem• The name “NP-Complete” is due to

Knuth(高德納)• Garey and Johnson 在 1979 年所著的

Computers and Intractability: A Guide to the Theory of NP-Completeness書中蒐列了數以百計的重要 NPC 問題,到今天, NPC 的問題已經列不勝列了。

• According to Wikipedia (維基百科) , 在2002 年的一項調查中,一百位研究者裡面有61 位相信 NP 不等於 P , 9 位相信 NP=P , 22 位不確定,而有 8 位研究者認為此問題在目前的假設基礎下是無法證明的。

Knuth 1974 Turing Award

Johnson

• For an NP-Complete or NP-hard problem, it is not expected to find an efficient algorithm. Or maybe you need the 1,000,000 USD award

• In 70s, the life-cycle of a problem– Defined– NP-hard– Heuristic or for special data

艱困而逐漸褪色• Life finds the ways

– Approximation – Online – Distributed– Mobile – New models

• Quantum computing• Bio-computing

Approximation Approximation algorithmsalgorithms

Approximation Approximation algorithmsalgorithms

Approximation algorithms

• For optimization (min/max) problem• Heuristic vs. approximation algorithms

– Ensuring the worst-case quality

• The error– Relative and Absolute – A function k of input size n. A k-

approximation: • minimization: sol/opt<=k;

maximization: opt/sol<=k• The ratio is always >1

最高境界 : Polynomial time approximation scheme

• Some algorithms are of fixed ratio• Approximation scheme: allow us to make

trade-off between time and quality– The more time, the better quality

• PTAS: for any fixed k>0, it finds a (1+k)-approximation in polynomial time.– Usually (1/k) appears in the time complexity,

e.q. O(n/k), O(n1/k).– FPTAS if (1/k) not in the exponent,

The first PTAS (Not sure)

• In Ronald L. Graham’s 1969 paper for scheduling problem (Contribution also due to Knuth and another)

An example -- TSP• Starting at a node, find a tour of

min distance traveling all nodes and back to the starting node.

10

15

8

3

2

5

2

6

10

The doubling tree algorithm

• Find a minimum spanning tree• Output the Euler tour in the

doubling tree of MST

10

15

8

3

2

5

2

6

10

10

15

8

3

2

5

2

6

10

The error ratio• MST<=TSP

– MST is the minimum cost of any spanning tree.

– A tour must contain a spanning tree since it is connected.

• It is a 2-approximation

Optimum Optimum communication communication spanning tree spanning tree

ProblemsProblems

Optimum Optimum communication communication spanning tree spanning tree

ProblemsProblems

OCT: definition• Input:

– an undirected graph with nonnegative edge lengths

– a nonnegative requirement for each pair of vertices

• Output: – a spanning tree minimizing the total communic

ation cost summed over all pairs of vertices, in which the cost of a vertex pair is the distance multiplied by their requirement, that is, we want to minimize Σ λi,j dT(i,j)

First studied by T.C. Hu 1974 SICOMPFirst approximation appeared in Wong 1980

A way to a PTASA way to a PTASA way to a PTASA way to a PTASA case study of the MRCT problem A case study of the MRCT problem

Optimum Communication Spanning Optimum Communication Spanning TreesTrees

Minimum routing cost spanning trees

• A spanning tree with minimum all-to-all distance

• NP-hard in the strong sense• Tree with short edges may have large

routing cost

Approximation– comparing with a trivial

lower bound• A lower bound

– d(T,u,v)>=d(G,u,v) ( 樹上距離 <= 原圖最短路徑 )

– Opt>=Σd(G,u,v)• The median of G: a node m min Σvd(G,

m,v)– Since min<=mean, Σvd(G,m,v)<=(1/n) Σd(G,u,v)

• Y : a shortest path tree rooted at m– d(Y,i,j)<=d(Y,i,m)+d(Y,m,j)

– Σd(G,u,v)<=2nΣvd(G,m,v)<=2*OPT

• A shortest path tree rooted at the median is a 2-approximation of the MRCT. m

ij j

To find an approx.• A lower bound of the optimum• An algorithm• Analyze the worst-case ratio

Solution decomposition• 假設 T 是一個 OPT, 我們將 T 做一些處理 , 得

到另一個解 Y, 使得– Y 的 cost 不至於與 T 相差太多– Y 屬於某一種特殊類別的解 , 而這類別中的最佳解

是可以在 polynomial time 求得的• 注意 : 我們無法得知 Y, Y 並不會出現在 algori

thm 中 , 只在分析中扮演一個中繼的角色

Metric MRCT• For easy to understand, we

consider only the metric case• The input is a metric graph: a

complete graph with edge length satisfying the triangle inequality

>=n/2

Metric MRCT• 假設 T 是 OPT, r 是 T 的 centroid

– 一個 tree 的 centroid 是去掉它的話 , 剩下的 subtree 均不會超過一半的 node

• 在計算 cost 時 , d(T,r,v) 至少被計算 n 次– opt>=nΣvd(T,r,v)

• Let Y: the star centered at r– C(Y)= 2(n-1)Σvd(Y,r,v)– Y is a 2-approximation

r

v

• 利用 solution decomposition 証得– 存在一個 star 是 2-approximation

• 以窮舉法嘗試所有的 star (n 個 ) 並取出最好的 , 必然是一個 2-approximatin

• Can we do better?

• Separator of a tree:– Centroid is a ½ separator

• How the 2-approx. algorithm works?– Guess (try all possible) the separator– Connect the others greedily– Distance increases only for nodes in the sa

me branch -- we don’t pay too much

δ-separator

nn

• To get better result, we try to generalize the centroid to general δ-separator

• Indeed, when δ↘, the error↘• But it costs too much to obtain the exact δ-

separator for δ<1/2. – For example, a 1/3-separator may have n/3 nod

es

1/3-separator

n/3 n/3

屬下犧牲了上司也該犧牲

• We don’t need a perfect separator– Only some critical nodes are necessary

• Leaves of the separator ( 確保下屬有個好的依歸 )

• Branch nodes of the separator( 確保結構 )

δ-separator

nn

To a k-Star• k-star: a tree with at most k internal nodes• Need some other work to show the ratio

(通常這樣的話代表了背後有慘不忍睹的內容 )

Solution decomposition• 從一個 OPT開始,我們設法將他改造成一個

k-star ,並證明此 k-star 是一個不錯的 approximation

• 設計一個演算法可以求得最好的k-star ,既然他是最好,當然不比那個改造的差

• 精緻的分析是重要的,「好,要說的出口」

• 3-star =>1.5-approximation • k-star => (k+3)/(k+1)-

approxiamtion• The best k-star for fixed k can be

found in polynomial time• We have a PTAS

一些經驗之談• Evolutionary tree reconstruction

– 給一個 n 個物種的距離矩陣,找一個 tree 以此 n 個物種為 leaves, 使得兩兩物種之間在樹上的距離 >=給定的距離且最小化距離總合

• 這個問題比較難,因為樹的中間節點是可以任意給的

• Steiner tree vs. Spanning tree

tree-driven SP-alignments

SP (Sum of Pairs) Tree-alignment

M ultiple sequence alignment

Ultrametric tree

Without auxiliary node

M inimum increamentwith L 1-norm

Additive tree

Evolutionary tree reconstruction

Computational biology

•花了不少時間 study Steiner tree•先做做 Spanning 的 case

– MRCT•找到 separator 的方法

– (15/8)-approx => 1.577 =>1.5 =>4/3+– 兩種 extension

• 這個方法在 general graph 上不可能做到比 4/3+ 更好了

• 困難點在於受限於 shortest path tree•如果是 metric graph就有可能做到更好•但是 metric graph 的 case還不知是不

是 NP-hard– 對於證明 NPC實在是很厭煩了

•把 Garey & Johnson 的書翻了又翻– 遠在天邊 近在眼前– 把 general case transform 到 metric case – 不只解決 NP-hard 的疑問 , 證明了 metric

上的 approx. 可以用在 general case 上

•找到 k-star 的方法• 意外的插曲

– 研究是很競爭的– 提心吊膽 難以入眠– 謎底揭曉的那一刻

• 1997 年,我做到了兩年來作夢都夢不到的事

•更多的 extension

• 做研究是在千百次失敗中期待一次成功• 做行政是在千百次成功中等待一次失敗• 研究之路很迷人,如果有人結伴而行則更加美好(當學生很幸福啊!)

•李老師告訴我說:沒有計畫,只有方向

• 研究如此,人生何嘗不是

ThanksThanksThanksThanks

Q&AQ&A