design and analysis of algorithms - ai labailab.cs.nchu.edu.tw/course/algorithms/104/al06.pdf ·...

1

Design and Analysis of

Algorithms

演算法設計與分析

Lecture 6

October 21, 2015

洪國寶

2

Homework # 5

1. 28.2-6 (p. 741) / 4.2-7 (p. 83)2. -- / 15.3-6 (p. 390) Imagine that you wish to exchange one currency for another.

You realize that instead of directly exchanging one currency for another, you might be better off making a series of trades through other currencies, winding up with the currency you want. Suppose that you can trade n different currencies, numbered 1, 2, …; n, where you start with currency 1 and wish to wind up with currency n. You are given, for each pair of currencies i and j , an exchange rate rij , meaning that if you start with d units of currency i , you can trade for drijunits of currency j . A sequence of trades may entail a commission, which depends on the number of trades you make. Let ck be the commission that you are charged when you make k trades. Show that, if ck = 0 for all k = 1, 2, …, n, then the problem of finding the best sequence of exchanges from currency 1 to currency n exhibits optimal substructure. Then show that if commissions ck are arbitrary values, then the problem of finding the best sequence of exchanges from currency 1 to currency n does not necessarily exhibit optimal substructure.

3. 15.4-5 (p. 356) / 15.4-5 (p. 397)4. A string S is a cyclic rotation of a string T if S and T have the same length and S

consists of a suffix of T followed by a prefix of T. Given two strings, design an efficient algorithm to determine whether one string is a cyclic rotation of another.

Due October 28, 2015

3

Outline

• Review

• Greedy method

– Activity-selection algorithm

– Basic elements of greedy approach

– Examples where greedy approach does not work

– More examples (optimal merge pattern, Huffman code)

– Theoretical foundation (matroid)

– Task scheduling

4

Review: Dynamic Programming

• Dynamic programming

– A metatechnique, not an algorithm

(like divide & conquer)

– The word “programming” is historical and

predates computer programming

• Typically applied to optimization problems

• Use when problem breaks down into

recurring small subproblems

5

Review: Dynamic Programming vs.

Divide and Conquer

• Divide-and-Conquer– Partition the problem into independent subproblems,

solve the subproblems recursively, and then combine their solutions to solve the original problem.

• Dynamic Programming– Applicable when the subproblems are not independent,

that is, when subproblems share subsubproblems.

– Solves every subsubproblem just once and then saves its answer in a table, thereby avoiding the work of recomputing the answer every time the subsubproblem is encountered

6

Review: Development of A

Dynamic-Programming Algorithm

1. Characterize the structure of an optimal solution

2. Recursively define the value of an optimal

solution

3. Compute the value of an optimal solution in a

bottom-up fashion

4. Construct an optimal solution from computed

information

7

Matrix chain multiplication

• Given a sequence (chain) <A1, A2,…, An> of n matrices to be multiplied, compute the product A1A2…An in a way that minimizesthe number of scalar multiplications.

• Every way of multiplying the matrices corresponds to a parenthesization.

• Impractical to check all possible parenthesizations

1

1

2)()(

11

)( n

k

nifknPkP

nif

nP　　　

　　　　　　　　　

8

Review: Elements of Dynamic

Programming

Optimal substructure: an optimal

solution to the problem contains within

it optimal solutions to subproblems

(for DP to be applicable)

Overlapping subproblems: the space of

subproblems must be small

(for algorithm to be efficient)

9

Common Pattern in Discovering

Optimal Substructure

• Show a solution to the problem consists of making a choice. Making the choice leaves one or more subproblems to be solved.

• Suppose that for a given problem, the choice that leads to an optimal solution is available.– Given this optimal choice, determine which

subproblems ensue and how to best characterize the resulting space of subproblems

– Show that the solutions to the subproblems used within the optimal solution to the problem must themselves be optimal by using a “cut-and-paste” technique and prove by contradiction

10

Review: Memoization

• A variation of dynamic programming that often offers the efficiency of the usual dynamic-programming approach while maintaining a top-down strategy

– Memoize the natural, but inefficient, recursive algorithm

– Maintain a table with subproblem solutions, but the control structure for filling in the table is more like the recursive algorithm

11

Review: LCS Algorithm

• if |X| = m, |Y| = n, then there are 2m subsequences of X; we must compare each with Y (ncomparisons)

• So the running time of the brute-force algorithm is O(n 2m)

• Notice that the LCS problem has optimal substructure: solutions of subproblems are parts of the final solution.

• Subproblems: “find LCS of pairs of prefixes of X and Y”

12

Review: LCS Algorithm Running Time

• LCS algorithm calculates the values of each

entry of the array c[m,n]

• The running time is

O(mn)

since each c[i,j] is calculated in

constant time, and there are mn

elements in the array

13

Review: Optimal Polygon Triangulation

• Input: a convex polygon P=(v0, v1, …, vn-1)

Output: an optimal triangulation

14

Optimal Polygon Triangulation vs.

Matrix-chain multiplication

• Optimal Polygon Triangulation

0 if i=j

t[i,j] = if i<j

• Matrix-chain multiplication

　　　

　　　　　　　　　　　　　　　　

jiifpppjkmkim

jiif

jimjki

jki1],1[],[{

0

],[min

)}(],1[],[{min 1 kjijki

vvvwjktkit

15

Dynamic Programming (TSP)

• Traveling salesperson problem (revisit)

- Optimal substructure? (subproblems)• g(i,S) = min{cij + g(j,S-{j})}

= the length of a shortest path starting at

vertex i, going through all vertices in S, and

terminating at vertex 1

- Overlapping subproblems?• N = |g(i,S)| = (n-1)2n-2

jS

16

Outline

• Review

• Greedy method

– Activity-selection algorithm

– Basic elements of greedy approach

– Examples where greedy approach does not work

– More examples (optimal merge pattern, Huffman code)

– Theoretical foundation (matroid)

– Task scheduling

17

Algorithm design techniques

• So far, we’ve looked at the following design

techniques:

- induction (incremental approach)

- divide and conquer

- augmenting data structures

- dynamic programming

• Coming up: greedy method

18

Dynamic Programming VS.

Greedy Algorithms

• Dynamic programming uses optimal substructure in a bottom-up fashion

– First find optimal solutions to subproblems and, having solved the subproblems, we find an optimal solution to the problem

• Greedy algorithms use optimal substructure in a top-down fashion

– First make a choice – the choice that looks best at the time – and then solving a resulting subproblem ■

19

Greedy Algorithms

• A greedy algorithm always makes the

choice that looks best at the moment

– The hope: a locally optimal choice will lead to

a globally optimal solution

– For some problems, it works

• Dynamic programming can be overkill;

greedy algorithms tend to be easier to code ■

20

Example: Activity-Selection

• Formally:

– Given a set S of n activities

si = start time of activity i

fi = finish time of activity i

– Find max-size subset A of compatible activities:

for all i,j A, [si, fi) and [sj, fj) do not overlap

1

2

3

4

5

6

21

Activity Selection: Optimal Substructure

• Let k be the minimum activity in A (i.e., the one with the earliest finish time). Then A - {k} is an optimal solution to S’ = {i S: si fk}

– In words: once activity k is selected, the problem reduces to finding an optimal solution for activity-selection over activities in S compatible with k

– Proof: if we could find optimal solution B’ to S’with |B| > |A - {k}|,

• Then B U {k} is compatible

• And |B U {k}| > |A| ■

22

Greedy Choice Property

• Dynamic programming? Memoize?

• Activity selection problem also exhibits the

greedy choice property:

– Locally optimal choice globally optimal sol’n

– Thm 16.1: if S is an activity selection problem

sorted by finish time, then optimal solution

A S such that {1} A

• Sketch of proof: if optimal solution B that does not

contain {1}, can always replace first activity in B with

{1} (Why?). Same number of activities, thus optimal. ■

23

Activity Selection: A Greedy Algorithm

• Intuition is simple:

– Always pick the activity with the earliest

finish time available at the time

– GAS(S)

1 if S=NIL then return NIL

2 else return {k} U GAS(S’)

where k is the activity in S with smallest f, and

S’ = {i S: si fk}

24


• GAS(S)




S’ = {i S: si fk}

Proof of correctness: use blackboard

25

Activity Selection: A LISP program

• GAS(S)




S’ = {i S: si fk}

(defun gas (l)

(print l)

(cond ((null l) nil)

(T (cons (car l) (gas (filter (nth 1 (car l)) (cdr l)))))))

(defun filter (s l)

(cond ((null l) nil)

((> s (nth 0 (car l))) (filter s (cdr l)))

(T (cons (car l) (filter s (cdr l))))))

26

Activity Selection: A LISP program

(cont.)(defvar *activity* '((1 4) (3 5) (0 6) (5 7) (3 8) (5 9) (6 10) (8 11) (8 12) (2 13) (12 14)))

#<BUFFERED FILE-STREAM CHARACTER #P"gas.txt" @1>

[3]>

*activity*

((1 4) (3 5) (0 6) (5 7) (3 8) (5 9) (6 10) (8 11) (8 12) (2 13) (12 14))

[4]> (gas *activity*)

((1 4) (3 5) (0 6) (5 7) (3 8) (5 9) (6 10) (8 11) (8 12) (2 13) (12 14))

((5 7) (5 9) (6 10) (8 11) (8 12) (12 14))

((8 11) (8 12) (12 14))

((12 14))

NIL

((1 4) (5 7) (8 11) (12 14))

[5]> (dribble)

The S’ for

each iteration

(print l)

27


• If we sort the activities by finish time then

the algorithm is simple (iterative instead of

recursive):

– Sort the activities by finish time

– Schedule the first activity

– Then schedule the next activity in sorted list

which starts after previous activity finishes

– Repeat until no more activities

Note: we don’t have to construct S’ explicitly. Why?

28

Greedy-Activity-Selector

Assume (wlog) that

f1 f2 … fn

• Time complexity:

O(n)

• Exercise 1: Proof of

correctness by loop

invariant.

29

Activity Selection

30

Other greedy choices

• Greedy-Activity-Selector always pick the activity with the earliest finish time available at the time (smallest f)

• Some other greedy choices:

– Largest f

– Largest/Smallest s

– Largest/Smallest (f-s)

– Fewest overlapping ■

Exercise 2: Which of these criteria result in optimal solutions?

31

A Variation of the Problem

• Instead of maximizing the number of activities we

want to schedule, we want to maximize the total

time the resource is in use.

• None of the obvious greedy choices would work:

■Choose the activity that starts earliest/latest

■Choose the activity that finishes earliest/latest

■Choose the longest activity ■

Exercise 3: Design an efficient algorithm for this variation

of activity selection problem.

32

Elements of Greedy Strategy

Greedy choice property:

An optimal solution can be obtained by making choices that seem best at the time, without considering their implications for solutions to subproblems.

Optimal substructure:

An optimal solution can be obtained by augmenting the partial solution constructed so far with an optimal solution of the remaining subproblem. ■

33

Steps in designing

greedy algorithms

1. Cast the optimization problem as one in

which we make a choice and are left with

one subproblem to solve.

2. Prove the greedy choice property.

3. Demonstrate optimal substructure. ■

34

Examples where greedy approach

does not work

• Traveling salesman problem: nearest

neighbor, closest pair

• Matrix-chain multiplication: multiply the

two matrices with lowest cost first

• Knapsack problem: largest values,

largest value per unit weight ■

35

Traveling salesman problem

• Correctness is not obvious (2/7) (nearest neighbor)

36

Traveling salesman problem

• Correctness is not obvious (3/7)

37


• Greedy choice: multiply the two matrices

with lowest cost first

• Example 1: <A1, A2, A3> (10*100, 100*5,

5*50)

• Example 2: <A1, A2, A3, A4, A5, A6 >

(30*35, 35*15, 15*5, 5*10, 10*20, 20*25)

38


• Example 1: <A1, A2, A3> (10*100, 100*5, 5*50)

– ((A1A2)A3) 10*100*5 + 10*5*50 = 5000 + 2500 =

7500

– (A1(A2A3)) 100*5*50 + 10*100*50 = 25000 +

50000 = 75000

39


• Example 2: <A1, A2, A3, A4, A5, A6 >

(30*35, 35*15, 15*5, 5*10, 10*20, 20*25)

– A3A4 15*5*10 = 750

– Figure 15.3

((A1(A2 A3))((A4 A5) A6))

41

Knapsack problem

Given some items, pack the knapsack to get

the maximum total value. Each item has some

weight and some value. Total weight that we can

carry is no more than some fixed number W.

So we must consider weights of items as well as

their value.

Item # Weight Value

1 1 8

2 3 6

3 5 5

42

Knapsack Problem

• A thief breaks into a jewelry store carrying a

knapsack. Given n items S = {item1,… ,itemn},

each having a weight wi and value vi, which

items should the thief put into the knapsack

with capacity W to obtain the maximum value?

43

Knapsack problem

There are two versions of the problem:

(1)“0-1 knapsack problem”Items are indivisible; you either take an item or not.

Solved with dynamic programming

(2) “Fractional knapsack problem”Items are divisible: you can take any fraction of an item.

Solved with a greedy algorithm.

44

0-1 Knapsack Problem

• This problem requires a subset A of S to be

determined such that

is maximized subject to

• The naïve approach is to generate all

possible subset of S and find the maximum

value, requiring 2n time.

Aitem

i

i

v

Aitem

i

i

Ww

45

Does Greedy Approach Work?

• Strategy 1:

– Steal the items with the largest values.

– E.g. w=[25, 10, 10], v=[$10, $9, $9], W=30.

– Value is 10, although the optimal value is 18.

w1=25 w2=10 w3=10

v1 = $10 v2 = $9 v3 = $9

46

Does Greedy Approach Work?

• Strategy 2:

– Steal the items with the largest value per unit

weight.

– E.g. w=[5, 20,10], v=[$50, $140, $60], W=30.

– Value is 190, although optimal would be 200.

w1=5 w3=10w2=20

v1=$50 v3=$60v2=$140

47

NP-hard Problems(Knapsack, Traveling Salesman,...)

• The two approaches above cannot yield the optimal

result for 0-1 knapsack.

• This problem is NP-hard even when all items are the

same kind, that is, itemi described by wi only (vi=wi).

• Observation: Greedy approach to the fractional

knapsack problem yields the optimal solution.

– E.g. (see 2nd example)

50 + 140 + (5/10)*60 = 220

where 5 is the remaining capacity of the knapsack

48

Dynamic Programming Approach for the

Knapsack Problem

• We can find optimal solution for 0-1 knapsack problem by DP,

but not in polynomial-time.

• Let A be an optimal subset with items from

{item1,… ,itemi} only. There are two cases:

1) A contains itemi Then the total value for A is equal to vi

plus the optimal value obtained from the first i-1 items,

where the total weight cannot exceed W – wi

2) A does not contain itemi Then the total value for A is equal

to that of the optimal subset chosen from the first i-1 items

(with total weight cannot exceed W).

Q: What are the subproblems?

49

Dynamic Programming Approach

for the Knapsack Problem•

max(A[i-1][w], vi+A[i-1][w-wi]) if w>wi

A[i-1][w] if w<wi

• The maximum value is A[n][W]

• Running time is O(nW),

not polynomial in n

Q: WHY?

A[i][w]=

1 2 …...…….. W

1

2

...

…

n

50

Fractional knapsack problem

• Greedy approach: taking items in order of

greatest value per pound

• Optimal for the fractional version (why?),

but not for the 0-1 version

51

Optimal merge pattern

52

Optimal merge pattern

53

Huffman codes

• Compact Text Encoding

• Code that can be decoded

• Huffman encoding

54

Compact Text Encoding

• The goal is to develop a code that represents a given text as compactly as possible.

• A standard encoding is ASCII, which represents every character using 7 bits:

• “An English sentence”

1000001 (A) 1101110 (n) 0100000 ( ) 1000101 (E) 1101110 (n) 1100111 (g) 1101100 (l) 1101001 (i) 1110011 (s)1101000 (h) 0100000 ( ) 1110011 (s)1100101 (e) 1101110 (n) 1110100 (t) 1100101 (e)1101110 (n) 1100011 (c) 1100101 (e)

• This requires 133 bits ≈ 17 bytes

55

Compact Text Encoding (Cont.)

• Of course, this is wasteful because we can encode 12 characters in 4 bits:

‹space› = 0000 A = 0001 E = 0010 c = 0011 e = 0100 g = 0101 h = 0110i = 0111 l = 1000 n = 1001 s = 1010 t = 1011

• Then we encode the phrase as

0001 (A) 1001 (n) 0000 ( ) 0010 (E) 1001 (n) 0101 (g) 1000 (l) 0111 (i)1010 (s) 0110 (h) 0000 ( ) 1010 (s) 0100 (e) 1001 (n) 1011 (t) 0100 (e)1001 (n) 0011 (c) 0100 (e)


56

Compact Text Encoding (Cont.)

• An even better code is given by the following encoding:

‹space› = 000 A = 0010 E = 0011 s = 010 c = 0110 g = 0111 h = 1000 i = 1001 l = 1010 t = 1011 e = 110 n = 111

• Then we encode the phrase as

0010 (A) 111 (n) 000 ( ) 0011 (E) 111 (n) 0111 (g) 1010 (l) 1001 (i) 010 (s) 1000 (h) 000 ( ) 010 (s) 110 (e) 111 (n) 1011 (t) 110 (e) 111 (n)0110 (c) 110 (e)


57

Code that can be decoded

• Fixed-length codes:

– Every character is encoded using the same number of bits.

– To determine the boundaries between characters, we form groups of w bits, where w is the length of a character.

– Examples:

• ASCII

• Our first improved code

• Prefix codes:

– No character is the prefix of another character.

– Examples:

• Fixed-length codes

• Huffman codes ■

58

Why Prefix Codes?

• Consider a code that is not a prefix code:a= 01 m = 10 n = 111 o = 0 r = 11 s = 1 t = 0011

• Now you send a fan-letter to you favorite movie star. One of the sentences is “You are a star.”You encode “star” as “1 0011 01 11”.

• Your idol receives the letter and decodes the text using your coding table:

100110111 = 10 0 11 0 111 = “moron”

• Prefix codes are unambiguous. (See next slide) ■

59

Why Are Prefix Codes Unambiguous?

It suffices to show that the first character can be decoded unambiguously. We

then remove this character and are left with the problem of decoding the first

character of the remaining text, and so on until the whole text has been decoded.

c

Assume that there are two characters c and c' that could potentially be the first

characters in the text. Assume that the encodings are x0x

1…x

kand y

0y

2…y

l. Assume

further that k ≤ l.

c

c'

Since both c and c' can occur at the beginning of the text, we have xi= y

i,

for 0 ≤ i ≤ k; that is, x0x

1…x

kis a prefix of y

0y

2…y

l, a contradiction. ■

prefix codes are unambiguous

60

• In the example:

‹space› = 000 A = 0010 E = 0011 s = 010 c = 0110 g = 0111

h = 1000 i = 1001 l = 1010 t = 1011 e = 110 n = 111

Representing a Prefix-Code Dictionary

‹spc›

A E

s

c g h i l t

e n

0 1

0

0

0

0

0

0

00

0 0

1

11

1 1 1 1

1 1

1

61

Figure 16.3/ Figure 16.4

63

Huffman code

64

The Cost of Huffman Code

• Let C be the set of characters in the text to be encoded, and let

f(c) be the frequency of character c.

• Let dT(c) be the depth of node (character) c in the tree

representing the code. Then

is the number of bits required to encode the text using the

code represented by tree T. We call B(T) the cost of tree T.

• Observation: In a tree T representing an optimal prefix code,

every internal node has two children. ■

65

Greedy Choice

• Lemma: There exists an optimal prefix

code such that the two characters with

smallest frequency are siblings and have

maximal depth in T. ■

66

Greedy Choice

Let x and y be two such characters, and let

T be a tree representing an optimal prefix

code.

Let a and b be two sibling leaves of

maximal depth in T.

Assume w.l.o.g. that f(x) ≤ f(y) and

f(a) ≤ f(b).

This implies that f(x) ≤ f(a) and

f(y) ≤ f(b).

Let T' be the tree obtained by exchanging

a and x and b and y. (Continued in the next slide.)

T

T'

x y

ba

x

a

y

b

• Proof:

67

The cost difference between trees T and T' is

T T'

x y

ba x

a

y

b

Greedy Choice

68

Optimal Substructure

• After joining two nodes x and y by making them children of a new node z, the algorithm treats z as a leaf with frequency f(z) = f(x) + f(y).

• Let C' be the character set in which x and y are replaced by the single character z with frequency f(z) = f(x) + f(y), and let T' be an optimal tree for C'.

• Let T be the tree obtained from T' by making x and y children of z.

• We observe the following relationship between B(T) and B(T'):

B(T) = B(T') + f(x) + f(y) (Continued in the next slide.)

70

Lemma: If T' is optimal for C', then T is optimal for C.

Assume the contrary. Then there exists a better tree T'' for C.

Also, there exists a tree T''' at least as good as T'' for C where x and y are sibling

leaves of maximal depth.

The removal of x and y from T''' turns their parent into a leaf; we can associate this

leaf with z.

The cost of the resulting tree is B(T''') – f(x) – f(y) < B(T) – f(x) – f(y) = B(T').

This contradicts the optimality of B(T').

Hence, T must be optimal for C. ■

72

Huffman code (Remarks)

• Assume that the string is generated by a

memoryless source, regardless the past, the

next character in the string is c with

probability f(c).

Then Huffman is optimal.

• Can we do better?

73

Huffman code (Remarks)

• Huffman encodes fixed length blocks.

– What if we vary them?

• Huffman uses one encoding throughout a life.

– What if characteristics change?

• What if data has structure?

– For examples: raster images, video

• Huffman is lossless.

– Necessary?

• LZW, MPEG, etc. ■

74

Huffman code (Implementation)

• Time complexity and data structure:

Let S be the set of n weights (nodes).

Constructing a Huffamn code based on greedy strategy can be described as following

Repeat until |S|=1

Find two min nodes x and y from S and remove them from S

Construct a new node z with weight w(z)=w(x)+w(y) and

insert z into S

75

Huffman code (Implementation)

Why data structures are important?

• An algorithm for constructing Huffman code (Tree):

Repeat until |S|=1– Find two min nodes x and y from S and remove them from S

– Construct a new node z with weight w(z)=w(x)+w(y) and insert z into S

• The time complexity of the algorithm depends on how S is implemented.

• Data structure for S each iteration total

linked list O(n) O(1) O(n2)

Sorted array O(1) O(n) O(n2)

? O(log n) O(log n) O(n log n)

76

We will cover heap in Lecture 8

77

Greedy method: Recap

• Greedy algorithms are efficient algorithms for optimization problems that exhibit two properties:– Greedy choice property: An optimal solution can be

obtained by making locally optimal choices.

– Optimal substructure: An optimal solution contains within it optimal solutions to smaller subproblems.

• If only optimal substructure is present, dynamic programming may be a viable approach; that is, the greedy choice property is what allows us to obtain faster algorithms than what can be obtained using dynamic programming. ■

78

Questions?

design and analysis of algorithms - ai labailab.cs.nchu.edu.tw/course/algorithms/104/al06.pdf ·...

Documents