design and analysis of algorithms - ai labailab.cs.nchu.edu.tw/course/algorithms/104/al06.pdf ·...
TRANSCRIPT
1
Design and Analysis of
Algorithms
演算法設計與分析
Lecture 6
October 21, 2015
洪國寶
2
Homework # 5
1. 28.2-6 (p. 741) / 4.2-7 (p. 83)2. -- / 15.3-6 (p. 390) Imagine that you wish to exchange one currency for another.
You realize that instead of directly exchanging one currency for another, you might be better off making a series of trades through other currencies, winding up with the currency you want. Suppose that you can trade n different currencies, numbered 1, 2, …; n, where you start with currency 1 and wish to wind up with currency n. You are given, for each pair of currencies i and j , an exchange rate rij , meaning that if you start with d units of currency i , you can trade for drijunits of currency j . A sequence of trades may entail a commission, which depends on the number of trades you make. Let ck be the commission that you are charged when you make k trades. Show that, if ck = 0 for all k = 1, 2, …, n, then the problem of finding the best sequence of exchanges from currency 1 to currency n exhibits optimal substructure. Then show that if commissions ck are arbitrary values, then the problem of finding the best sequence of exchanges from currency 1 to currency n does not necessarily exhibit optimal substructure.
3. 15.4-5 (p. 356) / 15.4-5 (p. 397)4. A string S is a cyclic rotation of a string T if S and T have the same length and S
consists of a suffix of T followed by a prefix of T. Given two strings, design an efficient algorithm to determine whether one string is a cyclic rotation of another.
Due October 28, 2015
3
Outline
• Review
• Greedy method
– Activity-selection algorithm
– Basic elements of greedy approach
– Examples where greedy approach does not work
– More examples (optimal merge pattern, Huffman code)
– Theoretical foundation (matroid)
– Task scheduling
4
Review: Dynamic Programming
• Dynamic programming
– A metatechnique, not an algorithm
(like divide & conquer)
– The word “programming” is historical and
predates computer programming
• Typically applied to optimization problems
• Use when problem breaks down into
recurring small subproblems
5
Review: Dynamic Programming vs.
Divide and Conquer
• Divide-and-Conquer– Partition the problem into independent subproblems,
solve the subproblems recursively, and then combine their solutions to solve the original problem.
• Dynamic Programming– Applicable when the subproblems are not independent,
that is, when subproblems share subsubproblems.
– Solves every subsubproblem just once and then saves its answer in a table, thereby avoiding the work of recomputing the answer every time the subsubproblem is encountered
6
Review: Development of A
Dynamic-Programming Algorithm
1. Characterize the structure of an optimal solution
2. Recursively define the value of an optimal
solution
3. Compute the value of an optimal solution in a
bottom-up fashion
4. Construct an optimal solution from computed
information
7
Matrix chain multiplication
• Given a sequence (chain) <A1, A2,…, An> of n matrices to be multiplied, compute the product A1A2…An in a way that minimizesthe number of scalar multiplications.
• Every way of multiplying the matrices corresponds to a parenthesization.
• Impractical to check all possible parenthesizations
1
1
2)()(
11
)( n
k
nifknPkP
nif
nP
8
Review: Elements of Dynamic
Programming
Optimal substructure: an optimal
solution to the problem contains within
it optimal solutions to subproblems
(for DP to be applicable)
Overlapping subproblems: the space of
subproblems must be small
(for algorithm to be efficient)
9
Common Pattern in Discovering
Optimal Substructure
• Show a solution to the problem consists of making a choice. Making the choice leaves one or more subproblems to be solved.
• Suppose that for a given problem, the choice that leads to an optimal solution is available.– Given this optimal choice, determine which
subproblems ensue and how to best characterize the resulting space of subproblems
– Show that the solutions to the subproblems used within the optimal solution to the problem must themselves be optimal by using a “cut-and-paste” technique and prove by contradiction
10
Review: Memoization
• A variation of dynamic programming that often offers the efficiency of the usual dynamic-programming approach while maintaining a top-down strategy
– Memoize the natural, but inefficient, recursive algorithm
– Maintain a table with subproblem solutions, but the control structure for filling in the table is more like the recursive algorithm
11
Review: LCS Algorithm
• if |X| = m, |Y| = n, then there are 2m subsequences of X; we must compare each with Y (ncomparisons)
• So the running time of the brute-force algorithm is O(n 2m)
• Notice that the LCS problem has optimal substructure: solutions of subproblems are parts of the final solution.
• Subproblems: “find LCS of pairs of prefixes of X and Y”
12
Review: LCS Algorithm Running Time
• LCS algorithm calculates the values of each
entry of the array c[m,n]
• The running time is
O(mn)
since each c[i,j] is calculated in
constant time, and there are mn
elements in the array
13
Review: Optimal Polygon Triangulation
• Input: a convex polygon P=(v0, v1, …, vn-1)
Output: an optimal triangulation
14
Optimal Polygon Triangulation vs.
Matrix-chain multiplication
• Optimal Polygon Triangulation
0 if i=j
t[i,j] = if i<j
• Matrix-chain multiplication
jiifpppjkmkim
jiif
jimjki
jki1],1[],[{
0
],[min
)}(],1[],[{min 1 kjijki
vvvwjktkit
15
Dynamic Programming (TSP)
• Traveling salesperson problem (revisit)
- Optimal substructure? (subproblems)• g(i,S) = min{cij + g(j,S-{j})}
= the length of a shortest path starting at
vertex i, going through all vertices in S, and
terminating at vertex 1
- Overlapping subproblems?• N = |g(i,S)| = (n-1)2n-2
jS
16
Outline
• Review
• Greedy method
– Activity-selection algorithm
– Basic elements of greedy approach
– Examples where greedy approach does not work
– More examples (optimal merge pattern, Huffman code)
– Theoretical foundation (matroid)
– Task scheduling
17
Algorithm design techniques
• So far, we’ve looked at the following design
techniques:
- induction (incremental approach)
- divide and conquer
- augmenting data structures
- dynamic programming
• Coming up: greedy method
18
Dynamic Programming VS.
Greedy Algorithms
• Dynamic programming uses optimal substructure in a bottom-up fashion
– First find optimal solutions to subproblems and, having solved the subproblems, we find an optimal solution to the problem
• Greedy algorithms use optimal substructure in a top-down fashion
– First make a choice – the choice that looks best at the time – and then solving a resulting subproblem ■
19
Greedy Algorithms
• A greedy algorithm always makes the
choice that looks best at the moment
– The hope: a locally optimal choice will lead to
a globally optimal solution
– For some problems, it works
• Dynamic programming can be overkill;
greedy algorithms tend to be easier to code ■
20
Example: Activity-Selection
• Formally:
– Given a set S of n activities
si = start time of activity i
fi = finish time of activity i
– Find max-size subset A of compatible activities:
for all i,j A, [si, fi) and [sj, fj) do not overlap
1
2
3
4
5
6
21
Activity Selection: Optimal Substructure
• Let k be the minimum activity in A (i.e., the one with the earliest finish time). Then A - {k} is an optimal solution to S’ = {i S: si fk}
– In words: once activity k is selected, the problem reduces to finding an optimal solution for activity-selection over activities in S compatible with k
– Proof: if we could find optimal solution B’ to S’with |B| > |A - {k}|,
• Then B U {k} is compatible
• And |B U {k}| > |A| ■
22
Greedy Choice Property
• Dynamic programming? Memoize?
• Activity selection problem also exhibits the
greedy choice property:
– Locally optimal choice globally optimal sol’n
– Thm 16.1: if S is an activity selection problem
sorted by finish time, then optimal solution
A S such that {1} A
• Sketch of proof: if optimal solution B that does not
contain {1}, can always replace first activity in B with
{1} (Why?). Same number of activities, thus optimal. ■
23
Activity Selection: A Greedy Algorithm
• Intuition is simple:
– Always pick the activity with the earliest
finish time available at the time
– GAS(S)
1 if S=NIL then return NIL
2 else return {k} U GAS(S’)
where k is the activity in S with smallest f, and
S’ = {i S: si fk}
24
Activity Selection: A Greedy Algorithm
• GAS(S)
1 if S=NIL then return NIL
2 else return {k} U GAS(S’)
where k is the activity in S with smallest f, and
S’ = {i S: si fk}
Proof of correctness: use blackboard
25
Activity Selection: A LISP program
• GAS(S)
1 if S=NIL then return NIL
2 else return {k} U GAS(S’)
where k is the activity in S with smallest f, and
S’ = {i S: si fk}
(defun gas (l)
(print l)
(cond ((null l) nil)
(T (cons (car l) (gas (filter (nth 1 (car l)) (cdr l)))))))
(defun filter (s l)
(cond ((null l) nil)
((> s (nth 0 (car l))) (filter s (cdr l)))
(T (cons (car l) (filter s (cdr l))))))
26
Activity Selection: A LISP program
(cont.)(defvar *activity* '((1 4) (3 5) (0 6) (5 7) (3 8) (5 9) (6 10) (8 11) (8 12) (2 13) (12 14)))
#<BUFFERED FILE-STREAM CHARACTER #P"gas.txt" @1>
[3]>
*activity*
((1 4) (3 5) (0 6) (5 7) (3 8) (5 9) (6 10) (8 11) (8 12) (2 13) (12 14))
[4]> (gas *activity*)
((1 4) (3 5) (0 6) (5 7) (3 8) (5 9) (6 10) (8 11) (8 12) (2 13) (12 14))
((5 7) (5 9) (6 10) (8 11) (8 12) (12 14))
((8 11) (8 12) (12 14))
((12 14))
NIL
((1 4) (5 7) (8 11) (12 14))
[5]> (dribble)
The S’ for
each iteration
(print l)
27
Activity Selection: A Greedy Algorithm
• If we sort the activities by finish time then
the algorithm is simple (iterative instead of
recursive):
– Sort the activities by finish time
– Schedule the first activity
– Then schedule the next activity in sorted list
which starts after previous activity finishes
– Repeat until no more activities
Note: we don’t have to construct S’ explicitly. Why?
28
Greedy-Activity-Selector
Assume (wlog) that
f1 f2 … fn
• Time complexity:
O(n)
• Exercise 1: Proof of
correctness by loop
invariant.
29
Activity Selection
30
Other greedy choices
• Greedy-Activity-Selector always pick the activity with the earliest finish time available at the time (smallest f)
• Some other greedy choices:
– Largest f
– Largest/Smallest s
– Largest/Smallest (f-s)
– Fewest overlapping ■
Exercise 2: Which of these criteria result in optimal solutions?
31
A Variation of the Problem
• Instead of maximizing the number of activities we
want to schedule, we want to maximize the total
time the resource is in use.
• None of the obvious greedy choices would work:
■Choose the activity that starts earliest/latest
■Choose the activity that finishes earliest/latest
■Choose the longest activity ■
Exercise 3: Design an efficient algorithm for this variation
of activity selection problem.
32
Elements of Greedy Strategy
Greedy choice property:
An optimal solution can be obtained by making choices that seem best at the time, without considering their implications for solutions to subproblems.
Optimal substructure:
An optimal solution can be obtained by augmenting the partial solution constructed so far with an optimal solution of the remaining subproblem. ■
33
Steps in designing
greedy algorithms
1. Cast the optimization problem as one in
which we make a choice and are left with
one subproblem to solve.
2. Prove the greedy choice property.
3. Demonstrate optimal substructure. ■
34
Examples where greedy approach
does not work
• Traveling salesman problem: nearest
neighbor, closest pair
• Matrix-chain multiplication: multiply the
two matrices with lowest cost first
• Knapsack problem: largest values,
largest value per unit weight ■
35
Traveling salesman problem
• Correctness is not obvious (2/7) (nearest neighbor)
36
Traveling salesman problem
• Correctness is not obvious (3/7)
37
Matrix-chain multiplication
• Greedy choice: multiply the two matrices
with lowest cost first
• Example 1: <A1, A2, A3> (10*100, 100*5,
5*50)
• Example 2: <A1, A2, A3, A4, A5, A6 >
(30*35, 35*15, 15*5, 5*10, 10*20, 20*25)
38
Matrix-chain multiplication
• Example 1: <A1, A2, A3> (10*100, 100*5, 5*50)
– ((A1A2)A3) 10*100*5 + 10*5*50 = 5000 + 2500 =
7500
– (A1(A2A3)) 100*5*50 + 10*100*50 = 25000 +
50000 = 75000
39
Matrix-chain multiplication
• Example 2: <A1, A2, A3, A4, A5, A6 >
(30*35, 35*15, 15*5, 5*10, 10*20, 20*25)
– A3A4 15*5*10 = 750
– Figure 15.3
((A1(A2 A3))((A4 A5) A6))
40
41
Knapsack problem
Given some items, pack the knapsack to get
the maximum total value. Each item has some
weight and some value. Total weight that we can
carry is no more than some fixed number W.
So we must consider weights of items as well as
their value.
Item # Weight Value
1 1 8
2 3 6
3 5 5
42
Knapsack Problem
• A thief breaks into a jewelry store carrying a
knapsack. Given n items S = {item1,… ,itemn},
each having a weight wi and value vi, which
items should the thief put into the knapsack
with capacity W to obtain the maximum value?
43
Knapsack problem
There are two versions of the problem:
(1)“0-1 knapsack problem”Items are indivisible; you either take an item or not.
Solved with dynamic programming
(2) “Fractional knapsack problem”Items are divisible: you can take any fraction of an item.
Solved with a greedy algorithm.
44
0-1 Knapsack Problem
• This problem requires a subset A of S to be
determined such that
is maximized subject to
• The naïve approach is to generate all
possible subset of S and find the maximum
value, requiring 2n time.
Aitem
i
i
v
Aitem
i
i
Ww
45
Does Greedy Approach Work?
• Strategy 1:
– Steal the items with the largest values.
– E.g. w=[25, 10, 10], v=[$10, $9, $9], W=30.
– Value is 10, although the optimal value is 18.
w1=25 w2=10 w3=10
v1 = $10 v2 = $9 v3 = $9
46
Does Greedy Approach Work?
• Strategy 2:
– Steal the items with the largest value per unit
weight.
– E.g. w=[5, 20,10], v=[$50, $140, $60], W=30.
– Value is 190, although optimal would be 200.
w1=5 w3=10w2=20
v1=$50 v3=$60v2=$140
47
NP-hard Problems(Knapsack, Traveling Salesman,...)
• The two approaches above cannot yield the optimal
result for 0-1 knapsack.
• This problem is NP-hard even when all items are the
same kind, that is, itemi described by wi only (vi=wi).
• Observation: Greedy approach to the fractional
knapsack problem yields the optimal solution.
– E.g. (see 2nd example)
50 + 140 + (5/10)*60 = 220
where 5 is the remaining capacity of the knapsack
48
Dynamic Programming Approach for the
Knapsack Problem
• We can find optimal solution for 0-1 knapsack problem by DP,
but not in polynomial-time.
• Let A be an optimal subset with items from
{item1,… ,itemi} only. There are two cases:
1) A contains itemi Then the total value for A is equal to vi
plus the optimal value obtained from the first i-1 items,
where the total weight cannot exceed W – wi
2) A does not contain itemi Then the total value for A is equal
to that of the optimal subset chosen from the first i-1 items
(with total weight cannot exceed W).
Q: What are the subproblems?
49
Dynamic Programming Approach
for the Knapsack Problem•
max(A[i-1][w], vi+A[i-1][w-wi]) if w>wi
A[i-1][w] if w<wi
• The maximum value is A[n][W]
• Running time is O(nW),
not polynomial in n
Q: WHY?
A[i][w]=
1 2 …...…….. W
1
2
...
…
n
50
Fractional knapsack problem
• Greedy approach: taking items in order of
greatest value per pound
• Optimal for the fractional version (why?),
but not for the 0-1 version
51
Optimal merge pattern
52
Optimal merge pattern
53
Huffman codes
• Compact Text Encoding
• Code that can be decoded
• Huffman encoding
54
Compact Text Encoding
• The goal is to develop a code that represents a given text as compactly as possible.
• A standard encoding is ASCII, which represents every character using 7 bits:
• “An English sentence”
1000001 (A) 1101110 (n) 0100000 ( ) 1000101 (E) 1101110 (n) 1100111 (g) 1101100 (l) 1101001 (i) 1110011 (s)1101000 (h) 0100000 ( ) 1110011 (s)1100101 (e) 1101110 (n) 1110100 (t) 1100101 (e)1101110 (n) 1100011 (c) 1100101 (e)
• This requires 133 bits ≈ 17 bytes
55
Compact Text Encoding (Cont.)
• Of course, this is wasteful because we can encode 12 characters in 4 bits:
‹space› = 0000 A = 0001 E = 0010 c = 0011 e = 0100 g = 0101 h = 0110i = 0111 l = 1000 n = 1001 s = 1010 t = 1011
• Then we encode the phrase as
0001 (A) 1001 (n) 0000 ( ) 0010 (E) 1001 (n) 0101 (g) 1000 (l) 0111 (i)1010 (s) 0110 (h) 0000 ( ) 1010 (s) 0100 (e) 1001 (n) 1011 (t) 0100 (e)1001 (n) 0011 (c) 0100 (e)
• This requires 76 bits ≈ 10 bytes
56
Compact Text Encoding (Cont.)
• An even better code is given by the following encoding:
‹space› = 000 A = 0010 E = 0011 s = 010 c = 0110 g = 0111 h = 1000 i = 1001 l = 1010 t = 1011 e = 110 n = 111
• Then we encode the phrase as
0010 (A) 111 (n) 000 ( ) 0011 (E) 111 (n) 0111 (g) 1010 (l) 1001 (i) 010 (s) 1000 (h) 000 ( ) 010 (s) 110 (e) 111 (n) 1011 (t) 110 (e) 111 (n)0110 (c) 110 (e)
• This requires 65 bits ≈ 9 bytes
57
Code that can be decoded
• Fixed-length codes:
– Every character is encoded using the same number of bits.
– To determine the boundaries between characters, we form groups of w bits, where w is the length of a character.
– Examples:
• ASCII
• Our first improved code
• Prefix codes:
– No character is the prefix of another character.
– Examples:
• Fixed-length codes
• Huffman codes ■
58
Why Prefix Codes?
• Consider a code that is not a prefix code:a= 01 m = 10 n = 111 o = 0 r = 11 s = 1 t = 0011
• Now you send a fan-letter to you favorite movie star. One of the sentences is “You are a star.”You encode “star” as “1 0011 01 11”.
• Your idol receives the letter and decodes the text using your coding table:
100110111 = 10 0 11 0 111 = “moron”
• Prefix codes are unambiguous. (See next slide) ■
59
Why Are Prefix Codes Unambiguous?
It suffices to show that the first character can be decoded unambiguously. We
then remove this character and are left with the problem of decoding the first
character of the remaining text, and so on until the whole text has been decoded.
c
Assume that there are two characters c and c' that could potentially be the first
characters in the text. Assume that the encodings are x0x
1…x
kand y
0y
2…y
l. Assume
further that k ≤ l.
c
c'
Since both c and c' can occur at the beginning of the text, we have xi= y
i,
for 0 ≤ i ≤ k; that is, x0x
1…x
kis a prefix of y
0y
2…y
l, a contradiction. ■
prefix codes are unambiguous
60
• In the example:
‹space› = 000 A = 0010 E = 0011 s = 010 c = 0110 g = 0111
h = 1000 i = 1001 l = 1010 t = 1011 e = 110 n = 111
Representing a Prefix-Code Dictionary
‹spc›
A E
s
c g h i l t
e n
0 1
0
0
0
0
0
0
00
0 0
1
11
1 1 1 1
1 1
1
61
Figure 16.3/ Figure 16.4
62
63
Huffman code
64
The Cost of Huffman Code
• Let C be the set of characters in the text to be encoded, and let
f(c) be the frequency of character c.
• Let dT(c) be the depth of node (character) c in the tree
representing the code. Then
is the number of bits required to encode the text using the
code represented by tree T. We call B(T) the cost of tree T.
• Observation: In a tree T representing an optimal prefix code,
every internal node has two children. ■
65
Greedy Choice
• Lemma: There exists an optimal prefix
code such that the two characters with
smallest frequency are siblings and have
maximal depth in T. ■
66
Greedy Choice
Let x and y be two such characters, and let
T be a tree representing an optimal prefix
code.
Let a and b be two sibling leaves of
maximal depth in T.
Assume w.l.o.g. that f(x) ≤ f(y) and
f(a) ≤ f(b).
This implies that f(x) ≤ f(a) and
f(y) ≤ f(b).
Let T' be the tree obtained by exchanging
a and x and b and y. (Continued in the next slide.)
T
T'
x y
ba
x
a
y
b
• Proof:
67
The cost difference between trees T and T' is
T T'
x y
ba x
a
y
b
Greedy Choice
68
Optimal Substructure
• After joining two nodes x and y by making them children of a new node z, the algorithm treats z as a leaf with frequency f(z) = f(x) + f(y).
• Let C' be the character set in which x and y are replaced by the single character z with frequency f(z) = f(x) + f(y), and let T' be an optimal tree for C'.
• Let T be the tree obtained from T' by making x and y children of z.
• We observe the following relationship between B(T) and B(T'):
B(T) = B(T') + f(x) + f(y) (Continued in the next slide.)
69
70
Lemma: If T' is optimal for C', then T is optimal for C.
Assume the contrary. Then there exists a better tree T'' for C.
Also, there exists a tree T''' at least as good as T'' for C where x and y are sibling
leaves of maximal depth.
The removal of x and y from T''' turns their parent into a leaf; we can associate this
leaf with z.
The cost of the resulting tree is B(T''') – f(x) – f(y) < B(T) – f(x) – f(y) = B(T').
This contradicts the optimality of B(T').
Hence, T must be optimal for C. ■
71
72
Huffman code (Remarks)
• Assume that the string is generated by a
memoryless source, regardless the past, the
next character in the string is c with
probability f(c).
Then Huffman is optimal.
• Can we do better?
73
Huffman code (Remarks)
• Huffman encodes fixed length blocks.
– What if we vary them?
• Huffman uses one encoding throughout a life.
– What if characteristics change?
• What if data has structure?
– For examples: raster images, video
• Huffman is lossless.
– Necessary?
• LZW, MPEG, etc. ■
74
Huffman code (Implementation)
• Time complexity and data structure:
Let S be the set of n weights (nodes).
Constructing a Huffamn code based on greedy strategy can be described as following
Repeat until |S|=1
Find two min nodes x and y from S and remove them from S
Construct a new node z with weight w(z)=w(x)+w(y) and
insert z into S
75
Huffman code (Implementation)
Why data structures are important?
• An algorithm for constructing Huffman code (Tree):
Repeat until |S|=1– Find two min nodes x and y from S and remove them from S
– Construct a new node z with weight w(z)=w(x)+w(y) and insert z into S
• The time complexity of the algorithm depends on how S is implemented.
• Data structure for S each iteration total
linked list O(n) O(1) O(n2)
Sorted array O(1) O(n) O(n2)
? O(log n) O(log n) O(n log n)
76
We will cover heap in Lecture 8
77
Greedy method: Recap
• Greedy algorithms are efficient algorithms for optimization problems that exhibit two properties:– Greedy choice property: An optimal solution can be
obtained by making locally optimal choices.
– Optimal substructure: An optimal solution contains within it optimal solutions to smaller subproblems.
• If only optimal substructure is present, dynamic programming may be a viable approach; that is, the greedy choice property is what allows us to obtain faster algorithms than what can be obtained using dynamic programming. ■
78
Questions?