1 solution space smoothing method and its application dong sheqin hong xianlong dong sheqin hong...
Post on 22-Dec-2015
216 views
TRANSCRIPT
1
Solution Space Smoothing Method and its Application
Dong Sheqin Dong Sheqin Hong XianlongHong Xianlong
董社勤 洪先龙
Department of Computer Science and Technology, Tsinghua University, Beijing,100084 P.R. China
2
Outline
Principle of Search Space SmoothingVLSI Placement based on SSSLocal Smoothing & Global
SmoothingVLSI Placement based on P-SSSExperimental ResultsApplications: TSP, Temporal
planning, FPGA Floorplanning
3
NP-hard Problem and Optimization Algorithm
A NP-Hard Problem has a complicated search space, a greedy search strategy gets stuck at one of the deep canyons easily, unable to climb out and to reach the global energy minimum.
To avoid getting stuck at one local minimum, there are commonly two types of approaches. 1). To introduce complex moves, 2). To introduce some mechanisms to allow the search to climb over the energy barrier. (ie. The simulated annealing algorithm)
4
History of Search Space Smoothing (SSS)
Jun Gu and Xiaofei Huang proposed a 3rd approach, that is, to smooth the solution space.
They had applied this method to the classical NP-hard problem: TSP. Using the smoothing function below.
5
Principle of Search Space Smoothing
By Search Space Smoothing ,The rugged terrain of search space of a NP-hard problem is smoothed, and therefore the original problem instance is transformed into a series of gradually simplified problem instances. The solutions of the simplified instances are used to guide the search of more complicated ones. Finally, the original problem is solved in the end of the series.
6
Principle of Search space smoothing
The minimum solution in original space
The initial searchpoint in original space
The smoothed solution space 1
The smoothed solution space 2
………
The smoothed solution space n
The original solution space
An example of one dimensional solution space smoothing: the minimum solutionof solution space i will be the initial starting point in the solution space i+1
7
Formal Description of SSS Algorithm
//Initializationα ← α0; x ← x0;
//Searchwhile (α >= 0) do begin H(α) ← DoSmooth(α, H); for (some times) do begin
x’ ← NeighborhoodTransition(x); if (Acceptable(α, x, x’)) then begin x ← x’; end;
end; α ← NewAlpha(α);end;End;
8
History of Search Space Smoothing (SSS)
Johannes Schneider investigates this method for traveling salesman problem thoroughly and pointed out that “the advantage(of search space smoothing) over the later one(simulated annealing: SA) is that a certain amount of computational effort usually provides a better result than in the case of SA”
SSS + SA is infeasible proved by analytic and experimental results of Johannes Schneider
9
History of Search Space Smoothing (SSS)
10
History of Search Space Smoothing (SSS)
11
Summary of The Principle of SSS
To solve the original Problem instance Pi, SSS first transform Pi to a series of Problem instance Pi
0, Pi1 , Pi
2 , Pi3 , Pi
4 …….. it is obvious that Pi
0 is similar to Pi1 , Pi
1 is similar to Pi
2 , and so on. In some sense, the “distance” between Pi
0 and Pi1 is smaller than the “distance”
between Pi0 and Pi
2. Obviously, the optimal solution of Pi
1 is very close to the optimal solution of Pi
0 in some sense. Because the two problem instances have great similarity.
In the series, each problem instance is a gradually smoothed approximation of the previous problem instance in some sense.
12
Outline
Principle of Search Space SmoothingVLSI Placement based on SSSLocal Smoothing & Global
SmoothingVLSI Placement based on P-SSSExperimental ResultsApplications: TSP, Temporal
planning, FPGA Floorplanning
13
A Soc is composed of IPs and Macro building blocks
The first step to physically design a Soc is constraint driven floorplanning and placement.
ROM/RAM
A/D
PLA I/OCPU
Problem of VLSI Placement
14
How to smooth the search space for a Placement instance--an example
Incremental optimization
(a) (b)
From an optimal placement of all buildingblocks with the same size (a), we can easilyget the optimal placement with the size ofonly one block has been changed in (b)
15
How to smooth the search space for a Placement instance--an example
Placement instance smoothing ( their optimal solution must be very close to each other)
(Pi)
Placement instance Pi, Pi0 ,Pi1 , Pi2 , Pi3
(Pi0) (Pi1) (Pi2)
(Pi3)
16
The basic smoothing function
hw, i
j
ij
hhPinY
wwPinX
/
/
ii
j
iij
hhPinY
wwPinX
/
/
To calculate the first placement instance by left formula
n
ii
n
ii w
nwh
nh
11
1,
1
To calculate the successive placement instances by the formula below:
hhhhh
hhhhhh
ii
iii 1
1
wwwww
wwwwww
ii
iii 1
1
17
Outline
Principle of Search Space SmoothingVLSI Placement based on SSSLocal Smoothing & Global
SmoothingVLSI Placement based on P-SSSExperimental ResultsApplications: TSP, Temporal
planning, FPGA Floorplanning
18
Global Smoothing
19
Global Smoothing
For a placement instance which has 4 modules, their sizes are 2 * 1.8, 2 * 1.8,1.8 * 0.5, 1.8 * 0.5, and the corresponding numbers are 1, 2, 3, 4. The global optimal solution is depicted as Fig.1(a), we name it as solution-1. Obviously, another solution depicted as Fig.1(b) is not a global optimal solution, we name it as solution-2. For a BSG as Fig.1(c), if we place modules 2 * 1.8, 2 * 1.8, 1.8 * 0.5, 1.8 * 0.5, their corresponding numbers are 1, 2, 3, 4, in Fig.1(c) BSG in (2,2), (3, 2), (2, 1), (3, 1), we get a local minimum (solution-2). If we use greedy search described above, there is no way to achieve the global optimal solution from this local minimum.
However, if we “smoothed” the placement instance that the four modules have the same size, solution-1 and solution-2 will both be the global optimal solutions of the “smoothed” placement instances. Note that, original placement instance (2 * 1.8, 2 * 1.8, 1.8 * 0.5, 1.8 * 0.5), smoothed to placement instance (2 * 1.8, 2 * 1.8, 2 * 1.8, 2 * 1.8), or (1.8 * 0.5, 1.8 * 0.5, 1.8 * 0.5, 1.8 * 0.5), the later two “smoothed” instances are similar to the original instance, their solution spaces are also similar to the solution space of the original placement instance, but the number of local minimums are reduced in the “smoothed” solution spaces.
20
Global Smoothing
Definition 1: Suppose the neighborhood of a solutionis, is a local minimum, iff
After smoothed with parameter α, we say the local minimumis eliminated, iff
iii sHsHsNss 00
iii sHsHsNss 00
21
Local effect of the smoothing operation
We first randomly choose tens of different solutions, and for each solution si, randomly select 1000 other solutions
sj (j = 1, 2, …, 1000) within its neighborhood N(si). For
every α, we calculate the energy (area, for Placement Problem) differences:
Then the Root-Mean-Square (RMS) value of all the 1000 energy differences is:
1000,...,2,1 jsHsHH ijij
1000
1
2
10001
iijHRMS
22
Local effect of the smoothing operation
23
Local Smoothing
Definition 2: Suppose the neighborhood of a solution s0 is, and the size of the neighborhood is :
And we have a vector of energy differences
Then the local smoothness of could be described by:
0
,,, 210 sNssssN
00 ,,2,10
sNisHsHiV is
0
1s
i
VsN
LS
24
How to make use of Global Smoothing effect in SSS
For Greedy Local Search, Suppose there are two solutions si and sj within a neighborhood, , then the probability of accepting the transition is
For the global smoothing effect, which changes the sign of energy difference of some pairs of solutions, the greedy strategy is effective, but as to the local smoothing, greedy strategy is completely insensitive to it.
ijij sHsHH
00
01
ij
ijij H
HA
25
How to make use of Local smoothing effect
In Simulated Annealing, Metropolis algorithm is used to make a quasi-equilibrium state at a given temperature t.
Obviously, , for . could be viewed as the smoothed result of energy difference under control parameter t.
0exp
01
ijij
ijt
ij ΔHt
ΔHΔH
A
,
ijij
Ht
H
t
HH
ijtij
.
1t
26
How to make use of Local smoothing effect
A local search, which is sensitive to both global and local smoothing effects, would leads to a better result.
The Metropolis function that have a smooth transition from 1 to 0 of the accepting probability should be introduced and the local search algorithm could degenerate to greedy algorithm in the original un-smoothed search space for convergence reason.
27
A Local search that can make use of local smoothing
A Local search with a proper accepting probability can make use of both global smoothing effect and local smoothing effect
0exp
01
ijij
ij
ij HK
H
H
A
28
Outline
Principle of Search Space Smoothing VLSI Placement based on SSS Local Smoothing & Global Smoothing VLSI Placement based on Probability
Search Space Smoothing Experimental Results Applications: TSP, Temporal planning,
FPGA Floorplanning
29
Algorithm: Probability-SSS () STEP 1: create the initial placement instance according to the
smoothing function. STEP 2: use a local search with probability acceptance function to
search the solution for the initial placement instance. The result is a starting solution.
STEP 3: α ← NewAlpha(α) ; apply the smoothing function to the previous solution to produce a new placement instance.
STEP 4: use local search algorithm a local search with probability acceptance function to search the solution for the new placement instance. The result is the current solution.
STEP 5: if =0, stop. The current solution is the final solution. Otherwise, using the current solution, go to STEP 3.
30
Outline
Principle of Search Space Smoothing VLSI Placement based on SSS Local Smoothing & Global Smoothing VLSI Placement based on Probability
Search Space Smoothing Experimental Results Applications: TSP, Temporal planning,
FPGA Floorplanning
31
Experimental Results TableI: Comparison with the solution quality and run-time
The Area (mm2) / Time(sec) comparison among Fast-SP(on Ultra-I), CBL(on Sparc
20),O-tree(on Ultra60),B*-tree(on Ultra-I),TCG(on Ultra 60), Probability SSS(on v880)
Case ECBL Fast-SP O-tree B*-tree TCG P-SSS
Ami33 1.192/73 1.205/20 1.242/119 1.27/3417 1.20/306 1.170/31
P-SSS vs. others
1.8% 2.9% 5.7% 7.8% 2.5% 0
Ami49 36.70/117 36.50/31 37.73/526 36.8/4752 36.77/434 36.08/64
P-SSS vs. others
1.6% 1.2% 4.3% 1.9% 1.9% 0
Tbale II : Minimum / average distribution:O-tree (Sun ultra 1) vs. P-SSS (Sun v880 ) for
simultaneously area and wire length optimization O-tree P-SSS Circuits
Area (mm2) Wire (mm) Area (mm2) Wire (mm) Ami33 1.26 / 1.34 51.6 / 59.8 1.221 / 1.242 31.34 / 39.94 Ami49 39.1 / 42.0 671 / 777 37.60 / 38.18 675.2 / 789.7
32
Experimental Results: placement example ami33(1)- area usage is 98.85%
33
Experimental Results: placement example ami49 - area usage is 98.85%
34
Outline
Principle of Search Space Smoothing VLSI Placement based on SSS Local Smoothing & Global Smoothing VLSI Placement based on Probability
Search Space Smoothing Experimental Results Applications: TSP, Temporal planning,
FPGA Floorplanning
35
Application:Using P-SSS to solve TSP
Solution quality (excess over optimal solution )
36
3D-BSSG representation
Application: FPGA Temporal Planning using P-SSS
37
Experimental Results
Cost Function:Φ = Volume + β * Wirelength
Temporal precedence requirements, which describe the temporal ordering among modules, should also be satisfied in our algorithm.
Using 3D-MCNC benchmarks, two groups of experiments are performed.
38
Experimental Results
In the first experiment, our objective is to compare P-SSS with G-SSS as the quantity of precedence constraints varies.
Conclusions from this experiment.
1. The increase of the precedence constraints number leads to decrease of the quality of search.
2. Combination with Metropolis algorithm makes SSS more powerful than that with Greedy algorithm as local search method.
60
70
80
90
100
2 7 12 17 22Quantity of Precedence Constraints
Vol
ume
Usa
ge (
%)
P-SSS
G-SSS
39
Experimental Results
Experimental results on all circuits of 3D-MCNC Conclusion: P-SSS algorithm improve over G-SSS algorithm in
both volume and wirelength
40
Experimental Results
In the second experiment, using same benchmarks and same constraints, we respectively execute the Simulated Annealing approach and P-SSS algorithm based on two kind of representation: 3D-subTCG and Sequence Triplet (ST).
41
Best Results of 3D-ami49:Volume usage is 84.9%
42
Application:Heterogeneous FPGA
Floorplanning Based on Instance Augmentation
43
Instance Augmentation
Instance Augmentation is a new stochastic optimization method, which showed great ability in constrained floorplanning, such as fixed-outline floorplanning [Rong Liu, ISCAS05].
Floorplanning for heterogeneous can be regarded as a constrained floorplanning problem:– fixed-outline, since size of the device is fixed;– Each module’s requirement for all kinds of resources
must be satisfied. Therefore, we applied IA on heterogeneous
floorplanning problem.
44
Overview Start from sub-instance of the given instance. That is, it
first floorplans a subset of the given modules. Simulated annealing or greedy local search may be
adopted to find feasible solutions of specific instance. When feasible solution of sub-instance is found,
augment it by inserting modules (called down-casting). If no feasible solution of current sub-instance is found,
“shrink” it by removing a module (called up-casting).
Illustration of so called instance augmentation
45
Overview
Main flow of Instance Augmentation
Note: a solution is feasible iff all the modules of the instance are put into the device and their requirement for all kinds of resources fulfilled.
46
Some Details Once augment a sub-instance to a bigger one,
an initial solution of the bigger instance is also generated by inserting the module to the feasible solution of the sub-instance.
For example: (abcd, badc) -> (aebcd, baedc) Different inserting positions and realizations of
the module is tried to find a better insertion. Experiments show that lots of feasible solutions can be obtained directly by this way.
47
Some Details When “shrink” current instance to a
smaller one, an initial solution of the smaller instance is also generated by removing the module to the feasible solution of the sub-instance.
For example: (aebcd, baedc)->(abcd, badc) To avoid the algorithm stuck in local
minimum, there may be more than one module be removed when “shrink” current instance.
48
Some Details Either simulated annealing or greedy local
search can be used to search for feasible solution of current instance;
This work adopted simulated annealing; Since the initial solutions often have good
qualities, simulated annealing used here has:– very low start temperature; – few iterations at each temperature;
49
Some Details Inserting small modules has less
destruction to the floorplan than inserting large modules.
Therefore, we– sort the modules by their requirements for
resources in descending order;– insert modules in this order;
Experiments prove that inserting modules in this order induce higher success-ratios.
50
Problem Definition Heterogeneous FPGA device
– Instead of being composed of similar CLBs, modern FPGA devices have more heteroge-neous logical resources.
– Xilinx’s Vertix II and Spartan 3 families are typical heterogeneous FPGA devices. Simplified architecture of Xilinx’s
XC3S5000, which is composed of CLBs, RAMs and Multipliers.
51
Problem Definition Heterogeneous FPGA floorplanning
– A module in FPGA floorplanning is associated with a resource requirement vector r=(nc, nr, nm), indicating that the module requires nc CLBs, nr RAMs, and nm Multipliers.
– Given a set of modules and their connections, the objective of floorplanning is to place and shape each module inside the chip so as to fulfill its resource requirements, assure no modules overlapping with each other, and a given cost is optimized.
52
Experimental Results
Employ Xilinx’s XC3S5000 as target device Generate testing data with different modules and resource
requirements
# ofmodules
CLB-rate RAM-rate MUL-rate time(sec) success-rate
21 72% 88% 86% 0.1 100%
23 83% 81% 81% 0.1 100%
37 94% 75% 75% 2.4 100%
50 78% 78% 76% 0.6 100%
100 78% 79% 77% 1.7 91%
Results of different testing data.
53
Experimental Results
Floorplan of 20 modules, which is obtained in 1.2 sec. (the resource utili-zation is 100%)
A resultant floorplan of 50 modules
54