parallel techniques 之 易并行计算
DESCRIPTION
Parallel Techniques 之 易并行计算. Lecture 4. 张少强 http://bioinfo.uncc.edu/szhang [email protected]. Parallel Techniques. Embarrassingly Parallel Computations. Embarrassingly Parallel Computations. - PowerPoint PPT PresentationTRANSCRIPT
1
张少强http://bioinfo.uncc.edu/szhang
Lecture 4
Parallel Techniques之易并行计算
2
Embarrassingly Parallel Computations
Parallel Techniques
3
Embarrassingly Parallel ComputationsA computation that can obviously be divided into a number of completely independent parts, each of which can be executed by a separate process(or). 计算可以很显然地被分成一组完全独立的部分,每部分可分别由一个进程执行。
进程之间没有或有很少的通信每个进程可以不与其他进程有交互来运行任务
4
Practical embarrassingly parallel computation with static process
creation and master-slave approach
5
Embarrassingly Parallel Computation Examples
• Low level image processing
• Mandelbrot set
• Monte Carlo Calculations
6
Low level image processing
Many low level image processing operations only involve local data with very limited if any communication between areas of interest.
7
Image coordinate system
. (x, y)
x
y
Origin (0,0)
8
Some geometrical operationsShifting平移
Object shifted by x in x-dimension and y in y-dimension:x = x + xy = y + y
where x and y are original coordinates and x and y are the new coordinates.
x
yx
y
9
Some geometrical operations
Scaling缩放Object scaled by factor Sx in x-direction and Sy in y-direction:
x = xSx
y = ySy
3.8
x
y
10
Rotation翻转Object rotated through an angle q about the origin of the coordinate system:
x = x cos + y siny = -x sin + y cos
3.8
x
y
11
Parallelizing Image OperationsPartitioning into regions for individual processes
Example
Square region for each process (can also use strips) 3.9
12
for(i=0,row=0;i<48;i++,row=row+10) Send(row,P(i)); /*send row to process i*/
for(i=0;i<480;i++) for(i=0;i<640;j++) temp_map[i][j]=0;
for(i=0;i<(640*480);i++) Recv(oldrow,oldcol,newrow,newcol,P_any); if!((newrow<0)||(newrow>=480)||(newcol<0)||(newcol>=640)) temp_map[newrow][newcol]=map[oldrow][oldcol];}
for(i=0;i<480;i++) for(j=0;j<640;j++) map[i][j]=temp_map[i][j];
主进程master process480行, 10行一组分割;分出 48个从进程;每个进程处理 640*10区域。
13
从进程 slave process
Recv(row, P(0)); /*revecive from master process*/for(oldrow=row;oldrow<(row+10);oldrow++) for(oldcol=0;oldcol<640;oldcol++){ newrow=oldrow+delta_x; newcol=oldcol+delta_y; Send(oldrow,oldcol,newrow,newcol,P(0)); }
平移的从进程
从进程之间没有通信
14
Mandelbrot Set 曼德博集合
Set of points in a complex plane that are quasi-stable (will
increase and decrease, but not exceed some limit) when
computed by iterating the function:
where zk +1 is the (k + 1)th iteration of complex number:
z = a + bi
and c is a complex number giving position of point in the
complex plane. The initial value for z is zero.
是一种在复平面上组成分形的点的集合
15
Mandelbrot Set continued
Iterations continued until magnitude of z is:
• Greater than 2
or
• Number of iterations reaches arbitrary limit.
Magnitude of z is the length of the vector given by:
Zk 或者延伸到无限大,或者只停留在有限半径的圆盘内。曼德博集合就是使以上序列不延伸至无限大的所有 c点的集合。
16
对一点 (像素 )的值进行计算并返回迭代次数 ( 色 )
structure complex {float real;float imag;
};int cal_pixel(complex c){
int count, max;complex z;float temp, lengthsq;max = 256; /*若令 256为黑色 */z.real = 0; z.imag = 0;count = 0; /* number of iterations */do {
temp = z.real * z.real - z.imag * z.imag + c.real;z.imag = 2 * z.real * z.imag + c.imag;z.real = temp;lengthsq = z.real * z.real + z.imag * z.imag;count++;
} while ((lengthsq < 4.0) && (count < max));return count;
}
17
Mandelbrot set
18
Parallelizing Mandelbrot Set Computation
Static Task Assignment
Simply divide the region in to fixed number of parts, each computed by a separate processor.
Not very successful because different regions require different numbers of iterations and time.
19
Dynamic Task Assignment
Have processor request regions after computing previousregions
20
动态主进程Count=0;Row=0;For(k=0;k<num_proc;k++) /*进程数 <row/ send(&row,Pk,data_tag); count++; row++;}
do{ recv(&slave,&r,color,P_any,result_tag); count--; if(row <disp_height){
send(&row,P_slave,data_tag); row++; count++; }else send(&row, P_slave,terminator_tag); display();}while(count>0)
21
Monte Carlo Methods
Another embarrassingly parallel computation.
Monte Carlo methods use of random selections.
22
2 x 2的正方形内接一个圆 . 圆与正方形的面具比值为 :
在正方形内随机选择很多“点” . 记下有多少个“点”落在圆中。
Fraction of points within circle will be , given sufficient number of randomly selected samples.
23
24
Computing an IntegralOne quadrant can be described by integral
Random pairs of numbers, (xr,yr) generated, each between 0 and 1.Counted as in circle if
25
Alternative (better) Method
Use random values of x to compute f(x) and sum values of f(x):
where xr are randomly generated values of x between x1 and x2.
Monte Carlo method very useful if the function cannot be integrated numerically (maybe having a large number of variables)
26
ExampleComputing the integral
Sequential Code
sum = 0;for (i = 0; i < N; i++) { /* N random samples */
xr = rand_v(x1, x2); /* generate next random value */sum = sum + xr * xr - 3 * xr; /* compute f(xr) */
}area = (sum / N) * (x2 - x1);
Routine randv(x1, x2) returns a pseudorandom number between x1 and x2.
27
For parallelizing Monte Carlo code, must address best way to generate random numbers in parallel
28
1arctan41
41
0 2dx
x/*********************************************************************************pi_calc.cpp calculates value of pi and compares with actual value (to 25digits) of pi to give error. Integrates function f(x)=4/(1+x^2).**********************************************************************************/#include <math.h> //include files#include <iostream.h>#include "mpi.h "
void printit(); //function prototypesint main(int argc, char *argv[]){double actual_pi = 3.141592653589793238462643; //for comparison laterint n, rank, num_proc, i;double temp_pi, calc_pi, int_size, part_sum, x;char response = 'y', resp1 = 'y';MPI::Init(argc, argv); //initiate MPI
29
num_proc = MPI::COMM_WORLD.Get_size();rank = MPI::COMM_WORLD.Get_rank();if (rank == 0) printit(); /* I am root node, print out welcome */
while (response == 'y') {if (resp1 == 'y') {if (rank == 0) { /*I am root node*/cout <<"__________________________________" <<endl;cout <<"\nEnter the number of intervals: (0 will exit)" << endl;cin >> n;}
} else n = 0;
MPI::COMM_WORLD.Bcast(&n, 1, MPI::INT, 0); //broadcast nif (n==0) break; //check for quit condition
30
else {int_size = 1.0 / (double) n; //calcs interval sizepart_sum = 0.0;
for (i = rank + 1; i <= n; i += num_proc) { //calcs partial sums
x = int_size * ((double)i - 0.5);part_sum += (4.0 / (1.0 + x*x));
}temp_pi = int_size * part_sum; //collects all partial sums computes pi
MPI::COMM_WORLD.Reduce(&temp_pi,&calc_pi, 1, MPI::DOUBLE, MPI::SUM, 0);
31
if (rank == 0) { /*I am server*/cout << "pi is approximately " << calc_pi<< ". Error is " << fabs(calc_pi - actual_pi)<< endl<<"_______________________________________"<< endl;}} //end elseif (rank == 0) { /*I am root node*/cout << "\nCompute with new intervals? (y/n)" << endl; cin >> resp1;}}//end whileMPI::Finalize(); //terminate MPIreturn 0;} //end main
32
//functionsvoid printit(){cout << "\n*********************************" << endl<< "Welcome to the pi calculator!" << endl<< "You set the number of divisions \nfor estimating the integral:\n\tf(x)=4/(1+x^2)"<< endl<< "*********************************" << endl;} //end printit
33