Download - OpenMP

OpenMP

Speaker ：呂宗螢Date ： 2007/06/01

Embedded and Parallel Systems Lab

2

Outline


3

OpenMP

OpenMP 2.5 Multi-threaded & Share memory Fortran 、 C / C++ 基本語法

#pragma omp directive [clause] OpenMP 需求及支援環境

Windows Virtual studio 2005 standard Intel ® C++ Compiler 9.1

Linux gcc 4.2.0 Omni

Xbox 360 & PS3


4

Windows

於程式最前面 #include <omp.h> Virtual studio 2005 standard

專案 / 專案屬性 / 組態屬性 /c/c++/ 語言將 OpenMP 支援改為 yes


5

Linux

gcc 4.2 如果沒有請至 GNU gcc 下載 gcc

http://gcc.gnu.org/以 gcc 4.2.1 為例

1. 解開 gcc tar -zxvf gcc-4.2.1.tar.gz2. 進到該目錄 cd gcc-4.2.13. 設定 configure ，並安裝至 /opt/gcc-4.2.1 ./configure -prefix=/opt/gcc-4.2.1/4. 編繹 make 5. 安裝 make install

http://gcc.gnu.org/

http://gcc.gnu.org/

http://gcc.gnu.org/


6

OpenMP Constructs


7

Types of Work-Sharing Constructs Loop ： shares iterations of a loop

across the team. Represents a type of "data parallelism".

Source : http://www.llnl.gov/computing/tutorials/openMP/

Sections ： breaks work into separate, discrete sections. Each section is executed by a thread. Can be used to implement a type of "functional parallelism".


8

Types of Work-Sharing Constructs

single ：將程式於一個執行緒執行 ( 於一個子執行緒執行，但不會在master thread 執行 )

Source : http://www.llnl.gov/computing/tutorials/openMP/


9

Loop working sharing

#pragma omp parallel for

for( int i , i <10000, i++)

for( int j , j <100 , j++)

function(i);

#pragma omp parallel

{\\大括號必須斷行，不能接於 parallel後#pragma omp for

for( int i , i <10000, i++)

for( int j , j <100 , j++)

function(i);

}

=

parallel for 只能使用迴圈的 index 為 int 型態，且執行次數是可預知的

Thread 0 (Master)

for( i = 0 , i <5000, i++)

for( int j , j <100 , j++)

function(i);

Thread 1

for( i = 5000 , i <10000, i++)

for( int j , j <100 , j++)

function(i);

於雙執行緒的 cpu 執行時情形


10

OpenMP example : log.cpp#include <omp.h>#pragma omp parallel for num_threads(2) // 將 for迴圈平均分給 2 個 threads for (y=2;y<BufSizeY-2;y++) for (x=2;x<BufSizeX-2;x++)

for (z=0;z<BufSizeBand;z++) {addr=(y*BufSizeX+x)*BufSizeBand+z; ans = (BYTE)(*(InBuf+addr))*16+ (BYTE)(*(InBuf+((y*BufSizeX+x+1)*BufSizeBand+z)))*(-2) +

(BYTE)(*(InBuf+((y*BufSizeX+x-1)*BufSizeBand+z)))*(-2) +

(BYTE)(*(InBuf+(((y+1)*BufSizeX+x)*BufSizeBand+z)))*(-2)+ (BYTE)(*(InBuf+(((y-1)*BufSizeX+x)*BufSizeBand+z)))*(-2)+

(BYTE)(*(InBuf+((y*BufSizeX+x+2)*BufSizeBand+z)))*(-1)+

(BYTE)(*(InBuf+((y*BufSizeX+x-2)*BufSizeBand+z)))*(-1)+ (BYTE)(*(InBuf+(((y+2)*BufSizeX+x)*BufSizeBand+z)))*(-1)+ (BYTE)(*(InBuf+(((y-2)*BufSizeX+x)*BufSizeBand+z)))*(-1)+

(BYTE)(*(InBuf+(((y+1)*BufSizeX+x+1)*BufSizeBand+z)))*(-1) + (BYTE)(*(InBuf+(((y+1)*BufSizeX+x-1)*BufSizeBand+z)))*(-1)+

(BYTE)(*(InBuf+(((y-1)*BufSizeX+x+1)*BufSizeBand+z)))*(-1)+ (BYTE)(*(InBuf+(((y-1)*BufSizeX+x-1)*BufSizeBand+z)))*(-1);*(OutBuf+addr)=abs(ans)/8;}


11

Source image Out image

Convert Log Image


12

Sections Working Shareint main(int argc, char* argv[]) {

#pragma omp parallel sections {

#pragma omp section {

toPNG(); } #pragma omp section {

toJPG(); } #pragma omp section {

toTIF();}

}

}

Input image

toPNG

toJPG

toTIF


13

OpenMP notice

int Fe[10]; Fe[0] = 0;Fe[1] = 1; #pragma omp parallel for num_threads(2)for( i = 2; i < 10; ++ i )

Fe[i] = Fe[i-1] + Fe[i-2];

Data dependent

#pragma omp parallel {

#pragma omp for for( int i = 0; i < 1000000; ++ i )

sum += i; }

Race conditions


14

OpenMP notice

DeadLock#pragma omp parallel

private(me)

{

int me;

me = omp_get_thread_num ();

if (me == 0) goto Master;

#pragma omp barrier

Master:

#pragma omp single

write(*,*) ”done”

}


15

OpenMP example:matrix(1)#include <omp.h>#include <stdio.h>#include <stdlib.h>#define RANDOM_SEED 2882 //random seed#define VECTOR_SIZE 4 //sequare matrix width the same to height

#define MATRIX_SIZE (VECTOR_SIZE * VECTOR_SIZE) //total size of

MATRIXint main(int argc, char *argv[]){

int i,j,k;int node_id;int *AA; //sequence use & check the d2mce right or faultint *BB; //sequence useint *CC; //sequence useint computing;int _vector_size = VECTOR_SIZE;int _matrix_size = MATRIX_SIZE;char c[10];


16

OpenMP example:matrix(2)if(argc > 1){

for( i = 1 ; i < argc ;){ if(strcmp(argv[i],"-s") == 0){ _vector_size = atoi(argv[i+1]); _matrix_size =_vector_size * _vector_size; i+=2; } else{ printf("the argument only have:\n"); printf("-s: the size of vector ex: -s 256\n"); return 0; } } }

AA =(int *)malloc(sizeof(int) * _matrix_size);BB =(int *)malloc(sizeof(int) * _matrix_size);CC =(int *)malloc(sizeof(int) * _matrix_size);


17

OpenMP example:matrix(3)srand( RANDOM_SEED );/* create matrix A and Matrix B */

for( i=0 ; i< _matrix_size ; i++){AA[i] = rand()%10; BB[i] = rand()%10;

}/* computing C = A * B */

#pragma omp parallel for private(computing, j , k)for( i=0 ; i < _vector_size ; i++){

for( j=0 ; j < _vector_size ; j++){computing =0;for( k=0 ; k < _vector_size ; k++)

computing += AA[ i*_vector_size + k ] * BB[ k*_vector_size + j ];

CC[ i*_vector_size + j ] = computing;}

}


18

OpenMP example:matrix(4)

printf("\nVector_size:%d\n", _vector_size);

printf("Matrix_size:%d\n", _matrix_size);

printf("Processing time:%f\n", time);

return 0;

}


19

OpenMP Directive TableDirective Description

atomic Specifies that a memory location that will be updated atomically.

barrierSynchronizes all threads in a team; all threads pause at the barrier, until all threads execute the barrier.

critical Specifies that code is only executed on one thread at a time.

flush Specifies that all threads have the same view of memory for all shared objects.

for Causes the work done in a for loop inside a parallel region to be divided among threads.

master Specifies that only the master threadshould execute a section of the program.

ordered Specifies that code under a parallelized for loop should be executed like a sequential loop.

parallel Defines a parallel region, which is code that will be executed by multiple threads in parallel.

sections Identifies code sections to be divided among all threads.

singleLets you specify that a section of code should be executed on a single thread, not necessarily the master thread.

threadprivate Specifies that a variable is private to a thread.

Source :http://msdn2.microsoft.com/zh-tw/library/0ca2w8dk(VS.80).aspx


20

OpenMP Clause TableClause Description

copyin Allows threads to access the master thread's value, for a threadprivate variable.

copyprivate Specifies that one or more variables should be shared among all threads.

default Specifies the behavior of unscoped variables in a parallel region.

firstprivateSpecifies that each thread should have its own instance of a variable, and that the variable should be initialized with the value of the variable, because it exists before the parallel construct.

if Specifies whether a loop should be executed in parallel or in serial.

lastprivateSpecifies that the enclosing context's version of the variable is set equal to the private version of whichever thread executes the final iteration (for-loop construct) or last section (#pragma sections).

nowait Overrides the barrier implicit in a directive.

num_threads Sets the number of threads in a thread team.

ordered Required on a parallel for statement if an ordered directive is to be used in the loop.

private Specifies that each thread should have its own instance of a variable.

reductionSpecifies that one or more variables that are private to each thread are the subject of a reduction operation at the end of the parallel region.

schedule Applies to the for directive. Have fourt method ： static 、 dynamic 、 guided 、 runtime

shared Specifies that one or more variables should be shared among all threads.

Source :http://msdn2.microsoft.com/zh-tw/library/0ca2w8dk(VS.80).aspx


21

Reference

Michael J. Quinn, “Parallel Programming in C with MPI and OpenMP” Introduction to Parallel Computing 　

http://www.llnl.gov/computing/tutorials/parallel_comp/ OpenMP standard http://www.openmp.org/drupal/ OpenMP MSDN tutorial

http://msdn2.microsoft.com/en-us/library/tt15eb9t(VS.80).aspx OpenMP tutorial http://www.llnl.gov/computing/tutorials/openMP/#DO Kang Su Gatlin , Pete Isensee, “Reap the Benefits of Multithreading without

All the Work” ,MSDN Magazine

http://www.llnl.gov/computing/tutorials/parallel_comp/

http://www.openmp.org/drupal/



http://msdn2.microsoft.com/en-us/library/tt15eb9t(VS.80).aspx

http://www.llnl.gov/computing/tutorials/openMP/#DO

http://www.llnl.gov/computing/tutorials/openMP/#DO

Download - OpenMP

Top Related