Parallel programming using MPI

Parallel Programming using
MPI
Collective Communications
Claudio Gheller
CINECA
c.gheller@cineca.it



!"#$$%&'()*'#&+,'&-#.-'&/,),/0#%1,#2,10#(3++
!")..34,56,).. 10#(3++3+,'&,),(#$$%&'()*#0
Barrier Synchronization
Broadcast
Gather/Scatter
Reduction (sum, max, prod, … )

2

1


Characteristics

All processes must call the collective routine
No non-blocking collective communication
No tags

7)23+*,)&4,$#+*,322'('3&*,
(#$$%&'()*'#&,$#43

3


MPI_Barrier
Stop processes until all processes within a communicator reach the
barrier

Fortran:
CALL MPI_BARRIER(comm, ierr)

C:
int MPI_Barrier(MPI_Comm comm)

4

2


Barrier
*: *; *<
8;
89
8=
8: 8< 89 8: 8; 8< 8=

barrier barrier
89 8: 8; 8< 8=

*

5


Broadcast (MPI_BCAST)
One-to-all communication: same data sent from root process to all others in the
communicator

Fortran:
INTEGER count, type, root, comm, ierr
CALL MPI_BCAST(buf, count, type, root, comm, ierr)
Buf array of type type

C:
int MPI_Bcast(void *buf, int count, MPI_Datatype, datatypem int root,
MPI_Comm comm)

All processes must specify same root, rank and comm

6

3


Broadcast
PROGRAM broad_cast
INCLUDE 'mpif.h'
INTEGER ierr, myid, nproc, root
INTEGER status(MPI_STATUS_SIZE)
REAL A(2) 89
CALL MPI_INIT(ierr)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nproc, ierr)
CALL MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr) ):
root = 0 8:
IF( myid .EQ. 0 ) THEN 89 ):
a(1) = 2.0
a(2) = 4.0 ):
END IF 8;
CALL MPI_BCAST(a, 2, MPI_REAL, 0,
):
MPI_COMM_WORLD, ierr)
WRITE(6,*) myid, ': a(1)=', a(1), 'a(2)=', a(2)
CALL MPI_FINALIZE(ierr) 8<
END

7


Scatter / Gather

7()**30 >)*?30

sndbuf sndbuf sndbuf sndbuf sndbuf

89
89 ): ); )< )= 89
89 ): 8:
8: ); 8;
8; )< 8<
8< )=

89 ): 8: ); 8; )< 8< )=
89
89 ): ); )< )=
rcvbuf rcvbuf rcvbuf rcvbuf
rcvbuf

8

4


MPI_Scatter
One-to-all communication: different data sent from root process to all others in the
communicator

Fortran:
CALL MPI_SCATTER(sndbuf, sndcount, sndtype, rcvbuf, rcvcount, rcvtype,
root, comm, ierr)

Arguments definition are like other MPI subroutine
sndcount is the number of elements sent to each process, not the size of sndbuf, that
should be sndcount times the number of process in the communicator
The sender arguments are meaningful only for root

9


10

5


MPI_SCATTERV
Usage
– int MPI_Scatterv( void* sendbuf, /* in */
int* sendcounts, /* in */
int* displs, /* in */
MPI_Datatype sendtype, /* in */
void* recvbuf, /* in */
int recvcount, /* in */
MPI_Datatype recvtype, /* in */
int root, /* in */
MPI_Comm comm); /* in */
Description
– Distributes individual messages from root to each process in communicator
– Messages can have different sizes and displacements

11


12

6


MPI_Gather
One-to-all communication: different data collected by the root process, from all others
processes in the communicator. Is the opposite of Scatter
Fortran:
CALL MPI_GATHER(sndbuf, sndcount, sndtype, rcvbuf, rcvcount,
rcvtype, root, comm, ierr)

Arguments definition are like other MPI subroutine
rcvcount is the number of elements collected from each process, not the size of rcvbuf,
that should be rcvcount times the number of process in the communicator
The receiver arguments are meaningful only for root

13


14

7


MPI_GATHERV
Usage
– int MPI_Gatherv( void* sendbuf, /* in */
int sendcount, /* in */
MPI_Datatype sendtype, /* in */
void* recvbuf, /* out */
int* recvcount, /* in */
int* displs, /* in */
MPI_Datatype recvtype, /* in */
int root, /* in */
MPI_Comm comm ); /* in */
Description
– Collects individual messages from each process in communicator to the root process and
store them in rank order
– Messages can have different sizes and displacements

15


16

8


Scatter example
PROGRAM scatter
INCLUDE 'mpif.h'
INTEGER ierr, myid, nproc, nsnd, I, root
REAL A(16), B(2)
CALL MPI_INIT(ierr)
CALL MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)
root = 0
IF( myid .eq. root ) THEN
DO i = 1, 16
a(i) = REAL(i)
END DO
END IF
nsnd = 2
CALL MPI_SCATTER(a, nsnd, MPI_REAL, b, nsnd,
& MPI_REAL, root, MPI_COMM_WORLD, ierr)
WRITE(6,*) myid, ': b(1)=', b(1), 'b(2)=', b(2)
CALL MPI_FINALIZE(ierr)

17

Gather example
PROGRAM gather
INCLUDE 'mpif.h'
INTEGER ierr, myid, nproc, nsnd, I, root
REAL A(16), B(2)
CALL MPI_INIT(ierr)
root = 0
b(1) = REAL( myid )
b(2) = REAL( myid )
nsnd = 2
CALL MPI_GATHER(b, nsnd, MPI_REAL, a, nsnd,
& MPI_REAL, root MPI_COMM_WORLD, ierr)
IF( myid .eq. root ) THEN
DO i = 1, (nsnd*nproc)
WRITE(6,*) myid, ': a(i)=', a(i)
END DO
END IF

18

9


MPI_Alltoall
@#0*0)&A
CALL MPI_ALLTOALL(sndbuf, sndcount, sndtype, rcvbuf, rcvcount,
rcvtype, comm, ierr)

89 ): ); )< )= 89 ): 5: (: 4:

89 89

rcvbuf
sndbuf

5: 5; 5< 5= ); 5; (; 4;

89 (: (; (< (= 89 )< 5< (< 4<

89 4: 4; 4< 4= 89 )= 5= (= 4=

B306,%+32%.,*#,'$1.3$3&*,4)*),*0)&+1#+'*'#&

19


Reduction
The reduction operation allow to:
• Collect data from each process
• Reduce the data to a single value
• Store the result on the root processes
• Store the result on all processes
• Overlap of communication and computing

20

10


Reduce, Parallel Sum

89 ): 5:

89 7) 75
8: ); 5;
8: 7) 75
7)C):D);D)<D)=
8; )< 5< 75C5:D5;D5<D5=
8; 7) 75

8< )= 5= 8< 7) 75

E34%(*'#&,2%&(*'#&,F#0G+,F'*?,)00)6+
#*?30,#130)*'#&A,10#4%(*H,$'&H,$)IH,)&4H,JK,

21


MPI_REDUCE and MPI_ALLREDUCE
Fortran:
MPI_REDUCE( snd_buf, rcv_buf, count, type, op, root, comm, ierr)

snd_buf input array of type type containing local values.
rcv_buf output array of type type containing global results
Count (INTEGER) number of element of snd_buf and rcv_buf
type (INTEGER) MPI type of snd_buf and rcv_buf
op (INTEGER) parallel operation to be performed
root (INTEGER) MPI id of the process storing the result
comm (INTEGER) communicator of processes involved in the operation
ierr (INTEGER) output, error code (if ierr=0 no error occours)

MPI_ALLREDUCE( snd_buf, rcv_buf, count, type, op, comm, ierr)

The argument root is missing, the result is stored to all processes.

22

11


Predefined Reduction Operations
MPI op Function

MPI_MAX Maximum

MPI_MIN Minimum

MPI_SUM Sum

MPI_PROD Product

MPI_LAND Logical AND

MPI_BAND Bitwise AND

MPI_LOR Logical OR

MPI_BOR Bitwise OR

MPI_LXOR Logical exclusive OR

MPI_BXOR Bitwise exclusive OR

MPI_MAXLOC Maximum and location

MPI_MINLOC Minimum and location

23


Reduce / 1

C:
int MPI_Reduce(void * snd_buf, void * rcv_buf, int count,
MPI_Datatype type, MPI_Op op, int root, MPI_Comm comm)

int MPI_Allreduce(void * snd_buf, void * rcv_buf, int
count, MPI_Datatype type, MPI_Op op, MPI_Comm comm)

24

12


Reduce, example
PROGRAM reduce
INCLUDE 'mpif.h'
INTEGER ierr, myid, nproc, root
REAL A(2), res(2)
CALL MPI_INIT(ierr)
root = 0
a(1) = 2.0*myid
a(2) = 4.0+myid
CALL MPI_REDUCE(a, res, 2, MPI_REAL, MPI_SUM, root,
& MPI_COMM_WORLD, ierr)
IF( myid .EQ. 0 ) THEN
WRITE(6,*) myid, ': res(1)=', res(1), 'res(2)=', res(2)
END IF
END

25

13

Parallel programming using MPI

More Related Content

What's hot (19)

Viewers also liked (20)

Similar to Parallel programming using MPI (20)

More from Majong DevJfu (20)

Parallel programming using MPI