tailieunhanh - Parallel Programming: for Multicore and Cluster Systems- P27

Parallel Programming: for Multicore and Cluster Systems- P27: Innovations in hardware architecture, like hyper-threading or multicore processors, mean that parallel computing resources are available for inexpensive desktop computers. In only a few years, many standard software products will be based on concepts of parallel programming implemented on such hardware, and the range of applications will be much broader than that of scientific computing, up to now the main application area for parallel computing | 252 5 Message-Passing Programming be used if the executing process will change the value of window entries using MPI_Put and if these entries could also be accessed by other processes. A shared lock is indicated by MPI_LOCK_SHARED. This lock type guarantees that the following RMA operations of the calling process are protected from exclusive RMA operations of other processes . other processes are not allowed to change entries of the window via RMA operations that are protected by an exclusive lock. But other processes are allowed to perform RMA operations on the same window that are also protected by a shared lock. Shared locks should be used if the executing process accesses window entries only by MPI_Get or MPI Accumulate . When a process wants to read or manipulate entries of its local window using local operations it must protect these local operations with a lock mechanism if these entries can also be accessed by other processes. An access epoch started by MPI_Win_lock for a window win can be terminated by calling the MPI function int MPI_Win_unlock int rank MPI_Win win where rank is the rank of the target process. The call of this function blocks until all RMA operations issued by the calling process on the specified window have been completed both at the calling process and at the target process. This guarantees that all manipulations of window entries issued by the calling process have taken effect at the target process. Example The use of lock synchronization for the iterative computation of a distributed data structure is illustrated in the following example which is a variation of the previous examples. Here an exclusive lock is used to protect the RMA operations while converged A update A update_buffer A from_buf MPI_Win_start target_group 0 win for i 0 i num_neighbors i MPI_Win_lock MPI_LOCK_EXCLUSIVE neighbor i 0 win MPI_Put from_buf i size i neighbor i to _disp i size i win MPI_Win_unlock neighbor i win Exercises for