|
Packit Service |
c5cf8c |
Below is the current design for passive target RMA on top of CH3.
|
|
Packit Service |
c5cf8c |
|
|
Packit Service |
c5cf8c |
We assume that there is some asychronous agent (thread) that
|
|
Packit Service |
c5cf8c |
periodically pokes the progress engine, i.e., periodically calls
|
|
Packit Service |
c5cf8c |
MPID_Progress_test(). We need to do a general poke of the progress
|
|
Packit Service |
c5cf8c |
engine because we don't know whether there will be passive target RMA
|
|
Packit Service |
c5cf8c |
or not and there may be other communication going on.
|
|
Packit Service |
c5cf8c |
|
|
Packit Service |
c5cf8c |
This thread is created only when MPI_Win_create is called and the user
|
|
Packit Service |
c5cf8c |
did not pass an info object with the key "no_locks" set to "true". (As
|
|
Packit Service |
c5cf8c |
an aside, I wish this was an assert instead of an info. An assert can
|
|
Packit Service |
c5cf8c |
be easily passed, whereas users are not likely to go through the
|
|
Packit Service |
c5cf8c |
trouble of creating an info object to say no_locks, even if they only
|
|
Packit Service |
c5cf8c |
plan to use fence.)
|
|
Packit Service |
c5cf8c |
|
|
Packit Service |
c5cf8c |
Assuming that such a thread exists, passive target RMA is implemented
|
|
Packit Service |
c5cf8c |
in the CH3 channel as follows:
|
|
Packit Service |
c5cf8c |
|
|
Packit Service |
c5cf8c |
On the source side, MPI_Win_lock and all the RMA operations after it
|
|
Packit Service |
c5cf8c |
are simply queued until MPI_Win_unlock is called (similar to what we
|
|
Packit Service |
c5cf8c |
do with fence and start/complete). At MPI_Win_unlock, a "lock" packet
|
|
Packit Service |
c5cf8c |
is sent over to the target, containing the lock type and the rank of
|
|
Packit Service |
c5cf8c |
the source. The source then waits for a "lock granted" reply from the
|
|
Packit Service |
c5cf8c |
target. (Singleton puts/gets are handled differently. See
|
|
Packit Service |
c5cf8c |
Optimization for Single Puts/Gets/Accs below.)
|
|
Packit Service |
c5cf8c |
|
|
Packit Service |
c5cf8c |
When the target receives a lock packet, it examines the current lock
|
|
Packit Service |
c5cf8c |
information for that window (described later below) and either grants
|
|
Packit Service |
c5cf8c |
the lock by sending a "lock granted" packet to the source or just
|
|
Packit Service |
c5cf8c |
queues up the lock request.
|
|
Packit Service |
c5cf8c |
|
|
Packit Service |
c5cf8c |
When the source receives a "lock granted" reply, it performs the RMA
|
|
Packit Service |
c5cf8c |
operations exactly as in the fence or start/complete case. The last
|
|
Packit Service |
c5cf8c |
RMA operation also releases the lock on the window at the target.
|
|
Packit Service |
c5cf8c |
No separate unlock packet needs to be sent.
|
|
Packit Service |
c5cf8c |
|
|
Packit Service |
c5cf8c |
Note that RMA requests that the target receives (put, get, acc) are
|
|
Packit Service |
c5cf8c |
always satisfied, because they won't be sent in the first place unless
|
|
Packit Service |
c5cf8c |
the source has been authorized to send them.
|
|
Packit Service |
c5cf8c |
|
|
Packit Service |
c5cf8c |
If none of the RMA operations is a get, the target must send an
|
|
Packit Service |
c5cf8c |
acknowledgement to the source when the last RMA operation has
|
|
Packit Service |
c5cf8c |
completed. If any one of the operations is a get, we reorder the
|
|
Packit Service |
c5cf8c |
operations and perform the get last. In this case, since the source
|
|
Packit Service |
c5cf8c |
must wait to receive the data, the acknowledgement is not needed
|
|
Packit Service |
c5cf8c |
assuming that data transfer is ordered. If data transfer is not
|
|
Packit Service |
c5cf8c |
ordered, an acknowledgement is needed even if the last operation is a
|
|
Packit Service |
c5cf8c |
get.
|
|
Packit Service |
c5cf8c |
|
|
Packit Service |
c5cf8c |
The MPI_Win object needs the following lock info:
|
|
Packit Service |
c5cf8c |
int current_lock_type; /* no_lock, shared, exclusive */
|
|
Packit Service |
c5cf8c |
int shared_lock_ref_cnt; /* count of active shared locks */
|
|
Packit Service |
c5cf8c |
a queue of unsatisfied lock requests;
|
|
Packit Service |
c5cf8c |
|
|
Packit Service |
c5cf8c |
When a lock request arrives at the target, it looks at the incoming
|
|
Packit Service |
c5cf8c |
and existing lock types and takes the following action:
|
|
Packit Service |
c5cf8c |
|
|
Packit Service |
c5cf8c |
Incoming Existing Action
|
|
Packit Service |
c5cf8c |
-------- -------- ------
|
|
Packit Service |
c5cf8c |
Shared Exclusive Queue it
|
|
Packit Service |
c5cf8c |
Shared NoLock/Shared Grant it
|
|
Packit Service |
c5cf8c |
Exclusive NoLock Grant it
|
|
Packit Service |
c5cf8c |
Exclusive Exclusive/Shared Queue it
|
|
Packit Service |
c5cf8c |
|
|
Packit Service |
c5cf8c |
No change needs to be made to the existing code in the progress engine
|
|
Packit Service |
c5cf8c |
for handling put/get/accumulate, except that when the last RMA
|
|
Packit Service |
c5cf8c |
operation from a source is completed, grant the next queued lock
|
|
Packit Service |
c5cf8c |
request if there is one and change the lock_type if necessary. This
|
|
Packit Service |
c5cf8c |
can be done even if the sync model is fence or post/start because in
|
|
Packit Service |
c5cf8c |
that case there will be no lock request to grant. Therefore we don't
|
|
Packit Service |
c5cf8c |
need to know whether the current synch model is lock/unlock or not.
|
|
Packit Service |
c5cf8c |
|
|
Packit Service |
c5cf8c |
|
|
Packit Service |
c5cf8c |
Optimization for Single Puts, Gets, Accs
|
|
Packit Service |
c5cf8c |
----------------------------------------
|
|
Packit Service |
c5cf8c |
For the case where the lock/unlock is for a single short
|
|
Packit Service |
c5cf8c |
put/accumulate or get, we can send over the put data (or get info)
|
|
Packit Service |
c5cf8c |
along with the lock pkt. If the lock needs to be queued, it will be
|
|
Packit Service |
c5cf8c |
queued with this data or info. When the lock is granted, no "lock
|
|
Packit Service |
c5cf8c |
granted" reply needs to be sent. Instead the put data is simply copied
|
|
Packit Service |
c5cf8c |
or the get data is sent over. Except in the case of get operations,
|
|
Packit Service |
c5cf8c |
Win_unlock must block until it receives an acknowledgement from the
|
|
Packit Service |
c5cf8c |
target that the RMA operation has completed (for both shared and
|
|
Packit Service |
c5cf8c |
exclusive locks).
|