Blame doc/notes/rma/pt-rma.txt

Packit Service c5cf8c
Below is the current design for passive target RMA on top of CH3.
Packit Service c5cf8c
Packit Service c5cf8c
We assume that there is some asychronous agent (thread) that
Packit Service c5cf8c
periodically pokes the progress engine, i.e., periodically calls
Packit Service c5cf8c
MPID_Progress_test(). We need to do a general poke of the progress
Packit Service c5cf8c
engine because we don't know whether there will be passive target RMA
Packit Service c5cf8c
or not and there may be other communication going on. 
Packit Service c5cf8c
Packit Service c5cf8c
This thread is created only when MPI_Win_create is called and the user
Packit Service c5cf8c
did not pass an info object with the key "no_locks" set to "true". (As
Packit Service c5cf8c
an aside, I wish this was an assert instead of an info. An assert can
Packit Service c5cf8c
be easily passed, whereas users are not likely to go through the
Packit Service c5cf8c
trouble of creating an info object to say no_locks, even if they only
Packit Service c5cf8c
plan to use fence.)
Packit Service c5cf8c
Packit Service c5cf8c
Assuming that such a thread exists, passive target RMA is implemented
Packit Service c5cf8c
in the CH3 channel as follows:
Packit Service c5cf8c
Packit Service c5cf8c
On the source side, MPI_Win_lock and all the RMA operations after it
Packit Service c5cf8c
are simply queued until MPI_Win_unlock is called (similar to what we
Packit Service c5cf8c
do with fence and start/complete). At MPI_Win_unlock, a "lock" packet
Packit Service c5cf8c
is sent over to the target, containing the lock type and the rank of
Packit Service c5cf8c
the source. The source then waits for a "lock granted" reply from the
Packit Service c5cf8c
target. (Singleton puts/gets are handled differently. See
Packit Service c5cf8c
Optimization for Single Puts/Gets/Accs below.) 
Packit Service c5cf8c
Packit Service c5cf8c
When the target receives a lock packet, it examines the current lock
Packit Service c5cf8c
information for that window (described later below) and either grants
Packit Service c5cf8c
the lock by sending a "lock granted" packet to the source or just
Packit Service c5cf8c
queues up the lock request. 
Packit Service c5cf8c
Packit Service c5cf8c
When the source receives a "lock granted" reply, it performs the RMA
Packit Service c5cf8c
operations exactly as in the fence or start/complete case. The last
Packit Service c5cf8c
RMA operation also releases the lock on the window at the target.
Packit Service c5cf8c
No separate unlock packet needs to be sent.
Packit Service c5cf8c
Packit Service c5cf8c
Note that RMA requests that the target receives (put, get, acc) are
Packit Service c5cf8c
always satisfied, because they won't be sent in the first place unless
Packit Service c5cf8c
the source has been authorized to send them.
Packit Service c5cf8c
Packit Service c5cf8c
If none of the RMA operations is a get, the target must send an
Packit Service c5cf8c
acknowledgement to the source when the last RMA operation has
Packit Service c5cf8c
completed. If any one of the operations is a get, we reorder the
Packit Service c5cf8c
operations and perform the get last. In this case, since the source
Packit Service c5cf8c
must wait to receive the data, the acknowledgement is not needed
Packit Service c5cf8c
assuming that data transfer is ordered. If data transfer is not
Packit Service c5cf8c
ordered, an acknowledgement is needed even if the last operation is a
Packit Service c5cf8c
get.
Packit Service c5cf8c
Packit Service c5cf8c
The MPI_Win object needs the following lock info:
Packit Service c5cf8c
   int current_lock_type;  /* no_lock, shared, exclusive */
Packit Service c5cf8c
   int shared_lock_ref_cnt;  /* count of active shared locks */
Packit Service c5cf8c
   a queue of unsatisfied lock requests;
Packit Service c5cf8c
Packit Service c5cf8c
When a lock request arrives at the target, it looks at the incoming
Packit Service c5cf8c
and existing lock types and takes the following action:
Packit Service c5cf8c
Packit Service c5cf8c
Incoming           Existing             Action
Packit Service c5cf8c
--------           --------             ------
Packit Service c5cf8c
Shared             Exclusive            Queue it
Packit Service c5cf8c
Shared             NoLock/Shared        Grant it
Packit Service c5cf8c
Exclusive          NoLock               Grant it
Packit Service c5cf8c
Exclusive          Exclusive/Shared     Queue it
Packit Service c5cf8c
Packit Service c5cf8c
No change needs to be made to the existing code in the progress engine
Packit Service c5cf8c
for handling put/get/accumulate, except that when the last RMA
Packit Service c5cf8c
operation from a source is completed, grant the next queued lock
Packit Service c5cf8c
request if there is one and change the lock_type if necessary. This
Packit Service c5cf8c
can be done even if the sync model is fence or post/start because in
Packit Service c5cf8c
that case there will be no lock request to grant. Therefore we don't
Packit Service c5cf8c
need to know whether the current synch model is lock/unlock or not.
Packit Service c5cf8c
Packit Service c5cf8c
Packit Service c5cf8c
Optimization for Single Puts, Gets, Accs
Packit Service c5cf8c
----------------------------------------
Packit Service c5cf8c
For the case where the lock/unlock is for a single short
Packit Service c5cf8c
put/accumulate or get, we can send over the put data (or get info)
Packit Service c5cf8c
along with the lock pkt. If the lock needs to be queued, it will be
Packit Service c5cf8c
queued with this data or info. When the lock is granted, no "lock
Packit Service c5cf8c
granted" reply needs to be sent. Instead the put data is simply copied
Packit Service c5cf8c
or the get data is sent over. Except in the case of get operations,
Packit Service c5cf8c
Win_unlock must block until it receives an acknowledgement from the
Packit Service c5cf8c
target that the RMA operation has completed (for both shared and
Packit Service c5cf8c
exclusive locks).