Blame doc/notes/pt2pt/pt2pt.txt

Packit 0848f5
pt2pt requirement
Packit 0848f5
Packit 0848f5
- need to specify blocking vs. non-blocking for most routines
Packit 0848f5
Packit 0848f5
Packit 0848f5
------------------------------------------------------------------------
Packit 0848f5
Packit 0848f5
MPI_Send_init(buf, count, datatype, dest, tag, comm, request, error)
Packit 0848f5
MPI_Bsend_init(buf, count, datatype, dest, tag, comm, request, error)
Packit 0848f5
MPI_Rsend_init(buf, count, datatype, dest, tag, comm, request, error)
Packit 0848f5
MPI_Ssend_init(buf, count, datatype, dest, tag, comm, request, error)
Packit 0848f5
MPI_Recv_init(buf, count , datatype, src, tag, com, request, error)
Packit 0848f5
{
Packit 0848f5
    request_p = MPIR_Request_alloc();
Packit 0848f5
Packit 0848f5
    /* Fill in request structure based on parameters and type of operation */
Packit 0848f5
    request_p->buf = buf;
Packit 0848f5
    request_p->count = count;
Packit 0848f5
    request_p->datatype = datatype;
Packit 0848f5
    request_p->rank = dest/src;
Packit 0848f5
    request_p->tag = tag;
Packit 0848f5
    request_p->comm = comm;
Packit 0848f5
    request_p->type = persistent | <type>;
Packit 0848f5
Packit 0848f5
    *request = MPIR_Request_handle(request_p);
Packit 0848f5
}
Packit 0848f5
Packit 0848f5
Packit 0848f5
MPI_Start(request, error)
Packit 0848f5
{
Packit 0848f5
    switch(request->type)
Packit 0848f5
    {
Packit 0848f5
        send:
Packit 0848f5
	    MPID_Isend(buf, count, datatype, dest, tag, comm, request_p,
Packit 0848f5
                       error);
Packit 0848f5
        bsend:
Packit 0848f5
            MPID_Ibsend(...)
Packit 0848f5
        rsend:
Packit 0848f5
            MPID_Irsend(...)
Packit 0848f5
        ssend:
Packit 0848f5
            MPID_Issend(...)
Packit 0848f5
        recv:
Packit 0848f5
            MPID_Irecv(...)
Packit 0848f5
    }
Packit 0848f5
}
Packit 0848f5
Packit 0848f5
- persistent requests require copying parameters into the request structure.
Packit 0848f5
  should we always fill in a request and simply pass the request as the only
Packit 0848f5
  parameter?  this would eliminate optimizations on machines where large
Packit 0848f5
  numbers of parameters can be passed in registers, but the intel boxes will
Packit 0848f5
  just end up pushing the parameters on the stack anyway...
Packit 0848f5
Packit 0848f5
- there is an optimization here that allows registered memory to be maintained
Packit 0848f5
  as registered in the persistent case.  to do this we will need to let the
Packit 0848f5
  method know that we do/do not want the memory unregistered.
Packit 0848f5
Packit 0848f5
- need to store request type in request structure so that MPI_Start() can do
Packit 0848f5
  the right thing (tm).
Packit 0848f5
Packit 0848f5
- we chose not to convert handles to structure pointers since the handles may
Packit 0848f5
  cointain quick access to common information avoiding pointer dereferences.
Packit 0848f5
  in some cases, an associated structure may not even exist.
Packit 0848f5
Packit 0848f5
  the implication here is that many of the non-persistent MPI_Xsend routines
Packit 0848f5
  will do little work outside of calling an MPID function.  Perhaps we should
Packit 0848f5
  not have separate MPI functions in those cases but rather map the MPI
Packit 0848f5
  functions direct to the MPID functions (through the use of macros or weak
Packit 0848f5
  symbols).
Packit 0848f5
Packit 0848f5
------------------------------------------------------------------------
Packit 0848f5
Packit 0848f5
Packit 0848f5
MPI_Send(buf, count, datatype, dest, tag, comm, error)
Packit 0848f5
MPI_Bsend(buf, count, datatype, dest, tag, comm, error)
Packit 0848f5
MPI_Rsend(buf, count, datatype, dest, tag, comm, error)
Packit 0848f5
MPI_Ssend(buf, count, datatype, dest, tag, comm, error)
Packit 0848f5
{
Packit 0848f5
  /* Map (comm,rank) handle to a virtual connection */
Packit 0848f5
  MPID_Comm_get_connection(comm, rank, &vc);
Packit 0848f5
Packit 0848f5
  /* If virtual connection is not bound to a real connection, then perform
Packit 0848f5
     connection resolution. */
Packit 0848f5
Packit 0848f5
  /* (atomically) If no other requests are queued on this connection, the send
Packit 0848f5
     as much data as possible.  If the entire message could not be sent
Packit 0848f5
     "immediately" then queue the request for later processing. (We need a
Packit 0848f5
     progress engine to ensure that later happens.  */
Packit 0848f5
  /* Build up a segement unless the datatype is "trivial" */
Packit 0848f5
Packit 0848f5
Packit 0848f5
  /* Wait until entire message is sent */
Packit 0848f5
}
Packit 0848f5
Packit 0848f5
- heterogeneity should be handled by the method.  this allows methods which do
Packit 0848f5
  require conversions, such as shared memory, to be fully optimized.
Packit 0848f5
Packit 0848f5
- who should setup the segment and convert the buffer (buf, count, datatype) to
Packit 0848f5
  one or more blocks of bytes?  should that be a layer above the method or
Packit 0848f5
  should it be the method itself?
Packit 0848f5
Packit 0848f5
  a method may or may not need to use segments depending on its capabilities.
Packit 0848f5
Packit 0848f5
  there should only be one implementation of the segment API which will be
Packit 0848f5
  called by all of the method implementations.
Packit 0848f5
Packit 0848f5
- we noticed that the segment initialization code take a (comm,rank) pair which
Packit 0848f5
  will have to be dereferenced to a virtual connection in order to determine if
Packit 0848f5
  data conversion is required.  since we have already done the dereference, it
Packit 0848f5
  would be ideal if the segment took a ADI3 implementation (MPID) specific
Packit 0848f5
  connection type instead of a (comm,rank).  Making this parameter type
Packit 0848f5
  implementation specific implies that the segment interface is never called
Packit 0848f5
  from the MPI layer or that the ADI3 interface provided a means of converting
Packit 0848f5
  a (comm, rank) to a connection type.
Packit 0848f5
Packit 0848f5
- David suggested that we might be able to use the xfer interface for
Packit 0848f5
  point-to-point messaging as well as for collective operations.
Packit 0848f5
Packit 0848f5
  What should the xfer interface look like?
Packit 0848f5
Packit 0848f5
  - David provided a write-up of the existing interface
Packit 0848f5
Packit 0848f5
  - We questioned whether or not multiple receive blocks could be used to
Packit 0848f5
    receive a message sent from a single send block.  We decided that blocks
Packit 0848f5
    define envelopes which match, where a single block defines an envelope (and
Packit 0848f5
    payload) per destination and/or source.  So, a message sent to a particular
Packit 0848f5
    destination (from a single send block) must be received by a single receive
Packit 0848f5
    block.  In other words, the message cannot be broken across receive blocks.
Packit 0848f5
 
Packit 0848f5
    - there is an asymmetry in the existing interface which allows multiple
Packit 0848f5
      destinations but prevents multiple sources.  the result of this is that
Packit 0848f5
      scattering operations can be naturally described, but aggregation
Packit 0848f5
      operations cannot.  we believe that there are important cases where
Packit 0848f5
      aggregation would benefit collective operations.
Packit 0848f5
Packit 0848f5
    - to address this we believe that we should extend the interface to
Packit 0848f5
      implement a many-to-one, in addition to the existing one-to-many
Packit 0848f5
      interface.  we hope we don't need the many-to-many...
Packit 0848f5
Packit 0848f5
    - perhaps we should call these scatter_init and gather_init (etc)?
Packit 0848f5
Packit 0848f5
  - Nick proposed that the interface be split up such that sends requests were
Packit 0848f5
    separate from receive requests.  This implies that there would be a
Packit 0848f5
    xfer_send_init() and xfer_recv_init().  We later threw this out, as it
Packit 0848f5
    didn't make a whole lot of sense with forwards existing in the recv case.
Packit 0848f5
Packit 0848f5
  - Brian wondered about aggregating sends into a single receive and whether
Packit 0848f5
    that could be used to reduce the overhead of message headers when
Packit 0848f5
    forwarding.  We think that this can be done below the xfer interface when
Packit 0848f5
    converting into a dataflow-like structure (?)
Packit 0848f5
Packit 0848f5
- We think it may be necessary to describe dependencies, such as progress,
Packit 0848f5
  completion and buffer.  These dependencies as frighteningly close to
Packit 0848f5
  dataflow...
Packit 0848f5
Packit 0848f5
- basically we see the xfer init...start calls as being converted into a set of
Packit 0848f5
  comm. agent requests and a dependency graph.  we see the dependencies as
Packit 0848f5
  being possibly stored in a tabular format, so that ranges of the incoming
Packit 0848f5
  stream can have different dependencies on them -- specifically this allows
Packit 0848f5
  for progress dependencies on a range basis, which we see as a requirement.
Packit 0848f5
  completion dependencies (of which there may be > 1) would be listed at the
Packit 0848f5
  end of this table
Packit 0848f5
Packit 0848f5
  the table describes what depends on THIS request, rather than the other way
Packit 0848f5
  around.  this is tailored to a notification system rather than some sort of
Packit 0848f5
  search-for-ready approach (which would be a disaster).
Packit 0848f5
Packit 0848f5
- for dependencies BETWEEN blocks, we propose waiting on the first block to
Packit 0848f5
  complete before starting the next block.  you can still create blocks ahead
Packit 0848f5
  of time if desired.  otherwise blocks may be processed in parallel
Packit 0848f5
Packit 0848f5
- blocks follow the same envelope matching rules as posted mpi send/recvs
Packit 0848f5
  (commit time order).  this is the only "dependency" between blocks
Packit 0848f5
Packit 0848f5
reminder: envelope = (context (communicator), source_rank, tag)
Packit 0848f5
Packit 0848f5
QUESTION: what exactly are the semantics of a block?  Sends to the same
Packit 0848f5
destination are definitely ordered.  Sends to different desinations could
Packit 0848f5
proceed in parallel.  Should they?
Packit 0848f5
Packit 0848f5
example:
Packit 0848f5
  init
Packit 0848f5
  rf(5)
Packit 0848f5
  rf(4)
Packit 0848f5
  r
Packit 0848f5
  start
Packit 0848f5
Packit 0848f5
a transfer block defines 0 or 1 envelope/payloads for sources and 0 to N envelope/payloads for destinations, one per destination.
Packit 0848f5
Packit 0848f5
Packit 0848f5
Packit 0848f5
- The communication agent will need to process these requests and data
Packit 0848f5
  dependencies.  We see the agent having queues of requests similar in nature
Packit 0848f5
  to the run queue within an operating system.  (We aren't really sure what
Packit 0848f5
  this means yet...)  Queues might consist of the active queue, the wait queue,
Packit 0848f5
  and the still-to-be-matched queue.
Packit 0848f5
Packit 0848f5
  - the "try to send right away" code will look to see if there is anything in
Packit 0848f5
    the active queue for the vc, and if not just put it in run queue and call
Packit 0848f5
    the make progress function (whatever that is...)
Packit 0848f5
Packit 0848f5
- adaptive polling done at the agent level, perhaps with method supplied
Packit 0848f5
  min/max/increments.  comm. agent must track outstanding requests (as
Packit 0848f5
  described above) in order to know WHAT to poll.  we must also take into
Packit 0848f5
  account that there might be incoming active message or error conditions, so
Packit 0848f5
  we should poll all methods (and all vcs) periodically.
Packit 0848f5
Packit 0848f5
- We believe that a MPID_Request might simply contain enough information for
Packit 0848f5
  signalling that one or more CARs have completed.  This implies that a
Packit 0848f5
  MPID_Request might consist of a integer counter of outstanding CARs.  When
Packit 0848f5
  the counter reached zero, the request is complete.  David suggests making
Packit 0848f5
  CARs and MPID_Requests reside in the same physical structure so that in the
Packit 0848f5
  MPI_Send/Recv() case, two logical allocations (one for MPID_Request and CAR)
Packit 0848f5
  are combined into one.
Packit 0848f5
Packit 0848f5
- operations within a block are prioritized by the order in which they are
Packit 0848f5
  added to the block.  operations may proceed in parallel so long as higher
Packit 0848f5
  priority operations are not slowed down by lesser priority operations.  a
Packit 0848f5
  valid implementation is to serialize the operations thus guaranteeing that
Packit 0848f5
  the current operation has all available resources at its desposal.
Packit 0848f5
Packit 0848f5
MPI_Isend(buf, count, datatype, dest, tag, comm, request, error)
Packit 0848f5
MPI_Ibsend(buf, count, datatype, dest, tag, comm, request, error)
Packit 0848f5
MPI_Irsend(buf, count, datatype, dest, tag, comm, request, error)
Packit 0848f5
MPI_Issend(buf, count, datatype, dest, tag, comm, request, error)
Packit 0848f5
{
Packit 0848f5
    request_p = MPIR_Request_alloc();
Packit 0848f5
    MPID_IXsend(buf, count, datatype, dest, tag, comm, request_p, error);
Packit 0848f5
    *request = MPIR_Request_handle(request_p);
Packit 0848f5
}
Packit 0848f5
Packit 0848f5
MPI_Recv()
Packit 0848f5
MPI_Irecv()
Packit 0848f5
Packit 0848f5
- need to cover wild card receive!
Packit 0848f5
Packit 0848f5
MPI_Sendrecv()
Packit 0848f5
{
Packit 0848f5
    /* KISS */
Packit 0848f5
    MPI_Isend()
Packit 0848f5
    MPI_Irecv()
Packit 0848f5
    MPI_Waitall()
Packit 0848f5
}
Packit 0848f5
Packit 0848f5
Packit 0848f5
MPID_Send(buf, count, datatype, dest, tag, comm, group, error)
Packit 0848f5
MPID_Isend(buf, count, datatype, dest, tag, comm, request, error)
Packit 0848f5
Packit 0848f5
MPID_Bsend(buf, count, datatype, dest, tag, comm, error)
Packit 0848f5
MPID_Ibsend(buf, count, datatype, dest, tag, comm, request, error)
Packit 0848f5
Packit 0848f5
MPID_Rsend(buf, count, datatype, dest, tag, comm, error)
Packit 0848f5
MPID_Irsend(buf, count, datatype, dest, tag, comm, request, error)
Packit 0848f5
Packit 0848f5
MPID_Ssend(buf, count, datatype, dest, tag, comm, error)
Packit 0848f5
MPID_Issend(buf, count, datatype, dest, tag, comm, request, error)
Packit 0848f5
Packit 0848f5
Packit 0848f5
-----
Packit 0848f5
Packit 0848f5
Items which make life more difficult:
Packit 0848f5
Packit 0848f5
-