Blame doc/notes/mpi.txt

Packit 0848f5
MPI
Packit 0848f5
- spawn/attach
Packit 0848f5
- communicators
Packit 0848f5
- pt2pt requests
Packit 0848f5
- collective
Packit 0848f5
- RMA
Packit 0848f5
- error (handling, FT, reporting)
Packit 0848f5
Packit 0848f5
MPID
Packit 0848f5
- messages
Packit 0848f5
  - tradition MPI message
Packit 0848f5
  - one-sided operations
Packit 0848f5
  - control?
Packit 0848f5
Packit 0848f5
- streams
Packit 0848f5
- process management (via BNR)
Packit 0848f5
Packit 0848f5
Communication Methods
Packit 0848f5
- TCP
Packit 0848f5
- VIA
Packit 0848f5
- Shared Memory
Packit 0848f5
- Loopback
Packit 0848f5
- IMPI
Packit 0848f5
Packit 0848f5
===============================================================================
Packit 0848f5
Packit 0848f5
MPI layer
Packit 0848f5
Packit 0848f5
--------------------
Packit 0848f5
Packit 0848f5
Operations
Packit 0848f5
Packit 0848f5
- point-to-point
Packit 0848f5
  - requests
Packit 0848f5
  - datatypes
Packit 0848f5
  - communicators
Packit 0848f5
  - status
Packit 0848f5
  - errors
Packit 0848f5
Packit 0848f5
- collective
Packit 0848f5
  - datatypes
Packit 0848f5
  - communicators
Packit 0848f5
  - errors
Packit 0848f5
Packit 0848f5
- process management
Packit 0848f5
  - communicators
Packit 0848f5
  - info
Packit 0848f5
  - errors
Packit 0848f5
Packit 0848f5
- RMA
Packit 0848f5
  - datatypes
Packit 0848f5
  - windows
Packit 0848f5
  - communicators
Packit 0848f5
  - groups
Packit 0848f5
  - epoch
Packit 0848f5
  - errors
Packit 0848f5
Packit 0848f5
- I/O
Packit 0848f5
  - depends on
Packit 0848f5
    - files
Packit 0848f5
    - datatypes
Packit 0848f5
    - requests
Packit 0848f5
    - info
Packit 0848f5
    - status
Packit 0848f5
    - errors
Packit 0848f5
  * implemented via ROMIO
Packit 0848f5
    - dependent only on MPI functions
Packit 0848f5
    - future enhancements may use low-level interfaces
Packit 0848f5
Packit 0848f5
- topology
Packit 0848f5
  - communicators
Packit 0848f5
  - errors
Packit 0848f5
  * can be implemented entirely at the MPI layer
Packit 0848f5
Packit 0848f5
- generalized requests
Packit 0848f5
  - errors
Packit 0848f5
  * can be implemented entirely at the MPI layer
Packit 0848f5
Packit 0848f5
--------------------
Packit 0848f5
Packit 0848f5
Structures
Packit 0848f5
Packit 0848f5
- requests
Packit 0848f5
- datatypes
Packit 0848f5
- communicators
Packit 0848f5
  - groups
Packit 0848f5
- groups
Packit 0848f5
  * are groups modified or augmented by low layers?
Packit 0848f5
- windows
Packit 0848f5
- files
Packit 0848f5
  * defined and implemented via ROMIO
Packit 0848f5
    - future enhancements may require access lower layers
Packit 0848f5
- status
Packit 0848f5
- errors
Packit 0848f5
- attributes
Packit 0848f5
  * can be defined and operations implemented entirely at the MPI layer
Packit 0848f5
  - communicator
Packit 0848f5
  - datatypes
Packit 0848f5
  - windows
Packit 0848f5
Packit 0848f5
- info
Packit 0848f5
  * can be defined and operations implemented entirely at the MPI layer
Packit 0848f5
Packit 0848f5
===============================================================================
Packit 0848f5
Packit 0848f5
MPID
Packit 0848f5
Packit 0848f5
--------------------
Packit 0848f5
Packit 0848f5
Operations
Packit 0848f5
Packit 0848f5
- point-to-point
Packit 0848f5
Packit 0848f5
- collective operations
Packit 0848f5
Packit 0848f5
- process management
Packit 0848f5
Packit 0848f5
- RMA
Packit 0848f5
Packit 0848f5
- generalized requests
Packit 0848f5
Packit 0848f5
Packit 0848f5
Structures
Packit 0848f5
Packit 0848f5
Packit 0848f5
--------------------
Packit 0848f5
Packit 0848f5
Concepts
Packit 0848f5
Packit 0848f5
- MPI buffer movement (moving buffers defined by an address, count and
Packit 0848f5
  datatype)
Packit 0848f5
Packit 0848f5
- internal buffer management
Packit 0848f5
Packit 0848f5
- Connection management
Packit 0848f5
Packit 0848f5
  - virtual connection structures
Packit 0848f5
Packit 0848f5
  - low-level connnection management (sockets, etc.) should be handled
Packit 0848f5
    entirely by the device and probably driven by a state machine
Packit 0848f5
Packit 0848f5
Packit 0848f5
===============================================================================
Packit 0848f5
Packit 0848f5
Multi-method design
Packit 0848f5
Packit 0848f5
--------------------
Packit 0848f5
Packit 0848f5
Device-level objects
Packit 0848f5
Packit 0848f5
- group
Packit 0848f5
Packit 0848f5
  - data structures
Packit 0848f5
Packit 0848f5
  - methods
Packit 0848f5
Packit 0848f5
    - set_connection(group, rank, vc_ptr) - associate pointer to virtual
Packit 0848f5
      connection structure with a (group,rank)
Packit 0848f5
  
Packit 0848f5
    - get_connection(group, rank) - returns pointer to virtual connection
Packit 0848f5
      structure associated with (group,rank)
Packit 0848f5
Packit 0848f5
- communicators
Packit 0848f5
Packit 0848f5
  - data structures
Packit 0848f5
Packit 0848f5
    - group
Packit 0848f5
Packit 0848f5
  - methods
Packit 0848f5
Packit 0848f5
    - set_connection(dcomm, rank, vc_ptr) - associate pointer to virtual
Packit 0848f5
      connection structure with a (dcomm,rank)
Packit 0848f5
  
Packit 0848f5
    - get_connection(dcomm, rank) - returns pointer to virtual connection
Packit 0848f5
      structure associated with (dcomm,rank)
Packit 0848f5
Packit 0848f5
- virtual connections
Packit 0848f5
Packit 0848f5
  - alloc() - returns a pointer to a virtual connection structure
Packit 0848f5
Packit 0848f5
  - add_ref(vc) - increments the reference count (atomically)
Packit 0848f5
Packit 0848f5
  - release() - decrements the reference count; if the reference count reaches
Packit 0848f5
    zero, the structure is freed
Packit 0848f5
Packit 0848f5
  NOTE: It may be useful to be able to locate a virtual connection based on a
Packit 0848f5
  process group ID and rank, in part so we can detect when multiple virtual
Packit 0848f5
  connections might be formed between a pair of processes.
Packit 0848f5
Packit 0848f5
  
Packit 0848f5
Packit 0848f5
can connect/accept be called multiple times between a set of processes?
Packit 0848f5
Packit 0848f5
--------------------
Packit 0848f5
Packit 0848f5
Method-level functions
Packit 0848f5
Packit 0848f5
Method descriptors are strings that are used to describe the capabilities of
Packit 0848f5
the methods.  These descriptors can then be used to determine if two processes
Packit 0848f5
miight be capable of "talking" using the method in question.  We say "might be
Packit 0848f5
capable" because in the case of a method like VIA, it may be impossible provide
Packit 0848f5
enough information in the descriptor to determine in two processes can talk.
Packit 0848f5
It may be necessary to simply attempt to form the connection.  This implies
Packit 0848f5
that binding to a particular protocol may need to be deferred until we are
Packit 0848f5
ready to form a real connection.  However, some methods, such as shared memory,
Packit 0848f5
can provide enough information and thus can be bound immediately.
Packit 0848f5
Packit 0848f5
- query/get_descriptor()
Packit 0848f5
Packit 0848f5
- match_descriptors()
Packit 0848f5
Packit 0848f5
===============================================================================
Packit 0848f5
Packit 0848f5
Packit 0848f5
MPI_Init()
Packit 0848f5
Packit 0848f5
- create basic datatypes
Packit 0848f5
Packit 0848f5
  MPIR_Datatype_init()
Packit 0848f5
  {
Packit 0848f5
      foreach dt (all basic datatypes)
Packit 0848f5
      {
Packit 0848f5
          MPID_Datatype_init(dt, ...)
Packit 0848f5
      }
Packit 0848f5
  }
Packit 0848f5
Packit 0848f5
Packit 0848f5
- initialize device
Packit 0848f5
Packit 0848f5
  - BNR initialization
Packit 0848f5
  
Packit 0848f5
    BNR_Init()
Packit 0848f5
    BNR_Get_group(&my_bnr_group)
Packit 0848f5
    BNR_Get_size(my_bnr_group, &size)
Packit 0848f5
    BNR_Get_rank(my_bnr_group, &rank)
Packit 0848f5
    BNR_Get_parent(&parent_bnr_group)
Packit 0848f5
    BNR_Merge(my_bnr_group, parent_bnr_group, &inter_bnr_group)
Packit 0848f5
  
Packit 0848f5
  - loop through methods
Packit 0848f5
Packit 0848f5
    - initialize method
Packit 0848f5
  
Packit 0848f5
    - query for descriptor of method's capabilities
Packit 0848f5
  
Packit 0848f5
    Q: what about dymanicly loaded methods?  do they have to be initialized now
Packit 0848f5
    or can they be added later?
Packit 0848f5
  
Packit 0848f5
  - publish capabilities of all known methods
Packit 0848f5
Packit 0848f5
  - initialize AQ, buffer management, etc.
Packit 0848f5
Packit 0848f5
- establish MPI_COMM_WORLD
Packit 0848f5
Packit 0848f5
  - create MPI_GROUP_WORLD (internal) from BNR my_group, etc.
Packit 0848f5
Packit 0848f5
    stores BNR info in group structure
Packit 0848f5
    allocates virtual connection structures
Packit 0848f5
    initializes virtual connections to stubs
Packit 0848f5
Packit 0848f5
  - create MPI_COMM_WORLD from MPI_GROUP_WORLD
Packit 0848f5
Packit 0848f5
- establish inter-communicator with parent (if parent exists)
Packit 0848f5
  
Packit 0848f5
  - create inter_group from inter_bnr_group
Packit 0848f5
  
Packit 0848f5
  - create inter-communicator from inter_group
Packit 0848f5
  
Packit 0848f5
- create inter_group from inter_bnr_gorup
Packit 0848f5
Packit 0848f5
- create inter-communicator from inter_group
Packit 0848f5
Packit 0848f5
Packit 0848f5
MPI_Spawn()
Packit 0848f5
{
Packit 0848f5
    BNR_Open_group(my_bnr_group, &new_bnr_group)
Packit 0848f5
    BNR_Spawn(remote_bnr_group, N, ..., func)
Packit 0848f5
    BNR_Close(remote_bnr_group)
Packit 0848f5
    BNR_Merge(my_bnr_group, remote_bnr_group, &inter_bnr_group);
Packit 0848f5
}
Packit 0848f5
Packit 0848f5
Packit 0848f5
Packit 0848f5
Packit 0848f5
need a BNR_Group_ID which is globally unique in order to implement MPI_Connect/Attach
Packit 0848f5
Packit 0848f5
Packit 0848f5
------------------------------------------------------------------------
Packit 0848f5
Packit 0848f5
Structures that cross layers
Packit 0848f5
Packit 0848f5
- many of the information structures that are passed through the layers contain
Packit 0848f5
  data sections from multiple layers
Packit 0848f5
Packit 0848f5
- ne option is to include device (and method) include files in the MPICH layer
Packit 0848f5
  include file.  Rob and Brian feel this would be bad.
Packit 0848f5
Packit 0848f5
- David suggests that the structure definitions be supplied by the device
Packit 0848f5
  header files and that method specific information be included in those
Packit 0848f5
  definition using unions.  Rob and Brian feel this is ugly (from a software
Packit 0848f5
  engineering standpoint).
Packit 0848f5
Packit 0848f5
- Rob and Brian suggest having each layer define their own portion of the
Packit 0848f5
  structure.  The definitions of the higher layers are known to the lower
Packit 0848f5
  layers, but not vice versa.  To increase cache locality and reduce memory
Packit 0848f5
  allocation, the device (and methods) report the amount of space they need in
Packit 0848f5
  these structures so that the highest layer can allocate sufficient space.
Packit 0848f5
  pointer arithmatic, etc.
Packit 0848f5
Packit 0848f5
Virtual connection
Packit 0848f5
Packit 0848f5
- used by MM implementation to allow late binding to a method
Packit 0848f5
Packit 0848f5
  this implies that the VC contains a pointer to either the function pointer
Packit 0848f5
  table for the method to which it is bound or the function pointer to table to
Packit 0848f5
  a set of functions that perform the binding to such a method
Packit 0848f5
Packit 0848f5
- one-to-one correspondence between a virtual connection and a real connection
Packit 0848f5
Packit 0848f5
- Contains
Packit 0848f5
  - state of binding
Packit 0848f5
  - method specific information
Packit 0848f5
  - function pointer table?
Packit 0848f5
Packit 0848f5
----------
Packit 0848f5
Packit 0848f5
Communicators
Packit 0848f5
- Contains
Packit 0848f5
  - communication group
Packit 0848f5
  - local group (inter-communicator only)
Packit 0848f5
  - send and receive context IDs - same for intra-communicator
Packit 0848f5
  - attributes
Packit 0848f5
  - reference count
Packit 0848f5
  - error handlers
Packit 0848f5
  - device specific information (needed for MPICH-G2)
Packit 0848f5
  
Packit 0848f5
----------
Packit 0848f5
Packit 0848f5
Groups
Packit 0848f5
- Contains
Packit 0848f5
  - virtual connection table
Packit 0848f5
  - my rank
Packit 0848f5
  - reference count
Packit 0848f5
  - device specific information (???)
Packit 0848f5
Packit 0848f5
----------
Packit 0848f5
Packit 0848f5
Requests
Packit 0848f5
- probably allocated and partially initialized above ADI
Packit 0848f5
- initialization complete by device/method
Packit 0848f5
Packit 0848f5
- Request contains
Packit 0848f5
  - Immutable after initialization
Packit 0848f5
    - type of request
Packit 0848f5
      - persistent request flag
Packit 0848f5
      - send, bsend, rsend, ssend, recv, generalized
Packit 0848f5
    - buffer
Packit 0848f5
    - count
Packit 0848f5
    - datatype
Packit 0848f5
    - rank (src or dest depending on type of request)
Packit 0848f5
    - tag
Packit 0848f5
    - comm
Packit 0848f5
----------
Packit 0848f5
Packit 0848f5
Connection resolution
Packit 0848f5
Packit 0848f5
- needs to talk to with BNR
Packit 0848f5
Packit 0848f5
----------
Packit 0848f5
Packit 0848f5
Communication agent
Packit 0848f5
Packit 0848f5