Blob Blame History Raw
This file contains the current DRAFT description of the structures used 
in the MPI2 debugger interface.  The author of this interface was
Rob Faught <rtf@etnus.com>

AN INTERFACE BETWEEN A DEBUGGER AND AN MPI IMPLEMENTATION
(DRAFT)

Jan 18 2007 RTF: The executable name has been added to several
   structures. It is needed along with the pid and hostname to attach
   to a process.
Jan 18 2007 RTF: Pull the breakpoint out of the info type.
Jan 22 2007 RTF: At Bill Gropp's request: Changed name of defines for
   debugger_flags and mpi_flags to resp. MPI2DD_FLAGS_xxx and 
MPI2DD_MPIFLAGS_xxx.

Types
_________________________________________________________


MPI2DD_ADDR_T  is the type of an address on the target machine.

MPI2DD_INT32_T is the type of a signed integer with a size of four
               bytes on the target machine

MPI2DD_UINT32_T is the type of an unsigned integer with a size of four
               bytes on the target machine

MPI2DD_BYTE_T is the type of an unsigned integer the size of one byte.




Process Info
_________________________________________________________


extern "C" struct MPI2DD_INFO MPI2DD_info;

This structure is defined in each rank process and starter. The symbol
"MPI2DD_info" is associated with the address of this structure and
must be visible to the attached debugger.


struct MPI2DD_INFO {

MPI2DD_BYTE_T magic[5]

     magic[0] == 'M', magic[1] = 'P', magic[2] = 'I',
     magic[3] = '2', magic[4] = 0x7f


MPI2DD_BYTE_T version

     A version number for this interface. This will be one(1) for all
     instances defined by this document. It can only be changed by
     general agreement of the formal or informal organization that
     maintains this document and interface.

MPI2DD_BYTE_T variant

     A code that allows for small variations in layout of the
     structures defined here or small changes in the standard
     interaction of debugger and mpi application. This field should be
     one(1) unless it is changed by general agreement between a
     debugger and MPI implementation.

MPI2DD_BYTE_T debug_state

     This byte contains an indication of why a MPI2DD_Breakpoint was
     triggered and is written by the MPI implementation before the
     breakpoint function is called. It is not changed by the debugger.

     #define MPI2DD_DEBUG_START                     1
     #define MPI2DD_DEBUG_SPAWN                     2
     #define MPI2DD_DEBUG_CONNECT                   3
     #define MPI2DD_DEBUG_ACCEPT                    4
     #define MPI2DD_DEBUG_JOIN                      5
     #define MPI2DD_DEBUG_DIRECTORY_CHANGED         6
     #define MPI2DD_DEBUG_METADIRECTORY_CHANGED     7
     #define MPI2DD_DEBUG_ABORT                     8

MPI2DD_UINT32_T debugger_flags

     The bits in this field are initalized by the MPI implementation
     and may be modified by the debugger.

     #define MPI2DD_FLAG_GATE    0x01

     This bit is initialized to zero by the MPI implementation and
     set to one by the debugger after it has acquired a process. This
     is used in some implementations to allow rank processes to run
     out of MPI_Init. Implementations are not required to use this
     method.

     #define MPI2DD_FLAG_BEING_DEBUGGED    0x02

     This bit is initialized to zero by the MPI implementation and set
     to one by the debugger to tell the starter program the a debugger
     is attached.

     #define MPI2DD_FLAG_REQUEST_DIRECTORY_EVENTS 0x04

     Set by the debugger if it would like to receive breakpoint events
     when changes occur to a directory or metadirectory.


MPI2DD_UINT32_T mpi_flags

     The bits in this field are set by the MPI implementation and are
     not modified by the debugger.

     #define MPI2DD_MPIFLAG_I_AM_METADIR      0x01 Set if process is a 
metadirectory
     #define MPI2DD_MPIFLAG_I_AM_DIR          0x02 Set if process is a 
directory
     #define MPI2DD_MPIFLAG_I_AM_STARTER      0x04 Set if this is a 
starter process
     #define MPI2DD_MPIFLAG_FORCE_TO_MAIN     0x08 Set if this process 
is acquired before running its main procedure.
     #define MPI2DD_MPIFLAG_IGNORE_QUEUE      0x10 Set if message queue 
debugging is not implemented.
     #define MPI2DD_MPIFLAG_ACQUIRED_PRE_MAIN 0x20 Set if the rank 
processes are attached before main.
     #define MPI2DD_MPIFLAG_PARTIAL_ATTACH_OK 0x40 Set if job can be 
started by continuing the initial process.


MPI2DD_ADDR_T dll_name_32;

     The address of an ascii null-terminated string containing the
     pathname of the message queue debug library that is dynamically
     loaded by the debugger. This library is used by debuggers that
     are built as 32 bit executables. If there is no 32 bit message
     queue debug library, this field is null;

MPI2DD_ADDR_T dll_name_64;

     The address of an ascii null-terminated string containing the
     pathname of the message queue debug library that is dynamically
     loaded by the debugger. This library is used by debuggers that
     are built as 64 bit executables. If there is no 64 bit message
     queue debug library, this field is null;

MPI2DD_ADDR_T meta_host_name;

     The address of an ascii null-terminated string containing the
     address or name of the network node where a metadirectory process
     is running.

     The host_name is either a host name, or an IPv4 address in
     standard dot notation, or an IPv6 address in colon (and possibly
     dot) notation.  (See RFC 1884 for the description of IPv6
     addresses.)

     The debugger needs meta_pid and this field to locate a
     metadirectory from an arbitrarily selected rank process.


MPI2DD_ADDR_T meta_executable_name;

     The address of an ascii null-terminated string containing the
     path name of the metadirectory executable. The executable is
     opened by the debugger to read symbol tables, so the path should
     be accessible to the debugger.


MPI2DD_ADDR_T abort_string

     The address of an ascii null-terminated string that holds an
     abort message that is shown to the user when the breakpoint at
     MPIDD_info.breakpoint is triggered and the MPIDD_info.debug_state
     is set to MPI2DD_DEBUG_ABORT. MPI implementations are not
     required to implement this feature.


MPI2DD_ADDR_T proctable;

     This field is null except in directory processes. In a directory
     process this field contains the address of an array of proctable
     structures.

MPI2DD_ADDR_T directory_table;

     This field is null except in metadirectory processes. In a
     metadirectory process this field contains the address of an array
     of directory entry structures. Each directory entry in the array
     allows the debugger to find one directory process. The process
     that contains this info structure should not have an entry for
     itself in its directory table. It is possible that the value of
     this field is null in a metadirectory process, if the
     metadirectory process is also a directory process and there are
     no other directory processes.


MPI2DD_ADDR_T metadirectory_table;

     This field is null except in metadirectory processes. In a
     metadirectory process this field contains the address of an array
     of directory entry structures. Each directory entry in the array
     allows the debugger to find one metadirectory process. The process
     that contains this info structure should not have an entry for
     itself in its metadirectory table. It is possible that the value of
     this field is null in a metadirectory process, if there are no other
     metadirectory processes in the application.


MPI2DD_INT32_T proctable_size

     If this is a directory process, this field contains a count of
     the entries in the proctable, otherwise it is zero.


MPI2DD_INT32_T directory_size

     The number of entries in the array indicated by
     directory_table;


MPI2DD_INT32_T metadirectory_size

     The number of entries in the array indicated by
     MPI2DD_metadirectory_table;


MPI2DD_INT32_T meta_pid;

     The process id or task id of the metadirectory process on the node 
given
     by the meta_host_name field. On UNIX this will be a pid.

MPI2DD_INT32_T padding[8];

     Thirty-two bytes of padding. Reserved for future expansion and 
vendor use.
};



Breakpoint address symbol
_________________________________________________________


void MPI2DD_breakpoint() { }

     This function provides an address where the debugger can set a
     breakpoint. It will be a routine that MPI calls at points of
     interest. When the debugger gets the breakpoint trap, it can use
     the MPI2DD_debug_state field to determine why the breakpoint was
     triggered. (It was pulled out of the info structure because its
     address may be needed before a process runs any instructions.)



Proctable
_________________________________________________________


The new proctable will be read without first having to find its type
in the debug information of the MPI executable. The order of fields is
fixed. These structures are packed with no intervening padding bytes
allowed. There will be a version field in the MPI2DD_INFO structure to
indicate future changes to this structure. Each instance of a struct
MPI2DD_PROCDESC has attributes to locate one rank process. Any process
with a entries in its proctable is, by definition, a directory
process.


struct MPI2DD_PROCDESC {

  MPI2DD_ADDR_T host_name;

                        The address of an ascii null-terminated string
                        containing the address or name of the network
                        node where this process is running. More
                        precisely, it is the IP address of the network
                        node where a debugger server can be run to
                        control this process.

                        The host_name is either a host name, or an
                        IPv4 address in standard dot notation, or an
                        IPv6 address in colon (and possibly dot)
                        notation.  (See RFC 1884 for the description
                        of IPv6 addresses.)


  MPI2DD_ADDR_T executable_name;

                        The address of an ascii null-terminated string
                        containing the path name of the
                        executable. The executable is opened by the
                        debugger to read symbol tables, so the path
                        should be accessible to the debugger.


  MPI2DD_ADDR_T spawn_desc;

                        (new) There are two ways that processes are
                        created in an MPI job. They are created by a
                        starter program or they are spawned by an
                        existing MPI group. This field is either the
                        address of a MPI2DD_SPAWNDESC structure that has
                        the context for the spawn, or null if this
                        process is part of the MPI_COMM_WORLD created
                        by a starter program. [To save space it is
                        possible that this field could be moved to a
                        separate table that is indexed by the
                        comm_world_id field below.]

  MPI2DD_ADDR_T comm_world_id;

                        This field is the address of a ascii
                        null-terminated string that identifies the
                        MPI_COMM_WORLD associated with this rank
                        process. It should distinguish this comm world
                        from any other comm worlds spawned in this job
                        and any job that this job join/connects to.

  MPI2DD_INT32_T pid;

                        The process id or task id of the rank process
                        on the node given by the hostname field. On
                        UNIX this will be a pid.

  MPI2DD_INT32_T rank;

                        (new) The rank of the process in the
                        MPI_COMM_WORLD. [The table index can no longer
                        be used as the rank indicator because the
                        proctable for a job may be distributed across
                        multiple directory nodes, processes may appear
                        in more than one proctable, and it may be
                        possible for a rank process to remove itself
                        from its MPI_COMM_WORLD (?)].

 };




struct MPI2DD_SPAWNDESC {

  MPI2DD_ADDR_T parent_comm_world_id;

                        This field is the address of a ascii
                        null-terminated string that identifies the
                        MPI_COMM_WORLD associated with the parent_rank
                        process. It should distinguish this comm world
                        from any other comm worlds spawned in this job
                        and any job that this job join/connects to.

  MPI2DD_INT32_T parent_rank;

                        Rank of the parent process.

  MPI2DD_INT32_T sequence;

                        The sequence of this spawn command among those
                        rooted on the parent process. This should
                        start at zero and increment by one for each
                        spawn that is rooted at the parent_rank
                        process.

};




Directory and MetaDirectory Tables
_________________________________________________________

A metadirectory process will have two tables that allow a debugger to
find metadirectory processes and directory processes. A process should
not be in its own metadirectory or directory tables. These tables both
have the same format.

In a simple job, where there are no other metadirectory processes and
the metadirectory process is also the only directory process, these
tables might both be empty.


struct MPI2DD_DIRECTORYENTRY {

MPI2DD_ADDR_T host_name;

     The address of an ascii null-terminated string containing the
     address or name of the network node where a directory or
     metadirectory process is running.

     The host_name is either a host name, or an IPv4 address in
     standard dot notation, or an IPv6 address in colon (and possibly
     dot) notation.  (See RFC 1884 for the description of IPv6
     addresses.)


MPI2DD_ADDR_T executable_name;

     The address of an ascii null-terminated string containing the
     path name of the directory or metadirectory executable. The
     executable is opened by the debugger to read symbol tables, so
     the path should be accessible to the debugger.


MPI2DD_INT32_T pid;

     The process id or task id of the directory or metadirectory
     process on the node given by the host_name field. On UNIX this
     will be a pid.

};