Blame doc/userguide/user.tex.vin

Packit Service c5cf8c
% -*- Mode: latex; -*-
Packit Service c5cf8c
\documentclass[dvipdfm,11pt]{article}
Packit Service c5cf8c
\usepackage[dvipdfm]{hyperref} % Upgraded url package
Packit Service c5cf8c
\parskip=.1in
Packit Service c5cf8c
Packit Service c5cf8c
% Formatting conventions for contributors
Packit Service c5cf8c
% 
Packit Service c5cf8c
% A quoting mechanism is needed to set off things like file names, command
Packit Service c5cf8c
% names, code fragments, and other strings that would confuse the flow of
Packit Service c5cf8c
% text if left undistinguished from preceding and following text.  In this
Packit Service c5cf8c
% document we use the LaTeX macro '\texttt' to indicate such text in the
Packit Service c5cf8c
% source, which normally produces, when used as in '\texttt{special text}',
Packit Service c5cf8c
% the typewriter font.
Packit Service c5cf8c
Packit Service c5cf8c
% It is particularly easy to use this convention if one is using emacs as
Packit Service c5cf8c
% the editor and LaTeX mode within emacs for editing LaTeX documents.  In
Packit Service c5cf8c
% such a case the key sequence ^C^F^T (hold down the control key and type
Packit Service c5cf8c
% 'cft') produces '\texttt{}' with the cursor positioned between the
Packit Service c5cf8c
% braces, ready for the special text to be typed.  The closing brace can
Packit Service c5cf8c
% be skipped over by typing ^e (go to the end of the line) if entering
Packit Service c5cf8c
% text or ^C-} to just move the cursor past the brace.
Packit Service c5cf8c
%
Packit Service c5cf8c
% Please add index entries for important terms and keywords, including
Packit Service c5cf8c
% environment variables that may control the behavior of MPI or one of the
Packit Service c5cf8c
% tools and concepts such as line labeling from mpiexec.
Packit Service c5cf8c
Packit Service c5cf8c
% LaTeX mode is usually loaded automatically.  At Argonne, one way to 
Packit Service c5cf8c
% get several useful emacs tools working for you automatically is to put
Packit Service c5cf8c
% the following in your .emacs file.
Packit Service c5cf8c
Packit Service c5cf8c
% (require 'tex-site)
Packit Service c5cf8c
% (setq LaTeX-mode-hook '(lambda ()
Packit Service c5cf8c
%                        (auto-fill-mode 1)
Packit Service c5cf8c
%                        (flyspell-mode 1)
Packit Service c5cf8c
%                        (reftex-mode 1)
Packit Service c5cf8c
%                        (setq TeX-command "latex")))
Packit Service c5cf8c
   
Packit Service c5cf8c
\begin{document}
Packit Service c5cf8c
\markright{MPICH User's Guide}
Packit Service c5cf8c
\title{\textbf{MPICH User's Guide}\thanks{This work was supported by the Mathematical,
Packit Service c5cf8c
    Information, and Computational Sciences Division subprogram of the
Packit Service c5cf8c
    Office of Advanced Scientific Computing Research, SciDAC Program,
Packit Service c5cf8c
    Office of Science, U.S. Department of Energy, under Contract
Packit Service c5cf8c
    DE-AC02-06CH11357.}\\
Packit Service c5cf8c
Version %MPICH_VERSION%\\
Packit Service c5cf8c
Mathematics and Computer Science Division\\
Packit Service c5cf8c
Argonne National Laboratory}
Packit Service c5cf8c
Packit Service c5cf8c
\author{
Packit Service c5cf8c
Abdelhalim Amer \and Pavan Balaji \and Wesley Bland \and William Gropp \and
Packit Service c5cf8c
Yanfei Guo \and Rob Latham \and Huiwei Lu \and Lena Oden \and Antonio J. Pe\~na
Packit Service c5cf8c
\and Ken Raffenetti \and Sangmin Seo \and Min Si \and Rajeev Thakur \and
Packit Service c5cf8c
Junchao Zhang \and Xin Zhao
Packit Service c5cf8c
}
Packit Service c5cf8c
Packit Service c5cf8c
\maketitle
Packit Service c5cf8c
Packit Service c5cf8c
\cleardoublepage
Packit Service c5cf8c
Packit Service c5cf8c
\pagenumbering{roman}
Packit Service c5cf8c
\tableofcontents
Packit Service c5cf8c
\clearpage
Packit Service c5cf8c
Packit Service c5cf8c
\pagenumbering{arabic}
Packit Service c5cf8c
\pagestyle{headings}
Packit Service c5cf8c
Packit Service c5cf8c
Packit Service c5cf8c
\section{Introduction}
Packit Service c5cf8c
\label{sec:introduction}
Packit Service c5cf8c
Packit Service c5cf8c
This manual assumes that MPICH has already been installed.  For
Packit Service c5cf8c
instructions on how to install MPICH, see the \emph{MPICH
Packit Service c5cf8c
  Installer's Guide}, or the \texttt{README} in the top-level MPICH
Packit Service c5cf8c
directory.  This manual explains how to compile, link, and run MPI
Packit Service c5cf8c
applications, and use certain tools that come with MPICH.  This is a
Packit Service c5cf8c
preliminary version and some sections are not complete yet.  However,
Packit Service c5cf8c
there should be enough here to get you started with MPICH.
Packit Service c5cf8c
Packit Service c5cf8c
Packit Service c5cf8c
\section{Getting Started with MPICH}
Packit Service c5cf8c
\label{sec:migrating}
Packit Service c5cf8c
Packit Service c5cf8c
MPICH is a high-performance and widely portable implementation of the
Packit Service c5cf8c
MPI Standard, designed to implement all of MPI-1, MPI-2, and MPI-3
Packit Service c5cf8c
(including dynamic process management, one-sided operations, parallel
Packit Service c5cf8c
I/O, and other extensions).  The \emph{MPICH Installer's Guide}
Packit Service c5cf8c
provides some information on MPICH with respect to configuring and
Packit Service c5cf8c
installing it. Details on compiling, linking, and running MPI programs
Packit Service c5cf8c
are described below.
Packit Service c5cf8c
Packit Service c5cf8c
Packit Service c5cf8c
\subsection{Default Runtime Environment}
Packit Service c5cf8c
\label{sec:default-environment}
Packit Service c5cf8c
Packit Service c5cf8c
MPICH provides a separation of process management and communication.
Packit Service c5cf8c
The default runtime environment in MPICH is called Hydra. Other
Packit Service c5cf8c
process managers are also available.
Packit Service c5cf8c
Packit Service c5cf8c
\subsection{Starting Parallel Jobs}
Packit Service c5cf8c
\label{sec:startup}
Packit Service c5cf8c
Packit Service c5cf8c
MPICH implements \texttt{mpiexec} and all of its standard arguments,
Packit Service c5cf8c
together with some extensions. See Section~\ref{sec:mpiexec-standard}
Packit Service c5cf8c
for standard arguments to \texttt{mpiexec} and various subsections of
Packit Service c5cf8c
Section~\ref{sec:mpiexec} for extensions particular to various process
Packit Service c5cf8c
management systems.
Packit Service c5cf8c
Packit Service c5cf8c
Packit Service c5cf8c
\subsection{Command-Line Arguments in Fortran}
Packit Service c5cf8c
\label{sec:fortran-command-line}
Packit Service c5cf8c
Packit Service c5cf8c
MPICH1 (more precisely MPICH1's \texttt{mpirun}) required access to
Packit Service c5cf8c
command line arguments in all application programs, including Fortran
Packit Service c5cf8c
ones, and MPICH1's \texttt{configure} devoted some effort to finding
Packit Service c5cf8c
the libraries that contained the right versions of \texttt{iargc} and
Packit Service c5cf8c
\texttt{getarg} and including those libraries with which the
Packit Service c5cf8c
\texttt{mpifort} script linked MPI programs.  Since MPICH does not
Packit Service c5cf8c
require access to command line arguments to applications, these
Packit Service c5cf8c
functions are optional, and \texttt{configure} does nothing special
Packit Service c5cf8c
with them.  If you need them in your applications, you will have to
Packit Service c5cf8c
ensure that they are available in the Fortran environment you are
Packit Service c5cf8c
using.
Packit Service c5cf8c
Packit Service c5cf8c
\section{Quick Start}
Packit Service c5cf8c
\label{sec:quickstart}
Packit Service c5cf8c
Packit Service c5cf8c
To use MPICH, you will have to know the directory where MPICH has
Packit Service c5cf8c
been installed.  (Either you installed it there yourself, or your
Packit Service c5cf8c
systems administrator has installed it.  One place to look in this
Packit Service c5cf8c
case might be \texttt{/usr/local}.  If MPICH has not yet been
Packit Service c5cf8c
installed, see the \emph{MPICH Installer's Guide}.)  We suggest that
Packit Service c5cf8c
you put the \texttt{bin} subdirectory of that directory into your
Packit Service c5cf8c
path.  This will give you access to assorted MPICH commands to
Packit Service c5cf8c
compile, link, and run your programs conveniently.  Other commands in
Packit Service c5cf8c
this directory manage parts of the run-time environment and execute
Packit Service c5cf8c
tools.
Packit Service c5cf8c
Packit Service c5cf8c
One of the first commands you might run is \texttt{mpichversion} to
Packit Service c5cf8c
find out the exact version and configuration of MPICH you are working
Packit Service c5cf8c
with. Some of the material in this manual depends on just what version
Packit Service c5cf8c
of MPICH you are using and how it was configured at installation
Packit Service c5cf8c
time.
Packit Service c5cf8c
Packit Service c5cf8c
You should now be able to run an MPI program.  Let us assume that the
Packit Service c5cf8c
directory where MPICH has been installed is
Packit Service c5cf8c
\texttt{/home/you/mpich-installed}, and that you have added that directory to
Packit Service c5cf8c
your path, using 
Packit Service c5cf8c
\begin{verbatim}
Packit Service c5cf8c
    setenv PATH /home/you/mpich-installed/bin:$PATH
Packit Service c5cf8c
\end{verbatim}
Packit Service c5cf8c
for \texttt{tcsh} and \texttt{csh}, or 
Packit Service c5cf8c
\begin{verbatim}
Packit Service c5cf8c
    export PATH=/home/you/mpich-installed/bin:$PATH
Packit Service c5cf8c
\end{verbatim}
Packit Service c5cf8c
for \texttt{bash} or \texttt{sh}.
Packit Service c5cf8c
Then to run an MPI program, albeit only on one machine, you can do:
Packit Service c5cf8c
\begin{verbatim}
Packit Service c5cf8c
    cd  /home/you/mpich-installed/examples
Packit Service c5cf8c
    mpiexec -n 3 ./cpi
Packit Service c5cf8c
\end{verbatim}
Packit Service c5cf8c
Packit Service c5cf8c
Details for these commands are provided below, but if you can
Packit Service c5cf8c
successfully execute them here, then you have a correctly installed
Packit Service c5cf8c
MPICH and have run an MPI program. 
Packit Service c5cf8c
Packit Service c5cf8c
\section{Compiling and Linking}
Packit Service c5cf8c
\label{sec:compiling}
Packit Service c5cf8c
Packit Service c5cf8c
A convenient way to compile and link your program is by using scripts
Packit Service c5cf8c
that use the same compiler that MPICH was built with.  These are
Packit Service c5cf8c
\texttt{mpicc}, \texttt{mpicxx}, and \texttt{mpifort},
Packit Service c5cf8c
for C, C++, and Fortran programs, respectively.  If any
Packit Service c5cf8c
of these commands are missing, it means that MPICH was configured
Packit Service c5cf8c
without support for that particular language.
Packit Service c5cf8c
Packit Service c5cf8c
%% Pavan Balaji (12/27/2009): I'm commenting out this part as this is
Packit Service c5cf8c
%% broken in the current MPICH stack (see ticket #502).
Packit Service c5cf8c
Packit Service c5cf8c
%% \subsection{Specifying Compilers}
Packit Service c5cf8c
%% \label{sec:specifying-compilers}
Packit Service c5cf8c
Packit Service c5cf8c
%% You need not use the same compiler that MPICH was built with, but not
Packit Service c5cf8c
%% all compilers are compatible.  You can also specify the compiler for
Packit Service c5cf8c
%% building MPICH itself, as reported by \texttt{mpichversion}, just by
Packit Service c5cf8c
%% using the compiling and linking commands from the previous section.
Packit Service c5cf8c
%% The environment variables \texttt{MPICH_CC}, \texttt{MPICH_CXX},
Packit Service c5cf8c
%% \texttt{MPICH_F77}, and \texttt{MPICH_F90} may be used to specify
Packit Service c5cf8c
%% alternate C, C++, Fortran 77, and Fortran 90 compilers, respectively.
Packit Service c5cf8c
Packit Service c5cf8c
Packit Service c5cf8c
\subsection{Special Issues for C++}
Packit Service c5cf8c
\label{sec:cxx}
Packit Service c5cf8c
Packit Service c5cf8c
Some users may get error messages such as
Packit Service c5cf8c
\begin{small}
Packit Service c5cf8c
\begin{verbatim}
Packit Service c5cf8c
    SEEK_SET is #defined but must not be for the C++ binding of MPI
Packit Service c5cf8c
\end{verbatim}
Packit Service c5cf8c
\end{small}
Packit Service c5cf8c
The problem is that both \texttt{stdio.h} and the MPI C++ interface use
Packit Service c5cf8c
\texttt{SEEK\_SET}, \texttt{SEEK\_CUR}, and \texttt{SEEK\_END}.  This is really a bug
Packit Service c5cf8c
in the MPI standard.  You can try adding
Packit Service c5cf8c
\begin{verbatim}
Packit Service c5cf8c
    #undef SEEK_SET
Packit Service c5cf8c
    #undef SEEK_END
Packit Service c5cf8c
    #undef SEEK_CUR
Packit Service c5cf8c
\end{verbatim}
Packit Service c5cf8c
before \texttt{mpi.h} is included, or add the definition
Packit Service c5cf8c
\begin{verbatim}
Packit Service c5cf8c
    -DMPICH_IGNORE_CXX_SEEK
Packit Service c5cf8c
\end{verbatim}
Packit Service c5cf8c
to the command line (this will cause the MPI versions of \texttt{SEEK\_SET}
Packit Service c5cf8c
etc. to be skipped).
Packit Service c5cf8c
Packit Service c5cf8c
\subsection{Special Issues for Fortran}
Packit Service c5cf8c
\label{sec:fortran}
Packit Service c5cf8c
Packit Service c5cf8c
MPICH provides two kinds of support for Fortran programs.  For
Packit Service c5cf8c
Fortran 77 programmers, the file \texttt{mpif.h} provides the
Packit Service c5cf8c
definitions of the MPI constants such as \texttt{MPI\_COMM\_WORLD}.
Packit Service c5cf8c
Fortran 90 programmers should use the \texttt{MPI} module instead;
Packit Service c5cf8c
this provides all of the definitions as well as interface definitions
Packit Service c5cf8c
for many of the MPI functions.  However, this MPI module does not
Packit Service c5cf8c
provide full Fortran 90 support; in particular, interfaces for the
Packit Service c5cf8c
routines, such as \texttt{MPI\_Send}, that take ``choice'' arguments
Packit Service c5cf8c
are not provided.
Packit Service c5cf8c
Packit Service c5cf8c
Packit Service c5cf8c
\section{Running Programs with \texttt{mpiexec}}
Packit Service c5cf8c
\label{sec:mpiexec}
Packit Service c5cf8c
Packit Service c5cf8c
The MPI Standard describes \texttt{mpiexec} as a suggested way to run
Packit Service c5cf8c
MPI programs. MPICH implements the \texttt{mpiexec} standard, and also
Packit Service c5cf8c
provides some extensions.
Packit Service c5cf8c
Packit Service c5cf8c
\subsection{Standard \texttt{mpiexec}}
Packit Service c5cf8c
\label{sec:mpiexec-standard}
Packit Service c5cf8c
Packit Service c5cf8c
Here we describe the standard \texttt{mpiexec} arguments from the MPI
Packit Service c5cf8c
Standard~\cite{mpi-forum:mpi2-journal}.  To run a program with 'n'
Packit Service c5cf8c
processes on your local machine, you can use:
Packit Service c5cf8c
Packit Service c5cf8c
\begin{verbatim}
Packit Service c5cf8c
   mpiexec -n <number> ./a.out
Packit Service c5cf8c
\end{verbatim}
Packit Service c5cf8c
Packit Service c5cf8c
To test that you can run an 'n' process job on multiple nodes:
Packit Service c5cf8c
Packit Service c5cf8c
\begin{verbatim}
Packit Service c5cf8c
   mpiexec -f machinefile -n <number> ./a.out
Packit Service c5cf8c
\end{verbatim}
Packit Service c5cf8c
Packit Service c5cf8c
The 'machinefile' is of the form:
Packit Service c5cf8c
Packit Service c5cf8c
\begin{verbatim}
Packit Service c5cf8c
   host1
Packit Service c5cf8c
   host2:2
Packit Service c5cf8c
   host3:4   # Random comments
Packit Service c5cf8c
   host4:1
Packit Service c5cf8c
\end{verbatim}
Packit Service c5cf8c
Packit Service c5cf8c
'host1', 'host2', 'host3' and 'host4' are the hostnames of the
Packit Service c5cf8c
machines you want to run the job on. The ':2', ':4', ':1' segments
Packit Service c5cf8c
depict the number of processes you want to run on each node. If
Packit Service c5cf8c
nothing is specified, ':1' is assumed.
Packit Service c5cf8c
Packit Service c5cf8c
More details on interacting with Hydra can be found at
Packit Service c5cf8c
\url{http://wiki.mpich.org/mpich/index.php/Using_the_Hydra_Process_Manager}
Packit Service c5cf8c
Packit Service c5cf8c
Packit Service c5cf8c
\subsection{Extensions for All Process Management Environments}
Packit Service c5cf8c
\label{sec:extensions-uniform}
Packit Service c5cf8c
Packit Service c5cf8c
Some \texttt{mpiexec} arguments are specific to particular
Packit Service c5cf8c
communication subsystems (``devices'') or process management
Packit Service c5cf8c
environments (``process managers'').  Our intention is to make all
Packit Service c5cf8c
arguments as uniform as possible across devices and process managers.
Packit Service c5cf8c
For the time being we will document these separately.
Packit Service c5cf8c
Packit Service c5cf8c
\subsection{\texttt{mpiexec} Extensions for the Hydra Process Manager}
Packit Service c5cf8c
Packit Service c5cf8c
MPICH provides a number of process management systems. Hydra is the
Packit Service c5cf8c
default process manager in MPICH. More details on Hydra and its
Packit Service c5cf8c
extensions to mpiexec can be found at
Packit Service c5cf8c
\url{http://wiki.mpich.org/mpich/index.php/Using\_the\_Hydra\_Process\_Manager}
Packit Service c5cf8c
Packit Service c5cf8c
Packit Service c5cf8c
\subsection{Extensions for the gforker Process Management Environment}
Packit Service c5cf8c
\label{sec:extensions-forker}
Packit Service c5cf8c
\texttt{gforker} is a process management system for starting
Packit Service c5cf8c
processes on a single machine, so called because the MPI processes are
Packit Service c5cf8c
simply \texttt{fork}ed from the \texttt{mpiexec} process.  This process
Packit Service c5cf8c
manager supports programs that use \texttt{MPI\_Comm\_spawn} and the other
Packit Service c5cf8c
dynamic process routines, but does not support the use of the dynamic process
Packit Service c5cf8c
routines from programs that are not started with \texttt{mpiexec}.  The
Packit Service c5cf8c
\texttt{gforker} process manager is primarily intended as a debugging aid as
Packit Service c5cf8c
it simplifies development and testing of MPI programs on a single node or
Packit Service c5cf8c
processor.  
Packit Service c5cf8c
Packit Service c5cf8c
\subsubsection{\texttt{mpiexec} arguments for gforker}
Packit Service c5cf8c
\label{sec:mpiexec-forker}
Packit Service c5cf8c
Packit Service c5cf8c
In addition to the standard \texttt{mpiexec} command-line arguments, the
Packit Service c5cf8c
\texttt{gforker} \texttt{mpiexec} supports the following options:
Packit Service c5cf8c
\begin{description}
Packit Service c5cf8c
\item[\texttt{-np <num>}]A synonym for the standard \texttt{-n} argument
Packit Service c5cf8c
\item[\texttt{-env <name> <value>}]Set the environment variable
Packit Service c5cf8c
\texttt{<name>} to \texttt{<value>} for the processes being run by
Packit Service c5cf8c
\texttt{mpiexec}.
Packit Service c5cf8c
\item[\texttt{-envnone}]Pass no environment variables (other than ones
Packit Service c5cf8c
specified with  other \texttt{-env} or \texttt{-genv} arguments) to the
Packit Service c5cf8c
processes being run by \texttt{mpiexec}. 
Packit Service c5cf8c
By default, all environment
Packit Service c5cf8c
variables are provided to each MPI process (rationale: principle of
Packit Service c5cf8c
least surprise for the user)
Packit Service c5cf8c
\item[\texttt{-envlist <list>}]Pass the listed environment variables (names
Packit Service c5cf8c
separated  by commas), with their current values, to the processes being run by
Packit Service c5cf8c
 \texttt{mpiexec}.
Packit Service c5cf8c
\item[\texttt{-genv <name> <value>}]The \item{-genv} options have the same
Packit Service c5cf8c
meaning as their corresponding \texttt{-env} version, except they apply to all
Packit Service c5cf8c
executables, not just the current executable (in the case that the colon
Packit Service c5cf8c
syntax is used to specify multiple execuables).
Packit Service c5cf8c
\item[\texttt{-genvnone}]Like \texttt{-envnone}, but for all executables
Packit Service c5cf8c
\item[\texttt{-genvlist <list>}]Like \texttt{-envlist}, but for all executables
Packit Service c5cf8c
\item[\texttt{-usize <n>}]Specify the value returned for the value of the
Packit Service c5cf8c
attribute \texttt{MPI\_UNIVERSE\_SIZE}.
Packit Service c5cf8c
\item[\texttt{-l}]Label standard out and standard error (\texttt{stdout} and \texttt{stderr}) with 
Packit Service c5cf8c
  the rank of the process
Packit Service c5cf8c
\item[\texttt{-maxtime <n>}]Set a timelimit of \texttt{<n>} seconds.
Packit Service c5cf8c
\item[\texttt{-exitinfo}]Provide more information on the reason each process
Packit Service c5cf8c
exited if there is an abnormal exit
Packit Service c5cf8c
\end{description}
Packit Service c5cf8c
Packit Service c5cf8c
In addition to the commandline argments, the \texttt{gforker} \texttt{mpiexec}
Packit Service c5cf8c
provides a number of environment variables that can be used to control the
Packit Service c5cf8c
behavior of \texttt{mpiexec}:
Packit Service c5cf8c
Packit Service c5cf8c
\begin{description}
Packit Service c5cf8c
\item[\texttt{MPIEXEC\_TIMEOUT}]Maximum running time in seconds.
Packit Service c5cf8c
\texttt{mpiexec} will terminate MPI programs that take longer than the value
Packit Service c5cf8c
specified by \texttt{MPIEXEC\_TIMEOUT}.  
Packit Service c5cf8c
\item[\texttt{MPIEXEC\_UNIVERSE\_SIZE}]Set the universe size
Packit Service c5cf8c
\item[\texttt{MPIEXEC\_PORT\_RANGE}]Set the range of ports that
Packit Service c5cf8c
\texttt{mpiexec} will use  
Packit Service c5cf8c
  in communicating with the processes that it starts.  The format of 
Packit Service c5cf8c
  this is \texttt{<low>:<high>}.  For example, to specify any port between
Packit Service c5cf8c
  10000 and 10100, use \texttt{10000:10100}.  
Packit Service c5cf8c
\item[\texttt{MPICH\_PORT\_RANGE}]Has the same meaning as
Packit Service c5cf8c
\texttt{MPIEXEC\_PORT\_RANGE} and is used if \texttt{MPIEXEC\_PORT\_RANGE} is
Packit Service c5cf8c
not set. 
Packit Service c5cf8c
\item[\texttt{MPIEXEC\_PREFIX\_DEFAULT}]If this environment variable is set,
Packit Service c5cf8c
output to standard output is prefixed by the rank in \texttt{MPI\_COMM\_WORLD}
Packit Service c5cf8c
of the process and output to standard error is prefixed by the rank and the
Packit Service c5cf8c
text \texttt{(err)}; both are followed by an angle bracket (\texttt{>}).  If 
Packit Service c5cf8c
  this variable is not set, there is no prefix.
Packit Service c5cf8c
\item[\texttt{MPIEXEC\_PREFIX\_STDOUT}]Set the prefix used for lines sent to
Packit Service c5cf8c
standard output.  A \texttt{\%d} is replaced with the rank in
Packit Service c5cf8c
\texttt{MPI\_COMM\_WORLD}; a \texttt{\%w} is replaced with an indication of
Packit Service c5cf8c
which \texttt{MPI\_COMM\_WORLD} in MPI jobs that involve multiple
Packit Service c5cf8c
\texttt{MPI\_COMM\_WORLD}s (e.g., ones that use \texttt{MPI\_Comm\_spawn} or
Packit Service c5cf8c
\texttt{MPI\_Comm\_connect}). 
Packit Service c5cf8c
\item[\texttt{MPIEXEC\_PREFIX\_STDERR}]Like \texttt{MPIEXEC\_PREFIX\_STDOUT},
Packit Service c5cf8c
but for standard error. 
Packit Service c5cf8c
\item[\texttt{MPIEXEC\_STDOUTBUF}]Sets the buffering mode for standard
Packit Service c5cf8c
  output.  Valid  values are \texttt{NONE} (no buffering),
Packit Service c5cf8c
  \texttt{LINE} (buffering by lines), and \texttt{BLOCK} (buffering by
Packit Service c5cf8c
  blocks of characters; the size of the block is implementation
Packit Service c5cf8c
  defined).  The default is \texttt{NONE}. 
Packit Service c5cf8c
\item[\texttt{MPIEXEC\_STDERRBUF}]Like \texttt{MPIEXEC\_STDOUTBUF},
Packit Service c5cf8c
  but for standard error. 
Packit Service c5cf8c
\end{description}
Packit Service c5cf8c
Packit Service c5cf8c
\subsection{Restrictions of the remshell Process Management Environment}
Packit Service c5cf8c
\label{sec:restrictions-remshell}
Packit Service c5cf8c
Packit Service c5cf8c
The \texttt{remshell} ``process manager'' provides a very simple version of
Packit Service c5cf8c
\texttt{mpiexec} that makes use of the secure shell command (\texttt{ssh}) to
Packit Service c5cf8c
start processes on a collection of machines.  As this is intended primarily as
Packit Service c5cf8c
an illustration of how to build a version of \texttt{mpiexec} that works with
Packit Service c5cf8c
other process managers, it does not implement all of the features of the other
Packit Service c5cf8c
\texttt{mpiexec} programs described in this document.  In particular, it
Packit Service c5cf8c
ignores the command line options that control the environment variables given
Packit Service c5cf8c
to the MPI programs.  It does support the same output labeling features
Packit Service c5cf8c
provided by the \texttt{gforker} version of \texttt{mpiexec}. 
Packit Service c5cf8c
However, this version of \texttt{mpiexec} can be used
Packit Service c5cf8c
much like the \texttt{mpirun} for the \texttt{ch\_p4} device in MPICH-1 to run
Packit Service c5cf8c
programs on a collection of machines that allow remote shells.  A file by the
Packit Service c5cf8c
name of \texttt{machines} should contain the names of machines on which
Packit Service c5cf8c
processes can be run, one machine name per line.  There must be enough
Packit Service c5cf8c
machines listed to satisfy the requested number of processes; you can list the
Packit Service c5cf8c
same machine name multiple times if necessary.  
Packit Service c5cf8c
Packit Service c5cf8c
Packit Service c5cf8c
\subsection{Using MPICH with SLURM and PBS}
Packit Service c5cf8c
\label{sec:external_pm}
Packit Service c5cf8c
Packit Service c5cf8c
There are multiple ways of using MPICH with SLURM or PBS. Hydra
Packit Service c5cf8c
provides native support for both SLURM and PBS, and is likely the
Packit Service c5cf8c
easiest way to use MPICH on these systems (see the Hydra
Packit Service c5cf8c
documentation above for more details).
Packit Service c5cf8c
Packit Service c5cf8c
Alternatively, SLURM also provides compatibility with MPICH's
Packit Service c5cf8c
internal process management interface. To use this, you need to
Packit Service c5cf8c
configure MPICH with SLURM support, and then use the {\texttt srun}
Packit Service c5cf8c
job launching utility provided by SLURM.
Packit Service c5cf8c
Packit Service c5cf8c
For PBS, MPICH jobs can be launched in two ways: (i) use Hydra's
Packit Service c5cf8c
mpiexec with the appropriate options corresponding to PBS, or (ii)
Packit Service c5cf8c
using the OSC mpiexec.
Packit Service c5cf8c
Packit Service c5cf8c
Packit Service c5cf8c
\subsubsection{OSC mpiexec}
Packit Service c5cf8c
\label{sec:osc_mpiexec}
Packit Service c5cf8c
Packit Service c5cf8c
Pete Wyckoff from the Ohio Supercomputer Center provides a alternate
Packit Service c5cf8c
utility called OSC mpiexec to launch MPICH jobs on PBS systems. More
Packit Service c5cf8c
information about this can be found here:
Packit Service c5cf8c
\url{http://www.osc.edu/~pw/mpiexec}
Packit Service c5cf8c
Packit Service c5cf8c
Packit Service c5cf8c
\section{Specification of Implementation Details}
Packit Service c5cf8c
\label{sec:specification}
Packit Service c5cf8c
Packit Service c5cf8c
The MPI Standard defines a number of areas where a library is free to
Packit Service c5cf8c
define its own specific behavior as long as such behavior is documented
Packit Service c5cf8c
appropriately. This section provides that documentation for MPICH where
Packit Service c5cf8c
necessary.
Packit Service c5cf8c
Packit Service c5cf8c
\subsection{MPI Error Handlers for Communicators}
Packit Service c5cf8c
\label{sec:errhandler}
Packit Service c5cf8c
Packit Service c5cf8c
In Section 8.3.1 (Error Handlers for Communicators) of the MPI-3.0
Packit Service c5cf8c
Standard~\cite{mpi-forum:mpi3},
Packit Service c5cf8c
MPI defines an error handler callback function as
Packit Service c5cf8c
Packit Service c5cf8c
\begin{verbatim}
Packit Service c5cf8c
typedef void MPI_Comm_errhandler_function(MPI_Comm *, int *, ...);
Packit Service c5cf8c
\end{verbatim}
Packit Service c5cf8c
Packit Service c5cf8c
Where the first argument is the communicator in use, the second argument is
Packit Service c5cf8c
the error code to be returned by the MPI routine that raised the error, and
Packit Service c5cf8c
the remaining arguments to be implementation specific ``{\texttt varargs}''.
Packit Service c5cf8c
MPICH does not provide any arguments as part of this list. So a callback
Packit Service c5cf8c
function being provided to MPICH is sufficient if the header is
Packit Service c5cf8c
Packit Service c5cf8c
\begin{verbatim}
Packit Service c5cf8c
typedef void MPI_Comm_errhandler_function(MPI_Comm *, int *);
Packit Service c5cf8c
\end{verbatim}
Packit Service c5cf8c
Packit Service c5cf8c
\section{Debugging}
Packit Service c5cf8c
\label{sec:debugging}
Packit Service c5cf8c
Packit Service c5cf8c
Debugging parallel programs is notoriously difficult.  Here we describe
Packit Service c5cf8c
a number of approaches, some of which depend on the exact version of
Packit Service c5cf8c
MPICH you are using. 
Packit Service c5cf8c
Packit Service c5cf8c
Packit Service c5cf8c
\subsection{TotalView}
Packit Service c5cf8c
\label{sec:totalview}
Packit Service c5cf8c
Packit Service c5cf8c
MPICH supports use of the TotalView debugger from Etnus.  If MPICH
Packit Service c5cf8c
has been configured to enable debugging with TotalView then one can
Packit Service c5cf8c
debug an MPI program using
Packit Service c5cf8c
Packit Service c5cf8c
\begin{verbatim}
Packit Service c5cf8c
    totalview -a mpiexec -a -n 3 cpi
Packit Service c5cf8c
\end{verbatim}
Packit Service c5cf8c
Packit Service c5cf8c
You will get a popup window from TotalView asking whether you want to
Packit Service c5cf8c
start the job in a stopped state.  If so, when the TotalView window
Packit Service c5cf8c
appears, you may see assembly code in the source window.  Click on
Packit Service c5cf8c
\texttt{main} in the stack window (upper left) to see the source of
Packit Service c5cf8c
the \texttt{main} function.  TotalView will show that the program (all
Packit Service c5cf8c
processes) are stopped in the call to \texttt{MPI\_Init}.
Packit Service c5cf8c
Packit Service c5cf8c
If you have TotalView 8.1.0 or later, you can use a TotalView feature
Packit Service c5cf8c
called indirect launch with MPICH. Invoke TotalView as:
Packit Service c5cf8c
Packit Service c5cf8c
\begin{verbatim}
Packit Service c5cf8c
    totalview <program> -a <program args>
Packit Service c5cf8c
\end{verbatim}
Packit Service c5cf8c
Packit Service c5cf8c
Then select the Process/Startup Parameters command. Choose the
Packit Service c5cf8c
Parallel tab in the resulting dialog box and choose MPICH as the
Packit Service c5cf8c
parallel system. Then set the number of tasks using the Tasks field
Packit Service c5cf8c
and enter other needed mpiexec arguments into the Additional Starter
Packit Service c5cf8c
Arguments field.
Packit Service c5cf8c
Packit Service c5cf8c
\section{Checkpointing}
Packit Service c5cf8c
\label{sec:checkpointing}
Packit Service c5cf8c
MPICH supports checkpoint/rollback fault tolerance when used with the
Packit Service c5cf8c
Hydra process manager.  Currently only the BLCR checkpointing library
Packit Service c5cf8c
is supported.  BLCR needs to be installed separately.  Below we
Packit Service c5cf8c
describe how to enable the feature in MPICH and how to use it.  This
Packit Service c5cf8c
information can also be found on the MPICH Wiki:
Packit Service c5cf8c
\url{http://wiki.mpich.org/mpich/index.php/Checkpointing}
Packit Service c5cf8c
Packit Service c5cf8c
\subsection{Configuring for checkpointing}
Packit Service c5cf8c
\label{sec:conf-checkp}
Packit Service c5cf8c
Packit Service c5cf8c
First, you need to have BLCR version 0.8.2 installed on your
Packit Service c5cf8c
machine.  If it's installed in the default system location, add the
Packit Service c5cf8c
following two options to your configure command:
Packit Service c5cf8c
\begin{small}
Packit Service c5cf8c
\begin{verbatim}
Packit Service c5cf8c
    --enable-checkpointing --with-hydra-ckpointlib=blcr
Packit Service c5cf8c
\end{verbatim}
Packit Service c5cf8c
\end{small}
Packit Service c5cf8c
Packit Service c5cf8c
If BLCR is not installed in the default system location, you'll need
Packit Service c5cf8c
to tell MPICH's configure where to find it.  You might also need to
Packit Service c5cf8c
set the \texttt{LD\_LIBRARY\_PATH} environment variable so that BLCR's shared
Packit Service c5cf8c
libraries can be found.  In this case add the following options to your
Packit Service c5cf8c
configure command:
Packit Service c5cf8c
\begin{small}
Packit Service c5cf8c
\begin{verbatim}
Packit Service c5cf8c
    --enable-checkpointing --with-hydra-ckpointlib=blcr 
Packit Service c5cf8c
    --with-blcr=BLCR_INSTALL_DIR LD_LIBRARY_PATH=BLCR_INSTALL_DIR/lib
Packit Service c5cf8c
\end{verbatim}
Packit Service c5cf8c
\end{small}
Packit Service c5cf8c
where \texttt{BLCR\_INSTALL\_DIR} is the directory where BLCR has been
Packit Service c5cf8c
installed (whatever was specified in \texttt{--prefix} when BLCR was
Packit Service c5cf8c
configured).  Note, checkpointing is only supported with the Hydra
Packit Service c5cf8c
process manager.  Hyrda will used by default, unless you choose
Packit Service c5cf8c
something else with the \texttt{--with-pm=} configure option.
Packit Service c5cf8c
Packit Service c5cf8c
After it's configured, compile as usual (e.g., \texttt{make; make install}). 
Packit Service c5cf8c
Packit Service c5cf8c
\subsection{Taking checkpoints}
Packit Service c5cf8c
\label{sec:taking-checkpoints}
Packit Service c5cf8c
Packit Service c5cf8c
To use checkpointing, include the \texttt{-ckpointlib} option for
Packit Service c5cf8c
\texttt{mpiexec} to specify the checkpointing library to use and
Packit Service c5cf8c
\texttt{-ckpoint-prefix} to specify the directory where the checkpoint
Packit Service c5cf8c
images should be written:
Packit Service c5cf8c
\begin{small}
Packit Service c5cf8c
\begin{verbatim}
Packit Service c5cf8c
    shell$ mpiexec -ckpointlib blcr \
Packit Service c5cf8c
           -ckpoint-prefix /home/buntinas/ckpts/app.ckpoint \
Packit Service c5cf8c
           -f hosts -n 4 ./app
Packit Service c5cf8c
\end{verbatim}
Packit Service c5cf8c
\end{small}
Packit Service c5cf8c
Packit Service c5cf8c
While the application is running, the user can request for a
Packit Service c5cf8c
checkpoint at any time by sending a \texttt{SIGUSR1} signal to
Packit Service c5cf8c
\texttt{mpiexec}.  You can also automatically checkpoint the
Packit Service c5cf8c
application at regular intervals using the mpiexec option
Packit Service c5cf8c
\texttt{-ckpoint-interval} to specify the number of seconds between
Packit Service c5cf8c
checkpoints:
Packit Service c5cf8c
\begin{small}
Packit Service c5cf8c
\begin{verbatim}
Packit Service c5cf8c
    shell$ mpiexec -ckpointlib blcr \
Packit Service c5cf8c
           -ckpoint-prefix /home/buntinas/ckpts/app.ckpoint \
Packit Service c5cf8c
           -ckpoint-interval 3600 -f hosts -n 4 ./app
Packit Service c5cf8c
\end{verbatim}
Packit Service c5cf8c
\end{small}
Packit Service c5cf8c
Packit Service c5cf8c
The checkpoint/restart parameters can also be controlled with the
Packit Service c5cf8c
environment variables \texttt{HYDRA\_\linebreak[0]CKPOINTLIB},
Packit Service c5cf8c
\texttt{HYDRA\_\linebreak[0]CKPOINT\_\linebreak[0]PREFIX} and
Packit Service c5cf8c
\texttt{HYDRA\_\linebreak[0]CKPOINT\_\linebreak[0]INTERVAL}.
Packit Service c5cf8c
Packit Service c5cf8c
Each checkpoint generates one file per node.  Note that checkpoints
Packit Service c5cf8c
for all processes on a node will be stored in the same file.  Each
Packit Service c5cf8c
time a new checkpoint is taken an additional set of files are created.
Packit Service c5cf8c
The files are numbered by the checkpoint number.  This allows the
Packit Service c5cf8c
application to be restarted from checkpoints other than the most
Packit Service c5cf8c
recent.  The checkpoint number can be specified with the
Packit Service c5cf8c
\texttt{-ckpoint-num} parameter.  To restart a process:
Packit Service c5cf8c
\begin{small}
Packit Service c5cf8c
\begin{verbatim}
Packit Service c5cf8c
    shell$ mpiexec -ckpointlib blcr \
Packit Service c5cf8c
           -ckpoint-prefix /home/buntinas/ckpts/app.ckpoint \
Packit Service c5cf8c
           -ckpoint-num 5 -f hosts -n 4
Packit Service c5cf8c
\end{verbatim}
Packit Service c5cf8c
\end{small}
Packit Service c5cf8c
Packit Service c5cf8c
Note that by default, the process will be restarted from the first
Packit Service c5cf8c
checkpoint, so in most cases, the checkpoint number should be
Packit Service c5cf8c
specified.
Packit Service c5cf8c
Packit Service c5cf8c
Packit Service c5cf8c
\section{Other Tools Provided with MPICH}
Packit Service c5cf8c
\label{sec:other-tools}
Packit Service c5cf8c
MPICH also includes a test suite for MPI functionality; this suite may
Packit Service c5cf8c
be found in the \texttt{mpich/test/mpi} source directory and can be
Packit Service c5cf8c
run with the command \texttt{make testing}.  This test suite should
Packit Service c5cf8c
work with any MPI implementation, not just MPICH.
Packit Service c5cf8c
Packit Service c5cf8c
\clearpage
Packit Service c5cf8c
\appendix
Packit Service c5cf8c
Packit Service c5cf8c
\section{Frequently Asked Questions}
Packit Service c5cf8c
Packit Service c5cf8c
The frequently asked questions are maintained online
Packit Service c5cf8c
here:\url{http://wiki.mpich.org/mpich/index.php/Frequently_Asked_Questions}
Packit Service c5cf8c
Packit Service c5cf8c
\bibliographystyle{plain}
Packit Service c5cf8c
\bibliography{user}
Packit Service c5cf8c
Packit Service c5cf8c
\end{document}