Blob Blame History Raw
#
# Copyright © 2013-2014 University of Wisconsin-La Crosse.
#                         All rights reserved.
# Copyright © 2016 Inria.  All rights reserved.
#
# Additional copyrights may follow.
# See COPYING in top-level directory.
#

First you have to build netloc, install it and add netloc binary path
to your PATH environment variable. In bash, just do:

  $ PATH=$PATH:/path/to/netloc/installation/bin
  $ export PATH

Then create a directory to gather your cluster information and enter it:

  $ mkdir mycluster-data
  $ cd mycluster-data/

Create a subdirectory "hwloc" and place the lstopo XML output of your nodes
there.

  $ ssh node001 lstopo ~/mycluster-data/hwloc/node001.xml
  [...]

  $ ls -l hwloc/
  -rw-r--r-- 1 user group 74242 13 sept. 15:07 node001.xml
  [...]
  -rw-r--r-- 1 user group 26438 24 sept. 08:59 node263.xml
  -rw-r--r-- 1 user group 26438 13 sept. 08:11 node264.xml

Now run the netloc_ib_gather_raw script to gather IB network information.
It uses the hwloc information to find out the IB subnets to query.

It uses ibnetdiscover and ibroute utilities which require privileged
access. You can either run the entire script as root, or pass --sudo
so that ibnetdiscover and ibroute are called by sudo.

If the hwloc XML files are not in the "hwloc", specify it with --hwloc-dir.

  $ netloc_ib_gather_raw ~/mycluster-data --sudo
  Found 1 subnets in hwloc directory:
   Subnet fe80:0000:0000:0000 is locally accessible from board qib0 port 1.

  Looking at fe80:0000:0000:0000 (through local board qib0 port 1)...
   Running ibnetdiscover...
   Getting routes...
    Running ibroute for switch LID 19...
    [...]
    Running ibroute for switch LID 7...
    Running ibroute for switch LID 12...

Now you have IB information under the "ib-raw" directory.
Each subnet gets one file and one directory.

  $ ls -l ~/mycluster-data/ib-raw/
  -rw-r--r-- 1 user group 274493 22 janv. 15:21 ib-subnet-fe80:0000:0000:0000.txt
  drwxr-xr-x 2 user group   4096 22 janv. 15:21 ibroutes-fe80:0000:0000:0000/

The last step is to run netloc_ib_extract_dats to collect the IB information.

It automatically looks inside the "ib-raw" directory by default and generates
the output under "netloc". These can be changed with "--raw-dir" and "--out-dir".

  $ netloc_ib_extract_dats .
  ----------------------------------------------------------------------
  Processing Subnet: fe80:0000:0000:0000
  ----------------------------------------------------------------------
  --------------- General Network Information

The netloc information is now ready for use under the "netloc" subdirectory.

  $ ls -l netloc/
  -rw-r--r-- 1 user group 3226573 22 janv. 15:31 IB-fe80:0000:0000:0000-log-paths.ndat
  -rw-r--r-- 1 user group  384011 22 janv. 15:31 IB-fe80:0000:0000:0000-nodes.ndat
  -rw-r--r-- 1 user group 3237964 22 janv. 15:31 IB-fe80:0000:0000:0000-phy-paths.ndat

  $ lsnettopo netloc/
  Network: IB-fe80:0000:0000:0000
    Type    : InfiniBand
    Subnet  : fe80:0000:0000:0000
    Hosts   :   279
    Switches:   25
    [...]