Blame IbaTools/man/opafabricanalysis.manPage

Packit 857059
.\" .\" *********************************************************************
Packit 857059
.\" .\" *                                                                   *
Packit 857059
.\" .\" *             Copyright 2015-2019, Intel Corporation                *
Packit 857059
.\" .\" *                                                                   *
Packit 857059
.\" .\" *                       All Rights Reserved.                        *
Packit 857059
.\" .\" *                                                                   *
Packit 857059
.\" .\" *********************************************************************
Packit 857059
Packit 857059
.TH opafabricanalysis 8 "Intel Corporation" "Copyright(C) 2015\-2019" "IFSFFCLIRG (Man Page)"
Packit 857059
.SH NAME
Packit 857059
opafabricanalysis
Packit 857059
Packit 857059
Packit 857059
.PP
Packit 857059
Packit 857059
\fB(All)\fR
Packit 857059
Performs analysis of the fabric.
Packit 857059
.SH Syntax
Packit 857059
opafabricanalysis [-b|-e] [-s] [-d  \fIdir\fR] [-c  \fIfile\fR] [-t  \fIportsfile\fR]
Packit 857059
.br
Packit 857059
[-p  \fIports\fR]
Packit 857059
[-T  \fItopology\(ulinput\fR]
Packit 857059
.SH Options
Packit 857059
Packit 857059
.TP 10
Packit 857059
--help
Packit 857059
Packit 857059
Produces full help text.
Packit 857059
.TP 10
Packit 857059
-b
Packit 857059
Packit 857059
Specifies the baseline mode. Default is compare/check mode.
Packit 857059
.TP 10
Packit 857059
-e
Packit 857059
Packit 857059
Evaluates health only. Default is compare/check mode.
Packit 857059
.TP 10
Packit 857059
-s
Packit 857059
Packit 857059
Saves history of failures (errors/differences).
Packit 857059
.TP 10
Packit 857059
-d dir
Packit 857059
Packit 857059
Specifies the top-level directory for saving baseline and history of failed checks. Default is /var/usr/lib/opa/analysis
Packit 857059
.TP 10
Packit 857059
-c file
Packit 857059
Packit 857059
Specifies the error thresholds config file.Default is /etc/opa/opamon.conf
Packit 857059
.TP 10
Packit 857059
-t portsfile
Packit 857059
Packit 857059
Specifies the file with list of local HFI ports used to access fabric(s) for analysis. Default is /etc/opa/ports
Packit 857059
.TP 10
Packit 857059
-p ports
Packit 857059
Packit 857059
Specifies the list of local HFI ports used to access fabrics for analysis.
Packit 857059
Packit 857059
.IP
Packit 857059
Default is first active port. The first HFI in the system is 1. The first port on an HFI is 1. Uses the format hfi:port,
Packit 857059
.br
Packit 857059
for example:
Packit 857059
.RS
Packit 857059
.TP 10
Packit 857059
Packit 857059
.sp
Packit 857059
0:0
Packit 857059
First active port in system.
Packit 857059
Packit 857059
.RE
Packit 857059
Packit 857059
.RS
Packit 857059
.TP 10
Packit 857059
Packit 857059
.sp
Packit 857059
0:y
Packit 857059
Port \fIy\fR within system.
Packit 857059
Packit 857059
.RE
Packit 857059
Packit 857059
.RS
Packit 857059
.TP 10
Packit 857059
Packit 857059
.sp
Packit 857059
x:0
Packit 857059
First active port on HFI \fIx\fR.
Packit 857059
Packit 857059
.RE
Packit 857059
Packit 857059
.RS
Packit 857059
.TP 10
Packit 857059
Packit 857059
.sp
Packit 857059
x:y
Packit 857059
HFI \fIx\fR, port \fIy\fR.
Packit 857059
Packit 857059
.RE
Packit 857059
Packit 857059
.TP 10
Packit 857059
-T \fItopology\(ulinput\fR
Packit 857059
Packit 857059
Specifies the name of topology input file to use. Any %P markers in this filename are replaced with the HFI:port being operated on (such as 0:0 or 1:2). Default is /etc/opa/topology.%P.xml. If -T NONE is specified, no topology input file is used. See
Packit 857059
\fIDetails\fR
Packit 857059
and
Packit 857059
\fIopareport\fR
Packit 857059
for more information.
Packit 857059
.SH Example
Packit 857059
opafabricanalysis
Packit 857059
.br
Packit 857059
Packit 857059
opafabricanalysis -p \[aq]1:1 1:2 2:1 2:2\[aq]
Packit 857059
.PP
Packit 857059
The fabric analysis tool checks the following:
Packit 857059
.IP \(bu
Packit 857059
Fabric links (both internal to switch chassis and external cables)
Packit 857059
.IP \(bu
Packit 857059
Fabric components (nodes, links, SMs, systems, and their SMA configuration)
Packit 857059
.IP \(bu
Packit 857059
Fabric PMA error counters and link speed mismatches
Packit 857059
.PP
Packit 857059
Packit 857059
.B NOTE:
Packit 857059
The comparison includes components on the fabric. Therefore, operations such as shutting down a server cause the server to no longer appear on the fabric and are flagged as a fabric change or failure by opafabricanalysis.
Packit 857059
Packit 857059
.SH Environment Variables
Packit 857059
Packit 857059
.PP
Packit 857059
The following environment variables are also used by this command:
Packit 857059
.TP 10
Packit 857059
\fBPORTS\fR
Packit 857059
Packit 857059
List of ports, used in absence of -t and -p.
Packit 857059
.TP 10
Packit 857059
\fBPORTS\(ulFILE\fR
Packit 857059
Packit 857059
File containing list of ports, used in absence of -t and -p.
Packit 857059
.TP 10
Packit 857059
\fBFF\(ulTOPOLOGY\(ulFILE\fR
Packit 857059
Packit 857059
File containing topology\(ulinput (may have %P marker in filename), used in absence of -T.
Packit 857059
.TP 10
Packit 857059
\fBFF\(ulANALYSIS\(ulDIR\fR
Packit 857059
Packit 857059
Top-level directory for baselines and failed health checks.
Packit 857059
.SH Details
Packit 857059
Packit 857059
.PP
Packit 857059
For simple fabrics, the Intel(R) Omni-Path Fabric Suite FastFabric Toolset host is connected to a single fabric. By default, the first active port on the FastFabric Toolset host is used to analyze the fabric. However, in more complex fabrics, the FastFabric Toolset host may be connected to more than one fabric or subnet. In this case, you can specify the ports or HFIs to use with one of the following methods:
Packit 857059
.IP \(bu
Packit 857059
On the command line using the -p option.
Packit 857059
.IP \(bu
Packit 857059
In a file specified using the -t option.
Packit 857059
.IP \(bu
Packit 857059
Through the environment variables \fBPORTS\fR or \fBPORTS\(ulFILE\fR.
Packit 857059
.IP \(bu
Packit 857059
Using the \fBPORTS\(ulFILE\fR configuration option in opafastfabric.conf.
Packit 857059
.PP
Packit 857059
If the specified port does not exist or is empty, the first active port on the local system is used. In more complex configurations, you must specify the exact ports to use for all fabrics to be analyzed.
Packit 857059
.PP
Packit 857059
You can specify the topology\(ulinput file to be used with one of the following methods:
Packit 857059
.IP \(bu
Packit 857059
On the command line using the -T option.
Packit 857059
.IP \(bu
Packit 857059
In a file specified through the environment variable \fBFF\(ulTOPOLOGY\(ulFILE\fR.
Packit 857059
.IP \(bu
Packit 857059
Using the ff\(ultopology\(ulfile configuration option in opafastfabric.conf.
Packit 857059
.PP
Packit 857059
If the specified file does not exist, no topology\(ulinput file is used. Alternately the filename can be specified as NONE to prevent use of an input file.
Packit 857059
.PP
Packit 857059
For more information on topology\(ulinput, refer to
Packit 857059
\fIopareport\fR
Packit 857059
.
Packit 857059
.PP
Packit 857059
By default, the error analysis includes PMA counters and slow links (that is, links running below enabled speeds). You can change this using the \fBFF\(ulFABRIC\(ulHEALTH\fR configuration parameter in opafastfabric.conf. This parameter specifies the opareport options and reports to be used for the health analysis. It also can specify the PMA counter clearing behavior (-I \fIseconds\fR, -C, or none at all).
Packit 857059
.PP
Packit 857059
When a topology\(ulinput file is used, it can also be useful to extend \fBFF\(ulFABRIC\(ulHEALTH\fR to include fabric topology verification options such as -o verifylinks.
Packit 857059
.PP
Packit 857059
The thresholds for PMA counter analysis default to /etc/opa/opamon.conf. However, you can specify an alternate configuration file for thresholds using the -c option. The opamon.si.conf file can also be used to check for any non-zero values for signal integrity (SI) counters.
Packit 857059
.PP
Packit 857059
All files generated by opafabricanalysis start with fabric in their file name. This is followed by the port selection option identifying the port used for the analysis. Default is 0:0.
Packit 857059
.PP
Packit 857059
The opafabricanalysis tool generates files such as the following within FF\(ulANALYSIS\(ulDIR :
Packit 857059
.PP
Packit 857059
Packit 857059
\fBHealth Check\fR
Packit 857059
Packit 857059
.IP \(bu
Packit 857059
latest/fabric.0:0.errors stdout of opareport for errors encountered during fabric error analysis.
Packit 857059
Packit 857059
.IP \(bu
Packit 857059
latest/fabric.0.0.errors.stderr stderr of opareport during fabric error analysis.
Packit 857059
Packit 857059
.PP
Packit 857059
Packit 857059
\fBBaseline\fR
Packit 857059
Packit 857059
.PP
Packit 857059
During a baseline run, the following files are also created in FF\(ulANALYSIS\(ulDIR/latest.
Packit 857059
.IP \(bu
Packit 857059
baseline/fabric.0:0.snapshot.xml opareport snapshot of complete fabric components and SMA configuration.
Packit 857059
Packit 857059
.IP \(bu
Packit 857059
baseline/fabric.0:0.comps opareport summary of fabric components and basic SMA configuration.
Packit 857059
Packit 857059
.IP \(bu
Packit 857059
baseline/fabric.0.0.links opareport summary of internal and external links.
Packit 857059
Packit 857059
.PP
Packit 857059
Packit 857059
\fBFull Analysis\fR
Packit 857059
Packit 857059
.IP \(bu
Packit 857059
latest/fabric.0:0.snapshot.xml opareport snapshot of complete fabric components and SMA configuration.
Packit 857059
Packit 857059
.IP \(bu
Packit 857059
latest/fabric.0:0.snapshot.stderr stderr of opareport during snapshot.
Packit 857059
Packit 857059
.IP \(bu
Packit 857059
latest/fabric.0:0.errors stdout of opareport for errors encountered during fabric error analysis.
Packit 857059
Packit 857059
.IP \(bu
Packit 857059
latest/fabric.0.0.errors.stderr stderr of opareport during fabric error analysis.
Packit 857059
Packit 857059
.IP \(bu
Packit 857059
latest/fabric.0:0.comps stdout of opareport for fabric components and SMA configuration.
Packit 857059
Packit 857059
.IP \(bu
Packit 857059
latest/fabric.0:0.comps.stderr stderr of opareport for fabric components.
Packit 857059
Packit 857059
.IP \(bu
Packit 857059
latest/fabric.0:0.comps.diff diff of baseline and latest fabric components.
Packit 857059
Packit 857059
.IP \(bu
Packit 857059
latest/fabric.0:0.links stdout of opareport summary of internal and external links.
Packit 857059
Packit 857059
.IP \(bu
Packit 857059
latest/fabric.0:0.links.stderr stderr of opareport summary of internal and external links.
Packit 857059
Packit 857059
.IP \(bu
Packit 857059
latest/fabric.0:0.links.diff diff of baseline and latest fabric internal and external links.
Packit 857059
Packit 857059
.IP \(bu
Packit 857059
latest/fabric.0:0.links.changes.stderr stderr of opareport comparison of links.
Packit 857059
Packit 857059
.IP \(bu
Packit 857059
latest/fabric.0:0.links.changes opareport comparison of links against baseline. This is typically easier to read than the links.diff file and contains the same information.
Packit 857059
Packit 857059
.IP \(bu
Packit 857059
latest/fabric.0:0.comps.changes.stderr stderr of opareport comparison of components.
Packit 857059
Packit 857059
.IP \(bu
Packit 857059
latest/fabric.0:0.comps.changes opareport comparison of components against baseline. This is typically easier to read than the comps.diff file and contains the same information.
Packit 857059
Packit 857059
.PP
Packit 857059
The .diff and .changes files are only created if differences are detected.
Packit 857059
.PP
Packit 857059
If the -s option is used and failures are detected, files related to the checks that failed are also copied to the time-stamped directory name under FF\(ulANALYSIS\(ulDIR.
Packit 857059
.SH Fabric Items Checked Against the Baseline
Packit 857059
Packit 857059
.PP
Packit 857059
Based on opareport -o links:
Packit 857059
.IP \(bu
Packit 857059
Unconnected/down/missing cables
Packit 857059
.IP \(bu
Packit 857059
Added/moved cables
Packit 857059
.IP \(bu
Packit 857059
Changes in link width and speed
Packit 857059
.IP \(bu
Packit 857059
Changes to Node GUIDs in fabric (replacement of HFI or Switch hardware)
Packit 857059
.IP \(bu
Packit 857059
Adding/Removing Nodes [FI, Virtual FIs, Virtual Switches, Physical Switches, Physical Switch internal switching cards (leaf/spine)]
Packit 857059
.IP \(bu
Packit 857059
Changes to server or switch names
Packit 857059
.PP
Packit 857059
Based on opareport -o comps:
Packit 857059
.IP \(bu
Packit 857059
Overlap with items from links report
Packit 857059
.IP \(bu
Packit 857059
Changes in port MTU, LMC, number of VLs
Packit 857059
.IP \(bu
Packit 857059
Changes in port speed/width enabled or supported
Packit 857059
.IP \(bu
Packit 857059
Changes in HFI or switch device IDs/revisions/VendorID (for example, ASIC hardware changes)
Packit 857059
.IP \(bu
Packit 857059
Changes in port Capability mask (which features/agents run on port/server)
Packit 857059
.IP \(bu
Packit 857059
Changes to ErrorLimits and PKey enforcement per port
Packit 857059
.IP \(bu
Packit 857059
Changes to IOUs/IOCs/IOC Services provided
Packit 857059
Packit 857059
Packit 857059
.PP
Packit 857059
Location (port, node) and number of SMs in fabric. Includes:
Packit 857059
.IP \(bu
Packit 857059
Primary and backups
Packit 857059
.IP \(bu
Packit 857059
Configured priority for SM
Packit 857059
.SH Fabric Items Also Checked During Health Check
Packit 857059
Packit 857059
.PP
Packit 857059
Based on opareport -s -C -o errors -o slowlinks:
Packit 857059
.IP \(bu
Packit 857059
PMA error counters on all Intel(R) Omni-Path Fabric ports (HFI, switch external and switch internal) checked against configurable thresholds.
Packit 857059
.IP \(bu
Packit 857059
Counters are cleared each time a health check is run. Each health check reflects a counter delta since last health check.
Packit 857059
.IP \(bu
Packit 857059
Typically identifies potential fabric errors, such as symbol errors.
Packit 857059
.IP \(bu
Packit 857059
May also identify transient congestion, depending on the counters that are monitored.
Packit 857059
.IP \(bu
Packit 857059
Link active speed/width as compared to Enabled speed.
Packit 857059
.IP \(bu
Packit 857059
Identifies links whose active speed/width is < min (enabled speed/width on each side of link).
Packit 857059
.IP \(bu
Packit 857059
This typically reflects bad cables or bad ports or poor connections.
Packit 857059
.IP \(bu
Packit 857059
Side effect is the verification of SA health.