Blame doc/opensm_release_notes_openib-2.0.5.txt

Packit 13e616
                        OpenSM Release Notes 2.0.5
Packit 13e616
                       ============================
Packit 13e616
Packit 13e616
Version: OpenFabrics Enterprise Distribution (OFED) 1.1
Packit 13e616
Repo:    https://openib.org/svn/gen2/branches/1.1/src/userspace/management/osm
Packit 13e616
Version: 9535 (openib-2.0.5)
Packit 13e616
Date:    October 2006
Packit 13e616
Packit 13e616
1 Overview
Packit 13e616
----------
Packit 13e616
This document describes the contents of the OpenSM OFED 1.1 release.
Packit 13e616
OpenSM is an InfiniBand compliant Subnet Manager and Administration,
Packit 13e616
and runs on top of OpenIB. The OpenSM version for this release
Packit 13e616
is openib-2.0.5
Packit 13e616
Packit 13e616
This document includes the following sections:
Packit 13e616
1 This Overview section (describing new features and software
Packit 13e616
  dependencies)
Packit 13e616
2 Known Issues And Limitations
Packit 13e616
3 Unsupported IB compliance statements
Packit 13e616
4 Major Bug Fixes
Packit 13e616
5 Main Verification Flows
Packit 13e616
6 Qualified software stacks and devices
Packit 13e616
Packit 13e616
1.1 Major New Features
Packit 13e616
Packit 13e616
* Partition manager:
Packit 13e616
  The partition manager provides a means to setup multiple partitions
Packit 13e616
  by providing a partition policy file. For details please read the
Packit 13e616
  doc/partition-config.txt or the opensm man page.
Packit 13e616
Packit 13e616
* Basic QoS Manager:
Packit 13e616
  Provides a uniform configuration of the entire fabric with values defined
Packit 13e616
  in the OpenSM options file. The options support different settings for
Packit 13e616
  CAs, Switches, and Routers. Note that this is disabled by default and
Packit 13e616
  using -Q enables QoS fabric setup.
Packit 13e616
Packit 13e616
* Loading pre-routes from a file:
Packit 13e616
  A new routing module enables loading pre-routes from a file.
Packit 13e616
  To use this option you should use the command line options:
Packit 13e616
  "-R file --U <your routing file>" or
Packit 13e616
  "--routing_engine file --ucast_file <your routing file>"
Packit 13e616
  For more information refer to the file doc/modular-routing.txt
Packit 13e616
  or the opensm man page.
Packit 13e616
Packit 13e616
* SA MultiPathRecord support:
Packit 13e616
  The SA can now handle requests for multiple PathRecords in one query.
Packit 13e616
  This includes methods SA GetMulti/GetMultiResp and dual sided RMPP.
Packit 13e616
Packit 13e616
* PPC64 is now QAed and supported
Packit 13e616
Packit 13e616
* Support LMC > 0 for Switch Enhanced Port 0:
Packit 13e616
  Allows enhanced switch port 0 (ESP0) to have a non zero
Packit 13e616
  LMC. Use the configured subnet wide LMC for this. Modifications were
Packit 13e616
  necessary to the LID assignment and routing to support this.
Packit 13e616
  Also, added an option to the configuration to use LMC configured for
Packit 13e616
  subnet for enhanced switch port 0 or set it to 0 even if a non zero
Packit 13e616
  LMC is configured for the subnet. The default is currently the
Packit 13e616
  latter option. The new configuration option is: lmc_esp0
Packit 13e616
Packit 13e616
1.2 Minor New Features:
Packit 13e616
Packit 13e616
* IPoIB broadcast group configuration:
Packit 13e616
  It is now possible to control the IPoIB broadcast group parameters
Packit 13e616
  (MTU, rate, SL) through the partitions configuration file.
Packit 13e616
Packit 13e616
* Limiting OpenSM log file size:
Packit 13e616
  By providing the command line option: "-L <size in MB>" or
Packit 13e616
  "--log_limit <size in MB>" the user can limit the generated log
Packit 13e616
  file size. When specified, the log file will be truncated upon reaching
Packit 13e616
  this limit.
Packit 13e616
Packit 13e616
* Favor 1K MTU for Tavor (MT23108) HCA
Packit 13e616
  In cases where a PathRecord or MultiPathRecord is queried and the
Packit 13e616
  requestor does not specify the MTU or does specify it in a way
Packit 13e616
  that allows for MTU to be 1K and one of the path ends in a Tavor,
Packit 13e616
  limit the MTU to 1K max.
Packit 13e616
Packit 13e616
* Man pages:
Packit 13e616
  Added opensm.8 and osmtest.8
Packit 13e616
Packit 13e616
* Leaf VL stall count control:
Packit 13e616
  A new parameter (leaf_vl_stall_count) for controlling the number of
Packit 13e616
  sequential packets dropped on a switch port driving a HCA/TCA/Router
Packit 13e616
  that cause the port to enter the VLStalled state was added to the
Packit 13e616
  options file.
Packit 13e616
Packit 13e616
* SM Polling/Handover defaults changed
Packit 13e616
  The default SMInfo polling retries was decreased from 18 to 4
Packit 13e616
  which reduces the default handover time from 3 min to 40 seconds.
Packit 13e616
Packit 13e616
1.3 Library API Changes
Packit 13e616
Packit 13e616
* cl_mem* APIs deprecated in complib:
Packit 13e616
  These functions are now considered as deprecated and should be
Packit 13e616
  replaced by direct calls to malloc, free, memset, etc.
Packit 13e616
Packit 13e616
* osm_log_init_v2 API added in libopensm:
Packit 13e616
  Supports providing the new option for log file truncation.
Packit 13e616
Packit 13e616
1.4 Software Dependencies
Packit 13e616
Packit 13e616
OpenSM depends on the installation of either OFED 1.1, OFED 1.0,
Packit 13e616
OpenIB gen2 (e.g. IBG2 distribution), OpenIB gen1 (e.g. IBGD
Packit 13e616
distribution), or Mellanox VAPI stacks. The qualified driver versions
Packit 13e616
are provided in Table 2, "Qualified IB Stacks".
Packit 13e616
Packit 13e616
1.5 Supported Devices Firmware
Packit 13e616
Packit 13e616
The main task of OpenSM is to initialize InfiniBand devices. The
Packit 13e616
qualified devices and their corresponding firmware versions
Packit 13e616
are listed in Table 3.
Packit 13e616
Packit 13e616
2 Known Issues And Limitations
Packit 13e616
------------------------------
Packit 13e616
Packit 13e616
* No Service / Key associations:
Packit 13e616
  There is no way to manage Service access by Keys.
Packit 13e616
Packit 13e616
* No SM to SM SMDB synchronization:
Packit 13e616
  Puts the burden of re-registering services, multicast groups, and
Packit 13e616
  inform-info on the client application (or IB access layer core).
Packit 13e616
Packit 13e616
* No "port down" event handling:
Packit 13e616
  Changing the switch port through which OpenSM connects to the IB
Packit 13e616
  fabric may cause incorrect operation. Please restart OpenSM whenever
Packit 13e616
  such a connectivity change is made.
Packit 13e616
Packit 13e616
* Changing connections during SM operation:
Packit 13e616
  Under some conditions the SM can get confused by a change in
Packit 13e616
  cabling (moving a cable from one switch port to the other) and
Packit 13e616
  momentarily see this as having the same GUID appear connected
Packit 13e616
  to two different IB ports. Under some conditions, when the SM fails to
Packit 13e616
  get the corresponding change event it might mistakenly report this case
Packit 13e616
  as a "duplicated GUID" case and abort. It is advisable to double-check
Packit 13e616
  the syslog after each such change in connectivity and restart
Packit 13e616
  OpenSM if it has exited.
Packit 13e616
Packit 13e616
3 Unsupported IB Compliance Statements
Packit 13e616
--------------------------------------
Packit 13e616
The following section lists all the IB compliance statements which
Packit 13e616
OpenSM does not support. Please refer to the IB specification for detailed
Packit 13e616
information regarding each compliance statement.
Packit 13e616
Packit 13e616
* C14-22 (Authentication):
Packit 13e616
  M_Key M_KeyProtectBits and M_KeyLeasePeriod shall be set in one
Packit 13e616
  SubnSet method. As a work-around, an OpenSM option is provided for
Packit 13e616
  defining the protect bits.
Packit 13e616
Packit 13e616
* C14-67 (Authentication):
Packit 13e616
  On SubnGet(SMInfo) and SubnSet(SMInfo) - if M_Key is not zero then
Packit 13e616
  the SM shall generate a SubnGetResp if the M_Key matches, or
Packit 13e616
  silently drop the packet if M_Key does not match.
Packit 13e616
Packit 13e616
* C15-0.1.23.4 (Authentication):
Packit 13e616
  InformInfoRecords shall always be provided with the QPN set to 0,
Packit 13e616
  except for the case of a trusted request, in which case the actual
Packit 13e616
  subscriber QPN shall be returned.
Packit 13e616
Packit 13e616
* o13-17.1.2 (Event-FWD):
Packit 13e616
  If no permission to forward, the subscription should be removed and
Packit 13e616
  no further forwarding should occur.
Packit 13e616
Packit 13e616
* C14-24.1.1.5 and C14-62.1.1.22 (Initialization):
Packit 13e616
  GUIDInfo - SM should enable assigning Port GUIDInfo.
Packit 13e616
Packit 13e616
* C14-44 (Initialization):
Packit 13e616
  If the SM discovers that it is missing an M_Key to update CA/RT/SW,
Packit 13e616
  it should notify the higher level.
Packit 13e616
Packit 13e616
* C14-62.1.1.12 (Initialization):
Packit 13e616
  PortInfo:M_Key - Set the M_Key to a node based random value.
Packit 13e616
Packit 13e616
* C14-62.1.1.13 (Initialization):
Packit 13e616
  PortInfo:P_KeyProtectBits - set according to an optional policy.
Packit 13e616
Packit 13e616
* C14-62.1.1.24 (Initialization):
Packit 13e616
  SwitchInfo:DefaultPort - should be configured for random FDB.
Packit 13e616
Packit 13e616
* C14-62.1.1.32 (Initialization):
Packit 13e616
  RandomForwardingTable should be configured.
Packit 13e616
Packit 13e616
* o15-0.1.12 (Multicast):
Packit 13e616
  If the JoinState is SendOnlyNonMember = 1 (only), then the endport
Packit 13e616
  should join as sender only.
Packit 13e616
Packit 13e616
* o15-0.1.8 (Multicast):
Packit 13e616
  If a request for creating an MCG with fields that cannot be met,
Packit 13e616
  return ERR_REQ_INVALID (currently ignores SL and FlowLabelTClass).
Packit 13e616
Packit 13e616
* C15-0.1.8.6 (SA-Query):
Packit 13e616
  Respond to SubnAdmGetTraceTable - this is an optional attribute.
Packit 13e616
Packit 13e616
* C15-0.1.13 Services:
Packit 13e616
  Reject ServiceRecord create, modify or delete if the given
Packit 13e616
  ServiceP_Key does not match the one included in the ServiceGID port
Packit 13e616
  and the port that sent the request.
Packit 13e616
Packit 13e616
* C15-0.1.14 (Services):
Packit 13e616
  Provide means to associate service name and ServiceKeys.
Packit 13e616
Packit 13e616
4 Major Bug Fixes
Packit 13e616
-----------------
Packit 13e616
Packit 13e616
The following is a list of bugs that were fixed. Note that other less critical
Packit 13e616
or visible bugs were also fixed.
Packit 13e616
Packit 13e616
* "Broken" fabric (duplicated port GUIDs) handling improved
Packit 13e616
  Replace assert with a real check to handle invalid physical port
Packit 13e616
  in osm_node_info_rcv.c which could occur on a broken fabric
Packit 13e616
Packit 13e616
* SA client synchronous request failed but status returned was IB_SUCCESS
Packit 13e616
  even if there was no response.
Packit 13e616
  There was a missing setting of the status in the synchronous case.
Packit 13e616
Packit 13e616
* Memory leak fixes:
Packit 13e616
  1. In libvendor/osm_vendor_ibumad.c:osm_vendor_get_all_port_attr
Packit 13e616
  2. In libvendor/osm_vendor_ibumad_sa.c:__osmv_sa_mad_rcv_cb
Packit 13e616
  3. On receiving SMInfo SA request from a node that does not share a
Packit 13e616
	  partition, the response mad was allocated but never free'd
Packit 13e616
	  as it was never sent.
Packit 13e616
Packit 13e616
* Set(InformInfo) OpenSM Deadlock:
Packit 13e616
  When receiving a request with unknown LID
Packit 13e616
Packit 13e616
* PathRecord to inconsistent multicast destination:
Packit 13e616
  Fix the return error when multicast destination is not consistently
Packit 13e616
  indicated.
Packit 13e616
Packit 13e616
* Remove double calculation of reversible path
Packit 13e616
  In osm_sa_path_record.c:__osm_pr_rcv_get_lid_pair_path a PathRecord
Packit 13e616
  query used to double check if the path is reversible
Packit 13e616
Packit 13e616
* Some PathRecord log messages use "net order":
Packit 13e616
  Fix GUID net to host conversion in some osm_log messages
Packit 13e616
Packit 13e616
* DR/LID routed SMPs direction bit handling:
Packit 13e616
  osm_resp.c:osm_resp_make_resp_smp, set direction bit only if direct
Packit 13e616
  routed class. This bug caused two issues:
Packit 13e616
  1. Get/Set responses always had direction bit set.
Packit 13e616
  2. Trap represses never had direction bit set.
Packit 13e616
  The direction bit needs setting in direct routed responses and it
Packit 13e616
  doesn't exist in LID routed responses.
Packit 13e616
  osm_sm_mad_ctrl.c: did not detect the "direction bit" correctly.
Packit 13e616
Packit 13e616
* OpenSM crash due to transaction lookup (interop with Cisco stack)
Packit 13e616
  When a wire TID that maps to internal TID of zero (after applying
Packit 13e616
  mask) was received the lookup of the transaction was successful.
Packit 13e616
  The stale transaction pointed to "free'd" memory.
Packit 13e616
Packit 13e616
* Better handling for Path/MultiPath requests for raw traffic
Packit 13e616
Packit 13e616
* Wrong ProducerType provided in Notice Reports:
Packit 13e616
  When formating an SM generated report, the ProducerType was using
Packit 13e616
  CL_NTOH32 which can not be used to format a 24bit network order number.
Packit 13e616
Packit 13e616
* OpenSM break on PPC64
Packit 13e616
  complib: Fixed memory corruption in cl_pool.c:cl_qcpool_init. This
Packit 13e616
  affected big endian 64-bit architectures only.
Packit 13e616
Packit 13e616
* Illegal Set(InformInfo) was wrongly successful in updating the SMDB
Packit 13e616
  osm_sa_informinfo.c: In osm_infr_rcv_process_set_method, if sending
Packit 13e616
  error, don't call osm_infr_rcv_process_set_method
Packit 13e616
Packit 13e616
* RMPP queries of InformInfoRecord fail
Packit 13e616
  ib_types.h: Pad ib_inform_info_record_t to be modulo 8 in size so
Packit 13e616
  that attribute offset is calculated properly
Packit 13e616
Packit 13e616
* Returning "invalid request" rather than "unsupported method/attribute"
Packit 13e616
  In these cases, a noncompliant response was being provided.
Packit 13e616
Packit 13e616
* Noncompliant response for SubnAdmGet(PortInfoRecord) with no match
Packit 13e616
  osm_pir_rcv_process, now returns "SA no records error" for SubnAdmGet
Packit 13e616
  with 0 records found
Packit 13e616
Packit 13e616
* Noncompliant non base LID returned by some queries:
Packit 13e616
  The following attributes used to return the request LID rather than
Packit 13e616
  its base LID in responses: PKeyTableRecord, GUIDInfoRecord,
Packit 13e616
  SLtoVLMappingTableRecord, VLArbitrationTableRecord, LinkRecord
Packit 13e616
Packit 13e616
* Noncompliant SubnAdmGet and SubnAdmGetTable:
Packit 13e616
  Mixing of error codes in case of no records or multiple records
Packit 13e616
  fixed for the attributes:
Packit 13e616
  LinearForwardingTableRecord, GUIDInfoRecord,
Packit 13e616
  VLArbitrationTableRecord, LinkRecord, PathRecord
Packit 13e616
Packit 13e616
* segfault in InformInfo flows
Packit 13e616
  Under stress concurrent Set/Delete/Get flows. Fixed by adding
Packit 13e616
  missing lock.
Packit 13e616
Packit 13e616
* SA queries containing LID out if range did not return ERR_REQ_INVALID
Packit 13e616
Packit 13e616
5 Main Verification Flows
Packit 13e616
-------------------------
Packit 13e616
Packit 13e616
OpenSM verification is run using the following activities:
Packit 13e616
* osmtest - a stand-alone program
Packit 13e616
* ibmgtsim (IB management simulator) based - a set of flows that
Packit 13e616
  simulate clusters, inject errors and verify OpenSM capability to
Packit 13e616
  respond and bring up the network correctly.
Packit 13e616
* small cluster regression testing - where the SM is used on back to
Packit 13e616
  back or single switch configurations. The regression includes
Packit 13e616
  multiple OpenSM dedicated tests.
Packit 13e616
* cluster testing - when we run OpenSM to setup a large cluster, perform
Packit 13e616
  hand-off, reboots and reconnects, verify routing correctness and SA
Packit 13e616
  responsiveness at the ULP level (IPoIB and SDP).
Packit 13e616
Packit 13e616
5.1 osmtest
Packit 13e616
Packit 13e616
osmtest is an automated verification tool used for OpenSM
Packit 13e616
testing. Its verification flows are described by list below.
Packit 13e616
Packit 13e616
* Inventory File: Obtain and verify all port info, node info, link and path
Packit 13e616
  records parameters.
Packit 13e616
Packit 13e616
* Service Record:
Packit 13e616
   - Register new service
Packit 13e616
   - Register another service (with a lease period)
Packit 13e616
   - Register another service (with service p_key set to zero)
Packit 13e616
   - Get all services by name
Packit 13e616
   - Delete the first service
Packit 13e616
   - Delete the third service
Packit 13e616
   - Added bad flows of get/delete  non valid service
Packit 13e616
   - Add / Get same service with different data
Packit 13e616
   - Add / Get / Delete by different component  mask values (services
Packit 13e616
     by Name & Key / Name & Data / Name & Id / Id only )
Packit 13e616
Packit 13e616
* Multicast Member Record:
Packit 13e616
   - Query of existing Groups (IPoIB)
Packit 13e616
   - BAD Join with insufficient comp mask (o15.0.1.3)
Packit 13e616
   - Create given MGID=0 (o15.0.1.4)
Packit 13e616
   - Create given MGID=0xFF12A01C,FE800000,00000000,12345678 (o15.0.1.4)
Packit 13e616
   - Create BAD MGID=0xFA. (o15.0.1.6)
Packit 13e616
   - Create BAD MGID=0xFF12A01B w/ link-local not set (o15.0.1.6)
Packit 13e616
   - New MGID with invalid join state (o15.0.1.9)
Packit 13e616
   - Retry of existing MGID - See JoinState update (o15.0.1.11)
Packit 13e616
   - BAD RATE when connecting to existing MGID (o15.0.1.13)
Packit 13e616
   - Partial JoinState delete request - removing FullMember (o15.0.1.14)
Packit 13e616
   - Full Delete of a group (o15.0.1.14)
Packit 13e616
   - Verify Delete by trying to Join deleted group (o15.0.1.14)
Packit 13e616
   - BAD Delete of IPoIB membership (no prev join) (o15.0.1.15)
Packit 13e616
Packit 13e616
* GUIDInfo Record:
Packit 13e616
   - All GUIDInfoRecords in subnet are obtained
Packit 13e616
Packit 13e616
* MultiPathRecord:
Packit 13e616
   - Perform some compliant and noncompliant MultiPathRecord requests
Packit 13e616
   - Validation is via status in responses and IB analyzer
Packit 13e616
Packit 13e616
* PKeyTableRecord:
Packit 13e616
  - Perform some compliant and noncompliant PKeyTableRecord queries
Packit 13e616
  - Validation is via status in responses and IB analyzer
Packit 13e616
Packit 13e616
* LinearForwardingTableRecord:
Packit 13e616
  - Perform some compliant and noncompliant LinearForwardingTableRecord queries
Packit 13e616
  - Validation is via status in responses and IB analyzer
Packit 13e616
Packit 13e616
* Event Forwarding: Register for trap forwarding using reports
Packit 13e616
   - Send a trap and wait for report
Packit 13e616
   - Unregister non-existing
Packit 13e616
Packit 13e616
* Trap 64/65 Flow: Register to Trap 64-65, create traps (by
Packit 13e616
  disconnecting/connecting ports) and wait for report, then unregister.
Packit 13e616
Packit 13e616
* Stress Test: send PortInfoRecord queries, both single and RMPP and
Packit 13e616
  check for the rate of responses as well as their validity.
Packit 13e616
Packit 13e616
Packit 13e616
5.2 IB Management Simulator OpenSM Test Flows:
Packit 13e616
Packit 13e616
The simulator provides ability to simulate the SM handling of virtual
Packit 13e616
topologies that are not limited to actual lab equipment availability.
Packit 13e616
OpenSM was simulated to bring up clusters of up to 10,000 nodes. Daily
Packit 13e616
regressions use smaller (16 and 128 nodes clusters).
Packit 13e616
Packit 13e616
The following test flows are run on the IB management simulator:
Packit 13e616
Packit 13e616
* Stability:
Packit 13e616
  Up to 12 links from the fabric are randomly selected to drop packets
Packit 13e616
  at drop rates up to 90%. The SM is required to succeed in bringing the
Packit 13e616
  fabric up. The resulting routing is verified to be correct as well.
Packit 13e616
Packit 13e616
* LID Manager:
Packit 13e616
  Using LMC = 2 the fabric is initialized with LIDs. Faults such as
Packit 13e616
  zero LID, Duplicated LID, non-aligned (to LMC) LIDs are
Packit 13e616
  randomly assigned to various nodes and other errors are randomly
Packit 13e616
  output to the guid2lid cache file. The SM sweep is run 5 times and
Packit 13e616
  after each iteration a complete verification is made to ensure that all
Packit 13e616
  LIDs that could possibly be maintained are kept, as well as that all nodes
Packit 13e616
  were assigned a legal LID range.
Packit 13e616
Packit 13e616
* Multicast Routing:
Packit 13e616
  Nodes randomly join the 0xc000 group and eventually the
Packit 13e616
  resulting routing is verified for completeness and adherence to
Packit 13e616
  Up/Down routing rules.
Packit 13e616
Packit 13e616
* osmtest:
Packit 13e616
  The complete osmtest flow as described in the previous table is run on
Packit 13e616
  the simulated fabrics.
Packit 13e616
Packit 13e616
* Stress Test:
Packit 13e616
  This flow merges fabric, LID and stability issues with continuous
Packit 13e616
  PathRecord, ServiceRecord and Multicast Join/Leave activity to
Packit 13e616
  stress the SM/SA during continuous sweeps. InformInfo Set/Delete/Get
Packit 13e616
  were added to the test such both existing and non existing nodes
Packit 13e616
  perform them in random order.
Packit 13e616
Packit 13e616
5.3 OpenSM Regression
Packit 13e616
Packit 13e616
Using a back-to-back or single switch connection, the following set of
Packit 13e616
tests is run nightly on the stacks described in table 2. The included
Packit 13e616
tests are:
Packit 13e616
Packit 13e616
* Stress Testing: Flood the SA with queries from multiple channel
Packit 13e616
  adapters to check the robustness of the entire stack up to the SA.
Packit 13e616
Packit 13e616
* Dynamic Changes: Dynamic Topology changes, through randomly
Packit 13e616
  dropping SMP packets, used to test OpenSM adaptation to an unstable
Packit 13e616
  network & verify DB correctness.
Packit 13e616
Packit 13e616
* Trap Injection: This flow injects traps to the SM and verifies that it
Packit 13e616
  handles them gracefully.
Packit 13e616
Packit 13e616
* SA Query Test: This test exhaustively checks the SA responses to all
Packit 13e616
  possible single component mask. To do that the test examines the
Packit 13e616
  entire set of records the SA can provide, classifies them by their
Packit 13e616
  field values and then selects every field (using component mask and a
Packit 13e616
  value) and verifies that the response matches the expected set of records.
Packit 13e616
  A random selection using multiple component mask bits is also performed.
Packit 13e616
Packit 13e616
5.4 Cluster testing:
Packit 13e616
Packit 13e616
Cluster testing is usually run before a distribution release. It
Packit 13e616
involves real hardware setups of 16 to 32 nodes (or more if a beta site
Packit 13e616
is available). Each test is validated by running all-to-all ping through the IB
Packit 13e616
interface. The test procedure includes:
Packit 13e616
Packit 13e616
* Cluster bringup
Packit 13e616
Packit 13e616
* Hand-off between 2 or 3 SM's while performing:
Packit 13e616
  - Node reboots
Packit 13e616
  - Switch power cycles (disconnecting the SM's)
Packit 13e616
Packit 13e616
* Unresponsive port detection and recovery
Packit 13e616
Packit 13e616
* osmtest from multiple nodes
Packit 13e616
Packit 13e616
* Trap injection and recovery
Packit 13e616
Packit 13e616
Packit 13e616
6 Qualification
Packit 13e616
----------------
Packit 13e616
Packit 13e616
Table 2 - Qualified IB Stacks
Packit 13e616
=============================
Packit 13e616
Packit 13e616
Stack                                    | Version
Packit 13e616
-----------------------------------------|--------------------------
Packit 13e616
OFED                                     |   1.1
Packit 13e616
OFED                                     |   1.0
Packit 13e616
OpenIB Gen2 (IBG2 distribution)          |   1.0
Packit 13e616
OpenIB Gen1 (IBGD distribution)          |   1.8.0
Packit 13e616
VAPI (Mellanox InfiniBand HCA Driver)    |   3.2 and later
Packit 13e616
Packit 13e616
Table 3 - Qualified Devices and Corresponding Firmware
Packit 13e616
======================================================
Packit 13e616
Packit 13e616
Mellanox
Packit 13e616
Device  |   FW versions
Packit 13e616
--------|-----------------------------------------------------------
Packit 13e616
MT43132 |   InfiniScale - fw-43132  5.2.0 (and later)
Packit 13e616
MT47396 |   InfiniScale III - fw-47396 0.5.0 (and later)
Packit 13e616
MT23108 |   InfiniHost - fw-23108   3.3.2 (and later)
Packit 13e616
MT25204 |   InfiniHost III Lx - fw-25204  1.0.1i (and later)
Packit 13e616
MT25208 |   InfiniHost III Ex (InfiniHost Mode) - fw-25208  4.6.2 (and later)
Packit 13e616
MT25208 |   InfiniHost III Ex (MemFree Mode) - fw-25218  5.0.1 (and later)
Packit 13e616
Packit 13e616
QLogic/PathScale
Packit 13e616
Device  |   Note
Packit 13e616
--------|-----------------------------------------------------------
Packit 13e616
iPath   | QHT6040 (PathScale InfiniPath HT-460)
Packit 13e616
iPath   | QHT6140 (PathScale InfiniPath HT-465)
Packit 13e616
iPath   | QLE6140 (PathScale InfiniPath PE-880)
Packit 13e616
Packit 13e616
Note: OpenSM does not run on an IBM Galaxy (eHCA) as it does not expose
Packit 13e616
QP0 and QP1. However, it does support it as a device on the subnet.