Exadata and Exalogic Elastic Cloud IB Ports State Shown as Down Along with Infiniband Switch Ports Status Showing AutomaticHighErrorRate Message

Oracle Exadata Troubleshooting

In the Exadata and Exalogic Elastic Cloud racks, you may see issue of IB ports showing status as “Down” when running “ibstat” command.

This issue can of happen on Compute Nodes,İnifiniband Switch or Storage heads. IB port being down on Compute Node can be confirmed by running below command

[root@xc000xc-ibb01 ~]

# listlinkup
Connector 0A Present <-> Switch Port 20 is up (Enabled)
Connector 1A Present <-> Switch Port 22 is up (Enabled)
Connector 2A Present <-> Switch Port 24 is up (Enabled)
Connector 3A Present <-> Switch Port 26 is up (Enabled)
Connector 4A Present <-> Switch Port 28 is up (Enabled)
Connector 5A Present <-> Switch Port 30 is up (Enabled)
Connector 6A Present <-> Switch Port 35 is up (Enabled)
Connector 7A Present <-> Switch Port 33 is up (Enabled)
Connector 8A Present <-> Switch Port 31 is up (Enabled)
Connector 9A Present <-> Switch Port 14 is up (Enabled)
Connector 10A Present <-> Switch Port 16 is up (Enabled)
Connector 11A Present <-> Switch Port 18 is up (Enabled)
Connector 12A Not present
Connector 13A Present <-> Switch Port 09 is up (Enabled)
Connector 14A Present <-> Switch Port 07 is up (Enabled)
Connector 15A Present <-> Switch Port 5 down(AutomaticHighErrorRate)
Connector 16A Present <-> Switch Port 03 is up (Enabled)
Connector 17A Present <-> Switch Port 01 is up (Enabled)
Connector 0B Present <-> Switch Port 19 is up (Enabled)
Connector 1B Present <-> Switch Port 21 is up (Enabled)
Connector 2B Present <-> Switch Port 23 is up (Enabled)
Connector 3B Present <-> Switch Port 25 is up (Enabled)
Connector 4B Present <-> Switch Port 27 is up (Enabled)
Connector 5B Not present
Connector 6B Present <-> Switch Port 36 is up (Enabled)
Connector 7B Present <-> Switch Port 34 is up (Enabled)
Connector 8B Present <-> Switch Port 32 is up (Enabled)
Connector 9B Present <-> Switch Port 13 is up (Enabled)
Connector 10B Present <-> Switch Port 15 is up (Enabled)
Connector 11B Present <-> Switch Port 17 is up (Enabled)
Connector 12B Present <-> Switch Port 12 is up (Enabled)
Connector 13B Present <-> Switch Port 10 is up (Enabled)
Connector 14B Present <-> Switch Port 08 is up (Enabled)
Connector 15B Present <-> Switch Port 06 is up (Enabled)
Connector 16B Present <-> Switch Port 04 is up (Enabled)
Connector 17B Present <-> Switch Port 02 is up (Enabled)

[root@xc000xc-iba01 ~]
# getportstatus 15A
Port status for connector 15A Switch port 5
Adminstate:………………….Disabled
LinkWidthEnabled:…………….1X or 4X
LinkWidthSupported:…………..1X or 4X
LinkWidthActive:……………..4X
LinkSpeedSupported:…………..2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkState:…………………..Active
PhysLinkState:……………….LinkUp
LinkSpeedActive:……………..10.0 Gbps
LinkSpeedEnabled:…………….2.5 Gbps or 5.0 Gbps or 10.0 Gbps
NeighborMTU:…………………2048
OperVLs:…………………….VL0-7

Meanwhile, we see that the symbol error has increased very rapidly in seconds, as seen below.

[root@xc000xc-iba01 ~]
# date; getportcounters 15a |grep “SymbolErrors”
Wed Feb 5 10:13:20 +03 2020
SymbolErrors…………………1120

Run the following command to verify if the feature is turned on:

[root@xc000xc-iba01 ~]
# autodisable list
Connectors which will be disabled on high error rate:
0A 0B 1A 1B 2A 2B 3A 3B 4A 4B 5A 5B 6A 6B 7A 7B 8A 8B 9A 9B 10A 10B 11A 11B 12A 12B 13A 13B 14A 14B 15A 15B 16A 16B 17A 17B
Connectors which will be disabled on suboptimal link speed or width:
0A 0B 1A 1B 2A 2B 3A 3B 4A 4B 5A 5B 6A 6B 7A 7B 8A 8B 9A 9B 10A 10B 11A 11B 12A 12B 13A 13B 14A 14B 15A 15B 16A 16B 17A 17B

Enable Switch Port 15A disabled on Switch due to High Symbol errors.

[root@xc000xc-iba01 ~] enableswitchport –automatic 15A

Verify the Port Status of the Port iba01 you just activated in the iba 01 Switch by running the following command in the Iba 01 Switch.

[root@xc000xc-iba01 ~]
# getportstatus 15A
Port status for connector 15A Switch port 5
Adminstate:………………….Enabled
LinkWidthEnabled:…………….1X or 4X
LinkWidthSupported:…………..1X or 4X
LinkWidthActive:……………..4X
LinkSpeedSupported:…………..2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkState:…………………..Active
PhysLinkState:……………….LinkUp
LinkSpeedActive:……………..10.0 Gbps
LinkSpeedEnabled:…………….2.5 Gbps or 5.0 Gbps or 10.0 Gbps
NeighborMTU:…………………2048
OperVLs:…………………….VL0-7

# listlinkup
Connector 0A Present <-> Switch Port 20 is up (Enabled)
Connector 1A Present <-> Switch Port 22 is up (Enabled)
Connector 2A Present <-> Switch Port 24 is up (Enabled)
Connector 3A Present <-> Switch Port 26 is up (Enabled)
Connector 4A Present <-> Switch Port 28 is up (Enabled)
Connector 5A Present <-> Switch Port 30 is up (Enabled)
Connector 6A Present <-> Switch Port 35 is up (Enabled)
Connector 7A Present <-> Switch Port 33 is up (Enabled)
Connector 8A Present <-> Switch Port 31 is up (Enabled)
Connector 9A Present <-> Switch Port 14 is up (Enabled)
Connector 10A Present <-> Switch Port 16 is up (Enabled)
Connector 11A Present <-> Switch Port 18 is up (Enabled)
Connector 12A Not present
Connector 13A Present <-> Switch Port 09 is up (Enabled)
Connector 14A Present <-> Switch Port 07 is up (Enabled)
Connector 15A Present <-> Switch Port 05 is up (Enabled)
Connector 16A Present <-> Switch Port 03 is up (Enabled)
Connector 17A Present <-> Switch Port 01 is up (Enabled)
Connector 0B Present <-> Switch Port 19 is up (Enabled)
Connector 1B Present <-> Switch Port 21 is up (Enabled)
Connector 2B Present <-> Switch Port 23 is up (Enabled)
Connector 3B Present <-> Switch Port 25 is up (Enabled)
Connector 4B Present <-> Switch Port 27 is up (Enabled)
Connector 5B Not present
Connector 6B Present <-> Switch Port 36 is up (Enabled)
Connector 7B Present <-> Switch Port 34 is up (Enabled)
Connector 8B Present <-> Switch Port 32 is up (Enabled)
Connector 9B Present <-> Switch Port 13 is up (Enabled)
Connector 10B Present <-> Switch Port 15 is up (Enabled)
Connector 11B Present <-> Switch Port 17 is up (Enabled)
Connector 12B Present <-> Switch Port 12 is up (Enabled)
Connector 13B Present <-> Switch Port 10 is up (Enabled)
Connector 14B Present <-> Switch Port 08 is up (Enabled)
Connector 15B Present <-> Switch Port 06 is up (Enabled)
Connector 16B Present <-> Switch Port 04 is up (Enabled)
Connector 17B Present <-> Switch Port 02 is up (Enabled)

Run the commands below and check for symbol errors.

[root@xc000xc-iba01 ~]
# ibqueryerrors.pl -rR -s RcvSwRelayErrors,XmtDiscards,XmtWait
Suppressing: RcvSwRelayErrors,XmtDiscards,XmtWait

# ibcheckstate -v

Checking Switch: nodeguid 0x0010e0dc1269a0a0

Node check lid 1: OK
Port check lid 1 port 36: OK
Port check lid 1 port 35: OK
Port check lid 1 port 34: OK
Port check lid 1 port 33: OK
Port check lid 1 port 32: OK
Port check lid 1 port 31: OK
Port check lid 1 port 30: OK
Port check lid 1 port 28: OK
Port check lid 1 port 27: OK
Port check lid 1 port 26: OK
Port check lid 1 port 25: OK
Port check lid 1 port 24: OK
Port check lid 1 port 23: OK
Port check lid 1 port 22: OK
Port check lid 1 port 21: OK
Port check lid 1 port 20: OK
Port check lid 1 port 19: OK
Port check lid 1 port 18: OK
Port check lid 1 port 17: OK
Port check lid 1 port 16: OK
Port check lid 1 port 15: OK
Port check lid 1 port 14: OK
Port check lid 1 port 13: OK
Port check lid 1 port 12: OK
Port check lid 1 port 10: OK
Port check lid 1 port 9: OK
Port check lid 1 port 8: OK
Port check lid 1 port 7: OK
Port check lid 1 port 6: OK
Port check lid 1 port 5: OK
Port check lid 1 port 4: OK
Port check lid 1 port 3: OK
Port check lid 1 port 2: OK
Port check lid 1 port 1: OK

Checking Switch: nodeguid 0x0010e0ceea80c0a0

Node check lid 11: OK
Port check lid 11 port 36: OK
Port check lid 11 port 35: OK
Port check lid 11 port 32: OK
Port check lid 11 port 30: OK
Port check lid 11 port 29: OK
Port check lid 11 port 28: OK
Port check lid 11 port 27: OK
Port check lid 11 port 24: OK
Port check lid 11 port 22: OK
Port check lid 11 port 18: OK
Port check lid 11 port 11: OK
Port check lid 11 port 10: OK
Port check lid 11 port 9: OK
Port check lid 11 port 8: OK
Port check lid 11 port 7: OK
Port check lid 11 port 6: OK
Port check lid 11 port 5: OK
Port check lid 11 port 4: OK
Port check lid 11 port 3: OK
Port check lid 11 port 2: OK
Port check lid 11 port 1: OK
Port check lid 11 port 21: OK
Port check lid 11 port 23: OK

Checking Switch: nodeguid 0x0010e0cfaa60c0a0

Node check lid 58: OK
Port check lid 58 port 32: OK
Port check lid 58 port 31: OK
Port check lid 58 port 24: OK
Port check lid 58 port 22: OK
Port check lid 58 port 18: OK
Port check lid 58 port 11: OK
Port check lid 58 port 10: OK
Port check lid 58 port 9: OK
Port check lid 58 port 8: OK
Port check lid 58 port 7: OK
Port check lid 58 port 6: OK
Port check lid 58 port 5: OK
Port check lid 58 port 4: OK
Port check lid 58 port 3: OK
Port check lid 58 port 2: OK
Port check lid 58 port 1: OK
Port check lid 58 port 36: OK
Port check lid 58 port 35: OK
Port check lid 58 port 30: OK
Port check lid 58 port 29: OK
Port check lid 58 port 28: OK
Port check lid 58 port 27: OK
Port check lid 58 port 21: OK
Port check lid 58 port 23: OK

Checking Switch: nodeguid 0x0010e0dc1072a0a0

Node check lid 2: OK
Port check lid 2 port 19: OK
Port check lid 2 port 21: OK

Checking Switch: nodeguid 0x0010e0dc180ca0a0

Node check lid 3: OK
Port check lid 3 port 30: OK
Port check lid 3 port 28: OK
Port check lid 3 port 27: OK
Port check lid 3 port 26: OK
Port check lid 3 port 25: OK
Port check lid 3 port 24: OK
Port check lid 3 port 23: OK
Port check lid 3 port 22: OK
Port check lid 3 port 21: OK
Port check lid 3 port 20: OK
Port check lid 3 port 19: OK
Port check lid 3 port 12: OK
Port check lid 3 port 10: OK
Port check lid 3 port 9: OK
Port check lid 3 port 8: OK
Port check lid 3 port 7: OK
Port check lid 3 port 6: OK
Port check lid 3 port 5: OK
Port check lid 3 port 4: OK
Port check lid 3 port 3: OK
Port check lid 3 port 2: OK
Port check lid 3 port 1: OK
Port check lid 3 port 32: OK
Port check lid 3 port 33: OK
Port check lid 3 port 35: OK
Port check lid 3 port 34: OK
Port check lid 3 port 36: OK
Port check lid 3 port 31: OK
Port check lid 3 port 17: OK
Port check lid 3 port 18: OK
Port check lid 3 port 15: OK
Port check lid 3 port 16: OK
Port check lid 3 port 13: OK
Port check lid 3 port 14: OK

Checking Ca: nodeguid 0x0010e00001d39a88

Node check lid 6: OK
Port check lid 6 port 2: OK

Checking Ca: nodeguid 0x0010e0cfaa60c000

Node check lid 48: OK
Port check lid 48 port 1: OK
Port check lid 48 port 2: OK

Checking Ca: nodeguid 0x0010e0cfaa60c040

Node check lid 52: OK
Port check lid 52 port 1: OK
Port check lid 52 port 2: OK

Checking Ca: nodeguid 0x0010e00001d38228

Node check lid 5: OK
Port check lid 5 port 2: OK
Port check lid 5 port 1: OK

Checking Ca: nodeguid 0x0010e00001d385e8

Node check lid 27: OK
Port check lid 27 port 2: OK
Port check lid 27 port 1: OK

Checking Ca: nodeguid 0x0010e00001d395d8

Node check lid 25: OK
Port check lid 25 port 2: OK
Port check lid 25 port 1: OK

Checking Ca: nodeguid 0x0010e00001d3a9a8

Node check lid 21: OK
Port check lid 21 port 2: OK
Port check lid 21 port 1: OK

Checking Ca: nodeguid 0x0010e00001d38578

Node check lid 23: OK
Port check lid 23 port 2: OK
Port check lid 23 port 1: OK

Checking Ca: nodeguid 0x0010e00001d3a9c8

Node check lid 17: OK
Port check lid 17 port 2: OK
Port check lid 17 port 1: OK

Checking Ca: nodeguid 0x0010e00001d39698

Node check lid 19: OK
Port check lid 19 port 2: OK
Port check lid 19 port 1: OK

Checking Ca: nodeguid 0x0010e00001d4e278

Node check lid 13: OK
Port check lid 13 port 2: OK
Port check lid 13 port 1: OK

Checking Ca: nodeguid 0x0010e00001d3a128

Node check lid 15: OK
Port check lid 15 port 2: OK
Port check lid 15 port 1: OK

Checking Ca: nodeguid 0x0010e0ceea80c000

Node check lid 7: OK
Port check lid 7 port 1: OK
Port check lid 7 port 2: OK

Checking Ca: nodeguid 0x0010e0ceea80c040

Node check lid 9: OK
Port check lid 9 port 1: OK
Port check lid 9 port 2: OK

Checking Ca: nodeguid 0x0010e00001d469f8

Node check lid 68: OK
Port check lid 68 port 1: OK
Port check lid 68 port 2: OK

Checking Ca: nodeguid 0x0010e00001d2d8e8

Node check lid 70: OK
Port check lid 70 port 1: OK
Port check lid 70 port 2: OK

Checking Ca: nodeguid 0x0010e00001d45818

Node check lid 72: OK
Port check lid 72 port 1: OK
Port check lid 72 port 2: OK

Checking Ca: nodeguid 0x0010e00001d2d968

Node check lid 74: OK
Port check lid 74 port 1: OK
Port check lid 74 port 2: OK

Checking Ca: nodeguid 0x0010e00001d4d558

Node check lid 42: OK
Port check lid 42 port 1: OK
Port check lid 42 port 2: OK

Checking Ca: nodeguid 0x0010e00001d5e260

Node check lid 36: OK
Port check lid 36 port 1: OK
Port check lid 36 port 2: OK

Checking Ca: nodeguid 0x0010e00001d4f568

Node check lid 54: OK
Port check lid 54 port 1: OK
Port check lid 54 port 2: OK

Checking Ca: nodeguid 0x0010e00001d4ab98

Node check lid 44: OK
Port check lid 44 port 1: OK
Port check lid 44 port 2: OK

Checking Ca: nodeguid 0x0010e00001d45968

Node check lid 56: OK
Port check lid 56 port 1: OK
Port check lid 56 port 2: OK

Checking Ca: nodeguid 0x0010e00001d4a998

Node check lid 46: OK
Port check lid 46 port 1: OK
Port check lid 46 port 2: OK

Checking Ca: nodeguid 0x0010e00001d4a9b8

Node check lid 32: OK
Port check lid 32 port 1: OK
Port check lid 32 port 2: OK

Checking Ca: nodeguid 0x0010e00001d2d5c8

Node check lid 67: OK
Port check lid 67 port 2: OK
Port check lid 67 port 1: OK

Checking Ca: nodeguid 0x0010e00001d2d008

Node check lid 63: OK
Port check lid 63 port 2: OK
Port check lid 63 port 1: OK

Checking Ca: nodeguid 0x0010e00001d2d458

Node check lid 65: OK
Port check lid 65 port 2: OK
Port check lid 65 port 1: OK

Checking Ca: nodeguid 0x0010e00001d2d6d8

Node check lid 62: OK
Port check lid 62 port 2: OK
Port check lid 62 port 1: OK

Checking Ca: nodeguid 0x0010e00001d5e100

Node check lid 41: OK
Port check lid 41 port 2: OK
Port check lid 41 port 1: OK

Checking Ca: nodeguid 0x0010e00001d3b298

Node check lid 29: OK
Port check lid 29 port 2: OK
Port check lid 29 port 1: OK

Checking Ca: nodeguid 0x0010e00001d4c258

Node check lid 31: OK
Port check lid 31 port 2: OK
Port check lid 31 port 1: OK

Checking Ca: nodeguid 0x0010e00001d3afc8

Node check lid 39: OK
Port check lid 39 port 2: OK
Port check lid 39 port 1: OK

Checking Ca: nodeguid 0x0010e00001d3b238

Node check lid 51: OK
Port check lid 51 port 2: OK
Port check lid 51 port 1: OK

Checking Ca: nodeguid 0x0010e00001d3b0a8

Node check lid 60: OK
Port check lid 60 port 2: OK
Port check lid 60 port 1: OK

Checking Ca: nodeguid 0x0010e00001d3b098

Node check lid 35: OK
Port check lid 35 port 2: OK
Port check lid 35 port 1: OK

Summary: 41 nodes checked, 0 bad nodes found

188 ports checked, 0 ports with bad state found

Congrat, Have a nice day..

Comments