Common Causes of EXTERNAL BUS RESETS to appear in the VTrak
Logs.
The External Bus Reset error is seen when the connection
between the VTrak and the SCSI HBA times out. The VTrak sends out a command to
try and reestablish the connection. This produces the External Bus Reset event
in the event log of the VTrak.
Common Causes:
1: An External Bus Reset event naturally occurs if the
server that is connected to VTrak is rebooted. You will see one when the server
is shutting down and coming back up.
2: There may be a cabling problem.
- Be sure to use a 1 meter/3Ft
cable. Anything longer may cause signal fluctuations between the server and the
VTrak. You can also try toggling down
the speed on the HBA to see if it will stop the External Bus Resets.
3: There may be a port issue on the VTrak or the HBA.
- Try
Swapping the port on either the VTrak or the HBA to see if the External Bus
Resets will stop.
4: There may be a software issue pertaining to the HBA
- Make
sure you are using drivers that are provided by the HBA manufacturer. It may be
also a good idea to double check if you
are up to date on the BIOS and firmware on the HBA. Some OS compatible drivers may cause the External
Bus Reset events to show at a constant interval.
5: There may be a compatibility issue between the VTrak and
the HBA:
- Be sure to check our website for
the VTrak compatibility list. You can find the compatibility list on our
website.
If you need help troubleshooting the problem, Please give us
a call at 408-228-1400 (Option 4) an we will be glad to help.
The SAS expansion cable is the weakest link in an otherwise cable-less system, and issues can arise due to a faulty cable having a marginal signal that will only cause a transmission problem under very particular circumstances. A SAS cable has a fragile connector inside that could be affected by all of the handling it undergoes from manufacturing to final installation, so care must be taken when unpacking and connecting the cables between the VTrak enclosures.
The Promise VTrak has the ability to report any signal errors that may occur over the two SAS paths from the RAID controllers to the JBOD I/O modules. This can be done by using the serial port and a terminal emulator or using the network port and a telnet or SSH session.
The command to display SAS communication error statistics is:
administrator@cli> sasdiag -a errorlog -l expander -e X -i Y
The two command parameters used to check the individual SAS ports of each RAID controller and I/O module are:
-e which refers to the enclosure number, starting with 1, the RAID unit.
-i which refers to the expander number, 1 on the left and 2 on the right.
This diagram illustrates the values of these two parameters and the SAS ports they are referencing, which can be best thought of as rows and columns.
Note: additional JBODs would be numbered 3, 4, etc.
For this example, we have one VTrak RAID and one VTrak JBOD, which gives a total of four SAS ports to check.
To display the SAS statistics for the RAID controller on the left we would issue the command:
administrator@cli> sasdiag -a errorlog -l expander -e 1 -i 1
To display the SAS statistics for the JBOD I/O module on the right we would issue the command:
administrator@cli> sasdiag -a errorlog -l expander -e 2 -i 2
This command will return the SAS statistics for each “lane” - denoted by the PHYId - with the first sixteen entries being for the 16 drives in the enclosure, and the remaining eight entries for the SAS expansion port. This means that even though 24 entries are returned, we will only be looking at entries 17 through 24.
administrator@cli> sasdiag -a errorlog -l expander -e 1 -i 1
They should all be “0” if there are no transmission errors detected on the port.
To test the quality of the SAS expansion cable and attempt to induce any errors that may occur due to a marginal cable, we will perform what is referred to as a “wiggle” test. This involves holding the cable near the center and slowly moving it in a circle approximately two to three inches in diameter for a few seconds. Please see the attached video.
Use the sasdiag command again to display the SAS statistics for the two SAS ports that are connected by the SAS cable under test:
administrator@cli> sasdiag -a errorlog -l expander -e 1 -i 1
administrator@cli> sasdiag -a errorlog -l expander -e 2 -i 1
If any SAS errors occur, the counts for PHYIds 17 through 24 could increase.
If the test results in these counts increasing, power down both the RAID head and attached JBODs and then unplug and re-insert the SAS cable(s). Power back up and re-run the test. If the counts continue to increase, replace the SAS cable and test again.
To make it easier to recognize any incrementing errors, you may want to clear the SAS counters. For a VTrak running firmware SR2.5, the RAID controllers must be restarted. For firmware SR2.6 you can use this command, remembering to run it for each expander:
administrator@cli> sasdiag -a clearerrlog -l expander -e 1 -i 1
Note: the controller restart method must be used for the VTrak for Mac.
Additional Information:
-PHYIds 17-20 signify the "Out" port on the controllers/IO modules. They are identified by the "Circle" on the rear of each module.
-PHYIds 21-24 signify the "In" port on the controllers/IO modules. They are identified by the "Diamond" on the rear of each module.