KB 10299 Mx10i Immediate Data setting causing slow IO and possible controller reset Date: 3/8/2011
What does the Immediate Data Parameter do:
To reduce protocol overhead, iSCSI allows the initiator to send out WRITE commands with small IO data packets (e.g. 4K), this improves response time.
Problem Description
Recently we encountered a problem on Mx10i with Immediate Data enabled in the following environment:
Initiator: VMWare ESXi 4.1 iSCSI initiator
Target: Mx10i with Immediate Data
enable
Operation:
1. Install Ubuntu 10.04 as a guest OS of ESXi
2. Use ESXi iSCSI initiator to login Mx10i
3. Map a LUN as a raw device for Ubuntu
4. Create filesystem on the raw device in Ubuntu
5. Use dd to write a big file on the filesystem
Symptom:
IO is very slow and a watchdog reset may occur. See Appendix
Cause
The firmware/driver of the iSCSI controller has compatibility problems while Immediate Data is enabled. For this case, iSCSI session reconstruction happens frequently and the IO becomes slow. We have seen where this corner case causes Mx10i error handling and a possible watchdog reset.
Solution
Disable Immediate Data for this case. To disable ImmediateData you must use the advance mode in CLU:
1)
Log into the command line via Serial, ssh or telnet to port 2300
2) At the CLI Prompt, type Menu
3) Highlight iSCSI Management. Do not hit enter to go to the
next menu level.
4) Press “CTRL” + “SHIFT” + “-“ (Control Shift Minus) keys at
the same time. The iSCSI management option should change to Advanced
iSCSI Management
5) Choose Advanced iSCSI Management
6) Choose Node Settings You may need to choose which iSCSI
port and disable this setting for both ports.
7) Disable ImmediateData
8) Save the change
9) Reboot the VTrak
Appendix
Output from the debug console log while big file WRITE’s are in progress with ImmediateData enabled, messages will repeat frequently:
sbist_ack_data_transfer(): READ failue result=4, pFeCmdResp=65129034 CDB2A
sbist_ack_data_transfer(): READ failue result=4, pFeCmdResp=650de034 CDB2A
sbist_ack_data_transfer(): READ failue result=4, pFeCmdResp=6511d034 CDB2A
sbist_ack_data_transfer(): READ failue result=4, pFeCmdResp=650f7034 CDB2A
sbist_ack_data_transfer(): READ failue result=4, pFeCmdResp=6511e034 CDB2A
FDIO_WARN: Fail to send status for 65106034
FDIO_WARN: Fail to send status for 650ea034
FDIO_WARN: Fail to send status for 65105034
FDIO_WARN: Fail to send status for 650ff034
FDIO_WARN: Fail to send status for 6512b034
FDIO_WARN: Fail to send status for 6512e034
FDIO_WARN: Fail to send status for 65128034
fdIOReleaseFedResource 915 65129034
fdIOReleaseFedResource 916 FF
fdIOReleaseFedResource 917 62EF2A50
fdIOReleaseFedResource 922 0
*****sbist_internal_task_abort 61ce9430 650f1034 *****
*****sbist_internal_task_abort 657c9850 650da034 *****
*****sbist_internal_task_abort 657c3a50 65124034 *****
*****sbist_internal_task_abort 61ce9550 65100034 *****
*****sbist_internal_task_abort 657c36f0 65111034 *****
*****sbist_internal_task_abort 657bdc50 6511e034 *****
*****sbist_internal_task_abort 657cf9d0 650de034 *****
*****sbist_internal_task_abort 657c57d0 6511d034 *****
*****sbist_internal_task_abort 62ef5210 650f7034 *****
*****sbist_internal_task_abort 657c4370 65106034 *****
*****sbist_internal_task_abort 657cb5d0 650ea034 *****
*****sbist_internal_task_abort 657c3b70 65105034 *****
*****sbist_internal_task_abort 61ceb750 650ff034 *****
*****sbist_internal_task_abort 63297c50 6512b034 *****
*****sbist_internal_task_abort 657bf790 6512e034 *****
*****sbist_internal_task_abort 657d0bd0 65128034 *****
*****sbist_internal_task_abort 657c93d0 650d3034 *****
fdIOReleaseFedResource 915 650de034
fdIOReleaseFedResource 916 FF
fdIOReleaseFedResource 917 657CF9D0
fdIOReleaseFedResource 922 0
fdIOReleaseFedResource 915 6511d034
fdIOReleaseFedResource 916 FF
fdIOReleaseFedResource 917 657C57D0
fdIOReleaseFedResource 922 0
fdIOReleaseFedResource 915 650f7034
fdIOReleaseFedResource 916 FF
fdIOReleaseFedResource 917 62EF5210
fdIOReleaseFedResource 922 0
fdIOReleaseFedResource 915 6511e034