Transport Procedure for VTrak E-Class

Version 1.0 - January 18th, 2008

 

Transport is the action of moving the physical drives of a disk array:

  • To different slots in the same VTrak enclosure
  • From one VTrak enclosure to another

Promise recommends that you to use the CLI—not WebPAM PRO—to perform a Transport.

Step 1: Preparing the Disk Array for Transport

Before you can use the Transport feature, you must verify normal operation of the:

  • Disk array
  • Logical drives
  • Source RAID controllers
  • Target RAID controllers
  • JBOD IO modules

Connecting to the RAID Head

  1. Connect to the Source RAID Head controller via Telnet or HyperTerminal.
     
  2. Verify that the source array’s Operational Status is OK.
    At the command line, type array –v and press Enter.
     
  3. Verify that the logical drive’s Operational Status is OK.
    At the command line, type logdrv –v and press Enter.
     
  4. Verify that no background activities are running on the disk array or logical drive.
    At the command line, type bga and press Enter.
     
  5. If you are moving the disk array to a different enclosure, verify that the RAID controllers in the target enclosure are:
    Running firmware 3.28.0000.00 or the prescribed firmware build (3.29.0000.00 for Apple users that purchased product from the Apple Store)
  • Running in Active/Active mode
  • Operational Status OK
  • Not reporting any link errors

At the command line, type ctrl -v and press Enter.

Connecting to JBOD IO Modules

  1. Connect to each JBOD IO module via the RJ11 console.
     
  2. Verify that each JBOD IO module is running SEP firmware 1.07.0000.03 or newer (1.07.0000.04 for Apple users that purchased product from the Apple Store).
    At the command line, type enclosure and press Enter.
  3. Verify that each JBOD IO module is free from link errors.
    Each JBOD enclosure has two IO modules.
    At the command line, type link and press Enter.

    See the example below of a link counter output free of link errors:
cli:> link
Link Status:
Port Type Rate Init Dev Link PRdy
P 0 D01 SATA 3.0G OK End ---- Rdy
P 1 D02 SATA 3.0G OK End ---- Rdy
P 2 D03 SATA 3.0G OK End ---- Rdy
P 3 D04 SATA 3.0G OK End ---- Rdy
P 4 D05 SATA 3.0G OK End ---- Rdy
P 5 D06 SATA 3.0G OK End ---- Rdy
P 6 D07 SATA 3.0G OK End ---- Rdy
P 7 D08 SATA 3.0G OK End ---- Rdy
P 8 D09 SATA 3.0G OK End ---- Rdy
P 9 D10 SATA 3.0G OK End ---- Rdy
P10 D11 SATA 3.0G OK End ---- Rdy
P11 D12 SATA 3.0G OK End ---- Rdy
P12 D13 SATA 3.0G OK End ---- Rdy
P13 D14 SATA 3.0G OK End ---- Rdy
P14 D15 SATA 3.0G OK End ---- Rdy
P15 D16 SATA 3.0G OK End ---- Rdy
P16 CN1 SAS 3.0G OK Exp ---- Rdy
P17 CN1 SAS 3.0G OK Exp ---- Rdy
P18 CN1 SAS 3.0G OK Exp ---- Rdy
P19 CN1 SAS 3.0G OK Exp ---- Rdy
P20 CN2 SAS 3.0G OK Exp ---- Rdy
P21 CN2 SAS 3.0G OK Exp ---- Rdy
P22 CN2 SAS 3.0G OK Exp ---- Rdy
P23 CN2 SAS 3.0G OK Exp ---- Rdy

 

Port:Port Id Type:SAS or SATA Rate:Rate 1.5G/3G
Init:Init Passed Dev :Device Type Link:Link Connected
PRdy:Phy Ready    

 

Link Counter:

 

 

 

 

 

InDW

DsEr

DwLo

PhRe

CoVi

PhCh

P 0

----------

----------

----------

----------

----------

0x0B

P 1

----------

----------

----------

----------

----------

0x0B

P 2

----------

----------

----------

----------

----------

0x0B

P 3

----------

----------

----------

----------

----------

0x0B

P 4

----------

----------

----------

----------

----------

0x0B

P 5

----------

----------

----------

----------

----------

0x0B

P 6

----------

----------

----------

----------

----------

0x0B

P 7

----------

----------

----------

----------

----------

0x0B

P 8

----------

----------

----------

----------

----------

0x0B

P 9

----------

----------

----------

----------

----------

0x0B

P10

----------

----------

----------

----------

----------

0x0B

P11

----------

----------

----------

----------

----------

0x0B

P12

----------

----------

----------

----------

----------

0x0B

P13

----------

----------

----------

----------

----------

0x0B

P14

----------

----------

----------

----------

----------

0x0B

P15

----------

----------

----------

----------

----------

0x0B

P16

----------

----------

----------

----------

----------

0x01

P17

----------

----------

----------

----------

----------

0x01

P18

----------

----------

----------

----------

----------

0x01

P19

----------

----------

----------

----------

----------

0x01

P20

----------

----------

----------

----------

----------

0x01

P21

----------

----------

----------

----------

----------

0x01

P22

----------

----------

----------

----------

----------

0x01

P23

----------

----------

----------

----------

----------

0x01

InDW:Invalid Dword Count DsEr:Disparity Err Count
DwLo:Dword Sync Loss Count PhRe:Phy Reset Problem Count
CoVi:Code Violations Cnt PhCh:Phy Change Count

Step 2: Interpreting Link Errors

If your system has no link errors, skip to “Step 4: Transporting a Disk Array ”

Link errors may be observed on P0 through P15. This is not the main area of interest but you may want to take corrective action. The link counter may increment when the following change counts occur:

  • (InDW) Invalid Dword Count
  • (DsEr) Disparity Err Count
  • (DwLo) Dword Sync Loss Count
  • (PhRe) Phy Reset Problem Count
  • (CoVi) Code Violations Count
  • (PhCh) Phy Change Count

These errors can be isolated cases when a physical drive times out or resets, encounters read/write errors, or you have a bad AMMUX adapter.

    1. Clear the link error to see if the link counter increments its hexadecimal value.
      At the command line, type link –a clear and press Enter.
    2. Then type link and press Enter.

This action might also require a rebuild of the disk array to which the physical drive belongs.

Focusing on Critical Links

The main area of interest is the link counters for P16 through P23. Errors here can affect the Transport operation or may cause the controller RAID Head IO modules to break a path and cause a controller to enter Maintenance Mode.

The links errors may increment when you issue the link command. These ports are connectors physically on the JBOD IO module that are labeled CN1 and CN2.

See page 23, Figure 17 for connector assignments and page 37 for additional information on the link command output in the VTrak J-Class Product Manual: http://www.promise.com/upload/Support/Manual/VTrak_J610s_J310s_PM_v1.0a.pdf.

If link errors are detected:

  1. Clear the link error.
    At the command line, type link –a clear and press Enter. 
     
  2. Check to see if the link error comes back.
    At the command line, type link and press Enter.
     
  3. If errors return, identify the source of the link error.
    • CN1 = P16 through P19
    • CN2 = P20 through P23

       

Step 3: Correcting Link Errors

After you have identified the source of the link errors you must Fail Over the affected SAS domain before you can take corrective action.

  1. Pull the RAID controller for the affected SAS domain from the enclosure.
    See diagram below:




    When the RAID controller has been removed from the enclosure, all IOs will resume on the remaining RAID controller SAS domain. Controller Fail Over is almost instantaneous.

  2. Verify the controller Fail Over via the remaining RAID controller.
    Using Telnet or HyperTerminal, at the command line, type ctrl and press Enter.
    In the example below, note that controller 2 is no longer present.
  administrator@cli> ctrl  
  ===================================================
  CId Alias OpStatus Readiness Status
  ===================================================
  1   OK Active
  2 N/A Not Present N/A
         
         
  1. Check the RAID controller CLI event logs to verify that there are no other problems.
    At the command line, type event –l nvram and press Enter.
    Then type event –l and press Enter.
     
  2. Find and correct the root cause of the link error.
    A link error can be caused by:
    • Faulty SAS cable – Replace a suspect cable with a known-good cable.
    • Debris blocking the SAS cable connector – Visually inspect and clean.
    • Bad IO module CN1 or CN2 connector – Checked online after other possibilities are eliminated. At the command line, type sasdiag -a errorlog –l c2cport and press Enter. Look for incrementing errors.
       
  3. When you have corrected the root cause of the link errors on P16 through P23 on the respective IO modules verify all SAS cables are properly connected.
     
  4. Insert the RAID controller back into the enclosure and restore SAS connection connections to the Host.
    When the RAID controller is replaced and all paths restored, the RAID Head will Fail Back and return to Active/Active mode. This action can take up to one minute from the moment all the SAS connections are restored and the RAID controller is inserted.

    To verify that the RAID Head is in Active/Active mode, do one of the following actions at the command line:
  • Type ctrl –v and press Enter.
  • Type event –l nvram and press Enter.
  • Type event –l and press Enter.
  1. When the RAID Head is back to normal, repeat “Connecting to the RAID Head ” on page 2 to verify that the system is free of link errors.
  • If link errors are reported, repeat the procedure beginning with “Connecting to JBOD IO Modules ” on page 2 until you have eliminated all link errors on CN1 = P16 through P19 and CN2 - P20 through P23.
  • If no link errors are reported, proceed to "Transporting a Disk Array"

Step 4: Transporting a Disk Array

This step is the actual operation of transporting the physical drives of a disk array from one location to another.

  1. Connect to the Target RAID Head controller via Telnet or HyperTerminal.
     
  2. Place the disk array into Transport mode.
    At the command line, type array –a transport –d 0 and press Enter.
    For proper syntax, type ? array and press Enter or see the CLI User Manual.
     
  3. Move four physical drives at a time from the Source enclosure to the Target enclosure.
    Keep the physical drives in exactly the same order and sequence.
     
  4. Verify that all controllers have discovered each of the transported physical drives.
    At the command line, type phydrv –v –pX and press Enter.
    Where X is the physical drive’s number.
     
    See example highlighted below in Bold:
     
    Administrator@cli> phydrv -v -pl    

    -----------------------------------------------------------------PdId: 1

    OperationalStatus: OK    

    Alias:

       
    PhysicalCapacity: 153.39GB ConfigurableCapacity: 152.74GB
    UsedCapacity: 83.33GB BlockSize: 512Bytes  
    ConfigStatus: Array0 SeqNo0 Location: Encl1 Slot1
    ModelNo: ATA Hitachi HDS72161 VisibleTo: All Controllers
    SerialNo: PVB300Z2R23RBD FirmwareVersion: P22OA70A  
    DriveInterface: SATA 3Gb/s Protocol: ATA/ATAPI-7  
    WriteCacheSupport: Yes WriteCache: Enabled  
    RLACacheSupport: Yes RLACache: Enabled  
    SMARTFeatureSetSupport: Yes    
    SMARTSelfTestSetSupport: Yes    
    SMARTErrorLoggingSupport: Yes    
    CmdQueuingSupport: NCQ CmdQueuing: Enabled  
    CmdQueueDepth: 16 MultiDMASupport: MDMA2  
    UltraDMASupport: UDMA5 DMAMode: UDMA5  
    Errors: 0 NonRWErrors: 0  
    ReadErrors: 0 WriteErrors: 0  
    DriveTemperature: N/A ReferenceDriveTemperature: N/A
     
     
  5. After all physical drives in the disk array have been moved to their new locations, verify that the logical drive’s Operational Status is OK.
    At the command line, type logdrv –v and press Enter.

 

Troubleshooting

If a physical drive is NOT visible to all controllers when you run the phydrv –v –pX command, or not present when you run the array –v or logdrv –v commands, the logical drive’s Operational Status will be Offline.

Take the following actions:
 

  1. Remove from the enclosure the physical drive that is not visible to all controllers.

     
  2. Replace the AAMUX adapter.
     
  3. Reinsert the physical drive back into its slot.

     
  4. Wait one minute for the RAID controllers to detect the physical drive.

     
  5. At the command line, type phydrv –v –pX and press Enter.
     
    • If the physical drive is “Visible to All Controllers,” verify that the logical drive’s Operational Status is OK. This completes the procedure.
    • If the physical drive is NOT reported as “Visible to All Controllers,” contact your Promise FAE for assistance.

 

 

©2008 Promise Technology, Inc. All Rights Reserved.

No part of this document may be reproduced or transmitted in any form without the expressed, written permission of Promise Technology, Inc.