Summary of Contents for IBM Storwize V7000 Unified
Page 1
IBM Storwize V7000 Unified Problem Determination Guide GA32-1057-07...
Page 2
Before using this information and the product it supports, read the general information in “Notices” on page 309, the information in the “Safety and environmental notices” on page xi, as well as the information in the IBM Environmental Notices and User Guide , which is provided on a DVD.
. . 55 Emphasis . . xix Removing and replacing file module components 58 Storwize V7000 Unified library and related Resolving hard disk drive problems . . 61 publications . . xx Monitoring memory usage on a file module .
Page 4
Working with NFS clients that fail to mount Procedure: Fixing node errors . . 220 NFS shares after a client IP change . . 275 Procedure: Changing the service IP address of a node canister . 220 Storwize V7000 Unified: Problem Determination Guide Version...
Page 5
Working with file modules that report a stale Appendix. Accessibility features for NFS file handle. . 276 IBM Storwize V7000 Unified ..307 File module-related issues . . 277 Restoring System x firmware (BIOS) settings Notices .
Page 6
Storwize V7000 Unified: Problem Determination Guide Version...
Error code port location mapping . . 36 installed . . 145 Fibre Channel cabling from the file module to Storwize V7000 Unified logical devices and the control enclosure. . . 37 physical port locations . 165 LED states and associated actions. For the Hostname and service IP reference .
Page 10
Storwize V7000 Unified: Problem Determination Guide Version...
DANGER A danger notice indicates the presence of a hazard that has the potential of causing death or serious personal injury. (D002) 2. Locate IBM Systems Safety Notices with the user publications that were provided ® with the Storwize V7000 Unified hardware.
Page 12
Les sikkerhetsinformasjonen (Safety Information) før du installerer dette produktet. Antes de instalar este produto, leia as Informações sobre Segurança. Antes de instalar este producto, lea la información de seguridad. Läs säkerhetsinformationen innan du installerar den här produkten. Storwize V7000 Unified: Problem Determination Guide Version...
Safety statements Each caution and danger statement in this document is labeled with a number. This number is used to cross reference an English-language caution or danger statement with translated versions of the caution or danger statement in the Safety Information document.
Page 14
Statement 2 CAUTION: When replacing the lithium battery, use only IBM Part Number 33F8354 or an equivalent type battery recommended by the manufacturer. If your system has a module containing a lithium battery, replace it only with the same module type made by the same manufacturer.
Page 15
DANGER Some laser products contain an embedded Class 3A or Class 3B laser diode. Note the following. Laser radiation when open. Do not stare into the beam, do not view directly with optical instruments, and avoid direct exposure to the beam. Class 1 Laser Product Laser Klasse 1 Laser Klass 1...
Page 16
240 V under any distribution fault condition. Important: This product is not suitable for use with visual display workplace devices according to Clause 2 of the German Ordinance for Work with Visual Display Units. Storwize V7000 Unified: Problem Determination Guide Version...
Sound pressure Attention: Depending on local conditions, the sound pressure can exceed 85 dB(A) during service operations. In such cases, wear appropriate hearing protection. xvii Safety and environmental notices...
Page 18
Storwize V7000 Unified: Problem Determination Guide Version...
V7000 Unified. The chapters that follow introduce you to the hardware components and to the tools that assist you in troubleshooting and servicing the Storwize V7000 Unified, such as the management GUI and the service assistant. The troubleshooting procedures can help you analyze failures that occur in a Storwize V7000 Unified system.
Storwize V7000 Unified library Unless otherwise noted, the publications in the Storwize V7000 Unified library are available in Adobe portable document format (PDF) from the following website: www.ibm.com/storage/support/storwize/v7000/unified Each of the PDF publications in Table 1 is available in this information center by clicking the number in the “Order number”...
Page 21
SC28-6872 (contains Machine Code contains the License Z125-5468) Agreement for Machine Code for the Storwize V7000 Unified product. Other IBM publications Table 2 on page xxii lists IBM publications that contain information related to the Storwize V7000 Unified. About this guide...
Some publications are available for you to view or download at no charge. You can also order publications. The publications center displays prices in your local currency. You can access the IBM Publications Center through the following website: www.ibm.com/e-business/linkweb/publications/servlet/pbi.wss...
To submit any comments about this book or any other Storwize V7000 Unified documentation: v Go to the feedback page on the website for the Storwize V7000 Unified Information Center at publib.boulder.ibm.com/infocenter/storwize/unified_ic/ index.jsp?topic=/com.ibm.storwize.v7000.unified.doc/feedback_ifs.htm. There you can use the feedback page to enter and submit comments or browse to the topic and use the feedback link in the running footer of that page to identify the topic for which you have a comment.
Page 24
Storwize V7000 Unified: Problem Determination Guide Version...
Chapter 1. Storwize V7000 Unified hardware components A Storwize V7000 Unified system consists of one or more machine type 2076 rack-mounted enclosures and two machine type 2073 rack-mounted file modules. There are several model types for the 2076 machine type. The main differences among the model types are the following items: v The number of drives that an enclosure can hold.
Page 26
Storwize V7000 Unified: Problem Determination Guide Version...
Use this address if the control enclosure CLI is not working. These addresses are not set during the installation of a Storwize V7000 Unified system, but you can set these IP addresses later by using the chserviceip CLI command.
RAID arrays for the disk system. The Storwize V7000 Unified system uses a pair of file modules for redundancy. Follow the appropriate power down procedures to minimize impacts to the system operations.
IBM automatically opens a problem report, and if appropriate, contacts you to verify if replacement parts are required. If you set up Call Home to IBM, ensure that the contact details that you configure are correct and kept up to date as personnel change.
Know your IBM warranty and maintenance agreement details If you have a warranty or maintenance agreement with IBM, know the details that must be supplied when you call for support. Have the phone number of the support center available. When you call support, provide the machine type and the serial number of the enclosure or file module that has the problem.
Page 31
Support personnel also ask for your customer number, machine location, contact details, and the details of the problem. Chapter 2. Best practices for troubleshooting...
Page 32
Storwize V7000 Unified: Problem Determination Guide Version...
If users or applications are having trouble accessing data that is held on the Storwize V7000 Unified system, or if the management GUI is not accessible or is running slowly, the Storwize V7000 control enclosure might have a problem.
169; otherwise, see “Checking the GPFS file system mount on each file module” on page 171. If you have lost access to the files, but there is no sign that anything is wrong with the Storwize V7000 Unified system, see “Host to file modules connectivity” on page 25. Installation troubleshooting This topic provides information for troubleshooting problems encountered during the installation.
Page 35
– Product Family: Disk Systems – Product: IBM Storwize V7000 Unified – Release: All – Platform: All Before loading the USB flash drive verify it has a FAT32 formatted file system. Plug the USB flash drive into the laptop. Go to Start (my computer), right-click the USB drive.
SONAS_results.txt file and open it. Check for errors and corrective actions (refer to Storwize V7000 Unified Problem Determination Guide PDF on the CD). If no errors are listed, reboot both file modules, allow file modules to boot completely, reinsert the USB flash drive as originally instructed and try again.
3. Refer to Table 5 to match the code (A-H) to the recommended action. Follow the suggested action, in order, completing one before trying the next. 4. If the recommended action or actions fail, call the IBM Support Center. Table actions defined This table serves as a legend for defining the precise action to follow.
Page 38
Verify that the Ethernet cabling connections are seated properly between the Storwize V7000 Unified control enclosure and the customer network, as well as the file modules cabling to the customer network. Then reinsert the USB flash drive into the original file module.
Page 39
0AAF Unable to get node roles from VPD. 0AB0 Error opening /etc/sysconfig/rsyslog. 0AB1 Error writing to /etc/sysconfig/rsyslog. 0AB2 Error reading /etc/rsyslog.conf. 0AB3 Unable to open /opt/IBM/sonas/etc/ rsyslog_template_mgmt.conf. 0AB4 Unable to open /opt/IBM/sonas/etc/ rsyslog_template_int.conf. 0AB5 Unable to open /opt/IBM/sonas/etc/ rsyslog_template_strg.conf. 0AB6 Unknown node roles.
Page 40
Trying to install management stack on non-management node. 0AF9 Invalid site ID. Curently only 'st001' is supported on physical systems. 0AFA This node is already a part of a cluster. Unable to configure. Storwize V7000 Unified: Problem Determination Guide Version...
Page 41
Table 6. Error messages and actions (continued) Error code Error message Action key 0AFB Unable to generate public/private keys. 0AFC Unable to copy user SSH keys. 0AFD Unable to copy host SSH keys. 0AFE Unable to set the system's timezone. 0AFF Unable to write clock file.
Page 42
Storage controllers may be cabled incorrectly or UUIDs might not be set properly. 0B95 Invalid parameters. 0B96 Failed to configure the management processes on D then A then B mgmt001st001 0B97 IP is invalid. 0B98 Netmask is invalid. Storwize V7000 Unified: Problem Determination Guide Version...
Page 43
Table 6. Error messages and actions (continued) Error code Error message Action key 0B99 IP, gateway, and netmask are not a valid combination. 0B9A There was an internal error. 0B9B Invalid NAS private key file. 0B9C Unable to copy the NAS private key file. 0B9D Internal error setting permissions on NAS private key file.
Use this information when troubleshooting problems reported by the CLI commands during software configurations. The following table contains error messages that might be displayed when running the CLI commands during software configuration. Storwize V7000 Unified: Problem Determination Guide Version...
Table 7. CLI command problems CLI Command Symptom/Message Action mkfs SG0002C Command This message indicates that the exception found : Disk arrays listed in the error message <arrayname> might still appear to already be part of a file belong to file system system.
1. Does the GUI launch and are there problems logging into the system? v Yes: Check that the user ID being used was set up to access the GUI. Refer to “Authentication basic concepts” in the IBM Storwize V7000 Unified Information Center.
– Yes: Run the CLI command lshealth. Reference the active management node Hostname (mgmt001st001 or mgmt002st002) obtained from the lsnode command. Ensure that HOST_STATE, SERVICE, and NETWORK from lshealth is set to OK. Sample Output: mgmt001st001 HOST_STATE SERVICE All services are running OK CTDB CTDBSTATE_STATE_ACTIVE GPFS...
Page 48
About this task Within the Storwize V7000 Unified system, the system Health Status is based on a set of predefined software and hardware health status sensors that are reflected in the System Details page under the Status section for the corresponding logical host name.
a. Review the Sensor column and the Level column for Critical Error, Major Warning, or Minor Warning items. If the problem that caused the Level item is resolved, right-click the event and select the Mark Event as Resolved action. b. Follow the online instructions to complete the change. c.
108 and “Installing a PCI adapter in a PCI riser-card assembly” on page 109. Ethernet connectivity between file modules This topic covers troubleshooting Ethernet connectivity issues between the file modules. These connections are used for internal management operations between Storwize V7000 Unified: Problem Determination Guide Version...
They make use of the Internal IP address range that you provided during initializing the Storwize V7000 Unified system. About this task This procedure is used to troubleshoot Ethernet connectivity between the file modules. These network paths are used for all internal file system communication.
Page 52
It is always possible that somebody in your site could set up another machine to use one or more IP address that your Storwize V7000 Unified system is already using. Use the management GUI to check which four IP addresses the file modules are currently using to communicate with each other.
If you cannot stop other machines on your network using these IP addresses and must change the internal IP address range used then you need to contact IBM Remote Technical Support to help you to put your file modules back to an out-of-box state so you can choose a different internal IP address range.
Page 54
USB flash drive to discover the state and settings of the Storwize V7000. Make sure that there is no satask.txt file on the USB flash drive before you plug it into the control enclosure. Storwize V7000 Unified: Problem Determination Guide Version...
Page 55
CLI command). Otherwise you may have plugged the USB flash drive into the wrong control enclosure (such as one that is not part of this Storwize V7000 unified system). The node_status should be active for each node canister in the cluster under sainfo lsservicestatus. Otherwise follow the service action under sainfo lsservicerecommendation.
Page 56
Update the file module's record of the control enclosure system IP: To find the file module's current record of the control enclosure system IP address, use the Storwize V7000 Unified management CLI to issue the lsstoragesystem command. Here is an example: >ssh admin@<management_IP>...
Page 57
Verify that communication from the file module to the control enclosure is now possible by running the lssystemip command on the Storwize V7000 Unified management CLI: >ssh admin@<managment IP address> [kd01ghf.ibm]$ lssystemip Changing the cluster IP of the file modules: If the cluster IP address of the file modules is not known, or has been incorrectly set, the value can be changed by logging into the system using a console.
Each file module has a dual port Fibre Channel adapter card located in PCI slot 2. Both ports are used to connect to the Storwize V7000 control enclosure with a connection going to each control canister. Storwize V7000 Unified: Problem Determination Guide Version...
CAUTIO N CAUT I O N Disconnect all Disconnect all supply power for supply power for complete isolation complete isolation Figure 3. Diagram shows how to connect the file modules to the control enclosure using Fibre Channel cables. (A) is file module 1 and (B) is file module 2. (C) is the control enclosure.
2. Run the command: locatenode #HOSTNAME on #SECONDS. HOSTNAME is the hostname associated with the error... either mgmt001st001or mgmt002st001. #SECONDS is the number of seconds for the LED indicator to be turned on. Physical connection and repair: Storwize V7000 Unified: Problem Determination Guide Version...
Each file module has a dual port Fibre Channel adapter card located in PCI slot 2. Both ports are used to connect to the Storwize V7000 system with a connection going to each Storwize V7000 node canister. Table 12. Fibre Channel cabling from the file module to the control enclosure. File Module Node # 1 File Module Storage Node # 2 PCI slot #2, port 1...
Hard disk drive status LED (amber) Rack Rack release release latch latch Bay 0 Bay 11 CD/DVD drive CD/DVD drive Hard disk CD/DVD activity LED (optical drive) drive bays eject button Storwize V7000 Unified: Problem Determination Guide Version...
Page 63
2. To view the light path diagnostics panel, slide the latch to the left on the front of the operator information panel and pull the panel forward. This reveals the light path diagnostics panel. Lit LEDs on this panel indicate the type of error that has occurred.
Page 64
12v channel error LEDs indicate an overcurrent condition. Refer to the procedure “Solving power problems” in the “Troubleshooting the System x3650” in the IBM Storwize V7000 Unified Information Center to identify the components that are associated with each power channel, and the order in which to troubleshoot the components.
Page 65
Light path diagnostics LEDs LEDs on the light path diagnostics panel of an Storwize V7000 Unified file module indicate the cause of a problem. About this task Table 15 shows suggested actions to correct detected problems. Note: Check the system-event log for additional information before you replace a FRU.
LINK Reserved. An error message has been written to Check the system logs for information about the error. Replace the system-event log any components that are identified in the error logs. Storwize V7000 Unified: Problem Determination Guide Version...
Page 67
The power on “Power problems” in the appropriate server guide in supplies are using more power than “Troubleshooting the System x3650” in the IBM Storwize their maximum rating. V7000 Unified Information Center. (For the location of power channel error LEDs, see the section on “Internal...
Page 68
One power supply v Power cord v Three cooling fans v One PCI riser-card assembly in PCI riser connector 2 The following illustration shows the locations of the power-supply LEDs. Storwize V7000 Unified: Problem Determination Guide Version...
Page 69
Refer to “Removing and replacing parts” on page 85 to determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU). v Go to the IBM support website at www.ibm.com/storage/support/storwize/v7000/unified to check for technical information, hints, tips, and new device drivers, or to submit a request for information.
Refer to “Removing and replacing parts” on page 85 to determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU). v Go to the IBM support website at www.ibm.com/storage/support/storwize/v7000/unified to check for technical information, hints, tips, and new device drivers, or to submit a request for information.
Figure 4. LEDs on the power supply units of the control enclosure Table 17. Power-supply unit LEDs Power supply ac failure dc failure failure Status Action Communication Replace the power failure between supply unit. If failure is the power still present, replace the supply unit and enclosure chassis.
LEDs also flash. Table 18 on page 49 shows the three canister status LEDs on each of the node canisters. Figure 5 on page 49 shows the LEDs on the node canister. Storwize V7000 Unified: Problem Determination Guide Version...
Figure 5. LEDs on the node canisters Table 18. Power LEDs Power LED status Description There is no power to the canister. Try reseating the canister. Go to “Procedure: Reseating a node canister” on page 222. If the state persists, follow the hardware replacement procedures for the parts in the following order: node canister, enclosure chassis.
Page 74
Battery Good Battery Fault Description Action Battery is good and fully None charged. Flashing Battery is good but not fully None charged. The battery is either charging or a maintenance discharge is being performed. Storwize V7000 Unified: Problem Determination Guide Version...
Table 20. Control enclosure battery LEDs (continued) Battery Good Battery Fault Description Action Nonrecoverable battery fault. Replace the battery. If replacing the battery does not fix the issue, replace the power supply unit. Flashing Recoverable battery fault. None Flashing Flashing The battery cannot be used None because the firmware for the...
GUI to resolve the problem. Always use the fix procedures for both system configuration problems and hardware failures. The fix procedures analyze the system to ensure that the required changes do not cause volumes to be Storwize V7000 Unified: Problem Determination Guide Version...
You can use fix procedures to diagnose and resolve problems with the Storwize V7000 Unified. About this task For example, to repair a Storwize V7000 Unified system, you might perform the following tasks: v Analyze the event log v Replace failed components...
Page 78
Many of the file module fix procedures are not automated. In these cases, you are directed to a documented procedure in the Storwize V7000 Unified Information Center. The example uses the management GUI to repair a Storwize V7000 Unified system. Perform the following steps to start the fix procedure: Procedure 1.
3. The node reboot restarts all services that were previously running. Removing a file module to perform a maintenance action You can remove an IBM Storwize V7000 Unified file module to perform maintenance. The procedure that you follow differs slightly, depending on whether you must unplug the power cables.
Page 80
Removing a file module and disconnecting power You can remove an IBM Storwize V7000 Unified file module and disconnect it from its power line cords before performing a maintenance action that requires the file module to have no power.
Page 81
To remove the mgmt001st001 file module from the system, for example, issue the following command: # suspendnode mgmt001st001 3. Wait for the Storwize V7000 Unified system to stop the file module at the clustered trivial database (CTDB) level. The command does not unmount any mounted file systems.
FRUs must be installed by trained service technicians. About this task Installation guidelines To help you work safely with IBM Storwize V7000 Unified file modules, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and these guidelines.
Page 83
v Do not attempt to lift an object that you think is too heavy for you. If you have to lift a heavy object, observe the following precautions: – Make sure that you can stand safely without slipping. – Distribute the weight of the object equally between your feet. –...
Page 84
Take additional care when handling devices during cold weather. Heating reduces indoor humidity and increases static electricity. Returning a device or component When returning a device or component, follow all packaging instructions and use any supplied packaging materials for shipping. Storwize V7000 Unified: Problem Determination Guide Version...
Resolving hard disk drive problems Use this information to address various hard disk drive issues. About this task v Before running a procedure, refer to “Removing a file module to perform a maintenance action” on page 55. v Follow the suggested actions for a Symptom in the order in which they are listed in the Action column until the problem is solved.
Page 86
Turn on the server and observe the activity of the hard disk drive LEDs. Displaying node mirror and hard drive status The Storwize V7000 Unified system provides a method to check the node mirror status and hard drive status for each file module.
1. Ensure that you are logged into the file module as root. 2. To display mirror status and hard drive status, run the following perl script: # /opt/IBM/sonas/bin/cnrspromptnode.pl -a -c "/opt/IBM/sonas/bin/cnrsQueryNodeDrives.pl" File modules in this Storwize V7000 Unified Cluster Node Node Name Node Details -------------------------------------------------------------------------------- 1.
The volume is Active. The user data is not fully protected due to a configuration change or drive failure. Rebuilding (RBLD) A data resynchronization or rebuild might be in progress. or Resyncing (RSY) Storwize V7000 Unified: Problem Determination Guide Version...
Table 21. Status of volume (continued) Status of volume Description Inactive, Okay The volume is inactive and the drives are functioning correctly. The (OKY) user data is protected if the current RAID level is RAID 1 (IM) or RAID 1E (IME). Inactive, Degraded The volume is inactive and the user data is not fully protected due (DGD)
SMART ASCQ : none Figure 8. Example that shows that mirroring is re-synchronizing If a drive were not synchronized, the status might appear like the status shown in Figure 9 on page 67: Storwize V7000 Unified: Problem Determination Guide Version...
The mirror is not created/configured. If the mirror is not created, refer to “Troubleshooting the System x3650” in the IBM Storwize V7000 Unified Information Center for information on launching the LSI configuration tool. Chapter 4. File module...
ASC/ ASCQ error of 05/00. For isolation and the repair of hard disk problems, refer to “Troubleshooting the System x3650” in the IBM Storwize V7000 Unified Information Center. For a list of SMART (ASC/ASCQ) error codes and their descriptions, go to “SMART ASC/ASCQ error codes and messages”...
Page 93
Device is a Hard disk Enclosure # Slot # Connector ID Target ID State : Online (ONL) Size (in MB)/(in sectors) : 286102/585937500 Manufacturer : IBM-ESXS Model Number : MBD2300RC Firmware Revision : SB19 Serial No : D009P9A01SJC Drive Type : SAS Protocol...
NO REFERENCE POSITION FOUND MULTIPLE PERIPHERAL DEVICES SELECTED LOGICAL UNIT COMMUNICATION FAILURE LOGICAL UNIT COMMUNICATION TIME-OUT LOGICAL UNIT COMMUNICATION PARITY ERROR LOGICAL UNIT COMMUNICATION CRC ERROR (ULTRA-DMA/32) UNREACHABLE COPY TARGET TRACK FOLLOWING ERROR Storwize V7000 Unified: Problem Determination Guide Version...
Page 95
Table 23. SMART ASC/ASCQ error codes and messages (continued) ASCQ Description HEAD SELECT FAULT ERROR LOG OVERFLOW WARNING WARNING - SPECIFIED TEMPERATURE EXCEEDED WARNING - ENCLOSURE DEGRADED WARNING - BACKGROUND SELF-TEST FAILED WARNING - BACKGROUND PRE-SCAN DETECTED MEDIUM ERROR WARNING - BACKGROUND MEDIUM SCAN DETECTED MEDIUM ERROR WARNING - NON-VOLATILE CACHE NOW VOLATILE WARNING - DEGRADED POWER TO NON-VOLATILE CACHE...
Page 96
RECOVERED DATA WITH ERROR CORR. & RETRIES APPLIED RECOVERED DATA - DATA AUTO-REALLOCATED RECOVERED DATA - RECOMMEND REASSIGNMENT RECOVERED DATA - RECOMMEND REWRITE RECOVERED DATA WITH ECC - DATA REWRITTEN DEFECT LIST ERROR Storwize V7000 Unified: Problem Determination Guide Version...
Page 97
Table 23. SMART ASC/ASCQ error codes and messages (continued) ASCQ Description DEFECT LIST NOT AVAILABLE DEFECT LIST ERROR IN PRIMARY LIST DEFECT LIST ERROR IN GROWN LIST PARAMETER LIST LENGTH ERROR SYNCHRONOUS DATA TRANSFER ERROR DEFECT LIST NOT FOUND PRIMARY DEFECT LIST NOT FOUND GROWN DEFECT LIST NOT FOUND MISCOMPARE DURING VERIFY OPERATION MISCOMPARE VERIFY OF UNMAPPED LBA...
Page 98
COMMAND SEQUENCE ERROR ILLEGAL POWER CONDITION REQUEST PREVIOUS BUSY STATUS PREVIOUS TASK SET FULL STATUS PREVIOUS RESERVATION CONFLICT STATUS ORWRITE GENERATION DOES NOT MATCH COMMANDS CLEARED BY ANOTHER INITIATOR COMMANDS CLEARED BY POWER LOSS NOTIFICATION Storwize V7000 Unified: Problem Determination Guide Version...
Page 99
Table 23. SMART ASC/ASCQ error codes and messages (continued) ASCQ Description COMMANDS CLEARED BY DEVICE SERVER INCOMPATIBLE MEDIUM INSTALLED CANNOT READ MEDIUM - UNKNOWN FORMAT CANNOT READ MEDIUM - INCOMPATIBLE FORMAT CLEANING CARTRIDGE INSTALLED CANNOT WRITE MEDIUM - UNKNOWN FORMAT CANNOT WRITE MEDIUM - INCOMPATIBLE FORMAT CANNOT FORMAT MEDIUM - INCOMPATIBLE MEDIUM CLEANING FAILURE...
Page 100
SCSI PARITY ERROR DATA PHASE CRC ERROR DETECTED SCSI PARITY ERROR DETECTED DURING ST DATA PHASE INFORMATION UNIT IUCRC ERROR DETECTED ASYNCHRONOUS INFORMATION PROTECTION ERROR DETECTED PROTOCOL SERVICE CRC ERROR PHY TEST FUNCTION IN PROGRESS Storwize V7000 Unified: Problem Determination Guide Version...
Page 101
Table 23. SMART ASC/ASCQ error codes and messages (continued) ASCQ Description SOME COMMANDS CLEARED BY ISCSI PROTOCOL EVENT INITIATOR DETECTED ERROR MESSAGE RECEIVED INVALID MESSAGE ERROR COMMAND PHASE ERROR DATA PHASE ERROR INVALID TARGET PORT TRANSFER TAG RECEIVED TOO MUCH WRITE DATA ACK/NAK TIMEOUT NAK RECEIVED DATA OFFSET ERROR...
Page 102
DATA CHANNEL IMPENDING FAILURE DATA ERROR RATE TOO HIGH DATA CHANNEL IMPENDING FAILURE SEEK ERROR RATE TOO HIGH DATA CHANNEL IMPENDING FAILURE TOO MANY BLOCK REASSIGNS DATA CHANNEL IMPENDING FAILURE ACCESS TIMES TOO HIGH Storwize V7000 Unified: Problem Determination Guide Version...
Page 103
Table 23. SMART ASC/ASCQ error codes and messages (continued) ASCQ Description DATA CHANNEL IMPENDING FAILURE START UNIT TIMES TOO HIGH DATA CHANNEL IMPENDING FAILURE CHANNEL PARAMETRICS DATA CHANNEL IMPENDING FAILURE CONTROLLER DETECTED DATA CHANNEL IMPENDING FAILURE THROUGHPUT PERFORMANCE DATA CHANNEL IMPENDING FAILURE SEEK TIME PERFORMANCE DATA CHANNEL IMPENDING FAILURE SPIN-UP RETRY COUNT DATA CHANNEL IMPENDING FAILURE DRIVE CALIBRATION...
SA CREATION PARAMETER NOT SUPPORTED AUTHENTICATION FAILED LOGICAL UNIT ACCESS NOT AUTHORIZED SECURITY CONFLICT IN TRANSLATED DEVICE Monitoring memory usage on a file module Use this procedure to monitor memory usage on a file module. Storwize V7000 Unified: Problem Determination Guide Version...
Understanding error codes The Storwize V7000 Unified error codes convey specific information in an alphanumeric sequence. Tip: Search for error codes or event IDs by using EFS on the front. For 66012FC, for example, search on EFS66012FC.
Optional Ethernet port 7 (Dual Port 10G card) Fibre channel adapter 1 (both ports) – Storage node only Fibre channel adapter 2 (both ports) – Storage node only Bonded device (data0 mgmt0) System x internal hard disk drives Storwize V7000 Unified: Problem Determination Guide Version...
Table 27. Originating file module specific software code – Code 1, 3, 5. Listing devices for variable C in the specific software code sequence of ABBCDDDD. C = Originating specific software code in sequence ABBCDDDD Code Device Red Hat Linux GPFS CIFS server CTDB...
Unique error code Severity of the error Understanding event IDs The Storwize V7000 Unified messages follow a specific format, which is detailed here. About this task Tip: Search for error codes or event IDs by using EFS on the front. For 66012FC, for example, search on EFS66012FC.
I for Asynchronous Replication J for SCM L for HSM AK for NDMP v The element nnnn is a 4 digit message number v The element x indicates the severity of the error. The value x can be: A for Action: GUI error messages. The user must perform a specific action. C for Critical: A critical error occurred which must be corrected by the user or system administrator.
“Removing the fan bracket” on page 100 these. “Installing the fan bracket” on page 102 “Removing the IBM virtual media key” on page 103 “Installing the IBM virtual media key” on page 104 “Removing a PCI riser-card assembly” on page 105 “Installing a PCI riser-card assembly”...
Page 111
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 112
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 113
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 114
To remove the battery, complete the following procedure. Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
Page 115
9. Remove the battery: a. If there is a rubber cover on the battery holder, use your fingers to lift the battery cover from the battery connector. b. Use one finger to push the battery horizontally away from the PCI riser card in slot 2 and out of its housing.
Page 116
In the United States, IBM has established a return process for reuse, recycling, or proper disposal of used IBM sealed lead acid, nickel cadmium, nickel metal hydride, and other battery packs from IBM Equipment. For information on proper disposal of these batteries, contact IBM at 1-800-426-4333.
Page 117
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 118
To install the replacement battery, complete the following steps: Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
Page 119
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 120
Installing the microprocessor 2 air baffle The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at Storwize V7000 Unified: Problem Determination Guide Version...
Page 121
To install the microprocessor air baffle, complete the following steps. Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
Page 122
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 123
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 124
Removing the fan bracket The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at Storwize V7000 Unified: Problem Determination Guide Version...
Page 125
To remove the fan bracket, complete the following steps. Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
Page 126
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 127
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 128
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 129
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 130
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 131
Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58. 2. Reinstall any adapters you removed in other procedures.
Page 132
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 133
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 134
Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58. 2. Install the adapter in the expansion slot.
Page 135
If you replace a fibre channel adapter within a storage node, the WWPNs change. For 2851-DR1/DE1 attached storage the WWPN updates are automatic. If the attached storage unit is a gateway configuration (consisting of IBM XIV Storage System, V7000, or SAN Volume Controller), the WWPN update is not automatic.
Page 136
These installation instructions show the slot location for the 10-Gbps Ethernet PCI adapter. About this task The 10-Gbps Ethernet adapter must go in PCI slot 4. The following illustration shows the locations of the adapter expansion slots from the rear of the file module. Storwize V7000 Unified: Problem Determination Guide Version...
Page 137
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 138
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Ethernet adapter filler panel Standoff Rubber stopper Figure 14. Location of the Ethernet adapter filler panel on the chassis 6. Install the two standoffs on the system board. 7. Insert the bottom tabs of the metal clip into the port openings from outside the chassis.
Attention: Make sure the port connectors on the adapter are aligned properly with the chassis on the rear of the server. An incorrectly seated adapter might cause damage to the system board or the adapter. Figure 18. Port connector alignment Storwize V7000 Unified: Problem Determination Guide Version...
Page 141
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
To remove the SAS riser-card and controller assembly from the server, complete the following procedure. Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
SAS controller. See Figure 22. Figure 22. Controller retention brackets on 16-drive-capable server model 1) Remove the SAS controller front retention bracket from the server. See Figure 23 on page 121. Storwize V7000 Unified: Problem Determination Guide Version...
SAS expander card front retention bracket Figure 23. SAS controller front retention brackets 2) Remove the rear controller retention bracket located in the battery bay above the power supplies by pulling up the release tab 1 and sliding the bracket outward 2 . See Figure 24. Figure 24.
3. To install the SAS riser-card and controller assembly for a tape-enabled server model, complete the following steps. Figure 27 on page 123 shows the SAS riser card in the tape-enabled server model. Storwize V7000 Unified: Problem Determination Guide Version...
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 148
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 149
Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58. 2. Touch the static-protective package that contains the new ServeRAID SAS controller to any unpainted metal surface on the file module.
Page 150
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 151
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 152
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
3. Press down on the left and right side latches and pull the server out of the rack enclosure until both slide rails lock. 4. Remove the cover, as described in “Removing the cover” on page 87. 5. Locate the remote battery tray in the server and remove the battery that you want to replace.
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 160
To remove the CD-RW/DVD drive, complete the following procedure. Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
Page 161
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 162
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 164
To install a DIMM, complete the following procedure. Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
Page 165
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 166
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 168
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Make sure that the devices that you are installing are supported. For a list of supported devices for the server, see “Parts listing for file modules” in the IBM Storwize V7000 Unified Information Center. v Before you install an additional power supply or replace a power supply with one of a different wattage, you may use the IBM Power Configurator utility to determine current system power consumption.
Page 170
To install an ac power supply, complete the following steps: Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
Page 171
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
Page 172
The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
To remove a microprocessor and heat sink, complete the following steps: Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
About this task Read the documentation that comes with the microprocessor to determine whether you must update the IBM System x Server Firmware. To download the most current level of server firmware, complete the following steps: 1. Go to http://www.ibm.com/systems/support/.
Page 175
Note: For simplicity, certain components are not shown in this illustration. Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
Page 176
8. Twist the handle of the installation tool clockwise to secure the microprocessor in the tool. Note: You can pick up or release the microprocessor by twisting the microprocessor installation tool handle. Storwize V7000 Unified: Problem Determination Guide Version...
Page 177
Handle Installation tool Microprocessor 9. Carefully align the microprocessor installation tool over the microprocessor socket. Twist the handle of the microprocessor tool counterclockwise to insert the microprocessor into the socket. Attention: The microprocessor fits only one way on the socket. You must place a microprocessor straight down on the socket to avoid damaging the pins on the socket.
21. Install the cover, as described in “Installing the cover” on page 88. 22. Slide the server into the rack. 23. Reconnect the external cables; then, reconnect the power cords and turn on the peripheral devices and the file module. Storwize V7000 Unified: Problem Determination Guide Version...
Page 179
Note: You must wait approximately 2.5 minutes after you connect the power cord of the file module to an electrical outlet before the power-control button becomes active. Removing and replacing the thermal grease The following procedure is for a field replaceable unit (FRU). FRUs must be installed only by trained service technicians.
Page 180
Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58. 2. Turn off the file module and peripheral devices, then label and disconnect both power cords and all external cables.
Page 181
8. If you are instructed to return the heat-sink retention module, follow all packaging instructions, and use any packaging materials for shipping that are supplied to you. Installing a heat-sink retention module The following procedure is for a field replaceable unit (FRU). FRUs must be installed only by trained service technicians.
Page 182
To remove the system board, complete the following steps. Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
Page 183
9. If an Ethernet daughter card is installed in the server, remove it. 10. If a virtual media key is installed in the server, remove it, as described in “Removing the IBM virtual media key” on page 103. 11. Remove the DIMM air baffle, as described in “Removing the DIMM air baffle”...
Page 184
Note: You must wait approximately 2.5 minutes after you connect the power cord of the file module to an electrical outlet before the power-control button becomes active. Storwize V7000 Unified: Problem Determination Guide Version...
Page 185
[root@PFESONAS1.mgmt001st001 ~]# asu show BootOrder.BootOrder IBM Advanced Settings Utility version 3.62.71B Licensed Materials - Property of IBM (C) Copyright IBM Corp. 2007-2010 All Rights Reserved Successfully discovered the IMM via SLP. Discovered IMM at IP address 169.254.95.118 Connected to IMM at IP address 169.254.95.118 BootOrder.BootOrder=Legacy Only=CD/DVD Rom=Floppy Disk=PXE Network=Hard Disk 0...
To remove the 240 VA safety cover, perform the following steps: Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
About this task The ASU package is part of the Storwize V7000 Unified code. ASU is available to authorized service personnel from the command-line interface (CLI) on the file module. Use ASU to modify selected settings in the integrated-management- module (IMM)-based Storwize V7000 Unified file modules.
2. Issue the following command to view the current settings for the machine type and model: asu show SYSTEM_PROD_DATA.SysInfoProdName 3. Issue the ASU command on the Storwize V7000 Unified file module to set the machine type and model: asu set SYSTEM_PROD_DATA.SysInfoProdName 2073-700 4.
About this task Logical devices and physical port locations Use this table to help identify logical devices, file module roles used, and physical locations. Table 33. Storwize V7000 Unified logical devices and physical port locations Logical Ethernet device name Device description...
About this task If both file modules are operating correctly with regard to management services, perform the following procedure to failover the active management node to the passive management node. Storwize V7000 Unified: Problem Determination Guide Version...
Page 191
If you see the following error message when running the command, wait until the initialization has completed before running setcluster again: IBM SONAS management service is starting up EFSSG0654I The Management Service is starting up. After you run the startmgtsrv command, the system displays information that is similar to the following example: [yourlogon@yourmachine.mgmt002st001 ~]# startmgtsrv...
Page 192
7. Run the CLI command startmgtsrv. This starts the management services on the passive node. 8. Once command execution is complete: a. Verify that the management service is running by again executing the CLI command lsnode. Storwize V7000 Unified: Problem Determination Guide Version...
Use this information for checking system health with the clustered trivial database (CTDB). About this task CTDB checks the health status of the Storwize V7000 Unified file modules, scanning elements such as storage access, General Parallel File System (GPFS), networking, Common Internet File System (CIFS), and Network File System (NFS).
“Checking the GPFS file system mount on each file module” on page 171. v Refer to the information in “Troubleshooting the System x3650” in the IBM Storwize V7000 Unified Information Center to determine if any additional hardware problems might be causing the “unhealthy”...
System (GPFS) file system mounts on IBM Storwize V7000 Unified file modules. About this task A GPFS file system that is not mounted on an Storwize V7000 Unified file module can cause the clustered trivial database (CTDB) status to be 'UNHEALTHY'." The...
2. To identify the currently created file systems on each Storwize V7000 Unified file module, log in as the root user on the active management node, then enter the onnode -n mgmt001st001 df | grep ibm command from the CLI, as shown...
If file systems remain unmounted, contact IBM support. Resolving stale NFS file systems You can resolve problems with stale NFS file systems on Storwize V7000 Unified file modules. A file module might have the file system mounted, but the file system remains inaccessible due to a stale NFS file handle.
Command_Output_Data Home_Directory Template_Shell FETCH USER INFO SUCCEED 12004360 12000513 /var/opt/IBM/sofs/scproot /usr/bin/rssh EFSSG1000I The command completed successfully. When the system is unable to authenticate against an external authentication server, you must ensure that it can obtain user information from the authentication server.
This can cause some clients have access while others do not. Procedure 1. To obtain the IP addresses of your Storwize V7000 Unified cluster, issue the nslookup command; this non-disruptive command requires “root” access and your domain name. .
You are running this procedure on a file module. v You are logged into the file module, which is the active management node, as root. See “Accessing a file module as root” on page 273. Storwize V7000 Unified: Problem Determination Guide Version...
Page 201
4. Issue the chkfs file_system_name -v | tee /ftdc/chkfs_fs_name.log1 command to capture the output to a file. Review the output file for errors and save it for IBM support to investigate any problems. If the file contains a TSM ERROR message, perform the following steps: a.
Resolving issues reported by lshealth Use this information to resolve lshealth reported issues, specifically for “MGMTNODE_REPL_STATE ERROR DATABASE_REPLICATION_FAILED” and “The mount state of the file system /ibm/Filesystem_Name changed to error level” errors. About this task These errors might be transient and can clear automatically at any time.
4. Issue the command lshealth -i gpfs_fs -r. The command output should display The mount state of the file system /ibm/gpfs1 was set back to normal level. 5. If the error persists, refer to the GPFS documentation to debug or repair the error.
2. If there is no space in fragments or if the mmdefragfs command does not free up space, add disks (NSDs) to the file system to create space. a. Add disks to the file system. Storwize V7000 Unified: Problem Determination Guide Version...
Kerberos tickets, for example, can expire and then no one can access the cluster. For the Storwize V7000 Unified file module, the ntpq –p command shows you which server is used for synchronization and any peers and a set of data about their status.
Page 206
[root@domain.node ~]# service ntpd start Starting ntpd: [ OK ] [root@domain.node ~]# After the time on all of the servers is synchronized, you can verify that the logs apply to your troubleshooting situation. Storwize V7000 Unified: Problem Determination Guide Version...
You cannot manage a system by using the 10 Gbps Ethernet ports. You can perform almost all of the configuration, troubleshooting, recovery, and maintenance of the storage system from within the Storwize V7000 Unified management GUI or the CLI commands that are running on the Storwize V7000 file modules.
Page 208
When you cannot access the system from the management GUI and you cannot access the storage Storwize V7000 Unified to run the recommended actions v When the recommended action directs you to use the service assistant. The storage system management GUI operates only when there is an online system.
Accessing the storage system CLI Follow the steps that are described in the “Command-line interface” topic in the “Reference” section of the Storwize V7000 Unified Information Center to initialize and use a CLI session. Chapter 5. Control enclosure...
Accessing the service CLI Follow the steps that are described in the “Command-line interface” topic in the “Reference” section of the Storwize V7000 Unified Information Center to initialize and use a CLI session. USB flash drive and Initialization tool interface Use a USB flash drive to initialize a system and also to help service the node canisters in a control enclosure.
Page 211
USB flash drive, you can download the application from the support website (search for initialization tool): www.ibm.com/storage/support/storwize/v7000/unified If you download the initialization tool, you must copy the file onto the USB flash drive that you are going to use.
Page 212
Use the chsystemip CLI command to change the managed gateway IP address setting on the control enclosure. (This must be done first before you change the management gateway IP address setting on the file modules): [kd52v6h.ibm]$ chsystemip -gw 9.71.16.1 -port 1 Storwize V7000 Unified: Problem Determination Guide Version...
Page 213
You should be able to access the management GUI or CLI from a computer, which is on a different subnet or different Ethernet switch to the Storwize V7000 Unified system. The link to the management GUI from the InitTool.exe panel should now work.
Page 214
Use this command when you are unable to logon to the system because you have forgotten the superuser password, and you wish to reset it. Attention: Run this command only when instructed by IBM support. Running this command directly on a Storwize V7000 can affect your I/O operations on the file modules.
Page 215
Use this command to collect diagnostic information from the node canister and to write the output to a USB flash drive. Attention: Run this command only when instructed by IBM support. Running this command directly on a Storwize V7000 can affect your I/O operations on the file modules.
Page 216
Note: The reference to cluster is not the same as the file system cluster on the Storwize V7000 file modules. Attention: Run this command only when instructed by IBM support. Running this command directly on a Storwize V7000 can affect your I/O operations on the file modules.
If any service activity is required, a notification is sent. Event reporting process The following methods are used to notify you and the IBM Support Center of a new event: v If you enabled Simple Network Management Protocol (SNMP), an SNMP trap is sent to an SNMP manager that is configured by the customer.
Resolve the root event first. Sense data Additional data that gives the details of the condition that caused the event to be logged. Storwize V7000 Unified: Problem Determination Guide Version...
Critical notifications can be configured to be sent as a Call Home email to the IBM Support Center. Warning A warning notification is sent to indicate a problem or unexpected condition with the Storwize V7000 Unified.
You can view information about collecting CIM log files or you can view examples of a configuration dump, error log, or featurization log. To do this, click Reference in the left pane of the Storwize V7000 Unified Information Center and then expand the Logs and traces section.
Page 221
There are two power supply units in the control enclosure. Each one contains an integrated battery. Both power supply units and batteries provide power to both control canisters. Each battery has a sufficient charge to power both node canisters for the duration of saving critical data to the local drive. In a fully redundant system with two batteries and two canisters, there is enough charge in the batteries to support saving critical data from both canisters to a local drive twice.
Important: Although Storwize V7000 Unified is resilient to power failures and brown outs, always install Storwize V7000 Unified in an environment where there is reliable and consistent ac power that meets the Storwize V7000 Unified requirements.
Understanding the medium errors and bad blocks A storage system returns a medium error response to a host when it is unable to successfully read a block. The Storwize V7000 Unified response to a host read follows this behavior. The volume virtualization that is provided extends the time when a medium error is returned to a host.
The Start here: Use the management GUI recommended actions topic gives the starting point for any service action. The situations covered in this section are the Storwize V7000 Unified: Problem Determination Guide Version...
The management GUI provides extensive facilities to help you troubleshoot and correct problems on your system. You can connect to and manage a Storwize V7000 Unified system using the management GUI as soon as you have created a clustered system. If you cannot create a clustered system, see the problem that contains information about what to do if you cannot create one.
Update the file module's record of the control enclosure system IP: To find the file module's current record of the control enclosure system IP address, use the Storwize V7000 Unified management CLI to issue the lsstoragesystem command. Here is an example: >ssh admin@<management_IP>...
>[kd01ghf.ibm]$ chstoragesystem --ip1 9.71.18.136 --ip2 9.71.18.136 EFSSG1000I The command completed successfully. Verify that communication from the file module to the control enclosure is now possible by running the lssystem command on the Storwize V7000 Unified management CLI: >ssh admin@<managment IP address>...
Updating file module's record of the control enclosure system IP: To find the USB flash drive current record of the control enclosure system IP address, use the Storwize V7000 Unified management CLI to issue the lsstoragesystem command. Here is an example: >ssh admin@<management_IP>...
of both node canisters is candidate, then there is not a clustered system to connect to. If the node state is service, go to “Procedure: Fixing node errors” on page 220. v Ensure that you are using the correct system IP address. If you know the service address of a node canister, go to “Procedure: Getting node canister and system information using the service assistant”...
1. Point your browser at the /service directory of the management IP address of the system. If your management IP address is 11.22.33.44, point your web browser to 11.22.33.44/service. 2. Log into the service assistant. Storwize V7000 Unified: Problem Determination Guide Version...
3. The service assistant home page lists the node canister that can communicate with the node. 4. If the service address of the node canister that you are looking for is listed in the Change Node window, make the node the current node. Its service address is listed under the Access tab of the node details.
Problem: SAS cabling not valid This topic provides information to be aware of if you receive errors that indicate the SAS cabling is not valid. Check the following items: Storwize V7000 Unified: Problem Determination Guide Version...
v No more than five expansion enclosures can be chained to port 1 (below the control enclosure). The connecting sequence from port 1 of the node canister is called chain 1. v No more than four expansion enclosures can be chained to port 2 (above the control enclosure).
About this task If you are having problems attaching to the FCoE hosts, your problem might be related to the network, the Storwize V7000 Unified system, or the host. Procedure 1. If you are seeing error code 705 on the node, this means Fibre Channel I/O port is inactive.
Verify that Storwize V7000 Unified and host get an fcid on FCF. If not, check the VLAN configuration. b. Verify that Storwize V7000 Unified and host port are part of a zone and that zone is currently in force.
The Node tab shows general information about the node canister that includes the node state and whether it is a configuration node. Storwize V7000 Unified: Problem Determination Guide Version...
v The Hardware tab shows information about the hardware. v The Access tab shows the management IP addresses and the service addresses for this node. v The Location tab identifies the enclosure in which the node canister is located. v The Ports tab shows information about the I/O ports. Procedure: Getting node canister and system information using a USB flash drive This procedure explains how to view information about the node canister and...
LEDs on the power supply unit for the 2076-112 or 2076-124. The LEDs on the power supply units for the 2076-312 and 2076-324 are similar, but they are not shown here. Figure 47. LEDs on the power supply units of the control enclosure Storwize V7000 Unified: Problem Determination Guide Version...
Page 239
Table 38. Power-supply unit LEDs Power supply ac failure dc failure failure Status Action Communication Replace the power failure between supply unit. If failure is the power still present, replace the supply unit and enclosure chassis. the enclosure chassis No ac power to Turn on power.
If the power LEDs show green, reseat the node canister. See “Procedure: Reseating a node canister” on page 222. If the LED status does not change, see “Replacing a node canister” on page 224. Storwize V7000 Unified: Problem Determination Guide Version...
Page 241
Table 40. System status and fault LEDs (continued) System status Fault LED Status Action Code is not Follow the hardware replacement active. The BIOS procedures for the node canister. or the service processor has detected a hardware fault. Code is active. No action.
1. Verify that each end of the cable is securely connected. 2. Verify that the port on the Ethernet switch or hub is configured correctly. 3. Connect the cable to a different port on your Ethernet network. Storwize V7000 Unified: Problem Determination Guide Version...
4. If the status is obtained using the USB flash drive, review all the node errors that are reported. 5. Replace the Ethernet cable. Procedure: Removing system data from a node canister This procedure guides you through the process to remove system information from a node canister.
You can set an IPv4 address, an IPv6 address, or both, as the service address of a node. Enter the required address correctly. If you set the address to 0.0.0.0 or 0000:0000:0000:0000:0000:0000:0000, you disable the access to the port on that protocol. Storwize V7000 Unified: Problem Determination Guide Version...
Procedure Change the service IP address. v Use the control enclosure management GUI when the system is operating and the system is able to connect to the node with the service IP address that you want to change. 1. Select Settings > Network from the navigation. 2.
Results Procedure: Powering off your system Use this procedure to power off your Storwize V7000 Unified system when it must be serviced or to permit other maintenance actions in your data center. To turn off the Storwize V7000 Unified system, see “Turning off the system” in the Storwize V7000 Unified information center.
About this task Procedure: Collecting information for support IBM support might ask you to collect trace files and dump files from your system to help them resolve a problem. Typically, you perform this task from the Storwize V7000 Unified management GUI. You can also collect information from the Storwize V7000 control enclosure itself.
Before you remove and replace parts, you must be aware of all safety issues. Before you begin First, read the safety precautions in the IBM Systems Safety Notices. These guidelines help you safely work with the Storwize V7000 Unified. Replacing a node canister This topic describes how to replace a node canister.
v If the system status is off, it is acceptable to remove a node canister. However, do not remove a node canister unless directed to do so by a service procedure. v If the power LED is flashing or off, it is safe to remove a node canister. However, do not remove a node canister unless directed to do so by a service procedure.
Be careful when you are replacing the hardware components that are located in the back of the system that you do not inadvertently disturb or remove any cables that you are not instructed to remove. Storwize V7000 Unified: Problem Determination Guide Version...
Be aware of the following canister LED states: v If the power LED is on, do not remove an expansion canister unless directed to do so by a service procedure. v If the power LED is flashing or off, it is safe to remove an expansion canister. However, do not remove an expansion canister unless directed to do so by a service procedure.
Be careful when you are replacing the hardware components that are located in the back of the system that you do not inadvertently disturb or remove any cables that you are not instructed to remove. Storwize V7000 Unified: Problem Determination Guide Version...
Page 253
CAUTION: Some laser products contain an embedded Class 3A or Class 3B laser diode. Note the following information: laser radiation when open. Do not stare into the beam, do not view directly with optical instruments, and avoid direct exposure to the beam. (C030) About this task Perform the following steps to remove and then replace an SFP transceiver: Procedure...
Electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard: v Connect power to this unit only with the IBM provided power cord. Do not use the IBM provided power cord for any other product.
Page 255
Attention: If your system is powered on and performing I/O operations, go to the management GUI and follow the fix procedures. Performing the replacement actions without the assistance of the fix procedures can result in loss of data or access to data. Attention: A powered-on enclosure must not have a power supply removed for more than five minutes because the cooling does not function correctly with an empty slot.
6. Insert the replacement power supply unit into the enclosure with the handle pointing towards the center of the enclosure. Insert the unit in the same orientation as the one that you removed. Storwize V7000 Unified: Problem Determination Guide Version...
7. Push the power supply unit back into the enclosure until the handle starts to move. 8. Finish inserting the power supply unit into the enclosure by closing the handle until the locking catch clicks into place. 9. Reattach the power cable and cable retention bracket. 10.
Page 258
Electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard: v Connect power to this unit only with the IBM provided power cord. Do not use the IBM provided power cord for any other product.
Page 259
Attention: A powered-on enclosure must not have a power supply removed for more than five minutes because the cooling does not function correctly with an empty slot. Ensure that you have read and understood all these instructions and have the replacement available, and unpacked, before you remove the existing power supply.
6. Insert the replacement power supply unit into the enclosure with the handle pointing towards the center of the enclosure. Insert the unit in the same orientation as the one that you removed. Storwize V7000 Unified: Problem Determination Guide Version...
7. Push the power supply unit back into the enclosure until the handle starts to move. 8. Finish inserting the power supply unit in the enclosure by closing the handle until the locking catch clicks into place. 9. Reattach the power cable and cable retention bracket. 10.
Page 262
Electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard: v Connect power to this unit only with the IBM provided power cord. Do not use the IBM provided power cord for any other product.
Page 263
The battery is a lithium ion battery. To avoid possible explosion, do not burn. Exchange only with the IBM-approved part. Recycle or discard the battery as instructed by local regulations. In the United States, IBM has a process for the collection of this battery. For information, call 1-800-426-4333. Have the IBM part number for the battery unit available when you call.
Remove the battery from the packaging. b. Remove the end caps. c. Attach the end caps to both ends of the battery that you removed and place the battery in the original packaging. Storwize V7000 Unified: Problem Determination Guide Version...
d. Place the replacement battery in the opening on top of the power supply in its proper orientation. e. Press the battery to seat the connector. f. Place the handle in its downward location 5. Push the power supply unit back into the enclosure until the handle starts to move.
224 refers. 2. Unlock the assembly by squeezing together the tabs on the side. Figure 59. Unlocking the 3.5" drive 3. Open the handle to the full extension. Figure 60. Removing the 3.5" drive Storwize V7000 Unified: Problem Determination Guide Version...
4. Pull out the drive. 5. Push the new drive back into the slot until the handle starts to move. 6. Finish inserting the drive by closing the handle until the locking catch clicks into place. Replacing a 2.5" drive assembly or blank carrier This topic describes how to remove a 2.5"...
4. Fit the slot that is on the top of the end cap over the tab on the top of the chassis flange. 5. Rotate the end cap down until it snaps into place. Make sure that the inside surface of the end cap is flush with the chassis. Storwize V7000 Unified: Problem Determination Guide Version...
Attention: The left end cap is printed with information that helps identify the enclosure. v machine type and model v enclosure serial number v its machine part number The information on the end cap should always match the information printed on the rear of the enclosure, and it should also match the information that is stored on the enclosure midplane.
The procedures for replacing a control enclosure chassis are different from those procedures for replacing an expansion enclosure chassis. For information about replacing an expansion enclosure chassis, see “Replacing an expansion enclosure chassis” on page 251. Storwize V7000 Unified: Problem Determination Guide Version...
Page 271
Electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard: v Connect power to this unit only with the IBM provided power cord. Do not use the IBM provided power cord for any other product.
Page 272
Attention: Perform this procedure only if instructed to do so by a service action or the IBM support center. If you have a single control enclosure, this procedure requires that you shut down your system to replace the control enclosure. If you...
Page 273
b. Use the following CLI command to find the volumes that depend on this enclosure: lsdependentvdisks -enclosure <enclosure_id> Dependent volume names that start with IFS are file volumes that are used by the file modules to provide file systems. Turn off these file modules.
Page 274
“Procedure: Fixing node errors” on page 220. To restart a node from the service assistant, perform the following steps: 1) Log on to the service assistant. 2) From the home page, select the node that you want to restart from the Changed Node List. Storwize V7000 Unified: Problem Determination Guide Version...
3) Select Actions > Restart. d. The system starts and can handle I/O requests from the host systems. Note: The configuration changes that are described in the following steps must be performed to ensure that the system is operating correctly. If you do not perform these steps, the system is unable to report certain errors.
Page 276
Electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard: v Connect power to this unit only with the IBM provided power cord. Do not use the IBM provided power cord for any other product.
Page 277
Attention: If your system is powered on and performing I/O operations, go the management GUI and follow the fix procedures. Performing the replacement actions without the assistance of the fix procedures can result in loss of data or access to data. Even though many of these procedures are hot-swappable, these procedures are intended to be used only when your system is not up and running and performing I/O operations.
2. Record the location of the rail assembly in the rack cabinet. 3. Working from the back of the rack cabinet, remove the clamping screw 1 from the rail assembly on both sides of the rack cabinet. Storwize V7000 Unified: Problem Determination Guide Version...
Figure 64. Removing a rail assembly from a rack cabinet 4. Working from the front of the rack cabinet, remove the clamping screw from the rail assembly on both sides of the rack cabinet. 5. From one side of the rack cabinet, grip the rail and slide the rail pieces together to shorten the rail.
1. Ensure that the Fibre Channel cable is securely connected at each end. 2. Replace the Fibre Channel cable. 3. Replace the SFP transceiver for the failing port on the Storwize V7000 Unified Storwize V7000 Unified node. Note: Storwize V7000 Unified nodes are supported with both longwave SFP transceivers and shortwave SFP transceivers.
Ethernet iSCSI host-link problems If you are having problems attaching to the Ethernet hosts, your problem might be related to the network, the Storwize V7000 Unified system, or the host. Before you begin For network problems, you can attempt any of the following actions: v Test your connectivity between the host and Storwize V7000 Unified ports.
Attention: If you experience failures at any time while you are running the recover system procedure, call the IBM Support Center. Do not attempt to do further recovery actions because these actions might prevent IBM Support from restoring the system to an operational status.
Page 283
Certain conditions must be met before you run the recovery procedure. Use the following items to help you determine when to run the recovery procedure: v Check to see if any node in the system has a node status of active. This status means that the system is still available.
Note: If after resolving all these scenarios, half or greater than half of the nodes are reporting node error 578, it is appropriate to run the recovery procedure. You can also call IBM Support for further assistance. – For any nodes that are reporting a node error 550, ensure that all the missing hardware that is identified by these errors is powered on and connected without faults.
Attention: This service action has serious implications if not performed properly. If at any time an error is encountered not covered by this procedure, stop and call IBM Support. Note: The web browser must not block pop-up windows, otherwise progress windows cannot open.
Page 286
“Recovering from offline VDisks using the CLI” on page 263 for details. T3 failed Call IBM Support. Do not attempt any further action. Run the recovery from any node canisters in the system; the node canisters must not have participated in any other system.
Perform the following steps to recover an offline volume after the recovery procedure has completed: 1. Delete all IBM FlashCopy function mappings and Metro Mirror or Global Mirror relationships that use the offline volumes. 2. Run the recovervdisk or recovervdiskbysystem command.
Before using the file volumes that are used by GPFS on the file modules to provide Network Attached Storage (NAS), perform the following task: v Contact IBM support for assistance with recovering the GPFS quorum state so that access to files as NAS can be restored.
Page 289
Contact the IBM support center to help you prepare the Storwize V7000 Unified system to do the restoring of the system configuration on the control enclosure.
Typically the restoration should be performed via canister 1. Before you begin, hardware recovery must be complete. The following hardware must be operational: hosts, Storwize V7000 Unified, drives, the Ethernet network, and the SAN fabric. Backing up the system configuration using the CLI You can back up your configuration data using the command-line interface (CLI).
Page 291
data. This can be attempted via the <Recover System Procedure> also known as a Tier 3 (T3) procedure. Restoring the system configuration without attempting to recover the application data is performed via the <Restoring the System Configuration> procedure also known as a Tier 4 (T4) recovery. Both of these procedures require a recent backup of the configuration data.
2. Issue the following CLI command to erase all of the files that are stored in the /tmp directory: svconfig clear -all Storwize V7000 Unified: Problem Determination Guide Version...
6. Save the new configuration by clicking the OK button. Results Configuring the remote support system IBM Storwize V7000 Unified uses IBM Tivoli Assist On Site software to establish remote connections to IBM support representatives. Establishing an AOS connection Use this information to establish an AOS connection with IBM remote support for diagnosing and reviewing issues and problems on your system.
Page 294
Storwize V7000 Unified system. About this task Configure the system for a lights-out connection using the Enable IBM Tivoli Assist On-Site (AOS) task. After you configure the system, no other tasks are needed. The remote support contact might ask you for machine information, such as machine type and models, serial numbers, and your machine name.
Page 295
Enter the customer name, the case number (use the PMR number), and the geography. f. Talk to the IBM authorized servicer at the customer site to make sure that the servicer is ready to establish the link before you submit the form.
Page 296
Storwize V7000 Unified: Problem Determination Guide Version...
Page 298
8. From the KVM where you logged on as root, use the chrootpwd command to change the root password on both file modules. Results The chrootpwd program prompts you for the new root password. Storwize V7000 Unified: Problem Determination Guide Version...
SCSI protocol. Before you begin During the USB initialization of the Storwize V7000 Unified system, one of the node canisters in the control enclosure creates a public/private key pair to use for ssh. The node canister stores the public key and writes the private key to the USB flash drive memory.
- /sharename The ls command can return the following error: ls: .: Stale NFS file handle The Storwize V7000 Unified system hosting file module might display the following error: mgmt002st001 mountd[3055867]: refused mount request from hostname for sharename (/): not exported If one of these errors occurs, complete the following steps.
This section covers the recovery procedures related to file module issues. Restoring System x firmware (BIOS) settings During critical repair actions such as the replacement of a system planar in an IBM Storwize V7000 Unified file module, you might have to reset the System x firmware.
13. Press ESC or click Exit Setup, and then press Enter. 14. When prompted, click Y to exit the setup menu. The system now reboots. During the reboot, the Storwize V7000 Unified code automatically modifies the configuration of the System x firmware (BIOS) to change the default settings to the required settings.
Use this procedure after completing the procedure in “Fibre Channel connectivity between file modules and control enclosure” on page 34. The Storwize V7000 Unified system can experience problems where the multipathd failures occur. If the paths are not automatically restored, a system reboot can recover the paths.
Use this procedure to recover from an httpd service error when the service is reported as unhealthy or off. About this task Procedure To fix the httpd error, perform the following steps: 1. Attempt to start the http service manually. Storwize V7000 Unified: Problem Determination Guide Version...
a. Log in as root. b. Issue the service http start command. 2. When you complete the service action, refer to “Health status and recovery” on page 23. Recovering from an sshd_data service error Use this procedure to recover from an sshd_data service error. About this task This recovery procedure starts the sshd_data when it is down.
About this task Procedure To run the fix procedures, perform the following steps: 1. Log in to the Storwize V7000 Unified management GUI. 2. Go to Monitoring > Events and click the Block tab. 3. Run any Next recommended action.
Point in time block copies are a good candidate for deletion. Storwize V7000 Unified can virtualize external block storage controllers. If spare capacity is available on other block storage controllers then you can virtualize those and use that free local arrays.
You can immediately remount any remaining unmounted file systems without waiting for IBM support to tell you that it is safe for you to re-enable the control enclosure CLI. Note: The management GUI can become very slow when the control enclosure CLI is restricted, so the following procedure shows how to use the management CLI to check if the file systems are mounted.
Page 309
-r -n <node name of the active mode> initnode -r 4. Log back on to the Storwize V7000 Unified CLI. Then wait for GPFS to be active on both file modules in the output of the CLI command: Chapter 7. Recovery procedures...
CLI is restricted. When you log on to the management GUI, it issues a warning that the Storwize V7000 CLI is restricted. The management GUI runs the fix procedure to direct you to send logs to IBM. The fix procedure directs you back to this procedure to make the file systems accessible again.
Site A by using the rmtask CLI command. Restoring Tivoli Storage Manager data The Storwize V7000 Unified system contains a Tivoli Storage Manager client that works with your Tivoli Storage Manager server system to perform high-speed data backup and recovery operations.
2. After each recommended fix, restart the upgrade by issuing the applysoftware command again. If the action fails, try the next recommended action. 3. If the recommended actions fail to resolve the issue, call the IBM Support Center. Table 43. Upgrade error codes from using the applysoftware command and recommended...
Page 313
Table 43. Upgrade error codes from using the applysoftware command and recommended actions (continued) The applysoftware Error Code command explanation Action EFSSG4102A The applysoftware command returned software package does not exist EFSSG4103 The software package is not The package might be valid.
EFSSG4160 The system has insufficient At least 3 GB of space is file system space. required. Remove unneeded files from the /var file system. EFSSA0201C The license agreement has not been accepted. Storwize V7000 Unified: Problem Determination Guide Version...
Page 315
2. After each recommended fix, restart the upgrade by issuing the applysoftware command again. If the action fails, try the next recommended action. 3. If the recommended actions fail to resolve the issue, call the IBM Support Center. Table 44. Upgrade error codes and recommended actions...
If there is no obvious event that could have caused this error, refer to “Ethernet connectivity from file modules to the control enclosure” on page 29. Storwize V7000 Unified: Problem Determination Guide Version...
Page 317
Table 44. Upgrade error codes and recommended actions (continued) Error Code Explanation Action 01B5 Storwize V7000 multipaths are Check the Fibre Channel connections to unhealthy. the system. Reseat Fibre Channel cables. For more information, see “Fibre Channel connectivity between file modules and control enclosure”...
Page 318
Unable to configure node. 1. Pull both power supply cables from subject node. Wait 10 seconds, then plug back in. After the system restarts, try again. 2. Contact your next level of support. Storwize V7000 Unified: Problem Determination Guide Version...
Page 319
Table 44. Upgrade error codes and recommended actions (continued) Error Code Explanation Action 01D0 Unable to disable call home. Contact your next level of support. 01D1 Unable to enable call home. Contact your next level of support. 01D2 Failed to stop GPFS. 1.
Page 320
Storwize V7000 Unified: Problem Determination Guide Version...
Storage pool is full and the file system pool Increase capacity of the storage pool. is offline. Storage pool is full and the file system pool Contact IBM Remote Technical Support or is offline, but no additional storage is your service representative. available to add to the pool.
Page 322
2. Select the storage system to view a list of MDisks that are currently detected on the external storage system. If there are no MDisks that are displayed, click Detect MDisks. If theStorwize V7000 Unified system attached to external storage systems, you can allocate additional LUNs. Storwize V7000 Unified: Problem Determination Guide Version...
3. Right-click an unmanaged MDisk and select Add to Pool. 4. On the Add to Pool dialog, select the pool and click Add to Pool. 5. Verify that the MDisk was added to the selected pool by expanding the pool and ensuring that the added MDisk is displayed.
Page 324
5. On the right side of the panel, under the Capacity heading, the real capacity for the compressed volume is displayed. The storage pool must have at least the real capacity of the volume to successfully migrate the data. Storwize V7000 Unified: Problem Determination Guide Version...
To decrease the file system capacity, you can remove the disks (NSD) and the corresponding mapping to block volumes to force migration of the data to other NSDs, thus freeing up space on the file system. To remove an NSD, contact IBM Remote Technical Support.
Page 326
In the management GUI, select Files > File Systems. b. Right-click the compressed file system that is offline and select Mount. If the file system does not come back online you may need to restart all of the Storwize V7000 Unified: Problem Determination Guide Version...
Step 1 and select Mark as... > Spare . e. Click OK. To add additional drives to the system, complete these steps: a. Acquire additional drives from IBM or vendor. b. Install drives into available drive slots on the enclosure. See Installing a hot-swap hard disk drive.
Page 328
Additionally you must also monitor file capacity utilization to ensure that the file system does not reach 100% utilization and run out of capacity. The capacity utilization of a file system issued physical capacity in the compressed pool. The Storwize V7000 Unified: Problem Determination Guide Version...
Page 329
system uses the same threshold and alerting system and suggests corrective actions when thresholds are reached. If based on the original, uncompressed capacity that the system presents to users and applications of the file system. To free up capacity in a file system, you can either delete files from the file system or increase the current capacity of the storage pool, which can be used to expand the volumes that are related to the NSDs from the unused physical capacity.
Page 330
Storwize V7000 Unified: Problem Determination Guide Version...
Accessibility features help a user who has a physical disability, such as restricted mobility or limited vision, to use software products successfully. Accessibility features These are the major accessibility features associated with the Storwize V7000 Unified Information Center: v You can use screen-reader software and a digital speech synthesizer to hear what is displayed on the screen.
Page 332
Storwize V7000 Unified: Problem Determination Guide Version...
Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk.
Page 334
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created...
IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. The sample programs are provided "AS IS", without warranty of any kind. IBM shall not be liable for any damages arising out of your use of the sample programs.
Properly shielded and grounded cables and connectors must be used in order to meet FCC emission limits. IBM is not responsible for any radio or television interference caused by using other than recommended cables and connectors, or by unauthorized changes or modifications to this equipment.
Klasse A ein. Um dieses sicherzustellen, sind die Geräte wie in den Handbüchern beschrieben zu installieren und zu betreiben. Des Weiteren dürfen auch nur von der IBM empfohlene Kabel angeschlossen werden. IBM übernimmt keine Verantwortung für die Einhaltung der Schutzanforderungen, wenn das Produkt ohne Zustimmung der IBM verändert bzw.
Statement International Electrotechnical Commission (IEC) statement This product has been designed and built to comply with (IEC) Standard 950. Korean Communications Commission (KCC) Class A Statement Russia Electromagnetic Interference (EMI) Class A Statement Storwize V7000 Unified: Problem Determination Guide Version...
Fax: 0049 (0)711 785 1283 Email: mailto: tjahn @ de.ibm.com Taiwan Contact Information This topic contains the product service contact information for Taiwan. IBM Taiwan Product Service Contact Information: IBM Taiwan Corporation 3F, No 7, Song Ren Rd., Taipei Taiwan Tel: 0800-016-888...
Page 340
Storwize V7000 Unified: Problem Determination Guide Version...
Page 343
38 mirrored volumes light path diagnostics 41 not identical 209 system status 213 multipath events legal notices outputs 279 IBM virtual media key Notices 309 removal 103 trademarks 311 replacing 104 light path diagnostics identifying LEDs 41...
Page 344
206 operator information panel accessing 307 service assistant assembly 147, 148 accessing 184, 223 parts interface 183 overview 224 supported browsers 208 preparing 224 query status command 192 when to use 183 Storwize V7000 Unified: Problem Determination Guide Version...
Page 345
257 error codes 288 Storwize V7000 46 recovery 288 hardware indicators 46 USB flash drive Storwize V7000 Unified library detection error 210 related publications xx USB key superuser using 186 password when to use 186 resetting 211...
Page 346
Storwize V7000 Unified: Problem Determination Guide Version...
Page 348
Part Number: 00AR050 Printed in USA GA32-1057-07...