VNX Uptime Bulletin Q3 2013

EMC_Image_C_1310583102517_header-image-vnx-series

EMC sends out a VNX Uptime Bulletin every quarter to update customers on best practices and fixes which will help you in achieving the maximum possible uptime and robustness for your VNX. You can subscribe to them as you would to with any other ETA (EMC Technical Advisory): log in at http://support.emc.com, go to Support by Product, open your product page (in this case the VNX) and click “Get Advisory Alerts” to subscribe. This bulletin discusses pools and LUN ownership, vault drives, software versions, etc.

Storage Pools & LUN Ownership

Uptime is all nice and well, but if your attached hosts can’t get the required performance from the array, you’re still in trouble. The bulletin kicks off with explaining the “allocation owner”. Understanding what types of LUN owners exist is critical for performance when provisioning LUNs from a VNX storage pool. You can read about this in a separate post I made over here.

Apart from that, it’s vital that you keep some amount of free space in a storage pool. Recommendations are 10% of free space in VNX OE R31; R32 can cope with 5%. Without any free space, FAST VP slice relocations will either take longer to complete or just outright fail. This will of course impact performance if a slice can’t be promoted to a faster tier if it needs to be… Another problem is that EMC has certain recovery tools (used for example when LUNs are offline) that need some free space in a pool. No free space -> extra delay in getting your data online.

Human error

The vault drives hold the VNX OE and potentially also some user data. There are scenarios where you would want to swap all the drives, e.g. for a smaller to waste less capacity when not using the user space on the vault drives or a higher capacity drive when required to for a FLARE / VNX OE upgrade. The bottom line: vault drives have your storage OS on them. Handle with care and don’t replace them like it’s just a regular failed drive. There’s some serious potential of wrecking your storage system, so make sure there’s a certified engineer with the proper EMC procedures doing the swap!

Your VNX uptime will suffer if you do the wrong things...

A different task but with just as much potential for catastrophic data loss is powering down your VNX storage system. If you’ve got a Unified VNX, always power down your datamovers and control station gracefully. Next, power down the block component by pressing the SPS power switches. NEVER EVER pull the power cables when the system is on; you’ll lose data! If you want a procedure doc with this, look over here.

Software versions

An important aspect in getting a good VNX uptime is making sure you’re not too far behind with software versions (i.e. the VNX OE software). Both the VNX uptime bulletin and the EMC support website have a list of latest and target software releases. The website has the added benefit of also showing you the adoption rate, so that you can get a sense of what the rest of the world is doing. I myself usually schedule a VNX upgrade once or twice per year, upgrading to the latest code that’s at least 30-60 days old. This usually filters out the sporadic bad releases, yet also makes sure I don’t fall too far behind. Read the full VNX uptime bulletin to list the latest code related news.