Every once in a while you might need to replace an Isilon infiniband switch. Possibly because of a broken switch, the need for more ports, or because the old switch is.. too old. Good news: it’s a fairly straightforward job. And if your cluster has two switches, you can replace a switch at a time without outage.
We’re in the midst of a VCE vBlock 340 software upgrade. Part of this upgrade process is upgrading the Cisco Nexus 5K switches that connect the blades and storage to the customer network. After upgrading the switch we suddenly noticed on the switch that the VNX Unified standby data mover (server_3) interface suspended with a “no LACP PDUs” error message. A quick check on the switch that wasn’t upgraded yet showed that interface to be online. So what’s up with that?
Once upon a time there was a data center filled with racks of physical servers. Thanks to hypervisors such as VMware ESX it was possible to virtualize these systems and run them as virtual machines, using less hardware. This had a lot of advantages in terms of compute efficiency, ease of management and deployment/DR agility.
To enable many of the hypervisor features such as VMotion, HA and DRS, the data of the virtual machine had to be located on a shared storage system. This had an extra benefit: it’s easier to hand out pieces of a big pool of shared storage, than to predict capacity requirements for 100’s of individual servers. Some servers might need a lot of capacity (file servers), some might need just enough for an OS and maybe a web server application. This meant that the move to centralized storage was also beneficial from a capacity allocation perspective.
A Brocade firmware upgrade once in a while is highly recommended: new releases usually squash bugs and add new features, which helps with a stable and efficient SAN infrastructure. The upgrade process itself is relatively straightforward if you keep in mind that you can only non-disruptively upgrade one major release at a time. With that knowledge you only need a FTP server (like Filezilla), SSH client (PuTTY), the upgrade packages and some patience.
After you’ve built a new storage environment you will probably want to monitor it and/or integrate the equipment in existing monitoring tools. SNMP is one of the protocols to use for this, but for some reason I always forget how to do a Cisco NX-OS SNMP v3 configuration. There’s a big difference in security between SNMP v2c and v3 and they’re configured quite differently: SNMPv2c uses community strings, SNMPv3 builds on the user accounts in the switch. This post will show you how to configure SNMP v3 in the DCNM SAN GUI and on the Cisco MDS NX-OS CLI.
Cisco Smart Zoning greatly reduces the time needed to zone servers to storage on Cisco NX-OS SAN switches. Instead of creating numerous zones that contain one single initiator and one single target, you can now classify a WWN as initiator, target or both and throw them all into one single zone. The switch then figures out which devices should be allowed to talk with each other (based on the parameter you set for each WWN). Not only does this speed up the entire zoning process but it also helps keep the zoning interface uncluttered and minimize the risk for errors. Let’s see how you can configure this in the DCNM-SAN GUI…
Last weekend we replaced six old Brocade SAN switches with brand new Cisco MDS 9148 switches. Everything went according to plan with no disruption to the rest of the infrastructure. I was however stuck with a bunch of old Brocade 4900 switches ready to be decommissioned. Performing a Brocade reset to factory default settings proved to be a bit of a challenge though…
If you ask Google how to perform a Brocade reset to factory default settings, you’ll find a lot of commands. One command removes the zoning, another command removes a different part of the config, a third command replaces some config values with the default settings. However, none of these commands reset the IP configuration, user passwords or switch name. Which is kind of awkward since that’s THE part of the switch config you wouldn’t want to become public domain…
Recently I ran into an environment with a couple of VNX5700 systems that were attached to the front-end SAN switches with only two ports per storage processor. The customer was complaining: performance was OK most of the time but at some times during the day the performance was noticeably lower. Analysis revealed that the back-end was coping well with the workload (30-50% load on the disks and storage processors). The front-end ports were a bit (over)loaded and spewing QFULL errors. Time to cable in some extra ports and to rebalance the existing hosts over the new storage paths!