I recently installed a new Data Domain DD6300. Part of the whole installation procedure is to run a DD OS upgrade to bring the system up to the target DD OS release. You can find the target releases over here. While running the upgrade to 126.96.36.199, the Data Domain correctly rebooted as part of the upgrade. Logging back in, the system GUI kept throwing an “Upgrade in progress” popup, blocking everything else in view. There is also an alert that shows “DD OS Upgrade is in progress. The system will not be available for backup and restore operations. The alert will be cleared after the upgrade operation is complete.” Which I guess is NEVER when the upgrade is hung…
I’ve installed quite a few new Isilon clusters in 2019. All of them are generation 6 clusters (H400, H500, A200), using the very cool 4-nodes-in-a-chassis hardware. Commonality among all these systems is an 1GbE management port next to the two 10GbE ports. While Isilon uses in-band management, we typically use those UTP ports for management: SRS, HTTP, etc. We assign those interfaces to subnet0:pool0 and make it a static SmartConnect pool. This assigns one IP address to each interface; if you do it right, these should be sequential.
Recent addition to my install procedure is to create some DNS A-records for those management ports. This makes it a bit more human friendly to connect your browser or SSH client to a specific node. In line with the Isilon naming convention, I followed the -# suffix format. So if the cluster is called cluster01, node 1 is cluster01-1, node 2 is cluster01-2, etc. However, it turns out this messes up your SyncIQ replication behavior!
If you’re remotely managing a Linux machine, you’ll probably use an SSH connection to run commands on that machine. There’s one problem with this approach: if you close the SSH connection, any long-running jobs/commands will halt. If you know a job will take a long time and you won’t be able to babysit the SSH connection, you can plan accordingly. But what if you underestimated the time a job will take, and you need to disconnect anyway? Here’s how to keep the job running AND make it home in time for dinner!
While upgrading OneFS it’s important to keep the InsightIQ software version compatible with the Isilon systems. In this case, InsightIQ wasn’t updated for a while and I had to upgrade from 3.0 -> 3.1 -> 3.2 -> 4.x. The actual upgrade process isn’t too hard (it just takes a lot of time), but there’s one little prerequisite in the 3.1 -> 3.2 upgrade: a minimum free space in the root partition of 502MB. As you can see in the screenshot, I wasn’t even close to the minimum requirement. I got to 357 MB, and that’s after cleaning up redundant stuff. Time to add some more disk space and extend root partition!
When deleting an Isilon folder, you might come across some peculiar behavior. When browsing with a file explorer to an SMB share and deleting a folder, the operation apparently succeeds and the folder disappears. When refreshing the share however, the folder is back. Resorting to an SSH session to delete the folder, you get an Operation not permitted error and the rm/rmdir command fails.
Last week I’ve been implementing two new Data Domain systems for a new customer who’d like to use these systems as backup targets for their existing Veeam 8 environment. Backup would be replicated to the secondary system to guarantee recoverability even if the first system or data center experiences a catastrophic failure. In this case replication will be handled by the Data Domain system itself. You’d like your backup software to be aware of the replicas on the secondary location. This in turn means Veeam should be able to read from the replica, which turned out to be a bit of a configuration challenge. Bring out the CLI!