According to Tintri, the rise of server virtualization broke the traditional storage system. Initially we had relatively simple environments where one server talks to a number of LUNs on a storage system. Sometimes we’d have a small cluster of servers accessing those volumes. Still relatively simple.
Fast forward to now: large clusters of hypervisor hosts are the norm, collectively accessing an even larger number of volumes. Each hypervisor in turn hosts a large number or virtual machines. In case of performance problems, how are you ever going to figure out the root cause and which other systems are affected?
For any engineers working in storage, this might sound way too familiar. I can still vividly recall the case where suddenly roughly half the VMs started to perform poorly. After checking the four storage systems, we pinpointed one system that was showing a 100% CPU load on one storage processor. It took another couple of minutes to pinpoint the LUN responsible for the excessive write IO, then some more time to find the VM responsible for it: a SQL server. A very professional (ahum!) shout across the department pinpointed the SQL admin that was restoring a database: he was really happy with the amazing restore speed (300+ MB/s, which was exciting back then!), but we weren’t so happy on our already overloaded systems.
Making storage VM-aware
Tintri simplifies this troubleshooting exercise by showing you the virtual machines located on the system, straight from the storage management interface. It can do this for a whole host of hypervisors: VMware vSphere, Microsoft Hyper-V, Red Hat Enterprise Virtualization, OpenStack and XenServer at the time of Storage Field Day 10.
Since the Tintri system can paint a broader picture than just the storage latency, it can also show you the latency components such as host, network/SAN, storage and disk latency. Useful for those “It’s the storage again, boohoo!” discussions!
The Tintri GUI is based on Adobe Flex, with modernization coming over time. I like the clean approach of the GUI, with plenty of MouseOver options, for example with the latency drilldown per virtual machine. See for yourself in this SFD10 demo video:
The above is still a bit reactive I can hear you think. Well… Tintri proactively helps you avoid bottlenecks with Global Federated management system of Tintri Global Center. Twice a day it crunches the numbers and will show you the bottlenecks inside and outside of the system. It will then show you the recommended actions (e.g. move VMs) and the impact of these actions (e.g. capacity reduction and load improvements).
The Tintri systems come in either hybrid or all-flash configurations. You can start with one system and scale up to 32 systems in a cluster, with the Tintri Global Center to manage the whole environment.
My thoughts on Tintri and VM-aware storage
I spent my first years in IT initializing storage systems, building RAID groups, creating LUNs and gluing them together into MetaLUNs. A lot of planning went into these installations and as long as you knew what you were doing, you could get a surprising stable and high performance out of the systems. Basically we were doing all these fancy things like storage pools manually. Since you were working full time with these systems, you knew everything about them and troubleshooting usually didn’t even need conscious thought: it was second nature, you just felt where the problem would probably reside.
The downside is that a customer isn’t that familiar with the system. And all that LUN planning and calculating IOps might be cool for a storage geek, but the other 99,99999% of the population couldn’t care less. The sheer amount of data and virtual machines has made it a daunting task, and to be honest after a couple of years of hunting for peaks of IOps, it quickly loses its glamour.
Fast forward to systems like this Tintri VM-aware storage: installation is just a matter of racking the system. Troubleshooting is either easy because of a much more advanced GUI, or even automatic due to the background analytics that will just show you what to do. You just ensure the action isn’t complete bogus and click “execute”.
Tintri isn’t unique in this: many different vendors are making storage simpler and more automated. If you’ve followed the news lately, hyper-converged infrastructure (HCI) is the new cloud. So what does this mean for the die-hard storage admin that dreams in IOps and QoS? They won’t go away overnight: there’s too many legacy storage systems out there.
But in a couple of years I expect plenty of mid-size customers consolidating the storage and hypervisor teams. Fellow SFD10 delegate Max has written an interesting post: Storage Administrators: an endangered species? And I think he could be right: maybe not for the large size companies that can afford to keep a dedicated storage team, but the mid and smaller size companies will probably consolidate teams. Adapt, or…
Make sure to check out the Storage Field Day 10 videos over here, and it’s also worthwile to read Dan’s take. I’ll be keeping an eye on Tintri; I like where they are heading. There’s plenty of competition (as Enrico’s points out), but they are making storage simpler. And who doesn’t like that!
Disclaimer: GestaltIT paid for the flight, hotel and various other expenses to make it possible for me to attend SFD10. I was however not compensated for my time and there is no requirement to blog or tweet about any of the presentations. Everything I post is of my own accord.