There’s no denying that off-premises cloud services are growing. Just look at the year-to-year growth of big public cloud providers. There’s big potential if you focus on two aspects of cloud. The first is speeding up access to data that is potentially not located in the same city or even geographic area. The second is supporting new protocols and storage methodologies that are suited for cloud native applications. One player in this area of IT is Avere, which aims to connect on-premises storage and compute to their siblings in the cloud.
I met Avere several years ago. Initially they started out with NAS optimization: optimizing the access from on-premises compute to on-premises storage systems, typically slow NAS filers. By inserting an Avere caching layer between the compute and the storage, performance was improved without investing massive amounts of money in the storage layer itself. Typical read:write IO ratios in the field are 70:30, so if you can absorb a large portion those 70%, this will help.
The next step for Avere was object storage translation. Avere would take storage from an object storage system and translate this into a NAS file system that clients and compute resources could use. Useful if you didn’t fill up your object store completely and still wanted to get your investment back.
After this, Avere joined the cloud movement and built a cloud storage gateway. This helped connect cloud storage to on-premises compute. Vice versa, it enabled cloud bursting in which on-premises storage was made available for cloud based compute. Finally, pure cloud NAS: cloud compute would be connected to cloud storage.
Media and Entertainment
So who could benefit from this technology? Avere showcased several Media and Entertainment use cases. The Pixars of this world need large amounts of compute for short periods of time to compile their CGI and animated scenes. It doesn’t make sense for these companies to have all this compute in their own datacenters: the majority of the time it would be idle, not generating any money.
In these cases, an Avere storage gateway allows them to keep their source data on-site, pull some compute power out of a wall somewhere (that’s a Dutch thing right there), and hand the processing power back once compiling has finished.
Caching vs Tiering
We had a roundtable discussion with several delegates about tiering vs caching. What’s the difference? Here’s a quick summary: Tiering moves data to for example either higher performance (=faster) or better economy (=cheaper) storage, while caching copies (parts of the) data to typically faster storage. Think caching if you want to speed up a system, and tiering if you want to also improve the TCO of a system.
Avere advertises their systems as tiering, however this is not entire true according to that definition. The best example is what happens if an Avere cluster blows up entirely: your data is still safe. You simply connect a new Avere system to the S3 based config data bucket and repopulate the system from the underlying storage systems. Hence there’s an extra layer in your storage stack (with the mathematically slightly lower total availability if you add components to a stack), but your data is always completely safe.
Avere C2N Cloud-core NAS system
Apart from a caching layer, Avere now also makes their own complete storage solution. The system can be bought in either an Erasure Coding configuration with a minimum of 6 nodes, or a Triple Replication (mirroring) configuration with a minimum of 3 nodes. List pricing is somewhere in the region of 26c per RAW GB.
Dave Henry has written an extensive post on the C2N system, so do check it out.
My thoughts on Avere’s products
Public cloud is growing but will not completely destroy the on-premises systems. I’ve written a lengthy post about that over here. There’s still several challenges which have to be overcome, one of them being the connectivity to the public cloud. If you’ve got a low bandwidth, high latency link, you’ll have complaining users if you force a cloud approach on them. Avere systems can make it easier to move towards the cloud and actually support a cloud-only approach, by either making the move to the cloud graceful or by speeding up technically difficult cloud implementations (e.g. low bandwidth availability) which might just tip them into the “possible” range.
With regards to the tiering and caching discussion: I think the distinction matters. While the Avere product is pretty promising in itself, I think true policy based tiering would definitely be beneficial. File systems usually contain a lot of inactive data. If you could tier the active data to the on-premises Avere system and tier the inactive data to the cloud (S3, Glacier, whatever), that would give you three benefits:
- Less WAN utilization: agreed, this is already done with caching as well.
- Higher performance: active data (e.g. file age is <180 days) is always/usually local.
- Potentially lower retrieval costs for cloud storage. The cloud sometimes costs money if you retrieve data, so by minimizing the amount of I/O, you save money.
It was surprising to see that the Avere interface is still an oldskool “throw as many stats torwards the admin”-GUI, whereas Storage Field Day 10 was all about “here’s a single button you can click and the system will take care of itself”. I strongly believe in a middle ground: storage systems should automate and optimize as much as they can, but should also provide enough information back to the admins to facilitate troubleshooting.
Disclaimer: I wouldn’t have been able to attend Storage Field Day 11 without GestaltIT picking up the tab for the flights, hotel and various other expenses like food. I was however not compensated for my time and there is no requirement to blog or tweet about any of the presentations. Everything I post is of my own accord and because I like what I see and hear.