Bringing hyperscale operations to the masses with Datera

Datera LogoDatera was founded in 2013 with a clear mission: bringing hyperscale operations and economics to the private clouds. The big corporations such as Facebook and Google don’t manage individual pieces of hardware. Instead they use policies and “the system” will decide where to spin up an app or place some data. This means an admin can manage a lot more servers or storage. So why is this level of automation only used by the big corporations? Datera aims to change that!

Datera officially launched in April 2016 from Sunnyvale in Silicon Valley, with 50+ employees. A month later we received a briefing at Storage Field Day 10. At that point they had 4 announced customers, with roughly 10 customers not announced yet.

With hyperscale economics and operations, you basically want to manage more equipment (in this case storage) with less people/FTE. You can optimize processes, standardize operations etc, but what it really boils down to is automation. Datera calls this Application Intent: the admin decides which storage characteristics and application needs, and the Datera Elastic Data Fabric will automatically select the correct type of storage.

Architecture of the Datera Elastic Data Fabric

This Datera Elastic Data Fabric is made up of software-defined storage nodes. You can mix all-flash, hybrid and HDD nodes in the same fabric. The licenses are capacity based and can be licensed in 50TB or 100TB steps. The deployment is fully featured: snapshots, clones, thin provisioning, replication (1:5 within the cluster) are all included. The front-end protocol is iSCSI; the rest of the protocols haven’t been requested by customers yet.

Datera GUI

Instead of a software-only deployment you can also order a software + hardware combination; currently you’ll get a SuperMicro server, but Datera is working adding additional vendors to the mix.

Backend connectivity between the scale-out nodes is currently 40Gbit/s, with front-end connectivity towards the clients at 10Gbit/s. The SDS approach means its pretty easy to incorporate new technologies like 3D-Xpoint or other flash technologies in the Datera Elastic Data Fabric.

The Datera control plane is distributed across all storage nodes: there’s no dedicated control/management node. During deployment you configure one node with all the required settings (such as connect home details, IP addresses etc). All subsequent nodes perform a broadcast and will retrieve their configurations from the main node.

Every node reports back to the cluster what it’s characteristics are. The metadata is continuously analyzed by the Optimizer in the control plane to determine if data is still stored on the optimal node.

Intent provisioning

The policy engine in the Datera Elastic Data Fabric is responsible for provisioning the volumes based on the application intent. This intent consists of application policies or templates, which contain:

  • Management policies: QoS (Gold/Silver/Bronze), data retention, data protection and data placement.
  • Storage templates: number of volumes and their respective size: maybe you always want your VMware VMFS volumes to be 2TB-512 bytes (oldskool!).
  • Pools of storage: useful for multi tenancy environments where you want some sort of isolation (e.g. don’t place the Pepsi data on the same nodes as the Coca Cola data).

For example if you need a couple of log volumes for a database server, you could select the “SQL Log volumes” templates (which you’d have to create ofcourse) and create a number of volumes with the predefined settings (for example QoS policy Gold, 3 replica’s, create in the Coca Cola pool of storage).

My thoughts on the Datera product

I’m a strong believer in automation. It allows an administrator to manage more storage/volumes/servers with less work involved. It also ensures consistency in application provisioning: no more mismatches because someone forgets to create a replica or creates volumes based on 1000’s instead of 1024’s.

If you want to get some more information on the Datera, check out Arjan’s post over here, or Dan’s post over here. And you can always watch the session recordings on the Storage Field Day 10 site.

Disclaimer: GestaltIT paid for the flight, hotel and various other expenses to make it possible for me to attend SFD10. I was however not compensated for my time and there is no requirement to blog or tweet about any of the presentations. Everything I post is of my own accord.