Hardware has set the pace for latency, time for software to catch up

I can’t recall the last storage system installation that didn’t have some amount of solid state drives in its configuration. And for good reason: we’re rapidly getting used to the performance benefits of SSD technology. Faster applications usually result in real business value. The doctor treating patients can get his paperwork done faster and thus has time for more patients in a day. Or the batch job processing customer mailing lists or CGI renderings completes sooner, giving you a faster time to market.

To reduce the application wait times even further, solid state drives need to be able to achieve even lower latencies. Just reducing the media latency won’t cut it anymore: the software component in the chain needs to catch up. Intel is doing just that with Storage Performance Development Kit (SPDK).

Whack a mole

Flash media have given us an enormous performance boost compared to traditional spinning media. Historically, systems that were designed for spinning disk were retrofitted to be able to use flash media. The result: disks that could handle the IO load, but a storage processor CPU or bus that would be saturated due to the increase load. The old CLARiiON systems were a good example. Solve one performance problem, get another one back.

The next generation system would have faster CPUs, or better utilize the available CPU cores through software optimization from single-threaded to multi-threaded applications. We just whacked one mole; when will the next one pop up?

According to Intel at Storage Field Day 11, the next mole will be controller and performance latency. With current NAND flash technologies, controller and protocol latencies are already responsible for 25% of the total latency.

Intel media latency


We’ve already seen that the move from SAS to NVMe cuts out so much of the controller latency, it’s hardly visible in the graph anymore.

SPDK and software latency

A lot of storage drivers are using a kernel-based, interrupt driven model. What we learned during the Intel session is that this has a couple of disadvantages.

First of all, interrupts used to be considered fast. This was however in the 10ms IO latency era, which is currently a low bar to cross. With microsecond latencies, interrupts are all of a sudden slow and polled mode is much faster.

Placing drivers into the kernel and moving between kernel and user mode is also expensive latency wise. Plus it has a commercial side to it: anything placed in the kernel is automatically open source due to GPL/BSD licensing requirements.

To get an idea of the full extent of the SPDK innovations, make sure to watch the video with Jonathan Stern’s presentation. One clear benefit is in scalability: with SPDK, Intel is able to push over 3.5 million IOps through a single Intel Xeon core, whereas the NVMe driver in the Linux kernel is only able to do around 500K IOps. And keep in mind these IOs are seeing lower latencies at the same time!

NVMe scalability with Intel SPDK

Intel also optimized part of the iSCSI protocol. Here, switching from the Linux kernel based driver to the Intel SPDK results in roughly a 2x efficiency win. The next bottleneck is the kernel network stack which consumes 70+% of the CPU cycles. Jonathan hinted towards Intel working on a solution (“There are some ideas”), so let’s see what the team comes up with.

My thoughts on Intel’s SPDK

From an administrative perspective it’s beneficial to have enough performance in your system. Time spent locating the source of storage problems is time not being used to optimize your environment and planning for the future. Under sizing your storage system could directly cost you money: I don’t mind getting money to troubleshoot performance problems, but extended troubleshooting/optimization sessions spanning days or weeks are sometimes solved cheaper with a couple of extra SSDs.

Most people that talk SSDs like to talk throughput. “Hey this SSD can do so many IOPs, so we can use less drives to handle the workload!”. The next battle of performance will be one of latencies though. Faster media like 3D-XPoint will be one part of the puzzle in achieving this; software optimizations like the ones coming out of the Intel SPDK will be another piece of the puzzle.

At Storage Field Day 8, Coho Data mentioned the performance potential of NVMe devices and the CPU and NIC saturation problems that came along with it. Since Coho Data and Intel work closely together, I’m curious to see how both companies work together in leveraging the power of the SPDK in building faster storage systems.

Make sure to watch the rest of the Storage Field Day 11 videos from Intel. James Green has recorded a podcast together with J Metz, which you can find here. Last but not least, Max sees SPDK as the foundation for a new generation of storage systems.

Disclaimer: I wouldn’t have been able to attend Storage Field Day 11 without GestaltIT picking up the tab for the flights, hotel and various other expenses like food. I was however not compensated for my time and there is no requirement to blog or tweet about any of the presentations. Everything I post is of my own accord and because I like what I see and hear. Many thanks to Chris for helping with a catchy title for this post.