Dr. J. Metz talked with us about NVMe at Storage Field Day 16 in Boston. NVMe is rapidly becoming one of the new hypes in the storage infrastructure market. A few years ago, everything was cloud. Vendors now go out of their way to mention their array contains NVMe storage, or is at the very least ready for it. So should you care? And if so, why?
SNIA’s mission is to lead the storage industry worldwide in developing and promoting vendor-neutral architectures, standards and educational services that facilitate the efficient management, movement, and security of information. They do that in a number of ways: standards development and adoption for one, but also through interoperability testing (a.k.a. plugfest). They aim to help in technology acceleration and promotion: solving current problems with new technologies. So NVMe-oF fits this mission well: it’s a relatively new technology, and it can solve some of the queuing problems we’re seeing in storage nowadays. Let’s dive in!
NVMe vs NVM Express vs NVMe-oF
First of all: lets get some terminology straight.
- NVMe is the technology
- NVM Express, Inc is the organization behind the NVMe specification.
- NVMe-oF or NVMe over Fabrics is the extension to the base NVMe specification that enables NVMe to be accessible beyond PCIe limitations (read: distance). And very importantly: it’s transport agnostic, meaning it is not pinned to a single transport mechanism.
So why is NVMe hot? Solid state media is becoming faster. Latencies are decreasing and the new media behaves very differently compared to traditional spinning disk or tape devices. In effect, storage is becoming a lot more like memory. That means we need a newer, better way of accessing it. The alternatives, for example SCSI, just add too much latency and cannot handle the increase in parallelism.
NVMe achieves this by accessing the solid state media over the PCI Express (PCIe) bus, using the PCI command set. This negates the need for a translation step in a SCSI or SATA adapter.
NVMe uses queues: one admin queue and more than one I/O queues. These I/O queues are pinned to CPU cores, and this is where the scalability and parallelism comes into play. Each host/controller pair also has an independent set of NVMe queues, and they operate autonomously.
Each queue has a submission and a completion queue. A command is queued into the submission queue at the host. It then notifies the controller that there’s a command waiting by writing to the doorbell register. Or, in Dr. J. Metz’s words: “Ding, order up!”. The NVMe controller then reads the command out of the submission queue and writes it back into its own memory buffer. If any data needs to be transferred, that happens at this stage. If it doesn’t (for example, when flushing registers), that’s it. The NVMe controller then queues the Completion Queue Entry, sends it back to the host and the relevant pointers are updated. That’s one I/O done.
NVMe-oF: multiple transport mechanisms
NVMe over Fabrics extends on the base NVMe specification and allows you to access NVMe devices that are not local to the system. There are multiple transport mechanisms available, such as RDMA, FC and TCP.
There are a few key differences between NVMe and NVMe-oF, for example the one-to-one mapping between submission queues and I/O completion queues, and the fact a controller is only associated with one host at the same time. Dr. J. Metz does a much better way of explaining the differences though, and he also dives into the transport protocol differences. Check out the video if you want to have the explanations from the original source!
My thoughts on SNIA’s presentation
Yes, you should care about NVMe. Storage media are becoming faster, with better parallelism and latencies are dropping through the floor. To utilize their full potential, we need smarter ways to access them. But you also want to access these medias outside of the single box, for all the reasons you can think of. That’s why NVMe and NVMe-oF were invented. A great thing about NVMe-oF is that it’s transport agnostic. Which transport mechanism you choose depends on your use case. Don’t simply jump on RDMA or TCP, because each transport mechanism has its limitations and use cases. For example, TCP is great for data locality and flexibility, but not the best for lowest-latency.
I loved the presentation from SNIA. As an organization, they’ve presented at Storage Field Day before. They consistently do a good job explaining market developments or things I didn’t even know about (like tail latency). And they approach it all relatively un-biased. It is refreshing to just get an objective explanation of technology, without all the “we’re better than them”-marketing BS.
I learned a lot from this session. I knew the basics (why and what), but not the how. Check out these posts from Matt and Keiran to get their opinions on how SNIA did and what they think of NVMe. And if you want to get a glimpse of how it is to present for a Tech Field Day audience, go read this write-up from Dr. J. Metz himself!
Disclaimer: I wouldn’t have been able to attend Storage Field Day 16 without GestaltIT picking up the tab for the flights, hotel and various other expenses like food. I was however not compensated for my time and there is no requirement to blog or tweet about any of the presentations. Everything I post is of my own accord and because I like what I see and hear.