Consistency and predictability matter. You expect Google to answer your search query within a second. If it takes two seconds, that is slow but ok. Much longer and you will probably hit refresh because ‘it’s broken and maybe that will fix it’.
There are many examples that could substitute the scenario above. Starting a Netflix movie, refreshing your Facebook timeline, or powering on an Azure VM. Or in your business: retrieving an MRI scan or patient data, compiling a 3D model, or listing all POs from last month.
Ensuring your service can meet this demand of predictability and consistency requires a multifaceted approach, both in hardware and procedures. You can have a modern hypervisor environment with fast hardware, but if you allow a substantially lower spec system in the cluster, performance will not be consistent. What happens when a virtual machine moves to the lower spec system and suddenly takes longer to finish a query?
Similarly, in storage, tiering across different disk types helps improve TCO. However, what happens when data trickles down to the slowest tier? Achieving that lower TCO comes with the tradeoff of less latency predictability.
These challenges are not new. If they impact user experience too much, you can usually work around them. For example, ensure your data is moved to a faster tier in time. If you have a lot of budget, maybe forgo the slowest & cheapest NL-SAS tier and stick to SAS & SSD. But what if the source of the latency inconsistency is something internal to a component, like a drive?