A long, long time ago when public cloud IaaS (Infrastructure as a Service) was still relatively new I was doing some contract work for a big international company. One of the tasks for the department was an IaaS proof of concept: does offloading servers to the public cloud result in cost savings? Long story short: the PoC was halted after several months because the AWS IaaS offering was prohibitively expensive and inflexible. We had enough systems and data to keep a team of qualified engineers busy and to get good purchasing discounts. An additional problem was the very rigid service catalog: there weren’t that many flavors of machines available back then and custom machines were either not possible or even more expensive.
Fast forward to 2016 and I’m looking into public/private/on-premises IaaS again for a different company. Prices have dropped, but there are still some things to keep in mind when considering a move to the public cloud IaaS models.
I’ve previously written a 101 cloud computing post with some of the basic cloud terminology. For the purpose of this post, private cloud will be off-premises private cloud in a service provider data center. Public cloud is an IaaS or virtual machine offering from one of the three big providers; pick the one you like most.
Burst vs 24/7
If you need temporary computing power, the public cloud is ideal. You pay per hour the instance is switched on. As soon as you turn it off, you no longer pay for the computing power, only for the storage in use. Especially if you only need a server for 1 week per month (for example for reporting/rendering purposes), purchasing hardware yourself is way, way more expensive.
However, if your virtual machines are running 24/7/365, you would be leveraging the hardware investment to the max. During the night there’s usually maintenance or backup jobs running on your application servers. So even for a company that’s not global, your systems will still be doing something useful during the night.
Ask yourself the question: is your software and organization mature enough to switch off unused virtual machines or to automatically power off unused cluster nodes? For most organizations I run into: nope. Just scan Twitter for the “Woops I left my AWS machines running and blew my budget overnight”-style tweets and you’ve got some good reading material. Even the pro’s working with cloud all day get it wrong sometimes…
Cisco presenter just mentioned turning off AWS VMs to control cost. Forgot I left a load running! #CFD1 @RayLucchesi pic.twitter.com/UkezyXOnvx
— Nigel Poulton (@nigelpoulton) September 14, 2016
Software to automatically power off machines does exist. VMware DPM for example powers off ESX hosts when the cluster utilization is low and powers them back on when more power is needed. Surpise: I have seen it in production use exactly 0 times over the last 3 years. I’m not sure if there’s even a similar technology to downscale Citrix/SQL clusters during the night, but would be surprised if there’s an organization using them.
Is it far fetched to think the majority of basic company utility servers are powered on 24/7/365? I think not. And thus one of the big benefits of public cloud (pay as you go) is actually not that much of a benefit for your day-to-day workloads.
Standardization
Virtual machine instances come in predefined sizes and configuration. This is part of the reason why public cloud is cheap: standardized solutions are easier to configure and support than custom made solutions. The problem with standard solutions is: what do you do with the machines that don’t fit in a standard slot?
Public cloud providers predefine a number of virtual machines flavors, with various ratios of CPU, RAM and storage. For example:
- 8 CPU + 56GB RAM + 400GB disk
- 16 CPU + 112GB RAM + 800GB disk
So what if you wanted a machine that had 8 CPUs and 112GB of RAM? Or 12 CPU and 70GB of RAM? It’s not on the menu, so you would have to either downscale or upscale. Upscaling means paying more money to the cloud provider with no business benefit, downscaling means a raging app owner at your desk…
With smaller machines the difference between the flavors is not that pronounced and the losses are smaller. However you’ll likely have a whole bunch of those small virtual machines, hence the cumulative losses are still obvious. With bigger machines, the differences are more apparent.
Then there’s the exceptionally sized virtual machines. You’ll pay through the nose for these. A single G5 machine with 32 CPUs, 448GB of RAM and 6.1TB of storage will cost you €9,361 /hr (Azure, at the time of writing this post). A wonderful machine for database workloads, but it will cost you a whopping €417k over 5 years (standard depreciation time). Last time I checked, this buys you a lot of hardware and maybe even an IT engineer to monitor it. Sure; there’s more hidden costs involved with running an IaaS infrastructure. But if you need two or three of those machines, you’ve got some budget to play with in building your own private cloud!
Legal restraints
Compared to the US, we in Europe still have a lot of privacy laws that differ per country. Some countries (Germany and Norway are notable examples) place some seriously stringent restraints on where sensitive information about their inhabitants can go. Sometimes that means it can’t leave the country.
The big cloud providers are building datacenters like crazy so there is likely already a datacenter in your country or it’s on the roadmap. If it isn’t however, you will still need some local footprint.
And if I place my Dutch data on a Irish server that’s owned by an American company, does that mean the US government can subpoena it? And what happens with all these audits that are periodically run on financial/healthcare organizations?
Connectivity
We’re pretty spoiled in the Netherlands. We’ve got so much fiber in the ground, we can get a dark fiber or managed ethernet connection between pretty much every part of the country. And we can get it cheap.
Move over to the Middle East, Sweden or the Balkans and you’ve got a different situation. Stable, fast and secure connectivity to the cloud is expensive or not available at all. Maybe you still need a local footprint in the building to keep your people operational when the link goes down. This is a cost you should factor in when deciding if a move to the cloud is feasible, but admittedly: this is the same for both public and private off-premises IaaS.
My thoughts on public cloud
First of all: I’m not bashing the public cloud for IaaS. It’s simply brilliant for temporary workloads. You can easily and quickly consume some compute power for a short period of time and tear it down again when no longer needed. Since there’s millions of instances running in the public clouds, your extra 100 virtual machines are not even noticed and compensated for because someone else just turned a few machines off.
If you’re a tiny company that doesn’t need many servers, or a start-up that needs maximum agility: creating them in public cloud is fast and cheap. This saves you the investment in a suitable computer room and an IT guy/gal, and allows you to focus on your core business. Plus it’s an OPEX spend model instead of CAPEX, which might be just what you can afford.
And if you can move up one more step into PaaS/SaaS solutions, you can offset some of the additional spend on cloud services with the reduced OPEX cost of some application specialists.
This craze however, which goes something like:
Oh my god, the public cloud is going to consume EVERYTHING. All these companies that are making hardware/software should just pack up!
Nope. Not soon, and in my opinion: never. There will always be customers that need flexibility or performance that the public cloud can’t offer. Or that have sufficient scale to be able to do it cheaper themselves, at least for the base loads. These customers will still benefit from a private cloud.
If you need to move your existing environment to the public cloud, don’t just pick up your existing virtual machines. Instead, run a rationalization project first to resize machines where possible and maybe even check them for relevance.
Leveraging the powers of the public and private cloud in a hybrid cloud, moving workloads between these distinctly different operating models. That’s the next big step for the next couple of years. Run your 24/7/365 systems in a private cloud. Run your temporary workloads in the public cloud.
The next big challenge will be rebuilding your environment and educating your admins to actually switch off machines when no longer needed…