How complex is Kubernetes really?

Andrei Dascalu
5 min readMay 3, 2023

--

I’ve started thinking about this largely after reading this piece. Of course, the title is clickbait as the obvious TLDR has nothing to do with Kubernetes but with TCO. It’s also my experience that in AWS, running Kubernetes with EKS isn’t just an exercise in frustration but also results in skyrocketing TCO.

But keeping “cloud agnostic”, particularly with Kubernetes can be a wise move for cost control. Once you have a delivery setup, you can move to wherever you can get a cluster up and running. Reconfigure your application for whatever external dependencies you have and off you go. Big clouds are expensive, but there are a load of smaller providers that give you Kubernetes on a platter for a fraction of the cost.

Kubernetes handles a lot. So does that make it a complex tool? And if so, how complex is it really?

As a bit of a background, I started working with Kubernetes about 6 years ago. For the past 4 years, I’ve run most projects I got involved with on Kubernetes. While I generally use cloud solutions (EKS, AKS, GKE mainly and a bit of Civo), I’ve done my share of bare metal clusters as well. Around the same time (~4 years ago), I ran through Kubernetes the hard way (by Kelsey Hightower).

Given that, I would like to be in a position to provide a definitive answer, but the truth is that any absolute answer (it’s complex, it’s simple) is meaningless. The perception of Kubernetes has a lot to do with the environment in which it’s meant to be used: provider, team, developers, etc. The perception improves with experience and training — without it, solutions come with dangerous caveats, increased risk and so on. It’s a lot like any other tool: you need to know it just well enough to cover your use cases.

To inspect the complexity of Kubernetes (also, just like any other tool), there are at least 2 aspects to consider: the tool itself (what it involves, setting it up and running it) as well as using the tool (learning curve for a user. These things are well captured in the two Kubernetes certifications: CKA (for those that need to setup and manage) and CKAD (for those developing applications meant to run in Kubernetes).

What has Kubernetes ever done for us?

The use of any tool should be evaluated through the benefits gained versus its perceived risks (costs, complexity, etc). For this, it’s important to separate the benefits of containers (having your system dependencies packaged together with the application) from those of Kubernetes (managing running containers).

Containers alone are an insane benefit — speaking from my experinece as someone who has had to run PHP applications bare metal. Installing the right extensions, not being able to reliably run 2 versions side by side (it eventually became possible, but still not easy when you want to scale horizontally), managing updates and so on, scripting it all together to be able to replicate an environment, the occasional OS update the broke everything (because yes, you always want the latest updates on the machine that’s exposed — nowadays I can upgrade the underlying machine and the containers continue to live happily as they are, 99% of time).

Kubernetes on the other hand makes sure that the containers are up and running. If one goes down, it should be replaced. It helps wrap containers with configuration provided from various sources. It ensures the availability of disk volumes.

The truly great benefits though come from application autoscaling (vertical as well as horizontal), cluster autoscaling (more machines please), networking isolation (which containers should talk to which), basic metrics and monitoring. These alone would have meant insane amounts of work in a bare metal setup of the old day. Even in a cloud compute environment it wouldn’t be easy to fully secure private inter-service communication and isolate environments in an auto-scalable way.

Serverless doesn’t fully supplant these things. Whether we talk functions, containers as a service or the like, the matter of isolating environments from one another remains. To do so in a cloud-native way, you have to get very cloud-specific. Also, while Kubernetes just runs your app, going serverless means you have to change the design of your application to fit. For example, one recent case had us evaluate moving a small service from AWS ECS to Lambda — but the service was base on a small “framework” which was reading configuration from environment. In ECS, you just inject values from Parameter Store into a running container environment but in Lambda you need to use the AWS SDK to make API calls to parameter store. It’s a significant code change but also it increases the cost in unpredictable ways because there’s a price per API call in addition to the execution cost of Lambda.

Setting Kubernetes up

For this chapter, Kubernetes the hard way provides you with the required knowledge. Setting up the moving parts: ETCD, api server, scheduler, etc — basically the master is a chore.

Of course, in a cloud environment these are handled by the provider. The big clouds charge your for that while the smaller providers (Civo, Linode, Scaleway, etc) give you a small flat rate or even free.

Even in a bare metal setup you can rely on system packages to set it all up for you but while in the cloud you have some reliability guarantees, in a bare metal setup it really pays off to know each of these components -> particularly where to find them and how to debug them.

If it sounds complex — it’s because it is. Outside of a cloud environment, running Kubernetes needs experienced people. With experience comes reduced maintenance time, downtimes and whatnot. In a cloud environment, those bits are maintained for you so the most work required is to plan the cluster so that upgrades won’t deadlock your system (eg: have enough resources so that pods get moved around as required and your availability isn’t affected) as well as keeping up to date with Kubernetes API changes so that you can notify developers to update manifests accordingly.

Using Kubernetes

When it comes to using Kubernetes in a proper DevOps mentality (yes, I use DevOps as a culture, not as a fancy synonym for Ops), there’s little complexity involved.

The one thing a developer in a cloud native environment needs to know is: what does it take for my application to run? That means compute resources, configuration and external dependencies. How much RAM/CPU, where configs come from and where they go, databases, queues, etc.

Then, some basic manifest writing ability (nothing fancy, the documentation covers everything with detailed examples).

The Kubernetes ecosystem provides enough tools so that any cluster design can be self contained and secure. Tools like Argo can ensure that you don’t need to expose your cluster to external systems and instead are able to pull updates into the cluster (eg: the cluster needs access to Github or registries, not the other way around). Resource usage is transparent (whether you setup open-source monitoring in the cluster and thus increase your compute costs or you use cloud monitoring and increase your direct cloud costs).

These are fairly basic things and I’d say given the amount of knowledge there is online, the complexity is very low — but here is where experience and perception kicks in.

Conclusion

Kubernetes can do a lot for you. It doesn’t replace serverless, nor does serverless replace Kubernetes (except for particular use cases, of course). The degree to which it abstracts away a lot of operational overhead is great but of course, at least a cursory understanding of its components and benefits is needed. Know your tools!

--

--