eBook

Doing It Yourself with Kubernetes — Harder than You Thought?

Our Story

Download this eBook as a PDF

Introduction

A while ago, your enterprise decided to go ahead and deploy Kubernetes. Given its reputation for ease-of-use and stability, you were enthusiastic and confident you could get things into production quickly and reliably — using in-house resources, doing it yourself — DIY. After all, there were so many free, open-source tools, and the developers couldn’t wait!

However, over time, your Kubernetes journey didn’t turn out exactly as planned. You knew you’d be learning a lot, but not this much. You found out that Kubernetes is one thing when developers are creating an app but is quite complex and time-consuming when you’re putting clusters into production.

This eBook tells a very short story of what may have been your Kubernetes journey (so far) and its challenges. It concludes with an optional happy ending — for all its complexity, with a little help, deploying and maintaining Kubernetes for your microservices applications can be a quick and painless process.

Read on and have fun!

“A lot of customers who managed their own Kubernetes switched to EKS. Doing-it-yourself just wasn’t worth it.”
Deepak SinghVP of Compute Services

Hello, Kubernetes! Getting started...

You built your first “Hello world!” app and K8s cluster, then added some features and thought how handy it would be to have a package manager. After some googling, you found Helm and you started working with Charts.

Except the app crashed. You needed to have persistent storage set up. So you got help from your DevOps and IT teams who also had to learn a host of new things. They learned about hostPath, container storage and network interfaces (CSI/CNI). And then they got things cooking, everything was running, and you were ready to take it to the next level

For your first major project, you looked around for one of your existing enterprise apps that wasn’t revenue-generating and that, if it broke, wouldn’t cause the end of the world. How was it deployed? It was a VM? Could you containerize it? How would you dockerize it

After lots of questions, lots of learning, and lots of work, it was almost ready to deploy. It took a lot more energy than you anticipated, but you were on your way, ready to move to production.

“Most organizations face challenges in determining the right time to scale pilot projects into production deployments, given the steep learning curve, lack of a DevOps culture, and an unclear ROI.”
Best Practices for Running Containers and Kubernetes in Production

Waking up to a brand-new ecosystem of apps and skills

But wait a sec, not so fast. You were going to production, so you needed to get your DevOps, SREs, and ITOps teams involved to deal with a lot more learning and a lot more work:

  • Monitoring? A service or an app? Prometheus, Grafana, AlertManager? Figuring this out took some time and research.
  • Logging? More time, more research. How many different ways are there to do logging, 30+? Loki, Elastic Kibana, Fluent Bit, ELK, K Stack? And whichever method you chose, you had to learn how to set it up specifically for K8s. How would you correlate the logs?
  • A/B deployments? You had to figure out Istio, or LinkerD, or Traefik; how many months did that take

You started to understand how differently K8s works than traditional software development. It takes a unique way of thinking, and it has a completely different toolset: it’s an entirely new, complex ecosystem of apps and skills.

By the way, were you working with any personal data, any PII? Was it easy to figure out how to define red and green zones inside a K8s cluster? How long did that take?

Failure is not an option — it’s a certainty

That’s why managing production-stage K8s is about having the experience and personnel to attack complex and inevitable K8s issues as they arise. For example:

  1. You’ve deployed 13,000+ services and a worker node goes offline. Ingress is fine but jobs fail and all you see are generic 500 errors.
  2. You try to connect via SSH but the box is hung hard. Maybe the app is consuming too much memory and it needs limits? You need access to troubleshoot, so you reserve some memory and dedicate some CPU bandwidth to the server itself. Even if the K8s goes crazy, at least the server will stay online
  3. The next time a node fails, you see a “PLEG is not healthy” message. You google PLEG and learn it’s a Kubelet component that monitors pod and node health. But what’s failing? Everytime a node goes down, pods are reassigned, but you’re losing compute resources each time.
  4. A fter weeks of noodling, you throw in the towel and redeploy the entire cluster.

This type of experience is typical with a cutting-edge technology like K8s that is evolving constantly. It takes time and expertise to solve these mysteries

Are you prepared to spend hours and hours in troubleshooting and risk downtime in production environments? Are you staffed to dig deep into these kinds of riddles?

In the above example, after days of troubleshooting from some of the top K8s experts, it turned out that Kube-proxy, Flannel, and Kubelets are fighting over iptables rules, the database isn’t updating correctly, and the pod lifecycle event generator thinks the node is dead.

The solution: Replace iptables with IPVS.
Simple, right? Hah!

Hiring and retaining K8s Ops talent

You started out enthusiastically embracing Kubernetes and, given its reputation for ease-of-use, your in-house talent and the abundance of free, open resources, you naturally tried a DIY approach to rolling it out.

Your first clusters were successful but the more you built it out, the more time it took, and you came to realize that K8s at the production level is something much more involved and time consuming than you originally budgeted for.

To help things along, maybe you got reqs for in-house platform engineers. But your recruitment agency kept telling you, “Hyperscalers snatch them up and you won’t believe how much they’re paying them ...” And consultants, if you could find them, were backlogged and expensive or available and not particularly savvy.

What do you do now? You need to deliver a high-performance K8s-based cloud, DIY K8s management has proved problematic, but you’re determined not to just throw money at the problem.

“We needed to hire 18 people Kubernetes experts and site reliability engineers in order to implement and operate our Kubernetes deployments worldwide 24x7. This expertise is hard to come by in Silicon Valley, and it’s worse outside.”
Ravi RamachandranVP Cloud Platform Engineering

Plug your infrastructure into an alternative that just works

So what’s the right thing to do?

  • If you can afford to find dependable platform engineers and pay them hyperscaler salaries, and keep them challenged to stay with you through your next rollout, you’re fortunate.
  • If you can run your apps on a public cloud and write off your existing infrastructure, maybe that’s the way to go.
  • If you can figure out how to make outsourced OpenShift management fit your budget, let alone run your apps reliably, then that’s an option too.

Yet none of these alternatives leverage the fundamental promise of a cloud-native approach: letting the top experts in any field make their expertise accessible on demand at a fraction of traditional costs via SaaS. The cloud-native approach is quick and cost-effective: the Platform9 SaaS-based management plane.

“Since starting with Platform9, there’ve been a number of seamless upgrades to the management plane — but because it’s SaaS, no one in the IT staff has noticed.”
IT Platform ManagerData Analytics Technology Company

Platform9 — the expertise of a managed service + the ease of cost-effective SaaS

You probably have enough on your plate supporting enterprise IT let alone helping developers with Kubernetes management and cloud-native app development. The Platform9 team’s expertise is in deploying and managing K8s infrastructure. Monitoring? We do that. Logging? We do that. A/B deployments? We do that. CSI, CNI, CI/CD? We give you all the curated and safe choices. Working alongside each other via a SaaS management plane results in the greatest K8s performance at the lowest cost.

  • Your teams create revenue-generating apps with their choice of tools and data sources.
  • Platform9 and our K8s-certified engineers manage Kubernetes and platform applications.
“Platform9 has freed up considerable time so we can focus more on business-critical initiatives rather than maintaining and troubleshooting our private cloud.”
IT Platform ManagerData Analytics Technology Company

You’re not on your own with 24/7/365 proactive support

When you register a node with the Platform9 service, an agent starts monitoring and reporting every important performance metric about the environment. When something is off, Platform9 automatically alerts you and Platform9 engineers and begins debugging, troubleshooting, and if necessary, restarting services and performing other self-healing activities. If a disk crashes or there’s a fatal condition in the customer-owned infrastructure, proactively minded Platform9 support teams are poised to help. Importantly:

  • All our support engineers are Certified Kubernetes Administrators (CKA).
  • We guarantee 99.9% management plane availability with a financially backed SLA.
“The Platform9 team has been awesome. A very, very good experience I witnessed is when we had a network issue, Platform9 proactively worked with our team to troubleshoot the problem. It's been years since I have seen support that dedicated.”
CTO

Quickly transition from DIY to getting-it-done

The first step? Visit platform9.com and log into the Platform9 Managed Kubernetes free tier. Tour the user interface or attach a server with real apps and see how it actually works. When you’re done, schedule a short demo and talk with one of our engineers. Chances are, you’ll want to set up a proof-of-concept.

  • A Platform9 PoC can be as simple as a 3-4 hour workshop, a call with an SE where we stand up a cluster together. The PoC can also be a comprehensive, full setup where we’ll review your requirements in detail. You’ll see your workloads running on Platform9 as well as experience a live support call. You can ask anything you want about production K8s in a “stump the expert” session. And it will take less than a week.
  • An initial deployment is swift with white-glove attention to your specific needs. Our success teams start off with a Confluence wiki page with every relevant detail about your environment. You’re introduced to our 24/7/365, follow-the-sun support model staffed by certified experts with a 99.9% customer-satisfaction rating. A dedicated solution architect takes you and your Platform9 team through onboarding to fully-automated Day-2 operations.
“What Platform9 delivered through a PoC in 2 1/2 days would have taken several weeks with other vendors.”
Digital Platform ManagerEuropean Retailer

What’s next?

Managing Kubernetes in production environments is never easy. And you don’t have to do it anymore. SaaS-based Platform9 gives you incomparable ease-of-use along with everything you love about a public cloud at a fraction of the cost: self service with CI/CD automation; instant scalability; security; and subscription-based pricing. And perhaps most importantly, your DevOps and IT teams are freed up to work on revenue-generating apps rather than backend and network maintenance.

“Based on all the math we’ve done, a public cloud is about four times as expensive as using our own infrastructure.”
IT Platform ManagerData Analytics Technology Company

Download this eBook as a PDF

Create Your Shared Cluster Environment and Run Your Production Applications Instantly

The browser you are using is outdated. For the best experience please download or update your browser to one of the following:

Leaving VMware? Get the VMware alternatives guideDownload now