Keeping up with cloud native is hard (Going Cloud Native, Part 1)

The pressure is on to go cloud native because it creates a better, more responsive, customer experience. Your journey to get there has been difficult and expensive and yet you’re a long way from arriving. You’re faced with a rapid rate of technology change in the Kubernetes ecosystem, complexity and scale needs, multiple locations that must run in the cloud, and/or a skills gap that is ever increasing. You can’t keep up. And you certainly can’t create a compelling customer experience when all you’re doing is fixing infrastructure issues. You need always-on assurance at scale. You need a better way to go cloud native.

The promise of cloud native

To be competitive and to better respond to your customer needs, you need  application agility and better development productivity. To achieve this, you have decided to build and scale containerized applications. You desire to take advantage of their speed, scalability, elasticity,  resiliency, and flexibility. And you are looking for a faster way to iterate and deploy your applications frequently using the power of Kubernetes and other cloud-native technologies . You  are understandably eager to leverage the promise of cloud native.

Challenges of Kubernetes management 

To be successful, however, you realize that you need a rock-solid foundation: a highly reliable and available cloud-native platform everywhere you need it. However, managing such a cloud-native platform based on a Kubernetes stack with “always-on” assurance is proving to be a massive challenge.

It’s complicated. It’s expensive. It’s hard to do.

Whatever your situation, it is hard to keep up

DIY management

If you are developing your own platform using open-source Kubernetes and related components, you are confronted with a high rate of change in the open-source ecosystem, and your team is required to master an increasing number of new technologies. To effectively deploy at scale, you need to constantly evaluate and integrate many independent components. It’s hard to keep up.

Commercial distros

If you have deployed commercial Kubernetes, your teams still have the burden of installation, patching, upgrading, and troubleshooting. These installations can be difficult to run without advanced expertise and in-depth understanding of Kubernetes and CNCF add-ons, and they frequently fail at scale. With the constantly changing ecosystem, you still can’t keep up. Moreover, you wind up paying exorbitant consulting, support, and vendor license costs in addition to budgeting for your employees. Many of the solutions are also highly prescriptive in nature, which reduces flexibility.

On-premises, public, or hybrid – more deployment options means more things to keep up with


If your infrastructure is on-premises, you are unsure of how best to leverage your existing investments.  You realize that your current tools, processes, and skills don’t work in the cloud-native world. You now require an entirely new set of tools, and the path to the desired future state appears uncertain and difficult. You are investing huge amounts of time and money evaluating tools, building, and operating cloud-native infrastructure. You are attempting to force-fit existing tools and processes with cloud-native apps. Or you are stuck in new tools sprawl and a never-ending try-fail loop.  All of this impacts developer productivity and is slowing down app delivery time.

Public clouds

You can’t move fast enough internally to meet the needs of your business. And it’s often very tempting– with a lot of nudging from the large public cloud providers – to move to their clouds but you are concerned about the total cost of ownership (TCO) and vendor lock-in.  You also may have data and workloads that simply cannot leave your premises or dominion. You may be unable to make a convincing business case or a strategy for what may work for those workloads that can be migrated.

If your infrastructure is already in a public cloud, you may find that public cloud costs are going out of control, especially if you don’t have the proper governance and FinOps models in place. You may also find public clouds container services do NOT completely simplify operations. You do get IaaS and managed control planes from these providers, but you still have to deal with integration, life cycle management, upgrades, and  troubleshooting of other add-ons and services yourself. Multiple public clouds adds to the complexity and operational burden. Additionally, you may want to have a hybrid approach but there seems to be no easy way to do that either.

To make matters worse, you’re in the middle of a talent war for Kubernetes expertise.

The opportunity to learn and work with new cloud native technologies is viewed as a rewarding experience by many in IT. However, to run a production-grade cloud native stack requires skills in many new technologies. There is a vast shortage of knowledgeable talent at hand. Your already short-handed staff are spending massive amounts of time on maintenance. You may be experiencing reputationally damaging outages. You are facing the real risk of your staff leaving. And you can’t hire the skills and people you need fast enough or you do not have the budget for it.

Your enterprise depends on you to make the cloud-native shift. But you are stuck in a perpetual hiring cycle.

Your developers are struggling to work with Kubernetes

Application developers often tend to try and fix kubernetes rather than work on their applications.  They’re turning to spinning up their own clusters to develop apps, learn kubernetes and test it themselves.  But the learning curve is huge. They’re stuck spending hours pulling application logs and often reach out via helpdesk to ask for increased permissions just to get their job done efficiently. This is creating a long list of helpdesk requests, hours of manual work, and apps that don’t work in test, staging and production.

And scaling challenges abound

All these challenges multiply 100-fold when you start to scale. Scale can mean any of the following depending on your use case:

  • Geographic scale (e.g 1000s of edge locations)
  • Hundreds of clusters in your data center or a single public cloud
  • A handful of clusters with 100’s of nodes and divided up by namespace

Scale results in cluster sprawl, access control problems, challenges with monitoring, observability, consistency, and policy governance. Multi cluster management is also very difficult in a hybrid/multi cloud environment. This in turn leads to higher operational overhead reducing the ROI and business value of going cloud-native in the first place.

Not keeping up has business consequences

Not keeping up and not having a reliable cloud native environment means outages and missed timelines which in turn results in undesirable business consequences: lost revenue, dissatisfied developers, disgruntled customers, employee burnout, turnover, and reputational damage etc.

Ultimately, your inability to have highly-available up-to-date cloud-native infrastructure at scale may prove to be the Achilles heel for your cloud-native transformation.

Platform9: Your cloud native accelerator platform

Platform9 helps you with a better way to cloud-native. We enable companies to tackle any location, any cloud, at any scale with our software platform, world-class expertise, and always-on assurance. Our unified cloud native platform includes Kubernetes container management, virtualization, bare metal management, and continuous deployment. Get a full stack of everything you need to keep up with cloud-native. Stop fixing infrastructure issues and start building compelling customer experiences.

Read: Going Cloud Native, Part 2 – Introducing Platform9 : A better way to go cloud native

Kamesh Pemmaraju

You may also enjoy

Mastering the operational model challenge for distributed AI/ML infrastructure

By Kamesh Pemmaraju

Exploring Platform9 Managed OpenStack as a modern virtualization alternative

By Peter Fray

The browser you are using is outdated. For the best experience please download or update your browser to one of the following:

State of Kubernetes FinOps Survey – Win 13 prizes including a MacBook Air.Start Now