Kubernetes Logging Best Practices
In this blog post you’ll learn:
- Decide Whether to Sidecar or Not to Sidecar
- Pick Your Log Analysis Tool – EFK or Dedicated Logging
- Control Access to Logs with RBAC
- Keep Log Formats Consistent
- Set Resource Limits on Log Collection Daemons
Kubernetes helps manage the lifecycle of hundreds of containers deployed in pods. It is highly distributed and its parts are dynamic. An implemented Kubernetes environment involves several systems with clusters and nodes that host hundreds of containers that are constantly being spun up and destroyed based on workloads.
When dealing with a large pool of containerized applications and workloads, it is important to be proactive with Kubernetes monitoring and debugging errors. These errors are seen at the container, node, or cluster level. Kubernetes’ logging mechanism is a crucial element to manage and monitor services and infrastructure. In the case of Kubernetes, logs allow you to track errors and even to fine-tune the performance of containers that host applications.
Configure stdout and stderr Streams
The first step is to understand how logs are generated. With Kubernetes, logs are sent to two streams – stdout and stderr. These streams of logs are written to a JSON file and this process is handled internally by Kubernetes. You can configure which logs you’d like to send to which stream. A best practice is to send all application logs to stdout and all error logs to stderr.
Decide Whether to Sidecar or Not to Sidecar
Kubernetes recommends using sidecar containers to collect logs. In this approach, every application container would have a neighboring ‘streaming container’ that streams all logs to stdout and stderr. The sidecar model helps to avoid exposing logs at the node level, and it gives you control over logs at the container level.
The problem with this model, however, is that it works well for low-volume logging, but at scale, it can be a resource drain. This is because you need to run a separate logging container for every application container that’s running. The K8s docs say that this model is ‘hardly a significant overhead.’ It’s up to you to try this model and see the kind of resources it consumes before opting for it.
The alternative is to use a logging agent that collects logs at the node level. This accounts for little overhead and ensures the logs are handled securely. Fluentd has emerged as the best option to aggregate Kubernetes logs at scale. It acts as a bridge between Kubernetes and any number of endpoints where you’d like to consume Kubernetes logs. Opting for a managed K8s service like Platform9 even gives you the ease of a fully-managed Fluentd instance without you having to manually configure or maintain it.
Once you’ve decided on Fluentd to better aggregate and route log data, the next step is to decide how you’ll store and analyze the log data.
Pick Your Log Analysis Tool – EFK or Dedicated Logging
Traditionally, with on-prem server-centric systems, application logs are stored in log files located in the system. These files can be seen in a defined location or can be moved to a central server. But in the case of Kubernetes, all logs are sent to a JSON file on disk at /var/log. This type of aggregation of logs isn’t safe because pods in the nodes can be temporary or short-lived. The log files would be lost when the pod is deleted. This can be an issue when trying to troubleshoot with part of the log data missing.
Kubernetes recommends two options: send all logs to Elasticsearch, or use a third-party logging tool of your choice. Here again, there is a choice to make. Going the Elasticsearch route means you buy into a complete stack – The EFK stack – that includes Elasticsearch, Fluentd, and Kibana. Each tool has its own role to play. As mentioned above, Fluentd aggregates and routes logs. Elasticsearch is the powerhouse that analyzes raw log data and gives out readable output. Kibana is an open-source data visualization tool that creates beautiful, custom-made dashboards from your log data. This is a completely open-source stack and is a powerful solution for logging with Kubernetes.
Still, there are things to keep in mind. Elasticsearch is built and maintained by an organization called Elastic, and a huge community of open source developers. While it is battle-tested to be blazing fast and very powerful at running queries on large scale data, it also has its quirks when operating at scale. Self-managed Elasticsearch needs someone who knows how to architect the platform for scale.
The alternative is to use a cloud-based log analysis tool to store and analyze Kubernetes logs. There are many examples of these tools like Sumo Logic and Splunk. Some of these tools leverage Fluentd to route logs to their platform while others may have their own custom logging agent that sits at the node level within Kubernetes. The setup for these tools is easy, and it takes the least amount of time to go from zero to viewing logs in beautiful dashboards.
Control Access to Logs with RBAC
The Authentication mechanism in Kubernetes uses role-based access control (RBAC) to validate a user’s access and permissions with the system. The audit logs generated during the operation are annotated based on whether a user has privileges (authorization.k8s.io/decision) and a reason (authorization.k8s.io/reason) access is being given to the user. Audit logs are not activated by default. Activating it to track authentication issues is recommended and can be set up with ‘kubectl’.
Keep Log Formats Consistent
Kubernetes logs are generated by different parts of the Kubernetes architecture. These aggregated logs should be in a consistent format so that it is easier for log aggregation tools like fluentd or FluentBit to process them. This should be kept in mind when configuring stdout and stderr, or when assigning labels and metadata using Fluentd, for example. Such structured logs, once provided to Elasticsearch, reduce latency during log analysis.
Set Resource Limits on Log Collection Daemons
With the high volume of logs generated, it can get hard to manage the logs at the cluster level. DaemonSet is used in Kubernetes in a similar way as Linux. It runs in the background to perform a specific task. Fluentd and filebeat are two daemons that Kubernetes supports for log collection. It is imperative to set up a resource limit per daemon so that the collection of log files will be optimized according to the available system resources.
Kubernetes contains multiple layers and components that should be monitored well and should be tracked. Kubernetes encourages logging with external ‘Kubernetes Native’ tools that integrate seamlessly to make logging easier for admins. The practices mentioned here are important to have a robust logging architecture that works well in any situation. They consume computing resources in an optimized way and keep the Kubernetes environment secure and performant.
- Kubernetes on-premises: why and how - February 8, 2023
- The Goldilocks Guide to going cloud native: How to find the perfect fit - January 24, 2023
- Platform9 awarded 8 G2 awards in Winter 2023 report - December 20, 2022