How to Setup the EFK Stack for Kubernetes Application Log Monitoring

Summary

Logging is a growing problem for Kubernetes users, and centralized log management solutions are now critical. In this blog we walk through how to rapidly implement a complete Kubernetes environment with logging enabled, using multiple popular open-source tools (Elasticsearch, FluentD, Kibana), Platform9’s free Managed Kubernetes service, and ArtifactHub.

A solution to the Kubernetes logging challenge

Kubernetes is becoming a huge cornerstone of cloud software development. Kubernetes deployments require many logs in many locations, and Site Reliability Engineers (SREs), DevOps and IT Ops teams are finding that more and more of their time is spent setting up logs, troubleshooting logging issues, or working with log data in different places. What can be done to solve this?

Fortunately, with advances in open-source tools and ready-made integrations from commercial providers, it’s now much simpler to set up and manage a logging solution. In this blog, we are using multiple open-source tools:

  • Elasticsearch, a distributed, open-source search and analytics engine for all types of data
  • FluentD for log aggregation. Fluentd is an open-source data collector for building the unified logging layer
  • Kibana, an open-source data visualization dashboard for Elasticsearch
  • Kubernetes itself

Together Elasticsearch, FluentD, and Kibana are commonly referred to as the EFK stack.

We’ll be using solutions from Platform9 to rapidly implement a complete environment:

  • Platform9’s Managed Kubernetes which provides built-in FluentD (early access)

Kubernetes Challenges

Deploying Helm Charts

With the need for fast software development and delivery, the DevOps community can use tools and deploy them easily using Helm charts. For example, Helm charts are often created by organizations such as Jetstack, Bitnami, and Elastic – and provided to the community to give them the ability to launch these organization’s software with a few command-line options. Helm charts also make it easy for developers to change the configuration options of applications. By editing the values.yaml file, an application can be set up in different ways – such as using a different database or using different configuration controls for production apps.

Here is a quick example on how you can work with Helm Charts

Go to artifacthub.io and search for the cert-manager chart. Find the official chart.

Select Install and you will see instructions on how to add the helm repository.

$ helm repo add cert-manager https://charts.jetstack.io
$ helm repo update

Next, to install the chart you can use:

helm install \
  cert-manager cert-manager/cert-manager \
  --namespace cert-management \
  -- version 0.15.2 \

We cover installing cert-manager in more detail below.

For a production install, you’ll want to review the information on the Read Me file for each chart. A good example Read Me can be found here.

Deploying the EFK Logging Stack for Kubernetes

Platform9 deploys Prometheus and Grafana with every cluster, helping solve the monitoring piece and we are actively developing a built-in FluentD deployment that will help simplify log aggregation and monitoring.

Note: You Must Have a Platform9 Managed Free Account to Get Started

Note: You Must Have a Platform9 Managed Free Account to Get Started

Part 1: Deploying Kubernetes + FluentD using Platform9

To follow this tutorial you must have a Platform9’s free managed Kubernetes account

Platform9 can run clusters in public clouds (AWS, Azure), private clouds, and edge locations with capabilities to manage from the bare metal up; a BareOS cluster. All clusters can be built using the Platform9 SaaS platform by connecting your public clouds or by onboarding physical or virtual servers. 

The example below is using a four-node Kubernetes cluster running on Platform9 Managed OpenStack but can be achieved using any virtual infrastructure, public cloud or physical servers. Once complete you will have Kubernetes cluster, managed by Platform9 with built-in monitoring, early access to our FluentD capabilities connected to Elasticsearch and Kibana running on Rook CSI storage. 

Requirements

Infrastructure:

Kubernetes Platform:

  • Single Node Control Plane ( 2 CPU 16 GB RAM 1 NIC)
  • Three Worker Nodes ( 4 CPU 16 GB RAM 1 NIC)
  • OS: Ubuntu 18.04

Rook Storage

  • Three Volumes (1 per Worker node)

Software

  • Git Hub Installed and an Account
  • KubeCtl
  • Helm v3 Client 

Note: To install any charts and to manipulate the cluster ensure Helm 3 and KubeCtl are installed and that KubeConfig has been set up so that you can access the cluster. 

Visit here for help on Kube Config Files and visit here help on Helm

Step 1: Sign up and Build a Cluster

If you currently don’t have a Platform9 free managed Kubernetes account, create your free account now. 

Platform9 is able to build, upgrade, and manage clusters in AWS, Azure, and Bare Metal Operating Systems, BareOS, which can be physical or virtual servers running CentOS or Ubuntu. 

This blog covers deploying a BareOS Cluster on Virtual Machines using Rook for persistent storage. Deploying onto Azure or AWS can be achieved by adding the native AWS or Azure Storage classes for the ELK data plane.

Once your account is active, create 4 virtual machines running either Ubuntu or CentOS in your platform of choice (Physical nodes can also be used), mount an empty unformatted volume to each VM (to support Rook) and then use the Platform9 CLI to connect each VM to the Platform9 SaaS Management Plane.

I built my environment on Platform9 Managed OpenStack, below you can see I have a single VM dedicated as the primary Kubernetes node and 3 for Kubernetes Worker nodes. 

Platform9 Managed OpenStack Virtual Machines

Kubernetes Challenges

Platform9 Managed OpenStack Virtual Machines

To attach a VM or physical server install the CLI by running

<span style="font-weight: 400;">bash <(curl -sL </span><a href="http://pf9.io/get_cli"><span style="font-weight: 400;">http://pf9.io/get_cli</span></a><span style="font-weight: 400;">)</span>

The installation will ask for your account details, these can be found on the first step of the BareOS wizard or the Add Node page.

Platform9 CLI Commands to Connect Nodes

Kubernetes Challenges

Once you have installed the CLI and run the ‘prep-node’ command on each node they will be attached to Platform9  and ready to host a cluster. Use the BareOS wizard to create the Kubernetes cluster, below is the required the following configuration to create a cluster with our Fluentd operator enabled.:

Control Plane Setup: Single Node Control Plane with Privileged Containers Enabled

  1. Select the node that will run the Kubernetes Control Plane
  2. Ensure Privileged Containers is enabled 

Workers Setup: Three Worker Nodes

  1. Select the three nodes you are using in this cluster

Network Setup:

  1. Cluster Virtual IP: Leave All fields empty as we are creating a single node control plane.
  2. Cluster Networking Range & HTTP Proxy: Leave with Defaults
  3. CNI: Select Calio and use the default configuration
  4. MetalLB: Disabled

NOTE: If you want to deploy MetalB ensure the IP Range for is reserved within your environment and that port security will not block traffic at the Virtual Machine

Final Tweaks – This is where we enable Fluentd

  1. Ensure monitoring is enabled
  2. Tags – Use the tags field to enable Fluentd

Enable Platform9 FluentD (Early Access Feature

Platform9 has a built-in FluentD operator that will be used to forward logs to Elasticsearch. To enable the FluentD operator edit the cluster from the Infrastructure dashboard and add the following tag to the clusters configuration 

  • key: “pf9-system:logging” 
  • value: “true”

      3. Review and done

Your cluster will now be built and you will be redirected to the Cluster Details page where you can review the status of the cluster deployment on the Node Health Page.

Once the cluster has finished being built you can confirm Fluentd has been enabled in two places. 

  1. Select the cluster and choose Edit on the Infrastructure dashboard. 

On the Edit screen, you should see the tag for logging added.

Kubernetes Challenges

       2. Navigate to the Pods, Deployments and Services dashboard, and filter the Pods table to display the Logging Namespace. You should see Fluentd pods running.

Step 2: Obtain KubeConfig

Once the cluster has been built you can download a KubeConfig file directly from Platform9, choose either token or username and password and place the file in your .kube directory and name the file config. Visit here for help on Kube Config Files

Kubernetes Challenges

Download Kubeconfig

Step 3: Create a namespace

For this example, I’m using a namespace called ‘monitoring-demo’ go ahead and create that in your cluster

 kubectl create namespace monitoring-demo

Step 4: Add Cert Manager

We can use artifacthub.io to search for the JetStack Cert-Manager chart location. After finding the official chart we can install the chart repository and then the chart itself.

Chart Location: https://artifacthub.io/packages/helm/cert-manager/cert-manager

Install Cert-Manager

To ensure Cert-Manager installs and operates correctly you need to first create a namespace for cert-manager and add their CRDs, that’s Custom Resource Definitions. 

  1. Create the cert-management namespace
 kubectl create namespace cert-management

       2. Install the CRDs

kubectl apply --validate=false -f \ 
https://github.com/jetstack/cert-manager/releases/download/v0.16.1/cert-manager.crds.yaml

       3. Install the helm chart

helm install \ 
  cert-manager cert-manager/cert-manager \ 
  --namespace cert-management \ 
  -- version 0.15.2 \

      4. Once installed add the following Certificate issuer for self-signed certificates 

apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: selfsigned-issuer
spec:
  selfSigned: {}

Now we have a cluster with multiple nodes and we don’t need to worry about certificates, the next step to running Elasticsearch is setting up storage.

Part 2: Setting up storage with Rook

For this example, we have chosen to use Rook, an open-source CSI based on Ceph. To run Rook you must have unformatted volumes attached to each node that are larger than 5 Gigabytes, I achieved this in our Managed OpenStack platform by creating a volume for each worker node that’s 10G in size and mounting it. 

Kubernetes Challenges

Storage with Rook

How to Add Rook CSI

I’m going to cheat here, Rook isn’t complicated to deploy, however, to stay keep this blog focused on ELK I’m going to refer to a great example on our Kool Kubernetes GitHub repository that steps through building a 3 worker node rook cluster. 

If your looking for an overview of Rook, an installation guide and tips on validating your new Rook Cluster read through this Blog on IT NEXT ROOK

Quick Guide to Deploying Rook on Kubernetes:

Clone the Kool Kubernetes repository on any machine from where the kubectl can deploy json manifests to your kubernetes cluster.

$ git clone https://github.com/KoolKubernetes/csi.git

Deploy first yaml.

$ kubectl apply -f rook/internal-ceph/1-common.yaml

Deploy the second yaml for rook operator

$ kubectl apply -f rook/internal-ceph/2-operator.yaml
configmap/rook-ceph-operator-config created
deployment.apps/rook-ceph-operator created

Once your Rook cluster is running you can continue. 

Now the fun part, let’s get Elasticsearch and Kibana running, then direct our FluentD output into Elasticsearch.

Part 3: Deploy Elasticsearch

The catch with all helm charts is ensuring that you configure it for your environment using the values.yaml file and by specifying the version, namespace, and ‘release’ or the Name of the deployment. 

The chart, available versions, instructions from the vendor, and security scan results can all be found at ArtifactHub: https://artifacthub.io/packages/helm/elastic/elasticsearch.

To deploy the chart you will need to create a `values.yaml` file (I called mine elastic-values.yaml).  To ensure Helm can access the yaml file, either provide the absolute path or have your terminal session in the directory where the values.yaml file is located. 

Some notes on elastic-values.yaml

To ensure your deployment runs, ensure that the following values are in line with the defaults.

clusterName: "elasticsearch"
protocol: http
httpPort: 9200
transportPort: 9300

To make life a little easier, not advised for production, make the following additions to your values.yaml file

antiAffinity: "soft"
resources:
requests:
    cpu: "100m"
    memory: "1500M"
limits:
    cpu: "1000m"
    memory: "1500M"
esJavaOpts: "-Xmx1024m -Xms1024m"
replicas: 1
minimumMasterNodes: 1

To use the Rook storage, add the following to the values.yaml file.
NOTE: Ensure the storage class name matches your implementation

volumeClaimTemplate:
  accessModes: [ "ReadWriteOnce" ]
  storageClassName: "rook-ceph-block"
  resources:
    requests:
      storage: 1Gi

Once your file is set up save it and we are ready to deploy the chart.

helm install \
    elasticsearch elastic/elasticsearch \
    --namespace monitoring-demo \
    --version 7.7.1 \
    -f elastic-values.yaml \

The above commands will install the 7.7.1 release of elastic into the monitoring-demo namespace using the configuration parameters defined in the elastic-values.yaml file.

Part 4: Deploy Kibana

Deploying Kibana is very similar to Elasticsearch, you will need a values.yaml file, i used a file named kibana-values.yaml. For this demo, I used a NodePort to expose the Kibana UI and to do this I modified the default values.yaml with the following override. 

service:
  type: NodePort
  port: 5601
  nodePort: 31000
  labels: {}
  annotations: {}

Do not change “elasticsearchHosts” unless you modified the elastic values.yaml file. By default the values.yaml file contains “elasticsearchHosts: “http://elasticsearch-master:9200” Port 9200 is the default port and elasticsearch-master is the default Elasticsearch deployment

The chart, available versions, instructions from the vendor and security scan results can also all be found at ArtifactHub: https://artifacthub.io/packages/helm/elastic/kibana.

To deploy Kibana run the following commands

helm install \
  kibana-ui elastic/kibana \
  --namespace monitoring-demo \
  --version 7.7.1 \
  -f kibana-values.yaml \

Once deployed you can confirm both Kibana and Elasticsearch are running by navigating to the Kibana UI in your browser of choice.
My cluster is running on 10.128.130.41 and the nodeport is 31000 as specified in the values.yaml file.

http://10.128.130.41:31000/app/kibana#/

Now we are ready to connect FluentD to Elasticsearch, then all that remains is a default Index Pattern.

Part 5: Configure FluentD

The Platform9 FluentD operator is running, you can find the pods in the the ‘pf9-logging’ namespace. What we need to do now is connect the two platforms; this is done by setting up an ‘Output” configuration.

You will need to place the configuration below in a yaml file and apply it to your cluster. Please note, you will need to adjust the user, password, index_name and importantly the url. 

The URL is an important piece, if this isn’t correct the data cannot be forwarded into Elasticsearch, the syntax is as follows:

Http://<elastic-cluster>.<namespace>.<access-definition>

If you have followed this example using the same names you will not need to change anything.

apiVersion: logging.pf9.io/v1alpha1
kind: Output
metadata:
  name: es-objstore
spec:
  type: elasticsearch
  params:
    - name: url
      value: http://elasticsearch-master.monitoring-demo.svc.cluster.local:9200
    - name: user
      value:myelasticuser
    - name: password
      value: mygreatpassword
    - name: index_name
      value: k8s-prdsjcmon01-fluentd

Use kubectl to apply the yaml file. 

Once the file has been applied FluentD will start to forward data to Elasticsearch, wait a few minutes and then refresh the Kibana UI and you will be able to go through the process of setting up the first index pattern. 

Setting up an Index Pattern is a two-step process. First, you need a regular expression to match the inbound data from FluentD, this needs to match the index_name value, the next step is to identify the method Elasticsearch should use to manage log time stamps.

Kubernetes Challenges

Setting up an Index Pattern Step 1

Kubernetes Challenges

Setting up an Index Pattern Step 2

Once the index pattern has been configured you can use the explore dashboard to view the log files.

Kubernetes Challenges

Dashboard

You may also enjoy

Improving Retail Customer Experience with Automatic License Plate Recognition (ALPR) – Part 1

By John Jamie

How To Set Up ALPR (Automatic License Plate Recognition) with Kubernetes to Improve Retail Drive-Thru Customer Experience (Part 2)

By John Jamie

The browser you are using is outdated. For the best experience please download or update your browser to one of the following: