How to configure and monitor VMware Tanzu Application Platform with Prometheus Loki and Grafana (PLG)
Introduction
There are many monitoring tools available; in this tutorial we will show how to configure and monitor VMware Tanzu Application Platform with one of the most popular monitoring tool sets: Prometheus-Loki-Grafana (PLG). This tool set is publicly available and uses Prometheus to collect metrics in a time series database, Loki to extract pod logs, and Grafana to visualize the data. This can either be deployed outside of a Tanzu Application Platform cluster or within a Tanzu Application Platform cluster. Here, we will explain a simple approach for configuring and utilizing Grafana within the Tanzu Application Platform cluster integrated with the inbuilt Contour to access the endpoints. In line with this, we’ll show how to:
● Set up and install Prometheus-Grafana using publicly available Helm chart
● Set up and install Loki-Grafana using publicly available Helm chart
Prerequisites
- A Kubernetes cluster is set up and configured
- Tanzu cli- v0.11.6
- Helm v3.6.3 or above
- Kubernetes cluster with Kubernetes version v1.24.3
- Tanzu Application Platform cluster essentials -1.2.0
- Tanzu Application Platform -1.3.0 version, full profile installed with OOTB Basic supply chain and dev namespace set up
Step 1: Set up and install the Prometheus-Grafana Helm chart
We use the publicly available Prometheus-Grafana Helm chart and integrate the same with our Contour to access the Grafana endpoint.
- Add the public Helm repo.
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo update |
- Install the Prometheus-Grafana chart in the namespace “monitoring,” creating the namespace if it does not exist.
helm install prometheus \ prometheus-community/kube-prometheus-stack \ --namespace monitoring \ --create-namespace |
- Validate whether the deployment is successful.
kubectl --namespace monitoring get pods -l "release=prometheus" |
- We will integrate the Grafana service to Tanzu Application Platform Contour so that we can access it via the Contour ingress domain. Here’s an example:
|
- Now we can access the endpoint via the configured ingress domain “grafana.blog.INGRESS_DOMAIN” and monitor the metrics of all Tanzu Application Platform components and workload pods. Creating dashboards is out of scope for this blog, so a sample dashboard from Grafana with Prometheus source is shown below.
Here’s a sample of an individual pod-level resource utilization.
Step 2: Set up and install the Loki-Grafana Helm chart
The PLG stack is commonly used to extract pod logs in Kubernetes clusters. Promtail runs as a daemon to extract the logs and is responsible for data ingestion into Loki, which, in turn, acts as a data source for Grafana to view logs in a desirable format. We use the publicly available Loki-Grafana Helm chart and integrate the same with our Contour to access the Grafana endpoint.
- Add the public Helm repo.
helm repo add grafana https://grafana.github.io/helm-charts helm repo update |
- When tailoring the configuration for the PLG stack, store the below values in a YAML file, which we will use later during the Loki-Grafana installation.
loki:
enabled: true
persistence:
enabled: true
storageClassName: default
size: 50Gi
promtail:
enabled: true
pipelineStages:
- cri: {}
- json:
expressions:
is_even: is_even
level: level
version: version
grafana:
enabled: true
sidecar:
datasources:
enabled: true
image:
tag: 8.3.5
- Here, we are creating an Azure managed disk with “storageClassName” as the default above. Since we are using an AKS cluster, we can create storage in this way and the storage class will be different for other clusters. The resulting disk will be used for persistence of data, but in general this can be done with any managed disk and addressing all possible options is not in scope of this tutorial.
- Install the Loki-Grafana chart in the namespace “loki,” creating the namespace if it does not exist.
helm install loki grafana/loki-stack -n loki --create-namespace -f ~/loki-stack-values.yml |
- Validate whether the installation is successful.
kubectl get -n loki all |
- Now we can follow the same process as above to integrate this instance of Grafana to Tanzu Application Platform Contour. For this example,
|
- Before proceeding, we have to extract the password for accessing the Grafana (username: admin).
kubectl get secret loki-grafana -n loki -o template --template '{{ index .data "admin-password" }}' | base64 -d; echo |
- Now we can access the endpoint via the configured ingress domain “http://loki.blog.INGRESS_DOMAIN” and monitor the logs for all Tanzu Application Platform components and workload pods. Creating dashboards is out of scope for this tutorial, so a sample explore query with datasource as Loki and the logs from namespace “kapp-controller” is given below.