How to configure and monitor VMware Tanzu Application Platform with Prometheus Loki and Grafana (PLG)

Introduction

There are many monitoring tools available; in this tutorial we will show how to configure and monitor VMware Tanzu Application Platform with one of the most popular monitoring tool sets: Prometheus-Loki-Grafana (PLG). This tool set is publicly available and uses Prometheus to collect metrics in a time series database, Loki to extract pod logs, and Grafana to visualize the data. This can either be deployed outside of a Tanzu Application Platform cluster or within a Tanzu Application Platform cluster. Here, we will explain a simple approach for configuring and utilizing Grafana within the Tanzu Application Platform cluster integrated with the inbuilt Contour to access the endpoints. In line with this, we’ll show how to:

● Set up and install Prometheus-Grafana using publicly available Helm chart

● Set up and install Loki-Grafana using publicly available Helm chart

Prerequisites

A Kubernetes cluster is set up and configured
Tanzu cli- v0.11.6
Helm v3.6.3 or above
Kubernetes cluster with Kubernetes version v1.24.3

Tanzu Application Platform cluster essentials -1.2.0
Tanzu Application Platform -1.3.0 version, full profile installed with OOTB Basic supply chain and dev namespace set up

Step 1: Set up and install the Prometheus-Grafana Helm chart

We use the publicly available Prometheus-Grafana Helm chart and integrate the same with our Contour to access the Grafana endpoint.

Add the public Helm repo.

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

helm repo update

Install the Prometheus-Grafana chart in the namespace “monitoring,” creating the namespace if it does not exist.

helm install prometheus \

prometheus-community/kube-prometheus-stack \

--namespace monitoring \

--create-namespace

Validate whether the deployment is successful.

kubectl --namespace monitoring get pods -l "release=prometheus"

We will integrate the Grafana service to Tanzu Application Platform Contour so that we can access it via the Contour ingress domain. Here’s an example:

apiVersion: projectcontour.io/v1
kind: HTTPProxy
metadata:
  name: grafana
  namespace: monitoring
spec:
  routes:
  - services:
    - name: prometheus-grafana
       port: 80
  virtualhost:
    fqdn: grafana.blog.INGRESS_DOMAIN

Now we can access the endpoint via the configured ingress domain “grafana.blog.INGRESS_DOMAIN” and monitor the metrics of all Tanzu Application Platform components and workload pods. Creating dashboards is out of scope for this blog, so a sample dashboard from Grafana with Prometheus source is shown below.

Here’s a sample of an individual pod-level resource utilization.

Step 2: Set up and install the Loki-Grafana Helm chart

The PLG stack is commonly used to extract pod logs in Kubernetes clusters. Promtail runs as a daemon to extract the logs and is responsible for data ingestion into Loki, which, in turn, acts as a data source for Grafana to view logs in a desirable format. We use the publicly available Loki-Grafana Helm chart and integrate the same with our Contour to access the Grafana endpoint.

Add the public Helm repo.

helm repo add grafana https://grafana.github.io/helm-charts

helm repo update

When tailoring the configuration for the PLG stack, store the below values in a YAML file, which we will use later during the Loki-Grafana installation.

loki:
 enabled: true
 persistence:
  enabled: true
  storageClassName: default
  size: 50Gi

promtail:
 enabled: true
 pipelineStages:
  - cri: {}
  - json:
    expressions:
     is_even: is_even
     level: level
     version: version

grafana:
 enabled: true
 sidecar:
  datasources:
   enabled: true
 image:
  tag: 8.3.5

Here, we are creating an Azure managed disk with “storageClassName” as the default above. Since we are using an AKS cluster, we can create storage in this way and the storage class will be different for other clusters. The resulting disk will be used for persistence of data, but in general this can be done with any managed disk and addressing all possible options is not in scope of this tutorial.
Install the Loki-Grafana chart in the namespace “loki,” creating the namespace if it does not exist.

helm install loki grafana/loki-stack -n loki --create-namespace -f ~/loki-stack-values.yml

Validate whether the installation is successful.

kubectl get -n loki all

Now we can follow the same process as above to integrate this instance of Grafana to Tanzu Application Platform Contour. For this example,

apiVersion: projectcontour.io/v1
kind: HTTPProxy
metadata:
  name: loki
  namespace: loki
spec:
  routes:
  - services:
    - name: loki-grafana
       port: 80
  virtualhost:
     fqdn: loki.blog.INGRESS_DOMAIN

Before proceeding, we have to extract the password for accessing the Grafana (username: admin).

kubectl get secret loki-grafana -n loki -o template --template '{{ index .data "admin-password" }}' | base64 -d; echo

Now we can access the endpoint via the configured ingress domain “http://loki.blog.INGRESS_DOMAIN” and monitor the logs for all Tanzu Application Platform components and workload pods. Creating dashboards is out of scope for this tutorial, so a sample explore query with datasource as Loki and the logs from namespace “kapp-controller” is given below.

Filter Tags

Tanzu Tanzu Application Platform Document