Monitoring setup on Kubernetes
This tutorial guides you through the process of installing a sample monitoring setup on Kubernetes using OpenTelemetry that can be used together with the Observability patterns provided by nevisAdmin 4.
The configuration presented here is intended to be used with Product Analytics, and stores all data for 24 hours.
Prerequisites
- Have an existing Kubernetes cluster and have enough permissions to create cluster scoped resources and namespaces.
- Ensure the following software is pre-installed:
- kubectl: Kubernetes command line interface.
- helm: Helm CLI
Infrastructure
- Grafana Loki is used to store the logs of the Nevis components.
- Promtail is an agent which ships the gathered logs to the Grafana Loki instance.
- Grafana Tempo is a tracing backend. It's used to ingest the traces gathered by the OpenTelemetry Collector.
- Prometheus is used to ingest the metrics gathered by the OpenTelemetry Collector.
- OpenTelemetry Collector receives metrics and traces from the Nevis components using the Observability patterns and forwards them to Prometheus and Tempo.
- Grafana provides visualization for the gathered metrics, traces and logs.
Installation
Grafana Loki
Use the following values file for the installation.
loki-values.yaml
loki:
auth_enabled: false
persistence:
enabled: true
size: 50Gi
limits_config:
retention_period: 24h
compactor:
retention_enabled: true
promtail:
config:
snippets:
pipelineStages:
- match:
selector: '{app!~"nevis.*"}'
action: drop
drop_counter_reason: not_nevis_log
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
helm upgrade --install --namespace observability --create-namespace loki grafana/loki-stack -f loki-values.yaml
Prometheus
Use the following values file for the installation.
prometheus-values.yaml
alertmanager:
enabled: false
kube-state-metrics:
enabled: false
prometheus-node-exporter:
enabled: false
prometheus-pushgateway:
enabled: false
server:
extraFlags:
- web.enable-remote-write-receiver
fullnameOverride: prometheus
retention: 24h
persistentVolume:
size: 25Gi
serverFiles:
prometheus.yml:
scrape_configs: []
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm upgrade --install --namespace observability --create-namespace prometheus prometheus-community/prometheus -f prometheus-values.yaml
Grafana Tempo
Use the following values file for the installation.
tempo-values.yaml
tempo:
retention: 24h
persistence:
enabled: true
size: 50Gi
metricsGenerator:
enabled: true
remoteWriteUrl: "http://prometheus:80/api/v1/write"
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
helm upgrade --install --namespace observability --create-namespace tempo grafana/tempo -f tempo-values.yaml
OpenTelemetry Collector
Use the following values file for the installation.
otel-values.yaml
nameOverride: "collector"
mode: deployment
presets:
kubernetesAttributes:
enabled: true
config:
receivers:
otlp:
protocols:
grpc:
endpoint: ${env:MY_POD_IP}:4317
http:
endpoint: ${env:MY_POD_IP}:4318
processors:
batch: {}
memory_limiter:
check_interval: 5s
limit_percentage: 80
spike_limit_percentage: 25
exporters:
prometheusremotewrite:
endpoint: http://prometheus:80/api/v1/write
otlp:
endpoint: tempo:4317
tls:
insecure: true
service:
pipelines:
traces:
receivers: [otlp]
processors: []
exporters: [otlp]
metrics:
receivers: [otlp]
processors: []
exporters: [prometheusremotewrite]
helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm repo update
helm upgrade --install --namespace observability --create-namespace otel open-telemetry/opentelemetry-collector -f otel-values.yaml
Grafana
Use the following values file for the installation.
grafana-values.yaml
persistence:
enabled: true
datasources:
datasources.yaml:
apiVersion: 1
datasources:
- name: Tempo
type: tempo
access: proxy
orgId: 1
url: http://tempo:3100
basicAuth: false
isDefault: true
version: 1
editable: true
apiVersion: 1
uid: tempo
jsonData:
serviceMap:
datasourceUid: 'prometheus'
tracesToLogsV2:
datasourceUid: 'loki'
spanStartTimeShift: '-10s'
spanEndTimeShift: '10s'
filterByTraceID: false
filterBySpanID: false
customQuery: true
query: '{app=~"nevis.+"} |= "$${__span.traceId}"'
- name: Loki
type: loki
access: proxy
orgId: 1
url: http://loki:3100
basicAuth: false
isDefault: false
version: 1
editable: true
apiVersion: 1
uid: loki
- name: Prometheus
type: prometheus
access: proxy
orgId: 1
url: http://prometheus:80
basicAuth: false
isDefault: false
version: 1
editable: true
apiVersion: 1
uid: prometheus
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
helm upgrade --install --namespace observability --create-namespace grafana grafana/grafana -f grafana-values.yaml
Follow the instructions shown after the installation is done on how to access the Grafana UI with port forwarding.
The added data sources are available under the /explore
path.
Endpoints
With the above setup the following endpoints can be used:
otelUrl: http://otel-collector.observability:4318
tracesEndpoint: http://otel-collector.observability:4318/v1/traces
metricsEndpoint: http://otel-collector.observability:4318/v1/metrics
logsEndpoint: http://otel-collector.observability:4318/v1/logs
prometheusUrl: http://prometheus.observability:80