Local Thanos on Kubernetes

Chris Cowley

March 3, 2026 - 6 minutes read - 1215 words

I run Kubernetes in my homelab and use the (now classic) Prometheus/Grafana stack for monitoring everything. In my experience, the thing that Prometheus does not do well is long-term storage. Prometheus’ own database is fine for a few days storage, but performance and reliability quickly degrade. There are quite a few solutions to this problem, but I choose to use Thanos, partly because Kube Prometheus Stack has very good support for it.

What is Thanos? Thanos is a project that provides a highly available, long-term storage solution for Prometheus. It is designed to be easy to set up and use, and it can be deployed in a variety of environments, including Kubernetes. Thanos works by running a sidecar container alongside each Prometheus instance, which uploads the data to a central store. This allows you to have multiple Prometheus instances running in parallel and it provides a single query interface for all of your data. In fact, one use case some of my colleagues use is to have a central Thanos instance that collects data from multiple Prometheus instances running in different environments and even continents.

I am only using a single instance and, if I am honest, the reason behind doing this is partly “because I can”. However, it does provide some benefits, such as the ability to store data for a longer period of time and to have a single query interface for all of my data. It also allows me to easily scale my monitoring setup in the future if I need to (however unlikely).

The architecture for a typical Thanos setup looks something like:

thanos-architecture

A sidecar container is added to the Prometheus Pod, which uploads to the Thanos Storage Gateway. This then writes the data to S3 for long-term storage.

And here we have a problem! I want to keep this local, so I need some S3 storage on site. For a long time, the default solution for this was MinIO, but they have stopped playing the open-source game. This is the open source world however, so there are alternatives. I choose to go with Garage because it seems pretty simple (UNIX philosophy and all that) and has excellent support for deployment on Kubernetes.

Garage deployment

As I said, Garage has really good support for Kubernetes and maintains a Helm chart.

garage:
  repicationFactor: 2

deployment:
  replicaCount: 3
  podManagementPolicy: Parallel

persistence:
  enabled: true
  meta:
    storageClass: longhorn
  data:
    size: 20G
    storageClass: longhorn-local

affinity:
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchExpressions:
              - key: app.kubernetes.io/name
                operator: In
                values:
                  - garage
          topologyKey: kubernetes.io/hostname

There is more to my values.yaml than this, but this is the important stuff that is not specific to me. I keep the replication factor at 2 and te replica count at 3 for plenty for redundancy. Notice I use longhorn-local as the storageClass. As Garage is dealing with replication, I do not want to have the overhead of Longhorn’s own replication. The config for the storageClass looks like:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: longhorn-local
parameters:
  dataLocality: strict-local
  fromBackup: ""
  fsType: ext4
  numberOfReplicas: "1"
  staleReplicaTimeout: "30"
provisioner: driver.longhorn.io
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowVolumeExpansion: true

With that in place you can install Garage using Helm in the namespace `garage``. I use FluxCD to manage it, but you do you.

helm install garage deuxfleurs/garage -n garage -f values.yaml \
  --create-namespace

Next we need to create a bucket for Thanos to store data in and a key to allow access:

kubectl exec --stdin --tty -n garage garage-0 -- ./garage status
kubectl exec --stdin --tty -n garage garage-0 -- ./garage layout \
  assign 0bfb01a7b5bea679 -z node-0 -c 19G
kubectl exec --stdin --tty -n garage garage-0 -- ./garage layout \
  assign 2d9502223f1a7d38 -z node-2 -c 19G
kubectl exec --stdin --tty -n garage garage-0 -- ./garage layout \
  assign e54cf4ab4f03e5a4 -z node-1 -c 19G
kubectl exec --stdin --tty -n garage garage-0 -- ./garage layout show
kubectl exec --stdin --tty -n garage garage-0 -- ./garage layout apply --version 1
kubectl exec --stdin --tty -n garage garage-0 -- ./garage bucket create thanos
kubectl exec --stdin --tty -n garage garage-0 -- ./garage key create thanos-app-key
kubectl exec --stdin --tty -n garage garage-0 -- ./garage bucket allow \
  --read --write --owner  thanos --key thanos-app-key

Info

Create a bash alias to make Garage commands easier to run:

alias garage="kubectl -n garage exec -it garage-0 -- ./garage"

When you created the key, it will have output the access and secret keys. Create a file called objstore.yml with the followin content:

type: S3
config:
  bucket: thanos
  endpoint: garage.garage.svc.cluster.local:3900
  region: garage
  access_key: <your access key here>
  secret_key: <your secret key here>
  insecure: true
  signature_version2: false

Now create a secret from this file:

kubectl --namespace monitoring create secret generic thanos-objstore \
  --from-file objstore.yml

How you keep that secure and versioned is an exercise for the reader (I use Sealed Secrets). In any case, now you we can move on to Prometheus/Thanos and that will hinge around that secret.

Adding Thanos sidecar

Like many in the Kubernetes world, I use Kube Prometheus Stack which has first class support for Thanos.

prometheus:
  enabled: true
... 
  thanosService:
    enabled: true
  prometheusSpec:
    retention: 6h
    externalLabels:
      cluster: lab
    thanos:
      objectStorageConfig:
        existingSecret:
          name: thanos-objstore
          key: objstore.yml

That addition to Prometheus enables the Thanos sidecar and the secret has all the information it needs to upload data to Garage. It also lowers the retention in Prometheus itself to 6 hours, which is more than enough.

With Prometheus writing to Garage, now we need to install the rest of Thanos.

Thanos deployment

The rest of Thanos is deployed using Helm also:

global:
  security:
    allowInsecureImages: true

image:
  registry: quay.io
  repository: thanos/thanos

existingObjstoreSecret: thanos-objstore

storegateway:
  enabled: true
  persistence:
    enabled: true
    size: 20Gi
    storageClass: longhorn

compactor:
  enabled: true
  persistence:
    enabled: true
    size: 20Gi
    storageClass: longhorn
  resources:
    requests:
      cpu: 100m
      memory: 512Mi

query:
  enabled: true
  replicaCount: 1
  dnsDiscovery:
    enabled: true
    sidecarsService: prometheus-kube-prometheus-thanos-discovery
    sidecarsNamespace: monitoring

  extraFlags:
    - --query.auto-downsampling

bucket:
  enabled: false

Install that using Helm in the monitoring namespace:

helm install thanos oci://registry-1.docker.io/bitnamicharts/thanos \
  --values ./thanos-values.yaml \
  --namespace monitoring

One of the key steps that makes this relatively simple was the creation of the secret with the credentials. By using the format we did, including the filename and key name, we are able to rely on defaults provided by the two Helm charts. This way we are not fighting an uphill battle for S3 access.

Now you have everything installed, but there is still a final step. For now we have Prometheus writing to Garage, with Thanos pointing at that same bucket. However, Grafana is still using Prometheus directly. What we want is Grafana to use Thanos Query, which will then decide whether to use Garage or Prometheus to serve the data. Once again we modify the values for Kube Prometheus Stack:

...
grafana:
...
  additionalDataSources:
    - name: Thanos
      type: prometheus
      access: proxy
      url: http://thanos-query.monitoring.svc.cluster.local:9090
      isDefault: true
      jsonData:
        prometheusVersion: 2.45.0
        prometheusType: Thanos
...

Apply that and Grafana will now have a (default) datasource for Thanos in addition to the existing Prometheus data source. Depending on your dashboards, you may need to update them to use the new datasource.

Conclusion

There you have it: scalable, long-term, completely local metrics storage in Kubernetes. I would have liked an automated way to pass the S3 credentials from Garage to the Thans secret, but this works. If anyone has any suggestions I am all ears.