Consul Prometheus and Puppet
By
- 3 minutes read - 629 wordsRecently I’ve been playing around with Prometheus. For now I think it is the best open source solution for monitoring (in the same way that chlamydia is probably the best STD). Previously I was a fan of Sensu, but honestly there are just too many moving parts to go wrong with Sensu, which meant they inevitably did.
So, why do I like Prometheus? Basically, it stays pretty close to the UNIX philosophy of doing one thing and doing it well - basically it is just a time-series database. Alerting is a seperate module for example and graphing is pretty much left to Grafana. Initially I was not taken by it for one simple reason:
- All its configuration is central.
Unlike with Sensu, a node cannot announce itself to the Prometheus server and then be automatically monitored. In this day an age, that sucks. However, while browsing the docs I discovered that it supports service discovery.
So the process:
- Use Puppet to configure Prometheus
- Individual nodes announce to Consul what services they have
- Prometheus collects its endpoints from Consul
This looks something like this:
Here is a network of 5 machines:
- Prometheus (also the Consul server
- Puppet
- 3 that will be monitored
This is a very simple consul cluster. Normally one would have at least 3 masters (ideally more) spread accross different datacentres. It works for this demo though.
Right, let’s jump straight into the Puppet code. I am using the classic ‘Roles and Profiles’ pattern. You can find my control repo here. There are a few Puppet modules necessarry, so your Puppetfile
will contain:
forge 'http://forge.puppetlabs.com'
mod 'KyleAnderson/consul', '2.1.0'
mod 'puppet/archive', '1.3.0'
mod 'puppetlabs/stdlib', '4.15.0'
mod 'puppetlabs/firewall', '1.8.2'
mod 'prometheus',
:git => 'https://github.com/voxpupuli/puppet-prometheus.git'
To begin with, lets install Node Exporter everywhere. This will collect basic system stats and make them available to Prometheus.
In common.yaml
:
---
prometheus::node_export: 0.13.0
and in your profile::base
:
class profile::base {
include ::prometheus::node_exporter
firewall {'102 node exporter':
dport => 9100,
proto => tcp,
action => accept,
}
}
Consul needs to be everywhere and you need to announce to it that the node exporter is there, so in your base profile:
class profile::base {
include ::consul
firewall { '103 Consul':
dport => [8400, 8500],
proto => tcp,
action => accept,
}
}
And in common.yaml
:
---
consul::version: 0.7.4
consul::config_hash:
data_dir: '/opt/consul'
datacenter: 'homelab'
log_level: 'INFO'
node_name: "%{::hostname}"
retry_join:
- 192.168.1.89
consul::services:
node_exporter:
address: "%{::fqdn}"
checks:
- http: http://localhost:9100
interval: 10s
port: 9100
tags:
- monitoring
Obviously modify the retry_join to suite your infrastructure. If you are doing the right thing and have a cluster, just expand the array down.
For the consul master create a profile that contains:
profile::consulmaster {
firewall { '102 consul inbound':
dport => [8300, 8301, 8302, 8600],
proto => tcp,
action => accept,
}
}
You need the following in Hiera applied to that node(s):
---
consul::version: 0.7.4
consul::config_hash:
bootstrap_expect: 1
data_dir: '/opt/consul'
datacenter: 'homelab'
log_level: 'INFO'
server: true
node_name: "%{::hostname}"
Change bootstrap_expect
to match what you need.
To configure the prometheus server itself create profile::prometheus
:
class profile::prometheus {
firewall { '100 Prometheus inbound':
dport => [9090,9093],
proto => tcp,
action => accept,
}
class { 'prometheus':
scrape_configs => [
{
'job_name' => 'consul',
'consul_sd_configs' => [
{
'server' => 'localhost:8500',
'services' => [
'node_exporter',
],
},
],
},
],
}
}
This will create a scrape config that queries consul for all services named ’node_exporter’.
Finally, the hiera for your prometheus node will look like:
---
classes:
- profile::prometheus
prometheus::version: '1.5.0'
That is it!
As an aside, the basic ideas here are based on Gareth Rushgrove’s excellent presentation about having 2 different speeds of configuration management. Basically, Puppet is the slow and stable speed then, in parallel, Consul gives another path that is much more reactive.
{% youtube XfSrc_sAm2c %}