Monitoring servers using icinga usually requires a small setup on the monitored server. From the different choices of agents, the nrpe daemon ontop of xinetd is easy enough to set up, while it can provide any kind of information that is available on the monitored server locally. To automate this setup task and as a learning exercise I wrote up an Ansible playbook, which I describe here.

The linux world is big, so you might come here asking: “Why the hell do you use xinetd and nrpe for monitoring?". Well, looking at the available options from the official documentation it turns out there are actually many different techniques which can work, I just happen to always use nrpe in an environment of mostly linux servers. Here is the list of all the options which I should probably check out in a future post:

SNMP
SSH
NSClient++ (https://nsclient.org/)
NSCA-NG
NRPE
Passive Check Results and SNMP Traps

Starting from the bottom: Configuring Ansible

Ansible uses a clientless architecture, but one host still needs to be the “master” or point-of-execution. Using the following playbook, Ansible can be installed on the monitoring-server or even on a different machine, like a virtual machine on the engineers laptop, which is simply connected to the network during installation or maintenance.

After an easy installation of Ansible through the package manager we are left with establishing the ssh-connection between the master and the servers involved in the monitoring setup. An article of the Linux Journal is a great ressource on how to go about this. Basically, the more comfortable you are able to connect to the servers you are dealing with using Ansible, the less secure it will be. In my case I decided for an insecure but comfortable solution for the testing setup on my laptop: a password-less ssl-keypair which allows the holder of the private key to log into the remote server's root account. While this really screws any security mechanisms that might have been established before, it is probably optimal for playing around and learning Ansible.

 ssh-keygen
 # repeat the step below for every involved server
 ssh-copy-id -i .ssh/id_dsa.pub root@remote.computer.ip
 ssh root@remote.computer.ip

After the ssh-setup we need to declare the servers on the Ansible-server. In my setup i declared the exact IPs in /etc/hosts and refered to the declared hostnames in /etc/ansible/hosts. This is due to the fact that my test-environment doesn't have static IP addresses, and I'd like to change them conveniently all at one place (/etc/hosts) in case the IP addresses happen to change.

The servers involved in the monitoring-setup will be either monitored-servers or monitoring-servers, so it makes sense to declare these two groups for Ansible in /etc/Ansible/hosts something like this:

[monitored_servers]
server1.laptop
server2.laptop

[monitoring_servers]
icinga2.laptop

The Ansible Playbook

The playbook itself is a .yml file and is easily readable once you get the hang of it. It is helpful to know that in this Ansible file every entry besides “name”, “hosts”, “tasks” and the conditional “when” refers to an Ansible module. This means if you stumble across any section of this .yml file and it is not clear to you how it works, simply search for the name in the Ansible module documentation. To me all the different names where confusing at the start and once I figured this out playbooks became much easier to read.

Things worth pointing out:

For this task we need a specific monitoring package. This has a different name depending on the platform, which is exactly what we load at the beginning below “Include OS specific variables”. This depends on two short additional files which can be found below this playbook.
Depending on the OS / monitoring package the check used to test the connectivity in the end also is in a different location. To solve this problem the playbook first searches for the location of the check using find, and pipes the output of that into the final command.
The playbook copies over the nrpe configuration file, which is expected to be in nrpe.d/nrpe relative to the playbook path. The configuration of this file can also found below.

---
- name:  Include OS specific variables
  hosts: all
  tasks:
  - debug: var=hostvars[inventory_hostname]['ansible_distribution']
  - include_vars: oel.yml
    when: ansible_distribution == "OracleLinux"
  - include_vars: ubuntu.yml
    when: ansible_distribution == "Ubuntu"
  # TODO include a default case, for now this works though
- name: Install the nrpe check on the monitoring-server
  hosts: monitoring_servers
  tasks:
  - debug: var=monitoring_checks
  - name: Install required packages
    package:
      name: "{{ monitoring_checks }}"
      state: latest
- name: Install nrpe daemon with monitoring-configuration
  hosts: monitored_servers
  tasks:
  - debug: var=hostvars[inventory_hostname]['ansible_default_ipv4']['address']
  - debug: var=monitoring_checks
  - name: Disable the firewall
    when: ansible_distribution == "OracleLinux" #TODO handle different firewalls
    service:
      name: firewalld
      enabled: no #TODO create a rule instead of just disabling it
      state: stopped
  - name: Install required packages
    package:
      name: nrpe, {{ monitoring_checks }}, xinetd
      state: latest
  - name: Copy over the configuration file for the nrpe daemon
    synchronize:
       src: nrpe.d/nrpe
       dest: /etc/xinetd.d/nrpe
  - name: Add xinetd to autostart and start it
    service:
       name: xinetd
       enabled: yes
       state: started
  - name: Add the required entry to /etc/hosts
    lineinfile:
       path: /etc/hosts
       state: present
       line: "{{ hostvars[item]['ansible_default_ipv4']['address'] }}   monitoring-server"
    with_items: "{{ groups.monitoring_servers }}"
    # it should be ok to have multiple IP addresses under the same host-name in case we have multiple
    # monitoring-servers. this way we add them all with the same hostname to /etc/hosts
  - name: Test the installation by executing the command locally
    raw: "$(find / -name check_nrpe 2>/dev/null | head -n 1) -H localhost -c check_users"
    # this command searches for the check_nrpe binary and the executes it.
    # this is due to the fact, that depending on the OS / packaging the location differs.
- name: Install the nrpe check on the monitoring-server
  hosts: monitoring_servers
  tasks:
  - debug: var=monitoring_checks
  - name: Install required packages
    package:
      name: "{{ monitoring_checks }}"
      state: latest
- name: Test the check_users check from the monitoring_servers
  hosts: monitoring_servers
  tasks:
  - raw: "$(find / -name check_nrpe 2>/dev/null | head -n 1) -H {{ hostvars[item]['ansible_default_ipv4']['address'] }} -c check_users"
    with_items: "{{ groups.monitored_servers }}"

The host-specific file ubuntu.yml:

---
# define variables which are specific to the ubuntu setup
monitoring_checks: monitoring-plugins-common

The host-specific file oel.yml:

---
# define variables which are specific to the oel setup
monitoring_checks: nagios-plugins-all, nagios-plugins-nrpe

The nrpe configuration file:

# default: off
# description: NRPE (Nagios Remote Plugin Executor)
service nrpe
{
        flags           = IPv4
        socket_type     = stream
        type            = UNLISTED
        port            = 5666
        wait            = no
        user            = nagios
        group           = nagios
        server          = /usr/sbin/nrpe
        server_args     = -c /etc/nagios/nrpe.cfg --inetd
        log_on_failure  += USERID
        disable         = no
        only_from       = 127.0.0.1 localhost monitoring-server
}

Finally, in order to execute the playbook on the hosts defined in /etc/ansible/hosts simply execute the following:

ansible-playbook -s nrpe.yml