Nagios: A Modular Monitoring Tool for Infrastructure and Networks
If you are looking for an enterprise class open source network, application and server monitoring tool, there is no better option than Nagios. This article explores the very popular Nagios network infrastructure monitoring tool.
It is not uncommon for the IT department in an organisation to run into problems and spend a great deal of time on troubleshooting. So there is always a need to be more proactive and prevent these issues. It is also worthwhile to optimise the performance of the infrastructure and the network. When a problem occurs, figuring it out immediately and troubleshooting it is strenuous. Most of the problems tend to build up over a period of time. Monitoring the devices and the network is crucial in preventing major issues and enhancing performance. Moreover, if the health of the IT environment is not monitored, how can one ensure efficient service to the end users? Having total visibility of your infrastructure is paramount in preventing issues and making informed decisions to shape the future of the IT environment.
Monitoring not only helps in preventing IT disasters, it also has a lot of other advantages. It enables data-driven insights and helps in decision making. We can improve productivity and performance using this data. It prevents downtime and business losses. It can also help to cut down a company’s budget and increase savings. Keeping track of the performance of hardware and software resources involves having first-hand knowledge of the health of the network and a complete control over IT resource usage trends.
There are numerous open source and commercial products in the market for monitoring the network and the infrastructure. Some of the well known open source monitoring tools are Nagios, Zabbix, Cacti, Icinga, Groundwork, OpenNMS and Hyperic. Among these, Nagios is the most powerful tool. It is a framework that is capable of monitoring a large network of around 100,000 hosts and almost all components. Nagios can be integrated with third party tools. It uses the concept of modularity via plugins, which provide support for protocols, operating systems, system metrics, applications, services, Web servers, websites, middleware, etc.
Nagios does not have any internal monitoring logic. It is unaware of what is to be monitored and contains no built-in
proprietary interpreters. It only does what it is instructed to do in a way one expects it to be done. Nagios lets the user intelligently schedule monitoring programs in any language. These monitoring programs are called plugins, which report the monitoring status back to Nagios. It has lots of hooks that make it easy to get data in and out; so it can provide real-time data to graphing programs such as RRDTool and MRTG, and can easily work along with other monitoring systems, either by feeding them or by being fed by them.
Nagios Core and Nagios XI
Nagios Core is open source whereas Nagios XI is a commercial, enterprise version of Nagios. Unlike Nagios Core, Nagios XI is completely Web based. It is also wizard driven. The dashboards are more robust and personalised. There are plenty of additional features in the enterprise version such as an advanced UI, maintenance tools, integration with third party ticketing and much more.
How does Nagios work?
Nagios was developed for Linux but is now capable of monitoring UNIX based operating systems. It has three components — the daemon, the Web interface, and the plugins. Nagios has four basic object types, which are listed below.
Commands: These are used to interact with the plugins, and to control the event handlers, notifications and checks.
Contact and contact groups: These define the individuals to be contacted in case of an event.
Host and host groups: These are used to specify services and hosts in a particular network.
Notifications: These define what content needs to be sent to contacts and contact groups when there are drifts.
Services: These specify network services such as HTTP/80, SSH, FTP, etc, and host services such as clock,
DNS entries, etc.
Host and service check definitions instruct Nagios which plugins to call to obtain the status of a host or service. The definitions in the plugins are checked against that particular host/service. If it is seen going beyond the threshold, Nagios notifies the contacts/contact groups. Nagios returns any one of the following four status codes for any event:
Nagios can be configured to do either passive checks or active checks. In active checks, the host that runs Nagios ‘polls’ devices or services for status information every so often, whereas passive checks are initiated and performed by external applications/processes. The results are then sent to the monitoring host (where Nagios runs) for processing. The major difference between active and passive checks is that active checks are initiated and performed by Nagios, while passive checks are performed by external applications. Passive checks can be used to monitor asynchronous services and to monitor the services that are located behind a firewall. Active checks can be used for on-demand checks and regular interval checks. The results can be monitored via the Web GUI, which is CGI based.
The features of Nagios are listed below:
Centralised view of the entire monitored IT infrastructure Automatic restart of failed applications is granted by its event handlers
Selective access allows clients to view only the infrastructure components that affect them
Monitoring of network services like SMTP, POP3, HTTP, NNTP, PING, etc
Monitoring of host resources such as processor load, disk usage, etc
Parallelised service checks
Ability to define network host hierarchy using ‘parent’ hosts, allowing detection of and distinction between hosts that are down and those that are unreachable
Optional Web interface for viewing current network status, notifications and problem history, log files, etc Powerful script APIs allow easy monitoring of in-house and custom applications, services, and systems Availability reports ensure SLAs are being met
Failover capabilities ensure non-stop monitoring of critical IT infrastructure components
Let’s install Nagios on Ubuntu 16.x (or above) where SELinux is not in an enforcing state.
Prerequisite: Graphics library to display the monitoring on the Web front-end (Apache). Use the following commands to download it:
$sudo apt-get update
$sudo apt-get install -y autoconf gcc libc6 make wget unzip apache2 php libapache2-mod-php7.0 libgd2-xpm-dev
Download Nagios from the source, as follows:
$wget -O nagioscore.tar.gz https://github.com/ NagiosEnterprises/nagioscore/archive/nagios-4.3.4.tar.gz $ttar xzf nagioscore.tar.gz
Compile the source using the following code:
$sudo ./configure withhttpdconf=/etc/apache2/sitesenabled
$sudo make all
Create User nagios and www-data. Add www-data to the Nagios group, as follows:
$sudo useradd nagios
$sudo usermod -a -G nagios www-data
Install binaries, HTMLs and CGIs by giving the following command:
$sudo make install
Install and configure the service or daemon that has to start on booting up, as follows:
$sudo make install-init
$sudo update-rc.d nagios defaults
Next, install the *SAMPLE* configuration files to allow Nagios to start:
$sudo make installconfig
Install the Apache Web server configuration files and configure Apache settings, as shown below:
$sudo make install-webconf $sudo a2enmod rewrite $sudo a2enmod cgi
Configure the firewall to allow Port 80 inbound traffic on the local firewall in order to reach the Nagios Core Web interface, as follows:
$sudo ufw allow Apache $sudo ufw reload
Create an Apache user account to be able to log into Nagios. This user name and password would be used to log in to the Nagios Web page:
$sudo htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
Restart Nagios Core and Apache Service, as follows:
$sudo systemctl restart apache2.service $sudo systemctl start nagios.service
To test how Nagios Core works, open the Nagios Web client in the browser at http://<ip address or FQDN>/nagios. The page that appears is shown in Figure 1.
The host and services page will display errors as there are no plugins installed, but once they are, these errors will get resolved. Here are steps to install plugins from the Nagios Plugin Package.
Download the prerequisites as follows:
$sudo apt-get install -y autoconf gcc libc6 libmcrypt-dev make libssl-dev wget bc gawk dc build-essential snmp libnet-snmpperl gettext
Next, download and compile the source, and install the plugins:
$wget nocheckcertificate O nagiosplugins.tar.gz https:// github.com/nagios-plugins/nagios-plugins/archive/release2.2.1.tar.gz
$tar zxf nagios-plugins.tar.gz
$sudo make install
Figure 1: Nagios Web UI
Figure 2: Services status after installing plugins