Google Cloud and Nagios monitoringPublished on: Author: Mike Nieuwstraten Category: IT development and operations
If you’re planning to move your resources to the cloud, you’ll be forced to re-think a lot of your infrastructure choices. One usually overlooked choice is monitoring. Most cloud providers offer their own monitoring tools, but that forces you to add extra knowledge to your company for this particular monitoring option and gives you an extra monitoring tool to configure and manage.
Qualogy has been moving more and more of its resources to Google Cloud and one of the challenges was to find the right monitoring option for this situation.
Stackdriver quite expensive
As early adopters of Google’s G-Suite, Google Cloud always had our preference. Google offers Stackdriver as its monitoring tool of choice, which can be quite expensive depending on the number of resources used.
Spending a lot of time and money on monitoring isn’t always desirable and is often overlooked or ignored, especially in the early stages of moving your on-site resources to the cloud.
Nagios infrastructure monitoring tools
We are big fans of Nagios, one of the more widely used infrastructure monitoring tools, and were quite surprised at how easy it was to integrate with Google Cloud. We were even more surprised by how powerful Nagios could be with the added functionality of Google Cloud SDK.
GCloud command-line interface to Google Cloud Platform
Google Cloud SDK gives you access to GCloud, a tool that provides you with a command-line interface to the Google Cloud Platform. This means everything you can do in Google Cloud using your browser, you can do on the command line.
This ability to interact with your cloud environment using command-line interface commandos is perfect for Nagios. The event handlers are a key feature of Nagios. If Nagios determines there’s something wrong with one of your resources, it will automatically run a script to handle this event.
‘Start secondary’ script
One example of how we use this is our ‘start secondary’ script. Nagios checks our primary server in a certain region to see if it’s up or down. If it’s down and won’t come up in time, it will use Google’s GCloud to bring up the resource in a different region.
Not having an active and running secondary server saves costs and the performance of Google Cloud is high enough to keep downtime to a minimum.
Disk space alerts
Another example is disk space alerts, quite a common event in most IT organizations. Nagios can monitor disk space and usage quite easily, but with the added functionality of GCloud it can act on these events and add disk space or disks.