Troubleshooting Mail on Nagios XI |
Server - Nagios |
Email management and troubleshooting can be a challenge as there are so many options and configuration choices that can get twisted. Email Tests Before you do anything, test to see if the Nagios server can send email to an account. This should be done once you have configured the mail settings for Nagios. Proceed to Admin/System Settings and Manage Email Settings. Now select “Send a Test Email”. Make sure the email used for testing is similar to the one you will use for the contact user that you are testing. Simply send the test email and check to see that it arrives. If it does not arrive, your problem is with how Nagios sends email. Either the Nagios server cannot send mail outside of your network, this would be if you chose “Sendmail” as the mail option. Or, Nagios cannot relay mail through your company server, if you selected SMTP as the mail method. Check your SMTP settings and talk with your mail administrators to verify Nagios can relay mail through the company server.
Host/Service Notification Basics One place that you can get caught is in the definitions for Check Period under the “Check Settings” tab and Notification Period under the “Alert Settings” tab. The Check Period MUST be equal to or larger than the Notification Period. If Nagios is not checking a host or service during a specific time, then it will certainly not send notification during that time. In other words, if you are only checking a service on Monday-Friday and you have notifications set for Saturday, you will not receive notification on Saturday or for that matter during any time exclusions. The other basic element is that the contact must be either directly associated with the host or service or be part of a contactgroup that is connected to the host or service. In the example the contact sue is listed as a contact for this host. Contact Timeperiods Each contact has a timeperiod management option that determines when they get notification. If you go to Configure/Core Config Manager/Alerting/Timeperiods and then select the notification times for the specific user. In the example, the contact sue has her timeperiods described as 24x7. Also closely review the time exclusions. When you click on the “Manage Timeperiod Exclusions” you can verify those exclusions. These are times that the user will not be sent notification. In the example sue will not receive notifications for the us-holidays as it was added as an exclusion.
Notification Preferences Login as the administrator that you want to receive email notifications and proceed to Configure/Notification Options/Notification Preferences. There are three sections to these settings. The top section allows you to turn on or off notifications. It also allows you to select how a contact is notified. Verify those settings first. Next look at the second section which determines what a contact will be notified for. In this example, the contact will only be notified when the host is down or when there is scheduled downtime. Or the contact will be notified for service CRITICAL states. So you need to verify these settings reflect the states of the host or service that you are trying to receive notifications for.
The last section is another way to alter the timeperiods that the contact will receive notifications. Host/Service Alert Settings It is important to check the host or service alert settings to make sure that these settings are not stopping notification when you want it to occur. In this example, no notification will be sent because the notification is disabled. In addition, even if notification was enabled, these settings only send notification when the host is down. These are simple settings that can get overlooked.
Tracking Notifications If you go to Home/Incident Management/Notifications you should see that Nagios is sending notification based on the settings you have chosen and to the appropriate contacts. In this example the contact sue should have received notification about these two problems. Using this tool helps you track down if Nagios intends to notify the appropriate contact.
Acknowledging Problems Using acknowledgment is another way of checking email. Proceed to host or service and click “Acknowledge this problem”. This will take you to a screen where you can add a comment to communicate the status to other administrators.
Note that for email testing you will need to select the “Send Notification”. Once that is committed then you should see the output in Acknowledgments. You should also receive an email notification. The Acknowledgment Email Here is an example of the email that should be received when the problem is acknowledged. ***** Nagios XI Alert ***** |