To access the alerts table, click on the “Alerts” menu on the left. A large table will appear as shown below.
By default, certain columns are not displayed in order to limit the size of each line and adapt to all screen sizes.
If required, you can display the columns that are hidden by default using the 'Show/hide columns' button at the top right. See the image below in the red box:
The menu appears just below, uncheck the “Auto” box to be able to select the columns you want to show or hide.
The following columns are hidden by default:
You can do a basic search, such as filtering on groups containing the letters “serv”. You'll get a display like this.
Existing feature on versions higher than 3.2.5.
But there are keywords that allow you to either configure your dashboard widget or refine your search. Here is the list of keywords:
The screen with “VOIP|| té”:
Here is an example on a “Current Alerts” dashboard widget. There is a filter section on the right. I'm going to filter the alerts by eliminating unknown level errors. So I'm going to indicate !unknown in the alerts filter. As shown below.
Once saved, you can see that the filter has been added to your dashboard widget.
By default, the table displays errors according to the priority of each service. There are 7 levels available (as in the OSI model). This enables errors to be sorted automatically. Level 1 is the most critical.
By default, for the Windows or Linux supervision pattern, service priorities are prioritised in this way.
This basic nomenclature can be explained as follows: If the ping does not respond, the node is unreachable, so there is no point in displaying the rest. If the CPU load is 100%, it is normal for SNMP requests to fail and the problem being dealt with is the CPU load. If SNMP is not configured, only the CPU line is displayed. It is therefore not necessary to display the other errors, which would be duplicates.
Example: my Houston server which has a PING problem (noted the use of a search filter ).
If I click on it, I can see that there are 4 services in error. The ping + the 3 basic SNMP services. In the example below, the “Processor” service has been acknowledged.
As Ping has the highest priority (1 by default), the alert table has eliminated all higher-level errors.
If you want to change the priority of services on a node, you can use the following tutorial as a guide: Applying services to your nodes
Let's take a classic Esia server, we have a hardware-related part which will be supervised by the Linux pattern which has basic service priorities like this:
I would add the service that tests disk IO (CHECK_SNMP_LINUX_IO). I would give it level 4 priority because if my IOs are saturated my database could be KO'd or my Apache server very slow. We therefore consider that the priority below 4 is due to a “hardware” problem.
For the software, here is the list of processes running on our server:
I'm going to add the following services from the most critical to the least critical, or by redoing the dependency chain.
So as soon as I have an error on my server, I already have a diagnosis just by reading the first line in my dashboard.
In the end, here's a list of all the services with their respective priorities.