Quick Tour

Get on board

Welcome to our quick tour! This page will give you a short impression on what logpecker can do for you. Details are spared out. This is a quick tour, so don't complain; see Europe in three days but don't ask questions! Now get on board, the doors will close automatically.

Imagine we have a small network with one NFS server, called zappa, and several clients, named beethoven, bach and bruch. We want to use logpecker to process messages like the following:

Sept 19 20:13:15 bruch nfs: server zappa not responding
Sept 19 20:13:17 bruch nfs: server zappa OK

The rules

These are the necessary rules that teach logpecker to recognize the messages; this has to go into the file "quicktour" in your rule search path.

group nfs.server

new crit
tag nfs
prio kern.*
match server $server not responding
name $server.no-response

new ok
tag nfs
prio kern.notice
match server $server OK
name $server.no-response

This defines two rules that match these messages. The first creates a "critical" incident, the second tells logpecker that the other incident is now resolved. The following stripped-down configuration tells logpecker to use this rules to create a "ticker report":

report quicktour-ticker {
  type ticker;
  file /var/log/quicktour-ticker;
};

process {
  rules { quicktour; };
  reports { quicktour-ticker; };
};

Logpecker pecking

Now here is an example log (the first number is the line number just for reference)

01  Sept 19 20:13:15 bruch nfs: server zappa not responding
02  Sept 19 20:13:17 bruch nfs: server zappa OK
03  Sept 19 20:18:00 bach nfs: server zappa not responding
04  Sept 19 20:18:01 bruch nfs: server zappa not responding
05  Sept 19 20:18:03 bruch nfs: server zappa OK
06  Sept 19 20:20:00 bach nfs: server zappa not responding
07  Sept 19 20:20:00 bruch nfs server zappa not responding
08  Sept 19 20:20:03 beethoven nfs: server zappa not responding
...  many more of these from varying clients ...
09  Sept 19 21:15:00 bruch nfs: server zappa OK
10  Sept 19 21:15:01 beethoven nfs: server zappa OK
11  Sept 19 21:17:03 bach nfs: server zappa is OK

In line 1, an incident with the name "nfs.server.zappa.no-response" is created. Please note that the name of the client is not included in the name since it is irrelevant. The clause "name $server.from.&host.no-response would have created the name "nfs.server.zappa.from.bruch.no-response", you get the idea?

logpecker always waits for a short period (20 seconds, but this is configurable) before an incident is actually reported. This means, that in line 2 it recognizes that the "critical" condition is over, before it even has considered telling anybody about it. So, it is silently dropped.

In line 3, another incident with the same name is created. In line 4, the message would create another incident with the same name, which is "lumped together". After line 5, there is no incident left. Again, nothing has been reported.

At line 6, something really bad has happened. All clients log error messages. Logpecker again lumps these together and waits until 20:20:20; then the grace period is over and the incident is reported; at 21:15:00 (line 9), the incident is over.

Line 11 contains a message that is actually not configured (note the unguilty looking word "is"!), and assume it's logged with facility/priority kern.notice. Software is that way, sometimes the log message just looks a little bit different for no apparent reason.

And here are the final results

Depending on how you have configured your reports, this could be the output:

Sept 19 20:20:00 crit      nfs.server.zappa.no-response
                           bach nfs: server zappa not responding

Sept 19 21:15:00 ok        nfs.server.zappa.no-response
                           bruch nfs: server zappa OK

Sept 19 21:17:03 notice    unknown.notice.0.342324
                           bach nfs: server zappa is OK

This is a report of type ticker. I like to have root-tail throw this on my screen. There are other types, too:

A summary contains all relevant log messages of a day, ordered by groups.
A notification contains a single message that could be sent by mail or to a pager. After sending a out a message, logpecker can ignore other incidents with the same or lower priority for a certain period. Additionally, the recipient has a way to acknowledge this incident, and you can define an escalation scheme: If logpecker sees no acknowledge for a certain period, it sends out another message to somebody else.
An unknown report contains all undefined messages of a day in a format so that you can cut-and-paste it into your rule configuration. A little bit of manual work is still needed, of course.

All these reports can be limited to certain severities, hosts, incidents of certain groups etc. This is all explained in the report configuration reference.

Quick Tour

Get on board

The rules

Logpecker pecking

And here are the final results

Recommended Readings