Figure 19: Applying patterns
The followings describe how patterns work. This information applies to program patterns and message patterns alike, even though message patterns are used to illustrate the procedure.
Patterns can consist of literals (keywords, or rather, keycharacters) and pattern parsers. Pattern parsers attempt to parse a sequence of characters according to certain rules.
|
NOTE:
Wildcards and regular expressions cannot be used in patterns. The @ character must be escaped, that is, to match for this character, you have to write @@ in your pattern. This is required because pattern parsers of syslog-ng are enclosed between @ characters. |
When a new message arrives, syslog-ng attempts to classify it using the pattern database. The available patterns are organized alphabetically into a tree, and syslog-ng inspects the message character-by-character, starting from the beginning. This approach ensures that only a small subset of the rules must be evaluated at any given step, resulting in high processing speed. Note that the speed of classifying messages is practically independent from the total number of rules.
For example, if the message begins with the Apple string, only patterns beginning with the character A are considered. In the next step, syslog-ng selects the patterns that start with Ap, and so on, until there is no more specific pattern left. The syslog-ng application has a strong preference for rules that match the input string completely.
Note that literal matches take precedence over pattern parser matches: if at a step there is a pattern that matches the next character with a literal, and another pattern that would match it with a parser, the pattern with the literal match is selected. Using the previous example, if at the third step there is the literal pattern Apport and a pattern parser Ap@STRING@, the Apport pattern is matched. If the literal does not match the incoming string (for example, Apple), syslog-ng attempts to match the pattern with the parser. However, if there are two or more parsers on the same level, only the first one will be applied, even if it does not perfectly match the message.
If there are two parsers at the same level (for example, Ap@STRING@ and Ap@QSTRING@), it is random which pattern is applied (technically, the one that is loaded first). However, if the selected parser cannot parse at least one character of the message, the other parser is used. But having two different parsers at the same level is extremely rare, so the impact of this limitation is much less than it appears.
Artificial ignorance is a method to detect anomalies. When applied to log analysis, it means that you ignore the regular, common log messages - these are the result of the regular behavior of your system, and therefore are not too interesting. However, new messages that have not appeared in the logs before can sign important events, and should be therefore investigated. "By definition, something we have never seen before is anomalous" (Marcus J. Ranum).
The syslog-ng application can classify messages using a pattern database: messages that do not match any pattern are classified as unknown. This provides a way to use artificial ignorance to review your log messages. You can periodically review the unknown messages — syslog-ng can send them to a separate destination, and add patterns for them to the pattern database. By reviewing and manually classifying the unknown messages, you can iteratively classify more and more messages, until only the really anomalous messages show up as unknown.
Obviously, for this to work, a large number of message patterns are required. The radix-tree matching method used for message classification is very effective, can be performed very fast, and scales very well. Basically the time required to perform a pattern matching is independent from the number of patterns in the database. For sample pattern databases, see Downloading sample pattern databases.
To classify messages using a pattern database, include a db-parser() statement in your syslog-ng configuration file using the following syntax:
parser <identifier> { db-parser(file("<database_filename>")); };
Note that using the parser in a log statement only performs the classification, but does not automatically do anything with the results of the classification.
The following statement uses the database located at /opt/syslog-ng/var/db/patterndb.xml.
parser pattern_db { db-parser( file("/opt/syslog-ng/var/db/patterndb.xml") ); };
To apply the patterns on the incoming messages, include the parser in a log statement:
log { source(s_all); parser(pattern_db); destination( di_messages_class); };
By default, syslog-ng tries to apply the patterns to the body of the incoming messages, that is, to the value of the $MESSAGE macro. If you want to apply patterns to a specific field, or to an expression created from the log message (for example, using template functions or other parsers), use the message-template() option. For example:
parser pattern_db { db-parser( file("/opt/syslog-ng/var/db/patterndb.xml") message-template("${MY-CUSTOM-FIELD-TO-PROCESS}") ); };
By default, syslog-ng uses the name of the application (content of the ${PROGRAM} macro) to select which rules to apply to the message. If the content of the ${PROGRAM} macro is not the proper name of the application, you can use the program-template() option to specify it. For example:
parser pattern_db { db-parser( file("/opt/syslog-ng/var/db/patterndb.xml") program-template("${MY-CUSTOM-FIELD-TO-SELECT-RULES}") ); };
Note that the program-template() option is available in syslog-ng OSE version
|
NOTE:
The default location of the pattern database file is /opt/syslog-ng/var/run/patterndb.xml. The file option of the db-parser() statement can be used to specify a different file, thus different db-parser statements can use different pattern databases. |
The following destination separates the log messages into different files based on the class assigned to the pattern that matches the message (for example Violation and Security type messages are stored in a separate file), and also adds the ID of the matching rule to the message:
destination di_messages_class { file( "/var/log/messages-${.classifier.class}" template("${.classifier.rule_id};${S_UNIXTIME};${SOURCEIP};${HOST};${PROGRAM};${PID};${MESSAGE}\n") template-escape(no) ); };
Note that if you chain pattern databases, that is, use multiple databases in the same log path, the class assigned to the message (the value of ${.classifier.class}) will be the one assigned by the last pattern database. As a result, a message might be classified as unknown even if a previous parser successfully classified it. For example, consider the following configuration:
log { ... parser(db_parser1); parser(db_parser2); ... };
Even if db_parser1 matches the message, db_parser2 might set ${.classifier.class} to unknown. To avoid this problem, you can use an 'if' statement to apply the second parser only if the first parser could not classify the message:
log { ... parser{ db-parser(file("db_parser1.xml")); }; if (match("^unknown$" value(".classifier.class"))) { parser { db-parser(file("db_parser2.xml")); }; }; ... };
For details on how to create your own pattern databases see The syslog-ng pattern database format.
If you want to automatically drop unmatched messages (that is, discard every message that does not match a pattern in the pattern database), use the drop-unmatched() option in the definition of the pattern database:
parser pattern_db { db-parser( file("/opt/syslog-ng/var/db/patterndb.xml") drop-unmatched(yes) ); };
Note that the drop-unmatched() option is available in syslog-ng OSE version
The results of message classification and parsing can be used in custom filters and templates, for example, in file and database templates. The following built-in macros allow you to use the results of the classification:
The .classifier.class macro contains the class assigned to the message (for example violation, security, or unknown).
The .classifier.rule_id macro contains the identifier of the message pattern that matched the message.
The .classifier.context_id macro contains the identifier of the context for messages that were correlated. For details on correlating messages, see Correlating log messages using pattern databases.
To filter on a specific message class, create a filter that checks the .classifier_class macro, and use this filter in a log statement.
filter fi_class_violation { match( "violation" value(".classifier.class") type("string") ); };
log { source(s_all); parser(pattern_db); filter(fi_class_violation); destination(di_class_violation); };
Filtering on the unknown class selects messages that did not match any rule of the pattern database. Routing these messages into a separate file allows you to periodically review new or unknown messages.
To filter on messages matching a specific classification rule, create a filter that checks the .classifier.rule_id macro. The unique identifier of the rule (for example e1e9c0d8-13bb-11de-8293-000c2922ed0a) is the id attribute of the rule in the XML database.
filter fi_class_rule { match( "e1e9c0d8-13bb-11de-8293-000c2922ed0a" value(".classifier.rule_id") type("string") ); };
Pattern database rules can assign tags to messages. These tags can be used to select tagged messages using the tags() filter function.
|
NOTE:
The syslog-ng OSE application automatically adds the class of the message as a tag using the .classifier.<message-class> format. For example, messages classified as "system" receive the .classifier.system tag. Use the tags() filter function to select messages of a specific class. filter f_tag_filter {tags(".classifier.system");}; |
The message-segments parsed by the pattern parsers can also be used as macros as well. To accomplish this, you have to add a name to the parser, and then you can use this name as a macro that refers to the parsed value of the message.
For example, you want to parse messages of an application that look like "Transaction: <type>.", where <type> is a string that has different values (for example refused, accepted, incomplete, and so on). To parse these messages, you can use the following pattern:
'Transaction: @ESTRING::.@'
Here the @ESTRING@ parser parses the message until the next full stop character. To use the results in a filter or a filename template, include a name in the parser of the pattern, for example:
'Transaction: @ESTRING:TRANSACTIONTYPE:.@'
After that, add a custom template to the log path that uses this template. For example, to select every accepted transaction, use the following custom filter in the log path:
match("accepted" value("TRANSACTIONTYPE"));
|
NOTE:
The above macros can be used in database columns and filename templates as well, if you create custom templates for the destination or logspace. Use a consistent naming scheme for your macros, for example, APPLICATIONNAME_MACRONAME. |
© ALL RIGHTS RESERVED. Terms of Use Privacy Cookie Preference Center