By default, syslog-ng uses PCRE-style regular expressions. To use other expression types, add the type() option after the regular expression.
The syslog-ng PE application supports the following expression types:
pcre
Description: Use Perl Compatible Regular Expressions (PCRE). Starting with syslog-ng PE version 3.1, PCRE expressions are supported on every platform. If the type() parameter is not specified, syslog-ng uses PCRE regular expressions by default.
PCRE regular expressions have the following flag options:
global
Usable only in rewrite rules: match for every occurrence of the expression, not only the first one.
ignore-case
Disable case-sensitivity.
store-matches:
Store the matches of the regular expression into the $0, ... $255 variables. The $0 stores the entire match, $1 is the first group of the match (parentheses), and so on. Named matches (also called named subpatterns), for example, (?<name>...), are stored as well. Matches from the last filter expression can be referenced in regular expressions.
unicode
Use Unicode support for UTF-8 matches: UTF-8 character sequences are handled as single characters.
utf8
An alias for the unicode flag.
Example: Using PCRE regular expressions
rewrite r_rewrite_subst
{subst("a*", "?", value("MESSAGE") flags("utf8" "global")); };
string
Description: Match the strings literally, without regular expression support. By default, only identical strings are matched. For partial matches, use the flags("prefix") or the flags("substring") flags.
glob
Description: Match the strings against a pattern containing '*' and '?' wildcards, without regular expression and character range support. The advantage of glob patterns to regular expressions is that globs can be processed much faster.
-
* matches an arbitrary string, including an empty string
-
? matches an arbitrary character
-
The wildcards can match the / character.
-
You cannot use the * and ? literally in the pattern.
The host(), match(), and program() filter functions and some other syslog-ng objects accept regular expressions as parameters. But evaluating general regular expressions puts a high load on the CPU, which can cause problems when the message traffic is very high. Often the regular expression can be replaced with simple filter functions and logical operators. Using simple filters and logical operators, the same effect can be achieved at a much lower CPU load.
Example: Optimizing regular expressions in filters
Suppose you need a filter that matches the following error message logged by the xntpd NTP daemon:
xntpd[1567]: time error -1159.777379 is too large (set clock manually);
The following filter uses regular expressions and matches every instance and variant of this message.
filter f_demo_regexp {
program("demo_program") and
match("time error .* is too large .* set clock manually"); };
Segmenting the match() part of this filter into separate match() functions greatly improves the performance of the filter.
filter f_demo_optimized_regexp {
program("demo_program") and
match("time error") and
match("is too large") and
match("set clock manually"); };
parser: Parse and segment structured messages
The filters and default macros of syslog-ng work well on the headers and metainformation of the log messages, but are rather limited when processing the content of the messages. Parsers can segment the content of the messages into name-value pairs, and these names can be used as user-defined macros. Subsequent filtering or other type of processing of the message can use these custom macros to refer to parts of the message. Parsers are global objects most often used together with filters and rewrite rules.
The syslog-ng PE application provides the following possibilities to parse the messages, or parts of the messages:
-
By default, syslog-ng PE parses every message as a syslog message. To disable message parsing, use the flags(no-parse) option of the source. To explicitly parse a message as a syslog message, use the syslog parser. For details, see Parsing syslog messages.
-
To segment a message into columns using a CSV-parser, see Parsing messages with comma-separated and similar values.
-
To segment a message consisting of whitespace or comma-separated key=value pairs (for example, Postfix log messages), see Parsing key=value pairs.
-
To parse JSON-formatted messages, see JSON parser.
-
To parse XML-formatted messages, see XML parser.
-
To identify and parse the messages using a pattern database, see Processing message content with a pattern database.
-
To parse a specially-formatted date or timestamp, see Parsing dates and timestamps.
-
To write a custom parser in Python or Hy, see Python parser.
The syslog-ng PE application provides built-in parsers for the following application logs:
By default, syslog-ng PE parses every message using the syslog-parser as a syslog message, and fills the macros with values of the message. The syslog-parser does not discard messages: the message cannot be parsed as a syslog message, the entire message (including its header) is stored in the $MSG macro. If you do not want to parse the message as a syslog message, use the flags(no-parse) option of the source.
You can also use the syslog-parser to explicitly parse a message, or a part of a message as a syslog message (for example, after rewriting the beginning of a message that does not comply with the syslog standards).
Example: Using junctions
For example, suppose that you have a single network source that receives log messages from different devices, and some devices send messages that are not RFC-compliant (some routers are notorious for that). To solve this problem in earlier versions of syslog-ng PE, you had to create two different network sources using different IP addresses or ports: one that received the RFC-compliant messages, and one that received the improperly formatted messages (for example, using the flags(no-parse) option). Using junctions this becomes much more simple: you can use a single network source to receive every message, then use a junction and two channels. The first channel processes the RFC-compliant messages, the second everything else. At the end, every message is stored in a single file. The filters used in the example can be host() filters (if you have a list of the IP addresses of the devices sending non-compliant messages), but that depends on your environment.
log {
source { syslog(ip(10.1.2.3) transport("tcp") flags(no-parse)); };
junction {
channel { filter(f_compliant_hosts); parser { syslog-parser(); }; };
channel { filter(f_noncompliant_hosts); };
};
destination { file("/var/log/messages"); };
};
Since every channel receives every message that reaches the junction, use the flags(final) option in the channels to avoid the unnecessary processing the messages multiple times:
log {
source { syslog(ip(10.1.2.3) transport("tcp") flags(no-parse)); };
junction {
channel { filter(f_compliant_hosts); parser { syslog-parser(); }; flags(final); };
channel { filter(f_noncompliant_hosts); flags(final); };
};
destination { file("/var/log/messages"); };
};
Note that syslog-ng PE has several parsers that you can use to parse non-compliant messages. You can even write a custom syslog-ng parser in Python. For details, see parser: Parse and segment structured messages.
Note that by default, the syslog-parser attempts to parse the message as an RFC3164-formatted (BSD-syslog) message. To parse the message as an RFC5424-formatted message, use the flags(syslog-protocol) option in the parser.
syslog-parser(flags(syslog-protocol));