Starting with syslog-ng OSE version 3.1, PCRE expressions are supported on every platform. If the type() parameter is not specified, syslog-ng OSE uses PCRE regular expressions by default.
The following example shows the structure of PCRE-style regular expressions in use.
Example: Using PCRE regular expressions
rewrite r_rewrite_subst {
subst("a*", "?", value("MESSAGE") flags("utf8" "global"));
};
PCRE-style regular expressions have the following flags() options:
dupnames
Allows using duplicate names for named subpatterns.
Configuration example:
filter { match("(?<DN>foo)|(?<DN>bar)" value(MSG) flags(store-matches, dupnames)); };
...
destination { file(/dev/stdout template("$DN\n")); };
global
Usable only in rewrite rules, flags("global") matches for every occurrence of the expression, not only the first one.
ignore-case
Disables case-sensitivity.
newline
When configured, it changes the newline definition used in PCRE regular expressions to accept either of the following:
- a single carriage-return
- linefeed
- the sequence carriage-return and linefeed (\r, \n and \r\n, respectively)
This newline definition is used when the circumflex and dollar patterns (^ and $) are matched against an input. By default, PCRE interprets the linefeed character as indicating the end of a line. It does not affect the \r, \n or \R characters used in patterns.
store-matches
Stores the matches of the regular expression into the $0, ... $255 variables. The $0 stores the entire match, $1 is the first group of the match (parentheses), and so on. Named matches (also called named subpatterns), for example, (?<name>...), are stored as well. Matches from the last filter expression can be referenced in regular expressions.
unicode
Uses Unicode support for UTF-8 matches: UTF-8 character sequences are handled as single characters.
utf8
An alias for the unicode flag.
Literal string searches have the following flags() options:
global
Usable only in rewrite rules, flags("global") matches for every occurrence of the expression, not only the first one.
ignore-case
Disables case-sensitivity.
prefix
During the matching process, patterns (also called search expressions) are matched against the input string starting from the beginning of the input string, and the input string is matched only for the maximum character length of the pattern. The initial characters of the pattern and the input string must be identical in the exact same order, and the pattern's length is definitive for the matching process (that is, if the pattern is longer than the input string, the match will fail).
Example: matching / non-matching patterns for the input string 'exam'
For the input string 'exam',
store-matches
Stores the matches of the regular expression into the $0, ... $255 variables. The $0 stores the entire match, $1 is the first group of the match (parentheses), and so on. Named matches (also called named subpatterns), for example, (?<name>...), are stored as well. Matches from the last filter expression can be referenced in regular expressions.
substring
The given literal string will match when the pattern is found within the input. Unlike flags("prefix"), the pattern does not have to be identical with the given literal string.
There are no supported flags() options for glob patterns without regular expression support.
The host(), match(), and program() filter functions and some other syslog-ng objects accept regular expressions as parameters. But evaluating general regular expressions puts a high load on the CPU, which can cause problems when the message traffic is very high. Often the regular expression can be replaced with simple filter functions and logical operators. Using simple filters and logical operators, the same effect can be achieved at a much lower CPU load.
Example: Optimizing regular expressions in filters
Suppose you need a filter that matches the following error message logged by the xntpd NTP daemon:
xntpd[1567]: time error -1159.777379 is too large (set clock manually);
The following filter uses regular expressions and matches every instance and variant of this message.
filter f_demo_regexp {
program("demo_program") and
match("time error .* is too large .* set clock manually");
};
Segmenting the match() part of this filter into separate match() functions greatly improves the performance of the filter.
filter f_demo_optimized_regexp {
program("demo_program") and
match("time error") and
match("is too large") and
match("set clock manually");
};