Filters and substitution rewrite rules can use regular expressions. In regular expressions, the characters ()[].*?+^$|\
are used as special symbols. Depending on how you want to use these characters and which quotation mark you use, these characters must be used differently, as summarized below.
-
Strings between single quotes (
'string'
) are treated literally and are not interpreted at all, you do not have to escape special characters. For example the output of'\x41'
is\x41
(characters as follows: backslash,x
(letter),4
(number),1
(number)). This makes writing and reading regular expressions much more simple: it is recommended to use single quotes when writing regular expressions. -
When enclosing strings between double-quotes (
"string"
), the string is interpreted and you have to escape special characters, that is, to precede them with a backslash (\
) character if they are meant literally. For example the output of the"\x41"
is simply the lettera
. Therefore special characters like\
(backslash) or"
(quotation mark) must be escaped (\\
and\"
). The following expressions are interpreted:\a
,\n
,\r
,\t
,\v
. For example, the\$40
expression matches the$40
string. Backslashes have to be escaped as well if they are meant literally, for example, the\\d
expression matches the\d
string.TIP: If you use single quotes, you do not need to escape the backslash, for example
match("\\.")
is equivalent tomatch('\.')
. -
Enclosing alphanumeric strings between double-quotes (
"string"
) is not necessary, you can just omit the double-quotes. For example when writing filters,match("sometext")
andmatch(sometext)
will both match for thesometext
string.NOTE: Only strings containing alphanumerical characters can be used without quotes or double quotes. If the string contains whitespace or any special characters (
()[].*?+^$|\
or;:#
), you must use quotes or double quotes.When using the
;:#
characters, you must use quotes or double quotes, but escaping them is not required.
By default, all regular expressions are case sensitive. To disable the case sensitivity of the expression, add the flags(ignore-case)
option to the regular expression.
filter demo_regexp_insensitive { host("system" flags(ignore-case)); };
The regular expressions can use up to 255 regexp matches (${1} ... ${255}
), but only from the last filter and only if the flags("store-matches")
flag was set for the filter. For case-insensitive searches, use the flags("ignore-case")
option.
By default, syslog-ng uses POSIX-style regular expressions. To use other expression types, add the type()
option after the regular expression.
The syslog-ng PE application supports the following expression types:
Description: Use POSIX regular expressions. If the type()
parameter is not specified, syslog-ng uses POSIX regular expressions by default.
Posix regular expressions have the following flag options:
global: Usable only in rewrite rules: match for every occurrence of the expression, not only the first one.
ignore-case: Disable case-sensitivity.
store-matches: Store the matches of the regular expression into the $0, ... $255
variables. The $0
stores the entire match, $1
is the first group of the match (parentheses), and so on. Matches from the last filter expression can be referenced in regular expressions.
Example 14.19. Using Posix regular expressions
filter f_message { message("keyword" flags("utf8" "ignore-case") ); };
Description: Use Perl Compatible Regular Expressions (PCRE). Starting with syslog-ng PE version 3.1, PCRE expressions are supported on every platform.
PCRE regular expressions have the following flag options:
global: Usable only in rewrite rules: match for every occurrence of the expression, not only the first one.
ignore-case: Disable case-sensitivity.
store-matches: Store the matches of the regular expression into the $0, ... $255
variables. The $0
stores the entire match, $1
is the first group of the match (parentheses), and so on. Named matches (also called named subpatterns), for example (?<name>...)
, are stored as well. Matches from the last filter expression can be referenced in regular expressions.
unicode: Use Unicode support for UTF-8 matches: UTF-8 character sequences are handled as single characters.
utf8: An alias for the unicode
flag.
Example 14.20. Using PCRE regular expressions
rewrite r_rewrite_subst {subst("a*", "?", value("MESSAGE") type("pcre") flags("utf8" "global")); };
Description: Match the strings literally, without regular expression support. By default, only identical strings are matched. For partial matches, use the flags("prefix")
or the flags("substring")
flags.
Description: Match the strings against a pattern containing '*' and '?' wildcards, without regular expression and character range support. The advantage of glob patterns to regular expressions is that globs can be processed much faster.
- *
-
matches an arbitrary string, including an empty string
- ?
-
matches an arbitrary character
|
NOTE:
|
The host()
, match()
, and program()
filter functions and some other syslog-ng objects accept regular expressions as parameters. But evaluating general regular expressions puts a high load on the CPU, which can cause problems when the message traffic is very high. Often the regular expression can be replaced with simple filter functions and logical operators. Using simple filters and logical operators, the same effect can be achieved at a much lower CPU load.
Example 14.21. Optimizing regular expressions in filters
Suppose you need a filter that matches the following error message logged by the xntpd
NTP daemon:
xntpd[1567]: time error -1159.777379 is too large (set clock manually);
The following filter uses regular expressions and matches every instance and variant of this message.
filter f_demo_regexp { program("demo_program") and match("time error .* is too large .* set clock manually"); };
Segmenting the match()
part of this filter into separate match()
functions greatly improves the performance of the filter.
filter f_demo_optimized_regexp { program("demo_program") and match("time error") and match("is too large") and match("set clock manually"); };