Options of key=value parsers
The kv-parser has the following options.
extract-stray-words-into()
Synopsis: |
extract-stray-words-into("<name-value-pair>") |
Description: Specifies the name-value pair where syslog-ng PE stores any stray words that appear before or between the parsed key-value pairs (mainly when the pair-separator() option is also set). If multiple stray words appear in a message, then syslog-ng PE stores them as a comma-separated list. Note that the prefix() option does not affect the name-value pair storing the stray words. Default value: N/A
Example: Extracting stray words in key-value pairs
For example, consider the following message:
VSYS=public; Slot=5/1; protocol=17; source-ip=10.116.214.221; source-port=50989; destination-ip=172.16.236.16; destination-port=162;time=2016/02/18 16:00:07; interzone-emtn_s1_vpn-enodeb_om; inbound; policy=370;
This is a list of key-value pairs, where the value separator is = and the pair separator is ;. However, before the last key-value pair (policy=370), there are two stray words: interzone-emtn_s1_vpn-enodeb_om inbound. If you want to store or process these, specify a name-value pair to store them in the extract-stray-words-into() option, for example, extract-stray-words-into("my-stray-words"). The value of ${my-stray-words} for this message will be interzone-emtn_s1_vpn-enodeb_om, inbound
prefix()
Description: Insert a prefix before the name part of the parsed name-value pairs to help further processing. For example:
-
To insert the my-parsed-data. prefix, use the prefix(my-parsed-data.) option.
-
To refer to a particular data that has a prefix, use the prefix in the name of the macro, for example, ${my-parsed-data.name} .
-
If you forward the parsed messages using the IETF-syslog protocol, you can insert all the parsed data into the SDATA part of the message using the prefix(.SDATA.my-parsed-data.) option.
Names starting with a dot (for example,
.example) are reserved for use by syslog-ng PE. If you use such a macro name as the name of a parsed value, it will attempt to replace the original value of the macro (note that only soft macros can be overwritten, see
Hard versus soft macros for details). To avoid such problems, use a prefix when naming the parsed values, for example,
prefix(my-parsed-data.)
By default, kv-parser() uses the .kv. prefix. To modify it, use the following format:
parser {
kv-parser(prefix("myprefix."));
};
pair-separator()
Synopsis: |
pair-separator("<separator-string>") |
Description: Specifies the character or string that separates the key-value pairs from each other. Default value: , (a comma followed by a whitespace)
For example, to parse key1=value1;key2=value2 pairs, use kv-parser(pair-separator(";"));
template()
Synopsis: |
template("${<macroname>}") |
Description: The macro that contains the part of the message that the parser will process. It can also be a macro created by a previous parser of the log path. By default, the parser processes the entire message (${MESSAGE}).
value-separator()
Synopsis: |
value-separator("<separator-character>") |
Description: Specifies the character that separates the keys from the values. Default value: =
For example, to parse key:value pairs, use kv-parser(value-separator(":"));
JSON parser
JavaScript Object Notation (JSON) is a text-based open standard designed for human-readable data interchange. It is used primarily to transmit data between a server and web application, serving as an alternative to XML. It is described in RFC 4627. The syslog-ng PE application can separate parts of incoming JSON-encoded log messages to name-value pairs. For details on using value-pairs in syslog-ng PE see Structuring macros, metadata, and other value-pairs.
You can refer to the separated parts of the JSON message using the key of the JSON object as a macro. For example, if the JSON contains {"KEY1":"value1","KEY2":"value2"}, you can refer to the values as ${KEY1} and ${KEY2}. If the JSON content is structured, syslog-ng PE converts it to dot-notation-format. For example, to access the value of the following structure {"KEY1": {"KEY2": "VALUE"}}, use the ${KEY1.KEY2} macro.
|
Caution:
If the names of keys in the JSON content are the same as the names of syslog-ng PE soft macros, the value from the JSON content will overwrite the value of the macro. For example, the {"PROGRAM":"value1","MESSAGE":"value2"} JSON content will overwrite the ${PROGRAM} and ${MESSAGE} macros. To avoid overwriting such macros, use the prefix() option.
Hard macros cannot be modified, so they will not be overwritten. For details on the macro types, see Hard versus soft macros. |
NOTE: The JSON parser currently supports only integer, double and string values when interpreting JSON structures. As syslog-ng does not handle different data types internally, the JSON parser converts all JSON data to string values. In case of boolean types, the value is converted to 'TRUE' or 'FALSE' as their string representation.
The JSON parser discards messages if it cannot parse them as JSON messages, so it acts as a JSON-filter as well.
To create a JSON parser, define a parser that has the json-parser() option. Defining the prefix and the marker are optional. By default, the parser will process the ${MESSAGE} part of the log message. To process other parts of a log message with the JSON parser, use the template() option. You can also define the parser inline in the log path.
Declaration
parser parser_name {
json-parser(
marker()
prefix()
);
};
Example: Using a JSON parser
In the following example, the source is a JSON encoded log message. The syslog parser is disabled, so that syslog-ng PE does not parse the message: flags(no-parse). The json-parser inserts ".json." prefix before all extracted name-value pairs. The destination is a file, that uses the format-json template function. Every name-value pair that begins with a dot (".") character will be written to the file (dot-nv-pairs). The log line connects the source, the destination and the parser.
source s_json {
network(port(21514) flags(no-parse));
};
destination d_json {
file("/tmp/test.json"
template("$(format-json --scope dot-nv-pairs)\n"));
};
parser p_json {
json-parser (prefix(".json."));
};
log {
source(s_json);
parser(p_json);
destination(d_json);
};
You can also define the parser inline in the log path.
source s_json {
network(port(21514) flags(no-parse));
};
destination d_json {
file("/tmp/test.json"
template("$(format-json --scope dot-nv-pairs)\n"));
};
log {
source(s_json);
parser {
json-parser (prefix(".json."));
};
destination(d_json);
};
Options of JSON parsers
The JSON parser has the following options.
extract-prefix()
Synopsis: |
extract-prefix() |
Description: Extract only the specified subtree from the JSON message. Use the dot-notation to specify the subtree. The rest of the message will be ignored. For example, assuming that the incoming object is named msg, the json-parser(extract-prefix("foo.bar[5]")); parser is equivalent to the msg.foo.bar[5] javascript code. Note that the resulting expression must be a JSON object in order to extract its members into name-value pairs.
This feature also works when the top-level object is an array, because you can use an array index at the first indirection level, for example: json-parser(extract-prefix("[5]")), which is equivalent to msg[5].
In addition to alphanumeric characters, the key of the JSON object can contain the following characters: !"#$%&'()*+,-/:;<=>?@\^_`{|}~
It cannot contain the following characters: .[]
Example: Convert logstash eventlog format v0 to v1
The following parser converts messages in the logstash eventlog v0 format to the v1 format.
parser p_jsoneventv0 {
channel {
parser { json-parser(extract-prefix("@fields")); };
parser { json-parser(prefix(".json.")); };
rewrite {
set("1" value("@version"));
set("${.json.@timestamp}" value("@timestamp"));
set("${.json.@message}" value("message"));
};
};
};
marker
Description: Use a marker in case of mixed log messages, to identify JSON encoded messages for the parser.
Some logging implementations require a marker to be set before the JSON payload. The JSON parser is able to find these markers and parse the message only if it is present.
Example: Using the marker option in JSON parser
This json parser parses log messages which use the "@cee:" marker in front of the json payload. It inserts ".cee." in front of the name of name-value pairs, so later on it is easier to find name-value pairs that were parsed using this parser. (For details on selecting name-value pairs, see value-pairs().)
parser {
json-parser(
marker("@cee:")
prefix(".cee.")
);
};
prefix()
Description: Insert a prefix before the name part of the parsed name-value pairs to help further processing. For example:
-
To insert the my-parsed-data. prefix, use the prefix(my-parsed-data.) option.
-
To refer to a particular data that has a prefix, use the prefix in the name of the macro, for example, ${my-parsed-data.name} .
-
If you forward the parsed messages using the IETF-syslog protocol, you can insert all the parsed data into the SDATA part of the message using the prefix(.SDATA.my-parsed-data.) option.
Names starting with a dot (for example,
.example) are reserved for use by syslog-ng PE. If you use such a macro name as the name of a parsed value, it will attempt to replace the original value of the macro (note that only soft macros can be overwritten, see
Hard versus soft macros for details). To avoid such problems, use a prefix when naming the parsed values, for example,
prefix(my-parsed-data.)
This parser does not have a default prefix. To configure a custom prefix, use the following format:
parser {
json-parser(prefix("myprefix."));
};
template()
Synopsis: |
template("${<macroname>}") |
Description: The macro that contains the part of the message that the parser will process. It can also be a macro created by a previous parser of the log path. By default, the parser processes the entire message (${MESSAGE}).
XML parser
Extensible Markup Language (XML) is a text-based open standard designed for both human-readable and machine-readable data interchange. Like JSON, it is used primarily to transmit data between a server and web application. It is described in W3C Recommendation: Extensible Markup Language (XML).
The XML parser processes input in XML format, and adds the parsed data to the message object.
To create an XML parser, define an xml_parser that has the xml() option. By default, the parser will process the ${MESSAGE} part of the log message. To process other parts of a log message using the XML parser, use the template() option. You can also define the parser inline in the log path.
Declaration
parser xml_name {
xml(template()
prefix()
drop-invalid()
exclude-tags()
strip-whitespaces()
);
};
Example: Using an XML parser
In the following example, the source is an XML-encoded log message. The destination is a file that uses the format-json template. The log line connects the source, the destination and the parser.
source s_local {
file("/tmp/aaa");
};
destination d_local {
file("/tmp/bbb" template("$(format-json .xml.*)\n"));
};
parser xml_parser {
xml();
};
log {
source(s_local);
parser(xml_parser);
destination(d_local);
};
You can also define the parser inline in the log path.
log {
source(s_file);
parser { xml(prefix(".SDATA")); };
destination(d_file);
};
The XML parser inserts an ".xml" prefix by default before the extracted name-value pairs. Since format-json replaces a dot with an underscore at the beginning of keys, the ".xml" prefix becomes "_xml". Attributes get an _ prefix. For example, from the XML input:
<tags attr='attrval'>part1<tag1>Tag1 Leaf</tag1>part2<tag2>Tag2 Leaf</tag2>part3</tags>
The following output is generated:
{"_xml":{"tags":{"tag2":"Tag2 Leaf","tag1":"Tag1 Leaf","_attr":"attrval","tags":"part1part2part3"}}}
When the text is separated by tags on different levels or tags on the same level, the parser uses the list-handling functionality (enabled by default) to handle lists in the XML.
The list-handling functionality of the XML parser separates vector-like structures by a comma as separate entries. Using the following structure as an example:
<vector>
<entry>value1</entry>
<entry>value 2</entry>
<entry>Doe,John</entry>
<entry>value3</entry>
...
<entry>valueN</entry>
</vector>
After parsing, the entries are separated by a comma. If an entry has a space or is separated by a comma, for example, value 2 or Doe,John in the previous example, quoting is applied to the entry:
vector.entry = value1,"value 2","Doe,John",value3...valueN
Note that if you disable the list-handling functionality, the XML parser cannot address each element of a vector-like structure individually. Using the following structure as an example:
<vector>
<entry>value1</entry>
<entry>value2</entry>
...
<entry>valueN</entry>
</vector>
After parsing, the entries are not addressed individually. Instead, the text of the entries are concatenated:
vector.entry = "value1value2...valueN"
For more information about the list-handling functionality, see Limitations of the XML parsers.
Whitespaces are kept as they are in the XML input. No collapsing happens on significant whitespaces. For example, from this input XML:
<133>Feb 25 14:09:07 webserver syslogd: <b>|Test\n\n Test2|</b>\n
The following output is generated:
[2017-09-04T13:20:27.417266] Setting value; msg='0x7f2fd8002df0', name='.xml.b', value='|Test\x0a\x0a Test2|'
However, note that users can choose to strip whitespaces using the strip-whitespaces() option.
Configuration hints
Define a source that correctly detects the end of the message, otherwise the XML parser will consider the input invalid, resulting in a parser error.
To ensure that the end of the XML document is accurately detected, use any of the following options:
In case you experience issues, start syslog-ng with debug logs enabled. There will be a debug log about the incoming log entry, which shows the complete message to be parsed. The entry should contain the entire XML document.
NOTE: If your log messages are entirely in .xml format, make sure to disable any message parsing on the source side by including the flags("no-parse") option in your source statement. This will put the entire log message in the $MESSAGE macro, which is the field that the XML parser parses by default.