syslog-ng Premium Edition 7.0.32 - Administration Guide

The JSON parser has the following options.

extract-prefix()

Synopsis:

extract-prefix()

Description: Extract only the specified subtree from the JSON message. Use the dot-notation to specify the subtree. The rest of the message will be ignored. For example, assuming that the incoming object is named msg, the json-parser(extract-prefix("foo.bar[5]")); parser is equivalent to the msg.foo.bar[5] javascript code. Note that the resulting expression must be a JSON object in order to extract its members into name-value pairs.

This feature also works when the top-level object is an array, because you can use an array index at the first indirection level, for example: json-parser(extract-prefix("[5]")), which is equivalent to msg[5].

In addition to alphanumeric characters, the key of the JSON object can contain the following characters: !"#$%&'()*+,-/:;<=>?@\^_`{|}~

It cannot contain the following characters: .[]

Example: Convert logstash eventlog format v0 to v1

The following parser converts messages in the logstash eventlog v0 format to the v1 format.

parser p_jsoneventv0 {
  channel {
    parser { json-parser(extract-prefix("@fields")); };
    parser { json-parser(prefix(".json.")); };
    rewrite {
      set("1" value("@version"));
      set("${.json.@timestamp}" value("@timestamp"));
      set("${.json.@message}" value("message"));
    };
  };
};

marker

Synopsis:

marker()

Description: Use a marker in case of mixed log messages, to identify JSON encoded messages for the parser.

Some logging implementations require a marker to be set before the JSON payload. The JSON parser is able to find these markers and parse the message only if it is present.

Example: Using the marker option in JSON parser

This json parser parses log messages which use the "@cee:" marker in front of the json payload. It inserts ".cee." in front of the name of name-value pairs, so later on it is easier to find name-value pairs that were parsed using this parser. (For details on selecting name-value pairs, see value-pairs().)

parser {
        json-parser(
            marker("@cee:")
            prefix(".cee.")
        );
    };

prefix()

Synopsis:

prefix()

Description: Insert a prefix before the name part of the parsed name-value pairs to help further processing. For example:

To insert the my-parsed-data. prefix, use the prefix(my-parsed-data.) option.
To refer to a particular data that has a prefix, use the prefix in the name of the macro, for example, ${my-parsed-data.name} .
If you forward the parsed messages using the IETF-syslog protocol, you can insert all the parsed data into the SDATA part of the message using the prefix(.SDATA.my-parsed-data.) option.

Names starting with a dot (for example, .example) are reserved for use by syslog-ng PE. If you use such a macro name as the name of a parsed value, it will attempt to replace the original value of the macro (note that only soft macros can be overwritten, see Hard versus soft macros for details). To avoid such problems, use a prefix when naming the parsed values, for example, prefix(my-parsed-data.)

This parser does not have a default prefix. To configure a custom prefix, use the following format:

parser {
    json-parser(prefix("myprefix."));
};

template()

Synopsis:

template("${<macroname>}")

Description: The macro that contains the part of the message that the parser will process. It can also be a macro created by a previous parser of the log path. By default, the parser processes the entire message (${MESSAGE}).

XML parser

Extensible Markup Language (XML) is a text-based open standard designed for both human-readable and machine-readable data interchange. Like JSON, it is used primarily to transmit data between a server and web application. It is described in W3C Recommendation: Extensible Markup Language (XML).

The XML parser processes input in XML format, and adds the parsed data to the message object.

To create an XML parser, define an xml_parser that has the xml() option. By default, the parser will process the ${MESSAGE} part of the log message. To process other parts of a log message using the XML parser, use the template() option. You can also define the parser inline in the log path.

Declaration

parser xml_name {
    xml(template()
        prefix()
        drop-invalid()
        exclude-tags()
        strip-whitespaces()
    );
};

Example: Using an XML parser

In the following example, the source is an XML-encoded log message. The destination is a file that uses the format-json template. The log line connects the source, the destination and the parser.

source s_local {
        file("/tmp/aaa");
};

destination d_local {
    file("/tmp/bbb" template("$(format-json .xml.*)\n"));
};

parser xml_parser {
       xml();
};

log {
    source(s_local);
    parser(xml_parser);
    destination(d_local);
};

You can also define the parser inline in the log path.

log {
    source(s_file);
    parser { xml(prefix(".SDATA")); };
    destination(d_file);
};

The XML parser inserts an ".xml" prefix by default before the extracted name-value pairs. Since format-json replaces a dot with an underscore at the beginning of keys, the ".xml" prefix becomes "_xml". Attributes get an _ prefix. For example, from the XML input:

<tags attr='attrval'>part1<tag1>Tag1 Leaf</tag1>part2<tag2>Tag2 Leaf</tag2>part3</tags>

The following output is generated:

{"_xml":{"tags":{"tag2":"Tag2 Leaf","tag1":"Tag1 Leaf","_attr":"attrval","tags":"part1part2part3"}}}

When the text is separated by tags on different levels or tags on the same level, the parser uses the list-handling functionality (enabled by default) to handle lists in the XML.

The list-handling functionality of the XML parser separates vector-like structures by a comma as separate entries. Using the following structure as an example:

<vector>
    <entry>value1</entry>
    <entry>value 2</entry>
    <entry>Doe,John</entry>
    <entry>value3</entry>
    ...
    <entry>valueN</entry>
</vector>

After parsing, the entries are separated by a comma. If an entry has a space or is separated by a comma, for example, value 2 or Doe,John in the previous example, quoting is applied to the entry:

vector.entry = value1,"value 2","Doe,John",value3...valueN

Note that if you disable the list-handling functionality, the XML parser cannot address each element of a vector-like structure individually. Using the following structure as an example:

<vector>
    <entry>value1</entry>
    <entry>value2</entry>
    ...
    <entry>valueN</entry>
</vector>

After parsing, the entries are not addressed individually. Instead, the text of the entries are concatenated:

vector.entry = "value1value2...valueN"

For more information about the list-handling functionality, see Limitations of the XML parsers.

Whitespaces are kept as they are in the XML input. No collapsing happens on significant whitespaces. For example, from this input XML:

<133>Feb 25 14:09:07 webserver syslogd: <b>|Test\n\n   Test2|</b>\n

The following output is generated:

[2017-09-04T13:20:27.417266] Setting value; msg='0x7f2fd8002df0', name='.xml.b', value='|Test\x0a\x0a   Test2|'

However, note that users can choose to strip whitespaces using the strip-whitespaces() option.

Configuration hints

Define a source that correctly detects the end of the message, otherwise the XML parser will consider the input invalid, resulting in a parser error.

To ensure that the end of the XML document is accurately detected, use any of the following options:

Ensure that the XML is a single-line message.
In the case of multiline XML documents:
- If the opening and closing tags are fixed and known, you can use multi-line-mode(prefix-suffix). Using regular expressions, specify a prefix and suffix matching the opening and closing tags. For details on using multi-line-mode(prefix-suffix), see the multi-line-prefix() and multi-line-suffix() options.
- In the case of TCP, you can encapsulate and send the document in syslog-protocol format, and use a syslog() source. Make sure that the message conforms to the octet counting method described in RFC6587.
  
  For example:
```
59 <133>Feb 25 14:09:07 webserver syslogd: <book>\nText\n</book>
```
  Considering the new lines as one character, 59 is appended to the original message.
- You can use a datagram-based source. In the case of datagram-based sources, the protocol signals the end of the message automatically. Ensure that the complete XML document is written in one message.
- Unless the opening and closing tags are fixed and known, stream-based sources are currently not supported.

In case you experience issues, start syslog-ng with debug logs enabled. There will be a debug log about the incoming log entry, which shows the complete message to be parsed. The entry should contain the entire XML document.

NOTE: If your log messages are entirely in .xml format, make sure to disable any message parsing on the source side by including the flags("no-parse") option in your source statement. This will put the entire log message in the $MESSAGE macro, which is the field that the XML parser parses by default.

Limitations of the XML parser

The XML parser comes with certain limitations.

Using the list-handling functionality with vector-like structures

The XML parser uses the list-handling functionality to handle lists in the XML. The list-handling functionality has limitations when handling name-value pairs or quoting in SDATA as described below. Note that you can disable the list-handling functionality if needed.

The list-handling functionality of the XML parser separates vector-like structures by a comma as separate entries. Using the following structure as an example:

<vector>
    <entry>value1</entry>
    <entry>value 2</entry>
    <entry>Doe,John</entry>
    <entry>value3</entry>
    ...
    <entry>valueN</entry>
</vector>

vector.entry = value1,"value 2","Doe,John",value3...valueN

Using the list-handling functionality with name-value pairs

As every value in name-value pairs can be quoted, One Identity recommends that you access name-values as lists as follows:

Use list-related template functions on the list created by the XML parser.

Use type-hinting using the format-json template function as shown in the example below:

template("$(format-json --scope dot-nv-pairs LIST=list(${.xml.Event.EventData.Data}))\n")

Using the list-handling functionality with SDATA

According to RFC5424, SDATA parameter values must be quoted with double-quote ('"') characters. If the value contains double-quotes, they must be escaped with a backslash (\) character.

Due to the list-handling functionality of the XML parser, parsed XML text contents are also quoted using double-quote ('"') characters. As parsed XML text content are part of the message, they are quoted when used as SDATA parameter values.

Using the following structure as an example:

<Event>
<Data>42</Data>
<Data>Testing testing</Data>
</Event>

The expected name-value pair is as follows:

Event.Data = 42,"Testing testing"

In SDATA, this is quoted as shown below:

[Event Data="42,\"Testing testing\""]

Disabling the list-handling functionality

To disable the list-handling functionality, use the create_lists(yes/no) option as shown below. The default value is set to yes.

parser p_xml {
    xml(create_lists(no));
};

Note that if you disable the list-handling functionality, the XML parser cannot address each element of a vector-like structure individually. Using the following structure as an example:

<vector>
    <entry>value1</entry>
    <entry>value2</entry>
    ...
    <entry>valueN</entry>
</vector>

After parsing, the entries are not addressed individually. Instead, the text of the entries are concatenated:

vector.entry = "value1value2...valueN"

CDATA

The XML parser does not support CDATA. CDATA inside the XML input is ignored. This is true for the processing instructions as well.

Inherited limitations

The XML parser is based on the glib XML subset parser, called "GMarkup" parser, which is not a full-scale XML parser. It is intended to parse a simple markup format that is a subset of XML. Some limitations are inherited:

Do not use the XML parser if you expect to interoperate with applications generating full-scale XML. Instead, use it for application data files, configuration files, log files, and so on, where you know your application will be the only one writing the file.
The XML parser is not guaranteed to display an error message in the case of invalid XML. It may accept invalid XML. However, it does not accept XML input that is not well-formed (a condition that is weaker than requiring XML to be valid).

No support for long keys

If the key is longer than 255 characters, syslog-ng drops the entry and an error log is emitted. There is no chunking or any other way of recovering data, not even partial data. The entry will be replaced by an empty string.

Options of the XML parsers

The XML parser has the following options.

create-lists()

Synopsis:	create-lists()
Format:	yes\|no
Default:	yes
Mandatory:	no

Description: If set, the list-handling functionality of the XML parser separates vector-like structures by a comma as separate entries. For more information, see Limitations of the XML parsers.

drop-invalid

Synopsis:	drop-invalid()
Format:	yes\|no
Default:	no
Mandatory:	no

Description: If set, messages with an invalid XML will be dropped entirely.

exclude-tags

Synopsis:	exclude-tags()
Format:	list of globs
Default:	None If not set, no filtering is done.
Mandatory:	no

Description: The XML parser matches tags against the listed globs. If there is a match, the given subtree of the XML will be omitted.

Example: Using exclude_tags

parser xml_parser {
       xml(template("$MSG") exclude_tags("tag1", "tag2", "inner*"));
};

From this XML input:

<tag1>Text1</tag1><tag2>Text2</tag2><tag3>Text3<innertag>TextInner</innertag></tag3>

The following output is generated:

{"_xml":{"tag3":"Text3"}}

prefix()

Synopsis:

prefix()

Description: Insert a prefix before the name part of the parsed name-value pairs to help further processing. For example:

To insert the my-parsed-data. prefix, use the prefix(my-parsed-data.) option.
To refer to a particular data that has a prefix, use the prefix in the name of the macro, for example, ${my-parsed-data.name} .
If you forward the parsed messages using the IETF-syslog protocol, you can insert all the parsed data into the SDATA part of the message using the prefix(.SDATA.my-parsed-data.) option.

The prefix() option is optional and its default value is ".xml".

strip-whitespaces

Synopsis:	strip-whitespaces()
Format:	yes\|no
Default:	no
Mandatory:	no

Description: Strip the whitespaces from the XML text nodes before adding them to the message.

Example: Using strip-whitespaces

parser xml_parser {
       xml(template("$MSG") strip_whitespaces(yes));
};

From this XML input:

<tag1> Tag </tag1>

The following output is generated:

{"_xml":{"tag1":"Tag"}}

template()

Synopsis:

template("${<macroname>}")

Please select your product:

To serve you better, please complete the Purpose of your Chat:

Recommended Solutions for Your Problem

syslog-ng Premium Edition 7.0.32 - Administration Guide

Options of JSON parsers

extract-prefix()

Example: Convert logstash eventlog format v0 to v1

marker

Example: Using the marker option in JSON parser

prefix()

template()

XML parser

Declaration

Example: Using an XML parser

Configuration hints

Limitations of the XML parser

Using the list-handling functionality with vector-like structures

Using the list-handling functionality with name-value pairs

Using the list-handling functionality with SDATA

Disabling the list-handling functionality

CDATA

Inherited limitations

No support for long keys

Options of the XML parsers

create-lists()

drop-invalid

exclude-tags

Example: Using exclude_tags

prefix()

strip-whitespaces

Example: Using strip-whitespaces

template()